101
|
Dey P, Ray SD, Manchi S, Pramod P, Kochiganti VHS, Singh RP. Whole genome sequencing and microsatellite motif discovery of farmed Japanese quail (Coturnix japonica): a first record from India. PROCEEDINGS OF THE INDIAN NATIONAL SCIENCE ACADEMY 2022. [DOI: 10.1007/s43538-022-00118-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
102
|
Rayamajhi N, Cheng CHC, Catchen JM. Evaluating Illumina-, Nanopore-, and PacBio-based genome assembly strategies with the bald notothen, Trematomus borchgrevinki. G3 (BETHESDA, MD.) 2022; 12:jkac192. [PMID: 35904764 PMCID: PMC9635638 DOI: 10.1093/g3journal/jkac192] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 07/18/2022] [Indexed: 11/16/2022]
Abstract
For any genome-based research, a robust genome assembly is required. De novo assembly strategies have evolved with changes in DNA sequencing technologies and have been through at least 3 phases: (1) short-read only, (2) short- and long-read hybrid, and (3) long-read only assemblies. Each of the phases has its own error model. We hypothesized that hidden short-read scaffolding errors and erroneous long-read contigs degrade the quality of short- and long-read hybrid assemblies. We assembled the genome of Trematomus borchgrevinki from data generated during each of the 3 phases and assessed the quality problems we encountered. We developed strategies such as k-mer-assembled region replacement, parameter optimization, and long-read sampling to address the error models. We demonstrated that a k-mer-based strategy improved short-read assemblies as measured by Benchmarking Universal Single-Copy Ortholog while mate-pair libraries introduced hidden scaffolding errors and perturbed Benchmarking Universal Single-Copy Ortholog scores. Furthermore, we found that although hybrid assemblies can generate higher contiguity they tend to suffer from lower quality. In addition, we found long-read-only assemblies can be optimized for contiguity by subsampling length-restricted raw reads. Our results indicate that long-read contig assembly is the current best choice and that assemblies from phase I and phase II were of lower quality.
Collapse
Affiliation(s)
- Niraj Rayamajhi
- Department of Evolution, Ecology, and Behavior, University of Illinois, Urbana-Champaign, Champaign, IL 61801, USA
| | - Chi-Hing Christina Cheng
- Department of Evolution, Ecology, and Behavior, University of Illinois, Urbana-Champaign, Champaign, IL 61801, USA
| | - Julian M Catchen
- Department of Evolution, Ecology, and Behavior, University of Illinois, Urbana-Champaign, Champaign, IL 61801, USA
| |
Collapse
|
103
|
Characterization of the complete mitochondrial genome of Miamiensis avidus causing flatfish scuticociliatosis. Genetica 2022; 150:407-420. [PMID: 36269500 DOI: 10.1007/s10709-022-00167-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 09/13/2022] [Indexed: 11/04/2022]
Abstract
Miamiensis avidus is a parasitic pathogen that causes the disease scuticociliatosis in teleost fish species. It is a ciliate and a free-living marine protozoan belonging to the order Philasterida, subclass Scuticociliatida, class Oligohymenophorea, and phylum Ciliophora. The complete mt-genome of M. avidus was linear and 38,695 bp in length with 47 genes, including 40 protein-coding genes, two ribosomal RNA (rRNA) genes, and five transfer RNA (tRNA) genes. Of these, 20 genes typically belong to the clusters of orthologous groups, playing roles in energy production and conversion, translation, ribosomal structure and biogenesis, and defense mechanisms. This is the first report of sequencing and characterization of the mt-genome of M. avidus, which was observed to be linear and possessing the typical ciliate mitochondrial genome organization and phylogenetic relationships. Remarkable differences were observed between M. avidus and other ciliates in the mitochondrially encoded rRNAs, extensive gene loss in ribosomal genes and tRNAs, terminal repeat sequences, and stop codon usage. A comparative and phylogenetic analysis of M. avidus and Uronema marinum of the order Hymenostomatida, which is most closely related to the order Philasterida, signified the promise of the mitogenome data of M. avidus as a valuable genetic marker in species detection and taxonomic research. The present study has potential applications in epidemiological studies and host-parasite interaction investigations facilitating disease control.
Collapse
|
104
|
Zhang X, Zhao Y, Kou Y, Chen X, Yang J, Zhang H, Zhao Z, Zhao Y, Zhao G, Li Z. Diploid chromosome-level reference genome and population genomic analyses provide insights into Gypenoside biosynthesis and demographic evolution of Gynostemma pentaphyllum (Cucurbitaceae). HORTICULTURE RESEARCH 2022; 10:uhac231. [PMID: 36643751 PMCID: PMC9832869 DOI: 10.1093/hr/uhac231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 10/01/2022] [Indexed: 06/17/2023]
Abstract
Gynostemma pentaphyllum (Thunb.) Makino is a perennial creeping herbaceous plant in the family Cucurbitaceae, which has great medicinal value and commercial potential, but urgent conservation efforts are needed due to the gradual decreases and fragmented distribution of its wild populations. Here, we report the high-quality diploid chromosome-level genome of G. pentaphyllum obtained using a combination of next-generation sequencing short reads, Nanopore long reads, and Hi-C sequencing technologies. The genome is anchored to 11 pseudo-chromosomes with a total size of 608.95 Mb and 26 588 predicted genes. Comparative genomic analyses indicate that G. pentaphyllum is estimated to have diverged from Momordica charantia 60.7 million years ago, with no recent whole-genome duplication event. Genomic population analyses based on genotyping-by-sequencing and ecological niche analyses indicated low genetic diversity but a strong population structure within the species, which could classify 32 G. pentaphyllum populations into three geographical groups shaped jointly by geographic and climate factors. Furthermore, comparative transcriptome analyses showed that the genes encoding enzyme involved in gypenoside biosynthesis had higher expression levels in the leaves and tendrils. Overall, the findings obtained in this study provide an effective molecular basis for further studies of demographic genetics, ecological adaption, and systematic evolution in Cucurbitaceae species, as well as contributing to molecular breeding, and the biosynthesis and biotransformation of gypenoside.
Collapse
Affiliation(s)
- Xiao Zhang
- Key Laboratory of Resource Biology and Biotechnology in Western China (Ministry of Education), College of Life Sciences, Northwest University, Xi’an, Shaanxi, 710069, China
| | - Yuhe Zhao
- Key Laboratory of Resource Biology and Biotechnology in Western China (Ministry of Education), College of Life Sciences, Northwest University, Xi’an, Shaanxi, 710069, China
| | - Yixuan Kou
- Laboratory of Subtropical Biodiversity, Jiangxi Agricultural University, Nanchang, 330045, China
| | - Xiaodan Chen
- Key Laboratory of Resource Biology and Biotechnology in Western China (Ministry of Education), College of Life Sciences, Northwest University, Xi’an, Shaanxi, 710069, China
- College of Life Sciences, Shanxi Normal University, Taiyuan, Shanxi, 030012, China
| | - Jia Yang
- Key Laboratory of Resource Biology and Biotechnology in Western China (Ministry of Education), College of Life Sciences, Northwest University, Xi’an, Shaanxi, 710069, China
| | - Hao Zhang
- Key Laboratory of Resource Biology and Biotechnology in Western China (Ministry of Education), College of Life Sciences, Northwest University, Xi’an, Shaanxi, 710069, China
- College of Life Sciences, Sun Yat-sen University, Guangzhou, Guangdong, 510275, China
| | - Zhe Zhao
- Key Laboratory of Resource Biology and Biotechnology in Western China (Ministry of Education), College of Life Sciences, Northwest University, Xi’an, Shaanxi, 710069, China
| | - Yuemei Zhao
- School of Biological Sciences, Guizhou Education University, Guiyang, Guizhou, 550018, China
| | | | | |
Collapse
|
105
|
Kaya Y, Aydın ZU, Cai X, Wang X, Dönmez AA. Genome-wide characterization of two Aubrieta taxa: Aubrieta canescens subsp. canescens and Au. macrostyla (Brassicaceae). AOB PLANTS 2022; 14:plac035. [PMID: 36196394 PMCID: PMC9521481 DOI: 10.1093/aobpla/plac035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Accepted: 09/09/2022] [Indexed: 06/16/2023]
Abstract
Aubrieta canescens complex is divided into two subspecies, Au. canescens subsp. canescens, Au. canescens subsp. cilicica and a distinct species, Au. macrostyla, based on molecular phylogeny. We generated a draft assembly of Au. canescens subsp. canescens and Au. macrostyla using paired-end shotgun sequencing. This is the first attempt at genome characterization for the genus. In the presented study, ~165 and ~157 Mbp of the genomes of Au. canescens subsp. canescens and Au. macrostyla were assembled, respectively, and a total of 32 425 and 31 372 gene models were predicted in the genomes of the target taxa, respectively. We corroborated the phylogenomic affinity of taxa with some core Brassicaceae species (Clades A and B) including Arabis alpina. The orthology-based tree suggested that Aubrieta species differentiated from A. alpina 1.3-2.0 mya (million years ago). The genome-wide syntenic comparison of two Aubrieta taxa revealed that Au. canescens subsp. canescens (46 %) and Au. macrostyla (45 %) have an almost identical syntenic gene pair ratio. These novel genome assemblies are the first steps towards the chromosome-level assembly of Au. canescens and understanding the genome diversity within the genus.
Collapse
Affiliation(s)
| | - Zübeyde Uğurlu Aydın
- Molecular Plant Systematic Laboratory (MOBIS), Department of Biology, Faculty of Science, Hacettepe University, Ankara 06800, Turkey
| | - Xu Cai
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Xiaowu Wang
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Ali A Dönmez
- Molecular Plant Systematic Laboratory (MOBIS), Department of Biology, Faculty of Science, Hacettepe University, Ankara 06800, Turkey
| |
Collapse
|
106
|
Finding a home for the ram’s horn squid: phylogenomic analyses support Spirula spirula (Cephalopoda: Decapodiformes) as a close relative of Oegopsida. ORG DIVERS EVOL 2022. [DOI: 10.1007/s13127-022-00583-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
107
|
Trabuco Amaral D, Mitani Y, Aparecida Silva Bonatelli I, Cerri R, Ohmiya Y, Viviani V. Genome analysis of Phrixothrix hirtus (Phengodidae) railroad worm shows the expansion of odorant-binding gene families and positive selection on morphogenesis and sex determination genes. Gene X 2022; 850:146917. [PMID: 36174905 DOI: 10.1016/j.gene.2022.146917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 09/14/2022] [Accepted: 09/21/2022] [Indexed: 10/14/2022] Open
Abstract
Among bioluminescent beetles of the Elateroidea superfamily, Phengodidae is the third largest family, with 244 bioluminescent species distributed only in the Americas, but is still the least studied from the phylogenetic and evolutionary points of view. The railroad worm Phrixothrix hirtus is an essential biological model and symbolic species due to its bicolor bioluminescence, being the only organism that produces true red light among bioluminescent terrestrial species. Here, we performed partial genome assembly of P. hirtus, combining short and long reads generated with Illumina sequencing, providing the first source of genomic information and a framework for comparative analyses of the bioluminescent system in Elateroidea. This is the largest genome described in the Elateroidea superfamily, with an estimated size of ∼3.4 Gb, displaying 32 % GC content, and 67 % transposable elements. Comparative genomic analyses showed a positive selection of genes and gene family expansion events of growths and morphogenesis gene products, which could be associated with the atypical anatomical development and morphogenesis found in paedomorphic females and underdeveloped males. We also observed gene family expansion among distinct odorant-binding receptors, which could be associated with the pheromone communication system typical of these beetles, and retrotransposable elements. Common genes putatively regulating bioluminescence production and control, including two luciferase genes corresponding to lateral lanterns green-emitting and head lanterns red-emitting luciferases with 7 exons and 6 introns, and genes potentially involved in luciferin biosynthesis were found, indicating that there are no clear differences about the presence or absence of gene families associated with bioluminescence in Elateroidea.
Collapse
Affiliation(s)
- Danilo Trabuco Amaral
- Programa de Pós-Graduação em Biotecnociência, Centro de Ciências Naturais e Humanas. Universidade Federal do ABC (UFABC), Santo André, Brazil
| | - Yasuo Mitani
- Bioproduction Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Sapporo, Japan
| | | | - Ricardo Cerri
- Department of Computational Science, Universidade Federal de São Carlos (UFSCar), São Carlos, Brazil
| | - Yoshihiro Ohmiya
- Biomedical Research Institute, AIST, Ikeda-Osaka, Japan; Osaka Institute of Technology, OIT, Osaka, Japan
| | - Vadim Viviani
- Graduate Program of Evolutive Genetics and Molecular Biology, Federal University of São Carlos (UFSCar), São Carlos, Brazil; Graduate Program of Biotechnology and Environmental Monitoring, Federal University of São Carlos (UFSCar), Sorocaba, Brazil.
| |
Collapse
|
108
|
Khan J, Kokot M, Deorowicz S, Patro R. Scalable, ultra-fast, and low-memory construction of compacted de Bruijn graphs with Cuttlefish 2. Genome Biol 2022; 23:190. [PMID: 36076275 PMCID: PMC9454175 DOI: 10.1186/s13059-022-02743-6] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Accepted: 08/01/2022] [Indexed: 11/13/2022] Open
Abstract
The de Bruijn graph is a key data structure in modern computational genomics, and construction of its compacted variant resides upstream of many genomic analyses. As the quantity of genomic data grows rapidly, this often forms a computational bottleneck. We present Cuttlefish 2, significantly advancing the state-of-the-art for this problem. On a commodity server, it reduces the graph construction time for 661K bacterial genomes, of size 2.58Tbp, from 4.5 days to 17-23 h; and it constructs the graph for 1.52Tbp white spruce reads in approximately 10 h, while the closest competitor requires 54-58 h, using considerably more memory.
Collapse
Affiliation(s)
- Jamshed Khan
- Department of Computer Science, University of Maryland, College Park, USA
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, USA
| | - Marek Kokot
- Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Gliwice, Poland
| | - Sebastian Deorowicz
- Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Gliwice, Poland
| | - Rob Patro
- Department of Computer Science, University of Maryland, College Park, USA
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, USA
| |
Collapse
|
109
|
Chen Y, Zhang Y, Wang H, Sun J, Ma L, Miao F, Zhang Z, Cheng Y, Huang J, Yang G, Wang Z. A High-Quality Genome Assembly of Sorghum dochna. Front Genet 2022; 13:844385. [PMID: 36035157 PMCID: PMC9412107 DOI: 10.3389/fgene.2022.844385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Accepted: 05/24/2022] [Indexed: 11/13/2022] Open
Abstract
Sweet sorghum (Sorghum dochna) is a high-quality bio-energy crop that also serves as food for humans and animals. However, there is little information on the genomic characteristics of S. dochna. In this study, we presented a high-quality assembly of S. dochna with PacBio long reads, Illumina short reads, high-throughput chromosome capture technology (Hi-C) sequencing data, gene annotation, and a comparative genome analysis. The results showed that the genome of S. dochna was assembled to 777 Mb with a contig N50 of 553.47 kb and a scaffold N50 of 727.11 kb. In addition, the gene annotation predicted 37,971 genes and 39,937 transcripts in the genome of S. dochna. A Venn analysis revealed a set of 7,988 common gene annotations by integrating five databases. A Cafe software analysis showed that 191 gene families were significantly expanded, while 3,794 were significantly contracted in S. dochna. A GO enrichment analysis showed that the expanded gene families were primarily clustered in the metabolic process, DNA reconstruction, and DNA binding among others. The high-quality genome map constructed in this study provides a biological basis for the future analysis of the biological characteristics of S. dochna, which is crucial for its breeding.
Collapse
Affiliation(s)
- Yu Chen
- College of Grassland Science, Qingdao Agricultural University, Qingdao, China
- Key Laboratory of National Forestry and Grassland Administration on Grassland Resources and Ecology in the Yellow River Delta, Qingdao Agricultural University, Qingdao, China
| | - Yongbai Zhang
- College of Grassland Science, Qingdao Agricultural University, Qingdao, China
- Key Laboratory of National Forestry and Grassland Administration on Grassland Resources and Ecology in the Yellow River Delta, Qingdao Agricultural University, Qingdao, China
| | - Hongjie Wang
- College of Grassland Science, Qingdao Agricultural University, Qingdao, China
- Key Laboratory of National Forestry and Grassland Administration on Grassland Resources and Ecology in the Yellow River Delta, Qingdao Agricultural University, Qingdao, China
| | - Juan Sun
- College of Grassland Science, Qingdao Agricultural University, Qingdao, China
- Key Laboratory of National Forestry and Grassland Administration on Grassland Resources and Ecology in the Yellow River Delta, Qingdao Agricultural University, Qingdao, China
| | - Lichao Ma
- College of Grassland Science, Qingdao Agricultural University, Qingdao, China
- Key Laboratory of National Forestry and Grassland Administration on Grassland Resources and Ecology in the Yellow River Delta, Qingdao Agricultural University, Qingdao, China
| | - Fuhong Miao
- College of Grassland Science, Qingdao Agricultural University, Qingdao, China
- Key Laboratory of National Forestry and Grassland Administration on Grassland Resources and Ecology in the Yellow River Delta, Qingdao Agricultural University, Qingdao, China
| | - Zixin Zhang
- Key Laboratory of National Forestry and Grassland Administration on Grassland Resources and Ecology in the Yellow River Delta, Qingdao Agricultural University, Qingdao, China
| | - Yang Cheng
- College of Animal Science, Qingdao Agricultural University, Qingdao, China
| | | | - Guofeng Yang
- College of Grassland Science, Qingdao Agricultural University, Qingdao, China
- Key Laboratory of National Forestry and Grassland Administration on Grassland Resources and Ecology in the Yellow River Delta, Qingdao Agricultural University, Qingdao, China
- *Correspondence: Guofeng Yang,
| | - Zengyu Wang
- College of Grassland Science, Qingdao Agricultural University, Qingdao, China
- Key Laboratory of National Forestry and Grassland Administration on Grassland Resources and Ecology in the Yellow River Delta, Qingdao Agricultural University, Qingdao, China
| |
Collapse
|
110
|
Khairi MHF, Nor Muhammad NA, Bunawan H, Abdul Murad AM, Ramzi AB. Unveiling the Core Effector Proteins of Oil Palm Pathogen Ganoderma boninense via Pan-Secretome Analysis. J Fungi (Basel) 2022; 8:jof8080793. [PMID: 36012782 PMCID: PMC9409662 DOI: 10.3390/jof8080793] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 07/04/2022] [Accepted: 07/12/2022] [Indexed: 12/10/2022] Open
Abstract
Ganoderma boninense is the major causal agent of basal stem rot (BSR) disease in oil palm, causing the progressive rot of the basal part of the stem. Despite its prominence, the key pathogenicity determinants for the aggressive nature of hemibiotrophic infection remain unknown. In this study, genome sequencing and the annotation of G. boninense T10 were carried out using the Illumina sequencing platform, and comparative genome analysis was performed with previously reported G. boninense strains (NJ3 and G3). The pan-secretome of G. boninense was constructed and comprised 937 core orthogroups, 243 accessory orthogroups, and 84 strain-specific orthogroups. In total, 320 core orthogroups were enriched with candidate effector proteins (CEPs) that could be classified as carbohydrate-active enzymes, hydrolases, and non-catalytic proteins. Differential expression analysis revealed an upregulation of five CEP genes that was linked to the suppression of PTI signaling cascade, while the downregulation of four CEP genes was linked to the inhibition of PTI by preventing host defense elicitation. Genome architecture analysis revealed the one-speed architecture of the G. boninense genome and the lack of preferential association of CEP genes to transposable elements. The findings obtained from this study aid in the characterization of pathogenicity determinants and molecular biomarkers of BSR disease.
Collapse
Affiliation(s)
- Mohamad Hazwan Fikri Khairi
- Institute of Systems Biology, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia; (M.H.F.K.); (N.A.N.M.); (H.B.)
| | - Nor Azlan Nor Muhammad
- Institute of Systems Biology, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia; (M.H.F.K.); (N.A.N.M.); (H.B.)
| | - Hamidun Bunawan
- Institute of Systems Biology, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia; (M.H.F.K.); (N.A.N.M.); (H.B.)
| | - Abdul Munir Abdul Murad
- Department of Biological Sciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia;
| | - Ahmad Bazli Ramzi
- Institute of Systems Biology, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia; (M.H.F.K.); (N.A.N.M.); (H.B.)
- Correspondence: ; Tel.: +603-8921-4546; Fax: +603-8921-3398
| |
Collapse
|
111
|
Pan-genomic, transcriptomic, and miRNA analyses to decipher genetic diversity and anthocyanin pathway genes among the traditional rice landraces. Genomics 2022; 114:110436. [PMID: 35902069 DOI: 10.1016/j.ygeno.2022.110436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Revised: 07/18/2022] [Accepted: 07/21/2022] [Indexed: 11/21/2022]
Abstract
Black rice is famous for containing high anthocyanin while Joha rice is aromatic with low anthocyanin containing rice from the North Eastern Region (NER) of India. However, there are limited reports on the anthocyanin biosynthesis in Manipur Black rice. Therefore, the present study was aimed to understand the origin, domestication and anthocyanin biosynthesis pathways in Black rice using the next generation sequencing of approaches. With the sequencing data, various analyses were carried out for differential expression and construction of a pan-genome. Protein coding RNA and small RNA sequencing analysis aided in determining 7415 and 131 differentially expressed transcripts and miRNAs, respectively in NER rice. This is the first extensive study on identification and expression analysis of miRNAs and their target genes in regulating anthocyanin biosynthesis in NER rice. This study will aid in better understanding for decoding the theory of high or low anthocyanin content in different rice genotypes.
Collapse
|
112
|
Xu X, Wang Y, Wang C, Guo G, Yu X, Dai Y, Liu Y, Wei G, He X, Jin G, Zhang Z, Guan Q, Pain A, Wang S, Zhang W, Young ND, Gasser RB, McManus DP, Cao J, Zhou Q, Zhang Q. Chromosome-level genome assembly defines female-biased genes associated with sex determination and differentiation in the human blood fluke Schistosoma japonicum. Mol Ecol Resour 2022; 23:205-221. [PMID: 35844053 DOI: 10.1111/1755-0998.13689] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2021] [Revised: 07/05/2022] [Accepted: 07/11/2022] [Indexed: 12/01/2022]
Abstract
Schistosomiasis is a neglected tropical disease of humans caused by blood flukes of the genus Schistosoma, the only dioecious parasitic flatworm. Although aspects of sex determination, differentiation and reproduction have been studied in some Schistosoma species, almost nothing is known for Schistosoma japonicum, the causative agent of schistosomiasis japonica. This mainly reflects the lack of high-quality genomic and transcriptomic resources for this species. As current genomes for S. japonicum are highly fragmented, we assembled and report a chromosome-level reference genome (seven autosomes, the Z-chromosome and partial W-chromosome), achieving a substantially enhanced gene annotation. Utilizing this genome, we discovered that the sex chromosomes of S. japonicum and its congener S. mansoni independently suppressed recombination during evolution, forming five and two evolutionary strata, respectively. By exploring the W-chromosome and sex-specific transcriptomes, we identified 35 W-linked genes and 257 female-preferentially transcribed genes (FTGs) from our chromosomal assembly and uncovered a signature for sex determination and differentiation in S. japonicum. These FTGs clustering within autosomes or the Z-chromosome exhibit a highly dynamic transcription profile during the pairing of female and male schistosomula, thereby representing a critical phase for the maturation of the female worms and suggesting distinct layers of regulatory control of gene transcription at this development stage. Collectively, these data provide a valuable resource for further functional genomic characterization of S. japonicum, shed light on the evolution of sex chromosomes in this highly virulent human blood fluke, and provide a pathway to identify novel targets for development of intervention tools against schistosomiasis.
Collapse
Affiliation(s)
- Xindong Xu
- Laboratory of Molecular Parasitology, Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration of Ministry of Education, Tongji Hospital, and Clinical Center for Brain and Spinal Cord Research School of Medicine, School of Medicine, Tongji University, Shanghai, China
| | - Yifeng Wang
- MOE Laboratory of Biosystems Homeostasis and Protection and Zhejiang Provincial Key Laboratory for Cancer Molecular Cell Biology, Life Sciences Institute, Zhejiang University, Hangzhou, China
| | - Changhong Wang
- Laboratory of Molecular Parasitology, Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration of Ministry of Education, Tongji Hospital, and Clinical Center for Brain and Spinal Cord Research School of Medicine, School of Medicine, Tongji University, Shanghai, China
| | - Gangqiang Guo
- Laboratory of Molecular Parasitology, Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration of Ministry of Education, Tongji Hospital, and Clinical Center for Brain and Spinal Cord Research School of Medicine, School of Medicine, Tongji University, Shanghai, China
| | - Xinyu Yu
- National Health Commission Key Laboratory of Parasitic Disease Control and Prevention, Jiangsu Provincial Key Laboratory on Parasite and Vector Control Technology, Jiangsu Institute of Parasitic Diseases, Wuxi, China
| | - Yang Dai
- National Health Commission Key Laboratory of Parasitic Disease Control and Prevention, Jiangsu Provincial Key Laboratory on Parasite and Vector Control Technology, Jiangsu Institute of Parasitic Diseases, Wuxi, China
| | - Yaobao Liu
- National Health Commission Key Laboratory of Parasitic Disease Control and Prevention, Jiangsu Provincial Key Laboratory on Parasite and Vector Control Technology, Jiangsu Institute of Parasitic Diseases, Wuxi, China
| | - Guiying Wei
- Laboratory of Molecular Parasitology, Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration of Ministry of Education, Tongji Hospital, and Clinical Center for Brain and Spinal Cord Research School of Medicine, School of Medicine, Tongji University, Shanghai, China
| | - Xiaohui He
- Laboratory of Molecular Parasitology, Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration of Ministry of Education, Tongji Hospital, and Clinical Center for Brain and Spinal Cord Research School of Medicine, School of Medicine, Tongji University, Shanghai, China
| | - Ge Jin
- Novogene Bioinformatics Institute, Beijing, China
| | - Ziqiu Zhang
- Novogene Bioinformatics Institute, Beijing, China
| | - Qingtian Guan
- Pathogen Genomics Laboratory, Biological and Environmental Sciences and Engineering (BESE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Arnab Pain
- Pathogen Genomics Laboratory, Biological and Environmental Sciences and Engineering (BESE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Shengyue Wang
- National Research Center for Translational Medicine, State Key Laboratory of Medical Genomics, Ruijin Hospital Affiliated to Shanghai Jiao Tong University (SJTU) School of Medicine, Shanghai, China
| | - Wenbao Zhang
- State Key Laboratory of Pathogenesis, Prevention and Treatment of High Incidence Diseases in Central Asia, Clinical Medical Research Institute, The First Affiliated Hospital of Xinjiang Medical University, Urumqi, China
| | - Neil D Young
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, Victoria, Australia
| | - Robin B Gasser
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, Victoria, Australia
| | - Donald P McManus
- Department of Immunology, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Jun Cao
- National Health Commission Key Laboratory of Parasitic Disease Control and Prevention, Jiangsu Provincial Key Laboratory on Parasite and Vector Control Technology, Jiangsu Institute of Parasitic Diseases, Wuxi, China.,Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Qi Zhou
- MOE Laboratory of Biosystems Homeostasis and Protection and Zhejiang Provincial Key Laboratory for Cancer Molecular Cell Biology, Life Sciences Institute, Zhejiang University, Hangzhou, China.,Department of Neuroscience and Developmental Biology, University of Vienna, Vienna, Austria.,Center for Reproductive Medicine, the Second Affiliated Hospital School of Medicine and Life Sciences Institute, Zhejiang University, Hangzhou, China
| | - Qingfeng Zhang
- Laboratory of Molecular Parasitology, Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration of Ministry of Education, Tongji Hospital, and Clinical Center for Brain and Spinal Cord Research School of Medicine, School of Medicine, Tongji University, Shanghai, China
| |
Collapse
|
113
|
K-Mer Spectrum-Based Error Correction Algorithm for Next-Generation Sequencing Data. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:8077664. [PMID: 35875730 PMCID: PMC9303089 DOI: 10.1155/2022/8077664] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 06/13/2022] [Indexed: 11/26/2022]
Abstract
In the mid-1970s, the first-generation sequencing technique (Sanger) was created. It used Advanced BioSystems sequencing devices and Beckman's GeXP genetic testing technology. The second-generation sequencing (2GS) technique arrived just several years after the first human genome was published in 2003. 2GS devices are very quicker than Sanger sequencing equipment, with considerably cheaper manufacturing costs and far higher throughput in the form of short reads. The third-generation sequencing (3GS) method, initially introduced in 2005, offers further reduced manufacturing costs and higher throughput. Even though sequencing technique has result generations, it is error-prone due to a large number of reads. The study of this massive amount of data will aid in the decoding of life secrets, the detection of infections, the development of improved crops, and the improvement of life quality, among other things. This is a challenging task, which is complicated not just by a large number of reads and by the occurrence of sequencing mistakes. As a result, error correction is a crucial duty in data processing; it entails identifying and correcting read errors. Various k-spectrum-based error correction algorithms' performance can be influenced by a variety of characteristics like coverage depth, read length, and genome size, as demonstrated in this work. As a result, time and effort must be put into selecting acceptable approaches for error correction of certain NGS data.
Collapse
|
114
|
Santoro D, Pellegrina L, Comin M, Vandin F. SPRISS: approximating frequent k-mers by sampling reads, and applications. Bioinformatics 2022; 38:3343-3350. [PMID: 35583271 PMCID: PMC9237683 DOI: 10.1093/bioinformatics/btac180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 02/25/2022] [Accepted: 05/16/2022] [Indexed: 11/29/2022] Open
Abstract
MOTIVATION The extraction of k-mers is a fundamental component in many complex analyses of large next-generation sequencing datasets, including reads classification in genomics and the characterization of RNA-seq datasets. The extraction of all k-mers and their frequencies is extremely demanding in terms of running time and memory, owing to the size of the data and to the exponential number of k-mers to be considered. However, in several applications, only frequent k-mers, which are k-mers appearing in a relatively high proportion of the data, are required by the analysis. RESULTS In this work, we present SPRISS, a new efficient algorithm to approximate frequent k-mers and their frequencies in next-generation sequencing data. SPRISS uses a simple yet powerful reads sampling scheme, which allows to extract a representative subset of the dataset that can be used, in combination with any k-mer counting algorithm, to perform downstream analyses in a fraction of the time required by the analysis of the whole data, while obtaining comparable answers. Our extensive experimental evaluation demonstrates the efficiency and accuracy of SPRISS in approximating frequent k-mers, and shows that it can be used in various scenarios, such as the comparison of metagenomic datasets, the identification of discriminative k-mers, and SNP (single nucleotide polymorphism) genotyping, to extract insights in a fraction of the time required by the analysis of the whole dataset. AVAILABILITY AND IMPLEMENTATION SPRISS [a preliminary version (Santoro et al., 2021) of this work was presented at RECOMB 2021] is available at https://github.com/VandinLab/SPRISS. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Diego Santoro
- Department of Information Engineering, University of Padova, 35131 Padova, Italy
| | - Leonardo Pellegrina
- Department of Information Engineering, University of Padova, 35131 Padova, Italy
| | - Matteo Comin
- Department of Information Engineering, University of Padova, 35131 Padova, Italy
| | - Fabio Vandin
- Department of Information Engineering, University of Padova, 35131 Padova, Italy
| |
Collapse
|
115
|
Liu S, Koslicki D. CMash: fast, multi-resolution estimation of k-mer-based Jaccard and containment indices. Bioinformatics 2022; 38:i28-i35. [PMID: 35758788 PMCID: PMC9235470 DOI: 10.1093/bioinformatics/btac237] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Motivation K-mer-based methods are used ubiquitously in the field of computational biology. However, determining the optimal value of k for a specific application often remains heuristic. Simply reconstructing a new k-mer set with another k-mer size is computationally expensive, especially in metagenomic analysis where datasets are large. Here, we introduce a hashing-based technique that leverages a kind of bottom-m sketch as well as a k-mer ternary search tree (KTST) to obtain k-mer-based similarity estimates for a range of k values. By truncating k-mers stored in a pre-built KTST with a large k=kmax value, we can simultaneously obtain k-mer-based estimates for all k values up to kmax. This truncation approach circumvents the reconstruction of new k-mer sets when changing k values, making analysis more time and space-efficient. Results We derived the theoretical expression of the bias factor due to truncation. And we showed that the biases are negligible in practice: when using a KTST to estimate the containment index between a RefSeq-based microbial reference database and simulated metagenome data for 10 values of k, the running time was close to 10× faster compared to a classic MinHash approach while using less than one-fifth the space to store the data structure. Availability and implementation A python implementation of this method, CMash, is available at https://github.com/dkoslicki/CMash. The reproduction of all experiments presented herein can be accessed via https://github.com/KoslickiLab/CMASH-reproducibles. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shaopeng Liu
- Huck Institutes of Life Sciences, Pennsylvania State University, State College, PA 16801, USA
| | - David Koslicki
- Huck Institutes of Life Sciences, Pennsylvania State University, State College, PA 16801, USA.,Department of Computer Science and Engineering, Pennsylvania State University, State College, PA 16801, USA.,Department of Biology, Pennsylvania State University, State College, PA 16801, USA
| |
Collapse
|
116
|
Phazna TA, Ngashangva N, Yentrembam RBS, Maurya R, Mukherjee P, Sharma C, Verma PK, Sarangthem I. Draft genome sequence and functional analysis of Lysinibacillus xylanilyticus t26, a plant growth-promoting bacterium isolated from Capsicum chinense rhizosphere. J Biosci 2022. [DOI: 10.1007/s12038-022-00264-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
117
|
Salinas-Restrepo C, Misas E, Estrada-Gómez S, Quintana-Castillo JC, Guzman F, Calderón JC, Giraldo MA, Segura C. Improving the Annotation of the Venom Gland Transcriptome of Pamphobeteus verdolaga, Prospecting Novel Bioactive Peptides. Toxins (Basel) 2022; 14:408. [PMID: 35737069 PMCID: PMC9228390 DOI: 10.3390/toxins14060408] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 06/06/2022] [Accepted: 06/07/2022] [Indexed: 02/01/2023] Open
Abstract
Spider venoms constitute a trove of novel peptides with biotechnological interest. Paucity of next-generation-sequencing (NGS) data generation has led to a description of less than 1% of these peptides. Increasing evidence supports the underestimation of the assembled genes a single transcriptome assembler can predict. Here, the transcriptome of the venom gland of the spider Pamphobeteus verdolaga was re-assembled, using three free access algorithms, Trinity, SOAPdenovo-Trans, and SPAdes, to obtain a more complete annotation. Assembler's performance was evaluated by contig number, N50, read representation on the assembly, and BUSCO's terms retrieval against the arthropod dataset. Out of all the assembled sequences with all software, 39.26% were common between the three assemblers, and 27.88% were uniquely assembled by Trinity, while 27.65% were uniquely assembled by SPAdes. The non-redundant merging of all three assemblies' output permitted the annotation of 9232 sequences, which was 23% more when compared to each software and 28% more when compared to the previous P. verdolaga annotation; moreover, the description of 65 novel theraphotoxins was possible. In the generation of data for non-model organisms, as well as in the search for novel peptides with biotechnological interest, it is highly recommended to employ at least two different transcriptome assemblers.
Collapse
Affiliation(s)
- Cristian Salinas-Restrepo
- Grupo Toxinología, Alternativas Terapéuticas y Alimentarias, Facultad de Ciencias Farmacéuticas y Alimentarias, Universidad de Antioquia, Medellín 050012, Colombia; (C.S.-R.); (S.E.-G.)
| | - Elizabeth Misas
- Corporación para Investigaciones Biológicas, Medellín 050012, Colombia;
| | - Sebastian Estrada-Gómez
- Grupo Toxinología, Alternativas Terapéuticas y Alimentarias, Facultad de Ciencias Farmacéuticas y Alimentarias, Universidad de Antioquia, Medellín 050012, Colombia; (C.S.-R.); (S.E.-G.)
- Centro de Investigación en Recursos Naturales y Sustentabilidad, Universidad Bernardo O’Higgins, Aven-ida Viel 1497, Santiago 7750000, Chile
| | | | - Fanny Guzman
- Núcleo Biotecnología Curauma (NBC), Pontifícia Universidad Católica de Valparaíso, Valparaíso 2374631, Chile;
| | - Juan C. Calderón
- Physiology and Biochemistry Research Group-PHYSIS, Faculty of Medicine, University of Antioquia, Medellín 050012, Colombia;
| | - Marco A. Giraldo
- Biophysics Group, Institute of Physics, University of Antioquia, Medellín 050012, Colombia;
| | - Cesar Segura
- Grupo Malaria, Facultad de Medicina, Universidad de Antioquia, Medellín 050012, Colombia
| |
Collapse
|
118
|
Lenz AR, Balbinot E, de Abreu FP, de Oliveira NS, Fontana RC, de Avila E Silva S, Park MS, Lim YW, Houbraken J, Camassola M, Dillon AJP. Taxonomy, comparative genomics and evolutionary insights of Penicillium ucsense: a novel species in series Oxalica. Antonie Van Leeuwenhoek 2022; 115:1009-1029. [PMID: 35678932 DOI: 10.1007/s10482-022-01746-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Accepted: 05/03/2022] [Indexed: 10/18/2022]
Abstract
The genomes of two Penicillium strains were sequenced and studied in this study: strain 2HH was isolated from the digestive tract of Anobium punctatum beetle larva in 1979 and the cellulase hypersecretory strain S1M29, derived from strain 2HH by a long-term mutagenesis process. With these data, the strains were reclassified and insight is obtained on molecular features related to cellulase hyperproduction and the albino phenotype of the mutant. Both strains were previously identified as Penicillium echinulatum and this investigation indicated that these should be reclassified. Phylogenetic and phenotype data showed that these strains represent a new Penicillium species in series Oxalica, for which the name Penicillium ucsense is proposed here. Six additional strains (SFC101850, SFCP10873, SFCP10886, SFCP10931, SFCP10932 and SFCP10933) collected from the marine environment in the Republic of Korea were also classified as this species, indicating a worldwide distribution of this new taxon. Compared to the closely related strain Penicillium oxalicum 114-2, the composition of cell wall-associated proteins of P. ucsense 2HH shows five fewer chitinases, considerable differences in the number of proteins related to β-D-glucan metabolism. The genomic comparison of 2HH and S1M29 highlighted single amino-acid substitutions in two major proteins (BGL2 and FlbA) that can be associated with the hyperproduction of cellulases. The study of melanin pathways shows that the S1M29 albino phenotype resulted from a single amino-acid substitution in the enzyme ALB1, a precursor of the 1,8-dihydroxynaphthalene (DHN)-melanin biosynthesis. Our study provides important knowledge towards understanding species distribution, molecular mechanisms, melanin production and cell wall biosynthesis of this new Penicillium species.
Collapse
Affiliation(s)
- Alexandre Rafael Lenz
- Bioinformatics and Computational Biology Laboratory, Institute of Biotechnology, University of Caxias Do Sul, Francisco Getúlio Vargas Street 1130, Caxias do Sul, RS, 95070-560, Brazil. .,Bahia State University, Silveira Martins Street 2555, Salvador, BA, 41150-000, Brazil.
| | - Eduardo Balbinot
- Bioinformatics and Computational Biology Laboratory, Institute of Biotechnology, University of Caxias Do Sul, Francisco Getúlio Vargas Street 1130, Caxias do Sul, RS, 95070-560, Brazil
| | - Fernanda Pessi de Abreu
- Bioinformatics and Computational Biology Laboratory, Institute of Biotechnology, University of Caxias Do Sul, Francisco Getúlio Vargas Street 1130, Caxias do Sul, RS, 95070-560, Brazil
| | - Nikael Souza de Oliveira
- Bioinformatics and Computational Biology Laboratory, Institute of Biotechnology, University of Caxias Do Sul, Francisco Getúlio Vargas Street 1130, Caxias do Sul, RS, 95070-560, Brazil
| | - Roselei Claudete Fontana
- Laboratory of Enzymes and Biomass, Institute of Biotechnology, University of Caxias Do Sul, Francisco Getúlio Vargas Street 1130, Caxias do Sul, RS, 95070-560, Brazil
| | - Scheila de Avila E Silva
- Bioinformatics and Computational Biology Laboratory, Institute of Biotechnology, University of Caxias Do Sul, Francisco Getúlio Vargas Street 1130, Caxias do Sul, RS, 95070-560, Brazil
| | - Myung Soo Park
- School of Biological Sciences and Institution of Microbiology, Seoul National University, Seoul, 08826, South Korea
| | - Young Woon Lim
- School of Biological Sciences and Institution of Microbiology, Seoul National University, Seoul, 08826, South Korea
| | - Jos Houbraken
- Westerdijk Fungal Biodiversity Institute, Uppsalalaan 8, 3584 CT, Utrecht, The Netherlands
| | - Marli Camassola
- Laboratory of Enzymes and Biomass, Institute of Biotechnology, University of Caxias Do Sul, Francisco Getúlio Vargas Street 1130, Caxias do Sul, RS, 95070-560, Brazil
| | - Aldo José Pinheiro Dillon
- Laboratory of Enzymes and Biomass, Institute of Biotechnology, University of Caxias Do Sul, Francisco Getúlio Vargas Street 1130, Caxias do Sul, RS, 95070-560, Brazil
| |
Collapse
|
119
|
Lu XY, Zhang QF, Jiang DD, Du CH, Xu R, Guo XG, Yang X. Characterization of the complete mitochondrial genome of Ixodes granulatus (Ixodidae) and its phylogenetic implications. Parasitol Res 2022; 121:2347-2358. [PMID: 35650429 DOI: 10.1007/s00436-022-07561-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 05/23/2022] [Indexed: 11/25/2022]
Abstract
Ticks are deemed to be second only to mosquitoes as the most common vector of human infectious diseases worldwide that give rise to human and animal diseases and economic losses to livestock production. Our understanding of the phylogenetic analysis between tick lineages has been restricted by the phylogenetic markers of individual genes. Genomic data research could help advance our understanding of phylogenetic analysis and molecular evolution. Mitochondrial genomic DNA facilitated the phylogenetic analysis of eukaryotes containing ticks. In this study, we sequenced and assembled the circular complete mitogenome information of Ixodes granulatus. The 14,540-bp mitogenome consists of 37 genes, including 13 protein-coding genes (PCGs), two genes for ribosomal RNA (rRNAs), and 22 genes for transfer RNA (tRNAs), and the origin of the L-strand replication region. The directions of the coding strand and component genes in the non-Australasian Ixodes mitochondrial genome were similar to those found in most other Australasian Ixodes, except for the loss of a lengthy control region. The phylogenetic tree based on maximum likelihood (ML) and Bayesian inference (BI) computational algorithms showed that I. granulatus exhibits a close relationship with I. hexagonus and I. ricinus. To our knowledge, this is the first study exploring the complete mitogenome for the species I. granulatus. Our results provide new insights for further research on the evolution, population genetics, systematics, and molecular ecology of ticks.
Collapse
Affiliation(s)
- Xin-Yan Lu
- Integrated Laboratory of Pathogenic Biology, College of Preclinical Medicine, Dali University, Dali, 671000, People's Republic of China
| | - Quan-Fu Zhang
- Department of Gastroenterology, Clinical Medical College and the First Affiliated Hospital of Chengdu Medical College, Chengdu, China
| | - Dan-Dan Jiang
- School of Public Health, Dali University, Dali, 671000, People's Republic of China
| | - Chun-Hong Du
- Yunnan Institute of Endemic Diseases Control and Prevention, Dali, Yunnan, 671000, People's Republic of China
| | - Rong Xu
- College of Preclinical Medicine, Dali University, Dali, 671000, People's Republic of China
| | - Xian-Guo Guo
- Institute of Pathogens and Vectors, Yunnan Provincial Key Laboratory for Zoonosis Control and Prevention, Dali University, Dali, 671000, Yunnan, China.
| | - Xing Yang
- Integrated Laboratory of Pathogenic Biology, College of Preclinical Medicine, Dali University, Dali, 671000, People's Republic of China.
| |
Collapse
|
120
|
Ant phylogenomics reveals a natural selection hotspot preceding the origin of complex eusociality. Curr Biol 2022; 32:2942-2947.e4. [PMID: 35623348 DOI: 10.1016/j.cub.2022.05.001] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 03/09/2022] [Accepted: 05/02/2022] [Indexed: 12/30/2022]
Abstract
The evolution of eusociality has allowed ants to become one of the most conspicuous and ecologically dominant groups of organisms in the world. A large majority of the current ∼14,000 ant species belong to the formicoids,1 a clade of nine subfamilies that exhibit the most extreme forms of reproductive division of labor, large colony size,2 worker polymorphism,3 and extended queen longevity.4 The eight remaining non-formicoid subfamilies are less well studied, with few genomes having been sequenced so far and unclear phylogenetic relationships.5 By sequencing 65 genomes, we provide a robust phylogeny of the 17 ant subfamilies, retrieving high support to the controversial leptanillomorph clade (Leptanillinae and Martialinae) as the sister group to all other extant ants. Moreover, our genomic analyses revealed that the emergence of the formicoids was accompanied by an elevated number of positive selection events. Importantly, the top three gene functions under selection are linked to key features of complex eusociality, with histone acetylation being implicated in caste differentiation, gene silencing by RNA in worker sterility, and autophagy in longevity. These results show that the key pathways associated with eusociality have been under strong selection during the Cretaceous, suggesting that the molecular foundations of complex eusociality may have evolved rapidly in less than 20 Ma.
Collapse
|
121
|
Li JX, Coombe L, Wong J, Birol I, Warren RL. ntEdit+Sealer: Efficient Targeted Error Resolution and Automated Finishing of Long-Read Genome Assemblies. Curr Protoc 2022; 2:e442. [PMID: 35567771 PMCID: PMC9196995 DOI: 10.1002/cpz1.442] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
High‐quality genome assemblies are crucial to many biological studies, and utilizing long sequencing reads can help achieve higher assembly contiguity. While long reads can resolve complex and repetitive regions of a genome, their relatively high associated error rates are still a major limitation. Long reads generally produce draft genome assemblies with lower base quality, which must be corrected with a genome polishing step. Hybrid genome polishing solutions can greatly improve the quality of long‐read genome assemblies by utilizing more accurate short reads to validate bases and correct errors. Currently available hybrid polishing methods rely on read alignments, and are therefore memory‐intensive and do not scale well to large genomes. Here we describe ntEdit+Sealer, an alignment‐free, k‐mer‐based genome finishing protocol that employs memory‐efficient Bloom filters. The protocol includes ntEdit for correcting base errors and small indels, and for marking potentially problematic regions, then Sealer for filling both assembly gaps and problematic regions flagged by ntEdit. ntEdit+Sealer produces highly accurate, error‐corrected genome assemblies, and is available as a Makefile pipeline from https://github.com/bcgsc/ntedit_sealer_protocol. © 2022 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol: Automated long‐read genome finishing with short reads Support Protocol: Selecting optimal values for k‐mer lengths (k) and Bloom filter size (b)
Collapse
Affiliation(s)
- Janet X Li
- Canada's Michael Smith Genome Sciences Center, Vancouver, BC, Canada.,Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC, Canada
| | - Lauren Coombe
- Canada's Michael Smith Genome Sciences Center, Vancouver, BC, Canada
| | - Johnathan Wong
- Canada's Michael Smith Genome Sciences Center, Vancouver, BC, Canada
| | - Inanç Birol
- Canada's Michael Smith Genome Sciences Center, Vancouver, BC, Canada.,Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| | - René L Warren
- Canada's Michael Smith Genome Sciences Center, Vancouver, BC, Canada
| |
Collapse
|
122
|
Dias MC, Caldeira C, Gastauer M, Ramos S, Oliveira G. Cross-species transcriptomes reveal species-specific and shared molecular adaptations for plants development on iron-rich rocky outcrops soils. BMC Genomics 2022; 23:313. [PMID: 35439930 PMCID: PMC9020022 DOI: 10.1186/s12864-022-08449-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 02/23/2022] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Canga is the Brazilian term for the savanna-like vegetation harboring several endemic species on iron-rich rocky outcrops, usually considered for mining activities. Parkia platycephala Benth. and Stryphnodendron pulcherrimum (Willd.) Hochr. naturally occur in the cangas of Serra dos Carajás (eastern Amazonia, Brazil) and the surrounding forest, indicating high phenotypic plasticity. The morphological and physiological mechanisms of the plants' establishment in the canga environment are well studied, but the molecular adaptative responses are still unknown. To understand these adaptative responses, we aimed to identify molecular mechanisms that allow the establishment of these plants in the canga environment. RESULTS Plants were grown in canga and forest substrates collected in the Carajás Mineral Province. RNA was extracted from pooled leaf tissue, and RNA-seq paired-end reads were assembled into representative transcriptomes for P. platycephala and S. pulcherrimum containing 31,728 and 31,311 primary transcripts, respectively. We identified both species-specific and core molecular responses in plants grown in the canga substrate using differential expression analyses. In the species-specific analysis, we identified 1,112 and 838 differentially expressed genes for P. platycephala and S. pulcherrimum, respectively. Enrichment analyses showed that unique biological processes and metabolic pathways were affected for each species. Comparative differential expression analysis was based on shared single-copy orthologs. The overall pattern of ortholog expression was species-specific. Even so, we identified almost 300 altered genes between plants in canga and forest substrates with conserved responses in the two species. The genes were functionally associated with the response to light stimulus and the circadian rhythm pathway. CONCLUSIONS Plants possess species-specific adaptative responses to cope with the substrates. Our results also suggest that plants adapted to both canga and forest environments can adjust the circadian rhythm in a substrate-dependent manner. The circadian clock gene modulation might be a central mechanism regulating the plants' development in the canga substrate in the studied legume species. The mechanism may be shared as a common mechanism to abiotic stress compensation in other native species.
Collapse
Affiliation(s)
- Mariana Costa Dias
- Instituto Tecnológico Vale, Rua Boaventura da Silva 955, Belém, Pará, CEP 66055-090, Brazil
- Universidade Federal de Minas Gerais, Avenida Antônio Carlos 6627, Belo Horizonte, Minas Gerais, CEP 31270-901, Brazil
| | - Cecílio Caldeira
- Instituto Tecnológico Vale, Rua Boaventura da Silva 955, Belém, Pará, CEP 66055-090, Brazil
| | - Markus Gastauer
- Instituto Tecnológico Vale, Rua Boaventura da Silva 955, Belém, Pará, CEP 66055-090, Brazil
| | - Silvio Ramos
- Instituto Tecnológico Vale, Rua Boaventura da Silva 955, Belém, Pará, CEP 66055-090, Brazil
| | - Guilherme Oliveira
- Instituto Tecnológico Vale, Rua Boaventura da Silva 955, Belém, Pará, CEP 66055-090, Brazil.
| |
Collapse
|
123
|
Graciano RCD, Oliveira RS, Santos IM, Yazbeck GM. Genomic Resources for Salminus brasiliensis. Front Genet 2022; 13:855718. [PMID: 35419039 PMCID: PMC8995856 DOI: 10.3389/fgene.2022.855718] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Accepted: 02/18/2022] [Indexed: 11/13/2022] Open
Abstract
The Neotropical region bears the most diverse freshwater fish fauna on the planet and is the stage for dramatic conservation struggles. Initiatives aiming for conservation of a single emblematic fish, a flagship species, to which different onlookers relate on a cultural/personal level, holds promise towards engagement and conservation actions benefiting whole biological communities and ecosystems. Here, we present the first comprehensive genomic resources for Salminus brasiliensis, a potential flagship Neotropical species. This fish faces pressing conservation issues, as well as taxonomic uncertainty, being a main species relevant to angling and commercial fisheries. We make available 178 million Illumina paired-end reads, 90 bases long, comprising 16 Gb (≈15X coverage) of filtered data, obtained from a primary genomic library of 500-bp fragments. We present the first de novo genomic assembly for S. brasiliensis, with ∼1 Gb (N50 = 10,889), as well as the coding genome annotation of 12,962 putative genes from assembled genomic fragments over 10 kb, most of which could be identified from the Ostariophysi GenBank database. We also provide a genome-wide panel for more than 80,000 predicted microsatellite loci for low-cost, fast and abundant DNA marker development for this species. A total of 47, among 52 candidates, empirically assayed microsatellites were confirmed as polymorphic in this fish. All genomic data produced for S. brasiliensis is hereby made publicly accessible. With the disclosure of these results, we intend to foster general biology studies and to provide tools to be applied immediately in conservation and aquaculture in this candidate flagship Neotropical species.
Collapse
Affiliation(s)
- Raissa Cristina Dias Graciano
- Laboratório de Recursos Genéticos, Programa de Pós Graduação Em Biotecnologia, Universidade Federal de São João Del Rei, São João Del Rei, Brazil
| | - Rafael Sachetto Oliveira
- Departamento de Ciência da Computação, Universidade Federal de São João Del Rei, São João Del Rei, Brazil
| | - Isllas Miguel Santos
- Laboratório de Recursos Genéticos, Departamento de Zootecnia, Universidade Federal de São João Del Rei, São João Del Rei, Brazil
| | - Gabriel M Yazbeck
- Laboratório de Recursos Genéticos, Programa de Pós Graduação Em Biotecnologia, Universidade Federal de São João Del Rei, São João Del Rei, Brazil.,Laboratório de Recursos Genéticos, Departamento de Zootecnia, Universidade Federal de São João Del Rei, São João Del Rei, Brazil
| |
Collapse
|
124
|
Chen Y, Zhang T, Xian M, Zhang R, Yang W, Su B, Yang G, Sun L, Xu W, Xu S, Gao H, Xu L, Gao X, Li J. A draft genome of Drung cattle reveals clues to its chromosomal fusion and environmental adaptation. Commun Biol 2022; 5:353. [PMID: 35418663 PMCID: PMC9008013 DOI: 10.1038/s42003-022-03298-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 03/21/2022] [Indexed: 12/02/2022] Open
Abstract
Drung cattle (Bos frontalis) have 58 chromosomes, differing from the Bos taurus 2n = 60 karyotype. To date, its origin and evolution history have not been proven conclusively, and the mechanisms of chromosome fusion and environmental adaptation have not been clearly elucidated. Here, we assembled a high integrity and good contiguity genome of Drung cattle with 13.7-fold contig N50 and 4.1-fold scaffold N50 improvements over the recently published Indian mithun assembly, respectively. Speciation time estimation and phylogenetic analysis showed that Drung cattle diverged from Bos taurus into an independent evolutionary clade. Sequence evidence of centromere regions provides clues to the breakpoints in BTA2 and BTA28 centromere satellites. We furthermore integrated a circulation and contraction-related biological process involving 43 evolutionary genes that participated in pathways associated with the evolution of the cardiovascular system. These findings may have important implications for understanding the molecular mechanisms of chromosome fusion, alpine valleys adaptability and cardiovascular function.
Collapse
Affiliation(s)
- Yan Chen
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Science, Chinese Academy of Agricultural Sciences, 100193, Beijing, P.R. China
| | - Tianliu Zhang
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Science, Chinese Academy of Agricultural Sciences, 100193, Beijing, P.R. China
| | - Ming Xian
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Science, Chinese Academy of Agricultural Sciences, 100193, Beijing, P.R. China
| | - Rui Zhang
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Science, Chinese Academy of Agricultural Sciences, 100193, Beijing, P.R. China
| | - Weifei Yang
- 1 Gene Co., Ltd, 310051, Hangzhou, P.R. China
- Annoroad Gene Technology (Beijing) Co., Ltd, 100176, Beijing, P.R. China
| | - Baqi Su
- Drung Cattle Conservation Farm in Jiudang Wood, Drung and Nu Minority Autonomous County, Gongshan, 673500, Kunming, Yunnan, P.R. China
| | - Guoqiang Yang
- Livestock and Poultry Breed Improvement Center, Nujiang Lisu Minority Autonomous Prefecture, 673199, Kunming, Yunnan, P.R. China
| | - Limin Sun
- Yunnan Animal Husbandry Service, 650224, Kunming, Yunnan, P.R. China
| | - Wenkun Xu
- Yunnan Animal Husbandry Service, 650224, Kunming, Yunnan, P.R. China
| | - Shangzhong Xu
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Science, Chinese Academy of Agricultural Sciences, 100193, Beijing, P.R. China
| | - Huijiang Gao
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Science, Chinese Academy of Agricultural Sciences, 100193, Beijing, P.R. China
| | - Lingyang Xu
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Science, Chinese Academy of Agricultural Sciences, 100193, Beijing, P.R. China
| | - Xue Gao
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Science, Chinese Academy of Agricultural Sciences, 100193, Beijing, P.R. China.
| | - Junya Li
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Science, Chinese Academy of Agricultural Sciences, 100193, Beijing, P.R. China.
| |
Collapse
|
125
|
Huang W, Zhang L, Columbus JT, Hu Y, Zhao Y, Tang L, Guo Z, Chen W, McKain M, Bartlett M, Huang CH, Li DZ, Ge S, Ma H. A well-supported nuclear phylogeny of Poaceae and implications for the evolution of C 4 photosynthesis. MOLECULAR PLANT 2022; 15:755-777. [PMID: 35093593 DOI: 10.1016/j.molp.2022.01.015] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Revised: 06/09/2021] [Accepted: 01/24/2022] [Indexed: 05/11/2023]
Abstract
Poaceae (the grasses) includes rice, maize, wheat, and other crops, and is the most economically important angiosperm family. Poaceae is also one of the largest plant families, consisting of over 11 000 species with a global distribution that contributes to diverse ecosystems. Poaceae species are classified into 12 subfamilies, with generally strong phylogenetic support for their monophyly. However, many relationships within subfamilies, among tribes and/or subtribes, remain uncertain. To better resolve the Poaceae phylogeny, we generated 342 transcriptomic and seven genomic datasets; these were combined with other genomic and transcriptomic datasets to provide sequences for 357 Poaceae species in 231 genera, representing 45 tribes and all 12 subfamilies. Over 1200 low-copy nuclear genes were retrieved from these datasets, with several subsets obtained using additional criteria, and used for coalescent analyses to reconstruct a Poaceae phylogeny. Our results strongly support the monophyly of 11 subfamilies; however, the subfamily Puelioideae was separated into two non-sister clades, one for each of the two previously defined tribes, supporting a hypothesis that places each tribe in a separate subfamily. Molecular clock analyses estimated the crown age of Poaceae to be ∼101 million years old. Ancestral character reconstruction of C3/C4 photosynthesis supports the hypothesis of multiple independent origins of C4 photosynthesis. These origins are further supported by phylogenetic analysis of the ppc gene family that encodes the phosphoenolpyruvate carboxylase, which suggests that members of three paralogous subclades (ppc-aL1a, ppc-aL1b, and ppc-B2) were recruited as functional C4ppc genes. This study provides valuable resources and a robust phylogenetic framework for evolutionary analyses of the grass family.
Collapse
Affiliation(s)
- Weichen Huang
- Department of Biology, 510 Mueller Laboratory, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, State College, PA 16802, USA
| | - Lin Zhang
- Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering and State Key Laboratory of Genetic Engineering, Institute of Biodiversity Sciences and Institute of Plant Biology, School of Life Sciences, Fudan University, 2005 Songhu Road, Shanghai 200438, China
| | - J Travis Columbus
- Rancho Santa Ana Botanic Garden and Claremont Graduate University, 1500 North College Avenue, Claremont, CA 91711, USA
| | - Yi Hu
- Department of Biology, 510 Mueller Laboratory, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, State College, PA 16802, USA
| | - Yiyong Zhao
- Department of Biology, 510 Mueller Laboratory, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, State College, PA 16802, USA; Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering and State Key Laboratory of Genetic Engineering, Institute of Biodiversity Sciences and Institute of Plant Biology, School of Life Sciences, Fudan University, 2005 Songhu Road, Shanghai 200438, China
| | - Lin Tang
- Department of Biology, 510 Mueller Laboratory, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, State College, PA 16802, USA; College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Zhenhua Guo
- Plant Germplasm and Genomics Center, Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan 650201 China
| | - Wenli Chen
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
| | - Michael McKain
- Department of Biological Sciences, University of Alabama, 411 Mary Harmon Bryant Hall, Tuscaloosa, AL 35487, USA
| | - Madelaine Bartlett
- Biology Department, University of Massachusetts Amherst, 611 North Pleasant Street, 221 Morrill 3, Amherst, MA 01003 USA
| | - Chien-Hsun Huang
- Department of Biology, 510 Mueller Laboratory, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, State College, PA 16802, USA; Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering and State Key Laboratory of Genetic Engineering, Institute of Biodiversity Sciences and Institute of Plant Biology, School of Life Sciences, Fudan University, 2005 Songhu Road, Shanghai 200438, China
| | - De-Zhu Li
- Plant Germplasm and Genomics Center, Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan 650201 China
| | - Song Ge
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
| | - Hong Ma
- Department of Biology, 510 Mueller Laboratory, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, State College, PA 16802, USA.
| |
Collapse
|
126
|
Angelova N, Danis T, Lagnel J, Tsigenopoulos CS, Manousaki T. SnakeCube: containerized and automated pipeline for de novo genome assembly in HPC environments. BMC Res Notes 2022; 15:98. [PMID: 35255960 PMCID: PMC8900408 DOI: 10.1186/s13104-022-05978-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 02/17/2022] [Indexed: 11/10/2022] Open
Abstract
Abstract
Objective
The rapid progress in sequencing technology and related bioinformatics tools aims at disentangling diversity and conservation issues through genome analyses. The foremost challenges of the field involve coping with questions emerging from the swift development and application of new algorithms, as well as the establishment of standardized analysis approaches that promote transparency and transferability in research.
Results
Here, we present SnakeCube, an automated and containerized whole de novo genome assembly pipeline that runs within isolated, secured environments and scales for use in High Performance Computing (HPC) domains. SnakeCube was optimized for its performance and tested for its effectiveness with various inputs, highlighting its safe and robust universal use in the field.
Collapse
|
127
|
Van Dam AR, Covas Orizondo JO, Lam AW, McKenna DD, Van Dam MH. Metagenomic clustering reveals microbial contamination as an essential consideration in ultraconserved element design for phylogenomics with insect museum specimens. Ecol Evol 2022; 12:e8625. [PMID: 35342556 PMCID: PMC8932080 DOI: 10.1002/ece3.8625] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 01/03/2022] [Accepted: 01/17/2022] [Indexed: 11/30/2022] Open
Abstract
Phylogenomics via ultraconserved elements (UCEs) has led to improved phylogenetic reconstructions across the tree of life. However, inadvertently incorporating non-targeted DNA into the UCE marker design will lead to misinformation being incorporated into subsequent analyses. To date, the effectiveness of basic metagenomic filtering strategies has not been assessed in arthropods. Designing markers from museum specimens requires careful consideration of methods due to the high levels of microbial contamination typically found in such specimens. We investigate if contaminant sequences are carried forward into a UCE marker set we developed from insect museum specimens using a standard bioinformatics pipeline. We find that the methods currently employed by most researchers do not exclude contamination from the final set of targets. Lastly, we highlight several paths forward for reducing contamination in UCE marker design.
Collapse
Affiliation(s)
- Alex R. Van Dam
- Department of BiologyUniversity of Puerto Rico MayagüezMayagüezPuerto Rico
| | | | - Athena W. Lam
- Department of EntomologyCalifornia Academy of SciencesSan FranciscoCaliforniaUSA
| | - Duane D. McKenna
- Department of Biological SciencesUniversity of MemphisMemphisTennesseeUSA
- Center for Biodiversity ResearchUniversity of MemphisMemphisTennesseeUSA
| | - Matthew H. Van Dam
- Department of EntomologyCalifornia Academy of SciencesSan FranciscoCaliforniaUSA
| |
Collapse
|
128
|
Jaron KS, Parker DJ, Anselmetti Y, Tran Van P, Bast J, Dumas Z, Figuet E, François CM, Hayward K, Rossier V, Simion P, Robinson-Rechavi M, Galtier N, Schwander T. Convergent consequences of parthenogenesis on stick insect genomes. SCIENCE ADVANCES 2022; 8:eabg3842. [PMID: 35196080 PMCID: PMC8865771 DOI: 10.1126/sciadv.abg3842] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
The shift from sexual reproduction to parthenogenesis has occurred repeatedly in animals, but how the loss of sex affects genome evolution remains poorly understood. We generated reference genomes for five independently evolved parthenogenetic species in the stick insect genus Timema and their closest sexual relatives. Using these references and population genomic data, we show that parthenogenesis results in an extreme reduction of heterozygosity and often leads to genetically uniform populations. We also find evidence for less effective positive selection in parthenogenetic species, suggesting that sex is ubiquitous in natural populations because it facilitates fast rates of adaptation. Parthenogenetic species did not show increased transposable element (TE) accumulation, likely because there is little TE activity in the genus. By using replicated sexual-parthenogenetic comparisons, our study reveals how the absence of sex affects genome evolution in natural populations, providing empirical support for the negative consequences of parthenogenesis as predicted by theory.
Collapse
Affiliation(s)
- Kamil S. Jaron
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3FL, UK
- Corresponding author. (D.J.P.); (K.S.J.); (T.S.)
| | - Darren J. Parker
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Corresponding author. (D.J.P.); (K.S.J.); (T.S.)
| | | | - Patrick Tran Van
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Jens Bast
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Zoé Dumas
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Emeric Figuet
- ISEM—Institut des Sciences de l’Evolution, Montpellier, France
| | | | - Keith Hayward
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Victor Rossier
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Paul Simion
- ISEM—Institut des Sciences de l’Evolution, Montpellier, France
| | - Marc Robinson-Rechavi
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Nicolas Galtier
- ISEM—Institut des Sciences de l’Evolution, Montpellier, France
| | - Tanja Schwander
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
- Corresponding author. (D.J.P.); (K.S.J.); (T.S.)
| |
Collapse
|
129
|
Kitano T, Sato H, Takahashi N, Igarashi S, Hatanaka Y, Igarashi K, Umetsu K. Complete mitochondrial genomes of three fairy shrimps from snowmelt pools in Japan. BMC ZOOL 2022; 7:11. [PMID: 37170326 PMCID: PMC10127424 DOI: 10.1186/s40850-022-00111-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Accepted: 01/27/2022] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background
Fairy shrimps belong to order Anostraca, class Branchiopoda, subphylum Crustacea, and phylum Arthropoda. Three fairy shrimp species (Eubranchipus uchidai, E. asanumai, and E. hatanakai) that inhabit snowmelt pools are currently known in Japan. Whole mitochondrial genomes are useful genetic information for conducting phylogenetic analyses. Mitochondrial genome sequences for Branchiopoda members are gradually being collated.
Results
Six whole mitochondrial genomes from the three Eubranchipus species are presented here. Eubranchipus species share the anostracan pattern of gene arrangement in their mitochondrial genomes. The mitochondrial genomes of the Eubranchipus species have a higher GC content than those of other anostracans. Accelerated substitution rates in the lineage of Eubranchipus species were observed.
Conclusion
This study is the first to obtain whole mitochondrial genomes for Far Eastern Eubranchipus species. We show that the nucleotide sequences of cytochrome oxidase subunit I and the 16S ribosomal RNA of E. asanumai presented in a previous study were nuclear mitochondrial DNA segments. Higher GC contents and accelerated substitution rates are specific characteristics of the mitochondrial genomes of Far Eastern Eubranchipus. The results will be useful for further investigations of the evolution of Anostraca as well as Branchiopoda.
Collapse
|
130
|
Widanagama SD, Freeland JR, Xu X, Shafer ABA. Genome assembly, annotation, and comparative analysis of the cattail Typha latifolia. G3 GENES|GENOMES|GENETICS 2022; 12:6433155. [PMID: 34871392 PMCID: PMC9210280 DOI: 10.1093/g3journal/jkab401] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 11/13/2021] [Indexed: 11/19/2022]
Abstract
Cattails (Typha species) comprise a genus of emergent wetland plants with a global distribution. Typha latifolia and Typha angustifolia are two of the most widespread species, and in areas of sympatry can interbreed to produce the hybrid Typha × glauca. In some regions, the relatively high fitness of Typha × glauca allows it to outcompete and displace both parent species, while simultaneously reducing plant and invertebrate biodiversity, and modifying nutrient and water cycling. We generated a high-quality whole-genome assembly of T. latifolia using PacBio long-read and high coverage Illumina sequences that will facilitate evolutionary and ecological studies in this hybrid zone. Genome size was 287 Mb and consisted of 1158 scaffolds, with an N50 of 8.71 Mb; 43.84% of the genome were identified as repetitive elements. The assembly has a BUSCO score of 96.03%, and 27,432 genes and 2700 RNA sequences were putatively identified. Comparative analysis detected over 9000 shared orthologs with related taxa and phylogenomic analysis supporting T. latifolia as a divergent lineage within Poales. This high-quality scaffold-level reference genome will provide a useful resource for future population genomic analyses and improve our understanding of Typha hybrid dynamics.
Collapse
Affiliation(s)
- Shane D Widanagama
- Department of Computer Science, Trent University, Peterborough, ON K9L 0G2, Canada
| | - Joanna R Freeland
- Department of Biology, Trent University, Peterborough, ON K9L 0G2, Canada
| | - Xinwei Xu
- Department of Ecology, College of Life Sciences, Wuhan University, Wuhan 430072, China
| | - Aaron B A Shafer
- Department of Forensic Sciences, Trent University, Peterborough, ON K9L 0G2, Canada
- Corresponding author: Department of Forensic Sciences, Trent University, DNA Building, 2140 East Bank Drive, Peterborough, ON, K9L 0G2, Canada.
| |
Collapse
|
131
|
Iwanicki NS, Botelho ABRZ, Klingen I, Júnior ID, Rossmann S, Lysøe E. Genomic signatures and insights into host niche adaptation of the entomopathogenic fungus Metarhizium humberi. G3 (BETHESDA, MD.) 2022; 12:6449448. [PMID: 34865006 PMCID: PMC9210286 DOI: 10.1093/g3journal/jkab416] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 11/22/2021] [Indexed: 12/30/2022]
Abstract
The genus Metarhizium is composed of species used in biological control programs of agricultural pests worldwide. This genus includes common fungal pathogen of many insects and mites and endophytes that can increase plant growth. Metarhizium humberi was recently described as a new species. This species is highly virulent against some insect pests and promotes growth in sugarcane, strawberry, and soybean crops. In this study, we sequenced the genome of M. humberi, isolate ESALQ1638, and performed a functional analysis to determine its genomic signatures and highlight the genes and biological processes associated with its lifestyle. The genome annotation predicted 10633 genes in M. humberi, of which 92.0% are assigned putative functions, and ∼17% of the genome was annotated as repetitive sequences. We found that 18.5% of the M. humberi genome is similar to experimentally validated proteins associated with pathogen-host interaction. Compared to the genomes of eight Metarhizium species, the M. humberi ESALQ1638 genome revealed some unique traits that stood out, e.g., more genes functionally annotated as polyketide synthases (PKSs), overrepresended GO-terms associated to transport of ions, organic and amino acid, a higher percentage of repetitive elements, and higher levels of RIP-induced point mutations. The M. humberi genome will serve as a resource for promoting studies on genome structure and evolution that can contribute to research on biological control and plant biostimulation. Thus, the genomic data supported the broad host range of this species within the generalist PARB clade and suggested that M. humberi ESALQ1638 might be particularly good at producing secondary metabolites and might be more efficient in transporting amino acids and organic compounds.
Collapse
Affiliation(s)
- Natasha Sant′Anna Iwanicki
- Department of Entomology and Acarology, “Luiz de Queiroz” College of Agriculture (ESALQ/USP), Piracicaba 13418-900, Brazil
- Corresponding author: (N.S.I.); (E.L.)
| | | | - Ingeborg Klingen
- Division of Biotechnology and Plant Health, Norwegian Institute of Bioeconomy Research (NIBIO), Ås 1431, Norway
| | - Italo Delalibera Júnior
- Department of Entomology and Acarology, “Luiz de Queiroz” College of Agriculture (ESALQ/USP), Piracicaba 13418-900, Brazil
| | - Simeon Rossmann
- Division of Biotechnology and Plant Health, Norwegian Institute of Bioeconomy Research (NIBIO), Ås 1431, Norway
| | - Erik Lysøe
- Division of Biotechnology and Plant Health, Norwegian Institute of Bioeconomy Research (NIBIO), Ås 1431, Norway
- Corresponding author: (N.S.I.); (E.L.)
| |
Collapse
|
132
|
Liu HL, Harris AJ, Wang ZF, Chen HF, Li ZA, Wei X. The genome of the Paleogene relic tree Bretschneidera sinensis: insights into trade-offs in gene family evolution, demographic history, and adaptive SNPs. DNA Res 2022; 29:6523039. [PMID: 35137004 PMCID: PMC8825261 DOI: 10.1093/dnares/dsac003] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Indexed: 11/13/2022] Open
Abstract
Among relic species, genomic information may provide the key to inferring their long-term survival. Therefore, in this study, we investigated the genome of the Paleogene relic tree species, Bretschneidera sinensis, which is a rare endemic species within southeastern Asia. Specifically, we assembled a high-quality genome for B. sinensis using PacBio high-fidelity and high-throughput chromosome conformation capture reads and annotated it with long and short RNA sequencing reads. Using the genome, we then detected a trade-off between active and passive disease defences among the gene families. Gene families involved in salicylic acid and MAPK signalling pathways expanded as active defence mechanisms against disease, but families involved in terpene synthase activity as passive defences contracted. When inferring the long evolutionary history of B. sinensis, we detected population declines corresponding to historical climate change around the Eocene–Oligocene transition and to climatic fluctuations in the Quaternary. Additionally, based on this genome, we identified 388 single nucleotide polymorphisms (SNPs) that were likely under selection, and showed diverse functions in growth and stress responses. Among them, we further found 41 climate-associated SNPs. The genome of B. sinensis and the SNP dataset will be important resources for understanding extinction/diversification processes using comparative genomics in different lineages.
Collapse
Affiliation(s)
- Hai-Lin Liu
- Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.,University of Chinese Academy of Sciences, Beijing, 100049, China.,Environmental Horticulture Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou, 510640, China.,Key Laboratory of Ornamental Plant Germplasm Innovation and Utilization, Guangzhou, 510640, China
| | - A J Harris
- Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.,Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China
| | - Zheng-Feng Wang
- Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.,Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Guangzhou, 511458, China.,Center of Plant Ecology, Core Botanical Gardens, Chinese Academy of Sciences, Guangzhou, 510650, China.,Key Laboratory of Vegetation Restoration and Management of Degraded Ecosystems, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China
| | - Hong-Feng Chen
- Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.,Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China
| | - Zhi-An Li
- Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.,Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Guangzhou, 511458, China.,Center of Plant Ecology, Core Botanical Gardens, Chinese Academy of Sciences, Guangzhou, 510650, China.,Key Laboratory of Vegetation Restoration and Management of Degraded Ecosystems, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China
| | - Xiao Wei
- Guangxi Institute of Botany, Chinese Academy of Sciences, Guilin, 541006, China
| |
Collapse
|
133
|
Long R, Zhang F, Zhang Z, Li M, Chen L, Wang X, Liu W, Zhang T, Yu LX, He F, Jiang X, Yang X, Yang C, Wang Z, Kang J, Yang Q. Genome assembly of alfalfa cultivar zhongmu-4 and identification of SNPs associated with agronomic traits. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022; 20:14-28. [PMID: 35033678 PMCID: PMC9510860 DOI: 10.1016/j.gpb.2022.01.002] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 12/23/2021] [Accepted: 01/07/2022] [Indexed: 12/21/2022]
Abstract
Alfalfa (Medicago sativa L.) is the most important legume forage crop worldwide with high nutritional value and yield. For a long time, the breeding of alfalfa was hampered by lacking reliable information on the autotetraploid genome and molecular markers linked to important agronomic traits. We herein reported the de novo assembly of the allele-aware chromosome-level genome of Zhongmu-4, a cultivar widely cultivated in China, and a comprehensive database of genomic variations based on resequencing of 220 germplasms. Approximate 2.74 Gb contigs (N50 of 2.06 Mb), accounting for 88.39% of the estimated genome, were assembled, and 2.56 Gb contigs were anchored to 32 pseudo-chromosomes. A total of 34,922 allelic genes were identified from the allele-aware genome. We observed the expansion of gene families, especially those related to the nitrogen metabolism, and the increase of repetitive elements including transposable elements, which probably resulted in the increase of Zhongmu-4 genome compared with Medicago truncatula. Population structure analysis revealed that the accessions from Asia and South America had relatively lower genetic diversity than those from Europe, suggesting that geography may influence alfalfa genetic divergence during local adaption. Genome-wide association studies identified 101 single nucleotide polymorphisms (SNPs) associated with 27 agronomic traits. Two candidate genes were predicted to be correlated with fall dormancy and salt response. We believe that the allele-aware chromosome-level genome sequence of Zhongmu-4 combined with the resequencing data of the diverse alfalfa germplasms will facilitate genetic research and genomics-assisted breeding in variety improvement of alfalfa.
Collapse
Affiliation(s)
- Ruicai Long
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Fan Zhang
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China; Department of Crop and Soil Sciences, Washington State University, Pullman, WA, 99163, United States
| | - Zhiwu Zhang
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA, 99163, United States
| | - Mingna Li
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Lin Chen
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Xue Wang
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Wenwen Liu
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Tiejun Zhang
- School of Grassland Science, Beijing Forestry University, Beijing 100083, China
| | - Long-Xi Yu
- United States Department of Agriculture-Agricultural Research Service, Plant and Germplasm Introduction and Testing Research, Prosser, WA, 99350, United States
| | - Fei He
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Xueqian Jiang
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Xijiang Yang
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Changfu Yang
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Zhen Wang
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China.
| | - Junmei Kang
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China.
| | - Qingchuan Yang
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China.
| |
Collapse
|
134
|
Filée J, Merle M, Bastide H, Mougel F, Bérenger JM, Folly-Ramos E, Almeida CE, Harry M. Phylogenomics for Chagas Disease Vectors of the Rhodnius Genus (Hemiptera, Triatominae): What We Learn From Mito-Nuclear Conflicts and Recommendations. Front Ecol Evol 2022. [DOI: 10.3389/fevo.2021.750317] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
We provide in this study a very large DNA dataset on Rhodnius species including 36 samples representing 16 valid species of the three Rhodnius groups, pictipes, prolixus and pallescens. Samples were sequenced at low-depth with whole-genome shotgun sequencing (Illumina technology). Using phylogenomics including 15 mitochondrial genes (13.3 kb), partial nuclear rDNA (5.2 kb) and 51 nuclear protein-coding genes (36.3 kb), we resolve sticking points in the Rhodnius phylogeny. At the species level, we confirmed the species-specific status of R. montenegrensis and R. marabaensis and we agree with the synonymy of R. taquarussuensis with R. neglectus. We also invite to revisit the species-specific status of R. milesi that is more likely R. nasutus. We proposed to define a robustus species complex that comprises the four close relative species: R. marabaensis, R. montenegrensis, R. prolixus and R. robustus. As Psammolestes tertius was included in the Rhodnius clade, we strongly recommend reclassifying this species as R. tertius. At the Rhodnius group level, molecular data consistently supports the clustering of the pictipes and pallescens groups, more related to each other than they are to the prolixus group. Moreover, comparing mitochondrial and nuclear tree topologies, our results demonstrated that various introgression events occurred in all the three Rhodnius groups, in laboratory strains but also in wild specimens. We demonstrated that introgressions occurred frequently in the prolixus group, involving the related species of the robustus complex but also the pairwise R. nasutus and R. neglectus. A genome wide analysis highlighted an introgression event in the pictipes group between R. stali and R. brethesi and suggested a complex gene flow between the three species of the pallescens group, R. colombiensis, R. pallescens and R. ecuadoriensis. The molecular data supports also a sylvatic distribution of R. prolixus in Brazil (Pará state) and the monophyly of R. robustus. As we detected extensive introgression events and selective pressure on mitochondrial genes, we strongly recommend performing separate mitochondrial and nuclear phylogenies and to take advantages of mito-nuclear conflicts in order to have a comprehensive evolutionary vision of this genus.
Collapse
|
135
|
Sharma A, Jain P, Mahgoub A, Zhou Z, Mahadik K, Chaterji S. Lerna: transformer architectures for configuring error correction tools for short- and long-read genome sequencing. BMC Bioinformatics 2022; 23:25. [PMID: 34991450 PMCID: PMC8734100 DOI: 10.1186/s12859-021-04547-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 12/20/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Sequencing technologies are prone to errors, making error correction (EC) necessary for downstream applications. EC tools need to be manually configured for optimal performance. We find that the optimal parameters (e.g., k-mer size) are both tool- and dataset-dependent. Moreover, evaluating the performance (i.e., Alignment-rate or Gain) of a given tool usually relies on a reference genome, but quality reference genomes are not always available. We introduce Lerna for the automated configuration of k-mer-based EC tools. Lerna first creates a language model (LM) of the uncorrected genomic reads, and then, based on this LM, calculates a metric called the perplexity metric to evaluate the corrected reads for different parameter choices. Next, it finds the one that produces the highest alignment rate without using a reference genome. The fundamental intuition of our approach is that the perplexity metric is inversely correlated with the quality of the assembly after error correction. Therefore, Lerna leverages the perplexity metric for automated tuning of k-mer sizes without needing a reference genome. RESULTS First, we show that the best k-mer value can vary for different datasets, even for the same EC tool. This motivates our design that automates k-mer size selection without using a reference genome. Second, we show the gains of our LM using its component attention-based transformers. We show the model's estimation of the perplexity metric before and after error correction. The lower the perplexity after correction, the better the k-mer size. We also show that the alignment rate and assembly quality computed for the corrected reads are strongly negatively correlated with the perplexity, enabling the automated selection of k-mer values for better error correction, and hence, improved assembly quality. We validate our approach on both short and long reads. Additionally, we show that our attention-based models have significant runtime improvement for the entire pipeline-18[Formula: see text] faster than previous works, due to parallelizing the attention mechanism and the use of JIT compilation for GPU inferencing. CONCLUSION Lerna improves de novo genome assembly by optimizing EC tools. Our code is made available in a public repository at: https://github.com/icanforce/lerna-genomics .
Collapse
Affiliation(s)
| | - Pranjal Jain
- Indian Institute of Technology Bombay, Mumbai, India
| | | | | | | | | |
Collapse
|
136
|
Tanaka A, Ryder MH, Suzuki T, Uesaka K, Yamaguchi N, Amimoto T, Otani M, Nakayachi O, Arakawa K, Tanaka N, Takemoto D. Production of Agrocinopine A by Ipomoea batatas Agrocinopine Synthase in Transgenic Tobacco and Its Effect on the Rhizosphere Microbial Community. MOLECULAR PLANT-MICROBE INTERACTIONS : MPMI 2022; 35:73-84. [PMID: 34585955 DOI: 10.1094/mpmi-05-21-0114-r] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Agrobacterium tumefaciens is a bacterial pathogen that causes crown gall disease on a wide range of eudicot plants by genetic transformation. Besides T-DNA integrated by natural transformation of plant vegetative tissues by pathogenic Agrobacterium spp., previous reports have indicated that T-DNA sequences originating from an ancestral Agrobacterium sp. are present in the genomes of all cultivated sweet potato (Ipomoea batatas) varieties analyzed. Expression of an Agrobacterium-derived agrocinopine synthase (ACS) gene was detected in leaf and root tissues of sweet potato, suggesting that the plant can produce agrocinopine, a sugar-phosphodiester opine considered to be utilized by some strains of Agrobacterium spp. in crown gall. To validate the product synthesized by Ipomoea batatas ACS (IbACS), we introduced IbACS into tobacco under a constitutive promoter. High-voltage paper electrophoresis followed by alkaline silver nitrate staining detected the production of an agrocinopine-like substance in IbACS1-expressing tobacco, and further mass spectrometry and nuclear magnetic resonance analyses of the product confirmed that IbACS can produce agrocinopine A from natural plant substrates. The partially purified compound was biologically active in an agrocinopine A bioassay. A 16S ribosomal RNA amplicon sequencing and meta-transcriptome analysis revealed that the rhizosphere microbial community of tobacco was affected by the expression of IbACS. A new species of Leifsonia (actinobacteria) was isolated as an enriched bacterium in the rhizosphere of IbACS1-expressing tobacco. This Leifsonia sp. can catabolize agrocinopine A produced in tobacco, indicating that the production of agrocinopine A attracts rhizosphere bacteria that can utilize this sugar-phosphodiester. These results suggest a potential role of IbACS conserved among sweet potato cultivars in manipulating their microbial community.[Formula: see text] Copyright © 2021 The Author(s). This is an open access article distributed under the CC BY-NC-ND 4.0 International license.
Collapse
Affiliation(s)
- Aiko Tanaka
- Graduate School of Bioagricultural Sciences, Nagoya University, Chikusa, Nagoya, Aichi 464-8601, Japan
| | - Maarten H Ryder
- School of Agriculture, Food & Wine, The University of Adelaide, Glen Osmond, South Australia 5064, Australia
| | - Takamasa Suzuki
- College of Bioscience and Biotechnology, Chubu University, Kasugai, Aichi 478-8501, Japan
| | - Kazuma Uesaka
- Center for Gene Research, Nagoya University, Chikusa, Nagoya, Aichi 464-8602, Japan
| | - Nobuo Yamaguchi
- Natural Science Center for Basic Research and Development, Hiroshima University, Higashi-Hiroshima, Hiroshima 739-8527, Japan
| | - Tomoko Amimoto
- Natural Science Center for Basic Research and Development, Hiroshima University, Higashi-Hiroshima, Hiroshima 739-8526, Japan
| | - Motoyasu Otani
- Research Institute for Bioresources and Biotechnology, Ishikawa Prefectural University, Ishikawa 921-8836, Japan
| | - Osamu Nakayachi
- Research Institute for Bioresources and Biotechnology, Ishikawa Prefectural University, Ishikawa 921-8836, Japan
| | - Kenji Arakawa
- Graduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima, Hiroshima 739-8530, Japan
| | - Nobukazu Tanaka
- Natural Science Center for Basic Research and Development, Hiroshima University, Higashi-Hiroshima, Hiroshima 739-8527, Japan
| | - Daigo Takemoto
- Graduate School of Bioagricultural Sciences, Nagoya University, Chikusa, Nagoya, Aichi 464-8601, Japan
| |
Collapse
|
137
|
Danis T, Papadogiannis V, Tsakogiannis A, Kristoffersen JB, Golani D, Tsaparis D, Sterioti A, Kasapidis P, Kotoulas G, Magoulas A, Tsigenopoulos CS, Manousaki T. Genome Analysis of Lagocephalus sceleratus: Unraveling the Genomic Landscape of a Successful Invader. Front Genet 2021; 12:790850. [PMID: 34956332 PMCID: PMC8692874 DOI: 10.3389/fgene.2021.790850] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Accepted: 11/16/2021] [Indexed: 11/13/2022] Open
Abstract
The Tetraodontidae family encompasses several species which attract scientific interest in terms of their ecology and evolution. The silver-cheeked toadfish (Lagocephalus sceleratus) is a well-known “invasive sprinter” that has invaded and spread, in less than a decade, throughout the Eastern and part of the Western Mediterranean Sea from the Red Sea through the Suez Canal. In this study, we built and analysed the first near-chromosome level genome assembly of L. sceleratus and explored its evolutionary landscape. Through a phylogenomic analysis, we positioned L. sceleratus closer to T. nigroviridis, compared to other members of the family, while gene family evolution analysis revealed that genes associated with the immune response have experienced rapid expansion, providing a genetic basis for studying how L. sceleratus is able to achieve highly successful colonisation. Moreover, we found that voltage-gated sodium channel (NaV 1.4) mutations previously connected to tetrodotoxin resistance in other pufferfishes are not found in L. sceleratus, highlighting the complex evolution of this trait. The high-quality genome assembly built here is expected to set the ground for future studies on the species biology.
Collapse
Affiliation(s)
- Theodoros Danis
- School of Medicine, University of Crete, Heraklion, Greece.,Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Greece
| | - Vasileios Papadogiannis
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Greece
| | - Alexandros Tsakogiannis
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Greece
| | - Jon B Kristoffersen
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Greece
| | - Daniel Golani
- Department of Ecology, Evolution and Behavior and the National Natural History Collections, The Hebrew University, Jerusalem, Israel
| | - Dimitris Tsaparis
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Greece
| | - Aspasia Sterioti
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Greece
| | - Panagiotis Kasapidis
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Greece
| | - Georgios Kotoulas
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Greece
| | - Antonios Magoulas
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Greece
| | - Costas S Tsigenopoulos
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Greece
| | - Tereza Manousaki
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Greece
| |
Collapse
|
138
|
Shyamli PS, Pradhan S, Panda M, Parida A. De novo Whole-Genome Assembly of Moringa oleifera Helps Identify Genes Regulating Drought Stress Tolerance. FRONTIERS IN PLANT SCIENCE 2021; 12:766999. [PMID: 34970282 PMCID: PMC8712769 DOI: 10.3389/fpls.2021.766999] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 11/12/2021] [Indexed: 06/14/2023]
Abstract
Abiotic stresses, especially drought stress, are responsible for heavy losses in productivity, which in turn poses an imminent threat for future food security. Understanding plants' response to abiotic stress at the molecular level is crucially important for mitigating the impacts of climate change. Moringa oleifera is an important multipurpose plant with medicinal and nutritional properties and with an ability to grow in low water conditions, which makes the species an ideal candidate to study the regulatory mechanisms that modulate drought tolerance and its possible use in agroforestry system. In the present communication, we report whole-genome sequencing (WGS) of this species and assemble about 90% of the genome of M. oleifera var. Bhagya into 915 contigs with a N50 value of 4.7 Mb and predicted 32,062 putative protein-coding genes. After annotating the genome, we have chosen to study the heat shock transcription factor (HSF) family of genes to analyze their role in drought tolerance in M. oleifera. We predicted a total of 21 HSFs in the M. oleifera genome and carried out phylogenetic analyses, motif identification, analysis of gene duplication events, and differential expression of the HSF-coding genes in M. oleifera. Our analysis reveals that members of the HSF family have an important role in the plant's response to abiotic stress and are viable candidates for further characterization.
Collapse
Affiliation(s)
- P Sushree Shyamli
- Institute of Life Sciences, An Autonomous Institute Under Department of Biotechnology Government of India, NALCO Square, Bhubaneswar, India
- Regional Centre for Biotechnology, NCR Biotech Science Cluster, Faridabad, India
| | - Seema Pradhan
- Institute of Life Sciences, An Autonomous Institute Under Department of Biotechnology Government of India, NALCO Square, Bhubaneswar, India
| | - Mitrabinda Panda
- Institute of Life Sciences, An Autonomous Institute Under Department of Biotechnology Government of India, NALCO Square, Bhubaneswar, India
- Regional Centre for Biotechnology, NCR Biotech Science Cluster, Faridabad, India
| | - Ajay Parida
- Institute of Life Sciences, An Autonomous Institute Under Department of Biotechnology Government of India, NALCO Square, Bhubaneswar, India
| |
Collapse
|
139
|
Mafiz AI, He Y, Zhang W, Zhang Y. Soil Bacteria in Urban Community Gardens Have the Potential to Disseminate Antimicrobial Resistance Through Horizontal Gene Transfer. Front Microbiol 2021; 12:771707. [PMID: 34887843 PMCID: PMC8650581 DOI: 10.3389/fmicb.2021.771707] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Accepted: 10/14/2021] [Indexed: 11/29/2022] Open
Abstract
Fifteen soil and 45 vegetable samples from Detroit community gardens were analyzed for potential antimicrobial resistance contamination. Soil bacteria were isolated and tested by antimicrobial susceptibility profiling, horizontal gene transfer, and whole-genome sequencing. High-throughput 16S rRNA sequencing analysis was conducted on collected soil samples to determine the total bacterial composition. Of 226 bacterial isolates recovered, 54 were from soil and 172 from vegetables. A high minimal inhibitory concentration (MIC) was defined as the MIC greater than or equal to the resistance breakpoint of Escherichia coli for Gram-negative bacteria or Staphylococcus aureus for Gram-positive bacteria. The high MIC was observed in 63.4 and 69.8% of Gram-negative isolates from soil and vegetables, respectively, against amoxicillin/clavulanic acid, as well as 97.5 and 82.7% against ampicillin, 97.6 and 90.7% against ceftriaxone, 85.4 and 81.3% against cefoxitin, 65.8 and 70.5% against chloramphenicol, and 80.5 and 59.7% against ciprofloxacin. All Gram-positive bacteria showed a high MIC to gentamicin, kanamycin, and penicillin. Forty of 57 isolates carrying tetM (70.2%) successfully transferred tetracycline resistance to a susceptible recipient via conjugation. Whole-genome sequencing analysis identified a wide array of antimicrobial resistance genes (ARGs), including those encoding AdeIJK, Mex, and SmeDEF efflux pumps, suggesting a high potential of the isolates to become antimicrobial resistant, despite some inconsistency between the gene profile and the resistance phenotype. In conclusion, soil bacteria in urban community gardens can serve as a reservoir of antimicrobial resistance with the potential to transfer to clinically important pathogens, resulting in food safety and public health concerns.
Collapse
Affiliation(s)
- Abdullah Ibn Mafiz
- Department of Nutrition and Food Science, Wayne State University, Detroit, MI, United States.,Department of Human Sciences, Tennessee State University, Nashville, TN, United States
| | - Yingshu He
- Department of Food Science and Nutrition, Illinois Institute of Technology, Chicago, IL, United States.,Center for Food Safety, University of Georgia, Griffin, GA, United States
| | - Wei Zhang
- Department of Food Science and Nutrition, Illinois Institute of Technology, Chicago, IL, United States
| | - Yifan Zhang
- Department of Nutrition and Food Science, Wayne State University, Detroit, MI, United States
| |
Collapse
|
140
|
Comparative Genomics and Physiological Investigation of a New Arthrospira/Limnospira Strain O9.13F Isolated from an Alkaline, Winter Freezing, Siberian Lake. Cells 2021; 10:cells10123411. [PMID: 34943919 PMCID: PMC8700078 DOI: 10.3390/cells10123411] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 11/24/2021] [Accepted: 12/01/2021] [Indexed: 11/24/2022] Open
Abstract
Cyanobacteria from the genus Arthrospira/Limnospira are considered haloalkalotolerant organisms with optimal growth temperatures around 35 °C. They are most abundant in soda lakes in tropical and subtropical regions. Here, we report the comprehensive genome-based characterisation and physiological investigation of the new strain O9.13F that was isolated in a temperate climate zone from the winter freezing Solenoye Lake in Western Siberia. Based on genomic analyses, the Siberian strain belongs to the Arthrospira/Limnospira genus. The described strain O9.13F showed the highest relative growth index upon cultivation at 20 °C, lower than the temperature 35 °C reported as optimal for the Arthrospira/Limnospira strains. We assessed the composition of fatty acids, proteins and photosynthetic pigments in the biomass of strain O9.13F grown at different temperatures, showing its potential suitability for cultivation in a temperate climate zone. We observed a decrease of gamma-linolenic acid favouring palmitic acid in the case of strain O9.13F compared to tropical strains. Comparative genomics showed no unique genes had been found for the Siberian strain related to its tolerance to low temperatures. In addition, this strain does not possess a different set of genes associated with the salinity stress response from those typically found in tropical strains. We confirmed the absence of plasmids and functional prophage sequences. The genome consists of a 4.94 Mbp with a GC% of 44.47% and 5355 encoded proteins. The Arthrospira/Limnospira strain O9.13F presented in this work is the first representative of a new clade III based on the 16S rRNA gene, for which a genomic sequence is available in public databases (PKGD00000000).
Collapse
|
141
|
Draft Whole-Genome Sequence of Sphingobium sp. Strain PNB, a Versatile Polycyclic Aromatic Hydrocarbon-Degrading Bacterium. Microbiol Resour Announc 2021; 10:e0092021. [PMID: 34854712 PMCID: PMC8638569 DOI: 10.1128/mra.00920-21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Sphingobium sp. strain PNB can completely degrade phenanthrene, naphthalene, and biphenyl as the sole carbon and energy source. The strain is also capable of cometabolizing benzo[a]pyrene, pyrene, acenaphthene, fluoranthene, etc. Here, we report the 5.69-Mb assembly and annotation of the genome sequence of strain PNB, obtained using Illumina sequencing.
Collapse
|
142
|
Identification of an Exopolysaccharide Biosynthesis Gene in Bradyrhizobium diazoefficiens USDA110. Microorganisms 2021; 9:microorganisms9122490. [PMID: 34946092 PMCID: PMC8707904 DOI: 10.3390/microorganisms9122490] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Revised: 11/24/2021] [Accepted: 11/25/2021] [Indexed: 01/27/2023] Open
Abstract
Exopolysaccharides (EPS) play critical roles in rhizobium-plant interactions. However, the EPS biosynthesis pathway in Bradyrhizobium diazoefficiens USDA110 remains elusive. Here we used transposon (Tn) mutagenesis with the aim to identify genetic elements required for EPS biosynthesis in B. diazoefficiens USDA110. Phenotypic screening of Tn5 insertion mutants grown on agar plates led to the identification of a mutant with a transposon insertion site in the blr2358 gene. This gene is predicted to encode a phosphor-glycosyltransferase that transfers a phosphosugar onto a polyprenol phosphate substrate. The disruption of the blr2358 gene resulted in defective EPS synthesis. Accordingly, the blr2358 mutant showed a reduced capacity to induce nodules and stimulate the growth of soybean plants. Glycosyltransferase genes related to blr2358 were found to be well conserved and widely distributed among strains of the Bradyrhizobium genus. In conclusion, our study resulted in identification of a gene involved in EPS biosynthesis and highlights the importance of EPS in the symbiotic interaction between USDA110 and soybeans.
Collapse
|
143
|
Population genetic and genomic analyses of Western Massasauga (Sistrurus tergeminus ssp.): implications for subspecies delimitation and conservation. CONSERV GENET 2021. [DOI: 10.1007/s10592-021-01420-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
144
|
Azmal Ali S, Singh AK, Tomar SK, Behare P. Genome Sequence of Lacticaseibacillus rhamnosus Strain NCDC610, Isolated from a Traditional Cereal-Based Fermented Milk Product (Raabadi). Microbiol Resour Announc 2021; 10:e0067221. [PMID: 34761961 PMCID: PMC8582304 DOI: 10.1128/mra.00672-21] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Accepted: 10/20/2021] [Indexed: 12/13/2022] Open
Abstract
We announce the draft genome sequence of Lacticaseibacillus rhamnosus NCDC610, an isolate from an Indian traditional cereal-based fermented milk product (Raabadi). The genome size of Lacticaseibacillus rhamnosus NCDC610 is 2.91 Mb with the assembled sequence, and the genome consists of 67 contigs.
Collapse
Affiliation(s)
- Syed Azmal Ali
- Proteomics and Cell Biology Laboratory, Animal Biotechnology Center, ICAR, National Dairy Research Institute, Karnal, Haryana, India
| | - Ashish Kumar Singh
- Dairy Technology Division, ICAR, National Dairy Research Institute, Karnal, Haryana, India
| | - Sudhir K. Tomar
- National Collection of Dairy Cultures Laboratory, Dairy Microbiology Division, ICAR, National Dairy Research Institute, Karnal, Haryana, India
| | - Pradip Behare
- National Collection of Dairy Cultures Laboratory, Dairy Microbiology Division, ICAR, National Dairy Research Institute, Karnal, Haryana, India
| |
Collapse
|
145
|
Draft Genome Sequence of Vibrio jasicida 20LP, an Opportunistic Bacterium Isolated from Fish Larvae. Microbiol Resour Announc 2021; 10:e0081321. [PMID: 34734757 PMCID: PMC8567781 DOI: 10.1128/mra.00813-21] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
We present the genome sequence of Vibrio jasicida 20LP, a bacterial strain retrieved from larvae of gilthead seabream (Sparus aurata), a highly valuable, model fish species in land-based aquaculture. Annotation of the V. jasicida 20LP genome reveals multiple genomic features potentially underpinning opportunistic associations with diverse marine animals.
Collapse
|
146
|
Sahlin K. Effective sequence similarity detection with strobemers. Genome Res 2021; 31:2080-2094. [PMID: 34667119 PMCID: PMC8559714 DOI: 10.1101/gr.275648.121] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Accepted: 08/20/2021] [Indexed: 01/08/2023]
Abstract
k-mer-based methods are widely used in bioinformatics for various types of sequence comparisons. However, a single mutation will mutate k consecutive k-mers and make most k-mer-based applications for sequence comparison sensitive to variable mutation rates. Many techniques have been studied to overcome this sensitivity, for example, spaced k-mers and k-mer permutation techniques, but these techniques do not handle indels well. For indels, pairs or groups of small k-mers are commonly used, but these methods first produce k-mer matches, and only in a second step, a pairing or grouping of k-mers is performed. Such techniques produce many redundant k-mer matches owing to the size of k Here, we propose strobemers as an alternative to k-mers for sequence comparison. Intuitively, strobemers consist of two or more linked shorter k-mers, where the combination of linked k-mers is decided by a hash function. We use simulated data to show that strobemers provide more evenly distributed sequence matches and are less sensitive to different mutation rates than k-mers and spaced k-mers. Strobemers also produce higher match coverage across sequences. We further implement a proof-of-concept sequence-matching tool StrobeMap and use synthetic and biological Oxford Nanopore sequencing data to show the utility of using strobemers for sequence comparison in different contexts such as sequence clustering and alignment scenarios.
Collapse
Affiliation(s)
- Kristoffer Sahlin
- Department of Mathematics, Science for Life Laboratory, Stockholm University, 10691 Stockholm, Sweden
| |
Collapse
|
147
|
Ekim B, Berger B, Chikhi R. Minimizer-space de Bruijn graphs: Whole-genome assembly of long reads in minutes on a personal computer. Cell Syst 2021; 12:958-968.e6. [PMID: 34525345 PMCID: PMC8562525 DOI: 10.1016/j.cels.2021.08.009] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Revised: 08/01/2021] [Accepted: 08/19/2021] [Indexed: 10/20/2022]
Abstract
DNA sequencing data continue to progress toward longer reads with increasingly lower sequencing error rates. Here, we define an algorithmic approach, mdBG, that makes use of minimizer-space de Bruijn graphs to enable long-read genome assembly. mdBG achieves orders-of-magnitude improvement in both speed and memory usage over existing methods without compromising accuracy. A human genome is assembled in under 10 min using 8 cores and 10 GB RAM, and 60 Gbp of metagenome reads are assembled in 4 min using 1 GB RAM. In addition, we constructed a minimizer-space de Bruijn graph-based representation of 661,405 bacterial genomes, comprising 16 million nodes and 45 million edges, and successfully search it for anti-microbial resistance (AMR) genes in 12 min. We expect our advances to be essential to sequence analysis, given the rise of long-read sequencing in genomics, metagenomics, and pangenomics. Code for constructing mdBGs is freely available for download at https://github.com/ekimb/rust-mdbg/.
Collapse
Affiliation(s)
- Barış Ekim
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA; Department of Mathematics, Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA; Department of Mathematics, Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA.
| | - Rayan Chikhi
- Department of Computational Biology, Institut Pasteur, Paris 75015, France.
| |
Collapse
|
148
|
K RM, Antony G, Arvind K, Godwin J, P GK, M S, A J, Grace T. Draft genome sequence, annotation and SSR mining data of Oryctes rhinoceros Linn. (Coleoptera: Scarabaeidae), the coconut rhinoceros beetle. Data Brief 2021; 38:107424. [PMID: 34660857 PMCID: PMC8503585 DOI: 10.1016/j.dib.2021.107424] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Revised: 07/01/2021] [Accepted: 08/11/2021] [Indexed: 11/24/2022] Open
Abstract
The coconut rhinoceros beetle (CRB), Oryctes rhinoceros Linn. (Coleoptera: Scarabaeidae), is one of the major pests of coconut causing severe yield losses. The adult beetles feed on unopened spear leaf (resulting in the typical ‘V’-shaped cuts), spathes, inflorescence, and tender nut leading to stunted palm growth and yield reduction. Moreover, these damages serve as predisposing factors to the entry of other fatal enemies on palms, viz., red palm weevil and bud rot disease, causing yield loss as high as 10%. CRB attacks juvenile palms through the collar region, affecting the growth and initial establishment of the juvenile palms. While the immature stages of CRB sustain on organic debris, the adult beetles are ubiquitous pests on coconut and other palms. The discovery of a new invasive haplotype of CRB from Guam and other Pacific Islands, insensitive to Oryctes rhinoceros nudivirus (OrNV), a potent biocontrol agent, has raised serious concerns. The draft genome sequence and simple sequence repeat (SSR) marker data for this important pest of coconut are presented here. A total of 30 Gb of sequence data from an individual third instar larva was obtained on an Illumina HiSeq X Five platform. The draft genome assembly was found to be 372 Mb, with 97.6% completeness based on Benchmarking Universal Single-Copy Orthologs (BUSCO) assessment. Functional gene annotation predicted about 16,241 genes. In addition, a total of 21,999 putative simple sequence repeat (SSR) markers were identified. The obtained draft genome is a valuable resource for comprehending population genetics, dispersal patterns, phylogenetics, and species behavior.
Collapse
Affiliation(s)
- Rajesh M. K
- ICAR-Central Plantation Crops Research Institute, Kasaragod, Kerala 671124, India
| | - Ginny Antony
- Central University of Kerala, Kasaragod, Kerala 671320, India
| | - Kumar Arvind
- Central University of Kerala, Kasaragod, Kerala 671320, India
| | - Jeffrey Godwin
- Bionivid Technology Private Limited, Bengaluru, Karnataka 560043, India
| | - Gangaraj K. P
- ICAR-Central Plantation Crops Research Institute, Kasaragod, Kerala 671124, India
| | - Sujithra M
- ICAR-Central Plantation Crops Research Institute, Kasaragod, Kerala 671124, India
| | - Josephrajkumar A
- Regional Station, ICAR-Central Plantation Crops Research Institute, Kayamkulam 690533, India
| | - Tony Grace
- Central University of Kerala, Kasaragod, Kerala 671320, India
- Corresponding author.
| |
Collapse
|
149
|
Wang D, Yang L, Ning C, Liu JF, Zhao X. Breed-specific reference sequence optimized mapping accuracy of NGS analyses for pigs. BMC Genomics 2021; 22:736. [PMID: 34641784 PMCID: PMC8507312 DOI: 10.1186/s12864-021-08030-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Accepted: 09/22/2021] [Indexed: 11/17/2022] Open
Abstract
Background Reference sequences play a vital role in next-generation sequencing (NGS), impacting mapping quality during genome analyses. However, reference genomes usually do not represent the full range of genetic diversity of a species as a result of geographical divergence and independent demographic events of different populations. For the mitochondrial genome (mitogenome), which occurs in high copy numbers in cells and is strictly maternally inherited, an optimal reference sequence has the potential to make mitogenome alignment both more accurate and more efficient. In this study, we used three different types of reference sequences for mitogenome mapping, i.e., the commonly used reference sequence (CU-ref), the breed-specific reference sequence (BS-ref) and the sample-specific reference sequence (SS-ref), respectively, and compared the accuracy of mitogenome alignment and SNP calling among them, for the purpose of proposing the optimal reference sequence for mitochondrial DNA (mtDNA) analyses of specific populations Results Four pigs, representing three different breeds, were high-throughput sequenced, subsequently mapping reads to the reference sequences mentioned above, resulting in a largest mapping ratio and a deepest coverage without increased running time when aligning reads to a BS-ref. Next, single nucleotide polymorphism (SNP) calling was carried out by 18 detection strategies with the three tools SAMtools, VarScan and GATK with different parameters, using the bam results mapping to BS-ref. The results showed that all eighteen strategies achieved the same high specificity and sensitivity, which suggested a high accuracy of mitogenome alignment by the BS-ref because of a low requirement for SNP calling tools and parameter choices. Conclusions This study showed that different reference sequences representing different genetic relationships to sample reads influenced mitogenome alignment, with the breed-specific reference sequences being optimal for mitogenome analyses, which provides a refined processing perspective for NGS data. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-08030-1.
Collapse
Affiliation(s)
- Dan Wang
- National Engineering Laboratory for Animal Breeding, Ministry of Agricultural Key Laboratory of Animal Genetics, Breeding and Reproduction, College of Animal Science and Technology, China Agricultural University, Beijing, China.,College of Animal Science and Technology, Shandong Agricultural University, Tai'an, China
| | - Liu Yang
- National Engineering Laboratory for Animal Breeding, Ministry of Agricultural Key Laboratory of Animal Genetics, Breeding and Reproduction, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Chao Ning
- National Engineering Laboratory for Animal Breeding, Ministry of Agricultural Key Laboratory of Animal Genetics, Breeding and Reproduction, College of Animal Science and Technology, China Agricultural University, Beijing, China.,College of Animal Science and Technology, Shandong Agricultural University, Tai'an, China
| | - Jian-Feng Liu
- National Engineering Laboratory for Animal Breeding, Ministry of Agricultural Key Laboratory of Animal Genetics, Breeding and Reproduction, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Xingbo Zhao
- National Engineering Laboratory for Animal Breeding, Ministry of Agricultural Key Laboratory of Animal Genetics, Breeding and Reproduction, College of Animal Science and Technology, China Agricultural University, Beijing, China.
| |
Collapse
|
150
|
Tedim AP, Lanza VF, Rodríguez CM, Freitas AR, Novais C, Peixe L, Baquero F, Coque TM. Fitness cost of vancomycin-resistant Enterococcus faecium plasmids associated with hospital infection outbreaks. J Antimicrob Chemother 2021; 76:2757-2764. [PMID: 34450635 DOI: 10.1093/jac/dkab249] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Accepted: 06/14/2021] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND Vancomycin resistance is mostly associated with Enterococcus faecium due to Tn1546-vanA located on narrow- and broad-host plasmids of various families. This study's aim was to analyse the effects of acquiring Tn1546-carrying plasmids with proven epidemicity in different bacterial host backgrounds. METHODS Widespread Tn1546-carrying plasmids of different families RepA_N (n = 5), Inc18 (n = 4) and/or pHTβ (n = 1), and prototype plasmids RepA_N (pRUM) and Inc18 (pRE25, pIP501) were analysed. Plasmid transferability and fitness cost were assessed using E. faecium (GE1, 64/3) and Enterococcus faecalis (JH2-2/FA202/UV202) recipient strains. Growth curves (Bioscreen C) and Relative Growth Rates were obtained in the presence/absence of vancomycin. Plasmid stability was analysed (300 generations). WGS (Illumina-MiSeq) of non-evolved and evolved strains (GE1/64/3 transconjugants, n = 49) was performed. SNP calling (Breseq software) of non-evolved strains was used for comparison. RESULTS All plasmids were successfully transferred to different E. faecium clonal backgrounds. Most Tn1546-carrying plasmids and Inc18 and RepA_N prototypes reduced host fitness (-2% to 18%) while the cost of Tn1546 expression varied according to the Tn1546-variant and the recipient strain (9%-49%). Stability of Tn1546-carrying plasmids was documented in all cases, often with loss of phenotypic resistance and/or partial plasmid deletions. SNPs and/or indels associated with essential bacterial functions were observed on the chromosome of evolved strains, some of them linked to increased fitness. CONCLUSIONS The stability of E. faecium Tn1546-carrying plasmids in the absence of selective pressure and the high intra-species conjugation rates might explain the persistence of vancomycin resistance in E. faecium populations despite the significant burden they might impose on bacterial host strains.
Collapse
Affiliation(s)
- Ana P Tedim
- Department of Microbiology, University Hospital Ramón y Cajal-IRYCIS, Madrid, Spain
| | - Val F Lanza
- Unit of Bioinformatics, University Hospital Ramón y Cajal-IRYCIS, Madrid, Spain
| | | | - Ana R Freitas
- UCIBIO/REQUIMTE, Department of Biological Sciences, Microbiology Laboratory, Faculty of Pharmacy, University of Porto, Porto, Portugal
| | - Carla Novais
- UCIBIO/REQUIMTE, Department of Biological Sciences, Microbiology Laboratory, Faculty of Pharmacy, University of Porto, Porto, Portugal
| | - Luísa Peixe
- UCIBIO/REQUIMTE, Department of Biological Sciences, Microbiology Laboratory, Faculty of Pharmacy, University of Porto, Porto, Portugal
| | - Fernando Baquero
- Department of Microbiology, University Hospital Ramón y Cajal-IRYCIS, Madrid, Spain.,Centres for Biomedical Research in the Epidemiology and Public Health Network (CIBER-ESP), Madrid, Spain
| | - Teresa M Coque
- Department of Microbiology, University Hospital Ramón y Cajal-IRYCIS, Madrid, Spain
| |
Collapse
|