1
|
Schmitz MA, Dimonaco NJ, Clavel T, Hitch TCA. Lineage-specific microbial protein prediction enables large-scale exploration of protein ecology within the human gut. Nat Commun 2025; 16:3204. [PMID: 40180917 PMCID: PMC11968815 DOI: 10.1038/s41467-025-58442-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Accepted: 03/20/2025] [Indexed: 04/05/2025] Open
Abstract
Microbes use a range of genetic codes and gene structures, yet these are often ignored during metagenomic analysis. This causes spurious protein predictions, preventing functional assignment which limits our understanding of ecosystems. To resolve this, we developed a lineage-specific gene prediction approach that uses the correct genetic code based on the taxonomic assignment of genetic fragments, removes incomplete protein predictions, and optimises prediction of small proteins. Applied to 9634 metagenomes and 3594 genomes from the human gut, this approach increased the landscape of captured expressed microbial proteins by 78.9%, including previously hidden functional groups. Optimised small protein prediction captured 3,772,658 small protein clusters, which form an improved microbial protein catalogue of the human gut (MiProGut). To enable the ecological study of a protein's prevalence and association with host parameters, we developed InvestiGUT, a tool which integrates both the protein sequences and sample metadata. Accurate prediction of proteins is critical to providing a functional understanding of microbiomes, enhancing our ability to study interactions between microbes and hosts.
Collapse
Affiliation(s)
- Matthias A Schmitz
- Functional Microbiome Research Group, RWTH University Hospital, Aachen, Germany
| | - Nicholas J Dimonaco
- Institute for Global Food Security, School of Biological Sciences, Queen's University Belfast, Belfast, UK
- Department of Computer Science, Aberystwyth University, Aberystwyth, UK
| | - Thomas Clavel
- Functional Microbiome Research Group, RWTH University Hospital, Aachen, Germany
| | - Thomas C A Hitch
- Functional Microbiome Research Group, RWTH University Hospital, Aachen, Germany.
| |
Collapse
|
2
|
Kim IE, Fola AA, Puig E, Maina TK, Hui ST, Ma H, Zuckerman K, Agwati EO, Leonetti A, Crudale R, Luftig MA, Moormann AM, Oduor C, Bailey JA. Comparison of nanopore with illumina whole genome assemblies of the Epstein-Barr virus in Burkitt lymphoma. Sci Rep 2025; 15:10970. [PMID: 40164811 PMCID: PMC11958722 DOI: 10.1038/s41598-025-94737-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2024] [Accepted: 03/17/2025] [Indexed: 04/02/2025] Open
Abstract
Endemic Burkitt lymphoma (eBL) is one of the most prevalent cancer in children in sub-Saharan Africa, and while prior studies have found that Epstein-Barr virus (EBV) type and variation may alter the tumor driver genes necessary for tumor survival, the precise relationship between EBV variation and EBV-associated tumorigenesis remains unclear due to lack of scalable, cost-effective, viral whole-genome sequencing from tumor samples. This study introduces a rapid and cost-effective method of enriching, sequencing, and assembling accurate EBV genomes in BL tumor cell lines through a combination of selective whole genome amplification (sWGA) and subsequent 2-tube multiplex polymerase chain reaction along with long-read sequencing with a portable sequencer. The method was optimized across a range of parameters to yield a high percentage of EBV reads and sufficient coverage across the EBV genome except for large repeat regions. After optimization, we applied our method to sequence 18 cell lines and 3 patient tumors from fine needle biopsies and assembled them with median coverages of 99.62 and 99.68%, respectively. The assemblies showed high concordance (99.61% similarity) to available Illumina-based assemblies. The improved method and assembly pipeline will allow for better understanding of EBV variation in relation to BL and is applicable more broadly for translational research studies, especially useful for laboratories in Africa where eBL is most widespread.
Collapse
Affiliation(s)
- Isaac E Kim
- Center for Computational Molecular Biology, Brown University, Box G-E5, Providence, 02912, RI, USA
- Warren Alpert Medical School, Brown University, Providence, RI, USA
| | - Abebe A Fola
- Center for Computational Molecular Biology, Brown University, Box G-E5, Providence, 02912, RI, USA
| | - Enrique Puig
- Center for Computational Molecular Biology, Brown University, Box G-E5, Providence, 02912, RI, USA
| | - Titus Kipkemboi Maina
- Department of Pathology and Laboratory Medicine, Brown University, Providence, RI, USA
| | - Sin Ting Hui
- Department of Pathology and Laboratory Medicine, Brown University, Providence, RI, USA
| | - Hongyu Ma
- Department of Pathology and Laboratory Medicine, Brown University, Providence, RI, USA
| | - Kaleb Zuckerman
- Center for Computational Molecular Biology, Brown University, Box G-E5, Providence, 02912, RI, USA
| | - Eddy O Agwati
- Department of Zoology, Maseno University, Maseno, Kenya
- Center for Global Health Research (CGHR), Kenya Medical Research Institute, Kisumu, Kenya
| | - Alec Leonetti
- Department of Pathology and Laboratory Medicine, Brown University, Providence, RI, USA
| | - Rebecca Crudale
- Department of Pathology and Laboratory Medicine, Brown University, Providence, RI, USA
| | - Micah A Luftig
- Department of Molecular Genetics and Microbiology, Duke University School of Medicine, Durham, NC, USA
- Center for Virology, Duke University School of Medicine, Durham, NC, USA
| | - Ann M Moormann
- Division of Infectious Diseases and Immunology, Department of Medicine, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Cliff Oduor
- Department of Pathology and Laboratory Medicine, Brown University, Providence, RI, USA
| | - Jeffrey A Bailey
- Center for Computational Molecular Biology, Brown University, Box G-E5, Providence, 02912, RI, USA.
- Warren Alpert Medical School, Brown University, Providence, RI, USA.
- Department of Pathology and Laboratory Medicine, Brown University, Providence, RI, USA.
| |
Collapse
|
3
|
Kim IE, Fola AA, Puig E, Maina TK, Hui ST, Ma H, Zuckerman K, Agwati E, Leonetti A, Crudale R, Luftig MA, Moormann AM, Oduor C, Bailey JA. Comparison of Nanopore with Illumina Whole Genome Assemblies of the Epstein-Barr Virus in Burkitt Lymphoma. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2025.02.21.25322471. [PMID: 40061313 PMCID: PMC11888525 DOI: 10.1101/2025.02.21.25322471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 03/17/2025]
Abstract
Endemic Burkitt lymphoma (eBL) is one of the most prevalent cancer in children in sub-Saharan Africa, and while prior studies have found that Epstein-Barr virus (EBV) type and variation may alter the tumor driver genes necessary for tumor survival, the precise relationship between EBV variation and EBV-associated tumorigenesis remains unclear due to lack of scalable, cost-effective, viral whole-genome sequencing from tumor samples. This study introduces a rapid and cost-effective method of enriching, sequencing, and assembling accurate EBV genomes in BL tumor cell lines through a combination of selective whole genome amplification (sWGA) and subsequent 2-tube multiplex polymerase chain reaction along with long-read sequencing with a portable sequencer. The method was optimized across a range of parameters to yield a high percentage of EBV reads and sufficient coverage across the EBV genome except for large repeat regions. After optimization, we applied our method to sequence 18 cell lines and 3 patient tumors from fine needle biopsies and assembled them with median coverages of 99.62 and 99.68%, respectively. The assemblies showed high concordance (99.61% similarity) to available Illumina-based assemblies. The improved method and assembly pipeline will allow for better understanding of EBV variation in relation to BL and is applicable more broadly for translational research studies, especially useful for laboratories in Africa where eBL is most widespread.
Collapse
Affiliation(s)
- Isaac E. Kim
- Center for Computational Molecular Biology, Brown University, Providence, RI, USA
- Warren Alpert Medical School, Brown University, Providence, RI, USA
| | - Abebe A. Fola
- Center for Computational Molecular Biology, Brown University, Providence, RI, USA
| | - Enrique Puig
- Center for Computational Molecular Biology, Brown University, Providence, RI, USA
| | - Titus K. Maina
- Department of Pathology and Laboratory Medicine, Brown University, Providence, RI
| | - Sin Ting Hui
- Department of Pathology and Laboratory Medicine, Brown University, Providence, RI
| | - Hongyu Ma
- Department of Pathology and Laboratory Medicine, Brown University, Providence, RI
| | - Kaleb Zuckerman
- Center for Computational Molecular Biology, Brown University, Providence, RI, USA
| | - Eddy Agwati
- Department of Zoology, Maseno University, Maseno, Kenya
- Center for Global Health Research (CGHR), Kenya Medical Research Institute, Kisumu, Kenya
| | - Alec Leonetti
- Department of Pathology and Laboratory Medicine, Brown University, Providence, RI
| | - Rebecca Crudale
- Department of Pathology and Laboratory Medicine, Brown University, Providence, RI
| | - Micah A. Luftig
- Department of Molecular Genetics and Microbiology, Duke University School of Medicine, Durham, North Carolina, USA
- Center for Virology, Duke University School of Medicine, Durham, North Carolina, USA
| | - Ann M. Moormann
- Division of Infectious Diseases and Immunology, Department of Medicine, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Cliff Oduor
- Department of Pathology and Laboratory Medicine, Brown University, Providence, RI
| | - Jeffrey A. Bailey
- Center for Computational Molecular Biology, Brown University, Providence, RI, USA
- Warren Alpert Medical School, Brown University, Providence, RI, USA
- Department of Pathology and Laboratory Medicine, Brown University, Providence, RI
| |
Collapse
|
4
|
Oberstaller J, Xu S, Naskar D, Zhang M, Wang C, Gibbons J, Pires CV, Mayho M, Otto TD, Rayner JC, Adams JH. Supersaturation mutagenesis reveals adaptive rewiring of essential genes among malaria parasites. Science 2025; 387:eadq7347. [PMID: 39913589 DOI: 10.1126/science.adq7347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Accepted: 12/05/2024] [Indexed: 03/27/2025]
Abstract
Malaria parasites are highly divergent from model eukaryotes. Large-scale genome engineering methods effective in model organisms are frequently inapplicable, and systematic studies of gene function are few. We generated more than 175,000 transposon insertions in the Plasmodium knowlesi genome, averaging an insertion every 138 base pairs, and used this "supersaturation" mutagenesis to score essentiality for 98% of genes. The density of mutations allowed mapping of putative essential domains within genes, providing a completely new level of genome annotation for any Plasmodium species. Although gene essentiality was largely conserved across P. knowlesi, Plasmodium falciparum, and rodent malaria model Plasmodium berghei, a large number of shared genes are differentially essential, revealing species-specific adaptations. Our results indicated that Plasmodium essential gene evolution was conditionally linked to adaptive rewiring of metabolic networks for different hosts.
Collapse
Affiliation(s)
- Jenna Oberstaller
- Center for Global Health and Interdisciplinary Research and USF Genomics Program, College of Public Health, University of South Florida, Tampa, FL, USA
| | - Shulin Xu
- Center for Global Health and Interdisciplinary Research and USF Genomics Program, College of Public Health, University of South Florida, Tampa, FL, USA
| | - Deboki Naskar
- Cambridge Institute for Medical Research, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
| | - Min Zhang
- Center for Global Health and Interdisciplinary Research and USF Genomics Program, College of Public Health, University of South Florida, Tampa, FL, USA
| | - Chengqi Wang
- Center for Global Health and Interdisciplinary Research and USF Genomics Program, College of Public Health, University of South Florida, Tampa, FL, USA
| | - Justin Gibbons
- Center for Global Health and Interdisciplinary Research and USF Genomics Program, College of Public Health, University of South Florida, Tampa, FL, USA
| | - Camilla Valente Pires
- Center for Global Health and Interdisciplinary Research and USF Genomics Program, College of Public Health, University of South Florida, Tampa, FL, USA
| | - Matthew Mayho
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Thomas D Otto
- School of Infection and Immunity, University of Glasgow, Glasgow, UK
- Laboratory of Pathogens and Host Immunity, Centre National de la Recherche Scientifique, and Institut National de la Santé et de la Recherche Médicale, Université de Montpellier, Montpellier, France
| | - Julian C Rayner
- Cambridge Institute for Medical Research, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
| | - John H Adams
- Center for Global Health and Interdisciplinary Research and USF Genomics Program, College of Public Health, University of South Florida, Tampa, FL, USA
- Department of Clinical Tropical Medicine, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand
| |
Collapse
|
5
|
Ruiz JL, Marin A, Carreira de Paula J, Gómez-Moracho T, García Olmedo P, Andrés-León E, Solano Parada J, Orantes Bermejo F, Osuna A, de Pablos LM. Genome drafts of Lotmaria passim strains C2 and C3 isolated from honey bees in Spain. Microbiol Resour Announc 2025; 14:e0064224. [PMID: 39601516 PMCID: PMC11737077 DOI: 10.1128/mra.00642-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Accepted: 11/06/2024] [Indexed: 11/29/2024] Open
Abstract
Lotmaria passim is a highly prevalent parasite of honey bees. Herein is reported the draft genome sequences of L. passim C2 and C3 strains of 27.15 Mbp and 26.94 Mbp, respectively. The genomes were sequenced using Illumina MiSeq platform and will allow for further comparative and functional genomics studies.
Collapse
Affiliation(s)
- José Luis Ruiz
- Bioinformatics Unit, Institute of Parasitology and Biomedicine “López-Neyra” (IPBLN), CSIC, Granada, Spain
| | | | - Jessica Carreira de Paula
- Department of Parasitology, Biochemical and Molecular Parasitology Group CTS-183, University of Granada, Granada, Spain
- Institute of Biotechnology, University of Granada, Granada, Spain
| | - Tamara Gómez-Moracho
- Department of Parasitology, Biochemical and Molecular Parasitology Group CTS-183, University of Granada, Granada, Spain
- Institute of Biotechnology, University of Granada, Granada, Spain
| | - Pedro García Olmedo
- Department of Parasitology, Biochemical and Molecular Parasitology Group CTS-183, University of Granada, Granada, Spain
- Institute of Biotechnology, University of Granada, Granada, Spain
| | - Eduardo Andrés-León
- Bioinformatics Unit, Institute of Parasitology and Biomedicine “López-Neyra” (IPBLN), CSIC, Granada, Spain
| | - Jennifer Solano Parada
- Department of Parasitology, Biochemical and Molecular Parasitology Group CTS-183, University of Granada, Granada, Spain
- Institute of Biotechnology, University of Granada, Granada, Spain
| | | | - Antonio Osuna
- Department of Parasitology, Biochemical and Molecular Parasitology Group CTS-183, University of Granada, Granada, Spain
- Institute of Biotechnology, University of Granada, Granada, Spain
| | - Luis Miguel de Pablos
- Department of Parasitology, Biochemical and Molecular Parasitology Group CTS-183, University of Granada, Granada, Spain
- Institute of Biotechnology, University of Granada, Granada, Spain
| |
Collapse
|
6
|
Delandre O, Lamer O, Loreau JM, Papa Mze N, Fonta I, Mosnier J, Gomez N, Javelle E, Pradines B. Long-Read Sequencing and De Novo Genome Assembly Pipeline of Two Plasmodium falciparum Clones ( Pf3D7, PfW2) Using Only the PromethION Sequencer from Oxford Nanopore Technologies without Whole-Genome Amplification. BIOLOGY 2024; 13:89. [PMID: 38392307 PMCID: PMC10886359 DOI: 10.3390/biology13020089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 01/25/2024] [Accepted: 01/29/2024] [Indexed: 02/24/2024]
Abstract
Antimalarial drug resistance has become a real public health problem despite WHO measures. New sequencing technologies make it possible to investigate genomic variations associated with resistant phenotypes at the genome-wide scale. Based on the use of hemisynthetic nanopores, the PromethION technology from Oxford Nanopore Technologies can produce long-read sequences, in contrast to previous short-read technologies used as the gold standard to sequence Plasmodium. Two clones of P. falciparum (Pf3D7 and PfW2) were sequenced in long-read using the PromethION sequencer from Oxford Nanopore Technologies without genomic amplification. This made it possible to create a processing analysis pipeline for human Plasmodium with ONT Fastq only. De novo assembly revealed N50 lengths of 18,488 kb and 17,502 kb for the Pf3D7 and PfW2, respectively. The genome size was estimated at 23,235,407 base pairs for the Pf3D7 clone and 21,712,038 base pairs for the PfW2 clone. The average genome coverage depth was estimated at 787X and 653X for the Pf3D7 and PfW2 clones, respectively. This study proposes an assembly processing pipeline for the human Plasmodium genome using software adapted to large ONT data and the high AT percentage of Plasmodium. This search provides all the parameters which were optimized for use with the software selected in the pipeline.
Collapse
Affiliation(s)
- Océane Delandre
- Unité Parasitologie et Entomologie, Département Microbiologie et Maladies Infectieuses, Institut de Recherche Biomédicale des Armées, 13005 Marseille, France
- Aix Marseille Univ, IRD, SSA, AP-HM, VITROME, 13005 Marseille, France
- IHU Méditerranée Infection, 13005 Marseille, France
| | - Ombeline Lamer
- Unité Bactériologie, Département Microbiologie et Maladies Infectieuses, Institut de Recherche Biomédicale des Armées, 91220 Brétigny-sur-Orge, France
- Aix-Marseille Univ, INSERM, SSA, IRBA, MCT, 13005 Marseille, France
| | - Jean-Marie Loreau
- French Armed Forces Center for Epidemiology and Public Health (CESPA), 13014 Marseille, France
| | - Nasserdine Papa Mze
- Service de Biologie, Unité de Microbiologie, Hôpital Mignot, Centre Hospitalier de Versailles, 78150 Versailles, France
| | - Isabelle Fonta
- Unité Parasitologie et Entomologie, Département Microbiologie et Maladies Infectieuses, Institut de Recherche Biomédicale des Armées, 13005 Marseille, France
- Aix Marseille Univ, IRD, SSA, AP-HM, VITROME, 13005 Marseille, France
- IHU Méditerranée Infection, 13005 Marseille, France
- Centre National de Référence du Paludisme, 13005 Marseille, France
| | - Joel Mosnier
- Unité Parasitologie et Entomologie, Département Microbiologie et Maladies Infectieuses, Institut de Recherche Biomédicale des Armées, 13005 Marseille, France
- Aix Marseille Univ, IRD, SSA, AP-HM, VITROME, 13005 Marseille, France
- IHU Méditerranée Infection, 13005 Marseille, France
- Centre National de Référence du Paludisme, 13005 Marseille, France
| | - Nicolas Gomez
- Unité Parasitologie et Entomologie, Département Microbiologie et Maladies Infectieuses, Institut de Recherche Biomédicale des Armées, 13005 Marseille, France
- Aix Marseille Univ, IRD, SSA, AP-HM, VITROME, 13005 Marseille, France
- IHU Méditerranée Infection, 13005 Marseille, France
- Centre National de Référence du Paludisme, 13005 Marseille, France
| | - Emilie Javelle
- Unité Parasitologie et Entomologie, Département Microbiologie et Maladies Infectieuses, Institut de Recherche Biomédicale des Armées, 13005 Marseille, France
- Aix Marseille Univ, IRD, SSA, AP-HM, VITROME, 13005 Marseille, France
- IHU Méditerranée Infection, 13005 Marseille, France
- Centre National de Référence du Paludisme, 13005 Marseille, France
| | - Bruno Pradines
- Unité Parasitologie et Entomologie, Département Microbiologie et Maladies Infectieuses, Institut de Recherche Biomédicale des Armées, 13005 Marseille, France
- Aix Marseille Univ, IRD, SSA, AP-HM, VITROME, 13005 Marseille, France
- IHU Méditerranée Infection, 13005 Marseille, France
- Centre National de Référence du Paludisme, 13005 Marseille, France
| |
Collapse
|
7
|
Wang J, Veldsman WP, Fang X, Huang Y, Xie X, Lyu A, Zhang L. Benchmarking multi-platform sequencing technologies for human genome assembly. Brief Bioinform 2023; 24:bbad300. [PMID: 37594299 DOI: 10.1093/bib/bbad300] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Revised: 07/12/2023] [Accepted: 07/26/2023] [Indexed: 08/19/2023] Open
Abstract
Genome assembly is a computational technique that involves piecing together deoxyribonucleic acid (DNA) fragments generated by sequencing technologies to create a comprehensive and precise representation of the entire genome. Generating a high-quality human reference genome is a crucial prerequisite for comprehending human biology, and it is also vital for downstream genomic variation analysis. Many efforts have been made over the past few decades to create a complete and gapless reference genome for humans by using a diverse range of advanced sequencing technologies. Several available tools are aimed at enhancing the quality of haploid and diploid human genome assemblies, which include contig assembly, polishing of contig errors, scaffolding and variant phasing. Selecting the appropriate tools and technologies remains a daunting task despite several studies have investigated the pros and cons of different assembly strategies. The goal of this paper was to benchmark various strategies for human genome assembly by combining sequencing technologies and tools on two publicly available samples (NA12878 and NA24385) from Genome in a Bottle. We then compared their performances in terms of continuity, accuracy, completeness, variant calling and phasing. We observed that PacBio HiFi long-reads are the optimal choice for generating an assembly with low base errors. On the other hand, we were able to produce the most continuous contigs with Oxford Nanopore long-reads, but they may require further polishing to improve on quality. We recommend using short-reads rather than long-reads themselves to improve the base accuracy of contigs from Oxford Nanopore long-reads. Hi-C is the best choice for chromosome-level scaffolding because it can capture the longest-range DNA connectedness compared to 10× linked-reads and Bionano optical maps. However, a combination of multiple technologies can be used to further improve the quality and completeness of genome assembly. For diploid assembly, hifiasm is the best tool for human diploid genome assembly using PacBio HiFi and Hi-C data. Looking to the future, we expect that further advancements in human diploid assemblers will leverage the power of PacBio HiFi reads and other technologies with long-range DNA connectedness to enable the generation of high-quality, chromosome-level and haplotype-resolved human genome assemblies.
Collapse
Affiliation(s)
- Jingjing Wang
- Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong, China
| | - Werner Pieter Veldsman
- Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong, China
| | | | | | | | - Aiping Lyu
- School of Chinese Medicine, Hong Kong Baptist University, Kowloon Tong, Hong Kong, China
| | - Lu Zhang
- Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong, China
- Institute for Research and Continuing Education, Hong Kong Baptist University, Shenzhen, China
| |
Collapse
|
8
|
Waizumi R, Tsubota T, Jouraku A, Kuwazaki S, Yokoi K, Iizuka T, Yamamoto K, Sezutsu H. Highly accurate genome assembly of an improved high-yielding silkworm strain, Nichi01. G3 (BETHESDA, MD.) 2023; 13:jkad044. [PMID: 36814357 PMCID: PMC10085791 DOI: 10.1093/g3journal/jkad044] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 01/23/2023] [Accepted: 02/14/2023] [Indexed: 02/24/2023]
Abstract
The silkworm (Bombyx mori) is an important lepidopteran model insect and an industrial domestic animal traditionally used for silk production. Here, we report the genome assembly of an improved Japanese strain Nichi01, in which the cocoon yield is comparable to that of commercial silkworm strains. The integration of PacBio Sequel II long-read and ddRAD-seq-based high-density genetic linkage map achieved the highest quality genome assembly of silkworms to date; 22 of the 28 pseudomolecules contained telomeric repeats at both ends, and only four gaps were present in the assembly. A total of 452 Mbp of the assembly with an N50 of 16.614 Mbp covered 99.3% of the complete orthologs of the lepidopteran core genes. Although the genome sequence of Nichi01 and that of the previously reported low-yielding tropical strain p50T assured their accuracy in most regions, we corrected several regions, misassembled in p50T, in our assembly. A total of 18,397 proteins were predicted using over 95 Gb of mRNA-seq derived from 10 different organs, covering 96.9% of the complete orthologs of the lepidopteran core genes. The final assembly and annotation files are available in KAIKObase (https://kaikobase.dna.affrc.go.jp/index.html) along with a genome browser and BLAST searching service, which would facilitate further studies and the breeding of silkworms and other insects.
Collapse
Affiliation(s)
- Ryusei Waizumi
- Silkworm Research Group, Institute of Agrobiological Sciences, National Agriculture and Food Research Organization (NARO), 1-2 Owashi, Tsukuba, Ibaraki 305-8634, Japan
| | - Takuya Tsubota
- Silkworm Research Group, Institute of Agrobiological Sciences, National Agriculture and Food Research Organization (NARO), 1-2 Owashi, Tsukuba, Ibaraki 305-8634, Japan
| | - Akiya Jouraku
- Silkworm Research Group, Institute of Agrobiological Sciences, National Agriculture and Food Research Organization (NARO), 1-2 Owashi, Tsukuba, Ibaraki 305-8634, Japan
| | - Seigo Kuwazaki
- Silkworm Research Group, Institute of Agrobiological Sciences, National Agriculture and Food Research Organization (NARO), 1-2 Owashi, Tsukuba, Ibaraki 305-8634, Japan
| | - Kakeru Yokoi
- Silkworm Research Group, Institute of Agrobiological Sciences, National Agriculture and Food Research Organization (NARO), 1-2 Owashi, Tsukuba, Ibaraki 305-8634, Japan
| | - Tetsuya Iizuka
- Silkworm Research Group, Institute of Agrobiological Sciences, National Agriculture and Food Research Organization (NARO), 1-2 Owashi, Tsukuba, Ibaraki 305-8634, Japan
| | - Kimiko Yamamoto
- Silkworm Research Group, Institute of Agrobiological Sciences, National Agriculture and Food Research Organization (NARO), 1-2 Owashi, Tsukuba, Ibaraki 305-8634, Japan
| | - Hideki Sezutsu
- Silkworm Research Group, Institute of Agrobiological Sciences, National Agriculture and Food Research Organization (NARO), 1-2 Owashi, Tsukuba, Ibaraki 305-8634, Japan
| |
Collapse
|