1
|
Gupta P, O’Neill H, Wolvetang E, Chatterjee A, Gupta I. Advances in single-cell long-read sequencing technologies. NAR Genom Bioinform 2024; 6:lqae047. [PMID: 38774511 PMCID: PMC11106032 DOI: 10.1093/nargab/lqae047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 04/18/2024] [Accepted: 04/29/2024] [Indexed: 05/24/2024] Open
Abstract
With an increase in accuracy and throughput of long-read sequencing technologies, they are rapidly being assimilated into the single-cell sequencing pipelines. For transcriptome sequencing, these techniques provide RNA isoform-level information in addition to the gene expression profiles. Long-read sequencing technologies not only help in uncovering complex patterns of cell-type specific splicing, but also offer unprecedented insights into the origin of cellular complexity and thus potentially new avenues for drug development. Additionally, single-cell long-read DNA sequencing enables high-quality assemblies, structural variant detection, haplotype phasing, resolving high-complexity regions, and characterization of epigenetic modifications. Given that significant progress has primarily occurred in single-cell RNA isoform sequencing (scRiso-seq), this review will delve into these advancements in depth and highlight the practical considerations and operational challenges, particularly pertaining to downstream analysis. We also aim to offer a concise introduction to complementary technologies for single-cell sequencing of the genome, epigenome and epitranscriptome. We conclude by identifying certain key areas of innovation that may drive these technologies further and foster more widespread application in biomedical science.
Collapse
Affiliation(s)
- Pallavi Gupta
- University of Queensland – IIT Delhi Research Academy, Hauz Khas, New Delhi 110016, India
- Australian Institute of Bioengineering and Nanotechnology (AIBN), The University of Queensland, St Lucia, QLD 4072, Australia
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India
| | - Hannah O’Neill
- Department of Pathology, Dunedin School of Medicine, University of Otago, 58 Hanover Street, Dunedin 9054, New Zealand
| | - Ernst J Wolvetang
- Australian Institute of Bioengineering and Nanotechnology (AIBN), The University of Queensland, St Lucia, QLD 4072, Australia
| | - Aniruddha Chatterjee
- Department of Pathology, Dunedin School of Medicine, University of Otago, 58 Hanover Street, Dunedin 9054, New Zealand
| | - Ishaan Gupta
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India
| |
Collapse
|
2
|
Wolf M, Greve C, Schell T, Janke A, Schmitt T, Pauls SU, Aspöck H, Aspöck U. The de novo genome of the Black-necked Snakefly (Venustoraphidia nigricollis Albarda, 1891): A resource to study the evolution of living fossils. J Hered 2024; 115:112-119. [PMID: 37988623 PMCID: PMC10838129 DOI: 10.1093/jhered/esad074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 11/15/2023] [Accepted: 11/21/2023] [Indexed: 11/23/2023] Open
Abstract
Snakeflies (Raphidioptera) are the smallest order of holometabolous insects that have kept their distinct and name-giving appearance since the Mesozoic, probably since the Jurassic, and possibly even since their emergence in the Carboniferous, more than 300 million years ago. Despite their interesting nature and numerous publications on their morphology, taxonomy, systematics, and biogeography, snakeflies have never received much attention from the general public, and only a few studies were devoted to their molecular biology. Due to this lack of molecular data, it is therefore unknown, if the conserved morphological nature of these living fossils translates to conserved genomic structures. Here, we present the first genome of the species and of the entire order of Raphidioptera. The final genome assembly has a total length of 669 Mbp and reached a high continuity with an N50 of 5.07 Mbp. Further quality controls also indicate a high completeness and no meaningful contamination. The newly generated data was used in a large-scaled phylogenetic analysis of snakeflies using shared orthologous sequences. Quartet score and gene concordance analyses revealed high amounts of conflicting signals within this group that might speak for substantial incomplete lineage sorting and introgression after their presumed re-radiation after the asteroid impact 66 million years ago. Overall, this reference genome will be a door-opening dataset for many future research applications, and we demonstrated its utility in a phylogenetic analysis that provides new insights into the evolution of this group of living fossils.
Collapse
Affiliation(s)
- Magnus Wolf
- Senckenberg Biodiversity and Climate Research Centre (BiK-F), Frankfurt am Main, Germany
- Institute for Ecology, Evolution and Diversity, Goethe University, Frankfurt am Main, Germany
| | - Carola Greve
- Senckenberg Biodiversity and Climate Research Centre (BiK-F), Frankfurt am Main, Germany
- LOEWE-Centre for Translational Biodiversity Genomics (LOEWE-TBG), Frankfurt am Main, Germany
| | - Tilman Schell
- Senckenberg Biodiversity and Climate Research Centre (BiK-F), Frankfurt am Main, Germany
- LOEWE-Centre for Translational Biodiversity Genomics (LOEWE-TBG), Frankfurt am Main, Germany
| | - Axel Janke
- Senckenberg Biodiversity and Climate Research Centre (BiK-F), Frankfurt am Main, Germany
- Institute for Ecology, Evolution and Diversity, Goethe University, Frankfurt am Main, Germany
- LOEWE-Centre for Translational Biodiversity Genomics (LOEWE-TBG), Frankfurt am Main, Germany
| | - Thomas Schmitt
- Senckenberg German Entomological Institute, Müncheberg, Germany
- Entomology and Biogeography, Faculty of Science, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
| | - Steffen U Pauls
- LOEWE-Centre for Translational Biodiversity Genomics (LOEWE-TBG), Frankfurt am Main, Germany
- Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt am Main, Germany
- Institute of Insects Biotechnology, Justus-Liebig-University Giessen, Giessen, Germany
| | - Horst Aspöck
- Institute of Specific Prophylaxis and Tropical Medicine, Medical Parasitology, Medical University of Vienna (MUW), Vienna, Austria
| | - Ulrike Aspöck
- Department of Evolutionary Biology, University of Vienna, Vienna, Austria
- Department of Entomology, Natural History Museum Vienna, Vienna, Austria
| |
Collapse
|
3
|
Collins G, Schneider C, Boštjančić LL, Burkhardt U, Christian A, Decker P, Ebersberger I, Hohberg K, Lecompte O, Merges D, Muelbaier H, Romahn J, Römbke J, Rutz C, Schmelz R, Schmidt A, Theissinger K, Veres R, Lehmitz R, Pfenninger M, Bálint M. The MetaInvert soil invertebrate genome resource provides insights into below-ground biodiversity and evolution. Commun Biol 2023; 6:1241. [PMID: 38066075 PMCID: PMC10709333 DOI: 10.1038/s42003-023-05621-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 11/21/2023] [Indexed: 12/18/2023] Open
Abstract
Soil invertebrates are among the least understood metazoans on Earth. Thus far, the lack of taxonomically broad and dense genomic resources has made it hard to thoroughly investigate their evolution and ecology. With MetaInvert we provide draft genome assemblies for 232 soil invertebrate species, representing 14 common groups and 94 families. We show that this data substantially extends the taxonomic scope of DNA- or RNA-based taxonomic identification. Moreover, we confirm that theories of genome evolution cannot be generalised across evolutionarily distinct invertebrate groups. The soil invertebrate genomes presented here will support the management of soil biodiversity through molecular monitoring of community composition and function, and the discovery of evolutionary adaptations to the challenges of soil conditions.
Collapse
Affiliation(s)
- Gemma Collins
- Senckenberg Biodiversity and Climate Research Centre, Frankfurt am Main, Germany
- LOEWE Centre for Translational Biodiversity Genomics, Frankfurt am Main, Germany
| | - Clément Schneider
- LOEWE Centre for Translational Biodiversity Genomics, Frankfurt am Main, Germany
- Soil Zoology, Senckenberg Museum of Natural History, Görlitz, Germany
| | - Ljudevit Luka Boštjančić
- Senckenberg Biodiversity and Climate Research Centre, Frankfurt am Main, Germany
- Department of Computer Science, ICube, UMR 7357, University of Strasbourg, CNRS, Centre de Recherche en Biomédecine de Strasbourg, Strasbourg, France
- Department of Molecular Ecology, Institute for Environmental Sciences, Rhineland-Palatinate Technical University Kaiserslautern Landau, Landau, Germany
| | | | - Axel Christian
- Soil Zoology, Senckenberg Museum of Natural History, Görlitz, Germany
| | - Peter Decker
- Soil Zoology, Senckenberg Museum of Natural History, Görlitz, Germany
| | - Ingo Ebersberger
- Senckenberg Biodiversity and Climate Research Centre, Frankfurt am Main, Germany
- LOEWE Centre for Translational Biodiversity Genomics, Frankfurt am Main, Germany
- Institute of Cell Biology and Neuroscience, Goethe University, Frankfurt am Main, Germany
| | - Karin Hohberg
- Soil Zoology, Senckenberg Museum of Natural History, Görlitz, Germany
| | - Odile Lecompte
- Department of Computer Science, ICube, UMR 7357, University of Strasbourg, CNRS, Centre de Recherche en Biomédecine de Strasbourg, Strasbourg, France
| | - Dominik Merges
- Department of Forest Mycology and Plant Pathology, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Hannah Muelbaier
- Institute of Cell Biology and Neuroscience, Goethe University, Frankfurt am Main, Germany
| | - Juliane Romahn
- Senckenberg Biodiversity and Climate Research Centre, Frankfurt am Main, Germany
- LOEWE Centre for Translational Biodiversity Genomics, Frankfurt am Main, Germany
| | - Jörg Römbke
- ECT Oekotoxikologie GmbH, Flörsheim, Germany
| | - Christelle Rutz
- Department of Computer Science, ICube, UMR 7357, University of Strasbourg, CNRS, Centre de Recherche en Biomédecine de Strasbourg, Strasbourg, France
| | | | - Alexandra Schmidt
- Senckenberg Biodiversity and Climate Research Centre, Frankfurt am Main, Germany
- Limnological Institute, University of Konstanz, Konstanz, Germany
| | - Kathrin Theissinger
- Senckenberg Biodiversity and Climate Research Centre, Frankfurt am Main, Germany
- LOEWE Centre for Translational Biodiversity Genomics, Frankfurt am Main, Germany
- Department of Molecular Ecology, Institute for Environmental Sciences, Rhineland-Palatinate Technical University Kaiserslautern Landau, Landau, Germany
| | - Robert Veres
- Senckenberg Biodiversity and Climate Research Centre, Frankfurt am Main, Germany
- Institute of Biology and Geology, Babeș-Bolyai University, Cluj-Napoca, Romania
| | - Ricarda Lehmitz
- Soil Zoology, Senckenberg Museum of Natural History, Görlitz, Germany
| | - Markus Pfenninger
- Senckenberg Biodiversity and Climate Research Centre, Frankfurt am Main, Germany
- LOEWE Centre for Translational Biodiversity Genomics, Frankfurt am Main, Germany
- Johannes Gutenberg University, Mainz, Germany
| | - Miklós Bálint
- Senckenberg Biodiversity and Climate Research Centre, Frankfurt am Main, Germany.
- LOEWE Centre for Translational Biodiversity Genomics, Frankfurt am Main, Germany.
- Department of Insect Biotechnology, Justus-Liebig University, Gießen, Germany.
| |
Collapse
|
4
|
Stevens L, Martínez-Ugalde I, King E, Wagah M, Absolon D, Bancroft R, Gonzalez de la Rosa P, Hall JL, Kieninger M, Kloch A, Pelan S, Robertson E, Pedersen AB, Abreu-Goodger C, Buck AH, Blaxter M. Ancient diversity in host-parasite interaction genes in a model parasitic nematode. Nat Commun 2023; 14:7776. [PMID: 38012132 PMCID: PMC10682056 DOI: 10.1038/s41467-023-43556-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 11/13/2023] [Indexed: 11/29/2023] Open
Abstract
Host-parasite interactions exert strong selection pressures on the genomes of both host and parasite. These interactions can lead to negative frequency-dependent selection, a form of balancing selection that is hypothesised to explain the high levels of polymorphism seen in many host immune and parasite antigen loci. Here, we sequence the genomes of several individuals of Heligmosomoides bakeri, a model parasite of house mice, and Heligmosomoides polygyrus, a closely related parasite of wood mice. Although H. bakeri is commonly referred to as H. polygyrus in the literature, their genomes show levels of divergence that are consistent with at least a million years of independent evolution. The genomes of both species contain hyper-divergent haplotypes that are enriched for proteins that interact with the host immune response. Many of these haplotypes originated prior to the divergence between H. bakeri and H. polygyrus, suggesting that they have been maintained by long-term balancing selection. Together, our results suggest that the selection pressures exerted by the host immune response have played a key role in shaping patterns of genetic diversity in the genomes of parasitic nematodes.
Collapse
Affiliation(s)
- Lewis Stevens
- Tree of Life, Wellcome Sanger Institute, Hinxton, UK.
| | - Isaac Martínez-Ugalde
- Institute of Ecology and Evolution, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | - Erna King
- Tree of Life, Wellcome Sanger Institute, Hinxton, UK
| | - Martin Wagah
- Tree of Life, Wellcome Sanger Institute, Hinxton, UK
| | | | - Rowan Bancroft
- Institute of Ecology and Evolution, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | | | - Jessica L Hall
- Institute of Ecology and Evolution, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | | | | | - Sarah Pelan
- Tree of Life, Wellcome Sanger Institute, Hinxton, UK
| | - Elaine Robertson
- Institute of Immunology & Infection Research, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | - Amy B Pedersen
- Institute of Ecology and Evolution, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | - Cei Abreu-Goodger
- Institute of Ecology and Evolution, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | - Amy H Buck
- Institute of Immunology & Infection Research, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | - Mark Blaxter
- Tree of Life, Wellcome Sanger Institute, Hinxton, UK.
| |
Collapse
|
5
|
Schelkunov MI. Mabs, a suite of tools for gene-informed genome assembly. BMC Bioinformatics 2023; 24:377. [PMID: 37794322 PMCID: PMC10548655 DOI: 10.1186/s12859-023-05499-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 09/26/2023] [Indexed: 10/06/2023] Open
Abstract
BACKGROUND Despite constantly improving genome sequencing methods, error-free eukaryotic genome assembly has not yet been achieved. Among other kinds of problems of eukaryotic genome assembly are so-called "haplotypic duplications", which may manifest themselves as cases of alleles being mistakenly assembled as paralogues. Haplotypic duplications are dangerous because they create illusions of gene family expansions and, thus, may lead scientists to incorrect conclusions about genome evolution and functioning. RESULTS Here, I present Mabs, a suite of tools that serve as parameter optimizers of the popular genome assemblers Hifiasm and Flye. By optimizing the parameters of Hifiasm and Flye, Mabs tries to create genome assemblies with the genes assembled as accurately as possible. Tests on 6 eukaryotic genomes showed that in 6 out of 6 cases, Mabs created assemblies with more accurately assembled genes than those generated by Hifiasm and Flye when they were run with default parameters. When assemblies of Mabs, Hifiasm and Flye were postprocessed by a popular tool for haplotypic duplication removal, Purge_dups, genes were better assembled by Mabs in 5 out of 6 cases. CONCLUSIONS Mabs is useful for making high-quality genome assemblies. It is available at https://github.com/shelkmike/Mabs.
Collapse
|
6
|
Abalde S, Tellgren-Roth C, Heintz J, Vinnere Pettersson O, Jondelius U. The draft genome of the microscopic Nemertoderma westbladi sheds light on the evolution of Acoelomorpha genomes. Front Genet 2023; 14:1244493. [PMID: 37829276 PMCID: PMC10565955 DOI: 10.3389/fgene.2023.1244493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Accepted: 09/12/2023] [Indexed: 10/14/2023] Open
Abstract
Background: Xenacoelomorpha is a marine clade of microscopic worms that is an important model system for understanding the evolution of key bilaterian novelties, such as the excretory system. Nevertheless, Xenacoelomorpha genomics has been restricted to a few species that either can be cultured in the lab or are centimetres long. Thus far, no genomes are available for Nemertodermatida, one of the group's main clades and whose origin has been dated more than 400 million years ago. Methods: DNA was extracted from a single specimen and sequenced with HiFi following the PacBio Ultra-Low DNA Input protocol. After genome assembly, decontamination, and annotation, the genome quality was benchmarked using two acoel genomes and one Illumina genome as reference. The gene content of three cnidarians, three acoelomorphs, four deuterostomes, and eight protostomes was clustered in orthogroups to make inferences of gene content evolution. Finally, we focused on the genes related to the ultrafiltration excretory system to compare patterns of presence/absence and gene architecture among these clades. Results: We present the first nemertodermatid genome sequenced from a single specimen of Nemertoderma westbladi. Although genome contiguity remains challenging (N50: 60 kb), it is very complete (BUSCO: 80.2%, Metazoa; 88.6%, Eukaryota) and the quality of the annotation allows fine-detail analyses of genome evolution. Acoelomorph genomes seem to be relatively conserved in terms of the percentage of repeats, number of genes, number of exons per gene and intron size. In addition, a high fraction of genes present in both protostomes and deuterostomes are absent in Acoelomorpha. Interestingly, we show that all genes related to the excretory system are present in Xenacoelomorpha except Osr, a key element in the development of these organs and whose acquisition seems to be interconnected with the origin of the specialised excretory system. Conclusion: Overall, these analyses highlight the potential of the Ultra-Low Input DNA protocol and HiFi to generate high-quality genomes from single animals, even for relatively large genomes, making it a feasible option for sequencing challenging taxa, which will be an exciting resource for comparative genomics analyses.
Collapse
Affiliation(s)
- Samuel Abalde
- Department of Zoology, Swedish Museum of Natural History, Stockholm, Sweden
| | - Christian Tellgren-Roth
- Department of Immunology, Genetics and Pathology, SciLifeLab, Uppsala University, Uppsala, Sweden
| | - Julia Heintz
- Department of Immunology, Genetics and Pathology, SciLifeLab, Uppsala University, Uppsala, Sweden
| | - Olga Vinnere Pettersson
- Department of Immunology, Genetics and Pathology, SciLifeLab, Uppsala University, Uppsala, Sweden
| | - Ulf Jondelius
- Department of Zoology, Swedish Museum of Natural History, Stockholm, Sweden
- Department of Zoology, Stockholm University, Stockholm, Sweden
| |
Collapse
|
7
|
Bachmann L, Beermann J, Brey T, de Boer HJ, Dannheim J, Edvardsen B, Ericson PGP, Holston KC, Johansson VA, Kloss P, Konijnenberg R, Osborn KJ, Pappalardo P, Pehlke H, Piepenburg D, Struck TH, Sundberg P, Markussen SS, Teschke K, Vanhove MPM. The role of systematics for understanding ecosystem functions: Proceedings of the Zoologica Scripta Symposium, Oslo, Norway, 25 August 2022. ZOOL SCR 2023. [DOI: 10.1111/zsc.12593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2023]
|
8
|
Heckenhauer J, Razuri-Gonzales E, Mwangi FN, Schneider J, Pauls SU. Holotype sequencing of Silvataresholzenthali Rázuri-Gonzales, Ngera & Pauls, 2022 (Trichoptera, Pisuliidae). Zookeys 2023; 1159:1-15. [PMID: 37213527 PMCID: PMC10193998 DOI: 10.3897/zookeys.1159.98439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 03/06/2023] [Indexed: 05/23/2023] Open
Abstract
While DNA barcodes are increasingly provided in descriptions of new species, the whole mitochondrial and nuclear genomes are still rarely included. This is unfortunate because whole genome sequencing of holotypes allows perpetual genetic characterization of the most representative specimen for a given species. Thus, de novo genomes are invaluable additional diagnostic characters in species descriptions, provided the structural integrity of the holotype specimens remains intact. Here, we used a minimally invasive method to extract DNA of the type specimen of the recently described caddisfly species Silvataresholzenthali Rázuri-Gonzales, Ngera & Pauls, 2022 (Trichoptera: Pisuliidae) from the Democratic Republic of the Congo. A low-cost next generation sequencing strategy was used to generate the complete mitochondrial and draft nuclear genome of the holotype. The data in its current form is an important extension to the morphological species description and valuable for phylogenomic studies.
Collapse
Affiliation(s)
- Jacqueline Heckenhauer
- Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt, GermanySenckenberg Research Institute and Natural History Museum FrankfurtFrankfurtGermany
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE‐TBG), Frankfurt, GermanyLOEWE Centre for Translational Biodiversity GenomicsFrankfurtGermany
| | - Ernesto Razuri-Gonzales
- Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt, GermanySenckenberg Research Institute and Natural History Museum FrankfurtFrankfurtGermany
| | - Francois Ngera Mwangi
- Centre de Recherche en Sciences Naturelles, Lwiro, Bukavu, Democratic Republic of the CongoCentre de Recherche en Sciences NaturellesBukavuDemocratic Republic of the Congo
| | - Julio Schneider
- Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt, GermanySenckenberg Research Institute and Natural History Museum FrankfurtFrankfurtGermany
| | - Steffen U. Pauls
- Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt, GermanySenckenberg Research Institute and Natural History Museum FrankfurtFrankfurtGermany
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE‐TBG), Frankfurt, GermanyLOEWE Centre for Translational Biodiversity GenomicsFrankfurtGermany
- Institute for Insect Biotechnology, Justus-Liebig-University, Gießen, GermanyJustus-Liebig-UniversityGießenGermany
| |
Collapse
|
9
|
Whiteford S, van’t Hof AE, Krishna R, Marubbi T, Widdison S, Saccheri IJ, Guest M, Morrison NI, Darby AC. Recovering individual haplotypes and a contiguous genome assembly from pooled long-read sequencing of the diamondback moth (Lepidoptera: Plutellidae). G3 (BETHESDA, MD.) 2022; 12:jkac210. [PMID: 35980174 PMCID: PMC9526047 DOI: 10.1093/g3journal/jkac210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Accepted: 07/25/2022] [Indexed: 06/15/2023]
Abstract
The assembly of divergent haplotypes using noisy long-read data presents a challenge to the reconstruction of haploid genome assemblies, due to overlapping distributions of technical sequencing error, intralocus genetic variation, and interlocus similarity within these data. Here, we present a comparative analysis of assembly algorithms representing overlap-layout-consensus, repeat graph, and de Bruijn graph methods. We examine how postprocessing strategies attempting to reduce redundant heterozygosity interact with the choice of initial assembly algorithm and ultimately produce a series of chromosome-level assemblies for an agricultural pest, the diamondback moth, Plutella xylostella (L.). We compare evaluation methods and show that BUSCO analyses may overestimate haplotig removal processing in long-read draft genomes, in comparison to a k-mer method. We discuss the trade-offs inherent in assembly algorithm and curation choices and suggest that "best practice" is research question dependent. We demonstrate a link between allelic divergence and allele-derived contig redundancy in final genome assemblies and document the patterns of coding and noncoding diversity between redundant sequences. We also document a link between an excess of nonsynonymous polymorphism and haplotigs that are unresolved by assembly or postassembly algorithms. Finally, we discuss how this phenomenon may have relevance for the usage of noisy long-read genome assemblies in comparative genomics.
Collapse
Affiliation(s)
- Samuel Whiteford
- Corresponding author: Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, UK.
| | - Arjen E van’t Hof
- Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, UK
| | - Ritesh Krishna
- Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, UK
- IBM Research UK, STFC Daresbury Laboratory, Warrington WA4 4AD, UK
| | | | - Stephanie Widdison
- General Bioinformatics, Jealott's Hill International Research Centre, Bracknell RG42 6EY, UK
| | - Ilik J Saccheri
- Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, UK
| | - Marcus Guest
- Syngenta, Jealott's Hill International Research Centre, Bracknell, RG42 6EY, UK
| | | | - Alistair C Darby
- Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, UK
| |
Collapse
|
10
|
Carlson CR, ter Horst AM, Johnston JS, Henry E, Falk BW, Kuo YW. High-quality, chromosome-scale genome assemblies: comparisons of three Diaphorina citri (Asian citrus psyllid) geographic populations. DNA Res 2022; 29:6648404. [PMID: 35866687 PMCID: PMC9338690 DOI: 10.1093/dnares/dsac027] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Indexed: 11/13/2022] Open
Abstract
The Asian citrus psyllid, Diaphorina citri, is the insect vector of the causal agent of huanglongbing (HLB), a devastating bacterial disease of commercial citrus. Presently, few genomic resources exist for D. citri. In this study, we utilized PacBio HiFi and chromatin confirmation contact (Hi-C) sequencing to sequence, assemble, and compare three high-quality, chromosome-scale genome assemblies of D. citri collected from California, Taiwan, and Uruguay. Our assemblies had final sizes of 282.67 Mb (California), 282.89 Mb (Taiwan), and 266.67 Mb (Uruguay) assembled into 13 pseudomolecules—a reduction in assembly size of 41–45% compared with previous assemblies which we validated using flow cytometry. We identified the X chromosome in D. citri and annotated each assembly for repetitive elements, protein-coding genes, transfer RNAs, ribosomal RNAs, piwi-interacting RNA clusters, and endogenous viral elements. Between 19,083 and 20,357 protein-coding genes were predicted. Repetitive DNA accounts for 36.87–38.26% of each assembly. Comparative analyses and mitochondrial haplotype networks suggest that Taiwan and Uruguay D. citri are more closely related, while California D. citri are closely related to Florida D. citri. These high-quality, chromosome-scale assemblies provide new genomic resources to researchers to further D. citri and HLB research.
Collapse
Affiliation(s)
- Curtis R Carlson
- Department of Plant Pathology, University of California Davis , Davis, CA 95616, USA
| | - Anneliek M ter Horst
- Department of Plant Pathology, University of California Davis , Davis, CA 95616, USA
| | - J Spencer Johnston
- Department of Entomology, Texas A&M University , College Station, TX 77843, USA
| | - Elizabeth Henry
- Department of Plant Pathology, University of California Davis , Davis, CA 95616, USA
| | - Bryce W Falk
- Department of Plant Pathology, University of California Davis , Davis, CA 95616, USA
| | - Yen-Wen Kuo
- Department of Plant Pathology, University of California Davis , Davis, CA 95616, USA
| |
Collapse
|
11
|
EBP-Colombia and the bioeconomy: Genomics in the service of biodiversity conservation and sustainable development. Proc Natl Acad Sci U S A 2022; 119:2115641119. [PMID: 35042804 PMCID: PMC8795567 DOI: 10.1073/pnas.2115641119] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The 2016 Peace Agreement has increased access to Colombia’s unique ecosystems, which remain understudied and increasingly under threat. The Colombian government has recently announced its National Bioeconomic Strategy (NBS), founded on the sustainable characterization, management, and conservation of the nation's biodiversity as a means to achieve sustainability and peace. Molecular tools will accelerate such endeavors, but capacity remains limited in Colombia. The Earth Biogenome Project's (EBP) objective is to characterize the genomes of all eukaryotic life on Earth through networks of partner institutions focused on sequencing either specific taxa or eukaryotic communities at regional or national scales. Colombia’s immense biodiversity and emerging network of stakeholders have inspired the creation of the national partnership “EBP-Colombia.” Here, we discuss how this Colombian-driven collaboration between government, academia, and the private sector is integrating research with sustainable, environmentally focused strategies to develop Colombia’s postconflict bioeconomy and conserve biological and cultural diversity. EBP-Colombia will accelerate the uptake of technology and promote partnership and exchange of knowledge among Colombian stakeholders and the EBP’s global network of experts; assist with conservation strategies to preserve Colombia’s vast biological wealth; and promote innovative approaches among public and private institutions in sectors such as agriculture, tourism, recycling, and medicine. EBP-Colombia can thus support Colombia’s NBS with the objective of sustainable and inclusive development to address the many social, environmental, and economic challenges, including conflict, inequality, poverty, and low agricultural productivity, and so, offer an alternative model for economic development that similarly placed countries can adopt.
Collapse
|
12
|
Kim BY, Wang JR, Miller DE, Barmina O, Delaney E, Thompson A, Comeault AA, Peede D, D'Agostino ERR, Pelaez J, Aguilar JM, Haji D, Matsunaga T, Armstrong EE, Zych M, Ogawa Y, Stamenković-Radak M, Jelić M, Veselinović MS, Tanasković M, Erić P, Gao JJ, Katoh TK, Toda MJ, Watabe H, Watada M, Davis JS, Moyle LC, Manoli G, Bertolini E, Košťál V, Hawley RS, Takahashi A, Jones CD, Price DK, Whiteman N, Kopp A, Matute DR, Petrov DA. Highly contiguous assemblies of 101 drosophilid genomes. eLife 2021; 10:e66405. [PMID: 34279216 PMCID: PMC8337076 DOI: 10.7554/elife.66405] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 07/16/2021] [Indexed: 12/13/2022] Open
Abstract
Over 100 years of studies in Drosophila melanogaster and related species in the genus Drosophila have facilitated key discoveries in genetics, genomics, and evolution. While high-quality genome assemblies exist for several species in this group, they only encompass a small fraction of the genus. Recent advances in long-read sequencing allow high-quality genome assemblies for tens or even hundreds of species to be efficiently generated. Here, we utilize Oxford Nanopore sequencing to build an open community resource of genome assemblies for 101 lines of 93 drosophilid species encompassing 14 species groups and 35 sub-groups. The genomes are highly contiguous and complete, with an average contig N50 of 10.5 Mb and greater than 97% BUSCO completeness in 97/101 assemblies. We show that Nanopore-based assemblies are highly accurate in coding regions, particularly with respect to coding insertions and deletions. These assemblies, along with a detailed laboratory protocol and assembly pipelines, are released as a public resource and will serve as a starting point for addressing broad questions of genetics, ecology, and evolution at the scale of hundreds of species.
Collapse
Affiliation(s)
- Bernard Y Kim
- Department of Biology, Stanford UniversityStanfordUnited States
| | - Jeremy R Wang
- Department of Genetics, University of North CarolinaChapel HillUnited States
| | - Danny E Miller
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children’s HospitalSeattleUnited States
| | - Olga Barmina
- Department of Evolution and Ecology, University of California DavisDavisUnited States
| | - Emily Delaney
- Department of Evolution and Ecology, University of California DavisDavisUnited States
| | - Ammon Thompson
- Department of Evolution and Ecology, University of California DavisDavisUnited States
| | - Aaron A Comeault
- School of Natural Sciences, Bangor UniversityBangorUnited Kingdom
| | - David Peede
- Biology Department, University of North CarolinaChapel HillUnited States
| | | | - Julianne Pelaez
- Department of Integrative Biology, University of California, BerkeleyBerkeleyUnited States
| | - Jessica M Aguilar
- Department of Integrative Biology, University of California, BerkeleyBerkeleyUnited States
| | - Diler Haji
- Department of Integrative Biology, University of California, BerkeleyBerkeleyUnited States
| | - Teruyuki Matsunaga
- Department of Integrative Biology, University of California, BerkeleyBerkeleyUnited States
| | | | - Molly Zych
- Molecular and Cellular Biology Program, University of WashingtonSeattleUnited States
| | - Yoshitaka Ogawa
- Department of Biological Sciences, Tokyo Metropolitan UniversityHachiojiJapan
| | | | - Mihailo Jelić
- Faculty of Biology, University of BelgradeBelgradeSerbia
| | | | - Marija Tanasković
- University of Belgrade, Institute for Biological Research "Siniša Stanković", National Institute of Republic of SerbiaBelgradeSerbia
| | - Pavle Erić
- University of Belgrade, Institute for Biological Research "Siniša Stanković", National Institute of Republic of SerbiaBelgradeSerbia
| | - Jian-Jun Gao
- School of Ecology and Environmental Science, Yunnan UniversityKunmingChina
| | - Takehiro K Katoh
- School of Ecology and Environmental Science, Yunnan UniversityKunmingChina
| | | | - Hideaki Watabe
- Biological Laboratory, Sapporo College, Hokkaido University of EducationSapporoJapan
| | - Masayoshi Watada
- Graduate School of Science and Engineering, Ehime UniversityMatsuyamaJapan
| | - Jeremy S Davis
- Department of Biology, University of KentuckyLexingtonUnited States
| | - Leonie C Moyle
- Department of Biology, Indiana UniversityBloomingtonUnited States
| | - Giulia Manoli
- Neurobiology and Genetics, Theodor Boveri Institute, Biocentre, University of WürzburgWürzburgGermany
| | - Enrico Bertolini
- Neurobiology and Genetics, Theodor Boveri Institute, Biocentre, University of WürzburgWürzburgGermany
| | - Vladimír Košťál
- Institute of Entomology, Biology Centre, Academy of Sciences of the Czech RepublicPragueCzech Republic
| | - R Scott Hawley
- Department of Molecular and Integrative Physiology, University of Kansas Medical Center, Stowers Institute for Medical ResearchKansas CityUnited States
| | - Aya Takahashi
- Department of Biological Sciences, Tokyo Metropolitan UniversityHachiojiJapan
| | - Corbin D Jones
- Biology Department, University of North CarolinaChapel HillUnited States
| | - Donald K Price
- School of Life Science, University of NevadaLas VegasUnited States
| | - Noah Whiteman
- Department of Integrative Biology, University of California, BerkeleyBerkeleyUnited States
| | - Artyom Kopp
- Department of Evolution and Ecology, University of California DavisDavisUnited States
| | - Daniel R Matute
- Biology Department, University of North CarolinaChapel HillUnited States
| | - Dmitri A Petrov
- Department of Biology, Stanford UniversityStanfordUnited States
| |
Collapse
|
13
|
The USDA-ARS Ag100Pest Initiative: High-Quality Genome Assemblies for Agricultural Pest Arthropod Research. INSECTS 2021; 12:insects12070626. [PMID: 34357286 PMCID: PMC8307976 DOI: 10.3390/insects12070626] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 06/20/2021] [Accepted: 06/22/2021] [Indexed: 12/16/2022]
Abstract
Simple Summary High-quality genome assemblies are essential tools for modern biological research. In the past, creating genome assemblies was prohibitively expensive and time-consuming for most non-model insect species due to, in part, the technical challenge of isolating the necessary quantity and quality of DNA from many species. Sequencing methods have now improved such that many insect genomes can be sequenced and assembled at scale. We created the Ag100Pest Initiative to propel agricultural research forward by assembling reference-quality genomes of important arthropod pest species. Here, we describe the Ag100Pest Initiative’s processes and experimental procedures. We show that the Ag100Pest Initiative will greatly expand the diversity of publicly available arthropod genome assemblies. We also demonstrate the high quality of preliminary contig assemblies. We share arthropod-specific technical details and insights that we have gained during the project. The methods and preliminary results presented herein should help other researchers attain similarly high-quality assemblies, effectively changing the landscape of insect genomics. Abstract The phylum Arthropoda includes species crucial for ecosystem stability, soil health, crop production, and others that present obstacles to crop and animal agriculture. The United States Department of Agriculture’s Agricultural Research Service initiated the Ag100Pest Initiative to generate reference genome assemblies of arthropods that are (or may become) pests to agricultural production and global food security. We describe the project goals, process, status, and future. The first three years of the project were focused on species selection, specimen collection, and the construction of lab and bioinformatics pipelines for the efficient production of assemblies at scale. Contig-level assemblies of 47 species are presented, all of which were generated from single specimens. Lessons learned and optimizations leading to the current pipeline are discussed. The project name implies a target of 100 species, but the efficiencies gained during the project have supported an expansion of the original goal and a total of 158 species are currently in the pipeline. We anticipate that the processes described in the paper will help other arthropod research groups or other consortia considering genome assembly at scale.
Collapse
|
14
|
Schneider C, Woehle C, Greve C, D'Haese CA, Wolf M, Hiller M, Janke A, Bálint M, Huettel B. Two high-quality de novo genomes from single ethanol-preserved specimens of tiny metazoans (Collembola). Gigascience 2021; 10:giab035. [PMID: 34018554 PMCID: PMC8138834 DOI: 10.1093/gigascience/giab035] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2020] [Revised: 02/05/2021] [Accepted: 04/27/2021] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Genome sequencing of all known eukaryotes on Earth promises unprecedented advances in biological sciences and in biodiversity-related applied fields such as environmental management and natural product research. Advances in long-read DNA sequencing make it feasible to generate high-quality genomes for many non-genetic model species. However, long-read sequencing today relies on sizable quantities of high-quality, high molecular weight DNA, which is mostly obtained from fresh tissues. This is a challenge for biodiversity genomics of most metazoan species, which are tiny and need to be preserved immediately after collection. Here we present de novo genomes of 2 species of submillimeter Collembola. For each, we prepared the sequencing library from high molecular weight DNA extracted from a single specimen and using a novel ultra-low input protocol from Pacific Biosciences. This protocol requires a DNA input of only 5 ng, permitted by a whole-genome amplification step. RESULTS The 2 assembled genomes have N50 values >5.5 and 8.5 Mb, respectively, and both contain ∼96% of BUSCO genes. Thus, they are highly contiguous and complete. The genomes are supported by an integrative taxonomy approach including placement in a genome-based phylogeny of Collembola and designation of a neotype for 1 of the species. Higher heterozygosity values are recorded in the more mobile species. Both species are devoid of the biosynthetic pathway for β-lactam antibiotics known in several Collembola, confirming the tight correlation of antibiotic synthesis with the species way of life. CONCLUSIONS It is now possible to generate high-quality genomes from single specimens of minute, field-preserved metazoans, exceeding the minimum contig N50 (1 Mb) required by the Earth BioGenome Project.
Collapse
Affiliation(s)
- Clément Schneider
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Senckenberganlage 25, 60325 Frankfurt am Main, Germany
- Senckenberg Gesellschaft für Naturforschung, Abteilung Bodenzoologie, Am Museum 1, 02826 Görlitz, Germany
| | - Christian Woehle
- Max Planck Institute for Plant Breeding Research, Max Planck Genome-centre Cologne, Carl-von-Linné-Weg 10, 50829 Cologne, Germany
| | - Carola Greve
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Senckenberganlage 25, 60325 Frankfurt am Main, Germany
| | - Cyrille A D'Haese
- Unité Mécanismes adaptatifs & Evolution (MECADEV), CNRS, Muséum national d'Histoire naturelle, 45 rue Buffon 75005 Paris, France
| | - Magnus Wolf
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Senckenberganlage 25, 60325 Frankfurt am Main, Germany
- Senckenberg Biodiversity and Climate Research Centre, Senckenberganlage 25, 60325 Frankfurt am Main, Germany
- Goethe University, Max-von-Laue-Str. 9, 60438 Frankfurt am Main, Germany
| | - Michael Hiller
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Senckenberganlage 25, 60325 Frankfurt am Main, Germany
- Goethe University, Max-von-Laue-Str. 9, 60438 Frankfurt am Main, Germany
- Senckenberg Research Institute, Senckenberganlage 25, 60325 Frankfurt, Germany
| | - Axel Janke
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Senckenberganlage 25, 60325 Frankfurt am Main, Germany
- Senckenberg Biodiversity and Climate Research Centre, Senckenberganlage 25, 60325 Frankfurt am Main, Germany
- Goethe University, Max-von-Laue-Str. 9, 60438 Frankfurt am Main, Germany
| | - Miklós Bálint
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Senckenberganlage 25, 60325 Frankfurt am Main, Germany
- Senckenberg Biodiversity and Climate Research Centre, Senckenberganlage 25, 60325 Frankfurt am Main, Germany
| | - Bruno Huettel
- Max Planck Institute for Plant Breeding Research, Max Planck Genome-centre Cologne, Carl-von-Linné-Weg 10, 50829 Cologne, Germany
| |
Collapse
|