1
|
Lin Q, Xavier BB, Alako BTF, Mitchell AL, Rajakani SG, Glupczynski Y, Finn RD, Cochrane G, Malhotra-Kumar S. Screening of global microbiomes implies ecological boundaries impacting the distribution and dissemination of clinically relevant antimicrobial resistance genes. Commun Biol 2022; 5:1217. [DOI: 10.1038/s42003-022-04187-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Accepted: 10/28/2022] [Indexed: 11/19/2022] Open
Abstract
AbstractUnderstanding the myriad pathways by which antimicrobial-resistance genes (ARGs) spread across biomes is necessary to counteract the global menace of antimicrobial resistance. We screened 17939 assembled metagenomic samples covering 21 biomes, differing in sequencing quality and depth, unevenly across 46 countries, 6 continents, and 14 years (2005-2019) for clinically crucial ARGs, mobile colistin resistance (mcr), carbapenem resistance (CR), and (extended-spectrum) beta-lactamase (ESBL and BL) genes. These ARGs were most frequent in human gut, oral and skin biomes, followed by anthropogenic (wastewater, bioreactor, compost, food), and natural biomes (freshwater, marine, sediment). Mcr-9 was the most prevalent mcr gene, spatially and temporally; blaOXA-233 and blaTEM-1 were the most prevalent CR and BL/ESBL genes, but blaGES-2 and blaTEM-116 showed the widest distribution. Redundancy analysis and Bayesian analysis showed ARG distribution was non-random and best-explained by potential host genera and biomes, followed by collection year, anthropogenic factors and collection countries. Preferential ARG occurrence, and potential transmission, between characteristically similar biomes indicate strong ecological boundaries. Our results provide a high-resolution global map of ARG distribution and importantly, identify checkpoint biomes wherein interventions aimed at disrupting ARGs dissemination are likely to be most effective in reducing dissemination and in the long term, the ARG global burden.
Collapse
|
2
|
Lange M, Alako BTF, Cochrane G, Ghaffar M, Mascher M, Habekost PK, Hillebrand U, Scholz U, Schorch F, Freitag J, Scholz AH. Quantitative monitoring of nucleotide sequence data from genetic resources in context of their citation in the scientific literature. Gigascience 2021; 10:giab084. [PMID: 34966925 PMCID: PMC8716361 DOI: 10.1093/gigascience/giab084] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Revised: 08/04/2021] [Accepted: 11/29/2021] [Indexed: 11/14/2022] Open
Abstract
BACKGROUND Linking nucleotide sequence data (NSD) to scientific publication citations can enhance understanding of NSD provenance, scientific use, and reuse in the community. By connecting publications with NSD records, NSD geographical provenance information, and author geographical information, it becomes possible to assess the contribution of NSD to infer trends in scientific knowledge gain at the global level. FINDINGS We extracted and linked records from the European Nucleotide Archive to citations in open-access publications aggregated at Europe PubMed Central. A total of 8,464,292 ENA accessions with geographical provenance information were associated with publications. We conducted a data quality review to uncover potential issues in publication citation information extraction and author affiliation tagging and developed and implemented best-practice recommendations for citation extraction. We constructed flat data tables and a data warehouse with an interactive web application to enable ad hoc exploration of NSD use and summary statistics. CONCLUSIONS The extraction and linking of NSD with associated publication citations enables transparency. The quality review contributes to enhanced text mining methods for identifier extraction and use. Furthermore, the global provision and use of NSD enable scientists worldwide to join literature and sequence databases in a multidimensional fashion. As a concrete use case, we visualized statistics of country clusters concerning NSD access in the context of discussions around digital sequence information under the United Nations Convention on Biological Diversity.
Collapse
Affiliation(s)
- Matthias Lange
- Leibniz Institute of Plant Genetics and Crop Plant Research, Department Breeding Research, OT Gatersleben, Corrensstrasse 3, 06466 Seeland, Germany
| | - Blaise T F Alako
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Mehmood Ghaffar
- Leibniz Institute of Plant Genetics and Crop Plant Research, Department Breeding Research, OT Gatersleben, Corrensstrasse 3, 06466 Seeland, Germany
| | - Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research, Department Breeding Research, OT Gatersleben, Corrensstrasse 3, 06466 Seeland, Germany
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Puschstraße 4, 04103 Leipzig, Germany
| | - Pia-Katharina Habekost
- Leibniz Institute of Plant Genetics and Crop Plant Research, Department Breeding Research, OT Gatersleben, Corrensstrasse 3, 06466 Seeland, Germany
- The Harz University of Applied Science, Department of Automation and Computer Science, Friedrichstraße 57, 38855 Wernigerode, Germany
| | - Upneet Hillebrand
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures GmbH, Department Research - Microbial Ecology and Diversity, Inhoffenstraße 7B, 38124 Braunschweig, Germany
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research, Department Breeding Research, OT Gatersleben, Corrensstrasse 3, 06466 Seeland, Germany
| | - Florian Schorch
- Leibniz Institute of Plant Genetics and Crop Plant Research, Department Breeding Research, OT Gatersleben, Corrensstrasse 3, 06466 Seeland, Germany
- The Harz University of Applied Science, Department of Automation and Computer Science, Friedrichstraße 57, 38855 Wernigerode, Germany
| | - Jens Freitag
- Leibniz Institute of Plant Genetics and Crop Plant Research, Department Breeding Research, OT Gatersleben, Corrensstrasse 3, 06466 Seeland, Germany
| | - Amber Hartman Scholz
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures GmbH, Department Research - Microbial Ecology and Diversity, Inhoffenstraße 7B, 38124 Braunschweig, Germany
| |
Collapse
|
3
|
Blackwell GA, Hunt M, Malone KM, Lima L, Horesh G, Alako BTF, Thomson NR, Iqbal Z. Exploring bacterial diversity via a curated and searchable snapshot of archived DNA sequences. PLoS Biol 2021; 19:e3001421. [PMID: 34752446 PMCID: PMC8577725 DOI: 10.1371/journal.pbio.3001421] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Accepted: 09/21/2021] [Indexed: 12/15/2022] Open
Abstract
The open sharing of genomic data provides an incredibly rich resource for the study of bacterial evolution and function and even anthropogenic activities such as the widespread use of antimicrobials. However, these data consist of genomes assembled with different tools and levels of quality checking, and of large volumes of completely unprocessed raw sequence data. In both cases, considerable computational effort is required before biological questions can be addressed. Here, we assembled and characterised 661,405 bacterial genomes retrieved from the European Nucleotide Archive (ENA) in November of 2018 using a uniform standardised approach. Of these, 311,006 did not previously have an assembly. We produced a searchable COmpact Bit-sliced Signature (COBS) index, facilitating the easy interrogation of the entire dataset for a specific sequence (e.g., gene, mutation, or plasmid). Additional MinHash and pp-sketch indices support genome-wide comparisons and estimations of genomic distance. Combined, this resource will allow data to be easily subset and searched, phylogenetic relationships between genomes to be quickly elucidated, and hypotheses rapidly generated and tested. We believe that this combination of uniform processing and variety of search/filter functionalities will make this a resource of very wide utility. In terms of diversity within the data, a breakdown of the 639,981 high-quality genomes emphasised the uneven species composition of the ENA/public databases, with just 20 of the total 2,336 species making up 90% of the genomes. The overrepresented species tend to be acute/common human pathogens, aligning with research priorities at different levels from individual interests to funding bodies and national and global public health agencies.
Collapse
Affiliation(s)
- Grace A. Blackwell
- EMBL-EBI, Wellcome Genome Campus, Hinxton, United Kingdom
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
| | - Martin Hunt
- EMBL-EBI, Wellcome Genome Campus, Hinxton, United Kingdom
- Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | | | - Leandro Lima
- EMBL-EBI, Wellcome Genome Campus, Hinxton, United Kingdom
| | - Gal Horesh
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
| | | | - Nicholas R. Thomson
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
- London School of Hygiene & Tropical Medicine, London, United Kingdom
| | - Zamin Iqbal
- EMBL-EBI, Wellcome Genome Campus, Hinxton, United Kingdom
| |
Collapse
|
4
|
Harrison PW, Ahamed A, Aslam R, Alako BTF, Burgin J, Buso N, Courtot M, Fan J, Gupta D, Haseeb M, Holt S, Ibrahim T, Ivanov E, Jayathilaka S, Balavenkataraman Kadhirvelu V, Kumar M, Lopez R, Kay S, Leinonen R, Liu X, O'Cathail C, Pakseresht A, Park Y, Pesant S, Rahman N, Rajan J, Sokolov A, Vijayaraja S, Waheed Z, Zyoud A, Burdett T, Cochrane G. The European Nucleotide Archive in 2020. Nucleic Acids Res 2021; 49:D82-D85. [PMID: 33175160 PMCID: PMC7778925 DOI: 10.1093/nar/gkaa1028] [Citation(s) in RCA: 72] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Accepted: 10/20/2020] [Indexed: 11/12/2022] Open
Abstract
The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena), provided by the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI), has for almost forty years continued in its mission to freely archive and present the world's public sequencing data for the benefit of the entire scientific community and for the acceleration of the global research effort. Here we highlight the major developments to ENA services and content in 2020, focussing in particular on the recently released updated ENA browser, modernisation of our release process and our data coordination collaborations with specific research communities.
Collapse
Affiliation(s)
- Peter W Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alisha Ahamed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Raheela Aslam
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Blaise T F Alako
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Josephine Burgin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nicola Buso
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mélanie Courtot
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jun Fan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dipayan Gupta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Muhammad Haseeb
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sam Holt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Talal Ibrahim
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Eugene Ivanov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | - Manish Kumar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rodrigo Lopez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Simon Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Xin Liu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Colman O'Cathail
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Amir Pakseresht
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Youngmi Park
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stephane Pesant
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nadim Rahman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jeena Rajan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alexey Sokolov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Senthilnathan Vijayaraja
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Zahra Waheed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ahmad Zyoud
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
5
|
Amid C, Alako BTF, Balavenkataraman Kadhirvelu V, Burdett T, Burgin J, Fan J, Harrison PW, Holt S, Hussein A, Ivanov E, Jayathilaka S, Kay S, Keane T, Leinonen R, Liu X, Martinez-Villacorta J, Milano A, Pakseresht A, Rahman N, Rajan J, Reddy K, Richards E, Smirnov D, Sokolov A, Vijayaraja S, Cochrane G. The European Nucleotide Archive in 2019. Nucleic Acids Res 2020; 48:D70-D76. [PMID: 31722421 PMCID: PMC7145635 DOI: 10.1093/nar/gkz1063] [Citation(s) in RCA: 55] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Revised: 10/25/2019] [Accepted: 11/07/2019] [Indexed: 11/12/2022] Open
Abstract
The European Nucleotide Archive (ENA, https://www.ebi.ac.uk/ena) at the European Molecular Biology Laboratory's European Bioinformatics Institute provides open and freely available data deposition and access services across the spectrum of nucleotide sequence data types. Making the world's public sequencing datasets available to the scientific community, the ENA represents a globally comprehensive nucleotide sequence resource. Here, we outline ENA services and content in 2019 and provide an insight into selected key areas of development in this period.
Collapse
Affiliation(s)
- Clara Amid
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Blaise T F Alako
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Josephine Burgin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jun Fan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Peter W Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sam Holt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Abdulrahman Hussein
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Eugene Ivanov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Simon Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Thomas Keane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Xin Liu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Josue Martinez-Villacorta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Annalisa Milano
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Amir Pakseresht
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nadim Rahman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jeena Rajan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Kethi Reddy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Edward Richards
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dmitriy Smirnov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alexey Sokolov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Senthilnathan Vijayaraja
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
6
|
Xavier BB, Mysara M, Bolzan M, Ribeiro-Gonçalves B, Alako BTF, Harrison P, Lammens C, Kumar-Singh S, Goossens H, Carriço JA, Cochrane G, Malhotra-Kumar S. BacPipe: A Rapid, User-Friendly Whole-Genome Sequencing Pipeline for Clinical Diagnostic Bacteriology. iScience 2019; 23:100769. [PMID: 31887656 PMCID: PMC6941874 DOI: 10.1016/j.isci.2019.100769] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2019] [Revised: 10/21/2019] [Accepted: 12/09/2019] [Indexed: 02/07/2023] Open
Abstract
Despite rapid advances in whole genome sequencing (WGS) technologies, their integration into routine microbiological diagnostics has been hampered by the lack of standardized downstream bioinformatics analysis. We developed a comprehensive and computationally low-resource bioinformatics pipeline (BacPipe) enabling direct analyses of bacterial whole-genome sequences (raw reads or contigs) obtained from second- or third-generation sequencing technologies. A graphical user interface was developed to visualize real-time progression of the analysis. The scalability and speed of BacPipe in handling large datasets was demonstrated using 4,139 Illumina paired-end sequence files of publicly available bacterial genomes (2.9–5.4 Mb) from the European Nucleotide Archive. BacPipe is integrated in EBI-SELECTA, a project-specific portal (H2020-COMPARE), and is available as an independent docker image that can be used across Windows- and Unix-based systems. BacPipe offers a fully automated “one-stop” bacterial WGS analysis pipeline to overcome the major hurdle of WGS data analysis in hospitals and public-health and for infection control monitoring. BacPipe is an automated whole genome sequencing pipeline Interactive user-friendly GUI BacPipe can process raw reads, contigs, or scaffolds Time-to-analysis for a 5 Mb genome is ∼30–40 min
Collapse
Affiliation(s)
- Basil B Xavier
- Laboratory of Medical Microbiology, Campus Drie Eiken, University of Antwerp, S6, Universiteitsplein 1, B-2610 Wilrijk, Belgium; Vaccine & Infectious Disease Institute, University of Antwerp, Antwerp 2610, Belgium
| | - Mohamed Mysara
- Laboratory of Medical Microbiology, Campus Drie Eiken, University of Antwerp, S6, Universiteitsplein 1, B-2610 Wilrijk, Belgium; Vaccine & Infectious Disease Institute, University of Antwerp, Antwerp 2610, Belgium; Microbiology Unit, Belgian Nuclear Research Center (SCK•CEN), Mol 2400, Belgium
| | - Mattia Bolzan
- Laboratory of Medical Microbiology, Campus Drie Eiken, University of Antwerp, S6, Universiteitsplein 1, B-2610 Wilrijk, Belgium
| | - Bruno Ribeiro-Gonçalves
- Instituto de Microbiologia and Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Av. Professor Egaz Moniz, Lisboa 1649-028, Portugal
| | - Blaise T F Alako
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge CB10 1SD, UK
| | - Peter Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge CB10 1SD, UK
| | - Christine Lammens
- Laboratory of Medical Microbiology, Campus Drie Eiken, University of Antwerp, S6, Universiteitsplein 1, B-2610 Wilrijk, Belgium; Vaccine & Infectious Disease Institute, University of Antwerp, Antwerp 2610, Belgium
| | - Samir Kumar-Singh
- Laboratory of Medical Microbiology, Campus Drie Eiken, University of Antwerp, S6, Universiteitsplein 1, B-2610 Wilrijk, Belgium; Molecular Pathology Group, Cell Biology and Histology, University of Antwerp, Antwerp 2610, Belgium
| | - Herman Goossens
- Laboratory of Medical Microbiology, Campus Drie Eiken, University of Antwerp, S6, Universiteitsplein 1, B-2610 Wilrijk, Belgium; Vaccine & Infectious Disease Institute, University of Antwerp, Antwerp 2610, Belgium
| | - João A Carriço
- Instituto de Microbiologia and Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Av. Professor Egaz Moniz, Lisboa 1649-028, Portugal
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge CB10 1SD, UK
| | - Surbhi Malhotra-Kumar
- Laboratory of Medical Microbiology, Campus Drie Eiken, University of Antwerp, S6, Universiteitsplein 1, B-2610 Wilrijk, Belgium; Vaccine & Infectious Disease Institute, University of Antwerp, Antwerp 2610, Belgium.
| |
Collapse
|
7
|
Amid C, Pakseresht N, Silvester N, Jayathilaka S, Lund O, Dynovski LD, Pataki BÁ, Visontai D, Xavier BB, Alako BTF, Belka A, Cisneros JLB, Cotten M, Haringhuizen GB, Harrison PW, Höper D, Holt S, Hundahl C, Hussein A, Kaas RS, Liu X, Leinonen R, Malhotra-Kumar S, Nieuwenhuijse DF, Rahman N, dos S Ribeiro C, Skiby JE, Schmitz D, Stéger J, Szalai-Gindl JM, Thomsen MCF, Cacciò SM, Csabai I, Kroneman A, Koopmans M, Aarestrup F, Cochrane G. The COMPARE Data Hubs. Database (Oxford) 2019; 2019:baz136. [PMID: 31868882 PMCID: PMC6927095 DOI: 10.1093/database/baz136] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Revised: 11/06/2019] [Accepted: 11/07/2019] [Indexed: 11/12/2022]
Abstract
Data sharing enables research communities to exchange findings and build upon the knowledge that arises from their discoveries. Areas of public and animal health as well as food safety would benefit from rapid data sharing when it comes to emergencies. However, ethical, regulatory and institutional challenges, as well as lack of suitable platforms which provide an infrastructure for data sharing in structured formats, often lead to data not being shared or at most shared in form of supplementary materials in journal publications. Here, we describe an informatics platform that includes workflows for structured data storage, managing and pre-publication sharing of pathogen sequencing data and its analysis interpretations with relevant stakeholders.
Collapse
Affiliation(s)
- Clara Amid
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nima Pakseresht
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nicole Silvester
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ole Lund
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| | - Lukasz D Dynovski
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| | - Bálint Á Pataki
- Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest 1117, Hungary
- Department of Computational Sciences, Wigner Research Centre for Physics of the HAS, Budapest 1121, Hungary
| | - Dávid Visontai
- Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest 1117, Hungary
- Department of Computational Sciences, Wigner Research Centre for Physics of the HAS, Budapest 1121, Hungary
| | - Basil Britto Xavier
- Laboratory of Medical Microbiology, Vaccine and Infectious Disease Institute, University of Antwerp, Wilrijk 2610, Belgium
| | - Blaise T F Alako
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ariane Belka
- Institute of Diagnostic Virology, Friedrich-Loeffler-Institut, Greifswald 17493, Germany
| | - Jose L B Cisneros
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| | - Matthew Cotten
- Department of Viroscience, Erasmus Medical Center, Rotterdam 3015, Netherlands
| | - George B Haringhuizen
- National Institute for Public Health and the Environment (RIVM), Bilthoven 3720, The Netherlands
| | - Peter W Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dirk Höper
- Institute of Diagnostic Virology, Friedrich-Loeffler-Institut, Greifswald 17493, Germany
| | - Sam Holt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Camilla Hundahl
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| | - Abdulrahman Hussein
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rolf S Kaas
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| | - Xin Liu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Surbhi Malhotra-Kumar
- Laboratory of Medical Microbiology, Vaccine and Infectious Disease Institute, University of Antwerp, Wilrijk 2610, Belgium
| | | | - Nadim Rahman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Carolina dos S Ribeiro
- National Institute for Public Health and the Environment (RIVM), Bilthoven 3720, The Netherlands
| | - Jeffrey E Skiby
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| | - Dennis Schmitz
- Department of Viroscience, Erasmus Medical Center, Rotterdam 3015, Netherlands
- National Institute for Public Health and the Environment (RIVM), Bilthoven 3720, The Netherlands
| | - József Stéger
- Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest 1117, Hungary
- Department of Computational Sciences, Wigner Research Centre for Physics of the HAS, Budapest 1121, Hungary
| | - János M Szalai-Gindl
- Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest 1117, Hungary
- Department of Computational Sciences, Wigner Research Centre for Physics of the HAS, Budapest 1121, Hungary
| | - Martin C F Thomsen
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| | - Simone M Cacciò
- European Union Reference Laboratory for Parasites, Istituto Superiore di Sanità (ISS), Rome 00161, Italy
| | - István Csabai
- Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest 1117, Hungary
- Department of Computational Sciences, Wigner Research Centre for Physics of the HAS, Budapest 1121, Hungary
| | - Annelies Kroneman
- National Institute for Public Health and the Environment (RIVM), Bilthoven 3720, The Netherlands
| | - Marion Koopmans
- Department of Viroscience, Erasmus Medical Center, Rotterdam 3015, Netherlands
| | - Frank Aarestrup
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
8
|
Flueck C, Bartfai R, Niederwieser I, Witmer K, Alako BTF, Moes S, Bozdech Z, Jenoe P, Stunnenberg HG, Voss TS. A major role for the Plasmodium falciparum ApiAP2 protein PfSIP2 in chromosome end biology. PLoS Pathog 2010; 6:e1000784. [PMID: 20195509 PMCID: PMC2829057 DOI: 10.1371/journal.ppat.1000784] [Citation(s) in RCA: 138] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2009] [Accepted: 01/20/2010] [Indexed: 12/30/2022] Open
Abstract
The heterochromatic environment and physical clustering of chromosome ends at the nuclear periphery provide a functional and structural framework for antigenic variation and evolution of subtelomeric virulence gene families in the malaria parasite Plasmodium falciparum. While recent studies assigned important roles for reversible histone modifications, silent information regulator 2 and heterochromatin protein 1 (PfHP1) in epigenetic control of variegated expression, factors involved in the recruitment and organization of subtelomeric heterochromatin remain unknown. Here, we describe the purification and characterization of PfSIP2, a member of the ApiAP2 family of putative transcription factors, as the unknown nuclear factor interacting specifically with cis-acting SPE2 motif arrays in subtelomeric domains. Interestingly, SPE2 is not bound by the full-length protein but rather by a 60kDa N-terminal domain, PfSIP2-N, which is released during schizogony. Our experimental re-definition of the SPE2/PfSIP2-N interaction highlights the strict requirement of both adjacent AP2 domains and a conserved bipartite SPE2 consensus motif for high-affinity binding. Genome-wide in silico mapping identified 777 putative binding sites, 94% of which cluster in heterochromatic domains upstream of subtelomeric var genes and in telomere-associated repeat elements. Immunofluorescence and chromatin immunoprecipitation (ChIP) assays revealed co-localization of PfSIP2-N with PfHP1 at chromosome ends. Genome-wide ChIP demonstrated the exclusive binding of PfSIP2-N to subtelomeric SPE2 landmarks in vivo but not to single chromosome-internal sites. Consistent with this specialized distribution pattern, PfSIP2-N over-expression has no effect on global gene transcription. Hence, contrary to the previously proposed role for this factor in gene activation, our results provide strong evidence for the first time for the involvement of an ApiAP2 factor in heterochromatin formation and genome integrity. These findings are highly relevant for our understanding of chromosome end biology and variegated expression in P. falciparum and other eukaryotes, and for the future analysis of the role of ApiAP2-DNA interactions in parasite biology.
Collapse
Affiliation(s)
- Christian Flueck
- Department of Medical Parasitology and Infection Biology, Swiss Tropical Institute, University of Basel, Basel, Switzerland
| | - Richard Bartfai
- Department of Molecular Biology, Nijmegen Center of Molecular Life Sciences, Radboud University, Nijmegen, The Netherlands
| | - Igor Niederwieser
- Department of Medical Parasitology and Infection Biology, Swiss Tropical Institute, University of Basel, Basel, Switzerland
| | - Kathrin Witmer
- Department of Medical Parasitology and Infection Biology, Swiss Tropical Institute, University of Basel, Basel, Switzerland
| | - Blaise T. F. Alako
- Department of Molecular Biology, Nijmegen Center of Molecular Life Sciences, Radboud University, Nijmegen, The Netherlands
| | - Suzette Moes
- Biozentrum, University of Basel, Basel, Switzerland
| | - Zbynek Bozdech
- School of Biological Sciences, Nanyang Technological University, Singapore
| | - Paul Jenoe
- Biozentrum, University of Basel, Basel, Switzerland
| | - Hendrik G. Stunnenberg
- Department of Molecular Biology, Nijmegen Center of Molecular Life Sciences, Radboud University, Nijmegen, The Netherlands
| | - Till S. Voss
- Department of Medical Parasitology and Infection Biology, Swiss Tropical Institute, University of Basel, Basel, Switzerland
- * E-mail:
| |
Collapse
|
9
|
Flueck C, Bartfai R, Volz J, Niederwieser I, Salcedo-Amaya AM, Alako BTF, Ehlgen F, Ralph SA, Cowman AF, Bozdech Z, Stunnenberg HG, Voss TS. Plasmodium falciparum heterochromatin protein 1 marks genomic loci linked to phenotypic variation of exported virulence factors. PLoS Pathog 2009; 5:e1000569. [PMID: 19730695 PMCID: PMC2731224 DOI: 10.1371/journal.ppat.1000569] [Citation(s) in RCA: 217] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2009] [Accepted: 08/07/2009] [Indexed: 02/01/2023] Open
Abstract
Epigenetic processes are the main conductors of phenotypic variation in eukaryotes. The malaria parasite Plasmodium falciparum employs antigenic variation of the major surface antigen PfEMP1, encoded by 60 var genes, to evade acquired immune responses. Antigenic variation of PfEMP1 occurs through in situ switches in mono-allelic var gene transcription, which is PfSIR2-dependent and associated with the presence of repressive H3K9me3 marks at silenced loci. Here, we show that P. falciparum heterochromatin protein 1 (PfHP1) binds specifically to H3K9me3 but not to other repressive histone methyl marks. Based on nuclear fractionation and detailed immuno-localization assays, PfHP1 constitutes a major component of heterochromatin in perinuclear chromosome end clusters. High-resolution genome-wide chromatin immuno-precipitation demonstrates the striking association of PfHP1 with virulence gene arrays in subtelomeric and chromosome-internal islands and a high correlation with previously mapped H3K9me3 marks. These include not only var genes, but also the majority of P. falciparum lineage-specific gene families coding for exported proteins involved in host-parasite interactions. In addition, we identified a number of PfHP1-bound genes that were not enriched in H3K9me3, many of which code for proteins expressed during invasion or at different life cycle stages. Interestingly, PfHP1 is absent from centromeric regions, implying important differences in centromere biology between P. falciparum and its human host. Over-expression of PfHP1 results in an enhancement of variegated expression and highlights the presence of well-defined heterochromatic boundaries. In summary, we identify PfHP1 as a major effector of virulence gene silencing and phenotypic variation. Our results are instrumental for our understanding of this widely used survival strategy in unicellular pathogens.
Collapse
Affiliation(s)
- Christian Flueck
- Department of Medical Parasitology and Infection Biology, Swiss Tropical Institute, Basle, Switzerland
| | - Richard Bartfai
- Department of Molecular Biology, Nijmegen Center of Molecular Life Sciences, Radboud University, Nijmegen, The Netherlands
| | - Jennifer Volz
- Division of Infection and Immunity, The Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
| | - Igor Niederwieser
- Department of Medical Parasitology and Infection Biology, Swiss Tropical Institute, Basle, Switzerland
| | - Adriana M. Salcedo-Amaya
- Department of Molecular Biology, Nijmegen Center of Molecular Life Sciences, Radboud University, Nijmegen, The Netherlands
| | - Blaise T. F. Alako
- Department of Molecular Biology, Nijmegen Center of Molecular Life Sciences, Radboud University, Nijmegen, The Netherlands
| | - Florian Ehlgen
- Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Victoria, Australia
| | - Stuart A. Ralph
- Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Victoria, Australia
| | - Alan F. Cowman
- Division of Infection and Immunity, The Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
| | - Zbynek Bozdech
- School of Biological Sciences, Nanyang Technological University, Singapore
| | - Hendrik G. Stunnenberg
- Department of Molecular Biology, Nijmegen Center of Molecular Life Sciences, Radboud University, Nijmegen, The Netherlands
| | - Till S. Voss
- Department of Medical Parasitology and Infection Biology, Swiss Tropical Institute, Basle, Switzerland
- * E-mail:
| |
Collapse
|
10
|
Abstract
Phylogenetic analysis and examination of protein domains allow accurate genome annotation and are invaluable to study proteins and protein complex evolution. However, two sequences can be homologous without sharing statistically significant amino acid or nucleotide identity, presenting a challenging bioinformatics problem. We present TreeDomViewer, a visualization tool available as a web-based interface that combines phylogenetic tree description, multiple sequence alignment and InterProScan data of sequences and generates a phylogenetic tree projecting the corresponding protein domain information onto the multiple sequence alignment. Thereby it makes use of existing domain prediction tools such as InterProScan. TreeDomViewer adopts an evolutionary perspective on how domain structure of two or more sequences can be aligned and compared, to subsequently infer the function of an unknown homolog. This provides insight into the function assignment of, in terms of amino acid substitution, very divergent but yet closely related family members. Our tool produces an interactive scalar vector graphics image that provides orthological relationship and domain content of proteins of interest at one glance. In addition, PDF, JPEG or PNG formatted output is also provided. These features make TreeDomViewer a valuable addition to the annotation pipeline of unknown genes or gene products. TreeDomViewer is available at .
Collapse
Affiliation(s)
- Blaise T. F. Alako
- Laboratory of Bioinformatics, Wageningen University and Research CentrePO Box 8128, 6700 ET Wageningen, The Netherlands
- Centre for BioSystems GenomicsPO Box 98, 6700 AB Wageningen, The Netherlands
| | - Daphne Rainey
- KEYGENE NVPO Box 216 6700 AE Wageningen, The Netherlands
| | - Harm Nijveen
- Laboratory of Bioinformatics, Wageningen University and Research CentrePO Box 8128, 6700 ET Wageningen, The Netherlands
| | - Jack A. M. Leunissen
- Laboratory of Bioinformatics, Wageningen University and Research CentrePO Box 8128, 6700 ET Wageningen, The Netherlands
- To whom correspondence should be addressed. Tel: +31 317 482 036; Fax: +31 317 483 584;
| |
Collapse
|