1
|
Dooley D, Nguyen MH, Hsiao WWL. OntoTrek: 3D visualization of application ontology class hierarchies. PLoS One 2023; 18:e0286728. [PMID: 37267413 DOI: 10.1371/journal.pone.0286728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 05/22/2023] [Indexed: 06/04/2023] Open
Abstract
An application ontology often reuses terms from other related, compatible ontologies. The extent of this interconnectedness is not readily apparent when browsing through larger textual presentations of term class hierarchies, be it Manchester text format OWL files or within an ontology editor like Protege. Users must either note ontology sources in term identifiers, or look at ontology import file term origins. Diagrammatically, this same information may be easier to perceive in 2 dimensional network or hierarchical graphs that visually code ontology term origins. However, humans, having stereoscopic vision and navigational acuity around colored and textured shapes, should benefit even more from a coherent 3-dimensional interactive visualization of ontology that takes advantage of perspective to offer both foreground focus on content and a stable background context. We present OntoTrek, a 3D ontology visualizer that enables ontology stakeholders-students, software developers, curation teams, and funders-to recognize the presence of imported terms and their domains, ultimately illustrating how projects can capture knowledge through a vocabulary of interwoven community-supported ontology resources.
Collapse
Affiliation(s)
- Damion Dooley
- Faculty of Health Sciences, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Matthew H Nguyen
- Faculty of Health Sciences, Simon Fraser University, Burnaby, British Columbia, Canada
- Bioinformatics Graduate Program, University of British Columbia, Vancouver, British Columbia, Canada
| | - William W L Hsiao
- Faculty of Health Sciences, Simon Fraser University, Burnaby, British Columbia, Canada
- Bioinformatics Graduate Program, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
2
|
Savic Kallesoe SA, Rabbani T, Gill EE, Brinkman F, Griffiths EJ, Zawati M, Liu H, Palmour N, Joly Y, Hsiao WWL. Canadians' opinions towards COVID-19 data-sharing: a national cross-sectional survey. BMJ Open 2023; 13:e066418. [PMID: 36750286 PMCID: PMC9905784 DOI: 10.1136/bmjopen-2022-066418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/09/2023] Open
Abstract
OBJECTIVES COVID-19 research has significantly contributed to pandemic response and the enhancement of public health capacity. COVID-19 data collected by provincial/territorial health authorities in Canada are valuable for research advancement yet not readily available to the public, including researchers. To inform developments in public health data-sharing in Canada, we explored Canadians' opinions of public health authorities sharing deidentified individual-level COVID-19 data publicly. DESIGN/SETTING/INTERVENTIONS/OUTCOMES A national cross-sectional survey was administered in Canada in March 2022, assessing Canadians' opinions on publicly sharing COVID-19 datatypes. Market research firm Léger was employed for recruitment and data collection. PARTICIPANTS Anyone greater than or equal to 18 years and currently living in Canada. RESULTS 4981 participants completed the survey with a 92.3% response rate. 79.7% were supportive of provincial/territorial authorities publicly sharing deidentified COVID-19 data, while 20.3% were hesitant/averse/unsure. Datatypes most supported for being shared publicly were symptoms (83.0% in support), geographical region (82.6%) and COVID-19 vaccination status (81.7%). Datatypes with the most aversion were employment sector (27.4% averse), postal area (26.7%) and international travel history (19.7%). Generally supportive Canadians were characterised as being ≥50 years, with higher education, and being vaccinated against COVID-19 at least once. Vaccination status was the most influential predictor of data-sharing opinion, with respondents who were ever vaccinated being 4.20 times more likely (95% CI 3.21 to 5.48, p=0.000) to be generally supportive of data-sharing than those unvaccinated. CONCLUSIONS These findings suggest that the Canadian public is generally favourable to deidentified data-sharing. Identifying factors that are likely to improve attitudes towards data-sharing are useful to stakeholders involved in data-sharing initiatives, such as public health agencies, in informing the development of public health communication and data-sharing policies. As Canada progresses through the COVID-19 pandemic, and with limited testing and reporting of COVID-19 data, it is essential to improve deidentified data-sharing given the public's general support for these efforts.
Collapse
Affiliation(s)
- Sarah A Savic Kallesoe
- Simon Fraser University Faculty of Health Sciences, Burnaby, British Columbia, Canada
- Department of Public Health and Primary Care, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Tian Rabbani
- Simon Fraser University Faculty of Health Sciences, Burnaby, British Columbia, Canada
- School of Kinesiology, The University of British Columbia Faculty of Education, Vancouver, British Columbia, Canada
| | - Erin E Gill
- Department of Molecular Biology and Biochemistry, Simon Fraser University Faculty of Sciences, Burnaby, British Columbia, Canada
| | - Fiona Brinkman
- Department of Molecular Biology and Biochemistry, Simon Fraser University Faculty of Sciences, Burnaby, British Columbia, Canada
| | - Emma J Griffiths
- Simon Fraser University Faculty of Health Sciences, Burnaby, British Columbia, Canada
| | - Ma'n Zawati
- Department of Human Genetics, McGill University Faculty of Medicine and Health Sciences, Montreal, Québec, Canada
| | - Hanshi Liu
- Department of Human Genetics, McGill University Faculty of Medicine and Health Sciences, Montreal, Québec, Canada
| | - Nicole Palmour
- Department of Human Genetics, McGill University Faculty of Medicine and Health Sciences, Montreal, Québec, Canada
| | - Yann Joly
- Department of Human Genetics, McGill University Faculty of Medicine and Health Sciences, Montreal, Québec, Canada
| | - William W L Hsiao
- Simon Fraser University Faculty of Health Sciences, Burnaby, British Columbia, Canada
| |
Collapse
|
3
|
Gill IS, Griffiths EJ, Dooley D, Cameron R, Savić Kallesøe S, John NS, Sehar A, Gosal G, Alexander D, Chapel M, Croxen MA, Delisle B, Di Tullio R, Gaston D, Duggan A, Guthrie JL, Horsman M, Joshi E, Kearny L, Knox N, Lau L, LeBlanc JJ, Li V, Lyons P, MacKenzie K, McArthur AG, Panousis EM, Palmer J, Prystajecky N, Smith KN, Tanner J, Townend C, Tyler A, Van Domselaar G, Hsiao WWL. The DataHarmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information. Microb Genom 2023; 9:mgen000908. [PMID: 36748616 PMCID: PMC9973856 DOI: 10.1099/mgen.0.000908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Pathogen genomics is a critical tool for public health surveillance, infection control, outbreak investigations as well as research. In order to make use of pathogen genomics data, they must be interpreted using contextual data (metadata). Contextual data include sample metadata, laboratory methods, patient demographics, clinical outcomes and epidemiological information. However, the variability in how contextual information is captured by different authorities and how it is encoded in different databases poses challenges for data interpretation, integration and their use/re-use. The DataHarmonizer is a template-driven spreadsheet application for harmonizing, validating and transforming genomics contextual data into submission-ready formats for public or private repositories. The tool's web browser-based JavaScript environment enables validation and its offline functionality and local installation increases data security. The DataHarmonizer was developed to address the data sharing needs that arose during the COVID-19 pandemic, and was used by members of the Canadian COVID Genomics Network (CanCOGeN) to harmonize SARS-CoV-2 contextual data for national surveillance and for public repository submission. In order to support coordination of international surveillance efforts, we have partnered with the Public Health Alliance for Genomic Epidemiology to also provide a template conforming to its SARS-CoV-2 contextual data specification for use worldwide. Templates are also being developed for One Health and foodborne pathogens. Overall, the DataHarmonizer tool improves the effectiveness and fidelity of contextual data capture as well as its subsequent usability. Harmonization of contextual information across authorities, platforms and systems globally improves interoperability and reusability of data for concerted public health and research initiatives to fight the current pandemic and future public health emergencies. While initially developed for the COVID-19 pandemic, its expansion to other data management applications and pathogens is already underway.
Collapse
Affiliation(s)
- Ivan S Gill
- University of British Columbia, Vancouver, BC, Canada
| | - Emma J Griffiths
- Faculty of Health Sciences, Simon Fraser University, Burnaby, BC, Canada
| | - Damion Dooley
- Faculty of Health Sciences, Simon Fraser University, Burnaby, BC, Canada
| | - Rhiannon Cameron
- Faculty of Health Sciences, Simon Fraser University, Burnaby, BC, Canada
| | | | - Nithu Sara John
- Faculty of Health Sciences, Simon Fraser University, Burnaby, BC, Canada
| | - Anoosha Sehar
- Faculty of Health Sciences, Simon Fraser University, Burnaby, BC, Canada
| | - Gurinder Gosal
- Faculty of Health Sciences, Simon Fraser University, Burnaby, BC, Canada
| | | | - Madison Chapel
- National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, MB, Canada
| | - Matthew A Croxen
- Alberta Precision Labs, Edmonton, AB, Canada.,Department of Laboratory Medicine and Pathology, University of Alberta, Edmonton, AB, Canada
| | | | | | - Daniel Gaston
- Department of Pathology and Laboratory Medicine, Nova Scotia Health, Halifax, NS, Canada
| | - Ana Duggan
- National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, MB, Canada
| | | | - Mark Horsman
- National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, MB, Canada.,Public Health Ontario Laboratory, Toronto, ON, Canada
| | - Esha Joshi
- Public Health Ontario Laboratory, Toronto, ON, Canada
| | - Levon Kearny
- National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, MB, Canada
| | - Natalie Knox
- National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, MB, Canada
| | - Lynette Lau
- The Hospital for Sick Children, Toronto, ON, Canada
| | - Jason J LeBlanc
- Department of Pathology and Laboratory Medicine, Nova Scotia Health, Halifax, NS, Canada
| | - Vincent Li
- Alberta Precision Labs, Edmonton, AB, Canada
| | - Pierre Lyons
- Public Health Agency of Canada, Moncton, NB, Canada
| | | | - Andrew G McArthur
- Michael G. DeGroote Institute for Infectious Disease Research & Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ON, Canada
| | - Emily M Panousis
- Michael G. DeGroote Institute for Infectious Disease Research & Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ON, Canada
| | - John Palmer
- Public Health Ontario Laboratory, Toronto, ON, Canada
| | - Natalie Prystajecky
- University of British Columbia, Vancouver, BC, Canada.,BCCDC Public Health Laboratory, Vancouver, BC, Canada
| | | | - Jennifer Tanner
- National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, MB, Canada
| | - Christopher Townend
- National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, MB, Canada
| | - Andrea Tyler
- National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, MB, Canada
| | - Gary Van Domselaar
- National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, MB, Canada
| | - William W L Hsiao
- University of British Columbia, Vancouver, BC, Canada.,Faculty of Health Sciences, Simon Fraser University, Burnaby, BC, Canada
| |
Collapse
|
4
|
Liu CC, Hsiao WWL. Large-scale comparative genomics to refine the organization of the global Salmonella enterica population structure. Microb Genom 2022; 8:mgen000906. [PMID: 36748524 PMCID: PMC9837569 DOI: 10.1099/mgen.0.000906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
The White-Kauffmann-Le Minor (WKL) scheme is the most widely used Salmonella typing scheme for reporting the disease prevalence of the enteric pathogen. With the advent of whole-genome sequencing (WGS), in silico methods have increasingly replaced traditional serotyping due to reproducibility, speed and coverage. However, despite integrating genomic-based typing by in silico serotyping tools such as SISTR, in silico serotyping in certain contexts remains ambiguous and insufficiently informative. Specifically, in silico serotyping does not attempt to resolve polyphyly. Furthermore, in spite of the widespread acknowledgement of polyphyly from genomic studies, the prevalence of polyphyletic serovars is not well characterized. Here, we applied a genomics approach to acquire the necessary resolution to classify genetically discordant serovars and propose an alternative typing scheme that consistently reflect natural Salmonella populations. By accessing the unprecedented volume of bacterial genomic data publicly available in GenomeTrakr and PubMLST databases (>180 000 genomes representing 723 serovars), we characterized the global Salmonella population structure and systematically identified putative non-monophyletic serovars. The proportion of putative non-monophyletic serovars was estimated higher than previous reports, reinforcing the inability of antigenic determinants to depict the complexity of Salmonella evolutionary history. We explored the extent of genetic diversity masked by serotyping labels and found significant intra-serovar molecular differences across many clinically important serovars. To avoid false discovery due to incorrect in silico serotyping calls, we cross-referenced reported serovar labels and concluded a low error rate in in silico serotyping. The combined application of clustering statistics and genome-wide association methods demonstrated effective characterization of stable bacterial populations and explained functional differences. The collective methods adopted in our study have practical values in establishing genomic-based typing nomenclatures for an entire microbial species or closely related subpopulations. Ultimately, we foresee an improved typing scheme to be a hybrid that integrates both genomic and antigenic information such that the resolution from WGS is leveraged to improve the precision of subpopulation classification while preserving the common names defined by the WKL scheme.
Collapse
Affiliation(s)
- Chao Chun Liu
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada,Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| | - William W. L. Hsiao
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada,Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, British Columbia, Canada,Faculty of Health Sciences, Simon Fraser University, Burnaby, British Columbia, Canada,*Correspondence: William W. L. Hsiao,
| |
Collapse
|
5
|
Alcock BP, Huynh W, Chalil R, Smith KW, Raphenya A, Wlodarski MA, Edalatmand A, Petkau A, Syed SA, Tsang KK, Baker SJC, Dave M, McCarthy M, Mukiri KM, Nasir JA, Golbon B, Imtiaz H, Jiang X, Kaur K, Kwong M, Liang ZC, Niu KC, Shan P, Yang JYJ, Gray K, Hoad GR, Jia B, Bhando T, Carfrae L, Farha M, French S, Gordzevich R, Rachwalski K, Tu M, Bordeleau E, Dooley D, Griffiths E, Zubyk HL, Brown ED, Maguire F, Beiko R, Hsiao WWL, Brinkman FSL, Van Domselaar G, McArthur AG. CARD 2023: expanded curation, support for machine learning, and resistome prediction at the Comprehensive Antibiotic Resistance Database. Nucleic Acids Res 2022; 51:D690-D699. [PMID: 36263822 PMCID: PMC9825576 DOI: 10.1093/nar/gkac920] [Citation(s) in RCA: 187] [Impact Index Per Article: 93.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 10/03/2022] [Accepted: 10/11/2022] [Indexed: 01/30/2023] Open
Abstract
The Comprehensive Antibiotic Resistance Database (CARD; card.mcmaster.ca) combines the Antibiotic Resistance Ontology (ARO) with curated AMR gene (ARG) sequences and resistance-conferring mutations to provide an informatics framework for annotation and interpretation of resistomes. As of version 3.2.4, CARD encompasses 6627 ontology terms, 5010 reference sequences, 1933 mutations, 3004 publications, and 5057 AMR detection models that can be used by the accompanying Resistance Gene Identifier (RGI) software to annotate genomic or metagenomic sequences. Focused curation enhancements since 2020 include expanded β-lactamase curation, incorporation of likelihood-based AMR mutations for Mycobacterium tuberculosis, addition of disinfectants and antiseptics plus their associated ARGs, and systematic curation of resistance-modifying agents. This expanded curation includes 180 new AMR gene families, 15 new drug classes, 1 new resistance mechanism, and two new ontological relationships: evolutionary_variant_of and is_small_molecule_inhibitor. In silico prediction of resistomes and prevalence statistics of ARGs has been expanded to 377 pathogens, 21,079 chromosomes, 2,662 genomic islands, 41,828 plasmids and 155,606 whole-genome shotgun assemblies, resulting in collation of 322,710 unique ARG allele sequences. New features include the CARD:Live collection of community submitted isolate resistome data and the introduction of standardized 15 character CARD Short Names for ARGs to support machine learning efforts.
Collapse
Affiliation(s)
- Brian P Alcock
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - William Huynh
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Romeo Chalil
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Keaton W Smith
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Amogelang R Raphenya
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Mateusz A Wlodarski
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Arman Edalatmand
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Aaron Petkau
- Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, Canada,National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, Manitoba, Canada
| | - Sohaib A Syed
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Kara K Tsang
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Sheridan J C Baker
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Mugdha Dave
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Madeline C McCarthy
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Karyn M Mukiri
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Jalees A Nasir
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Bahar Golbon
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Hamna Imtiaz
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Xingjian Jiang
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Komal Kaur
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Megan Kwong
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Zi Cheng Liang
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Keyu C Niu
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Prabakar Shan
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Jasmine Y J Yang
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Kristen L Gray
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Gemma R Hoad
- Research Computing Group, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Baofeng Jia
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Timsy Bhando
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Lindsey A Carfrae
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Maya A Farha
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Shawn French
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Rodion Gordzevich
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Kenneth Rachwalski
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Megan M Tu
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Emily Bordeleau
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Damion Dooley
- Faculty of Health Sciences, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Emma Griffiths
- Faculty of Health Sciences, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Haley L Zubyk
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Eric D Brown
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada,Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada,Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Finlay Maguire
- Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, Canada,Institute for Comparative Genomics, Dalhousie University, Halifax, Nova Scotia, Canada,Department of Community Health & Epidemiology, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Robert G Beiko
- Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, Canada,Institute for Comparative Genomics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - William W L Hsiao
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada,Faculty of Health Sciences, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Fiona S L Brinkman
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Gary Van Domselaar
- National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, Manitoba, Canada,Department of Medical Microbiology and Infectious Diseases, Max Rady College of Medicine, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Andrew G McArthur
- To whom correspondence should be addressed. Tel: +1 905 525 9140 (Ext 21663);
| |
Collapse
|
6
|
Griffiths EJ, Timme RE, Mendes CI, Page AJ, Alikhan NF, Fornika D, Maguire F, Campos J, Park D, Olawoye IB, Oluniyi PE, Anderson D, Christoffels A, da Silva AG, Cameron R, Dooley D, Katz LS, Black A, Karsch-Mizrachi I, Barrett T, Johnston A, Connor TR, Nicholls SM, Witney AA, Tyson GH, Tausch SH, Raphenya AR, Alcock B, Aanensen DM, Hodcroft E, Hsiao WWL, Vasconcelos ATR, MacCannell DR. Future-proofing and maximizing the utility of metadata: The PHA4GE SARS-CoV-2 contextual data specification package. Gigascience 2022; 11:6529104. [PMID: 35169842 PMCID: PMC8847733 DOI: 10.1093/gigascience/giac003] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 12/15/2021] [Accepted: 01/07/2022] [Indexed: 12/23/2022] Open
Abstract
Background The Public Health Alliance for Genomic Epidemiology (PHA4GE) (https://pha4ge.org) is a global coalition that is actively working to establish consensus standards, document and share best practices, improve the availability of critical bioinformatics tools and resources, and advocate for greater openness, interoperability, accessibility, and reproducibility in public health microbial bioinformatics. In the face of the current pandemic, PHA4GE has identified a need for a fit-for-purpose, open-source SARS-CoV-2 contextual data standard. Results As such, we have developed a SARS-CoV-2 contextual data specification package based on harmonizable, publicly available community standards. The specification can be implemented via a collection template, as well as an array of protocols and tools to support both the harmonization and submission of sequence data and contextual information to public biorepositories. Conclusions Well-structured, rich contextual data add value, promote reuse, and enable aggregation and integration of disparate datasets. Adoption of the proposed standard and practices will better enable interoperability between datasets and systems, improve the consistency and utility of generated data, and ultimately facilitate novel insights and discoveries in SARS-CoV-2 and COVID-19. The package is now supported by the NCBI’s BioSample database.
Collapse
Affiliation(s)
| | - Ruth E Timme
- Center for Food Safety and Applied Nutrition, U.S. Food and Drug Administration, College Park, MD 20740, USA
| | - Catarina Inês Mendes
- Instituto de Microbiologia, Instituto de Medicina Molecular, Faculdade de Medicina, Universidade de Lisboa, Lisboa 1649-028, Portugal
| | - Andrew J Page
- Microbes in the Food Chain, Quadram Institute Bioscience, Norwich, Norfolk NR4 7UQ, UK
| | - Nabil-Fareed Alikhan
- Microbes in the Food Chain, Quadram Institute Bioscience, Norwich, Norfolk NR4 7UQ, UK
| | - Dan Fornika
- BC Centre for Disease Control Public Health Laboratory, Vancouver, BC V5Z 4R4, Canada
| | - Finlay Maguire
- Faculty of Computer Science, Dalhousie University, Halifax, NS B3H 1W5, Canada
| | - Josefina Campos
- INEI-ANLIS “Dr Carlos G. Malbrán,” Buenos Aires C1282AFF, Argentina
| | - Daniel Park
- Infectious Disease and Microbiome Program, The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Idowu B Olawoye
- African Center of Excellence for Genomics of Infectious Diseases (ACEGID), Redeemer's University, Ede, Osun State 232103, Nigeria
- Department of Biological Sciences, College of Natural Sciences, Redeemer's University, Ede, Osun State 232103, Nigeria
| | - Paul E Oluniyi
- African Center of Excellence for Genomics of Infectious Diseases (ACEGID), Redeemer's University, Ede, Osun State 232103, Nigeria
- Department of Biological Sciences, College of Natural Sciences, Redeemer's University, Ede, Osun State 232103, Nigeria
| | - Dominique Anderson
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Bellville 7530, South Africa
| | - Alan Christoffels
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Bellville 7530, South Africa
| | - Anders Gonçalves da Silva
- Microbiological Diagnostic Unit Public Health Laboratory, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC 3000, Australia
| | - Rhiannon Cameron
- Faculty of Health Sciences, Simon Fraser University, Burnaby V5A 1S6, BC, Canada
| | - Damion Dooley
- Faculty of Health Sciences, Simon Fraser University, Burnaby V5A 1S6, BC, Canada
| | - Lee S Katz
- Center for Food Safety, University of Georgia, Atlanta, GA 30333, USA
- Office of Advanced Molecular Detection, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, GA 30333, USA
| | - Allison Black
- Department of Epidemiology, University of Washington, WA 98109, USA
| | - Ilene Karsch-Mizrachi
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Tanya Barrett
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Anjanette Johnston
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Thomas R Connor
- Organisms and Environment Division, School of Biosciences, Cardiff University, Cardiff CF10 3AX, UK
- Public Health Wales, University Hospital of Wales, Cardiff CF14 4XW, UK
| | | | - Adam A Witney
- Institute for Infection and Immunity, St George's, University of London, London SW17 0RE, UK
| | - Gregory H Tyson
- Center for Veterinary Medicine, U.S. Food and Drug Administration, Laurel, MD 20708, USA
| | - Simon H Tausch
- Department of Biological Safety, German Federal Institute for Risk Assessment, Berlin 12277, Germany
| | - Amogelang R Raphenya
- Department of Biochemistry and Biomedical Sciences and the Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, ON L8S 4L8, Canada
| | - Brian Alcock
- Department of Biochemistry and Biomedical Sciences and the Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, ON L8S 4L8, Canada
| | - David M Aanensen
- Centre for Genomic Pathogen Surveillance, Wellcome Genome Campus, Cambridge CB10 1SA, UK
- The Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7LF, UK
| | - Emma Hodcroft
- Biozentrum, University of Basel, Basel 3012, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - William W L Hsiao
- Faculty of Health Sciences, Simon Fraser University, Burnaby V5A 1S6, BC, Canada
- BC Centre for Disease Control Public Health Laboratory, Vancouver, BC V5Z 4R4, Canada
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC V6T 1Z7 V6T 1Z7, Canada
| | - Ana Tereza R Vasconcelos
- Bioinformatics Laboratory National Laboratory of Scientific Computation LNCC/MCTI, Petrópolis 25651-075, Brazil
| | - Duncan R MacCannell
- Office of Advanced Molecular Detection, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, GA 30333, USA
| |
Collapse
|
7
|
Alcock BP, Raphenya AR, Lau TTY, Tsang KK, Bouchard M, Edalatmand A, Huynh W, Nguyen ALV, Cheng AA, Liu S, Min SY, Miroshnichenko A, Tran HK, Werfalli RE, Nasir JA, Oloni M, Speicher DJ, Florescu A, Singh B, Faltyn M, Hernandez-Koutoucheva A, Sharma AN, Bordeleau E, Pawlowski AC, Zubyk HL, Dooley D, Griffiths E, Maguire F, Winsor GL, Beiko RG, Brinkman FSL, Hsiao WWL, Domselaar GV, McArthur AG. CARD 2020: antibiotic resistome surveillance with the comprehensive antibiotic resistance database. Nucleic Acids Res 2020; 48:D517-D525. [PMID: 31665441 PMCID: PMC7145624 DOI: 10.1093/nar/gkz935] [Citation(s) in RCA: 1091] [Impact Index Per Article: 272.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Revised: 10/03/2019] [Accepted: 10/08/2019] [Indexed: 02/06/2023] Open
Abstract
The Comprehensive Antibiotic Resistance Database (CARD; https://card.mcmaster.ca) is a curated resource providing reference DNA and protein sequences, detection models and bioinformatics tools on the molecular basis of bacterial antimicrobial resistance (AMR). CARD focuses on providing high-quality reference data and molecular sequences within a controlled vocabulary, the Antibiotic Resistance Ontology (ARO), designed by the CARD biocuration team to integrate with software development efforts for resistome analysis and prediction, such as CARD's Resistance Gene Identifier (RGI) software. Since 2017, CARD has expanded through extensive curation of reference sequences, revision of the ontological structure, curation of over 500 new AMR detection models, development of a new classification paradigm and expansion of analytical tools. Most notably, a new Resistomes & Variants module provides analysis and statistical summary of in silico predicted resistance variants from 82 pathogens and over 100 000 genomes. By adding these resistance variants to CARD, we are able to summarize predicted resistance using the information included in CARD, identify trends in AMR mobility and determine previously undescribed and novel resistance variants. Here, we describe updates and recent expansions to CARD and its biocuration process, including new resources for community biocuration of AMR molecular reference data.
Collapse
Affiliation(s)
- Brian P Alcock
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Department of Biochemistry and Biomedical Science, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - Amogelang R Raphenya
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Department of Biochemistry and Biomedical Science, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - Tammy T Y Lau
- M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Department of Biochemistry and Biomedical Science, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - Kara K Tsang
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Department of Biochemistry and Biomedical Science, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - Mégane Bouchard
- M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Bachelor of Health Sciences Program, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - Arman Edalatmand
- M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Department of Biochemistry and Biomedical Science, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - William Huynh
- M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Department of Biochemistry and Biomedical Science, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - Anna-Lisa V Nguyen
- M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Bachelor of Health Sciences Program, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - Annie A Cheng
- M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Department of Biochemistry and Biomedical Science, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - Sihan Liu
- M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Department of Biochemistry and Biomedical Science, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - Sally Y Min
- M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Department of Biochemistry and Biomedical Science, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - Anatoly Miroshnichenko
- M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Department of Biochemistry and Biomedical Science, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - Hiu-Ki Tran
- M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Department of Biochemistry and Biomedical Science, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - Rafik E Werfalli
- M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Department of Biochemistry and Biomedical Science, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - Jalees A Nasir
- M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Department of Biochemistry and Biomedical Science, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - Martins Oloni
- M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Department of Biochemistry and Biomedical Science, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - David J Speicher
- M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Department of Biochemistry and Biomedical Science, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - Alexandra Florescu
- M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Bachelor of Health Sciences Program, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - Bhavya Singh
- Honours Biology Program, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - Mateusz Faltyn
- M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Bachelor of Arts & Science Program, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | | | - Arjun N Sharma
- M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Department of Biochemistry and Biomedical Science, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - Emily Bordeleau
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Department of Biochemistry and Biomedical Science, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - Andrew C Pawlowski
- Department of Genetics, Harvard Medical School, Harvard University, Boston, MA 02115, USA
| | - Haley L Zubyk
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Department of Biochemistry and Biomedical Science, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| | - Damion Dooley
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, V6T 2B5, British Columbia, Canada
| | - Emma Griffiths
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, V5A 1S6, Canada
| | - Finlay Maguire
- Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, B3H 1W5, Canada
| | - Geoff L Winsor
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, V5A 1S6, Canada
| | - Robert G Beiko
- Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, B3H 1W5, Canada
| | - Fiona S L Brinkman
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, V5A 1S6, Canada
| | - William W L Hsiao
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, V6T 2B5, British Columbia, Canada.,Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, V5A 1S6, Canada.,British Columbia Centre for Disease Control Public Health Laboratory, Vancouver, British Columbia, V5Z 4R4, Canada
| | - Gary V Domselaar
- National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, Manitoba, R3E 3R2, Canada.,Department of Medical Microbiology and Infectious Diseases, Max Rady College of Medicine, University of Manitoba, Winnipeg, Manitoba, R3E 0J9, Canada
| | - Andrew G McArthur
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, L8S 4K1, Canada.,Department of Biochemistry and Biomedical Science, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| |
Collapse
|
8
|
Dooley DM, Griffiths EJ, Gosal GS, Buttigieg PL, Hoehndorf R, Lange MC, Schriml LM, Brinkman FSL, Hsiao WWL. FoodOn: a harmonized food ontology to increase global food traceability, quality control and data integration. NPJ Sci Food 2018; 2:23. [PMID: 31304272 PMCID: PMC6550238 DOI: 10.1038/s41538-018-0032-6] [Citation(s) in RCA: 84] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2018] [Accepted: 09/25/2018] [Indexed: 11/09/2022] Open
Abstract
The construction of high capacity data sharing networks to support increasing government and commercial data exchange has highlighted a key roadblock: the content of existing Internet-connected information remains siloed due to a multiplicity of local languages and data dictionaries. This lack of a digital lingua franca is obvious in the domain of human food as materials travel from their wild or farm origin, through processing and distribution chains, to consumers. Well defined, hierarchical vocabulary, connected with logical relationships-in other words, an ontology-is urgently needed to help tackle data harmonization problems that span the domains of food security, safety, quality, production, distribution, and consumer health and convenience. FoodOn (http://foodon.org) is a consortium-driven project to build a comprehensive and easily accessible global farm-to-fork ontology about food, that accurately and consistently describes foods commonly known in cultures from around the world. FoodOn addresses food product terminology gaps and supports food traceability. Focusing on human and domesticated animal food description, FoodOn contains animal and plant food sources, food categories and products, and other facets like preservation processes, contact surfaces, and packaging. Much of FoodOn's vocabulary comes from transforming LanguaL, a mature and popular food indexing thesaurus, into a World Wide Web Consortium (W3C) OWL Web Ontology Language-formatted vocabulary that provides system interoperability, quality control, and software-driven intelligence. FoodOn compliments other technologies facilitating food traceability, which is becoming critical in this age of increasing globalization of food networks.
Collapse
Affiliation(s)
- Damion M. Dooley
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC Canada
| | - Emma J. Griffiths
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC Canada
- Present Address: Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC Canada
| | - Gurinder S. Gosal
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC Canada
| | - Pier L. Buttigieg
- Alfred-Wegener-Institut, Helmholtz-Zentrum für Polar- und Meeresforschung, Bremen, Germany
| | - Robert Hoehndorf
- King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Matthew C. Lange
- Department of Food Science and Technology, UC Davis, Davis, CA USA
| | - Lynn M. Schriml
- Epidemiology & Public Health, University of Maryland School of Medicine, Baltimore, MD USA
| | - Fiona S. L. Brinkman
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC Canada
| | - William W. L. Hsiao
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC Canada
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC Canada
- British Columbia Centre for Disease Control Public Health Laboratory, Vancouver, BC Canada
| |
Collapse
|
9
|
Griffiths E, Dooley D, Graham M, Van Domselaar G, Brinkman FSL, Hsiao WWL. Context Is Everything: Harmonization of Critical Food Microbiology Descriptors and Metadata for Improved Food Safety and Surveillance. Front Microbiol 2017; 8:1068. [PMID: 28694792 PMCID: PMC5483436 DOI: 10.3389/fmicb.2017.01068] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2017] [Accepted: 05/29/2017] [Indexed: 11/18/2022] Open
Abstract
Globalization of food networks increases opportunities for the spread of foodborne pathogens beyond borders and jurisdictions. High resolution whole-genome sequencing (WGS) subtyping of pathogens promises to vastly improve our ability to track and control foodborne disease, but to do so it must be combined with epidemiological, clinical, laboratory and other health care data (called “contextual data”) to be meaningfully interpreted for regulatory and health interventions, outbreak investigation, and risk assessment. However, current multi-jurisdictional pathogen surveillance and investigation efforts are complicated by time-consuming data re-entry, curation and integration of contextual information owing to a lack of interoperable standards and inconsistent reporting. A solution to these challenges is the use of ‘ontologies’ - hierarchies of well-defined and standardized vocabularies interconnected by logical relationships. Terms are specified by universal IDs enabling integration into highly regulated areas and multi-sector sharing (e.g., food and water microbiology with the veterinary sector). Institution-specific terms can be mapped to a given standard at different levels of granularity, maximizing comparability of contextual information according to jurisdictional policies. Fit-for-purpose ontologies provide contextual information with the auditability required for food safety laboratory accreditation. Our research efforts include the development of a Genomic Epidemiology Ontology (GenEpiO), and Food Ontology (FoodOn) that harmonize important laboratory, clinical and epidemiological data fields, as well as existing food resources. These efforts are supported by a global consortium of researchers and stakeholders worldwide. Since foodborne diseases do not respect international borders, uptake of such vocabularies will be crucial for multi-jurisdictional interpretation of WGS results and data sharing.
Collapse
Affiliation(s)
- Emma Griffiths
- Department of Molecular Biology and Biochemistry, Simon Fraser University, VancouverBC, Canada
| | - Damion Dooley
- Department of Pathology and Laboratory Medicine, University of British Columbia, VancouverBC, Canada
| | - Morag Graham
- National Microbiology Laboratory, Public Health Agency of Canada, WinnipegMB, Canada.,Department of Medical Microbiology and Infectious Diseases, Max Rady College of Medicine, University of Manitoba, WinnipegMB, Canada
| | - Gary Van Domselaar
- National Microbiology Laboratory, Public Health Agency of Canada, WinnipegMB, Canada.,Department of Medical Microbiology and Infectious Diseases, Max Rady College of Medicine, University of Manitoba, WinnipegMB, Canada
| | - Fiona S L Brinkman
- Department of Molecular Biology and Biochemistry, Simon Fraser University, VancouverBC, Canada
| | - William W L Hsiao
- Department of Pathology and Laboratory Medicine, University of British Columbia, VancouverBC, Canada.,British Columbia Centre for Disease Control Public Health Laboratory, VancouverBC, Canada
| |
Collapse
|
10
|
Van Rossum T, Peabody MA, Uyaguari-Diaz MI, Cronin KI, Chan M, Slobodan JR, Nesbitt MJ, Suttle CA, Hsiao WWL, Tang PKC, Prystajecky NA, Brinkman FSL. Year-Long Metagenomic Study of River Microbiomes Across Land Use and Water Quality. Front Microbiol 2015; 6:1405. [PMID: 26733955 PMCID: PMC4681185 DOI: 10.3389/fmicb.2015.01405] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2015] [Accepted: 11/25/2015] [Indexed: 01/04/2023] Open
Abstract
Select bacteria, such as Escherichia coli or coliforms, have been widely used as sentinels of low water quality; however, there are concerns regarding their predictive accuracy for the protection of human and environmental health. To develop improved monitoring systems, a greater understanding of bacterial community structure, function, and variability across time is required in the context of different pollution types, such as agricultural and urban contamination. Here, we present a year-long survey of free-living bacterial DNA collected from seven sites along rivers in three watersheds with varying land use in Southwestern Canada. This is the first study to examine the bacterial metagenome in flowing freshwater (lotic) environments over such a time span, providing an opportunity to describe bacterial community variability as a function of land use and environmental conditions. Characteristics of the metagenomic data, such as sequence composition and average genome size (AGS), vary with sampling site, environmental conditions, and water chemistry. For example, AGS was correlated with hours of daylight in the agricultural watershed and, across the agriculturally and urban-affected sites, k-mer composition clustering corresponded to nutrient concentrations. In addition to indicating a community shift, this change in AGS has implications in terms of the normalization strategies required, and considerations surrounding such strategies in general are discussed. When comparing abundances of gene functional groups between high- and low-quality water samples collected from an agricultural area, the latter had a higher abundance of nutrient metabolism and bacteriophage groups, possibly reflecting an increase in agricultural runoff. This work presents a valuable dataset representing a year of monthly sampling across watersheds and an analysis targeted at establishing a foundational understanding of how bacterial lotic communities vary across time and land use. The results provide important context for future studies, including further analyses of watershed ecosystem health, and the identification and development of biomarkers for improved water quality monitoring systems.
Collapse
Affiliation(s)
- Thea Van Rossum
- Department of Molecular Biology and Biochemistry, Simon Fraser University Burnaby, BC, Canada
| | - Michael A Peabody
- Department of Molecular Biology and Biochemistry, Simon Fraser University Burnaby, BC, Canada
| | - Miguel I Uyaguari-Diaz
- Department of Pathology and Laboratory Medicine, University of British Columbia Vancouver, BC, Canada
| | - Kirby I Cronin
- Department of Pathology and Laboratory Medicine, University of British Columbia Vancouver, BC, Canada
| | - Michael Chan
- British Columbia Public Health Microbiology and Reference Laboratory, British Columbia Centre for Disease Control Vancouver, BC, Canada
| | | | | | - Curtis A Suttle
- Department of Microbiology and Immunology, University of British ColumbiaVancouver, BC, Canada; Department of Earth, Ocean and Atmospheric Sciences, University of British ColumbiaVancouver, BC, Canada; Department of Botany, University of British ColumbiaVancouver, BC, Canada; Canadian Institute for Advanced ResearchToronto, ON, Canada
| | - William W L Hsiao
- Department of Pathology and Laboratory Medicine, University of British ColumbiaVancouver, BC, Canada; British Columbia Public Health Microbiology and Reference Laboratory, British Columbia Centre for Disease ControlVancouver, BC, Canada
| | - Patrick K C Tang
- Department of Pathology and Laboratory Medicine, University of British ColumbiaVancouver, BC, Canada; British Columbia Public Health Microbiology and Reference Laboratory, British Columbia Centre for Disease ControlVancouver, BC, Canada
| | - Natalie A Prystajecky
- Department of Pathology and Laboratory Medicine, University of British ColumbiaVancouver, BC, Canada; British Columbia Public Health Microbiology and Reference Laboratory, British Columbia Centre for Disease ControlVancouver, BC, Canada
| | - Fiona S L Brinkman
- Department of Molecular Biology and Biochemistry, Simon Fraser University Burnaby, BC, Canada
| |
Collapse
|
11
|
Dooley DM, Petkau AJ, Van Domselaar G, Hsiao WWL. Sequence database versioning for command line and Galaxy bioinformatics servers. Bioinformatics 2015; 32:1275-7. [PMID: 26656932 PMCID: PMC4824126 DOI: 10.1093/bioinformatics/btv724] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2015] [Accepted: 12/06/2015] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION There are various reasons for rerunning bioinformatics tools and pipelines on sequencing data, including reproducing a past result, validation of a new tool or workflow using a known dataset, or tracking the impact of database changes. For identical results to be achieved, regularly updated reference sequence databases must be versioned and archived. Database administrators have tried to fill the requirements by supplying users with one-off versions of databases, but these are time consuming to set up and are inconsistent across resources. Disk storage and data backup performance has also discouraged maintaining multiple versions of databases since databases such as NCBI nr can consume 50 Gb or more disk space per version, with growth rates that parallel Moore's law. RESULTS Our end-to-end solution combines our own Kipper software package-a simple key-value large file versioning system-with BioMAJ (software for downloading sequence databases), and Galaxy (a web-based bioinformatics data processing platform). Available versions of databases can be recalled and used by command-line and Galaxy users. The Kipper data store format makes publishing curated FASTA databases convenient since in most cases it can store a range of versions into a file marginally larger than the size of the latest version. AVAILABILITY AND IMPLEMENTATION Kipper v1.0.0 and the Galaxy Versioned Data tool are written in Python and released as free and open source software available at https://github.com/Public-Health-Bioinformatics/kipper and https://github.com/Public-Health-Bioinformatics/versioned_data, respectively; detailed setup instructions can be found at https://github.com/Public-Health-Bioinformatics/versioned_data/blob/master/doc/setup.md CONTACT : Damion.Dooley@Bccdc.Ca or William.Hsiao@Bccdc.CaSupplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Damion M Dooley
- Department of Pathology, University of British Columbia, Vancouver, BC, Canada
| | - Aaron J Petkau
- National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, MB, Canada
| | - Gary Van Domselaar
- National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, MB, Canada
| | - William W L Hsiao
- Department of Pathology, University of British Columbia, Vancouver, BC, Canada BC Public Health Microbiology and Reference Laboratory, Vancouver, BC, Canada
| |
Collapse
|
12
|
Chu HT, Hsiao WWL, Chen JC, Yeh TJ, Tsai MH, Lin H, Liu YW, Lee SA, Chen CC, Tsao TTH, Kao CY. EBARDenovo: highly accurate de novo assembly of RNA-Seq with efficient chimera-detection. ACTA ACUST UNITED AC 2013; 29:1004-10. [PMID: 23457040 DOI: 10.1093/bioinformatics/btt092] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION High-accuracy de novo assembly of the short sequencing reads from RNA-Seq technology is very challenging. We introduce a de novo assembly algorithm, EBARDenovo, which stands for Extension, Bridging And Repeat-sensing Denovo. This algorithm uses an efficient chimera-detection function to abrogate the effect of aberrant chimeric reads in RNA-Seq data. RESULTS EBARDenovo resolves the complications of RNA-Seq assembly arising from sequencing errors, repetitive sequences and aberrant chimeric amplicons. In a series of assembly experiments, our algorithm is the most accurate among the examined programs, including de Bruijn graph assemblers, Trinity and Oases. AVAILABILITY AND IMPLEMENTATION EBARDenovo is available at http://ebardenovo.sourceforge.net/. This software package (with patent pending) is free of charge for academic use only. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hsueh-Ting Chu
- Department of Biomedical informatics, Department of Computer Science and Information Engineering, Asia University, Taichung, Taiwan.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Chu HT, Hsiao WWL, Tsao TTH, Chang CM, Liu YW, Fan CC, Lin H, Chang HH, Yeh TJ, Chen JC, Huang DM, Chen CC, Kao CY. Quantitative assessment of mitochondrial DNA copies from whole genome sequencing. BMC Genomics 2012; 13 Suppl 7:S5. [PMID: 23282223 PMCID: PMC3521385 DOI: 10.1186/1471-2164-13-s7-s5] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Background Mitochondrial dysfunction is associated with various aging diseases. The copy number of mtDNA in human cells may therefore be a potential biomarker for diagnostics of aging. Here we propose a new computational method for the accurate assessment of mtDNA copies from whole genome sequencing data. Results Two families of the human whole genome sequencing datasets from the HapMap and the 1000 Genomes projects were used for the accurate counting of mitochondrial DNA copy numbers. The results revealed the parental mitochondrial DNA copy numbers are significantly lower than that of their children in these samples. There are 8%~21% more copies of mtDNA in samples from the children than from their parents. The experiment demonstrated the possible correlations between the quantity of mitochondrial DNA and aging-related diseases. Conclusions Since the next-generation sequencing technology strives to deliver affordable and non-biased sequencing results, accurate assessment of mtDNA copy numbers can be achieved effectively from the output of whole genome sequencing. We implemented the method as a software package MitoCounter with the source code and user's guide available to the public at http://sourceforge.net/projects/mitocounter/.
Collapse
Affiliation(s)
- Hsueh-Ting Chu
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei 10617, Taiwan
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Abstract
Summary: Analysis of microbial genomes often requires the general organization and comparison of tens to thousands of genomes both from public repositories and unpublished sources. MicrobeDB provides a foundation for such projects by the automation of downloading published, completed bacterial and archaeal genomes from key sources, parsing annotations of all genomes (both public and private) into a local database, and allowing interaction with the database through an easy to use programming interface. MicrobeDB creates a simple to use, easy to maintain, centralized local resource for various large-scale comparative genomic analyses and a back-end for future microbial application design. Availability: MicrobeDB is freely available under the GNU-GPL at: http://github.com/mlangill/microbedb/ Contact:morgan.g.i.langille@gmail.com
Collapse
Affiliation(s)
- Morgan G I Langille
- Department of Biochemistry & Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada.
| | | | | | | | | | | |
Collapse
|
15
|
Li L, Hsiao WWL, Nandakumar R, Barbuto SM, Mongodin EF, Paster BJ, Fraser-Liggett CM, Fouad AF. Analyzing endodontic infections by deep coverage pyrosequencing. J Dent Res 2010; 89:980-4. [PMID: 20519493 DOI: 10.1177/0022034510370026] [Citation(s) in RCA: 98] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
Bacterial diversity in endodontic infections has not been sufficiently studied. The use of modern pyrosequencing technology should allow for more comprehensive analysis than traditional Sanger sequencing. This study investigated bacterial diversity in endodontic infections through taxonomic classification based on 16S rRNA gene sequences generated by 454 GS-FLX pyrosequencing and conventional Sanger capillary sequencing technologies. Sequencings were performed on 7 specimens from endodontic infections. On average, 47 vs. 28,590 sequences were obtained per sample for Sanger sequencing vs. pyrosequencing, representing a 600-fold difference in "depth-of-coverage". Based on Ribosomal Database Project (RDP II) Classifier analysis, pyrosequencing identified 179 bacterial genera in 13 phyla, which was significantly more than Sanger sequencing. The phylum Bacteroidetes was the most prevalent bacterial phylum. These results indicate that bacterial communities in endodontic infections are more diverse than previously demonstrated. In addition, deep-coverage pyrosequencing of the 16S rRNA gene revealed low-abundance micro-organisms with potential clinical implications.
Collapse
Affiliation(s)
- L Li
- Department of Endodontics, Prosthodontics and Operative Dentistry, Dental School, University of Maryland, 650 W. Baltimore St., Baltimore, MD 21201, USA
| | | | | | | | | | | | | | | |
Collapse
|
16
|
Abstract
BACKGROUND It has been noted that many bacterial virulence factor genes are located within genomic islands (GIs; clusters of genes in a prokaryotic genome of probable horizontal origin). However, such studies have been limited to single genera or isolated observations. We have performed the first large-scale analysis of multiple diverse pathogens to examine this association. We additionally identified genes found predominantly in pathogens, but not non-pathogens, across multiple genera using 631 complete bacterial genomes, and we identified common trends in virulence for genes in GIs. Furthermore, we examined the relationship between GIs and clustered regularly interspaced palindromic repeats (CRISPRs) proposed to confer resistance to phage. METHODOLOGY/PRINCIPAL FINDINGS We show quantitatively that GIs disproportionately contain more virulence factors than the rest of a given genome (p<1E-40 using three GI datasets) and that CRISPRs are also over-represented in GIs. Virulence factors in GIs and pathogen-associated virulence factors are enriched for proteins having more "offensive" functions, e.g. active invasion of the host, and are disproportionately components of type III/IV secretion systems or toxins. Numerous hypothetical pathogen-associated genes were identified, meriting further study. CONCLUSIONS/SIGNIFICANCE This is the first systematic analysis across diverse genera indicating that virulence factors are disproportionately associated with GIs. "Offensive" virulence factors, as opposed to host-interaction factors, may more often be a recently acquired trait (on an evolutionary time scale detected by GI analysis). Newly identified pathogen-associated genes warrant further study. We discuss the implications of these results, which cement the significant role of GIs in the evolution of many pathogens.
Collapse
Affiliation(s)
- Shannan J. Ho Sui
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Amber Fedynak
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - William W. L. Hsiao
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Morgan G. I. Langille
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Fiona S. L. Brinkman
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
- * E-mail:
| |
Collapse
|
17
|
Langille MGI, Hsiao WWL, Brinkman FSL. Evaluation of genomic island predictors using a comparative genomics approach. BMC Bioinformatics 2008; 9:329. [PMID: 18680607 PMCID: PMC2518932 DOI: 10.1186/1471-2105-9-329] [Citation(s) in RCA: 200] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2008] [Accepted: 08/05/2008] [Indexed: 01/08/2023] Open
Abstract
Background Genomic islands (GIs) are clusters of genes in prokaryotic genomes of probable horizontal origin. GIs are disproportionately associated with microbial adaptations of medical or environmental interest. Recently, multiple programs for automated detection of GIs have been developed that utilize sequence composition characteristics, such as G+C ratio and dinucleotide bias. To robustly evaluate the accuracy of such methods, we propose that a dataset of GIs be constructed using criteria that are independent of sequence composition-based analysis approaches. Results We developed a comparative genomics approach (IslandPick) that identifies both very probable islands and non-island regions. The approach involves 1) flexible, automated selection of comparative genomes for each query genome, using a distance function that picks appropriate genomes for identification of GIs, 2) identification of regions unique to the query genome, compared with the chosen genomes (positive dataset) and 3) identification of regions conserved across all genomes (negative dataset). Using our constructed datasets, we investigated the accuracy of several sequence composition-based GI prediction tools. Conclusion Our results indicate that AlienHunter has the highest recall, but the lowest measured precision, while SIGI-HMM is the most precise method. SIGI-HMM and IslandPath/DIMOB have comparable overall highest accuracy. Our comparative genomics approach, IslandPick, was the most accurate, compared with a curated list of GIs, indicating that we have constructed suitable datasets. This represents the first evaluation, using diverse and, independent datasets that were not artificially constructed, of the accuracy of several sequence composition-based GI predictors. The caveats associated with this analysis and proposals for optimal island prediction are discussed.
Collapse
Affiliation(s)
- Morgan G I Langille
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada.
| | | | | |
Collapse
|
18
|
McLeod MP, Warren RL, Hsiao WWL, Araki N, Myhre M, Fernandes C, Miyazawa D, Wong W, Lillquist AL, Wang D, Dosanjh M, Hara H, Petrescu A, Morin RD, Yang G, Stott JM, Schein JE, Shin H, Smailus D, Siddiqui AS, Marra MA, Jones SJM, Holt R, Brinkman FSL, Miyauchi K, Fukuda M, Davies JE, Mohn WW, Eltis LD. The complete genome of Rhodococcus sp. RHA1 provides insights into a catabolic powerhouse. Proc Natl Acad Sci U S A 2006; 103:15582-7. [PMID: 17030794 PMCID: PMC1622865 DOI: 10.1073/pnas.0607048103] [Citation(s) in RCA: 431] [Impact Index Per Article: 23.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Rhodococcus sp. RHA1 (RHA1) is a potent polychlorinated biphenyl-degrading soil actinomycete that catabolizes a wide range of compounds and represents a genus of considerable industrial interest. RHA1 has one of the largest bacterial genomes sequenced to date, comprising 9,702,737 bp (67% G+C) arranged in a linear chromosome and three linear plasmids. A targeted insertion methodology was developed to determine the telomeric sequences. RHA1's 9,145 predicted protein-encoding genes are exceptionally rich in oxygenases (203) and ligases (192). Many of the oxygenases occur in the numerous pathways predicted to degrade aromatic compounds (30) or steroids (4). RHA1 also contains 24 nonribosomal peptide synthase genes, six of which exceed 25 kbp, and seven polyketide synthase genes, providing evidence that rhodococci harbor an extensive secondary metabolism. Among sequenced genomes, RHA1 is most similar to those of nocardial and mycobacterial strains. The genome contains few recent gene duplications. Moreover, three different analyses indicate that RHA1 has acquired fewer genes by recent horizontal transfer than most bacteria characterized to date and far fewer than Burkholderia xenovorans LB400, whose genome size and catabolic versatility rival those of RHA1. RHA1 and LB400 thus appear to demonstrate that ecologically similar bacteria can evolve large genomes by different means. Overall, RHA1 appears to have evolved to simultaneously catabolize a diverse range of plant-derived compounds in an O(2)-rich environment. In addition to establishing RHA1 as an important model for studying actinomycete physiology, this study provides critical insights that facilitate the exploitation of these industrially important microorganisms.
Collapse
Affiliation(s)
- Michael P. McLeod
- *Department of Microbiology and Immunology, Life Sciences Institute, University of British Columbia, Vancouver, BC, Canada V6T 1Z3
| | - René L. Warren
- Michael Smith Genome Sciences Centre, Vancouver, BC, Canada V5Z 1L3
| | - William W. L. Hsiao
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada V5A 1S6; and
| | - Naoto Araki
- Department of Bioengineering, Nagaoka University of Technology, Nagaoka 940-2118, Japan
| | - Matthew Myhre
- *Department of Microbiology and Immunology, Life Sciences Institute, University of British Columbia, Vancouver, BC, Canada V6T 1Z3
| | - Clinton Fernandes
- *Department of Microbiology and Immunology, Life Sciences Institute, University of British Columbia, Vancouver, BC, Canada V6T 1Z3
| | - Daisuke Miyazawa
- *Department of Microbiology and Immunology, Life Sciences Institute, University of British Columbia, Vancouver, BC, Canada V6T 1Z3
| | - Wendy Wong
- *Department of Microbiology and Immunology, Life Sciences Institute, University of British Columbia, Vancouver, BC, Canada V6T 1Z3
| | - Anita L. Lillquist
- *Department of Microbiology and Immunology, Life Sciences Institute, University of British Columbia, Vancouver, BC, Canada V6T 1Z3
| | - Dennis Wang
- *Department of Microbiology and Immunology, Life Sciences Institute, University of British Columbia, Vancouver, BC, Canada V6T 1Z3
| | - Manisha Dosanjh
- *Department of Microbiology and Immunology, Life Sciences Institute, University of British Columbia, Vancouver, BC, Canada V6T 1Z3
| | - Hirofumi Hara
- *Department of Microbiology and Immunology, Life Sciences Institute, University of British Columbia, Vancouver, BC, Canada V6T 1Z3
| | - Anca Petrescu
- Michael Smith Genome Sciences Centre, Vancouver, BC, Canada V5Z 1L3
| | - Ryan D. Morin
- Michael Smith Genome Sciences Centre, Vancouver, BC, Canada V5Z 1L3
| | - George Yang
- Michael Smith Genome Sciences Centre, Vancouver, BC, Canada V5Z 1L3
| | - Jeff M. Stott
- Michael Smith Genome Sciences Centre, Vancouver, BC, Canada V5Z 1L3
| | | | - Heesun Shin
- Michael Smith Genome Sciences Centre, Vancouver, BC, Canada V5Z 1L3
| | - Duane Smailus
- Michael Smith Genome Sciences Centre, Vancouver, BC, Canada V5Z 1L3
| | - Asim S. Siddiqui
- Michael Smith Genome Sciences Centre, Vancouver, BC, Canada V5Z 1L3
| | - Marco A. Marra
- Michael Smith Genome Sciences Centre, Vancouver, BC, Canada V5Z 1L3
| | | | - Robert Holt
- Michael Smith Genome Sciences Centre, Vancouver, BC, Canada V5Z 1L3
| | - Fiona S. L. Brinkman
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada V5A 1S6; and
| | - Keisuke Miyauchi
- Department of Bioengineering, Nagaoka University of Technology, Nagaoka 940-2118, Japan
| | - Masao Fukuda
- Department of Bioengineering, Nagaoka University of Technology, Nagaoka 940-2118, Japan
| | - Julian E. Davies
- *Department of Microbiology and Immunology, Life Sciences Institute, University of British Columbia, Vancouver, BC, Canada V6T 1Z3
| | - William W. Mohn
- *Department of Microbiology and Immunology, Life Sciences Institute, University of British Columbia, Vancouver, BC, Canada V6T 1Z3
| | - Lindsay D. Eltis
- *Department of Microbiology and Immunology, Life Sciences Institute, University of British Columbia, Vancouver, BC, Canada V6T 1Z3
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
19
|
Coombes BK, Wickham ME, Brown NF, Lemire S, Bossi L, Hsiao WWL, Brinkman FSL, Finlay BB. Genetic and Molecular Analysis of GogB, a Phage-encoded Type III-secreted Substrate in Salmonella enterica Serovar Typhimurium with Autonomous Expression from its Associated Phage. J Mol Biol 2005; 348:817-30. [PMID: 15843015 DOI: 10.1016/j.jmb.2005.03.024] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2004] [Revised: 02/26/2005] [Accepted: 03/01/2005] [Indexed: 12/29/2022]
Abstract
Salmonella enterica serovar Typhimurium is lysogenized by several temperate bacteriophages that encode lysogenic conversion genes, which can act as virulence factors during infection and contribute to the genetic diversity and pathogenic potential of the lysogen. We have investigated the temperate bacteriophage called Gifsy-1 in S.enterica serovar Typhimurium and show here that the product of the gogB gene encoded within this phage shares similarity with proteins from other Gram-negative pathogens. The amino-terminal portion of GogB shares similarity with leucine-rich repeat-containing virulence-associated proteins from other Gram-negative pathogens, whereas the carboxyl-terminal portion of GogB shares similarity with uncharacterized proteins in other pathogens. We show that GogB is secreted by both type III secretion systems encoded in Salmonella Pathogenicity Island-1 (SPI-1) and SPI-2 but translocation into host cells is a SPI-2-mediated process. Once translocated, GogB localizes to the cytoplasm of infected host cells. The genetic regulation of gogB in Salmonella is influenced by the transcriptional activator, SsrB, under SPI-2-inducing conditions, but the modular nature of the gogB gene allows for autonomous expression and type III secretion following horizontal gene transfer into a heterologous pathogen. These data define the first autonomously expressed lysogenic conversion gene within Gifsy-1 that acts as a modular and promiscuous type III-secreted substrate of the infection process.
Collapse
Affiliation(s)
- Brian K Coombes
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z3, Canada
| | | | | | | | | | | | | | | |
Collapse
|
20
|
Warren R, Hsiao WWL, Kudo H, Myhre M, Dosanjh M, Petrescu A, Kobayashi H, Shimizu S, Miyauchi K, Masai E, Yang G, Stott JM, Schein JE, Shin H, Khattra J, Smailus D, Butterfield YS, Siddiqui A, Holt R, Marra MA, Jones SJM, Mohn WW, Brinkman FSL, Fukuda M, Davies J, Eltis LD. Functional characterization of a catabolic plasmid from polychlorinated- biphenyl-degrading Rhodococcus sp. strain RHA1. J Bacteriol 2004; 186:7783-95. [PMID: 15516593 PMCID: PMC524921 DOI: 10.1128/jb.186.22.7783-7795.2004] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Rhodococcus sp. strain RHA1, a potent polychlorinated-biphenyl (PCB)-degrading strain, contains three linear plasmids ranging in size from 330 to 1,100 kb. As part of a genome sequencing project, we report here the complete sequence and characterization of the smallest and least-well-characterized of the RHA1 plasmids, pRHL3. The plasmid is an actinomycete invertron, containing large terminal inverted repeats with a tightly associated protein and a predicted open reading frame (ORF) that is similar to that of a mycobacterial rep gene. The pRHL3 plasmid has 300 putative genes, almost 21% of which are predicted to have a catabolic function. Most of these are organized into three clusters. One of the catabolic clusters was predicted to include limonene degradation genes. Consistent with this prediction, RHA1 grew on limonene, carveol, or carvone as the sole carbon source. The plasmid carries three cytochrome P450-encoding (CYP) genes, a finding consistent with the high number of CYP genes found in other actinomycetes. Two of the CYP genes appear to belong to novel families; the third belongs to CYP family 116 but appears to belong to a novel class based on the predicted domain structure of its reductase. Analyses indicate that pRHL3 also contains four putative "genomic islands" (likely to have been acquired by horizontal transfer), insertion sequence elements, 19 transposase genes, and a duplication that spans two ORFs. One of the genomic islands appears to encode resistance to heavy metals. The plasmid does not appear to contain any housekeeping genes. However, each of the three catabolic clusters contains related genes that appear to be involved in glucose metabolism.
Collapse
|