1
|
Wang Y, Zheng S, Guo R, Li Y, Yin H, Qiu X, Chen J, Ni C, Yuan Y, Gong Y. Assessment for antibiotic resistance in Helicobacter pylori: A practical and interpretable machine learning model based on genome-wide genetic variation. Virulence 2025; 16:2481503. [PMID: 40119500 PMCID: PMC11934168 DOI: 10.1080/21505594.2025.2481503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 08/09/2024] [Accepted: 03/06/2025] [Indexed: 03/24/2025] Open
Abstract
Helicobacter pylori (H. pylori) antibiotic resistance poses a global health threat. Accurate identification of antibiotic resistant strains is essential for the control of infection. In the present study, our goal is to leverage the whole-genome data of H. pylori to develop practical and interpretable machine learning (ML) models for comprehensive antibiotic resistance assessment. A total of 296 H. pylori isolates with genome-wide data were downloaded from the Bacterial and Viral Bioinformatics Resource Center (BV-BRC) and the National Center for Biotechnology Information (NCBI) databases. By training ML models on feature sets of single nucleotide polymorphisms from SNP calling (SNPs-1), antibiotic-resistance SNP annotated by the Comprehensive Antibiotic Resistance Database (SNPs-2), gene presence or absence (GPA), we generated predictive models for four antibiotics and multidrug-resistance (MDR). Among them, the models that combined SNPs-1, SNPs-2, and GPA data demonstrated the best performance, with the eXtreme Gradient Boosting (XGBoost) consistently outperforming others. And then we utilized the SHapley Additive exPlanations (SHAP) method to interpret the ML models. Furthermore, a free web application for the MDR model was deployed to the GitHub repository (https://H.pylori/MDR/App/). Our study demonstrated the promise of employing whole-genome data in conjunction with ML algorithms to forecast H. pylori antibiotic resistance. In the future, the application of this approach for predicting H. pylori antibiotic resistance would hold the potential to mitigate the empiric administration.
Collapse
Affiliation(s)
- Yingying Wang
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, Shenyang, China
- Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, The First Hospital of China Medical University, Shenyang, China
- Key Laboratory of GI Cancer Etiology and Prevention in Liaoning Province, The First Hospital of China Medical University, Shenyang, China
| | - Shuwen Zheng
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, Shenyang, China
- Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, The First Hospital of China Medical University, Shenyang, China
- Key Laboratory of GI Cancer Etiology and Prevention in Liaoning Province, The First Hospital of China Medical University, Shenyang, China
| | - Rui Guo
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, Shenyang, China
- Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, The First Hospital of China Medical University, Shenyang, China
- Key Laboratory of GI Cancer Etiology and Prevention in Liaoning Province, The First Hospital of China Medical University, Shenyang, China
| | - Yanke Li
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, Shenyang, China
- Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, The First Hospital of China Medical University, Shenyang, China
- Key Laboratory of GI Cancer Etiology and Prevention in Liaoning Province, The First Hospital of China Medical University, Shenyang, China
| | - Honghao Yin
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, Shenyang, China
- Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, The First Hospital of China Medical University, Shenyang, China
- Key Laboratory of GI Cancer Etiology and Prevention in Liaoning Province, The First Hospital of China Medical University, Shenyang, China
| | - Xunan Qiu
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, Shenyang, China
- Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, The First Hospital of China Medical University, Shenyang, China
- Key Laboratory of GI Cancer Etiology and Prevention in Liaoning Province, The First Hospital of China Medical University, Shenyang, China
| | - Jijun Chen
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, Shenyang, China
- Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, The First Hospital of China Medical University, Shenyang, China
- Key Laboratory of GI Cancer Etiology and Prevention in Liaoning Province, The First Hospital of China Medical University, Shenyang, China
| | - Chuxuan Ni
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, Shenyang, China
- Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, The First Hospital of China Medical University, Shenyang, China
- Key Laboratory of GI Cancer Etiology and Prevention in Liaoning Province, The First Hospital of China Medical University, Shenyang, China
| | - Yuan Yuan
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, Shenyang, China
- Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, The First Hospital of China Medical University, Shenyang, China
- Key Laboratory of GI Cancer Etiology and Prevention in Liaoning Province, The First Hospital of China Medical University, Shenyang, China
| | - Yuehua Gong
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, Shenyang, China
- Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, The First Hospital of China Medical University, Shenyang, China
- Key Laboratory of GI Cancer Etiology and Prevention in Liaoning Province, The First Hospital of China Medical University, Shenyang, China
| |
Collapse
|
2
|
Espinoza ME, Swing AM, Elghraoui A, Modlin SJ, Valafar F. Interred mechanisms of resistance and host immune evasion revealed through network-connectivity analysis of M. tuberculosis complex graph pangenome. mSystems 2025; 10:e0049924. [PMID: 40261029 PMCID: PMC12013269 DOI: 10.1128/msystems.00499-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Accepted: 12/16/2024] [Indexed: 04/24/2025] Open
Abstract
Mycobacterium tuberculosis complex successfully adapts to environmental pressures through mechanisms of rapid adaptation which remain poorly understood despite knowledge gained through decades of research. In this study, we used 110 reference-quality, complete de novo assembled, long-read sequenced clinical genomes to study patterns of structural adaptation through a graph-based pangenome analysis, elucidating rarely studied mechanisms that enable enhanced clinical phenotypes offering a novel perspective to the species' adaptation. Across isolates, we identified a pangenome of 4,325 genes (3,767 core and 558 accessory), revealing 290 novel genes, and a substantially more complete account of difficult-to-sequence esx/pe/pgrs/ppe genes. Seventy-four percent of core genes were deemed non-essential in vitro, 38% of which support the pathogen's survival in vivo, suggesting a need to broaden current perspectives on essentiality. Through information-theoretic analysis, we reveal the ppe genes that contribute most to the species' diversity-several with known consequences for antigenic variation and immune evasion. Construction of a graph pangenome revealed topological variations that implicate genes known to modulate host immunity (Rv0071-73, Rv2817c, cas2), defense against phages/viruses (cas2, csm6, and Rv2817c-2821c), and others associated with host tissue colonization. Here, the prominent trehalose transport pathway stands out for its involvement in caseous granuloma catabolism and the development of post-primary disease. We show paralogous duplications of genes implicated in bedaquiline (mmpL5 in all L1 isolates) and ethambutol (embC-A) resistance, with a paralogous duplication of its regulator (embR) in 96 isolates. We provide hypotheses for novel mechanisms of immune evasion and antibiotic resistance through gene dosing that can escape detection by molecular diagnostics.IMPORTANCEM. tuberculosis complex (MTBC) has killed over a billion people in the past 200 years alone and continues to kill nearly 1.5 million annually. The pathogen has a versatile ability to diversify under immune and drug pressure and survive, even becoming antibiotic persistent or resistant in the face of harsh chemotherapy. For proper diagnosis and design of an appropriate treatment regimen, a full understanding of this diversification and its clinical consequences is desperately needed. A mechanism of diversification that is rarely studied systematically is MTBC's ability to structurally change its genome. In this article, we have de novo assembled 110 clinical genomes (the largest de novo assembled set to date) and performed a pangenomic analysis. Our pangenome provides structural variation-based hypotheses for novel mechanisms of immune evasion and antibiotic resistance through gene dosing that can compromise molecular diagnostics and lead to further emergence of antibiotic resistance.
Collapse
Affiliation(s)
- Monica E. Espinoza
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, San Diego State University, San Diego, California, USA
| | - Ashley M. Swing
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, San Diego State University, San Diego, California, USA
- San Diego State University/University of California, San Diego | Joint Doctoral Program in Public Health (Global Health), San Diego, California, USA
| | - Afif Elghraoui
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, San Diego State University, San Diego, California, USA
- Department of Electrical and Computer Engineering, San Diego State University, San Diego, California, USA
- Department of Electrical and Computer Engineering, University of California San Diego, San Diego, California, USA
| | - Samuel J. Modlin
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, San Diego State University, San Diego, California, USA
| | - Faramarz Valafar
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, San Diego State University, San Diego, California, USA
| |
Collapse
|
3
|
Khalaf WS, Morgan RN, Elkhatib WF. Clinical microbiology and artificial intelligence: Different applications, challenges, and future prospects. J Microbiol Methods 2025; 232-234:107125. [PMID: 40188989 DOI: 10.1016/j.mimet.2025.107125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2024] [Revised: 03/10/2025] [Accepted: 04/03/2025] [Indexed: 04/10/2025]
Abstract
Conventional clinical microbiological techniques are enhanced by the introduction of artificial intelligence (AI). Comprehensive data processing and analysis enabled the development of curated datasets that has been effectively used in training different AI algorithms. Recently, a number of machine learning (ML) and deep learning (DL) algorithms are developed and evaluated using diverse microbiological datasets. These datasets included spectral analysis (Raman and MALDI-TOF spectroscopy), microscopic images (Gram and acid fast stains), and genomic and protein sequences (whole genome sequencing (WGS) and protein data banks (PDBs)). The primary objective of these algorithms is to minimize the time, effort, and expenses linked to conventional analytical methods. Furthermore, AI algorithms are incorporated with quantitative structure-activity relationship (QSAR) models to predict novel antimicrobial agents that address the continuing surge of antimicrobial resistance. During the COVID-19 pandemic, AI algorithms played a crucial role in vaccine developments and the discovery of new antiviral agents, and introduced potential drug candidates via drug repurposing. However, despite their significant benefits, the implementation of AI encounters various challenges, including ethical considerations, the potential for bias, and errors related to data training. This review seeks to provide an overview of the most recent applications of artificial intelligence in clinical microbiology, with the intention of educating a wider audience of clinical practitioners regarding the current uses of machine learning algorithms and encouraging their implementation. Furthermore, it will discuss the challenges related to the incorporation of AI into clinical microbiology laboratories and examine future opportunities for AI within the realm of infectious disease epidemiology.
Collapse
Affiliation(s)
- Wafaa S Khalaf
- Department of Microbiology and Immunology, Faculty of Pharmacy (Girls), Al-Azhar University, Nasr city, Cairo 11751, Egypt.
| | - Radwa N Morgan
- National Centre for Radiation Research and Technology (NCRRT), Drug Radiation Research Department, Egyptian Atomic Energy Authority (EAEA), Cairo 11787, Egypt.
| | - Walid F Elkhatib
- Department of Microbiology & Immunology, Faculty of Pharmacy, Galala University, New Galala City, Suez, Egypt; Microbiology and Immunology Department, Faculty of Pharmacy, Ain Shams University, African Union Organization St., Abbassia, Cairo 11566, Egypt.
| |
Collapse
|
4
|
Arnold A, McLellan S, Stokes JM. How AI can help us beat AMR. NPJ ANTIMICROBIALS AND RESISTANCE 2025; 3:18. [PMID: 40082590 PMCID: PMC11906734 DOI: 10.1038/s44259-025-00085-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2024] [Accepted: 02/06/2025] [Indexed: 03/16/2025]
Abstract
Antimicrobial resistance (AMR) is an urgent public health threat. Advancements in artificial intelligence (AI) and increases in computational power have resulted in the adoption of AI for biological tasks. This review explores the application of AI in bacterial infection diagnostics, AMR surveillance, and antibiotic discovery. We summarize contemporary AI models applied to each of these domains, important considerations when applying AI across diverse tasks, and current limitations in the field.
Collapse
Affiliation(s)
- Autumn Arnold
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ON, Canada
- Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, ON, Canada
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, ON, Canada
| | - Stewart McLellan
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ON, Canada
- Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, ON, Canada
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, ON, Canada
| | - Jonathan M Stokes
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ON, Canada.
- Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, ON, Canada.
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, ON, Canada.
| |
Collapse
|
5
|
Xie F, Wang L, Li S, Hu L, Wen Y, Li X, Ye K, Duan Z, Wang Q, Guan Y, Zhang Y, Shi Q, Yang J, Xia H, Xie L. Large-scale genomic analysis reveals significant role of insertion sequences in antimicrobial resistance of Acinetobacter baumannii. mBio 2025; 16:e0285224. [PMID: 39976435 PMCID: PMC11898611 DOI: 10.1128/mbio.02852-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2024] [Accepted: 01/22/2025] [Indexed: 02/21/2025] Open
Abstract
Acinetobacter baumannii, a prominent nosocomial pathogen renowned for its extensive resistance to antimicrobial agents, poses a significant challenge in the accurate prediction of antimicrobial resistance (AMR) from genomic data. Despite thorough researches on the molecular mechanisms of AMR, gaps remain in our understanding of key contributors. This study utilized rule-based and three machine learning models to predict AMR phenotypes, aiming to decipher key genomic factors associated with AMR. Genomes and antibiotic resistance phenotypes from 1,012 public isolates were employed for model construction and training. To validate the models, a data set comprising 164 self-collected strains underwent next-generation sequencing, nanopore long-read sequencing, and antimicrobial susceptibility testing using the broth dilution method. It was found that the presence of antibiotic resistance genes (ARGs) alone was insufficient to accurately predict AMR phenotype for the majority of antibiotics (90%, 18 out of 20) in the public data set. Conversely, it was observed that combining ARGs with insertion sequence (IS) elements significantly enhanced predictive performance. The Random Forest model was found to outperform the support vector machine (SVM), logistic regression model, and rule-based method across all 20 antibiotics, with accuracies ranging from 83.80% to 97.70%. In the validation data set, even higher accuracies were achieved, ranging from 85.63% to 99.31%. Furthermore, conserved sequence patterns between IS elements and ARGs were validated using self-collected long-read sequencing data, substantially enhancing the accuracy of AMR prediction in A. baumannii. This study underscores the pivotal role of IS elements in AMR. IMPORTANCE The interplay between insertion sequences (ISs) and antibiotic resistance genes (ARGs) in Acinetobacter baumannii contributes to resistance against specific antibiotics. Conventionally, genetic variations and ARGs have been utilized for predicting resistance phenotypes, with the potential pivotal role of IS elements largely overlooked. Our study advances this approach by integrating both rule-based and machine learning models to predict AMR in A. baumannii. This significantly enhances the accuracy of AMR prediction, emphasizing the pivotal function of IS elements in antibiotic resistance. Notably, we uncover a series of conserved sequence patterns linking IS elements and ARGs, which outperform ARGs alone in phenotypic prediction. Our findings are crucial for bioinformatics strategies aimed at studying and tracking AMR, offering novel insights into combating the escalating AMR challenge.
Collapse
Affiliation(s)
- Fei Xie
- College of Pulmonary and Critical Care Medicine, Chinese PLA General Hospital, Beijing, China
| | - Lifeng Wang
- Laboratory Medicine Department, First Medical Center of Chinese PLA General Hospital, Beijing, China
| | - Song Li
- Department of Research and Development, Hugobiotech Co., Ltd, Beijing, China
| | - Long Hu
- Department of Research and Development, Hugobiotech Co., Ltd, Beijing, China
| | - Yanhua Wen
- Department of Research and Development, Hugobiotech Co., Ltd, Beijing, China
| | - Xuming Li
- Department of Research and Development, Hugobiotech Co., Ltd, Beijing, China
| | - Kun Ye
- Laboratory Medicine Department, First Medical Center of Chinese PLA General Hospital, Beijing, China
| | - Zhimei Duan
- College of Pulmonary and Critical Care Medicine, Chinese PLA General Hospital, Beijing, China
| | - Qi Wang
- Department of Research and Development, Hugobiotech Co., Ltd, Beijing, China
| | - Yuanlin Guan
- Department of Research and Development, Hugobiotech Co., Ltd, Beijing, China
| | - Ye Zhang
- Department of Research and Development, Hugobiotech Co., Ltd, Beijing, China
| | - Qiqi Shi
- Department of Research and Development, Hugobiotech Co., Ltd, Beijing, China
| | - Jiyong Yang
- Laboratory Medicine Department, First Medical Center of Chinese PLA General Hospital, Beijing, China
| | - Han Xia
- Department of Research and Development, Hugobiotech Co., Ltd, Beijing, China
| | - Lixin Xie
- College of Pulmonary and Critical Care Medicine, Chinese PLA General Hospital, Beijing, China
| |
Collapse
|
6
|
Pal A, Mohanty D. Machine learning-based approach for identification of new resistance associated mutations from whole genome sequences of Mycobacterium tuberculosis. BIOINFORMATICS ADVANCES 2025; 5:vbaf050. [PMID: 40125545 PMCID: PMC11930343 DOI: 10.1093/bioadv/vbaf050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/16/2024] [Revised: 02/27/2025] [Accepted: 03/06/2025] [Indexed: 03/25/2025]
Abstract
Motivation Currently available methods for the prediction of genotypic drug resistance in Mycobacterium tuberculosis utilize information on known markers of drug resistance. Hence, machine learning approaches are needed that can discover new resistance markers. Results Whole genome sequences with known phenotypic drug resistance profiles have been utilized to train XGBoost and ANN classifiers for 5 first-line and 8 second-line tuberculosis drugs. Benchmarking on a completely independent dataset from CRyPTIC database revealed that our method has high sensitivity (90%-95%) and specificity (94%-99%) for five first-line drugs and robust performance for six second-line drugs with a sensitivity of 77%-89% at over 95% specificity. An explainable AI method, SHapley Additive exPlanations, has successfully identified resistance mutations for each drug in a completely automated way. This approach could not only identify known resistance associated mutations in agreement with the WHO mutation catalogue, but also predicted >100 other potential resistance associated mutations for 13 antibiotics in new genes outside the known resistance loci. Identification of new resistance markers opens up the opportunity for the discovery of novel mechanisms of drug resistance. Availability and implementation Our prediction method has been implemented as TB-AMRpred webserver and command line tool, available freely at http://www.nii.ac.in/TB-AMRpred.html and https://github.com/Ankitapal1995/TB-AMRprd.
Collapse
Affiliation(s)
- Ankita Pal
- Bioinformatics Center, National Institute of Immunology, New Delhi 110067, India
| | - Debasisa Mohanty
- Bioinformatics Center, National Institute of Immunology, New Delhi 110067, India
| |
Collapse
|
7
|
Guliaev A, Hjort K, Rossi M, Jonsson S, Nicoloff H, Guy L, Andersson DI. Machine learning detection of heteroresistance in Escherichia coli. EBioMedicine 2025; 113:105618. [PMID: 39986174 PMCID: PMC11893328 DOI: 10.1016/j.ebiom.2025.105618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Revised: 02/10/2025] [Accepted: 02/11/2025] [Indexed: 02/24/2025] Open
Abstract
BACKGROUND Heteroresistance (HR) is a significant type of antibiotic resistance observed for several bacterial species and antibiotic classes where a susceptible main population contains small subpopulations of resistant cells. Mathematical models, animal experiments and clinical studies associate HR with treatment failure. Currently used susceptibility tests do not detect heteroresistance reliably, which can result in misclassification of heteroresistant isolates as susceptible which might lead to treatment failure. Here we examined if whole genome sequence (WGS) data and machine learning (ML) can be used to detect bacterial HR. METHODS We classified 467 Escherichia coli clinical isolates as HR or non-HR to the often used β-lactam/inhibitor combination piperacillin-tazobactam using pre-screening and Population Analysis Profiling tests. We sequenced the isolates, assembled the whole genomes and created a set of predictors based on current knowledge of HR mechanisms. Then we trained several machine learning models on 80% of this data set aiming to detect HR isolates. We compared performance of the best ML models on the remaining 20% of the data set with a baseline model based solely on the presence of β-lactamase genes. Furthermore, we sequenced the resistant sub-populations in order to analyse the genetic mechanisms underlying HR. FINDINGS The best ML model achieved 100% sensitivity and 84.6% specificity, outperforming the baseline model. The strongest predictors of HR were the total number of β-lactamase genes, β-lactamase gene variants and presence of IS elements flanking them. Genetic analysis of HR strains confirmed that HR is caused by an increased copy number of resistance genes via gene amplification or plasmid copy number increase. This aligns with the ML model's findings, reinforcing the hypothesis that this mechanism underlies HR in Gram-negative bacteria. INTERPRETATION We demonstrate that a combination of WGS and ML can identify HR in bacteria with perfect sensitivity and high specificity. This improved detection would allow for better-informed treatment decisions and potentially reduce the occurrence of treatment failures associated with HR. FUNDING Funding provided to DIA from the Swedish Research Council (2021-02091) and NIH (1U19AI158080-01).
Collapse
Affiliation(s)
- Andrei Guliaev
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Karin Hjort
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Michele Rossi
- Department of Biosciences, University of Milan, Milan, Italy; Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Sofia Jonsson
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Hervé Nicoloff
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Lionel Guy
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden; SciLifeLab, Uppsala University, Uppsala, Sweden
| | - Dan I Andersson
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden.
| |
Collapse
|
8
|
Yönden Z, Reshadi S, Hayati AF, Hooshiar MH, Ghasemi S, Yönden H, Daemi A. Reviewing on AI-Designed Antibiotic Targeting Drug-Resistant Superbugs by Emphasizing Mechanisms of Action. Drug Dev Res 2025; 86:e70066. [PMID: 39932058 DOI: 10.1002/ddr.70066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2024] [Revised: 01/15/2025] [Accepted: 02/02/2025] [Indexed: 05/08/2025]
Abstract
The emergence of drug-resistant bacteria, often referred to as "superbugs," poses a profound and escalating challenge to global health systems, surpassing the capabilities of traditional antibiotic discovery methods. As resistance mechanisms evolve rapidly, the need for innovative solutions has never been more critical. This review delves into the transformative role of AI-driven methodologies in antibiotic development, particularly in targeting drug-resistant bacterial strains (DRSBs), with an emphasis on understanding their mechanisms of action. AI algorithms have revolutionized the antibiotic discovery process by efficiently collecting, analyzing, and modeling complex datasets to predict both the effectiveness of potential antibiotics and the mechanisms of bacterial resistance. These computational advancements enable researchers to identify promising antibiotic candidates with unique mechanisms that effectively bypass conventional resistance pathways. By specifically targeting critical bacterial processes or disrupting essential cellular components, these AI-designed antibiotics offer robust solutions for combating even the most resilient bacterial strains. The application of AI in antibiotic design represents a paradigm shift, enabling the rapid and precise identification of novel compounds with tailored mechanisms of action. This approach not only accelerates the drug development timeline but also enhances the precision of targeting superbugs, significantly improving therapeutic outcomes. Furthermore, understanding the underlying mechanisms of these AI-designed antibiotics is crucial for optimizing their clinical efficacy and devising proactive strategies to prevent the emergence of further resistance. AI-driven antibiotic discovery is poised to play a pivotal role in the global fight against antimicrobial resistance. By leveraging the power of artificial intelligence, researchers are opening new frontiers in the development of effective treatments, ensuring a proactive and sustainable response to the growing threat of drug-resistant bacteria.
Collapse
Affiliation(s)
- Zafer Yönden
- Department of Medical Biochemistry, Faculty of Medicine, Cukurova University, Adana, Turkey
| | - Samira Reshadi
- School of Dentistry, Mashhad University of Medical Sciences, Mashhad, Iran
| | | | - Mohammad Hossein Hooshiar
- Department of Periodontics, School of Dentistry, Tehran University of Medical Sciences, Tehran, Iran
| | - Sholeh Ghasemi
- Department of Internal Medicine, School of Medicine, Urmia Univesity of Medical Science, Urmia, Iran
| | - Hakan Yönden
- Department of Health Services and Vocational School Management, Health Institution Management, Tarsus University, Tarsus, Turkey
| | - Amin Daemi
- Department of Medical Biochemistry, Faculty of Medicine, Cukurova University, Adana, Turkey
| |
Collapse
|
9
|
Shankar G, Akhter Y. Stealing survival: Iron acquisition strategies of Mycobacteriumtuberculosis. Biochimie 2024; 227:37-60. [PMID: 38901792 DOI: 10.1016/j.biochi.2024.06.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 06/07/2024] [Accepted: 06/18/2024] [Indexed: 06/22/2024]
Abstract
Mycobacterium tuberculosis (Mtb), the causative agent of tuberculosis (TB), faces iron scarcity within the host due to immune defenses. This review explores the importance of iron for Mtb and its strategies to overcome iron restriction. We discuss how the host limits iron as an innate immune response and how Mtb utilizes various iron acquisition systems, particularly the siderophore-mediated pathway. The review illustrates the structure and biosynthesis of mycobactin, a key siderophore in Mtb, and the regulation of its production. We explore the potential of targeting siderophore biosynthesis and uptake as a novel therapeutic approach for TB. Finally, we summarize current knowledge on Mtb's iron acquisition and highlight promising directions for future research to exploit this pathway for developing new TB interventions.
Collapse
Affiliation(s)
- Gauri Shankar
- Department of Biotechnology, Babasaheb Bhimrao Ambedkar University, Vidya Vihar, Raebareli Road, Lucknow, Uttar Pradesh, 226 025, India
| | - Yusuf Akhter
- Department of Biotechnology, Babasaheb Bhimrao Ambedkar University, Vidya Vihar, Raebareli Road, Lucknow, Uttar Pradesh, 226 025, India.
| |
Collapse
|
10
|
Hernandez-Velazquez D, Vasquez MK, Petre R, Kyndt JA. Genome sequences of Mycobacterium sp. Elmwood and accompanying phage, isolated from a public swimming pool in Nebraska. Microbiol Resour Announc 2024; 13:e0089624. [PMID: 39345178 PMCID: PMC11556037 DOI: 10.1128/mra.00896-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2024] [Accepted: 08/28/2024] [Indexed: 10/01/2024] Open
Abstract
Genome sequencing of a non-tuberculosis Mycobacterium species, isolated from a public pool, shows that the genome contains several genes for antibiotic resistance and anti-phage defense, which are absent from other related Mycobacteria. Metagenomic binning also provided the genome of the accompanying phage, which is distinct from other mycobacterial phages.
Collapse
Affiliation(s)
| | - Madelynn K. Vasquez
- College of Science and Technology, Bellevue University, Bellevue, Nebraska, USA
| | - Rana Petre
- Erasmus Brussels University of Applied Sciences and Art, Brussels, Belgium
| | - John A. Kyndt
- College of Science and Technology, Bellevue University, Bellevue, Nebraska, USA
| |
Collapse
|
11
|
Verma D, Satyanarayana T, Dias PJ. Editorial: Microbial comparative genomics and pangenomics: new tools, approaches and insights into gene and genome evolution. Front Genet 2024; 15:1490645. [PMID: 39512798 PMCID: PMC11540761 DOI: 10.3389/fgene.2024.1490645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2024] [Accepted: 10/10/2024] [Indexed: 11/15/2024] Open
Affiliation(s)
- Digvijay Verma
- Department of Environmental Microbiology, School of Earth and Environmental Sciences, Babasaheb Bhimrao Ambedkar University, Lucknow, Uttar Pradesh, India
| | - Tulasi Satyanarayana
- Department of Biological Sciences and Engineering, Netaji Subhas University of Technology, New Delhi, India
| | - Paulo Jorge Dias
- iBB - Institute for Bioengineering and Biosciences, Instituto Superior Técnico, University of Lisbon, Lisbon, Portugal
- Associate Laboratory i4HB - Institute for Health and Bioeconomy at Instituto Superior Técnico, University of Lisbon, Lisbon, Portugal
| |
Collapse
|
12
|
Matthews CA, Watson-Haigh NS, Burton RA, Sheppard AE. A gentle introduction to pangenomics. Brief Bioinform 2024; 25:bbae588. [PMID: 39552065 PMCID: PMC11570541 DOI: 10.1093/bib/bbae588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Revised: 09/12/2024] [Accepted: 11/01/2024] [Indexed: 11/19/2024] Open
Abstract
Pangenomes have emerged in response to limitations associated with traditional linear reference genomes. In contrast to a traditional reference that is (usually) assembled from a single individual, pangenomes aim to represent all of the genomic variation found in a group of organisms. The term 'pangenome' is currently used to describe multiple different types of genomic information, and limited language is available to differentiate between them. This is frustrating for researchers working in the field and confusing for researchers new to the field. Here, we provide an introduction to pangenomics relevant to both prokaryotic and eukaryotic organisms and propose a formalization of the language used to describe pangenomes (see the Glossary) to improve the specificity of discussion in the field.
Collapse
Affiliation(s)
- Chelsea A Matthews
- School of Agriculture, Food and Wine, Waite Campus, University of Adelaide, Urrbrae, South Australia 5064, Australia
| | - Nathan S Watson-Haigh
- Australian Genome Research Facility, Victorian Comprehensive Cancer Centre, Melbourne, Victoria 3000, Australia
- South Australian Genomics Centre, SAHMRI, North Terrace, Adelaide, South Australia 5000, Australia
- Alkahest Inc., San Carlos, CA 94070, United States
| | - Rachel A Burton
- School of Agriculture, Food and Wine, Waite Campus, University of Adelaide, Urrbrae, South Australia 5064, Australia
| | - Anna E Sheppard
- School of Biological Sciences, University of Adelaide, Adelaide, South Australia 5005, Australia
| |
Collapse
|
13
|
Atasoy M, Bartkova S, Çetecioğlu-Gürol Z, P Mira N, O'Byrne C, Pérez-Rodríguez F, Possas A, Scheler O, Sedláková-Kaduková J, Sinčák M, Steiger M, Ziv C, Lund PA. Methods for studying microbial acid stress responses: from molecules to populations. FEMS Microbiol Rev 2024; 48:fuae015. [PMID: 38760882 PMCID: PMC11418653 DOI: 10.1093/femsre/fuae015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 03/27/2024] [Accepted: 05/16/2024] [Indexed: 05/20/2024] Open
Abstract
The study of how micro-organisms detect and respond to different stresses has a long history of producing fundamental biological insights while being simultaneously of significance in many applied microbiological fields including infection, food and drink manufacture, and industrial and environmental biotechnology. This is well-illustrated by the large body of work on acid stress. Numerous different methods have been used to understand the impacts of low pH on growth and survival of micro-organisms, ranging from studies of single cells to large and heterogeneous populations, from the molecular or biophysical to the computational, and from well-understood model organisms to poorly defined and complex microbial consortia. Much is to be gained from an increased general awareness of these methods, and so the present review looks at examples of the different methods that have been used to study acid resistance, acid tolerance, and acid stress responses, and the insights they can lead to, as well as some of the problems involved in using them. We hope this will be of interest both within and well beyond the acid stress research community.
Collapse
Affiliation(s)
- Merve Atasoy
- UNLOCK, Wageningen University and Research, PO Box 9101, 6700 HB, the Netherlands
| | - Simona Bartkova
- Department of Chemistry and Biotechnology, Tallinn University of Technology, Ehitajate tee 5, 19086 Tallinn, Estonia
| | - Zeynep Çetecioğlu-Gürol
- Department of Industrial Biotechnology, KTH Royal Institute of Technology, Roslagstullsbacken 21 106 91 Stockholm, Stockholm, Sweden
| | - Nuno P Mira
- iBB, Institute for Bioengineering and Biosciences, Department of Bioengineering, Universidade de Lisboa, Av. Rovisco Pais, 1049-001 Lisboa, Portugal
- Associate Laboratory i4HB, Institute for Health and Bioeconomy, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais, 1049-001 Lisboa, Portugal
| | - Conor O'Byrne
- Microbiology, School of Biological and Chemical Sciences, University of Galway, University Road, Galway, H91 TK33, Ireland
| | - Fernando Pérez-Rodríguez
- Department of Food Science and Tehcnology, UIC Zoonosis y Enfermedades Emergentes ENZOEM, University of Córdoba, 14014 Córdoba, Spain
| | - Aricia Possas
- Department of Food Science and Tehcnology, UIC Zoonosis y Enfermedades Emergentes ENZOEM, University of Córdoba, 14014 Córdoba, Spain
| | - Ott Scheler
- Department of Chemistry and Biotechnology, Tallinn University of Technology, Ehitajate tee 5, 19086 Tallinn, Estonia
| | - Jana Sedláková-Kaduková
- Institute of Chemistry and Environmental Sciences, University of Ss. Cyril and Methodius, 91701 Trnava, Republic of Slovakia
| | - Mirka Sinčák
- Institute of Chemistry and Environmental Sciences, University of Ss. Cyril and Methodius, 91701 Trnava, Republic of Slovakia
| | - Matthias Steiger
- Institute of Chemical, Environmental and Bioscience Engineering, TU Wien, Getreidemarkt 9, 1060 Vienna, Austria
| | - Carmit Ziv
- Department of Postharvest Science, Agricultural Research Organization, Volcani Center, 7505101 Rishon LeZion, Israel
| | - Peter A Lund
- School of Biosciences and Institute of Microbiology of Infection, University of Birmingham, Birmingham B15 2TT, United Kingdom
| |
Collapse
|
14
|
Le DQ, Nguyen TA, Nguyen SH, Nguyen TT, Nguyen CH, Phung HT, Ho TH, Vo NS, Nguyen T, Nguyen HA, Cao MD. Efficient inference of large prokaryotic pangenomes with PanTA. Genome Biol 2024; 25:209. [PMID: 39107817 PMCID: PMC11304767 DOI: 10.1186/s13059-024-03362-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Accepted: 07/30/2024] [Indexed: 08/10/2024] Open
Abstract
Pangenome inference is an indispensable step in bacterial genomics, yet its scalability poses a challenge due to the rapid growth of genomic collections. This paper presents PanTA, a software package designed for constructing pangenomes of large bacterial datasets, showing unprecedented efficiency levels multiple times higher than existing tools. PanTA introduces a novel mechanism to construct the pangenome progressively without rebuilding the accumulated collection from scratch. The progressive mode is shown to consume orders of magnitude less computational resources than existing solutions in managing growing datasets. The software is open source and is publicly available at https://github.com/amromics/panta and at 10.6084/m9.figshare.23724705 .
Collapse
Affiliation(s)
- Duc Quang Le
- AMROMICS JSC, Nghe An, Vietnam
- Faculty of IT, Hanoi University of Civil Engineering, Hanoi, Vietnam
| | - Tien Anh Nguyen
- AMROMICS JSC, Nghe An, Vietnam
- Faculty of Biotechnology, Hanoi University of Pharmacy, Hanoi, Vietnam
| | | | - Tam Thi Nguyen
- Oxford University Clinical Research Unit, Hanoi, Vietnam
| | - Canh Hao Nguyen
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto, Japan
| | - Huong Thanh Phung
- Faculty of Biotechnology, Hanoi University of Pharmacy, Hanoi, Vietnam
| | - Tho Huu Ho
- Department of Medical Microbiology, The 103 Military Hospital, Vietnam Military Medical University, Hanoi, Vietnam
- Department of Genomics & Cytogenetics, Institute of Biomedicine & Pharmacy, Vietnam Military Medical University, Hanoi, Vietnam
| | - Nam S Vo
- Center for Biomedical Informatics, Vingroup Big Data Institute, Hanoi, Vietnam
| | | | | | | |
Collapse
|
15
|
Bhalla N, Nanda RK. Pangenome-wide association study reveals the selective absence of CRISPR genes (Rv2816c-19c) in drug-resistant Mycobacterium tuberculosis. Microbiol Spectr 2024; 12:e0052724. [PMID: 38916315 PMCID: PMC11302280 DOI: 10.1128/spectrum.00527-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Accepted: 05/31/2024] [Indexed: 06/26/2024] Open
Abstract
The presence of intermittently dispersed insertion sequences and transposases in the Mycobacterium tuberculosis (Mtb) genome makes intra-genome recombination events inevitable. Understanding their effect on the gene repertoires (GR), which may contribute to the development of drug-resistant Mtb, is critical. In this study, publicly available WGS data of clinical Mtb isolates (endemic region n = 2,601; non-endemic region n = 1,130) were de novo assembled, filtered, scaffolded into assemblies, and functionally annotated. Out of 2,601 Mtb WGS data sets from endemic regions, 2,184 (drug resistant/sensitive: 1,386/798) qualified as high quality. We identified 3,784 core genes, 123 softcore genes, 224 shell genes, and 762 cloud genes in the pangenome of Mtb clinical isolates from endemic regions. Sets of 33 and 39 genes showed positive and negative associations (P < 0.01) with drug resistance status, respectively. Gene ontology clustering showed compromised immunity to phages and impaired DNA repair in drug-resistant Mtb clinical isolates compared to the sensitive ones. Multidrug efflux pump repressor genes (Rv3830c and Rv3855c) and CRISPR genes (Rv2816c-19c) were absent in the drug-resistant Mtb. A separate WGS data analysis of drug-resistant Mtb clinical isolates from the Netherlands (n = 1130) also showed the absence of CRISPR genes (Rv2816c-17c). This study highlights the role of CRISPR genes in drug resistance development in Mtb clinical isolates and helps in understanding its evolutionary trajectory and as useful targets for diagnostics development.IMPORTANCEThe results from the present Pan-GWAS study comparing gene sets in drug-resistant and drug-sensitive Mtb clinical isolates revealed intricate presence-absence patterns of genes encoding DNA-binding proteins having gene regulatory as well as DNA modification and DNA repair roles. Apart from the genes with known functions, some uncharacterized and hypothetical genes that seem to have a potential role in drug resistance development in Mtb were identified. We have been able to extrapolate many findings of the present study with the existing literature on the molecular aspects of drug-resistant Mtb, further strengthening the relevance of the results presented in this study.
Collapse
Affiliation(s)
- Nikhil Bhalla
- Translational Health Group, International Center of Genetic Engineering and Biotechnology, New Delhi, India
| | - Ranjan Kumar Nanda
- Translational Health Group, International Center of Genetic Engineering and Biotechnology, New Delhi, India
| |
Collapse
|
16
|
Liu CSC, Pandey R. Integrative genomics would strengthen AMR understanding through ONE health approach. Heliyon 2024; 10:e34719. [PMID: 39816336 PMCID: PMC11734142 DOI: 10.1016/j.heliyon.2024.e34719] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 07/13/2024] [Accepted: 07/15/2024] [Indexed: 01/18/2025] Open
Abstract
Emergence of drug-induced antimicrobial resistance (AMR) forms a crippling health and economic crisis worldwide, causing high mortality from otherwise treatable diseases and infections. Next Generation Sequencing (NGS) has significantly augmented detection of culture independent microbes, potential AMR in pathogens and elucidation of mechanisms underlying it. Here, we review recent findings of AMR evolution in pathogens aided by integrated genomic investigation strategies inclusive of bacteria, virus, fungi and AMR alleles. While AMR monitoring is dominated by data from hospital-related infections, we review genomic surveillance of both biotic and abiotic components involved in global AMR emergence and persistence. Identification of pathogen-intrinsic as well as environmental and/or host factors through robust genomics/bioinformatics, along with monitoring of type and frequency of antibiotic usage will greatly facilitate prediction of regional and global patterns of AMR evolution. Genomics-enabled AMR prediction and surveillance will be crucial - in shaping health and economic policies within the One Health framework to combat this global concern.
Collapse
Affiliation(s)
- Chinky Shiu Chen Liu
- Division of Immunology and Infectious Disease Biology, INtegrative GENomics of HOst-PathogEn (INGEN-HOPE) Laboratory, CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Mall Road, Delhi, 110007, India
| | - Rajesh Pandey
- Division of Immunology and Infectious Disease Biology, INtegrative GENomics of HOst-PathogEn (INGEN-HOPE) Laboratory, CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Mall Road, Delhi, 110007, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002, India
| |
Collapse
|
17
|
Deb S, Basu J, Choudhary M. An overview of next generation sequencing strategies and genomics tools used for tuberculosis research. J Appl Microbiol 2024; 135:lxae174. [PMID: 39003248 DOI: 10.1093/jambio/lxae174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 06/07/2024] [Accepted: 07/10/2024] [Indexed: 07/15/2024]
Abstract
Tuberculosis (TB) is a grave public health concern and is considered the foremost contributor to human mortality resulting from infectious disease. Due to the stringent clonality and extremely restricted genomic diversity, conventional methods prove inefficient for in-depth exploration of minor genomic variations and the evolutionary dynamics operating in Mycobacterium tuberculosis (M.tb) populations. Until now, the majority of reviews have primarily focused on delineating the application of whole-genome sequencing (WGS) in predicting antibiotic resistant genes, surveillance of drug resistance strains, and M.tb lineage classifications. Despite the growing use of next generation sequencing (NGS) and WGS analysis in TB research, there are limited studies that provide a comprehensive summary of there role in studying macroevolution, minor genetic variations, assessing mixed TB infections, and tracking transmission networks at an individual level. This highlights the need for systematic effort to fully explore the potential of WGS and its associated tools in advancing our understanding of TB epidemiology and disease transmission. We delve into the recent bioinformatics pipelines and NGS strategies that leverage various genetic features and simultaneous exploration of host-pathogen protein expression profile to decipher the genetic heterogeneity and host-pathogen interaction dynamics of the M.tb infections. This review highlights the potential benefits and limitations of NGS and bioinformatics tools and discusses their role in TB detection and epidemiology. Overall, this review could be a valuable resource for researchers and clinicians interested in NGS-based approaches in TB research.
Collapse
Affiliation(s)
- Sushanta Deb
- Department of Veterinary Microbiology and Pathology, College of Veterinary Medicine, Washington State University, Pullman 99164, WA, United States
- All India Institute of Medical Sciences, New Delhi 110029, India
| | - Jhinuk Basu
- Department of Clinical Immunology and Rheumatology, Kalinga Institute of Medical Sciences (KIMS), KIIT University, Bhubaneswar 751024, India
| | - Megha Choudhary
- All India Institute of Medical Sciences, New Delhi 110029, India
| |
Collapse
|
18
|
Marin MG, Wippel C, Quinones-Olvera N, Behruznia M, Jeffrey BM, Harris M, Mann BC, Rosenthal A, Jacobson KR, Warren RM, Li H, Meehan CJ, Farhat MR. Analysis of the limited M. tuberculosis accessory genome reveals potential pitfalls of pan-genome analysis approaches. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.21.586149. [PMID: 38585972 PMCID: PMC10996470 DOI: 10.1101/2024.03.21.586149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Pan-genome analysis is a fundamental tool for studying bacterial genome evolution; however, the variety of methods used to define and measure the pan-genome poses challenges to the interpretation and reliability of results. To quantify sources of bias and error related to common pan-genome analysis approaches, we evaluated different approaches applied to curated collection of 151 Mycobacterium tuberculosis ( Mtb ) isolates. Mtb is characterized by its clonal evolution, absence of horizontal gene transfer, and limited accessory genome, making it an ideal test case for this study. Using a state-of-the-art graph-genome approach, we found that a majority of the structural variation observed in Mtb originates from rearrangement, deletion, and duplication of redundant nucleotide sequences. In contrast, we found that pan-genome analyses that focus on comparison of coding sequences (at the amino acid level) can yield surprisingly variable results, driven by differences in assembly quality and the softwares used. Upon closer inspection, we found that coding sequence annotation discrepancies were a major contributor to inflated Mtb accessory genome estimates. To address this, we developed panqc, a software that detects annotation discrepancies and collapses nucleotide redundancy in pan-genome estimates. When applied to Mtb and E. coli pan-genomes, panqc exposed distinct biases influenced by the genomic diversity of the population studied. Our findings underscore the need for careful methodological selection and quality control to accurately map the evolutionary dynamics of a bacterial species.
Collapse
|
19
|
Rusic D, Kumric M, Seselja Perisin A, Leskur D, Bukic J, Modun D, Vilovic M, Vrdoljak J, Martinovic D, Grahovac M, Bozic J. Tackling the Antimicrobial Resistance "Pandemic" with Machine Learning Tools: A Summary of Available Evidence. Microorganisms 2024; 12:842. [PMID: 38792673 PMCID: PMC11123121 DOI: 10.3390/microorganisms12050842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2024] [Revised: 04/16/2024] [Accepted: 04/19/2024] [Indexed: 05/26/2024] Open
Abstract
Antimicrobial resistance is recognised as one of the top threats healthcare is bound to face in the future. There have been various attempts to preserve the efficacy of existing antimicrobials, develop new and efficient antimicrobials, manage infections with multi-drug resistant strains, and improve patient outcomes, resulting in a growing mass of routinely available data, including electronic health records and microbiological information that can be employed to develop individualised antimicrobial stewardship. Machine learning methods have been developed to predict antimicrobial resistance from whole-genome sequencing data, forecast medication susceptibility, recognise epidemic patterns for surveillance purposes, or propose new antibacterial treatments and accelerate scientific discovery. Unfortunately, there is an evident gap between the number of machine learning applications in science and the effective implementation of these systems. This narrative review highlights some of the outstanding opportunities that machine learning offers when applied in research related to antimicrobial resistance. In the future, machine learning tools may prove to be superbugs' kryptonite. This review aims to provide an overview of available publications to aid researchers that are looking to expand their work with new approaches and to acquaint them with the current application of machine learning techniques in this field.
Collapse
Affiliation(s)
- Doris Rusic
- Department of Pharmacy, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia; (D.R.); (A.S.P.); (D.L.); (J.B.); (D.M.)
| | - Marko Kumric
- Department of Pathophysiology, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia; (M.K.); (M.V.); (J.V.); (D.M.)
- Laboratory for Cardiometabolic Research, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia
| | - Ana Seselja Perisin
- Department of Pharmacy, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia; (D.R.); (A.S.P.); (D.L.); (J.B.); (D.M.)
| | - Dario Leskur
- Department of Pharmacy, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia; (D.R.); (A.S.P.); (D.L.); (J.B.); (D.M.)
| | - Josipa Bukic
- Department of Pharmacy, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia; (D.R.); (A.S.P.); (D.L.); (J.B.); (D.M.)
| | - Darko Modun
- Department of Pharmacy, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia; (D.R.); (A.S.P.); (D.L.); (J.B.); (D.M.)
| | - Marino Vilovic
- Department of Pathophysiology, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia; (M.K.); (M.V.); (J.V.); (D.M.)
- Laboratory for Cardiometabolic Research, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia
| | - Josip Vrdoljak
- Department of Pathophysiology, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia; (M.K.); (M.V.); (J.V.); (D.M.)
- Laboratory for Cardiometabolic Research, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia
| | - Dinko Martinovic
- Department of Pathophysiology, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia; (M.K.); (M.V.); (J.V.); (D.M.)
- Department of Maxillofacial Surgery, University Hospital of Split, Spinciceva 1, 21000 Split, Croatia
| | - Marko Grahovac
- Department of Pharmacology, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia;
| | - Josko Bozic
- Department of Pathophysiology, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia; (M.K.); (M.V.); (J.V.); (D.M.)
- Laboratory for Cardiometabolic Research, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia
| |
Collapse
|
20
|
Asnicar F, Thomas AM, Passerini A, Waldron L, Segata N. Machine learning for microbiologists. Nat Rev Microbiol 2024; 22:191-205. [PMID: 37968359 PMCID: PMC11980903 DOI: 10.1038/s41579-023-00984-1] [Citation(s) in RCA: 44] [Impact Index Per Article: 44.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/03/2023] [Indexed: 11/17/2023]
Abstract
Machine learning is increasingly important in microbiology where it is used for tasks such as predicting antibiotic resistance and associating human microbiome features with complex host diseases. The applications in microbiology are quickly expanding and the machine learning tools frequently used in basic and clinical research range from classification and regression to clustering and dimensionality reduction. In this Review, we examine the main machine learning concepts, tasks and applications that are relevant for experimental and clinical microbiologists. We provide the minimal toolbox for a microbiologist to be able to understand, interpret and use machine learning in their experimental and translational activities.
Collapse
Affiliation(s)
- Francesco Asnicar
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy
| | - Andrew Maltez Thomas
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy
| | - Andrea Passerini
- Department of Information Engineering and Computer Science, University of Trento, Trento, Italy
| | - Levi Waldron
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy.
- Department of Epidemiology and Biostatistics, City University of New York, New York, NY, USA.
| | - Nicola Segata
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy.
- Department of Experimental Oncology, European Institute of Oncology IRCCS, Milan, Italy.
| |
Collapse
|
21
|
Hu K, Meyer F, Deng ZL, Asgari E, Kuo TH, Münch PC, McHardy AC. Assessing computational predictions of antimicrobial resistance phenotypes from microbial genomes. Brief Bioinform 2024; 25:bbae206. [PMID: 38706320 PMCID: PMC11070729 DOI: 10.1093/bib/bbae206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 04/08/2024] [Accepted: 04/11/2024] [Indexed: 05/07/2024] Open
Abstract
The advent of rapid whole-genome sequencing has created new opportunities for computational prediction of antimicrobial resistance (AMR) phenotypes from genomic data. Both rule-based and machine learning (ML) approaches have been explored for this task, but systematic benchmarking is still needed. Here, we evaluated four state-of-the-art ML methods (Kover, PhenotypeSeeker, Seq2Geno2Pheno and Aytan-Aktug), an ML baseline and the rule-based ResFinder by training and testing each of them across 78 species-antibiotic datasets, using a rigorous benchmarking workflow that integrates three evaluation approaches, each paired with three distinct sample splitting methods. Our analysis revealed considerable variation in the performance across techniques and datasets. Whereas ML methods generally excelled for closely related strains, ResFinder excelled for handling divergent genomes. Overall, Kover most frequently ranked top among the ML approaches, followed by PhenotypeSeeker and Seq2Geno2Pheno. AMR phenotypes for antibiotic classes such as macrolides and sulfonamides were predicted with the highest accuracies. The quality of predictions varied substantially across species-antibiotic combinations, particularly for beta-lactams; across species, resistance phenotyping of the beta-lactams compound, aztreonam, amoxicillin/clavulanic acid, cefoxitin, ceftazidime and piperacillin/tazobactam, alongside tetracyclines demonstrated more variable performance than the other benchmarked antibiotics. By organism, Campylobacter jejuni and Enterococcus faecium phenotypes were more robustly predicted than those of Escherichia coli, Staphylococcus aureus, Salmonella enterica, Neisseria gonorrhoeae, Klebsiella pneumoniae, Pseudomonas aeruginosa, Acinetobacter baumannii, Streptococcus pneumoniae and Mycobacterium tuberculosis. In addition, our study provides software recommendations for each species-antibiotic combination. It furthermore highlights the need for optimization for robust clinical applications, particularly for strains that diverge substantially from those used for training.
Collapse
Affiliation(s)
- Kaixin Hu
- Computational Biology of Infection Research, Helmholtz Center for Infection Research, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Technische Universität Braunschweig, Braunschweig, Germany
| | - Fernando Meyer
- Computational Biology of Infection Research, Helmholtz Center for Infection Research, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Technische Universität Braunschweig, Braunschweig, Germany
| | - Zhi-Luo Deng
- Computational Biology of Infection Research, Helmholtz Center for Infection Research, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Technische Universität Braunschweig, Braunschweig, Germany
| | - Ehsaneddin Asgari
- Computational Biology of Infection Research, Helmholtz Center for Infection Research, Braunschweig, Germany
- Molecular Cell Biomechanics Laboratory, Department of Bioengineering and Mechanical Engineering, University of California, Berkeley, USA
| | - Tzu-Hao Kuo
- Computational Biology of Infection Research, Helmholtz Center for Infection Research, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Technische Universität Braunschweig, Braunschweig, Germany
| | - Philipp C Münch
- Computational Biology of Infection Research, Helmholtz Center for Infection Research, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Technische Universität Braunschweig, Braunschweig, Germany
- Cluster of Excellence RESIST (EXC 2155), Hannover Medical School, Hannover, Germany
- German Center for Infection Research (DZIF), partner site Hannover Braunschweig, Braunschweig, Germany
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
| | - Alice C McHardy
- Computational Biology of Infection Research, Helmholtz Center for Infection Research, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Technische Universität Braunschweig, Braunschweig, Germany
| |
Collapse
|
22
|
Bundhoo E, Ghoorah AW, Jaufeerally-Fakim Y. Large-scale Pan Genomic Analysis of Mycobacterium tuberculosis Reveals Key Insights Into Molecular Evolutionary Rate of Specific Processes and Functions. Evol Bioinform Online 2024; 20:11769343241239463. [PMID: 38532808 PMCID: PMC10964447 DOI: 10.1177/11769343241239463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Accepted: 02/28/2024] [Indexed: 03/28/2024] Open
Abstract
Mycobacterium tuberculosis (Mtb) is the causative agent of tuberculosis (TB), an infectious disease that is a major killer worldwide. Due to selection pressure caused by the use of antibacterial drugs, Mtb is characterised by mutational events that have given rise to multi drug resistant (MDR) and extensively drug resistant (XDR) phenotypes. The rate at which mutations occur is an important factor in the study of molecular evolution, and it helps understand gene evolution. Within the same species, different protein-coding genes evolve at different rates. To estimate the rates of molecular evolution of protein-coding genes, a commonly used parameter is the ratio dN/dS, where dN is the rate of non-synonymous substitutions and dS is the rate of synonymous substitutions. Here, we determined the estimated rates of molecular evolution of select biological processes and molecular functions across 264 strains of Mtb. We also investigated the molecular evolutionary rates of core genes of Mtb by computing the dN/dS values, and estimated the pan genome of the 264 strains of Mtb. Our results show that the cellular amino acid metabolic process and the kinase activity function evolve at a significantly higher rate, while the carbohydrate metabolic process evolves at a significantly lower rate for M. tuberculosis. These high rates of evolution correlate well with Mtb physiology and pathogenicity. We further propose that the core genome of M. tuberculosis likely experiences varying rates of molecular evolution which may drive an interplay between core genome and accessory genome during M. tuberculosis evolution.
Collapse
Affiliation(s)
- Eshan Bundhoo
- Department of Agricultural & Food Science, Faculty of Agriculture, University of Mauritius, Reduit, Mauritius
| | - Anisah W Ghoorah
- Department of Digital Technologies, Faculty of Information, Communication & Digital Technologies, University of Mauritius, Reduit, Mauritius
| | - Yasmina Jaufeerally-Fakim
- Department of Agricultural & Food Science, Faculty of Agriculture, University of Mauritius, Reduit, Mauritius
| |
Collapse
|
23
|
Silva-Pereira TT, Soler-Camargo NC, Guimarães AMS. Diversification of gene content in the Mycobacterium tuberculosis complex is determined by phylogenetic and ecological signatures. Microbiol Spectr 2024; 12:e0228923. [PMID: 38230932 PMCID: PMC10871547 DOI: 10.1128/spectrum.02289-23] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 12/19/2023] [Indexed: 01/18/2024] Open
Abstract
We analyzed the pan-genome and gene content modulation of the most diverse genome data set of the Mycobacterium tuberculosis complex (MTBC) gathered to date. The closed pan-genome of the MTBC was characterized by reduced accessory and strain-specific genomes, compatible with its clonal nature. However, significantly fewer gene families were shared between MTBC genomes as their phylogenetic distance increased. This effect was only observed in inter-species comparisons, not within-species, which suggests that species-specific ecological characteristics are associated with changes in gene content. Gene loss, resulting from genomic deletions and pseudogenization, was found to drive the variation in gene content. This gene erosion differed among MTBC species and lineages, even within M. tuberculosis, where L2 showed more gene loss than L4. We also show that phylogenetic proximity is not always a good proxy for gene content relatedness in the MTBC, as the gene repertoire of Mycobacterium africanum L6 deviated from its expected phylogenetic niche conservatism. Gene disruptions of virulence factors, represented by pseudogene annotations, are mostly not conserved, being poor predictors of MTBC ecotypes. Each MTBC ecotype carries its own accessory genome, likely influenced by distinct selective pressures such as host and geography. It is important to investigate how gene loss confer new adaptive traits to MTBC strains; the detected heterogeneous gene loss poses a significant challenge in elucidating genetic factors responsible for the diverse phenotypes observed in the MTBC. By detailing specific gene losses, our study serves as a resource for researchers studying the MTBC phenotypes and their immune evasion strategies.IMPORTANCEIn this study, we analyzed the gene content of different ecotypes of the Mycobacterium tuberculosis complex (MTBC), the pathogens of tuberculosis. We found that changes in their gene content are associated with their ecological features, such as host preference. Gene loss was identified as the primary driver of these changes, which can vary even among different strains of the same ecotype. Our study also revealed that the gene content relatedness of these bacteria does not always mirror their evolutionary relationships. In addition, some genes of virulence can be variably lost among strains of the same MTBC ecotype, likely helping them to evade the immune system. Overall, our study highlights the importance of understanding how gene loss can lead to new adaptations in these bacteria and how different selective pressures may influence their genetic makeup.
Collapse
Affiliation(s)
- Taiana Tainá Silva-Pereira
- Laboratory of Applied Research in Mycobacteria, Department of Microbiology, Institute of Biomedical Sciences, University of São Paulo, São Paulo, Brazil
| | - Naila Cristina Soler-Camargo
- Laboratory of Applied Research in Mycobacteria, Department of Microbiology, Institute of Biomedical Sciences, University of São Paulo, São Paulo, Brazil
- Department of Preventive Veterinary Medicine and Animal Health, School of Veterinary Medicine and Animal Sciences, University of São Paulo, São Paulo, Brazil
| | - Ana Marcia Sá Guimarães
- Laboratory of Applied Research in Mycobacteria, Department of Microbiology, Institute of Biomedical Sciences, University of São Paulo, São Paulo, Brazil
| |
Collapse
|
24
|
Coenye T. Biofilm antimicrobial susceptibility testing: where are we and where could we be going? Clin Microbiol Rev 2023; 36:e0002423. [PMID: 37812003 PMCID: PMC10732061 DOI: 10.1128/cmr.00024-23] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 07/27/2023] [Indexed: 10/10/2023] Open
Abstract
Our knowledge about the fundamental aspects of biofilm biology, including the mechanisms behind the reduced antimicrobial susceptibility of biofilms, has increased drastically over the last decades. However, this knowledge has so far not been translated into major changes in clinical practice. While the biofilm concept is increasingly on the radar of clinical microbiologists, physicians, and healthcare professionals in general, the standardized tools to study biofilms in the clinical microbiology laboratory are still lacking; one area in which this is particularly obvious is that of antimicrobial susceptibility testing (AST). It is generally accepted that the biofilm lifestyle has a tremendous impact on antibiotic susceptibility, yet AST is typically still carried out with planktonic cells. On top of that, the microenvironment at the site of infection is an important driver for microbial physiology and hence susceptibility; but this is poorly reflected in current AST methods. The goal of this review is to provide an overview of the state of the art concerning biofilm AST and highlight the knowledge gaps in this area. Subsequently, potential ways to improve biofilm-based AST will be discussed. Finally, bottlenecks currently preventing the use of biofilm AST in clinical practice, as well as the steps needed to get past these bottlenecks, will be discussed.
Collapse
Affiliation(s)
- Tom Coenye
- Laboratory of Pharmaceutical Microbiology, Ghent University, Ghent, Belgium
| |
Collapse
|
25
|
Li T, Huang J, Yang S, Chen J, Yao Z, Zhong M, Zhong X, Ye X. Pan-Genome-Wide Association Study of Serotype 19A Pneumococci Identifies Disease-Associated Genes. Microbiol Spectr 2023; 11:e0407322. [PMID: 37358412 PMCID: PMC10433855 DOI: 10.1128/spectrum.04073-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 06/04/2023] [Indexed: 06/27/2023] Open
Abstract
Despite the widespread implementation of pneumococcal vaccines, hypervirulent Streptococcus pneumoniae serotype 19A is endemic worldwide. It is still unclear whether specific genetic elements contribute to complex pathogenicity of serotype 19A isolates. We performed a large-scale pan-genome-wide association study (pan-GWAS) of 1,292 serotype 19A isolates sampled from patients with invasive disease and asymptomatic carriers. To address the underlying disease-associated genotypes, a comprehensive analysis using three methods (Scoary, a linear mixed model, and random forest) was performed to compare disease and carriage isolates to identify genes consistently associated with disease phenotype. By using three pan-GWAS methods, we found consensus on statistically significant associations between genotypes and disease phenotypes (disease or carriage), with a subset of 30 consistently significant disease-associated genes. The results of functional annotation revealed that these disease-associated genes had diverse predicted functions, including those that participated in mobile genetic elements, antibiotic resistance, virulence, and cellular metabolism. Our findings suggest the multifactorial pathogenicity nature of this hypervirulent serotype and provide important evidence for the design of novel protein-based vaccines to prevent and control pneumococcal disease. IMPORTANCE It is important to understand the genetic and pathogenic characteristics of S. pneumoniae serotype 19A, which may provide important information for the prevention and treatment of pneumococcal disease. This global large-sample pan-GWAS study has identified a subset of 30 consistently significant disease-associated genes that are involved in mobile genetic elements, antibiotic resistance, virulence, and cellular metabolism. These findings suggest the multifactorial pathogenicity nature of hypervirulent S. pneumoniae serotype 19A isolates and provide implications for the design of novel protein-based vaccines.
Collapse
Affiliation(s)
- Ting Li
- School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Jiayin Huang
- School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Shimin Yang
- School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Jianyu Chen
- School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Zhenjiang Yao
- School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Minghao Zhong
- Department of Prevention and Health Care, The Sixth People’s Hospital of Dongguan City, Guangdong, China
| | - Xinguang Zhong
- Department of Prevention and Health Care, The Sixth People’s Hospital of Dongguan City, Guangdong, China
| | - Xiaohua Ye
- School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| |
Collapse
|
26
|
Yang Z, Guarracino A, Biggs PJ, Black MA, Ismail N, Wold JR, Merriman TR, Prins P, Garrison E, de Ligt J. Pangenome graphs in infectious disease: a comprehensive genetic variation analysis of Neisseria meningitidis leveraging Oxford Nanopore long reads. Front Genet 2023; 14:1225248. [PMID: 37636268 PMCID: PMC10448961 DOI: 10.3389/fgene.2023.1225248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 08/01/2023] [Indexed: 08/29/2023] Open
Abstract
Whole genome sequencing has revolutionized infectious disease surveillance for tracking and monitoring the spread and evolution of pathogens. However, using a linear reference genome for genomic analyses may introduce biases, especially when studies are conducted on highly variable bacterial genomes of the same species. Pangenome graphs provide an efficient model for representing and analyzing multiple genomes and their variants as a graph structure that includes all types of variations. In this study, we present a practical bioinformatics pipeline that employs the PanGenome Graph Builder and the Variation Graph toolkit to build pangenomes from assembled genomes, align whole genome sequencing data and call variants against a graph reference. The pangenome graph enables the identification of structural variants, rearrangements, and small variants (e.g., single nucleotide polymorphisms and insertions/deletions) simultaneously. We demonstrate that using a pangenome graph, instead of a single linear reference genome, improves mapping rates and variant calling for both simulated and real datasets of the pathogen Neisseria meningitidis. Overall, pangenome graphs offer a promising approach for comparative genomics and comprehensive genetic variation analysis in infectious disease. Moreover, this innovative pipeline, leveraging pangenome graphs, can bridge variant analysis, genome assembly, population genetics, and evolutionary biology, expanding the reach of genomic understanding and applications.
Collapse
Affiliation(s)
- Zuyu Yang
- Institute of Environmental Science and Research, Porirua, New Zealand
| | - Andrea Guarracino
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, United States
- Genomics Research Centre, Human Technopole, Milan, Italy
| | - Patrick J. Biggs
- Molecular Biosciences Group, School of Natural Sciences, Massey University, Palmerston North, New Zealand
- Molecular Epidemiology and Public Health Laboratory, School of Veterinary Science, Massey University, Palmerston North, New Zealand
| | - Michael A. Black
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Nuzla Ismail
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Jana Renee Wold
- School of Biological Sciences, University of Canterbury, Christchurch, New Zealand
| | - Tony R. Merriman
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
- Division of Clinical Immunology and Rheumatology, University of Alabama at Birmingham, Birmingham, AL, United States
| | - Pjotr Prins
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, United States
| | - Erik Garrison
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, United States
| | - Joep de Ligt
- Institute of Environmental Science and Research, Porirua, New Zealand
| |
Collapse
|
27
|
Baker M, Zhang X, Maciel-Guerra A, Dong Y, Wang W, Hu Y, Renney D, Hu Y, Liu L, Li H, Tong Z, Zhang M, Geng Y, Zhao L, Hao Z, Senin N, Chen J, Peng Z, Li F, Dottorini T. Machine learning and metagenomics reveal shared antimicrobial resistance profiles across multiple chicken farms and abattoirs in China. NATURE FOOD 2023; 4:707-720. [PMID: 37563495 PMCID: PMC10444626 DOI: 10.1038/s43016-023-00814-w] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 07/07/2023] [Indexed: 08/12/2023]
Abstract
China is the largest global consumer of antimicrobials and improving surveillance methods could help to reduce antimicrobial resistance (AMR) spread. Here we report the surveillance of ten large-scale chicken farms and four connected abattoirs in three Chinese provinces over 2.5 years. Using a data mining approach based on machine learning, we analysed 461 microbiomes from birds, carcasses and environments, identifying 145 potentially mobile antibiotic resistance genes (ARGs) shared between chickens and environments across all farms. A core set of 233 ARGs and 186 microbial species extracted from the chicken gut microbiome correlated with the AMR profiles of Escherichia coli colonizing the same gut, including Arcobacter, Acinetobacter and Sphingobacterium, clinically relevant for humans, and 38 clinically relevant ARGs. Temperature and humidity in the barns were also correlated with ARG presence. We reveal an intricate network of correlations between environments, microbial communities and AMR, suggesting multiple routes to improving AMR surveillance in livestock production.
Collapse
Affiliation(s)
- Michelle Baker
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington, UK
| | - Xibin Zhang
- Shandong New Hope Liuhe Group Co. Ltd and Qingdao Key Laboratory of Animal Feed Safety, Qingdao, People's Republic of China
| | | | - Yinping Dong
- NHC Key Laboratory of Food Safety Risk Assessment, China National Center for Food Safety Risk Assessment, Beijing, People's Republic of China
| | - Wei Wang
- NHC Key Laboratory of Food Safety Risk Assessment, China National Center for Food Safety Risk Assessment, Beijing, People's Republic of China
| | - Yujie Hu
- NHC Key Laboratory of Food Safety Risk Assessment, China National Center for Food Safety Risk Assessment, Beijing, People's Republic of China
| | - David Renney
- Nimrod Veterinary Products Ltd., Moreton-in-Marsh, UK
| | - Yue Hu
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington, UK
| | - Longhai Liu
- Shandong Kaijia Food Co., Weifang, People's Republic of China
| | - Hui Li
- Luoyang Center for Disease Control and Prevention, Luoyang City, People's Republic of China
| | - Zhiqin Tong
- Luoyang Center for Disease Control and Prevention, Luoyang City, People's Republic of China
| | - Meimei Zhang
- Liaoning Provincial Center for Disease Control and Prevention, Shenyang City, People's Republic of China
| | - Yingzhi Geng
- Liaoning Provincial Center for Disease Control and Prevention, Shenyang City, People's Republic of China
| | - Li Zhao
- Agricultural Biopharmaceutical Laboratory, College of Chemistry and Pharmaceutical Sciences, Qingdao Agricultural University, Qingdao City, People's Republic of China
| | - Zhihui Hao
- Chinese Veterinary Medicine Innovation Center, College of Veterinary Medicine, China Agricultural University, Beijing City, People's Republic of China
| | - Nicola Senin
- Department of Engineering, University of Perugia, Perugia, Italy
| | - Junshi Chen
- NHC Key Laboratory of Food Safety Risk Assessment, China National Center for Food Safety Risk Assessment, Beijing, People's Republic of China
| | - Zixin Peng
- NHC Key Laboratory of Food Safety Risk Assessment, China National Center for Food Safety Risk Assessment, Beijing, People's Republic of China.
| | - Fengqin Li
- NHC Key Laboratory of Food Safety Risk Assessment, China National Center for Food Safety Risk Assessment, Beijing, People's Republic of China.
| | - Tania Dottorini
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington, UK.
- Centre for Smart Food Research, Nottingham Ningbo China Beacons of Excellence Research and Innovation Institute, University of Nottingham Ningbo China, Ningbo, People's Republic of China.
| |
Collapse
|
28
|
Perea-Jacobo R, Paredes-Gutiérrez GR, Guerrero-Chevannier MÁ, Flores DL, Muñiz-Salazar R. Machine Learning of the Whole Genome Sequence of Mycobacterium tuberculosis: A Scoping PRISMA-Based Review. Microorganisms 2023; 11:1872. [PMID: 37630431 PMCID: PMC10456961 DOI: 10.3390/microorganisms11081872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 07/13/2023] [Accepted: 07/14/2023] [Indexed: 08/27/2023] Open
Abstract
Tuberculosis (TB) remains one of the most significant global health problems, posing a significant challenge to public health systems worldwide. However, diagnosing drug-resistant tuberculosis (DR-TB) has become increasingly challenging due to the rising number of multidrug-resistant (MDR-TB) cases, despite the development of new TB diagnostic tools. Even the World Health Organization-recommended methods such as Xpert MTB/XDR or Truenat are unable to detect all the Mycobacterium tuberculosis genome mutations associated with drug resistance. While Whole Genome Sequencing offers a more precise DR profile, the lack of user-friendly bioinformatics analysis applications hinders its widespread use. This review focuses on exploring various artificial intelligence models for predicting DR-TB profiles, analyzing relevant English-language articles using the PRISMA methodology through the Covidence platform. Our findings indicate that an Artificial Neural Network is the most commonly employed method, with non-statistical dimensionality reduction techniques preferred over traditional statistical approaches such as Principal Component Analysis or t-distributed Stochastic Neighbor Embedding.
Collapse
Affiliation(s)
- Ricardo Perea-Jacobo
- Facultad de Ingeniería Arquitectura y Diseño, Universidad Autónoma de Baja California, Campus Ensenada, Ensenada 22860, Mexico; (R.P.-J.); (G.R.P.-G.); (M.Á.G.-C.)
- Escuela de Ciencias de la Salud, Universidad Autónoma de Baja California, Campus Ensenada, Ensenada 22890, Mexico
| | - Guillermo René Paredes-Gutiérrez
- Facultad de Ingeniería Arquitectura y Diseño, Universidad Autónoma de Baja California, Campus Ensenada, Ensenada 22860, Mexico; (R.P.-J.); (G.R.P.-G.); (M.Á.G.-C.)
| | - Miguel Ángel Guerrero-Chevannier
- Facultad de Ingeniería Arquitectura y Diseño, Universidad Autónoma de Baja California, Campus Ensenada, Ensenada 22860, Mexico; (R.P.-J.); (G.R.P.-G.); (M.Á.G.-C.)
| | - Dora-Luz Flores
- Facultad de Ingeniería Arquitectura y Diseño, Universidad Autónoma de Baja California, Campus Ensenada, Ensenada 22860, Mexico; (R.P.-J.); (G.R.P.-G.); (M.Á.G.-C.)
| | - Raquel Muñiz-Salazar
- Escuela de Ciencias de la Salud, Universidad Autónoma de Baja California, Campus Ensenada, Ensenada 22890, Mexico
| |
Collapse
|
29
|
Wong F, de la Fuente-Nunez C, Collins JJ. Leveraging artificial intelligence in the fight against infectious diseases. Science 2023; 381:164-170. [PMID: 37440620 PMCID: PMC10663167 DOI: 10.1126/science.adh1114] [Citation(s) in RCA: 96] [Impact Index Per Article: 48.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Accepted: 06/05/2023] [Indexed: 07/15/2023]
Abstract
Despite advances in molecular biology, genetics, computation, and medicinal chemistry, infectious disease remains an ominous threat to public health. Addressing the challenges posed by pathogen outbreaks, pandemics, and antimicrobial resistance will require concerted interdisciplinary efforts. In conjunction with systems and synthetic biology, artificial intelligence (AI) is now leading to rapid progress, expanding anti-infective drug discovery, enhancing our understanding of infection biology, and accelerating the development of diagnostics. In this Review, we discuss approaches for detecting, treating, and understanding infectious diseases, underscoring the progress supported by AI in each case. We suggest future applications of AI and how it might be harnessed to help control infectious disease outbreaks and pandemics.
Collapse
Affiliation(s)
- Felix Wong
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Institute for Medical Engineering & Science and Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Cesar de la Fuente-Nunez
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA 19104, USA
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - James J. Collins
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Institute for Medical Engineering & Science and Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
| |
Collapse
|
30
|
Karlsen ST, Rau MH, Sánchez BJ, Jensen K, Zeidan AA. From genotype to phenotype: computational approaches for inferring microbial traits relevant to the food industry. FEMS Microbiol Rev 2023; 47:fuad030. [PMID: 37286882 PMCID: PMC10337747 DOI: 10.1093/femsre/fuad030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 05/31/2023] [Accepted: 06/06/2023] [Indexed: 06/09/2023] Open
Abstract
When selecting microbial strains for the production of fermented foods, various microbial phenotypes need to be taken into account to achieve target product characteristics, such as biosafety, flavor, texture, and health-promoting effects. Through continuous advances in sequencing technologies, microbial whole-genome sequences of increasing quality can now be obtained both cheaper and faster, which increases the relevance of genome-based characterization of microbial phenotypes. Prediction of microbial phenotypes from genome sequences makes it possible to quickly screen large strain collections in silico to identify candidates with desirable traits. Several microbial phenotypes relevant to the production of fermented foods can be predicted using knowledge-based approaches, leveraging our existing understanding of the genetic and molecular mechanisms underlying those phenotypes. In the absence of this knowledge, data-driven approaches can be applied to estimate genotype-phenotype relationships based on large experimental datasets. Here, we review computational methods that implement knowledge- and data-driven approaches for phenotype prediction, as well as methods that combine elements from both approaches. Furthermore, we provide examples of how these methods have been applied in industrial biotechnology, with special focus on the fermented food industry.
Collapse
Affiliation(s)
- Signe T Karlsen
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Martin H Rau
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Benjamín J Sánchez
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Kristian Jensen
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Ahmad A Zeidan
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| |
Collapse
|
31
|
Negrete-Paz AM, Vázquez-Marrufo G, Gutiérrez-Moraga A, Vázquez-Garcidueñas MS. Pangenome Reconstruction of Mycobacterium tuberculosis as a Guide to Reveal Genomic Features Associated with Strain Clinical Phenotype. Microorganisms 2023; 11:1495. [PMID: 37374997 DOI: 10.3390/microorganisms11061495] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 05/31/2023] [Accepted: 06/02/2023] [Indexed: 06/29/2023] Open
Abstract
Tuberculosis (TB) is one of the leading causes of human deaths worldwide caused by infectious diseases. TB infection by Mycobacterium tuberculosis can occur in the lungs, causing pulmonary tuberculosis (PTB), or in any other organ of the body, resulting in extrapulmonary tuberculosis (EPTB). There is no consensus on the genetic determinants of this pathogen that may contribute to EPTB. In this study, we constructed the M. tuberculosis pangenome and used it as a tool to seek genomic signatures associated with the clinical presentation of TB based on its accessory genome differences. The analysis carried out in the present study includes the raw reads of 490 M. tuberculosis genomes (PTB n = 245, EPTB n = 245) retrieved from public databases that were assembled, as well as ten genomes from Mexican strains (PTB n = 5, EPTB n = 5) that were sequenced and assembled. All genomes were annotated and then used to construct the pangenome with Roary and Panaroo. The pangenome obtained using Roary consisted of 2231 core genes and 3729 accessory genes. On the other hand, the pangenome resulting from Panaroo consisted of 2130 core genes and 5598 accessory genes. Associations between the distribution of accessory genes and the PTB/EPTB phenotypes were examined using the Scoary and Pyseer tools. Both tools found a significant association between the hspR, plcD, Rv2550c, pe_pgrs5, pe_pgrs25, and pe_pgrs57 genes and the PTB genotype. In contrast, the deletion of the aceA, esxR, plcA, and ppe50 genes was significantly associated with the EPTB phenotype. Rv1759c and Rv3740 were found to be associated with the PTB phenotype according to Scoary; however, these associations were not observed when using Pyseer. The robustness of the constructed pangenome and the gene-phenotype associations is supported by several factors, including the analysis of a large number of genomes, the inclusion of the same number of PTB/EPTB genomes, and the reproducibility of results thanks to the different bioinformatic tools used. Such characteristics surpass most of previous M. tuberculosis pangenomes. Thus, it can be inferred that the deletion of these genes can lead to changes in the processes involved in stress response and fatty acid metabolism, conferring phenotypic advantages associated with pulmonary or extrapulmonary presentation of TB. This study represents the first attempt to use the pangenome to seek gene-phenotype associations in M. tuberculosis.
Collapse
Affiliation(s)
- Andrea Monserrat Negrete-Paz
- División de Estudios de Posgrado, Facultad de Ciencias Médicas y Biológicas "Dr. Ignacio Chávez", Universidad Michoacana de San Nicolás de Hidalgo, Morelia 58020, Michoacán, Mexico
- Centro Multidisciplinario de Estudios en Biotecnología, Facultad de Medicina Veterinaria y Zootecnia, Universidad Michoacana de San Nicolás de Hidalgo, Tarímbaro 58893, Michoacán, Mexico
| | - Gerardo Vázquez-Marrufo
- Centro Multidisciplinario de Estudios en Biotecnología, Facultad de Medicina Veterinaria y Zootecnia, Universidad Michoacana de San Nicolás de Hidalgo, Tarímbaro 58893, Michoacán, Mexico
| | - Ana Gutiérrez-Moraga
- Instituto de Ciencias Biomédicas, Vicerrectoría de Investigación y Doctorados, Universidad Autónoma de Chile, Santiago 7500912, Chile
| | - Ma Soledad Vázquez-Garcidueñas
- División de Estudios de Posgrado, Facultad de Ciencias Médicas y Biológicas "Dr. Ignacio Chávez", Universidad Michoacana de San Nicolás de Hidalgo, Morelia 58020, Michoacán, Mexico
| |
Collapse
|
32
|
Green AG, Vargas R, Marin MG, Freschi L, Xie J, Farhat MR. Analysis of Genome-Wide Mutational Dependence in Naturally Evolving Mycobacterium tuberculosis Populations. Mol Biol Evol 2023; 40:msad131. [PMID: 37352142 PMCID: PMC10292908 DOI: 10.1093/molbev/msad131] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 05/12/2023] [Accepted: 05/23/2023] [Indexed: 06/25/2023] Open
Abstract
Pathogenic microorganisms are in a perpetual struggle for survival in changing host environments, where host pressures necessitate changes in pathogen virulence, antibiotic resistance, or transmissibility. The genetic basis of phenotypic adaptation by pathogens is difficult to study in vivo. In this work, we develop a phylogenetic method to detect genetic dependencies that promote pathogen adaptation using 31,428 in vivo sampled Mycobacterium tuberculosis genomes, a globally prevalent bacterial pathogen with increasing levels of antibiotic resistance. We find that dependencies between mutations are enriched in antigenic and antibiotic resistance functions and discover 23 mutations that potentiate the development of antibiotic resistance. Between 11% and 92% of resistant strains harbor a dependent mutation acquired after a resistance-conferring variant. We demonstrate the pervasiveness of genetic dependency in adaptation of naturally evolving populations and the utility of the proposed computational approach.
Collapse
Affiliation(s)
- Anna G Green
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Roger Vargas
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Center for Computational Biomedicine, Harvard Medical School, Boston, MA, USA
| | - Maximillian G Marin
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Luca Freschi
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Jiaqi Xie
- Department of Genetics, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Maha R Farhat
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Pulmonary and Critical Care Medicine, Massachusetts General Hospital, Boston, MA, USA
| |
Collapse
|
33
|
Yang MR, Su SF, Wu YW. Using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (MIC). Front Genet 2023; 14:1054032. [PMID: 37323667 PMCID: PMC10267731 DOI: 10.3389/fgene.2023.1054032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 05/16/2023] [Indexed: 06/17/2023] Open
Abstract
Background: Predicting the resistance profiles of antimicrobial resistance (AMR) pathogens is becoming more and more important in treating infectious diseases. Various attempts have been made to build machine learning models to classify resistant or susceptible pathogens based on either known antimicrobial resistance genes or the entire gene set. However, the phenotypic annotations are translated from minimum inhibitory concentration (MIC), which is the lowest concentration of antibiotic drugs in inhibiting certain pathogenic strains. Since the MIC breakpoints that classify a strain to be resistant or susceptible to specific antibiotic drug may be revised by governing institutes, we refrained from translating these MIC values into the categories "susceptible" or "resistant" but instead attempted to predict the MIC values using machine learning approaches. Results: By applying a machine learning feature selection approach on a Salmonella enterica pan-genome, in which the protein sequences were clustered to identify highly similar gene families, we showed that the selected features (genes) performed better than known AMR genes, and that models built on the selected genes achieved very accurate MIC prediction. Functional analysis revealed that about half of the selected genes were annotated as hypothetical proteins (i.e., with unknown functional roles), and that only a small portion of known AMR genes were among the selected genes, indicating that applying feature selection on the entire gene set has the potential of uncovering novel genes that may be associated with and may contribute to pathogenic antimicrobial resistances. Conclusion: The application of the pan-genome-based machine learning approach was indeed capable of predicting MIC values with very high accuracy. The feature selection process may also identify novel AMR genes for inferring bacterial antimicrobial resistance phenotypes.
Collapse
Affiliation(s)
- Ming-Ren Yang
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
- Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Shun-Feng Su
- Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Yu-Wei Wu
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
- Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei, Taiwan
- TMU Research Center for Digestive Medicine, Taipei Medical University, Taipei, Taiwan
| |
Collapse
|
34
|
|
35
|
Tharmakulasingam M, Gardner B, La Ragione R, Fernando A. Rectified Classifier Chains for Prediction of Antibiotic Resistance From Multi-Labelled Data With Missing Labels. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:625-636. [PMID: 35130168 DOI: 10.1109/tcbb.2022.3148577] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Predicting Antimicrobial Resistance (AMR) from genomic data has important implications for human and animal healthcare, and especially given its potential for more rapid diagnostics and informed treatment choices. With the recent advances in sequencing technologies, applying machine learning techniques for AMR prediction have indicated promising results. Despite this, there are shortcomings in the literature concerning methodologies suitable for multi-drug AMR prediction and especially where samples with missing labels exist. To address this shortcoming, we introduce a Rectified Classifier Chain (RCC) method for predicting multi-drug resistance. This RCC method was tested using annotated features of genomics sequences and compared with similar multi-label classification methodologies. We found that applying the eXtreme Gradient Boosting (XGBoost) base model to our RCC model outperformed the second-best model, XGBoost based binary relevance model, by 3.3% in Hamming accuracy and 7.8% in F1-score. Additionally, we note that in the literature machine learning models applied to AMR prediction typically are unsuitable for identifying biomarkers informative of their decisions; in this study, we show that biomarkers contributing to AMR prediction can also be identified using the proposed RCC method. We expect this can facilitate genome annotation and pave the path towards identifying new biomarkers indicative of AMR.
Collapse
|
36
|
Maciel-Guerra A, Baker M, Hu Y, Wang W, Zhang X, Rong J, Zhang Y, Zhang J, Kaler J, Renney D, Loose M, Emes RD, Liu L, Chen J, Peng Z, Li F, Dottorini T. Dissecting microbial communities and resistomes for interconnected humans, soil, and livestock. THE ISME JOURNAL 2023; 17:21-35. [PMID: 36151458 PMCID: PMC9751072 DOI: 10.1038/s41396-022-01315-7] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 08/26/2022] [Accepted: 09/01/2022] [Indexed: 12/24/2022]
Abstract
A debate is currently ongoing as to whether intensive livestock farms may constitute reservoirs of clinically relevant antimicrobial resistance (AMR), thus posing a threat to surrounding communities. Here, combining shotgun metagenome sequencing, machine learning (ML), and culture-based methods, we focused on a poultry farm and connected slaughterhouse in China, investigating the gut microbiome of livestock, workers and their households, and microbial communities in carcasses and soil. For both the microbiome and resistomes in this study, differences are observed across environments and hosts. However, at a finer scale, several similar clinically relevant antimicrobial resistance genes (ARGs) and similar associated mobile genetic elements were found in both human and broiler chicken samples. Next, we focused on Escherichia coli, an important indicator for the surveillance of AMR on the farm. Strains of E. coli were found intermixed between humans and chickens. We observed that several ARGs present in the chicken faecal resistome showed correlation to resistance/susceptibility profiles of E. coli isolates cultured from the same samples. Finally, by using environmental sensing these ARGs were found to be correlated to variations in environmental temperature and humidity. Our results show the importance of adopting a multi-domain and multi-scale approach when studying microbial communities and AMR in complex, interconnected environments.
Collapse
Affiliation(s)
- Alexandre Maciel-Guerra
- grid.4563.40000 0004 1936 8868School of Veterinary Medicine and Science, University of Nottingham, College Road, Sutton Bonington, Leicestershire, LE12 5RD UK
| | - Michelle Baker
- grid.4563.40000 0004 1936 8868School of Veterinary Medicine and Science, University of Nottingham, College Road, Sutton Bonington, Leicestershire, LE12 5RD UK
| | - Yue Hu
- grid.4563.40000 0004 1936 8868School of Veterinary Medicine and Science, University of Nottingham, College Road, Sutton Bonington, Leicestershire, LE12 5RD UK
| | - Wei Wang
- grid.464207.30000 0004 4914 5614NHC Key Laboratory of Food Safety Risk Assessment, China National Center for Food Safety Risk Assessment, Beijing, 100021 People’s Republic of China
| | - Xibin Zhang
- grid.508175.eNew Hope Liuhe Co., Ltd., Laboratory of Feed and Livestock and Poultry Products Quality & Safety Control, Ministry of Agriculture, Beijing 100102 and Weifang Heshengyuan Food Co. Ltd., Weifang, 262167 People’s Republic of China
| | - Jia Rong
- grid.508175.eNew Hope Liuhe Co., Ltd., Laboratory of Feed and Livestock and Poultry Products Quality & Safety Control, Ministry of Agriculture, Beijing 100102 and Weifang Heshengyuan Food Co. Ltd., Weifang, 262167 People’s Republic of China
| | - Yimin Zhang
- grid.440622.60000 0000 9482 4676College of Food Science and Engineering, Shandong Agricultural University, Tai’an, Shandong 271018 People’s Republic of China
| | - Jing Zhang
- grid.464207.30000 0004 4914 5614NHC Key Laboratory of Food Safety Risk Assessment, China National Center for Food Safety Risk Assessment, Beijing, 100021 People’s Republic of China
| | - Jasmeet Kaler
- grid.4563.40000 0004 1936 8868School of Veterinary Medicine and Science, University of Nottingham, College Road, Sutton Bonington, Leicestershire, LE12 5RD UK
| | - David Renney
- Nimrod Veterinary Products Limited, 2, Wychwood Court, Cotswold Business Village, Moreton-in-Marsh, GL56 0JQ UK
| | - Matthew Loose
- grid.4563.40000 0004 1936 8868DeepSeq, School of Life Sciences, Queens Medical Centre, University of Nottingham, Nottingham, NG7 2UH UK
| | - Richard D. Emes
- grid.4563.40000 0004 1936 8868School of Veterinary Medicine and Science, University of Nottingham, College Road, Sutton Bonington, Leicestershire, LE12 5RD UK
| | - Longhai Liu
- grid.508175.eNew Hope Liuhe Co., Ltd., Laboratory of Feed and Livestock and Poultry Products Quality & Safety Control, Ministry of Agriculture, Beijing 100102 and Weifang Heshengyuan Food Co. Ltd., Weifang, 262167 People’s Republic of China
| | - Junshi Chen
- grid.464207.30000 0004 4914 5614NHC Key Laboratory of Food Safety Risk Assessment, China National Center for Food Safety Risk Assessment, Beijing, 100021 People’s Republic of China
| | - Zixin Peng
- NHC Key Laboratory of Food Safety Risk Assessment, China National Center for Food Safety Risk Assessment, Beijing, 100021, People's Republic of China.
| | - Fengqin Li
- NHC Key Laboratory of Food Safety Risk Assessment, China National Center for Food Safety Risk Assessment, Beijing, 100021, People's Republic of China.
| | - Tania Dottorini
- School of Veterinary Medicine and Science, University of Nottingham, College Road, Sutton Bonington, Leicestershire, LE12 5RD, UK.
| |
Collapse
|
37
|
Khan MA, Amin A, Farid A, Ullah A, Waris A, Shinwari K, Hussain Y, Alsharif KF, Alzahrani KJ, Khan H. Recent Advances in Genomics-Based Approaches for the Development of Intracellular Bacterial Pathogen Vaccines. Pharmaceutics 2022; 15:pharmaceutics15010152. [PMID: 36678781 PMCID: PMC9863128 DOI: 10.3390/pharmaceutics15010152] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 12/12/2022] [Accepted: 12/19/2022] [Indexed: 01/04/2023] Open
Abstract
Infectious diseases continue to be a leading cause of morbidity and mortality worldwide. The majority of infectious diseases are caused by intracellular pathogenic bacteria (IPB). Historically, conventional vaccination drives have helped control the pathogenesis of intracellular bacteria and the emergence of antimicrobial resistance, saving millions of lives. However, in light of various limitations, many diseases that involve IPB still do not have adequate vaccines. In response to increasing demand for novel vaccine development strategies, a new area of vaccine research emerged following the advent of genomics technology, which changed the paradigm of vaccine development by utilizing the complete genomic data of microorganisms against them. It became possible to identify genes related to disease virulence, genetic patterns linked to disease virulence, as well as the genetic components that supported immunity and favorable vaccine responses. Complete genomic databases, and advancements in transcriptomics, metabolomics, structural genomics, proteomics, immunomics, pan-genomics, synthetic genomics, and population biology have allowed researchers to identify potential vaccine candidates and predict their effects in patients. New vaccines have been created against diseases for which previously there were no vaccines available, and existing vaccines have been improved. This review highlights the key issues and explores the evolution of vaccines. The increasing volume of IPB genomic data, and their application in novel genome-based techniques for vaccine development, were also examined, along with their characteristics, and the opportunities and obstacles involved. Critically, the application of genomics technology has helped researchers rapidly select and evaluate candidate antigens. Novel vaccines capable of addressing the limitations associated with conventional vaccines have been developed and pressing healthcare issues are being addressed.
Collapse
Affiliation(s)
- Muhammad Ajmal Khan
- Division of Life Science, Center for Cancer Research, and State Key Lab of Molecular Neuroscience, Hong Kong University of Science and Technology, Hong Kong, China
- Correspondence: (M.A.K.); or (H.K.)
| | - Aftab Amin
- Division of Life Science, Center for Cancer Research, and State Key Lab of Molecular Neuroscience, Hong Kong University of Science and Technology, Hong Kong, China
| | - Awais Farid
- Division of Environment and Sustainability, Hong Kong University of Science and Technology, Hong Kong, China
| | - Amin Ullah
- Molecular Virology Laboratory, Department of Microbiology and Biotechnology, Abasyn University, Peshawar 25000, Pakistan
| | - Abdul Waris
- Department of Biomedical Sciences, City University of Hong Kong, Hong Kong, China
| | - Khyber Shinwari
- Institute of Chemical Engineering, Department Immuno-Chemistry, Ural Federal University, Yekaterinbiurg 620002, Russia
| | - Yaseen Hussain
- Department of Pharmacy, Abdul Wali Khan University Mardan, Mardan 23200, Pakistan
| | - Khalaf F. Alsharif
- Department of Clinical Laboratory, College of Applied Medical Science, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia
| | - Khalid J. Alzahrani
- Department of Clinical Laboratory, College of Applied Medical Science, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia
| | - Haroon Khan
- Department of Clinical Laboratory, College of Applied Medical Science, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia
- Correspondence: (M.A.K.); or (H.K.)
| |
Collapse
|
38
|
Kim JI, Maguire F, Tsang KK, Gouliouris T, Peacock SJ, McAllister TA, McArthur AG, Beiko RG. Machine Learning for Antimicrobial Resistance Prediction: Current Practice, Limitations, and Clinical Perspective. Clin Microbiol Rev 2022; 35:e0017921. [PMID: 35612324 PMCID: PMC9491192 DOI: 10.1128/cmr.00179-21] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Antimicrobial resistance (AMR) is a global health crisis that poses a great threat to modern medicine. Effective prevention strategies are urgently required to slow the emergence and further dissemination of AMR. Given the availability of data sets encompassing hundreds or thousands of pathogen genomes, machine learning (ML) is increasingly being used to predict resistance to different antibiotics in pathogens based on gene content and genome composition. A key objective of this work is to advocate for the incorporation of ML into front-line settings but also highlight the further refinements that are necessary to safely and confidently incorporate these methods. The question of what to predict is not trivial given the existence of different quantitative and qualitative laboratory measures of AMR. ML models typically treat genes as independent predictors, with no consideration of structural and functional linkages; they also may not be accurate when new mutational variants of known AMR genes emerge. Finally, to have the technology trusted by end users in public health settings, ML models need to be transparent and explainable to ensure that the basis for prediction is clear. We strongly advocate that the next set of AMR-ML studies should focus on the refinement of these limitations to be able to bridge the gap to diagnostic implementation.
Collapse
Affiliation(s)
- Jee In Kim
- Faculty of Computer Science, Dalhousie University, Halifax, Canada
- Institute for Comparative Genomics, Dalhousie University, Halifax, Canada
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, Canada
| | - Finlay Maguire
- Faculty of Computer Science, Dalhousie University, Halifax, Canada
- Institute for Comparative Genomics, Dalhousie University, Halifax, Canada
- Department of Community Health and Epidemiology, Faculty of Medicine, Dalhousie University, Halifax, Canada
- Shared Hospital Laboratory, Toronto, Canada
- Sunnybrook Research Institute, Sunnybrook Health Sciences Centre, Toronto, Canada
| | - Kara K. Tsang
- London School of Hygiene & Tropical Medicine, London, United Kingdom
| | - Theodore Gouliouris
- Department of Medicine, University of Cambridge, Cambridge, United Kingdom
- Clinical Microbiology and Public Health Laboratory, Public Health England, Cambridge, United Kingdom
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
| | - Sharon J. Peacock
- Department of Medicine, University of Cambridge, Cambridge, United Kingdom
| | - Tim A. McAllister
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, Canada
| | - Andrew G. McArthur
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Canada
- M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Canada
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Canada
| | - Robert G. Beiko
- Faculty of Computer Science, Dalhousie University, Halifax, Canada
- Institute for Comparative Genomics, Dalhousie University, Halifax, Canada
| |
Collapse
|
39
|
Romano GE, Silva-Pereira TT, de Melo FM, Sisco MC, Banari AC, Zimpel CK, Soler-Camargo NC, Guimarães AMDS. Unraveling the metabolism of Mycobacterium caprae using comparative genomics. Tuberculosis (Edinb) 2022; 136:102254. [PMID: 36126496 DOI: 10.1016/j.tube.2022.102254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Revised: 08/01/2022] [Accepted: 08/25/2022] [Indexed: 11/19/2022]
Abstract
In our laboratory, Mycobacterium caprae has poor growth in standard medium (SM) 7H9-OADC supplemented with pyruvate and Tween-80. Our objectives were to identify mutations affecting M. caprae metabolism and use this information to design a culture medium to improve its growth. We selected 77 M. caprae genomes and sequenced M. caprae NLA000201913 used in our experiments. Mutations present in >95% of the strains compared to Mycobacterium tuberculosis H37Rv were analyzed in silico for their deleterious effects on proteins of metabolic pathways. Apart from the known defect in the pyruvate kinase, M. caprae has important lesions in enzymes of the TCA cycle, methylmalonyl cycle, B12 metabolism, and electron-transport chain. We provide evidence of enzymatic redundancy elimination and epistatic mutations, and possible production of toxic metabolites hindering M. caprae growth in vitro. A newly designed SM supplemented with l-glutamate allowed faster growth and increased final microbial mass of M. caprae. However, possible accumulation of metabolic waste-products and/or nutritional limitations halted M. caprae growth prior to a M. tuberculosis-like stationary phase. Our findings suggest that M. caprae relies on GABA and/or glyoxylate shunts for in vitro growth in routine media. The newly developed medium will improve experiments with this bacterium by allowing faster growth in vitro.
Collapse
Affiliation(s)
- Giovanni Emiddio Romano
- Laboratory of Applied Research in Mycobacteria (LaPAM), Department of Microbiology, Institute of Biomedical Sciences, University of São Paulo, 1374 Prof Lineu Prestes Avenue, Room 229, São Paulo, SP, 05508-000, Brazil.
| | - Taiana Tainá Silva-Pereira
- Laboratory of Applied Research in Mycobacteria (LaPAM), Department of Microbiology, Institute of Biomedical Sciences, University of São Paulo, 1374 Prof Lineu Prestes Avenue, Room 229, São Paulo, SP, 05508-000, Brazil.
| | - Filipe Menegatti de Melo
- Laboratory of Applied Research in Mycobacteria (LaPAM), Department of Microbiology, Institute of Biomedical Sciences, University of São Paulo, 1374 Prof Lineu Prestes Avenue, Room 229, São Paulo, SP, 05508-000, Brazil.
| | - Maria Carolina Sisco
- Laboratory of Applied Research in Mycobacteria (LaPAM), Department of Microbiology, Institute of Biomedical Sciences, University of São Paulo, 1374 Prof Lineu Prestes Avenue, Room 229, São Paulo, SP, 05508-000, Brazil.
| | - Alexandre Campos Banari
- Laboratory of Applied Research in Mycobacteria (LaPAM), Department of Microbiology, Institute of Biomedical Sciences, University of São Paulo, 1374 Prof Lineu Prestes Avenue, Room 229, São Paulo, SP, 05508-000, Brazil; Department of Preventive Veterinary Medicine and Animal Health, College of Veterinary Medicine, University of São Paulo, 87 Prof Dr Orlando Marques de Paiva Avenue, São Paulo, SP, 05508-270, Brazil.
| | - Cristina Kraemer Zimpel
- Laboratory of Applied Research in Mycobacteria (LaPAM), Department of Microbiology, Institute of Biomedical Sciences, University of São Paulo, 1374 Prof Lineu Prestes Avenue, Room 229, São Paulo, SP, 05508-000, Brazil; Department of Preventive Veterinary Medicine and Animal Health, College of Veterinary Medicine, University of São Paulo, 87 Prof Dr Orlando Marques de Paiva Avenue, São Paulo, SP, 05508-270, Brazil.
| | - Naila Cristina Soler-Camargo
- Laboratory of Applied Research in Mycobacteria (LaPAM), Department of Microbiology, Institute of Biomedical Sciences, University of São Paulo, 1374 Prof Lineu Prestes Avenue, Room 229, São Paulo, SP, 05508-000, Brazil; Department of Preventive Veterinary Medicine and Animal Health, College of Veterinary Medicine, University of São Paulo, 87 Prof Dr Orlando Marques de Paiva Avenue, São Paulo, SP, 05508-270, Brazil.
| | - Ana Marcia de Sá Guimarães
- Laboratory of Applied Research in Mycobacteria (LaPAM), Department of Microbiology, Institute of Biomedical Sciences, University of São Paulo, 1374 Prof Lineu Prestes Avenue, Room 229, São Paulo, SP, 05508-000, Brazil; Department of Comparative Pathobiology, College of Veterinary Medicine, Purdue University. 625 Harrison Street, West Lafayette, IN, 47907, USA.
| |
Collapse
|
40
|
Aljeldah MM. Antimicrobial Resistance and Its Spread Is a Global Threat. Antibiotics (Basel) 2022; 11:antibiotics11081082. [PMID: 36009948 PMCID: PMC9405321 DOI: 10.3390/antibiotics11081082] [Citation(s) in RCA: 97] [Impact Index Per Article: 32.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Revised: 07/20/2022] [Accepted: 07/27/2022] [Indexed: 02/07/2023] Open
Abstract
Antimicrobial resistance (AMR) is a challenge to human wellbeing the world over and is one of the more serious public health concerns. AMR has the potential to emerge as a serious healthcare threat if left unchecked, and could put into motion another pandemic. This establishes the need for the establishment of global health solutions around AMR, taking into account microdata from different parts of the world. The positive influences in this regard could be establishing conducive social norms, charting individual and group behavior practices that favor global human health, and lastly, increasing collective awareness around the need for such action. Apart from being an emerging threat in the clinical space, AMR also increases treatment complexity, posing a real challenge to the existing guidelines around the management of antibiotic resistance. The attribute of resistance development has been linked to many genetic elements, some of which have complex transmission pathways between microbes. Beyond this, new mechanisms underlying the development of AMR are being discovered, making this field an important aspect of medical microbiology. Apart from the genetic aspects of AMR, other practices, including misdiagnosis, exposure to broad-spectrum antibiotics, and lack of rapid diagnosis, add to the creation of resistance. However, upgrades and innovations in DNA sequencing technologies with bioinformatics have revolutionized the diagnostic industry, aiding the real-time detection of causes of AMR and its elements, which are important to delineating control and prevention approaches to fight the threat.
Collapse
Affiliation(s)
- Mohammed M Aljeldah
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, University of Hafr Al Batin, Hafar al-Batin 31991, Saudi Arabia
| |
Collapse
|
41
|
Li J, Zhu Y, Ma Z, Yang F. Genome sequence and pathogenicity of Vibrio vulnificus strain MCCC 1A08743 isolated from contaminated prawns. Biol Open 2022; 11:275848. [PMID: 35766638 PMCID: PMC9253834 DOI: 10.1242/bio.059299] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Accepted: 05/19/2022] [Indexed: 12/17/2022] Open
Abstract
Vibrio vulnificus is an opportunistic pathogen that naturally inhabits sea water globally and is responsible for most vibriosis-related deaths. The consumption of V. vulnificus contaminated seafood and exposure of wounds to Vibrio can result in systemic infection, with increased risks of amputation and extremely high rates of mortality. However, the pathogenicity and virulence factors of V. vulnificus are not fully understood. The genomic characterization of V. vulnificus will be helpful to extend our understanding on V. vulnificus at a genomic level. In this manuscript, the genome of V. vulnificus strain MCCC 1A08743 isolated from contaminated prawns from Zhanjiang, China, was sequenced using Illumina HiSeq X Ten system and annotated through multiple databases. The strain MCCC 1A08743 genome included 4371 protein-coding genes and 117 RNA genes. Average nucleotide identity analysis and core genome phylogenetic analysis revealed that MCCC 1A08743 was most closely related to strains from clinical samples from the United States. Pathogenicity annotation of the MCCC 1A08743 genome, using Virulence Factor Database and Pathogen-Host Interactions database, predicted the pathogenicity of the strain, and this was confirmed using mice infection experiments, which indicated that V. vulnificus strain MCCC 1A08743 could infect C57BL/6J mice and cause liver lesions. This article has an associated First Person interview with the first author of the paper. Summary:Vibrio vulnificus strain MCCC 1A08743 was newly isolated, sequenced and tested for its pathogenicity in mice.
Collapse
Affiliation(s)
- Jie Li
- Department of Medical Genetics, Naval Medical University, Shanghai 200433, China
| | - Yiqing Zhu
- Department of Medical Genetics, Naval Medical University, Shanghai 200433, China
| | - Zhenxia Ma
- Department of Biochemistry and Molecular Biology, Naval Medical University, Shanghai, 200433, China
| | - Fu Yang
- Department of Medical Genetics, Naval Medical University, Shanghai 200433, China
| |
Collapse
|
42
|
Ceres KM, Stanhope MJ, Gröhn YT. A critical evaluation of Mycobacterium bovis pangenomics, with reference to its utility in outbreak investigation. Microb Genom 2022; 8:mgen000839. [PMID: 35763423 PMCID: PMC9455707 DOI: 10.1099/mgen.0.000839] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Accepted: 04/29/2022] [Indexed: 11/18/2022] Open
Abstract
The increased accessibility of next generation sequencing has allowed enough genomes from a given bacterial species to be sequenced to describe the distribution of genes in the pangenome, without limiting analyses to genes present in reference strains. Although some taxa have thousands of whole genome sequences available on public databases, most genomes were sequenced with short read technology, resulting in incomplete assemblies. Studying pangenomes could lead to important insights into adaptation, pathogenicity, or molecular epidemiology, however given the known information loss inherent in analyzing contig-level assemblies, these inferences may be biased or inaccurate. In this study we describe the pangenome of a clonally evolving pathogen, Mycobacterium bovis , and examine the utility of gene content variation in M. bovis outbreak investigation. We constructed the M. bovis pangenome using 1463 de novo assembled genomes. We tested the assumption of strict clonal evolution by studying evidence of recombination in core genes and analyzing the distribution of accessory genes among core monophyletic groups. To determine if gene content variation could be utilized in outbreak investigation, we carefully examined accessory genes detected in a well described M. bovis outbreak in Minnesota. We found significant errors in accessory gene classification. After accounting for these errors, we show that M. bovis has a much smaller accessory genome than previously described and provide evidence supporting ongoing clonal evolution and a closed pangenome, with little gene content variation generated over outbreaks. We also identified frameshift mutations in multiple genes, including a mutation in glpK , which has recently been associated with antibiotic tolerance in Mycobacterium tuberculosis . A pangenomic approach enables a more comprehensive analysis of genome dynamics than is possible with reference-based approaches; however, without critical evaluation of accessory gene content, inferences of transmission patterns employing these loci could be misguided.
Collapse
Affiliation(s)
- Kristina M. Ceres
- Department of Population Medicine and Diagnostic Sciences, College of Veterinary Medicine, Cornell University, Ithaca, New York, USA
- Population and Ecosystem Health, College of Veterinary Medicine, Cornell University, Ithaca, NY, USA
| | - Michael J Stanhope
- Department of Population Medicine and Diagnostic Sciences, College of Veterinary Medicine, Cornell University, Ithaca, New York, USA
- Population and Ecosystem Health, College of Veterinary Medicine, Cornell University, Ithaca, NY, USA
| | - Yrjö T. Gröhn
- Department of Population Medicine and Diagnostic Sciences, College of Veterinary Medicine, Cornell University, Ithaca, New York, USA
- Population and Ecosystem Health, College of Veterinary Medicine, Cornell University, Ithaca, NY, USA
| |
Collapse
|
43
|
Posada-Reyes AB, Balderas-Martínez YI, Ávila-Ríos S, Vinuesa P, Fonseca-Coronado S. An Epistatic Network Describes oppA and glgB as Relevant Genes for Mycobacterium tuberculosis. Front Mol Biosci 2022; 9:856212. [PMID: 35712352 PMCID: PMC9194097 DOI: 10.3389/fmolb.2022.856212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2022] [Accepted: 03/11/2022] [Indexed: 11/18/2022] Open
Abstract
Mycobacterium tuberculosis is an acid-fast bacterium that causes tuberculosis worldwide. The role of epistatic interactions among different loci of the M. tuberculosis genome under selective pressure may be crucial for understanding the disease and the molecular basis of antibiotic resistance acquisition. Here, we analyzed polymorphic loci interactions by applying a model-free method for epistasis detection, SpydrPick, on a pan–genome-wide alignment created from a set of 254 complete reference genomes. By means of the analysis of an epistatic network created with the detected epistatic interactions, we found that glgB (α-1,4-glucan branching enzyme) and oppA (oligopeptide-binding protein) are putative targets of co-selection in M. tuberculosis as they were associated in the network with M. tuberculosis genes related to virulence, pathogenesis, transport system modulators of the immune response, and antibiotic resistance. In addition, our work unveiled potential pharmacological applications for genotypic antibiotic resistance inherent to the mutations of glgB and oppA as they epistatically interact with fprA and embC, two genes recently included as antibiotic-resistant genes in the catalog of the World Health Organization. Our findings showed that this approach allows the identification of relevant epistatic interactions that may lead to a better understanding of M. tuberculosis by deciphering the complex interactions of molecules involved in its metabolism, virulence, and pathogenesis and that may be applied to different bacterial populations.
Collapse
Affiliation(s)
- Ali-Berenice Posada-Reyes
- Posgrado en Ciencias Biológicas, UNAM, Mexico, Mexico
- Facultad de Estudios Superiores Cuautitlán, UNAM, Estado de Mexico, Mexico
- *Correspondence: Ali-Berenice Posada-Reyes, ; Salvador Fonseca-Coronado,
| | | | - Santiago Ávila-Ríos
- Instituto Nacional de Enfermedades Respiratorias “Ismael Cosio Villegas”, Ciudad de Mexico, Mexico
| | - Pablo Vinuesa
- Centro de Ciencias Genómicas, UNAM, Cuernavaca, Mexico
| | - Salvador Fonseca-Coronado
- Facultad de Estudios Superiores Cuautitlán, UNAM, Estado de Mexico, Mexico
- *Correspondence: Ali-Berenice Posada-Reyes, ; Salvador Fonseca-Coronado,
| |
Collapse
|
44
|
Marini S, Oliva M, Slizovskiy IB, Das RA, Noyes NR, Kahveci T, Boucher C, Prosperi M. AMR-meta: a k-mer and metafeature approach to classify antimicrobial resistance from high-throughput short-read metagenomics data. Gigascience 2022; 11:giac029. [PMID: 35583675 PMCID: PMC9116207 DOI: 10.1093/gigascience/giac029] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 01/27/2022] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Antimicrobial resistance (AMR) is a global health concern. High-throughput metagenomic sequencing of microbial samples enables profiling of AMR genes through comparison with curated AMR databases. However, the performance of current methods is often hampered by database incompleteness and the presence of homology/homoplasy with other non-AMR genes in sequenced samples. RESULTS We present AMR-meta, a database-free and alignment-free approach, based on k-mers, which combines algebraic matrix factorization into metafeatures with regularized regression. Metafeatures capture multi-level gene diversity across the main antibiotic classes. AMR-meta takes in reads from metagenomic shotgun sequencing and outputs predictions about whether those reads contribute to resistance against specific classes of antibiotics. In addition, AMR-meta uses an augmented training strategy that joins an AMR gene database with non-AMR genes (used as negative examples). We compare AMR-meta with AMRPlusPlus, DeepARG, and Meta-MARC, further testing their ensemble via a voting system. In cross-validation, AMR-meta has a median f-score of 0.7 (interquartile range, 0.2-0.9). On semi-synthetic metagenomic data-external test-on average AMR-meta yields a 1.3-fold hit rate increase over existing methods. In terms of run-time, AMR-meta is 3 times faster than DeepARG, 30 times faster than Meta-MARC, and as fast as AMRPlusPlus. Finally, we note that differences in AMR ontologies and observed variance of all tools in classification outputs call for further development on standardization of benchmarking data and protocols. CONCLUSIONS AMR-meta is a fast, accurate classifier that exploits non-AMR negative sets to improve sensitivity and specificity. The differences in AMR ontologies and the high variance of all tools in classification outputs call for the deployment of standard benchmarking data and protocols, to fairly compare AMR prediction tools.
Collapse
Affiliation(s)
- Simone Marini
- Department of Computer and Information Science and Engineering, University of Florida, 2004 Mowry Road Gainesville, FL 32610, USA
| | - Marco Oliva
- Department of Computer and Information Science and Engineering, University of Florida, 432 Newell Dr, Gainesville, FL 32611, USA
| | - Ilya B Slizovskiy
- Department of Veterinary Population Medicine, University of Minnesota, 1365 Gortner Avenue 225, St. Paul, MN 55108, USA
| | - Rishabh A Das
- Department of Computer and Information Science and Engineering, University of Florida, 2004 Mowry Road Gainesville, FL 32610, USA
| | - Noelle Robertson Noyes
- Department of Veterinary Population Medicine, University of Minnesota, 1365 Gortner Avenue 225, St. Paul, MN 55108, USA
| | - Tamer Kahveci
- Department of Computer and Information Science and Engineering, University of Florida, 432 Newell Dr, Gainesville, FL 32611, USA
| | - Christina Boucher
- Department of Computer and Information Science and Engineering, University of Florida, 432 Newell Dr, Gainesville, FL 32611, USA
| | - Mattia Prosperi
- Department of Computer and Information Science and Engineering, University of Florida, 2004 Mowry Road Gainesville, FL 32610, USA
| |
Collapse
|
45
|
Distribution of Common and Rare Genetic Markers of Second-Line-Injectable-Drug Resistance in Mycobacterium tuberculosis Revealed by a Genome-Wide Association Study. Antimicrob Agents Chemother 2022; 66:e0207521. [PMID: 35532237 DOI: 10.1128/aac.02075-21] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Point mutations in the rrs gene and the eis promoter are known to confer resistance to the second-line injectable drugs (SLIDs) amikacin (AMK), capreomycin (CAP), and kanamycin (KAN). While mutations in these canonical genes confer the majority of SLID resistance, alternative mechanisms of resistance are not uncommon and threaten effective treatment decisions when using conventional molecular diagnostics. In total, 1,184 clinical Mycobacterium tuberculosis isolates from 7 countries were studied for genomic markers associated with phenotypic resistance. The markers rrs:A1401G and rrs:G1484T were associated with resistance to all three SLIDs, and three known markers in the eis promoter (eis:G-10A, eis:C-12T, and eis:C-14T) were similarly associated with kanamycin resistance (KAN-R). Among 325, 324, and 270 AMK-R, CAP-R, and KAN-R isolates, 274 (84.3%), 250 (77.2%), and 249 (92.3%) harbored canonical mutations, respectively. Thirteen isolates harbored more than one canonical mutation. Canonical mutations did not account for 103 of the phenotypically resistant isolates. A genome-wide association study identified three genes and promoters with mutations that, on aggregate, were associated with unexplained resistance to at least one SLID. Our analysis associated whiB7 5'-untranslated-region mutations with KAN resistance, supporting clinical relevance for this previously demonstrated mechanism of KAN resistance. We also provide evidence for the novel association of CAP resistance with the promoter of the Rv2680-Rv2681 operon, which encodes an exoribonuclease that may influence the binding of CAP to the ribosome. Aggregating mutations by gene can provide additional insight and therefore is recommended for identifying rare mechanisms of resistance when individual mutations carry insufficient statistical power.
Collapse
|
46
|
Systems biology approach to functionally assess the Clostridioides difficile pangenome reveals genetic diversity with discriminatory power. Proc Natl Acad Sci U S A 2022; 119:e2119396119. [PMID: 35476524 PMCID: PMC9170149 DOI: 10.1073/pnas.2119396119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
SignificanceClostridioides difficile infections are the most common source of hospital-acquired infections and are responsible for an extensive burden on the health care system. Strains of the C. difficile species comprise diverse lineages and demonstrate genome variability, with advantageous trait acquisition driving the emergence of endemic lineages. Here, we present a systems biology analysis of C. difficile that evaluates strain-specific genotypes and phenotypes to investigate the overall diversity of the species. We develop a strain typing method based on similarity of accessory genomes to identify and contextualize genetic loci capable of discriminating between strain groups.
Collapse
|
47
|
Aytan-Aktug D, Clausen PTLC, Szarvas J, Munk P, Otani S, Nguyen M, Davis JJ, Lund O, Aarestrup FM. PlasmidHostFinder: Prediction of Plasmid Hosts Using Random Forest. mSystems 2022; 7:e0118021. [PMID: 35382558 PMCID: PMC9040769 DOI: 10.1128/msystems.01180-21] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 03/16/2022] [Indexed: 11/20/2022] Open
Abstract
Plasmids play a major role facilitating the spread of antimicrobial resistance between bacteria. Understanding the host range and dissemination trajectories of plasmids is critical for surveillance and prevention of antimicrobial resistance. Identification of plasmid host ranges could be improved using automated pattern detection methods compared to homology-based methods due to the diversity and genetic plasticity of plasmids. In this study, we developed a method for predicting the host range of plasmids using machine learning-specifically, random forests. We trained the models with 8,519 plasmids from 359 different bacterial species per taxonomic level; the models achieved Matthews correlation coefficients of 0.662 and 0.867 at the species and order levels, respectively. Our results suggest that despite the diverse nature and genetic plasticity of plasmids, our random forest model can accurately distinguish between plasmid hosts. This tool is available online through the Center for Genomic Epidemiology (https://cge.cbs.dtu.dk/services/PlasmidHostFinder/). IMPORTANCE Antimicrobial resistance is a global health threat to humans and animals, causing high mortality and morbidity while effectively ending decades of success in fighting against bacterial infections. Plasmids confer extra genetic capabilities to the host organisms through accessory genes that can encode antimicrobial resistance and virulence. In addition to lateral inheritance, plasmids can be transferred horizontally between bacterial taxa. Therefore, detection of the host range of plasmids is crucial for understanding and predicting the dissemination trajectories of extrachromosomal genes and bacterial evolution as well as taking effective countermeasures against antimicrobial resistance.
Collapse
Affiliation(s)
- Derya Aytan-Aktug
- National Food Institute, Technical University of Denmark, Kgs. Lyngby, Denmark
| | | | - Judit Szarvas
- National Food Institute, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Patrick Munk
- National Food Institute, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Saria Otani
- National Food Institute, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Marcus Nguyen
- Consortium for Advanced Science and Engineering, University of Chicago, Chicago, Illinois, USA
- Data Science and Learning Division, Argonne National Laboratory, Argonne, Illinois, USA
| | - James J. Davis
- Consortium for Advanced Science and Engineering, University of Chicago, Chicago, Illinois, USA
- Data Science and Learning Division, Argonne National Laboratory, Argonne, Illinois, USA
- Northwestern Argonne Institute for Science and Engineering, Evanston, Illinois, USA
| | - Ole Lund
- National Food Institute, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Frank M. Aarestrup
- National Food Institute, Technical University of Denmark, Kgs. Lyngby, Denmark
| |
Collapse
|
48
|
Zhang Z, Cheng S, Solis-Lemus C. Towards a robust out-of-the-box neural network model for genomic data. BMC Bioinformatics 2022; 23:125. [PMID: 35397517 PMCID: PMC8994362 DOI: 10.1186/s12859-022-04660-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 03/21/2022] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background
The accurate prediction of biological features from genomic data is paramount for precision medicine and sustainable agriculture. For decades, neural network models have been widely popular in fields like computer vision, astrophysics and targeted marketing given their prediction accuracy and their robust performance under big data settings. Yet neural network models have not made a successful transition into the medical and biological world due to the ubiquitous characteristics of biological data such as modest sample sizes, sparsity, and extreme heterogeneity.
Results
Here, we investigate the robustness, generalization potential and prediction accuracy of widely used convolutional neural network and natural language processing models with a variety of heterogeneous genomic datasets. Mainly, recurrent neural network models outperform convolutional neural network models in terms of prediction accuracy, overfitting and transferability across the datasets under study.
Conclusions
While the perspective of a robust out-of-the-box neural network model is out of reach, we identify certain model characteristics that translate well across datasets and could serve as a baseline model for translational researchers.
Collapse
|
49
|
Li J, Li X, Li M, Qiu H, Saad C, Zhao B, Li F, Wu X, Kuang D, Tang F, Chen Y, Shu H, Zhang J, Wang Q, Huang H, Qi S, Ye C, Bryant A, Yuan X, Kurts C, Hu G, Cheng W, Mei Q. Differential early diagnosis of benign versus malignant lung cancer using systematic pathway flux analysis of peripheral blood leukocytes. Sci Rep 2022; 12:5070. [PMID: 35332177 PMCID: PMC8948197 DOI: 10.1038/s41598-022-08890-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 03/07/2022] [Indexed: 12/24/2022] Open
Abstract
Early diagnosis of lung cancer is critically important to reduce disease severity and improve overall survival. Newer, minimally invasive biopsy procedures often fail to provide adequate specimens for accurate tumor subtyping or staging which is necessary to inform appropriate use of molecular targeted therapies and immune checkpoint inhibitors. Thus newer approaches to diagnosis and staging in early lung cancer are needed. This exploratory pilot study obtained peripheral blood samples from 139 individuals with clinically evident pulmonary nodules (benign and malignant), as well as ten healthy persons. They were divided into three cohorts: original cohort (n = 99), control cohort (n = 10), and validation cohort (n = 40). Average RNAseq sequencing of leukocytes in these samples were conducted. Subsequently, data was integrated into artificial intelligence (AI)-based computational approach with system-wide gene expression technology to develop a rapid, effective, non-invasive immune index for early diagnosis of lung cancer. An immune-related index system, IM-Index, was defined and validated for the diagnostic application. IM-Index was applied to assess the malignancies of pulmonary nodules of 109 participants (original + control cohorts) with high accuracy (AUC: 0.822 [95% CI: 0.75-0.91, p < 0.001]), and to differentiate between phases of cancer immunoediting concept (odds ratio: 1.17 [95% CI: 1.1-1.25, p < 0.001]). The predictive ability of IM-Index was validated in a validation cohort with a AUC: 0.883 (95% CI: 0.73-1.00, p < 0.001). The difference between molecular mechanisms of adenocarcinoma and squamous carcinoma histology was also determined via the IM-Index (OR: 1.2 [95% CI 1.14-1.35, p = 0.019]). In addition, a structural metabolic behavior pattern and signaling property in host immunity were found (bonferroni correction, p = 1.32e - 16). Taken together our findings indicate that this AI-based approach may be used for "Super Early" cancer diagnosis and amend the current immunotherpay for lung cancer.
Collapse
Affiliation(s)
- Jian Li
- Institute of Molecular Medicine and Experimental Immunology, University Clinic of Rheinische Friedrich-Wilhelms-University, Bonn, Germany
| | - Xiaoyu Li
- Department of Oncology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, People's Republic of China
| | - Ming Li
- Department of Oncology, Wuhan Pulmonary Hospital, Wuhan, Hubei, People's Republic of China
| | - Hong Qiu
- Department of Oncology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, People's Republic of China
| | - Christian Saad
- Department of Computer Science, University of Augsburg, Augsburg, Germany
| | - Bo Zhao
- Department of Thoracic Surgery, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, People's Republic of China
| | - Fan Li
- Department of Thoracic Surgery, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, People's Republic of China
| | - Xiaowei Wu
- Department of Thoracic Surgery, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, People's Republic of China
| | - Dong Kuang
- Institute of Pathology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, People's Republic of China
- Department of Pathology, School of Basic Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, People's Republic of China
| | - Fengjuan Tang
- Institute of Pathology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, People's Republic of China
- Department of Pathology, School of Basic Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, People's Republic of China
| | - Yaobing Chen
- Institute of Pathology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, People's Republic of China
- Department of Pathology, School of Basic Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, People's Republic of China
| | - Hongge Shu
- Radiology Department, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, People's Republic of China
| | - Jing Zhang
- Radiology Department, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, People's Republic of China
| | - Qiuxia Wang
- Radiology Department, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, People's Republic of China
| | - He Huang
- Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, People's Republic of China
| | - Shankang Qi
- Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, People's Republic of China
| | - Changkun Ye
- Medical Research Center of Yu Huang Hospital, Yu Huang, Zhejiang, People's Republic of China
| | - Amy Bryant
- Department of Biochemical and Pharmaceutical Sciences, College of Pharmacy, Idaho State University, Pocatello, USA
| | - Xianglin Yuan
- Department of Oncology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, People's Republic of China
| | - Christian Kurts
- Institute of Molecular Medicine and Experimental Immunology, University Clinic of Rheinische Friedrich-Wilhelms-University, Bonn, Germany
| | - Guangyuan Hu
- Department of Oncology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, People's Republic of China.
| | - Weiting Cheng
- Department of Oncology, Wuhan No. 1 Hospital, Wuhan, Hubei, People's Republic of China.
| | - Qi Mei
- Department of Oncology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, People's Republic of China.
| |
Collapse
|
50
|
Zhao W, Luo S, Wu H, Jiang X, He T, Hu X. A multi-label learning framework for predicting antibiotic resistance genes via dual-view modeling. Brief Bioinform 2022; 23:6546259. [PMID: 35272349 DOI: 10.1093/bib/bbac052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 01/27/2022] [Accepted: 01/31/2022] [Indexed: 11/13/2022] Open
Abstract
The increasing prevalence of antibiotic resistance has become a global health crisis. For the purpose of safety regulation, it is of high importance to identify antibiotic resistance genes (ARGs) in bacteria. Although culture-based methods can identify ARGs relatively more accurately, the identifying process is time-consuming and specialized knowledge is required. With the rapid development of whole genome sequencing technology, researchers attempt to identify ARGs by computing sequence similarity from public databases. However, these computational methods might fail to detect ARGs due to the low sequence identity to known ARGs. Moreover, existing methods cannot effectively address the issue of multidrug resistance prediction for ARGs, which is a great challenge to clinical treatments. To address the challenges, we propose an end-to-end multi-label learning framework for predicting ARGs. More specifically, the task of ARGs prediction is modeled as a problem of multi-label learning, and a deep neural network-based end-to-end framework is proposed, in which a specific loss function is introduced to employ the advantage of multi-label learning for ARGs prediction. In addition, a dual-view modeling mechanism is employed to make full use of the semantic associations among two views of ARGs, i.e. sequence-based information and structure-based information. Extensive experiments are conducted on publicly available data, and experimental results demonstrate the effectiveness of the proposed framework on the task of ARGs prediction.
Collapse
Affiliation(s)
- Weizhong Zhao
- School of Computer, Central China Normal University, Wuhan, Hubei, 430079, PR China
| | - Shujie Luo
- School of Computer, Central China Normal University, Wuhan, Hubei, 430079, PR China
| | - Haifang Wu
- School of Computer, Central China Normal University, Wuhan, Hubei, 430079, PR China
| | - Xingpeng Jiang
- School of Computer, Central China Normal University, Wuhan, Hubei, 430079, PR China
| | - Tingting He
- School of Computer, Central China Normal University, Wuhan, Hubei, 430079, PR China
| | - Xiaohua Hu
- College of Computing & Informatics, Drexel University, Philadelphia, PA 19104, USA
| |
Collapse
|