1
|
Carter JJ, Walker TM, Walker AS, Whitfield MG, Morlock GP, Lynch CI, Adlard D, Peto TEA, Posey JE, Crook DW, Fowler PW. Prediction of pyrazinamide resistance in Mycobacterium tuberculosis using structure-based machine-learning approaches. JAC Antimicrob Resist 2024; 6:dlae037. [PMID: 38500518 PMCID: PMC10946228 DOI: 10.1093/jacamr/dlae037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 02/19/2024] [Indexed: 03/20/2024] Open
Abstract
Background Pyrazinamide is one of four first-line antibiotics used to treat tuberculosis; however, antibiotic susceptibility testing for pyrazinamide is challenging. Resistance to pyrazinamide is primarily driven by genetic variation in pncA, encoding an enzyme that converts pyrazinamide into its active form. Methods We curated a dataset of 664 non-redundant, missense amino acid mutations in PncA with associated high-confidence phenotypes from published studies and then trained three different machine-learning models to predict pyrazinamide resistance. All models had access to a range of protein structural-, chemical- and sequence-based features. Results The best model, a gradient-boosted decision tree, achieved a sensitivity of 80.2% and a specificity of 76.9% on the hold-out test dataset. The clinical performance of the models was then estimated by predicting the binary pyrazinamide resistance phenotype of 4027 samples harbouring 367 unique missense mutations in pncA derived from 24 231 clinical isolates. Conclusions This work demonstrates how machine learning can enhance the sensitivity/specificity of pyrazinamide resistance prediction in genetics-based clinical microbiology workflows, highlights novel mutations for future biochemical investigation, and is a proof of concept for using this approach in other drugs.
Collapse
Affiliation(s)
- Joshua J Carter
- Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Headley Way, Oxford OX3 9DU, UK
| | - Timothy M Walker
- Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Headley Way, Oxford OX3 9DU, UK
| | - A Sarah Walker
- Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Headley Way, Oxford OX3 9DU, UK
- National Institute of Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Headley Way, Oxford OX3 9DU, UK
- NIHR Health Protection Research Unit in Healthcare Associated Infection and Antimicrobial Resistance, University of Oxford, Oxford, UK
| | - Michael G Whitfield
- Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, SAMRC Centre for Tuberculosis Research, DST/NRF Centre of Excellence for Biomedical Tuberculosis Research, Stellenbosch University, Tygerberg, South Africa
| | - Glenn P Morlock
- Division of Tuberculosis Elimination, National Center for HIV/AIDS, Viral Hepatitis, STD, and TB Prevention, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Charlotte I Lynch
- Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Headley Way, Oxford OX3 9DU, UK
| | - Dylan Adlard
- Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Headley Way, Oxford OX3 9DU, UK
| | - Timothy E A Peto
- Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Headley Way, Oxford OX3 9DU, UK
- National Institute of Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Headley Way, Oxford OX3 9DU, UK
| | - James E Posey
- Division of Tuberculosis Elimination, National Center for HIV/AIDS, Viral Hepatitis, STD, and TB Prevention, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Derrick W Crook
- Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Headley Way, Oxford OX3 9DU, UK
- National Institute of Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Headley Way, Oxford OX3 9DU, UK
- NIHR Health Protection Research Unit in Healthcare Associated Infection and Antimicrobial Resistance, University of Oxford, Oxford, UK
| | - Philip W Fowler
- Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Headley Way, Oxford OX3 9DU, UK
- National Institute of Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Headley Way, Oxford OX3 9DU, UK
| |
Collapse
|
2
|
Quek ZBR, Ng SH. Hybrid-Capture Target Enrichment in Human Pathogens: Identification, Evolution, Biosurveillance, and Genomic Epidemiology. Pathogens 2024; 13:275. [PMID: 38668230 PMCID: PMC11054155 DOI: 10.3390/pathogens13040275] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 03/11/2024] [Accepted: 03/18/2024] [Indexed: 04/29/2024] Open
Abstract
High-throughput sequencing (HTS) has revolutionised the field of pathogen genomics, enabling the direct recovery of pathogen genomes from clinical and environmental samples. However, pathogen nucleic acids are often overwhelmed by those of the host, requiring deep metagenomic sequencing to recover sufficient sequences for downstream analyses (e.g., identification and genome characterisation). To circumvent this, hybrid-capture target enrichment (HC) is able to enrich pathogen nucleic acids across multiple scales of divergences and taxa, depending on the panel used. In this review, we outline the applications of HC in human pathogens-bacteria, fungi, parasites and viruses-including identification, genomic epidemiology, antimicrobial resistance genotyping, and evolution. Importantly, we explored the applicability of HC to clinical metagenomics, which ultimately requires more work before it is a reliable and accurate tool for clinical diagnosis. Relatedly, the utility of HC was exemplified by COVID-19, which was used as a case study to illustrate the maturity of HC for recovering pathogen sequences. As we unravel the origins of COVID-19, zoonoses remain more relevant than ever. Therefore, the role of HC in biosurveillance studies is also highlighted in this review, which is critical in preparing us for the next pandemic. We also found that while HC is a popular tool to study viruses, it remains underutilised in parasites and fungi and, to a lesser extent, bacteria. Finally, weevaluated the future of HC with respect to bait design in the eukaryotic groups and the prospect of combining HC with long-read HTS.
Collapse
Affiliation(s)
- Z. B. Randolph Quek
- Defence Medical & Environmental Research Institute, DSO National Laboratories, Singapore 117510, Singapore
| | | |
Collapse
|
3
|
Padhi AK, Maurya S. Uncovering the secrets of resistance: An introduction to computational methods in infectious disease research. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2024; 139:173-220. [PMID: 38448135 DOI: 10.1016/bs.apcsb.2023.11.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/08/2024]
Abstract
Antimicrobial resistance (AMR) is a growing global concern with significant implications for infectious disease control and therapeutics development. This chapter presents a comprehensive overview of computational methods in the study of AMR. We explore the prevalence and statistics of AMR, underscoring its alarming impact on public health. The role of AMR in infectious disease outbreaks and its impact on therapeutics development are discussed, emphasizing the need for novel strategies. Resistance mutations are pivotal in AMR, enabling pathogens to evade antimicrobial treatments. We delve into their importance and contribution to the spread of AMR. Experimental methods for quantitatively evaluating resistance mutations are described, along with their limitations. To address these challenges, computational methods provide promising solutions. We highlight the advantages of computational approaches, including rapid analysis of large datasets and prediction of resistance profiles. A comprehensive overview of computational methods for studying AMR is presented, encompassing genomics, proteomics, structural bioinformatics, network analysis, and machine learning algorithms. The strengths and limitations of each method are briefly outlined. Additionally, we introduce ResScan-design, our own computational method, which employs a protein (re)design protocol to identify potential resistance mutations and adaptation signatures in pathogens. Case studies are discussed to showcase the application of ResScan in elucidating hotspot residues, understanding underlying mechanisms, and guiding the design of effective therapies. In conclusion, we emphasize the value of computational methods in understanding and combating AMR. Integration of experimental and computational approaches can expedite the discovery of innovative antimicrobial treatments and mitigate the threat posed by AMR.
Collapse
Affiliation(s)
- Aditya K Padhi
- Laboratory for Computational Biology & Biomolecular Design, School of Biochemical Engineering, Indian Institute of Technology (BHU), Varanasi, Uttar Pradesh, India.
| | - Shweata Maurya
- Laboratory for Computational Biology & Biomolecular Design, School of Biochemical Engineering, Indian Institute of Technology (BHU), Varanasi, Uttar Pradesh, India
| |
Collapse
|
4
|
Ko S, Kim J, Lim J, Lee SM, Park JY, Woo J, Scott-Nevros ZK, Kim JR, Yoon H, Kim D. Blanket antimicrobial resistance gene database with structural information, BOARDS, provides insights on historical landscape of resistance prevalence and effects of mutations in enzyme structure. mSystems 2024; 9:e0094323. [PMID: 38085058 PMCID: PMC10871167 DOI: 10.1128/msystems.00943-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 11/02/2023] [Indexed: 01/24/2024] Open
Abstract
Antimicrobial resistance (AMR) in pathogenic bacteria poses a significant threat to public health, yet there is still a need for development in the tools to deeply understand AMR genes based on genetic or structural information. In this study, we present an interactive web database named Blanket Overarching Antimicrobial-Resistance gene Database with Structural information (BOARDS, sbml.unist.ac.kr), a database that comprehensively includes 3,943 reported AMR gene information for 1,997 extended spectrum beta-lactamase (ESBL) and 1,946 other genes as well as a total of 27,395 predicted protein structures. These structures, which include both wild-type AMR genes and their mutants, were derived from 80,094 publicly available whole-genome sequences. In addition, we developed the rapid analysis and detection tool of antimicrobial-resistance (RADAR), a one-stop analysis pipeline to detect AMR genes across whole-genome sequencing (WGSs). By integrating BOARDS and RADAR, the AMR prevalence landscape for eight multi-drug resistant pathogens was reconstructed, leading to unexpected findings such as the pre-existence of the MCR genes before their official reports. Enzymatic structure prediction-based analysis revealed that the occurrence of mutations found in some ESBL genes was found to be closely related to the binding affinities with their antibiotic substrates. Overall, BOARDS can play a significant role in performing in-depth analysis on AMR.IMPORTANCEWhile the increasing antibiotic resistance (AMR) in pathogen has been a burden on public health, effective tools for deep understanding of AMR based on genetic or structural information remain limited. In this study, a blanket overarching antimicrobial-resistance gene database with structure information (BOARDS)-a web-based database that comprehensively collected AMR gene data with predictive protein structural information was constructed. Additionally, we report the development of a RADAR pipeline that can analyze whole-genome sequences as well. BOARDS, which includes sequence and structural information, has shown the historical landscape and prevalence of the AMR genes and can provide insight into single-nucleotide polymorphism effects on antibiotic degrading enzymes within protein structures.
Collapse
Affiliation(s)
- Seyoung Ko
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
- School of Life Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Jaehyung Kim
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Jaewon Lim
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Sang-Mok Lee
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Joon Young Park
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Jihoon Woo
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Zoe K. Scott-Nevros
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Jong R. Kim
- School of Engineering and Digital Sciences, Nazarbayev University, Astan, Kazakhstan
| | - Hyunjin Yoon
- Department of Molecular Science and Technology, Ajou University, Suwon, South Korea
| | - Donghyuk Kim
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
- School of Life Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| |
Collapse
|
5
|
Abstract
The greatest challenge in drug discovery remains the high rate of attrition across the different phases of the process, which cost the industry billions of dollars every year. While all phases remain crucial to ensure pharmaceutical-level safety, quality, and efficacy of the end product, streamlining these efforts toward compounds with success potential is pivotal for a more efficient and cost-effective process. The use of artificial intelligence (AI) within the pharmaceutical industry aims at just this, and has applications in preclinical screening for biological activity, optimization of pharmacokinetic properties for improved drug formulation, early toxicity prediction which reduces attrition, and pre-emptively screening for genetic changes in the biological target to improve therapeutic longevity. Here, we present a series of in silico tools that address these applications in small molecule development and describe how they can be embedded within the current pharmaceutical development pipeline.
Collapse
Affiliation(s)
- Adam Serghini
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
| | - Stephanie Portelli
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD, Australia.
| | - David B Ascher
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD, Australia.
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia.
| |
Collapse
|
6
|
Samantray D, Tanwar AS, Murali TS, Brand A, Satyamoorthy K, Paul B. A Comprehensive Bioinformatics Resource Guide for Genome-Based Antimicrobial Resistance Studies. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2023; 27:445-460. [PMID: 37861712 DOI: 10.1089/omi.2023.0140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/21/2023]
Abstract
The use of high-throughput sequencing technologies and bioinformatic tools has greatly transformed microbial genome research. With the help of sophisticated computational tools, it has become easier to perform whole genome assembly, identify and compare different species based on their genomes, and predict the presence of genes responsible for proteins, antimicrobial resistance, and toxins. These bioinformatics resources are likely to continuously improve in quality, become more user-friendly to analyze the multiple genomic data, efficient in generating information and translating it into meaningful knowledge, and enhance our understanding of the genetic mechanism of AMR. In this manuscript, we provide an essential guide for selecting the popular resources for microbial research, such as genome assembly and annotation, antibiotic resistance gene profiling, identification of virulence factors, and drug interaction studies. In addition, we discuss the best practices in computer-oriented microbial genome research, emerging trends in microbial genomic data analysis, integration of multi-omics data, the appropriate use of machine-learning algorithms, and open-source bioinformatics resources for genome data analytics.
Collapse
Affiliation(s)
- Debyani Samantray
- Department of Bioinformatics, Manipal School of Life Sciences, Manipal Academy of Higher Education, Manipal, India
| | - Ankit Singh Tanwar
- United Nations University-Maastricht Economic and Social Research Institute on Innovation and Technology (UNU-MERIT), Maastricht, The Netherlands
- Faculty of Health, Medicine and Life Sciences (FHML), Maastricht University, Maastricht, The Netherlands
| | - Thokur Sreepathy Murali
- Department of Biotechnology, Manipal School of Life Sciences, Manipal Academy of Higher Education, Manipal, India
| | - Angela Brand
- United Nations University-Maastricht Economic and Social Research Institute on Innovation and Technology (UNU-MERIT), Maastricht, The Netherlands
- Faculty of Health, Medicine and Life Sciences (FHML), Maastricht University, Maastricht, The Netherlands
- Department of Health Information, Prasanna School of Public Health (PSPH), Manipal Academy of Higher Education, Manipal, India
| | - Kapaettu Satyamoorthy
- SDM College of Medical Sciences and Hospital, Shri Dharmasthala Manjunatheshwara (SDM) University, Dharwad, India
| | - Bobby Paul
- Department of Bioinformatics, Manipal School of Life Sciences, Manipal Academy of Higher Education, Manipal, India
| |
Collapse
|
7
|
Ascher DB, Kaminskas LM, Myung Y, Pires DEV. Using Graph-Based Signatures to Guide Rational Antibody Engineering. Methods Mol Biol 2023; 2552:375-397. [PMID: 36346604 DOI: 10.1007/978-1-0716-2609-2_21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Antibodies are essential experimental and diagnostic tools and as biotherapeutics have significantly advanced our ability to treat a range of diseases. With recent innovations in computational tools to guide protein engineering, we can now rationally design better antibodies with improved efficacy, stability, and pharmacokinetics. Here, we describe the use of the mCSM web-based in silico suite, which uses graph-based signatures to rapidly identify the structural and functional consequences of mutations, to guide rational antibody engineering to improve stability, affinity, and specificity.
Collapse
Affiliation(s)
- David B Ascher
- Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
- Department of Biochemistry, Cambridge University, Cambridge, UK
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland, Australia
| | - Lisa M Kaminskas
- School of Biological Sciences, University of Queensland, St Lucia, QLD, Australia
| | - Yoochan Myung
- Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland, Australia
| | - Douglas E V Pires
- Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC, Australia.
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia.
- School of Computing and Information Systems, University of Melbourne, Parkville, VIC, Australia.
| |
Collapse
|
8
|
Pan Q, Nguyen TB, Ascher DB, Pires DEV. Systematic evaluation of computational tools to predict the effects of mutations on protein stability in the absence of experimental structures. Brief Bioinform 2022; 23:bbac025. [PMID: 35189634 PMCID: PMC9155634 DOI: 10.1093/bib/bbac025] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 01/13/2022] [Accepted: 01/30/2022] [Indexed: 12/26/2022] Open
Abstract
Changes in protein sequence can have dramatic effects on how proteins fold, their stability and dynamics. Over the last 20 years, pioneering methods have been developed to try to estimate the effects of missense mutations on protein stability, leveraging growing availability of protein 3D structures. These, however, have been developed and validated using experimentally derived structures and biophysical measurements. A large proportion of protein structures remain to be experimentally elucidated and, while many studies have based their conclusions on predictions made using homology models, there has been no systematic evaluation of the reliability of these tools in the absence of experimental structural data. We have, therefore, systematically investigated the performance and robustness of ten widely used structural methods when presented with homology models built using templates at a range of sequence identity levels (from 15% to 95%) and contrasted performance with sequence-based tools, as a baseline. We found there is indeed performance deterioration on homology models built using templates with sequence identity below 40%, where sequence-based tools might become preferable. This was most marked for mutations in solvent exposed residues and stabilizing mutations. As structure prediction tools improve, the reliability of these predictors is expected to follow, however we strongly suggest that these factors should be taken into consideration when interpreting results from structure-based predictors of mutation effects on protein stability.
Collapse
Affiliation(s)
- Qisheng Pan
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane City, Queensland 4072, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, 30 Flemington Rd, Parkville, Victoria 3052, Australia
| | - Thanh Binh Nguyen
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane City, Queensland 4072, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, 30 Flemington Rd, Parkville, Victoria 3052, Australia
| | - David B Ascher
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane City, Queensland 4072, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, 30 Flemington Rd, Parkville, Victoria 3052, Australia
- Department of Biochemistry, University of Cambridge, 80 Tennis Ct Rd, Cambridge CB2 1GA, UK
| | - Douglas E V Pires
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane City, Queensland 4072, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, 30 Flemington Rd, Parkville, Victoria 3052, Australia
- School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria 3053, Australia
| |
Collapse
|
9
|
Genome-Wide Mutation Scoring for Machine-Learning-Based Antimicrobial Resistance Prediction. Int J Mol Sci 2021; 22:ijms222313049. [PMID: 34884852 PMCID: PMC8657983 DOI: 10.3390/ijms222313049] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 11/25/2021] [Accepted: 11/29/2021] [Indexed: 01/21/2023] Open
Abstract
The prediction of antimicrobial resistance (AMR) based on genomic information can improve patient outcomes. Genetic mechanisms have been shown to explain AMR with accuracies in line with standard microbiology laboratory testing. To translate genetic mechanisms into phenotypic AMR, machine learning has been successfully applied. AMR machine learning models typically use nucleotide k-mer counts to represent genomic sequences. While k-mer representation efficiently captures sequence variation, it also results in high-dimensional and sparse data. With limited training data available, achieving acceptable model performance or model interpretability is challenging. In this study, we explore the utility of feature engineering with several biologically relevant signals. We propose to predict the functional impact of observed mutations with PROVEAN to use the predicted impact as a new feature for each protein in an organism’s proteome. The addition of the new features was tested on a total of 19,521 isolates across nine clinically relevant pathogens and 30 different antibiotics. The new features significantly improved the predictive performance of trained AMR models for Pseudomonas aeruginosa, Citrobacter freundii, and Escherichia coli. The balanced accuracy of the respective models of those three pathogens improved by 6.0% on average.
Collapse
|
10
|
Rodrigues CHM, Pires DEV, Ascher DB. mmCSM-PPI: predicting the effects of multiple point mutations on protein-protein interactions. Nucleic Acids Res 2021; 49:W417-W424. [PMID: 33893812 PMCID: PMC8262703 DOI: 10.1093/nar/gkab273] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 03/18/2021] [Accepted: 04/15/2021] [Indexed: 11/16/2022] Open
Abstract
Protein-protein interactions play a crucial role in all cellular functions and biological processes and mutations leading to their disruption are enriched in many diseases. While a number of computational methods to assess the effects of variants on protein-protein binding affinity have been proposed, they are in general limited to the analysis of single point mutations and have been shown to perform poorly on independent test sets. Here, we present mmCSM-PPI, a scalable and effective machine learning model for accurately assessing changes in protein-protein binding affinity caused by single and multiple missense mutations. We expanded our well-established graph-based signatures in order to capture physicochemical and geometrical properties of multiple wild-type residue environments and integrated them with substitution scores and dynamics terms from normal mode analysis. mmCSM-PPI was able to achieve a Pearson's correlation of up to 0.75 (RMSE = 1.64 kcal/mol) under 10-fold cross-validation and 0.70 (RMSE = 2.06 kcal/mol) on a non-redundant blind test, outperforming existing methods. Our method is freely available as a user-friendly and easy-to-use web server and API at http://biosig.unimelb.edu.au/mmcsm_ppi.
Collapse
Affiliation(s)
- Carlos H M Rodrigues
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Structural Biology and Bioinformatics, Department of Biochemistry and Pharmacology, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
| | - Douglas E V Pires
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Structural Biology and Bioinformatics, Department of Biochemistry and Pharmacology, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - David B Ascher
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Structural Biology and Bioinformatics, Department of Biochemistry and Pharmacology, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| |
Collapse
|
11
|
Portelli S, Barr L, de Sá AG, Pires DE, Ascher DB. Distinguishing between PTEN clinical phenotypes through mutation analysis. Comput Struct Biotechnol J 2021; 19:3097-3109. [PMID: 34141133 PMCID: PMC8180946 DOI: 10.1016/j.csbj.2021.05.028] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 04/29/2021] [Accepted: 05/19/2021] [Indexed: 12/28/2022] Open
Abstract
Phosphate and tensin homolog on chromosome ten (PTEN) germline mutations are associated with an overarching condition known as PTEN hamartoma tumor syndrome. Clinical phenotypes associated with this syndrome range from macrocephaly and autism spectrum disorder to Cowden syndrome, which manifests as multiple noncancerous tumor-like growths (hamartomas), and an increased predisposition to certain cancers. It is unclear, however, the basis by which mutations might lead to these very diverse phenotypic outcomes. Here we show that, by considering the molecular consequences of mutations in PTEN on protein structure and function, we can accurately distinguish PTEN mutations exhibiting different phenotypes. Changes in phosphatase activity, protein stability, and intramolecular interactions appeared to be major drivers of clinical phenotype, with cancer-associated variants leading to the most drastic changes, while ASD and non-pathogenic variants associated with more mild and neutral changes, respectively. Importantly, we show via saturation mutagenesis that more than half of variants of unknown significance could be associated with disease phenotypes, while over half of Cowden syndrome mutations likely lead to cancer. These insights can assist in exploring potentially important clinical outcomes delineated by PTEN variation.
Collapse
Affiliation(s)
- Stephanie Portelli
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| | - Lucy Barr
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| | - Alex G.C. de Sá
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Melbourne, Victoria, Australia
| | - Douglas E.V. Pires
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - David B. Ascher
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Melbourne, Victoria, Australia
- Department of Biochemistry, University of Cambridge, 80 Tennis Ct Rd, Cambridge CB2 1GA, United States
| |
Collapse
|
12
|
Gwenzi W, Chaukura N, Muisa-Zikali N, Teta C, Musvuugwa T, Rzymski P, Abia ALK. Insects, Rodents, and Pets as Reservoirs, Vectors, and Sentinels of Antimicrobial Resistance. Antibiotics (Basel) 2021; 10:antibiotics10010068. [PMID: 33445633 PMCID: PMC7826649 DOI: 10.3390/antibiotics10010068] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 01/07/2021] [Accepted: 01/08/2021] [Indexed: 12/22/2022] Open
Abstract
This paper reviews the occurrence of antimicrobial resistance (AMR) in insects, rodents, and pets. Insects (e.g., houseflies, cockroaches), rodents (rats, mice), and pets (dogs, cats) act as reservoirs of AMR for first-line and last-resort antimicrobial agents. AMR proliferates in insects, rodents, and pets, and their skin and gut systems. Subsequently, insects, rodents, and pets act as vectors that disseminate AMR to humans via direct contact, human food contamination, and horizontal gene transfer. Thus, insects, rodents, and pets might act as sentinels or bioindicators of AMR. Human health risks are discussed, including those unique to low-income countries. Current evidence on human health risks is largely inferential and based on qualitative data, but comprehensive statistics based on quantitative microbial risk assessment (QMRA) are still lacking. Hence, tracing human health risks of AMR to insects, rodents, and pets, remains a challenge. To safeguard human health, mitigation measures are proposed, based on the one-health approach. Future research should include human health risk analysis using QMRA, and the application of in-silico techniques, genomics, network analysis, and ’big data’ analytical tools to understand the role of household insects, rodents, and pets in the persistence, circulation, and health risks of AMR.
Collapse
Affiliation(s)
- Willis Gwenzi
- Biosystems and Environmental Engineering Research Group, Department of Agricultural and Biosystems Engineering, University of Zimbabwe, Mount. Pleasant, Harare P.O. Box MP167, Zimbabwe
- Correspondence: or (W.G.); or (A.L.K.A.)
| | - Nhamo Chaukura
- Department of Physical and Earth Sciences, Sol Plaatje University, Kimberley 8300, South Africa;
| | - Norah Muisa-Zikali
- Department of Environmental Sciences and Technology, School of Agricultural Sciences and Technology, Chinhoyi University of Technology, Private Bag, Chinhoyi 7724, Zimbabwe; or
| | - Charles Teta
- Future Water Institute, Faculty of Engineering & Built Environment, University of Cape Town, Cape Town 7700, South Africa;
| | - Tendai Musvuugwa
- Department of Biological and Agricultural Sciences, Sol Plaatje University, Kimberley 8300, South Africa;
| | - Piotr Rzymski
- Department of Environmental Medicine, Poznan University of Medical Sciences, 60-806 Poznan, Poland;
- Integrated Science Association (ISA), Universal Scientific Education and Research Network (USERN), 60-806 Poznań, Poland
| | - Akebe Luther King Abia
- Antimicrobial Research Unit, College of Health Sciences, University of KwaZulu-Natal, Private Bag X54001, Durban 4000, South Africa
- Correspondence: or (W.G.); or (A.L.K.A.)
| |
Collapse
|