1
|
Deb S, Basu J, Choudhary M. An overview of next generation sequencing strategies and genomics tools used for tuberculosis research. J Appl Microbiol 2024; 135:lxae174. [PMID: 39003248 DOI: 10.1093/jambio/lxae174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 06/07/2024] [Accepted: 07/10/2024] [Indexed: 07/15/2024]
Abstract
Tuberculosis (TB) is a grave public health concern and is considered the foremost contributor to human mortality resulting from infectious disease. Due to the stringent clonality and extremely restricted genomic diversity, conventional methods prove inefficient for in-depth exploration of minor genomic variations and the evolutionary dynamics operating in Mycobacterium tuberculosis (M.tb) populations. Until now, the majority of reviews have primarily focused on delineating the application of whole-genome sequencing (WGS) in predicting antibiotic resistant genes, surveillance of drug resistance strains, and M.tb lineage classifications. Despite the growing use of next generation sequencing (NGS) and WGS analysis in TB research, there are limited studies that provide a comprehensive summary of there role in studying macroevolution, minor genetic variations, assessing mixed TB infections, and tracking transmission networks at an individual level. This highlights the need for systematic effort to fully explore the potential of WGS and its associated tools in advancing our understanding of TB epidemiology and disease transmission. We delve into the recent bioinformatics pipelines and NGS strategies that leverage various genetic features and simultaneous exploration of host-pathogen protein expression profile to decipher the genetic heterogeneity and host-pathogen interaction dynamics of the M.tb infections. This review highlights the potential benefits and limitations of NGS and bioinformatics tools and discusses their role in TB detection and epidemiology. Overall, this review could be a valuable resource for researchers and clinicians interested in NGS-based approaches in TB research.
Collapse
Affiliation(s)
- Sushanta Deb
- Department of Veterinary Microbiology and Pathology, College of Veterinary Medicine, Washington State University, Pullman 99164, WA, United States
- All India Institute of Medical Sciences, New Delhi 110029, India
| | - Jhinuk Basu
- Department of Clinical Immunology and Rheumatology, Kalinga Institute of Medical Sciences (KIMS), KIIT University, Bhubaneswar 751024, India
| | - Megha Choudhary
- All India Institute of Medical Sciences, New Delhi 110029, India
| |
Collapse
|
2
|
Gaudelet T, Day B, Jamasb AR, Soman J, Regep C, Liu G, Hayter JBR, Vickers R, Roberts C, Tang J, Roblin D, Blundell TL, Bronstein MM, Taylor-King JP. Utilizing graph machine learning within drug discovery and development. Brief Bioinform 2021; 22:bbab159. [PMID: 34013350 PMCID: PMC8574649 DOI: 10.1093/bib/bbab159] [Citation(s) in RCA: 70] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Revised: 04/01/2021] [Accepted: 04/05/2021] [Indexed: 12/15/2022] Open
Abstract
Graph machine learning (GML) is receiving growing interest within the pharmaceutical and biotechnology industries for its ability to model biomolecular structures, the functional relationships between them, and integrate multi-omic datasets - amongst other data types. Herein, we present a multidisciplinary academic-industrial review of the topic within the context of drug discovery and development. After introducing key terms and modelling approaches, we move chronologically through the drug development pipeline to identify and summarize work incorporating: target identification, design of small molecules and biologics, and drug repurposing. Whilst the field is still emerging, key milestones including repurposed drugs entering in vivo studies, suggest GML will become a modelling framework of choice within biomedical machine learning.
Collapse
Affiliation(s)
| | - Ben Day
- Relation Therapeutics, London, UK
- The Computer Laboratory, University of Cambridge, UK
| | - Arian R Jamasb
- Relation Therapeutics, London, UK
- The Computer Laboratory, University of Cambridge, UK
- Department of Biochemistry, University of Cambridge, UK
| | | | | | | | | | | | | | - Jian Tang
- Mila, the Quebec AI Institute, Canada
- HEC Montreal, Canada
| | - David Roblin
- Relation Therapeutics, London, UK
- Juvenescence, London, UK
- The Francis Crick Institute, London, UK
| | | | - Michael M Bronstein
- Relation Therapeutics, London, UK
- Department of Computing, Imperial College London, UK
- Twitter, UK
| | | |
Collapse
|
3
|
Torres PHM, Rossi AD, Blundell TL. ProtCHOIR: a tool for proteome-scale generation of homo-oligomers. Brief Bioinform 2021; 22:bbab182. [PMID: 34015821 PMCID: PMC8574958 DOI: 10.1093/bib/bbab182] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 04/04/2021] [Accepted: 04/20/2021] [Indexed: 01/10/2023] Open
Abstract
The rapid developments in gene sequencing technologies achieved in the recent decades, along with the expansion of knowledge on the three-dimensional structures of proteins, have enabled the construction of proteome-scale databases of protein models such as the Genome3D and ModBase. Nevertheless, although gene products are usually expressed as individual polypeptide chains, most biological processes are associated with either transient or stable oligomerisation. In the PDB databank, for example, ~40% of the deposited structures contain at least one homo-oligomeric interface. Unfortunately, databases of protein models are generally devoid of multimeric structures. To tackle this particular issue, we have developed ProtCHOIR, a tool that is able to generate homo-oligomeric structures in an automated fashion, providing detailed information for the input protein and output complex. ProtCHOIR requires input of either a sequence or a protomeric structure that is queried against a pre-constructed local database of homo-oligomeric structures, then extensively analyzed using well-established tools such as PSI-Blast, MAFFT, PISA and Molprobity. Finally, MODELLER is employed to achieve the construction of the homo-oligomers. The output complex is thoroughly analyzed taking into account its stereochemical quality, interfacial stabilities, hydrophobicity and conservation profile. All these data are then summarized in a user-friendly HTML report that can be saved or printed as a PDF file. The software is easily parallelizable and also outputs a comma-separated file with summary statistics that can straightforwardly be concatenated as a spreadsheet-like document for large-scale data analyses. As a proof-of-concept, we built oligomeric models for the Mabellini Mycobacterium abscessus structural proteome database. ProtCHOIR can be run as a web-service and the code can be obtained free-of-charge at http://lmdm.biof.ufrj.br/protchoir.
Collapse
|
4
|
Vedithi SC, Malhotra S, Acebrón-García-de-Eulate M, Matusevicius M, Torres PHM, Blundell TL. Structure-Guided Computational Approaches to Unravel Druggable Proteomic Landscape of Mycobacterium leprae. Front Mol Biosci 2021; 8:663301. [PMID: 34026836 PMCID: PMC8138464 DOI: 10.3389/fmolb.2021.663301] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Accepted: 04/12/2021] [Indexed: 02/02/2023] Open
Abstract
Leprosy, caused by Mycobacterium leprae (M. leprae), is treated with a multidrug regimen comprising Dapsone, Rifampicin, and Clofazimine. These drugs exhibit bacteriostatic, bactericidal and anti-inflammatory properties, respectively, and control the dissemination of infection in the host. However, the current treatment is not cost-effective, does not favor patient compliance due to its long duration (12 months) and does not protect against the incumbent nerve damage, which is a severe leprosy complication. The chronic infectious peripheral neuropathy associated with the disease is primarily due to the bacterial components infiltrating the Schwann cells that protect neuronal axons, thereby inducing a demyelinating phenotype. There is a need to discover novel/repurposed drugs that can act as short duration and effective alternatives to the existing treatment regimens, preventing nerve damage and consequent disability associated with the disease. Mycobacterium leprae is an obligate pathogen resulting in experimental intractability to cultivate the bacillus in vitro and limiting drug discovery efforts to repositioning screens in mouse footpad models. The dearth of knowledge related to structural proteomics of M. leprae, coupled with emerging antimicrobial resistance to all the three drugs in the multidrug therapy, poses a need for concerted novel drug discovery efforts. A comprehensive understanding of the proteomic landscape of M. leprae is indispensable to unravel druggable targets that are essential for bacterial survival and predilection of human neuronal Schwann cells. Of the 1,614 protein-coding genes in the genome of M. leprae, only 17 protein structures are available in the Protein Data Bank. In this review, we discussed efforts made to model the proteome of M. leprae using a suite of software for protein modeling that has been developed in the Blundell laboratory. Precise template selection by employing sequence-structure homology recognition software, multi-template modeling of the monomeric models and accurate quality assessment are the hallmarks of the modeling process. Tools that map interfaces and enable building of homo-oligomers are discussed in the context of interface stability. Other software is described to determine the druggable proteome by using information related to the chokepoint analysis of the metabolic pathways, gene essentiality, homology to human proteins, functional sites, druggable pockets and fragment hotspot maps.
Collapse
Affiliation(s)
- Sundeep Chaitanya Vedithi
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom,*Correspondence: Sundeep Chaitanya Vedithi,
| | - Sony Malhotra
- Rutherford Appleton Laboratory, Science and Technology Facilities Council, Oxon, United Kingdom
| | | | | | - Pedro Henrique Monteiro Torres
- Laboratório de Modelagem e Dinâmica Molecular, Instituto de Biofísica Carlos Chagas Filho, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | - Tom L. Blundell
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom,Tom L. Blundell,
| |
Collapse
|
5
|
Alsulami AF, Thomas SE, Jamasb AR, Beaudoin CA, Moghul I, Bannerman B, Copoiu L, Vedithi SC, Torres P, Blundell TL. SARS-CoV-2 3D database: understanding the coronavirus proteome and evaluating possible drug targets. Brief Bioinform 2021; 22:769-780. [PMID: 33416848 PMCID: PMC7929435 DOI: 10.1093/bib/bbaa404] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 12/08/2020] [Accepted: 11/27/2020] [Indexed: 12/30/2022] Open
Abstract
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a rapidly growing infectious disease, widely spread with high mortality rates. Since the release of the SARS-CoV-2 genome sequence in March 2020, there has been an international focus on developing target-based drug discovery, which also requires knowledge of the 3D structure of the proteome. Where there are no experimentally solved structures, our group has created 3D models with coverage of 97.5% and characterized them using state-of-the-art computational approaches. Models of protomers and oligomers, together with predictions of substrate and allosteric binding sites, protein-ligand docking, SARS-CoV-2 protein interactions with human proteins, impacts of mutations, and mapped solved experimental structures are freely available for download. These are implemented in SARS CoV-2 3D, a comprehensive and user-friendly database, available at https://sars3d.com/. This provides essential information for drug discovery, both to evaluate targets and design new potential therapeutics.
Collapse
Affiliation(s)
- Ali F Alsulami
- Department of Biochemistry, at the University of Cambridge, UK
| | - Sherine E Thomas
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Arian R Jamasb
- Department of Biochemistry, at the University of Cambridge, UK
| | | | | | | | - Liviu Copoiu
- Department of Biochemistry, at the University of Cambridge, UK
| | - Sundeep Chaitanya Vedithi
- Molecular Immunity Unit, Department of Medicine University of Cambridge, MRC Laboratory of Molecular Biology, UK
| | - Pedro Torres
- Laboratório de Modelagem e Dinâmica Molecular, Instituto de Biofísica Carlos Chagas Filho, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brasil
| | | |
Collapse
|
6
|
Munir A, Vedithi SC, Chaplin AK, Blundell TL. Genomics, Computational Biology and Drug Discovery for Mycobacterial Infections: Fighting the Emergence of Resistance. Front Genet 2020; 11:965. [PMID: 33101362 PMCID: PMC7498718 DOI: 10.3389/fgene.2020.00965] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2020] [Accepted: 07/31/2020] [Indexed: 12/14/2022] Open
Abstract
Tuberculosis (TB) and leprosy are mycobacterial infections caused by Mycobacterium tuberculosis and Mycobacterium leprae respectively. These diseases continue to be endemic in developing countries where the cost of new medicines presents major challenges. The situation is further exacerbated by the emergence of resistance to many front-line antibiotics. A priority now is to design new antimycobacterials that are not only effective in combatting the diseases but are also less likely to give rise to resistance. In both these respects understanding the structure of drug targets in M. tuberculosis and M. leprae is crucial. In this review we describe structure-guided approaches to understanding the impacts of mutations that give rise to antimycobacterial resistance and the use of this information in the design of new medicines.
Collapse
Affiliation(s)
- Asma Munir
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | | | - Amanda K Chaplin
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Tom L Blundell
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
7
|
A Personal History of Using Crystals and Crystallography to Understand Biology and Advanced Drug Discovery. CRYSTALS 2020. [DOI: 10.3390/cryst10080676] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Over the past 60 years, the use of crystals to define structures of complexes using X-ray analysis has contributed to the discovery of new medicines in a very significant way. This has been in understanding not only small-molecule inhibitors of proteins, such as enzymes, but also protein or peptide hormones or growth factors that bind to cell surface receptors. Experimental structures from crystallography have also been exploited in software to allow prediction of structures of important targets based on knowledge of homologues. Crystals and crystallography continue to contribute to drug design and provide a successful example of academia–industry collaboration.
Collapse
|
8
|
Waman VP, Blundell TL, Buchan DWA, Gough J, Jones D, Kelley L, Murzin A, Pandurangan AP, Sillitoe I, Sternberg M, Torres P, Orengo C. The Genome3D Consortium for Structural Annotations of Selected Model Organisms. Methods Mol Biol 2020; 2165:27-67. [PMID: 32621218 DOI: 10.1007/978-1-0716-0708-4_3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Genome3D consortium is a collaborative project involving protein structure prediction and annotation resources developed by six world-leading structural bioinformatics groups, based in the United Kingdom (namely Blundell, Murzin, Gough, Sternberg, Orengo, and Jones). The main objective of Genome3D serves as a common portal to provide both predicted models and annotations of proteins in model organisms, using several resources developed by these labs such as CATH-Gene3D, DOMSERF, pDomTHREADER, PHYRE, SUPERFAMILY, FUGUE/TOCATTA, and VIVACE. These resources primarily use SCOP- and/or CATH-based protein domain assignments. Another objective of Genome3D is to compare structural classifications of protein domains in CATH and SCOP databases and to provide a consensus mapping of CATH and SCOP protein superfamilies. CATH/SCOP mapping analyses led to the identification of total of 1429 consensus superfamilies.Currently, Genome3D provides structural annotations for ten model organisms, including Homo sapiens, Arabidopsis thaliana, Mus musculus, Escherichia coli, Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, Plasmodium falciparum, Staphylococcus aureus, and Schizosaccharomyces pombe. Thus, Genome3D serves as a common gateway to each structure prediction/annotation resource and allows users to perform comparative assessment of the predictions. It, thus, assists researchers to broaden their perspective on structure/function predictions of their query protein of interest in selected model organisms.
Collapse
Affiliation(s)
- Vaishali P Waman
- Institute of Structural and Molecular Biology, University College London, London, UK
| | - Tom L Blundell
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Daniel W A Buchan
- Department of Computer Science, University College London, London, UK
| | - Julian Gough
- MRC Laboratory of Molecular Biology, Cambridge, UK
| | - David Jones
- Department of Computer Science, University College London, London, UK
| | - Lawrence Kelley
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, UK
| | | | | | - Ian Sillitoe
- Institute of Structural and Molecular Biology, University College London, London, UK
| | - Michael Sternberg
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, UK
| | - Pedro Torres
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Christine Orengo
- Institute of Structural and Molecular Biology, University College London, London, UK.
| |
Collapse
|
9
|
Roy KK, Wani MA. Emerging opportunities of exploiting mycobacterial electron transport chain pathway for drug-resistant tuberculosis drug discovery. Expert Opin Drug Discov 2019; 15:231-241. [PMID: 31774006 DOI: 10.1080/17460441.2020.1696771] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Introduction: Tuberculosis (TB) is a leading infectious disease worldwide whose chemotherapy is challenged by the continued rise of drug resistance. This epidemic urges the need to discover anti-TB drugs with novel modes of action.Areas covered: The mycobacterial electron transport chain (ETC) pathway represents a hub of anti-TB drug targets. Herein, the authors highlight the various targets within the mycobacterial ETC and highlight some of the promising ETC-targeted drugs and clinical candidates that have been discovered or repurposed. Furthermore, recent breakthroughs in the availability of X-ray and/or cryo-EM structures of some targets are discussed, and various opportunities of exploiting these structures for the discovery of new anti-TB drugs are emphasized.Expert opinion: The drug discovery efforts targeting the ETC pathway have led to the FDA approval of bedaquiline, a FOF1-ATP synthase inhibitor, and the discovery of Q203, a clinical candidate drug targeting the mycobacterial cytochrome bcc-aa3 supercomplex. Moreover, clofazimine, a proposed prodrug competing with menaquinone for its reduction by mycobacterial NADH dehydrogenase 2, has been repurposed for TB treatment. Recently available structures of the mycobacterial ATP synthase C9 rotary ring and the cytochrome bcc-aa3 supercomplex represent further opportunities for the structure-based drug design (SBDD) of the next-generation of inhibitors against Mycobacterium tuberculosis.
Collapse
Affiliation(s)
- Kuldeep K Roy
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research (NIPER), Kolkata, India
| | - Mushtaq Ahmad Wani
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research (NIPER), Kolkata, India
| |
Collapse
|
10
|
Skwark MJ, Torres PHM, Copoiu L, Bannerman B, Floto RA, Blundell TL. Mabellini: a genome-wide database for understanding the structural proteome and evaluating prospective antimicrobial targets of the emerging pathogen Mycobacterium abscessus. Database (Oxford) 2019; 2019:5611286. [PMID: 31681953 PMCID: PMC6853642 DOI: 10.1093/database/baz113] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Revised: 07/31/2019] [Accepted: 08/28/2019] [Indexed: 02/02/2023]
Abstract
Mycobacterium abscessus, a rapid growing, multidrug resistant, nontuberculous mycobacteria, can cause a wide range of opportunistic infections, particularly in immunocompromised individuals. M. abscessus has emerged as a growing threat to patients with cystic fibrosis, where it causes accelerated inflammatory lung damage, is difficult and sometimes impossible to treat and can prevent safe transplantation. There is therefore an urgent unmet need to develop new therapeutic strategies. The elucidation of the M. abscessus genome in 2009 opened a wide range of research possibilities in the field of drug discovery that can be more effectively exploited upon the characterization of the structural proteome. Where there are no experimental structures, we have used the available amino acid sequences to create 3D models of the majority of the remaining proteins that constitute the M. abscessus proteome (3394 proteins and over 13 000 models) using a range of up-to-date computational tools, many developed by our own group. The models are freely available for download in an on-line database, together with quality data and functional annotation. Furthermore, we have developed an intuitive and user-friendly web interface (http://www.mabellinidb.science) that enables easy browsing, querying and retrieval of the proteins of interest. We believe that this resource will be of use in evaluating the prospective targets for design of antimicrobial agents and will serve as a cornerstone to support the development of new molecules to treat M. abscessus infections.
Collapse
Affiliation(s)
- Marcin J Skwark
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
| | - Pedro H M Torres
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
| | - Liviu Copoiu
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
| | - Bridget Bannerman
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
| | - R Andres Floto
- Molecular Immunity Unit, Department of Medicine University of Cambridge, MRC-Laboratory of Molecular Biology, Cambridge CB2 0QH, UK
and,Cambridge Centre for Lung Infection, Royal Papworth Hospital, Cambridge CB23 3RE, UK
| | - Tom L Blundell
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK,Corresponding author: Tel: +44 1223 333628; Fax: +44 1223 766002;
| |
Collapse
|
11
|
A platform for target prediction of phenotypic screening hit molecules. J Mol Graph Model 2019; 95:107485. [PMID: 31836397 PMCID: PMC6983931 DOI: 10.1016/j.jmgm.2019.107485] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2019] [Revised: 09/25/2019] [Accepted: 10/21/2019] [Indexed: 01/09/2023]
Abstract
Many drug discovery programmes, particularly for infectious diseases, are conducted phenotypically. Identifying the targets of phenotypic screening hits experimentally can be complex, time-consuming, and expensive. However, it would be valuable to know what the molecular target(s) is, as knowledge of the binding pose of the hit molecule in the binding site can facilitate the compound optimisation. Furthermore, knowing the target would allow de-prioritisation of less attractive chemical series or molecular targets. To generate target-hypotheses for phenotypic active compounds, an in silico platform was developed that utilises both ligand and protein-structure information to generate a ranked set of predicted molecular targets. As a result of the web-based workflow the user obtains a set of 3D structures of the predicted targets with the active molecule bound. The platform was exemplified using Mycobacterium tuberculosis, the causative organism of tuberculosis. In a test that we performed, the platform was able to predict the targets of 60% of compounds investigated, where there was some similarity to a ligand in the protein database. An algorithm to predict the molecular target(s) of phenotypic hits against TB. Uses information based on the ligand and protein structure. Allow visualisation of proposed binding pose. Web interface developed.
Collapse
|
12
|
Malhotra S, Alsulami AF, Heiyun Y, Ochoa BM, Jubb H, Forbes S, Blundell TL. Understanding the impacts of missense mutations on structures and functions of human cancer-related genes: A preliminary computational analysis of the COSMIC Cancer Gene Census. PLoS One 2019; 14:e0219935. [PMID: 31323058 PMCID: PMC6641202 DOI: 10.1371/journal.pone.0219935] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Accepted: 07/03/2019] [Indexed: 12/12/2022] Open
Abstract
Genomics and genome screening are proving central to the study of cancer. However, a good appreciation of the protein structures coded by cancer genes is also invaluable, especially for the understanding of functions, for assessing ligandability of potential targets, and for designing new drugs. To complement the wealth of information on the genetics of cancer in COSMIC, the most comprehensive database for cancer somatic mutations available, structural information obtained experimentally has been brought together recently in COSMIC-3D. Even where structural information is available for a gene in the Cancer Gene Census, a list of genes in COSMIC with substantial evidence supporting their impacts in cancer, this information is quite often for a single domain in a larger protein or for a single protomer in a multiprotein assembly. Here, we show that over 60% of the genes included in the Cancer Gene Census are predicted to possess multiple domains. Many are also multicomponent and membrane-associated molecular assemblies, with mutations recorded in COSMIC affecting such assemblies. However, only 469 of the gene products have a structure represented in the PDB, and of these only 87 structures have 90-100% coverage over the sequence and 69 have less than 10% coverage. As a first step to bridging gaps in our knowledge in the many cases where individual protein structures and domains are lacking, we discuss our attempts of protein structure modelling using our pipeline and investigating the effects of mutations using two of our in-house methods (SDM2 and mCSM) and identifying potential driver mutations. This allows us to begin to understand the effects of mutations not only on protein stability but also on protein-protein, protein-ligand and protein-nucleic acid interactions. In addition, we consider ways to combine the structural information with the wealth of mutation data available in COSMIC. We discuss the impacts of COSMIC missense mutations on protein structure in order to identify and assess the molecular consequences of cancer-driving mutations.
Collapse
Affiliation(s)
- Sony Malhotra
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Ali F. Alsulami
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Yang Heiyun
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | | | - Harry Jubb
- Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom
| | - Simon Forbes
- Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom
| | - Tom L. Blundell
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
13
|
Pandurangan AP, Ochoa-Montaño B, Ascher DB, Blundell TL. SDM: a server for predicting effects of mutations on protein stability. Nucleic Acids Res 2019; 45:W229-W235. [PMID: 28525590 PMCID: PMC5793720 DOI: 10.1093/nar/gkx439] [Citation(s) in RCA: 359] [Impact Index Per Article: 59.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2017] [Accepted: 05/15/2017] [Indexed: 02/02/2023] Open
Abstract
Here, we report a webserver for the improved SDM, used for predicting the effects of mutations on protein stability. As a pioneering knowledge-based approach, SDM has been highlighted as the most appropriate method to use in combination with many other approaches. We have updated the environment-specific amino-acid substitution tables based on the current expanded PDB (a 5-fold increase in information), and introduced new residue-conformation and interaction parameters, including packing density and residue depth. The updated server has been extensively tested using a benchmark containing 2690 point mutations from 132 different protein structures. The revised method correlates well against the hypothetical reverse mutations, better than comparable methods built using machine-learning approaches, highlighting the strength of our knowledge-based approach for identifying stabilising mutations. Given a PDB file (a Protein Data Bank file format containing the 3D coordinates of the protein atoms), and a point mutation, the server calculates the stability difference score between the wildtype and mutant protein. The server is available at http://structure.bioc.cam.ac.uk/sdm2
Collapse
Affiliation(s)
| | | | - David B Ascher
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK.,Department of Biochemistry and Molecular Biology, University of Melbourne, Australia
| | - Tom L Blundell
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
| |
Collapse
|
14
|
Waman VP, Vedithi SC, Thomas SE, Bannerman BP, Munir A, Skwark MJ, Malhotra S, Blundell TL. Mycobacterial genomics and structural bioinformatics: opportunities and challenges in drug discovery. Emerg Microbes Infect 2019; 8:109-118. [PMID: 30866765 PMCID: PMC6334779 DOI: 10.1080/22221751.2018.1561158] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 12/03/2018] [Accepted: 12/09/2018] [Indexed: 01/08/2023]
Abstract
Of the more than 190 distinct species of Mycobacterium genus, many are economically and clinically important pathogens of humans or animals. Among those mycobacteria that infect humans, three species namely Mycobacterium tuberculosis (causative agent of tuberculosis), Mycobacterium leprae (causative agent of leprosy) and Mycobacterium abscessus (causative agent of chronic pulmonary infections) pose concern to global public health. Although antibiotics have been successfully developed to combat each of these, the emergence of drug-resistant strains is an increasing challenge for treatment and drug discovery. Here we describe the impact of the rapid expansion of genome sequencing and genome/pathway annotations that have greatly improved the progress of structure-guided drug discovery. We focus on the applications of comparative genomics, metabolomics, evolutionary bioinformatics and structural proteomics to identify potential drug targets. The opportunities and challenges for the design of drugs for M. tuberculosis, M. leprae and M. abscessus to combat resistance are discussed.
Collapse
Affiliation(s)
| | | | | | | | - Asma Munir
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Marcin J. Skwark
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Sony Malhotra
- Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck College, University of London, London, UK
| | - Tom L. Blundell
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| |
Collapse
|
15
|
Portelli S, Phelan JE, Ascher DB, Clark TG, Furnham N. Understanding molecular consequences of putative drug resistant mutations in Mycobacterium tuberculosis. Sci Rep 2018; 8:15356. [PMID: 30337649 PMCID: PMC6193939 DOI: 10.1038/s41598-018-33370-6] [Citation(s) in RCA: 53] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2018] [Accepted: 09/26/2018] [Indexed: 12/21/2022] Open
Abstract
Genomic studies of Mycobacterium tuberculosis bacteria have revealed loci associated with resistance to anti-tuberculosis drugs. However, the molecular consequences of polymorphism within these candidate loci remain poorly understood. To address this, we have used computational tools to quantify the effects of point mutations conferring resistance to three major anti-tuberculosis drugs, isoniazid (n = 189), rifampicin (n = 201) and D-cycloserine (n = 48), within their primary targets, katG, rpoB, and alr. Notably, mild biophysical effects brought about by high incidence mutations were considered more tolerable, while different structural effects brought about by haplotype combinations reflected differences in their functional importance. Additionally, highly destabilising mutations such as alr Y388, highlighted a functional importance of the wildtype residue. Our qualitative analysis enabled us to relate resistance mutations onto a theoretical landscape linking enthalpic changes with phenotype. Such insights will aid the development of new resistance-resistant drugs and, via an integration into predictive tools, in pathogen surveillance.
Collapse
Affiliation(s)
- Stephanie Portelli
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Victoria, 3051, Australia
| | - Jody E Phelan
- Department of Pathogen Molecular Biology, London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK
| | - David B Ascher
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Victoria, 3051, Australia
| | - Taane G Clark
- Department of Pathogen Molecular Biology, London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK
- Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK
| | - Nicholas Furnham
- Department of Pathogen Molecular Biology, London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK.
| |
Collapse
|
16
|
Malhotra S, Mugumbate G, Blundell TL, Higueruelo AP. TIBLE: a web-based, freely accessible resource for small-molecule binding data for mycobacterial species. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2017:3866794. [PMID: 29220433 PMCID: PMC5502366 DOI: 10.1093/database/bax041] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2017] [Accepted: 04/25/2017] [Indexed: 02/03/2023]
Abstract
Database URL http://www-cryst.bioc.cam.ac.uk/tible/.
Collapse
Affiliation(s)
- Sony Malhotra
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1GA, UK
| | - Grace Mugumbate
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1GA, UK
| | - Tom L Blundell
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1GA, UK
| | - Alicia P Higueruelo
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1GA, UK
| |
Collapse
|
17
|
Decoding the similarities and differences among mycobacterial species. PLoS Negl Trop Dis 2017; 11:e0005883. [PMID: 28854187 PMCID: PMC5595346 DOI: 10.1371/journal.pntd.0005883] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2017] [Revised: 09/12/2017] [Accepted: 08/18/2017] [Indexed: 11/19/2022] Open
Abstract
Mycobacteriaceae comprises pathogenic species such as Mycobacterium tuberculosis, M. leprae and M. abscessus, as well as non-pathogenic species, for example, M. smegmatis and M. thermoresistibile. Genome comparison and annotation studies provide insights into genome evolutionary relatedness, identify unique and pathogenicity-related genes in each species, and explore new targets that could be used for developing new diagnostics and therapeutics. Here, we present a comparative analysis of ten-mycobacterial genomes with the objective of identifying similarities and differences between pathogenic and non-pathogenic species. We identified 1080 core orthologous clusters that were enriched in proteins involved in amino acid and purine/pyrimidine biosynthetic pathways, DNA-related processes (replication, transcription, recombination and repair), RNA-methylation and modification, and cell-wall polysaccharide biosynthetic pathways. For their pathogenicity and survival in the host cell, pathogenic species have gained specific sets of genes involved in repair and protection of their genomic DNA. M. leprae is of special interest owing to its smallest genome (1600 genes and ~1300 psuedogenes), yet poor genome annotation. More than 75% of the pseudogenes were found to have a functional ortholog in the other mycobacterial genomes and belong to protein families such as transferases, oxidoreductases and hydrolases. Members of the Mycobacteriaceae family, which are known to adapt to different environmental niches, comprise bacterial species with varied genome sizes. They are unique in their cell-wall composition, which is remarkably thick and lipid-rich as compared to other bacteria. We performed a comparative analysis at the proteome level for ten mycobacterial species that differ in their pathogenicity, genome size and environmental niches. A total of 1080 orthologous clusters with representation from all ten species were obtained, and these were further examined for their domain annotations, domain architecture similarities and enriched GO terms. These core orthologous clusters are enriched in various biosynthetic pathways. The proteins that are specific to each of the ten species were also investigated for their GO functions. The M. leprae genome has a large number of pseudogenes and we searched for their functional orthologs in other mycobacterial species in order to understand the functions that are lost from the M. leprae genome. The proteins present exclusively in M. leprae genome were studied in more detail, in order to predict putative drug targets and diagnostic markers. These findings, which have implications in understanding evolution of mycobacterial genomes, identify species-specific proteins that have potential for use in developing new diagnostic tools and therapeutics.
Collapse
|
18
|
Maitra A, Kamil TK, Shaik M, Danquah CA, Chrzastek A, Bhakta S. Early diagnosis and effective treatment regimens are the keys to tackle antimicrobial resistance in tuberculosis (TB): A report from Euroscicon's international TB Summit 2016. Virulence 2017; 8:1005-1024. [PMID: 27813702 PMCID: PMC5626228 DOI: 10.1080/21505594.2016.1256536] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2016] [Accepted: 10/27/2016] [Indexed: 12/22/2022] Open
Abstract
To say that tuberculosis (TB) has regained a strong foothold in the global human health and wellbeing scenario would be an understatement. Ranking alongside HIV/AIDS as the top reason for mortality due to a single infectious disease, the impact of TB extends far into socio-economic context worldwide. As global efforts led by experts and political bodies converge to mitigate the predicted outcome of growing antimicrobial resistance, the academic community of students, practitioners and researchers have mobilised to develop integrated, inter-disciplinary programmes to bring the plans of the former to fruition. Enabling this crucial requirement for unimpeded dissemination of scientific discovery was the TB Summit 2016, held in London, United Kingdom. This report critically discusses the recent breakthroughs made in diagnostics and treatment while bringing to light the major hurdles in the control of the disease as discussed in the course of the 3-day international event. Conferences and symposia such as these are the breeding grounds for successful local and global collaborations and therefore must be supported to expand the understanding and outreach of basic science research.
Collapse
Affiliation(s)
- Arundhati Maitra
- Mycobacteria Research Laboratory, Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck, University of London, London, UK
| | - Tengku Karmila Kamil
- Mycobacteria Research Laboratory, Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck, University of London, London, UK
| | - Monisha Shaik
- Mycobacteria Research Laboratory, Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck, University of London, London, UK
| | - Cynthia Amaning Danquah
- Mycobacteria Research Laboratory, Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck, University of London, London, UK
| | - Alina Chrzastek
- Mycobacteria Research Laboratory, Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck, University of London, London, UK
| | - Sanjib Bhakta
- Mycobacteria Research Laboratory, Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck, University of London, London, UK
| |
Collapse
|
19
|
Lam SD, Das S, Sillitoe I, Orengo C. An overview of comparative modelling and resources dedicated to large-scale modelling of genome sequences. Acta Crystallogr D Struct Biol 2017; 73:628-640. [PMID: 28777078 PMCID: PMC5571743 DOI: 10.1107/s2059798317008920] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2016] [Accepted: 06/14/2017] [Indexed: 12/02/2022] Open
Abstract
Computational modelling of proteins has been a major catalyst in structural biology. Bioinformatics groups have exploited the repositories of known structures to predict high-quality structural models with high efficiency at low cost. This article provides an overview of comparative modelling, reviews recent developments and describes resources dedicated to large-scale comparative modelling of genome sequences. The value of subclustering protein domain superfamilies to guide the template-selection process is investigated. Some recent cases in which structural modelling has aided experimental work to determine very large macromolecular complexes are also cited.
Collapse
Affiliation(s)
- Su Datt Lam
- Institute of Structural and Molecular Biology, UCL, Darwin Building, Gower Street, London WC1E 6BT, England
- School of Biosciences and Biotechnology, Faculty of Science and Technology, University Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia
| | - Sayoni Das
- Institute of Structural and Molecular Biology, UCL, Darwin Building, Gower Street, London WC1E 6BT, England
| | - Ian Sillitoe
- Institute of Structural and Molecular Biology, UCL, Darwin Building, Gower Street, London WC1E 6BT, England
| | - Christine Orengo
- Institute of Structural and Molecular Biology, UCL, Darwin Building, Gower Street, London WC1E 6BT, England
| |
Collapse
|
20
|
Ochoa-Montaño B, Blundell TL. XSuLT: a web server for structural annotation and representation of sequence-structure alignments. Nucleic Acids Res 2017; 45:W381-W387. [PMID: 28510698 PMCID: PMC5793734 DOI: 10.1093/nar/gkx421] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2017] [Accepted: 05/04/2017] [Indexed: 12/16/2022] Open
Abstract
The web server XSuLT, an enhanced version of the protein alignment annotation program JoY, formats a submitted multiple-sequence alignment using three-dimensional (3D) structural information in order to assist in the comparative analysis of protein evolution and in the optimization of alignments for comparative modelling and construct design. In addition to the features analysed by JoY, which include secondary structure, solvent accessibility and sidechain hydrogen bonds, XSuLT annotates each amino acid residue with residue depth, chain and ligand interactions, inter-residue contacts, sequence entropy, root mean square deviation and secondary structure and disorder prediction. It is also now integrated with built-in 3D visualization which interacts with the formatted alignment to facilitate inspection and understanding. Results can be downloaded as stand-alone HTML for the formatted alignment and as XML with the underlying annotation data. XSuLT is freely available at http://structure.bioc.cam.ac.uk/xsult/.
Collapse
Affiliation(s)
| | - Tom L Blundell
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
| |
Collapse
|
21
|
Metri R, Hariharaputran S, Ramakrishnan G, Anand P, Raghavender US, Ochoa-Montaño B, Higueruelo AP, Sowdhamini R, Chandra NR, Blundell TL, Srinivasan N. SInCRe-structural interactome computational resource for Mycobacterium tuberculosis. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2015; 2015:bav060. [PMID: 26130660 PMCID: PMC4485431 DOI: 10.1093/database/bav060] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/23/2015] [Accepted: 05/26/2015] [Indexed: 11/20/2022]
Abstract
We have developed an integrated database for Mycobacterium tuberculosis H37Rv (Mtb) that collates information on protein sequences, domain assignments, functional annotation and 3D structural information along with protein–protein and protein–small molecule interactions. SInCRe (Structural Interactome Computational Resource) is developed out of CamBan (Cambridge and Bangalore) collaboration. The motivation for development of this database is to provide an integrated platform to allow easily access and interpretation of data and results obtained by all the groups in CamBan in the field of Mtb informatics. In-house algorithms and databases developed independently by various academic groups in CamBan are used to generate Mtb-specific datasets and are integrated in this database to provide a structural dimension to studies on tuberculosis. The SInCRe database readily provides information on identification of functional domains, genome-scale modelling of structures of Mtb proteins and characterization of the small-molecule binding sites within Mtb. The resource also provides structure-based function annotation, information on small-molecule binders including FDA (Food and Drug Administration)-approved drugs, protein–protein interactions (PPIs) and natural compounds that bind to pathogen proteins potentially and result in weakening or elimination of host–pathogen protein–protein interactions. Together they provide prerequisites for identification of off-target binding. Database URL:http://proline.biochem.iisc.ernet.in/sincre
Collapse
Affiliation(s)
- Rahul Metri
- Department of Biochemistry and Indian Institute of Science Mathematics Initiative, Indian Institute of Science, Bangalore, India
| | - Sridhar Hariharaputran
- Department of Biochemistry and National Centre for Biological Sciences, TIFR, UAS-GKVK Campus, Bellary Road, Bangalore, India
| | - Gayatri Ramakrishnan
- Indian Institute of Science Mathematics Initiative, Indian Institute of Science, Bangalore, India, Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India, and
| | | | | | | | - Alicia P Higueruelo
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, UK
| | - Ramanathan Sowdhamini
- National Centre for Biological Sciences, TIFR, UAS-GKVK Campus, Bellary Road, Bangalore, India
| | | | - Tom L Blundell
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, UK
| | | |
Collapse
|