1
|
Meng Y, Zhang Z, Zhou C, Tang X, Hu X, Tian G, Yang J, Yao Y. Protein structure prediction via deep learning: an in-depth review. Front Pharmacol 2025; 16:1498662. [PMID: 40248099 PMCID: PMC12003282 DOI: 10.3389/fphar.2025.1498662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2024] [Accepted: 02/28/2025] [Indexed: 04/19/2025] Open
Abstract
The application of deep learning algorithms in protein structure prediction has greatly influenced drug discovery and development. Accurate protein structures are crucial for understanding biological processes and designing effective therapeutics. Traditionally, experimental methods like X-ray crystallography, nuclear magnetic resonance, and cryo-electron microscopy have been the gold standard for determining protein structures. However, these approaches are often costly, inefficient, and time-consuming. At the same time, the number of known protein sequences far exceeds the number of experimentally determined structures, creating a gap that necessitates the use of computational approaches. Deep learning has emerged as a promising solution to address this challenge over the past decade. This review provides a comprehensive guide to applying deep learning methodologies and tools in protein structure prediction. We initially outline the databases related to the protein structure prediction, then delve into the recently developed large language models as well as state-of-the-art deep learning-based methods. The review concludes with a perspective on the future of predicting protein structure, highlighting potential challenges and opportunities.
Collapse
Affiliation(s)
- Yajie Meng
- College of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, China
| | - Zhuang Zhang
- College of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, China
| | - Chang Zhou
- College of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, China
| | - Xianfang Tang
- College of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, China
| | - Xinrong Hu
- College of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, China
| | | | | | - Yuhua Yao
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
- Key Laboratory of Data Science and Intelligence Education, Ministry of Education, Hainan Normal University, Haikou, China
- Key Laboratory of Computational Science and Application of Hainan Province, Hainan Normal University, Haikou, China
| |
Collapse
|
2
|
Islamaj R, Wei CH, Lai PT, Huston M, Coss C, Kochar PG, Miliaras N, Mork JG, Rodionov O, Sekiya K, Trinh D, Whitman D, Wallin C, Lu Z. Assessing Artificial Intelligence (AI) Implementation for Assisting Gene Linking (at the National Library of Medicine). JAMIA Open 2025; 8:ooae129. [PMID: 39776621 PMCID: PMC11706533 DOI: 10.1093/jamiaopen/ooae129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Revised: 10/18/2024] [Accepted: 11/16/2024] [Indexed: 01/11/2025] Open
Abstract
Objectives The National Library of Medicine (NLM) currently indexes close to a million articles each year pertaining to more than 5300 medicine and life sciences journals. Of these, a significant number of articles contain critical information about the structure, genetics, and function of genes and proteins in normal and disease states. These articles are identified by the NLM curators, and a manual link is created between these articles and the corresponding gene records at the NCBI Gene database. Thus, the information is interconnected with all the NLM resources, services which bring considerable value to life sciences. National Library of Medicine aims to provide timely access to all metadata, and this necessitates that the article indexing scales to the volume of the published literature. On the other hand, although automatic information extraction methods have been shown to achieve accurate results in biomedical text mining research, it remains difficult to evaluate them on established pipelines and integrate them within the daily workflows. Materials and Methods Here, we demonstrate how our machine learning model, GNorm2, which achieved state-of-the art performance on identifying genes and their corresponding species at the same time handling innate textual ambiguities, could be integrated with the established daily workflow at the NLM and evaluated for its performance in this new environment. Results We worked with 8 biomedical curator experts and evaluated the integration using these parameters: (1) gene identification accuracy, (2) interannotator agreement with and without GNorm2, (3) GNorm2 potential bias, and (4) indexing consistency and efficiency. We identified key interface changes that significantly helped the curators to maximize the GNorm2 benefit, and further improved the GNorm2 algorithm to cover 135 species of genes including viral and bacterial genes, based on the biocurator expert survey. Conclusion GNorm2 is currently in the process of being fully integrated into the regular curator's workflow.
Collapse
Affiliation(s)
- Rezarta Islamaj
- National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, United States
| | - Chih-Hsuan Wei
- National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, United States
| | - Po-Ting Lai
- National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, United States
| | - Melanie Huston
- National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, United States
| | - Cathleen Coss
- National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, United States
| | - Preeti Gokal Kochar
- National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, United States
| | - Nicholas Miliaras
- National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, United States
| | - James G Mork
- National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, United States
| | - Oleg Rodionov
- National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, United States
| | - Keiko Sekiya
- National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, United States
| | - Dorothy Trinh
- National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, United States
| | - Deborah Whitman
- National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, United States
| | - Craig Wallin
- National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, United States
| | - Zhiyong Lu
- National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, United States
| |
Collapse
|
3
|
Iyyappan Y, Dhayabaran V, Elayappan M, Chaudhary SK, Palaniappan C, Kanagaraj S. Functional characterization of a hypothetical protein (TTHA1873) from Thermus thermophilus. Proteins 2023; 91:1427-1436. [PMID: 37254593 DOI: 10.1002/prot.26530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 04/06/2023] [Accepted: 04/19/2023] [Indexed: 06/01/2023]
Abstract
Thermus thermophilus is an extremely thermophilic organism that thrives at a temperature of 65°C. T. thermophilus genome has ~2218 genes, out of which 66% (1482 genes) have been annotated, and the remaining 34% (736 genes) are assigned as hypothetical proteins. In this work, biochemical and biophysical experiments were performed to characterize the hypothetical protein TTHA1873 from T. thermophilus. The hypothetical protein TTHA1873 acts as a nuclease, which indiscreetly cuts methylated and non-methylated DNA in divalent metal ions and relaxes the plasmid DNA in the presence of ATP. The chelation of metal ions with EDTA inhibits its activity. These results suggest that the hypothetical protein TTHA1873 would be a CRISPR-associated protein with non-specific DNase activity and ATP-dependent DNA-relaxing activity.
Collapse
Affiliation(s)
- Yuvaraj Iyyappan
- Department of Computational and Data Sciences, Indian Institute of Science, Bangalore, India
- National Institute for Plant Biotechnology, New Delhi, India
| | - Vaigundan Dhayabaran
- Genomics and Central Research Laboratory, Department of Cell Biology and Molecular Genetics, Sri Devaraj Urs Academy of Higher Education and Research, Kolar, India
| | - Mohanapriya Elayappan
- Department of Computational and Data Sciences, Indian Institute of Science, Bangalore, India
| | - Santosh Kumar Chaudhary
- Department of Computational and Data Sciences, Indian Institute of Science, Bangalore, India
- Chemical Biology and Therapeutics Sciences, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Chandrasekaran Palaniappan
- Department of Computational and Data Sciences, Indian Institute of Science, Bangalore, India
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
| | - Sekar Kanagaraj
- Department of Computational and Data Sciences, Indian Institute of Science, Bangalore, India
| |
Collapse
|
4
|
Pan G, Sun C, Liao Z, Tang J. Machine and Deep Learning for Prediction of Subcellular Localization. Methods Mol Biol 2022; 2361:249-261. [PMID: 34236666 DOI: 10.1007/978-1-0716-1641-3_15] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Protein subcellular localization prediction (PSLP), which plays an important role in the field of computational biology, identifies the position and function of proteins in cells without expensive cost and laborious effort. In the past few decades, various methods with different algorithms have been proposed in solving the problem of subcellular localization prediction; machine learning and deep learning constitute a large portion among those proposed methods. In order to provide an overview about those methods, the first part of this article will be a brief review of several state-of-the-art machine learning methods on subcellular localization prediction; then the materials used by subcellular localization prediction is described and a simple prediction method, that takes protein sequences as input and utilizes a convolutional neural network as the classifier, is introduced. At last, a list of notes is provided to indicate the major problems that may occur with this method.
Collapse
Affiliation(s)
- Gaofeng Pan
- Department of Computer Science and Engineering, University of South Carolina, Columbia, SC, USA
| | - Chao Sun
- Department of Computer Science and Engineering, University of South Carolina, Columbia, SC, USA
| | - Zijun Liao
- Department of Computer Science and Engineering, University of South Carolina, Columbia, SC, USA.,Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fujian, China
| | - Jijun Tang
- Department of Computer Science and Engineering, University of South Carolina, Columbia, SC, USA. .,School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China.
| |
Collapse
|
5
|
Widowati W, Handono K, Marlina M, Sholihah IA, Jasaputra DK, Wargasetia TL, Subangkit M, Faried A, Girsang E, Lister IN, Ginting CN, Nainggolan IM, Rizal R, Kusuma H, Chiuman L. In Silico Approach for Pro-inflammatory Protein Interleukin 1β and Interleukin-1 Receptor Antagonist Protein Docking as Potential Therapy for COVID-19 Disease. Open Access Maced J Med Sci 2022. [DOI: 10.3889/oamjms.2022.7405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Abstract
Background: Interleukin-1 receptor antagonist (IL-1Ra) also known as Anakinra is a receptor antagonist of IL-1 especially IL-1β. IL-1β increased in infected COVID-19 patient groups. This study aimed that the IL-1Ra contained in Conditioned Medium Wharton’s Jelly Mesenchymal Stem Cells (CM-WJMSCs) has the potential to inhibit IL-1β which is one of the cytokine storms that occur in COVID patients through an in-silico approach. Objective: This study aims to determine the effect of in silico approach pro-inflammatory protein interleukin 1β (IL-1 β) and interleukin-1 receptor antagonist protein as cytokine WJ-MSCs for potential treatment of COVID-19 disease. Methods: 3D structure using the homology modeling method on Swiss Model web-server. Molecular docking was performed to analyze the binding mode of the IL-1β related to COVID-19 with IL-1Ra and the docking results were fixed using FireDock web-server. Results: These results of the docking of proteins between IL-1β and the CM-WJMSCs component, namely IL-1Ra showed that IL-1Ra has criteria for docking on IL-1β such as the good score for QMEAN, good CscoreLB, and BS-score results, and the lowest energy obtained was -585.1 KJ/mol. It can be predicted that IL-1Ra can inhibit IL-1β which causes cytokine storms in COVID-19 patients. Conclusion: So that there is a potential treatment of CM-WJMSCs on the severity of Covid-19 infection.
Collapse
|
6
|
Shrestha R, Fajardo E, Gil N, Fidelis K, Kryshtafovych A, Monastyrskyy B, Fiser A. Assessing the accuracy of contact predictions in CASP13. Proteins 2019; 87:1058-1068. [PMID: 31587357 PMCID: PMC6851495 DOI: 10.1002/prot.25819] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 09/17/2019] [Accepted: 09/17/2019] [Indexed: 01/07/2023]
Abstract
The accuracy of sequence-based tertiary contact predictions was assessed in a blind prediction experiment at the CASP13 meeting. After 4 years of significant improvements in prediction accuracy, another dramatic advance has taken place since CASP12 was held 2 years ago. The precision of predicting the top L/5 contacts in the free modeling category, where L is the corresponding length of the protein in residues, has exceeded 70%. As a comparison, the best-performing group at CASP12 with a 47% precision would have finished below the top 1/3 of the CASP13 groups. Extensively trained deep neural network approaches dominate the top performing algorithms, which appear to efficiently integrate information on coevolving residues and interacting fragments or possibly utilize memories of sequence similarities and sometimes can deliver accurate results even in the absence of virtually any target specific evolutionary information. If the current performance is evaluated by F-score on L contacts, it stands around 24% right now, which, despite the tremendous impact and advance in improving its utility for structure modeling, also suggests that there is much room left for further improvement.
Collapse
Affiliation(s)
- Rojan Shrestha
- Department of Systems and Computational Biology, and Department of Biochemistry, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA
| | - Eduardo Fajardo
- Department of Systems and Computational Biology, and Department of Biochemistry, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA
| | - Nelson Gil
- Department of Systems and Computational Biology, and Department of Biochemistry, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA
| | - Krzysztof Fidelis
- Genome Center, University of California, Davis, 451 Health Sciences Dr., Davis CA 95616-8816, USA
| | - Andriy Kryshtafovych
- Genome Center, University of California, Davis, 451 Health Sciences Dr., Davis CA 95616-8816, USA
| | - Bohdan Monastyrskyy
- Genome Center, University of California, Davis, 451 Health Sciences Dr., Davis CA 95616-8816, USA
| | - Andras Fiser
- Department of Systems and Computational Biology, and Department of Biochemistry, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA
| |
Collapse
|
7
|
Saidi R, Boudellioua I, Martin MJ, Solovyev V. Rule Mining Techniques to Predict Prokaryotic Metabolic Pathways. Methods Mol Biol 2017; 1613:311-331. [PMID: 28849566 DOI: 10.1007/978-1-4939-7027-8_12] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
It is becoming more evident that computational methods are needed for the identification and the mapping of pathways in new genomes. We introduce an automatic annotation system (ARBA4Path Association Rule-Based Annotator for Pathways) that utilizes rule mining techniques to predict metabolic pathways across wide range of prokaryotes. It was demonstrated that specific combinations of protein domains (recorded in our rules) strongly determine pathways in which proteins are involved and thus provide information that let us very accurately assign pathway membership (with precision of 0.999 and recall of 0.966) to proteins of a given prokaryotic taxon. Our system can be used to enhance the quality of automatically generated annotations as well as annotating proteins with unknown function. The prediction models are represented in the form of human-readable rules, and they can be used effectively to add absent pathway information to many proteins in UniProtKB/TrEMBL database.
Collapse
Affiliation(s)
- Rabie Saidi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, CB10 1SD, UK.
| | - Imane Boudellioua
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Kingdom of Saudi Arabia
| | - Maria J Martin
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Kingdom of Saudi Arabia
| | - Victor Solovyev
- Softberry Inc., 116 Radio Circle, Suite 400, Mount Kisco, NY, 10549, USA.
| |
Collapse
|
8
|
Singhal A, Leaman R, Catlett N, Lemberger T, McEntyre J, Polson S, Xenarios I, Arighi C, Lu Z. Pressing needs of biomedical text mining in biocuration and beyond: opportunities and challenges. Database (Oxford) 2016; 2016:baw161. [PMID: 28025348 PMCID: PMC5199160 DOI: 10.1093/database/baw161] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2016] [Revised: 11/10/2016] [Accepted: 11/11/2016] [Indexed: 12/24/2022]
Abstract
Text mining in the biomedical sciences is rapidly transitioning from small-scale evaluation to large-scale application. In this article, we argue that text-mining technologies have become essential tools in real-world biomedical research. We describe four large scale applications of text mining, as showcased during a recent panel discussion at the BioCreative V Challenge Workshop. We draw on these applications as case studies to characterize common requirements for successfully applying text-mining techniques to practical biocuration needs. We note that system 'accuracy' remains a challenge and identify several additional common difficulties and potential research directions including (i) the 'scalability' issue due to the increasing need of mining information from millions of full-text articles, (ii) the 'interoperability' issue of integrating various text-mining systems into existing curation workflows and (iii) the 'reusability' issue on the difficulty of applying trained systems to text genres that are not seen previously during development. We then describe related efforts within the text-mining community, with a special focus on the BioCreative series of challenge workshops. We believe that focusing on the near-term challenges identified in this work will amplify the opportunities afforded by the continued adoption of text-mining tools. Finally, in order to sustain the curation ecosystem and have text-mining systems adopted for practical benefits, we call for increased collaboration between text-mining researchers and various stakeholders, including researchers, publishers and biocurators.
Collapse
Affiliation(s)
- Ayush Singhal
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Robert Leaman
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | | | | - Johanna McEntyre
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Shawn Polson
- Center for Bioinformatics and Computational Biology and Department of Computer and Information Sciences, Delaware Biotechnology Institute, University of Delaware, Newark, DE 19711, USA
| | | | - Cecilia Arighi
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
- Center for Bioinformatics and Computational Biology and Department of Computer and Information Sciences, Delaware Biotechnology Institute, University of Delaware, Newark, DE 19711, USA
| | - Zhiyong Lu
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
9
|
Boudellioua I, Saidi R, Hoehndorf R, Martin MJ, Solovyev V. Prediction of Metabolic Pathway Involvement in Prokaryotic UniProtKB Data by Association Rule Mining. PLoS One 2016; 11:e0158896. [PMID: 27390860 PMCID: PMC4938425 DOI: 10.1371/journal.pone.0158896] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Accepted: 06/23/2016] [Indexed: 11/21/2022] Open
Abstract
The widening gap between known proteins and their functions has encouraged the development of methods to automatically infer annotations. Automatic functional annotation of proteins is expected to meet the conflicting requirements of maximizing annotation coverage, while minimizing erroneous functional assignments. This trade-off imposes a great challenge in designing intelligent systems to tackle the problem of automatic protein annotation. In this work, we present a system that utilizes rule mining techniques to predict metabolic pathways in prokaryotes. The resulting knowledge represents predictive models that assign pathway involvement to UniProtKB entries. We carried out an evaluation study of our system performance using cross-validation technique. We found that it achieved very promising results in pathway identification with an F1-measure of 0.982 and an AUC of 0.987. Our prediction models were then successfully applied to 6.2 million UniProtKB/TrEMBL reference proteome entries of prokaryotes. As a result, 663,724 entries were covered, where 436,510 of them lacked any previous pathway annotations.
Collapse
Affiliation(s)
- Imane Boudellioua
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Rabie Saidi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | - Robert Hoehndorf
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Maria J. Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | - Victor Solovyev
- Softberry Inc., 116 Radio Circle, Suite 400, Mount Kisco, NY 10549, United States of America
| |
Collapse
|
10
|
Pullan ST, Allnutt JC, Devine R, Hatch KA, Jeeves RE, Hendon-Dunn CL, Marsh PD, Bacon J. The effect of growth rate on pyrazinamide activity in Mycobacterium tuberculosis - insights for early bactericidal activity? BMC Infect Dis 2016; 16:205. [PMID: 27184366 PMCID: PMC4869200 DOI: 10.1186/s12879-016-1533-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2015] [Accepted: 04/29/2016] [Indexed: 11/24/2022] Open
Abstract
Background Pyrazinamide (PZA) plays an essential part in the shortened six-month tuberculosis (TB) treatment course due to its activity against slow-growing and non-replicating organisms. We tested whether PZA preferentially targets slow growing cells of Mycobacterium tuberculosis that could be representative of bacteria that remain after the initial kill with isoniazid (INH), by observing the response of either slow growing or fast growing bacilli to differing concentrations of PZA. Methods M. tuberculosis H37Rv was grown in continuous culture at either a constant fast growth rate (Mean Generation Time (MGT) of 23.1 h) or slow growth rate (69.3 h MGT) at a controlled dissolved oxygen tension of 10 % and a controlled acidity at pH 6.3 ± 0.1. Cultures were exposed to step-wise increases in the concentration of PZA (25 to 500 μgml−1) every two MGTs, and bacterial survival was measured. PZA-induced global gene expression was explored for each increase in PZA-concentration, using DNA microarray. Results At a constant pH 6.3, actively dividing mycobacteria were susceptible to PZA, with similar responses to increasing concentrations of PZA at both growth rates. Three distinct phases of drug response could be distingished for both slow growing (69.3 h MGT) and fast growing (23.1 h MGT) bacilli. A bacteriostatic phase at a low concentration of PZA was followed by a recovery period in which the culture adapted to the presence of PZA and bacteria were actively dividing in steady-state. In contrast, there was a rapid loss of viability at bactericidal concentrations. There was a notable delay in the onset of the recovery period in quickly dividing cells compared with those dividing more slowly. Fast growers and slow growers adapted to PZA-exposure via very similar mechanisms; through reduced gene expression of tRNA, 50S, and 30S ribosomal proteins. Conclusions PZA had an equivalent level of activity against fast growing and slow growing M. tuberculosis. At both growth rates drug-tolerance to sub-lethal concentrations may have been due to reduced expression of tRNA, 50S, and 30S ribosomal proteins. The findings from this study show that PZA has utility against more than one phenotypic sub-population of bacilli and could be re-assessed for its early bactericidal activity, in combination with other drugs, during TB treatment. Electronic supplementary material The online version of this article (doi:10.1186/s12879-016-1533-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Steven T Pullan
- Public Health England, National Infection Service, Porton Down, Salisbury, Wiltshire, SP4 0JG, UK
| | - Jon C Allnutt
- Public Health England, National Infection Service, Porton Down, Salisbury, Wiltshire, SP4 0JG, UK
| | - Rebecca Devine
- School of Biological Sciences, University of East Anglia, Norwich Research Park, NR4 7TJ, UK
| | - Kim A Hatch
- Public Health England, National Infection Service, Porton Down, Salisbury, Wiltshire, SP4 0JG, UK
| | - Rose E Jeeves
- Public Health England, National Infection Service, Porton Down, Salisbury, Wiltshire, SP4 0JG, UK
| | - Charlotte L Hendon-Dunn
- Public Health England, National Infection Service, Porton Down, Salisbury, Wiltshire, SP4 0JG, UK
| | - Philip D Marsh
- Public Health England, National Infection Service, Porton Down, Salisbury, Wiltshire, SP4 0JG, UK
| | - Joanna Bacon
- Public Health England, National Infection Service, Porton Down, Salisbury, Wiltshire, SP4 0JG, UK.
| |
Collapse
|
11
|
Wu TJ, Shamsaddini A, Pan Y, Smith K, Crichton DJ, Simonyan V, Mazumder R. A framework for organizing cancer-related variations from existing databases, publications and NGS data using a High-performance Integrated Virtual Environment (HIVE). DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014; 2014:bau022. [PMID: 24667251 PMCID: PMC3965850 DOI: 10.1093/database/bau022] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Years of sequence feature curation by UniProtKB/Swiss-Prot, PIR-PSD, NCBI-CDD, RefSeq and other database biocurators has led to a rich repository of information on functional sites of genes and proteins. This information along with variation-related annotation can be used to scan human short sequence reads from next-generation sequencing (NGS) pipelines for presence of non-synonymous single-nucleotide variations (nsSNVs) that affect functional sites. This and similar workflows are becoming more important because thousands of NGS data sets are being made available through projects such as The Cancer Genome Atlas (TCGA), and researchers want to evaluate their biomarkers in genomic data. BioMuta, an integrated sequence feature database, provides a framework for automated and manual curation and integration of cancer-related sequence features so that they can be used in NGS analysis pipelines. Sequence feature information in BioMuta is collected from the Catalogue of Somatic Mutations in Cancer (COSMIC), ClinVar, UniProtKB and through biocuration of information available from publications. Additionally, nsSNVs identified through automated analysis of NGS data from TCGA are also included in the database. Because of the petabytes of data and information present in NGS primary repositories, a platform HIVE (High-performance Integrated Virtual Environment) for storing, analyzing, computing and curating NGS data and associated metadata has been developed. Using HIVE, 31 979 nsSNVs were identified in TCGA-derived NGS data from breast cancer patients. All variations identified through this process are stored in a Curated Short Read archive, and the nsSNVs from the tumor samples are included in BioMuta. Currently, BioMuta has 26 cancer types with 13 896 small-scale and 308 986 large-scale study-derived variations. Integration of variation data allows identifications of novel or common nsSNVs that can be prioritized in validation studies. Database URL: BioMuta: http://hive.biochemistry.gwu.edu/tools/biomuta/index.php; CSR: http://hive.biochemistry.gwu.edu/dna.cgi?cmd=csr; HIVE: http://hive.biochemistry.gwu.edu
Collapse
Affiliation(s)
- Tsung-Jung Wu
- Department of Biochemistry and Molecular Medicine, George Washington University, Washington, DC 20037, USA, Data Systems and Technology Jet Propulsion Laboratory 4800 Oak Grove Drive Pasadena, CA 91109 Center for Biologics Evaluation and Research, Food and Drug Administration, Rockville, MD 20852, USA and McCormick Genomic and Proteomic Center, George Washington University, Washington, DC 20037, USA
| | | | | | | | | | | | | |
Collapse
|
12
|
Mavridis L, Nath N, Mitchell JBO. PFClust: a novel parameter free clustering algorithm. BMC Bioinformatics 2013; 14:213. [PMID: 23819480 PMCID: PMC3747858 DOI: 10.1186/1471-2105-14-213] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2013] [Accepted: 07/01/2013] [Indexed: 12/02/2022] Open
Abstract
Background We present the algorithm PFClust (Parameter Free Clustering), which is able automatically to cluster data and identify a suitable number of clusters to group them into without requiring any parameters to be specified by the user. The algorithm partitions a dataset into a number of clusters that share some common attributes, such as their minimum expectation value and variance of intra-cluster similarity. A set of n objects can be clustered into any number of clusters from one to n, and there are many different hierarchical and partitional, agglomerative and divisive, clustering methodologies available that can be used to do this. Nonetheless, automatically determining the number of clusters present in a dataset constitutes a significant challenge for clustering algorithms. Identifying a putative optimum number of clusters to group the objects into involves computing and evaluating a range of clusterings with different numbers of clusters. However, there is no agreed or unique definition of optimum in this context. Thus, we test PFClust on datasets for which an external gold standard of ‘correct’ cluster definitions exists, noting that this division into clusters may be suboptimal according to other reasonable criteria. PFClust is heuristic in the sense that it cannot be described in terms of optimising any single simply-expressed metric over the space of possible clusterings. Results We validate PFClust firstly with reference to a number of synthetic datasets consisting of 2D vectors, showing that its clustering performance is at least equal to that of six other leading methodologies – even though five of the other methods are told in advance how many clusters to use. We also demonstrate the ability of PFClust to classify the three dimensional structures of protein domains, using a set of folds taken from the structural bioinformatics database CATH. Conclusions We show that PFClust is able to cluster the test datasets a little better, on average, than any of the other algorithms, and furthermore is able to do this without the need to specify any external parameters. Results on the synthetic datasets demonstrate that PFClust generates meaningful clusters, while our algorithm also shows excellent agreement with the correct assignments for a dataset extracted from the CATH part-manually curated classification of protein domain structures.
Collapse
Affiliation(s)
- Lazaros Mavridis
- Biomedical Sciences Research Complex and EaStCHEM School of Chemistry, Purdie Building, University of St Andrews, North Haugh, St Andrews, KY16 9ST, Scotland, UK.
| | | | | |
Collapse
|
13
|
Garaguso I, Borlak J. A rapid screening assay to search for phosphorylated proteins in tissue extracts. PLoS One 2012; 7:e50025. [PMID: 23166814 PMCID: PMC3499474 DOI: 10.1371/journal.pone.0050025] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2012] [Accepted: 10/19/2012] [Indexed: 11/19/2022] Open
Abstract
Reversible protein phosphorylation is an essential mechanism in the regulation of diverse biological processes, nonetheless is frequently altered in disease. As most phosphoproteome studies are based on optimized in-vitro cell culture studies new methods are in need to improve de novo identification and characterization of phosphoproteins in extracts from tissues. Here, we describe a rapid and reliable method for the detection of phosphoproteins in tissue extract based on an experimental strategy that employs 1D and 2D SDS PAGE, Western immunoblotting of phosphoproteins, in-gel protease digestion and enrichment of phosphorpeptides using metal oxide affinity chromatography (MOAC). Subsequently, phosphoproteins are identified by MALDI-TOF-MS/MS with the CHCA-TL or DHB ML sample matrix preparation method and further characterized by various bioinformatic software tools to search for candidate kinases and phosphorylation-dependent binding motifs. The method was applied to mouse lung tissue extracts and resulted in an identification of 160 unique phosphoproteins. Notably, TiO(2) enrichment of pulmonary protein extracts resulted in an identification of additional 17 phosphoproteins and 20 phosphorylation sites. By use of MOAC, new phosphorylation sites were identified as evidenced for the advanced glycosylation end product-specific receptor. So far this protein was unknown to be phosphorylated in lung tissue of mice. Overall the developed methodology allowed efficient and rapid screening of phosphorylated proteins and can be employed as a general experimental strategy for an identification of phosphoproteins in tissue extracts.
Collapse
Affiliation(s)
- Ignazio Garaguso
- Centre for Pharmacology and Toxicology, Hannover Medical School, Hannover, Germany
| | - Juergen Borlak
- Centre for Pharmacology and Toxicology, Hannover Medical School, Hannover, Germany
- * E-mail:
| |
Collapse
|
14
|
Vollan HS, Tannaes T, Yamaoka Y, Bukholm G. In silico evolutionary analysis of Helicobacter pylori outer membrane phospholipase A (OMPLA). BMC Microbiol 2012; 12:206. [PMID: 22974200 PMCID: PMC3490997 DOI: 10.1186/1471-2180-12-206] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2012] [Accepted: 08/31/2012] [Indexed: 01/19/2023] Open
Abstract
Background In the past decade, researchers have proposed that the pldA gene for outer membrane phospholipase A (OMPLA) is important for bacterial colonization of the human gastric ventricle. Several conserved Helicobacter pylori genes have distinct genotypes in different parts of the world, biogeographic patterns that can be analyzed through phylogenetic trees. The current study will shed light on the importance of the pldA gene in H. pylori. In silico sequence analysis will be used to investigate whether the bacteria are in the process of preserving, optimizing, or rejecting the pldA gene. The pldA gene will be phylogenetically compared to other housekeeping (HK) genes, and a possible origin via horizontal gene transfer (HGT) will be evaluated through both intra- and inter-species evolutionary analyses. Results In this study, pldA gene sequences were phylogenetically analyzed and compared with a large reference set of concatenated HK gene sequences. A total of 246 pldA nucleotide sequences were used; 207 were from Norwegian isolates, 20 were from Korean isolates, and 19 were from the NCBI database. Best-fit evolutionary models were determined with MEGA5 ModelTest for the pldA (K80 + I + G) and HK (GTR + I + G) sequences, and maximum likelihood trees were constructed. Both HK and pldA genes showed biogeographic clustering. Horizontal gene transfer was inferred based on significantly different GC contents, the codon adaptation index, and a phylogenetic conflict between a tree of OMPLA protein sequences representing 171 species and a tree of the AtpA HK protein for 169 species. Although a vast majority of the residues in OMPLA were predicted to be under purifying selection, sites undergoing positive selection were also found. Conclusions Our findings indicate that the pldA gene could have been more recently acquired than seven of the HK genes found in H. pylori. However, the common biogeographic patterns of both the HK and pldA sequences indicated that the transfer occurred long ago. Our results indicate that the bacterium is preserving the function of OMPLA, although some sites are still being evolutionarily optimized.
Collapse
Affiliation(s)
- Hilde S Vollan
- Department of Clinical Molecular Biology, Division of Medicine, Akershus University Hospital, University of Oslo, Norway.
| | | | | | | |
Collapse
|
15
|
Brown AM, Zondlo NJ. A propensity scale for type II polyproline helices (PPII): aromatic amino acids in proline-rich sequences strongly disfavor PPII due to proline-aromatic interactions. Biochemistry 2012; 51:5041-51. [PMID: 22667692 DOI: 10.1021/bi3002924] [Citation(s) in RCA: 97] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Type II polyproline helices (PPII) are a fundamental secondary structure of proteins, common in globular and nonglobular regions and important in cellular signaling. We developed a propensity scale for PPII using a host-guest system with sequence Ac-GPPXPPGY-NH(2), where X represents any amino acid. We found that proline has the highest PPII propensity, but most other amino acids display significant PPII propensities. The PPII propensity of leucine was the highest of all propensities of non-proline residues. Alanine and residues with linear side chains displayed the next highest PPII propensities. Three classes of residues displayed lower PPII propensities: β-branched amino acids (Thr, Val, and Ile), short amino acids with polar side chains (Asn, protonated Asp, Ser, Thr, and Cys), and aromatic amino acids (Phe, Tyr, and Trp). tert-Leucine particularly disfavored PPII. The basis of the low PPII propensities of aromatic amino acids in this context was significant cis-trans isomerism, with proline-rich peptides containing aromatic residues exhibiting 45-60% cis amide bonds, due to Pro-cis-Pro-aromatic and aromatic-cis-Pro amide bonds.
Collapse
Affiliation(s)
- Alaina M Brown
- Department of Chemistry and Biochemistry, University of Delaware, Newark, Delaware 19716, USA
| | | |
Collapse
|
16
|
Tebaldi T, Re A, Viero G, Pegoretti I, Passerini A, Blanzieri E, Quattrone A. Widespread uncoupling between transcriptome and translatome variations after a stimulus in mammalian cells. BMC Genomics 2012; 13:220. [PMID: 22672192 PMCID: PMC3441405 DOI: 10.1186/1471-2164-13-220] [Citation(s) in RCA: 98] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2012] [Accepted: 06/06/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The classical view on eukaryotic gene expression proposes the scheme of a forward flow for which fluctuations in mRNA levels upon a stimulus contribute to determine variations in mRNA availability for translation. Here we address this issue by simultaneously profiling with microarrays the total mRNAs (the transcriptome) and the polysome-associated mRNAs (the translatome) after EGF treatment of human cells, and extending the analysis to other 19 different transcriptome/translatome comparisons in mammalian cells following different stimuli or undergoing cell programs. RESULTS Triggering of the EGF pathway results in an early induction of transcriptome and translatome changes, but 90% of the significant variation is limited to the translatome and the degree of concordant changes is less than 5%. The survey of other 19 different transcriptome/translatome comparisons shows that extensive uncoupling is a general rule, in terms of both RNA movements and inferred cell activities, with a strong tendency of translation-related genes to be controlled purely at the translational level. By different statistical approaches, we finally provide evidence of the lack of dependence between changes at the transcriptome and translatome levels. CONCLUSIONS We propose a model of diffused independency between variation in transcript abundances and variation in their engagement on polysomes, which implies the existence of specific mechanisms to couple these two ways of regulating gene expression.
Collapse
Affiliation(s)
- Toma Tebaldi
- Laboratory of Translational Genomics, Centre for Integrative Biology, University of Trento, 38123 Trento, Italy
| | | | | | | | | | | | | |
Collapse
|
17
|
Yu GX. RULEMINER: A KNOWLEDGE SYSTEM FOR SUPPORTING HIGH-THROUGHPUT PROTEIN FUNCTION ANNOTATIONS. J Bioinform Comput Biol 2011; 2:615-37. [PMID: 15617156 DOI: 10.1142/s0219720004000752] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2003] [Revised: 03/23/2004] [Accepted: 03/24/2004] [Indexed: 11/18/2022]
Abstract
In this paper, we present RuleMiner, a knowledge system to facilitate a seamless integration of multi-sequence analysis tools and define profile-based rules for supporting high-throughput protein function annotations. This system consists of three essential components, Protein Function Groups (PFGs), PFG profiles and rules. The PFGs, established from an integrated analysis of current knowledge of protein functions from Swiss-Prot database and protein family-based sequence classifications, cover all possible cellular functions available in the database. The PFG profiles illustrate detailed protein features in the PFGs as in sequence conservations, the occurrences of sequence-based motifs, domains and species distributions. The rules, extracted from the PFG profiles, describe the clear relationships between these PFGs and all possible features. As a result, the RuleMiner is able to provide an enhanced capability for protein function analysis, such as results from the integrated sequence analysis tools for given proteins can be comparatively analyzed due to the clear feature-PFG relationships. Also, much needed guidance is readily available for such analysis. If the rules describe one-to-one (unique) relationships between the protein features and the PFGs, then these features can be utilized as unique functional identifiers and cellular functions of unknown proteins can be reliably determined. Otherwise, additional information has to be provided.
Collapse
Affiliation(s)
- Gong-Xin Yu
- Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, USA.
| |
Collapse
|
18
|
Madhavan S, Gusev Y, Harris M, Tanenbaum DM, Gauba R, Bhuvaneshwar K, Shinohara A, Rosso K, Carabet LA, Song L, Riggins RB, Dakshanamurthy S, Wang Y, Byers SW, Clarke R, Weiner LM. G-DOC: a systems medicine platform for personalized oncology. Neoplasia 2011; 13:771-83. [PMID: 21969811 PMCID: PMC3182270 DOI: 10.1593/neo.11806] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2011] [Revised: 07/28/2011] [Accepted: 08/01/2011] [Indexed: 01/05/2023]
Abstract
Currently, cancer therapy remains limited by a "one-size-fits-all" approach, whereby treatment decisions are based mainly on the clinical stage of disease, yet fail to reference the individual's underlying biology and its role driving malignancy. Identifying better personalized therapies for cancer treatment is hindered by the lack of high-quality "omics" data of sufficient size to produce meaningful results and the ability to integrate biomedical data from disparate technologies. Resolving these issues will help translation of therapies from research to clinic by helping clinicians develop patient-specific treatments based on the unique signatures of patient's tumor. Here we describe the Georgetown Database of Cancer (G-DOC), a Web platform that enables basic and clinical research by integrating patient characteristics and clinical outcome data with a variety of high-throughput research data in a unified environment. While several rich data repositories for high-dimensional research data exist in the public domain, most focus on a single-data type and do not support integration across multiple technologies. Currently, G-DOC contains data from more than 2500 breast cancer patients and 800 gastrointestinal cancer patients, G-DOC includes a broad collection of bioinformatics and systems biology tools for analysis and visualization of four major "omics" types: DNA, mRNA, microRNA, and metabolites. We believe that G-DOC will help facilitate systems medicine by providing identification of trends and patterns in integrated data sets and hence facilitate the use of better targeted therapies for cancer. A set of representative usage scenarios is provided to highlight the technical capabilities of this resource.
Collapse
Affiliation(s)
- Subha Madhavan
- Lombardi Comprehensive Cancer Center, Georgetown University, Washington, DC 20007, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Cueva JP, Gallardo-Godoy A, Juncosa JI, Vidi PA, Lill MA, Watts VJ, Nichols DE. Probing the steric space at the floor of the D1 dopamine receptor orthosteric binding domain: 7α-, 7β-, 8α-, and 8β-methyl substituted dihydrexidine analogues. J Med Chem 2011; 54:5508-21. [PMID: 21714510 PMCID: PMC3150624 DOI: 10.1021/jm200334c] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
To probe the space at the floor of the orthosteric ligand binding site in the dopamine D(1) receptor, four methylated analogues of dihydrexidine (DHX) were synthesized with substitutions at the 7 and 8 positions. The 8α-axial, 8β-equatorial, and 7α-equatorial were synthesized by photochemical cyclization of appropriately substituted N-benzoyl enamines, and the 7β-axial analogue was prepared by an intramolecular Henry reaction. All of the methylated analogues displayed losses in affinity when compared to DHX (20 nM): 8β-Me(ax)-DHX (270 nM), 8α-Me(eq)-DHX (920 nM), 7β-Me(eq)-DHX (6540 nM), and 7α-Me(ax)-DHX (>10000 nM). Molecular modeling studies suggest that although the disruption of an aromatic interaction between Phe203(5.47) and Phe288(6.51) is the cause for the 14-fold loss in affinity associated with 8β-axial substitution, unfavorable steric interactions with Ser107(3.36) result in the more dramatic decreases in binding affinity suffered by the rest of the analogues.
Collapse
Affiliation(s)
- Juan Pablo Cueva
- Department of Pharmacy and Pharmacology, University of Bath, Bath, BA2 7AY, United Kingdom
| | - Alejandra Gallardo-Godoy
- Small Molecule Discovery Center (SMDC), School of Pharmacy, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Jose I. Juncosa
- Department of Medicinal Chemistry and Molecular Pharmacology, College of Pharmacy, Purdue University, West Lafayette, Indiana 47907
| | - Pierre A. Vidi
- Department of Medicinal Chemistry and Molecular Pharmacology, College of Pharmacy, Purdue University, West Lafayette, Indiana 47907
| | - Markus A. Lill
- Department of Medicinal Chemistry and Molecular Pharmacology, College of Pharmacy, Purdue University, West Lafayette, Indiana 47907
| | - Val J. Watts
- Department of Medicinal Chemistry and Molecular Pharmacology, College of Pharmacy, Purdue University, West Lafayette, Indiana 47907
| | - David E. Nichols
- Department of Medicinal Chemistry and Molecular Pharmacology, College of Pharmacy, Purdue University, West Lafayette, Indiana 47907
| |
Collapse
|
20
|
Bonner LA, Laban U, Chemel BR, Juncosa JI, Lill MA, Watts VJ, Nichols DE. Mapping the catechol binding site in dopamine D₁ receptors: synthesis and evaluation of two parallel series of bicyclic dopamine analogues. ChemMedChem 2011; 6:1024-40. [PMID: 21538900 DOI: 10.1002/cmdc.201100010] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2011] [Revised: 03/19/2011] [Indexed: 11/08/2022]
Abstract
A novel class of isochroman dopamine analogues, originally reported by Abbott Laboratories, have >100-fold selectivity for D₁-like over D₂-like receptors. We synthesized a parallel series of chroman compounds and showed that repositioning the oxygen atom in the heterocyclic ring decreases potency and confers D₂-like receptor selectivity to these compounds. In silico modeling supports the hypothesis that the altered pharmacology for the chroman series is due to potential intramolecular hydrogen bonding between the oxygen in the chroman ring and the meta-hydroxy group of the catechol moiety. This interaction realigns the catechol hydroxy groups and disrupts key interactions between these ligands and critical serine residues in TM5 of the D₁-like receptors. This hypothesis was tested by the synthesis and pharmacological evaluation of a parallel series of carbocyclic compounds. Our results suggest that if the potential for intramolecular hydrogen bonding is removed, D₁-like receptor potency and selectivity are restored.
Collapse
Affiliation(s)
- Lisa A Bonner
- Department of Chemistry, Saint Anselm College, Manchester, NH 03102, USA
| | | | | | | | | | | | | |
Collapse
|
21
|
Hung CL, Lee C, Lin CY, Chang CH, Chung YC, Yi Tang C. Feature amplified voting algorithm for functional analysis of protein superfamily. BMC Genomics 2010; 11 Suppl 3:S14. [PMID: 21143781 PMCID: PMC2999344 DOI: 10.1186/1471-2164-11-s3-s14] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Identifying the regions associated with protein function is a singularly important task in the post-genomic era. Biological studies often identify functional enzyme residues by amino acid sequences, particularly when related structural information is unavailable. In some cases of protein superfamilies, functional residues are difficult to detect by current alignment tools or evolutionary strategies when phylogenetic relationships do not parallel their protein functions. The solution proposed in this study is Feature Amplified Voting Algorithm with Three-profile alignment (FAVAT). The core concept of FAVAT is to reveal the desired features of a target enzyme or protein by voting on three different property groups aligned by three-profile alignment method. Functional residues of a target protein can then be retrieved by FAVAT analysis. In this study, the amidohydrolase superfamily was an interesting case for verifying the proposed approach because it contains divergent enzymes and proteins. RESULTS The FAVAT was used to identify critical residues of mammalian imidase, a member of the amidohydrolase superfamily. Members of this superfamily were first classified by their functional properties and sources of original organisms. After FAVAT analysis, candidate residues were identified and compared to a bacterial hydantoinase in which the crystal structure (1GKQ) has been fully elucidated. One modified lysine, three histidines and one aspartate were found to participate in the coordination of metal ions in the active site. The FAVAT analysis also redressed the misrecognition of metal coordinator Asp57 by the multiple sequence alignment (MSA) method. Several other amino acid residues known to be related to the function or structure of mammalian imidase were also identified. CONCLUSIONS The FAVAT is shown to predict functionally important amino acids in amidohydrolase superfamily. This strategy effectively identifies functionally important residues by analyzing the discrepancy between the sequence and functional properties of related proteins in a superfamily, and it should be applicable to other protein families.
Collapse
Affiliation(s)
- Che-Lun Hung
- Department of Computer Science, National Tsing Hua University, 101, Section 2 Kuang Fu Road, Hsinchu, Taiwan
| | | | | | | | | | | |
Collapse
|
22
|
Capone G, Novello G, Fasano C, Trost B, Bickis M, Kusalik A, Kanduc D. The oligodeoxynucleotide sequences corresponding to never-expressed peptide motifs are mainly located in the non-coding strand. BMC Bioinformatics 2010; 11:383. [PMID: 20646284 PMCID: PMC2919516 DOI: 10.1186/1471-2105-11-383] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2010] [Accepted: 07/20/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND We study the usage of specific peptide platforms in protein composition. Using the pentapeptide as a unit of length, we find that in the universal proteome many pentapeptides are heavily repeated (even thousands of times), whereas some are quite rare, and a small number do not appear at all. To understand the physico-chemical-biological basis underlying peptide usage at the proteomic level, in this study we analyse the energetic costs for the synthesis of rare and never-expressed versus frequent pentapeptides. In addition, we explore residue bulkiness, hydrophobicity, and codon number as factors able to modulate specific peptide frequencies. Then, the possible influence of amino acid composition is investigated in zero- and high-frequency pentapeptide sets by analysing the frequencies of the corresponding inverse-sequence pentapeptides. As a final step, we analyse the pentadecamer oligodeoxynucleotide sequences corresponding to the never-expressed pentapeptides. RESULTS We find that only DNA context-dependent constraints (such as oligodeoxynucleotide sequence location in the minus strand, introns, pseudogenes, frameshifts, etc.) provide a coherent mechanistic platform to explain the occurrence of never-expressed versus frequent pentapeptides in the protein world. CONCLUSIONS This study is of importance in cell biology. Indeed, the rarity (or lack of expression) of specific 5-mer peptide modules implies the rarity (or lack of expression) of the corresponding n-mer peptide sequences (with n < 5), so possibly modulating protein compositional trends. Moreover the data might further our understanding of the role exerted by rare pentapeptide modules as critical biological effectors in protein-protein interactions.
Collapse
Affiliation(s)
- Giovanni Capone
- Department of Biochemistry and Molecular Biology "Ernesto Quagliariello", University of Bari, Bari, Italy
| | - Giuseppe Novello
- Department of Biochemistry and Molecular Biology "Ernesto Quagliariello", University of Bari, Bari, Italy
| | - Candida Fasano
- Department of Biochemistry and Molecular Biology "Ernesto Quagliariello", University of Bari, Bari, Italy
| | - Brett Trost
- Department of Computer Science, University of Saskatchewan, Saskatoon, Canada
| | - Mik Bickis
- Department of Mathematics and Statistics, University of Saskatchewan, Saskatoon, Canada
| | - Anthony Kusalik
- Department of Computer Science, University of Saskatchewan, Saskatoon, Canada
| | - Darja Kanduc
- Department of Biochemistry and Molecular Biology "Ernesto Quagliariello", University of Bari, Bari, Italy
| |
Collapse
|
23
|
Greaves LC, Barron MJ, Plusa S, Kirkwood TB, Mathers JC, Taylor RW, Turnbull DM. Defects in multiple complexes of the respiratory chain are present in ageing human colonic crypts. Exp Gerontol 2010; 45:573-9. [PMID: 20096767 PMCID: PMC2887930 DOI: 10.1016/j.exger.2010.01.013] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2009] [Revised: 01/12/2010] [Accepted: 01/14/2010] [Indexed: 01/21/2023]
Abstract
Mitochondrial DNA (mtDNA) mutations accumulate in a number of ageing tissues and are proposed to play a role in the ageing process. We have previously shown that colonic crypt stem cells accumulate somatic mtDNA point mutations during ageing. These mtDNA mutations result in the loss of the activity of complex IV (cytochrome c oxidase (COX)) of the respiratory chain in the stem cells and their progeny, producing colonic crypts which are entirely COX deficient. However it is not known whether the other complexes of the respiratory chain are similarly affected during ageing. Here we have used antibodies to individual subunits of complexes I–IV to investigate their expression in the colonic epithelium from human subjects aged 18–84. We show that in ∼50% of crypts with any form of respiratory chain deficiency, decreased expression of subunits of multiple complexes is observed. Furthermore we have sequenced the entire mitochondrial genome of a number of cells with multiple complex defects and have found a wide variety of point mutations in these cells affecting a number of different protein encoding and RNA encoding genes. Finally we discuss the possible mechanisms by which multiple respiratory chain complex defects may occur in these cells.
Collapse
Affiliation(s)
- Laura C Greaves
- Mitochondrial Research Group, Institute for Ageing and Health, Medical School, University of Newcastle upon Tyne, Framlington Place, Newcastle upon Tyne, UK
| | | | | | | | | | | | | |
Collapse
|
24
|
Kimura A, Tyacke RJ, Robinson JJ, Husbands SM, Minchin MC, Nutt DJ, Hudson AL. Identification of an imidazoline binding protein: creatine kinase and an imidazoline-2 binding site. Brain Res 2009; 1279:21-8. [PMID: 19410564 PMCID: PMC2722693 DOI: 10.1016/j.brainres.2009.04.044] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2009] [Revised: 03/28/2009] [Accepted: 04/22/2009] [Indexed: 12/11/2022]
Abstract
Drugs that bind to imidazoline binding proteins have major physiological actions. To date, three subtypes of such proteins, I(1), I(2) and I(3), have been proposed, although characterisations of these binding proteins are lacking. I(2) binding sites are found throughout the brain, particularly dense in the arcuate nucleus of the hypothalamus. Selective I(2) ligands demonstrate antidepressant-like activity and the identity of the proteins that respond to such ligands remained unknown until now. Here we report the isolation of a approximately 45 kDa imidazoline binding protein from rabbit and rat brain using a high affinity ligand for the I(2) subtype, 2-BFI, to generate an affinity column. Following protein sequencing of the isolated approximately 45 kDa imidazoline binding protein, we identified it to be brain creatine kinase (B-CK). B-CK shows high binding capacity to selective I(2) ligands; [(3)H]-2-BFI (5 nM) specifically bound to B-CK (2330+/-815 fmol mg protein(-1)). We predicted an I(2) binding pocket near the active site of B-CK using molecular modelling. Furthermore, B-CK activity was inhibited by a selective I(2) irreversible ligand, where 20 microM BU99006 reduced the enzyme activity by 16%, confirming the interaction between B-CK and the I(2) ligand. In summary, we have identified B-CK to be the approximately 45 kDa imidazoline binding protein and we have demonstrated the existence of an I(2) binding site within this enzyme. The importance of B-CK in regulating neuronal activity and neurotransmitter release may well explain the various actions of I(2) ligands in brain and the alterations in densities of I(2) binding sites in psychiatric disorders.
Collapse
Key Words
- 2-bfi, 2-(2-benzofuranyl)2-imidazoline
- bu224, 2-(4,5-dihydroimidaz-2-yl)quinoline
- bu99006, 5-isothiocyanoato-2-benzofuranyl-2-imidazoline
- b-ck, brain creatine kinase
- ck, creatine kinase
- gold, genetic optimisation for ligand docking
- gr, glucose-responsive
- i2, imidazoline-2 subtype
- katp channel, atp sensitive potassium channel
- mao, monoamine oxidase
- moe, molecular operating environment
- imidazoline binding protein
- creatine kinase
- 2-bfi
- harmane and psychiatric disorders
Collapse
Affiliation(s)
- Atsuko Kimura
- Psychopharmacology Unit, University of Bristol, BS1 3NY, UK
| | | | - James J. Robinson
- Department of Pharmacy and Pharmacology, University of Bath, BA2 7AY, UK
| | | | | | - David J. Nutt
- Psychopharmacology Unit, University of Bristol, BS1 3NY, UK
| | - Alan L. Hudson
- Department of Pharmacology, 9-70 Medical Sciences Building, University of Alberta, Edmonton, Alberta, Canada T6G 2H7
| |
Collapse
|
25
|
Lucchese A, Serpico R, Crincoli V, Shoenfeld Y, Kanduc D. Sequence Uniqueness as a Molecular Signature of HIV-1-Derived B-Cell Epitopes. Int J Immunopathol Pharmacol 2009; 22:639-46. [DOI: 10.1177/039463200902200309] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
The complex pathophysiology of human immunodeficiency virus (HIV) infection and the relatively high mutation rate of the retrovirus make it challenging to design effective anti-HIV vaccines. Several attempts have been made during the last decades to elucidate the enigmatic immunology of HIV infection and to predict potential immunogenic peptides for active vaccination using bioinformatic analysis methods. The results obtained to date to address this important problem are scarce. In this study, we exploit available HIV databases and analyse previously characterized HIV-encoded linear B-cell epitopes for their amino acid sequence similarity to the human or murine host proteome. We obtained further documentation that the HIV-derived antibody-targeted sequences mostly coincide with peptide areas rarely shared with the host proteins. In toto, our past and present data give clear-cut support to the statement that low-similarity to the host proteome is a major mechanism in defining viral peptide immunogenicity and indicate a possible way for inducing effective, high-titer, and non-cross-reactive antibodies to be used in anti-HIV vaccine therapy.
Collapse
Affiliation(s)
| | | | - V. Crincoli
- Department of Odontostomatology and Surgery, University of Bari, Italy
| | - Y. Shoenfeld
- Center for Autoimmune Diseases, Department of Medicine ‘B’, Sheba Medical Center, Israel and Sackler Faculty of Medicine, Tel-Aviv University, Israel
| | - D. Kanduc
- Department of Biochemistry and Molecular Biology, University of Bari, Italy
| |
Collapse
|
26
|
Song CM, Lim SJ, Tong JC. Recent advances in computer-aided drug design. Brief Bioinform 2009; 10:579-91. [PMID: 19433475 DOI: 10.1093/bib/bbp023] [Citation(s) in RCA: 175] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Modern drug discovery is characterized by the production of vast quantities of compounds and the need to examine these huge libraries in short periods of time. The need to store, manage and analyze these rapidly increasing resources has given rise to the field known as computer-aided drug design (CADD). CADD represents computational methods and resources that are used to facilitate the design and discovery of new therapeutic solutions. Digital repositories, containing detailed information on drugs and other useful compounds, are goldmines for the study of chemical reactions capabilities. Design libraries, with the potential to generate molecular variants in their entirety, allow the selection and sampling of chemical compounds with diverse characteristics. Fold recognition, for studying sequence-structure homology between protein sequences and structures, are helpful for inferring binding sites and molecular functions. Virtual screening, the in silico analog of high-throughput screening, offers great promise for systematic evaluation of huge chemical libraries to identify potential lead candidates that can be synthesized and tested. In this article, we present an overview of the most important data sources and computational methods for the discovery of new molecular entities. The workflow of the entire virtual screening campaign is discussed, from data collection through to post-screening analysis.
Collapse
Affiliation(s)
- Chun Meng Song
- Institute for Infocomm Research, Connexis South Tower, Singapore 138632
| | | | | |
Collapse
|
27
|
Stufano A, Kanduc D. Proteome-based epitopic peptide scanning along PSA. Exp Mol Pathol 2009; 86:36-40. [DOI: 10.1016/j.yexmp.2008.11.009] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2008] [Accepted: 11/26/2008] [Indexed: 12/18/2022]
|
28
|
Peng Y, Reyes JL, Wei H, Yang Y, Karlson D, Covarrubias AA, Krebs SL, Fessehaie A, Arora R. RcDhn5, a cold acclimation-responsive dehydrin from Rhododendron catawbiense rescues enzyme activity from dehydration effects in vitro and enhances freezing tolerance in RcDhn5-overexpressing Arabidopsis plants. PHYSIOLOGIA PLANTARUM 2008; 134:583-97. [PMID: 19000195 DOI: 10.1111/j.1399-3054.2008.01164.x] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Dehydrins (DHNs) are typically induced in response to abiotic stresses that impose cellular dehydration. As extracellular freezing results in cellular dehydration, accumulation of DHNs and development of desiccation tolerance are believed to be key components of the cold acclimation (CA) process. The present study shows that RcDhn5, one of the DHNs from Rhododendron catawbiense leaf tissues, encodes an acidic, SK(2) type DHN and is upregulated during seasonal CA and downregulated during spring deacclimation (DA). Data from in vitro partial water loss assays indicate that purified RcDhn5 protects enzyme activity against a dehydration treatment and that this protection is comparable with acidic SK(n) DHNs from other species. To investigate the contribution of RcDhn5 to freezing tolerance (FT), Arabidopsis plants overexpressing RcDhn5 under the control of 35S promoter were generated. Transgenic plants exhibited improved 'constitutive' FT compared with the control plants. Furthermore, a small but significant improvement in FT of RcDhn5-overexpressing plants was observed after 12 h of CA; however, this gained acclimation capacity was not sustained after a 6-day CA. Transcript profiles of cold-regulated native Arabidopsis DHNs (COR47, ERD10 and ERD14) during a CA time-course suggests that the apparent lack of improvement in cold-acclimated FT of RcDhn5-overexpressing plants over that of wild-type controls after a 6-day CA might have been because of the dilution of the effect of RcDhn5 overproduction by a strong CA-induced expression of native Arabidopsis DHNs. This study provides evidence that RcDhn5 contributes to freezing stress tolerance and that this could be, in part, because of its dehydration stress-protective ability.
Collapse
Affiliation(s)
- Yanhui Peng
- Department of Horticulture, Iowa State University, Ames, IA 50011, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Chiang HI, Swaggerty CL, Kogut MH, Dowd SE, Li X, Pevzner IY, Zhou H. Gene expression profiling in chicken heterophils with Salmonella enteritidis stimulation using a chicken 44 K Agilent microarray. BMC Genomics 2008; 9:526. [PMID: 18990222 PMCID: PMC2588606 DOI: 10.1186/1471-2164-9-526] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2008] [Accepted: 11/06/2008] [Indexed: 06/04/2025] Open
Abstract
Background Salmonella enterica serovar Enteritidis (SE) is one of the most common food-borne pathogens that cause human salmonellosis and usually results from the consumption of contaminated poultry products. The mechanism of SE resistance in chickens remains largely unknown. Previously, heterophils isolated from broilers with different genetic backgrounds (SE-resistant [line A] and -susceptible [line B]) have been shown to be important in defending against SE infections. To dissect the interplay between heterophils and SE infection, we utilized large-scale gene expression profiling. Results The results showed more differentially expressed genes were found between different lines than between infection (SE-treated) and non-infection (control) samples within line. However, the numbers of expressed immune-related genes between these two comparisons were dramatically different. More genes related to immune function were down-regulated in line B than line A. The analysis of the immune-related genes indicated that SE infection induced a stronger, up-regulated gene expression of line heterophils A than line B, and these genes include several components in the Toll-like receptor (TLR) signaling pathway, and genes involved in T-helper cell activation. Conclusion We found: (1) A divergent expression pattern of immune-related genes between lines of different genetic backgrounds. The higher expression of immune-related genes might be more beneficial to enhance host immunity in the resistant line; (2) a similar TLR regulatory network might exist in both lines, where a possible MyD88-independent pathway may participate in the regulation of host innate immunity; (3) the genes exclusively differentially expressed in line A or line B with SE infection provided strong candidates for further investigating SE resistance and susceptibility. These findings have laid the foundation for future studies of TLR pathway regulation and cellular modulation of SE infection in chickens.
Collapse
Affiliation(s)
- Hsin-I Chiang
- Department of Poultry Science, Texas A&M University, College Station, TX 77843, USA.
| | | | | | | | | | | | | |
Collapse
|
30
|
Dai Q, Wang T. Comparison study on k-word statistical measures for protein: from sequence to 'sequence space'. BMC Bioinformatics 2008; 9:394. [PMID: 18811946 PMCID: PMC2571980 DOI: 10.1186/1471-2105-9-394] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2008] [Accepted: 09/23/2008] [Indexed: 11/30/2022] Open
Abstract
Background Many proposed statistical measures can efficiently compare protein sequence to further infer protein structure, function and evolutionary information. They share the same idea of using k-word frequencies of protein sequences. Given a protein sequence, the information on its related protein sequences hasn't been used for protein sequence comparison until now. This paper proposed a scheme to construct protein 'sequence space' which was associated with protein sequences related to the given protein, and the performances of statistical measures were compared when they explored the information on protein 'sequence space' or not. This paper also presented two statistical measures for protein: gre.k (generalized relative entropy) and gsm.k (gapped similarity measure). Results We tested statistical measures based on protein 'sequence space' or not with three data sets. This not only offers the systematic and quantitative experimental assessment of these statistical measures, but also naturally complements the available comparison of statistical measures based on protein sequence. Moreover, we compared our statistical measures with alignment-based measures and the existing statistical measures. The experiments were grouped into two sets. The first one, performed via ROC (Receiver Operating Curve) analysis, aims at assessing the intrinsic ability of the statistical measures to discriminate and classify protein sequences. The second set of the experiments aims at assessing how well our measure does in phylogenetic analysis. Based on the experiments, several conclusions can be drawn and, from them, novel valuable guidelines for the use of protein 'sequence space' and statistical measures were obtained. Conclusion Alignment-based measures have a clear advantage when the data is high redundant. The more efficient statistical measure is the novel gsm.k introduced by this article, the cos.k followed. When the data becomes less redundant, gre.k proposed by us achieves a better performance, but all the other measures perform poorly on classification tasks. Almost all the statistical measures achieve improvement by exploring the information on 'sequence space' as word's length increases, especially for less redundant data. The reasonable results of phylogenetic analysis confirm that Gdis.k based on 'sequence space' is a reliable measure for phylogenetic analysis. In summary, our quantitative analysis verifies that exploring the information on 'sequence space' is a promising way to improve the abilities of statistical measures for protein comparison.
Collapse
Affiliation(s)
- Qi Dai
- Department of Applied Mathematics, Dalian University of Technology, Dalian 116024, PR China.
| | | |
Collapse
|
31
|
Dowd SE, Killinger-Mann K, Brashears M, Fralick J. Evaluation of gene expression in a single antibiotic exposure-derived isolate of Salmonella enterica typhimurium 14028 possessing resistance to multiple antibiotics. Foodborne Pathog Dis 2008; 5:205-21. [PMID: 18407759 DOI: 10.1089/fpd.2007.0062] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Antibiotics are important tools used to control infections. Unfortunately, microbes can become resistant to antibiotics, which limit the drugs' usefulness for clinical and veterinary use. It is necessary to improve our understanding of mechanisms that contribute to or enhance antibiotic resistance. Using nalidixic acid (NA) exposure as a sole selective agent, a resistant strain of Salmonella enterica Typhimurium 14028 was derived (2a) that had acquired resistance to chloramphenicol, sulfisoxazole, cefoxitin, tetracycline, and NA. We employed gene array analysis to further characterize this derivative. Results indicate a significant difference (FDR < 5%) in the expression of 338 genes (fold regulation > 1.3) between the derivative and the parent strain growing exponentially under the same conditions at 37 degrees C. Strain 2a showed comparative induction of Salmonella pathogenicity island 2 (SPI2) transcripts and repression of SPI1 genes. Differences in expression were related to efflux pumps (increased expression), porins (decreased expression), type III secretion systems (increased expression), lipopolysaccharide synthesis (decreased expression), motility-related genes (decreased expression), and PhoP/PhoQ and peptidoglycan synthesis (increased expression). It appears that 2a developed altered regulation of gene expression to decrease the influx and increase the efflux of deleterious environmental agents (antibiotics) into and out of the cell, respectively. Mechanism(s) by which this was accomplished or the reason for alterations in gene expression of other genetic systems (curli, flagella, PhoP/PhoQ, and peptidoglycan) are not immediately apparent. Evaluation of transcriptomes within multiple antibiotic-resistant mutants hopefully will enable us to better understand those generalized mechanisms by which bacteria become resistant to multiple antibiotics. Future work in sequencing these genomes and evaluating pathogenicity are suggested.
Collapse
Affiliation(s)
- Scot E Dowd
- Livestock Issues Research Unit, U.S. Department of Agriculture, Agricultural Research Service, Lubbock, Texas 79403, USA.
| | | | | | | |
Collapse
|
32
|
Kojetin DJ, Sullivan DM, Thompson RJ, Cavanagh J. Classification of response regulators based on their surface properties. Methods Enzymol 2008; 422:141-69. [PMID: 17628138 DOI: 10.1016/s0076-6879(06)22007-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2023]
Abstract
The two-component signal transduction system is a ubiquitous signaling module present in most prokaryotic and some eukaryotic systems. Two conserved components, a histidine protein kinase (HPK) protein and a response regulator (RR) protein, function as a biological switch, sensing and responding to changes in the environment, thereby eliciting a specific response. Extensive studies have classified the HPK and RR proteins using primary sequence characteristics, domain identity, domain organization, and biological function. We propose that structural analysis of the surface properties of the highly conserved receiver domain of RRs can be used to build on previous classification methods. Our studies of the OmpR subfamily RRs in Bacillus subtilis and Escherichia coli reveal a notable correlation between the RR receiver domain surface classification and previous classification of cognate HPK proteins. We have extended these studies to analyze the receiver domains of all predicted RR proteins in the marine-dwelling bacterium Vibrio vulnificus.
Collapse
Affiliation(s)
- Douglas J Kojetin
- Department of Molecular Genetics, Biochemistry and Microbiology, University of Cincinnati, College of Medicine, Cincinnati, Ohio, USA
| | | | | | | |
Collapse
|
33
|
Ladunga I. Finding homologs in amino acid sequences using network BLAST searches. CURRENT PROTOCOLS IN BIOINFORMATICS 2008; Chapter 3:Unit 3.4. [PMID: 18428697 DOI: 10.1002/0471250953.bi0304s00] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
BLAST, Basic Local Alignment Search Tool is used more frequently than any other biosequence database search program. The purpose of this unit is not only to show how to run searches on the Web, but also to demonstrate how to fine-tune arguments for a specific research project. It also offers guidance for interpreting results, handling statistical significance and biological relevance issues, and selecting complementary analyses. This unit covers three classes of the BLAST program: standard protein-to-protein searches, translated searches when either the query or the database consists of nucleotide sequences translated into proteins, and finally programs for comparing two sequences (as opposed to searching one sequence against a database of sequences).
Collapse
Affiliation(s)
- Istvan Ladunga
- Celera Genomics, Foster City, California and Research Group for Evolutionary Genetics Hungarian Academy of Sciences Eötvös University, Budapest, Hungary
| |
Collapse
|
34
|
Sulakhe D, Rodriguez A, Wilde M, Foster I, Maltsev N. Interoperability of GADU in Using Heterogeneous Grid Resources for Bioinformatics Applications. ACTA ACUST UNITED AC 2008; 12:241-6. [DOI: 10.1109/titb.2007.897783] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
35
|
Parrish JR, Yu J, Liu G, Hines JA, Chan JE, Mangiola BA, Zhang H, Pacifico S, Fotouhi F, DiRita VJ, Ideker T, Andrews P, Finley RL. A proteome-wide protein interaction map for Campylobacter jejuni. Genome Biol 2008; 8:R130. [PMID: 17615063 PMCID: PMC2323224 DOI: 10.1186/gb-2007-8-7-r130] [Citation(s) in RCA: 176] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2007] [Revised: 05/14/2007] [Accepted: 07/05/2007] [Indexed: 11/12/2022] Open
Abstract
'Systematic identification of protein interactions for the bacterium Campylobacter jejuni using high-throughput yeast two-hybrid screens detected interactions for 80% of the organism's proteins. Background Data from large-scale protein interaction screens for humans and model eukaryotes have been invaluable for developing systems-level models of biological processes. Despite this value, only a limited amount of interaction data is available for prokaryotes. Here we report the systematic identification of protein interactions for the bacterium Campylobacter jejuni, a food-borne pathogen and a major cause of gastroenteritis worldwide. Results Using high-throughput yeast two-hybrid screens we detected and reproduced 11,687 interactions. The resulting interaction map includes 80% of the predicted C. jejuni NCTC11168 proteins and places a large number of poorly characterized proteins into networks that provide initial clues about their functions. We used the map to identify a number of conserved subnetworks by comparison to protein networks from Escherichia coli and Saccharomyces cerevisiae. We also demonstrate the value of the interactome data for mapping biological pathways by identifying the C. jejuni chemotaxis pathway. Finally, the interaction map also includes a large subnetwork of putative essential genes that may be used to identify potential new antimicrobial drug targets for C. jejuni and related organisms. Conclusion The C. jejuni protein interaction map is one of the most comprehensive yet determined for a free-living organism and nearly doubles the binary interactions available for the prokaryotic kingdom. This high level of coverage facilitates pathway mapping and function prediction for a large number of C. jejuni proteins as well as orthologous proteins from other organisms. The broad coverage also facilitates cross-species comparisons for the identification of evolutionarily conserved subnetworks of protein interactions.
Collapse
Affiliation(s)
- Jodi R Parrish
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, Detroit, MI, USA 48201
| | - Jingkai Yu
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, Detroit, MI, USA 48201
| | - Guozhen Liu
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, Detroit, MI, USA 48201
| | - Julie A Hines
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, Detroit, MI, USA 48201
| | - Jason E Chan
- Department of Bioengineering and Program in Bioinformatics, University of California at San Diego, San Diego, CA, USA 92093
| | - Bernie A Mangiola
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, Detroit, MI, USA 48201
| | - Huamei Zhang
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, Detroit, MI, USA 48201
| | - Svetlana Pacifico
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, Detroit, MI, USA 48201
| | - Farshad Fotouhi
- Department of Computer Science, Wayne State University, Detroit, MI, USA 48201
| | - Victor J DiRita
- Department of Microbiology and Immunology, University of Michigan Medical School, Ann Arbor, MI, USA 48109
| | - Trey Ideker
- Department of Bioengineering and Program in Bioinformatics, University of California at San Diego, San Diego, CA, USA 92093
| | - Phillip Andrews
- Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, MI, USA 48109
| | - Russell L Finley
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, Detroit, MI, USA 48201
- Department of Biochemistry and Molecular Biology, Wayne State University School of Medicine, Detroit, MI, USA 48201
| |
Collapse
|
36
|
Correlating low-similarity peptide sequences and HIV B-cell epitopes. Autoimmun Rev 2008; 7:291-6. [DOI: 10.1016/j.autrev.2007.11.001] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2007] [Accepted: 11/06/2007] [Indexed: 11/20/2022]
|
37
|
Li X, Chiang HI, Zhu J, Dowd SE, Zhou H. Characterization of a newly developed chicken 44K Agilent microarray. BMC Genomics 2008; 9:60. [PMID: 18237426 PMCID: PMC2262898 DOI: 10.1186/1471-2164-9-60] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2007] [Accepted: 01/31/2008] [Indexed: 12/05/2022] Open
Abstract
Background The development of microarray technology has greatly enhanced our ability to evaluate gene expression. In theory, the expression of all genes in a given organism can be monitored simultaneously. Sequencing of the chicken genome has provided the crucial information for the design of a comprehensive chicken transcriptome microarray. A long oligonucleotide microarray has been manually curated and designed by our group and manufactured using Agilent inkjet technology. This provides a flexible and powerful platform with high sensitivity and specificity for gene expression studies. Results A chicken 60-mer oligonucleotide microarray consisting of 42,034 features including the entire Marek's disease virus, two avian influenza virus (H5N2 and H5N3), and 150 chicken microRNAs has been designed and tested. In an important validation study, total RNA isolated from four major chicken tissues: cecal tonsil (C), ileum (I), liver (L), and spleen (S) were used for comparative hybridizations. More than 95% of spots had high signal noise ratio (SNR > 10). There were 2886, 2660, 358, 3208, 3355, and 3710 genes differentially expressed between liver and spleen, spleen and cecal tonsil, cecal tonsil and ileum, liver and cecal tonsil, liver and ileum, spleen and ileum (P < 10-7), respectively. There were a number of tissue-selective genes for cecal tonsil, ileum, liver, and spleen identified (95, 71, 535, and 108, respectively; P < 10-7). Another highlight of these data revealed that the antimicrobial peptides GAL1, GAL2, GAL6 and GAL7 were highly expressed in the spleen compared to other tissues tested. Conclusion A chicken 60-mer oligonucleotide 44K microarray was designed and validated in a comprehensive survey of gene expression in diverse tissues. The results of these tissue expression analyses have demonstrated that this microarray has high specificity and sensitivity, and will be a useful tool for chicken functional genomics. Novel data on the expression of putative tissue specific genes and antimicrobial peptides is highlighted as part of this comprehensive microarray validation study. The information for accessing and ordering this 44K chicken array can be found at
Collapse
Affiliation(s)
- Xianyao Li
- Texas Agricultural Experiment Station, Texas A&M University, College Station, TX, USA.
| | | | | | | | | |
Collapse
|
38
|
Baerson SR, Dayan FE, Rimando AM, Nanayakkara NPD, Liu CJ, Schröder J, Fishbein M, Pan Z, Kagan IA, Pratt LH, Cordonnier-Pratt MM, Duke SO. A functional genomics investigation of allelochemical biosynthesis in Sorghum bicolor root hairs. J Biol Chem 2007; 283:3231-3247. [PMID: 17998204 DOI: 10.1074/jbc.m706587200] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Sorghum is considered to be one of the more allelopathic crop species, producing phytotoxins such as the potent benzoquinone sorgoleone (2-hydroxy-5-methoxy-3-[(Z,Z)-8',11',14'-pentadecatriene]-p-benzoquinone) and its analogs. Sorgoleone likely accounts for much of the allelopathy of Sorghum spp., typically representing the predominant constituent of Sorghum bicolor root exudates. Previous and ongoing studies suggest that the biosynthetic pathway for this plant growth inhibitor occurs in root hair cells, involving a polyketide synthase activity that utilizes an atypical 16:3 fatty acyl-CoA starter unit, resulting in the formation of a pentadecatrienyl resorcinol intermediate. Subsequent modifications of this resorcinolic intermediate are likely to be mediated by S-adenosylmethionine-dependent O-methyltransferases and dihydroxylation by cytochrome P450 monooxygenases, although the precise sequence of reactions has not been determined previously. Analyses performed by gas chromatography-mass spectrometry with sorghum root extracts identified a 3-methyl ether derivative of the likely pentadecatrienyl resorcinol intermediate, indicating that dihydroxylation of the resorcinol ring is preceded by O-methylation at the 3'-position by a novel 5-n-alk(en)ylresorcinol-utilizing O-methyltransferase activity. An expressed sequence tag data set consisting of 5,468 sequences selected at random from an S. bicolor root hair-specific cDNA library was generated to identify candidate sequences potentially encoding enzymes involved in the sorgoleone biosynthetic pathway. Quantitative real time reverse transcription-PCR and recombinant enzyme studies with putative O-methyltransferase sequences obtained from the expressed sequence tag data set have led to the identification of a novel O-methyltransferase highly and predominantly expressed in root hairs (designated SbOMT3), which preferentially utilizes alk(en)ylresorcinols among a panel of benzene-derivative substrates tested. SbOMT3 is therefore proposed to be involved in the biosynthesis of the allelochemical sorgoleone.
Collapse
Affiliation(s)
- Scott R Baerson
- United States Department of Agriculture, Agricultural Research Service, Natural Products Utilization Research Unit, University, Mississippi 38677.
| | - Franck E Dayan
- United States Department of Agriculture, Agricultural Research Service, Natural Products Utilization Research Unit, University, Mississippi 38677
| | - Agnes M Rimando
- United States Department of Agriculture, Agricultural Research Service, Natural Products Utilization Research Unit, University, Mississippi 38677
| | - N P Dhammika Nanayakkara
- National Center for Natural Products Research, School of Pharmacy, University of Mississippi, University, Mississippi 38677
| | - Chang-Jun Liu
- Biology Department, Brookhaven National Laboratory, Upton, New York 11973
| | - Joachim Schröder
- Universität Freiburg, Institut für Biologie II, Schänzlestrasse 1, D-79104 Freiburg, Germany
| | - Mark Fishbein
- Department of Biology, Portland State University, Portland, Oregon 97207
| | - Zhiqiang Pan
- United States Department of Agriculture, Agricultural Research Service, Natural Products Utilization Research Unit, University, Mississippi 38677
| | - Isabelle A Kagan
- United States Department of Agriculture, Agricultural Research Service, Natural Products Utilization Research Unit, University, Mississippi 38677
| | - Lee H Pratt
- Department of Plant Biology, University of Georgia, Athens, Georgia 30602
| | | | - Stephen O Duke
- United States Department of Agriculture, Agricultural Research Service, Natural Products Utilization Research Unit, University, Mississippi 38677
| |
Collapse
|
39
|
Dowd SE, Killinger-Mann K, Blanton J, San Francisco M, Brashears M. Positive adaptive state: microarray evaluation of gene expression in Salmonella enterica Typhimurium exposed to nalidixic acid. Foodborne Pathog Dis 2007; 4:187-200. [PMID: 17600486 DOI: 10.1089/fpd.2006.0075] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The emergence of antimicrobial resistance among foodborne bacteria associated with food animal production is an important global issue. We hypothesised that antibiotics generate a positive adaptive state in Salmonella that actively contributes to the development of antimicrobial resistance. This is opposed to common views that antimicrobials only act as a passive selective pressure. Microarray analysis was used to evaluate changes in gene expression that occur upon exposure of Salmonella enterica Typhimurium ATCC 14028 to 1.6 microg/mL of nalidixic acid. The results showed a significant (P < 0.02) difference (fold expression differences >2.0) in the expression of 226 genes. Comparatively repressed transcripts included Salmonella pathogenicity islands 1 and 2 (SPI1 and SPI2). Induced genes included efflux pumps representing all five families of multidrug-resistance efflux pumps, outer membrane lipoproteins, and genes involved in regulating lipopolysaccharide chain length. This profile suggests both enhanced antimicrobial export from the cell and membrane permeability adaptations to limit diffusion of nalidixic acid into the cell. Finally, increased expression of the error-prone DNA repair mechanisms were also observed. From these data we show a highly integrated genetic response to nalidixic acid that places Salmonella into a positive adaptive state that elicits mutations. Evaluation of gene expression profile changes that occur during exposure to antibiotics will continue to improve our understanding of the development of antibiotic resistance.
Collapse
Affiliation(s)
- Scot E Dowd
- United States Department of Agriculture, Agricultural Research Service, Livestock Issues Research Unit, Lubbock, Texas 79403, USA.
| | | | | | | | | |
Collapse
|
40
|
Ferrè F, Ponty Y, Lorenz WA, Clote P. DIAL: a web server for the pairwise alignment of two RNA three-dimensional structures using nucleotide, dihedral angle and base-pairing similarities. Nucleic Acids Res 2007; 35:W659-68. [PMID: 17567620 PMCID: PMC1933154 DOI: 10.1093/nar/gkm334] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
DIAL (dihedral alignment) is a web server that provides public access to a new dynamic programming algorithm for pairwise 3D structural alignment of RNA. DIAL achieves quadratic time by performing an alignment that accounts for (i) pseudo-dihedral and/or dihedral angle similarity, (ii) nucleotide sequence similarity and (iii) nucleotide base-pairing similarity. DIAL provides access to three alignment algorithms: global (Needleman–Wunsch), local (Smith–Waterman) and semiglobal (modified to yield motif search). Suboptimal alignments are optionally returned, and also Boltzmann pair probabilities Pr(ai,bj) for aligned positions ai , bj from the optimal alignment. If a non-zero suboptimal alignment score ratio is entered, then the semiglobal alignment algorithm may be used to detect structurally similar occurrences of a user-specified 3D motif. The query motif may be contiguous in the linear chain or fragmented in a number of noncontiguous regions. The DIAL web server provides graphical output which allows the user to view, rotate and enlarge the 3D superposition for the optimal (and suboptimal) alignment of query to target. Although graphical output is available for all three algorithms, the semiglobal motif search may be of most interest in attempts to identify RNA motifs. DIAL is available at http://bioinformatics.bc.edu/clotelab/DIAL.
Collapse
Affiliation(s)
- F. Ferrè
- Harvard Medical School, Children's Hospital, Hematology/Oncology Department, Boston, MA 02115 and Department of Biology, Boston College, Chestnut Hill, MA 02467, USA
| | - Y. Ponty
- Harvard Medical School, Children's Hospital, Hematology/Oncology Department, Boston, MA 02115 and Department of Biology, Boston College, Chestnut Hill, MA 02467, USA
| | - W. A. Lorenz
- Harvard Medical School, Children's Hospital, Hematology/Oncology Department, Boston, MA 02115 and Department of Biology, Boston College, Chestnut Hill, MA 02467, USA
| | - Peter Clote
- Harvard Medical School, Children's Hospital, Hematology/Oncology Department, Boston, MA 02115 and Department of Biology, Boston College, Chestnut Hill, MA 02467, USA
- *To whom correspondence should be addressed. +1 617 552 1332+1 617 552 2011
| |
Collapse
|
41
|
Biro JC, Biro JMK. The BlastNP: a novel, sensitive sequence similarity searching method using overlappingly translated sequences. CONFERENCE PROCEEDINGS : ... ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL CONFERENCE 2007; 2004:2777-80. [PMID: 17270853 DOI: 10.1109/iembs.2004.1403794] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
An alternative method to TblastX has been developed, known as blastNP. Nucleic acids in database and query sequences were translated into overlapping protein-like sequences (overlappingly translated sequences or OTSs) before searching with blastP. Thus, each nucleic acid sequence is represented by a single "protein like" sequence (instead of three hypothetical proteins in different reading frames). The BlastNP method is defined as a BlastP that is performed on an overlappingly translated nucleic acid database using a similarly converted nucleic acid query. The specificity and sensitivity of blastNP and TblastX is quantitatively very similar, except that blastNP is more sensitive to detect short sequence similarities (less than 50 residues). However, a qualitative comparison of the observed similarities showed that only 56% was detected by both methods, but 22% was indicated only by blastNP and 22% only by TblastX. For example, a statistically significant similarity between prion protein (PrP) and transcriptions factors (TF) was only detected by blastNP. A signal amplification was seen when OTS sequences were used in similarity visualisation methods (like LALIGN) instead of nucleic acids.
Collapse
Affiliation(s)
- J C Biro
- Homulus Informatics, San Francisco, CA, USA
| | | |
Collapse
|
42
|
Polimeno L, Mittelman A, Gennero L, Ponzetto A, Lucchese G, Stufano A, Kusalik A, Kanduc D. Sub-epitopic dissection of HCV E1315-328HRMAWDMMMNWSPT sequence by similarity analysis. Amino Acids 2007; 34:479-84. [PMID: 17458624 DOI: 10.1007/s00726-007-0539-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2007] [Accepted: 02/23/2007] [Indexed: 11/24/2022]
Abstract
Our labs are focused on identifying amino acid sequences having the ability to react specifically with the functional binding site of a complementary antibody. Our epitopic definition is based on the analysis of the similarity level of antigenic amino acid sequences to the host proteome. Here, the similarity profile to the human proteome of an HCV E1 immunodominant epitope, i.e. the HCV E1(315-328)HRMAWDMMMNWSPT sequence, led to i) characterizing the immunoreactive HCV E1 315-328 region as a sequence endowed with a low level of similarity to human proteins; ii) defining 2 contiguous immunodominant linear determinants respectively located at the NH(2) and COOH terminus of the conserved viral antigenic sequence. This study supports the hypothesis that low sequence similarity to the host's proteome modulates the pool of epitopic amino acid sequences in a viral antigen, and appears of potential value in defining immunogenic viral peptide sequences to be used in immunotherapeutic approaches for HCV treatment.
Collapse
Affiliation(s)
- L Polimeno
- Department of Emergency and Organ Transplantation, Gastroenterology Section, University of Bari, Bari, Italy
| | | | | | | | | | | | | | | |
Collapse
|
43
|
Evans EE, Henn AD, Jonason A, Paris MJ, Schiffhauer LM, Borrello MA, Smith ES, Sahasrabudhe DM, Zauderer M. C35 (C17orf37) is a novel tumor biomarker abundantly expressed in breast cancer. Mol Cancer Ther 2007; 5:2919-30. [PMID: 17121940 DOI: 10.1158/1535-7163.mct-06-0389] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Identification of shared tumor-specific targets is useful in developing broadly applicable therapies. In a study designed to identify genes up-regulated in breast cancer, a cDNA clone corresponding to a novel gene C35 (C17orf37) was selected by representational difference analysis of tumor and normal human mammary cell lines. Abundant expression of C35 transcript in tumors was confirmed by Northern blot and real-time PCR. The C35 gene is located on chromosome 17q12, 505 nucleotides from the 3' end of the ERBB2 oncogene, the antigenic target for trastuzumab (Herceptin) therapy. The chromosomal arrangement of the genes encoding C35 and ERBB2 is tail to tail. An open reading frame encodes a 12-kDa protein of unknown function. Immunohistochemical analysis detected robust and frequent expression of C35 protein, including 32% of grade 1 and 66% of grades 2 and 3 infiltrating ductal carcinomas of the breast (in contrast to 20% overexpressing HER-2/neu), 38% of infiltrating lobular carcinoma (typically HER-2/neu negative), as well as tumors arising in other tissues. C35 was not detected in 38 different normal human tissues, except Leydig cells in the testes and trace levels in a small percentage of normal breast tissue samples. The distinct and favorable expression profile of C35 spanning early through late stages of disease, including high frequency of overexpression in various breast carcinoma, abundant expression in distant metastases, and either absence or low level expression in normal human tissues, warrants further investigation of the relevance of C35 as a biomarker and/or a target for development of broadly applicable cancer-specific therapies.
Collapse
Affiliation(s)
- Elizabeth E Evans
- Vaccinex, Inc., Rochester, 1875 Mt. Hope Avenue, Rochester, NY 14620, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Abstract
BACKGROUND Protein sequence clustering has been widely used as a part of the analysis of protein structure and function. In most cases single linkage or graph-based clustering algorithms have been applied. OPTICS (Ordering Points To Identify the Clustering Structure) is an attractive approach due to its emphasis on visualization of results and support for interactive work, e.g., in choosing parameters. However, OPTICS has not been used, as far as we know, for protein sequence clustering. RESULTS In this paper, a system of clustering proteins, SEQOPTICS (SEQuence clustering with OPTICS) is demonstrated. The system is implemented with Smith-Waterman as protein distance measurement and OPTICS at its core to perform protein sequence clustering. SEQOPTICS is tested with four data sets from different data sources. Visualization of the sequence clustering structure is demonstrated as well. CONCLUSION The system was evaluated by comparison with other existing methods. Analysis of the results demonstrates that SEQOPTICS performs better based on some evaluation criteria including Jaccard coefficient, Precision, and Recall. It is a promising protein sequence clustering method with future possible improvement on parallel computing and other protein distance measurements.
Collapse
Affiliation(s)
- Yonghui Chen
- Department of Computer and Information Sciences, University of Alabama at Birmingham, Birmingham, AL 35294-1170, USA
| | - Kevin D Reilly
- Department of Computer and Information Sciences, University of Alabama at Birmingham, Birmingham, AL 35294-1170, USA
| | - Alan P Sprague
- Department of Computer and Information Sciences, University of Alabama at Birmingham, Birmingham, AL 35294-1170, USA
| | - Zhijie Guan
- San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093-0505, USA
| |
Collapse
|
45
|
Zhang W, Culley DE, Gritsenko MA, Moore RJ, Nie L, Scholten JCM, Petritis K, Strittmatter EF, Camp DG, Smith RD, Brockman FJ. LC-MS/MS based proteomic analysis and functional inference of hypothetical proteins in Desulfovibrio vulgaris. Biochem Biophys Res Commun 2006; 349:1412-9. [PMID: 16982031 DOI: 10.1016/j.bbrc.2006.09.019] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2006] [Accepted: 09/07/2006] [Indexed: 11/26/2022]
Abstract
High efficiency capillary liquid chromatography-tandem mass spectrometry (LC-MS/MS) was used to examine the proteins extracted from Desulfovibrio vulgaris cells across six treatment conditions. While our previous study provided a proteomic overview of the cellular metabolism based on proteins with known functions [W. Zhang, M.A. Gritsenko, R.J. Moore, D.E. Culley, L. Nie, K. Petritis, E.F. Strittmatter, D.G. Camp II, R.D. Smith, F.J. Brockman, A proteomic view of the metabolism in Desulfovibrio vulgaris determined by liquid chromatography coupled with tandem mass spectrometry, Proteomics 6 (2006) 4286-4299], this study describes the global detection and functional inference for hypothetical D. vulgaris proteins. Using criteria that a given peptide of a protein is identified from at least two out of three independent LC-MS/MS measurements and that for any protein at least two different peptides are identified among the three measurements, 129 open reading frames (ORFs) originally annotated as hypothetical proteins were found to encode expressed proteins. Functional inference for the conserved hypothetical proteins was performed by a combination of several non-homology based methods: genomic context analysis, phylogenomic profiling, and analysis of a combination of experimental information, including peptide detection in cells grown under specific culture conditions and cellular location of the proteins. Using this approach we were able to assign possible functions to 20 conserved hypothetical proteins. This study demonstrated that a combination of proteomics and bioinformatics methodologies can provide verification of the expression of hypothetical proteins and improve genome annotation.
Collapse
Affiliation(s)
- Weiwen Zhang
- Microbiology Group, Pacific Northwest National Laboratory, 902 Battelle Boulevard, P.O. Box 999, Richland, WA 99352, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
46
|
Lucchese A, Mittelman A, Tessitore L, Serpico R, Sinha AA, Kanduc D. Proteomic definition of a desmoglein linear determinant common to Pemphigus vulgaris and Pemphigus foliaceous. J Transl Med 2006; 4:37. [PMID: 16925820 PMCID: PMC1590053 DOI: 10.1186/1479-5876-4-37] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2006] [Accepted: 08/22/2006] [Indexed: 11/25/2022] Open
Abstract
Background A number of autoimmune diseases have been clinically and pathologically characterized. In contrast, target antigens have been identified only in a few cases and, in these few cases, the knowledge of the exact epitopic antigenic sequence is still lacking. Thus the major objective of current work in the autoimmunity field is the identification of the epitopic sequences that are related to autoimmune reactions. Our labs propose that autoantigen peptide epitopes able to evoke humoral (auto)immune response are defined by the sequence similarity to the host proteome. The underlying scientific rationale is that antigen peptides acquire immunoreactivity in the context of their proteomic similarity level. Sequences uniquely owned by a protein will have high potential to evoke an immune reaction, whereas motifs with high proteomic redundancy should be immunogenically silenced by the tolerance phenomenon. The relationship between sequence redundancy and peptide immunoreactivity has been successfully validated in a number of experimental models. Here the hypothesis has been applied to pemphigus diseases and the corresponding desmoglein autoantigens. Methods Desmoglein 3 sequence similarity analysis to the human proteome followed by dot-blot/NMR immunoassays were carried out to identify and validate possible epitopic sequences. Results Computational analysis led to identifying a linear immunodominant desmoglein-3 epitope highly reactive with the sera from Pemphigus vulgaris as well as Pemphigus foliaceous. The epitopic peptide corresponded to the amino acid REWVKFAKPCRE sequence, was located in the extreme N-terminal region (residues 49 to 60), and had low redundancy to the human proteome. Sequence alignment showed that human desmoglein 1 and 3 share the REW-KFAK–RE sequence as a common motif with 75% residue identity. Conclusion This study 1) validates sequence redundancy to autoproteome as a main factor in shaping desmoglein peptide immunogenicity; 2) offers a molecular mechanicistic basis in analyzing the commonality of autoimmune responses exhibited by the two forms of pemphigus; 3) indicates possible peptide-immunotherapeutical approaches for pemphigus diseases.
Collapse
Affiliation(s)
| | | | | | - Rosario Serpico
- Institute of Clinical Odontostomatology, 2University of Naples, Italy
| | - Animesh A Sinha
- Division of Dermatology and Cutaneous Sciences, Center for Investigative Dermatology, Michigan State University, East Lansing, MI, USA
| | - Darja Kanduc
- Dept. of Biochemistry and Molecular Biology, University of Bari, Italy
| |
Collapse
|
47
|
Sulakhe D, Rodriguez A, D'Souza M, Wilde M, Nefedova V, Foster I, Maltsev N. GNARE: automated system for high-throughput genome analysis with grid computational backend. J Clin Monit Comput 2006; 19:361-9. [PMID: 16328950 DOI: 10.1007/s10877-005-3463-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2005] [Accepted: 06/30/2005] [Indexed: 10/25/2022]
Abstract
Recent progress in genomics and experimental biology has brought exponential growth of the biological information available for computational analysis in public genomics databases. However, applying the potentially enormous scientific value of this information to the understanding of biological systems requires computing and data storage technology of an unprecedented scale. The Grid, with its aggregated and distributed computational and storage infrastructure, offers an ideal platform for high-throughput bioinformatics analysis. To leverage this we have developed the Genome Analysis Research Environment (GNARE)--a scalable computational system for the high-throughput analysis of genomes, which provides an integrated database and computational backend for data-driven bioinformatics applications. GNARE efficiently automates the major steps of genome analysis including acquisition of data from multiple genomic databases; data analysis by a diverse set of bioinformatics tools; and storage of results and annotations. High-throughput computations in GNARE are performed using distributed heterogeneous Grid computing resources such as Grid2003, TeraGrid, and the DOE Science Grid. Multi-step genome analysis workflows involving massive data processing, the use of application-specific tools and algorithms and updating of an integrated database to provide interactive web access to results are all expressed and controlled by a "virtual data" model which transparently maps computational workflows to distributed Grid resources. This paper describes how Grid technologies such as Globus, Condor, and the Gryphyn Virtual Data System were applied in the development of GNARE. It focuses on our approach to Grid resource allocation and to the use of GNARE as a computational framework for the development of bioinformatics applications.
Collapse
Affiliation(s)
- Dinanath Sulakhe
- Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, USA
| | | | | | | | | | | | | |
Collapse
|
48
|
Güldener U, Münsterkötter M, Oesterheld M, Pagel P, Ruepp A, Mewes HW, Stümpflen V. MPact: the MIPS protein interaction resource on yeast. Nucleic Acids Res 2006; 34:D436-41. [PMID: 16381906 PMCID: PMC1347366 DOI: 10.1093/nar/gkj003] [Citation(s) in RCA: 234] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
In recent years, the Munich Information Center for Protein Sequences (MIPS) yeast protein–protein interaction (PPI) dataset has been used in numerous analyses of protein networks and has been called a gold standard because of its quality and comprehensiveness [H. Yu, N. M. Luscombe, H. X. Lu, X. Zhu, Y. Xia, J. D. Han, N. Bertin, S. Chung, M. Vidal and M. Gerstein (2004) Genome Res., 14, 1107–1118]. MPact and the yeast protein localization catalog provide information related to the proximity of proteins in yeast. Beside the integration of high-throughput data, information about experimental evidence for PPIs in the literature was compiled by experts adding up to 4300 distinct PPIs connecting 1500 proteins in yeast. As the interaction data is a complementary part of CYGD, interactive mapping of data on other integrated data types such as the functional classification catalog [A. Ruepp, A. Zollner, D. Maier, K. Albermann, J. Hani, M. Mokrejs, I. Tetko, U. Güldener, G. Mannhaupt, M. Münsterkötter and H. W. Mewes (2004) Nucleic Acids Res., 32, 5539–5545] is possible. A survey of signaling proteins and comparison with pathway data from KEGG demonstrates that based on these manually annotated data only an extensive overview of the complexity of this functional network can be obtained in yeast. The implementation of a web-based PPI-analysis tool allows analysis and visualization of protein interaction networks and facilitates integration of our curated data with high-throughput datasets. The complete dataset as well as user-defined sub-networks can be retrieved easily in the standardized PSI-MI format. The resource can be accessed through .
Collapse
Affiliation(s)
- Ulrich Güldener
- Institute for Bioinformatics, GSF National Research Center for Environment and Health, Ingolstädter Landstrasse 1, D-85764 Neuherberg, Germany.
| | | | | | | | | | | | | |
Collapse
|
49
|
Hazai E, Visy J, Fitos I, Bikádi Z, Simonyi M. Selective binding of coumarin enantiomers to human alpha1-acid glycoprotein genetic variants. Bioorg Med Chem 2005; 14:1959-65. [PMID: 16290938 DOI: 10.1016/j.bmc.2005.10.045] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2005] [Revised: 10/19/2005] [Accepted: 10/25/2005] [Indexed: 11/23/2022]
Abstract
Coumarin-type anticoagulants, warfarin, phenprocoumon and acenocoumarol, were tested for their stereoselective binding to the human orosomucoid (ORM; AGP) genetic variants ORM 1 and ORM 2. Direct binding studies with racemic ligands were carried out by the ultrafiltration method; the concentrations of free enantiomers were determined by capillary electrophoresis. The binding of pure enantiomers was investigated with quinaldine red fluorescence displacement measurements. Our results demonstrated that all investigated compounds bind stronger to ORM 1 variant than to ORM 2. ORM 1 and human native AGP preferred the binding of (S)-enantiomers of warfarin and acenocoumarol, while no enantioselectivity was observed in phenprocoumon binding. Acenocoumarol possessed the highest enantioselectivity in AGP binding due to the weak binding of its (R)-enantiomer. Furthermore, a new homology model of AGP was built and the models of ORM 1 and ORM 2 suggested that difference in binding to AGP genetic variants is caused by steric factors.
Collapse
Affiliation(s)
- Eszter Hazai
- Department of Molecular Pharmacology, Institute of Biomolecular Chemistry, Chemical Research Center, Hungarian Academy of Sciences, PO Box 17, H-1525 Budapest, Hungary.
| | | | | | | | | |
Collapse
|
50
|
Lucchese A, Willers J, Mittelman A, Kanduc D, Dummer R. Proteomic Scan for Tyrosinase Peptide Antigenic Pattern in Vitiligo and Melanoma: Role of Sequence Similarity and HLA-DR1 Affinity. THE JOURNAL OF IMMUNOLOGY 2005; 175:7009-20. [PMID: 16272362 DOI: 10.4049/jimmunol.175.10.7009] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Immune responses contribute to the pathogenesis of vitiligo and target melanoma sometimes associated with vitiligo-like depigmentation in some melanoma patients. We analyzed the sera from patients with vitiligo and cutaneous melanoma for reactivity toward tyrosinase peptide sequences 1) endowed with low level of similarity to human proteome, and 2) potentially able to bind HLA-DR1 Ags. We report that the tyrosinase autoantigen was immunorecognized with the same molecular pattern by sera from vitiligo and melanoma patients. Five autoantigen peptides composed the immunodominant anti-tyrosinase response: aa95-104FMGFNCGNCK; aa175-182 LFVWMHYY; aa176-190FVWMHYYVSMDALLG; aa222-236IQKLTGDENFTIPYW, and aa233-247 IPYWDWRDAEKCDIC. All of the five antigenic peptides were characterized by being (or containing) a sequence with low similarity level to the self proteome. Sera from healthy subjects were responsive to aa95-104FMGFNCGNCK, aa222-236IQKLTGDENFTIPYW, and aa233-247 IPYWDWRDAEKCDIC, but did not react with the aa175-182LFVWMHYY and aa176-190FVWMHYYVSMDALLG peptide sequences containing the copper-binding His180 and the oculocutaneous albinism I-A variant position F176. Our results indicate a clear-cut link between peptide immunogenicity and low similarity level of the corresponding amino acid sequence, and are an example of a comparative analysis that might allow to comprehensively distinguish the epitopic peptide sequences within a disease from those associated to natural autoantibodies. In particular, these data, for the first time, delineate the linear B epitope pattern on tyrosinase autoantigen and provide definitive evidence of humoral immune responses against tyrosinase.
Collapse
Affiliation(s)
- Alberta Lucchese
- Department of Odontostomatology and Surgery, University of Bari, Bari, Italy
| | | | | | | | | |
Collapse
|