51
|
Zhang B, Li J, Quan L, Chen Y, Lü Q. Sequence-based prediction of protein-protein interaction sites by simplified long short-term memory network. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.05.013] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
52
|
Sarkar D, Saha S. Machine-learning techniques for the prediction of protein–protein interactions. J Biosci 2019. [DOI: 10.1007/s12038-019-9909-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
53
|
Reille S, Garnier M, Robert X, Gouet P, Martin J, Launay G. Identification and visualization of protein binding regions with the ArDock server. Nucleic Acids Res 2019; 46:W417-W422. [PMID: 29905873 PMCID: PMC6031020 DOI: 10.1093/nar/gky472] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Accepted: 05/28/2018] [Indexed: 12/21/2022] Open
Abstract
ArDock (ardock.ibcp.fr) is a structural bioinformatics web server for the prediction and the visualization of potential interaction regions at protein surfaces. ArDock ranks the surface residues of a protein according to their tendency to form interfaces in a set of predefined docking experiments between the query protein and a set of arbitrary protein probes. The ArDock methodology is derived from large scale cross-docking studies where it was observed that randomly chosen proteins tend to dock in a non-random way at protein surfaces. The method predicts interaction site of the protein, or alternate interfaces in the case of proteins with multiple interaction modes. The server takes a protein structure as input and computes a score for each surface residue. Its output focuses on the interactive visualization of results and on interoperability with other services.
Collapse
Affiliation(s)
- Sébastien Reille
- Molecular Microbiology and Structural Biochemistry, Unité Mixte de Recherche, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, 69367 Lyon Cedex 07, France
| | - Mélanie Garnier
- Molecular Microbiology and Structural Biochemistry, Unité Mixte de Recherche, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, 69367 Lyon Cedex 07, France
| | - Xavier Robert
- Molecular Microbiology and Structural Biochemistry, Unité Mixte de Recherche, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, 69367 Lyon Cedex 07, France
| | - Patrice Gouet
- Molecular Microbiology and Structural Biochemistry, Unité Mixte de Recherche, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, 69367 Lyon Cedex 07, France
| | - Juliette Martin
- Molecular Microbiology and Structural Biochemistry, Unité Mixte de Recherche, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, 69367 Lyon Cedex 07, France
| | - Guillaume Launay
- Molecular Microbiology and Structural Biochemistry, Unité Mixte de Recherche, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, 69367 Lyon Cedex 07, France
- To whom correspondence should be addressed. Tel: +33 437 652 936; Fax: +33 472 722 601;
| |
Collapse
|
54
|
Gil N, Fajardo EJ, Fiser A. Discovery of receptor-ligand interfaces in the immunoglobulin superfamily. Proteins 2019; 88:135-142. [PMID: 31298437 DOI: 10.1002/prot.25778] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Revised: 06/21/2019] [Accepted: 07/06/2019] [Indexed: 12/13/2022]
Abstract
Cell-surface-anchored immunoglobulin superfamily (IgSF) proteins are widespread throughout the human proteome, forming crucial components of diverse biological processes including immunity, cell-cell adhesion, and carcinogenesis. IgSF proteins generally function through protein-protein interactions carried out between extracellular, membrane-bound proteins on adjacent cells, known as trans-binding interfaces. These protein-protein interactions constitute a class of pharmaceutical targets important in the treatment of autoimmune diseases, chronic infections, and cancer. A molecular-level understanding of IgSF protein-protein interactions would greatly benefit further drug development. A critical step toward this goal is the reliable identification of IgSF trans-binding interfaces. We propose a novel combination of structure and sequence information to identify trans-binding interfaces in IgSF proteins. We developed a structure-based binding interface prediction approach that can identify broad regions of the protein surface that encompass the binding interfaces and suggests that IgSF proteins possess binding supersites. These interfaces could theoretically be pinpointed using sequence-based conservation analysis, with performance approaching the theoretical upper limit of binding interface prediction accuracy, but achieving this in practice is limited by the current ability to identify an appropriate multiple sequence alignment for conservation analysis. However, an important contribution of combining the two orthogonal methods is that agreement between these approaches can estimate the reliability of the predictions. This approach was benchmarked on the set of 22 IgSF proteins with experimentally solved structures in complex with their ligands. Additionally, we provide structure-based predictions and reliability scores for the 62 IgSF proteins with known structure but yet uncharacterized binding interfaces.
Collapse
Affiliation(s)
- Nelson Gil
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York.,Department of Biochemistry, Albert Einstein College of Medicine, Bronx, New York
| | - Eduardo J Fajardo
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York.,Department of Biochemistry, Albert Einstein College of Medicine, Bronx, New York
| | - Andras Fiser
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York.,Department of Biochemistry, Albert Einstein College of Medicine, Bronx, New York
| |
Collapse
|
55
|
Wang X, Yu B, Ma A, Chen C, Liu B, Ma Q. Protein-protein interaction sites prediction by ensemble random forests with synthetic minority oversampling technique. Bioinformatics 2019; 35:2395-2402. [PMID: 30520961 PMCID: PMC6612859 DOI: 10.1093/bioinformatics/bty995] [Citation(s) in RCA: 93] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2018] [Revised: 11/19/2018] [Accepted: 12/03/2018] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION The prediction of protein-protein interaction (PPI) sites is a key to mutation design, catalytic reaction and the reconstruction of PPI networks. It is a challenging task considering the significant abundant sequences and the imbalance issue in samples. RESULTS A new ensemble learning-based method, Ensemble Learning of synthetic minority oversampling technique (SMOTE) for Unbalancing samples and RF algorithm (EL-SMURF), was proposed for PPI sites prediction in this study. The sequence profile feature and the residue evolution rates were combined for feature extraction of neighboring residues using a sliding window, and the SMOTE was applied to oversample interface residues in the feature space for the imbalance problem. The Multi-dimensional Scaling feature selection method was implemented to reduce feature redundancy and subset selection. Finally, the Random Forest classifiers were applied to build the ensemble learning model, and the optimal feature vectors were inserted into EL-SMURF to predict PPI sites. The performance validation of EL-SMURF on two independent validation datasets showed 77.1% and 77.7% accuracy, which were 6.2-15.7% and 6.1-18.9% higher than the other existing tools, respectively. AVAILABILITY AND IMPLEMENTATION The source codes and data used in this study are publicly available at http://github.com/QUST-AIBBDRC/EL-SMURF/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xiaoying Wang
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, China
- School of Mathematics, Shandong University, Jinan, China
- Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, China
| | - Bin Yu
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, China
- Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, China
- School of Life Sciences, University of Science and Technology of China, Hefei, China
| | - Anjun Ma
- Bioinformatics and Mathematical Biosciences Lab, Department of Agronomy, Horticulture and Plant Science, South Dakota State University, Brookings, SD, USA
- Department Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, USA
| | - Cheng Chen
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, China
- Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, China
| | - Bingqiang Liu
- School of Mathematics, Shandong University, Jinan, China
| | - Qin Ma
- Bioinformatics and Mathematical Biosciences Lab, Department of Agronomy, Horticulture and Plant Science, South Dakota State University, Brookings, SD, USA
- Department Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
56
|
Dequeker C, Laine E, Carbone A. Decrypting protein surfaces by combining evolution, geometry, and molecular docking. Proteins 2019; 87:952-965. [PMID: 31199528 PMCID: PMC6852240 DOI: 10.1002/prot.25757] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Revised: 05/09/2019] [Accepted: 06/07/2019] [Indexed: 01/30/2023]
Abstract
The growing body of experimental and computational data describing how proteins interact with each other has emphasized the multiplicity of protein interactions and the complexity underlying protein surface usage and deformability. In this work, we propose new concepts and methods toward deciphering such complexity. We introduce the notion of interacting region to account for the multiple usage of a protein's surface residues by several partners and for the variability of protein interfaces coming from molecular flexibility. We predict interacting patches by crossing evolutionary, physicochemical and geometrical properties of the protein surface with information coming from complete cross-docking (CC-D) simulations. We show that our predictions match well interacting regions and that the different sources of information are complementary. We further propose an indicator of whether a protein has a few or many partners. Our prediction strategies are implemented in the dynJET2 algorithm and assessed on a new dataset of 262 protein on which we performed CC-D. The code and the data are available at: http://www.lcqb.upmc.fr/dynJET2/.
Collapse
Affiliation(s)
- Chloé Dequeker
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, France
| | - Elodie Laine
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, France
| | - Alessandra Carbone
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, France.,Institut Universitaire de France (IUF), Paris, France
| |
Collapse
|
57
|
Formylated N-terminal methionine is absent from the Mycoplasma hyopneumoniae proteome: Implications for translation initiation. Int J Med Microbiol 2019; 309:288-298. [PMID: 31126750 DOI: 10.1016/j.ijmm.2019.03.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2018] [Revised: 02/28/2019] [Accepted: 03/17/2019] [Indexed: 12/31/2022] Open
Abstract
N-terminal methionine excision (NME) is a proteolytic pathway that cleaves the N-termini of proteins, a process that influences where proteins localise in the cell and their turnover rates. In bacteria, protein biosynthesis is initiated by formylated methionine start tRNA (fMet-tRNAfMet). The formyl group is attached by formyltransferase (FMT) and is subsequently removed by peptide deformylase (PDF) in most but not all proteins. Methionine aminopeptidase then cleaves deformylated methionine to complete the process. Components of NME, particularly PDF, are promising therapeutic targets for bacterial pathogens. In Mycoplasma hyopneumoniae, a genome-reduced, major respiratory pathogen of swine, pdf and fmt are absent from its genome. Our bioinformatic analysis uncovered additional enzymes involved in formylated N-terminal methionine (fnMet) processing missing in fourteen mycoplasma species, including M. hyopneumoniae but not in Mycoplasma pneumoniae, a major respiratory pathogen of humans. Consistent with our bioinformatic studies, an analysis of in-house tryptic peptide libraries confirmed the absence of fnMet in M. hyopneumoniae proteins but, as expected fnMet peptides were detected in the proteome of M. pneumoniae. Additionally, computational molecular modelling of M. hyopneumoniae translation initiation factors reveal structural and sequence differences in areas known to interact with fMet-tRNAfMet. Our data suggests that some mycoplasmas have evolved a translation process that does not require fnMet.
Collapse
|
58
|
Kolvenbach CM, Dworschak GC, Frese S, Japp AS, Schuster P, Wenzlitschke N, Yilmaz Ö, Lopes FM, Pryalukhin A, Schierbaum L, van der Zanden LFM, Kause F, Schneider R, Taranta-Janusz K, Szczepańska M, Pawlaczyk K, Newman WG, Beaman GM, Stuart HM, Cervellione RM, Feitz WFJ, van Rooij IALM, Schreuder MF, Steffens M, Weber S, Merz WM, Feldkötter M, Hoppe B, Thiele H, Altmüller J, Berg C, Kristiansen G, Ludwig M, Reutter H, Woolf AS, Hildebrandt F, Grote P, Zaniew M, Odermatt B, Hilger AC. Rare Variants in BNC2 Are Implicated in Autosomal-Dominant Congenital Lower Urinary-Tract Obstruction. Am J Hum Genet 2019; 104:994-1006. [PMID: 31051115 PMCID: PMC6506863 DOI: 10.1016/j.ajhg.2019.03.023] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Accepted: 03/22/2019] [Indexed: 12/29/2022] Open
Abstract
Congenital lower urinary-tract obstruction (LUTO) is caused by anatomical blockage of the bladder outflow tract or by functional impairment of urinary voiding. About three out of 10,000 pregnancies are affected. Although several monogenic causes of functional obstruction have been defined, it is unknown whether congenital LUTO caused by anatomical blockage has a monogenic cause. Exome sequencing in a family with four affected individuals with anatomical blockage of the urethra identified a rare nonsense variant (c.2557C>T [p.Arg853∗]) in BNC2, encoding basonuclin 2, tracking with LUTO over three generations. Re-sequencing BNC2 in 697 individuals with LUTO revealed three further independent missense variants in three unrelated families. In human and mouse embryogenesis, basonuclin 2 was detected in lower urinary-tract rudiments. In zebrafish embryos, bnc2 was expressed in the pronephric duct and cloaca, analogs of the mammalian lower urinary tract. Experimental knockdown of Bnc2 in zebrafish caused pronephric-outlet obstruction and cloacal dilatation, phenocopying human congenital LUTO. Collectively, these results support the conclusion that variants in BNC2 are strongly implicated in LUTO etiology as a result of anatomical blockage.
Collapse
Affiliation(s)
- Caroline M Kolvenbach
- Department of Pediatrics, Children's Hospital, University Hospital Bonn, 53113 Bonn, Germany; Institute of Anatomy, University of Bonn, 53115 Bonn, Germany; Division of Nephrology, Department of Medicine, Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, United States
| | - Gabriel C Dworschak
- Department of Pediatrics, Children's Hospital, University Hospital Bonn, 53113 Bonn, Germany; Institute of Anatomy, University of Bonn, 53115 Bonn, Germany; Institute of Human Genetics, University of Bonn, 53127 Bonn, Germany
| | - Sandra Frese
- Department of Pediatrics, Children's Hospital, University Hospital Bonn, 53113 Bonn, Germany; Institute of Human Genetics, University of Bonn, 53127 Bonn, Germany
| | - Anna S Japp
- Institute of Neuropathology, University of Bonn Medical Center, 53127 Bonn, Germany
| | - Peggy Schuster
- Institute of Cardiovascular Regeneration, Center for Molecular Medicine, Goethe University, 60439 Frankfurt am Main, Germany
| | - Nina Wenzlitschke
- Institute of Cardiovascular Regeneration, Center for Molecular Medicine, Goethe University, 60439 Frankfurt am Main, Germany
| | - Öznur Yilmaz
- Institute of Anatomy, University of Bonn, 53115 Bonn, Germany
| | - Filipa M Lopes
- Division of Cell Matrix and Regenerative Medicine, School of Biological Sciences, Faculty of Biology, Medicine, and Health, University of Manchester, Manchester Academic Health Science Centere, Manchester M13 9PT, United Kingdom
| | - Alexey Pryalukhin
- Institute of Pathology, University Hospital Bonn, 53127 Bonn, Germany
| | - Luca Schierbaum
- Department of Pediatrics, Children's Hospital, University Hospital Bonn, 53113 Bonn, Germany; Institute of Human Genetics, University of Bonn, 53127 Bonn, Germany
| | - Loes F M van der Zanden
- Radboud Institute for Health Sciences, Department for Health Evidence, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
| | - Franziska Kause
- Department of Pediatrics, Children's Hospital, University Hospital Bonn, 53113 Bonn, Germany; Division of Nephrology, Department of Medicine, Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, United States
| | - Ronen Schneider
- Division of Nephrology, Department of Medicine, Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, United States
| | - Katarzyna Taranta-Janusz
- Department of Pediatrics and Nephrology, Medical University of Białystok, 15-089 Białystok, Poland
| | - Maria Szczepańska
- Department and Clinics of Pediatrics, School of Medicine with the Division of Dentistry in Zabrze, Medical University of Silesia in Katowice, 40-055 Zabrze, Poland
| | - Krzysztof Pawlaczyk
- Department of Nephrology, Transplantology, and Internal Medicine, Poznan University of Medical Sciences, 61-701 Poznan, Poland
| | - William G Newman
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine, and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester M13 9PT, United Kingdom
| | - Glenda M Beaman
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine, and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester M13 9PT, United Kingdom
| | - Helen M Stuart
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine, and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester M13 9PT, United Kingdom
| | - Raimondo M Cervellione
- Paediatric Urology, Royal Manchester Children's Hospital, Central Manchester University Hospitals NHS Foundation Trust, Manchester M13 9WL, United Kingdom
| | - Wouter F J Feitz
- Department of Urology, Pediatric Urology, Radboudumc Amalia Children's Hospital, 6525 GA Nijmegen, the Netherlands
| | - Iris A L M van Rooij
- Radboud Institute for Health Sciences, Department for Health Evidence, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands; Department of Surgery-Pediatric Surgery, Radboudumc Amalia Children's Hospital, 6525 GA Nijmegen, the Netherlands
| | - Michiel F Schreuder
- Department of Pediatric Nephrology, Radboud University Medical Center, Radboud Institute for Molecular Life Sciences, Amalia Children's Hospital, 6525 GA Nijmegen, the Netherlands
| | | | - Stefanie Weber
- Department of Pediatrics, University Hospital Marburg, 35037 Marburg, Germany
| | - Waltraut M Merz
- Department of Obstetrics and Prenatal Medicine, University of Bonn, 53127 Bonn, Germany
| | - Markus Feldkötter
- Division of Pediatric Nephrology, Department of Pediatrics, University Hospital Bonn, 53129 Bonn, Germany
| | - Bernd Hoppe
- Division of Pediatric Nephrology, Department of Pediatrics, University Hospital Bonn, 53129 Bonn, Germany
| | - Holger Thiele
- Cologne Center for Genomics, University of Cologne, 50391 Cologne, Germany
| | - Janine Altmüller
- Cologne Center for Genomics, University of Cologne, 50391 Cologne, Germany; Center for Molecular Medicine Cologne, University of Cologne, 50391 Cologne, Germany
| | - Christoph Berg
- Department of Obstetrics and Prenatal Medicine, University of Bonn, 53127 Bonn, Germany
| | - Glen Kristiansen
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine, and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester M13 9PT, United Kingdom
| | - Michael Ludwig
- Department of Clinical Chemistry and Clinical Pharmacology, University of Bonn, 53127 Bonn, Germany
| | - Heiko Reutter
- Institute of Human Genetics, University of Bonn, 53127 Bonn, Germany; Department of Neonatology and Pediatric Intensive Care, Children's Hospital, University of Bonn, 53127 Bonn, Germany
| | - Adrian S Woolf
- Division of Cell Matrix and Regenerative Medicine, School of Biological Sciences, Faculty of Biology, Medicine, and Health, University of Manchester, Manchester Academic Health Science Centere, Manchester M13 9PT, United Kingdom
| | - Friedhelm Hildebrandt
- Division of Nephrology, Department of Medicine, Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, United States
| | - Phillip Grote
- Institute of Cardiovascular Regeneration, Center for Molecular Medicine, Goethe University, 60439 Frankfurt am Main, Germany
| | - Marcin Zaniew
- Department of Pediatrics, University of Zielona Góra, 56-417 Zielona Góra, Poland
| | - Benjamin Odermatt
- Institute of Anatomy, University of Bonn, 53115 Bonn, Germany; Institute of Neuro-Anatomy, University of Bonn, 53115 Bonn, Germany.
| | - Alina C Hilger
- Department of Pediatrics, Children's Hospital, University Hospital Bonn, 53113 Bonn, Germany; Institute of Human Genetics, University of Bonn, 53127 Bonn, Germany.
| |
Collapse
|
59
|
Zhang X, Lin X, Zhao J, Huang Q, Xu X. Efficiently Predicting Hot Spots in PPIs by Combining Random Forest and Synthetic Minority Over-Sampling Technique. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:774-781. [PMID: 33156780 DOI: 10.1109/tcbb.2018.2871674] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Hot spot residues bring into play the vital function in bioinformatics to find new medications such as drug design. However, current datasets are predominately composed of non-hot spots with merely a tiny percentage of hot spots. Conventional hot spots prediction methods may face great challenges towards the problem of imbalance training samples. This paper presents a classification method combining with random forest classification and oversampling strategy to improve the training performance. A strategy with an oversampling ability is used to generate hot spots data to balance the given training set. Random forest classification is then invoked to generate a set of forest trees for this oversampled training set. The final prediction performance can be computed recursively after the oversampling and training process. This proposed method is capable of randomly selecting features and constructing a robust random forest to avoid overfitting the training set. Experimental results from three data sets indicate that the performance of hot spots prediction has been significantly improved compared with existing classification methods.
Collapse
|
60
|
Improved measures for evolutionary conservation that exploit taxonomy distances. Nat Commun 2019; 10:1556. [PMID: 30952844 PMCID: PMC6450959 DOI: 10.1038/s41467-019-09583-2] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2018] [Accepted: 03/19/2019] [Indexed: 11/30/2022] Open
Abstract
Selective pressures on protein-coding regions that provide fitness advantages can lead to the regions' fixation and conservation in genome duplications and speciation events. Consequently, conservation analyses relying on sequence similarities are exploited by a myriad of applications across all biosciences to identify functionally important protein regions. While very potent, existing conservation measures based on multiple sequence alignments are so pervasive that improvements to solutions of many problems have become incremental. We introduce a new framework for evolutionary conservation with measures that exploit taxonomy distances across species. Results show that our taxonomy-based framework comfortably outperforms existing conservation measures in identifying deleterious variants observed in the human population, including variants located in non-abundant sequence domains such as intrinsically disordered regions. The predictive power of our approach emphasizes that the phenotypic effects of sequence variants can be taxonomy-level specific and thus, conservation needs to be interpreted accordingly. Information on protein sequence variability and conservation can be leveraged to identify functionally important regions. Here, the authors develop new conservation measures that exploit taxonomy distances and LIST, a tool for predicting deleteriousness of human variants.
Collapse
|
61
|
Emamjomeh A, Choobineh D, Hajieghrari B, MahdiNezhad N, Khodavirdipour A. DNA-protein interaction: identification, prediction and data analysis. Mol Biol Rep 2019; 46:3571-3596. [PMID: 30915687 DOI: 10.1007/s11033-019-04763-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2018] [Accepted: 03/14/2019] [Indexed: 12/30/2022]
Abstract
Life in living organisms is dependent on specific and purposeful interaction between other molecules. Such purposeful interactions make the various processes inside the cells and the bodies of living organisms possible. DNA-protein interactions, among all the types of interactions between different molecules, are of considerable importance. Currently, with the development of numerous experimental techniques, diverse methods are convenient for recognition and investigating such interactions. While the traditional experimental techniques to identify DNA-protein complexes are time-consuming and are unsuitable for genome-scale studies, the current high throughput approaches are more efficient in determining such interaction at a large-scale, but they are clearly too costly to be practice for daily applications. Hence, according to the availability of much information related to different biological sequences and clearing different dimensions of conditions in which such interactions are formed, with the developments related to the computer, mathematics, and statistics motivate scientists to develop bioinformatics tools for prediction the interaction site(s). Until now, there has been much progress in this field. In this review, the factors and conditions governing the interaction and the laboratory techniques for examining such interactions are addressed. In addition, developed bioinformatics tools are introduced and compared for this reason and, in the end, several suggestions are offered for the promotion of such tools in prediction with much more precision.
Collapse
Affiliation(s)
- Abbasali Emamjomeh
- Laboratory of Computational Biotechnology and Bioinformatics (CBB), Department of Plant Breeding and Biotechnology (PBB), University of Zabol, Zabol, 98615-538, Iran.
| | - Darush Choobineh
- Agricultural Biotechnology, Department of Plant Breeding and Biotechnology (PBB), Faculty of Agriculture, University of Zabol, Zabol, Iran
| | - Behzad Hajieghrari
- Department of Agricultural Biotechnology, College of Agriculture, Jahrom University, Jahrom, 74135-111, Iran.
| | - Nafiseh MahdiNezhad
- Laboratory of Computational Biotechnology and Bioinformatics (CBB), Department of Plant Breeding and Biotechnology (PBB), University of Zabol, Zabol, 98615-538, Iran
| | - Amir Khodavirdipour
- Division of Human Genetics, Department of Anatomy, St. John's hospital, Bangalore, India
| |
Collapse
|
62
|
Lipinski S, Petersen BS, Barann M, Piecyk A, Tran F, Mayr G, Jentzsch M, Aden K, Stengel ST, Klostermeier UC, Sheth V, Ellinghaus D, Rausch T, Korbel JO, Nothnagel M, Krawczak M, Gilissen C, Veltman JA, Forster M, Forster P, Lee CC, Fritscher-Ravens A, Schreiber S, Franke A, Rosenstiel P. Missense variants in NOX1 and p22phox in a case of very-early-onset inflammatory bowel disease are functionally linked to NOD2. Cold Spring Harb Mol Case Stud 2019; 5:mcs.a002428. [PMID: 30709874 PMCID: PMC6371741 DOI: 10.1101/mcs.a002428] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2017] [Accepted: 10/29/2018] [Indexed: 02/07/2023] Open
Abstract
Whole-genome and whole-exome sequencing of individual patients allow the study of rare and potentially causative genetic variation. In this study, we sequenced DNA of a trio comprising a boy with very-early-onset inflammatory bowel disease (veoIBD) and his unaffected parents. We identified a rare, X-linked missense variant in the NAPDH oxidase NOX1 gene (c.C721T, p.R241C) in heterozygous state in the mother and in hemizygous state in the patient. We discovered that, in addition, the patient was homozygous for a common missense variant in the CYBA gene (c.T214C, p.Y72H). CYBA encodes the p22phox protein, a cofactor for NOX1. Functional assays revealed reduced cellular ROS generation and antibacterial capacity of NOX1 and p22phox variants in intestinal epithelial cells. Moreover, the identified NADPH oxidase complex variants affected NOD2-mediated immune responses, and p22phox was identified as a novel NOD2 interactor. In conclusion, we detected missense variants in a veoIBD patient that disrupt the host response to bacterial challenges and reduce protective innate immune signaling via NOD2. We assume that the patient's individual genetic makeup favored disturbed intestinal mucosal barrier function.
Collapse
Affiliation(s)
- Simone Lipinski
- Institute of Clinical Molecular Biology (IKMB), Christian-Albrechts-University, 24105 Kiel, Germany
| | - Britt-Sabina Petersen
- Institute of Clinical Molecular Biology (IKMB), Christian-Albrechts-University, 24105 Kiel, Germany
| | - Matthias Barann
- Institute of Clinical Molecular Biology (IKMB), Christian-Albrechts-University, 24105 Kiel, Germany
| | - Agnes Piecyk
- Institute of Clinical Molecular Biology (IKMB), Christian-Albrechts-University, 24105 Kiel, Germany
| | - Florian Tran
- Institute of Clinical Molecular Biology (IKMB), Christian-Albrechts-University, 24105 Kiel, Germany.,Department of General Internal Medicine, Christian-Albrechts-University, University Hospital Schleswig-Holstein, 24105 Kiel, Germany
| | - Gabriele Mayr
- Institute of Clinical Molecular Biology (IKMB), Christian-Albrechts-University, 24105 Kiel, Germany
| | - Marlene Jentzsch
- Institute of Clinical Molecular Biology (IKMB), Christian-Albrechts-University, 24105 Kiel, Germany
| | - Konrad Aden
- Institute of Clinical Molecular Biology (IKMB), Christian-Albrechts-University, 24105 Kiel, Germany.,Department of General Internal Medicine, Christian-Albrechts-University, University Hospital Schleswig-Holstein, 24105 Kiel, Germany
| | - Stephanie T Stengel
- Institute of Clinical Molecular Biology (IKMB), Christian-Albrechts-University, 24105 Kiel, Germany
| | - Ulrich C Klostermeier
- Institute of Clinical Molecular Biology (IKMB), Christian-Albrechts-University, 24105 Kiel, Germany
| | - Vrunda Sheth
- Life Technologies, Beverly, Massachusetts 01915, USA
| | - David Ellinghaus
- Institute of Clinical Molecular Biology (IKMB), Christian-Albrechts-University, 24105 Kiel, Germany
| | - Tobias Rausch
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, 69117 Heidelberg, Germany
| | - Jan O Korbel
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, 69117 Heidelberg, Germany
| | - Michael Nothnagel
- Institute of Medical Informatics and Statistics (IMIS), Christian-Albrechts University, 24105 Kiel, Germany
| | - Michael Krawczak
- Institute of Medical Informatics and Statistics (IMIS), Christian-Albrechts University, 24105 Kiel, Germany
| | - Christian Gilissen
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behavior, Radboud University Medical Center, Nijmegen 6525, The Netherlands
| | - Joris A Veltman
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behavior, Radboud University Medical Center, Nijmegen 6525, The Netherlands.,Institute of Genetic Medicine, Newcastle University, Newcastle upon Tyne NE1 3BZ, United Kingdom
| | - Michael Forster
- Institute of Clinical Molecular Biology (IKMB), Christian-Albrechts-University, 24105 Kiel, Germany
| | - Peter Forster
- Murray Edwards College, University of Cambridge, Cambridge CB3 0DF, United Kingdom
| | - Clarence C Lee
- Department of General Internal Medicine, Christian-Albrechts-University, University Hospital Schleswig-Holstein, 24105 Kiel, Germany
| | - Annette Fritscher-Ravens
- Department of General Internal Medicine, Christian-Albrechts-University, University Hospital Schleswig-Holstein, 24105 Kiel, Germany
| | - Stefan Schreiber
- Institute of Clinical Molecular Biology (IKMB), Christian-Albrechts-University, 24105 Kiel, Germany.,Department of General Internal Medicine, Christian-Albrechts-University, University Hospital Schleswig-Holstein, 24105 Kiel, Germany
| | - Andre Franke
- Institute of Clinical Molecular Biology (IKMB), Christian-Albrechts-University, 24105 Kiel, Germany
| | - Philip Rosenstiel
- Institute of Clinical Molecular Biology (IKMB), Christian-Albrechts-University, 24105 Kiel, Germany
| |
Collapse
|
63
|
Bakhman A, Rabinovich E, Shlamkovich T, Papo N, Kosloff M. Residue-level determinants of angiopoietin-2 interactions with its receptor Tie2. Proteins 2018; 87:185-197. [PMID: 30520519 DOI: 10.1002/prot.25638] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Revised: 09/04/2018] [Accepted: 11/29/2018] [Indexed: 11/11/2022]
Abstract
We combined computational and experimental methods to interrogate the binding determinants of angiopoietin-2 (Ang2) to its receptor tyrosine kinase (RTK) Tie2-a central signaling system in angiogenesis, inflammation, and tumorigenesis. We used physics-based electrostatic and surface-area calculations to identify the subset of interfacial Ang2 and Tie2 residues that can affect binding directly. Using random and site-directed mutagenesis and yeast surface display (YSD), we validated these predictions and identified additional Ang2 positions that affected receptor binding. We then used burial-based calculations to classify the larger set of Ang2 residues that are buried in the Ang2 core, whose mutations can perturb the Ang2 structure and thereby affect interactions with Tie2 indirectly. Our analysis showed that the Ang2-Tie2 interface is dominated by nonpolar contributions, with only three Ang2 and two Tie2 residues that contribute electrostatically to intermolecular interactions. Individual interfacial residues contributed only moderately to binding, suggesting that engineering of this interface will require multiple mutations to reach major effects. Conversely, substitutions in substantially buried Ang2 residues were more prevalent in our experimental screen, reduced binding substantially, and are therefore more likely to have a deleterious effect that might contribute to oncogenesis. Computational analysis of additional RTK-ligand complexes, c-Kit-SCF and M-CSF-c-FMS, and comparison to previous YSD results, further show the utility of our combined methodology.
Collapse
Affiliation(s)
- Anna Bakhman
- Department of Human Biology, Faculty of Natural Sciences, University of Haifa, Haifa, Israel
| | - Eitan Rabinovich
- Department of Biotechnology Engineering and the National Institute of Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Tomer Shlamkovich
- Department of Biotechnology Engineering and the National Institute of Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Niv Papo
- Department of Biotechnology Engineering and the National Institute of Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Mickey Kosloff
- Department of Human Biology, Faculty of Natural Sciences, University of Haifa, Haifa, Israel
| |
Collapse
|
64
|
The novel EHEC gene asa overlaps the TEGT transporter gene in antisense and is regulated by NaCl and growth phase. Sci Rep 2018; 8:17875. [PMID: 30552341 PMCID: PMC6294744 DOI: 10.1038/s41598-018-35756-y] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Accepted: 11/08/2018] [Indexed: 12/02/2022] Open
Abstract
Only a few overlapping gene pairs are known in the best-analyzed bacterial model organism Escherichia coli. Automatic annotation programs usually annotate only one out of six reading frames at a locus, allowing only small overlaps between protein-coding sequences. However, both RNAseq and RIBOseq show signals corresponding to non-trivially overlapping reading frames in antisense to annotated genes, which may constitute protein-coding genes. The transcription and translation of the novel 264 nt gene asa, which overlaps in antisense to a putative TEGT (Testis-Enhanced Gene Transfer) transporter gene is detected in pathogenic E. coli, but not in two apathogenic E. coli strains. The gene in E. coli O157:H7 (EHEC) was further analyzed. An overexpression phenotype was identified in two stress conditions, i.e. excess in salt or arginine. For this, EHEC overexpressing asa was grown competitively against EHEC with a translationally arrested asa mutant gene. RT-qPCR revealed conditional expression dependent on growth phase, sodium chloride, and arginine. Two potential promoters were computationally identified and experimentally verified by reporter gene expression and determination of the transcription start site. The protein Asa was verified by Western blot. Close homologues of asa have not been found in protein databases, but bioinformatic analyses showed that it may be membrane associated, having a largely disordered structure.
Collapse
|
65
|
Compound heterozygous mutations in IL10RA combined with a complement factor properdin mutation in infantile-onset inflammatory bowel disease. Eur J Gastroenterol Hepatol 2018; 30:1491-1496. [PMID: 30199474 DOI: 10.1097/meg.0000000000001247] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
OBJECTIVES Inflammatory bowel diseases (IBDs) are chronic and multifactorial diseases resulting from a complex interaction of host genetic factors and environmental stimuli. Although many genome-wide association studies have identified host genetic factors associated with IBD, rare Mendelian forms of IBD have been reported in patients with very early onset forms. Therefore, this study aimed to identify genetic variants associated with infantile-onset IBD. PARTICIPANTS AND METHODS We obtained genomic DNA from whole blood samples of a male patient with infantile-onset IBD and nonconsanguineous Korean parents. Whole-exome sequencing was performed using trio samples. Then, we analyzed the data using susceptibility genes for monogenic forms of IBD and various immunodeficiencies and protein structural analysis. RESULTS The patient who presented with oral aphthous ulcers at the age of 14 days suffered from severe colitis and was refractory to medical treatment. Compound heterozygous mutations in IL10RA (p.R101W; p.T179T) were found in the patient. In addition, a hemizygous mutation in complement factor properdin (CFP) (p.L456V) located on the X-chromosome was detected, inherited from the patient's mother. Protein structural modeling suggested impaired properdin subunit interactions by p.L456V that may hamper protein oligomerization required for complement activation. CONCLUSION This study identified compound heterozygous mutations in IL10RA combined with a hemizygous CFP mutation in infantile-onset IBD by using whole-exome sequencing. CFP p.L456V may exacerbate symptoms of infantile-onset IBD by disturbing oligomerization of properdin.
Collapse
|
66
|
Macalino SJY, Basith S, Clavio NAB, Chang H, Kang S, Choi S. Evolution of In Silico Strategies for Protein-Protein Interaction Drug Discovery. Molecules 2018; 23:E1963. [PMID: 30082644 PMCID: PMC6222862 DOI: 10.3390/molecules23081963] [Citation(s) in RCA: 64] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Revised: 08/03/2018] [Accepted: 08/04/2018] [Indexed: 12/14/2022] Open
Abstract
The advent of advanced molecular modeling software, big data analytics, and high-speed processing units has led to the exponential evolution of modern drug discovery and better insights into complex biological processes and disease networks. This has progressively steered current research interests to understanding protein-protein interaction (PPI) systems that are related to a number of relevant diseases, such as cancer, neurological illnesses, metabolic disorders, etc. However, targeting PPIs are challenging due to their "undruggable" binding interfaces. In this review, we focus on the current obstacles that impede PPI drug discovery, and how recent discoveries and advances in in silico approaches can alleviate these barriers to expedite the search for potential leads, as shown in several exemplary studies. We will also discuss about currently available information on PPI compounds and systems, along with their usefulness in molecular modeling. Finally, we conclude by presenting the limits of in silico application in drug discovery and offer a perspective in the field of computer-aided PPI drug discovery.
Collapse
Affiliation(s)
- Stephani Joy Y Macalino
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Shaherin Basith
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Nina Abigail B Clavio
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Hyerim Chang
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Soosung Kang
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Sun Choi
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| |
Collapse
|
67
|
Vikram T, Kumar P. Analysis of Hepatitis E virus (HEV) X-domain structural model. Bioinformation 2018; 14:398-403. [PMID: 30262978 PMCID: PMC6143357 DOI: 10.6026/97320630014398] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2018] [Revised: 06/26/2018] [Accepted: 06/30/2018] [Indexed: 01/22/2023] Open
Abstract
Hepatitis E viral infection is now emerging as a global health concern, which needs to be addressed. Mechanism of viral replication and release is attributed by the different genomic component of HEV. However, few proteins/domain like X and Y domain remain unexplored, so we aim to explore the physiochemical, structural and functional features of HEV ORF-1 X domain. Molecular modeling of the unknown X domain was carried out using Phyre2 and Swiss Model. Active ligand binding sites were predicted using Phyre2. The X-domain protein found to be stable and acidic in nature with high thermostability and better hydrophilic property. Twelve binding sites were predicted along with putative transferase and catalytic functional activity. Homology modeling showed 10 binding sites along with Mg2+ and Zn2+ as metallic heterogen ligands binding to predicted ligand-binding sites. This study may help to decipher the role of this unexplored X-domain of HEV, thereby improving our understanding of the pathogenesis of HEV infection.
Collapse
Affiliation(s)
- Thakur Vikram
- Department of Virology, Postgraduate Institute of Medical Education and Research (PGIMER), Sec-12, Chandigarh, India
| | - Pradeep Kumar
- Faculty of Applied Sciences and Biotechnology, Shoolini University, Solan, (HP) India
| |
Collapse
|
68
|
Atas H, Tuncbag N, Doğan T. Phylogenetic and Other Conservation-Based Approaches to Predict Protein Functional Sites. Methods Mol Biol 2018; 1762:51-69. [PMID: 29594767 DOI: 10.1007/978-1-4939-7756-7_4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2023]
Abstract
Proteins use their functional regions to exploit various activities, including binding to other proteins, nucleic acids, or drugs. Functional sites of the proteins have a tendency to be more conserved than the rest of the protein surface. Therefore, detection of the conserved residues using phylogenetic analysis is a general approach to predict functionally critical residues. In this chapter, we describe some of the available methods to predict functional sites and demonstrate a complete pipeline with tool alternatives at several steps. We explain the standard procedure and all intermediate stages including homology detection with BLAST search, multiple sequence alignment (MSA) and the construction of a phylogenetic tree for a given query sequence. Additionally, we demonstrate the prediction results of these methods on a case study. Finally, we discuss the possible challenges and bottlenecks throughout the pipeline. Our step-by-step description about the functional site prediction could be a helpful resource for the researchers interested in finding protein functional sites, to be used in drug discovery research.
Collapse
Affiliation(s)
- Heval Atas
- Department of Health Informatics, Graduate School of Informatics, METU, Ankara, 06800, Turkey.,Cancer Systems Biology Laboratory (CanSyL), METU, Ankara, 06800, Turkey
| | - Nurcan Tuncbag
- Department of Health Informatics, Graduate School of Informatics, METU, Ankara, 06800, Turkey.,Cancer Systems Biology Laboratory (CanSyL), METU, Ankara, 06800, Turkey
| | - Tunca Doğan
- Department of Health Informatics, Graduate School of Informatics, METU, Ankara, 06800, Turkey. .,Cancer Systems Biology Laboratory (CanSyL), METU, Ankara, 06800, Turkey. .,European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, CB10 1SD, UK.
| |
Collapse
|
69
|
Abstract
The structural modeling of protein complexes by docking simulations has been attracting increasing interest with the rise of proteomics and of the number of experimentally identified binary interactions. Structures of unbound partners, either modeled or experimentally determined, can be used as input to sample as extensively as possible all putative binding modes and single out the most plausible ones. At the scoring step, evolutionary information contained in the joint multiple sequence alignments of both partners can provide key insights to recognize correct interfaces. Here, we describe a computational protocol based on the InterEvDock web server to exploit coevolution constraints in protein-protein docking methods. We provide methodology guidelines to prepare the input protein structures and generate improved alignments. We also explain how to extract and use the information returned by the server through the analysis of two representative examples.
Collapse
Affiliation(s)
- Aravindan Arun Nadaradjane
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, 91198, Gif-sur-Yvette Cedex, France
| | - Raphael Guerois
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, 91198, Gif-sur-Yvette Cedex, France.
| | - Jessica Andreani
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, 91198, Gif-sur-Yvette Cedex, France.
| |
Collapse
|
70
|
Qiu Z, Zhou B, Yuan J. Protein–protein interaction site predictions with minimum covariance determinant and Mahalanobis distance. J Theor Biol 2017; 433:57-63. [DOI: 10.1016/j.jtbi.2017.08.026] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2017] [Revised: 08/26/2017] [Accepted: 08/30/2017] [Indexed: 10/18/2022]
|
71
|
Elongation factor Tu is a multifunctional and processed moonlighting protein. Sci Rep 2017; 7:11227. [PMID: 28894125 PMCID: PMC5593925 DOI: 10.1038/s41598-017-10644-z] [Citation(s) in RCA: 83] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2017] [Accepted: 08/10/2017] [Indexed: 01/10/2023] Open
Abstract
Many bacterial moonlighting proteins were originally described in medically, agriculturally, and commercially important members of the low G + C Firmicutes. We show Elongation factor Tu (Ef-Tu) moonlights on the surface of the human pathogens Staphylococcus aureus (SaEf-Tu) and Mycoplasma pneumoniae (MpnEf-Tu), and the porcine pathogen Mycoplasma hyopneumoniae (MhpEf-Tu). Ef-Tu is also a target of multiple processing events on the cell surface and these were characterised using an N-terminomics pipeline. Recombinant MpnEf-Tu bound strongly to a diverse range of host molecules, and when bound to plasminogen, was able to convert plasminogen to plasmin in the presence of plasminogen activators. Fragments of Ef-Tu retain binding capabilities to host proteins. Bioinformatics and structural modelling studies indicate that the accumulation of positively charged amino acids in short linear motifs (SLiMs), and protein processing promote multifunctional behaviour. Codon bias engendered by an A + T rich genome may influence how positively-charged residues accumulate in SLiMs.
Collapse
|
72
|
Berry IJ, Jarocki VM, Tacchi JL, Raymond BBA, Widjaja M, Padula MP, Djordjevic SP. N-terminomics identifies widespread endoproteolysis and novel methionine excision in a genome-reduced bacterial pathogen. Sci Rep 2017; 7:11063. [PMID: 28894154 PMCID: PMC5593965 DOI: 10.1038/s41598-017-11296-9] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Accepted: 08/21/2017] [Indexed: 12/12/2022] Open
Abstract
Proteolytic processing alters protein function. Here we present the first systems-wide analysis of endoproteolysis in the genome-reduced pathogen Mycoplasma hyopneumoniae. 669 N-terminal peptides from 164 proteins were identified, demonstrating that functionally diverse proteins are processed, more than half of which 75 (53%) were accessible on the cell surface. Multiple cleavage sites were characterised, but cleavage with arginine in P1 predominated. Putative functions for a subset of cleaved fragments were assigned by affinity chromatography using heparin, actin, plasminogen and fibronectin as bait. Binding affinity was correlated with the number of cleavages in a protein, indicating that novel binding motifs are exposed, and protein disorder increases, after a cleavage event. Glyceraldehyde 3-phosphate dehydrogenase was used as a model protein to demonstrate this. We define the rules governing methionine excision, show that several aminopeptidases are involved, and propose that through processing, genome-reduced organisms can expand protein function.
Collapse
Affiliation(s)
- Iain J Berry
- The ithree institute, University of Technology Sydney, PO Box 123, Broadway, NSW, 2007, Australia.,Proteomics Core Facility, University of Technology Sydney, PO Box 123, Broadway, NSW, 2007, Australia
| | - Veronica M Jarocki
- The ithree institute, University of Technology Sydney, PO Box 123, Broadway, NSW, 2007, Australia.,Proteomics Core Facility, University of Technology Sydney, PO Box 123, Broadway, NSW, 2007, Australia
| | - Jessica L Tacchi
- The ithree institute, University of Technology Sydney, PO Box 123, Broadway, NSW, 2007, Australia.,Proteomics Core Facility, University of Technology Sydney, PO Box 123, Broadway, NSW, 2007, Australia
| | - Benjamin B A Raymond
- The ithree institute, University of Technology Sydney, PO Box 123, Broadway, NSW, 2007, Australia.,Proteomics Core Facility, University of Technology Sydney, PO Box 123, Broadway, NSW, 2007, Australia
| | - Michael Widjaja
- The ithree institute, University of Technology Sydney, PO Box 123, Broadway, NSW, 2007, Australia.,Proteomics Core Facility, University of Technology Sydney, PO Box 123, Broadway, NSW, 2007, Australia
| | - Matthew P Padula
- The ithree institute, University of Technology Sydney, PO Box 123, Broadway, NSW, 2007, Australia.,Proteomics Core Facility, University of Technology Sydney, PO Box 123, Broadway, NSW, 2007, Australia
| | - Steven P Djordjevic
- The ithree institute, University of Technology Sydney, PO Box 123, Broadway, NSW, 2007, Australia. .,Proteomics Core Facility, University of Technology Sydney, PO Box 123, Broadway, NSW, 2007, Australia.
| |
Collapse
|
73
|
Bertoni M, Kiefer F, Biasini M, Bordoli L, Schwede T. Modeling protein quaternary structure of homo- and hetero-oligomers beyond binary interactions by homology. Sci Rep 2017; 7:10480. [PMID: 28874689 PMCID: PMC5585393 DOI: 10.1038/s41598-017-09654-8] [Citation(s) in RCA: 545] [Impact Index Per Article: 68.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2017] [Accepted: 07/28/2017] [Indexed: 01/01/2023] Open
Abstract
Cellular processes often depend on interactions between proteins and the formation of macromolecular complexes. The impairment of such interactions can lead to deregulation of pathways resulting in disease states, and it is hence crucial to gain insights into the nature of macromolecular assemblies. Detailed structural knowledge about complexes and protein-protein interactions is growing, but experimentally determined three-dimensional multimeric assemblies are outnumbered by complexes supported by non-structural experimental evidence. Here, we aim to fill this gap by modeling multimeric structures by homology, only using amino acid sequences to infer the stoichiometry and the overall structure of the assembly. We ask which properties of proteins within a family can assist in the prediction of correct quaternary structure. Specifically, we introduce a description of protein-protein interface conservation as a function of evolutionary distance to reduce the noise in deep multiple sequence alignments. We also define a distance measure to structurally compare homologous multimeric protein complexes. This allows us to hierarchically cluster protein structures and quantify the diversity of alternative biological assemblies known today. We find that a combination of conservation scores, structural clustering, and classical interface descriptors, can improve the selection of homologous protein templates leading to reliable models of protein complexes.
Collapse
Affiliation(s)
- Martino Bertoni
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.,Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056, Basel, Switzerland
| | - Florian Kiefer
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.,Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056, Basel, Switzerland
| | - Marco Biasini
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.,Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056, Basel, Switzerland
| | - Lorenza Bordoli
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.,Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056, Basel, Switzerland
| | - Torsten Schwede
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland. .,Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056, Basel, Switzerland.
| |
Collapse
|
74
|
Ma L, Wang DD, Zou B, Yan H. An Eigen-Binding Site Based Method for the Analysis of Anti-EGFR Drug Resistance in Lung Cancer Treatment. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:1187-1194. [PMID: 27187970 DOI: 10.1109/tcbb.2016.2568184] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
We explore the drug resistance mechanism in non-small cell lung cancer treatment by characterizing the drug-binding site of a protein mutant based on local surface and energy features. These features are transformed to an eigen-binding site space and used for drug resistance level prediction and analysis.
Collapse
|
75
|
Protein binding hot spots prediction from sequence only by a new ensemble learning method. Amino Acids 2017; 49:1773-1785. [DOI: 10.1007/s00726-017-2474-6] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2016] [Accepted: 07/24/2017] [Indexed: 01/31/2023]
|
76
|
Tahir M, Hayat M. Machine learning based identification of protein–protein interactions using derived features of physiochemical properties and evolutionary profiles. Artif Intell Med 2017; 78:61-71. [DOI: 10.1016/j.artmed.2017.06.006] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2017] [Revised: 06/09/2017] [Accepted: 06/11/2017] [Indexed: 02/09/2023]
|
77
|
Murakami Y, Tripathi LP, Prathipati P, Mizuguchi K. Network analysis and in silico prediction of protein-protein interactions with applications in drug discovery. Curr Opin Struct Biol 2017; 44:134-142. [PMID: 28364585 DOI: 10.1016/j.sbi.2017.02.005] [Citation(s) in RCA: 71] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2016] [Revised: 02/05/2017] [Accepted: 02/23/2017] [Indexed: 11/29/2022]
Abstract
Protein-protein interactions (PPIs) are vital to maintaining cellular homeostasis. Several PPI dysregulations have been implicated in the etiology of various diseases and hence PPIs have emerged as promising targets for drug discovery. Surface residues and hotspot residues at the interface of PPIs form the core regions, which play a key role in modulating cellular processes such as signal transduction and are used as starting points for drug design. In this review, we briefly discuss how PPI networks (PPINs) inferred from experimentally characterized PPI data have been utilized for knowledge discovery and how in silico approaches to PPI characterization can contribute to PPIN-based biological research. Next, we describe the principles of in silico PPI prediction and survey the existing PPI and PPI site prediction servers that are useful for drug discovery. Finally, we discuss the potential of in silico PPI prediction in drug discovery.
Collapse
Affiliation(s)
- Yoichi Murakami
- National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito Asagi, Ibaraki, Osaka 567-0085, Japan.
| | - Lokesh P Tripathi
- National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito Asagi, Ibaraki, Osaka 567-0085, Japan.
| | - Philip Prathipati
- National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito Asagi, Ibaraki, Osaka 567-0085, Japan
| | - Kenji Mizuguchi
- National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito Asagi, Ibaraki, Osaka 567-0085, Japan.
| |
Collapse
|
78
|
Zhang J, Kurgan L. Review and comparative assessment of sequence-based predictors of protein-binding residues. Brief Bioinform 2017; 19:821-837. [DOI: 10.1093/bib/bbx022] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2016] [Indexed: 12/31/2022] Open
Affiliation(s)
- Jian Zhang
- School of Computer and Information Technology, Xinyang Normal University
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| |
Collapse
|
79
|
Integrating computational methods and experimental data for understanding the recognition mechanism and binding affinity of protein-protein complexes. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2017; 128:33-38. [PMID: 28069340 DOI: 10.1016/j.pbiomolbio.2017.01.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Revised: 01/04/2017] [Accepted: 01/05/2017] [Indexed: 01/09/2023]
Abstract
Protein-protein interactions perform several functions inside the cell. Understanding the recognition mechanism and binding affinity of protein-protein complexes is a challenging problem in experimental and computational biology. In this review, we focus on two aspects (i) understanding the recognition mechanism and (ii) predicting the binding affinity. The first part deals with computational techniques for identifying the binding site residues and the contribution of important interactions for understanding the recognition mechanism of protein-protein complexes in comparison with experimental observations. The second part is devoted to the methods developed for discriminating high and low affinity complexes, and predicting the binding affinity of protein-protein complexes using three-dimensional structural information and just from the amino acid sequence. The overall view enhances our understanding of the integration of experimental data and computational methods, recognition mechanism of protein-protein complexes and the binding affinity.
Collapse
|
80
|
Computational Approaches for Predicting Binding Partners, Interface Residues, and Binding Affinity of Protein-Protein Complexes. Methods Mol Biol 2017; 1484:237-253. [PMID: 27787830 DOI: 10.1007/978-1-4939-6406-2_16] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Studying protein-protein interactions leads to a better understanding of the underlying principles of several biological pathways. Cost and labor-intensive experimental techniques suggest the need for computational methods to complement them. Several such state-of-the-art methods have been reported for analyzing diverse aspects such as predicting binding partners, interface residues, and binding affinity for protein-protein complexes with reliable performance. However, there are specific drawbacks for different methods that indicate the need for their improvement. This review highlights various available computational algorithms for analyzing diverse aspects of protein-protein interactions and endorses the necessity for developing new robust methods for gaining deep insights about protein-protein interactions.
Collapse
|
81
|
Ren J, Gao F, Wu X, Lu X, Zeng L, Lv J, Su X, Luo H, Ren G. Bph32, a novel gene encoding an unknown SCR domain-containing protein, confers resistance against the brown planthopper in rice. Sci Rep 2016; 6:37645. [PMID: 27876888 PMCID: PMC5120289 DOI: 10.1038/srep37645] [Citation(s) in RCA: 86] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2016] [Accepted: 10/31/2016] [Indexed: 12/03/2022] Open
Abstract
An urgent need exists to identify more brown planthopper (Nilaparvata lugens Stål, BPH) resistance genes, which will allow the development of rice varieties with resistance to BPH to counteract the increased incidence of this pest species. Here, using bioinformatics and DNA sequencing approaches, we identified a novel BPH resistance gene, LOC_Os06g03240 (MSU LOCUS ID), from the rice variety Ptb33 in the interval between the markers RM19291 and RM8072 on the short arm of chromosome 6, where a gene for resistance to BPH was mapped by Jirapong Jairin et al. and renamed as "Bph32". This gene encodes a unique short consensus repeat (SCR) domain protein. Sequence comparison revealed that the Bph32 gene shares 100% sequence identity with its allele in Oryza latifolia. The transgenic introgression of Bph32 into a susceptible rice variety significantly improved resistance to BPH. Expression analysis revealed that Bph32 was highly expressed in the leaf sheaths, where BPH primarily settles and feeds, at 2 and 24 h after BPH infestation, suggesting that Bph32 may inhibit feeding in BPH. Western blotting revealed the presence of Pph (Ptb33) and Tph (TN1) proteins using a Penta-His antibody, and both proteins were insoluble. This study provides information regarding a valuable gene for rice defence against insect pests.
Collapse
Affiliation(s)
- Juansheng Ren
- Crop Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu, 610066, P.R. China
| | - Fangyuan Gao
- Crop Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu, 610066, P.R. China
| | - Xianting Wu
- Crop Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu, 610066, P.R. China
| | - Xianjun Lu
- Crop Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu, 610066, P.R. China
| | - Lihua Zeng
- Sichuan Normal University, Chengdu, 610066, P.R. China
| | - Jianqun Lv
- Crop Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu, 610066, P.R. China
| | - Xiangwen Su
- Crop Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu, 610066, P.R. China
| | - Hong Luo
- Department of Genetics and Biochemistry, Clemson University, 110 Biosystems Research Complex, Clemson, SC 29634-0318, USA
| | - Guangjun Ren
- Crop Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu, 610066, P.R. China
| |
Collapse
|
82
|
Kuo TH, Li KB. Predicting Protein-Protein Interaction Sites Using Sequence Descriptors and Site Propensity of Neighboring Amino Acids. Int J Mol Sci 2016; 17:ijms17111788. [PMID: 27792167 PMCID: PMC5133789 DOI: 10.3390/ijms17111788] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2016] [Revised: 10/14/2016] [Accepted: 10/18/2016] [Indexed: 12/17/2022] Open
Abstract
Information about the interface sites of Protein–Protein Interactions (PPIs) is useful for many biological research works. However, despite the advancement of experimental techniques, the identification of PPI sites still remains as a challenging task. Using a statistical learning technique, we proposed a computational tool for predicting PPI interaction sites. As an alternative to similar approaches requiring structural information, the proposed method takes all of the input from protein sequences. In addition to typical sequence features, our method takes into consideration that interaction sites are not randomly distributed over the protein sequence. We characterized this positional preference using protein complexes with known structures, proposed a numerical index to estimate the propensity and then incorporated the index into a learning system. The resulting predictor, without using structural information, yields an area under the ROC curve (AUC) of 0.675, recall of 0.597, precision of 0.311 and accuracy of 0.583 on a ten-fold cross-validation experiment. This performance is comparable to the previous approach in which structural information was used. Upon introducing the B-factor data to our predictor, we demonstrated that the AUC can be further improved to 0.750. The tool is accessible at http://bsaltools.ym.edu.tw/predppis.
Collapse
Affiliation(s)
- Tzu-Hao Kuo
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei 112, Taiwan.
| | - Kuo-Bin Li
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei 112, Taiwan.
- Office of Information Management, National Yang-Ming University Hospital, Yilan 260, Taiwan.
| |
Collapse
|
83
|
Phylogenomic analysis supports a recent change in nitrate assimilation in the White-nose Syndrome pathogen, Pseudogymnoascus destructans. FUNGAL ECOL 2016. [DOI: 10.1016/j.funeco.2016.04.010] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
84
|
Wei ZS, Han K, Yang JY, Shen HB, Yu DJ. Protein–protein interaction sites prediction by ensembling SVM and sample-weighted random forests. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2016.02.022] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
85
|
Keskin O, Tuncbag N, Gursoy A. Predicting Protein–Protein Interactions from the Molecular to the Proteome Level. Chem Rev 2016; 116:4884-909. [PMID: 27074302 DOI: 10.1021/acs.chemrev.5b00683] [Citation(s) in RCA: 221] [Impact Index Per Article: 24.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Affiliation(s)
| | - Nurcan Tuncbag
- Graduate
School of Informatics, Department of Health Informatics, Middle East Technical University, 06800 Ankara, Turkey
| | | |
Collapse
|
86
|
Esmaielbeiki R, Krawczyk K, Knapp B, Nebel JC, Deane CM. Progress and challenges in predicting protein interfaces. Brief Bioinform 2016; 17:117-31. [PMID: 25971595 PMCID: PMC4719070 DOI: 10.1093/bib/bbv027] [Citation(s) in RCA: 85] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2015] [Revised: 03/18/2015] [Indexed: 12/31/2022] Open
Abstract
The majority of biological processes are mediated via protein-protein interactions. Determination of residues participating in such interactions improves our understanding of molecular mechanisms and facilitates the development of therapeutics. Experimental approaches to identifying interacting residues, such as mutagenesis, are costly and time-consuming and thus, computational methods for this purpose could streamline conventional pipelines. Here we review the field of computational protein interface prediction. We make a distinction between methods which address proteins in general and those targeted at antibodies, owing to the radically different binding mechanism of antibodies. We organize the multitude of currently available methods hierarchically based on required input and prediction principles to provide an overview of the field.
Collapse
|
87
|
Fellner L, Simon S, Scherling C, Witting M, Schober S, Polte C, Schmitt-Kopplin P, Keim DA, Scherer S, Neuhaus K. Evidence for the recent origin of a bacterial protein-coding, overlapping orphan gene by evolutionary overprinting. BMC Evol Biol 2015; 15:283. [PMID: 26677845 PMCID: PMC4683798 DOI: 10.1186/s12862-015-0558-z] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2015] [Accepted: 12/06/2015] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND Gene duplication is believed to be the classical way to form novel genes, but overprinting may be an important alternative. Overprinting allows entirely novel proteins to evolve de novo, i.e., formerly non-coding open reading frames within functional genes become expressed. Only three cases have been described for Escherichia coli. Here, a fourth example is presented. RESULTS RNA sequencing revealed an open reading frame weakly transcribed in cow dung, coding for 101 residues and embedded completely in the -2 reading frame of citC in enterohemorrhagic E. coli. This gene is designated novel overlapping gene, nog1. The promoter region fused to gfp exhibits specific activities and 5' rapid amplification of cDNA ends indicated the transcriptional start 40-bp upstream of the start codon. nog1 was strand-specifically arrested in translation by a nonsense mutation silent in citC. This Nog1-mutant showed a phenotype in competitive growth against wild type in the presence of MgCl2. Small differences in metabolite concentrations were also found. Bioinformatic analyses propose Nog1 to be inner membrane-bound and to possess at least one membrane-spanning domain. A phylogenetic analysis suggests that the orphan gene nog1 arose by overprinting after Escherichia/Shigella separated from the other γ-proteobacteria. CONCLUSIONS Since nog1 is of recent origin, non-essential, short, weakly expressed and only marginally involved in E. coli's central metabolism, we propose that this gene is in an initial stage of evolution. While we present specific experimental evidence for the existence of a fourth overlapping gene in enterohemorrhagic E. coli, we believe that this may be an initial finding only and overlapping genes in bacteria may be more common than is currently assumed by microbiologists.
Collapse
Affiliation(s)
- Lea Fellner
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85350, Freising, Germany.
| | - Svenja Simon
- Lehrstuhl für Datenanalyse und Visualisierung, Fachbereich Informatik und Informationswissenschaft, Universität Konstanz, Box 78, 78457, Constance, Germany.
| | - Christian Scherling
- Lehrstuhl für Ernährungsphysiologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Gregor-Mendel-Straße 2, D-85354, Freising, Germany.
| | - Michael Witting
- Research Unit Analytical BioGeoChemistry, Deutsches Forschungszentrum für Gesundheit und Umwelt GmbH, Helmholtz Zentrum München, Ingolstädter Landstraße 1, 85754, Neuherberg, Germany.
| | - Steffen Schober
- Institute of Communications Engineering, Universität Ulm, Albert-Einstein-Allee 43, 89081, Ulm, Germany. .,Present address: Blue Yonder GmbH, Ohiostraße 8, Karlsruhe, Germany.
| | - Christine Polte
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85350, Freising, Germany. .,Present address: Institut für Biochemie und Molekularbiologie, Universität Hamburg, Martin-Luther-King Platz 6, 20146, Hamburg, Germany.
| | - Philippe Schmitt-Kopplin
- Research Unit Analytical BioGeoChemistry, Deutsches Forschungszentrum für Gesundheit und Umwelt GmbH, Helmholtz Zentrum München, Ingolstädter Landstraße 1, 85754, Neuherberg, Germany.
| | - Daniel A Keim
- Lehrstuhl für Datenanalyse und Visualisierung, Fachbereich Informatik und Informationswissenschaft, Universität Konstanz, Box 78, 78457, Constance, Germany.
| | - Siegfried Scherer
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85350, Freising, Germany.
| | - Klaus Neuhaus
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85350, Freising, Germany.
| |
Collapse
|
88
|
Prediction of Protein-Protein Interaction Sites Based on Naive Bayes Classifier. Biochem Res Int 2015; 2015:978193. [PMID: 26697220 PMCID: PMC4677168 DOI: 10.1155/2015/978193] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2015] [Revised: 11/05/2015] [Accepted: 11/12/2015] [Indexed: 11/18/2022] Open
Abstract
Protein functions through interactions with other proteins and biomolecules and these interactions occur on the so-called interface residues of the protein sequences. Identifying interface residues makes us better understand the biological mechanism of protein interaction. Meanwhile, information about the interface residues contributes to the understanding of metabolic, signal transduction networks and indicates directions in drug designing. In recent years, researchers have focused on developing new computational methods for predicting protein interface residues. Here we creatively used a 181-dimension protein sequence feature vector as input to the Naive Bayes Classifier- (NBC-) based method to predict interaction sites in protein-protein complexes interaction. The prediction of interaction sites in protein interactions is regarded as an amino acid residue binary classification problem by applying NBC with protein sequence features. Independent test results suggested that Naive Bayes Classifier-based method with the protein sequence features as input vectors performed well.
Collapse
|
89
|
Prediction of Protein–Protein Interaction Sites with Machine-Learning-Based Data-Cleaning and Post-Filtering Procedures. J Membr Biol 2015; 249:141-53. [DOI: 10.1007/s00232-015-9856-z] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2015] [Accepted: 11/03/2015] [Indexed: 12/12/2022]
|
90
|
Soner S, Ozbek P, Garzon JI, Ben-Tal N, Haliloglu T. DynaFace: Discrimination between Obligatory and Non-obligatory Protein-Protein Interactions Based on the Complex's Dynamics. PLoS Comput Biol 2015; 11:e1004461. [PMID: 26506003 PMCID: PMC4623975 DOI: 10.1371/journal.pcbi.1004461] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Accepted: 07/08/2015] [Indexed: 12/31/2022] Open
Abstract
Protein-protein interfaces have been evolutionarily-designed to enable transduction between the interacting proteins. Thus, we hypothesize that analysis of the dynamics of the complex can reveal details about the nature of the interaction, and in particular whether it is obligatory, i.e., persists throughout the entire lifetime of the proteins, or not. Indeed, normal mode analysis, using the Gaussian network model, shows that for the most part obligatory and non-obligatory complexes differ in their decomposition into dynamic domains, i.e., the mobile elements of the protein complex. The dynamic domains of obligatory complexes often mix segments from the interacting chains, and the hinges between them do not overlap with the interface between the chains. In contrast, in non-obligatory complexes the interface often hinges between dynamic domains, held together through few anchor residues on one side of the interface that interact with their counterpart grooves in the other end. In automatic analysis, 117 of 139 obligatory (84.2%) and 203 of 246 non-obligatory (82.5%) complexes are correctly classified by our method: DynaFace. We further use DynaFace to predict obligatory and non-obligatory interactions among a set of 300 putative protein complexes. DynaFace is available at: http://safir.prc.boun.edu.tr/dynaface.
Collapse
Affiliation(s)
- Seren Soner
- Department of Computer Engineering and Polymer Research Center, Bogazici University, Istanbul, Turkey
| | - Pemra Ozbek
- Department of Bioengineering, Marmara University, Istanbul, Turkey
| | - Jose Ignacio Garzon
- Departments of Biochemistry and Molecular Biophysics and Systems Biology and Howard Hughes Medical Institute, Columbia University, New York, New York, United States of America
| | - Nir Ben-Tal
- Department of Biochemistry and Molecular Biology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Turkan Haliloglu
- Department of Chemical Engineering and Polymer Research Center, Bogazici University, Istanbul, Turkey
- * E-mail:
| |
Collapse
|
91
|
Xia B, Zhang H, Li Q, Li T. PETs: A Stable and Accurate Predictor of Protein-Protein Interacting Sites Based on Extremely-Randomized Trees. IEEE Trans Nanobioscience 2015; 14:882-93. [PMID: 26529772 DOI: 10.1109/tnb.2015.2491303] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Protein-protein interaction (PPI) plays crucial roles in the performance of various biological processes. A variety of methods are dedicated to identify whether proteins have interaction residues, but it is often more crucial to recognize each amino acid. In practical applications, the stability of a prediction model is as important as its accuracy. However, random sampling, which is widely used in previous prediction models, often brings large difference between each training model. In this paper, a Predictor of protein-protein interaction sites based on Extremely-randomized Trees (PETs) is proposed to improve the prediction accuracy while maintaining the prediction stability. In PETs, a cluster-based sampling strategy is proposed to ensure the model stability: first, the training dataset is divided into subsets using specific features; second, the subsets are clustered using K-means; and finally the samples are selected from each cluster. Using the proposed sampling strategy, samples which have different types of significant features could be selected independently from different clusters. The evaluation shows that PETs is able to achieve better accuracy while maintaining a good stability. The source code and toolkit are available at https://github.com/BinXia/PETs.
Collapse
|
92
|
Chen Y, Wang XM, Zhou L, He Y, Wang D, Qi YH, Jiang DA. Rubisco Activase Is Also a Multiple Responder to Abiotic Stresses in Rice. PLoS One 2015; 10:e0140934. [PMID: 26479064 PMCID: PMC4610672 DOI: 10.1371/journal.pone.0140934] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2015] [Accepted: 09/30/2015] [Indexed: 11/19/2022] Open
Abstract
Ribulose-1,5-bisphosphate carboxylase/oxygenase activase (RCA) is a nuclear gene that encodes a chloroplast protein that plays an important role in photosynthesis. Some reports have indicated that it may play a role in acclimation to different abiotic stresses. In this paper, we analyzed the stress-responsive elements in the 2.0 kb 5’-upstream regions of the RCA gene promoter and the primary, secondary and tertiary structure of the protein. We identified some cis-elements of multiple stress-related components in the RCA promoter. Amino acid and evolution analyses showed that the RCA protein had conserved regions between different species; however, the size and type varied. The secondary structures, binding sites and tertiary structures of the RCA proteins were also different. This might reflect the differences in the transcription and translation levels of the two RCA isoforms during adaptation to different abiotic stresses. Although both the transcription and translation levels of RCA isoforms in the rice leaves increased under various stresses, the large isoform was increased more significantly in the chloroplast stroma and thylakoid. It can be concluded that RCA, especially RCAL, is also a multiple responder to abiotic stresses in rice, which provides new insights into RCA functions.
Collapse
Affiliation(s)
- Yue Chen
- State Key Laboratory of Plant Physiology and Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Xiao-Man Wang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Li Zhou
- State Key Laboratory of Plant Physiology and Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Yi He
- State Key Laboratory of Plant Physiology and Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Dun Wang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Yan-Hua Qi
- State Key Laboratory of Plant Physiology and Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - De-An Jiang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
- * E-mail:
| |
Collapse
|
93
|
Hou Q, Dutilh BE, Huynen MA, Heringa J, Feenstra KA. Sequence specificity between interacting and non-interacting homologs identifies interface residues--a homodimer and monomer use case. BMC Bioinformatics 2015; 16:325. [PMID: 26449222 PMCID: PMC4599308 DOI: 10.1186/s12859-015-0758-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Accepted: 09/30/2015] [Indexed: 11/17/2022] Open
Abstract
Background Protein families participating in protein-protein interactions may contain sub-families that have different binding characteristics, ranging from right binding to showing no interaction at all. Composition differences at the sequence level in these sub-families are often decisive to their differential functional interaction. Methods to predict interface sites from protein sequences typically exploit conservation as a signal. Here, instead, we provide proof of concept that the sequence specificity between interacting versus non-interacting groups can be exploited to recognise interaction sites. Results We collected homodimeric and monomeric proteins and formed homologous groups, each having an interacting (homodimer) subgroup and a non-interacting (monomer) subgroup. We then compiled multiple sequence alignments of the proteins in the homologous groups and identified compositional differences between the homodimeric and monomeric subgroups for each of the alignment positions. Our results show that this specificity signal distinguishes interface and other surface residues with 40.9 % recall and up to 25.1 % precision. Conclusions To our best knowledge, this is the first large scale study that exploits sequence specificity between interacting and non-interacting homologs to predict interaction sites from sequence information only. The performance obtained indicates that this signal contains valuable information to identify protein-protein interaction sites. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0758-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Qingzhen Hou
- Center for Integrative Bioinformatics VU (IBIVU), Vrije University Amsterdam, De Boelelaan 1081A, 1081 HV, Amsterdam, The Netherlands.
| | - Bas E Dutilh
- Theoretical Biology and Bioinformatics, Utrecht University, Padualaan 8, 3584 CH, Utrecht, The Netherlands. .,Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Geert Grooteplein 28, 6525 GA, Nijmegen, The Netherlands. .,Department of Marine Biology, Institute of Biology, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil.
| | - Martijn A Huynen
- Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Geert Grooteplein 28, 6525 GA, Nijmegen, The Netherlands.
| | - Jaap Heringa
- Center for Integrative Bioinformatics VU (IBIVU), Vrije University Amsterdam, De Boelelaan 1081A, 1081 HV, Amsterdam, The Netherlands.
| | - K Anton Feenstra
- Center for Integrative Bioinformatics VU (IBIVU), Vrije University Amsterdam, De Boelelaan 1081A, 1081 HV, Amsterdam, The Netherlands.
| |
Collapse
|
94
|
Wei ZS, Yang JY, Shen HB, Yu DJ. A Cascade Random Forests Algorithm for Predicting Protein-Protein Interaction Sites. IEEE Trans Nanobioscience 2015; 14:746-60. [DOI: 10.1109/tnb.2015.2475359] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
|
95
|
Calvete O, Martinez P, Garcia-Pavia P, Benitez-Buelga C, Paumard-Hernández B, Fernandez V, Dominguez F, Salas C, Romero-Laorden N, Garcia-Donas J, Carrillo J, Perona R, Triviño JC, Andrés R, Cano JM, Rivera B, Alonso-Pulpon L, Setien F, Esteller M, Rodriguez-Perales S, Bougeard G, Frebourg T, Urioste M, Blasco MA, Benítez J. A mutation in the POT1 gene is responsible for cardiac angiosarcoma in TP53-negative Li-Fraumeni-like families. Nat Commun 2015; 6:8383. [PMID: 26403419 DOI: 10.1038/ncomms9383] [Citation(s) in RCA: 127] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2015] [Accepted: 08/14/2015] [Indexed: 12/30/2022] Open
Abstract
Cardiac angiosarcoma (CAS) is a rare malignant tumour whose genetic basis is unknown. Here we show, by whole-exome sequencing of a TP53-negative Li-Fraumeni-like (LFL) family including CAS cases, that a missense variant (p.R117C) in POT1 (protection of telomeres 1) gene is responsible for CAS. The same gene alteration is found in two other LFL families with CAS, supporting the causal effect of the identified mutation. We extend the analysis to TP53-negative LFL families with no CAS and find the same mutation in a breast AS family. The mutation is recently found once in 121,324 studied alleles in ExAC server but it is not described in any other database or found in 1,520 Spanish controls. In silico structural analysis suggests how the mutation disrupts POT1 structure. Functional and in vitro studies demonstrate that carriers of the mutation show reduced telomere-bound POT1 levels, abnormally long telomeres and increased telomere fragility.
Collapse
Affiliation(s)
- Oriol Calvete
- Human Genetics Group, Spanish National Cancer Research Center (CNIO), Melchor Fernandez Almagro 3, Madrid 28029, Spain.,Center for Biomedical Network Research on Rare Diseases (CIBERER), Madrid 28029, Spain
| | - Paula Martinez
- Telomeres and Telomerase Group, Spanish National Cancer Research Center (CNIO), Madrid 28029, Spain
| | - Pablo Garcia-Pavia
- Department of Cardiology. Hospital Universitario Puerta de Hierro, Mahadahonda, Madrid 28222, Spain.,Department of Cardiovascular Development and Repair, Centro Nacional de Investigaciones Cardiovasculares (CNIC), Madrid 28029, Spain
| | - Carlos Benitez-Buelga
- Human Genetics Group, Spanish National Cancer Research Center (CNIO), Melchor Fernandez Almagro 3, Madrid 28029, Spain
| | - Beatriz Paumard-Hernández
- Human Genetics Group, Spanish National Cancer Research Center (CNIO), Melchor Fernandez Almagro 3, Madrid 28029, Spain
| | - Victoria Fernandez
- Human Genetics Group, Spanish National Cancer Research Center (CNIO), Melchor Fernandez Almagro 3, Madrid 28029, Spain
| | - Fernando Dominguez
- Department of Cardiology. Hospital Universitario Puerta de Hierro, Mahadahonda, Madrid 28222, Spain
| | - Clara Salas
- Department of Pathology. Hospital Universitario Puerta de Hierro Majadahonda, Madrid 28222, Spain
| | - Nuria Romero-Laorden
- Oncology Department, Clara Campal Comprehensive Cancer Center, Sanchinarro, Madrid 28050, Spain
| | - Jesus Garcia-Donas
- Oncology Department, Clara Campal Comprehensive Cancer Center, Sanchinarro, Madrid 28050, Spain
| | - Jaime Carrillo
- Department of Experimental Models of Human Disease. Instituto Investigaciones Biomédicas (CSIC/UAM), Madrid 28029, Spain
| | - Rosario Perona
- Center for Biomedical Network Research on Rare Diseases (CIBERER), Madrid 28029, Spain.,Department of Experimental Models of Human Disease. Instituto Investigaciones Biomédicas (CSIC/UAM), Madrid 28029, Spain
| | | | - Raquel Andrés
- Medical Oncology Service, Hospital Universitario Lozano Blesa, Zaragoza 50009, Spain
| | - Juana María Cano
- Medical Oncology Service, Hospital General de Ciudad Real, Ciudad Real 13005, Spain
| | - Bárbara Rivera
- Familial Cancer Clinical Unit, Spanish National Cancer Research Center (CNIO), Madrid 28029, Spain
| | - Luis Alonso-Pulpon
- Department of Cardiology. Hospital Universitario Puerta de Hierro, Mahadahonda, Madrid 28222, Spain
| | - Fernando Setien
- Cancer Epigenetics and Biology Program (PEBC), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona 08908, Spain
| | - Manel Esteller
- Cancer Epigenetics and Biology Program (PEBC), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona 08908, Spain.,Department of Physiological Sciences II, School of Medicine, University of Barcelona, Barcelona 08007, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona 08010, Spain
| | | | - Gaelle Bougeard
- Genetics Department, Rouen University Hospital, Rouen 76000, France
| | - Tierry Frebourg
- Genetics Department, Rouen University Hospital, Rouen 76000, France
| | - Miguel Urioste
- Center for Biomedical Network Research on Rare Diseases (CIBERER), Madrid 28029, Spain.,Familial Cancer Clinical Unit, Spanish National Cancer Research Center (CNIO), Madrid 28029, Spain
| | - Maria A Blasco
- Telomeres and Telomerase Group, Spanish National Cancer Research Center (CNIO), Madrid 28029, Spain
| | - Javier Benítez
- Human Genetics Group, Spanish National Cancer Research Center (CNIO), Melchor Fernandez Almagro 3, Madrid 28029, Spain.,Center for Biomedical Network Research on Rare Diseases (CIBERER), Madrid 28029, Spain
| |
Collapse
|
96
|
Mosaic composition of ribA and wspB genes flanking the virB8-D4 operon in the Wolbachia supergroup B-strain, wStr. Arch Microbiol 2015; 198:53-69. [PMID: 26400107 PMCID: PMC4705124 DOI: 10.1007/s00203-015-1154-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2015] [Revised: 09/09/2015] [Accepted: 09/14/2015] [Indexed: 01/28/2023]
Abstract
The obligate intracellular bacterium, Wolbachia pipientis (Rickettsiales), is a widespread, vertically transmitted endosymbiont of filarial nematodes and arthropods. In insects, Wolbachia modifies reproduction, and in mosquitoes, infection interferes with replication of arboviruses, bacteria and plasmodia. Development of Wolbachia as a tool to control pest insects will be facilitated by an understanding of molecular events that underlie genetic exchange between Wolbachia strains. Here, we used nucleotide sequence, transcriptional and proteomic analyses to evaluate expression levels and establish the mosaic nature of genes flanking the T4SS virB8-D4 operon from wStr, a supergroup B-strain from a planthopper (Hemiptera) that maintains a robust, persistent infection in an Aedes albopictus mosquito cell line. Based on protein abundance, ribA, which contains promoter elements at the 5′-end of the operon, is weakly expressed. The 3′-end of the operon encodes an intact wspB, which encodes an outer membrane protein and is co-transcribed with the vir genes. WspB and vir proteins are expressed at similar, above average abundance levels. In wStr, both ribA and wspB are mosaics of conserved sequence motifs from Wolbachia supergroup A- and B-strains, and wspB is nearly identical to its homolog from wCobU4-2, an A-strain from weevils (Coleoptera). We describe conserved repeated sequence elements that map within or near pseudogene lesions and transitions between A- and B-strain motifs. These studies contribute to ongoing efforts to explore interactions between Wolbachia and its host cell in an in vitro system.
Collapse
|
97
|
Koberg S, Mohamed MDA, Faulhaber K, Neve H, Heller KJ. Identification and characterization of cis- and trans-acting elements involved in prophage induction in Streptococcus thermophilus J34. Mol Microbiol 2015; 98:535-52. [PMID: 26193959 DOI: 10.1111/mmi.13140] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/17/2015] [Indexed: 11/29/2022]
Abstract
The genetic switch region of temperate Streptococcus thermophilus phage TP-J34 contains two divergently oriented promoters and several predicted operator sites. It separates lytic cycle-promoting genes from those promoting lysogeny. A polycistronic transcript comprises the genes coding for repressor Crh, metalloproteinase-motif protein Rir and superinfection exclusion lipoprotein Ltp. Weak promoters effecting monocistronic transcripts were localized for ltp and int (encoding integrase) by Northern blot and 5'-RACE-PCR. These transcripts appeared in lysogenic as well as lytic state. A polycistronic transcript comprising genes coh (encoding Cro homolog), ant (encoding putative antirepressor), orf7, orf8 and orf9 was only detected in the lytic state. Four operator sites, of which three were located in the intergenic regions between crh and coh, and one between coh and ant, were identified by competition electromobility shift assays. Cooperative binding of Crh to two operator sites immediately upstream of coh could be demonstrated. Coh was shown to bind to the operator closest to crh only. Oligomerization was proven by cross-linking Crh by glutaraldehyde. Knock-out of rir revealed a key role in prophage induction. Rir and Crh were shown to form a complex in solution and Rir prevented binding of Crh to its operator sites.
Collapse
Affiliation(s)
- Sabrina Koberg
- Department of Microbiology and Biotechnology, Max Rubner-Institut (Federal Research Institute of Nutrition and Food), Kiel, Germany
| | - Mazhar Desouki Ali Mohamed
- Department of Microbiology and Biotechnology, Max Rubner-Institut (Federal Research Institute of Nutrition and Food), Kiel, Germany
| | - Katharina Faulhaber
- Department of Microbiology and Biotechnology, Max Rubner-Institut (Federal Research Institute of Nutrition and Food), Kiel, Germany
| | - Horst Neve
- Department of Microbiology and Biotechnology, Max Rubner-Institut (Federal Research Institute of Nutrition and Food), Kiel, Germany
| | - Knut J Heller
- Department of Microbiology and Biotechnology, Max Rubner-Institut (Federal Research Institute of Nutrition and Food), Kiel, Germany
| |
Collapse
|
98
|
Biogeography of Nocardiopsis strains from hypersaline environments of Yunnan and Xinjiang Provinces, western China. Sci Rep 2015; 5:13323. [PMID: 26289784 PMCID: PMC4542603 DOI: 10.1038/srep13323] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2014] [Accepted: 07/23/2015] [Indexed: 12/19/2022] Open
Abstract
The genus Nocardiopsis is a widespread group within the phylum Actinobacteria and has been isolated from various salty environments worldwide. However, little is known about whether biogeography affects Nocardiopsis distribution in various hypersaline environments. Such information is essential for understanding the ecology of Nocardiopsis. Here we analyzed 16S rRNA, gyrB, rpoB and sodA genes of 78 Nocardiopsis strains isolated from hypersaline environments in Yunnan and Xinjiang Provinces of western China. The obtained Nocardiopsis strains were classified into five operational taxonomic units, each comprising location-specific phylo- and genotypes. Statistical analyses showed that spatial distance and environmental factors substantially influenced Nocardiopsis distribution in hypersaline environments: the former had stronger influence at large spatial scales, whereas the latter was more influential at small spatial scales.
Collapse
|
99
|
Thompson CM, Visick KL. Assessing the function of STAS domain protein SypA in Vibrio fischeri using a comparative analysis. Front Microbiol 2015; 6:760. [PMID: 26284045 PMCID: PMC4517449 DOI: 10.3389/fmicb.2015.00760] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2015] [Accepted: 07/13/2015] [Indexed: 01/15/2023] Open
Abstract
Colonization of the squid Euprymna scolopes by Vibrio fischeri requires biofilm formation dependent on the 18-gene symbiosis polysaccharide locus, syp. One key regulator, SypA, controls biofilm formation by an as-yet unknown mechanism; however, it is known that SypA itself is regulated by SypE. Biofilm-proficient strains form wrinkled colonies on solid media, while sypA mutants form biofilm-defective smooth colonies. To begin to understand the function of SypA, we used comparative analyses and mutagenesis approaches. sypA (and the syp locus) is conserved in other Vibrios, including two food-borne human pathogens, Vibrio vulnificus (rbdA) and Vibrio parahaemolyticus (sypAVP). We found that both homologs could complement the biofilm defect of the V. fischeri sypA mutant, but their phenotypes varied depending on the biofilm-inducing conditions used. Furthermore, while SypAVP retained an ability to be regulated by SypE, RbdA was resistant to this control. To better understand SypA function, we examined the biofilm-promoting ability of a number of mutant SypA proteins with substitutions in conserved residues, and found many that were biofilm-defective. The most severe biofilm-defective phenotypes occurred when changes were made to a conserved stretch of amino acids within a predicted α-helix of SypA; we hypothesize that this region of SypA may interact with another protein to promote biofilm formation. Finally, we identified a residue required for negative control by SypE. Together, our data provide insights into the function of this key biofilm regulator and suggest that the SypA orthologs may play similar roles in their native Vibrio species.
Collapse
Affiliation(s)
- Cecilia M Thompson
- Department of Microbiology and Immunology, Loyola University Chicago, Maywood, IL USA
| | - Karen L Visick
- Department of Microbiology and Immunology, Loyola University Chicago, Maywood, IL USA
| |
Collapse
|
100
|
Abstract
Elucidating the effects of naturally occurring genetic variation is one of the major challenges for personalized health and personalized medicine. Here, we introduce SNAP2, a novel neural network based classifier that improves over the state-of-the-art in distinguishing between effect and neutral variants. Our method's improved performance results from screening many potentially relevant protein features and from refining our development data sets. Cross-validated on >100k experimentally annotated variants, SNAP2 significantly outperformed other methods, attaining a two-state accuracy (effect/neutral) of 83%. SNAP2 also outperformed combinations of other methods. Performance increased for human variants but much more so for other organisms. Our method's carefully calibrated reliability index informs selection of variants for experimental follow up, with the most strongly predicted half of all effect variants predicted at over 96% accuracy. As expected, the evolutionary information from automatically generated multiple sequence alignments gave the strongest signal for the prediction. However, we also optimized our new method to perform surprisingly well even without alignments. This feature reduces prediction runtime by over two orders of magnitude, enables cross-genome comparisons, and renders our new method as the best solution for the 10-20% of sequence orphans. SNAP2 is available at: https://rostlab.org/services/snap2web
Collapse
|