1
|
Manavi F, Sharma A, Sharma R, Tsunoda T, Shatabda S, Dehzangi I. CNN-Pred: Prediction of single-stranded and double-stranded DNA-binding protein using convolutional neural networks. Gene X 2023; 853:147045. [PMID: 36503892 DOI: 10.1016/j.gene.2022.147045] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 10/10/2022] [Accepted: 11/08/2022] [Indexed: 11/27/2022] Open
Abstract
DNA-binding proteins play a vital role in biological activity including DNA replication, DNA packing, and DNA reparation. DNA-binding proteins can be classified into single-stranded DNA-binding proteins (SSBs) or double-stranded DNA-binding proteins (DSBs). Determining whether a protein is DSB or SSB helps determine the protein's function. Therefore, many studies have been conducted to accurately identify DSB and SSB in recent years. Despite all the efforts have been made so far, the DSB and SSB prediction performance remains limited. In this study, we propose a new method called CNN-Pred to accurately predict DSB and SSB. To build CNN-Pred, we first extract evolutionary-based features in the form of mono-gram and bi-gram profiles using position specific scoring matrix (PSSM). We then, use 1D-convolutional neural network (CNN) as the classifier to our extracted features. Our results demonstrate that CNN-Pred can enhance the DSB and SSB prediction accuracies by more than 4%, on the independent test compared to previous studies found in the literature. CNN-pred as a standalone tool and all its source codes are publicly available at: https://github.com/MLBC-lab/CNN-Pred.
Collapse
Affiliation(s)
- Farnoush Manavi
- Computer Science and Engineering and Information Technology Department, Shiraz University, Shiraz, Iran
| | - Alok Sharma
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama 230-0045, Japan; Institute for Integrated and Intelligent Systems, Griffith University, Nathan, Brisbane, QLD 4111, Australia
| | - Ronesh Sharma
- School of Electrical and Electronics Engineering, Fiji National University, Suva, Fiji
| | - Tatsuhiko Tsunoda
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama 230-0045, Japan; Laboratory for Medical Science Mathematics, Department of Biological Sciences, School of Science, The University of Tokyo, Tokyo 113-0033, Japan; Laboratory for Medical Science Mathematics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo 113-0033, Japan
| | - Swakkhar Shatabda
- Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh
| | - Iman Dehzangi
- Department of Computer Science, Rutgers University, Camden, NJ, USA; Center for Computational and Integrative Biology, Rutgers University, Camden, USA
| |
Collapse
|
2
|
Welte H, Sinn P, Kovermann M. Fluorine NMR Spectroscopy Enables to Quantify the Affinity Between DNA and Proteins in Cell Lysate. Chembiochem 2021; 22:2973-2980. [PMID: 34390111 PMCID: PMC8596521 DOI: 10.1002/cbic.202100304] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2021] [Revised: 07/30/2021] [Indexed: 11/12/2022]
Abstract
The determination of the binding affinity quantifying the interaction between proteins and nucleic acids is of crucial interest in biological and chemical research. Here, we have made use of site-specific fluorine labeling of the cold shock protein from Bacillus subtilis, BsCspB, enabling to directly monitor the interaction with single stranded DNA molecules in cell lysate. High-resolution 19 F NMR spectroscopy has been applied to exclusively report on resonance signals arising from the protein under study. We have found that this experimental approach advances the reliable determination of the binding affinity between single stranded DNA molecules and its target protein in this complex biological environment by intertwining analyses based on NMR chemical shifts, signal heights, line shapes and simulations. We propose that the developed experimental platform offers a potent approach for the identification of binding affinities characterizing intermolecular interactions in native surroundings covering the nano-to-micromolar range that can be even expanded to in cell applications in future studies.
Collapse
Affiliation(s)
- Hannah Welte
- Department of ChemistryUniversity of KonstanzUniversitätsstrasse 1078467KonstanzGermany
| | - Pia Sinn
- Department of ChemistryUniversity of KonstanzUniversitätsstrasse 1078467KonstanzGermany
| | - Michael Kovermann
- Department of ChemistryUniversity of KonstanzUniversitätsstrasse 1078467KonstanzGermany
| |
Collapse
|
3
|
Heinemann U, Roske Y. Cold-Shock Domains-Abundance, Structure, Properties, and Nucleic-Acid Binding. Cancers (Basel) 2021; 13:cancers13020190. [PMID: 33430354 PMCID: PMC7825780 DOI: 10.3390/cancers13020190] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 01/05/2021] [Accepted: 01/06/2021] [Indexed: 02/06/2023] Open
Abstract
Simple Summary Proteins are composed of compact domains, often of known three-dimensional structure, and natively unstructured polypeptide regions. The abundant cold-shock domain is among the set of canonical nucleic acid-binding domains and conserved from bacteria to man. Proteins containing cold-shock domains serve a large variety of biological functions, which are mostly linked to DNA or RNA binding. These functions include the regulation of transcription, RNA splicing, translation, stability and sequestration. Cold-shock domains have a simple architecture with a conserved surface ideally suited to bind single-stranded nucleic acids. Because the binding is mostly by non-specific molecular interactions which do not involve the sugar-phosphate backbone, cold-shock domains are not strictly sequence-specific and do not discriminate reliably between DNA and RNA. Many, but not all functions of cold shock-domain proteins in health and disease can be understood based of the physical and structural properties of their cold-shock domains. Abstract The cold-shock domain has a deceptively simple architecture but supports a complex biology. It is conserved from bacteria to man and has representatives in all kingdoms of life. Bacterial cold-shock proteins consist of a single cold-shock domain and some, but not all are induced by cold shock. Cold-shock domains in human proteins are often associated with natively unfolded protein segments and more rarely with other folded domains. Cold-shock proteins and domains share a five-stranded all-antiparallel β-barrel structure and a conserved surface that binds single-stranded nucleic acids, predominantly by stacking interactions between nucleobases and aromatic protein sidechains. This conserved binding mode explains the cold-shock domains’ ability to associate with both DNA and RNA strands and their limited sequence selectivity. The promiscuous DNA and RNA binding provides a rationale for the ability of cold-shock domain-containing proteins to function in transcription regulation and DNA-damage repair as well as in regulating splicing, translation, mRNA stability and RNA sequestration.
Collapse
|
4
|
Tan C, Wang T, Yang W, Deng L. PredPSD: A Gradient Tree Boosting Approach for Single-Stranded and Double-Stranded DNA Binding Protein Prediction. Molecules 2019; 25:molecules25010098. [PMID: 31888057 PMCID: PMC6982935 DOI: 10.3390/molecules25010098] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2019] [Revised: 12/20/2019] [Accepted: 12/21/2019] [Indexed: 11/16/2022] Open
Abstract
Interactions between proteins and DNAs play essential roles in many biological processes. DNA binding proteins can be classified into two categories. Double-stranded DNA-binding proteins (DSBs) bind to double-stranded DNA and are involved in a series of cell functions such as gene expression and regulation. Single-stranded DNA-binding proteins (SSBs) are necessary for DNA replication, recombination, and repair and are responsible for binding to the single-stranded DNA. Therefore, the effective classification of DNA-binding proteins is helpful for functional annotations of proteins. In this work, we propose PredPSD, a computational method based on sequence information that accurately predicts SSBs and DSBs. It introduces three novel feature extraction algorithms. In particular, we use the autocross-covariance (ACC) transformation to transform feature matrices into fixed-length vectors. Then, we put the optimal feature subset obtained by the minimal-redundancy-maximal-relevance criterion (mRMR) feature selection algorithm into the gradient tree boosting (GTB). In 10-fold cross-validation based on a benchmark dataset, PredPSD achieves promising performances with an AUC score of 0.956 and an accuracy of 0.912, which are better than those of existing methods. Moreover, our method has significantly improved the prediction accuracy in independent testing. The experimental results show that PredPSD can significantly recognize the binding specificity and differentiate DSBs and SSBs.
Collapse
Affiliation(s)
- Changgeng Tan
- School of Computer Science and Engineering, Central South University, Changsha 410075, China; (C.T.); (T.W.); (W.Y.)
| | - Tong Wang
- School of Computer Science and Engineering, Central South University, Changsha 410075, China; (C.T.); (T.W.); (W.Y.)
| | - Wenyi Yang
- School of Computer Science and Engineering, Central South University, Changsha 410075, China; (C.T.); (T.W.); (W.Y.)
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha 410075, China; (C.T.); (T.W.); (W.Y.)
- School of Software, Xinjiang University, Urumqi 830008, China
- Correspondence: ; Tel.: +86-731-82539736
| |
Collapse
|
5
|
Caruso IP, Panwalkar V, Coronado MA, Dingley AJ, Cornélio ML, Willbold D, Arni RK, Eberle RJ. Structure and interaction of Corynebacterium pseudotuberculosis cold shock protein A with Y-box single-stranded DNA fragment. FEBS J 2017; 285:372-390. [PMID: 29197185 DOI: 10.1111/febs.14350] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2017] [Revised: 11/07/2017] [Accepted: 11/29/2017] [Indexed: 11/28/2022]
Abstract
Cold shock proteins (Csps) function to preserve cell viability at low temperatures by binding to nucleic acids and consequently control gene expression. The mesophilic bacterium Corynebacterium pseudotuberculosis is the causative agent of caseous lymphadenitis in animals, and infection in livestock is a considerable economic burden worldwide. In this report, the structure of cold shock protein A from Cp (Cp-CspA) and biochemical analysis of its temperature-dependent interaction with a Y-box ssDNA motif is presented. The Cp-CspA structure contains five β-strands making up a β-barrel fold with 11 hydrophobic core residues and two salt bridges that confers it with a melting temperature of ~ 54 °C that is similar to mesophilic Bs-CspB. Chemical shift perturbations analysis revealed that residues in the nucleic acid-binding motifs (RNP 1 and 2) and loop 3 are involved in binding to the Y-box fragment either by direct interaction or by conformational rearrangements remote from the binding region. Fluorescence quenching experiments of Cp-CspA showed that the dissociation constants for Y-box ssDNA binding is nanomolar and the binding affinity decreased as the temperature increased, indicating that the interaction is enthalpically driven and the hydrogen bonds and van der Waals forces are important contributions for complex stabilization. The Y31 of Cp-CspA is a particular occurrence among Csps from mesophilic bacteria that provide a possible explanation for the higher binding affinity to ssDNA than that observed for Bs-CspB. Anisotropy measurements indicated that the reduction in molecular mobility of Cp-CspA upon Y-box binding is characterized by a cooperative process. DATABASE Resonance assignment and structural data are available in the Biological Magnetic Resonance Data Bank and Protein Data Bank under accession number 26802 and 5O6F, respectively.
Collapse
Affiliation(s)
- Icaro P Caruso
- Department of Physics, Multiuser Center for Biomolecular Innovation (CMIB), IBILCE/UNESP, São José do Rio Preto, São Paulo, Brazil
| | - Vineet Panwalkar
- Institute of Complex System, Structural Biochemistry (ICS-6), Forchungszentrum Jülich, Germany.,Institut für Physikalische Biologie, Heinrich-Heine-Universität Düsseldorf, Universitätsstraße, Germany
| | - Monika A Coronado
- Department of Physics, Multiuser Center for Biomolecular Innovation (CMIB), IBILCE/UNESP, São José do Rio Preto, São Paulo, Brazil
| | - Andrew J Dingley
- Institute of Complex System, Structural Biochemistry (ICS-6), Forchungszentrum Jülich, Germany.,Institut für Physikalische Biologie, Heinrich-Heine-Universität Düsseldorf, Universitätsstraße, Germany
| | - Marinônio L Cornélio
- Department of Physics, Multiuser Center for Biomolecular Innovation (CMIB), IBILCE/UNESP, São José do Rio Preto, São Paulo, Brazil
| | - Dieter Willbold
- Institute of Complex System, Structural Biochemistry (ICS-6), Forchungszentrum Jülich, Germany.,Institut für Physikalische Biologie, Heinrich-Heine-Universität Düsseldorf, Universitätsstraße, Germany
| | - Raghuvir K Arni
- Department of Physics, Multiuser Center for Biomolecular Innovation (CMIB), IBILCE/UNESP, São José do Rio Preto, São Paulo, Brazil
| | - Raphael J Eberle
- Department of Physics, Multiuser Center for Biomolecular Innovation (CMIB), IBILCE/UNESP, São José do Rio Preto, São Paulo, Brazil
| |
Collapse
|
6
|
Benhalevy D, Biran I, Bochkareva ES, Sorek R, Bibi E. Evidence for a cytoplasmic pool of ribosome-free mRNAs encoding inner membrane proteins in Escherichia coli. PLoS One 2017; 12:e0183862. [PMID: 28841711 PMCID: PMC5571963 DOI: 10.1371/journal.pone.0183862] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2017] [Accepted: 08/11/2017] [Indexed: 12/13/2022] Open
Abstract
Translation-independent mRNA localization represents an emerging concept in cell biology. In Escherichia coli, mRNAs encoding integral membrane proteins (MPRs) are targeted to the membrane where they are translated by membrane associated ribosomes and the produced proteins are inserted into the membrane co-translationally. In order to better understand aspects of the biogenesis and localization of MPRs, we investigated their subcellular distribution using cell fractionation, RNA-seq and qPCR. The results show that MPRs are overrepresented in the membrane fraction, as expected, and depletion of the signal recognition particle-receptor, FtsY reduced the amounts of all mRNAs on the membrane. Surprisingly, however, MPRs were also found relatively abundant in the soluble ribosome-free fraction and their amount in this fraction is increased upon overexpression of CspE, which was recently shown to interact with MPRs. CspE also conferred a positive effect on the membrane-expression of integral membrane proteins. We discuss the possibility that the effects of CspE overexpression may link the intriguing subcellular localization of MPRs to the cytosolic ribosome-free fraction with their translation into membrane proteins and that the ribosome-free pool of MPRs may represent a stage during their targeting to the membrane, which precedes translation.
Collapse
Affiliation(s)
- Daniel Benhalevy
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
| | - Ido Biran
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
| | - Elena S. Bochkareva
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
| | - Rotem Sorek
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Eitan Bibi
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
| |
Collapse
|
7
|
Yu J, Blom J, Glaeser SP, Jaenicke S, Juhre T, Rupp O, Schwengers O, Spänig S, Goesmann A. A review of bioinformatics platforms for comparative genomics. Recent developments of the EDGAR 2.0 platform and its utility for taxonomic and phylogenetic studies. J Biotechnol 2017; 261:2-9. [PMID: 28705636 DOI: 10.1016/j.jbiotec.2017.07.010] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2017] [Revised: 07/06/2017] [Accepted: 07/07/2017] [Indexed: 12/12/2022]
Abstract
The rapid development of next generation sequencing technology has greatly increased the amount of available microbial genomes. As a result of this development, there is a rising demand for fast and automated approaches in analyzing these genomes in a comparative way. Whole genome sequencing also bears a huge potential for obtaining a higher resolution in phylogenetic and taxonomic classification. During the last decade, several software tools and platforms have been developed in the field of comparative genomics. In this manuscript, we review the most commonly used platforms and approaches for ortholog group analyses with a focus on their potential for phylogenetic and taxonomic research. Furthermore, we describe the latest improvements of the EDGAR platform for comparative genome analyses and present recent examples of its application for the phylogenomic analysis of different taxa. Finally, we illustrate the role of the EDGAR platform as part of the BiGi Center for Microbial Bioinformatics within the German network on Bioinformatics Infrastructure (de.NBI).
Collapse
Affiliation(s)
- J Yu
- Int. Research Training Group 1906 (DiDy), Bielefeld University, Bielefeld, 33501, Germany; Bioinformatics and Systems Biology, Justus-Liebig-University Giessen, Giessen, 35392, Germany
| | - J Blom
- Bioinformatics and Systems Biology, Justus-Liebig-University Giessen, Giessen, 35392, Germany.
| | - S P Glaeser
- Institute of Applied Microbiology, Justus-Liebig-University Giessen, Giessen, 35392, Germany
| | - S Jaenicke
- Bioinformatics and Systems Biology, Justus-Liebig-University Giessen, Giessen, 35392, Germany
| | - T Juhre
- Bioinformatics and Systems Biology, Justus-Liebig-University Giessen, Giessen, 35392, Germany
| | - O Rupp
- Bioinformatics and Systems Biology, Justus-Liebig-University Giessen, Giessen, 35392, Germany
| | - O Schwengers
- Bioinformatics and Systems Biology, Justus-Liebig-University Giessen, Giessen, 35392, Germany
| | - S Spänig
- Bioinformatics and Systems Biology, Justus-Liebig-University Giessen, Giessen, 35392, Germany
| | - A Goesmann
- Bioinformatics and Systems Biology, Justus-Liebig-University Giessen, Giessen, 35392, Germany
| |
Collapse
|
8
|
Wang W, Sun L, Zhang S, Zhang H, Shi J, Xu T, Li K. Analysis and prediction of single-stranded and double-stranded DNA binding proteins based on protein sequences. BMC Bioinformatics 2017; 18:300. [PMID: 28606086 PMCID: PMC5469069 DOI: 10.1186/s12859-017-1715-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2017] [Accepted: 06/06/2017] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND DNA-binding proteins perform important functions in a great number of biological activities. DNA-binding proteins can interact with ssDNA (single-stranded DNA) or dsDNA (double-stranded DNA), and DNA-binding proteins can be categorized as single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). The identification of DNA-binding proteins from amino acid sequences can help to annotate protein functions and understand the binding specificity. In this study, we systematically consider a variety of schemes to represent protein sequences: OAAC (overall amino acid composition) features, dipeptide compositions, PSSM (position-specific scoring matrix profiles) and split amino acid composition (SAA), and then we adopt SVM (support vector machine) and RF (random forest) classification model to distinguish SSBs from DSBs. RESULTS Our results suggest that some sequence features can significantly differentiate DSBs and SSBs. Evaluated by 10 fold cross-validation on the benchmark datasets, our prediction method can achieve the accuracy of 88.7% and AUC (area under the curve) of 0.919. Moreover, our method has good performance in independent testing. CONCLUSIONS Using various sequence-derived features, a novel method is proposed to distinguish DSBs and SSBs accurately. The method also explores novel features, which could be helpful to discover the binding specificity of DNA-binding proteins.
Collapse
Affiliation(s)
- Wei Wang
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, Henan Province 453007 China
- Laboratory of Computation Intelligence and Information Processing, Engineering Technology Research Center for Computing Intelligence and Data Mining, Xinxiang, Henan Province 453007 China
| | - Lin Sun
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, Henan Province 453007 China
| | - Shiguang Zhang
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, Henan Province 453007 China
| | - Hongjun Zhang
- School of Aviation Engineering, Anyang University, Anyang, Henan Province 455000 China
| | - Jinling Shi
- School of International Education, Xuchang University, Xuchang, Henan Province 461000 China
| | - Tianhe Xu
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, Henan Province 453007 China
| | - Keliang Li
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, Henan Province 453007 China
| |
Collapse
|
9
|
Blom J, Kreis J, Spänig S, Juhre T, Bertelli C, Ernst C, Goesmann A. EDGAR 2.0: an enhanced software platform for comparative gene content analyses. Nucleic Acids Res 2016; 44:W22-8. [PMID: 27098043 PMCID: PMC4987874 DOI: 10.1093/nar/gkw255] [Citation(s) in RCA: 281] [Impact Index Per Article: 31.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2016] [Accepted: 04/02/2016] [Indexed: 12/29/2022] Open
Abstract
The rapidly increasing availability of microbial genome sequences has led to a growing demand for bioinformatics software tools that support the functional analysis based on the comparison of closely related genomes. By utilizing comparative approaches on gene level it is possible to gain insights into the core genes which represent the set of shared features for a set of organisms under study. Vice versa singleton genes can be identified to elucidate the specific properties of an individual genome. Since initial publication, the EDGAR platform has become one of the most established software tools in the field of comparative genomics. Over the last years, the software has been continuously improved and a large number of new analysis features have been added. For the new version, EDGAR 2.0, the gene orthology estimation approach was newly designed and completely re-implemented. Among other new features, EDGAR 2.0 provides extended phylogenetic analysis features like AAI (Average Amino Acid Identity) and ANI (Average Nucleotide Identity) matrices, genome set size statistics and modernized visualizations like interactive synteny plots or Venn diagrams. Thereby, the software supports a quick and user-friendly survey of evolutionary relationships between microbial genomes and simplifies the process of obtaining new biological insights into their differential gene content. All features are offered to the scientific community via a web-based and therefore platform-independent user interface, which allows easy browsing of precomputed datasets. The web server is accessible at http://edgar.computational.bio.
Collapse
Affiliation(s)
- Jochen Blom
- Bioinformatics & Systems Biology, Justus-Liebig-University Giessen, 35392 Giessen, Hesse, Germany
| | - Julian Kreis
- Bioinformatics & Systems Biology, Justus-Liebig-University Giessen, 35392 Giessen, Hesse, Germany
| | - Sebastian Spänig
- Bioinformatics & Systems Biology, Justus-Liebig-University Giessen, 35392 Giessen, Hesse, Germany
| | - Tobias Juhre
- Bioinformatics & Systems Biology, Justus-Liebig-University Giessen, 35392 Giessen, Hesse, Germany
| | - Claire Bertelli
- Institute of Microbiology, University Hospital Center and University of Lausanne, 1011 Lausanne, VD, Switzerland SIB Swiss Institute of Bioinformatics, 1015 Lausanne, VD, Switzerland
| | - Corinna Ernst
- Center for Familial Breast and Ovarian Cancer, Medical Faculty, University Hospital Cologne, University of Cologne, 50931 Cologne, NRW, Germany
| | - Alexander Goesmann
- Bioinformatics & Systems Biology, Justus-Liebig-University Giessen, 35392 Giessen, Hesse, Germany
| |
Collapse
|
10
|
Wang W, Liu J, Sun L. Surface shapes and surrounding environment analysis of single- and double-stranded DNA-binding proteins in protein-DNA interface. Proteins 2016; 84:979-89. [PMID: 27038080 DOI: 10.1002/prot.25045] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2016] [Revised: 03/15/2016] [Accepted: 03/25/2016] [Indexed: 11/12/2022]
Abstract
Protein-DNA bindings are critical to many biological processes. However, the structural mechanisms underlying these interactions are not fully understood. Here, we analyzed the residues shape (peak, flat, or valley) and the surrounding environment of double-stranded DNA-binding proteins (DSBs) and single-stranded DNA-binding proteins (SSBs) in protein-DNA interfaces. In the results, we found that the interface shapes, hydrogen bonds, and the surrounding environment present significant differences between the two kinds of proteins. Built on the investigation results, we constructed a random forest (RF) classifier to distinguish DSBs and SSBs with satisfying performance. In conclusion, we present a novel methodology to characterize protein interfaces, which will deepen our understanding of the specificity of proteins binding to ssDNA (single-stranded DNA) or dsDNA (double-stranded DNA). Proteins 2016; 84:979-989. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Wei Wang
- Department of Computer Science and Technology, College of Computer and Information Engineering, Henan Normal University, Xinxiang, 453007, China.,Laboratory of Computation Intelligence and Information Processing, Engineering Technology Research Center for Computing Intelligence and Data Mining, Henan Province, China
| | - Juan Liu
- Institute of Computer Software, School of Computer, Wuhan University, Wuhan, 430072, China
| | - Lin Sun
- Department of Computer Science and Technology, College of Computer and Information Engineering, Henan Normal University, Xinxiang, 453007, China.,Laboratory of Computation Intelligence and Information Processing, Engineering Technology Research Center for Computing Intelligence and Data Mining, Henan Province, China
| |
Collapse
|
11
|
MacGregor BJ. Abundant Intergenic TAACTGA Direct Repeats and Putative Alternate RNA Polymerase β' Subunits in Marine Beggiatoaceae Genomes: Possible Regulatory Roles and Origins. Front Microbiol 2015; 6:1397. [PMID: 26733950 PMCID: PMC4679880 DOI: 10.3389/fmicb.2015.01397] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2015] [Accepted: 11/23/2015] [Indexed: 12/15/2022] Open
Abstract
The genome sequences of several giant marine sulfur-oxidizing bacteria present evidence of a possible post-transcriptional regulatory network that may have been transmitted to or from two distantly related bacteria lineages. The draft genome of a Cand. “Maribeggiatoa” filament from the Guaymas Basin (Gulf of California, Mexico) seafloor contains 169 sets of TAACTGA direct repeats and one indirect repeat, with two to six copies per set. Related heptamers are rarely or never found as direct repeats. TAACTGA direct repeats are also found in some other Beggiatoaceae, Thiocystis violascens, a range of Cyanobacteria, and five Bacteroidetes. This phylogenetic distribution suggests they may have been transmitted horizontally, but no mechanism is evident. There is no correlation between total TAACTGA occurrences and repeats per genome. In most species the repeat units are relatively short, but longer arrays of up to 43 copies are found in several Bacteroidetes and Cyanobacteria. The majority of TAACTGA repeats in the Cand. “Maribeggiatoa” Orange Guaymas (BOGUAY) genome are within several nucleotides upstream of a putative start codon, suggesting they may be binding sites for a post-transcriptional regulator. Candidates include members of the ribosomal protein S1, Csp (cold shock protein), and Csr (carbon storage regulator) families. No pattern was evident in the predicted functions of the open reading frames (ORFs) downstream of repeats, but some encode presumably essential products such as ribosomal proteins. Among these is an ORF encoding a possible alternate or modified RNA polymerase beta prime subunit, predicted to have the expected subunit interaction domains but lacking most catalytic residues. A similar ORF was found in the Thioploca ingrica draft genome, but in no others. In both species they are immediately upstream of putative sensor kinase genes with nearly identical domain structures. In the marine Beggiatoaceae, a role for the TAACTGA repeats in translational regulation is suggested. More speculatively, the putative alternate RNA polymerase subunit could be a negative transcriptional regulator.
Collapse
Affiliation(s)
- Barbara J MacGregor
- Department of Marine Sciences, University of North Carolina-Chapel Hill Chapel Hill, NC, USA
| |
Collapse
|
12
|
Lindae A, Eberle RJ, Caruso IP, Coronado MA, de Moraes FR, Azevedo V, Arni RK. Expression, purification and characterization of cold shock protein A of Corynebacterium pseudotuberculosis. Protein Expr Purif 2015; 112:15-20. [DOI: 10.1016/j.pep.2015.04.006] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2015] [Revised: 03/27/2015] [Accepted: 04/14/2015] [Indexed: 10/23/2022]
|
13
|
Wang W, Liu J, Xiong Y, Zhu L, Zhou X. Analysis and classification of DNA-binding sites in single-stranded and double-stranded DNA-binding proteins using protein information. IET Syst Biol 2014; 8:176-83. [PMID: 25075531 DOI: 10.1049/iet-syb.2013.0048] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs) play different roles in biological processes when they bind to single-stranded DNA (ssDNA) or double-stranded DNA (dsDNA). However, the underlying binding mechanisms of SSBs and DSBs have not yet been fully understood. Here, the authors firstly constructed two groups of ssDNA and dsDNA specific binding sites from two non-redundant sets of SSBs and DSBs. They further analysed the relationship between the two classes of binding sites and a newly proposed set of features (residue charge distribution, secondary structure and spatial shape). To assess and utilise the predictive power of these features, they trained a classification model using support vector machine to make predictions about the ssDNA and the dsDNA binding sites. The author's analysis and prediction results indicated that the two classes of binding sites can be distinguishable by the three types of features, and the final classifier using all the features achieved satisfactory performance. In conclusion, the proposed features will deepen their understanding of the specificity of proteins which bind to ssDNA or dsDNA.
Collapse
Affiliation(s)
- Wei Wang
- School of Computer, Wuhan University, Wuhan, Hubei, People's Republic of China
| | - Juan Liu
- School of Computer, Wuhan University, Wuhan, Hubei, People's Republic of China.
| | - Yi Xiong
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana 47907, USA
| | - Lida Zhu
- School of Computer, Wuhan University, Wuhan, Hubei, People's Republic of China
| | - Xionghui Zhou
- School of Computer, Wuhan University, Wuhan, Hubei, People's Republic of China
| |
Collapse
|
14
|
Wang W, Liu J, Zhou X. Identification of single-stranded and double-stranded DNA binding proteins based on protein structure. BMC Bioinformatics 2014; 15 Suppl 12:S4. [PMID: 25474071 PMCID: PMC4243121 DOI: 10.1186/1471-2105-15-s12-s4] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Background Protein-DNA interactions are essential for many biological processes. However, the structural mechanisms underlying these interactions are not fully understood. DNA binding proteins can be classified into double-stranded DNA binding proteins (DSBs) and single-stranded DNA binding proteins (SSBs), and they take part in different biological functions. DSBs usually act as transcriptional factors to regulate the genes' expressions, while SSBs usually play roles in DNA replication, recombination, and repair, etc. Understanding the binding specificity of a DNA binding protein is helpful for the research of protein functions. Results In this paper, we investigated the differences between DSBs and SSBs on surface tunnels as well as the OB-fold domain information. We detected the largest clefts on the protein surfaces, to obtain several features to be used for distinguishing the potential interfaces between SSBs and DSBs, and compared its structure with each of the six OB-fold protein templates, and use the maximal alignment score TM-score as the OB-fold feature of the protein, based on which, we constructed the support vector machine (SVM) classification model to automatically distinguish these two kinds of proteins, with prediction accuracy of 87%,83% and 83% for HOLO-set, APO-set and Mixed-set respectively. Conclusions We found that they have different ranges of tunnel lengths and tunnel curvatures; moreover, the alignment results with OB-fold templates have also found to be the discriminative feature of SSBs and DSBs. Experimental results on 10-fold cross validation indicate that the new feature set are effective to describe DNA binding proteins. The evaluation results on both bound (DNA-bound) and non-bound (DNA-free) proteins have shown the satisfactory performance of our method.
Collapse
|
15
|
Zasedateleva OA, Vasiliskov VA, Surzhikov SA, Sazykin AY, Putlyaeva LV, Schwarz AM, Kuprash DV, Rubina AY, Barsky VE, Zasedatelev AS. UV fluorescence of tryptophan residues effectively measures protein binding to nucleic acid fragments immobilized in gel elements of microarrays. Biotechnol J 2014; 9:1074-80. [DOI: 10.1002/biot.201300556] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2013] [Revised: 05/04/2014] [Accepted: 06/11/2014] [Indexed: 12/13/2022]
|
16
|
Mayr F, Heinemann U. Mechanisms of Lin28-mediated miRNA and mRNA regulation--a structural and functional perspective. Int J Mol Sci 2013; 14:16532-53. [PMID: 23939427 PMCID: PMC3759924 DOI: 10.3390/ijms140816532] [Citation(s) in RCA: 93] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Revised: 07/22/2013] [Accepted: 07/25/2013] [Indexed: 12/14/2022] Open
Abstract
Lin28 is an essential RNA-binding protein that is ubiquitously expressed in embryonic stem cells. Its physiological function has been linked to the regulation of differentiation, development, and oncogenesis as well as glucose metabolism. Lin28 mediates these pleiotropic functions by inhibiting let-7 miRNA biogenesis and by modulating the translation of target mRNAs. Both activities strongly depend on Lin28’s RNA-binding domains (RBDs), an N-terminal cold-shock domain (CSD) and a C-terminal Zn-knuckle domain (ZKD). Recent biochemical and structural studies revealed the mechanisms of how Lin28 controls let-7 biogenesis. Lin28 binds to the terminal loop of pri- and pre-let-7 miRNA and represses their processing by Drosha and Dicer. Several biochemical and structural studies showed that the specificity of this interaction is mainly mediated by the ZKD with a conserved GGAGA or GGAGA-like motif. Further RNA crosslinking and immunoprecipitation coupled to high-throughput sequencing (CLIP-seq) studies confirmed this binding motif and uncovered a large number of new mRNA binding sites. Here we review exciting recent progress in our understanding of how Lin28 binds structurally diverse RNAs and fulfills its pleiotropic functions.
Collapse
Affiliation(s)
- Florian Mayr
- Crystallography, Max-Delbrück Center for Molecular Medicine, Robert-Rössle Straße 10, Berlin 13125, Germany; E-Mail:
- Institute for Chemistry and Biochemistry, Freie Universität Berlin, Takustraße 6, Berlin 14195, Germany
| | - Udo Heinemann
- Crystallography, Max-Delbrück Center for Molecular Medicine, Robert-Rössle Straße 10, Berlin 13125, Germany; E-Mail:
- Institute for Chemistry and Biochemistry, Freie Universität Berlin, Takustraße 6, Berlin 14195, Germany
- Author to whom correspondence should be addressed; E-Mail: ; Tel.: +49-30-9406-3420; Fax: +49-30-9406-2548
| |
Collapse
|
17
|
Sachs R, Max KE, Heinemann U, Balbach J. RNA single strands bind to a conserved surface of the major cold shock protein in crystals and solution. RNA (NEW YORK, N.Y.) 2012; 18:65-76. [PMID: 22128343 PMCID: PMC3261745 DOI: 10.1261/rna.02809212] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/09/2011] [Accepted: 08/29/2011] [Indexed: 05/26/2023]
Abstract
Bacterial cold shock proteins (CSPs) regulate the cellular response to temperature downshift. Their general principle of function involves RNA chaperoning and transcriptional antitermination. Here we present two crystal structures of cold shock protein B from Bacillus subtilis (Bs-CspB) in complex with either a hexanucleotide (5'-UUUUUU-3') or heptanucleotide (5'-GUCUUUA-3') single-stranded RNA (ssRNA). Hydrogen bonds and stacking interactions between RNA bases and aromatic sidechains characterize individual binding subsites. Additional binding subsites which are not occupied by the ligand in the crystal structure were revealed by NMR spectroscopy in solution on Bs-CspB·RNA complexes. Binding studies demonstrate that Bs-CspB associates with ssDNA as well as ssRNA with moderate sequence specificity. Varying affinities of oligonucleotides are reflected mainly in changes of the dissociation rates. The generally lower binding affinity of ssRNA compared to its ssDNA analog is attributed solely to the substitution of thymine by uracil bases in RNA.
Collapse
Affiliation(s)
- Rolf Sachs
- Fachgruppe Biophysik Institut für Physik, Martin-Luther-Universität Halle-Wittenberg, 06120 Halle (Saale), Germany
| | - Klaas E.A. Max
- Max-Delbrück-Centrum für Molekulare Medizin Berlin-Buch, 13125 Berlin, Germany
| | - Udo Heinemann
- Max-Delbrück-Centrum für Molekulare Medizin Berlin-Buch, 13125 Berlin, Germany
- Institut für Chemie und Biochemie, Freie Universität Berlin, 14195 Berlin, Germany
| | - Jochen Balbach
- Fachgruppe Biophysik Institut für Physik, Martin-Luther-Universität Halle-Wittenberg, 06120 Halle (Saale), Germany
| |
Collapse
|
18
|
Morgan HP, Wear MA, McNae I, Gallagher MP, Walkinshaw MD. Crystallization and X-ray structure of cold-shock protein E from Salmonella typhimurium. Acta Crystallogr Sect F Struct Biol Cryst Commun 2009; 65:1240-5. [PMID: 20054119 PMCID: PMC2802871 DOI: 10.1107/s1744309109033788] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2009] [Accepted: 08/24/2009] [Indexed: 11/10/2022]
Abstract
In prokaryotic organisms, cold shock triggers the production of a small highly conserved family of cold-shock proteins (CSPs). CSPs have been well studied structurally and functionally in Escherichia coli and Bacillus subtilis, but Salmonella typhimurium CSPs remain relatively uncharacterized. In S. typhimurium, six homologous CSPs have been identified: StCspA-E and StCspH. The crystal structure of cold-shock protein E from S. typhimurium (StCspE) has been determined at 1.1 A resolution and has an R factor of 0.203 after refinement. The three-dimensional structure is similar to those of previously determined CSPs and is composed of five antiparallel beta-strands forming a classic OB fold/five-stranded beta-barrel. This first structure of a CSP from S. typhimurium provides new insight into the cold-shock response of this bacterium.
Collapse
Affiliation(s)
- Hugh P. Morgan
- Centre for Translational and Chemical Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, Scotland
| | - Martin A. Wear
- Centre for Translational and Chemical Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, Scotland
| | - Iain McNae
- Centre for Translational and Chemical Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, Scotland
| | - Maurice P. Gallagher
- Centre for Translational and Chemical Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, Scotland
| | - Malcolm D. Walkinshaw
- Centre for Translational and Chemical Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, Scotland
| |
Collapse
|
19
|
Zasedateleva OA, Mikheikin AL, Turygin AY, Prokopenko DV, Chudinov AV, Belobritskaya EE, Chechetkin VR, Zasedatelev AS. Gel-based oligonucleotide microarray approach to analyze protein-ssDNA binding specificity. Nucleic Acids Res 2008; 36:e61. [PMID: 18474529 PMCID: PMC2425478 DOI: 10.1093/nar/gkn246] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Gel-based oligonucleotide microarray approach was developed for quantitative profiling of binding affinity of a protein to single-stranded DNA (ssDNA). To demonstrate additional capabilities of this method, we analyzed the binding specificity of ribonuclease (RNase) binase from Bacillus intermedius (EC 3.1.27.3) to ssDNA using generic hexamer oligodeoxyribonucleotide microchip. Single-stranded octamer oligonucleotides were immobilized within 3D hemispherical gel pads. The octanucleotides in individual pads 5'-{N}N(1)N(2)N(3)N(4)N(5)N(6){N}-3' consisted of a fixed hexamer motif N(1)N(2)N(3)N(4)N(5)N(6) in the middle and variable parts {N} at the ends, where {N} represent A, C, G and T in equal proportions. The chip has 4096 pads with a complete set of hexamer sequences. The affinity was determined by measuring dissociation of the RNase-ssDNA complexes with the temperature increasing from 0 degrees C to 50 degrees C in quasi-equilibrium conditions. RNase binase showed the highest sequence-specificity of binding to motifs 5'-NNG(A/T/C)GNN-3' with the order of preference: GAG > GTG > GCG. High specificity towards G(A/T/C)G triplets was also confirmed by measuring fluorescent anisotropy of complexes of binase with selected oligodeoxyribonucleotides in solution. The affinity of RNase binase to other 3-nt sequences was also ranked. These results demonstrate the applicability of the method and provide the ground for further investigations of nonenzymatic functions of RNases.
Collapse
Affiliation(s)
- Olga A Zasedateleva
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 32 Vavilov Street, 119991 Moscow, Russian Federation.
| | | | | | | | | | | | | | | |
Collapse
|