1
|
Zhou H, Yan X, Song Y, Yang X, Chen X, Huang Y. Development of a Magnetic Solid-Phase Extraction-Liquid Chromatography Targeted to Five Fluoroquinolones in Food Based on Aptamer Recognition. Foods 2025; 14:798. [PMID: 40077500 PMCID: PMC11899132 DOI: 10.3390/foods14050798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2025] [Revised: 02/20/2025] [Accepted: 02/22/2025] [Indexed: 03/14/2025] Open
Abstract
Fluoroquinolones (FQs) are present in trace amounts in the environment, from where they enter animal- and plant-derived food products. Long-term exposure to low-dose drugs poses a risk to human health and increases the pressure on antibiotic selection. Based on previous aptamer screening with high FQs specificity, this study combined a new aptamer recognition probe with a metal-organic framework (MOF) to obtain a sample pretreatment composite material with strong FQs specificity for multi-target analysis. Residual FQs were extracted from the complex food matrix via magnetic dispersive solid-phase extraction and examined using high-performance liquid chromatography. The method showed good linearity in a range of 0.39 to 200 µg/kg for five FQs in milk and fish samples, with a detection limit of 0.04-0.10 µg/kg and a quantitative limit of 0.13-0.33 µg/kg. This study successfully developed an effective sample pretreatment material and methodology for trace FQs identification in complex animal-derived food matrices.
Collapse
Affiliation(s)
- Haiyan Zhou
- Food Microbiology Key Laboratory of Sichuan Province, School of Food and Bioengineering, Xihua University, Chengdu 610039, China; (H.Z.); (X.Y.); (Y.S.); (X.Y.); (X.C.)
| | - Xiaofeng Yan
- Food Microbiology Key Laboratory of Sichuan Province, School of Food and Bioengineering, Xihua University, Chengdu 610039, China; (H.Z.); (X.Y.); (Y.S.); (X.Y.); (X.C.)
| | - Yaning Song
- Food Microbiology Key Laboratory of Sichuan Province, School of Food and Bioengineering, Xihua University, Chengdu 610039, China; (H.Z.); (X.Y.); (Y.S.); (X.Y.); (X.C.)
| | - Xiao Yang
- Food Microbiology Key Laboratory of Sichuan Province, School of Food and Bioengineering, Xihua University, Chengdu 610039, China; (H.Z.); (X.Y.); (Y.S.); (X.Y.); (X.C.)
- Chongqing Key Laboratory of Speciality Food Co-Built by Sichuan and Chongqing, Chengdu 610039, China
| | - Xianggui Chen
- Food Microbiology Key Laboratory of Sichuan Province, School of Food and Bioengineering, Xihua University, Chengdu 610039, China; (H.Z.); (X.Y.); (Y.S.); (X.Y.); (X.C.)
- Chongqing Key Laboratory of Speciality Food Co-Built by Sichuan and Chongqing, Chengdu 610039, China
| | - Yukun Huang
- Food Microbiology Key Laboratory of Sichuan Province, School of Food and Bioengineering, Xihua University, Chengdu 610039, China; (H.Z.); (X.Y.); (Y.S.); (X.Y.); (X.C.)
- Chongqing Key Laboratory of Speciality Food Co-Built by Sichuan and Chongqing, Chengdu 610039, China
| |
Collapse
|
2
|
Krautwurst S, Lamkiewicz K. RNA-protein interaction prediction without high-throughput data: An overview and benchmark of in silico tools. Comput Struct Biotechnol J 2024; 23:4036-4046. [PMID: 39610906 PMCID: PMC11603007 DOI: 10.1016/j.csbj.2024.11.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Revised: 11/05/2024] [Accepted: 11/05/2024] [Indexed: 11/30/2024] Open
Abstract
RNA-protein interactions (RPIs) are crucial for accurately operating various processes in and between organisms across kingdoms of life. Mutual detection of RPI partner molecules depends on distinct sequential, structural, or thermodynamic features, which can be determined via experimental and bioinformatic methods. Still, the underlying molecular mechanisms of many RPIs are poorly understood. It is further hypothesized that many RPIs are not even described yet. Computational RPI prediction is continuously challenged by the lack of data and detailed research of very specific examples. With the discovery of novel RPI complexes in all kingdoms of life, adaptations of existing RPI prediction methods are necessary. Continuously improving computational RPI prediction is key in advancing the understanding of RPIs in detail and supplementing experimental RPI determination. The growing amount of data covering more species and detailed mechanisms support the accuracy of prediction tools, which in turn support specific experimental research on RPIs. Here, we give an overview of RPI prediction tools that do not use high-throughput data as the user's input. We review the tools according to their input, usability, and output. We then apply the tools to known RPI examples across different kingdoms of life. Our comparison shows that the investigated prediction tools do not favor a certain species and equip the user with results varying in degree of information, from an overall RPI score to detailed interacting residues. Furthermore, we provide a guide tree to assist users which RPI prediction tool is appropriate for their available input data and desired output.
Collapse
Affiliation(s)
- Sarah Krautwurst
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany
- European Virus Bioinformatics Center, Leutragraben 1, 07743 Jena, Germany
| | - Kevin Lamkiewicz
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany
- European Virus Bioinformatics Center, Leutragraben 1, 07743 Jena, Germany
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Puschstr. 4, 04103 Leipzig, Germany
| |
Collapse
|
3
|
Corley M, Burns MC, Yeo GW. How RNA-Binding Proteins Interact with RNA: Molecules and Mechanisms. Mol Cell 2020; 78:9-29. [PMID: 32243832 PMCID: PMC7202378 DOI: 10.1016/j.molcel.2020.03.011] [Citation(s) in RCA: 475] [Impact Index Per Article: 95.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2019] [Revised: 01/13/2020] [Accepted: 03/09/2020] [Indexed: 12/17/2022]
Abstract
RNA-binding proteins (RBPs) comprise a large class of over 2,000 proteins that interact with transcripts in all manner of RNA-driven processes. The structures and mechanisms that RBPs use to bind and regulate RNA are incredibly diverse. In this review, we take a look at the components of protein-RNA interaction, from the molecular level to multi-component interaction. We first summarize what is known about protein-RNA molecular interactions based on analyses of solved structures. We additionally describe software currently available for predicting protein-RNA interaction and other resources useful for the study of RBPs. We then review the structure and function of seventeen known RNA-binding domains and analyze the hydrogen bonds adopted by protein-RNA structures on a domain-by-domain basis. We conclude with a summary of the higher-level mechanisms that regulate protein-RNA interactions.
Collapse
Affiliation(s)
- Meredith Corley
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Margaret C Burns
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA, USA; Biomedical Sciences Graduate Program, University of California, San Diego, La Jolla, CA, USA
| | - Gene W Yeo
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA, USA; Biomedical Sciences Graduate Program, University of California, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
4
|
Luo J, Liu L, Venkateswaran S, Song Q, Zhou X. RPI-Bind: a structure-based method for accurate identification of RNA-protein binding sites. Sci Rep 2017; 7:614. [PMID: 28377624 PMCID: PMC5429624 DOI: 10.1038/s41598-017-00795-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2016] [Accepted: 03/13/2017] [Indexed: 01/11/2023] Open
Abstract
RNA and protein interactions play crucial roles in multiple biological processes, while these interactions are significantly influenced by the structures and sequences of protein and RNA molecules. In this study, we first performed an analysis of RNA-protein interacting complexes, and identified interface properties of sequences and structures, which reveal the diverse nature of the binding sites. With the observations, we built a three-step prediction model, namely RPI-Bind, for the identification of RNA-protein binding regions using the sequences and structures of both proteins and RNAs. The three steps include 1) the prediction of RNA binding regions on protein, 2) the prediction of protein binding regions on RNA, and 3) the prediction of interacting regions on both RNA and protein simultaneously, with the results from steps 1) and 2). Compared with existing methods, most of which employ only sequences, our model significantly improves the prediction accuracy at each of the three steps. Especially, our model outperforms the catRAPID by >20% at the 3rd step. All of these results indicate the importance of structures in RNA-protein interactions, and suggest that the RPI-Bind model is a powerful theoretical framework for studying RNA-protein interactions.
Collapse
Affiliation(s)
- Jiesi Luo
- Center for Bioinformatics and Systems Biology and Department of Radiology, Wake Forest School of Medicine, Winston-Salem, NC, 27157, USA
| | - Liang Liu
- Center for Bioinformatics and Systems Biology and Department of Radiology, Wake Forest School of Medicine, Winston-Salem, NC, 27157, USA
| | - Suresh Venkateswaran
- Center for Bioinformatics and Systems Biology and Department of Radiology, Wake Forest School of Medicine, Winston-Salem, NC, 27157, USA
| | - Qianqian Song
- Center for Bioinformatics and Systems Biology and Department of Radiology, Wake Forest School of Medicine, Winston-Salem, NC, 27157, USA
| | - Xiaobo Zhou
- Center for Bioinformatics and Systems Biology and Department of Radiology, Wake Forest School of Medicine, Winston-Salem, NC, 27157, USA.
| |
Collapse
|
5
|
Liu ZP, Liu S, Chen R, Huang X, Wu LY. Structure alignment-based classification of RNA-binding pockets reveals regional RNA recognition motifs on protein surfaces. BMC Bioinformatics 2017; 18:27. [PMID: 28077065 PMCID: PMC5225598 DOI: 10.1186/s12859-016-1410-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2016] [Accepted: 12/07/2016] [Indexed: 11/23/2022] Open
Abstract
Background Many critical biological processes are strongly related to protein-RNA interactions. Revealing the protein structure motifs for RNA-binding will provide valuable information for deciphering protein-RNA recognition mechanisms and benefit complementary structural design in bioengineering. RNA-binding events often take place at pockets on protein surfaces. The structural classification of local binding pockets determines the major patterns of RNA recognition. Results In this work, we provide a novel framework for systematically identifying the structure motifs of protein-RNA binding sites in the form of pockets on regional protein surfaces via a structure alignment-based method. We first construct a similarity network of RNA-binding pockets based on a non-sequential-order structure alignment method for local structure alignment. By using network community decomposition, the RNA-binding pockets on protein surfaces are clustered into groups with structural similarity. With a multiple structure alignment strategy, the consensus RNA-binding pockets in each group are identified. The crucial recognition patterns, as well as the protein-RNA binding motifs, are then identified and analyzed. Conclusions Large-scale RNA-binding pockets on protein surfaces are grouped by measuring their structural similarities. This similarity network-based framework provides a convenient method for modeling the structural relationships of functional pockets. The local structural patterns identified serve as structure motifs for the recognition with RNA on protein surfaces. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1410-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Zhi-Ping Liu
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong, 250061, China
| | - Shutang Liu
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong, 250061, China
| | - Ruitang Chen
- Department of Computer Science, Stanford University, Stanford, CA, 94305, USA
| | - Xiaopeng Huang
- Institute of Applied Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China.,National Center for Mathematics and Interdisciplinary Sciences, Chinese Academy of Sciences, Beijing, 100190, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Ling-Yun Wu
- Institute of Applied Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China. .,National Center for Mathematics and Interdisciplinary Sciences, Chinese Academy of Sciences, Beijing, 100190, China. .,University of Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|
6
|
|
7
|
Miao Z, Westhof E. A Large-Scale Assessment of Nucleic Acids Binding Site Prediction Programs. PLoS Comput Biol 2015; 11:e1004639. [PMID: 26681179 PMCID: PMC4683125 DOI: 10.1371/journal.pcbi.1004639] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2015] [Accepted: 10/30/2015] [Indexed: 11/18/2022] Open
Abstract
Computational prediction of nucleic acid binding sites in proteins are necessary to disentangle functional mechanisms in most biological processes and to explore the binding mechanisms. Several strategies have been proposed, but the state-of-the-art approaches display a great diversity in i) the definition of nucleic acid binding sites; ii) the training and test datasets; iii) the algorithmic methods for the prediction strategies; iv) the performance measures and v) the distribution and availability of the prediction programs. Here we report a large-scale assessment of 19 web servers and 3 stand-alone programs on 41 datasets including more than 5000 proteins derived from 3D structures of protein-nucleic acid complexes. Well-defined binary assessment criteria (specificity, sensitivity, precision, accuracy…) are applied. We found that i) the tools have been greatly improved over the years; ii) some of the approaches suffer from theoretical defects and there is still room for sorting out the essential mechanisms of binding; iii) RNA binding and DNA binding appear to follow similar driving forces and iv) dataset bias may exist in some methods.
Collapse
Affiliation(s)
- Zhichao Miao
- Architecture et Réactivité de l'ARN, Université de Strasbourg, Institut de Biologie Moléculaire et Cellulaire du CNRS, Strasbourg, France
| | - Eric Westhof
- Architecture et Réactivité de l'ARN, Université de Strasbourg, Institut de Biologie Moléculaire et Cellulaire du CNRS, Strasbourg, France
| |
Collapse
|
8
|
Prediction of Protein-RNA Interactions Using Sequence and Structure Descriptors**This work was partially supported by the National Natural Science Foundation of China (NSFC) Grant No. 31100949, the Scientific Research Foundation for the Returned Overseas Chinese Scholars, Ministry of Education of China, the Fundamental Research Funds of Shandong University Grant No. 2014TB006, University of Rochester Center for AIDS Research Grant P30 AI078498 (NIH/NIAID) and NIH R01 Grant GM100788-01. ACTA ACUST UNITED AC 2015. [DOI: 10.1016/j.ifacol.2015.12.090] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
9
|
Predicting protein-RNA interaction amino acids using random forest based on submodularity subset selection. Comput Biol Chem 2014; 53PB:324-330. [PMID: 25462339 DOI: 10.1016/j.compbiolchem.2014.11.002] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2014] [Revised: 10/31/2014] [Accepted: 11/08/2014] [Indexed: 12/25/2022]
Abstract
Protein-RNA interaction plays a very crucial role in many biological processes, such as protein synthesis, transcription and post-transcription of gene expression and pathogenesis of disease. Especially RNAs always function through binding to proteins. Identification of binding interface region is especially useful for cellular pathways analysis and drug design. In this study, we proposed a novel approach for binding sites identification in proteins, which not only integrates local features and global features from protein sequence directly, but also constructed a balanced training dataset using sub-sampling based on submodularity subset selection. Firstly we extracted local features and global features from protein sequence, such as evolution information and molecule weight. Secondly, the number of non-interaction sites is much more than interaction sites, which leads to a sample imbalance problem, and hence biased machine learning model with preference to non-interaction sites. To better resolve this problem, instead of previous randomly sub-sampling over-represented non-interaction sites, a novel sampling approach based on submodularity subset selection was employed, which can select more representative data subset. Finally random forest were trained on optimally selected training subsets to predict interaction sites. Our result showed that our proposed method is very promising for predicting protein-RNA interaction residues, it achieved an accuracy of 0.863, which is better than other state-of-the-art methods. Furthermore, it also indicated the extracted global features have very strong discriminate ability for identifying interaction residues from random forest feature importance analysis.
Collapse
|
10
|
Re A, Joshi T, Kulberkyte E, Morris Q, Workman CT. RNA-protein interactions: an overview. Methods Mol Biol 2014; 1097:491-521. [PMID: 24639174 DOI: 10.1007/978-1-62703-709-9_23] [Citation(s) in RCA: 81] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
RNA binding proteins (RBPs) are key players in the regulation of gene expression. In this chapter we discuss the main protein-RNA recognition modes used by RBPs in order to regulate multiple steps of RNA processing. We discuss traditional and state-of-the-art technologies that can be used to study RNAs bound by individual RBPs, or vice versa, for both in vitro and in vivo methodologies. To help highlight the biological significance of RBP mediated regulation, online resources on experimentally verified protein-RNA interactions are briefly presented. Finally, we present the major tools to computationally infer RNA binding sites according to the modeling features and to the unsupervised or supervised frameworks that are adopted. Since some RNA binding site search algorithms are derived from DNA binding site search algorithms, we discuss the commonalities and novelties introduced to handle both sequence and structural features uniquely characterizing protein-RNA interactions.
Collapse
Affiliation(s)
- Angela Re
- University of Trento, Mattarello, Italy
| | | | | | | | | |
Collapse
|
11
|
Parisien M, Wang X, Perdrizet G, Lamphear C, Fierke CA, Maheshwari KC, Wilde MJ, Sosnick TR, Pan T. Discovering RNA-protein interactome by using chemical context profiling of the RNA-protein interface. Cell Rep 2013; 3:1703-13. [PMID: 23665222 DOI: 10.1016/j.celrep.2013.04.010] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2012] [Revised: 03/04/2013] [Accepted: 04/12/2013] [Indexed: 02/04/2023] Open
Abstract
RNA-protein (RNP) interactions generally are required for RNA function. At least 5% of human genes code for RNA-binding proteins. Whereas many approaches can identify the RNA partners for a specific protein, finding the protein partners for a specific RNA is difficult. We present a machine-learning method that scores a protein's binding potential for an RNA structure by utilizing the chemical context profiles of the interface from known RNP structures. Our approach is applicable even when only a single RNP structure is available. We examined 801 mammalian proteins and find that 37 (4.6%) potentially bind transfer RNA (tRNA). Most are enzymes involved in cellular processes unrelated to translation and were not known to interact with RNA. We experimentally tested six positive and three negative predictions for tRNA binding in vivo, and all nine predictions were correct. Our computational approach provides a powerful complement to experiments in discovering new RNPs.
Collapse
Affiliation(s)
- Marc Parisien
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL 60637, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Jahandideh S, Srinivasasainagendra V, Zhi D. Comprehensive comparative analysis and identification of RNA-binding protein domains: multi-class classification and feature selection. J Theor Biol 2012; 312:65-75. [PMID: 22884576 DOI: 10.1016/j.jtbi.2012.07.013] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2012] [Revised: 07/09/2012] [Accepted: 07/13/2012] [Indexed: 01/11/2023]
Abstract
RNA-protein interaction plays an important role in various cellular processes, such as protein synthesis, gene regulation, post-transcriptional gene regulation, alternative splicing, and infections by RNA viruses. In this study, using Gene Ontology Annotated (GOA) and Structural Classification of Proteins (SCOP) databases an automatic procedure was designed to capture structurally solved RNA-binding protein domains in different subclasses. Subsequently, we applied tuned multi-class SVM (TMCSVM), Random Forest (RF), and multi-class ℓ1/ℓq-regularized logistic regression (MCRLR) for analysis and classifying RNA-binding protein domains based on a comprehensive set of sequence and structural features. In this study, we compared prediction accuracy of three different state-of-the-art predictor methods. From our results, TMCSVM outperforms the other methods and suggests the potential of TMCSVM as a useful tool for facilitating the multi-class prediction of RNA-binding protein domains. On the other hand, MCRLR by elucidating importance of features for their contribution in predictive accuracy of RNA-binding protein domains subclasses, helps us to provide some biological insights into the roles of sequences and structures in protein-RNA interactions.
Collapse
Affiliation(s)
- Samad Jahandideh
- Section on Statistical Genetics, Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL, USA.
| | - Vinodh Srinivasasainagendra
- Section on Statistical Genetics, Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Degui Zhi
- Section on Statistical Genetics, Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL, USA.
| |
Collapse
|
13
|
Barik A, Mishra A, Bahadur RP. PRince: a web server for structural and physicochemical analysis of protein-RNA interface. Nucleic Acids Res 2012; 40:W440-4. [PMID: 22689640 PMCID: PMC3394290 DOI: 10.1093/nar/gks535] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
We have developed a web server, PRince, which analyzes the structural features and physicochemical properties of the protein–RNA interface. Users need to submit a PDB file containing the atomic coordinates of both the protein and the RNA molecules in complex form (in ‘.pdb’ format). They should also mention the chain identifiers of interacting protein and RNA molecules. The size of the protein–RNA interface is estimated by measuring the solvent accessible surface area buried in contact. For a given protein–RNA complex, PRince calculates structural, physicochemical and hydration properties of the interacting surfaces. All these parameters generated by the server are presented in a tabular format. The interacting surfaces can also be visualized with software plug-in like Jmol. In addition, the output files containing the list of the atomic coordinates of the interacting protein, RNA and interface water molecules can be downloaded. The parameters generated by PRince are novel, and users can correlate them with the experimentally determined biophysical and biochemical parameters for better understanding the specificity of the protein–RNA recognition process. This server will be continuously upgraded to include more parameters. PRince is publicly accessible and free for use. Available at http://www.facweb.iitkgp.ernet.in/~rbahadur/prince/home.html.
Collapse
Affiliation(s)
- Amita Barik
- Department of Biotechnology, Indian Institute of Technology, Kharagpur 721302, India
| | | | | |
Collapse
|
14
|
Wu MY, Dai DQ, Yan H. PRL-dock: Protein-ligand docking based on hydrogen bond matching and probabilistic relaxation labeling. Proteins 2012; 80:2137-53. [DOI: 10.1002/prot.24104] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2012] [Revised: 04/14/2012] [Accepted: 04/17/2012] [Indexed: 11/08/2022]
|
15
|
Iwakiri J, Tateishi H, Chakraborty A, Patil P, Kenmochi N. Dissecting the protein-RNA interface: the role of protein surface shapes and RNA secondary structures in protein-RNA recognition. Nucleic Acids Res 2011; 40:3299-306. [PMID: 22199255 PMCID: PMC3333874 DOI: 10.1093/nar/gkr1225] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Protein-RNA interactions are essential for many biological processes. However, the structural mechanisms underlying these interactions are not fully understood. Here, we analyzed the protein surface shape (dented, intermediate or protruded) and the RNA base pairing properties (paired or unpaired nucleotides) at the interfaces of 91 protein-RNA complexes derived from the Protein Data Bank. Dented protein surfaces prefer unpaired nucleotides to paired ones at the interface, and hydrogen bonds frequently occur between the protein backbone and RNA bases. In contrast, protruded protein surfaces do not show such a preference, rather, electrostatic interactions initiate the formation of hydrogen bonds between positively charged amino acids and RNA phosphate groups. Interestingly, in many protein-RNA complexes that interact via an RNA loop, an aspartic acid is favored at the interface. Moreover, in most of these complexes, nucleotide bases in the RNA loop are flipped out and form hydrogen bonds with the protein, which suggests that aspartic acid is important for RNA loop recognition through a base-flipping process. This study provides fundamental insights into the role of the shape of the protein surface and RNA secondary structures in mediating protein-RNA interactions.
Collapse
Affiliation(s)
| | | | | | | | - Naoya Kenmochi
- *To whom correspondence should be addressed. Tel/Fax: +81 985 85 9084;
| |
Collapse
|
16
|
Fernandez M, Kumagai Y, Standley DM, Sarai A, Mizuguchi K, Ahmad S. Prediction of dinucleotide-specific RNA-binding sites in proteins. BMC Bioinformatics 2011; 12 Suppl 13:S5. [PMID: 22373260 PMCID: PMC3278845 DOI: 10.1186/1471-2105-12-s13-s5] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
Background Regulation of gene expression, protein synthesis, replication and assembly of many viruses involve RNA–protein interactions. Although some successful computational tools have been reported to recognize RNA binding sites in proteins, the problem of specificity remains poorly investigated. After the nucleotide base composition, the dinucleotide is the smallest unit of RNA sequence information and many RNA-binding proteins simply bind to regions enriched in one dinucleotide. Interaction preferences of protein subsequences and dinucleotides can be inferred from protein-RNA complex structures, enabling a training-based prediction approach. Results We analyzed basic statistics of amino acid-dinucleotide contacts in protein-RNA complexes and found their pairing preferences could be identified. Using a standard approach to represent protein subsequences by their evolutionary profile, we trained neural networks to predict multiclass target vectors corresponding to 16 possible contacting dinucleotide subsequences. In the cross-validation experiments, the accuracies of the optimum network, measured as areas under the curve (AUC) of the receiver operating characteristic (ROC) graphs, were in the range of 65-80%. Conclusions Dinucleotide-specific contact predictions have also been extended to the prediction of interacting protein and RNA fragment pairs, which shows the applicability of this method to predict targets of RNA-binding proteins. A web server predicting the 16-dimensional contact probability matrix directly from a user-defined protein sequence was implemented and made available at: http://tardis.nibio.go.jp/netasa/srcpred.
Collapse
|
17
|
Shazman S, Elber G, Mandel-Gutfreund Y. From face to interface recognition: a differential geometric approach to distinguish DNA from RNA binding surfaces. Nucleic Acids Res 2011; 39:7390-9. [PMID: 21693557 PMCID: PMC3177183 DOI: 10.1093/nar/gkr395] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Protein nucleic acid interactions play a critical role in all steps of the gene expression pathway. Nucleic acid (NA) binding proteins interact with their partners, DNA or RNA, via distinct regions on their surface that are characterized by an ensemble of chemical, physical and geometrical properties. In this study, we introduce a novel methodology based on differential geometry, commonly used in face recognition, to characterize and predict NA binding surfaces on proteins. Applying the method on experimentally solved three-dimensional structures of proteins we successfully classify double-stranded DNA (dsDNA) from single-stranded RNA (ssRNA) binding proteins, with 83% accuracy. We show that the method is insensitive to conformational changes that occur upon binding and can be applicable for de novo protein-function prediction. Remarkably, when concentrating on the zinc finger motif, we distinguish successfully between RNA and DNA binding interfaces possessing the same binding motif even within the same protein, as demonstrated for the RNA polymerase transcription-factor, TFIIIA. In conclusion, we present a novel methodology to characterize protein surfaces, which can accurately tell apart dsDNA from an ssRNA binding interfaces. The strength of our method in recognizing fine-tuned differences on NA binding interfaces make it applicable for many other molecular recognition problems, with potential implications for drug design.
Collapse
Affiliation(s)
- Shula Shazman
- Department of Computer Science, Technion-Israel Institute of Technology, Haifa, Israel
| | | | | |
Collapse
|
18
|
Gupta A, Gribskov M. The role of RNA sequence and structure in RNA--protein interactions. J Mol Biol 2011; 409:574-87. [PMID: 21514302 DOI: 10.1016/j.jmb.2011.04.007] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2010] [Revised: 02/07/2011] [Accepted: 04/04/2011] [Indexed: 11/17/2022]
Abstract
We investigate the sequence and structural properties of RNA--protein interaction sites in 211 RNA--protein chain pairs, the largest set of RNA--protein complexes analyzed to date. Statistical analysis confirms and extends earlier analyses made on smaller data sets. There are 24.6% of hydrogen bonds between RNA and protein that are nucleobase specific, indicating the importance of both nucleobase-specific and -nonspecific interactions. While there is no significant difference between RNA base frequencies in protein-binding and non-binding regions, distinct preferences for RNA bases, RNA structural states, protein residues, and protein secondary structure emerge when nucleobase-specific and -nonspecific interactions are considered separately. Guanine nucleobase and unpaired RNA structural states are significantly preferred in nucleobase-specific interactions; however, nonspecific interactions disfavor guanine, while still favoring unpaired RNA structural states. The opposite preferences of nucleobase-specific and -nonspecific interactions for guanine may explain discrepancies between earlier studies with regard to base preferences in RNA--protein interaction regions. Preferences for amino acid residues differ significantly between nucleobase-specific and -nonspecific interactions, with nonspecific interactions showing the expected bias towards positively charged residues. Irregular protein structures are strongly favored in interactions with the protein backbone, whereas there is little preference for specific protein secondary structure in either nucleobase-specific interaction or -nonspecific interaction. Overall, this study shows strong preferences for both RNA bases and RNA structural states in protein--RNA interactions, indicating their mutual importance in protein recognition.
Collapse
Affiliation(s)
- Aditi Gupta
- Department of Biological Sciences, Purdue University, Hockmeyer Hall of Structural Biology, West Lafayette, IN 47907, USA
| | | |
Collapse
|
19
|
Pancaldi V, Bähler J. In silico characterization and prediction of global protein-mRNA interactions in yeast. Nucleic Acids Res 2011; 39:5826-36. [PMID: 21459850 PMCID: PMC3152324 DOI: 10.1093/nar/gkr160] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Post-transcriptional gene regulation is mediated through complex networks of protein-RNA interactions. The targets of only a few RNA binding proteins (RBPs) are known, even in the well-characterized budding yeast. In silico prediction of protein-RNA interactions is therefore useful to guide experiments and to provide insight into regulatory networks. Computational approaches have identified RBP targets based on sequence binding preferences. We investigate here to what extent RBP-RNA interactions can be predicted based on RBP and mRNA features other than sequence motifs. We analyze global relationships between gene and protein properties in general and between selected RBPs and known mRNA targets in particular. Highly translated RBPs tend to bind to shorter transcripts, and transcripts bound by the same RBP show high expression correlation across different biological conditions. Surprisingly, a given RBP preferentially binds to mRNAs that encode interaction partners for this RBP, suggesting coordinated post-transcriptional auto-regulation of protein complexes. We apply a machine-learning approach to predict specific RBP targets in yeast. Although this approach performs well for RBPs with known targets, predictions for uncharacterized RBPs remain challenging due to limiting experimental data. We also predict targets of fission yeast RBPs, indicating that the suggested framework could be applied to other species once more experimental data are available.
Collapse
Affiliation(s)
- Vera Pancaldi
- Department of Genetics, Evolution & Environment and UCL Cancer Institute, University College London, Gower Street, London WC1E 6BT, UK.
| | | |
Collapse
|
20
|
Abstract
Rapid improvements in high-throughput experimental technologies make it nowadays possible to study the expression, as well as changes in expression, of whole transcriptomes under different environmental conditions in a detailed view. We describe current approaches to identify genome-wide functional RNA transcripts (experimentally as well as computationally), and focus on computational methods that may be utilized to disclose their function. While genome databases offer a wealth of information about known and putative functions for protein-coding genes, functional information for novel non-coding RNA genes is almost nonexistent. This is mainly explained by the lack of established software tools to efficiently reveal the function and evolutionary origin of non-coding RNA genes. Here, we describe in detail computational approaches one may follow to annotate and classify an RNA transcript.
Collapse
Affiliation(s)
- Kristin Reiche
- Fraunhofer Institute for Cell Therapy and Immunology, Leipzig, Germany
| | | | | | | | | |
Collapse
|
21
|
Abstract
RNA localisation is an important mode of delivering proteins to their site of function. Cis-acting signals within the RNAs, which can be thought of as zip-codes, determine the site of localisation. There are few examples of fully characterised RNA signals, but the signals are thought to be defined through a combination of primary, secondary, and tertiary structures. In this chapter, we describe a selection of computational methods for predicting RNA secondary structure, identifying localisation signals, and searching for similar localisation signals on a genome-wide scale. The chapter is aimed at the biologist rather than presenting the details of each of the individual methods.
Collapse
|
22
|
Xie ZR, Hwang MJ. An interaction-motif-based scoring function for protein-ligand docking. BMC Bioinformatics 2010; 11:298. [PMID: 20525216 PMCID: PMC3098071 DOI: 10.1186/1471-2105-11-298] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2010] [Accepted: 06/02/2010] [Indexed: 01/08/2023] Open
Abstract
Background A good scoring function is essential for molecular docking computations. In conventional scoring functions, energy terms modeling pairwise interactions are cumulatively summed, and the best docking solution is selected. Here, we propose to transform protein-ligand interactions into three-dimensional geometric networks, from which recurring network substructures, or network motifs, are selected and used to provide probability-ranked interaction templates with which to score docking solutions. Results A novel scoring function for protein-ligand docking, MotifScore, was developed. It is non-energy-based, and docking is, instead, scored by counting the occurrences of motifs of protein-ligand interaction networks constructed using structures of protein-ligand complexes. MotifScore has been tested on a benchmark set established by others to assess its ability to identify near-native complex conformations among a set of decoys. In this benchmark test, 84% of the highest-scored docking conformations had root-mean-square deviations (rmsds) below 2.0 Å from the native conformation, which is comparable with the best of several energy-based docking scoring functions. Many of the top motifs, which comprise a multitude of chemical groups that interact simultaneously and make a highly significant contribution to MotifScore, capture recurrent interacting patterns beyond pairwise interactions. Conclusions While providing quite good docking scores, MotifScore is quite different from conventional energy-based functions. MotifScore thus represents a new, network-based approach for exploring problems associated with molecular docking.
Collapse
Affiliation(s)
- Zhong-Ru Xie
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan
| | | |
Collapse
|
23
|
Liu ZP, Wu LY, Wang Y, Zhang XS, Chen L. Prediction of protein-RNA binding sites by a random forest method with combined features. ACTA ACUST UNITED AC 2010; 26:1616-22. [PMID: 20483814 DOI: 10.1093/bioinformatics/btq253] [Citation(s) in RCA: 110] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION Protein-RNA interactions play a key role in a number of biological processes, such as protein synthesis, mRNA processing, mRNA assembly, ribosome function and eukaryotic spliceosomes. As a result, a reliable identification of RNA binding site of a protein is important for functional annotation and site-directed mutagenesis. Accumulated data of experimental protein-RNA interactions reveal that a RNA binding residue with different neighbor amino acids often exhibits different preferences for its RNA partners, which in turn can be assessed by the interacting interdependence of the amino acid fragment and RNA nucleotide. RESULTS In this work, we propose a novel classification method to identify the RNA binding sites in proteins by combining a new interacting feature (interaction propensity) with other sequence- and structure-based features. Specifically, the interaction propensity represents a binding specificity of a protein residue to the interacting RNA nucleotide by considering its two-side neighborhood in a protein residue triplet. The sequence as well as the structure-based features of the residues are combined together to discriminate the interaction propensity of amino acids with RNA. We predict RNA interacting residues in proteins by implementing a well-built random forest classifier. The experiments show that our method is able to detect the annotated protein-RNA interaction sites in a high accuracy. Our method achieves an accuracy of 84.5%, F-measure of 0.85 and AUC of 0.92 prediction of the RNA binding residues for a dataset containing 205 non-homologous RNA binding proteins, and also outperforms several existing RNA binding residue predictors, such as RNABindR, BindN, RNAProB and PPRint, and some alternative machine learning methods, such as support vector machine, naive Bayes and neural network in the comparison study. Furthermore, we provide some biological insights into the roles of sequences and structures in protein-RNA interactions by both evaluating the importance of features for their contributions in predictive accuracy and analyzing the binding patterns of interacting residues. AVAILABILITY All the source data and code are available at http://www.aporc.org/doc/wiki/PRNA or http://www.sysbio.ac.cn/datatools.asp CONTACT lnchen@sibs.ac.cn SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhi-Ping Liu
- Key Laboratory of Systems Biology, SIBS-Novo Nordisk Translational Research Centre for PreDiabetes, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | | | | | | | | |
Collapse
|
24
|
Zhou P, Zou J, Tian F, Shang Z. Geometric similarity between protein-RNA interfaces. J Comput Chem 2010; 30:2738-51. [PMID: 19399760 DOI: 10.1002/jcc.21300] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
A new method is described to measure the geometric similarity between protein-RNA interfaces quantitatively. The method is based on a procedure that dissects the interface geometry in terms of the spatial relationships between individual amino acid nucleotide pairs. Using this technique, we performed an all-on-all comparison of 586 protein-RNA interfaces deposited in the current Protein Data Bank, as the result, an interface-interface similarity score matrix was obtained. Based upon this matrix, hierarchical clustering was carried out which yielded a complete clustering tree for the 586 protein-RNA interfaces. By investigating the organizing behavior of the clustering tree and the SCOP classification of protein partners in complexes, a geometrically nonredundant, diverse data set (representative data set) consisting of 45 distinct protein-RNA interfaces was extracted for the purpose of studying protein-RNA interactions, RNA regulations, and drug design. We classified protein-RNA interfaces into three types. In type I, the families and interface structural classes of the protein partners, as well as the interface geometries are all similar. In type II, the interface geometries and the interface structural classes are similar, whereas the protein families are different. In type III, only the interface geometries are similar but the protein families and the interface structural classes are distinct. Furthermore, we also show two new RNA recognition themes derived from the representative data set.
Collapse
Affiliation(s)
- Peng Zhou
- Institute of Molecular Design and Molecular Thermodynamics, Department of Chemistry, Zhejiang University, Hangzhou 310027, China
| | | | | | | |
Collapse
|
25
|
Levy R, Edelman M, Sobolev V. Prediction of 3D metal binding sites from translated gene sequences based on remote-homology templates. Proteins 2010; 76:365-74. [PMID: 19173310 DOI: 10.1002/prot.22352] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Database-scale analysis was performed to determine whether structural models, based on remote homologues, are effective in predicting 3D transition metal binding sites in proteins directly from translated gene sequences. The extent by which side chain modeling alone reduces sensitivity and selectivity is shown to be <10%. Surprisingly, selectivity was not dependent on the level of sequence homology between template and target, or on the presence of a metal ion in the structural template. Applying a modification of the CHED algorithm (Babor et al., Proteins 2008;70:208-217) and machine learning filters, a selectivity of approximately 90% was achieved for protein sequences using unrelated structural templates over a sequence identity range of 18-100%. Below approximately 18% identity, the number of analyzable target-template pairs and predictability of metal binding sites falls off sharply. A full third of structural templates were found to have target partners only in the remote homology range of 18-30%. In this range, nonmetal-binding templates are calculated to be the majority and serve to predict with 50% sensitivity at the geometric level. Overall, sensitivity at the geometric level for targets having templates in the 18-30% sequence identity range is 73%, with an average of one false positive site per true site. Protein sequences described as "unknown" in the UniProt database and composed largely of unidentified genome project sequences were studied and metal binding sites predicted. A web server for prediction of metal binding sites from protein sequence is provided.
Collapse
Affiliation(s)
- Ronen Levy
- Department of Plant Sciences, Weizmann Institute of Science, Rehovot, Israel
| | | | | |
Collapse
|
26
|
Ciriello G, Gallina C, Guerra C. Analysis of interactions between ribosomal proteins and RNA structural motifs. BMC Bioinformatics 2010; 11 Suppl 1:S41. [PMID: 20122215 PMCID: PMC3009514 DOI: 10.1186/1471-2105-11-s1-s41] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
Background One important goal of structural bioinformatics is to recognize and predict the interactions between protein binding sites and RNA. Recently, a comprehensive analysis of ribosomal proteins and their interactions with rRNA has been done. Interesting results emerged from the comparison of r-proteins within the small subunit in T. thermophilus and E. coli, supporting the idea of a core made by both RNA and proteins, conserved by evolution. Recent work showed also that ribosomal RNA is modularly composed. Motifs are generally single-stranded sequences of consecutive nucleotides (ssRNA) with characteristic folding. The role of these motifs in protein-RNA interactions has been so far only sparsely investigated. Results This work explores the role of RNA structural motifs in the interaction of proteins with ribosomal RNA (rRNA). We analyze composition, local geometries and conformation of interface regions involving motifs such as tetraloops, kink turns and single extruded nucleotides. We construct an interaction map of protein binding sites that allows us to identify the common types of shared 3-D physicochemical binding patterns for tetraloops. Furthermore, we investigate the protein binding pockets that accommodate single extruded nucleotides either involved in kink-turns or in arbitrary RNA strands. This analysis reveals a new structural motif, called tripod. It corresponds to small pockets consisting of three aminoacids arranged at the vertices of an almost equilateral triangle. We developed a search procedure for the recognition of tripods, based on an empirical tripod fingerprint. Conclusion A comparative analysis with the overall RNA surface and interfaces shows that contact surfaces involving RNA motifs have distinctive features that may be useful for the recognition and prediction of interactions.
Collapse
Affiliation(s)
- Giovanni Ciriello
- Dept, of Information Engineering, University of Padova, Via Gradenigo 6a, 35131 Padova, Italy.
| | | | | |
Collapse
|
27
|
Sutch BT, Chambers EJ, Bayramyan MZ, Gallaher TK, Haworth IS. Similarity of Protein-RNA Interfaces Based on Motif Analysis. J Chem Inf Model 2009; 49:2139-46. [DOI: 10.1021/ci900154a] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Affiliation(s)
- Brian T. Sutch
- Department of Pharmacology & Pharmaceutical Sciences, University of Southern California, Los Angeles, California 90089-9121
| | - Eric J. Chambers
- Department of Pharmacology & Pharmaceutical Sciences, University of Southern California, Los Angeles, California 90089-9121
| | - Melina Z. Bayramyan
- Department of Pharmacology & Pharmaceutical Sciences, University of Southern California, Los Angeles, California 90089-9121
| | - Timothy K. Gallaher
- Department of Pharmacology & Pharmaceutical Sciences, University of Southern California, Los Angeles, California 90089-9121
| | - Ian S. Haworth
- Department of Pharmacology & Pharmaceutical Sciences, University of Southern California, Los Angeles, California 90089-9121
| |
Collapse
|
28
|
Czyżnikowska Ż, Lipkowski P, Góra RW, Zaleśny R, Cheng AC. On the Nature of Intermolecular Interactions in Nucleic Acid Base−Amino Acid Side-Chain Complexes. J Phys Chem B 2009; 113:11511-20. [DOI: 10.1021/jp904146m] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Ż. Czyżnikowska
- Theoretical Chemistry Group, Institute of Physical and Theoretical Chemistry, Wrocław University of Technology, Wyb. Wyspiańskiego 27, 50-370 Wrocław, Poland, and Department of Molecular Structure, Amgen Inc., One Kendall Square, Building 1000, Cambridge, Massachusetts 02139
| | - P. Lipkowski
- Theoretical Chemistry Group, Institute of Physical and Theoretical Chemistry, Wrocław University of Technology, Wyb. Wyspiańskiego 27, 50-370 Wrocław, Poland, and Department of Molecular Structure, Amgen Inc., One Kendall Square, Building 1000, Cambridge, Massachusetts 02139
| | - R. W. Góra
- Theoretical Chemistry Group, Institute of Physical and Theoretical Chemistry, Wrocław University of Technology, Wyb. Wyspiańskiego 27, 50-370 Wrocław, Poland, and Department of Molecular Structure, Amgen Inc., One Kendall Square, Building 1000, Cambridge, Massachusetts 02139
| | - R. Zaleśny
- Theoretical Chemistry Group, Institute of Physical and Theoretical Chemistry, Wrocław University of Technology, Wyb. Wyspiańskiego 27, 50-370 Wrocław, Poland, and Department of Molecular Structure, Amgen Inc., One Kendall Square, Building 1000, Cambridge, Massachusetts 02139
| | - A. C. Cheng
- Theoretical Chemistry Group, Institute of Physical and Theoretical Chemistry, Wrocław University of Technology, Wyb. Wyspiańskiego 27, 50-370 Wrocław, Poland, and Department of Molecular Structure, Amgen Inc., One Kendall Square, Building 1000, Cambridge, Massachusetts 02139
| |
Collapse
|
29
|
Wallach I, Lilien RH. Prediction of sub-cavity binding preferences using an adaptive physicochemical structure representation. Bioinformatics 2009; 25:i296-304. [PMID: 19478002 PMCID: PMC2687958 DOI: 10.1093/bioinformatics/btp204] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
MOTIVATION The ability to predict binding profiles for an arbitrary protein can significantly improve the areas of drug discovery, lead optimization and protein function prediction. At present, there are no successful algorithms capable of predicting binding profiles for novel proteins. Existing methods typically rely on manually curated templates or entire active site comparison. Consequently, they perform best when analyzing proteins sharing significant structural similarity with known proteins (i.e. proteins resulting from divergent evolution). These methods fall short when used to characterize the binding profile of a novel active site or one for which a template is not available. In contrast to previous approaches, our method characterizes the binding preferences of sub-cavities within the active site by exploiting a large set of known protein-ligand complexes. The uniqueness of our approach lies not only in the consideration of sub-cavities, but also in the more complete structural representation of these sub-cavities, their parametrization and the method by which they are compared. By only requiring local structural similarity, we are able to leverage previously unused structural information and perform binding inference for proteins that do not share significant structural similarity with known systems. RESULTS Our algorithm demonstrates the ability to accurately cluster similar sub-cavities and to predict binding patterns across a diverse set of protein-ligand complexes. When applied to two high-profile drug targets, our algorithm successfully generates a binding profile that is consistent with known inhibitors. The results suggest that our algorithm should be useful in structure-based drug discovery and lead optimization.
Collapse
Affiliation(s)
- Izhar Wallach
- Department of Computer Science, Donnelly Centre for Cellular and Biomolecular Research and Banting and Best, University of Toronto, Toronto, Ontario, Canada.
| | | |
Collapse
|
30
|
Kalinina OV, Gelfand MS, Russell RB. Combining specificity determining and conserved residues improves functional site prediction. BMC Bioinformatics 2009; 10:174. [PMID: 19508719 PMCID: PMC2709924 DOI: 10.1186/1471-2105-10-174] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2009] [Accepted: 06/09/2009] [Indexed: 11/16/2022] Open
Abstract
Background Predicting the location of functionally important sites from protein sequence and/or structure is a long-standing problem in computational biology. Most current approaches make use of sequence conservation, assuming that amino acid residues conserved within a protein family are most likely to be functionally important. Most often these approaches do not consider many residues that act to define specific sub-functions within a family, or they make no distinction between residues important for function and those more relevant for maintaining structure (e.g. in the hydrophobic core). Many protein families bind and/or act on a variety of ligands, meaning that conserved residues often only bind a common ligand sub-structure or perform general catalytic activities. Results Here we present a novel method for functional site prediction based on identification of conserved positions, as well as those responsible for determining ligand specificity. We define Specificity-Determining Positions (SDPs), as those occupied by conserved residues within sub-groups of proteins in a family having a common specificity, but differ between groups, and are thus likely to account for specific recognition events. We benchmark the approach on enzyme families of known 3D structure with bound substrates, and find that in nearly all families residues predicted by SDPsite are in contact with the bound substrate, and that the addition of SDPs significantly improves functional site prediction accuracy. We apply SDPsite to various families of proteins containing known three-dimensional structures, but lacking clear functional annotations, and discusse several illustrative examples. Conclusion The results suggest a better means to predict functional details for the thousands of protein structures determined prior to a clear understanding of molecular function.
Collapse
|
31
|
Lee S, Blundell TL. BIPA: a database for protein-nucleic acid interaction in 3D structures. ACTA ACUST UNITED AC 2009; 25:1559-60. [PMID: 19357098 DOI: 10.1093/bioinformatics/btp243] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
UNLABELLED BIPA is a database for protein-nucleic acid interactions in 3D structures. The database provides various physicochemical features of protein-nucleic acid interface such as size, shape, residue propensity, secondary structure composition and intermolecular interactions. The database also contains multiple structural alignments of nucleic acid-binding protein families with annotations of local environments in order to allow definition of features that influence acceptability of mutations at a particular position in a protein family. A web interface has been designed to present the results of these analyses and facilitate navigation of protein-nucleic acid interfaces. AVAILABILITY http://www-cryst.bioc.cam.ac.uk/bipa SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Semin Lee
- Department of Biochemistry, University of Cambridge, Old Addenbrooke's Site, Cambridge, UK.
| | | |
Collapse
|
32
|
Relating Macromolecular Function and Association: The Structural Basis of Protein–DNA and RNA Recognition. Cell Mol Bioeng 2008. [DOI: 10.1007/s12195-008-0032-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022] Open
|
33
|
Shulman-Peleg A, Nussinov R, Wolfson HJ. RsiteDB: a database of protein binding pockets that interact with RNA nucleotide bases. Nucleic Acids Res 2008; 37:D369-73. [PMID: 18953028 PMCID: PMC2686467 DOI: 10.1093/nar/gkn759] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
We present a new database and an on-line search engine, which store and query the protein binding pockets that interact with single-stranded RNA nucleotide bases. The database consists of a classification of binding sites derived from protein–RNA complexes. Each binding site is assigned to a cluster of similar binding sites in other protein–RNA complexes. Cluster members share similar spatial arrangements of physico–chemical properties, thus can reveal novel similarity between proteins and RNAs with different sequences and folds. The clusters provide 3D consensus binding patterns important for protein–nucleotide recognition. The database search engine allows two types of useful queries: first, given a PDB code of a protein–RNA complex, RsiteDB can detail and classify the properties of the protein binding pockets accommodating extruded RNA nucleotides not involved in local RNA base pairing. Second, given an unbound protein structure, RsiteDB can perform an on-line structural search against the constructed database of 3D consensus binding patterns. Regions similar to known patterns are predicted to serve as binding sites. Alignment of the query to these patterns with their corresponding RNA nucleotides allows making unique predictions of the protein–RNA interactions at the atomic level of detail. This database is accessable at http://bioinfo3d.cs.tau.ac.il/RsiteDB.
Collapse
Affiliation(s)
- Alexandra Shulman-Peleg
- School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel.
| | | | | |
Collapse
|