1
|
R Hamre J, Klimov DK, McCoy MD, Jafri MS. Machine learning-based prediction of drug and ligand binding in BCL-2 variants through molecular dynamics. Comput Biol Med 2022; 140:105060. [PMID: 34920365 DOI: 10.1016/j.compbiomed.2021.105060] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 11/13/2021] [Accepted: 11/20/2021] [Indexed: 12/13/2022]
Abstract
Venetoclax is a BH3 (BCL-2 Homology 3) mimetic used to treat leukemia and lymphoma by inhibiting the anti-apoptotic BCL-2 protein thereby promoting apoptosis of cancerous cells. Acquired resistance to Venetoclax via specific variants in BCL-2 is a major problem for the successful treatment of cancer patients. Replica exchange molecular dynamics (REMD) simulations combined with machine learning were used to define the average structure of variants in aqueous solution to predict changes in drug and ligand binding in BCL-2 variants. The variant structures all show shifts in residue positions that occlude the binding groove, and these are the primary contributors to drug resistance. Correspondingly, we established a method that can predict the severity of a variant as measured by the inhibitory constant (Ki) of Venetoclax by measuring the structure deviations to the binding cleft. In addition, we also applied machine learning to the phi and psi angles of the amino acid backbone to the ensemble of conformations that demonstrated a generalizable method for drug resistant predictions of BCL-2 proteins that elucidates changes where detailed understanding of the structure-function relationship is less clear.
Collapse
Affiliation(s)
- John R Hamre
- School of Systems Biology, George Mason University, Manassas, VA, USA.
| | - Dmitri K Klimov
- School of Systems Biology, George Mason University, Manassas, VA, USA.
| | - Matthew D McCoy
- Innovation Center for Biomedical Informatics, Department of Oncology, Georgetown University Medical Center, Georgetown University, Washington DC, USA.
| | - M Saleet Jafri
- School of Systems Biology, George Mason University, Fairfax, VA and Center for Biomedical Technology and Engineering, University of Maryland School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
2
|
Basu S, Assaf SS, Teheux F, Rooman M, Pucci F. BRANEart: Identify Stability Strength and Weakness Regions in Membrane Proteins. FRONTIERS IN BIOINFORMATICS 2021; 1:742843. [PMID: 36303753 PMCID: PMC9581023 DOI: 10.3389/fbinf.2021.742843] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 11/03/2021] [Indexed: 11/22/2022] Open
Abstract
Understanding the role of stability strengths and weaknesses in proteins is a key objective for rationalizing their dynamical and functional properties such as conformational changes, catalytic activity, and protein-protein and protein-ligand interactions. We present BRANEart, a new, fast and accurate method to evaluate the per-residue contributions to the overall stability of membrane proteins. It is based on an extended set of recently introduced statistical potentials derived from membrane protein structures, which better describe the stability properties of this class of proteins than standard potentials derived from globular proteins. We defined a per-residue membrane propensity index from combinations of these potentials, which can be used to identify residues which strongly contribute to the stability of the transmembrane region or which would, on the contrary, be more stable in extramembrane regions, or vice versa. Large-scale application to membrane and globular proteins sets and application to tests cases show excellent agreement with experimental data. BRANEart thus appears as a useful instrument to analyze in detail the overall stability properties of a target membrane protein, to position it relative to the lipid bilayer, and to rationally modify its biophysical characteristics and function. BRANEart can be freely accessed from http://babylone.3bio.ulb.ac.be/BRANEart.
Collapse
Affiliation(s)
- Sankar Basu
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels, Belgium
- Department of Microbiology, Austosh College, Under University of Calcutta, Kolkata, India
| | - Simon S. Assaf
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels, Belgium
| | - Fabian Teheux
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels, Belgium
| | - Marianne Rooman
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Brussels, Belgium
- *Correspondence: Marianne Rooman, ; Fabrizio Pucci,
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Brussels, Belgium
- *Correspondence: Marianne Rooman, ; Fabrizio Pucci,
| |
Collapse
|
3
|
Lensink MF, Brysbaert G, Mauri T, Nadzirin N, Velankar S, Chaleil RAG, Clarence T, Bates PA, Kong R, Liu B, Yang G, Liu M, Shi H, Lu X, Chang S, Roy RS, Quadir F, Liu J, Cheng J, Antoniak A, Czaplewski C, Giełdoń A, Kogut M, Lipska AG, Liwo A, Lubecka EA, Maszota-Zieleniak M, Sieradzan AK, Ślusarz R, Wesołowski PA, Zięba K, Del Carpio Muñoz CA, Ichiishi E, Harmalkar A, Gray JJ, Bonvin AMJJ, Ambrosetti F, Vargas Honorato R, Jandova Z, Jiménez-García B, Koukos PI, Van Keulen S, Van Noort CW, Réau M, Roel-Touris J, Kotelnikov S, Padhorny D, Porter KA, Alekseenko A, Ignatov M, Desta I, Ashizawa R, Sun Z, Ghani U, Hashemi N, Vajda S, Kozakov D, Rosell M, Rodríguez-Lumbreras LA, Fernandez-Recio J, Karczynska A, Grudinin S, Yan Y, Li H, Lin P, Huang SY, Christoffer C, Terashi G, Verburgt J, Sarkar D, Aderinwale T, Wang X, Kihara D, Nakamura T, Hanazono Y, Gowthaman R, Guest JD, Yin R, Taherzadeh G, Pierce BG, Barradas-Bautista D, Cao Z, Cavallo L, Oliva R, Sun Y, Zhu S, Shen Y, Park T, Woo H, Yang J, Kwon S, Won J, Seok C, Kiyota Y, Kobayashi S, Harada Y, Takeda-Shitaka M, Kundrotas PJ, Singh A, Vakser IA, et alLensink MF, Brysbaert G, Mauri T, Nadzirin N, Velankar S, Chaleil RAG, Clarence T, Bates PA, Kong R, Liu B, Yang G, Liu M, Shi H, Lu X, Chang S, Roy RS, Quadir F, Liu J, Cheng J, Antoniak A, Czaplewski C, Giełdoń A, Kogut M, Lipska AG, Liwo A, Lubecka EA, Maszota-Zieleniak M, Sieradzan AK, Ślusarz R, Wesołowski PA, Zięba K, Del Carpio Muñoz CA, Ichiishi E, Harmalkar A, Gray JJ, Bonvin AMJJ, Ambrosetti F, Vargas Honorato R, Jandova Z, Jiménez-García B, Koukos PI, Van Keulen S, Van Noort CW, Réau M, Roel-Touris J, Kotelnikov S, Padhorny D, Porter KA, Alekseenko A, Ignatov M, Desta I, Ashizawa R, Sun Z, Ghani U, Hashemi N, Vajda S, Kozakov D, Rosell M, Rodríguez-Lumbreras LA, Fernandez-Recio J, Karczynska A, Grudinin S, Yan Y, Li H, Lin P, Huang SY, Christoffer C, Terashi G, Verburgt J, Sarkar D, Aderinwale T, Wang X, Kihara D, Nakamura T, Hanazono Y, Gowthaman R, Guest JD, Yin R, Taherzadeh G, Pierce BG, Barradas-Bautista D, Cao Z, Cavallo L, Oliva R, Sun Y, Zhu S, Shen Y, Park T, Woo H, Yang J, Kwon S, Won J, Seok C, Kiyota Y, Kobayashi S, Harada Y, Takeda-Shitaka M, Kundrotas PJ, Singh A, Vakser IA, Dapkūnas J, Olechnovič K, Venclovas Č, Duan R, Qiu L, Xu X, Zhang S, Zou X, Wodak SJ. Prediction of protein assemblies, the next frontier: The CASP14-CAPRI experiment. Proteins 2021; 89:1800-1823. [PMID: 34453465 PMCID: PMC8616814 DOI: 10.1002/prot.26222] [Show More Authors] [Citation(s) in RCA: 82] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 07/24/2021] [Accepted: 08/05/2021] [Indexed: 12/19/2022]
Abstract
We present the results for CAPRI Round 50, the fourth joint CASP-CAPRI protein assembly prediction challenge. The Round comprised a total of twelve targets, including six dimers, three trimers, and three higher-order oligomers. Four of these were easy targets, for which good structural templates were available either for the full assembly, or for the main interfaces (of the higher-order oligomers). Eight were difficult targets for which only distantly related templates were found for the individual subunits. Twenty-five CAPRI groups including eight automatic servers submitted ~1250 models per target. Twenty groups including six servers participated in the CAPRI scoring challenge submitted ~190 models per target. The accuracy of the predicted models was evaluated using the classical CAPRI criteria. The prediction performance was measured by a weighted scoring scheme that takes into account the number of models of acceptable quality or higher submitted by each group as part of their five top-ranking models. Compared to the previous CASP-CAPRI challenge, top performing groups submitted such models for a larger fraction (70-75%) of the targets in this Round, but fewer of these models were of high accuracy. Scorer groups achieved stronger performance with more groups submitting correct models for 70-80% of the targets or achieving high accuracy predictions. Servers performed less well in general, except for the MDOCKPP and LZERD servers, who performed on par with human groups. In addition to these results, major advances in methodology are discussed, providing an informative overview of where the prediction of protein assemblies currently stands.
Collapse
Affiliation(s)
- Marc F Lensink
- CNRS UMR8576 UGSF, Institute for Structural and Functional Glycobiology, University of Lille, Lille, France
| | - Guillaume Brysbaert
- CNRS UMR8576 UGSF, Institute for Structural and Functional Glycobiology, University of Lille, Lille, France
| | - Théo Mauri
- CNRS UMR8576 UGSF, Institute for Structural and Functional Glycobiology, University of Lille, Lille, France
| | - Nurul Nadzirin
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Sameer Velankar
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | | | - Tereza Clarence
- Biomolecular Modelling Laboratory, The Francis Crick Institute, London, UK
| | - Paul A Bates
- Biomolecular Modelling Laboratory, The Francis Crick Institute, London, UK
| | - Ren Kong
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Bin Liu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Guangbo Yang
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Ming Liu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Hang Shi
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Xufeng Lu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Shan Chang
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Raj S Roy
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
| | - Farhan Quadir
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
| | - Jian Liu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
- Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri, USA
| | - Anna Antoniak
- Faculty of Chemistry, University of Gdansk, Gdansk, Poland
| | | | - Artur Giełdoń
- Faculty of Chemistry, University of Gdansk, Gdansk, Poland
| | - Mateusz Kogut
- Faculty of Chemistry, University of Gdansk, Gdansk, Poland
| | | | - Adam Liwo
- Faculty of Chemistry, University of Gdansk, Gdansk, Poland
| | - Emilia A Lubecka
- Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology, Gdansk, Poland
| | | | | | - Rafał Ślusarz
- Faculty of Chemistry, University of Gdansk, Gdansk, Poland
| | - Patryk A Wesołowski
- Faculty of Chemistry, University of Gdansk, Gdansk, Poland
- Intercollegiate Faculty of Biotechnology, University of Gdansk and Medical University of Gdansk, Gdansk, Poland
| | - Karolina Zięba
- Faculty of Chemistry, University of Gdansk, Gdansk, Poland
| | | | - Eiichiro Ichiishi
- International University of Health and Welfare Hospital (IUHW Hospital), Nasushiobara City, Japan
| | - Ameya Harmalkar
- Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland, USA
| | - Jeffrey J Gray
- Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland, USA
| | - Alexandre M J J Bonvin
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Francesco Ambrosetti
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Rodrigo Vargas Honorato
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Zuzana Jandova
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Brian Jiménez-García
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Panagiotis I Koukos
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Siri Van Keulen
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Charlotte W Van Noort
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Manon Réau
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Jorge Roel-Touris
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Sergei Kotelnikov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
- Innopolis University, Russia
| | - Dzmitry Padhorny
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Kathryn A Porter
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Andrey Alekseenko
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
- Institute of Computer-Aided Design of the Russian Academy of Sciences, Moscow, Russia
| | - Mikhail Ignatov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Israel Desta
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Ryota Ashizawa
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Zhuyezi Sun
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Usman Ghani
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Nasser Hashemi
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
- Department of Chemistry, Boston University, Boston, Massachusetts, USA
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Mireia Rosell
- Instituto de Ciencias de la Vid y del Vino (ICVV), CSIC - Universidad de la Rioja - Gobierno de La Rioja, Logrono, Spain
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - Luis A Rodríguez-Lumbreras
- Instituto de Ciencias de la Vid y del Vino (ICVV), CSIC - Universidad de la Rioja - Gobierno de La Rioja, Logrono, Spain
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - Juan Fernandez-Recio
- Instituto de Ciencias de la Vid y del Vino (ICVV), CSIC - Universidad de la Rioja - Gobierno de La Rioja, Logrono, Spain
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | | | - Sergei Grudinin
- Université Grenoble Alpes, Inria, CNRS, Grenoble INP, LJK, Grenoble, France
| | - Yumeng Yan
- School of Physics, Huazhong University of Science and Technology, Wuhan, China
| | - Hao Li
- School of Physics, Huazhong University of Science and Technology, Wuhan, China
| | - Peicong Lin
- School of Physics, Huazhong University of Science and Technology, Wuhan, China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, China
| | - Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Jacob Verburgt
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Daipayan Sarkar
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Tunde Aderinwale
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
| | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Tsukasa Nakamura
- Graduate School of Information Sciences, Tohoku University, Sendai, Miyagi, Japan
| | - Yuya Hanazono
- Institute for Quantum Life Science, National Institutes for Quantum and Radiological Science and Technology, Tokai, Ibaraki, Japan
| | - Ragul Gowthaman
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, Maryland, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, Maryland, USA
| | - Johnathan D Guest
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, Maryland, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, Maryland, USA
| | - Rui Yin
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, Maryland, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, Maryland, USA
| | - Ghazaleh Taherzadeh
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, Maryland, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, Maryland, USA
| | - Brian G Pierce
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, Maryland, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, Maryland, USA
| | | | - Zhen Cao
- King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Luigi Cavallo
- King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Romina Oliva
- University of Naples "Parthenope", Napoli, Italy
| | - Yuanfei Sun
- Department of Electrical and Computer Engineering, Texas A&M University, Texas, USA
| | - Shaowen Zhu
- Department of Electrical and Computer Engineering, Texas A&M University, Texas, USA
| | - Yang Shen
- Department of Electrical and Computer Engineering, Texas A&M University, Texas, USA
| | - Taeyong Park
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Hyeonuk Woo
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Jinsol Yang
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Sohee Kwon
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Jonghun Won
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Yasuomi Kiyota
- School of Pharmacy, Kitasato University, Minato-ku, Tokyo, Japan
| | | | - Yoshiki Harada
- School of Pharmacy, Kitasato University, Minato-ku, Tokyo, Japan
| | | | - Petras J Kundrotas
- Computational Biology Program and Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas, USA
| | - Amar Singh
- Computational Biology Program and Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas, USA
| | - Ilya A Vakser
- Computational Biology Program and Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas, USA
| | - Justas Dapkūnas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Kliment Olechnovič
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Česlovas Venclovas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Rui Duan
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, USA
| | - Liming Qiu
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, USA
| | - Xianjin Xu
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, USA
| | - Shuang Zhang
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, USA
| | - Xiaoqin Zou
- Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri, USA
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, USA
- Department of Physics and Astronomy, University of Missouri, Columbia, Missouri, USA
- Department of Biochemistry, University of Missouri, Columbia, Missouri, USA
| | | |
Collapse
|
4
|
Hou Q, Pucci F, Ancien F, Kwasigroch JM, Bourgeas R, Rooman M. SWOTein: a structure-based approach to predict stability Strengths and Weaknesses of prOTEINs. Bioinformatics 2021; 37:1963–1971. [PMID: 33471089 DOI: 10.1093/bioinformatics/btab034] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Revised: 12/05/2020] [Accepted: 01/15/2021] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Although structured proteins adopt their lowest free energy conformation in physiological conditions, the individual residues are generally not in their lowest free energy conformation. Residues that are stability weaknesses are often involved in functional regions, whereas stability strengths ensure local structural stability. The detection of strengths and weaknesses provides key information to guide protein engineering experiments aiming to modulate folding and various functional processes. RESULTS We developed the SWOTein predictor which identifies strong and weak residues in proteins on the basis of three types of statistical energy functions describing local interactions along the chain, hydrophobic forces and tertiary interactions. The large-scale analysis of the different types of strengths and weaknesses demonstrated their complementarity and the enhancement of the information they provide. Moreover, a good average correlation was observed between predicted and experimental strengths and weaknesses obtained from native hydrogen exchange data. SWOTein application to three test cases further showed its suitability to predict and interpret strong and weak residues in the context of folding, conformational changes and protein-protein binding. In summary, SWOTein is both fast and accurate and can be applied at small and large scale to analyze and modulate folding and molecular recognition processes. AVAILABILITY The SWOTein webserver provides the list of predicted strengths and weaknesses and a protein structure visualization tool that facilitates the interpretation of the predictions. It is freely available for academic use at http://babylone.ulb.ac.be/SWOTein/.
Collapse
Affiliation(s)
- Qingzhen Hou
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Shandong 250002, P. R. China.,National Institute of Health Data Science of China, Shandong University, Shandong 250002, P. R. China.,Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels 1050, Belgium
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels 1050, Belgium.,Interuniversity Institute of Bioinformatics in Brussels, Boulevard du Triomphe, 1050 Brussels, Belgium
| | - François Ancien
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels 1050, Belgium.,Interuniversity Institute of Bioinformatics in Brussels, Boulevard du Triomphe, 1050 Brussels, Belgium
| | - Jean-Marc Kwasigroch
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels 1050, Belgium
| | - Raphaël Bourgeas
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels 1050, Belgium
| | - Marianne Rooman
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels 1050, Belgium.,Interuniversity Institute of Bioinformatics in Brussels, Boulevard du Triomphe, 1050 Brussels, Belgium
| |
Collapse
|
5
|
Hou Q, Kwasigroch JM, Rooman M, Pucci F. SOLart: a structure-based method to predict protein solubility and aggregation. Bioinformatics 2020; 36:1445-1452. [PMID: 31603466 DOI: 10.1093/bioinformatics/btz773] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2019] [Revised: 08/31/2019] [Accepted: 10/08/2019] [Indexed: 12/12/2022] Open
Abstract
MOTIVATION The solubility of a protein is often decisive for its proper functioning. Lack of solubility is a major bottleneck in high-throughput structural genomic studies and in high-concentration protein production, and the formation of protein aggregates causes a wide variety of diseases. Since solubility measurements are time-consuming and expensive, there is a strong need for solubility prediction tools. RESULTS We have recently introduced solubility-dependent distance potentials that are able to unravel the role of residue-residue interactions in promoting or decreasing protein solubility. Here, we extended their construction by defining solubility-dependent potentials based on backbone torsion angles and solvent accessibility, and integrated them, together with other structure- and sequence-based features, into a random forest model trained on a set of Escherichia coli proteins with experimental structures and solubility values. We thus obtained the SOLart protein solubility predictor, whose most informative features turned out to be folding free energy differences computed from our solubility-dependent statistical potentials. SOLart performances are very good, with a Pearson correlation coefficient between experimental and predicted solubility values of almost 0.7 both in cross-validation on the training dataset and in an independent set of Saccharomyces cerevisiae proteins. On test sets of modeled structures, only a limited drop in performance is observed. SOLart can thus be used with both high-resolution and low-resolution structures, and clearly outperforms state-of-art solubility predictors. It is available through a user-friendly webserver, which is easy to use by non-expert scientists. AVAILABILITY AND IMPLEMENTATION The SOLart webserver is freely available at http://babylone.ulb.ac.be/SOLART/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Qingzhen Hou
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Avenue Roosevelt 50, 1050 Brussels, Belgium.,Interuniversity Institute of Bioinformatics in Brussels, Boulevard du Triomphe, 1050 Brussels, Belgium
| | - Jean Marc Kwasigroch
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Avenue Roosevelt 50, 1050 Brussels, Belgium.,Interuniversity Institute of Bioinformatics in Brussels, Boulevard du Triomphe, 1050 Brussels, Belgium
| | - Marianne Rooman
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Avenue Roosevelt 50, 1050 Brussels, Belgium.,Interuniversity Institute of Bioinformatics in Brussels, Boulevard du Triomphe, 1050 Brussels, Belgium
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Avenue Roosevelt 50, 1050 Brussels, Belgium.,Interuniversity Institute of Bioinformatics in Brussels, Boulevard du Triomphe, 1050 Brussels, Belgium.,John von Neumann Institute for Computing, Jülich Supercomputer Centre, Forschungszentrum Jülich, 52428 Jülich, Germany
| |
Collapse
|
6
|
Lensink MF, Nadzirin N, Velankar S, Wodak SJ. Modeling protein‐protein, protein‐peptide, and protein‐oligosaccharide complexes: CAPRI 7th edition. Proteins 2020; 88:916-938. [DOI: 10.1002/prot.25870] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2019] [Revised: 12/19/2019] [Accepted: 12/26/2019] [Indexed: 12/19/2022]
Affiliation(s)
- Marc F. Lensink
- University of Lille, CNRS UMR8576 UGSF, Unité de Glycobiologie Structurale et Fonctionnelle F‐59000 Lille France
| | - Nurul Nadzirin
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI), Wellcome Trust Genome Campus Cambridge UK
| | - Sameer Velankar
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI), Wellcome Trust Genome Campus Cambridge UK
| | | |
Collapse
|
7
|
Lensink MF, Brysbaert G, Nadzirin N, Velankar S, Chaleil RAG, Gerguri T, Bates PA, Laine E, Carbone A, Grudinin S, Kong R, Liu RR, Xu XM, Shi H, Chang S, Eisenstein M, Karczynska A, Czaplewski C, Lubecka E, Lipska A, Krupa P, Mozolewska M, Golon Ł, Samsonov S, Liwo A, Crivelli S, Pagès G, Karasikov M, Kadukova M, Yan Y, Huang SY, Rosell M, Rodríguez-Lumbreras LA, Romero-Durana M, Díaz-Bueno L, Fernandez-Recio J, Christoffer C, Terashi G, Shin WH, Aderinwale T, Subraman SRMV, Kihara D, Kozakov D, Vajda S, Porter K, Padhorny D, Desta I, Beglov D, Ignatov M, Kotelnikov S, Moal IH, Ritchie DW, de Beauchêne IC, Maigret B, Devignes MD, Echartea MER, Barradas-Bautista D, Cao Z, Cavallo L, Oliva R, Cao Y, Shen Y, Baek M, Park T, Woo H, Seok C, Braitbard M, Bitton L, Scheidman-Duhovny D, Dapkūnas J, Olechnovič K, Venclovas Č, Kundrotas PJ, Belkin S, Chakravarty D, Badal VD, Vakser IA, Vreven T, Vangaveti S, Borrman T, Weng Z, Guest JD, Gowthaman R, Pierce BG, Xu X, Duan R, Qiu L, Hou J, Merideth BR, Ma Z, Cheng J, Zou X, Koukos PI, Roel-Touris J, Ambrosetti F, Geng C, Schaarschmidt J, Trellet ME, Melquiond ASJ, Xue L, et alLensink MF, Brysbaert G, Nadzirin N, Velankar S, Chaleil RAG, Gerguri T, Bates PA, Laine E, Carbone A, Grudinin S, Kong R, Liu RR, Xu XM, Shi H, Chang S, Eisenstein M, Karczynska A, Czaplewski C, Lubecka E, Lipska A, Krupa P, Mozolewska M, Golon Ł, Samsonov S, Liwo A, Crivelli S, Pagès G, Karasikov M, Kadukova M, Yan Y, Huang SY, Rosell M, Rodríguez-Lumbreras LA, Romero-Durana M, Díaz-Bueno L, Fernandez-Recio J, Christoffer C, Terashi G, Shin WH, Aderinwale T, Subraman SRMV, Kihara D, Kozakov D, Vajda S, Porter K, Padhorny D, Desta I, Beglov D, Ignatov M, Kotelnikov S, Moal IH, Ritchie DW, de Beauchêne IC, Maigret B, Devignes MD, Echartea MER, Barradas-Bautista D, Cao Z, Cavallo L, Oliva R, Cao Y, Shen Y, Baek M, Park T, Woo H, Seok C, Braitbard M, Bitton L, Scheidman-Duhovny D, Dapkūnas J, Olechnovič K, Venclovas Č, Kundrotas PJ, Belkin S, Chakravarty D, Badal VD, Vakser IA, Vreven T, Vangaveti S, Borrman T, Weng Z, Guest JD, Gowthaman R, Pierce BG, Xu X, Duan R, Qiu L, Hou J, Merideth BR, Ma Z, Cheng J, Zou X, Koukos PI, Roel-Touris J, Ambrosetti F, Geng C, Schaarschmidt J, Trellet ME, Melquiond ASJ, Xue L, Jiménez-García B, van Noort CW, Honorato RV, Bonvin AMJJ, Wodak SJ. Blind prediction of homo- and hetero-protein complexes: The CASP13-CAPRI experiment. Proteins 2019; 87:1200-1221. [PMID: 31612567 PMCID: PMC7274794 DOI: 10.1002/prot.25838] [Show More Authors] [Citation(s) in RCA: 93] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2019] [Revised: 09/26/2019] [Accepted: 09/27/2019] [Indexed: 12/28/2022]
Abstract
We present the results for CAPRI Round 46, the third joint CASP-CAPRI protein assembly prediction challenge. The Round comprised a total of 20 targets including 14 homo-oligomers and 6 heterocomplexes. Eight of the homo-oligomer targets and one heterodimer comprised proteins that could be readily modeled using templates from the Protein Data Bank, often available for the full assembly. The remaining 11 targets comprised 5 homodimers, 3 heterodimers, and two higher-order assemblies. These were more difficult to model, as their prediction mainly involved "ab-initio" docking of subunit models derived from distantly related templates. A total of ~30 CAPRI groups, including 9 automatic servers, submitted on average ~2000 models per target. About 17 groups participated in the CAPRI scoring rounds, offered for most targets, submitting ~170 models per target. The prediction performance, measured by the fraction of models of acceptable quality or higher submitted across all predictors groups, was very good to excellent for the nine easy targets. Poorer performance was achieved by predictors for the 11 difficult targets, with medium and high quality models submitted for only 3 of these targets. A similar performance "gap" was displayed by scorer groups, highlighting yet again the unmet challenge of modeling the conformational changes of the protein components that occur upon binding or that must be accounted for in template-based modeling. Our analysis also indicates that residues in binding interfaces were less well predicted in this set of targets than in previous Rounds, providing useful insights for directions of future improvements.
Collapse
Affiliation(s)
- Marc F. Lensink
- University of Lille, CNRS UMR8576 UGSF, Unité de Glycobiologie Structurale et Fonctionnelle, Lille, France
| | - Guillaume Brysbaert
- University of Lille, CNRS UMR8576 UGSF, Unité de Glycobiologie Structurale et Fonctionnelle, Lille, France
| | - Nurul Nadzirin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Sameer Velankar
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | - Tereza Gerguri
- Biomolecular Modelling Laboratory, The Francis Crick Institute, London, UK
| | - Paul A. Bates
- Biomolecular Modelling Laboratory, The Francis Crick Institute, London, UK
| | - Elodie Laine
- CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Sorbonne Université, Paris, France
| | - Alessandra Carbone
- CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Sorbonne Université, Paris, France
- Institut Universitaire de France (IUF), Paris, France
| | - Sergei Grudinin
- Université Grenoble Alpes, CNRS, Inria, Grenoble INP, LJK, Grenoble, France
| | - Ren Kong
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Ran-Ran Liu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Xi-Ming Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Hang Shi
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Shan Chang
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Miriam Eisenstein
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | | | | | - Emilia Lubecka
- Institute of Informatics, Faculty of Mathematics, Physics, and Informatics, University of Gdańsk, Gdańsk, Poland
| | | | - Paweł Krupa
- Polish Academy of Sciences, Institute of Physics, Warsaw, Poland
| | | | - Łukasz Golon
- Faculty of Chemistry, University of Gdańsk, Gdańsk, Poland
| | | | - Adam Liwo
- Faculty of Chemistry, University of Gdańsk, Gdańsk, Poland
- School of Computational Sciences, Korea Institute for Advanced Study, Seoul, South Korea
| | | | - Guillaume Pagès
- Université Grenoble Alpes, CNRS, Inria, Grenoble INP, LJK, Grenoble, France
| | | | - Maria Kadukova
- Université Grenoble Alpes, CNRS, Inria, Grenoble INP, LJK, Grenoble, France
- Moscow Institute of Physics and Technology, Dolgoprudniy, Russia
| | - Yumeng Yan
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Mireia Rosell
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
- Instituto de Ciencias de la Vid y del Vino (ICVV-CSIC), Logroño, Spain
| | - Luis A. Rodríguez-Lumbreras
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
- Instituto de Ciencias de la Vid y del Vino (ICVV-CSIC), Logroño, Spain
| | | | | | - Juan Fernandez-Recio
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
- Instituto de Ciencias de la Vid y del Vino (ICVV-CSIC), Logroño, Spain
- Instituto de Biología Molecular de Barcelona (IBMB-CSIC), Barcelona, Spain
| | | | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana
| | - Woong-Hee Shin
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana
| | - Tunde Aderinwale
- Department of Computer Science, Purdue University, West Lafayette, Indiana
| | | | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, Indiana
| | - Dima Kozakov
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts
- Department of Chemistry, Boston University, Boston, Massachusetts
| | - Kathryn Porter
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts
| | - Dzmitry Padhorny
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York
| | - Israel Desta
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts
| | - Dmitri Beglov
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts
| | - Mikhail Ignatov
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York
| | - Sergey Kotelnikov
- Moscow Institute of Physics and Technology, Dolgoprudniy, Russia
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York
| | - Iain H. Moal
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | | | | | | | | | - Didier Barradas-Bautista
- Physical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Zhen Cao
- Physical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Luigi Cavallo
- Physical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Romina Oliva
- Department of Sciences and Technologies, University of Naples “Parthenope”, Napoli, Italy
| | - Yue Cao
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas
| | - Yang Shen
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas
| | - Minkyung Baek
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Taeyong Park
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Hyeonuk Woo
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Merav Braitbard
- Department of Biological Chemistry, Institute of Live Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Lirane Bitton
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Dina Scheidman-Duhovny
- Department of Biological Chemistry, Institute of Live Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Justas Dapkūnas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Kliment Olechnovič
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Česlovas Venclovas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Petras J. Kundrotas
- Computational Biology Program and Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas
| | - Saveliy Belkin
- Computational Biology Program and Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas
| | - Devlina Chakravarty
- Computational Biology Program and Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas
| | - Varsha D. Badal
- Computational Biology Program and Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas
| | - Ilya A. Vakser
- Computational Biology Program and Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas
| | - Thom Vreven
- Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts
| | - Sweta Vangaveti
- Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts
| | - Tyler Borrman
- Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts
| | - Zhiping Weng
- Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts
| | - Johnathan D. Guest
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, Maryland
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland
| | - Ragul Gowthaman
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, Maryland
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland
| | - Brian G. Pierce
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, Maryland
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland
| | - Xianjin Xu
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri
| | - Rui Duan
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri
| | - Liming Qiu
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri
| | - Jie Hou
- Department of Computer Science, University of Missouri, Columbia, Missouri
| | - Benjamin Ryan Merideth
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri
- Informatics Institute, University of Missouri, Columbia, Missouri
| | - Zhiwei Ma
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri
- Department of Physics and Astronomy, University of Missouri, Columbia, Missouri
| | - Jianlin Cheng
- Department of Computer Science, University of Missouri, Columbia, Missouri
- Informatics Institute, University of Missouri, Columbia, Missouri
| | - Xiaoqin Zou
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri
- Informatics Institute, University of Missouri, Columbia, Missouri
- Department of Physics and Astronomy, University of Missouri, Columbia, Missouri
- Department of Biochemistry, University of Missouri, Columbia, Missouri
| | - Panagiotis I. Koukos
- Computational Structural Biology Group, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Jorge Roel-Touris
- Computational Structural Biology Group, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Francesco Ambrosetti
- Computational Structural Biology Group, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Cunliang Geng
- Computational Structural Biology Group, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Jörg Schaarschmidt
- Computational Structural Biology Group, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Mikael E. Trellet
- Computational Structural Biology Group, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Adrien S. J. Melquiond
- Computational Structural Biology Group, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Li Xue
- Computational Structural Biology Group, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Brian Jiménez-García
- Computational Structural Biology Group, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Charlotte W. van Noort
- Computational Structural Biology Group, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Rodrigo V. Honorato
- Computational Structural Biology Group, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Alexandre M. J. J. Bonvin
- Computational Structural Biology Group, Department of Chemistry, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | | |
Collapse
|
8
|
Mbaye MN, Hou Q, Basu S, Teheux F, Pucci F, Rooman M. A comprehensive computational study of amino acid interactions in membrane proteins. Sci Rep 2019; 9:12043. [PMID: 31427701 PMCID: PMC6700154 DOI: 10.1038/s41598-019-48541-2] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Accepted: 08/07/2019] [Indexed: 01/26/2023] Open
Abstract
Transmembrane proteins play a fundamental role in a wide series of biological processes but, despite their importance, they are less studied than globular proteins, essentially because their embedding in lipid membranes hampers their experimental characterization. In this paper, we improved our understanding of their structural stability through the development of new knowledge-based energy functions describing amino acid pair interactions that prevail in the transmembrane and extramembrane regions of membrane proteins. The comparison of these potentials and those derived from globular proteins yields an objective view of the relative strength of amino acid interactions in the different protein environments, and their role in protein stabilization. Separate potentials were also derived from α-helical and β-barrel transmembrane regions to investigate possible dissimilarities. We found that, in extramembrane regions, hydrophobic residues are less frequent but interactions between aromatic and aliphatic amino acids as well as aromatic-sulfur interactions contribute more to stability. In transmembrane regions, polar residues are less abundant but interactions between residues of equal or opposite charges or non-charged polar residues as well as anion-π interactions appear stronger. This shows indirectly the preference of the water and lipid molecules to interact with polar and hydrophobic residues, respectively. We applied these new energy functions to predict whether a residue is located in the trans- or extramembrane region, and obtained an AUC score of 83% in cross validation, which demonstrates their accuracy. As their application is, moreover, extremely fast, they are optimal instruments for membrane protein design and large-scale investigations of membrane protein stability.
Collapse
Affiliation(s)
- Mame Ndew Mbaye
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels, Belgium.,Department of Mathematics and Informatics, Cheikh Anta Diop University, Dakar-Fann, Senegal
| | - Qingzhen Hou
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels, Belgium
| | - Sankar Basu
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels, Belgium
| | - Fabian Teheux
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels, Belgium
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels, Belgium.,John von Neumann Institute for Computing, Jülich Supercomputer Centre, Forschungszentrum Jülich, Jülich, Germany
| | - Marianne Rooman
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels, Belgium.
| |
Collapse
|
9
|
Wardah W, Khan M, Sharma A, Rashid MA. Protein secondary structure prediction using neural networks and deep learning: A review. Comput Biol Chem 2019; 81:1-8. [DOI: 10.1016/j.compbiolchem.2019.107093] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2018] [Revised: 12/28/2018] [Accepted: 07/10/2019] [Indexed: 02/02/2023]
|
10
|
Hou Q, Bourgeas R, Pucci F, Rooman M. Computational analysis of the amino acid interactions that promote or decrease protein solubility. Sci Rep 2018; 8:14661. [PMID: 30279585 PMCID: PMC6168528 DOI: 10.1038/s41598-018-32988-w] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2018] [Accepted: 09/11/2018] [Indexed: 11/24/2022] Open
Abstract
The solubility of globular proteins is a basic biophysical property that is usually a prerequisite for their functioning. In this study, we probed the solubility of globular proteins with the help of the statistical potential formalism, in view of objectifying the connection of solubility with structural and energetic properties and of the solubility-dependence of specific amino acid interactions. We started by setting up two independent datasets containing either soluble or aggregation-prone proteins with known structures. From these two datasets, we computed solubility-dependent distance potentials that are by construction biased towards the solubility of the proteins from which they are derived. Their analysis showed the clear preference of amino acid interactions such as Lys-containing salt bridges and aliphatic interactions to promote protein solubility, whereas others such as aromatic, His-π, cation-π, amino-π and anion-π interactions rather tend to reduce it. These results indicate that interactions involving delocalized π-electrons favor aggregation, unlike those involving no (or few) dispersion forces. Furthermore, using our potentials derived from either highly or weakly soluble proteins to compute protein folding free energies, we found that the difference between these two energies correlates better with solubility than other properties analyzed before such as protein length, isoelectric point and aliphatic index. This is, to the best of our knowledge, the first comprehensive in silico study of the impact of residue-residue interactions on protein solubility properties.The results of this analysis provide new insights that will facilitate future rational protein design applications aimed at modulating the solubility of targeted proteins.
Collapse
Affiliation(s)
- Qingzhen Hou
- Department of BioModeling BioInformatics & BioProcesses, Université Libre de Bruxelles, Brussels, 1050, Belgium
| | - Raphaël Bourgeas
- Department of BioModeling BioInformatics & BioProcesses, Université Libre de Bruxelles, Brussels, 1050, Belgium
| | - Fabrizio Pucci
- Department of BioModeling BioInformatics & BioProcesses, Université Libre de Bruxelles, Brussels, 1050, Belgium
| | - Marianne Rooman
- Department of BioModeling BioInformatics & BioProcesses, Université Libre de Bruxelles, Brussels, 1050, Belgium.
| |
Collapse
|
11
|
Ancien F, Pucci F, Godfroid M, Rooman M. Prediction and interpretation of deleterious coding variants in terms of protein structural stability. Sci Rep 2018. [PMID: 29540703 PMCID: PMC5852127 DOI: 10.1038/s41598-018-22531-2] [Citation(s) in RCA: 62] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
The classification of human genetic variants into deleterious and neutral is a challenging issue, whose complexity is rooted in the large variety of biophysical mechanisms that can be responsible for disease conditions. For non-synonymous mutations in structured proteins, one of these is the protein stability change, which can lead to loss of protein structure or function. We developed a stability-driven knowledge-based classifier that uses protein structure, artificial neural networks and solvent accessibility-dependent combinations of statistical potentials to predict whether destabilizing or stabilizing mutations are disease-causing. Our predictor yields a balanced accuracy of 71% in cross validation. As expected, it has a very high positive predictive value of 89%: it predicts with high accuracy the subset of mutations that are deleterious because of stability issues, but is by construction unable of classifying variants that are deleterious for other reasons. Its combination with an evolutionary-based predictor increases the balanced accuracy up to 75%, and allowed predicting more than 1/4 of the variants with 95% positive predictive value. Our method, called SNPMuSiC, can be used with both experimental and modeled structures and compares favorably with other prediction tools on several independent test sets. It constitutes a step towards interpreting variant effects at the molecular scale. SNPMuSiC is freely available at https://soft.dezyme.com/.
Collapse
Affiliation(s)
- François Ancien
- Department of BioModeling, BioInformatics & BioProcesses, Université Libre de Bruxelles (ULB), CP 165/61, Roosevelt Avenue 50, 1050, Brussels, Belgium. .,Interuniversity Institute of Bioinformatics in Brussels, ULB, CP 263, Triumph Bld, 1050, Brussels, Belgium.
| | - Fabrizio Pucci
- Department of BioModeling, BioInformatics & BioProcesses, Université Libre de Bruxelles (ULB), CP 165/61, Roosevelt Avenue 50, 1050, Brussels, Belgium. .,Interuniversity Institute of Bioinformatics in Brussels, ULB, CP 263, Triumph Bld, 1050, Brussels, Belgium.
| | - Maxime Godfroid
- Department of BioModeling, BioInformatics & BioProcesses, Université Libre de Bruxelles (ULB), CP 165/61, Roosevelt Avenue 50, 1050, Brussels, Belgium.,Institute of General Microbiology, Kiel University, Am Botanischen Garten 11, 24118, Kiel, Germany
| | - Marianne Rooman
- Department of BioModeling, BioInformatics & BioProcesses, Université Libre de Bruxelles (ULB), CP 165/61, Roosevelt Avenue 50, 1050, Brussels, Belgium. .,Interuniversity Institute of Bioinformatics in Brussels, ULB, CP 263, Triumph Bld, 1050, Brussels, Belgium.
| |
Collapse
|
12
|
Dalkas GA, Rooman M. SEPIa, a knowledge-driven algorithm for predicting conformational B-cell epitopes from the amino acid sequence. BMC Bioinformatics 2017; 18:95. [PMID: 28183272 PMCID: PMC5301386 DOI: 10.1186/s12859-017-1528-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2016] [Accepted: 02/06/2017] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND The identification of immunogenic regions on the surface of antigens, which are able to be recognized by antibodies and to trigger an immune response, is a major challenge for the design of new and effective vaccines. The prediction of such regions through computational immunology techniques is a challenging goal, which will ultimately lead to a drastic limitation of the experimental tests required to validate their efficiency. However, current methods are far from being sufficiently reliable and/or applicable on a large scale. RESULTS We developed SEPIa, a B-cell epitope predictor from the protein sequence, which is sufficiently fast to be applicable on a large scale. The originality of SEPIa lies in the combination of two classifiers, a naïve Bayesian and a random forest classifier, through a voting algorithm that exploits the advantages of both. It is based on 13 sequence-based features, whose values in a 9-residue sequence window are compiled to predict the epitope/non-epitope state of the central residue. The features are related to the type of amino acid, its conservation in homologous proteins, and its tendency of being exposed to the solvent, soluble, flexible, and disordered. The highest signal is obtained from statistical amino acid preferences, but all 13 features contribute non-negligibly in the predictor. SEPIa's average prediction accuracy is limited, with an AUC score (area under the receiver operating characteristic curve) that reaches 0.65 both in 10-fold cross-validation and on an independent test set. It is nevertheless slightly higher than that of other methods evaluated on the same test set. CONCLUSIONS SEPIa was applied to a test protein whose epitopes are known, human β2 adrenergic G-protein-coupled receptor, with promising results. Although the actual AUC score is rather low, many of the predicted epitopes cluster together and overlap the experimental epitope region. The reasons underlying the limitations of SEPIa and of all other B-cell epitope predictors are discussed.
Collapse
Affiliation(s)
- Georgios A. Dalkas
- BioModeling, BioInformatics & BioProcesses (3BIO), Université Libre de Bruxelles (ULB), CP 165/61, 50 Roosevelt Ave, 1050 Brussels, Belgium
- Present address: Institute of Mechanical, Process & Energy Engineering, Heriot-Watt University, Edinburgh, EH14 4AS UK
| | - Marianne Rooman
- BioModeling, BioInformatics & BioProcesses (3BIO), Université Libre de Bruxelles (ULB), CP 165/61, 50 Roosevelt Ave, 1050 Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, CP 263, Triumph Bld, 1050 Brussels, Belgium
| |
Collapse
|
13
|
Abstract
More than two decades of research have enabled dihedral angle predictions at an accuracy that makes them an interesting alternative or supplement to secondary structure prediction that provides detailed local structure information for every residue of a protein. The evolution of dihedral angle prediction methods is closely linked to advancements in machine learning and other relevant technologies. Consequently recent improvements in large-scale training of deep neural networks have led to the best method currently available, which achieves a mean absolute error of 19° for phi, and 30° for psi. This performance opens interesting perspectives for the application of dihedral angle prediction in the comparison, prediction, and design of protein structures.
Collapse
Affiliation(s)
- Olav Zimmermann
- Jülich Supercomputing Centre (JSC), Institute for Advanced Simulation (IAS), Forschungszentrum Jülich GmbH, 52425, Jülich, Germany.
| |
Collapse
|
14
|
Shin JM, Lee B, Cho KH. A New Efficient Conformational Search Method forab initioProtein Folding Study: Window Growth Evolutionary Algorithm. B KOREAN CHEM SOC 2016. [DOI: 10.1002/bkcs.11006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Jae-Min Shin
- School of Systems Biomedical Science; Soongsil University; Seoul 156-743 Republic of Korea
| | - Byungkook Lee
- Laboratory of Molecular Biology, Division of Basic Sciences; National Cancer Institute, National Institutes of Health; Bethesda MD 20892-4200 USA
| | - Kwang-Hwi Cho
- School of Systems Biomedical Science; Soongsil University; Seoul 156-743 Republic of Korea
| |
Collapse
|
15
|
De Laet M, Gilis D, Rooman M. Stability strengths and weaknesses in protein structures detected by statistical potentials: Application to bovine seminal ribonuclease. Proteins 2015; 84:143-58. [DOI: 10.1002/prot.24962] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2015] [Revised: 10/27/2015] [Accepted: 11/09/2015] [Indexed: 11/10/2022]
Affiliation(s)
- Marie De Laet
- 3BIO-BioInfo Department; Université Libre De Bruxelles; Avenue F. Roosevelt 50 CP 165/61 Brussels 1050 Belgium
| | - Dimitri Gilis
- 3BIO-BioInfo Department; Université Libre De Bruxelles; Avenue F. Roosevelt 50 CP 165/61 Brussels 1050 Belgium
| | - Marianne Rooman
- 3BIO-BioInfo Department; Université Libre De Bruxelles; Avenue F. Roosevelt 50 CP 165/61 Brussels 1050 Belgium
| |
Collapse
|
16
|
Pucci F, Bernaerts K, Teheux F, Gilis D, Rooman M. Symmetry Principles in Optimization Problems: an application to Protein Stability Prediction. ACTA ACUST UNITED AC 2015. [DOI: 10.1016/j.ifacol.2015.05.068] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
17
|
Singh H, Singh S, Raghava GPS. Evaluation of protein dihedral angle prediction methods. PLoS One 2014; 9:e105667. [PMID: 25166857 PMCID: PMC4148315 DOI: 10.1371/journal.pone.0105667] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2013] [Accepted: 07/26/2014] [Indexed: 11/30/2022] Open
Abstract
Tertiary structure prediction of a protein from its amino acid sequence is one of the major challenges in the field of bioinformatics. Hierarchical approach is one of the persuasive techniques used for predicting protein tertiary structure, especially in the absence of homologous protein structures. In hierarchical approach, intermediate states are predicted like secondary structure, dihedral angles, Cα-Cα distance bounds, etc. These intermediate states are used to restraint the protein backbone and assist its correct folding. In the recent years, several methods have been developed for predicting dihedral angles of a protein, but it is difficult to conclude which method is better than others. In this study, we benchmarked the performance of dihedral prediction methods ANGLOR and SPINE X on various datasets, including independent datasets. TANGLE dihedral prediction method was not benchmarked (due to unavailability of its standalone) and was compared with SPINE X and ANGLOR on only ANGLOR dataset on which TANGLE has reported its results. It was observed that SPINE X performed better than ANGLOR and TANGLE, especially in case of prediction of dihedral angles of glycine and proline residues. The analysis suggested that angle shifting was the foremost reason of better performance of SPINE X. We further evaluated the performance of the methods on independent ccPDB30 dataset and observed that SPINE X performed better than ANGLOR.
Collapse
Affiliation(s)
- Harinder Singh
- Bioinformatics Center, Institute of Microbial Technology, Chandigarh, India
| | - Sandeep Singh
- Bioinformatics Center, Institute of Microbial Technology, Chandigarh, India
| | | |
Collapse
|
18
|
Wood CW, Bruning M, Ibarra AÁ, Bartlett GJ, Thomson AR, Sessions RB, Brady RL, Woolfson DN. CCBuilder: an interactive web-based tool for building, designing and assessing coiled-coil protein assemblies. Bioinformatics 2014; 30:3029-35. [PMID: 25064570 PMCID: PMC4201159 DOI: 10.1093/bioinformatics/btu502] [Citation(s) in RCA: 93] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Motivation: The ability to accurately model protein structures at the atomistic level underpins efforts to understand protein folding, to engineer natural proteins predictably and to design proteins de novo. Homology-based methods are well established and produce impressive results. However, these are limited to structures presented by and resolved for natural proteins. Addressing this problem more widely and deriving truly ab initio models requires mathematical descriptions for protein folds; the means to decorate these with natural, engineered or de novo sequences; and methods to score the resulting models. Results: We present CCBuilder, a web-based application that tackles the problem for a defined but large class of protein structure, the α-helical coiled coils. CCBuilder generates coiled-coil backbones, builds side chains onto these frameworks and provides a range of metrics to measure the quality of the models. Its straightforward graphical user interface provides broad functionality that allows users to build and assess models, in which helix geometry, coiled-coil architecture and topology and protein sequence can be varied rapidly. We demonstrate the utility of CCBuilder by assembling models for 653 coiled-coil structures from the PDB, which cover >96% of the known coiled-coil types, and by generating models for rarer and de novo coiled-coil structures. Availability and implementation: CCBuilder is freely available, without registration, at http://coiledcoils.chm.bris.ac.uk/app/cc_builder/ Contact:D.N.Woolfson@bristol.ac.uk or Chris.Wood@bristol.ac.uk
Collapse
Affiliation(s)
- Christopher W Wood
- School of Chemistry, University of Bristol, Cantock's Close, BS8 1TS and School of Biochemistry, University of Bristol, Medical Sciences Building, University Walk, BS8 1TD, Bristol, UK School of Chemistry, University of Bristol, Cantock's Close, BS8 1TS and School of Biochemistry, University of Bristol, Medical Sciences Building, University Walk, BS8 1TD, Bristol, UK
| | - Marc Bruning
- School of Chemistry, University of Bristol, Cantock's Close, BS8 1TS and School of Biochemistry, University of Bristol, Medical Sciences Building, University Walk, BS8 1TD, Bristol, UK
| | - Amaurys Á Ibarra
- School of Chemistry, University of Bristol, Cantock's Close, BS8 1TS and School of Biochemistry, University of Bristol, Medical Sciences Building, University Walk, BS8 1TD, Bristol, UK
| | - Gail J Bartlett
- School of Chemistry, University of Bristol, Cantock's Close, BS8 1TS and School of Biochemistry, University of Bristol, Medical Sciences Building, University Walk, BS8 1TD, Bristol, UK
| | - Andrew R Thomson
- School of Chemistry, University of Bristol, Cantock's Close, BS8 1TS and School of Biochemistry, University of Bristol, Medical Sciences Building, University Walk, BS8 1TD, Bristol, UK
| | - Richard B Sessions
- School of Chemistry, University of Bristol, Cantock's Close, BS8 1TS and School of Biochemistry, University of Bristol, Medical Sciences Building, University Walk, BS8 1TD, Bristol, UK
| | - R Leo Brady
- School of Chemistry, University of Bristol, Cantock's Close, BS8 1TS and School of Biochemistry, University of Bristol, Medical Sciences Building, University Walk, BS8 1TD, Bristol, UK
| | - Derek N Woolfson
- School of Chemistry, University of Bristol, Cantock's Close, BS8 1TS and School of Biochemistry, University of Bristol, Medical Sciences Building, University Walk, BS8 1TD, Bristol, UK School of Chemistry, University of Bristol, Cantock's Close, BS8 1TS and School of Biochemistry, University of Bristol, Medical Sciences Building, University Walk, BS8 1TD, Bristol, UK
| |
Collapse
|
19
|
Protein thermostability prediction within homologous families using temperature-dependent statistical potentials. PLoS One 2014; 9:e91659. [PMID: 24646884 PMCID: PMC3960129 DOI: 10.1371/journal.pone.0091659] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2014] [Accepted: 02/12/2014] [Indexed: 11/28/2022] Open
Abstract
The ability to rationally modify targeted physical and biological features of a protein of interest holds promise in numerous academic and industrial applications and paves the way towards de novo protein design. In particular, bioprocesses that utilize the remarkable properties of enzymes would often benefit from mutants that remain active at temperatures that are either higher or lower than the physiological temperature, while maintaining the biological activity. Many in silico methods have been developed in recent years for predicting the thermodynamic stability of mutant proteins, but very few have focused on thermostability. To bridge this gap, we developed an algorithm for predicting the best descriptor of thermostability, namely the melting temperature , from the protein's sequence and structure. Our method is applicable when the of proteins homologous to the target protein are known. It is based on the design of several temperature-dependent statistical potentials, derived from datasets consisting of either mesostable or thermostable proteins. Linear combinations of these potentials have been shown to yield an estimation of the protein folding free energies at low and high temperatures, and the difference of these energies, a prediction of the melting temperature. This particular construction, that distinguishes between the interactions that contribute more than others to the stability at high temperatures and those that are more stabilizing at low , gives better performances compared to the standard approach based on -independent potentials which predict the thermal resistance from the thermodynamic stability. Our method has been tested on 45 proteins of known that belong to 11 homologous families. The standard deviation between experimental and predicted 's is equal to 13.6°C in cross validation, and decreases to 8.3°C if the 6 worst predicted proteins are excluded. Possible extensions of our approach are discussed.
Collapse
|
20
|
Maurice KJ. SSThread: Template-free protein structure prediction by threading pairs of contacting secondary structures followed by assembly of overlapping pairs. J Comput Chem 2014; 35:644-56. [PMID: 24523210 DOI: 10.1002/jcc.23543] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2013] [Revised: 11/15/2013] [Accepted: 01/05/2014] [Indexed: 11/12/2022]
Abstract
Acquiring the three-dimensional structure of a protein from its amino acid sequence alone, despite a great deal of work and significant progress on the subject, is still an unsolved problem. SSThread, a new template-free algorithm is described here that consists of making several predictions of contacting pairs of α-helices and β-strands derived from a database of experimental structures using a knowledge-based potential, secondary structure prediction, and contact map prediction followed by assembly of overlapping pair predictions to create an ensemble of core structure predictions whose loops are then predicted. In a set of seven CASP10 targets SSThread outperformed the two leading methods for two targets each. The targets were all β-strand containing structures and most of them have a high relative contact order which demonstrates the advantages of SSThread. The primary bottlenecks based on sets of 74 and 21 test cases are the pair prediction and loop prediction stages.
Collapse
|
21
|
Ma J, Wang S. Algorithms, Applications, and Challenges of Protein Structure Alignment. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2014; 94:121-75. [DOI: 10.1016/b978-0-12-800168-4.00005-6] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
22
|
Wintjens RT, Rooman MJ, Wodak SJ. Identification of Short Turn Motifs in Proteins Using Sequence and Structure Fingerprints. Isr J Chem 2013. [DOI: 10.1002/ijch.199400030] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
23
|
Mirzaie M, Sadeghi M. Delaunay-based nonlocal interactions are sufficient and accurate in protein fold recognition. Proteins 2013; 82:415-23. [DOI: 10.1002/prot.24407] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2013] [Revised: 08/12/2013] [Accepted: 08/21/2013] [Indexed: 01/05/2023]
Affiliation(s)
- Mehdi Mirzaie
- Department of Basic Sciences, Faculty of Paramedical Sciences; Shahid Beheshti University of Medical Sciences; Tehran Iran
- Department of Bioinformatics; School of Computer Science, Institute for Research in Fundamental Sciences (IPM); Tehran Iran
| | - Mehdi Sadeghi
- Department of Bioinformatics, National Institute of Genetic Engineering and Biotechnology; Tehran Iran
| |
Collapse
|
24
|
Zangooei MH, Jalili S. Protein secondary structure prediction using DWKF based on SVR-NSGAII. Neurocomputing 2012. [DOI: 10.1016/j.neucom.2012.04.015] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
25
|
Lu WW, Huang RB, Wei YT, Meng JZ, Du LQ, Du QS. Statistical energy potential: reduced representation of Dehouck–Gilis–Rooman function by selecting against decoy datasets. Amino Acids 2012; 42:2353-61. [DOI: 10.1007/s00726-011-0977-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2010] [Accepted: 07/06/2011] [Indexed: 11/24/2022]
|
26
|
|
27
|
Song J, Tan H, Wang M, Webb GI, Akutsu T. TANGLE: two-level support vector regression approach for protein backbone torsion angle prediction from primary sequences. PLoS One 2012; 7:e30361. [PMID: 22319565 PMCID: PMC3271071 DOI: 10.1371/journal.pone.0030361] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2011] [Accepted: 12/14/2011] [Indexed: 12/29/2022] Open
Abstract
Protein backbone torsion angles (Phi) and (Psi) involve two rotation angles rotating around the Cα-N bond (Phi) and the Cα-C bond (Psi). Due to the planarity of the linked rigid peptide bonds, these two angles can essentially determine the backbone geometry of proteins. Accordingly, the accurate prediction of protein backbone torsion angle from sequence information can assist the prediction of protein structures. In this study, we develop a new approach called TANGLE (Torsion ANGLE predictor) to predict the protein backbone torsion angles from amino acid sequences. TANGLE uses a two-level support vector regression approach to perform real-value torsion angle prediction using a variety of features derived from amino acid sequences, including the evolutionary profiles in the form of position-specific scoring matrices, predicted secondary structure, solvent accessibility and natively disordered region as well as other global sequence features. When evaluated based on a large benchmark dataset of 1,526 non-homologous proteins, the mean absolute errors (MAEs) of the Phi and Psi angle prediction are 27.8° and 44.6°, respectively, which are 1% and 3% respectively lower than that using one of the state-of-the-art prediction tools ANGLOR. Moreover, the prediction of TANGLE is significantly better than a random predictor that was built on the amino acid-specific basis, with the p-value<1.46e-147 and 7.97e-150, respectively by the Wilcoxon signed rank test. As a complementary approach to the current torsion angle prediction algorithms, TANGLE should prove useful in predicting protein structural properties and assisting protein fold recognition by applying the predicted torsion angles as useful restraints. TANGLE is freely accessible at http://sunflower.kuicr.kyoto-u.ac.jp/~sjn/TANGLE/.
Collapse
Affiliation(s)
- Jiangning Song
- Department of Biochemistry and Molecular Biology, Faculty of Medicine, Monash University, Melbourne, Victoria, Australia
- National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto, Japan
- * E-mail: (JS); (GIW); (TA)
| | - Hao Tan
- Department of Biochemistry and Molecular Biology, Faculty of Medicine, Monash University, Melbourne, Victoria, Australia
| | - Mingjun Wang
- National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China
| | - Geoffrey I. Webb
- Faculty of Information Technology, Monash University, Melbourne, Victoria, Australia
- * E-mail: (JS); (GIW); (TA)
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto, Japan
- * E-mail: (JS); (GIW); (TA)
| |
Collapse
|
28
|
Mirzaie M, Sadeghi M. Distance-dependent atomic knowledge-based force in protein fold recognition. Proteins 2012; 80:683-90. [DOI: 10.1002/prot.24011] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2011] [Revised: 11/15/2011] [Accepted: 12/06/2011] [Indexed: 11/08/2022]
|
29
|
Abstract
Loop modeling is crucial for high-quality homology model construction outside conserved secondary structure elements. Dozens of loop modeling protocols involving a range of database and ab initio search algorithms and a variety of scoring functions have been proposed. Knowledge-based loop modeling methods are very fast and some can successfully and reliably predict loops up to about eight residues long. Several recent ab initio loop simulation methods can be used to construct accurate models of loops up to 12-13 residues long, albeit at a substantial computational cost. Major current challenges are the simulations of loops longer than 12-13 residues, the modeling of multiple interacting flexible loops, and the sensitivity of the loop predictions to the accuracy of the loop environment.
Collapse
|
30
|
Zhou Y, Duan Y, Yang Y, Faraggi E, Lei H. Trends in template/fragment-free protein structure prediction. Theor Chem Acc 2011; 128:3-16. [PMID: 21423322 PMCID: PMC3030773 DOI: 10.1007/s00214-010-0799-2] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2010] [Accepted: 08/15/2010] [Indexed: 12/13/2022]
Abstract
Predicting the structure of a protein from its amino acid sequence is a long-standing unsolved problem in computational biology. Its solution would be of both fundamental and practical importance as the gap between the number of known sequences and the number of experimentally solved structures widens rapidly. Currently, the most successful approaches are based on fragment/template reassembly. Lacking progress in template-free structure prediction calls for novel ideas and approaches. This article reviews trends in the development of physical and specific knowledge-based energy functions as well as sampling techniques for fragment-free structure prediction. Recent physical- and knowledge-based studies demonstrated that it is possible to sample and predict highly accurate protein structures without borrowing native fragments from known protein structures. These emerging approaches with fully flexible sampling have the potential to move the field forward.
Collapse
Affiliation(s)
- Yaoqi Zhou
- School of Informatics, Indiana Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indiana University Purdue University, 719 Indiana Ave #319, Walker Plaza Building, Indianapolis, IN 46202 USA
| | - Yong Duan
- UC Davis Genome Center and Department of Applied Science, University of California, One Shields Avenue, Davis, CA USA
- College of Physics, Huazhong University of Science and Technology, 1037 Luoyu Road, 430074 Wuhan, China
| | - Yuedong Yang
- School of Informatics, Indiana Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indiana University Purdue University, 719 Indiana Ave #319, Walker Plaza Building, Indianapolis, IN 46202 USA
| | - Eshel Faraggi
- School of Informatics, Indiana Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indiana University Purdue University, 719 Indiana Ave #319, Walker Plaza Building, Indianapolis, IN 46202 USA
| | - Hongxing Lei
- UC Davis Genome Center and Department of Applied Science, University of California, One Shields Avenue, Davis, CA USA
- Beijing Institute of Genomics, Chinese Academy of Sciences, 100029 Beijing, China
| |
Collapse
|
31
|
Hamelryck T, Borg M, Paluszewski M, Paulsen J, Frellsen J, Andreetta C, Boomsma W, Bottaro S, Ferkinghoff-Borg J. Potentials of mean force for protein structure prediction vindicated, formalized and generalized. PLoS One 2010; 5:e13714. [PMID: 21103041 PMCID: PMC2978081 DOI: 10.1371/journal.pone.0013714] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2010] [Accepted: 10/04/2010] [Indexed: 11/26/2022] Open
Abstract
Understanding protein structure is of crucial importance in science, medicine and biotechnology. For about two decades, knowledge-based potentials based on pairwise distances – so-called “potentials of mean force” (PMFs) – have been center stage in the prediction and design of protein structure and the simulation of protein folding. However, the validity, scope and limitations of these potentials are still vigorously debated and disputed, and the optimal choice of the reference state – a necessary component of these potentials – is an unsolved problem. PMFs are loosely justified by analogy to the reversible work theorem in statistical physics, or by a statistical argument based on a likelihood function. Both justifications are insightful but leave many questions unanswered. Here, we show for the first time that PMFs can be seen as approximations to quantities that do have a rigorous probabilistic justification: they naturally arise when probability distributions over different features of proteins need to be combined. We call these quantities “reference ratio distributions” deriving from the application of the “reference ratio method.” This new view is not only of theoretical relevance but leads to many insights that are of direct practical use: the reference state is uniquely defined and does not require external physical insights; the approach can be generalized beyond pairwise distances to arbitrary features of protein structure; and it becomes clear for which purposes the use of these quantities is justified. We illustrate these insights with two applications, involving the radius of gyration and hydrogen bonding. In the latter case, we also show how the reference ratio method can be iteratively applied to sculpt an energy funnel. Our results considerably increase the understanding and scope of energy functions derived from known biomolecular structures.
Collapse
Affiliation(s)
- Thomas Hamelryck
- Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark
- * E-mail: (TH); (JFB)
| | - Mikael Borg
- Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Martin Paluszewski
- Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Jonas Paulsen
- Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Jes Frellsen
- Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Christian Andreetta
- Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Wouter Boomsma
- Biomedical Engineering, Technical University of Denmark (DTU) Elektro, Technical University of Denmark, Lyngby, Denmark
- Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - Sandro Bottaro
- Biomedical Engineering, Technical University of Denmark (DTU) Elektro, Technical University of Denmark, Lyngby, Denmark
| | - Jesper Ferkinghoff-Borg
- Biomedical Engineering, Technical University of Denmark (DTU) Elektro, Technical University of Denmark, Lyngby, Denmark
- * E-mail: (TH); (JFB)
| |
Collapse
|
32
|
Hollingsworth SA, Karplus PA. A fresh look at the Ramachandran plot and the occurrence of standard structures in proteins. Biomol Concepts 2010; 1:271-283. [PMID: 21436958 DOI: 10.1515/bmc.2010.022] [Citation(s) in RCA: 224] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
The Ramachandran plot is among the most central concepts in structural biology, seen in publications and textbooks alike. However, with the increasing numbers of known protein-structures and greater accuracy of ultra-high resolution protein structures, we are still learning more about the basic principles of protein structure. Here we use high fidelity conformational information to explore novel ways, such a geo-style and wrapped Ramachandran plots, to convey some of the basic aspects of the Ramachandran plot and of protein conformation. We point out the pressing need for a standard nomenclature for peptide conformation and propose such a nomenclature. Finally, we summarize some recent conceptual advances related to the building blocks of protein structure. The results for linear groups imply the need for substantive revisions in how the basics of protein structure are handled.
Collapse
Affiliation(s)
- Scott A Hollingsworth
- Department of Biochemistry & Biophysics, Oregon State University, Corvallis, OR 97331
| | | |
Collapse
|
33
|
Figueirêdo PH, Moret MA, Coutinho S, Nogueira E. The role of stochasticity on compactness of the native state of protein peptide backbone. J Chem Phys 2010; 133:085102. [DOI: 10.1063/1.3481485] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
34
|
Jiang F, Han W, Wu YD. Influence of side chain conformations on local conformational features of amino acids and implication for force field development. J Phys Chem B 2010; 114:5840-50. [PMID: 20392111 DOI: 10.1021/jp909088e] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Statistical analysis of coil regions in protein structures has been used to obtain the local backbone phi, psi preferences of amino acids, which agree well with the NMR experiments of unfolded peptides and proteins. We analyzed the conformational features of amino acid residues in a restricted coil library of 4220 high-resolution protein crystal structures. In addition to Gly, Ala, and Pro, the phi, psi distribution (Ramachandran plot) of each amino acid is analyzed with respect to three side chain conformers: g+ (chi(1) approximately -60 degrees), g- (chi(1) approximately 60 degrees), and t (chi(1) approximately 180 degrees). The statistical study indicates that the effect of side chain conformations on phi, psi distributions is even greater than the effect of amino acid types. On the basis of the chi(1), phi, psi conformational preferences, the amino acids in addition to Gly, Pro, and Ala can be divided into five types: (1) ordinary amino acids, (2) Ser, (3) Asp and Asn, (4) Val and Ile, and (5) Thr, each with distinguished chi(1) rotamers. The alpha-helix, beta-sheet, and type-I beta-turn preferences of the different rotamers of various amino acid types can be captured by their intrinsic phi, psi preferences from our coil library. Molecular dynamics simulations of dipeptide Ac-X-NHMe and tetrapeptide Ac-A-X-A-NHMe models give nearly the same side chain rotamer distributions. However, for many amino acids, both OPLS-AA/L and AMBER-FF03 force fields give very different chi(1) rotamer distributions from the coil library. This may partially explain why dipeptide models sometimes cannot reproduce those of protein structures well. The current coil library analysis may be valuable in improving the force field for protein simulations.
Collapse
Affiliation(s)
- Fan Jiang
- Laboratory of Chemical Genomics, Shenzhen Graduate School of Peking University, Shenzhen 518055, China
| | | | | |
Collapse
|
35
|
Faraggi E, Yang Y, Zhang S, Zhou Y. Predicting continuous local structure and the effect of its substitution for secondary structure in fragment-free protein structure prediction. Structure 2010; 17:1515-27. [PMID: 19913486 DOI: 10.1016/j.str.2009.09.006] [Citation(s) in RCA: 91] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2009] [Revised: 09/01/2009] [Accepted: 09/03/2009] [Indexed: 11/30/2022]
Abstract
Local structures predicted from protein sequences are used extensively in every aspect of modeling and prediction of protein structure and function. For more than 50 years, they have been predicted at a low-resolution coarse-grained level (e.g., three-state secondary structure). Here, we combine a two-state classifier with real-value predictor to predict local structure in continuous representation by backbone torsion angles. The accuracy of the angles predicted by this approach is close to that derived from NMR chemical shifts. Their substitution for predicted secondary structure as restraints for ab initio structure prediction doubles the success rate. This result demonstrates the potential of predicted local structure for fragment-free tertiary-structure prediction. It further implies potentially significant benefits from using predicted real-valued torsion angles as a replacement for or supplement to the secondary-structure prediction tools used almost exclusively in many computational methods ranging from sequence alignment to function prediction.
Collapse
Affiliation(s)
- Eshel Faraggi
- Indiana University School of Informatics, Indiana University-Purdue University and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | | | | | | |
Collapse
|
36
|
Kountouris P, Hirst JD. Prediction of backbone dihedral angles and protein secondary structure using support vector machines. BMC Bioinformatics 2009; 10:437. [PMID: 20025785 PMCID: PMC2811710 DOI: 10.1186/1471-2105-10-437] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2009] [Accepted: 12/22/2009] [Indexed: 11/26/2022] Open
Abstract
Background The prediction of the secondary structure of a protein is a critical step in the prediction of its tertiary structure and, potentially, its function. Moreover, the backbone dihedral angles, highly correlated with secondary structures, provide crucial information about the local three-dimensional structure. Results We predict independently both the secondary structure and the backbone dihedral angles and combine the results in a loop to enhance each prediction reciprocally. Support vector machines, a state-of-the-art supervised classification technique, achieve secondary structure predictive accuracy of 80% on a non-redundant set of 513 proteins, significantly higher than other methods on the same dataset. The dihedral angle space is divided into a number of regions using two unsupervised clustering techniques in order to predict the region in which a new residue belongs. The performance of our method is comparable to, and in some cases more accurate than, other multi-class dihedral prediction methods. Conclusions We have created an accurate predictor of backbone dihedral angles and secondary structure. Our method, called DISSPred, is available online at http://comp.chem.nottingham.ac.uk/disspred/.
Collapse
Affiliation(s)
- Petros Kountouris
- School of Chemistry, University of Nottingham, University Park, Nottingham NG7 2RD, UK.
| | | |
Collapse
|
37
|
Dehouck Y, Grosfils A, Folch B, Gilis D, Bogaerts P, Rooman M. Fast and accurate predictions of protein stability changes upon mutations using statistical potentials and neural networks: PoPMuSiC-2.0. ACTA ACUST UNITED AC 2009; 25:2537-43. [PMID: 19654118 DOI: 10.1093/bioinformatics/btp445] [Citation(s) in RCA: 321] [Impact Index Per Article: 20.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
MOTIVATION The rational design of proteins with modified properties, through amino acid substitutions, is of crucial importance in a large variety of applications. Given the huge number of possible substitutions, every protein engineering project would benefit strongly from the guidance of in silico methods able to predict rapidly, and with reasonable accuracy, the stability changes resulting from all possible mutations in a protein. RESULTS We exploit newly developed statistical potentials, based on a formalism that highlights the coupling between four protein sequence and structure descriptors, and take into account the amino acid volume variation upon mutation. The stability change is expressed as a linear combination of these energy functions, whose proportionality coefficients vary with the solvent accessibility of the mutated residue and are identified with the help of a neural network. A correlation coefficient of R = 0.63 and a root mean square error of sigma(c) = 1.15 kcal/mol between measured and predicted stability changes are obtained upon cross-validation. These scores reach R = 0.79, and sigma(c) = 0.86 kcal/mol after exclusion of 10% outliers. The predictive power of our method is shown to be significantly higher than that of other programs described in the literature. AVAILABILITY http://babylone.ulb.ac.be/popmusic
Collapse
Affiliation(s)
- Yves Dehouck
- Bioinformatique génomique et structurale, Université Libre de Bruxelles. Av Fr. Roosevelt 50, CP165/61, 1050 Brussels, Belgium.
| | | | | | | | | | | |
Collapse
|
38
|
Betancourt MR. Another look at the conditions for the extraction of protein knowledge-based potentials. Proteins 2009; 76:72-85. [PMID: 19089977 DOI: 10.1002/prot.22320] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Protein knowledge-based potentials are effective free energies obtained from databases of known protein structures. They are used to parameterize coarse-grained protein models in many folding simulation and structure prediction methods. Two common approaches are used in the derivation of knowledge-based potentials. One assumes that the energy parameters optimize the native structure stability. The other assumes that interaction events are related to their energies according to the Boltzmann distribution, and that they are distributed independently of other events, that is, the quasi-chemical approximation. Here, these assumptions are systematically tested by extracting contact energies from artificial databases of lattice proteins with predefined pairwise contact energies. Databases of protein sequences are designed to either satisfy the Boltzmann distribution at high or low temperatures, or to simultaneously optimize the native stability and folding kinetics. It is found that the quasi-chemical approximation, with the ideal reference state, accurately reproduce the true energies for high temperature Boltzmann distributed sequences (weakly interacting residues), but less accurately at low temperatures, where the sequences correspond to energy minima and the residues are strongly interacting. To overcome this problem, an iterative procedure for Boltzmann distributed sequences is introduced, which accounts for interacting residue correlations and eliminates the need for the quasi-chemical approximation. In this case, the energies are accurately reproduced at any ensemble temperature. However, when the database of sequences designed for optimal stability and kinetics is used, the energy correlation is less than optimal using either method, exhibiting random and systematic deviations from linearity. Therefore, the assumption that native structures are maximally stable or that sequences are determined according to the Boltzmann distribution seems to be inadequate for obtaining accurate energies. The limited number of sequences in the database and the inhomogeneous concentration of amino acids from one structure to another do not seem to be major obstacles for improving the quality of the extracted pairwise energies, with the exception of repulsive interactions.
Collapse
Affiliation(s)
- Marcos R Betancourt
- Department of Physics, Indiana University Purdue University Indianapolis, Indianapolis, Indiana 46202, USA.
| |
Collapse
|
39
|
Wong M, Toth J, Haney S, Tyshenko MG, Darshan S, Krewski D, Leighton FA, Westaway D, Moore SS, Ricketts M, Cashman N. Prionet Canada: a network of centres of excellence for research into prions and prion diseases. JOURNAL OF TOXICOLOGY AND ENVIRONMENTAL HEALTH. PART A 2009; 72:1000-1007. [PMID: 19697232 DOI: 10.1080/15287390903084108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
PrioNet Canada's strength in basic, applied, and social research is helping to solve the food, health safety, and socioeconomic problems associated with prion diseases. Prion diseases are transmissible, fatal neurodegenerative diseases of humans and animals. Examples of prion diseases include bovine spongiform encephalopathy (BSE, commonly known as "mad cow" disease), Creutzfeldt-Jakob disease in humans, and chronic wasting disease (CWD) in deer and elk. As of March 31, 2008, PrioNet's interdisciplinary network included 62 scientific members, 5 international collaborators, and more than 150 students and young professionals working in partnership with 25 different government, nongovernment, and industry partners. PrioNet's activities are developing strategies based on a sustained, rational approach that will mitigate, and ultimately control, prion diseases in Canada.
Collapse
|
40
|
Dasgupta B, Chakrabarti P. pi-Turns: types, systematics and the context of their occurrence in protein structures. BMC STRUCTURAL BIOLOGY 2008; 8:39. [PMID: 18808671 PMCID: PMC2559839 DOI: 10.1186/1472-6807-8-39] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2008] [Accepted: 09/22/2008] [Indexed: 11/10/2022]
Abstract
BACKGROUND For a proper understanding of protein structure and folding it is important to know if a polypeptide segment adopts a conformation inherent in the sequence or it depends on the context of its flanking secondary structures. Turns of various lengths have been studied and characterized starting from three-residue gamma-turn to six-residue pi-turn. The Schellman motif occurring at the C-terminal end of alpha-helices is a classical example of hydrogen bonded pi-turn involving residues at (i) and (i+5) positions. Hydrogen bonded and non-hydrogen bonded beta- and alpha-turns have been identified previously; likewise, a systematic characterization of pi-turns would provide valuable insight into turn structures. RESULTS An analysis of protein structures indicates that at least 20% of pi-turns occur independent of the Schellman motif. The two categories of pi-turns, designated as pi-HB and SCH, have been further classified on the basis of backbone conformation and both have AAAa as the major class. They differ in the residue usage at position (i+1), the former having a large preference for Pro that is absent in the latter. As in the case of shorter length beta- and alpha-turns, pi-turns have also been identified not only on the basis of the existence of hydrogen bond, but also using the distance between terminal C alpha-atoms, and this resulted in a comparable number of non-hydrogen-bonded pi-turns (pi-NHB). The presence of shorter beta- and alpha-turns within all categories of pi-turns, the subtle variations in backbone torsion angles along the turn residues, the location of the turns in the context of tertiary structures have been studied. CONCLUSION pi-turns have been characterized, first using hydrogen bond and the distance between C alpha atoms of the terminal residues, and then using backbone torsion angles. While the Schellman motif has a structural role in helix termination, many of the pi-HB turns, being located on surface cavities, have functional role and there is also sequence conservation.
Collapse
|
41
|
Wang S, Zheng WM. CLePAPS: fast pair alignment of protein structures based on conformational letters. J Bioinform Comput Biol 2008; 6:347-66. [PMID: 18464327 DOI: 10.1142/s0219720008003461] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2007] [Revised: 11/22/2007] [Accepted: 12/05/2007] [Indexed: 11/18/2022]
Abstract
Fast, efficient, and reliable algorithms for pairwise alignment of protein structures are in ever-increasing demand for analyzing the rapidly growing data on protein structures. CLePAPS is a tool developed for this purpose. It distinguishes itself from other existing algorithms by the use of conformational letters, which are discretized states of 3D segmental structural states. A letter corresponds to a cluster of combinations of the three angles formed by Calpha pseudobonds of four contiguous residues. A substitution matrix called CLESUM is available to measure the similarity between any two such letters. CLePAPS regards an aligned fragment pair (AFP) as an ungapped string pair with a high sum of pairwise CLESUM scores. Using CLESUM scores as the similarity measure, CLePAPS searches for AFPs by simple string comparison. The transformation which best superimposes a highly similar AFP can be used to superimpose the structure pairs under comparison. A highly scored AFP which is consistent with several other AFPs determines an initial alignment. CLePAPS then joins consistent AFPs guided by their similarity scores to extend the alignment by several "zoom-in" iteration steps. A follow-up refinement produces the final alignment. CLePAPS does not implement dynamic programming. The utility of CLePAPS is tested on various protein structure pairs.
Collapse
Affiliation(s)
- Sheng Wang
- Institute of Theoretical Physics, Academia Sinica, Beijing 100080, China
| | | |
Collapse
|
42
|
Abstract
The backbone structure of a protein is largely determined by the phi and psi torsion angles. Thus, knowing these angles, even if approximately, will be very useful for protein-structure prediction. However, in a previous work, a sequence-based, real-value prediction of psi angle could only achieve a mean absolute error of 54 degrees (83 degrees, 35 degrees, 33 degrees for coil, strand, and helix residues, respectively) between predicted and actual angles. Moreover, a real-value prediction of phi angle is not yet available. This article employs a neural-network based approach to improve psi prediction by taking advantage of angle periodicity and apply the new method to the prediction to phi angles. The 10-fold-cross-validated mean absolute error for the new method is 38 degrees (58 degrees, 33 degrees, 22 degrees for coil, strand, and helix, respectively) for psi and 25 degrees (35 degrees, 22 degrees, 16 degrees for coil, strand, and helix, respectively) for phi. The accuracy of real-value prediction is comparable to or more accurate than the predictions based on multistate classification of the phi-psi map. More accurate prediction of real-value angles will likely be useful for improving the accuracy of fold recognition and ab initio protein-structure prediction. The Real-SPINE 2.0 server is available on the website http://sparks.informatics.iupui.edu.
Collapse
Affiliation(s)
- Bin Xue
- Indiana University School of Informatics, Indiana University-Purdue University, Indianapolis, Indiana 46202, USA
| | | | | | | |
Collapse
|
43
|
Liu X, Zhao YP, Zheng WM. CLEMAPS: Multiple alignment of protein structures based on conformational letters. Proteins 2008; 71:728-36. [DOI: 10.1002/prot.21739] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
44
|
|
45
|
Hidden Markov Models for prediction of protein features. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2008; 413:173-98. [PMID: 18075166 DOI: 10.1007/978-1-59745-574-9_7] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
Hidden Markov Models (HMMs) are an extremely versatile statistical representation that can be used to model any set of one-dimensional discrete symbol data. HMMs can model protein sequences in many ways, depending on what features of the protein are represented by the Markov states. For protein structure prediction, states have been chosen to represent either homologous sequence positions, local or secondary structure types, or transmembrane locality. The resulting models can be used to predict common ancestry, secondary or local structure, or membrane topology by applying one of the two standard algorithms for comparing a sequence to a model. In this chapter, we review those algorithms and discuss how HMMs have been constructed and refined for the purpose of protein structure prediction.
Collapse
|
46
|
Yang Y, Liu H. Genetic algorithms for protein conformation sampling and optimization in a discrete backbone dihedral angle space. J Comput Chem 2007; 27:1593-602. [PMID: 16868993 DOI: 10.1002/jcc.20463] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
We have investigated protein conformation sampling and optimization based on the genetic algorithm and discrete main chain dihedral state model. An efficient approach combining the genetic algorithm with local minimization and with a niche technique based on the sharing function is proposed. Using two different types of potential energy functions, a Go-type potential function and a knowledge-based pairwise potential energy function, and a test set containing small proteins of varying sizes and secondary structure compositions, we demonstrated the importance of local minimization and population diversity in protein conformation optimization with genetic algorithms. Some general properties of the sampled conformations such as their native-likeness and the influences of including side-chains are discussed.
Collapse
Affiliation(s)
- Yuedong Yang
- Hefei National Laboratory for Physical Sciences, Key Laboratory of Structural Biology, School of Life Sciences, University of Science and Technology of China, Hefei, Anhui 230026, People's Republic of China
| | | |
Collapse
|
47
|
Kwasigroch JM, Rooman M. Prelude&Fugue, predicting local protein structure, early folding regions and structural weaknesses. Bioinformatics 2006; 22:1800-2. [PMID: 16682423 DOI: 10.1093/bioinformatics/btl176] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
UNLABELLED Prelude&Fugue are bioinformatics tools aiming at predicting the local 3D structure of a protein from its amino acid sequence in terms of seven backbone torsion angle domains, using database-derived potentials. Prelude(&Fugue) computes all lowest free energy conformations of a protein or protein region, ranked by increasing energy, and possibly satisfying some interresidue distance constraints specified by the user. (Prelude&)Fugue detects sequence regions whose predicted structure is significantly preferred relative to other conformations in the absence of tertiary interactions. These programs can be used for predicting secondary structure, tertiary structure of short peptides, flickering early folding sequences and peptides that adopt a preferred conformation in solution. They can also be used for detecting structural weaknesses, i.e. sequence regions that are not optimal with respect to the tertiary fold. AVAILABILITY http://babylone.ulb.ac.be/Prelude_and_Fugue.
Collapse
Affiliation(s)
- Jean Marc Kwasigroch
- Unité de Bioinformatique génomique et structurale, Université Libre de Bruxelles, CP 165/61, Avenue Roosevelt 50, 1050 Bruxelles, Belgium.
| | | |
Collapse
|
48
|
Abstract
We propose a novel and flexible derivation scheme of statistical, database-derived, potentials, which allows one to take simultaneously into account specific correlations between several sequence and structure descriptors. This scheme leads to the decomposition of the total folding free energy of a protein into a sum of lower order terms, thereby giving the possibility to analyze independently each contribution and clarify its significance and importance, to avoid overcounting certain contributions, and to deal more efficiently with the limited size of the database. In addition, this derivation scheme appears as quite general, for many previously developed potentials can be expressed as particular cases of our formalism. We use this formalism as a framework to generate different residue-based energy functions, whose performances are assessed on the basis of their ability to discriminate genuine proteins from decoy models. The optimal potential is generated as a combination of several coupling terms, measuring correlations between residue types, backbone torsion angles, solvent accessibilities, relative positions along the sequence, and interresidue distances. This potential outperforms all tested residue-based potentials, and even several atom-based potentials. Its incorporation in algorithms aiming at predicting protein structure and stability should therefore substantially improve their performances.
Collapse
Affiliation(s)
- Y Dehouck
- Unité de Bioinformatique génomique et structurale, Université Libre de Bruxelles, 1050 Brussels, Belgium.
| | | | | |
Collapse
|
49
|
Affiliation(s)
- Roger E Ison
- Department of Computer Science and Engineering, University of Colorado at Denver, 80217-3364, USA.
| | | | | |
Collapse
|
50
|
Thomas GL, Sessions RB, Parker MJ. Density guided importance sampling: application to a reduced model of protein folding. Bioinformatics 2005; 21:2839-43. [PMID: 15802285 DOI: 10.1093/bioinformatics/bti421] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Monte Carlo methods are the most effective means of exploring the energy landscapes of protein folding. The rugged topography of folding energy landscapes causes sampling inefficiencies however, particularly at low, physiological temperatures. RESULTS A hybrid Monte Carlo method, termed density guided importance sampling (DGIS), is presented that overcomes these sampling inefficiencies. The method is shown to be highly accurate and efficient in determining Boltzmann weighted structural metrics of a discrete off-lattice protein model. In comparison to the Metropolis Monte Carlo method, and the hybrid Monte Carlo methods, jump-walking, smart-walking and replica-exchange, the DGIS method is shown to be more efficient, requiring no parameter optimization. The method guides the simulation towards under-sampled regions of the energy spectrum and recognizes when equilibrium has been reached, avoiding arbitrary and excessively long simulation times. AVAILABILITY Fortran code available from authors upon request. CONTACT m.j.parker@leeds.ac.uk.
Collapse
Affiliation(s)
- Geraint L Thomas
- Astbury Centre for Structural Molecular Biology, Department of Biochemistry and Microbiology, University of Leeds, Leeds LS2 9JT, UK
| | | | | |
Collapse
|