1
|
Bu J, Luo N, Shen C, Xu C, Zhu Q, Chen C, Xie Y, Liu X, Liu Y, Luo C, Zhang X. A fast and efficient virtual screening and identification strategy for helix peptide binders based on finDr webserver: A case study of bovine serum albumin (BSA). Int J Biol Macromol 2025; 306:141118. [PMID: 39993680 DOI: 10.1016/j.ijbiomac.2025.141118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2024] [Revised: 02/05/2025] [Accepted: 02/14/2025] [Indexed: 02/26/2025]
Abstract
Peptides offer unique advantages, including strong specificity, rapid action, and low side effects, making them a prominent focus in the development of new drugs and functional materials. However, the rapid and efficient screening and identification of high-affinity peptides for specific targets remains a significant challenge. In this study, we successfully screened 12-helix candidate peptides using bovine serum albumin (BSA) as the target protein, employing the computer-aided peptide virtual screening webserver finDr. Among the top five candidate peptides, we identified E4-TP2 (GVATVVARLFLL) as the peptide capable of binding BSA with high affinity constant (KD = 39.4 nM), confirmed through an in vitro molecular interaction instrument. The interaction mode of the peptide-BSA complex was analyzed using Ligplot software, revealing that the primary interactions involved hydrophobic forces and hydrogen bonds. Additionally, molecular dynamics simulations further elucidated the molecular mechanisms underlying the high-affinity peptide interactions, the results demonstrated that the complex exhibited good conformational stability and strong binding free energy (MM/PBSA: -21.075 ± 5.471 kJ/mol). In conclusion, the finDr virtual screening strategy and the molecular interaction identification method employed in this study provide a robust technical approach for the rapid and efficient acquisition of high-affinity binding peptides for target proteins of interest.
Collapse
Affiliation(s)
- Jiarui Bu
- Jiangsu Provincial Key Construction Laboratory of Probiotics Preparation, Huaiyin Institute of Technology, Huaian 223003, China; Jiangsu Key Laboratory for Food Quality and Safety-State Key Laboratory Cultivation Base of Ministry of Science and Technology, Institute of Food Safety and Nutrition, Jiangsu Academy of Agricultural Sciences, Nanjing 210014, China
| | - Na Luo
- Jiangsu Provincial Key Construction Laboratory of Probiotics Preparation, Huaiyin Institute of Technology, Huaian 223003, China; Jiangsu Key Laboratory for Food Quality and Safety-State Key Laboratory Cultivation Base of Ministry of Science and Technology, Institute of Food Safety and Nutrition, Jiangsu Academy of Agricultural Sciences, Nanjing 210014, China
| | - Cheng Shen
- Jiangsu Key Laboratory for Food Quality and Safety-State Key Laboratory Cultivation Base of Ministry of Science and Technology, Institute of Food Safety and Nutrition, Jiangsu Academy of Agricultural Sciences, Nanjing 210014, China
| | - Chongxin Xu
- Jiangsu Key Laboratory for Food Quality and Safety-State Key Laboratory Cultivation Base of Ministry of Science and Technology, Institute of Food Safety and Nutrition, Jiangsu Academy of Agricultural Sciences, Nanjing 210014, China
| | - Qing Zhu
- Jiangsu Key Laboratory for Food Quality and Safety-State Key Laboratory Cultivation Base of Ministry of Science and Technology, Institute of Food Safety and Nutrition, Jiangsu Academy of Agricultural Sciences, Nanjing 210014, China
| | - Chengyu Chen
- Jiangsu Key Laboratory for Food Quality and Safety-State Key Laboratory Cultivation Base of Ministry of Science and Technology, Institute of Food Safety and Nutrition, Jiangsu Academy of Agricultural Sciences, Nanjing 210014, China
| | - Yajing Xie
- Jiangsu Key Laboratory for Food Quality and Safety-State Key Laboratory Cultivation Base of Ministry of Science and Technology, Institute of Food Safety and Nutrition, Jiangsu Academy of Agricultural Sciences, Nanjing 210014, China
| | - Xianjin Liu
- Jiangsu Key Laboratory for Food Quality and Safety-State Key Laboratory Cultivation Base of Ministry of Science and Technology, Institute of Food Safety and Nutrition, Jiangsu Academy of Agricultural Sciences, Nanjing 210014, China
| | - Yuan Liu
- Jiangsu Provincial Key Construction Laboratory of Probiotics Preparation, Huaiyin Institute of Technology, Huaian 223003, China; Jiangsu Key Laboratory for Food Quality and Safety-State Key Laboratory Cultivation Base of Ministry of Science and Technology, Institute of Food Safety and Nutrition, Jiangsu Academy of Agricultural Sciences, Nanjing 210014, China.
| | - Chuping Luo
- Jiangsu Provincial Key Construction Laboratory of Probiotics Preparation, Huaiyin Institute of Technology, Huaian 223003, China.
| | - Xiao Zhang
- Jiangsu Provincial Key Construction Laboratory of Probiotics Preparation, Huaiyin Institute of Technology, Huaian 223003, China; Jiangsu Key Laboratory for Food Quality and Safety-State Key Laboratory Cultivation Base of Ministry of Science and Technology, Institute of Food Safety and Nutrition, Jiangsu Academy of Agricultural Sciences, Nanjing 210014, China.
| |
Collapse
|
2
|
Zalewski M, Wallner B, Kmiecik S. Protein-Peptide Docking with ESMFold Language Model. J Chem Theory Comput 2025; 21:2817-2821. [PMID: 40053869 PMCID: PMC11948316 DOI: 10.1021/acs.jctc.4c01585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2024] [Revised: 03/02/2025] [Accepted: 03/05/2025] [Indexed: 03/09/2025]
Abstract
Designing peptide therapeutics requires precise peptide docking, which remains a challenge. We assessed the ESMFold language model, originally designed for protein structure prediction, for its effectiveness in protein-peptide docking. Various docking strategies, including polyglycine linkers and sampling-enhancing modifications, were explored. The number of acceptable-quality models among top-ranking results is comparable to traditional methods and generally lower than AlphaFold-Multimer or Alphafold 3, though ESMFold surpasses it in some cases. The combination of result quality and computational efficiency underscores ESMFold's potential value as a component in a consensus approach for high-throughput peptide design.
Collapse
Affiliation(s)
- Mateusz Zalewski
- Biological
and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| | - Björn Wallner
- Department
of Physics, Chemistry and Biology, Linköping
University, Linköping 58 183, Sweden
| | - Sebastian Kmiecik
- Biological
and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| |
Collapse
|
3
|
Jin X, Chen Z, Yu D, Jiang Q, Chen Z, Yan B, Qin J, Liu Y, Wang J. TPepPro: a deep learning model for predicting peptide-protein interactions. Bioinformatics 2024; 41:btae708. [PMID: 39585721 PMCID: PMC11681936 DOI: 10.1093/bioinformatics/btae708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 10/23/2024] [Accepted: 11/24/2024] [Indexed: 11/26/2024] Open
Abstract
MOTIVATION Peptides and their derivatives hold potential as therapeutic agents. The rising interest in developing peptide drugs is evidenced by increasing approval rates by the FDA of USA. To identify the most potential peptides, study on peptide-protein interactions (PepPIs) presents a very important approach but poses considerable technical challenges. In experimental aspects, the transient nature of PepPIs and the high flexibility of peptides contribute to elevated costs and inefficiency. Traditional docking and molecular dynamics simulation methods require substantial computational resources, and the predictive accuracy of their results remain unsatisfactory. RESULTS To address this gap, we proposed TPepPro, a Transformer-based model for PepPI prediction. We trained TPepPro on a dataset of 19,187 pairs of peptide-protein complexes with both sequential and structural features. TPepPro utilizes a strategy that combines local protein sequence feature extraction with global protein structure feature extraction. Moreover, TPepPro optimizes the architecture of structural featuring neural network in BN-ReLU arrangement, which notably reduced the amount of computing resources required for PepPIs prediction. According to comparison analysis, the accuracy reached 0.855 in TPepPro, achieving an 8.1% improvement compared to the second-best model TAGPPI. TPepPro achieved an AUC of 0.922, surpassing the second-best model TAGPPI with 0.844. Moreover, the newly developed TPepPro identify certain PepPIs that can be validated according to previous experimental evidence, thus indicating the efficiency of TPepPro to detect high potential PepPIs that would be helpful for amino acid drug applications. AVAILABILITY AND IMPLEMENTATION The source code of TPepPro is available at https://github.com/wanglabhku/TPepPro.
Collapse
Affiliation(s)
- Xiaohong Jin
- School of Electronic Information, Guangxi University for Nationalities, Nanning 530000, China
| | - Zimeng Chen
- Division of Applied Oral Sciences and Community Dental Care, Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China
| | - Dan Yu
- Division of Applied Oral Sciences and Community Dental Care, Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China
| | - Qianhui Jiang
- Division of Applied Oral Sciences and Community Dental Care, Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China
| | - Zhuobin Chen
- School of Pharmaceutical Sciences (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, Guangdong 518107, China
| | - Bin Yan
- Division of Applied Oral Sciences and Community Dental Care, Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China
| | - Jing Qin
- School of Pharmaceutical Sciences (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, Guangdong 518107, China
| | - Yong Liu
- School of Artificial Intelligence, Guangxi University for Nationalities, Nanning 530000, China
| | - Junwen Wang
- Division of Applied Oral Sciences and Community Dental Care, Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China
- State Key Laboratory of Pharmaceutical Biotechnology, The University of Hong Kong, Hong Kong SAR, China
- HKU Shenzhen Hospital, Shenzhen 518000, China
| |
Collapse
|
4
|
Huang J, Li W, Xiao B, Zhao C, Zheng H, Li Y, Wang J. PepCA: Unveiling protein-peptide interaction sites with a multi-input neural network model. iScience 2024; 27:110850. [PMID: 39391726 PMCID: PMC11465048 DOI: 10.1016/j.isci.2024.110850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 06/13/2024] [Accepted: 08/27/2024] [Indexed: 10/12/2024] Open
Abstract
The protein-peptide interaction plays a pivotal role in fields such as drug development, yet remains underexplored experimentally and challenging to model computationally. Herein, we introduce PepCA, a sequence-based approach for predicting peptide-binding sites on proteins. A primary obstacle in predicting peptide-protein interactions is the difficulty in acquiring precise protein structures, coupled with the uncertainty of polypeptide configurations. To address this, we first encode protein sequences using the Evolutionary Scale Modeling 2 (ESM-2) pre-trained model to extract latent structural information. Additionally, we have developed a multi-input coattention mechanism to concurrently update the encoding of both peptide and protein residues. PepCA integrates this module within an encoder-decoder structure. This model's high precision in identifying binding sites significantly advances the field of computational biology, offering vital insights for peptide drug development and protein science.
Collapse
Affiliation(s)
- Junxiong Huang
- iCarbonX (Zhuhai) Company Limited, Zhuhai, Guangdong, China
- iCarbonX (Shenzhen) Pharmaceutical Technology Co, Shenzhen, Guangdong, China
| | - Weikang Li
- iCarbonX (Zhuhai) Company Limited, Zhuhai, Guangdong, China
- iCarbonX (Shenzhen) Pharmaceutical Technology Co, Shenzhen, Guangdong, China
| | - Bin Xiao
- iCarbonX (Zhuhai) Company Limited, Zhuhai, Guangdong, China
- iCarbonX (Shenzhen) Pharmaceutical Technology Co, Shenzhen, Guangdong, China
| | - Chunqing Zhao
- iCarbonX (Zhuhai) Company Limited, Zhuhai, Guangdong, China
- iCarbonX (Shenzhen) Pharmaceutical Technology Co, Shenzhen, Guangdong, China
| | - Hancheng Zheng
- iCarbonX (Zhuhai) Company Limited, Zhuhai, Guangdong, China
- Shenzhen Digital Life Institute, Shenzhen, Guangdong, China
| | - Yingrui Li
- iCarbonX (Zhuhai) Company Limited, Zhuhai, Guangdong, China
- Faculty of Health and Medical Sciences, University of Surrey, Guildford, Surrey, UK
- Shenzhen Digital Life Institute, Shenzhen, Guangdong, China
- iCarbonX (Shenzhen) Pharmaceutical Technology Co, Shenzhen, Guangdong, China
| | - Jun Wang
- iCarbonX (Zhuhai) Company Limited, Zhuhai, Guangdong, China
- State Key Laboratory of Quality Research in Chinese Medicine, Macau University of Science and Technology, Taipa, Macau, China
- Shenzhen Digital Life Institute, Shenzhen, Guangdong, China
- iCarbonX (Shenzhen) Pharmaceutical Technology Co, Shenzhen, Guangdong, China
| |
Collapse
|
5
|
Aslan A, Ari Yuka S. Therapeutic peptides for coronary artery diseases: in silico methods and current perspectives. Amino Acids 2024; 56:37. [PMID: 38822212 PMCID: PMC11143054 DOI: 10.1007/s00726-024-03397-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Accepted: 05/06/2024] [Indexed: 06/02/2024]
Abstract
Many drug formulations containing small active molecules are used for the treatment of coronary artery disease, which affects a significant part of the world's population. However, the inadequate profile of these molecules in terms of therapeutic efficacy has led to the therapeutic use of protein and peptide-based biomolecules with superior properties, such as target-specific affinity and low immunogenicity, in critical diseases. Protein‒protein interactions, as a consequence of advances in molecular techniques with strategies involving the combined use of in silico methods, have enabled the design of therapeutic peptides to reach an advanced dimension. In particular, with the advantages provided by protein/peptide structural modeling, molecular docking for the study of their interactions, molecular dynamics simulations for their interactions under physiological conditions and machine learning techniques that can work in combination with all these, significant progress has been made in approaches to developing therapeutic peptides that can modulate the development and progression of coronary artery diseases. In this scope, this review discusses in silico methods for the development of peptide therapeutics for the treatment of coronary artery disease and strategies for identifying the molecular mechanisms that can be modulated by these designs and provides a comprehensive perspective for future studies.
Collapse
Affiliation(s)
- Ayca Aslan
- Department of Bioengineering, Faculty of Chemical and Metallurgical Engineering, Yildiz Technical University, Esenler, Istanbul, Turkey
- Health Biotechnology Joint Research and Application Center of Excellence, Esenler, Istanbul, Turkey
| | - Selcen Ari Yuka
- Department of Bioengineering, Faculty of Chemical and Metallurgical Engineering, Yildiz Technical University, Esenler, Istanbul, Turkey.
- Health Biotechnology Joint Research and Application Center of Excellence, Esenler, Istanbul, Turkey.
| |
Collapse
|
6
|
Gurvich R, Markel G, Tanoli Z, Meirson T. Peptriever: a Bi-Encoder approach for large-scale protein-peptide binding search. Bioinformatics 2024; 40:btae303. [PMID: 38710496 PMCID: PMC11112044 DOI: 10.1093/bioinformatics/btae303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 03/31/2024] [Accepted: 05/03/2024] [Indexed: 05/08/2024] Open
Abstract
MOTIVATION Peptide therapeutics hinge on the precise interaction between a tailored peptide and its designated receptor while mitigating interactions with alternate receptors is equally indispensable. Existing methods primarily estimate the binding score between protein and peptide pairs. However, for a specific peptide without a corresponding protein, it is challenging to identify the proteins it could bind due to the sheer number of potential candidates. RESULTS We propose a transformers-based protein embedding scheme in this study that can quickly identify and rank millions of interacting proteins. Furthermore, the proposed approach outperforms existing sequence- and structure-based methods, with a mean AUC-ROC and AUC-PR of 0.73. AVAILABILITY AND IMPLEMENTATION Training data, scripts, and fine-tuned parameters are available at https://github.com/RoniGurvich/Peptriever. The proposed method is linked with a web application available for customized prediction at https://peptriever.app/.
Collapse
Affiliation(s)
- Roni Gurvich
- Davidoff Cancer Center, Rabin Medical Center-Beilinson Hospital, Petah Tikva 49100, Israel
| | - Gal Markel
- Davidoff Cancer Center, Rabin Medical Center-Beilinson Hospital, Petah Tikva 49100, Israel
- Faculty of Medicine, Tel Aviv University, Tel-Aviv 6997801, Israel
- Samueli Integrative Cancer Pioneering Institute, Rabin Medical Center-Beilinson Hospital, Petah Tikva, Israel
| | - Ziaurrehman Tanoli
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki 00290, Finland
| | - Tomer Meirson
- Davidoff Cancer Center, Rabin Medical Center-Beilinson Hospital, Petah Tikva 49100, Israel
- Faculty of Medicine, Tel Aviv University, Tel-Aviv 6997801, Israel
- Samueli Integrative Cancer Pioneering Institute, Rabin Medical Center-Beilinson Hospital, Petah Tikva, Israel
| |
Collapse
|
7
|
Chang L, Mondal A, Singh B, Martínez-Noa Y, Perez A. Revolutionizing Peptide-Based Drug Discovery: Advances in the Post-AlphaFold Era. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL MOLECULAR SCIENCE 2024; 14:e1693. [PMID: 38680429 PMCID: PMC11052547 DOI: 10.1002/wcms.1693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 09/18/2023] [Indexed: 05/01/2024]
Abstract
Peptide-based drugs offer high specificity, potency, and selectivity. However, their inherent flexibility and differences in conformational preferences between their free and bound states create unique challenges that have hindered progress in effective drug discovery pipelines. The emergence of AlphaFold (AF) and Artificial Intelligence (AI) presents new opportunities for enhancing peptide-based drug discovery. We explore recent advancements that facilitate a successful peptide drug discovery pipeline, considering peptides' attractive therapeutic properties and strategies to enhance their stability and bioavailability. AF enables efficient and accurate prediction of peptide-protein structures, addressing a critical requirement in computational drug discovery pipelines. In the post-AF era, we are witnessing rapid progress with the potential to revolutionize peptide-based drug discovery such as the ability to rank peptide binders or classify them as binders/non-binders and the ability to design novel peptide sequences. However, AI-based methods are struggling due to the lack of well-curated datasets, for example to accommodate modified amino acids or unconventional cyclization. Thus, physics-based methods, such as docking or molecular dynamics simulations, continue to hold a complementary role in peptide drug discovery pipelines. Moreover, MD-based tools offer valuable insights into binding mechanisms, as well as the thermodynamic and kinetic properties of complexes. As we navigate this evolving landscape, a synergistic integration of AI and physics-based methods holds the promise of reshaping the landscape of peptide-based drug discovery.
Collapse
Affiliation(s)
- Liwei Chang
- Department of Chemistry, University of Florida, Gainesville, FL 32611
| | - Arup Mondal
- Department of Chemistry, University of Florida, Gainesville, FL 32611
| | - Bhumika Singh
- Department of Chemistry, University of Florida, Gainesville, FL 32611
| | | | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL 32611
| |
Collapse
|
8
|
Feng H, Wang F, Li N, Xu Q, Zheng G, Sun X, Hu M, Li X, Xing G, Zhang G. Use of tree-based machine learning methods to screen affinitive peptides based on docking data. Mol Inform 2023; 42:e202300143. [PMID: 37696773 DOI: 10.1002/minf.202300143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 09/03/2023] [Accepted: 09/11/2023] [Indexed: 09/13/2023]
Abstract
Screening peptides with good affinity is an important step in peptide-drug discovery. Recent advancement in computer and data science have made machine learning a useful tool in accurately affinitive-peptide screening. In current study, four different tree-based algorithms, including Classification and regression trees (CART), C5.0 decision tree (C50), Bagged CART (BAG) and Random Forest (RF), were employed to explore the relationship between experimental peptide affinities and virtual docking data, and the performance of each model was also compared in parallel. All four algorithms showed better performances on dataset pre-scaled, -centered and -PCA than other pre-processed dataset. After model re-built and hyperparameter optimization, the optimal C50 model (C50O) showed the best performances in terms of Accuracy, Kappa, Sensitivity, Specificity, F1, MCC and AUC when validated on test data and an unknown PEDV datasets evaluation (Accuracy=80.4 %). BAG and RFO (the optimal RF), as two best models during training process, did not performed as expecting during in testing and unknown dataset validations. Furthermore, the high correlation of the predictions of RFO and BAG to C50O implied the high stability and robustness of their prediction. Whereas although the good performance on unknown dataset, the poor performance in test data validation and correlation analysis indicated CARTO could not be used for future data prediction. To accurately evaluate the peptide affinity, the current study firstly gave a tree-model competition on affinitive peptide prediction by using virtual docking data, which would expand the application of machine learning algorithms in studying PepPIs and benefit the development of peptide therapeutics.
Collapse
Affiliation(s)
- Hua Feng
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Fangyu Wang
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Ning Li
- College of Food Science and Technology, Henan Agricultural University, Zhengzhou, China
| | - Qian Xu
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Guanming Zheng
- Public Health and Preventive Medicine Teaching and Research Center, Henan University of Chinese Medicine, Zhengzhou, Henan, China
| | - Xuefeng Sun
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Man Hu
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Xuewu Li
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Guangxu Xing
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Gaiping Zhang
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
- Longhu Modern Immunology Laboratory, Zhengzhou, China
- School of Advanced Agricultural sciences, Peking University, Beijing, China
- Jiangsu Co-Innovation Center for the Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou University, Yangzhou, Jiangsu, China
| |
Collapse
|
9
|
Feng H, Wang F, Li N, Xu Q, Zheng G, Sun X, Hu M, Xing G, Zhang G. A Random Forest Model for Peptide Classification Based on Virtual Docking Data. Int J Mol Sci 2023; 24:11409. [PMID: 37511165 PMCID: PMC10380188 DOI: 10.3390/ijms241411409] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 06/25/2023] [Accepted: 07/11/2023] [Indexed: 07/30/2023] Open
Abstract
The affinity of peptides is a crucial factor in studying peptide-protein interactions. Despite the development of various techniques to evaluate peptide-receptor affinity, the results may not always reflect the actual affinity of the peptides accurately. The current study provides a free tool to assess the actual peptide affinity based on virtual docking data. This study employed a dataset that combined actual peptide affinity information (active and inactive) and virtual peptide-receptor docking data, and different machine learning algorithms were utilized. Compared with the other algorithms, the random forest (RF) algorithm showed the best performance and was used in building three RF models using different numbers of significant features (four, three, and two). Further analysis revealed that the four-feature RF model achieved the highest Accuracy of 0.714 in classifying an independent unknown peptide dataset designed with the PEDV spike protein, and it also revealed overfitting problems in the other models. This four-feature RF model was used to evaluate peptide affinity by constructing the relationship between the actual affinity and the virtual docking scores of peptides to their receptors.
Collapse
Affiliation(s)
- Hua Feng
- Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou 450002, China
| | - Fangyu Wang
- Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou 450002, China
| | - Ning Li
- Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou 450002, China
| | - Qian Xu
- Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou 450002, China
| | - Guanming Zheng
- Public Health and Preventive Medicine Teaching and Research Center, Henan University of Chinese Medicine, Zhengzhou 450046, China
| | - Xuefeng Sun
- Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou 450002, China
| | - Man Hu
- Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou 450002, China
| | - Guangxu Xing
- Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou 450002, China
| | - Gaiping Zhang
- Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou 450002, China
- Longhu Modern Immunology Laboratory, Zhengzhou 450002, China
- School of Advanced Agricultural Sciences, Peking University, Beijing 100871, China
- Jiangsu Co-Innovation Center for the Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou University, Yangzhou 225009, China
| |
Collapse
|
10
|
Trisciuzzi D, Siragusa L, Baroni M, Cruciani G, Nicolotti O. An Integrated Machine Learning Model To Spot Peptide Binding Pockets in 3D Protein Screening. J Chem Inf Model 2022; 62:6812-6824. [PMID: 36320100 DOI: 10.1021/acs.jcim.2c00583] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The prediction of peptide-protein binding sites is of utmost importance to tackle the onset of severe neurodegenerative diseases and cancer. In this work, we detail a novel machine learning model based on Linear Discriminant Analysis (LDA) demonstrating to be highly predictive in detecting the putative protein binding regions of small peptides. Starting from 439 high-quality pockets derived from peptide-protein crystallographic complexes, three sets of well-established peptide-binding regions were first selected through a Partitioning Around Medoids (PAM) clustering algorithm based on morphological and energetic 3D GRID-MIF molecular descriptors. Next, the best combination between all the putative interacting peptide pockets and related GRID-MIF scores was automatically explored by using the LDA-based protocol implemented in BioGPS. This approach proved successful to recognize the actual interacting peptide regions (that is, AUC = 0.86 and partial ROC enrichment at 5% of 0.48) from all the other pockets of the protein. Validated on two external collections sets, including 445 and 347 crystallographic peptide-protein complexes, our LDA-based model could be effective to further run peptide-protein virtual screening campaigns.
Collapse
Affiliation(s)
- Daniela Trisciuzzi
- Department of Pharmacy-Pharmaceutical Sciences, Università degli Studi di Bari "Aldo Moro", 70125Bari, Italy.,Molecular Discovery Ltd., Kinetic Business Centre, Theobald Street, Elstree, Borehamwood, HertfordshireWD6 4PJ, United Kingdom
| | - Lydia Siragusa
- Molecular Horizon s.r.l., Via Montelino, 30, 06084Bettona (PG), Italy.,Molecular Discovery Ltd., Kinetic Business Centre, Theobald Street, Elstree, Borehamwood, HertfordshireWD6 4PJ, United Kingdom
| | - Massimo Baroni
- Molecular Discovery Ltd., Kinetic Business Centre, Theobald Street, Elstree, Borehamwood, HertfordshireWD6 4PJ, United Kingdom
| | - Gabriele Cruciani
- Department of Chemistry, Biology and Biotechnology, Università degli Studi di Perugia, via Elce di Sotto, 8, 06123Perugia (PG), Italy
| | - Orazio Nicolotti
- Department of Pharmacy-Pharmaceutical Sciences, Università degli Studi di Bari "Aldo Moro", 70125Bari, Italy
| |
Collapse
|
11
|
Johansson-Åkhe I, Wallner B. Improving peptide-protein docking with AlphaFold-Multimer using forced sampling. FRONTIERS IN BIOINFORMATICS 2022; 2:959160. [PMID: 36304330 PMCID: PMC9580857 DOI: 10.3389/fbinf.2022.959160] [Citation(s) in RCA: 53] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 08/16/2022] [Indexed: 12/02/2022] Open
Abstract
Protein interactions are key in vital biological processes. In many cases, particularly in regulation, this interaction is between a protein and a shorter peptide fragment. Such peptides are often part of larger disordered regions in other proteins. The flexible nature of peptides enables the rapid yet specific regulation of important functions in cells, such as their life cycle. Consequently, knowledge of the molecular details of peptide-protein interactions is crucial for understanding and altering their function, and many specialized computational methods have been developed to study them. The recent release of AlphaFold and AlphaFold-Multimer has led to a leap in accuracy for the computational modeling of proteins. In this study, the ability of AlphaFold to predict which peptides and proteins interact, as well as its accuracy in modeling the resulting interaction complexes, are benchmarked against established methods. We find that AlphaFold-Multimer predicts the structure of peptide-protein complexes with acceptable or better quality (DockQ ≥0.23) for 66 of the 112 complexes investigated-25 of which were high quality (DockQ ≥0.8). This is a massive improvement on previous methods with 23 or 47 acceptable models and only four or eight high quality models, when using energy-based docking or interaction templates, respectively. In addition, AlphaFold-Multimer can be used to predict whether a peptide and a protein will interact. At 1% false positives, AlphaFold-Multimer found 26% of the possible interactions with a precision of 85%, the best among the methods benchmarked. However, the most interesting result is the possibility of improving AlphaFold by randomly perturbing the neural network weights to force the network to sample more of the conformational space. This increases the number of acceptable models from 66 to 75 and improves the median DockQ from 0.47 to 0.55 (17%) for first ranked models. The best possible DockQ improves from 0.58 to 0.72 (24%), indicating that selecting the best possible model is still a challenge. This scheme of generating more structures with AlphaFold should be generally useful for many applications involving multiple states, flexible regions, and disorder.
Collapse
Affiliation(s)
| | - Björn Wallner
- Division of Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, Linköping, Sweden
| |
Collapse
|
12
|
Johansson-Åkhe I, Wallner B. InterPepScore: a deep learning score for improving the FlexPepDock refinement protocol. Bioinformatics 2022; 38:3209-3215. [PMID: 35575349 PMCID: PMC9191208 DOI: 10.1093/bioinformatics/btac325] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 04/29/2022] [Accepted: 05/10/2022] [Indexed: 11/15/2022] Open
Abstract
MOTIVATION Interactions between peptide fragments and protein receptors are vital to cell function yet difficult to experimentally determine in structural details of. As such, many computational methods have been developed to aid in peptide-protein docking or structure prediction. One such method is Rosetta FlexPepDock which consistently refines coarse peptide-protein models into sub-Ångström precision using Monte-Carlo simulations and statistical potentials. Deep learning has recently seen increased use in protein structure prediction, with graph neural networks used for protein model quality assessment. RESULTS Here, we introduce a graph neural network, InterPepScore, as an additional scoring term to complement and improve the Rosetta FlexPepDock refinement protocol. InterPepScore is trained on simulation trajectories from FlexPepDock refinement starting from thousands of peptide-protein complexes generated by a wide variety of docking schemes. The addition of InterPepScore into the refinement protocol consistently improves the quality of models created, and on an independent benchmark on 109 peptide-protein complexes its inclusion results in an increase in the number of complexes for which the top-scoring model had a DockQ-score of 0.49 (Medium quality) or better from 14.8% to 26.1%. AVAILABILITY AND IMPLEMENTATION InterPepScore is available online at http://wallnerlab.org/InterPepScore. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Isak Johansson-Åkhe
- Division of Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, SE-581 83 Linköping, Sweden
| | - Björn Wallner
- Division of Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, SE-581 83 Linköping, Sweden
| |
Collapse
|
13
|
Yang H, Mei J, Xu W, Ma X, Sun B, Ai H. Identification of the probable structure of the sAPPα-GABA BR1a complex and theoretical solutions for such cases. Phys Chem Chem Phys 2022; 24:12267-12280. [PMID: 35543350 DOI: 10.1039/d2cp00569g] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Amyloid precursor protein (APP) is the core of the pathogenesis of Alzheimer's disease (AD). Existing studies have shown that the soluble secreted APP (sAPPα) fragment obtained from the hydrolysis of APP by α-secretase has a synaptic function. Thereinto, a nine-residue fragment (APP9mer) of the extension domain region of sAPPα can bind directly and selectively to the N-terminal sushi1 domain (SD1) of the γ-aminobutyric acid type B receptor subunit 1a (GABABR1a) protein, which can influence synaptic transmission and plasticity by changing the GABABR1a conformation. APP9mer is a highly flexible, disordered region, and as such it is difficult to experimentally determine the optimal APPmer-SD1 binding complex. In this study we constructed two types of APP9mer-SD1 complexes through molecular docking and molecular dynamics simulation, aiming to explore the recognition function and mechanism of the specific binding of APP9mer with SD1, from which the most probable APPmer-SD1 model conformation is predicted. All the data from the analyses of RMSD, RMSF, PCA, DCCM and MM/PBSA binding energy as well as comparison with the experimental dissociation constant Kd suggest that 2NC is the most likely conformation to restore the crystal structure of the experimental APP9mer-SD1 complex. Of note, the key recognition residues of APP9mer are D24, D25, D27, W29 and W30, which mainly act on the 9-45 residue domain of SD1 (consisting of two loops and three short β-chains at the N-terminus of SD1). The mini-model with key residues identified establishes the molecular basis with deep insight into the interaction between APP and GABABR1a and provides a target for the development of therapeutic strategies for modulating GABABR1a-specific signaling in neurological and psychiatric disorders. More importantly, the study offers a theoretical solution for how to determine a biomolecular structure with a highly flexible, disordered fragment embedded within. The flexible fragment involved in a protein structure has to be deserted usually during the structural determination with experimental methods (e.g. X-ray crystallography, etc.).
Collapse
Affiliation(s)
- Huijuan Yang
- School of Chemistry and Chemical Engineering, University of Jinan, Jinan 250022, P. R. China.
| | - Jinfei Mei
- School of Chemistry and Chemical Engineering, University of Jinan, Jinan 250022, P. R. China.
| | - Wen Xu
- School of Chemistry and Chemical Engineering, University of Jinan, Jinan 250022, P. R. China.
| | - Xiaohong Ma
- School of Chemistry and Chemical Engineering, University of Jinan, Jinan 250022, P. R. China.
| | - Bo Sun
- School of Chemistry and Chemical Engineering, University of Jinan, Jinan 250022, P. R. China.
| | - Hongqi Ai
- School of Chemistry and Chemical Engineering, University of Jinan, Jinan 250022, P. R. China.
| |
Collapse
|
14
|
Matching protein surface structural patches for high-resolution blind peptide docking. Proc Natl Acad Sci U S A 2022; 119:e2121153119. [PMID: 35482919 PMCID: PMC9170164 DOI: 10.1073/pnas.2121153119] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Modeling interactions between short peptides and their receptors is a challenging docking problem due to the peptide flexibility, resulting in a formidable sampling problem of peptide conformation in addition to its orientation. Alternatively, the peptide can be viewed as a piece that complements the receptor monomer structure. Here, we show that the peptide conformation can be determined based on the receptor backbone only and sampled using local structural motifs found in solved protein monomers and interfaces, independent of sequence similarity. This approach outperforms current peptide docking protocols and promotes new directions for peptide interface design. Peptide docking can be perceived as a subproblem of protein–protein docking. However, due to the short length and flexible nature of peptides, many do not adopt one defined conformation prior to binding. Therefore, to tackle a peptide docking problem, not only the relative orientation, but also the bound conformation of the peptide needs to be modeled. Traditional peptide-centered approaches use information about peptide sequences to generate representative conformer ensembles, which can then be rigid-body docked to the receptor. Alternatively, one may look at this problem from the viewpoint of the receptor, namely, that the protein surface defines the peptide-bound conformation. Here, we present PatchMAN (Patch-Motif AligNments), a global peptide-docking approach that uses structural motifs to map the receptor surface with backbone scaffolds extracted from protein structures. On a nonredundant set of protein–peptide complexes, starting from free receptor structures, PatchMAN successfully models and identifies near-native peptide–protein complexes in 58%/84% within 2.5 Å/5 Å interface backbone RMSD, with corresponding sampling in 81%/100% of the cases, outperforming other approaches. PatchMAN leverages the observation that structural units of peptides with their binding pocket can be found not only within interfaces, but also within monomers. We show that the bound peptide conformation is sampled based on the structural context of the receptor only, without taking into account any sequence information. Beyond peptide docking, this approach opens exciting new avenues to study principles of peptide–protein association, and to the design of new peptide binders. PatchMAN is available as a server at https://furmanlab.cs.huji.ac.il/patchman/.
Collapse
|
15
|
Delaunay M, Ha-Duong T. Computational Tools and Strategies to Develop Peptide-Based Inhibitors of Protein-Protein Interactions. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2405:205-230. [PMID: 35298816 DOI: 10.1007/978-1-0716-1855-4_11] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Protein-protein interactions play crucial and subtle roles in many biological processes and modifications of their fine mechanisms generally result in severe diseases. Peptide derivatives are very promising therapeutic agents for modulating protein-protein associations with sizes and specificities between those of small compounds and antibodies. For the same reasons, rational design of peptide-based inhibitors naturally borrows and combines computational methods from both protein-ligand and protein-protein research fields. In this chapter, we aim to provide an overview of computational tools and approaches used for identifying and optimizing peptides that target protein-protein interfaces with high affinity and specificity. We hope that this review will help to implement appropriate in silico strategies for peptide-based drug design that builds on available information for the systems of interest.
Collapse
Affiliation(s)
| | - Tâp Ha-Duong
- Université Paris-Saclay, CNRS, BioCIS, Châtenay-Malabry, France.
| |
Collapse
|
16
|
Trisciuzzi D, Siragusa L, Baroni M, Autiero I, Nicolotti O, Cruciani G. Getting Insights into Structural and Energetic Properties of Reciprocal Peptide-Protein Interactions. J Chem Inf Model 2022; 62:1113-1125. [PMID: 35148095 DOI: 10.1021/acs.jcim.1c01343] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Peptide-protein interactions play a key role for many cellular and metabolic processes involved in the onset of largely spread diseases such as cancer and neurodegenerative pathologies. Despite the progress in the structural characterization of peptide-protein interfaces, the in-depth knowledge of the molecular details behind their interactions is still a daunting task. Here, we present the first comprehensive in silico morphological and energetic study of peptide binding sites by focusing on both peptide and protein standpoints. Starting from the PixelDB database, a nonredundant benchmark collection of high-quality 3D crystallographic structures of peptide-protein complexes, a classification analysis of the most representative categories based on the nature of each cocrystallized peptide has been carried out. Several interpretable geometrical and energetic descriptors have been computed both from peptide and target protein sides in the attempt to unveil physicochemical and structural causative correlations. Finally, we investigated the most frequent peptide-protein residue pairs at the binding interface and made extensive energetic analyses, based on GRID MIFs, with the aim to study the peptide affinity-enhancing interactions to be further exploited in rational drug design strategies.
Collapse
Affiliation(s)
- Daniela Trisciuzzi
- Department of Pharmacy, Pharmaceutical Sciences, Università degli Studi di Bari "Aldo Moro", 70125 Bari, Italy.,Molecular Horizon s.r.l., Via Montelino, 30, 06084 Bettona (PG), Italy
| | - Lydia Siragusa
- Molecular Horizon s.r.l., Via Montelino, 30, 06084 Bettona (PG), Italy.,Molecular Discovery Ltd., Kinetic Business Centre, Theobald Street, Elstree, Borehamwood, Hertfordshire WD6 4PJ, United Kingdom
| | - Massimo Baroni
- Molecular Discovery Ltd., Kinetic Business Centre, Theobald Street, Elstree, Borehamwood, Hertfordshire WD6 4PJ, United Kingdom
| | - Ida Autiero
- Molecular Horizon s.r.l., Via Montelino, 30, 06084 Bettona (PG), Italy.,National Research Council, Institute of Biostructures and Bioimaging, 80138 Naples, Italy
| | - Orazio Nicolotti
- Department of Pharmacy, Pharmaceutical Sciences, Università degli Studi di Bari "Aldo Moro", 70125 Bari, Italy
| | - Gabriele Cruciani
- Department of Chemistry, Biology and Biotechnology, Università degli Studi di Perugia, via Elce di Sotto, 8, 06123 Perugia (PG), Italy
| |
Collapse
|
17
|
Tsaban T, Varga JK, Avraham O, Ben-Aharon Z, Khramushin A, Schueler-Furman O. Harnessing protein folding neural networks for peptide-protein docking. Nat Commun 2022; 13:176. [PMID: 35013344 PMCID: PMC8748686 DOI: 10.1038/s41467-021-27838-9] [Citation(s) in RCA: 293] [Impact Index Per Article: 97.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Accepted: 12/10/2021] [Indexed: 12/31/2022] Open
Abstract
Highly accurate protein structure predictions by deep neural networks such as AlphaFold2 and RoseTTAFold have tremendous impact on structural biology and beyond. Here, we show that, although these deep learning approaches have originally been developed for the in silico folding of protein monomers, AlphaFold2 also enables quick and accurate modeling of peptide-protein interactions. Our simple implementation of AlphaFold2 generates peptide-protein complex models without requiring multiple sequence alignment information for the peptide partner, and can handle binding-induced conformational changes of the receptor. We explore what AlphaFold2 has memorized and learned, and describe specific examples that highlight differences compared to state-of-the-art peptide docking protocol PIPER-FlexPepDock. These results show that AlphaFold2 holds great promise for providing structural insight into a wide range of peptide-protein complexes, serving as a starting point for the detailed characterization and manipulation of these interactions.
Collapse
Affiliation(s)
- Tomer Tsaban
- Department of Microbiology and Molecular Genetics, Institute for Biomedical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Julia K Varga
- Department of Microbiology and Molecular Genetics, Institute for Biomedical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Orly Avraham
- Department of Microbiology and Molecular Genetics, Institute for Biomedical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Ziv Ben-Aharon
- Department of Microbiology and Molecular Genetics, Institute for Biomedical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Alisa Khramushin
- Department of Microbiology and Molecular Genetics, Institute for Biomedical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Ora Schueler-Furman
- Department of Microbiology and Molecular Genetics, Institute for Biomedical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel.
| |
Collapse
|
18
|
Xu X, Xiaoqin Zou. Predicting Protein-Peptide Complex Structures by Accounting for Peptide Flexibility and the Physicochemical Environment. J Chem Inf Model 2022; 62:27-39. [PMID: 34931833 PMCID: PMC9020583 DOI: 10.1021/acs.jcim.1c00836] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
Predicting protein-peptide complex structures is crucial to the understanding of a vast variety of peptide-mediated cellular processes and to peptide-based drug development. Peptide flexibility and binding mode ranking are the two major challenges for protein-peptide complex structure prediction. Peptides are highly flexible molecules, and therefore, brute-force modeling of peptide conformations of interest in protein-peptide docking is beyond current computing power. Inspired by the fact that the protein-peptide binding process is like protein folding, we developed a novel strategy, named MDockPeP2, which tries to address these challenges using physicochemical information embedded in abundant monomeric proteins with an exhaustive search strategy, in combination with an integrated global search and a local flexible minimization method. Only the peptide sequence and the protein crystal structure are required. The method was systemically assessed using a newly constructed structural database of 89 nonredundant protein-peptide complexes with the peptide sequence length ranging from 5 to 29 in which about half of the peptides are longer than 15 residues. MDockPeP2 yielded a total success rate of 58.4% (70.8, 79.8%) for the bound docking (i.e., with the bound receptor and fully flexible peptides) and 19.0% (44.8, 70.7%) for the challenging unbound docking when top 10 (100, 1000) models were considered for each prediction. MDockPeP2 achieved significantly higher success rates on two other datasets, peptiDB and LEADS-PEP, which contain only short- and medium-size peptides (≤ 15 residues). For peptiDB, our method obtained a success rate of 62.0% for the bound docking and 35.9% for the unbound docking when the top 10 models were considered. For LEADS-PEP, MDockPeP2 achieved a success rate of 69.8% when the top 10 models were considered. The program is available at https://zougrouptoolkit.missouri.edu/mdockpep2/download.html.
Collapse
|
19
|
Roy RS, Quadir F, Soltanikazemi E, Cheng J. OUP accepted manuscript. Bioinformatics 2022; 38:1904-1910. [PMID: 35134816 PMCID: PMC8963319 DOI: 10.1093/bioinformatics/btac063] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2021] [Revised: 01/17/2022] [Accepted: 01/31/2022] [Indexed: 11/23/2022] Open
Abstract
Motivation Deep learning has revolutionized protein tertiary structure prediction recently. The cutting-edge deep learning methods such as AlphaFold can predict high-accuracy tertiary structures for most individual protein chains. However, the accuracy of predicting quaternary structures of protein complexes consisting of multiple chains is still relatively low due to lack of advanced deep learning methods in the field. Because interchain residue–residue contacts can be used as distance restraints to guide quaternary structure modeling, here we develop a deep dilated convolutional residual network method (DRCon) to predict interchain residue–residue contacts in homodimers from residue–residue co-evolutionary signals derived from multiple sequence alignments of monomers, intrachain residue–residue contacts of monomers extracted from true/predicted tertiary structures or predicted by deep learning, and other sequence and structural features. Results Tested on three homodimer test datasets (Homo_std dataset, DeepHomo dataset and CASP-CAPRI dataset), the precision of DRCon for top L/5 interchain contact predictions (L: length of monomer in a homodimer) is 43.46%, 47.10% and 33.50% respectively at 6 Å contact threshold, which is substantially better than DeepHomo and DNCON2_inter and similar to Glinter. Moreover, our experiments demonstrate that using predicted tertiary structure or intrachain contacts of monomers in the unbound state as input, DRCon still performs well, even though its accuracy is lower than using true tertiary structures in the bound state are used as input. Finally, our case study shows that good interchain contact predictions can be used to build high-accuracy quaternary structure models of homodimers. Availability and implementation The source code of DRCon is available at https://github.com/jianlin-cheng/DRCon. The datasets are available at https://zenodo.org/record/5998532#.YgF70vXMKsB. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Raj S Roy
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Farhan Quadir
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Elham Soltanikazemi
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | | |
Collapse
|
20
|
Johansson-Åkhe I, Mirabello C, Wallner B. InterPepRank: Assessment of Docked Peptide Conformations by a Deep Graph Network. FRONTIERS IN BIOINFORMATICS 2021; 1:763102. [PMID: 36303778 PMCID: PMC9581042 DOI: 10.3389/fbinf.2021.763102] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 10/05/2021] [Indexed: 11/13/2022] Open
Abstract
Peptide-protein interactions between a smaller or disordered peptide stretch and a folded receptor make up a large part of all protein-protein interactions. A common approach for modeling such interactions is to exhaustively sample the conformational space by fast-Fourier-transform docking, and then refine a top percentage of decoys. Commonly, methods capable of ranking the decoys for selection fast enough for larger scale studies rely on first-principle energy terms such as electrostatics, Van der Waals forces, or on pre-calculated statistical potentials. We present InterPepRank for peptide-protein complex scoring and ranking. InterPepRank is a machine learning-based method which encodes the structure of the complex as a graph; with physical pairwise interactions as edges and evolutionary and sequence features as nodes. The graph network is trained to predict the LRMSD of decoys by using edge-conditioned graph convolutions on a large set of peptide-protein complex decoys. InterPepRank is tested on a massive independent test set with no targets sharing CATH annotation nor 30% sequence identity with any target in training or validation data. On this set, InterPepRank has a median AUC of 0.86 for finding coarse peptide-protein complexes with LRMSD < 4Å. This is an improvement compared to other state-of-the-art ranking methods that have a median AUC between 0.65 and 0.79. When included as a selection-method for selecting decoys for refinement in a previously established peptide docking pipeline, InterPepRank improves the number of medium and high quality models produced by 80% and 40%, respectively. The InterPepRank program as well as all scripts for reproducing and retraining it are available from: http://wallnerlab.org/InterPepRank.
Collapse
|
21
|
Masoudi-Sobhanzadeh Y, Jafari B, Parvizpour S, Pourseif MM, Omidi Y. A novel multi-objective metaheuristic algorithm for protein-peptide docking and benchmarking on the LEADS-PEP dataset. Comput Biol Med 2021; 138:104896. [PMID: 34601392 DOI: 10.1016/j.compbiomed.2021.104896] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Revised: 09/22/2021] [Accepted: 09/22/2021] [Indexed: 01/03/2023]
Abstract
Protein-peptide interactions have attracted the attention of many drug discovery scientists due to their possible druggability features on most key biological activities such as regulating disease-related signaling pathways and enhancing the immune system's responses. Different studies have utilized some protein-peptide-specific docking algorithms/methods to predict protein-peptide interactions. However, the existing algorithms/methods suffer from two serious limitations which make them unsuitable for protein-peptide docking problems. First, it seems that the prevalent approaches require to be modified and remodeled for weighting the unbounded forces between a protein and a peptide. Second, they do not employ state-of-the-art search algorithms for detecting the 3D pose of a peptide relative to a protein. To address these restrictions, the present study aims to introduce a novel multi-objective algorithm, which first generates some potential 3D poses of a peptide, and then, improves them through its operators. The candidate solutions are further evaluated using Multi-Objective Pareto Front (MOPF) optimization concepts. To this end, van der Waals, electrostatic, solvation, and hydrogen bond energies between the atoms of a protein and designated peptide are computed. To evaluate the algorithm, it is first applied to the LEADS-PEP dataset containing 53 protein-peptide complexes with up to 53 rotatable branches/bonds and then compared with three popular/efficient algorithms. The obtained results indicate that the MOPF-based approaches which reduce the backbone RMSD between the original and predicted states, achieve significantly better results in terms of the success rate in predicting the near-native conditions. Besides, a comparison between the different types of search algorithms reveals that efficient ones like the multi-objective Trader/differential evolution algorithm can predict protein-peptide interactions better than the popular algorithms such as the multi-objective genetic/particle swarm optimization algorithms.
Collapse
Affiliation(s)
- Yosef Masoudi-Sobhanzadeh
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Behzad Jafari
- Department of Medicinal Chemistry, Faculty of Pharmacy, Urmia University of Medical Sciences, Urmia, Iran
| | - Sepideh Parvizpour
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Mohammad M Pourseif
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Yadollah Omidi
- Department of Pharmaceutical Sciences, College of Pharmacy, Nova Southeastern University, Florida, 33328, USA.
| |
Collapse
|
22
|
Wang K, Lyu N, Diao H, Jin S, Zeng T, Zhou Y, Wu R. GM-DockZn: a geometry matching-based docking algorithm for zinc proteins. Bioinformatics 2020; 36:4004-4011. [DOI: 10.1093/bioinformatics/btaa292] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 04/06/2020] [Accepted: 04/27/2020] [Indexed: 12/23/2022] Open
Abstract
Abstract
Motivation
Molecular docking is a widely used technique for large-scale virtual screening of the interactions between small-molecule ligands and their target proteins. However, docking methods often perform poorly for metalloproteins due to additional complexity from the three-way interactions among amino-acid residues, metal ions and ligands. This is a significant problem because zinc proteins alone comprise about 10% of all available protein structures in the protein databank. Here, we developed GM-DockZn that is dedicated for ligand docking to zinc proteins. Unlike the existing docking methods developed specifically for zinc proteins, GM-DockZn samples ligand conformations directly using a geometric grid around the ideal zinc-coordination positions of seven discovered coordination motifs, which were found from the survey of known zinc proteins complexed with a single ligand.
Results
GM-DockZn has the best performance in sampling near-native poses with correct coordination atoms and numbers within the top 50 and top 10 predictions when compared to several state-of-the-art techniques. This is true not only for a non-redundant dataset of zinc proteins but also for a homolog set of different ligand and zinc-coordination systems for the same zinc proteins. Similar superior performance of GM-DockZn for near-native-pose sampling was also observed for docking to apo-structures and cross-docking between different ligand complex structures of the same protein. The highest success rate for sampling nearest near-native poses within top 5 and top 1 was achieved by combining GM-DockZn for conformational sampling with GOLD for ranking. The proposed geometry-based sampling technique will be useful for ligand docking to other metalloproteins.
Availability and implementation
GM-DockZn is freely available at www.qmclab.com/ for academic users.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Kai Wang
- Guangdong Provincial Key Laboratory of New Drug Design and Evaluation, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006
- School of Agriculture and Biology, Zhongkai University of Agriculture and Engineering, Guangzhou 510000
| | - Nan Lyu
- Guangdong Provincial Key Laboratory of New Drug Design and Evaluation, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006
| | - Hongjuan Diao
- Guangdong Provincial Key Laboratory of New Drug Design and Evaluation, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006
| | - Shujuan Jin
- Peking University Shenzhen Graduate School, Shenzhen 518055
- Shenzhen Bay Laboratory, Shenzhen 518055, China
| | - Tao Zeng
- Guangdong Provincial Key Laboratory of New Drug Design and Evaluation, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006
| | - Yaoqi Zhou
- Peking University Shenzhen Graduate School, Shenzhen 518055
- Shenzhen Bay Laboratory, Shenzhen 518055, China
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Southport, QLD 4222, Australia
| | - Ruibo Wu
- Guangdong Provincial Key Laboratory of New Drug Design and Evaluation, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Southport, QLD 4222, Australia
| |
Collapse
|
23
|
Andreani J, Quignot C, Guerois R. Structural prediction of protein interactions and docking using conservation and coevolution. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2020. [DOI: 10.1002/wcms.1470] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Jessica Andreani
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| | - Chloé Quignot
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| | - Raphael Guerois
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| |
Collapse
|