1
|
Li M, Shi Y, Hu S, Hu S, Guo P, Wan W, Zhang LY, Pan S, Li J, Sun L, Lan X. MVSF-AB: accurate antibody-antigen binding affinity prediction via multi-view sequence feature learning. Bioinformatics 2025; 41:btae579. [PMID: 39363630 PMCID: PMC12089643 DOI: 10.1093/bioinformatics/btae579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2024] [Revised: 08/22/2024] [Accepted: 10/02/2024] [Indexed: 10/05/2024] Open
Abstract
MOTIVATION Predicting the binding affinity between antigens and antibodies accurately is crucial for assessing therapeutic antibody effectiveness and enhancing antibody engineering and vaccine design. Traditional machine learning methods have been widely used for this purpose, relying on interfacial amino acids' structural information. Nevertheless, due to technological limitations and high costs of acquiring structural data, the structures of most antigens and antibodies are unknown, and sequence-based methods have gained attention. Existing sequence-based approaches designed for protein-protein affinity prediction exhibit a significant drop in performance when applied directly to antibody-antigen affinity prediction due to imbalanced training data and lacking design in the model framework specifically for antibody-antigen, hindering the learning of key features of antibodies and antigens. Therefore, we propose MVSF-AB, a Multi-View Sequence Feature learning for accurate Antibody-antigen Binding affinity prediction. RESULTS MVSF-AB designs a multi-view method that fuses semantic features and residue features to fully utilize the sequence information of antibody-antigen and predicts the binding affinity. Experimental results demonstrate that MVSF-AB outperforms existing approaches in predicting unobserved natural antibody-antigen affinity and maintains its effectiveness when faced with mutant strains of antibodies. AVAILABILITY AND IMPLEMENTATION Datasets we used and source code are available on our public GitHub repository https://github.com/TAI-Medical-Lab/MVSF-AB.
Collapse
Affiliation(s)
- Minghui Li
- School of Software Engineering, Huazhong University of Science and Technology, Wuhan 430000, China
| | - Yao Shi
- School of Software Engineering, Huazhong University of Science and Technology, Wuhan 430000, China
| | - Shengqing Hu
- Department of Nuclear Medicine, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430000, China
| | - Shengshan Hu
- School of Cyber Science and Engineering, Huazhong University of Science and Technology, Wuhan 430000, China
| | - Peijin Guo
- School of Cyber Science and Engineering, Huazhong University of Science and Technology, Wuhan 430000, China
| | - Wei Wan
- School of Cyber Science and Engineering, Huazhong University of Science and Technology, Wuhan 430000, China
| | - Leo Yu Zhang
- School of Information and Communication Technology, Griffith University, Queensland 4222, Australia
| | - Shirui Pan
- School of Information and Communication Technology, Griffith University, Queensland 4222, Australia
| | - Jizhou Li
- School of Data Science, City University of Hong Kong, Hong Kong 999077, China
| | - Lichao Sun
- Department of Computer Science and Engineering, Lehigh University, Bethlehem, PA 18018, United States
| | - Xiaoli Lan
- Department of Nuclear Medicine, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430000, China
| |
Collapse
|
2
|
Xie J, Zhang Y, Wang Z, Jin X, Lu X, Ge S, Min X. PPI-Graphomer: enhanced protein-protein affinity prediction using pretrained and graph transformer models. BMC Bioinformatics 2025; 26:116. [PMID: 40301762 PMCID: PMC12042501 DOI: 10.1186/s12859-025-06123-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2024] [Accepted: 03/28/2025] [Indexed: 05/01/2025] Open
Abstract
Protein-protein interactions (PPIs) refer to the phenomenon of protein binding through various types of bonds to execute biological functions. These interactions are critical for understanding biological mechanisms and drug research. Among these, the protein binding interface is a critical region involved in protein-protein interactions, particularly the hotspot residues on it that play a key role in protein interactions. Current deep learning methods trained on large-scale data can characterize proteins to a certain extent, but they often struggle to adequately capture information about protein binding interfaces. To address this limitation, we propose the PPI-Graphomer module, which integrates pretrained features from large-scale language models and inverse folding models. This approach enhances the characterization of protein binding interfaces by defining edge relationships and interface masks on the basis of molecular interaction information. Our model outperforms existing methods across multiple benchmark datasets and demonstrates strong generalization capabilities.
Collapse
Affiliation(s)
- Jun Xie
- Institute of Artificial Intelligence, School of Informatic, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
- State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
| | - Youli Zhang
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
- State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
| | - Ziyang Wang
- Institute of Artificial Intelligence, School of Informatic, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
- State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
| | - Xiaocheng Jin
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
- State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
| | - Xiaoli Lu
- Information and Networking Center, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China
| | - Shengxiang Ge
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China.
- State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China.
| | - Xiaoping Min
- Institute of Artificial Intelligence, School of Informatic, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China.
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China.
- State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen, 361005, China.
| |
Collapse
|
3
|
Dandibhotla S, Samudrala M, Kaneriya A, Dakshanamurthy S. GNNSeq: A Sequence-Based Graph Neural Network for Predicting Protein-Ligand Binding Affinity. Pharmaceuticals (Basel) 2025; 18:329. [PMID: 40143108 PMCID: PMC11945123 DOI: 10.3390/ph18030329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2025] [Revised: 02/24/2025] [Accepted: 02/24/2025] [Indexed: 03/28/2025] Open
Abstract
Background/Objectives: Accurately predicting protein-ligand binding affinity is essential in drug discovery for identifying effective compounds. While existing sequence-based machine learning models for binding affinity prediction have shown potential, they lack accuracy and robustness in pattern recognition, which limits their generalizability across diverse and novel binding complexes. To overcome these limitations, we developed GNNSeq, a novel hybrid machine learning model that integrates a Graph Neural Network (GNN) with Random Forest (RF) and XGBoost. Methods: GNNSeq predicts ligand binding affinity by extracting molecular characteristics and sequence patterns from protein and ligand sequences. The fully optimized GNNSeq model was trained and tested on subsets of the PDBbind dataset. The novelty of GNNSeq lies in its exclusive reliance on sequence features, a hybrid GNN framework, and an optimized kernel-based context-switching design. By relying exclusively on sequence features, GNNSeq eliminates the need for pre-docked complexes or high-quality structural data, allowing for accurate binding affinity predictions even when interaction-based or structural information is unavailable. The integration of GNN, XGBoost, and RF improves GNNSeq performance by hierarchical sequence learning, handling complex feature interactions, reducing variance, and forming a robust ensemble that improves predictions and mitigates overfitting. The GNNSeq unique kernel-based context switching scheme optimizes model efficiency and runtime, dynamically adjusts feature weighting between sequence and basic structural information, and improves predictive accuracy and model generalization. Results: In benchmarking, GNNSeq performed comparably to several existing sequence-based models and achieved a Pearson correlation coefficient (PCC) of 0.784 on the PDBbind v.2020 refined set and 0.84 on the PDBbind v.2016 core set. During external validation with the DUDE-Z v.2023.06.20 dataset, GNNSeq attained an average area under the curve (AUC) of 0.74, demonstrating its ability to distinguish active ligands from decoys across diverse ligand-receptor pairs. To further evaluate its performance, we combined GNNSeq with two additional specialized models that integrate structural and protein-ligand interaction features. When tested on a curated set of well-characterized drug-target complexes, the hybrid models achieved an average PCC of 0.89, with the top-performing model reaching a PCC of 0.97. GNNSeq was designed with a strong emphasis on computational efficiency, training on 5000+ complexes in 1 h and 32 min, with real-time affinity predictions for test complexes. Conclusions: GNNSeq provides an efficient and scalable approach for binding affinity prediction, offering improved accuracy and generalizability while enabling large-scale virtual screening and cost-effective hit identification. GNNSeq is publicly available in a server-based graphical user interface (GUI) format.
Collapse
Affiliation(s)
- Somanath Dandibhotla
- Department of Computer Science, College of Engineering and Computing, George Mason University, Fairfax, VA 22030, USA
| | - Madhav Samudrala
- Department of Statistics, College of Arts and Sciences, The University of Virginia, Charlottesville, VA 22903, USA
| | - Arjun Kaneriya
- Department of Computer Science, School of Computing, Data Sciences & Physics, College of William and Mary, Williamsburg, VA 23185, USA
| | - Sivanesan Dakshanamurthy
- Department of Oncology, Lombardi Comprehensive Cancer Center, Georgetown University Medical Center, Washington, DC 20007, USA
| |
Collapse
|
4
|
Zhou Z, Yin Y, Han H, Jia Y, Koh JH, Kong AWK, Mu Y. ProAffinity-GNN: A Novel Approach to Structure-Based Protein-Protein Binding Affinity Prediction via a Curated Data Set and Graph Neural Networks. J Chem Inf Model 2024; 64:8796-8808. [PMID: 39558674 DOI: 10.1021/acs.jcim.4c01850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2024]
Abstract
Protein-protein interactions (PPIs) are crucial for understanding biological processes and disease mechanisms, contributing significantly to advances in protein engineering and drug discovery. The accurate determination of binding affinities, essential for decoding PPIs, faces challenges due to the substantial time and financial costs involved in experimental and theoretical methods. This situation underscores the urgent need for more effective and precise methodologies for predicting binding affinity. Despite the abundance of research on PPI modeling, the field of quantitative binding affinity prediction remains underexplored, mainly due to a lack of comprehensive data. This study seeks to address these needs by manually curating pairwise interaction labels on available 3D structures of protein complexes, with experimentally determined binding affinities, creating the largest data set for structure-based pairwise protein interaction with binding affinity to date. Subsequently, we introduce ProAffinity-GNN, a novel deep learning framework using protein language model and graph neural network (GNN) to improve the accuracy of prediction of structure-based protein-protein binding affinities. The evaluation results across several benchmark test sets and an additional case study demonstrate that ProAffinity-GNN not only outperforms existing models in terms of accuracy but also shows strong generalization capabilities.
Collapse
Affiliation(s)
- Zhiyuan Zhou
- School of Biological Sciences, Nanyang Technological University, 637551, Singapore
| | - Yueming Yin
- Institute for Digital Molecular Analytics and Science (IDMxS), Nanyang Technological University, 636921, Singapore
| | - Hao Han
- School of Biological Sciences, Nanyang Technological University, 637551, Singapore
| | - Yiping Jia
- School of Pharmacy, Shanghai Jiao Tong University, 200240, Shanghai, China
| | - Jun Hong Koh
- School of Biological Sciences, Nanyang Technological University, 637551, Singapore
| | - Adams Wai-Kin Kong
- College of Computing and Data Science, Nanyang Technological University, 639798, Singapore
| | - Yuguang Mu
- School of Biological Sciences, Nanyang Technological University, 637551, Singapore
| |
Collapse
|
5
|
Zheng F, Jiang X, Wen Y, Yang Y, Li M. Systematic investigation of machine learning on limited data: A study on predicting protein-protein binding strength. Comput Struct Biotechnol J 2024; 23:460-472. [PMID: 38235359 PMCID: PMC10792694 DOI: 10.1016/j.csbj.2023.12.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/14/2023] [Accepted: 12/16/2023] [Indexed: 01/19/2024] Open
Abstract
The application of machine learning techniques in biological research, especially when dealing with limited data availability, poses significant challenges. In this study, we leveraged advancements in method development for predicting protein-protein binding strength to conduct a systematic investigation into the application of machine learning on limited data. The binding strength, quantitatively measured as binding affinity, is vital for understanding the processes of recognition, association, and dysfunction that occur within protein complexes. By incorporating transfer learning, integrating domain knowledge, and employing both deep learning and traditional machine learning algorithms, we mitigated the impact of data limitations and made significant advancements in predicting protein-protein binding affinity. In particular, we developed over 20 models, ultimately selecting three representative best-performing ones that belong to distinct categories. The first model is structure-based, consisting of a random forest regression and thirteen handcrafted features. The second model is sequence-based, employing an architecture that combines transferred embedding features with a multilayer perceptron. Finally, we created an ensemble model by averaging the predictions of the two aforementioned models. The comparison with other predictors on three independent datasets confirms the significant improvements achieved by our models in predicting protein-protein binding affinity. The programs for running these three models are available at https://github.com/minghuilab/BindPPI.
Collapse
Affiliation(s)
- Feifan Zheng
- MOE Key Laboratory of Geriatric Diseases and Immunology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, Jiangsu Province 215123, China
| | - Xin Jiang
- MOE Key Laboratory of Geriatric Diseases and Immunology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, Jiangsu Province 215123, China
| | - Yuhao Wen
- MOE Key Laboratory of Geriatric Diseases and Immunology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, Jiangsu Province 215123, China
| | - Yan Yang
- MOE Key Laboratory of Geriatric Diseases and Immunology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, Jiangsu Province 215123, China
| | - Minghui Li
- MOE Key Laboratory of Geriatric Diseases and Immunology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, Jiangsu Province 215123, China
| |
Collapse
|
6
|
E U, T M, A V G, D P. A comprehensive survey of drug-target interaction analysis in allopathy and siddha medicine. Artif Intell Med 2024; 157:102986. [PMID: 39326289 DOI: 10.1016/j.artmed.2024.102986] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 08/13/2024] [Accepted: 09/18/2024] [Indexed: 09/28/2024]
Abstract
Effective drug delivery is the cornerstone of modern healthcare, ensuring therapeutic compounds reach their intended targets efficiently. This paper explores the potential of personalized and holistic healthcare, driven by the synergy between traditional and allopathic medicine systems, with a specific focus on the vast reservoir of medicinal compounds found in plants rooted in the historical legacy of traditional medicine. Motivated by the desire to unlock the therapeutic potential of medicinal plants and bridge the gap between traditional and allopathic medicine, this survey delves into in-silico computational approaches for studying Drug-Target Interactions (DTI) within the contexts of allopathy and siddha medicine. The contributions of this survey are multifaceted: it offers a comprehensive overview of in-silico methods for DTI analysis in both systems, identifies common challenges in DTI studies, provides insights into future directions to advance DTI analysis, and includes a comparative analysis of DTI in allopathy and siddha medicine. The findings of this survey highlight the pivotal role of in-silico computational approaches in advancing drug research and development in both allopathy and siddha medicine, emphasizing the importance of integrating these methods to drive the future of personalized healthcare.
Collapse
Affiliation(s)
- Uma E
- Department of Information Science and Technology, College of Engineering Guindy, Chennai, India.
| | - Mala T
- Department of Information Science and Technology, College of Engineering Guindy, Chennai, India
| | - Geetha A V
- Department of Information Science and Technology, College of Engineering Guindy, Chennai, India
| | - Priyanka D
- Department of Information Science and Technology, College of Engineering Guindy, Chennai, India
| |
Collapse
|
7
|
Galvão GDF, Trefilio LM, Salvio AL, da Silva EV, Alves-Leon SV, Fontes-Dantas FL, de Souza JM. Comprehensive analysis of Novel mutations in CCM1/KRIT1 and CCM2/MGC4607 and their clinical implications in Cerebral Cavernous malformations. J Stroke Cerebrovasc Dis 2024; 33:107947. [PMID: 39181174 DOI: 10.1016/j.jstrokecerebrovasdis.2024.107947] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 08/08/2024] [Accepted: 08/14/2024] [Indexed: 08/27/2024] Open
Abstract
BACKGROUND Cerebral Cavernous Malformations (CCM) is a genetic disease characterized by vascular abnormalities in the brain and spinal cord, affecting 0.4-0.5 % of the population. We identified two novel pathogenic mutations, CCM1/KRIT1 c.811delT (p.Trp271GlyfsTer5) and CCM2/MGC4607 c.613_614insGG p.Glu205GlyfsTer31), which disrupt crucial protein domains and potentially alter disease progression. OBJECTIVE The study aims to comprehensively analyze a Brazilian cohort of CCM patients, integrating genetic, clinical, and structural aspects. Specifically, we sought to identify novel mutations within the CCM complex, and explore their potential impact on disease progression. METHODS We conducted a detailed examination of neuroradiological and clinical features in both symptomatic and asymptomatic CCM patients, performing genetic analyses through sequencing of the CCM1/KRIT1, CCM2/MGC4607, and CCM3/PDCD10 genes In silico structural predictions were carried out using PolyPhen-2, SIFT, and Human Genomics Community tools. Protein-protein interactions and docking analyses were explored using the STRING database. RESULTS Genetic analysis identifies 6 pathogenic mutations, 4 likely pathogenic, 1 variants of uncertain significance, and 7 unclassified mutations, including the novel mutations in CCM1 c.811delT and CCM2 c.613_614insGG. In silico structural analysis revealed significant alterations in protein structure, supporting their pathogenicity. Protein-protein interaction analysis indicated nuanced impacts on cellular processes. Clinically, we observed a broad spectrum of symptoms, including seizures and focal neurological deficits. However, no statistically significant differences were found in lesion burden, age of first symptom onset, or sex between the identified CCM1/KRIT1 and CCM2/MGC4607 mutations among all patients studied. CONCLUSION This study enhances the understanding of CCM by linking clinical variability, genetic mutations, and structural effects. The identification of these novel mutations opens new avenues for research and potential therapeutic strategies.
Collapse
Affiliation(s)
- Gustavo da Fontoura Galvão
- Universidade Federal do Estado do Rio de Janeiro, Laboratório de Neurociências Translacional, Programa de Pós-Graduação em Neurologia, Rio de Janeiro RJ, Brasil; Universidade Federal do Rio de Janeiro, Hospital Universitário Clementino Fraga Filho, Departamento de Neurocirurgia, Rio de Janeiro RJ, Brasil
| | - Luisa Menezes Trefilio
- Universidade Estadual do Rio de Janeiro, Instituto de Biologia Roberto Alcântara Gomes, Departamento de Farmacologia e Psicobiologia, Rio de Janeiro RJ, Brasil; Universidade Federal do Estado do Rio de Janeiro, Instituto Biomédico, Rio de Janeiro RJ, Brasil
| | - Andreza Lemos Salvio
- Universidade Federal do Estado do Rio de Janeiro, Laboratório de Neurociências Translacional, Programa de Pós-Graduação em Neurologia, Rio de Janeiro RJ, Brasil
| | - Elielson Veloso da Silva
- Universidade Federal do Estado do Rio de Janeiro, Laboratório de Neurociências Translacional, Programa de Pós-Graduação em Neurologia, Rio de Janeiro RJ, Brasil
| | - Soniza Vieira Alves-Leon
- Universidade Federal do Estado do Rio de Janeiro, Laboratório de Neurociências Translacional, Programa de Pós-Graduação em Neurologia, Rio de Janeiro RJ, Brasil; Universidade Federal do Rio de Janeiro, Hospital Universitário Clementino Fraga Filho, Departamento de Neurologia, Rio de Janeiro RJ, Brasil
| | - Fabrícia Lima Fontes-Dantas
- Universidade Estadual do Rio de Janeiro, Instituto de Biologia Roberto Alcântara Gomes, Departamento de Farmacologia e Psicobiologia, Rio de Janeiro RJ, Brasil.
| | - Jorge Marcondes de Souza
- Universidade Federal do Rio de Janeiro, Hospital Universitário Clementino Fraga Filho, Departamento de Neurocirurgia, Rio de Janeiro RJ, Brasil
| |
Collapse
|
8
|
Kwon YJ, Lee J, Seo EB, Lee J, Park J, Kim SK, Yu H, Ye SK, Chang PS. Cysteine protease I29 propeptide from Calotropis procera R. Br. As a potent cathepsin L inhibitor and its suppressive activity in breast cancer metastasis. Sci Rep 2024; 14:23218. [PMID: 39368988 PMCID: PMC11457494 DOI: 10.1038/s41598-024-73578-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 09/18/2024] [Indexed: 10/07/2024] Open
Abstract
Breast cancer metastasis is associated with a poor prognosis and a high rate of mortality. Cathepsin L (CTSL) is a lysosomal cysteine protease that promotes tumor metastasis by degrading the extracellular matrix. Gene set enrichment analysis revealed that CTSL expression was higher in tumorous than in non-tumorous tissues of breast cancer patients and that high-level CTSL expression correlated positively with the epithelial-mesenchymal transition. Therefore, we hypothesized that inhibiting CTSL activity in tumor cells would prevent metastasis. In this study, we characterized the inhibitory activity of SnuCalCpI15, the I29 domain of a CTSL-like cysteine protease from Calotropis procera R. Br., and revealed that the propeptide stereoselectively inhibited CTSL in a reversible slow-binding manner, with an inhibitory constant (Ki) value of 1.38 ± 0.71 nM, indicating its potency as an exogenous inhibitor in anti-cancer therapy. SnuCalCpI15 was localized intracellularly in MDA-MB-231 breast cancer cells and suppressed tumor cell migration and invasion. These results demonstrate the potential of SnuCalCpI15 as a novel agent to prevent breast cancer metastasis.
Collapse
Affiliation(s)
- Yong-Jin Kwon
- Department of Pharmacology and Biomedical Sciences, Seoul National University College of Medicine, Seoul, 03080, Republic of Korea
- Department of Cosmetic Science, Kyungsung University, Busan, 48434, Republic of Korea
| | - Juno Lee
- Center for Agricultural Microorganism and Enzyme, Seoul National University, Seoul, 08826, Republic of Korea
- Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, 08826, Republic of Korea
| | - Eun-Bi Seo
- Department of Pharmacology and Biomedical Sciences, Seoul National University College of Medicine, Seoul, 03080, Republic of Korea
- Biomedical Science Project (BK21PLUS), Seoul National University College of Medicine, Seoul, 03080, Republic of Korea
| | - Juchan Lee
- Department of Agricultural Biotechnology, Seoul National University College of Agricultural and Life Sciences, Seoul, 08826, Republic of Korea
| | - Jaehyeon Park
- Department of Agricultural Biotechnology, Seoul National University College of Agricultural and Life Sciences, Seoul, 08826, Republic of Korea
| | - Seul-Ki Kim
- Department of Pharmacology and Biomedical Sciences, Seoul National University College of Medicine, Seoul, 03080, Republic of Korea
| | - Hyunjong Yu
- Major of Food Science and Biotechnology, Division of Bio-Convergence, Kyonggi University, Suwon, 16227, Republic of Korea
| | - Sang-Kyu Ye
- Department of Pharmacology and Biomedical Sciences, Seoul National University College of Medicine, Seoul, 03080, Republic of Korea.
- Biomedical Science Project (BK21PLUS), Seoul National University College of Medicine, Seoul, 03080, Republic of Korea.
- Ischemic/Hypoxic Disease Institute, Seoul National University College of Medicine, Seoul, 03080, Republic of Korea.
- Neuro-Immune Information Storage Network Research Center, Seoul National University College of Medicine, Seoul, 03080, Republic of Korea.
- Wide River Institute of Immunology, Seoul National University, Hongcheon, 25159, Republic of Korea.
| | - Pahn-Shick Chang
- Center for Agricultural Microorganism and Enzyme, Seoul National University, Seoul, 08826, Republic of Korea.
- Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, 08826, Republic of Korea.
- Department of Agricultural Biotechnology, Seoul National University College of Agricultural and Life Sciences, Seoul, 08826, Republic of Korea.
- Center for Food and Bioconvergence, Seoul National University, Seoul, 08826, Republic of Korea.
| |
Collapse
|
9
|
Bogdanova EA, Novoseletsky VN. ProBAN: Neural network algorithm for predicting binding affinity in protein-protein complexes. Proteins 2024; 92:1127-1136. [PMID: 38722047 DOI: 10.1002/prot.26700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 03/22/2024] [Accepted: 04/26/2024] [Indexed: 08/07/2024]
Abstract
Determining binding affinities in protein-protein and protein-peptide complexes is a challenging task that directly impacts the development of peptide and protein pharmaceuticals. Although several models have been proposed to predict the value of the dissociation constant and the Gibbs free energy, they are currently not capable of making stable predictions with high accuracy, in particular for complexes consisting of more than two molecules. In this work, we present ProBAN, a new method for predicting binding affinity in protein-protein complexes based on a deep convolutional neural network. Prediction is carried out for the spatial structures of complexes, presented in the format of a 4D tensor, which includes information about the location of atoms and their abilities to participate in various types of interactions realized in protein-protein and protein-peptide complexes. The effectiveness of the model was assessed both on an internal test data set containing complexes consisting of three or more molecules, as well as on an external test for the PPI-Affinity service. As a result, we managed to achieve the best prediction quality on these data sets among all the analyzed models: on the internal test, Pearson correlation R = 0.6, MAE = 1.60, on the external test, R = 0.55, MAE = 1.75. The open-source code, the trained ProBAN model, and the collected dataset are freely available at the following link https://github.com/EABogdanova/ProBAN.
Collapse
|
10
|
Biswas G, Mukherjee D, Basu S. Combining Complementarity and Binding Energetics in the Assessment of Protein Interactions: EnCPdock-A Practical Manual. J Comput Biol 2024; 31:769-781. [PMID: 38885081 DOI: 10.1089/cmb.2024.0554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/20/2024] Open
Abstract
The combined effect of shape and electrostatic complementarities (Sc, EC) at the interface of the interacting protein partners (PPI) serves as the physical basis for such associations and is a strong determinant of their binding energetics. EnCPdock (https://www.scinetmol.in/EnCPdock/) presents a comprehensive web platform for the direct conjoint comparative analyses of complementarity and binding energetics in PPIs. It elegantly interlinks the dual nature of local (Sc) and nonlocal complementarity (EC) in PPIs using the complementarity plot. It further derives an AI-based ΔGbinding with a prediction accuracy comparable to the state of the art. This book chapter presents a practical manual to conceptualize and implement EnCPdock with its various features and functionalities, collectively having the potential to serve as a valuable protein engineering tool in the design of novel protein interfaces.
Collapse
Affiliation(s)
- Gargi Biswas
- Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot, Israel
| | | | - Sankar Basu
- Department of Microbiology, Asutosh College, University of Calcutta, Kolkata, India
| |
Collapse
|
11
|
Durojaye OA, Yekeen AA, Idris MO, Okoro NO, Odiba AS, Nwanguma BC. Investigation of the MDM2-binding potential of de novo designed peptides using enhanced sampling simulations. Int J Biol Macromol 2024; 269:131840. [PMID: 38679255 DOI: 10.1016/j.ijbiomac.2024.131840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 04/13/2024] [Accepted: 04/22/2024] [Indexed: 05/01/2024]
Abstract
The tumor suppressor p53 plays a crucial role in cellular responses to various stresses, regulating key processes such as apoptosis, senescence, and DNA repair. Dysfunctional p53, prevalent in approximately 50 % of human cancers, contributes to tumor development and resistance to treatment. This study employed deep learning-based protein design and structure prediction methods to identify novel high-affinity peptide binders (Pep1 and Pep2) targeting MDM2, with the aim of disrupting its interaction with p53. Extensive all-atom molecular dynamics simulations highlighted the stability of the designed peptide in complex with the target, supported by several structural analyses, including RMSD, RMSF, Rg, SASA, PCA, and free energy landscapes. Using the steered molecular dynamics and umbrella sampling simulations, we elucidate the dissociation dynamics of p53, Pep1, and Pep2 from MDM2. Notable differences in interaction profiles were observed, emphasizing the distinct dissociation patterns of each peptide. In conclusion, the results of our umbrella sampling simulations suggest Pep1 as a higher-affinity MDM2 binder compared to p53 and Pep2, positioning it as a potential inhibitor of the MDM2-p53 interaction. Using state-of-the-art protein design tools and advanced MD simulations, this study provides a comprehensive framework for rational in silico design of peptide binders with therapeutic implications in disrupting MDM2-p53 interactions for anticancer interventions.
Collapse
Affiliation(s)
- Olanrewaju Ayodeji Durojaye
- MOE Key Laboratory of Membraneless Organelle and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, University of Science and Technology of China, Hefei, Anhui 230027, China; School of Life Sciences, University of Science and Technology of China, Hefei, Anhui 230027, China; Department of Chemical Sciences, Coal City University, Emene, Enugu State, Nigeria.
| | - Abeeb Abiodun Yekeen
- Department of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, TX 75390, United States; Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX 75390, United States
| | | | - Nkwachukwu Oziamara Okoro
- Department of Pharmaceutical and Medicinal Chemistry, Faculty of Pharmaceutical Sciences, University of Nigeria, Nsukka 410001, Nigeria
| | - Arome Solomon Odiba
- Department of Molecular Genetics and Biotechnology, University of Nigeria, Nsukka, Enugu State 410001, Nigeria; Department of Biochemistry, Faculty of Biological Sciences, University of Nigeria, Nsukka, Enugu State 410001, Nigeria.
| | - Bennett Chima Nwanguma
- Department of Molecular Genetics and Biotechnology, University of Nigeria, Nsukka, Enugu State 410001, Nigeria; Department of Biochemistry, Faculty of Biological Sciences, University of Nigeria, Nsukka, Enugu State 410001, Nigeria.
| |
Collapse
|
12
|
Grassmann G, Miotto M, Desantis F, Di Rienzo L, Tartaglia GG, Pastore A, Ruocco G, Monti M, Milanetti E. Computational Approaches to Predict Protein-Protein Interactions in Crowded Cellular Environments. Chem Rev 2024; 124:3932-3977. [PMID: 38535831 PMCID: PMC11009965 DOI: 10.1021/acs.chemrev.3c00550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 02/20/2024] [Accepted: 02/21/2024] [Indexed: 04/11/2024]
Abstract
Investigating protein-protein interactions is crucial for understanding cellular biological processes because proteins often function within molecular complexes rather than in isolation. While experimental and computational methods have provided valuable insights into these interactions, they often overlook a critical factor: the crowded cellular environment. This environment significantly impacts protein behavior, including structural stability, diffusion, and ultimately the nature of binding. In this review, we discuss theoretical and computational approaches that allow the modeling of biological systems to guide and complement experiments and can thus significantly advance the investigation, and possibly the predictions, of protein-protein interactions in the crowded environment of cell cytoplasm. We explore topics such as statistical mechanics for lattice simulations, hydrodynamic interactions, diffusion processes in high-viscosity environments, and several methods based on molecular dynamics simulations. By synergistically leveraging methods from biophysics and computational biology, we review the state of the art of computational methods to study the impact of molecular crowding on protein-protein interactions and discuss its potential revolutionizing effects on the characterization of the human interactome.
Collapse
Affiliation(s)
- Greta Grassmann
- Department
of Biochemical Sciences “Alessandro Rossi Fanelli”, Sapienza University of Rome, Rome 00185, Italy
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
| | - Mattia Miotto
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
| | - Fausta Desantis
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
- The
Open University Affiliated Research Centre at Istituto Italiano di
Tecnologia, Genoa 16163, Italy
| | - Lorenzo Di Rienzo
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
| | - Gian Gaetano Tartaglia
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
- Department
of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Genoa 16163, Italy
- Center
for Human Technologies, Genoa 16152, Italy
| | - Annalisa Pastore
- Experiment
Division, European Synchrotron Radiation
Facility, Grenoble 38043, France
| | - Giancarlo Ruocco
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
- Department
of Physics, Sapienza University, Rome 00185, Italy
| | - Michele Monti
- RNA
System Biology Lab, Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Genoa 16163, Italy
| | - Edoardo Milanetti
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
- Department
of Physics, Sapienza University, Rome 00185, Italy
| |
Collapse
|
13
|
Ridha F, Gromiha MM. MPA-Pred: A machine learning approach for predicting the binding affinity of membrane protein-protein complexes. Proteins 2024; 92:499-508. [PMID: 37949651 DOI: 10.1002/prot.26633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 10/05/2023] [Accepted: 10/25/2023] [Indexed: 11/12/2023]
Abstract
Membrane protein-protein interactions are essential for several functions including cell signaling, ion transport, and enzymatic activity. These interactions are mainly dictated by their binding affinities. Although several methods are available for predicting the binding affinity of protein-protein complexes, there exists no specific method for membrane protein-protein complexes. In this work, we collected the experimental binding affinity data for a set of 114 membrane protein-protein complexes and derived several structure and sequence-based features. Our analysis on the relationship between binding affinity and the features revealed that the important factors mainly depend on the type of membrane protein and the functional class of the protein. Specifically, aromatic and charged residues at the interface, and aromatic-aromatic and electrostatic interactions are found to be important to understand the binding affinity. Further, we developed a method, MPA-Pred, for predicting the binding affinity of membrane protein-protein complexes using a machine learning approach. It showed an average correlation and mean absolute error of 0.83 and 0.91 kcal/mol, respectively, using the jack-knife test on a set of 114 complexes. We have also developed a web server and it is available at https://web.iitm.ac.in/bioinfo2/MPA-Pred/. This method can be used for predicting the affinity of membrane protein-protein complexes at a large scale and aid to improve drug design strategies.
Collapse
Affiliation(s)
- Fathima Ridha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
- Department of Computer Science, National University of Singapore, Singapore, Singapore
| |
Collapse
|
14
|
Hoogstraten CA, Koenderink JB, van Straaten CE, Scheer-Weijers T, Smeitink JAM, Schirris TJJ, Russel FGM. Pyruvate dehydrogenase is a potential mitochondrial off-target for gentamicin based on in silico predictions and in vitro inhibition studies. Toxicol In Vitro 2024; 95:105740. [PMID: 38036072 DOI: 10.1016/j.tiv.2023.105740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 11/08/2023] [Accepted: 11/22/2023] [Indexed: 12/02/2023]
Abstract
During the drug development process, organ toxicity leads to an estimated failure of one-third of novel chemical entities. Drug-induced toxicity is increasingly associated with mitochondrial dysfunction, but identifying the underlying molecular mechanisms remains a challenge. Computational modeling techniques have proven to be a good tool in searching for drug off-targets. Here, we aimed to identify mitochondrial off-targets of the nephrotoxic drugs tenofovir and gentamicin using different in silico approaches (KRIPO, ProBis and PDID). Dihydroorotate dehydrogenase (DHODH) and pyruvate dehydrogenase (PDH) were predicted as potential novel off-target sites for tenofovir and gentamicin, respectively. The predicted targets were evaluated in vitro, using (colorimetric) enzymatic activity measurements. Tenofovir did not inhibit DHODH activity, while gentamicin potently reduced PDH activity. In conclusion, the use of in silico methods appeared a valuable approach in predicting PDH as a mitochondrial off-target of gentamicin. Further research is required to investigate the contribution of PDH inhibition to overall renal toxicity of gentamicin.
Collapse
Affiliation(s)
- Charlotte A Hoogstraten
- Division of Pharmacology and Toxicology, Department of Pharmacy, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands; Radboud Center for Mitochondrial Medicine, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands
| | - Jan B Koenderink
- Division of Pharmacology and Toxicology, Department of Pharmacy, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands
| | - Carolijn E van Straaten
- Division of Pharmacology and Toxicology, Department of Pharmacy, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands
| | - Tom Scheer-Weijers
- Division of Pharmacology and Toxicology, Department of Pharmacy, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands
| | - Jan A M Smeitink
- Radboud Center for Mitochondrial Medicine, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands; Department of Pediatrics, Amalia Children's Hospital, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands; Khondrion BV, Nijmegen 6525 EX, the Netherlands
| | - Tom J J Schirris
- Division of Pharmacology and Toxicology, Department of Pharmacy, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands; Radboud Center for Mitochondrial Medicine, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands
| | - Frans G M Russel
- Division of Pharmacology and Toxicology, Department of Pharmacy, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands; Radboud Center for Mitochondrial Medicine, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands.
| |
Collapse
|
15
|
Pepe A, Tito FR, Guevara MG. Antiplatelet mechanism of a subtilisin-like serine protease from Solanum tuberosum (StSBTc-3). Biochimie 2024; 218:152-161. [PMID: 37704077 DOI: 10.1016/j.biochi.2023.09.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 09/01/2023] [Accepted: 09/09/2023] [Indexed: 09/15/2023]
Abstract
The aims of this study are to characterize the antiplatelet activity of StSBTc-3, a potato serine protease with fibrino (geno) lytic activity, and to provide information on its mechanism of action. The results obtained show that StSBTc-3 inhibits clot retraction and prevents platelet aggregation induced by thrombin, convulxin, and A23187. Platelet aggregation inhibition occurs in a dose-dependent manner and is not affected by inactivation of StSBTc-3 with the inhibitor of serine proteases phenylmethylsulfonyl fluoride (PMSF). In addition, StSBTc-3 reduces fibrinogen binding onto platelets. In-silico calculations show a high binding affinity between StSBTc-3 and human α2bβ3 integrin suggesting that the antiplatelet activity of StSBTc-3 could be associated with the fibronectin type III domain present in its amino acid sequence. Binding experiments show that StSBTc-3 binds to α2bβ3 preventing the interaction between α2bβ3 and fibrinogen and, consequently, inhibiting platelet aggregation. StSBTc-3 represents a promising compound to be considered as an alternative to commercially available drugs used in cardiovascular therapies.
Collapse
Affiliation(s)
- Alfonso Pepe
- Biological Research Institute, National Scientific and Technical Research Council (CONICET) - University of Mar del Plata (UNMdP), Funes 3250, Mar del Plata, 7600, Buenos Aires, Argentina
| | - Florencia Rocio Tito
- Biological Research Institute, National Scientific and Technical Research Council (CONICET) - University of Mar del Plata (UNMdP), Funes 3250, Mar del Plata, 7600, Buenos Aires, Argentina
| | - Maria Gabriela Guevara
- Biological Research Institute, National Scientific and Technical Research Council (CONICET) - University of Mar del Plata (UNMdP), Funes 3250, Mar del Plata, 7600, Buenos Aires, Argentina.
| |
Collapse
|
16
|
Kowalski A. Sequence-based prediction of the effects of histones H1 post-translational modifications: impact on the features related to the function. J Biomol Struct Dyn 2024:1-10. [PMID: 38353488 DOI: 10.1080/07391102.2024.2316773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 02/04/2024] [Indexed: 03/11/2025]
Abstract
Post-translational modifications modulate histones H1 activity but their impact on proteins features was not studied so far. Therefore, this work was intended to answer how the most common modifications, i.e. acetylation, methylation, phosphorylation and ubiquitination, can influence on histones H1 to alter their physicochemical and molecular properties. Investigations were done with the use of sequence-based predictors trained on various protein features. Because a full set of histones H1 modifications is not included in the databases of histone proteins, the survey was performed on the human, animals, plants, fungi and protist sequences selected from UniProtKB/Swiss-Prot database. Quantitative proportions of modifications were similar between the groups of organisms (CV = 0.11) but different within the group (p < 0.05). The effects of modifications were evaluated with the use of mutated sequences obtained through the substitution of modified residue of Lys, Ser and Thr by a neutral residue of the Ala. An advantage of deleterious mutations at the sites of acetylation, methylation and ubiquitination over the sites of phosphorylation (p < 0.05) indicate that this modification have more redundant character. Modifications evoke an increase of protein solubility and stability as well as acceleration of folding kinetics and a weaken of binding affinity. Besides, they also maintain a higher extent of intrinsic structural disorder. The obtained results prove that modifications should be perceived as relevant factors influencing physicochemical features determining molecular properties. Thus, histones H1 functioning is strictly correlated with the status of modifications.
Collapse
Affiliation(s)
- Andrzej Kowalski
- Division of Medical Biology, Institute of Biology, Jan Kochanowski University in Kielce, Kielce, Poland
| |
Collapse
|
17
|
Yi C, Taylor ML, Ziebarth J, Wang Y. Predictive Models and Impact of Interfacial Contacts and Amino Acids on Protein-Protein Binding Affinity. ACS OMEGA 2024; 9:3454-3468. [PMID: 38284090 PMCID: PMC10809705 DOI: 10.1021/acsomega.3c06996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 12/11/2023] [Accepted: 12/14/2023] [Indexed: 01/30/2024]
Abstract
Protein-protein interactions (PPIs) play a central role in nearly all cellular processes. The strength of the binding in a PPI is characterized by the binding affinity (BA) and is a key factor in controlling protein-protein complex formation and defining the structure-function relationship. Despite advancements in understanding protein-protein binding, much remains unknown about the interfacial region and its association with BA. New models are needed to predict BA with improved accuracy for therapeutic design. Here, we use machine learning approaches to examine how well different types of interfacial contacts can be used to predict experimentally determined BA and to reveal the impact of the specific amino acids at the binding interface on BA. We create a series of multivariate linear regression models incorporating different contact features at both residue and atomic levels and examine how different methods of identifying and characterizing these properties impact the performance of these models. Particularly, we introduce a new and simple approach to predict BA based on the quantities of specific amino acids at the protein-protein interface. We found that the numbers of specific amino acids at the protein-protein interface were correlated with BA. We show that the interfacial numbers of amino acids can be used to produce models with consistently good performance across different data sets, indicating the importance of the identities of interfacial amino acids in underlying BA. When trained on a diverse set of complexes from two benchmark data sets, the best performing BA model was generated with an explicit linear equation involving six amino acids. Tyrosine, in particular, was identified as the key amino acid in controlling BA, as it had the strongest correlation with BA and was consistently identified as the most important amino acid in feature importance studies. Glycine and serine were identified as the next two most important amino acids in predicting BA. The results from this study further our understanding of PPIs and can be used to make improved predictions of BA, giving them implications for drug design and screening in the pharmaceutical industry.
Collapse
Affiliation(s)
- Carey
Huang Yi
- Department of Chemistry, The University of Memphis, Memphis, Tennessee 38152, United States
| | - Mitchell Lee Taylor
- Department of Chemistry, The University of Memphis, Memphis, Tennessee 38152, United States
| | - Jesse Ziebarth
- Department of Chemistry, The University of Memphis, Memphis, Tennessee 38152, United States
| | - Yongmei Wang
- Department of Chemistry, The University of Memphis, Memphis, Tennessee 38152, United States
| |
Collapse
|
18
|
Jarończyk M. Software for Predicting Binding Free Energy of Protein-Protein Complexes and Their Mutants. Methods Mol Biol 2024; 2780:139-147. [PMID: 38987468 DOI: 10.1007/978-1-0716-3985-6_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Protein-protein binding affinity prediction is important for understanding complex biochemical pathways and to uncover protein interaction networks. Quantitative estimation of the binding affinity changes caused by mutations can provide critical information for protein function annotation and genetic disease diagnoses. The binding free energies of protein-protein complexes can be predicted using several computational tools. This chapter is a summary of software developed for the prediction of binding free energies for protein-protein complexes and their mutants.
Collapse
|
19
|
Iannuzo N, Welfley H, Li NC, Johnson MDL, Rojas-Quintero J, Polverino F, Guerra S, Li X, Cusanovich DA, Langlais PR, Ledford JG. CC16 drives VLA-2-dependent SPLUNC1 expression. Front Immunol 2023; 14:1277582. [PMID: 38053993 PMCID: PMC10694244 DOI: 10.3389/fimmu.2023.1277582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 10/30/2023] [Indexed: 12/07/2023] Open
Abstract
Rationale CC16 (Club Cell Secretory Protein) is a protein produced by club cells and other non-ciliated epithelial cells within the lungs. CC16 has been shown to protect against the development of obstructive lung diseases and attenuate pulmonary pathogen burden. Despite recent advances in understanding CC16 effects in circulation, the biological mechanisms of CC16 in pulmonary epithelial responses have not been elucidated. Objectives We sought to determine if CC16 deficiency impairs epithelial-driven host responses and identify novel receptors expressed within the pulmonary epithelium through which CC16 imparts activity. Methods We utilized mass spectrometry and quantitative proteomics to investigate how CC16 deficiency impacts apically secreted pulmonary epithelial proteins. Mouse tracheal epithelial cells (MTECS), human nasal epithelial cells (HNECs) and mice were studied in naïve conditions and after Mp challenge. Measurements and main results We identified 8 antimicrobial proteins significantly decreased by CC16-/- MTECS, 6 of which were validated by mRNA expression in Severe Asthma Research Program (SARP) cohorts. Short Palate Lung and Nasal Epithelial Clone 1 (SPLUNC1) was the most differentially expressed protein (66-fold) and was the focus of this study. Using a combination of MTECs and HNECs, we found that CC16 enhances pulmonary epithelial-driven SPLUNC1 expression via signaling through the receptor complex Very Late Antigen-2 (VLA-2) and that rCC16 given to mice enhances pulmonary SPLUNC1 production and decreases Mycoplasma pneumoniae (Mp) burden. Likewise, rSPLUNC1 results in decreased Mp burden in mice lacking CC16 mice. The VLA-2 integrin binding site within rCC16 is necessary for induction of SPLUNC1 and the reduction in Mp burden. Conclusion Our findings demonstrate a novel role for CC16 in epithelial-driven host defense by up-regulating antimicrobials and define a novel epithelial receptor for CC16, VLA-2, through which signaling is necessary for enhanced SPLUNC1 production.
Collapse
Affiliation(s)
- Natalie Iannuzo
- Department of Cellular and Molecular Medicine, University of Arizona, Tucson, AZ, United States
| | - Holly Welfley
- Asthma and Airway Disease Research Center, Tucson, AZ, United States
| | | | | | | | | | - Stefano Guerra
- Asthma and Airway Disease Research Center, Tucson, AZ, United States
- Department of Medicine, Division of Pulmonary, Allergy, Critical Care, and Sleep Medicine, University of Arizona, Tucson, AZ, United States
| | - Xingnan Li
- Department of Medicine, Division of Genetics, Genomics, and Precision Medicine, University of Arizona, Tucson, AZ, United States
| | - Darren A. Cusanovich
- Department of Cellular and Molecular Medicine, University of Arizona, Tucson, AZ, United States
- Asthma and Airway Disease Research Center, Tucson, AZ, United States
| | - Paul R. Langlais
- Department of Medicine, Division of Endocrinology, University of Arizona, Tucson, AZ, United States
| | - Julie G. Ledford
- Department of Cellular and Molecular Medicine, University of Arizona, Tucson, AZ, United States
- Asthma and Airway Disease Research Center, Tucson, AZ, United States
| |
Collapse
|
20
|
Nikam R, Yugandhar K, Gromiha MM. Deep learning-based method for predicting and classifying the binding affinity of protein-protein complexes. BIOCHIMICA ET BIOPHYSICA ACTA. PROTEINS AND PROTEOMICS 2023; 1871:140948. [PMID: 37567456 DOI: 10.1016/j.bbapap.2023.140948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 08/05/2023] [Accepted: 08/08/2023] [Indexed: 08/13/2023]
Abstract
Protein-protein interactions (PPIs) play a critical role in various biological processes. Accurately estimating the binding affinity of PPIs is essential for understanding the underlying molecular recognition mechanisms. In this study, we employed a deep learning approach to predict the binding affinity (ΔG) of protein-protein complexes. To this end, we compiled a dataset of 903 protein-protein complexes, each with its corresponding experimental binding affinity, which belong to six functional classes. We extracted 8 to 20 non-redundant features from the sequence information as well as the predicted three-dimensional structures using feature selection methods for each protein functional class. Our method showed an overall mean absolute error of 1.05 kcal/mol and a correlation of 0.79 between experimental and predicted ΔG values. Additionally, we evaluated our model for discriminating high and low affinity protein-protein complexes and it achieved an accuracy of 87% with an F1 score of 0.86 using 10-fold cross-validation on the selected features. Our approach presents an efficient tool for studying PPIs and provides crucial insights into the underlying mechanisms of the molecular recognition process. The web server can be freely accessed at https://web.iitm.ac.in/bioinfo2/DeepPPAPred/index.html.
Collapse
Affiliation(s)
- Rahul Nikam
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India
| | - Kumar Yugandhar
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India; Department of Computational Biology, Cornell University, New York, USA
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India; Department of Computer Science, Tokyo Institute of Technology, Yokohama, Japan; Department of Computer Science, National University of Singapore, Singapore.
| |
Collapse
|
21
|
Biswas G, Mukherjee D, Dutta N, Ghosh P, Basu S. EnCPdock: a web-interface for direct conjoint comparative analyses of complementarity and binding energetics in inter-protein associations. J Mol Model 2023; 29:239. [PMID: 37423912 DOI: 10.1007/s00894-023-05626-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 06/20/2023] [Indexed: 07/11/2023]
Abstract
CONTEXT Protein-protein interaction (PPI) is a key component linked to virtually all cellular processes. Be it an enzyme catalysis ('classic type functions' of proteins) or a signal transduction ('non-classic'), proteins generally function involving stable or quasi-stable multi-protein associations. The physical basis for such associations is inherent in the combined effect of shape and electrostatic complementarities (Sc, EC) of the interacting protein partners at their interface, which provides indirect probabilistic estimates of the stability and affinity of the interaction. While Sc is a necessary criterion for inter-protein associations, EC can be favorable as well as disfavored (e.g., in transient interactions). Estimating equilibrium thermodynamic parameters (∆Gbinding, Kd) by experimental means is costly and time consuming, thereby opening windows for computational structural interventions. Attempts to empirically probe ∆Gbinding from coarse-grain structural descriptors (primarily, surface area based terms) have lately been overtaken by physics-based, knowledge-based and their hybrid approaches (MM/PBSA, FoldX, etc.) that directly compute ∆Gbinding without involving intermediate structural descriptors. METHODS Here, we present EnCPdock ( https://www.scinetmol.in/EnCPdock/ ), a user-friendly web-interface for the direct conjoint comparative analyses of complementarity and binding energetics in proteins. EnCPdock returns an AI-predicted ∆Gbinding computed by combining complementarity (Sc, EC) and other high-level structural descriptors (input feature vectors), and renders a prediction accuracy comparable to the state-of-the-art. EnCPdock further locates a PPI complex in terms of its {Sc, EC} values (taken as an ordered pair) in the two-dimensional complementarity plot (CP). In addition, it also generates mobile molecular graphics of the interfacial atomic contact network for further analyses. EnCPdock also furnishes individual feature trends along with the relative probability estimates (Prfmax) of the obtained feature-scores with respect to the events of their highest observed frequencies. Together, these functionalities are of real practical use for structural tinkering and intervention as might be relevant in the design of targeted protein-interfaces. Combining all its features and applications, EnCPdock presents a unique online tool that should be beneficial to structural biologists and researchers across related fraternities.
Collapse
Affiliation(s)
- Gargi Biswas
- Department of Chemistry and Structural Biology, Weizmann Institute of Science, 7610001, Rehovot, Israel
| | - Debasish Mukherjee
- Institute of Molecular Biology gGmbH (IMB), Ackermannweg 4, 55128, Mainz, Germany
| | - Nalok Dutta
- Dept of Biochemical Engineering, Faculty of Engineering Science, University College London, London, WC1E 6BT, UK
| | - Prithwi Ghosh
- Department of Botany, Narajole Raj College, Vidyasagar University, Midnapore, 721211, India
| | - Sankar Basu
- Department of Microbiology, Asutosh College (affiliated with University of Calcutta), 92, Shyama Prasad Mukherjee Rd, Bhowanipore, 700026, Kolkata, India.
| |
Collapse
|
22
|
GÖKER BAGCA B, GÖDE S, TURHAL G, ÖZATEŞ NP, VERAL A, GÜNDÜZ C, AVCI ÇB. Nadir paranazal sinüs kanserlerinde yeni tanımlanan reseptör tirozin kinaz mutasyonları ve potansiyel fonksiyonel etkileri. EGE TIP DERGISI 2023. [DOI: 10.19161/etd.1262612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/11/2023] Open
Abstract
Amaç: Paranazal sinüs kanserleri oldukça nadir görülen heterojen bir hastalık grubudur. Maksiler sinüs skuamoz hücreli karsinomu, paranazal sinüs kanserlerinin anatomik ve histolojik olarak en
yaygın alt tipidir. Bu kanserin genetik profiline dair bilginin sınırlı olması, hastaların hedefli tedavi seçeneklerinden yararlanamamasına neden olmaktadır. Çalışmamızda bu nadir kanserdeki reseptör tirozin kinaz mutasyonlarının tanımlanması ve mutasyonların olası fonksiyonel etkilerinin tahmin edilmesi amaçlanmıştır.
Gereç ve Yöntem: Bu amaçla 30 olgunun tümörüne ait FFPE dokulardan DNA izolasyonu gerçekleştirildi, olguların mutasyon profili yeni nesil sekanslama yöntemi ve biyoinformatik
değerlendirme ile belirlendi. Belirlenen patojenik/ olası patojenik varyantların fonksiyonel etkileri farklı in silico araçlar yardımıyla tahminlendi.
Bulgular: Olgularının tamamında en az bir adet patojenik/olası patojenik KIT, PDFGRA ve RETmutasyonu belirlendi. KIT geninin katalitik bölgesindeki mutasyonların kinaz aktivitesini arttıracağı
tahmin edildi. PDFGRA genindeki p.P567P ve p.D1074D mutasyonları, 30 olgunun tamamında ve SRA veritabanından elde edilen normal dokulara ait okumaların tümünde belirlendi.
Sonuç: Reseptör tirozin kinaz mutasyonlarının paranazal sinüs kanserlerinde de önemli rol oynayabileceğinin belirlenmiş olması özellikle artmış kinaz aktivitesini hedefleyen tedavi yaklaşımlarını
bu olguların erişimine sunma potansiyeli taşıması bakımından oldukça önemlidir.
Collapse
Affiliation(s)
- Bakiye GÖKER BAGCA
- Aydın Adnan Menderes Üniversitesi, Tıp Fakültesi, Tıbbi Biyoloji Anabilim Dalı, Aydın, Türkiye
| | - Sercan GÖDE
- Ege Üniversitesi, Tıp Fakültesi, Kulak Burun Boğaz Anabilim Dalı, İzmir, Türkiye
| | - Göksel TURHAL
- Ege Üniversitesi, Tıp Fakültesi, Kulak Burun Boğaz Anabilim Dalı, İzmir, Türkiye
| | - Neslihan Pınar ÖZATEŞ
- Harran Üniversitesi, Tıp Fakültesi, Tıbbi Biyoloji Anabilim Dalı, Şanlıurfa, Türkiye
| | - Ali VERAL
- Ege Üniversitesi, Tıp Fakültesi, Patoloji Anabilim Dalı, İzmir, Türkiye
| | - Cumhur GÜNDÜZ
- Ege Üniversitesi, Tıp Fakültesi, Tıbbi Biyoloji Anabilim Dalı, İzmir, Türkiye
| | - Çığır Biray AVCI
- Ege Üniversitesi, Tıp Fakültesi, Tıbbi Biyoloji Anabilim Dalı, İzmir, Türkiye
| |
Collapse
|
23
|
Miftahussurur M, Alfaray RI, Fauzia KA, Dewayani A, Doohan D, Waskito LA, Rezkitha YAA, Utomo DH, Somayana G, Fahrial Syam A, Lubis M, Akada J, Matsumoto T, Yamaoka Y. Low-grade intestinal metaplasia in Indonesia: Insights into the expression of proinflammatory cytokines during Helicobacter pylori infection and unique East-Asian CagA characteristics. Cytokine 2023; 163:156122. [PMID: 36640695 DOI: 10.1016/j.cyto.2022.156122] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Revised: 12/28/2022] [Accepted: 12/28/2022] [Indexed: 01/15/2023]
Abstract
Helicobacter pylori infection is a major cause of intestinal metaplasia. In this study, we aimed to understand the reason underlying the low grade and incidence of intestinal metaplasia in Indonesia, based on the expression of genes encoding proinflammatory cytokines in gastric biopsy specimens. The possible reasons for the lesser virulence of the East-Asian-type CagA in Indonesia than that of the Western-type CagA, which is not common in other countries, were also investigated. The mRNA expression of cytokines was evaluated using real-time PCR. CagA characteristics were analyzed using in silico analysis. The expression of cytokines was typically not robust, among H. pylori-infected subjects in Indonesia, despite them predominantly demonstrating the East-Asian-type CagA. This might partially be explained by the characteristics of the East-Asian-type CagA in Indonesia, which showed a higher instability index and required higher energy to interact with proteins related to the cytokine induction pathway compared with the other types (p < 0.001 and p < 0.05, respectively). Taken together, besides the low prevalence of H. pylori, the low inflammatory response of the host and low CagA virulence, even among populations with high infection rates, may play an essential role in the low grade and low incidence of intestinal metaplasia in Indonesia. We believe that these findings would be relevant for better understanding of intestinal metaplasia, which is closely associated with the development of gastric cancer.
Collapse
Affiliation(s)
- Muhammad Miftahussurur
- Division of Gastroentero-Hepatology, Department of Internal Medicine, Faculty of Medicine-Dr. Soetomo Teaching Hospital, Universitas Airlangga, Jalan Mayjend Prof, Dr. Moestopo, No. 6-8, Surabaya, Surabaya 60131, Indonesia; Helicobacter pylori and Microbiota Study Group, Institute of Tropical Disease, Universitas Airlangga, Surabaya 60115, Indonesia.
| | - Ricky Indra Alfaray
- Helicobacter pylori and Microbiota Study Group, Institute of Tropical Disease, Universitas Airlangga, Surabaya 60115, Indonesia; Department of Environmental and Preventive Medicine, Oita University Faculty of Medicine, 1-1, Idaigaoka, Hasama-machi, Yufu Oita 879-5593, Japan.
| | - Kartika Afrida Fauzia
- Helicobacter pylori and Microbiota Study Group, Institute of Tropical Disease, Universitas Airlangga, Surabaya 60115, Indonesia; Department of Environmental and Preventive Medicine, Oita University Faculty of Medicine, 1-1, Idaigaoka, Hasama-machi, Yufu Oita 879-5593, Japan; Department of Public Health and Preventive Medicine, Faculty of Medicine, Universitas Airlangga, Surabaya 60132, Indonesia.
| | - Astri Dewayani
- Helicobacter pylori and Microbiota Study Group, Institute of Tropical Disease, Universitas Airlangga, Surabaya 60115, Indonesia; Department of Infectious Disease Control, Oita University Faculty of Medicine, 1-1, Idaigaoka, Hasama-machi, Yufu, Oita 879-5593, Japan; Department of Anatomy, Histology and Pharmacology, Universitas Airlangga, Surabaya 60131, Indonesia.
| | - Dalla Doohan
- Helicobacter pylori and Microbiota Study Group, Institute of Tropical Disease, Universitas Airlangga, Surabaya 60115, Indonesia; Department of Anatomy, Histology and Pharmacology, Universitas Airlangga, Surabaya 60131, Indonesia.
| | - Langgeng Agung Waskito
- Helicobacter pylori and Microbiota Study Group, Institute of Tropical Disease, Universitas Airlangga, Surabaya 60115, Indonesia; Department of Physiology and Medical Biochemistry, Faculty of Medicine, Universitas Airlangga, Surabaya, 60132, Indonesia; Department of Internal Medicine, Faculty of Medicine, Universitas Airlangga, Surabaya 60132, Indonesia.
| | - Yudith Annisa Ayu Rezkitha
- Helicobacter pylori and Microbiota Study Group, Institute of Tropical Disease, Universitas Airlangga, Surabaya 60115, Indonesia; Department of Internal Medicine, Faculty of Medicine, University of Muhammadiyah, Surabaya, Surabaya 60113, Indonesia.
| | - Didik Huswo Utomo
- Research and Education Center for Bioinformatics, Indonesia Institute of Bioinformatics, Malang 65162, Indonesia.
| | - Gde Somayana
- Gastroentero Hepatology Division, Department of Internal Medicine, Faculty of Medicine-Sanglah Hospital, Udayana University, Denpasar, Bali 80114, Indonesia.
| | - Ari Fahrial Syam
- Division of Gastroenterology, Department of Internal Medicine, Faculty of Medicine-Cipto Mangunkusumo Teaching Hospital, University of Indonesia, Jakarta 10430, Indonesia.
| | - Masrul Lubis
- Division of Gastroenterology, Department of Internal Medicine, Faculty of Medicine-Cipto Mangunkusumo Teaching Hospital, Universitas Sumatera Utara, Medan 20222, Indonesia
| | - Junko Akada
- Department of Environmental and Preventive Medicine, Oita University Faculty of Medicine, 1-1, Idaigaoka, Hasama-machi, Yufu Oita 879-5593, Japan.
| | - Takashi Matsumoto
- Department of Environmental and Preventive Medicine, Oita University Faculty of Medicine, 1-1, Idaigaoka, Hasama-machi, Yufu Oita 879-5593, Japan.
| | - Yoshio Yamaoka
- Department of Environmental and Preventive Medicine, Oita University Faculty of Medicine, 1-1, Idaigaoka, Hasama-machi, Yufu Oita 879-5593, Japan; Department of Medicine, Gastroenterology and Hepatology Section, Baylor College of Medicine, Houston, TX 77030, USA.
| |
Collapse
|
24
|
Yan K, Lv H, Wen J, Guo Y, Xu Y, Liu B. PreTP-Stack: Prediction of Therapeutic Peptides Based on the Stacked Ensemble Learing. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1337-1344. [PMID: 35700248 DOI: 10.1109/tcbb.2022.3183018] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Therapeutic peptide prediction is critical for drug development and therapeutic therapy. Researchers have developed several computational methods to identify different therapeutic peptide types. However, most computational methods focus on identifying the specific type of therapeutic peptides and fail to accurately predict all types of therapeutic peptides. Moreover, it is still challenging to utilize different properties features to predict the therapeutic peptides. In this study, a novel stacking framework PreTP-Stack is proposed for predicting different types of therapeutic peptide. PreTP-Stack is constructed based on ten different features and four predictors (Random Forest, Linear Discriminant Analysis, XGBoost and Support Vector Machine). Then the proposed method constructs an auto-weighted multi-view learning model as a final meta-classifier to enhance the performance of the basic models. Experimental results showed that the proposed method achieved better or highly comparable performance with the state-of-the-art methods for predicting eight types of therapeutic peptides A user-friendly web-server predictor is available at http://bliulab.net/PreTP-Stack.
Collapse
|
25
|
Wang E. Prediction of antibody binding to SARS-CoV-2 RBDs. BIOINFORMATICS ADVANCES 2023; 3:vbac103. [PMID: 36698760 PMCID: PMC9868522 DOI: 10.1093/bioadv/vbac103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 12/18/2022] [Accepted: 12/31/2022] [Indexed: 01/03/2023]
Abstract
Summary The ability to predict antibody-antigen binding is essential for computational models of antibody affinity maturation and protein design. While most models aim to predict binding for arbitrary antigens and antibodies, the global impact of SARS-CoV-2 on public health and the availability of associated data suggest that a SARS-CoV-2-specific model would be highly beneficial. In this work, we present a neural network model, trained on ∼315 000 datapoints from deep mutational scanning experiments, that predicts escape fractions of SARS-CoV-2 RBDs binding to arbitrary antibodies. The antibody embeddings within the model constitute an effective sequence space, which correlates with the Hamming distance, suggesting that these embeddings may be useful for downstream tasks such as binding prediction. Indeed, the model achieves Spearman correlation coefficients of 0.46 and 0.52 on two held-out test sets. By comparison, correlation coefficients calculated using existing structure and sequence-based models do not exceed 0.28. The correlation coefficient against dissociation constants of antibodies binding to SARS-CoV-2 RBD variants is 0.46. Additionally, the residue-level escapes are highest in the antibody epitope, correlating well with experimentally measured escapes. We further study the effect of antibody chain use, embedding dimension size and feed-forward and convolutional architectures on the model results. Lastly, we find that the inference time of our model is significantly faster than previous models, suggesting that it could be a useful tool for the accurate and rapid prediction of antibodies binding to SARS-CoV-2 RBDs. Availability and implementation The model and associated code are available for download at https://github.com/ericzwang/RBD_AB. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Eric Wang
- To whom correspondence should be addressed.
| |
Collapse
|
26
|
Murakami Y, Mizuguchi K. Recent developments of sequence-based prediction of protein-protein interactions. Biophys Rev 2022; 14:1393-1411. [PMID: 36589735 PMCID: PMC9789376 DOI: 10.1007/s12551-022-01038-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/08/2022] [Indexed: 12/25/2022] Open
Abstract
The identification of protein-protein interactions (PPIs) can lead to a better understanding of cellular functions and biological processes of proteins and contribute to the design of drugs to target disease-causing PPIs. In addition, targeting host-pathogen PPIs is useful for elucidating infection mechanisms. Although several experimental methods have been used to identify PPIs, these methods can yet to draw complete PPI networks. Hence, computational techniques are increasingly required for the prediction of potential PPIs, which have never been seen experimentally. Recent high-performance sequence-based methods have contributed to the construction of PPI networks and the elucidation of pathogenetic mechanisms in specific diseases. However, the usefulness of these methods depends on the quality and quantity of training data of PPIs. In this brief review, we introduce currently available PPI databases and recent sequence-based methods for predicting PPIs. Also, we discuss key issues in this field and present future perspectives of the sequence-based PPI predictions.
Collapse
Affiliation(s)
- Yoichi Murakami
- grid.440890.10000 0004 0640 9413Tokyo University of Information Sciences, 4-1 Onaridai, Wakaba-Ku, Chiba, 265-8501 Japan
| | - Kenji Mizuguchi
- grid.136593.b0000 0004 0373 3971Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita-Shi, Osaka, 565-0871 Japan ,grid.482562.fNational Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito Asagi, Ibaraki, Osaka 567-0085 Japan
| |
Collapse
|
27
|
Guo Z, Yamaguchi R. Machine learning methods for protein-protein binding affinity prediction in protein design. FRONTIERS IN BIOINFORMATICS 2022; 2:1065703. [PMID: 36591334 PMCID: PMC9800603 DOI: 10.3389/fbinf.2022.1065703] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 12/01/2022] [Indexed: 12/23/2022] Open
Abstract
Protein-protein interactions govern a wide range of biological activity. A proper estimation of the protein-protein binding affinity is vital to design proteins with high specificity and binding affinity toward a target protein, which has a variety of applications including antibody design in immunotherapy, enzyme engineering for reaction optimization, and construction of biosensors. However, experimental and theoretical modelling methods are time-consuming, hinder the exploration of the entire protein space, and deter the identification of optimal proteins that meet the requirements of practical applications. In recent years, the rapid development in machine learning methods for protein-protein binding affinity prediction has revealed the potential of a paradigm shift in protein design. Here, we review the prediction methods and associated datasets and discuss the requirements and construction methods of binding affinity prediction models for protein design.
Collapse
Affiliation(s)
- Zhongliang Guo
- Division of Cancer Systems Biology, Aichi Cancer Center Research Institute, Nagoya, Aichi, Japan
| | - Rui Yamaguchi
- Division of Cancer Systems Biology, Aichi Cancer Center Research Institute, Nagoya, Aichi, Japan,Division of Cancer Informatics, Nagoya University Graduate School of Medicine, Nagoya, Aichi, Japan,*Correspondence: Rui Yamaguchi,
| |
Collapse
|
28
|
Romero-Molina S, Ruiz-Blanco YB, Mieres-Perez J, Harms M, Münch J, Ehrmann M, Sanchez-Garcia E. PPI-Affinity: A Web Tool for the Prediction and Optimization of Protein-Peptide and Protein-Protein Binding Affinity. J Proteome Res 2022; 21:1829-1841. [PMID: 35654412 PMCID: PMC9361347 DOI: 10.1021/acs.jproteome.2c00020] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
![]()
Virtual screening
of protein–protein and protein–peptide
interactions is a challenging task that directly impacts the processes
of hit identification and hit-to-lead optimization in drug design
projects involving peptide-based pharmaceuticals. Although several
screening tools designed to predict the binding affinity of protein–protein
complexes have been proposed, methods specifically developed to predict
protein–peptide binding affinity are comparatively scarce.
Frequently, predictors trained to score the affinity of small molecules
are used for peptides indistinctively, despite the larger complexity
and heterogeneity of interactions rendered by peptide binders. To
address this issue, we introduce PPI-Affinity, a tool that leverages
support vector machine (SVM) predictors of binding affinity to screen
datasets of protein–protein and protein–peptide complexes,
as well as to generate and rank mutants of a given structure. The
performance of the SVM models was assessed on four benchmark datasets,
which include protein–protein and protein–peptide binding
affinity data. In addition, we evaluated our model on a set of mutants
of EPI-X4, an endogenous peptide inhibitor of the chemokine receptor
CXCR4, and on complexes of the serine proteases HTRA1 and HTRA3 with
peptides. PPI-Affinity is freely accessible at https://protdcal.zmb.uni-due.de/PPIAffinity.
Collapse
Affiliation(s)
- Sandra Romero-Molina
- Computational Biochemistry, Center of Medical Biotechnology, University of Duisburg-Essen, Essen 45141, Germany
| | - Yasser B Ruiz-Blanco
- Computational Biochemistry, Center of Medical Biotechnology, University of Duisburg-Essen, Essen 45141, Germany
| | - Joel Mieres-Perez
- Computational Biochemistry, Center of Medical Biotechnology, University of Duisburg-Essen, Essen 45141, Germany
| | - Mirja Harms
- Institute of Molecular Virology, Ulm University Medical Center, Ulm 89081, Germany
| | - Jan Münch
- Institute of Molecular Virology, Ulm University Medical Center, Ulm 89081, Germany.,Core Facility Functional Peptidomics, Ulm University Medical Center, Ulm 89081, Germany
| | - Michael Ehrmann
- Faculty of Biology, Center of Medical Biotechnology, University of Duisburg-Essen, Essen 45141, Germany
| | - Elsa Sanchez-Garcia
- Computational Biochemistry, Center of Medical Biotechnology, University of Duisburg-Essen, Essen 45141, Germany
| |
Collapse
|
29
|
Panday S, Alexov E. Protein-Protein Binding Free Energy Predictions with the MM/PBSA Approach Complemented with the Gaussian-Based Method for Entropy Estimation. ACS OMEGA 2022; 7:11057-11067. [PMID: 35415339 PMCID: PMC8991903 DOI: 10.1021/acsomega.1c07037] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 03/10/2022] [Indexed: 06/14/2023]
Abstract
Here, we present a Gaussian-based method for estimation of protein-protein binding entropy to augment the molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) method for computational prediction of binding free energy (ΔG). The method is termed f5-MM/PBSA/E, where "E" stands for entropy and f5 for five adjustable parameters. The enthalpy components of ΔG (molecular mechanics, polar and non-polar solvation energies) are computed from a single implicit solvent generalized Born (GB) energy minimized structure of a protein-protein complex, while the binding entropy is computed using independently GB energy minimized unbound and bound structures. It should be emphasized that the f5-MM/PBSA/E method does not use snapshots, just energy minimized structures, and is thus very fast and computationally efficient. The method is trained and benchmarked in 5-fold validation test over a data set consisting of 46 protein-protein binding cases with experimentally determined dissociation constant K d values. This data set has been used for benchmarking in recently published protein-protein binding studies that apply conventional MM/PBSA and MM/PBSA with an enhanced sampling method. The f5-MM/PBSA/E tested on the same data set achieves similar or better performance than these computationally demanding approaches, making it an excellent choice for high throughput protein-protein binding affinity prediction studies.
Collapse
|
30
|
Santa-Coloma TA. Overlapping synthetic peptides as a tool to map protein-protein interactions ̶ FSH as a model system of nonadditive interactions. Biochim Biophys Acta Gen Subj 2022; 1866:130153. [DOI: 10.1016/j.bbagen.2022.130153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2022] [Revised: 04/06/2022] [Accepted: 04/12/2022] [Indexed: 10/18/2022]
|
31
|
Munjal NS, Sapra D, Parthasarathi KTS, Goyal A, Pandey A, Banerjee M, Sharma J. Deciphering the Interactions of SARS-CoV-2 Proteins with Human Ion Channels Using Machine-Learning-Based Methods. Pathogens 2022; 11:pathogens11020259. [PMID: 35215201 PMCID: PMC8874499 DOI: 10.3390/pathogens11020259] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 01/31/2022] [Accepted: 02/08/2022] [Indexed: 01/04/2023] Open
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is accountable for the protracted COVID-19 pandemic. Its high transmission rate and pathogenicity led to health emergencies and economic crisis. Recent studies pertaining to the understanding of the molecular pathogenesis of SARS-CoV-2 infection exhibited the indispensable role of ion channels in viral infection inside the host. Moreover, machine learning (ML)-based algorithms are providing a higher accuracy for host-SARS-CoV-2 protein–protein interactions (PPIs). In this study, PPIs of SARS-CoV-2 proteins with human ion channels (HICs) were trained on the PPI-MetaGO algorithm. PPI networks (PPINs) and a signaling pathway map of HICs with SARS-CoV-2 proteins were generated. Additionally, various U.S. food and drug administration (FDA)-approved drugs interacting with the potential HICs were identified. The PPIs were predicted with 82.71% accuracy, 84.09% precision, 84.09% sensitivity, 0.89 AUC-ROC, 65.17% Matthews correlation coefficient score (MCC) and 84.09% F1 score. Several host pathways were found to be altered, including calcium signaling and taste transduction pathway. Potential HICs could serve as an initial set to the experimentalists for further validation. The study also reinforces the drug repurposing approach for the development of host directed antiviral drugs that may provide a better therapeutic management strategy for infection caused by SARS-CoV-2.
Collapse
Affiliation(s)
- Nupur S. Munjal
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; (N.S.M.); (D.S.); (K.T.S.P.); (A.G.)
| | - Dikscha Sapra
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; (N.S.M.); (D.S.); (K.T.S.P.); (A.G.)
| | - K. T. Shreya Parthasarathi
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; (N.S.M.); (D.S.); (K.T.S.P.); (A.G.)
| | - Abhishek Goyal
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; (N.S.M.); (D.S.); (K.T.S.P.); (A.G.)
| | - Akhilesh Pandey
- Center for Molecular Medicine, National Institute of Mental Health and Neurosciences (NIMHANS), Hosur Road, Bangalore 560029, India;
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN 55905, USA
- Center for Individualized Medicine, Mayo Clinic, Rochester, MN 55905, USA
| | - Manidipa Banerjee
- Kusuma School of Biological Sciences, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India;
| | - Jyoti Sharma
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; (N.S.M.); (D.S.); (K.T.S.P.); (A.G.)
- Manipal Academy of Higher Education (MAHE), Udupi 576104, India
- Correspondence:
| |
Collapse
|
32
|
Yang YX, Wang P, Zhu BT. Relative importance of interface and surface areas in protein-protein binding affinity prediction: A machine learning analysis based on linear regression and artificial neural network. Biophys Chem 2022; 283:106762. [DOI: 10.1016/j.bpc.2022.106762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 01/11/2022] [Accepted: 01/14/2022] [Indexed: 11/02/2022]
|
33
|
OUP accepted manuscript. Brief Funct Genomics 2022; 21:202-215. [DOI: 10.1093/bfgp/elac003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 01/29/2022] [Accepted: 02/15/2022] [Indexed: 11/14/2022] Open
|
34
|
Wang XR, Cao TT, Jia CM, Tian XM, Wang Y. Quantitative prediction model for affinity of drug-target interactions based on molecular vibrations and overall system of ligand-receptor. BMC Bioinformatics 2021; 22:497. [PMID: 34649499 PMCID: PMC8515642 DOI: 10.1186/s12859-021-04389-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2021] [Accepted: 09/20/2021] [Indexed: 12/27/2022] Open
Abstract
Background The study of drug–target interactions (DTIs) affinity plays an important role in safety assessment and pharmacology. Currently, quantitative structure–activity relationship (QSAR) and molecular docking (MD) are most common methods in research of DTIs affinity. However, they often built for a specific target or several targets, and most QSAR and MD methods were based either on structure of drug molecules or on structure of receptors with low accuracy and small scope of application. How to construct quantitative prediction models with high accuracy and wide applicability remains a challenge. To this end, this paper screened molecular descriptors based on molecular vibrations and took molecule-target as a whole system to construct prediction models with high accuracy-wide applicability based on dissociation constant (Kd) and concentration for 50% of maximal effect (EC50), and to provide reference for quantifying affinity of DTIs. Results After comprehensive comparison, the results showed that RF models are optimal models to analyze and predict DTIs affinity with coefficients of determination (R2) are all greater than 0.94. Compared to the quantitative models reported in literatures, the RF models developed in this paper have higher accuracy and wide applicability. In addition, E-state molecular descriptors associated with molecular vibrations and normalized Moreau-Broto autocorrelation (G3), Moran autocorrelation (G4), transition-distribution (G7) protein descriptors are of higher importance in the quantification of DTIs. Conclusion Through screening molecular descriptors based on molecular vibrations and taking molecule-target as whole system, we obtained optimal models based on RF with more accurate-widely applicable, which indicated that selection of molecular descriptors associated with molecular vibrations and the use of molecular-target as whole system are reliable methods for improving performance of models. It can provide reference for quantifying affinity of DTIs. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04389-w.
Collapse
Affiliation(s)
- Xian-Rui Wang
- Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, 100102, China
| | - Ting-Ting Cao
- Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, 100102, China
| | - Cong Min Jia
- Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, 100102, China
| | - Xue-Mei Tian
- Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, 100102, China
| | - Yun Wang
- Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, 100102, China.
| |
Collapse
|
35
|
Milighetti M, Shawe-Taylor J, Chain B. Predicting T Cell Receptor Antigen Specificity From Structural Features Derived From Homology Models of Receptor-Peptide-Major Histocompatibility Complexes. Front Physiol 2021; 12:730908. [PMID: 34566692 PMCID: PMC8456106 DOI: 10.3389/fphys.2021.730908] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Accepted: 08/02/2021] [Indexed: 11/13/2022] Open
Abstract
The physical interaction between the T cell receptor (TCR) and its cognate antigen causes T cells to activate and participate in the immune response. Understanding this physical interaction is important in predicting TCR binding to a target epitope, as well as potential cross-reactivity. Here, we propose a way of collecting informative features of the binding interface from homology models of T cell receptor-peptide-major histocompatibility complex (TCR-pMHC) complexes. The information collected from these structures is sufficient to discriminate binding from non-binding TCR-pMHC pairs in multiple independent datasets. The classifier is limited by the number of crystal structures available for the homology modelling and by the size of the training set. However, the classifier shows comparable performance to sequence-based classifiers requiring much larger training sets.
Collapse
Affiliation(s)
- Martina Milighetti
- Division of Infection and Immunity, University College London, London, United Kingdom
- Cancer Institute, University College London, London, United Kingdom
| | - John Shawe-Taylor
- Department of Computer Science, University College London, London, United Kingdom
| | - Benny Chain
- Division of Infection and Immunity, University College London, London, United Kingdom
- Department of Computer Science, University College London, London, United Kingdom
| |
Collapse
|
36
|
Abbasi WA, Abbas SA, Andleeb S. PANDA: Predicting the change in proteins binding affinity upon mutations by finding a signal in primary structures. J Bioinform Comput Biol 2021; 19:2150015. [PMID: 34126874 DOI: 10.1142/s0219720021500153] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Accurately determining a change in protein binding affinity upon mutations is important to find novel therapeutics and to assist mutagenesis studies. Determination of change in binding affinity upon mutations requires sophisticated, expensive, and time-consuming wet-lab experiments that can be supported with computational methods. Most of the available computational prediction techniques depend upon protein structures that bound their applicability to only protein complexes with recognized 3D structures. In this work, we explore the sequence-based prediction of change in protein binding affinity upon mutation and question the effectiveness of [Formula: see text]-fold cross-validation (CV) across mutations adopted in previous studies to assess the generalization ability of such predictors with no known mutation during training. We have used protein sequence information instead of protein structures along with machine learning techniques to accurately predict the change in protein binding affinity upon mutation. Our proposed sequence-based novel change in protein binding affinity predictor called PANDA performs comparably to the existing methods gauged through an appropriate CV scheme and an external independent test dataset. On an external test dataset, our proposed method gives a maximum Pearson correlation coefficient of 0.52 in comparison to the state-of-the-art existing protein structure-based method called MutaBind which gives a maximum Pearson correlation coefficient of 0.59. Our proposed protein sequence-based method, to predict a change in binding affinity upon mutations, has wide applicability and comparable performance in comparison to existing protein structure-based methods. We made PANDA easily accessible through a cloud-based webserver and python code available at https://sites.google.com/view/wajidarshad/software and https://github.com/wajidarshad/panda, respectively.
Collapse
Affiliation(s)
- Wajid Arshad Abbasi
- Computational Biology and Data Analysis Lab., Department of Computer Sciences & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K 13100, Pakistan
| | - Syed Ali Abbas
- Computational Biology and Data Analysis Lab., Department of Computer Sciences & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K 13100, Pakistan
| | - Saiqa Andleeb
- Biotechnology Lab., Department of Zoology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K 13100, Pakistan
| |
Collapse
|
37
|
Abbasi WA, Abbas SA, Andleeb S, Ul Islam G, Ajaz SA, Arshad K, Khalil S, Anjam A, Ilyas K, Saleem M, Chughtai J, Abbas A. COVIDC: An expert system to diagnose COVID-19 and predict its severity using chest CT scans: Application in radiology. INFORMATICS IN MEDICINE UNLOCKED 2021; 23:100540. [PMID: 33644298 PMCID: PMC7901302 DOI: 10.1016/j.imu.2021.100540] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2020] [Revised: 02/17/2021] [Accepted: 02/19/2021] [Indexed: 01/09/2023] Open
Abstract
Early diagnosis of Coronavirus disease 2019 (COVID-19) is significantly important, especially in the absence or inadequate provision of a specific vaccine, to stop the surge of this lethal infection by advising quarantine. This diagnosis is challenging as most of the patients having COVID-19 infection stay asymptomatic while others showing symptoms are hard to distinguish from patients having different respiratory infections such as severe flu and Pneumonia. Due to cost and time-consuming wet-lab diagnostic tests for COVID-19, there is an utmost requirement for some alternate, non-invasive, rapid, and discounted automatic screening system. A chest CT scan can effectively be used as an alternative modality to detect and diagnose the COVID-19 infection. In this study, we present an automatic COVID-19 diagnostic and severity prediction system called COVIDC (COVID-19 detection using CT scans) that uses deep feature maps from the chest CT scans for this purpose. Our newly proposed system not only detects COVID-19 but also predicts its severity by using a two-phase classification approach (COVID vs non-COVID, and COVID-19 severity) with deep feature maps and different shallow supervised classification algorithms such as SVMs and random forest to handle data scarcity. We performed a stringent COVIDC performance evaluation not only through 10-fold cross-validation and an external validation dataset but also in a real setting under the supervision of an experienced radiologist. In all the evaluation settings, COVIDC outperformed all the existing state-of-the-art methods designed to detect COVID-19 with an F1 score of 0.94 on the validation dataset and justified its use to diagnose COVID-19 effectively in the real setting by classifying correctly 9 out of 10 COVID-19 CT scans. We made COVIDC openly accessible through a cloud-based webserver and python code available at https://sites.google.com/view/wajidarshad/software and https://github.com/wajidarshad/covidc.
Collapse
Affiliation(s)
- Wajid Arshad Abbasi
- Computational Biology and Data Analysis Lab., Department of Computer Science & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
| | - Syed Ali Abbas
- Computational Biology and Data Analysis Lab., Department of Computer Science & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
| | - Saiqa Andleeb
- Biotechnology Lab., Department of Zoology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
| | - Ghafoor Ul Islam
- Biotechnology Lab., Department of Zoology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
| | - Syeda Adin Ajaz
- Computational Biology and Data Analysis Lab., Department of Computer Science & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
| | - Kinza Arshad
- Computational Biology and Data Analysis Lab., Department of Computer Science & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
| | - Sadia Khalil
- Computational Biology and Data Analysis Lab., Department of Computer Science & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
| | - Asma Anjam
- Computational Biology and Data Analysis Lab., Department of Computer Science & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
| | - Kashif Ilyas
- Computational Biology and Data Analysis Lab., Department of Computer Science & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
| | - Mohsib Saleem
- Computational Biology and Data Analysis Lab., Department of Computer Science & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
| | - Jawad Chughtai
- Computational Biology and Data Analysis Lab., Department of Computer Science & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
| | - Ayesha Abbas
- Computational Biology and Data Analysis Lab., Department of Computer Science & Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, AJ&K, 13100, Pakistan
| |
Collapse
|