1
|
Tahir ul Qamar M, Noor F, Guo YX, Zhu XT, Chen LL. Deep-HPI-pred: An R-Shiny applet for network-based classification and prediction of Host-Pathogen protein-protein interactions. Comput Struct Biotechnol J 2024; 23:316-329. [PMID: 38192372 PMCID: PMC10772389 DOI: 10.1016/j.csbj.2023.12.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Revised: 12/11/2023] [Accepted: 12/12/2023] [Indexed: 01/10/2024] Open
Abstract
Host-pathogen interactions (HPIs) are vital in numerous biological activities and are intrinsically linked to the onset and progression of infectious diseases. HPIs are pivotal in the entire lifecycle of diseases: from the onset of pathogen introduction, navigating through the mechanisms that bypass host cellular defenses, to its subsequent proliferation inside the host. At the heart of these stages lies the synergy of proteins from both the host and the pathogen. By understanding these interlinking protein dynamics, we can gain crucial insights into how diseases progress and pave the way for stronger plant defenses and the swift formulation of countermeasures. In the framework of current study, we developed a web-based R/Shiny app, Deep-HPI-pred, that uses network-driven feature learning method to predict the yet unmapped interactions between pathogen and host proteins. Leveraging citrus and CLas bacteria training datasets as case study, we spotlight the effectiveness of Deep-HPI-pred in discerning Protein-protein interaction (PPIs) between them. Deep-HPI-pred use Multilayer Perceptron (MLP) models for HPI prediction, which is based on a comprehensive evaluation of topological features and neural network architectures. When subjected to independent validation datasets, the predicted models consistently surpassed a Matthews correlation coefficient (MCC) of 0.80 in host-pathogen interactions. Remarkably, the use of Eigenvector Centrality as the leading topological feature further enhanced this performance. Further, Deep-HPI-pred also offers relevant gene ontology (GO) term information for each pathogen and host protein within the system. This protein annotation data contributes an additional layer to our understanding of the intricate dynamics within host-pathogen interactions. In the additional benchmarking studies, the Deep-HPI-pred model has proven its robustness by consistently delivering reliable results across different host-pathogen systems, including plant-pathogens (accuracy of 98.4% and 97.9%), human-virus (accuracy of 94.3%), and animal-bacteria (accuracy of 96.6%) interactomes. These results not only demonstrate the model's versatility but also pave the way for gaining comprehensive insights into the molecular underpinnings of complex host-pathogen interactions. Taken together, the Deep-HPI-pred applet offers a unified web service for both identifying and illustrating interaction networks. Deep-HPI-pred applet is freely accessible at its homepage: https://cbi.gxu.edu.cn/shiny-apps/Deep-HPI-pred/ and at github: https://github.com/tahirulqamar/Deep-HPI-pred.
Collapse
Affiliation(s)
- Muhammad Tahir ul Qamar
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning 530004, China
| | - Fatima Noor
- Integrative Omics and Molecular Modeling Laboratory, Department of Bioinformatics and Biotechnology, Government College University Faisalabad (GCUF), Faisalabad 38000, Pakistan
| | - Yi-Xiong Guo
- National Key Laboratory of Crop Genetic Improvement, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Xi-Tong Zhu
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning 530004, China
| | - Ling-Ling Chen
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning 530004, China
| |
Collapse
|
2
|
Chen Z, Wang R, Guo J, Wang X. The role and future prospects of artificial intelligence algorithms in peptide drug development. Biomed Pharmacother 2024; 175:116709. [PMID: 38713945 DOI: 10.1016/j.biopha.2024.116709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 05/01/2024] [Accepted: 05/02/2024] [Indexed: 05/09/2024] Open
Abstract
Peptide medications have been more well-known in recent years due to their many benefits, including low side effects, high biological activity, specificity, effectiveness, and so on. Over 100 peptide medications have been introduced to the market to treat a variety of illnesses. Most of these peptide medications are developed on the basis of endogenous peptides or natural peptides, which frequently required expensive, time-consuming, and extensive tests to confirm. As artificial intelligence advances quickly, it is now possible to build machine learning or deep learning models that screen a large number of candidate sequences for therapeutic peptides. Therapeutic peptides, such as those with antibacterial or anticancer properties, have been developed by the application of artificial intelligence algorithms.The process of finding and developing peptide drugs is outlined in this review, along with a few related cases that were helped by AI and conventional methods. These resources will open up new avenues for peptide drug development and discovery, helping to meet the pressing needs of clinical patients for disease treatment. Although peptide drugs are a new class of biopharmaceuticals that distinguish them from chemical and small molecule drugs, their clinical purpose and value cannot be ignored. However, the traditional peptide drug research and development has a long development cycle and high investment, and the creation of peptide medications will be substantially hastened by the AI-assisted (AI+) mode, offering a new boost for combating diseases.
Collapse
Affiliation(s)
- Zhiheng Chen
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China.
| | - Ruoxi Wang
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China.
| | - Junqi Guo
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China.
| | - Xiaogang Wang
- Guangdong Provincial Key Laboratory of Bone and Joint Degenerative Diseases, The Third Affiliated Hospital of Southern Medical University, Guangzhou, Guangdong 510630, China.
| |
Collapse
|
3
|
Aslan A, Ari Yuka S. Therapeutic peptides for coronary artery diseases: in silico methods and current perspectives. Amino Acids 2024; 56:37. [PMID: 38822212 PMCID: PMC11143054 DOI: 10.1007/s00726-024-03397-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Accepted: 05/06/2024] [Indexed: 06/02/2024]
Abstract
Many drug formulations containing small active molecules are used for the treatment of coronary artery disease, which affects a significant part of the world's population. However, the inadequate profile of these molecules in terms of therapeutic efficacy has led to the therapeutic use of protein and peptide-based biomolecules with superior properties, such as target-specific affinity and low immunogenicity, in critical diseases. Protein‒protein interactions, as a consequence of advances in molecular techniques with strategies involving the combined use of in silico methods, have enabled the design of therapeutic peptides to reach an advanced dimension. In particular, with the advantages provided by protein/peptide structural modeling, molecular docking for the study of their interactions, molecular dynamics simulations for their interactions under physiological conditions and machine learning techniques that can work in combination with all these, significant progress has been made in approaches to developing therapeutic peptides that can modulate the development and progression of coronary artery diseases. In this scope, this review discusses in silico methods for the development of peptide therapeutics for the treatment of coronary artery disease and strategies for identifying the molecular mechanisms that can be modulated by these designs and provides a comprehensive perspective for future studies.
Collapse
Affiliation(s)
- Ayca Aslan
- Department of Bioengineering, Faculty of Chemical and Metallurgical Engineering, Yildiz Technical University, Esenler, Istanbul, Turkey
- Health Biotechnology Joint Research and Application Center of Excellence, Esenler, Istanbul, Turkey
| | - Selcen Ari Yuka
- Department of Bioengineering, Faculty of Chemical and Metallurgical Engineering, Yildiz Technical University, Esenler, Istanbul, Turkey.
- Health Biotechnology Joint Research and Application Center of Excellence, Esenler, Istanbul, Turkey.
| |
Collapse
|
4
|
Liang Y, Lv D, Liu K, Yang L, Shu H, Wen L, Lv C, Sun Q, Yin J, Liu H, Xu J, Liu Z, Ding N. MicroProteinDB: A database to provide knowledge on sequences, structures and function of ncRNA-derived microproteins. Comput Biol Med 2024; 177:108660. [PMID: 38820774 DOI: 10.1016/j.compbiomed.2024.108660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 05/08/2024] [Accepted: 05/26/2024] [Indexed: 06/02/2024]
Abstract
Omics-based technologies have revolutionized our comprehension of microproteins encoded by ncRNAs, revealing their abundant presence and pivotal roles within complex functional landscapes. Here, we developed MicroProteinDB (http://bio-bigdata.hrbmu.edu.cn/MicroProteinDB), which offers and visualizes the extensive knowledge to aid retrieval and analysis of computationally predicted and experimentally validated microproteins originating from various ncRNA types. Employing prediction algorithms grounded in diverse deep learning approaches, MicroProteinDB comprehensively documents the fundamental physicochemical properties, secondary and tertiary structures, interactions with functional proteins, family domains, and inter-species conservation of microproteins. With five major analytical modules, it will serve as a valuable knowledge for investigating ncRNA-derived microproteins.
Collapse
Affiliation(s)
- Yinan Liang
- The First Affiliated Hospital, Harbin Medical University, Harbin, 150001, China
| | - Dezhong Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Kefan Liu
- School of Interdisciplinary Medicine and Engineering, Harbin Medical University, Harbin, 150081, China
| | - Liting Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Huan Shu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Luan Wen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Chongwen Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Qisen Sun
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Jiaqi Yin
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Hui Liu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Juan Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China.
| | - Zhigang Liu
- Affiliated Foshan Maternity&Child Healthcare Hospital, Southern Medical University, Guangzhou, 510000, China.
| | - Na Ding
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China.
| |
Collapse
|
5
|
Kwon JJ, Pan J, Gonzalez G, Hahn WC, Zitnik M. On knowing a gene: A distributional hypothesis of gene function. Cell Syst 2024:S2405-4712(24)00123-6. [PMID: 38810640 DOI: 10.1016/j.cels.2024.04.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 02/25/2024] [Accepted: 04/30/2024] [Indexed: 05/31/2024]
Abstract
As words can have multiple meanings that depend on sentence context, genes can have various functions that depend on the surrounding biological system. This pleiotropic nature of gene function is limited by ontologies, which annotate gene functions without considering biological contexts. We contend that the gene function problem in genetics may be informed by recent technological leaps in natural language processing, in which representations of word semantics can be automatically learned from diverse language contexts. In contrast to efforts to model semantics as "is-a" relationships in the 1990s, modern distributional semantics represents words as vectors in a learned semantic space and fuels current advances in transformer-based models such as large language models and generative pre-trained transformers. A similar shift in thinking of gene functions as distributions over cellular contexts may enable a similar breakthrough in data-driven learning from large biological datasets to inform gene function.
Collapse
Affiliation(s)
- Jason J Kwon
- Dana-Farber Cancer Institute and Harvard Medical School, Department of Medical Oncology, Boston, MA 02215, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Joshua Pan
- Dana-Farber Cancer Institute and Harvard Medical School, Department of Medical Oncology, Boston, MA 02215, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Guadalupe Gonzalez
- Department of Computing, Faculty of Engineering, Imperial College, London SW7 2AZ, UK
| | - William C Hahn
- Dana-Farber Cancer Institute and Harvard Medical School, Department of Medical Oncology, Boston, MA 02215, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Marinka Zitnik
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Harvard Medical School, Department of Biomedical Informatics, Boston, MA 02115, USA; Harvard Data Science Initiative, Harvard University, Cambridge, MA 02138, USA; Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Allston, MA 02134, USA.
| |
Collapse
|
6
|
Cao MY, Zainudin S, Daud KM. Protein features fusion using attributed network embedding for predicting protein-protein interaction. BMC Genomics 2024; 25:466. [PMID: 38741045 DOI: 10.1186/s12864-024-10361-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 04/29/2024] [Indexed: 05/16/2024] Open
Abstract
BACKGROUND Protein-protein interactions (PPIs) hold significant importance in biology, with precise PPI prediction as a pivotal factor in comprehending cellular processes and facilitating drug design. However, experimental determination of PPIs is laborious, time-consuming, and often constrained by technical limitations. METHODS We introduce a new node representation method based on initial information fusion, called FFANE, which amalgamates PPI networks and protein sequence data to enhance the precision of PPIs' prediction. A Gaussian kernel similarity matrix is initially established by leveraging protein structural resemblances. Concurrently, protein sequence similarities are gauged using the Levenshtein distance, enabling the capture of diverse protein attributes. Subsequently, to construct an initial information matrix, these two feature matrices are merged by employing weighted fusion to achieve an organic amalgamation of structural and sequence details. To gain a more profound understanding of the amalgamated features, a Stacked Autoencoder (SAE) is employed for encoding learning, thereby yielding more representative feature representations. Ultimately, classification models are trained to predict PPIs by using the well-learned fusion feature. RESULTS When employing 5-fold cross-validation experiments on SVM, our proposed method achieved average accuracies of 94.28%, 97.69%, and 84.05% in terms of Saccharomyces cerevisiae, Homo sapiens, and Helicobacter pylori datasets, respectively. CONCLUSION Experimental findings across various authentic datasets validate the efficacy and superiority of this fusion feature representation approach, underscoring its potential value in bioinformatics.
Collapse
Affiliation(s)
- Mei-Yuan Cao
- Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, 43600, Selangor, Malaysia.
| | - Suhaila Zainudin
- Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, 43600, Selangor, Malaysia
| | - Kauthar Mohd Daud
- Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, 43600, Selangor, Malaysia
| |
Collapse
|
7
|
Yin S, Mi X, Shukla D. Leveraging machine learning models for peptide-protein interaction prediction. RSC Chem Biol 2024; 5:401-417. [PMID: 38725911 PMCID: PMC11078210 DOI: 10.1039/d3cb00208j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 02/07/2024] [Indexed: 05/12/2024] Open
Abstract
Peptides play a pivotal role in a wide range of biological activities through participating in up to 40% protein-protein interactions in cellular processes. They also demonstrate remarkable specificity and efficacy, making them promising candidates for drug development. However, predicting peptide-protein complexes by traditional computational approaches, such as docking and molecular dynamics simulations, still remains a challenge due to high computational cost, flexible nature of peptides, and limited structural information of peptide-protein complexes. In recent years, the surge of available biological data has given rise to the development of an increasing number of machine learning models for predicting peptide-protein interactions. These models offer efficient solutions to address the challenges associated with traditional computational approaches. Furthermore, they offer enhanced accuracy, robustness, and interpretability in their predictive outcomes. This review presents a comprehensive overview of machine learning and deep learning models that have emerged in recent years for the prediction of peptide-protein interactions.
Collapse
Affiliation(s)
- Song Yin
- Department of Chemical and Biomolecular Engineering, University of Illinois Urbana-Champaign Urbana 61801 Illinois USA
| | - Xuenan Mi
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign Urbana IL 61801 USA
| | - Diwakar Shukla
- Department of Chemical and Biomolecular Engineering, University of Illinois Urbana-Champaign Urbana 61801 Illinois USA
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign Urbana IL 61801 USA
- Department of Bioengineering, University of Illinois Urbana-Champaign Urbana IL 61801 USA
| |
Collapse
|
8
|
Gurvich R, Markel G, Tanoli Z, Meirson T. Peptriever: a Bi-Encoder approach for large-scale protein-peptide binding search. Bioinformatics 2024; 40:btae303. [PMID: 38710496 PMCID: PMC11112044 DOI: 10.1093/bioinformatics/btae303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 03/31/2024] [Accepted: 05/03/2024] [Indexed: 05/08/2024] Open
Abstract
MOTIVATION Peptide therapeutics hinge on the precise interaction between a tailored peptide and its designated receptor while mitigating interactions with alternate receptors is equally indispensable. Existing methods primarily estimate the binding score between protein and peptide pairs. However, for a specific peptide without a corresponding protein, it is challenging to identify the proteins it could bind due to the sheer number of potential candidates. RESULTS We propose a transformers-based protein embedding scheme in this study that can quickly identify and rank millions of interacting proteins. Furthermore, the proposed approach outperforms existing sequence- and structure-based methods, with a mean AUC-ROC and AUC-PR of 0.73. AVAILABILITY AND IMPLEMENTATION Training data, scripts, and fine-tuned parameters are available at https://github.com/RoniGurvich/Peptriever. The proposed method is linked with a web application available for customized prediction at https://peptriever.app/.
Collapse
Affiliation(s)
- Roni Gurvich
- Davidoff Cancer Center, Rabin Medical Center-Beilinson Hospital, Petah Tikva 49100, Israel
| | - Gal Markel
- Davidoff Cancer Center, Rabin Medical Center-Beilinson Hospital, Petah Tikva 49100, Israel
- Faculty of Medicine, Tel Aviv University, Tel-Aviv 6997801, Israel
- Samueli Integrative Cancer Pioneering Institute, Rabin Medical Center-Beilinson Hospital, Petah Tikva, Israel
| | - Ziaurrehman Tanoli
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki 00290, Finland
| | - Tomer Meirson
- Davidoff Cancer Center, Rabin Medical Center-Beilinson Hospital, Petah Tikva 49100, Israel
- Faculty of Medicine, Tel Aviv University, Tel-Aviv 6997801, Israel
- Samueli Integrative Cancer Pioneering Institute, Rabin Medical Center-Beilinson Hospital, Petah Tikva, Israel
| |
Collapse
|
9
|
Zhang Y, Yu L, Yang M, Han B, Luo J, Jing R. Model fusion for predicting unconventional proteins secreted by exosomes using deep learning. Proteomics 2024:e2300184. [PMID: 38643383 DOI: 10.1002/pmic.202300184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 03/25/2024] [Accepted: 03/26/2024] [Indexed: 04/22/2024]
Abstract
Unconventional secretory proteins (USPs) are vital for cell-to-cell communication and are necessary for proper physiological processes. Unlike classical proteins that follow the conventional secretory pathway via the Golgi apparatus, these proteins are released using unconventional pathways. The primary modes of secretion for USPs are exosomes and ectosomes, which originate from the endoplasmic reticulum. Accurate and rapid identification of exosome-mediated secretory proteins is crucial for gaining valuable insights into the regulation of non-classical protein secretion and intercellular communication, as well as for the advancement of novel therapeutic approaches. Although computational methods based on amino acid sequence prediction exist for predicting unconventional proteins secreted by exosomes (UPSEs), they suffer from significant limitations in terms of algorithmic accuracy. In this study, we propose a novel approach to predict UPSEs by combining multiple deep learning models that incorporate both protein sequences and evolutionary information. Our approach utilizes a convolutional neural network (CNN) to extract protein sequence information, while various densely connected neural networks (DNNs) are employed to capture evolutionary conservation patterns.By combining six distinct deep learning models, we have created a superior framework that surpasses previous approaches, achieving an ACC score of 77.46% and an MCC score of 0.5406 on an independent test dataset.
Collapse
Affiliation(s)
- Yonglin Zhang
- Department of Clinical Pharmacy and Pharmacy Management, Affiliated Hospital of North Sichuan Medical College, Nanchong, Sichuan, China
| | - Lezheng Yu
- School of Chemistry and Materials Science, Guizhou Education University, Guiyang, Guizhou, China
| | - Ming Yang
- Department of Clinical Pharmacy and Pharmacy Management, Affiliated Hospital of North Sichuan Medical College, Nanchong, Sichuan, China
| | - Bin Han
- GCP Center/Institute of Drug Clinical Trials, Affiliated Hospital of North Sichuan Medical College, Nanchong, China
| | - Jiesi Luo
- Basic Medical College, Southwest Medical University, Luzhou, Sichuan, China
| | - Runyu Jing
- School of Cyber Science and Engineering, Sichuan University, Chengdu, Sichuan, China
| |
Collapse
|
10
|
Lin L, Li C, Zhang T, Xia C, Bai Q, Jin L, Shen Y. An in silico scheme for optimizing the enzymatic acquisition of natural biologically active peptides based on machine learning and virtual digestion. Anal Chim Acta 2024; 1298:342419. [PMID: 38462343 DOI: 10.1016/j.aca.2024.342419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 12/23/2023] [Accepted: 02/26/2024] [Indexed: 03/12/2024]
Abstract
BACKGROUND As a potential natural active substance, natural biologically active peptides (NBAPs) are recently attracting increasing attention. The traditional proteolysis methods of obtaining effective NBAPs are considerably vexing, especially since multiple proteases can be used, which blocks the exploration of available NBAPs. Although the development of virtual digesting brings some degree of convenience, the activity of the obtained peptides remains unclear, which would still not allow efficient access to the NBAPs. It is necessary to develop an efficient and accurate strategy for acquiring NBAPs. RESULTS A new in silico scheme named SSA-LSTM-VD, which combines a sparrow search algorithm-long short-term memory (SSA-LSTM) deep learning and virtually digested, was presented to optimize the proteolysis acquisition of NBAPs. Therein, SSA-LSTM reached the highest Efficiency value reached 98.00 % compared to traditional machine learning algorithms, and basic LSTM algorithm. SSA-LSTM was trained to predict the activity of peptides in the proteins virtually digested results, obtain the percentage of target active peptide, and select the appropriate protease for the actual experiment. As an application, SSA-LSTM was employed to predict the percentage of neuroprotective peptides in the virtual digested result of walnut protein, and trypsin was ultimately found to possess the highest value (85.29 %). The walnut protein was digested by trypsin (WPTrH) and the peptide sequence obtained was analyzed closely matches the theoretical neuroprotective peptide. More importantly, the neuroprotective effects of WPTrH had been demonstrated in nerve damage mouse models. SIGNIFICANCE The proposed SSA-LSTM-VD in this paper makes the acquisition of NBAPs efficient and accurate. The approach combines deep learning and virtually digested skillfully. Utilizing the SSA-LSTM-VD based strategy holds promise for discovering and developing peptides with neuroprotective properties or other desired biological activities.
Collapse
Affiliation(s)
- Like Lin
- Key Laboratory of Synthetic and Natural Functional Molecule of Ministry of Education, College of Chemistry and Materials Science, National Demonstration Center for Experimental Chemistry Education, Northwest University, Xi'an, Shaanxi, 710127, People's Republic of China
| | - Cong Li
- Key Laboratory of Synthetic and Natural Functional Molecule of Ministry of Education, College of Chemistry and Materials Science, National Demonstration Center for Experimental Chemistry Education, Northwest University, Xi'an, Shaanxi, 710127, People's Republic of China.
| | - Tianlong Zhang
- Key Laboratory of Synthetic and Natural Functional Molecule of Ministry of Education, College of Chemistry and Materials Science, National Demonstration Center for Experimental Chemistry Education, Northwest University, Xi'an, Shaanxi, 710127, People's Republic of China
| | - Chaoshuang Xia
- Center for Biomedical Mass Spectrometry, Boston University Chobanian and Avedisian School of Medicine, Boston, MA, 02118, United States
| | - Qiuhong Bai
- Key Laboratory of Synthetic and Natural Functional Molecule of Ministry of Education, College of Chemistry and Materials Science, National Demonstration Center for Experimental Chemistry Education, Northwest University, Xi'an, Shaanxi, 710127, People's Republic of China
| | - Lihua Jin
- Key Laboratory of Synthetic and Natural Functional Molecule of Ministry of Education, College of Chemistry and Materials Science, National Demonstration Center for Experimental Chemistry Education, Northwest University, Xi'an, Shaanxi, 710127, People's Republic of China
| | - Yehua Shen
- Key Laboratory of Synthetic and Natural Functional Molecule of Ministry of Education, College of Chemistry and Materials Science, National Demonstration Center for Experimental Chemistry Education, Northwest University, Xi'an, Shaanxi, 710127, People's Republic of China.
| |
Collapse
|
11
|
Karamifard F, Mazaheri M, Dadbinpour A. Abatement of the binding of human hexokinase II enzyme monomers by in-silico method with the design of inhibitory peptides. In Silico Pharmacol 2024; 12:30. [PMID: 38617709 PMCID: PMC11009198 DOI: 10.1007/s40203-024-00201-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2023] [Accepted: 03/05/2024] [Indexed: 04/16/2024] Open
Abstract
The hexokinase II enzyme is bound to the (VDAC1) channel in the form of a dimer and prevents the release of cell death factors from mitochondria to the cytoplasm. Studies have shown that blocking the binding of hexokinase II enzyme to (VDAC1) led to the initiation of apoptosis in cancer cells. No peptide has been designed so far to inhibit hexokinase II. The aim of this study was to inhibit the dimerization of enzyme subunits in order to inhibition the formation of (VDAC1) and the hexokinase II complex. In this study, the molecular dynamics simulation of the enzyme in monomer and dimer states was investigated in terms of RMSF, RMSD and radius of gyration. The following process involves extracting and designing variable-length peptides from the interacting segments of enzyme monomers. Using molecular dynamics simulation, the stability of the peptide was determined in terms of RMSD. Molecular docking was used to investigate the interaction between the designed peptides. Finally, the inhibitory effect of peptides on subunit association was measured using dynamic light scattering (DLS) technique. Our results showed that the designed peptides, which mimic common amino acids in dimerization, interrupt the bona fide form of the enzyme subunits. The result of this study provides a new way to disrupt the assembly process and thereby decreased the function of the hexokinase II. Supplementary Information The online version contains supplementary material available at 10.1007/s40203-024-00201-8.
Collapse
Affiliation(s)
- Faranak Karamifard
- Department of Genetics, Faculty of Medicine, Shahid Sadoughi University of Medical Sciences of Yazd, Yazd, Iran
| | - Mahta Mazaheri
- Department of Medical Genetics, Faculty of Medicine, Mother and Newborn, Health Research Center, Shahid Sadoughi University of Medical Sciences, Yazd, Iran
| | - Ali Dadbinpour
- Genetic and Environmental Adventures, Department of Genetics, Medical School, School of Abarkouh Paramedicin, Faculty of Medicine, Shahid Sadoughi University of Medical Science, Yazd, Iran
| |
Collapse
|
12
|
Dekker PM, Boeren S, Saccenti E, Hettinga KA. Network analysis of the proteome and peptidome sheds light on human milk as a biological system. Sci Rep 2024; 14:7569. [PMID: 38555284 PMCID: PMC10981717 DOI: 10.1038/s41598-024-58127-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Accepted: 03/26/2024] [Indexed: 04/02/2024] Open
Abstract
Proteins and peptides found in human milk have bioactive potential to benefit the newborn and support healthy development. Research has been carried out on the health benefits of proteins and peptides, but many questions still need to be answered about the nature of these components, how they are formed, and how they end up in the milk. This study explored and elucidated the complexity of the human milk proteome and peptidome. Proteins and peptides were analyzed with non-targeted nanoLC-Orbitrap-MS/MS in a selection of 297 milk samples from the CHILD Cohort Study. Protein and peptide abundances were determined, and a network was inferred using Gaussian graphical modeling (GGM), allowing an investigation of direct associations. This study showed that signatures of (1) specific mechanisms of transport of different groups of proteins, (2) proteolytic degradation by proteases and aminopeptidases, and (3) coagulation and complement activation are present in human milk. These results show the value of an integrated approach in evaluating large-scale omics data sets and provide valuable information for studies that aim to associate protein or peptide profiles from biofluids such as milk with specific physiological characteristics.
Collapse
Affiliation(s)
- Pieter M Dekker
- Food Quality and Design Group, Wageningen University and Research, Wageningen, 6708 WE, The Netherlands
- Laboratory of Biochemistry, Wageningen University and Research, Wageningen, 6708 WE, The Netherlands
| | - Sjef Boeren
- Laboratory of Biochemistry, Wageningen University and Research, Wageningen, 6708 WE, The Netherlands
| | - Edoardo Saccenti
- Laboratory of Systems and Synthetic Biology, Wageningen University and Research, Wageningen, 6708 WE, The Netherlands
| | - Kasper A Hettinga
- Food Quality and Design Group, Wageningen University and Research, Wageningen, 6708 WE, The Netherlands.
| |
Collapse
|
13
|
Jia P, Zhang F, Wu C, Li M. A comprehensive review of protein-centric predictors for biomolecular interactions: from proteins to nucleic acids and beyond. Brief Bioinform 2024; 25:bbae162. [PMID: 38739759 PMCID: PMC11089422 DOI: 10.1093/bib/bbae162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Revised: 02/17/2024] [Accepted: 03/31/2024] [Indexed: 05/16/2024] Open
Abstract
Proteins interact with diverse ligands to perform a large number of biological functions, such as gene expression and signal transduction. Accurate identification of these protein-ligand interactions is crucial to the understanding of molecular mechanisms and the development of new drugs. However, traditional biological experiments are time-consuming and expensive. With the development of high-throughput technologies, an increasing amount of protein data is available. In the past decades, many computational methods have been developed to predict protein-ligand interactions. Here, we review a comprehensive set of over 160 protein-ligand interaction predictors, which cover protein-protein, protein-nucleic acid, protein-peptide and protein-other ligands (nucleotide, heme, ion) interactions. We have carried out a comprehensive analysis of the above four types of predictors from several significant perspectives, including their inputs, feature profiles, models, availability, etc. The current methods primarily rely on protein sequences, especially utilizing evolutionary information. The significant improvement in predictions is attributed to deep learning methods. Additionally, sequence-based pretrained models and structure-based approaches are emerging as new trends.
Collapse
Affiliation(s)
- Pengzhen Jia
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| | - Fuhao Zhang
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
- College of Information Engineering, Northwest A&F University, No. 3 Taicheng Road, Yangling, Shaanxi 712100, China
| | - Chaojin Wu
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, 932 Lushan Road(S), Changsha 410083, China
| |
Collapse
|
14
|
Emami N, Ferdousi R. HormoNet: a deep learning approach for hormone-drug interaction prediction. BMC Bioinformatics 2024; 25:87. [PMID: 38418979 PMCID: PMC10903040 DOI: 10.1186/s12859-024-05708-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 02/16/2024] [Indexed: 03/02/2024] Open
Abstract
Several experimental evidences have shown that the human endogenous hormones can interact with drugs in many ways and affect drug efficacy. The hormone drug interactions (HDI) are essential for drug treatment and precision medicine; therefore, it is essential to understand the hormone-drug associations. Here, we present HormoNet to predict the HDI pairs and their risk level by integrating features derived from hormone and drug target proteins. To the best of our knowledge, this is one of the first attempts to employ deep learning approach for prediction of HDI prediction. Amino acid composition and pseudo amino acid composition were applied to represent target information using 30 physicochemical and conformational properties of the proteins. To handle the imbalance problem in the data, we applied synthetic minority over-sampling technique technique. Additionally, we constructed novel datasets for HDI prediction and the risk level of their interaction. HormoNet achieved high performance on our constructed hormone-drug benchmark datasets. The results provide insights into the understanding of the relationship between hormone and a drug, and indicate the potential benefit of reducing risk levels of interactions in designing more effective therapies for patients in drug treatments. Our benchmark datasets and the source codes for HormoNet are available in: https://github.com/EmamiNeda/HormoNet .
Collapse
Affiliation(s)
- Neda Emami
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran.
| | - Reza Ferdousi
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran
| |
Collapse
|
15
|
Rockett T, Almahyawi M, Ghimire ML, Jonnalagadda A, Tagliaferro V, Seashols-Williams SJ, Bertino MF, Caputo GA, Reiner JE. Cluster-Enhanced Nanopore Sensing of Ovarian Cancer Marker Peptides in Urine. ACS Sens 2024; 9:860-869. [PMID: 38286995 PMCID: PMC10897939 DOI: 10.1021/acssensors.3c02207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 12/20/2023] [Accepted: 01/09/2024] [Indexed: 01/31/2024]
Abstract
The development of novel methodologies that can detect biomarkers from cancer or other diseases is both a challenge and a need for clinical applications. This partly motivates efforts related to nanopore-based peptide sensing. Recent work has focused on the use of gold nanoparticles for selective detection of cysteine-containing peptides. Specifically, tiopronin-capped gold nanoparticles, trapped in the cis-side of a wild-type α-hemolysin nanopore, provide a suitable anchor for the attachment of cysteine-containing peptides. It was recently shown that the attachment of these peptides onto a nanoparticle yields unique current signatures that can be used to identify the peptide. In this article, we apply this technique to the detection of ovarian cancer marker peptides ranging in length from 8 to 23 amino acid residues. It is found that sequence variability complicates the detection of low-molecular-weight peptides (<10 amino acid residues), but higher-molecular-weight peptides yield complex, high-frequency current fluctuations. These fluctuations are characterized with chi-squared and autocorrelation analyses that yield significantly improved selectivity when compared to traditional open-pore analysis. We demonstrate that the technique is capable of detecting the only two cysteine-containing peptides from LRG-1, an emerging protein biomarker, that are uniquely present in the urine of ovarian cancer patients. We further demonstrate the detection of one of these LRG-1 peptides spiked into a sample of human female urine.
Collapse
Affiliation(s)
- Thomas
W. Rockett
- Department
of Physics, Virginia Commonwealth University, Richmond, Virginia 23284, United States
| | - Mohammed Almahyawi
- Department
of Physics, Virginia Commonwealth University, Richmond, Virginia 23284, United States
- King
Fahd Medical Research Center, King Abdulaziz
University, Jeddah 21589, Saudi Arabia
| | - Madhav L. Ghimire
- Department
of Physics, Virginia Commonwealth University, Richmond, Virginia 23284, United States
| | - Aashna Jonnalagadda
- Department
of Chemistry and Biochemistry, Rowan University, Glassboro, New Jersey 08028, United States
| | - Victoria Tagliaferro
- Department
of Chemistry and Biochemistry, Rowan University, Glassboro, New Jersey 08028, United States
| | - Sarah J. Seashols-Williams
- Department
of Forensic Sciences, Virginia Commonwealth
University, Richmond, Virginia 23284, United States
| | - Massimo F. Bertino
- Department
of Physics, Virginia Commonwealth University, Richmond, Virginia 23284, United States
| | - Gregory A. Caputo
- Department
of Chemistry and Biochemistry, Rowan University, Glassboro, New Jersey 08028, United States
| | - Joseph E. Reiner
- Department
of Physics, Virginia Commonwealth University, Richmond, Virginia 23284, United States
| |
Collapse
|
16
|
Wang Z, Brand R, Adolf-Bryfogle J, Grewal J, Qi Y, Combs SA, Golovach N, Alford R, Rangwala H, Clark PM. EGGNet, a Generalizable Geometric Deep Learning Framework for Protein Complex Pose Scoring. ACS OMEGA 2024; 9:7471-7479. [PMID: 38405499 PMCID: PMC10882658 DOI: 10.1021/acsomega.3c04889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 01/19/2024] [Accepted: 01/23/2024] [Indexed: 02/27/2024]
Abstract
Computational prediction of molecule-protein interactions has been key for developing new molecules to interact with a target protein for therapeutics development. Previous work includes two independent streams of approaches: (1) predicting protein-protein interactions (PPIs) between naturally occurring proteins and (2) predicting binding affinities between proteins and small-molecule ligands [also known as drug-target interaction (DTI)]. Studying the two problems in isolation has limited the ability of these computational models to generalize across the PPI and DTI tasks, both of which ultimately involve noncovalent interactions with a protein target. In this work, we developed Equivariant Graph of Graphs neural Network (EGGNet), a geometric deep learning (GDL) framework, for molecule-protein binding predictions that can handle three types of molecules for interacting with a target protein: (1) small molecules, (2) synthetic peptides, and (3) natural proteins. EGGNet leverages a graph of graphs (GoG) representation constructed from the molecular structures at atomic resolution and utilizes a multiresolution equivariant graph neural network to learn from such representations. In addition, EGGNet leverages the underlying biophysics and makes use of both atom- and residue-level interactions, which improve EGGNet's ability to rank candidate poses from blind docking. EGGNet achieves competitive performance on both a public protein-small-molecule binding affinity prediction task (80.2% top 1 success rate on CASF-2016) and a synthetic protein interface prediction task (88.4% area under the precision-recall curve). We envision that the proposed GDL framework can generalize to many other protein interaction prediction problems, such as binding site prediction and molecular docking, helping accelerate protein engineering and structure-based drug development.
Collapse
Affiliation(s)
- Zichen Wang
- Amazon
Web Services, Amazon, Seattle, Washington 98109-5210, United
States
| | - Ryan Brand
- Amazon
Web Services, Amazon, Seattle, Washington 98109-5210, United
States
| | - Jared Adolf-Bryfogle
- Janssen
Biotherapeutics, Janssen Pharmaceutical
Companies of Johnson & Johnson, Spring House, Titusville, New Jersey 08560-1504, United States
| | - Jasleen Grewal
- Amazon
Web Services, Amazon, Seattle, Washington 98109-5210, United
States
| | - Yanjun Qi
- Amazon
Web Services, Amazon, Seattle, Washington 98109-5210, United
States
| | - Steven A. Combs
- Janssen
Biotherapeutics, Janssen Pharmaceutical
Companies of Johnson & Johnson, Spring House, Titusville, New Jersey 08560-1504, United States
| | - Nataliya Golovach
- Janssen
Biotherapeutics, Janssen Pharmaceutical
Companies of Johnson & Johnson, Spring House, Titusville, New Jersey 08560-1504, United States
| | - Rebecca Alford
- Janssen
Biotherapeutics, Janssen Pharmaceutical
Companies of Johnson & Johnson, Spring House, Titusville, New Jersey 08560-1504, United States
| | - Huzefa Rangwala
- Amazon
Web Services, Amazon, Seattle, Washington 98109-5210, United
States
| | - Peter M. Clark
- Janssen
Biotherapeutics, Janssen Pharmaceutical
Companies of Johnson & Johnson, Spring House, Titusville, New Jersey 08560-1504, United States
| |
Collapse
|
17
|
Pan X, Li Y, Huang P, Staecker H, He M. Extracellular vesicles for developing targeted hearing loss therapy. J Control Release 2024; 366:460-478. [PMID: 38182057 DOI: 10.1016/j.jconrel.2023.12.050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 12/19/2023] [Accepted: 12/28/2023] [Indexed: 01/07/2024]
Abstract
Substantial efforts have been made for local administration of small molecules or biologics in treating hearing loss diseases caused by either trauma, genetic mutations, or drug ototoxicity. Recently, extracellular vesicles (EVs) naturally secreted from cells have drawn increasing attention on attenuating hearing impairment from both preclinical studies and clinical studies. Highly emerging field utilizing diverse bioengineering technologies for developing EVs as the bioderived therapeutic materials, along with artificial intelligence (AI)-based targeting toolkits, shed the light on the unique properties of EVs specific to inner ear delivery. This review will illuminate such exciting research field from fundamentals of hearing protective functions of EVs to biotechnology advancement and potential clinical translation of functionalized EVs. Specifically, the advancements in assessing targeting ligands using AI algorithms are systematically discussed. The overall translational potential of EVs is reviewed in the context of auditory sensing system for developing next generation gene therapy.
Collapse
Affiliation(s)
- Xiaoshu Pan
- Department of Pharmaceutics, College of Pharmacy, University of Florida, Gainesville, Florida 32610, United States
| | - Yanjun Li
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, Florida 32610, United States
| | - Peixin Huang
- Department of Otolaryngology, Head and Neck Surgery, University of Kansas School of Medicine, Kansas City, Kansas 66160, United States
| | - Hinrich Staecker
- Department of Otolaryngology, Head and Neck Surgery, University of Kansas School of Medicine, Kansas City, Kansas 66160, United States.
| | - Mei He
- Department of Pharmaceutics, College of Pharmacy, University of Florida, Gainesville, Florida 32610, United States.
| |
Collapse
|
18
|
Chen S, Yan K, Liu B. PDB-BRE: A ligand-protein interaction binding residue extractor based on Protein Data Bank. Proteins 2024; 92:145-153. [PMID: 37750380 DOI: 10.1002/prot.26596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Revised: 08/13/2023] [Accepted: 09/11/2023] [Indexed: 09/27/2023]
Abstract
Proteins typically exert their biological functions by interacting with other biomolecules or ligands. The study of ligand-protein interactions is crucial in elucidating the biological mechanisms of proteins. Most existing studies have focused on analyzing ligand-protein interactions, and they ignore the additional situational of inserted and modified residues. Besides, the resources often support only a single ligand type and cannot obtain satisfied results in analyzing novel complexes. Therefore, it is important to develop a general analytical tool to extract the binding residues of ligand-protein interactions in complexes fully. In this study, we propose a ligand-protein interaction binding residue extractor (PDB-BRE), which can be used to automatically extract interacting ligand or protein-binding residues from complex three-dimensional (3D) structures based on the RCSB Protein Data Bank (RCSB PDB). PDB-BRE offers a notable advantage in its comprehensive support for analyzing six distinct types of ligands, including proteins, peptides, DNA, RNA, mixed DNA and RNA entities, and non-polymeric entities. Moreover, it takes into account the consideration of inserted and modified residues within complexes. Compared to other state-of-the-art methods, PDB-BRE is more suitable for massively parallel batch analysis, and can be directly applied for downstream tasks, such as predicting binding residues of novel complexes. PDB-BRE is freely available at http://bliulab.net/PDB-BRE.
Collapse
Affiliation(s)
- Shutao Chen
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| | - Ke Yan
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
19
|
Cong H, Liu H, Cao Y, Liang C, Chen Y. Protein-protein interaction site prediction by model ensembling with hybrid feature and self-attention. BMC Bioinformatics 2023; 24:456. [PMID: 38053020 DOI: 10.1186/s12859-023-05592-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2022] [Accepted: 11/30/2023] [Indexed: 12/07/2023] Open
Abstract
BACKGROUND Protein-protein interactions (PPIs) are crucial in various biological functions and cellular processes. Thus, many computational approaches have been proposed to predict PPI sites. Although significant progress has been made, these methods still have limitations in encoding the characteristics of each amino acid in sequences. Many feature extraction methods rely on the sliding window technique, which simply merges all the features of residues into a vector. The importance of some key residues may be weakened in the feature vector, leading to poor performance. RESULTS We propose a novel sequence-based method for PPI sites prediction. The new network model, PPINet, contains multiple feature processing paths. For a residue, the PPINet extracts the features of the targeted residue and its context separately. These two types of features are processed by two paths in the network and combined to form a protein representation, where the two types of features are of relatively equal importance. The model ensembling technique is applied to make use of more features. The base models are trained with different features and then ensembled via stacking. In addition, a data balancing strategy is presented, by which our model can get significant improvement on highly unbalanced data. CONCLUSION The proposed method is evaluated on a fused dataset constructed from Dset186, Dset_72, and PDBset_164, as well as the public Dset_448 dataset. Compared with current state-of-the-art methods, the performance of our method is better than the others. In the most important metrics, such as AUPRC and recall, it surpasses the second-best programmer on the latter dataset by 6.9% and 4.7%, respectively. We also demonstrated that the improvement is essentially due to using the ensemble model, especially, the hybrid feature. We share our code for reproducibility and future research at https://github.com/CandiceCong/StackingPPINet .
Collapse
Affiliation(s)
- Hanhan Cong
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
- Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan, China
| | - Hong Liu
- School of Information Science and Engineering, Shandong Normal University, Jinan, China.
- Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan, China.
| | - Yi Cao
- School of Information Science and Engineering, University of Jinan, Jinan, China
- Shandong Provincial Key Laboratory of Network Based Intelligent Computing, Jinan, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Yuehui Chen
- School of Information Science and Engineering, University of Jinan, Jinan, China
- Shandong Provincial Key Laboratory of Network Based Intelligent Computing, Jinan, China
| |
Collapse
|
20
|
Zhang Z, Verburgt J, Kagaya Y, Christoffer C, Kihara D. Improved Peptide Docking with Privileged Knowledge Distillation using Deep Learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.01.569671. [PMID: 38106114 PMCID: PMC10723353 DOI: 10.1101/2023.12.01.569671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Protein-peptide interactions play a key role in biological processes. Understanding the interactions that occur within a receptor-peptide complex can help in discovering and altering their biological functions. Various computational methods for modeling the structures of receptor-peptide complexes have been developed. Recently, accurate structure prediction enabled by deep learning methods has significantly advanced the field of structural biology. AlphaFold (AF) is among the top-performing structure prediction methods and has highly accurate structure modeling performance on single-chain targets. Shortly after the release of AlphaFold, AlphaFold-Multimer (AFM) was developed in a similar fashion as AF for prediction of protein complex structures. AFM has achieved competitive performance in modeling protein-peptide interactions compared to previous computational methods; however, still further improvement is needed. Here, we present DistPepFold, which improves protein-peptide complex docking using an AFM-based architecture through a privileged knowledge distillation approach. DistPepFold leverages a teacher model that uses native interaction information during training and transfers its knowledge to a student model through a teacher-student distillation process. We evaluated DistPepFold's docking performance on two protein-peptide complex datasets and showed that DistPepFold outperforms AFM. Furthermore, we demonstrate that the student model was able to learn from the teacher model to make structural improvements based on AFM predictions.
Collapse
Affiliation(s)
- Zicong Zhang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Jacob Verburgt
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Yuki Kagaya
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| |
Collapse
|
21
|
Xu H, Jia Z, Liu F, Li J, Huang Y, Jiang Y, Pu P, Shang T, Tang P, Zhou Y, Yang Y, Su J, Liu J. Biomarkers and experimental models for cancer immunology investigation. MedComm (Beijing) 2023; 4:e437. [PMID: 38045830 PMCID: PMC10693314 DOI: 10.1002/mco2.437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 11/01/2023] [Accepted: 11/10/2023] [Indexed: 12/05/2023] Open
Abstract
The rapid advancement of tumor immunotherapies poses challenges for the tools used in cancer immunology research, highlighting the need for highly effective biomarkers and reproducible experimental models. Current immunotherapy biomarkers encompass surface protein markers such as PD-L1, genetic features such as microsatellite instability, tumor-infiltrating lymphocytes, and biomarkers in liquid biopsy such as circulating tumor DNAs. Experimental models, ranging from 3D in vitro cultures (spheroids, submerged models, air-liquid interface models, organ-on-a-chips) to advanced 3D bioprinting techniques, have emerged as valuable platforms for cancer immunology investigations and immunotherapy biomarker research. By preserving native immune components or coculturing with exogenous immune cells, these models replicate the tumor microenvironment in vitro. Animal models like syngeneic models, genetically engineered models, and patient-derived xenografts provide opportunities to study in vivo tumor-immune interactions. Humanized animal models further enable the simulation of the human-specific tumor microenvironment. Here, we provide a comprehensive overview of the advantages, limitations, and prospects of different biomarkers and experimental models, specifically focusing on the role of biomarkers in predicting immunotherapy outcomes and the ability of experimental models to replicate the tumor microenvironment. By integrating cutting-edge biomarkers and experimental models, this review serves as a valuable resource for accessing the forefront of cancer immunology investigation.
Collapse
Affiliation(s)
- Hengyi Xu
- State Key Laboratory of Molecular OncologyNational Cancer Center /National Clinical Research Center for Cancer/Cancer HospitalChinese Academy of Medical Sciences and Peking Union Medical CollegeBeijingChina
- Eight‐year MD ProgramSchool of Clinical Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeBeijingChina
| | - Ziqi Jia
- Department of Breast Surgical OncologyNational Cancer Center/National Clinical Research Center for Cancer/Cancer HospitalChinese Academy of Medical Sciences and Peking Union Medical CollegeBeijingChina
| | - Fengshuo Liu
- Eight‐year MD ProgramSchool of Clinical Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeBeijingChina
| | - Jiayi Li
- Eight‐year MD ProgramSchool of Clinical Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeBeijingChina
- Department of Breast Surgical OncologyNational Cancer Center/National Clinical Research Center for Cancer/Cancer HospitalChinese Academy of Medical Sciences and Peking Union Medical CollegeBeijingChina
| | - Yansong Huang
- Eight‐year MD ProgramSchool of Clinical Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeBeijingChina
- Department of Breast Surgical OncologyNational Cancer Center/National Clinical Research Center for Cancer/Cancer HospitalChinese Academy of Medical Sciences and Peking Union Medical CollegeBeijingChina
| | - Yiwen Jiang
- Eight‐year MD ProgramSchool of Clinical Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeBeijingChina
| | - Pengming Pu
- Eight‐year MD ProgramSchool of Clinical Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeBeijingChina
| | - Tongxuan Shang
- Eight‐year MD ProgramSchool of Clinical Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeBeijingChina
| | - Pengrui Tang
- Eight‐year MD ProgramSchool of Clinical Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeBeijingChina
| | - Yongxin Zhou
- Eight‐year MD ProgramSchool of Clinical Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeBeijingChina
| | - Yufan Yang
- School of MedicineTsinghua UniversityBeijingChina
| | - Jianzhong Su
- Oujiang LaboratoryZhejiang Lab for Regenerative Medicine, Vision, and Brain HealthWenzhouZhejiangChina
| | - Jiaqi Liu
- State Key Laboratory of Molecular OncologyNational Cancer Center /National Clinical Research Center for Cancer/Cancer HospitalChinese Academy of Medical Sciences and Peking Union Medical CollegeBeijingChina
- Department of Breast Surgical OncologyNational Cancer Center/National Clinical Research Center for Cancer/Cancer HospitalChinese Academy of Medical Sciences and Peking Union Medical CollegeBeijingChina
| |
Collapse
|
22
|
Feng H, Wang F, Li N, Xu Q, Zheng G, Sun X, Hu M, Li X, Xing G, Zhang G. Use of tree-based machine learning methods to screen affinitive peptides based on docking data. Mol Inform 2023; 42:e202300143. [PMID: 37696773 DOI: 10.1002/minf.202300143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 09/03/2023] [Accepted: 09/11/2023] [Indexed: 09/13/2023]
Abstract
Screening peptides with good affinity is an important step in peptide-drug discovery. Recent advancement in computer and data science have made machine learning a useful tool in accurately affinitive-peptide screening. In current study, four different tree-based algorithms, including Classification and regression trees (CART), C5.0 decision tree (C50), Bagged CART (BAG) and Random Forest (RF), were employed to explore the relationship between experimental peptide affinities and virtual docking data, and the performance of each model was also compared in parallel. All four algorithms showed better performances on dataset pre-scaled, -centered and -PCA than other pre-processed dataset. After model re-built and hyperparameter optimization, the optimal C50 model (C50O) showed the best performances in terms of Accuracy, Kappa, Sensitivity, Specificity, F1, MCC and AUC when validated on test data and an unknown PEDV datasets evaluation (Accuracy=80.4 %). BAG and RFO (the optimal RF), as two best models during training process, did not performed as expecting during in testing and unknown dataset validations. Furthermore, the high correlation of the predictions of RFO and BAG to C50O implied the high stability and robustness of their prediction. Whereas although the good performance on unknown dataset, the poor performance in test data validation and correlation analysis indicated CARTO could not be used for future data prediction. To accurately evaluate the peptide affinity, the current study firstly gave a tree-model competition on affinitive peptide prediction by using virtual docking data, which would expand the application of machine learning algorithms in studying PepPIs and benefit the development of peptide therapeutics.
Collapse
Affiliation(s)
- Hua Feng
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Fangyu Wang
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Ning Li
- College of Food Science and Technology, Henan Agricultural University, Zhengzhou, China
| | - Qian Xu
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Guanming Zheng
- Public Health and Preventive Medicine Teaching and Research Center, Henan University of Chinese Medicine, Zhengzhou, Henan, China
| | - Xuefeng Sun
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Man Hu
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Xuewu Li
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Guangxu Xing
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Gaiping Zhang
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
- Longhu Modern Immunology Laboratory, Zhengzhou, China
- School of Advanced Agricultural sciences, Peking University, Beijing, China
- Jiangsu Co-Innovation Center for the Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou University, Yangzhou, Jiangsu, China
| |
Collapse
|
23
|
Wang M, Li W, Yu X, Luo Y, Han K, Wang C, Jin Q. AffinityVAE: A multi-objective model for protein-ligand affinity prediction and drug design. Comput Biol Chem 2023; 107:107971. [PMID: 37852036 DOI: 10.1016/j.compbiolchem.2023.107971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 09/23/2023] [Accepted: 10/08/2023] [Indexed: 10/20/2023]
Abstract
In the prediction of protein-ligand affinity, the traditional methods require a large amount of computing resources, and have certain limitations in predicting and simulating the structural changes. Although employing data-driven approaches can yield favorable outcomes in deep learning, it entails a lack of interpretability. Some methods may require additional structural information or domain knowledge to support the interpretation, which may limit their applicability. This paper proposes an affinity variational autoencoder (AffinityVAE) using interaction feature mapping and a variational autoencoder, which consists of a multi-objective model capable of end-to-end affinity prediction and drug discovery. In this study, the limitations of affinity prediction in terms of interpretability are tackled by proposing the concept of a protein-ligand interaction feature map. This increases the diversity and quantity of protein-ligand binding data by designing an adaptive autoencoder of target chemical properties to generate new ligands similar to known ligands and adding them to the original training set. AffinityVAE is then retrained using this extended training set to further validate the protein-ligand binding affinity prediction. Comparisons were conducted between the AffinityVAE and recent methods to demonstrate the high efficiency of the proposed model. The experimental results show that AffinityVAE has very high prediction performance, and it has the potential to enhance the diversity and the amount of protein-ligand binding data, which promotes the drug development.
Collapse
Affiliation(s)
- Mengying Wang
- School of Computer Engineering and Science, Shanghai University, Shanghai, China.
| | - Weimin Li
- School of Computer Engineering and Science, Shanghai University, Shanghai, China.
| | - Xiao Yu
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Yin Luo
- School of Life Sciences, East China Normal University, China
| | - Ke Han
- Medical and Health Center, Liaocheng People's Hospital, LiaoCheng, China.
| | - Can Wang
- School of Information and Communication Technology, Griffith University, Australia
| | - Qun Jin
- Networked Information System Laboratory, Waseda University, Tokyo, Japan
| |
Collapse
|
24
|
Ali M, Park IH, Kim J, Kim G, Oh J, You JS, Kim J, Shin JS, Yoon SS. How Deep Learning in Antiviral Molecular Profiling Identified Anti-SARS-CoV-2 Inhibitors. Biomedicines 2023; 11:3134. [PMID: 38137356 PMCID: PMC10740425 DOI: 10.3390/biomedicines11123134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 11/15/2023] [Accepted: 11/22/2023] [Indexed: 12/24/2023] Open
Abstract
The integration of artificial intelligence (AI) into drug discovery has markedly advanced the search for effective therapeutics. In our study, we employed a comprehensive computational-experimental approach to identify potential anti-SARS-CoV-2 compounds. We developed a predictive model to assess the activities of compounds based on their structural features. This model screened a library of approximately 700,000 compounds, culminating in the selection of the top 100 candidates for experimental validation. In vitro assays on human intestinal epithelial cells (Caco-2) revealed that 19 of these compounds exhibited inhibitory activity. Notably, eight compounds demonstrated dose-dependent activity in Vero cell lines, with half-maximal effective concentration (EC50) values ranging from 1 μM to 7 μM. Furthermore, we utilized a clustering approach to pinpoint potential nucleoside analog inhibitors, leading to the discovery of two promising candidates: azathioprine and its metabolite, thioinosinic acid. Both compounds showed in vitro activity against SARS-CoV-2, with thioinosinic acid also significantly reducing viral loads in mouse lungs. These findings underscore the utility of AI in accelerating drug discovery processes.
Collapse
Affiliation(s)
- Mohammed Ali
- Department of Microbiology and Immunology, Yonsei University College of Medicine, Seoul 03722, Republic of Korea; (M.A.); (J.K.); (G.K.); (J.O.); (J.S.Y.); (J.K.)
- Brain Korea 21 Project for Medical Sciences, Department of Microbiology and Immunology, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
| | - In Ho Park
- Department of Biomedical Science, Yonsei University College of Medicine, Seoul 03722, Republic of Korea;
- Institute of Immunology and Immunological Diseases, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
| | - Junebeom Kim
- Department of Microbiology and Immunology, Yonsei University College of Medicine, Seoul 03722, Republic of Korea; (M.A.); (J.K.); (G.K.); (J.O.); (J.S.Y.); (J.K.)
- Brain Korea 21 Project for Medical Sciences, Department of Microbiology and Immunology, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
| | - Gwanghee Kim
- Department of Microbiology and Immunology, Yonsei University College of Medicine, Seoul 03722, Republic of Korea; (M.A.); (J.K.); (G.K.); (J.O.); (J.S.Y.); (J.K.)
- Brain Korea 21 Project for Medical Sciences, Department of Microbiology and Immunology, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
| | - Jooyeon Oh
- Department of Microbiology and Immunology, Yonsei University College of Medicine, Seoul 03722, Republic of Korea; (M.A.); (J.K.); (G.K.); (J.O.); (J.S.Y.); (J.K.)
- Brain Korea 21 Project for Medical Sciences, Department of Microbiology and Immunology, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
| | - Jin Sun You
- Department of Microbiology and Immunology, Yonsei University College of Medicine, Seoul 03722, Republic of Korea; (M.A.); (J.K.); (G.K.); (J.O.); (J.S.Y.); (J.K.)
- Brain Korea 21 Project for Medical Sciences, Department of Microbiology and Immunology, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
| | - Jieun Kim
- Department of Microbiology and Immunology, Yonsei University College of Medicine, Seoul 03722, Republic of Korea; (M.A.); (J.K.); (G.K.); (J.O.); (J.S.Y.); (J.K.)
- Institute of Immunology and Immunological Diseases, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
| | - Jeon-Soo Shin
- Department of Microbiology and Immunology, Yonsei University College of Medicine, Seoul 03722, Republic of Korea; (M.A.); (J.K.); (G.K.); (J.O.); (J.S.Y.); (J.K.)
- Brain Korea 21 Project for Medical Sciences, Department of Microbiology and Immunology, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
- Institute of Immunology and Immunological Diseases, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
| | - Sang Sun Yoon
- Department of Microbiology and Immunology, Yonsei University College of Medicine, Seoul 03722, Republic of Korea; (M.A.); (J.K.); (G.K.); (J.O.); (J.S.Y.); (J.K.)
- Brain Korea 21 Project for Medical Sciences, Department of Microbiology and Immunology, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
- Institute of Immunology and Immunological Diseases, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
- BioMe Inc., Seoul 02455, Republic of Korea
| |
Collapse
|
25
|
Kusuma WA, Fadli A, Fatriani R, Sofyantoro F, Yudha DS, Lischer K, Nuringtyas TR, Putri WA, Purwestri YA, Swasono RT. Prediction of the interaction between Calloselasma rhodostoma venom-derived peptides and cancer-associated hub proteins: A computational study. Heliyon 2023; 9:e21149. [PMID: 37954374 PMCID: PMC10637925 DOI: 10.1016/j.heliyon.2023.e21149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 09/04/2023] [Accepted: 10/17/2023] [Indexed: 11/14/2023] Open
Abstract
The use of peptide drugs to treat cancer is gaining popularity because of their efficacy, fewer side effects, and several advantages over other properties. Identifying the peptides that interact with cancer proteins is crucial in drug discovery. Several approaches related to predicting peptide-protein interactions have been conducted. However, problems arise due to the high costs of resources and time and the smaller number of studies. This study predicts peptide-protein interactions using Random Forest, XGBoost, and SAE-DNN. Feature extraction is also performed on proteins and peptides using intrinsic disorder, amino acid sequences, physicochemical properties, position-specific assessment matrices, amino acid composition, and dipeptide composition. Results show that all algorithms perform equally well in predicting interactions between peptides derived from venoms and target proteins associated with cancer. However, XGBoost produces the best results with accuracy, precision, and area under the receiver operating characteristic curve of 0.859, 0.663, and 0.697, respectively. The enrichment analysis revealed that peptides from the Calloselasma rhodostoma venom targeted several proteins (ESR1, GOPC, and BRD4) related to cancer.
Collapse
Affiliation(s)
- Wisnu Ananta Kusuma
- Department of Computer Science, Faculty of Mathematics and Natural Sciences, IPB University, Bogor, 16680, Indonesia
- Tropical Biopharmaca Research Center, IPB University, Bogor, 16128, Indonesia
| | - Aulia Fadli
- Department of Computer Science, Faculty of Mathematics and Natural Sciences, IPB University, Bogor, 16680, Indonesia
| | - Rizka Fatriani
- Tropical Biopharmaca Research Center, IPB University, Bogor, 16128, Indonesia
| | - Fajar Sofyantoro
- Faculty of Biology, Universitas Gadjah Mada, Yogyakarta, 55281, Indonesia
| | - Donan Satria Yudha
- Faculty of Biology, Universitas Gadjah Mada, Yogyakarta, 55281, Indonesia
| | - Kenny Lischer
- Faculty of Engineering, University of Indonesia, Jakarta, 16424, Indonesia
| | - Tri Rini Nuringtyas
- Faculty of Biology, Universitas Gadjah Mada, Yogyakarta, 55281, Indonesia
- Research Center for Biotechnology, Universitas Gadjah Mada, Yogyakarta, 55281, Indonesia
| | | | - Yekti Asih Purwestri
- Faculty of Biology, Universitas Gadjah Mada, Yogyakarta, 55281, Indonesia
- Research Center for Biotechnology, Universitas Gadjah Mada, Yogyakarta, 55281, Indonesia
| | - Respati Tri Swasono
- Department of Chemistry, Faculty of Mathematics and Natural Sciences, Universitas Gadjah Mada, Yogyakarta, 55281, Indonesia
| |
Collapse
|
26
|
Feifei W, Wenrou S, Sining K, Siyu Z, Xiaolei F, Junxiang L, Congfen H, Xuhui L. A novel functional peptide, named EQ-9 (ESETRILLQ), identified by virtual screening from regenerative cell secretome and its potential anti-aging and restoration effects in topical applications. Peptides 2023; 169:171078. [PMID: 37579838 DOI: 10.1016/j.peptides.2023.171078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 08/10/2023] [Accepted: 08/11/2023] [Indexed: 08/16/2023]
Abstract
Skin aging refers to a degenerative process that can be affected and regulated by intrinsic and extrinsic factors. The mesenchymal stem cell secretome covers a considerable number of regenerative molecules with anti-aging effects in a wide variety of circumstances. However, it is complex, time-consuming, and costly to identify specific compounds from thousands of natural molecules using conventional methods. With the development of computational biology and machine learning, an efficient workflow was generated to identify novel peptides with anti-aging and skin restoration potential. One of the candidate peptides was discovered and subsequently truncated to a novel peptide named EQ-9, with promising anti-aging effects for topical applications at a concentration of 10 ppm validated by experimental validation. The above-described paradigm is expected to be further applied to the virtual screening of novel peptide molecules targeting specific biological functions from a wide variety of natural resources.
Collapse
Affiliation(s)
- Wang Feifei
- Yunnan Botanee Bio-technology Group Co., Ltd., Yunnan, China; Yunnan Yunke Characteristic Plant Extraction Laboratory Co., Ltd., Yunnan, China
| | - Su Wenrou
- Yunnan Botanee Bio-technology Group Co., Ltd., Yunnan, China; Yunnan Yunke Characteristic Plant Extraction Laboratory Co., Ltd., Yunnan, China
| | - Kang Sining
- AGECODE R&D Center, Yangtze Delta Region Institute of Tsinghua University, Zhejiang, China; Harvest Biotech (Zhejiang) Co., Ltd., Zhejiang, China
| | - Zhu Siyu
- AGECODE R&D Center, Yangtze Delta Region Institute of Tsinghua University, Zhejiang, China; Harvest Biotech (Zhejiang) Co., Ltd., Zhejiang, China
| | - Fu Xiaolei
- AGECODE R&D Center, Yangtze Delta Region Institute of Tsinghua University, Zhejiang, China; Harvest Biotech (Zhejiang) Co., Ltd., Zhejiang, China
| | - Li Junxiang
- AGECODE R&D Center, Yangtze Delta Region Institute of Tsinghua University, Zhejiang, China; Harvest Biotech (Zhejiang) Co., Ltd., Zhejiang, China
| | - He Congfen
- Beijing Technology and Business University, Beijing Key Lab of Plant Resources Research and Development, Beijing, China
| | - Li Xuhui
- AGECODE R&D Center, Yangtze Delta Region Institute of Tsinghua University, Zhejiang, China; Zhejiang Provincial Key Laboratory of Applied Enzymology, Yangtze Delta Region Institute of Tsinghua University, Zhejiang, China.
| |
Collapse
|
27
|
Sahoo BR, Bardwell JCA. SERF, a family of tiny highly conserved, highly charged proteins with enigmatic functions. FEBS J 2023; 290:4150-4162. [PMID: 35694898 DOI: 10.1111/febs.16555] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 06/07/2022] [Accepted: 06/10/2022] [Indexed: 11/27/2022]
Abstract
Amyloid formation is a misfolding process that has been linked to age-related diseases, including Alzheimer's and Huntington's. Understanding how cellular factors affect this process in vivo is vital in realizing the dream of controlling this insidious process that robs so many people of their humanity. SERF (small EDRK-rich factor) was initially isolated as a factor that accelerated polyglutamine amyloid formation in a C. elegans model. SERF knockouts inhibit amyloid formation of a number of proteins that include huntingtin, α-synuclein and β-amyloid which are associated with Huntington's, Parkinson's and Alzheimer's disease, respectively, and purified SERF protein speeds their amyloid formation in vitro. SERF proteins are highly conserved, highly charged and conformationally dynamic proteins that form a fuzzy complex with amyloid precursors. They appear to act by specifically accelerating the primary step of amyloid nucleation. Brain-specific SERF knockout mice, though viable, appear to be more prone to deposition of amyloids, and show modified fibril morphology. Whole-body knockouts are perinatally lethal due to an apparently unrelated developmental issue. Recently, it was found that SERF binds RNA and is localized to nucleic acid-rich membraneless compartments. SERF-related sequences are commonly found fused to zinc finger sequences. These results point towards a nucleic acid-binding function. How this function relates to their ability to accelerate amyloid formation is currently obscure. In this review, we discuss the possible biological functions of SERF family proteins in the context of their structural fuzziness, modulation of amyloid pathway, nucleic acid binding and their fusion to folded proteins.
Collapse
Affiliation(s)
- Bikash R Sahoo
- Department of Molecular, Cellular and Developmental Biology, Howard Hughes Medical Institute, University of Michigan, Ann Arbor, MI, USA
| | - James C A Bardwell
- Department of Molecular, Cellular and Developmental Biology, Howard Hughes Medical Institute, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
28
|
Zhai S, Tan Y, Zhang C, Hipolito CJ, Song L, Zhu C, Zhang Y, Duan H, Yin Y. PepScaf: Harnessing Machine Learning with In Vitro Selection toward De Novo Macrocyclic Peptides against IL-17C/IL-17RE Interaction. J Med Chem 2023; 66:11187-11200. [PMID: 37480587 DOI: 10.1021/acs.jmedchem.3c00627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/24/2023]
Abstract
The combination of library-based screening and artificial intelligence (AI) has been accelerating the discovery and optimization of hit ligands. However, the potential of AI to assist in de novo macrocyclic peptide ligand discovery has yet to be fully explored. In this study, an integrated AI framework called PepScaf was developed to extract the critical scaffold relative to bioactivity based on a vast dataset from an initial in vitro selection campaign against a model protein target, interleukin-17C (IL-17C). Taking the generated scaffold, a focused macrocyclic peptide library was rationally constructed to target IL-17C, yielding over 20 potent peptides that effectively inhibited IL-17C/IL-17RE interaction. Notably, the top two peptides displayed exceptional potency with IC50 values of 1.4 nM. This approach presents a viable methodology for more efficient macrocyclic peptide discovery, offering potential time and cost savings. Additionally, this is also the first report regarding the discovery of macrocyclic peptides against IL-17C/IL-17RE interaction.
Collapse
Affiliation(s)
- Silong Zhai
- School of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou 310014, China
| | - Yahong Tan
- State Key Laboratory of Microbial Technology, Institute of Microbial Technology, Shandong University, Qingdao 266237, China
| | - Chengyun Zhang
- School of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou 310014, China
| | - Christopher John Hipolito
- Screening & Compound Profiling, Quantitative Biosciences, Merck & Co., Inc., Kenilworth, New Jersey 07033, United States
| | - Lulu Song
- State Key Laboratory of Microbial Technology, Institute of Microbial Technology, Shandong University, Qingdao 266237, China
| | - Cheng Zhu
- School of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou 310014, China
| | - Youming Zhang
- State Key Laboratory of Microbial Technology, Institute of Microbial Technology, Shandong University, Qingdao 266237, China
| | - Hongliang Duan
- School of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou 310014, China
| | - Yizhen Yin
- State Key Laboratory of Microbial Technology, Institute of Microbial Technology, Shandong University, Qingdao 266237, China
- Shandong Research Institute of Industrial Technology, Jinan 250101, China
| |
Collapse
|
29
|
Liu Z, Zhang X, Wang Y, Tai Y, Yao X, Midgley AC. Emergent Peptides of the Antifibrotic Arsenal: Taking Aim at Myofibroblast Promoting Pathways. Biomolecules 2023; 13:1179. [PMID: 37627244 PMCID: PMC10452577 DOI: 10.3390/biom13081179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 07/25/2023] [Accepted: 07/26/2023] [Indexed: 08/27/2023] Open
Abstract
Myofibroblasts are the principal effector cells driving fibrosis, and their accumulation in tissues is a fundamental feature of fibrosis. Essential pathways have been identified as being central to promoting myofibroblast differentiation, revealing multiple targets for intervention. Compared with large proteins and antibodies, peptide-based therapies have transpired to serve as biocompatible and cost-effective solutions to exert biomimicry, agonistic, and antagonistic activities with a high degree of targeting specificity and selectivity. In this review, we summarize emergent antifibrotic peptides and their utilization for the targeted prevention of myofibroblasts. We then highlight recent studies on peptide inhibitors of upstream pathogenic processes that drive the formation of profibrotic cell phenotypes. We also briefly discuss peptides from non-mammalian origins that show promise as antifibrotic therapeutics. Finally, we discuss the future perspectives of peptide design and development in targeting myofibroblasts to mitigate fibrosis.
Collapse
Affiliation(s)
- Zhen Liu
- State Key Laboratory of Medicinal Chemical Biology, Key Laboratory of Bioactive Materials for the Ministry of Education, College of Life Sciences, Nankai University, Tianjin 300071, China
| | - Xinyan Zhang
- State Key Laboratory of Medicinal Chemical Biology, Key Laboratory of Bioactive Materials for the Ministry of Education, College of Life Sciences, Nankai University, Tianjin 300071, China
| | - Yanrong Wang
- State Key Laboratory of Medicinal Chemical Biology, Key Laboratory of Bioactive Materials for the Ministry of Education, College of Life Sciences, Nankai University, Tianjin 300071, China
| | - Yifan Tai
- State Key Laboratory of Medicinal Chemical Biology, Key Laboratory of Bioactive Materials for the Ministry of Education, College of Life Sciences, Nankai University, Tianjin 300071, China
| | - Xiaolin Yao
- School of Food and Biological Engineering, Shaanxi University of Science and Technology, Xi’an 710021, China
| | - Adam C. Midgley
- State Key Laboratory of Medicinal Chemical Biology, Key Laboratory of Bioactive Materials for the Ministry of Education, College of Life Sciences, Nankai University, Tianjin 300071, China
| |
Collapse
|
30
|
Feng H, Wang F, Li N, Xu Q, Zheng G, Sun X, Hu M, Xing G, Zhang G. A Random Forest Model for Peptide Classification Based on Virtual Docking Data. Int J Mol Sci 2023; 24:11409. [PMID: 37511165 PMCID: PMC10380188 DOI: 10.3390/ijms241411409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 06/25/2023] [Accepted: 07/11/2023] [Indexed: 07/30/2023] Open
Abstract
The affinity of peptides is a crucial factor in studying peptide-protein interactions. Despite the development of various techniques to evaluate peptide-receptor affinity, the results may not always reflect the actual affinity of the peptides accurately. The current study provides a free tool to assess the actual peptide affinity based on virtual docking data. This study employed a dataset that combined actual peptide affinity information (active and inactive) and virtual peptide-receptor docking data, and different machine learning algorithms were utilized. Compared with the other algorithms, the random forest (RF) algorithm showed the best performance and was used in building three RF models using different numbers of significant features (four, three, and two). Further analysis revealed that the four-feature RF model achieved the highest Accuracy of 0.714 in classifying an independent unknown peptide dataset designed with the PEDV spike protein, and it also revealed overfitting problems in the other models. This four-feature RF model was used to evaluate peptide affinity by constructing the relationship between the actual affinity and the virtual docking scores of peptides to their receptors.
Collapse
Affiliation(s)
- Hua Feng
- Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou 450002, China
| | - Fangyu Wang
- Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou 450002, China
| | - Ning Li
- Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou 450002, China
| | - Qian Xu
- Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou 450002, China
| | - Guanming Zheng
- Public Health and Preventive Medicine Teaching and Research Center, Henan University of Chinese Medicine, Zhengzhou 450046, China
| | - Xuefeng Sun
- Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou 450002, China
| | - Man Hu
- Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou 450002, China
| | - Guangxu Xing
- Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou 450002, China
| | - Gaiping Zhang
- Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou 450002, China
- Longhu Modern Immunology Laboratory, Zhengzhou 450002, China
- School of Advanced Agricultural Sciences, Peking University, Beijing 100871, China
- Jiangsu Co-Innovation Center for the Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou University, Yangzhou 225009, China
| |
Collapse
|
31
|
Narganes-Carlón D, Crowther DJ, Pearson ER. A publication-wide association study (PWAS), historical language models to prioritise novel therapeutic drug targets. Sci Rep 2023; 13:8366. [PMID: 37225853 DOI: 10.1038/s41598-023-35597-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Accepted: 05/20/2023] [Indexed: 05/26/2023] Open
Abstract
Most biomedical knowledge is published as text, making it challenging to analyse using traditional statistical methods. In contrast, machine-interpretable data primarily comes from structured property databases, which represent only a fraction of the knowledge present in the biomedical literature. Crucial insights and inferences can be drawn from these publications by the scientific community. We trained language models on literature from different time periods to evaluate their ranking of prospective gene-disease associations and protein-protein interactions. Using 28 distinct historical text corpora of abstracts published between 1995 and 2022, we trained independent Word2Vec models to prioritise associations that were likely to be reported in future years. This study demonstrates that biomedical knowledge can be encoded as word embeddings without the need for human labelling or supervision. Language models effectively capture drug discovery concepts such as clinical tractability, disease associations, and biochemical pathways. Additionally, these models can prioritise hypotheses years before their initial reporting. Our findings underscore the potential for extracting yet-to-be-discovered relationships through data-driven approaches, leading to generalised biomedical literature mining for potential therapeutic drug targets. The Publication-Wide Association Study (PWAS) enables the prioritisation of under-explored targets and provides a scalable system for accelerating early-stage target ranking, irrespective of the specific disease of interest.
Collapse
Affiliation(s)
- David Narganes-Carlón
- Division of Population Health and Genomics, Ninewells Hospital, School of Medicine, University of Dundee, Dundee, DD1 9SY, UK.
- Exscientia Ltd, Dundee One, River Court, 5 West Victoria Dock Road, Dundee, DD1 3JT, UK.
| | - Daniel J Crowther
- Exscientia Ltd, Dundee One, River Court, 5 West Victoria Dock Road, Dundee, DD1 3JT, UK
| | - Ewan R Pearson
- Division of Population Health and Genomics, Ninewells Hospital, School of Medicine, University of Dundee, Dundee, DD1 9SY, UK
| |
Collapse
|
32
|
Binothman N, Aljadani M, Alghanem B, Refai MY, Rashid M, Al Tuwaijri A, Alsubhi NH, Alrefaei GI, Khan MY, Sonbul SN, Aljoud F, Alhayyani S, Abdulal RH, Ganash M, Hashem AM. Identification of novel interacts partners of ADAR1 enzyme mediating the oncogenic process in aggressive breast cancer. Sci Rep 2023; 13:8341. [PMID: 37221310 DOI: 10.1038/s41598-023-35517-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 05/19/2023] [Indexed: 05/25/2023] Open
Abstract
Triple-negative breast cancer (TNBC) subtype is characterized by aggressive clinical behavior and poor prognosis patient outcomes. Here, we show that ADAR1 is more abundantly expressed in infiltrating breast cancer (BC) tumors than in benign tumors. Further, ADAR1 protein expression is higher in aggressive BC cells (MDA-MB-231). Moreover, we identify a novel interacting partners proteins list with ADAR1 in MDA-MB-231, using immunoprecipitation assay and mass spectrometry. Using iLoop, a protein-protein interaction prediction server based on structural features, five proteins with high iloop scores were discovered: Histone H2A.V, Kynureninase (KYNU), 40S ribosomal protein SA, Complement C4-A, and Nebulin (ranged between 0.6 and 0.8). In silico analysis showed that invasive ductal carcinomas had the highest level of KYNU gene expression than the other classifications (p < 0.0001). Moreover, KYNU mRNA expression was shown to be considerably higher in TNBC patients (p < 0.0001) and associated with poor patient outcomes with a high-risk value. Importantly, we found an interaction between ADAR1 and KYNU in the more aggressive BC cells. Altogether, these results propose a new ADAR-KYNU interaction as potential therapeutic targeted therapy in aggressive BC.
Collapse
Affiliation(s)
- Najat Binothman
- Department of Chemistry, College of Sciences and Arts, King Abdulaziz University, Rabigh, Saudi Arabia.
- Vaccine and Immunotherapy Unit, King Fahad Medical Research Center, King Abdulaziz University Saudi Arabia, Jeddah, Saudi Arabia.
| | - Majidah Aljadani
- Department of Chemistry, College of Sciences and Arts, King Abdulaziz University, Rabigh, Saudi Arabia
| | - Bandar Alghanem
- Medical Research Core Facility and Platforms (MRCFP), King Abdullah International Medical Research Center/King Saud bin Abdulaziz University for Health Sciences (KSAU-HS), King Abdulaziz Medical City (KAMC), National Guard Health Affairs (NGHA), Riyadh, Saudi Arabia
| | - Mohammed Y Refai
- Department of Biochemistry, College of Science, University of Jeddah, Jeddah, Saudi Arabia
| | - Mamoon Rashid
- Department of AI and Bioinformatics, King Abdullah International Medical Research Center (KAIMRC), King Saud Bin Abdulaziz University for Health Sciences (KSAU-HS), King Abdulaziz Medical City, Ministry of National Guard Health Affairs, P.O. Box 22490, Riyadh, 11426, Saudi Arabia
| | - Abeer Al Tuwaijri
- Medical Genomics Research Department, King Abdullah International Medical Research Center (KAIMRC), Ministry of National Guard Health Affairs (MNGH), Riyadh, Saudi Arabia
- Clinical Laboratory Sciences Department, College of Applied Medical Sciences, King Saud Bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia
| | - Nouf H Alsubhi
- Biological Sciences Department, College of Science & Arts, King Abdulaziz University, Rabigh, 21911, Saudi Arabia
| | - Ghadeer I Alrefaei
- Department of Biology, College of Science, University of Jeddah, Jeddah, Saudi Arabia
| | - Muhammad Yasir Khan
- Vaccine and Immunotherapy Unit, King Fahad Medical Research Center, King Abdulaziz University Saudi Arabia, Jeddah, Saudi Arabia
- Department of Biology, Faculty of Science, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
| | - Sultan N Sonbul
- Biochemistry Department, Faculty of Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
- Experimental Biochemistry Unit, King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Fadwa Aljoud
- Department of Biology, Faculty of Science, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
- Regenerative Medicine Unit, King Fahd Medical Research Centre, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
| | - Sultan Alhayyani
- Department of Chemistry, College of Sciences and Arts, King Abdulaziz University, Rabigh, Saudi Arabia
| | - Rwaa H Abdulal
- Vaccine and Immunotherapy Unit, King Fahad Medical Research Center, King Abdulaziz University Saudi Arabia, Jeddah, Saudi Arabia
- Department of Biology, Faculty of Science, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
| | - Magdah Ganash
- Department of Biology, Faculty of Science, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
| | - Anwar M Hashem
- Vaccine and Immunotherapy Unit, King Fahad Medical Research Center, King Abdulaziz University Saudi Arabia, Jeddah, Saudi Arabia
- Department of Medical Microbiology and Parasitology, Faculty of Medicine, King AbdulAziz University, Jeddah, Saudi Arabia
| |
Collapse
|
33
|
Jefferson RE, Oggier A, Füglistaler A, Camviel N, Hijazi M, Villarreal AR, Arber C, Barth P. Computational design of dynamic receptor-peptide signaling complexes applied to chemotaxis. Nat Commun 2023; 14:2875. [PMID: 37208363 DOI: 10.1038/s41467-023-38491-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 05/04/2023] [Indexed: 05/21/2023] Open
Abstract
Engineering protein biosensors that sensitively respond to specific biomolecules by triggering precise cellular responses is a major goal of diagnostics and synthetic cell biology. Previous biosensor designs have largely relied on binding structurally well-defined molecules. In contrast, approaches that couple the sensing of flexible compounds to intended cellular responses would greatly expand potential biosensor applications. Here, to address these challenges, we develop a computational strategy for designing signaling complexes between conformationally dynamic proteins and peptides. To demonstrate the power of the approach, we create ultrasensitive chemotactic receptor-peptide pairs capable of eliciting potent signaling responses and strong chemotaxis in primary human T cells. Unlike traditional approaches that engineer static binding complexes, our dynamic structure design strategy optimizes contacts with multiple binding and allosteric sites accessible through dynamic conformational ensembles to achieve strongly enhanced signaling efficacy and potency. Our study suggests that a conformationally adaptable binding interface coupled to a robust allosteric transmission region is a key evolutionary determinant of peptidergic GPCR signaling systems. The approach lays a foundation for designing peptide-sensing receptors and signaling peptide ligands for basic and therapeutic applications.
Collapse
Affiliation(s)
- Robert E Jefferson
- Interfaculty Institute of Bioengineering, École Polytechnique Fédérale de Lausanne, Lausanne, CH-1015, Switzerland
- Ludwig Institute for Cancer Research Lausanne, Lausanne, Switzerland
| | - Aurélien Oggier
- Interfaculty Institute of Bioengineering, École Polytechnique Fédérale de Lausanne, Lausanne, CH-1015, Switzerland
- Ludwig Institute for Cancer Research Lausanne, Lausanne, Switzerland
| | - Andreas Füglistaler
- Interfaculty Institute of Bioengineering, École Polytechnique Fédérale de Lausanne, Lausanne, CH-1015, Switzerland
- Ludwig Institute for Cancer Research Lausanne, Lausanne, Switzerland
| | - Nicolas Camviel
- Ludwig Institute for Cancer Research Lausanne, Lausanne, Switzerland
- Department of Oncology UNIL-CHUV, University Hospital Lausanne (CHUV), University of Lausanne (UNIL), Lausanne, Switzerland
| | - Mahdi Hijazi
- Interfaculty Institute of Bioengineering, École Polytechnique Fédérale de Lausanne, Lausanne, CH-1015, Switzerland
- Ludwig Institute for Cancer Research Lausanne, Lausanne, Switzerland
| | - Ana Rico Villarreal
- Interfaculty Institute of Bioengineering, École Polytechnique Fédérale de Lausanne, Lausanne, CH-1015, Switzerland
- Ludwig Institute for Cancer Research Lausanne, Lausanne, Switzerland
| | - Caroline Arber
- Ludwig Institute for Cancer Research Lausanne, Lausanne, Switzerland
- Department of Oncology UNIL-CHUV, University Hospital Lausanne (CHUV), University of Lausanne (UNIL), Lausanne, Switzerland
| | - Patrick Barth
- Interfaculty Institute of Bioengineering, École Polytechnique Fédérale de Lausanne, Lausanne, CH-1015, Switzerland.
- Ludwig Institute for Cancer Research Lausanne, Lausanne, Switzerland.
| |
Collapse
|
34
|
Jiang J, Li J, Li J, Pei H, Li M, Zou Q, Lv Z. A Machine Learning Method to Identify Umami Peptide Sequences by Using Multiplicative LSTM Embedded Features. Foods 2023; 12:foods12071498. [PMID: 37048319 PMCID: PMC10094688 DOI: 10.3390/foods12071498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 03/24/2023] [Accepted: 03/30/2023] [Indexed: 04/05/2023] Open
Abstract
Umami peptides enhance the umami taste of food and have good food processing properties, nutritional value, and numerous potential applications. Wet testing for the identification of umami peptides is a time-consuming and expensive process. Here, we report the iUmami-DRLF that uses a logistic regression (LR) method solely based on the deep learning pre-trained neural network feature extraction method, unified representation (UniRep based on multiplicative LSTM), for feature extraction from the peptide sequences. The findings demonstrate that deep learning representation learning significantly enhanced the capability of models in identifying umami peptides and predictive precision solely based on peptide sequence information. The newly validated taste sequences were also used to test the iUmami-DRLF and other predictors, and the result indicates that the iUmami-DRLF has better robustness and accuracy and remains valid at higher probability thresholds. The iUmami-DRLF method can aid further studies on enhancing the umami flavor of food for satisfying the need for an umami-flavored diet.
Collapse
Affiliation(s)
- Jici Jiang
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Jiayu Li
- College of Life Science, Sichuan University, Chengdu 610065, China
| | - Junxian Li
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Hongdi Pei
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
- Wu Yuzhang Honors College, Sichuan University, Chengdu 610065, China
| | - Mingxin Li
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| | - Zhibin Lv
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| |
Collapse
|
35
|
Han T, Zhang Y, Qi B, Chen M, Sun K, Qin X, Yang B, Yin H, Xu A, Wei X, Zhu L. Clinical features and shared mechanisms of chronic gastritis and osteoporosis. Sci Rep 2023; 13:4991. [PMID: 36973348 PMCID: PMC10042850 DOI: 10.1038/s41598-023-31541-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 03/14/2023] [Indexed: 03/29/2023] Open
Abstract
Chronic gastritis (CG) and osteoporosis (OP) are common and occult diseases in the elderly and the relationship of these two diseases have been increasingly exposed. We aimed to explore the clinical characteristics and shared mechanisms of CG patients combined with OP. In the cross-sectional study, all participants were selected from BEYOND study. The CG patients were included and classified into two groups, namely OP group and non-OP group. Univariable and multivariable logistic regression methods were used to evaluate the influencing factors. Furthermore, CG and OP-related genes were obtained from Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) were identified using the GEO2R tool and the Venny platform. Protein-protein interaction information was obtained by inputting the intersection targets into the STRING database. The PPI network was constructed by Cytoscape v3.6.0 software again, and the key genes were screened out according to the degree value. Gene function enrichment of DEGs was performed by Webgestalt online tool. One hundred and thirty CG patients were finally included in this study. Univariate correlation analysis showed that age, gender, BMI and coffee were the potential influencing factors for the comorbidity (P < 0.05). Multivariate Logistic regression model found that smoking history, serum PTH and serum β-CTX were positively correlated with OP in CG patients, while serum P1NP and eating fruit had an negative relationship with OP in CG patients. In studies of the shared mechanisms, a total of 76 intersection genes were identified between CG and OP, including CD163, CD14, CCR1, CYBB, CXCL10, SIGLEC1, LILRB2, IGSF6, MS4A6A and CCL8 as the core genes. The biological processes closely related to the occurrence and development of CG and OP mainly involved Ferroptosis, Toll-like receptor signaling pathway, Legionellosis and Chemokine signaling pathway. Our study firstly identified the possible associated factors with OP in the patients with CG, and mined the core genes and related pathways that could be used as biomarkers or potential therapeutic targets to reveal the shared mechanisms.
Collapse
Affiliation(s)
- Tao Han
- Department of Spine, Wangjing Hospital, China Academy of Chinese Medical Sciences, Huajiadi Street, Chaoyang District, Beijing, 100102, China
| | - Yili Zhang
- School of Traditional Chinese Medicine & School of Integrated Chinese and Western Medicine, Nanjing University of Chinese Medicine, Nanjing, China
| | - Baoyu Qi
- Department of Spine, Wangjing Hospital, China Academy of Chinese Medical Sciences, Huajiadi Street, Chaoyang District, Beijing, 100102, China
| | - Ming Chen
- Department of Spine, Wangjing Hospital, China Academy of Chinese Medical Sciences, Huajiadi Street, Chaoyang District, Beijing, 100102, China
| | - Kai Sun
- Department of Spine, Wangjing Hospital, China Academy of Chinese Medical Sciences, Huajiadi Street, Chaoyang District, Beijing, 100102, China
| | - Xiaokuan Qin
- Department of Spine, Wangjing Hospital, China Academy of Chinese Medical Sciences, Huajiadi Street, Chaoyang District, Beijing, 100102, China
| | - Bowen Yang
- Department of Spine, Wangjing Hospital, China Academy of Chinese Medical Sciences, Huajiadi Street, Chaoyang District, Beijing, 100102, China
| | - He Yin
- Department of Spine, Wangjing Hospital, China Academy of Chinese Medical Sciences, Huajiadi Street, Chaoyang District, Beijing, 100102, China
| | - Aili Xu
- Department of Gastroenterology, Wangjing Hospital, China Academy of Chinese Medical Sciences, Huajiadi Street, Chaoyang District, Beijing, 100102, China.
| | - Xu Wei
- Department of Academic Development, Wangjing Hospital, China Academy of Chinese Medical Sciences, Huajiadi Street, Chaoyang District, Beijing, 100102, China.
| | - Liguo Zhu
- Department of Spine, Wangjing Hospital, China Academy of Chinese Medical Sciences, Huajiadi Street, Chaoyang District, Beijing, 100102, China.
| |
Collapse
|
36
|
Peng X, Lei Y, Feng P, Jia L, Ma J, Zhao D, Zeng J. Characterizing the interaction conformation between T-cell receptors and epitopes with deep learning. NAT MACH INTELL 2023. [DOI: 10.1038/s42256-023-00634-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
37
|
Yan K, Li T, Marques JAL, Gao J, Fong SJ. A review on multimodal machine learning in medical diagnostics. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:8708-8726. [PMID: 37161218 DOI: 10.3934/mbe.2023382] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Nowadays, the increasing number of medical diagnostic data and clinical data provide more complementary references for doctors to make diagnosis to patients. For example, with medical data, such as electrocardiography (ECG), machine learning algorithms can be used to identify and diagnose heart disease to reduce the workload of doctors. However, ECG data is always exposed to various kinds of noise and interference in reality, and medical diagnostics only based on one-dimensional ECG data is not trustable enough. By extracting new features from other types of medical data, we can implement enhanced recognition methods, called multimodal learning. Multimodal learning helps models to process data from a range of different sources, eliminate the requirement for training each single learning modality, and improve the robustness of models with the diversity of data. Growing number of articles in recent years have been devoted to investigating how to extract data from different sources and build accurate multimodal machine learning models, or deep learning models for medical diagnostics. This paper reviews and summarizes several recent papers that dealing with multimodal machine learning in disease detection, and identify topics for future research.
Collapse
Affiliation(s)
- Keyue Yan
- Department of Computer and Information Science, University of Macau, Macau SAR, China
| | - Tengyue Li
- Department of Computer and Information Science, University of Macau, Macau SAR, China
| | | | - Juntao Gao
- Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China
| | - Simon James Fong
- Department of Computer and Information Science, University of Macau, Macau SAR, China
- Institute of Artificial Intelligence, Chongqing Technology and Business University, Chongqing, China
| |
Collapse
|
38
|
Shu J, Li J, Wang S, Lin J, Wen L, Ye H, Zhou P. Systematic analysis and comparison of peptide specificity and selectivity between their cognate receptors and noncognate decoys. J Mol Recognit 2023; 36:e3006. [PMID: 36579779 DOI: 10.1002/jmr.3006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Revised: 12/07/2022] [Accepted: 12/27/2022] [Indexed: 12/30/2022]
Abstract
Protein-peptide interactions (PpIs) play an important role in cell signaling networks and have been exploited as new and attractive therapeutic targets. The affinity and specificity are two unity-of-opposite aspects of PpIs (and other biomolecular interactions); the former indicates the absolute binding strength between the peptide ligand and its cognate protein receptor in a PpI, while the latter represents the relative recognition selectivity of the peptide ligand for its cognate protein receptor in a PpI over those noncognate decoys that could be potentially encountered by the peptide in cell. Although the PpI binding affinity has been widely investigated over the past decades, the peptide recognition specificity (and selectivity) still remains largely unexplored to date. In this study, we classified PpI specificity into three types: (i) class-I specificity: peptide selectivity for its cognate wild-type protein receptor over the noncognate mutant decoys of this receptor, (ii) class-II specificity: peptide selectivity for its cognate protein receptor over other noncognate decoys that are homologous with this receptor, and (iii) class-III specificity: peptide selectivity for its cognate protein receptor over other noncognate decoys that are the cognate receptors of other peptides. We performed affinity and selectivity analysis for the three types of PpI specificity and revealed that the PpIs generally exhibit a moderate or modest specificity; peptide selectivity increases in the order: class-I < class-II < class-III. All the three types of PpI specificity were observed to have no statistically significant correlation with peptide length and hydrophobicity, but the class-I and class-II specificities can be influenced considerably by peptide secondary structures; the high specificity is preferentially associated with ordered structure types as compared to undefined structure types. In addition, the mutation distribution (for class-I specificity), sequence conservation (for class-II specificity), and structural similarity (for class-III specificity) seem also to address effects on peptide selectivity.
Collapse
Affiliation(s)
- Jianping Shu
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
| | - Juelin Li
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
| | - Shaozhou Wang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
| | - Jing Lin
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
| | - Li Wen
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
| | - Haiyang Ye
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
| | - Peng Zhou
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
| |
Collapse
|
39
|
Boone K, Cloyd AK, Derakovic E, Spencer P, Tamerler C. Designing Collagen-Binding Peptide with Enhanced Properties Using Hydropathic Free Energy Predictions. APPLIED SCIENCES (BASEL, SWITZERLAND) 2023; 13:3342. [PMID: 38037603 PMCID: PMC10686322 DOI: 10.3390/app13053342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2023]
Abstract
Collagen is fundamental to a vast diversity of health functions and potential therapeutics. Short peptides targeting collagen are attractive for designing modular systems for site-specific delivery of bioactive agents. Characterization of peptide-protein binding involves a larger number of potential interactions that require screening methods to target physiological conditions. We build a hydropathy-based free energy estimation tool which allows quick evaluation of peptides binding to collagen. Previous studies showed that pH plays a significant role in collagen structure and stability. Our design tool enables probing peptides for their collagen-binding property across multiple pH conditions. We explored binding features of currently known collagen-binding peptides, collagen type I alpha chain 2 sense peptide (TKKTLRT) and decorin LRR-10 (LRELHLNNN). Based on these analyzes, we engineered a collagen-binding peptide with enhanced properties across a large pH range in contrast to LRR-10 pH dependence. To validate our predictions, we used a quantum-dots-based binding assay to compare the coverage of the peptides on type I collagen. The predicted peptide resulted in improved collagen binding. Hydropathy of the peptide-protein pair is a promising approach to finding compatible pairings with minimal use of computational resources, and our method allows for quick evaluation of peptides for binding to other proteins. Overall, the free-energy-based tool provides an alternative computational screening approach that impacts protein interaction search methods.
Collapse
Affiliation(s)
- Kyle Boone
- Institute for Bioengineering Research, University of Kansas, 5109 Learned Hall 1530 W, 15th Street, Lawrence, KS 66045-7609, USA
- Department of Mechanical Engineering, University of Kansas, Lawrence, KS 66045-7609, USA
| | - Aya Kirahm Cloyd
- Institute for Bioengineering Research, University of Kansas, 5109 Learned Hall 1530 W, 15th Street, Lawrence, KS 66045-7609, USA
- Bioengineering Program, University of Kansas, 1132 Learned Hall 1530 W, 15th Street, Lawrence, KS 66045-7609, USA
| | - Emina Derakovic
- Department of Mechanical Engineering, University of Kansas, Lawrence, KS 66045-7609, USA
| | - Paulette Spencer
- Institute for Bioengineering Research, University of Kansas, 5109 Learned Hall 1530 W, 15th Street, Lawrence, KS 66045-7609, USA
- Department of Mechanical Engineering, University of Kansas, Lawrence, KS 66045-7609, USA
- Bioengineering Program, University of Kansas, 1132 Learned Hall 1530 W, 15th Street, Lawrence, KS 66045-7609, USA
| | - Candan Tamerler
- Institute for Bioengineering Research, University of Kansas, 5109 Learned Hall 1530 W, 15th Street, Lawrence, KS 66045-7609, USA
- Department of Mechanical Engineering, University of Kansas, Lawrence, KS 66045-7609, USA
- Bioengineering Program, University of Kansas, 1132 Learned Hall 1530 W, 15th Street, Lawrence, KS 66045-7609, USA
| |
Collapse
|
40
|
Lin J, Wang S, Wen L, Ye H, Shang S, Li J, Shu J, Zhou P. Targeting peptide-mediated interactions in omics. Proteomics 2023; 23:e2200175. [PMID: 36461811 DOI: 10.1002/pmic.202200175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 11/28/2022] [Accepted: 11/28/2022] [Indexed: 12/05/2022]
Abstract
Peptide-mediated interactions (PMIs) play a crucial role in cell signaling network, which are responsible for about half of cellular protein-protein associations in the human interactome and have recently been recognized as a new kind of promising druggable target for drug development and disease therapy. In this article, we give a systematic review regarding the proteome-wide discovery of PMIs and targeting druggable PMIs (dPMIs) with chemical drugs, self-inhibitory peptides (SIPs) and protein agents, particularly focusing on their implications and applications for therapeutic purpose in omics. We also introduce computational peptidology strategies used to model, analyze, and design PMI-targeted molecular entities and further extend the concepts of protein context, direct/indirect readout, and enthalpy/entropy effect involved in PMIs. Current issues and future perspective on this topic are discussed. There is still a long way to go before establishment of efficient therapeutic strategies to target PMIs on the omics scale.
Collapse
Affiliation(s)
- Jing Lin
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
| | - Shaozhou Wang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
| | - Li Wen
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
| | - Haiyang Ye
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
| | - Shuyong Shang
- Institute of Ecological Environment Protection, Chengdu Normal University, Chengdu, China
| | - Juelin Li
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
| | - Jianping Shu
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
| | - Peng Zhou
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
| |
Collapse
|
41
|
Evaluation of affinity-purification coupled to mass spectrometry approaches for capture of short linear motif-based interactions. Anal Biochem 2023; 663:115017. [PMID: 36526023 DOI: 10.1016/j.ab.2022.115017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 11/29/2022] [Accepted: 12/07/2022] [Indexed: 12/15/2022]
Abstract
Low affinity and transient protein-protein interactions, such as short linear motif (SLiM)-based interactions, require dedicated experimental tools for discovery and validation. Here, we evaluated and compared biotinylated peptide pulldown and protein interaction screen on peptide matrix (PRISMA) coupled to mass-spectrometry (MS) using a set of peptides containing interaction motifs. Eight different peptide sequences that engage in interactions with three distinct protein domains (KEAP1 Kelch, MDM2 SWIB, and TSG101 UEV) with a wide range of affinities were tested. We found that peptide pulldown can be an effective approach for SLiM validation, however, parameters such as protein abundance and competitive interactions can prevent the capture of known interactors. The use of tandem peptide repeats improved the capture and preservation of some interactions. When testing PRISMA, it failed to provide comparable results for model peptides that successfully pulled down known interactors using biotinylated peptide pulldown. Overall, in our hands, we find that albeit more laborious, biotin-peptide pulldown was more successful in terms of validation of known interactions. Our results highlight that the tested affinity-capture MS-based methods for validation of SLiM-based interactions from cell lysates are suboptimal, and we identified parameters for consideration for method development.
Collapse
|
42
|
Zhang H, Saravanan KM, Wei Y, Jiao Y, Yang Y, Pan Y, Wu X, Zhang JZH. Deep Learning-Based Bioactive Therapeutic Peptide Generation and Screening. J Chem Inf Model 2023; 63:835-845. [PMID: 36724090 DOI: 10.1021/acs.jcim.2c01485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Many bioactive peptides demonstrated therapeutic effects over complicated diseases, such as antiviral, antibacterial, anticancer, etc. It is possible to generate a large number of potentially bioactive peptides using deep learning in a manner analogous to the generation of de novo chemical compounds using the acquired bioactive peptides as a training set. Such generative techniques would be significant for drug development since peptides are much easier and cheaper to synthesize than compounds. Despite the limited availability of deep learning-based peptide-generating models, we have built an LSTM model (called LSTM_Pep) to generate de novo peptides and fine-tuned the model to generate de novo peptides with specific prospective therapeutic benefits. Remarkably, the Antimicrobial Peptide Database has been effectively utilized to generate various kinds of potential active de novo peptides. We proposed a pipeline for screening those generated peptides for a given target and used the main protease of SARS-COV-2 as a proof-of-concept. Moreover, we have developed a deep learning-based protein-peptide prediction model (DeepPep) for rapid screening of the generated peptides for the given targets. Together with the generating model, we have demonstrated that iteratively fine-tuning training, generating, and screening peptides for higher-predicted binding affinity peptides can be achieved. Our work sheds light on developing deep learning-based methods and pipelines to effectively generate and obtain bioactive peptides with a specific therapeutic effect and showcases how artificial intelligence can help discover de novo bioactive peptides that can bind to a particular target.
Collapse
Affiliation(s)
- Haiping Zhang
- Shenzhen Institute of Synthetic Biology, Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China
| | - Konda Mani Saravanan
- Department of Biotechnology, Bharath Institute of Higher Education and Research, Chennai 600073, Tamil Nadu, India
| | - Yanjie Wei
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China
| | - Yang Jiao
- Faculty of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Yang Yang
- Shenzhen Key Laboratory of Pathogen and Immunity, National Clinical Research Center for infectious disease, State Key Discipline of Infectious Disease, Shenzhen Third People's Hospital, Second Hospital Affiliated to Southern University of Science and Technology, Shenzhen 518112, China
| | - Yi Pan
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China.,Faculty of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Xuli Wu
- School of Medicine, Shenzhen University, Shenzhen 518060, Guangdong, China
| | - John Z H Zhang
- Shenzhen Institute of Synthetic Biology, Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China.,East China Normal University, Shanghai 200062, China.,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
43
|
Soleymani F, Paquet E, Viktor HL, Michalowski W, Spinello D. ProtInteract: A deep learning framework for predicting protein-protein interactions. Comput Struct Biotechnol J 2023; 21:1324-1348. [PMID: 36817951 PMCID: PMC9929211 DOI: 10.1016/j.csbj.2023.01.028] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 01/20/2023] [Accepted: 01/20/2023] [Indexed: 01/26/2023] Open
Abstract
Proteins mainly perform their functions by interacting with other proteins. Protein-protein interactions underpin various biological activities such as metabolic cycles, signal transduction, and immune response. However, due to the sheer number of proteins, experimental methods for finding interacting and non-interacting protein pairs are time-consuming and costly. We therefore developed the ProtInteract framework to predict protein-protein interaction. ProtInteract comprises two components: first, a novel autoencoder architecture that encodes each protein's primary structure to a lower-dimensional vector while preserving its underlying sequence attributes. This leads to faster training of the second network, a deep convolutional neural network (CNN) that receives encoded proteins and predicts their interaction under three different scenarios. In each scenario, the deep CNN predicts the class of a given encoded protein pair. Each class indicates different ranges of confidence scores corresponding to the probability of whether a predicted interaction occurs or not. The proposed framework features significantly low computational complexity and relatively fast response. The contributions of this work are twofold. First, ProtInteract assimilates the protein's primary structure into a pseudo-time series. Therefore, we leverage the nature of the time series of proteins and their physicochemical properties to encode a protein's amino acid sequence into a lower-dimensional vector space. This approach enables extracting highly informative sequence attributes while reducing computational complexity. Second, the ProtInteract framework utilises this information to identify protein interactions with other proteins based on its amino acid configuration. Our results suggest that the proposed framework performs with high accuracy and efficiency in predicting protein-protein interactions.
Collapse
Affiliation(s)
- Farzan Soleymani
- Department of Mechanical Engineering, University of Ottawa, Ottawa, ON K1N 6N5, Canada
| | - Eric Paquet
- National Research Council, 1200 Montreal Road, Ottawa, ON K1A 0R6, Canada,Corresponding author.
| | - Herna Lydia Viktor
- School of Electrical Engineering and Computer Science, University of Ottawa, ON K1N 6N5, Canada
| | | | - Davide Spinello
- Department of Mechanical Engineering, University of Ottawa, Ottawa, ON K1N 6N5, Canada
| |
Collapse
|
44
|
Rogers JR, Nikolényi G, AlQuraishi M. Growing ecosystem of deep learning methods for modeling protein-protein interactions. Protein Eng Des Sel 2023; 36:gzad023. [PMID: 38102755 DOI: 10.1093/protein/gzad023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 12/06/2023] [Accepted: 12/07/2023] [Indexed: 12/17/2023] Open
Abstract
Numerous cellular functions rely on protein-protein interactions. Efforts to comprehensively characterize them remain challenged however by the diversity of molecular recognition mechanisms employed within the proteome. Deep learning has emerged as a promising approach for tackling this problem by exploiting both experimental data and basic biophysical knowledge about protein interactions. Here, we review the growing ecosystem of deep learning methods for modeling protein interactions, highlighting the diversity of these biophysically informed models and their respective trade-offs. We discuss recent successes in using representation learning to capture complex features pertinent to predicting protein interactions and interaction sites, geometric deep learning to reason over protein structures and predict complex structures, and generative modeling to design de novo protein assemblies. We also outline some of the outstanding challenges and promising new directions. Opportunities abound to discover novel interactions, elucidate their physical mechanisms, and engineer binders to modulate their functions using deep learning and, ultimately, unravel how protein interactions orchestrate complex cellular behaviors.
Collapse
Affiliation(s)
- Julia R Rogers
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
| | - Gergő Nikolényi
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
| | | |
Collapse
|
45
|
Cai Y, Chen R, Gao S, Li W, Liu Y, Su G, Song M, Jiang M, Jiang C, Zhang X. Artificial intelligence applied in neoantigen identification facilitates personalized cancer immunotherapy. Front Oncol 2023; 12:1054231. [PMID: 36698417 PMCID: PMC9868469 DOI: 10.3389/fonc.2022.1054231] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 12/16/2022] [Indexed: 01/10/2023] Open
Abstract
The field of cancer neoantigen investigation has developed swiftly in the past decade. Predicting novel and true neoantigens derived from large multi-omics data became difficult but critical challenges. The rise of Artificial Intelligence (AI) or Machine Learning (ML) in biomedicine application has brought benefits to strengthen the current computational pipeline for neoantigen prediction. ML algorithms offer powerful tools to recognize the multidimensional nature of the omics data and therefore extract the key neoantigen features enabling a successful discovery of new neoantigens. The present review aims to outline the significant technology progress of machine learning approaches, especially the newly deep learning tools and pipelines, that were recently applied in neoantigen prediction. In this review article, we summarize the current state-of-the-art tools developed to predict neoantigens. The standard workflow includes calling genetic variants in paired tumor and blood samples, and rating the binding affinity between mutated peptide, MHC (I and II) and T cell receptor (TCR), followed by characterizing the immunogenicity of tumor epitopes. More specifically, we highlight the outstanding feature extraction tools and multi-layer neural network architectures in typical ML models. It is noted that more integrated neoantigen-predicting pipelines are constructed with hybrid or combined ML algorithms instead of conventional machine learning models. In addition, the trends and challenges in further optimizing and integrating the existing pipelines are discussed.
Collapse
Affiliation(s)
- Yu Cai
- School of Medicine, Northwest University, Xi’an, Shaanxi, China
| | - Rui Chen
- School of Medicine, Northwest University, Xi’an, Shaanxi, China
| | - Shenghan Gao
- School of Medicine, Northwest University, Xi’an, Shaanxi, China
| | - Wenqing Li
- School of Medicine, Northwest University, Xi’an, Shaanxi, China
| | - Yuru Liu
- School of Medicine, Northwest University, Xi’an, Shaanxi, China
| | - Guodong Su
- School of Medicine, Northwest University, Xi’an, Shaanxi, China
| | - Mingming Song
- School of Medicine, Northwest University, Xi’an, Shaanxi, China
| | - Mengju Jiang
- School of Medicine, Northwest University, Xi’an, Shaanxi, China
| | - Chao Jiang
- Department of Neurology, The Second Affiliated Hospital of Xi’an Medical University, Xi’an, Shaanxi, China,*Correspondence: Chao Jiang, ; Xi Zhang,
| | - Xi Zhang
- School of Medicine, Northwest University, Xi’an, Shaanxi, China,*Correspondence: Chao Jiang, ; Xi Zhang,
| |
Collapse
|
46
|
Trisciuzzi D, Siragusa L, Baroni M, Cruciani G, Nicolotti O. An Integrated Machine Learning Model To Spot Peptide Binding Pockets in 3D Protein Screening. J Chem Inf Model 2022; 62:6812-6824. [PMID: 36320100 DOI: 10.1021/acs.jcim.2c00583] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The prediction of peptide-protein binding sites is of utmost importance to tackle the onset of severe neurodegenerative diseases and cancer. In this work, we detail a novel machine learning model based on Linear Discriminant Analysis (LDA) demonstrating to be highly predictive in detecting the putative protein binding regions of small peptides. Starting from 439 high-quality pockets derived from peptide-protein crystallographic complexes, three sets of well-established peptide-binding regions were first selected through a Partitioning Around Medoids (PAM) clustering algorithm based on morphological and energetic 3D GRID-MIF molecular descriptors. Next, the best combination between all the putative interacting peptide pockets and related GRID-MIF scores was automatically explored by using the LDA-based protocol implemented in BioGPS. This approach proved successful to recognize the actual interacting peptide regions (that is, AUC = 0.86 and partial ROC enrichment at 5% of 0.48) from all the other pockets of the protein. Validated on two external collections sets, including 445 and 347 crystallographic peptide-protein complexes, our LDA-based model could be effective to further run peptide-protein virtual screening campaigns.
Collapse
Affiliation(s)
- Daniela Trisciuzzi
- Department of Pharmacy-Pharmaceutical Sciences, Università degli Studi di Bari "Aldo Moro", 70125Bari, Italy.,Molecular Discovery Ltd., Kinetic Business Centre, Theobald Street, Elstree, Borehamwood, HertfordshireWD6 4PJ, United Kingdom
| | - Lydia Siragusa
- Molecular Horizon s.r.l., Via Montelino, 30, 06084Bettona (PG), Italy.,Molecular Discovery Ltd., Kinetic Business Centre, Theobald Street, Elstree, Borehamwood, HertfordshireWD6 4PJ, United Kingdom
| | - Massimo Baroni
- Molecular Discovery Ltd., Kinetic Business Centre, Theobald Street, Elstree, Borehamwood, HertfordshireWD6 4PJ, United Kingdom
| | - Gabriele Cruciani
- Department of Chemistry, Biology and Biotechnology, Università degli Studi di Perugia, via Elce di Sotto, 8, 06123Perugia (PG), Italy
| | - Orazio Nicolotti
- Department of Pharmacy-Pharmaceutical Sciences, Università degli Studi di Bari "Aldo Moro", 70125Bari, Italy
| |
Collapse
|
47
|
Tsuchiya Y, Yamamori Y, Tomii K. Protein-protein interaction prediction methods: from docking-based to AI-based approaches. Biophys Rev 2022; 14:1341-1348. [PMID: 36570321 PMCID: PMC9759050 DOI: 10.1007/s12551-022-01032-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 11/30/2022] [Indexed: 12/23/2022] Open
Abstract
Protein-protein interactions (PPIs), such as protein-protein inhibitor, antibody-antigen complex, and supercomplexes play diverse and important roles in cells. Recent advances in structural analysis methods, including cryo-EM, for the determination of protein complex structures are remarkable. Nevertheless, much room remains for improvement and utilization of computational methods to predict PPIs because of the large number and great diversity of unresolved complex structures. This review introduces a wide array of computational methods, including our own, for estimating PPIs including antibody-antigen interactions, offering both historical and forward-looking perspectives.
Collapse
Affiliation(s)
- Yuko Tsuchiya
- grid.208504.b0000 0001 2230 7538Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-Ku, Tokyo, 135-0064 Japan
| | - Yu Yamamori
- grid.208504.b0000 0001 2230 7538Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-Ku, Tokyo, 135-0064 Japan
| | - Kentaro Tomii
- grid.208504.b0000 0001 2230 7538Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-Ku, Tokyo, 135-0064 Japan
| |
Collapse
|
48
|
Singh D, Roy J. A large-scale benchmark study of tools for the classification of protein-coding and non-coding RNAs. Nucleic Acids Res 2022; 50:12094-12111. [PMID: 36420898 PMCID: PMC9757047 DOI: 10.1093/nar/gkac1092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 10/22/2022] [Accepted: 10/28/2022] [Indexed: 11/27/2022] Open
Abstract
Identification of protein-coding and non-coding transcripts is paramount for understanding their biological roles. Computational approaches have been addressing this task for over a decade; however, generalized and high-performance models are still unreliable. This benchmark study assessed the performance of 24 tools producing >55 models on the datasets covering a wide range of species. We have collected 135 small and large transcriptomic datasets from existing studies for comparison and identified the potential bottlenecks hampering the performance of current tools. The key insights of this study include lack of standardized training sets, reliance on homogeneous training data, gradual changes in annotated data, lack of augmentation with homology searches, the presence of false positives and negatives in datasets and the lower performance of end-to-end deep learning models. We also derived a new dataset, RNAChallenge, from the benchmark considering hard instances that may include potential false alarms. The best and least well performing models under- and overfit the dataset, respectively, thereby serving a dual purpose. For computational approaches, it will be valuable to develop accurate and unbiased models. The identification of false alarms will be of interest for genome annotators, and experimental study of hard RNAs will help to untangle the complexity of the RNA world.
Collapse
Affiliation(s)
- Dalwinder Singh
- To whom correspondence should be addressed. Tel: +91 172 5221206;
| | - Joy Roy
- Correspondence may also be addressed to Joy Roy.
| |
Collapse
|
49
|
Xu C, Ma D, Ding Q, Zhou Y, Zheng H. PlantPhoneDB: A manually curated pan-plant database of ligand-receptor pairs infers cell-cell communication. PLANT BIOTECHNOLOGY JOURNAL 2022; 20:2123-2134. [PMID: 35842742 PMCID: PMC9616517 DOI: 10.1111/pbi.13893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Revised: 07/10/2022] [Accepted: 07/13/2022] [Indexed: 06/15/2023]
Abstract
Ligand-receptor pairs play important roles in cell-cell communication for multicellular organisms in response to environmental cues. Recently, the emergence of single-cell RNA-sequencing (scRNA-seq) provides unprecedented opportunities to investigate cellular communication based on ligand-receptor expression. However, so far, no reliable ligand-receptor interaction database is available for plant species. In this study, we developed PlantPhoneDB (https://jasonxu.shinyapps.io/PlantPhoneDB/), a pan-plant database comprising a large number of high-confidence ligand-receptor pairs manually curated from seven resources. Also, we developed a PlantPhoneDB R package, which not only provided optional four scoring approaches that calculate interaction scores of ligand-receptor pairs between cell types but also provided visualization functions to present analysis results. At the PlantPhoneDB web interface, the processed datasets and results can be searched, browsed, and downloaded. To uncover novel cell-cell communication events in plants, we applied the PlantPhoneDB R package on GSE121619 dataset to infer significant cell-cell interactions of heat-shocked root cells in Arabidopsis thaliana. As a result, the PlantPhoneDB predicted the actively communicating AT1G28290-AT2G14890 ligand-receptor pair in atrichoblast-cortex cell pair in Arabidopsis thaliana. Importantly, the downstream target genes of this ligand-receptor pair were significantly enriched in the ribosome pathway, which facilitated plants adapting to environmental changes. In conclusion, PlantPhoneDB provided researchers with integrated resources to infer cell-cell communication from scRNA-seq datasets.
Collapse
Affiliation(s)
- Chaoqun Xu
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and EcologyXiamen UniversityXiamenChina
| | - Dongna Ma
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and EcologyXiamen UniversityXiamenChina
| | - Qiansu Ding
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and EcologyXiamen UniversityXiamenChina
| | - Ying Zhou
- National Institute for Data Science in Health and Medicine, School of MedicineXiamen UniversityXiamenChina
| | - Hai‐Lei Zheng
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and EcologyXiamen UniversityXiamenChina
| |
Collapse
|
50
|
Chang L, Mondal A, Perez A. Towards rational computational peptide design. FRONTIERS IN BIOINFORMATICS 2022; 2:1046493. [PMID: 36338806 PMCID: PMC9634169 DOI: 10.3389/fbinf.2022.1046493] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 10/11/2022] [Indexed: 11/16/2022] Open
Abstract
Peptides are prevalent in biology, mediating as many as 40% of protein-protein interactions, and involved in other cellular functions such as transport and signaling. Their ability to bind with high specificity make them promising therapeutical agents with intermediate properties between small molecules and large biologics. Beyond their biological role, peptides can be programmed to self-assembly, and they are already being used for functions as diverse as oligonuclotide delivery, tissue regeneration or as drugs. However, the transient nature of their interactions has limited the number of structures and knowledge of binding affinities available-and their flexible nature has limited the success of computational pipelines that predict the structures and affinities of these molecules. Fortunately, recent advances in experimental and computational pipelines are creating new opportunities for this field. We are starting to see promising predictions of complex structures, thermodynamic and kinetic properties. We believe in the following years this will lead to robust rational peptide design pipelines with success similar to those applied for small molecule drug discovery.
Collapse
Affiliation(s)
- Liwei Chang
- Department of Chemistry, University of Florida, Gainesville, FL, United States,Quantum Theory Project, University of Florida, Gainesville, FL, United States
| | - Arup Mondal
- Department of Chemistry, University of Florida, Gainesville, FL, United States,Quantum Theory Project, University of Florida, Gainesville, FL, United States
| | - Alberto Perez
- Department of Chemistry, University of Florida, Gainesville, FL, United States,Quantum Theory Project, University of Florida, Gainesville, FL, United States,*Correspondence: Alberto Perez,
| |
Collapse
|