Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Shahid Malik M, Ou YY. Integrating Pre-Trained protein language model and multiple window scanning deep learning networks for accurate identification of secondary active transporters in membrane proteins. Methods 2023;220:11-20. [PMID: 37871661 DOI: 10.1016/j.ymeth.2023.10.008] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 10/04/2023] [Accepted: 10/09/2023] [Indexed: 10/25/2023] Open

For:	Shahid Malik M, Ou YY. Integrating Pre-Trained protein language model and multiple window scanning deep learning networks for accurate identification of secondary active transporters in membrane proteins. Methods 2023;220:11-20. [PMID: 37871661 DOI: 10.1016/j.ymeth.2023.10.008] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 10/04/2023] [Accepted: 10/09/2023] [Indexed: 10/25/2023] Open

Number

Cited by Other Article(s)

Le VT, Yuune JPT, Vu TTP, Malik MS, Ou YY. DeepCR: predicting cytokine receptor proteins through pretrained language models and deep learning networks. J Biomol Struct Dyn 2025:1-18. [PMID: 40448687 DOI: 10.1080/07391102.2025.2512448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2025] [Accepted: 05/21/2025] [Indexed: 06/02/2025]

Abstract

Cytokine receptors play a pivotal role in mediating the immune response and are critical in cytokine storms, which underlie the pathogenesis of conditions such as acute respiratory distress syndrome (ARDS) and autoimmune disorders. Identifying cytokine receptors is essential for understanding their biological functions, exploring therapeutic targets, and guiding clinical interventions. Traditional biochemical methods to identify cytokine receptors are labor-intensive, costly, and time-consuming, prompting the need for more efficient alternatives. Recent advances in computational biology have enabled the use of machine learning to classify cytokine receptor proteins. Most existing approaches focused on homologous features and protein composition to classify cytokine families, but no dedicated studies have been conducted on cytokine receptor proteins. This gap presents an opportunity to develop a method specifically for classifying cytokine receptors among other membrane proteins. In this study, we present a novel classification framework combining pre-trained language models (PLMs) with a multi-window convolutional neural network (mCNN) architecture for the fast and accurate identification of cytokine receptor proteins. PLMs, such as ProtTrans and ESM variants, capture biochemical context directly from raw protein sequences, while mCNN efficiently extracts local and global sequence patterns using convolutional layers with varying window sizes. Our model achieved an AUC of 0.96 in the training as well as 0.97 and 0.93 in two independent tests, demonstrating its effectiveness in distinguishing cytokine receptors from non-cytokine receptor proteins. By eliminating the need for manual feature extraction, this approach offers a robust and scalable solution for protein classification, paving the way for its application in drug discovery and understanding cytokine-mediated diseases.

Collapse

Arslan N, Eggeling R, Reuter B, Van Leathem K, Pingarilho M, Gomes P, Sönnerborg A, Kaiser R, Zazzi M, Pfeifer N. HIV multidrug class resistance prediction with a time sliding anchor approach. BIOINFORMATICS ADVANCES 2025;5:vbaf099. [PMID: 40421422 PMCID: PMC12104520 DOI: 10.1093/bioadv/vbaf099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2025] [Revised: 04/16/2025] [Accepted: 04/25/2025] [Indexed: 05/28/2025]

Affiliation(s)

Nurhan Arslan Methods in Medical Informatics, Department of Computer Science, University of Tuebingen, Tuebingen 72076, Germany Institute for Bioinformatics and Medical Informatics (IBMI), University of Tuebingen, Tuebingen 72076, Germany
Ralf Eggeling Methods in Medical Informatics, Department of Computer Science, University of Tuebingen, Tuebingen 72076, Germany Institute for Bioinformatics and Medical Informatics (IBMI), University of Tuebingen, Tuebingen 72076, Germany
Bernhard Reuter Methods in Medical Informatics, Department of Computer Science, University of Tuebingen, Tuebingen 72076, Germany Institute for Bioinformatics and Medical Informatics (IBMI), University of Tuebingen, Tuebingen 72076, Germany
Kristel Van Leathem Laboratory of Clinical and Epidemiological Virology, Department of Microbiology, Immunology and Transplantation, Rega Institute for Medical Research, KU Leuven, Leuven 3000, Belgium
Marta Pingarilho Global Health and Tropical Medicine, GHTM, Associate Laboratory in Translation and Innovation towards Global Health, LA-REAL, Instituto de Higiene e Medicina Tropical, IHMT, Universidade NOVA de Lisboa, Lisbon 1349-008, Portugal
Perpétua Gomes Laboratório de Biologia Molecular, LMCBM, SPC, Unidade Local de Saúde Lisboa Ocidental, Hospital Egas Moniz, Caparica 2829-511, Portugal Egas Moniz Center for Interdisciplinary Research (CiiEM), Egas Moniz School of Health and Science, Lisbon, Almada 1349-019, Portugal
Anders Sönnerborg Department of Medicine Huddinge, Karolinska University Hospital, Stockholm 14186, Sweden Division of Infectious Diseases, Department of Clinical Microbiology, Karolinska Institutet, Stockholm 14152, Sweden
Rolf Kaiser Institute of Virology, Faculty of Medicine, University Hospital Cologne, University of Cologne, Cologne 50935, Germany
Maurizio Zazzi Department of Medical Biotechnology, University of Siena, Siena 53100, Italy
Nico Pfeifer Methods in Medical Informatics, Department of Computer Science, University of Tuebingen, Tuebingen 72076, Germany Institute for Bioinformatics and Medical Informatics (IBMI), University of Tuebingen, Tuebingen 72076, Germany

Collapse

Malik M, Le VT, Ou YY. NA_mCNN: Classification of Sodium Transporters in Membrane Proteins by Integrating Multi-Window Deep Learning and ProtTrans for Their Therapeutic Potential. J Proteome Res 2025;24:2324-2335. [PMID: 40193588 PMCID: PMC12053934 DOI: 10.1021/acs.jproteome.4c00884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2024] [Revised: 01/01/2025] [Accepted: 03/19/2025] [Indexed: 04/09/2025]

Chuang CC, Liu YC, Ou YY. DeepEpiIL13: Deep Learning for Rapid and Accurate Prediction of IL-13-Inducing Epitopes Using Pretrained Language Models and Multiwindow Convolutional Neural Networks. ACS OMEGA 2025;10:9675-9683. [PMID: 40092768 PMCID: PMC11904640 DOI: 10.1021/acsomega.4c10960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/03/2024] [Revised: 02/12/2025] [Accepted: 02/14/2025] [Indexed: 03/19/2025]

Abstract

Accurate prediction of interleukin-13 (IL-13)-inducing epitopes is crucial for advancing targeted therapies against allergic inflammation, the cytokine storm associated with severe COVID-19, and related disorders. Current epitope prediction methods, however, often exhibit limitations in efficiency and accuracy. To address this, we introduce DeepEpilL13, a novel deep learning framework that uniquely synergizes pretrained language models with multiwindow convolutional neural networks (CNNs) for the rapid and accurate identification of IL-13-inducing epitopes from protein sequences. DeepEpilL13 leverages high-dimensional embeddings generated by the pretrained language model, which capture rich contextual information from protein sequences. These embeddings are then processed by a multiwindow CNN architecture, enabling the effective exploration of both local and global sequence patterns pertinent to IL-13 induction. The proposed DeepEpilL13 approach underwent rigorous evaluation using both benchmark data sets and an independent SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) data set. Results demonstrate that DeepEpilL13 achieves superior performance compared with traditional methods. On the benchmark data set, DeepEpilL13 attained a Matthews correlation coefficient (MCC) of 0.52 and an area under the receiver operating characteristic curve (AUC) of 0.86. Notably, when assessed on the independent SARS-CoV-2 data set, DeepEpilL13 exhibited remarkable robustness, achieving an MCC of 0.63 and an AUC of 0.92. These metrics underscore the enhanced predictive capability and robust applicability of DeepEpilL13, particularly within the context of the COVID-19 research and related viral infections. This study presents DeepEpilL13 as a powerful and efficient deep learning framework for accurate epitope prediction. By offering significant improvement in performance and robustness, DeepEpilL13 provides new and promising avenues for the development of epitope-based vaccines and immunotherapies specifically targeting IL-13-mediated disorders. The successful and rapid identification of IL-13-inducing epitopes using DeepEpilL13 paves the way for novel therapeutic interventions against a range of conditions, including allergic diseases, inflammatory conditions, and severe viral infections such as COVID-19, with potential for a significant impact on public health outcomes.

Collapse

Chuang CC, Liu YC, Jhang WE, Wei SS, Ou YY. RAG_MCNNIL6: A Retrieval-Augmented Multi-Window Convolutional Network for Accurate Prediction of IL-6 Inducing Epitopes. J Chem Inf Model 2025;65:2685-2694. [PMID: 39967508 PMCID: PMC11898070 DOI: 10.1021/acs.jcim.4c02144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2024] [Revised: 01/20/2025] [Accepted: 02/11/2025] [Indexed: 02/20/2025]

Shah SMA, Rafi M, Malik MS, Malik SA, Ou YY. mCNN-glucose: Identifying families of glucose transporters using a deep convolutional neural network based on multiple-scanning windows. Int J Biol Macromol 2025;294:139522. [PMID: 39761890 DOI: 10.1016/j.ijbiomac.2025.139522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2024] [Revised: 01/01/2025] [Accepted: 01/03/2025] [Indexed: 01/11/2025]

Le VT, Malik MS, Lin YJ, Liu YC, Chang YY, Ou YY. ATP_mCNN: Predicting ATP binding sites through pretrained language models and multi-window neural networks. Comput Biol Med 2025;185:109541. [PMID: 39653625 DOI: 10.1016/j.compbiomed.2024.109541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2024] [Revised: 11/20/2024] [Accepted: 12/05/2024] [Indexed: 01/26/2025]

Zhang H, Wei Y, Saravanan KM. Artificial intelligence and computer-aided drug discovery: Methods development and application. Methods 2025;234:294-295. [PMID: 39826658 DOI: 10.1016/j.ymeth.2025.01.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2025] Open

Malik M, Chang YY, Liu YC, Le VT, Ou YY. MCNN_MC: Computational Prediction of Mitochondrial Carriers and Investigation of Bongkrekic Acid Toxicity Using Protein Language Models and Convolutional Neural Networks. J Chem Inf Model 2024;64:9125-9134. [PMID: 39133248 PMCID: PMC11683872 DOI: 10.1021/acs.jcim.4c00961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 07/26/2024] [Accepted: 07/29/2024] [Indexed: 08/13/2024]

Abstract

Mitochondrial carriers (MCs) are essential proteins that transport metabolites across mitochondrial membranes and play a critical role in cellular metabolism. ADP/ATP (adenosine diphosphate/adenosine triphosphate) is one of the most important carriers as it contributes to cellular energy production and is susceptible to the powerful toxin bongkrekic acid. This toxin has claimed several lives; for example, a recent foodborne outbreak in Taipei, Taiwan, has caused four deaths and sickened 30 people. The issue of bongkrekic acid poisoning has been a long-standing problem in Indonesia, with reports as early as 1895 detailing numerous deaths from contaminated coconut fermented cakes. In bioinformatics, significant advances have been made in understanding biological processes through computational methods; however, no established computational method has been developed for identifying mitochondrial carriers. We propose a computational bioinformatics approach for predicting MCs from a broader class of secondary active transporters with a focus on the ADP/ATP carrier and its interaction with bongkrekic acid. The proposed model combines protein language models (PLMs) with multiwindow scanning convolutional neural networks (mCNNs). While PLM embeddings capture contextual information within proteins, mCNN scans multiple windows to identify potential binding sites and extract local features. Our results show 96.66% sensitivity, 95.76% specificity, 96.12% accuracy, 91.83% Matthews correlation coefficient (MCC), 94.63% F1-Score, and 98.55% area under the curve (AUC). The results demonstrate the effectiveness of the proposed approach in predicting MCs and elucidating their functions, particularly in the context of bongkrekic acid toxicity. This study presents a valuable approach for identifying novel mitochondrial complexes, characterizing their functional roles, and understanding mitochondrial toxicology mechanisms. Our findings, that utilize computational methods to improve our understanding of cellular processes and drug-target interactions, contribute to the development of therapeutic strategies for mitochondrial disorders, reducing the devastating effects of bongkrekic acid poisoning.

Collapse

Malik MS, Le VT, Shah SMA, Ou YY. MCNN-AAPT: accurate classification and functional prediction of amino acid and peptide transporters in secondary active transporters using protein language models and multi-window deep learning. J Biomol Struct Dyn 2024:1-10. [PMID: 39576667 DOI: 10.1080/07391102.2024.2431664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Accepted: 04/23/2024] [Indexed: 02/28/2025]

Abstract

Secondary active transporters play a crucial role in cellular physiology by facilitating the movement of molecules across cell membranes. Identifying the functional classes of these transporters, particularly amino acid and peptide transporters, is essential for understanding their involvement in various physiological processes and disease pathways, including cancer. This study aims to develop a robust computational framework that integrates pre-trained protein language models and deep learning techniques to classify amino acid and peptide transporters within the secondary active transporter (SAT) family and predict their functional association with solute carrier (SLC) proteins. The study leverages a comprehensive dataset of 448 secondary active transporters, including 36 solute carrier proteins, obtained from UniProt and the Transporter Classification Database (TCDB). Three state-of-the-art protein language models, ProtTrans, ESM-1b, and ESM-2, are evaluated within a deep learning neural network architecture that employs a multi-window scanning technique to capture local and global sequence patterns. The ProtTrans-based feature set demonstrates exceptional performance, achieving a classification accuracy of 98.21% with 87.32% sensitivity and 99.76% specificity for distinguishing amino acid and peptide transporters from other SATs. Furthermore, the model maintains strong predictive ability for SLC proteins, with an overall accuracy of 88.89% and a Matthews Correlation Coefficient (MCC) of 0.7750. This study showcases the power of integrating pre-trained protein language models and deep learning techniques for the functional classification of secondary active transporters and the prediction of associated solute carrier proteins. The findings have significant implications for drug development, disease research, and the broader understanding of cellular transport mechanisms.

Collapse

Zhang H, Wei Y, Saravanan KM. Artificial intelligence and computer-aided drug discovery: Methods development and application. Methods 2024;231:55-56. [PMID: 39265960 DOI: 10.1016/j.ymeth.2024.09.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/14/2024] Open

Le VT, Tseng YH, Liu YC, Malik MS, Ou YY. VesiMCNN: Using pre-trained protein language models and multiple window scanning convolutional neural networks to identify vesicular transport proteins. Int J Biol Macromol 2024;280:136048. [PMID: 39332561 DOI: 10.1016/j.ijbiomac.2024.136048] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2024] [Revised: 09/16/2024] [Accepted: 09/25/2024] [Indexed: 09/29/2024]

Le VT, Malik MS, Tseng YH, Lee YC, Huang CI, Ou YY. DeepPLM_mCNN: An approach for enhancing ion channel and ion transporter recognition by multi-window CNN based on features from pre-trained language models. Comput Biol Chem 2024;110:108055. [PMID: 38555810 DOI: 10.1016/j.compbiolchem.2024.108055] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2023] [Revised: 02/28/2024] [Accepted: 03/19/2024] [Indexed: 04/02/2024]