1
|
Hajim WI, Zainudin S, Daud KM, Alheeti K. Golden eagle optimized CONV-LSTM and non-negativity-constrained autoencoder to support spatial and temporal features in cancer drug response prediction. PeerJ Comput Sci 2024; 10:e2520. [PMID: 39896419 PMCID: PMC11784781 DOI: 10.7717/peerj-cs.2520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Accepted: 10/25/2024] [Indexed: 02/04/2025]
Abstract
Advanced machine learning (ML) and deep learning (DL) methods have recently been utilized in Drug Response Prediction (DRP), and these models use the details from genomic profiles, such as extensive drug screening data and cell line data, to predict the response of drugs. Comparatively, the DL-based prediction approaches provided better learning of such features. However, prior knowledge, like pathway data, is sometimes discarded as irrelevant since the drug response datasets are multidimensional and noisy. Optimized feature learning and extraction processes are suggested to handle this problem. First, the noise and class imbalance problems must be tackled to avoid low identification accuracy, long prediction times, and poor applicability. This article aims to apply the Non-Negativity-Constrained Auto Encoder (NNCAE) network to tackle these issues, enhance the adaptive search for the optimal size of sliding windows, and ensure that deep network architectures are adept at learning the vital hidden features. NNCAE methodology is used after performing the standard pre-processing procedures to handle the noise and class imbalance problem. This class balanced and noise-removed input data features are learned to train the proposed hybrid classifier. The classification model, Golden Eagle Optimization-based Convolutional Long Short-Term Memory neural networks (GEO-Conv-LSTM), is assembled by integrating Convolutional Neural Network CNN and LSTM models, with parameter tuning performed by the GEO algorithm. Evaluations are conducted on two large datasets from the Genomics of Drug Sensitivity in Cancer (GDSC) repository, and the proposed NNCAE-GEO-Conv-LSTM-based approach has achieved 96.99% and 97.79% accuracies, respectively, with reduced processing time and error rate for the DRP problem.
Collapse
Affiliation(s)
- Wesam Ibrahim Hajim
- Department of Applied Geology, College of Sciences, University of Tikrit, Tikrit, Salah ad Din, Iraq
- Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Selangor, Malaysia
| | - Suhaila Zainudin
- Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Selangor, Malaysia
| | - Kauthar Mohd Daud
- Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Selangor, Malaysia
| | - Khattab Alheeti
- Department of Computer Networking Systems College of Computer Sciences and Information Technology, University of Anbar, Ramadi, Al Anbar, Iraq
| |
Collapse
|
2
|
Saranya KR, Vimina ER. DRN-CDR: A cancer drug response prediction model using multi-omics and drug features. Comput Biol Chem 2024; 112:108175. [PMID: 39191166 DOI: 10.1016/j.compbiolchem.2024.108175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 08/09/2024] [Accepted: 08/14/2024] [Indexed: 08/29/2024]
Abstract
Cancer drug response (CDR) prediction is an important area of research that aims to personalize cancer therapy, optimizing treatment plans for maximum effectiveness while minimizing potential negative effects. Despite the advancements in Deep learning techniques, the effective integration of multi-omics data for drug response prediction remains challenging. In this paper, a regression method using Deep ResNet for CDR (DRN-CDR) prediction is proposed. We aim to explore the potential of considering sole cancer genes in drug response prediction. Here the multi-omics data such as gene expressions, mutation data, and methylation data along with the molecular structural information of drugs were integrated to predict the IC50 values of drugs. Drug features are extracted by employing a Uniform Graph Convolution Network, while Cell line features are extracted using a combination of Convolutional Neural Network and Fully Connected Networks. These features are then concatenated and fed into a deep ResNet for the prediction of IC50 values between Drug - Cell line pairs. The proposed method yielded higher Pearson's correlation coefficient (rp) of 0.7938 with lowest Root Mean Squared Error (RMSE) value of 0.92 when compared with similar methods of tCNNS, MOLI, DeepCDR, TGSA, NIHGCN, DeepTTA, GraTransDRP and TSGCNN. Further, when the model is extended to a classification problem to categorize drugs as sensitive or resistant, we achieved AUC and AUPR measures of 0.7623 and 0.7691, respectively. The drugs such as Tivozanib, SNX-2112, CGP-60474, PHA-665752, Foretinib etc., exhibited low median IC50 values and were found to be effective anti-cancer drugs. The case studies with different TCGA cancer types also revealed the effectiveness of SNX-2112, CGP-60474, Foretinib, Cisplatin, Vinblastine etc. This consistent pattern strongly suggests the effectiveness of the model in predicting CDR.
Collapse
Affiliation(s)
- K R Saranya
- Department of Computer Science and IT, School of Computing, Amrita Vishwa Vidyapeetham, Kochi Campus, India
| | - E R Vimina
- Department of Computer Science and IT, School of Computing, Amrita Vishwa Vidyapeetham, Kochi Campus, India.
| |
Collapse
|
3
|
Jeyananthan P. Performance comparison between multi-level gene expression data in cancer subgroup classification. Pathol Res Pract 2024; 260:155419. [PMID: 38955118 DOI: 10.1016/j.prp.2024.155419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 06/06/2024] [Accepted: 06/19/2024] [Indexed: 07/04/2024]
Abstract
Cancer is a serious disease that can affect various parts of the body such as breast, colon, lung or stomach. Each of these cancers has their own treatment dependent historical subgroups. Hence, the correct identification of cancer subgroup has almost same importance as the timely diagnosis of cancer. This is still a challenging task and a system with highest accuracy is essential. Current researches are moving towards analyzing the gene expression data of cancer patients for various purposes including biomarker identification and studying differently expressed genes, using gene expression data measured in a single level (selected from different gene levels including genome, transcriptome or translation). However, previous studies showed that information carried by one level of gene expression is not similar to another level. This shows the importance of integrating multi-level omics data in these studies. Hence, this study uses tumor gene expression data measured from various levels of gene along with the integration of those data in the subgroup classification of nine different cancers. This is a comprehensive analysis where four different gene expression data such as transcriptome, miRNA, methylation and proteome are used in this subgrouping and the performances between models are compared to reveal the best model.
Collapse
|
4
|
Wei L, Zou Q, Zeng X. Editorial: Artificial intelligence in drug discovery and development. Methods 2024; 226:133-137. [PMID: 38582311 DOI: 10.1016/j.ymeth.2024.04.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2024] Open
Affiliation(s)
- Leyi Wei
- Faculty of Applied Sciences, Macao Polytechnic University, Macao 999078, China; School of Software, Shandong University, Jinan 250101, China.
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Xiangxiang Zeng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| |
Collapse
|
5
|
Zhang S, Tian X, Chen C, Su Y, Huang W, Lv X, Chen C, Li H. AIGO-DTI: Predicting Drug-Target Interactions Based on Improved Drug Properties Combined with Adaptive Iterative Algorithms. J Chem Inf Model 2024; 64:4373-4384. [PMID: 38743013 DOI: 10.1021/acs.jcim.4c00584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Artificial intelligence-based methods for predicting drug-target interactions (DTIs) aim to explore reliable drug candidate targets rapidly and cost-effectively to accelerate the drug development process. However, current methods are often limited by the topological regularities of drug molecules, making them difficult to generalize to a broader chemical space. Additionally, the use of similarity to measure DTI network links often introduces noise, leading to false DTI relationships and affecting the prediction accuracy. To address these issues, this study proposes an Adaptive Iterative Graph Optimization (AIGO)-DTI prediction framework. This framework integrates atomic cluster information and enhances molecular features through the design of functional group prompts and graph encoders, optimizing the construction of DTI association networks. Furthermore, the optimization of graph structure is transformed into a node similarity learning problem, utilizing multihead similarity metric functions to iteratively update the network structure to improve the quality of DTI information. Experimental results demonstrate the outstanding performance of AIGO-DTI on multiple public data sets and label reversal data sets. Case studies, molecular docking, and existing research validate its effectiveness and reliability. Overall, the method proposed in this study can construct comprehensive and reliable DTI association network information, providing new graphing and optimization strategies for DTI prediction, which contribute to efficient drug development and reduce target discovery costs.
Collapse
Affiliation(s)
- Sizhe Zhang
- College of Software, Xinjiang University, Urumqi, 830046 Xinjiang, China
| | - Xuecong Tian
- College of Information Science and Engineering, Xinjiang University, Urumqi, 830046 Xinjiang, China
| | - Chen Chen
- College of Information Science and Engineering, Xinjiang University, Urumqi, 830046 Xinjiang, China
| | - Ying Su
- College of Information Science and Engineering, Xinjiang University, Urumqi, 830046 Xinjiang, China
| | - Wanhua Huang
- College of Information Science and Engineering, Xinjiang University, Urumqi, 830046 Xinjiang, China
| | - Xiaoyi Lv
- College of Software, Xinjiang University, Urumqi, 830046 Xinjiang, China
| | - Cheng Chen
- College of Software, Xinjiang University, Urumqi, 830046 Xinjiang, China
| | - Hongyi Li
- Xinjiang University, Urumqi, 830046 Xinjiang, China
| |
Collapse
|
6
|
Chen S, Li M, Semenov I. MFA-DTI: Drug-target interaction prediction based on multi-feature fusion adopted framework. Methods 2024; 224:79-92. [PMID: 38430967 DOI: 10.1016/j.ymeth.2024.02.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Revised: 02/16/2024] [Accepted: 02/23/2024] [Indexed: 03/05/2024] Open
Abstract
The identification of drug-target interactions (DTI) is a valuable step in the drug discovery and repositioning process. However, traditional laboratory experiments are time-consuming and expensive. Computational methods have streamlined research to determine DTIs. The application of deep learning methods has significantly improved the prediction performance for DTIs. Modern deep learning methods can leverage multiple sources of information, including sequence data that contains biological structural information, and interaction data. While useful, these methods cannot be effectively applied to each type of information individually (e.g., chemical structure and interaction network) and do not take into account the specificity of DTI data such as low- or zero-interaction biological entities. To overcome these limitations, we propose a method called MFA-DTI (Multi-feature Fusion Adopted framework for DTI). MFA-DTI consists of three modules: an interaction graph learning module that processes the interaction network to generate interaction vectors, a chemical structure learning module that extracts features from the chemical structure, and a fusion module that combines these features for the final prediction. To validate the performance of MFA-DTI, we conducted experiments on six public datasets under different settings. The results indicate that the proposed method is highly effective in various settings and outperforms state-of-the-art methods.
Collapse
Affiliation(s)
- Siqi Chen
- School of Information Science and Engineering, Chongqing Jiaotong University, Chongqing, 400074, China.
| | - Minghui Li
- Beidahuang Industry Group General Hospital, Harbin, 150006, China
| | - Ivan Semenov
- College of Intelligence and Computing, Tianjin University, Tianjin, 300072, China
| |
Collapse
|
7
|
Ceskoutsé RFT, Bomgni AB, Gnimpieba Zanfack DR, Agany DDM, Bouetou Bouetou T, Gnimpieba Zohim E. Sub-clustering based recommendation system for stroke patient: Identification of a specific drug class for a given patient. Comput Biol Med 2024; 171:108117. [PMID: 38335820 PMCID: PMC10981530 DOI: 10.1016/j.compbiomed.2024.108117] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Revised: 01/29/2024] [Accepted: 02/04/2024] [Indexed: 02/12/2024]
Abstract
Stroke is one of the leading causes of death worldwide. Previous studies have explored machine learning techniques for early detection of stroke patients using content-based recommendation systems. However, these models often struggle with timely detection of medications, which can be critical for patient management and decision-making regarding the prescription of new drugs. In this study, we developed a content-based recommendation model using three machine learning algorithms: Gaussian Mixture Model (GMM), Affinity Propagation (AP), and K-Nearest Neighbors (KNN), to aid Healthcare Professionals (HCP) in quickly detecting medications based on the symptoms of a patient with stroke. Our model focused on three classes of drugs: antihypertensive, anticoagulant, and fibrate. Each machine learning algorithm was used to accomplish specific tasks, thereby reducing the partial search space, computational cost, and accurately detecting a primary drug class without loss of precision and accuracy. Our proposed model, called CRGANNC (Clustering Recommendation Gaussian Affinity Nearest Neighbors Classifier), effectively addresses the sparsity and scalability issues faced by content-based recommendation models. The CRGANNC model dynamically partition clusters into sub-clusters with variable numbers based on the group, and can diagnose healthy, sick, and at-risk patients, and recommend drugs to the HCP. In addition to our analysis, we developed a semi-artificial dataset with new features such as weakness, dizziness, headache, nausea, and vomiting, using a pipeline. This dataset serves as a valuable resource for researchers in the sensitive domain of stroke, providing a starting point for building and testing models when real data is often restricted. Our work not only contributes to the development of predictive models for stroke but also establishes a framework for creating similar datasets in other sensitive domains, accelerating research efforts and improving patient care. Our experiments were conducted on our dataset consisting of 9691 patient records, with 1206 records for stroke attacks and 8485 healthy patients. The CRGANNC model achieved an average precision of 0.98, recall of 0.95 and F1-score of 0.96 across all three drugs classes. Furthermore, our model demonstrated significant improvement in computational efficiency compared to existing content-based recommendation models, reducing the processing time by 25.80% . This results indicate the effectiveness of our model in accurately detecting medications for stroke patients based on their symptoms.
Collapse
Affiliation(s)
- Ribot Fleury T Ceskoutsé
- Ecole Nationale Supérieure Polytechnique, University of Yaounde I, P.O. Box. 8390, Yaoundé, Cameroon.
| | - Alain Bertrand Bomgni
- University of South Dakota, 4800 N Career Avenue, 57107, SD, USA; Departement of Mathematics and computer science, University of Dschang, P.O. Box. 67, Dschang, Cameroon.
| | - David R Gnimpieba Zanfack
- Laboratory of Innovative Technologies (LTI), University of Picardie Jule Verne (UPJV), 48 Rue Raspail, 02100 Saint Quentin, France.
| | - Diing D M Agany
- University of South Dakota, 4800 N Career Avenue, 57107, SD, USA.
| | - Thomas Bouetou Bouetou
- Ecole Nationale Supérieure Polytechnique, University of Yaounde I, P.O. Box. 8390, Yaoundé, Cameroon.
| | | |
Collapse
|
8
|
Chen S, Semenov I, Zhang F, Yang Y, Geng J, Feng X, Meng Q, Lei K. An effective framework for predicting drug-drug interactions based on molecular substructures and knowledge graph neural network. Comput Biol Med 2024; 169:107900. [PMID: 38199213 DOI: 10.1016/j.compbiomed.2023.107900] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 11/27/2023] [Accepted: 12/23/2023] [Indexed: 01/12/2024]
Abstract
Drug-drug interactions (DDIs) play a central role in drug research, as the simultaneous administration of multiple drugs can have harmful or beneficial effects. Harmful interactions lead to adverse reactions, some of which can be life-threatening, while beneficial interactions can promote efficacy. Therefore, it is crucial for physicians, patients, and the research community to identify potential DDIs. Although many AI-based techniques have been proposed for predicting DDIs, most existing computational models primarily focus on integrating multiple data sources or combining popular embedding methods. Researchers often overlook the valuable information within the molecular structure of drugs or only consider the structural information of drugs, neglecting the relationship or topological information between drugs and other biological objects. In this study, we propose MSKG-DDI - a two-component framework that incorporates the Drug Chemical Structure Graph-based component and the Drug Knowledge Graph-based component to capture multimodal characteristics of drugs. Subsequently, a multimodal fusion neural layer is utilized to explore the complementarity between multimodal representations of drugs. Extensive experiments were conducted using two real-world datasets, and the results demonstrate that MSKG-DDI outperforms other state-of-the-art models in binary-class, multi-class, and multi-label prediction tasks under both transductive and inductive settings. Furthermore, the ablation analysis further confirms the practical usefulness of MSKG-DDI.
Collapse
Affiliation(s)
- Siqi Chen
- School of Information Science and Engineering, Chongqing Jiaotong University, Chongqing, 400074, China
| | - Ivan Semenov
- College of Intelligence and Computing, Tianjin University, Tianjin, 300072, China
| | - Fengyun Zhang
- College of Intelligence and Computing, Tianjin University, Tianjin, 300072, China
| | - Yang Yang
- College of Intelligence and Computing, Tianjin University, Tianjin, 300072, China
| | - Jie Geng
- TianJin Chest Hospital, Tianjin University, Tianjin, 300222, China
| | - Xuequan Feng
- Tianjin First Central Hospital, Tianjin, 300192, China.
| | - Qinghua Meng
- Tianjin Key Laboratory of Sports Physiology and Sports Medicine, Tianjin University of Sport, Tianjin, 301617, China
| | - Kaiyou Lei
- College of Computer and Information Science, Southwest University, Chongqing, 400715, China
| |
Collapse
|
9
|
Shahzad M, Tahir MA, Alhussein M, Mobin A, Shams Malick RA, Anwar MS. NeuPD-A Neural Network-Based Approach to Predict Antineoplastic Drug Response. Diagnostics (Basel) 2023; 13:2043. [PMID: 37370938 DOI: 10.3390/diagnostics13122043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 06/01/2023] [Accepted: 06/05/2023] [Indexed: 06/29/2023] Open
Abstract
With the beginning of the high-throughput screening, in silico-based drug response analysis has opened lots of research avenues in the field of personalized medicine. For a decade, many different predicting techniques have been recommended for the antineoplastic (anti-cancer) drug response, but still, there is a need for improvements in drug sensitivity prediction. The intent of this research study is to propose a framework, namely NeuPD, to validate the potential anti-cancer drugs against a panel of cancer cell lines in publicly available datasets. The datasets used in this work are Genomics of Drug Sensitivity in Cancer (GDSC) and Cancer Cell Line Encyclopedia (CCLE). As not all drugs are effective on cancer cell lines, we have worked on 10 essential drugs from the GDSC dataset that have achieved the best modeling results in previous studies. We also extracted 1610 essential oncogene expressions from 983 cell lines from the same dataset. Whereas, from the CCLE dataset, 16,383 gene expressions from 1037 cell lines and 24 drugs have been used in our experiments. For dimensionality reduction, Pearson correlation is applied to best fit the model. We integrate the genomic features of cell lines and drugs' fingerprints to fit the neural network model. For evaluation of the proposed NeuPD framework, we have used repeated K-fold cross-validation with 5 times repeats where K = 10 to demonstrate the performance in terms of root mean square error (RMSE) and coefficient determination (R2). The results obtained on the GDSC dataset that were measured using these cost functions show that our proposed NeuPD framework has outperformed existing approaches with an RMSE of 0.490 and R2 of 0.929.
Collapse
Affiliation(s)
- Muhammad Shahzad
- FAST School of Computing, National University of Computer and Emerging Sciences (NUCES-FAST), Karachi 75030, Pakistan
| | - Muhammad Atif Tahir
- FAST School of Computing, National University of Computer and Emerging Sciences (NUCES-FAST), Karachi 75030, Pakistan
| | - Musaed Alhussein
- Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, P.O. Box 51178, Riyadh 11543, Saudi Arabia
| | - Ansharah Mobin
- FAST School of Computing, National University of Computer and Emerging Sciences (NUCES-FAST), Karachi 75030, Pakistan
| | - Rauf Ahmed Shams Malick
- FAST School of Computing, National University of Computer and Emerging Sciences (NUCES-FAST), Karachi 75030, Pakistan
| | - Muhammad Shahid Anwar
- Department of AI and Software, Gachon University, Seongnam-si 13120, Republic of Korea
| |
Collapse
|