1
|
Wu J, Lai J, Zhao X, Wang Z, Zhang Y, Wang L, Su Y, He Y, Li S, Jiang Y, Han J. DeepCCDS: Interpretable Deep Learning Framework for Predicting Cancer Cell Drug Sensitivity through Characterizing Cancer Driver Signals. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2025:e2416958. [PMID: 40397390 DOI: 10.1002/advs.202416958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2024] [Revised: 04/18/2025] [Indexed: 05/22/2025]
Abstract
Accurate characterization of cellular states is the foundation for precise prediction of drug sensitivity in cancer cell lines, which in turn is fundamental to realizing precision oncology. However, current deep learning approaches have limitations in characterizing cellular states. They rely solely on isolated genetic markers, overlooking the complex regulatory networks and cellular mechanisms that underlie drug responses. To address this limitation, this work proposes DeepCCDS, a Deep learning framework for Cancer Cell Drug Sensitivity prediction through Characterizing Cancer Driver Signals. DeepCCDS incorporates a prior knowledge network to characterize cancer driver signals, building upon the self-supervised neural network framework. The signals can reflect key mechanisms influencing cancer cell development and drug response, enhancing the model's predictive performance and interpretability. DeepCCDS has demonstrated superior performance in predicting drug sensitivity compared to previous state-of-the-art approaches across multiple datasets. Benefiting from integrating prior knowledge, DeepCCDS exhibits powerful feature representation capabilities and interpretability. Based on these feature representations, we have identified embedding features that could potentially be used for drug screening in new indications. Further, this work demonstrates the applicability of DeepCCDS on solid tumor samples from The Cancer Genome Atlas. This work believes integrating DeepCCDS into clinical decision-making processes can potentially improve the selection of personalized treatment strategies for cancer patients.
Collapse
Affiliation(s)
- Jiashuo Wu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Jiyin Lai
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Xilong Zhao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Ziyi Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Yongbao Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Liqiang Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Yinchun Su
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Yalan He
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Siyuan Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Ying Jiang
- College of Basic Medical Science, Heilongjiang University of Chinese Medicine, Harbin, 150040, China
| | - Junwei Han
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| |
Collapse
|
2
|
He Y, Li S, Lan H, Long W, Zhai S, Li M, Wen Z. A Transfer Learning Framework for Predicting and Interpreting Drug Responses via Single-Cell RNA-Seq Data. Int J Mol Sci 2025; 26:4365. [PMID: 40362602 PMCID: PMC12072357 DOI: 10.3390/ijms26094365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2025] [Revised: 04/29/2025] [Accepted: 05/02/2025] [Indexed: 05/15/2025] Open
Abstract
Chemotherapy is a fundamental therapy in cancer treatment, yet its effectiveness is often undermined by drug resistance. Understanding the molecular mechanisms underlying drug response remains a major challenge due to tumor heterogeneity, complex cellular interactions, and limited access to clinical samples, which also hinder the performance and interpretability of existing predictive models. Meanwhile, single-cell RNA sequencing (scRNA-seq) has emerged as a powerful tool for uncovering resistance mechanisms, but the systematic collection and utilization of scRNA-seq drug response data remain limited. In this study, we collected scRNA-seq drug response datasets from publicly available web sources and proposed a transfer learning-based framework to align bulk and single cell sequencing data. A shared encoder was designed to project both bulk and single-cell sequencing data into a unified latent space for drug response prediction, while a sparse decoder guided by prior biological knowledge enhanced interpretability by mapping latent features to predefined pathways. The proposed model achieved superior performance across five curated scRNA-seq datasets and yielded biologically meaningful insights through integrated gradient analysis. This work demonstrates the potential of deep learning to advance drug response prediction and underscores the value of scRNA-seq data in supporting related research.
Collapse
Affiliation(s)
- Yujie He
- College of Chemistry, Sichuan University, Chengdu 610064, China; (Y.H.)
| | - Shenghao Li
- College of Chemistry, Sichuan University, Chengdu 610064, China; (Y.H.)
| | - Hao Lan
- College of Chemistry, Sichuan University, Chengdu 610064, China; (Y.H.)
| | - Wulin Long
- College of Chemistry, Sichuan University, Chengdu 610064, China; (Y.H.)
| | - Shengqiu Zhai
- College of Chemistry, Sichuan University, Chengdu 610064, China; (Y.H.)
| | - Menglong Li
- College of Chemistry, Sichuan University, Chengdu 610064, China; (Y.H.)
| | - Zhining Wen
- College of Chemistry, Sichuan University, Chengdu 610064, China; (Y.H.)
- Medical Big Data Center, Sichuan University, Chengdu 610064, China
| |
Collapse
|
3
|
Shi H, Xu T, Li X, Gao Q, Xiong Z, Xia J, Yue Z. DRExplainer: Quantifiable interpretability in drug response prediction with directed graph convolutional network. Artif Intell Med 2025; 163:103101. [PMID: 40056540 DOI: 10.1016/j.artmed.2025.103101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2024] [Revised: 01/08/2025] [Accepted: 02/23/2025] [Indexed: 03/10/2025]
Abstract
Predicting the response of a cancer cell line to a therapeutic drug is pivotal for personalized medicine. Despite numerous deep learning methods that have been developed for drug response prediction, integrating diverse information about biological entities and predicting the directional response remain major challenges. Here, we propose a novel interpretable predictive model, DRExplainer, which leverages a directed graph convolutional network to enhance the prediction in a directed bipartite network framework. DRExplainer constructs a directed bipartite network integrating multi-omics profiles of cell lines, the chemical structure of drugs and known drug response to achieve directed prediction. Then, DRExplainer identifies the most relevant subgraph to each prediction in this directed bipartite network by learning a mask, facilitating critical medical decision-making. Additionally, we introduce a quantifiable method for model interpretability that leverages a ground truth benchmark dataset curated from biological features. In computational experiments, DRExplainer outperforms state-of-the-art predictive methods and another graph-based explanation method under the same experimental setting. Finally, the case studies further validate the interpretability and the effectiveness of DRExplainer in predictive novel drug response. Our code is available at: https://github.com/vshy-dream/DRExplainer.
Collapse
Affiliation(s)
- Haoyuan Shi
- University of Science and Technology of China, Hefei, 230026, Anhui, China; School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, Anhui, China.
| | - Tao Xu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, Anhui, China.
| | - Xiaodi Li
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, Anhui, China.
| | - Qian Gao
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, Anhui, China.
| | - Zhiwei Xiong
- University of Science and Technology of China, Hefei, 230026, Anhui, China.
| | - Junfeng Xia
- Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230036, Anhui, China.
| | - Zhenyu Yue
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, Anhui, China.
| |
Collapse
|
4
|
Reza M, Qiu C, Lin X, Su K, Liu A, Zhang X, Gong Y, Luo Z, Tian Q, Nwadiugwu M, Liang S, Shen H, Deng H. An Attention-Aware Multi-Task Learning Framework Identifies Candidate Targets for Drug Repurposing in Sarcopenia. J Cachexia Sarcopenia Muscle 2025; 16:e13661. [PMID: 40045692 PMCID: PMC11883102 DOI: 10.1002/jcsm.13661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Revised: 09/19/2024] [Accepted: 10/31/2024] [Indexed: 03/09/2025] Open
Abstract
BACKGROUND Sarcopenia presents a pressing public health concern due to its association with age-related muscle mass decline, strength loss and reduced physical performance, particularly in the growing older population. Given the absence of approved pharmacological therapies for sarcopenia, the need to discover effective pharmacological interventions has become critical. METHODS To address this challenge and discover new therapies, we developed a novel Multi-Task Attention-aware method for Multi-Omics data (MTA-MO) to extract complex biological insights from various biomedical data sources, including transcriptome, methylome and genome data to identify drug targets and discover new therapies. Additionally, MTA-MO integrates human protein-protein interaction (PPI) networks and drug-target networks to improve target identification. The novel method is applied to a multi-omics dataset that included 1055 participants aged 20-50 (mean (± SD) age 36.88 (± 8.64)), comprising 37.82% African-American and 62.18% Caucasian/White individuals. Physical activity levels were self-reported and categorized into three groups: ≥ 3 times/week, < 3 times/week and no regular exercise. Mean (± SD) measures for grip strength, appendicular lean mass (ALM), exercise frequency and smoking status (no/yes, n (%)) were 38.72 (± 8.93) kg, 28.65 (± 4.63) kg, 4.31 (± 1.79) and 30.81%/69.19%, respectively. Significant differences (p < 0.05) were found between groups in age, ALM, smoking, and consumption of milk, alcohol, beer and wine. RESULTS Using the MTA-MO method, we identified 639 gene targets, and by analysing PPIs and querying public databases, we narrowed this list down to seven potential hub genes associated with sarcopenia (ESR1, ATM, CDC42, EP300, PIK3CA, EGF and PTK2B). These findings were further validated through diverse levels of pathobiological evidence associated with sarcopenia. Gene Ontology and KEGG pathways analysis highlighted five key functions and signalling pathways relevant to skeletal muscle. The interaction network analysis identified three transcriptional factors (GATA2, JUN and FOXC1) as the key transcriptional regulators of the seven potential genes. In silico analysis of 1940 drug candidates identified canagliflozin as a promising candidate for repurposing in sarcopenia, demonstrating the strongest binding affinity to the PTK2B protein (inhibition constant 6.97 μM). This binding is stabilized by hydrophobic bonds, Van der Waals forces, pi-alkyl interactions and pi-anion interactions around PTK2B's active residues, suggesting its potential as a therapeutic option. CONCLUSIONS Our novel approach effectively integrates multi-omics data to identify potential treatments for sarcopenia. The findings suggest that canagliflozin could be a promising therapeutic candidate for sarcopenia.
Collapse
Affiliation(s)
- Md Selim Reza
- Deming Department of Medicine, School of Medicine, Tulane Center for Biomedical Informatics and GenomicsTulane UniversityNew OrleansLouisianaUSA
| | - Chuan Qiu
- Deming Department of Medicine, School of Medicine, Tulane Center for Biomedical Informatics and GenomicsTulane UniversityNew OrleansLouisianaUSA
| | - Xu Lin
- Shunde Hospital of Southern Medical UniversityFoshanChina
| | - Kuan‐Jui Su
- Deming Department of Medicine, School of Medicine, Tulane Center for Biomedical Informatics and GenomicsTulane UniversityNew OrleansLouisianaUSA
| | - Anqi Liu
- Deming Department of Medicine, School of Medicine, Tulane Center for Biomedical Informatics and GenomicsTulane UniversityNew OrleansLouisianaUSA
| | - Xiao Zhang
- Deming Department of Medicine, School of Medicine, Tulane Center for Biomedical Informatics and GenomicsTulane UniversityNew OrleansLouisianaUSA
| | - Yun Gong
- Deming Department of Medicine, School of Medicine, Tulane Center for Biomedical Informatics and GenomicsTulane UniversityNew OrleansLouisianaUSA
| | - Zhe Luo
- Deming Department of Medicine, School of Medicine, Tulane Center for Biomedical Informatics and GenomicsTulane UniversityNew OrleansLouisianaUSA
| | - Qing Tian
- Deming Department of Medicine, School of Medicine, Tulane Center for Biomedical Informatics and GenomicsTulane UniversityNew OrleansLouisianaUSA
| | - Martin Nwadiugwu
- Deming Department of Medicine, School of Medicine, Tulane Center for Biomedical Informatics and GenomicsTulane UniversityNew OrleansLouisianaUSA
| | | | - Hui Shen
- Deming Department of Medicine, School of Medicine, Tulane Center for Biomedical Informatics and GenomicsTulane UniversityNew OrleansLouisianaUSA
| | - Hong‐Wen Deng
- Deming Department of Medicine, School of Medicine, Tulane Center for Biomedical Informatics and GenomicsTulane UniversityNew OrleansLouisianaUSA
| |
Collapse
|
5
|
Miao R, Zhong BJ, Mei XY, Dong X, Ou YD, Liang Y, Yu HY, Wang Y, Dong ZH. A semi-supervised weighted SPCA- and convolution KAN-based model for drug response prediction. Front Genet 2025; 16:1532651. [PMID: 40191608 PMCID: PMC11968432 DOI: 10.3389/fgene.2025.1532651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2024] [Accepted: 02/24/2025] [Indexed: 04/09/2025] Open
Abstract
Motivation Predicting the response of cell lines to characteristic drugs based on multi-omics gene information has become the core problem of precision oncology. At present, drug response prediction using multi-omics gene data faces the following three main challenges: first, how to design a gene probe feature extraction model with biological interpretation and high performance; second, how to develop multi-omics weighting modules for reasonably fusing genetic data of different lengths and noise conditions; third, how to construct deep learning models that can handle small sample sizes while minimizing the risk of possible overfitting. Results We propose an innovative drug response prediction model (NMDP). First, the NMDP model introduces an interpretable semi-supervised weighted SPCA module to solve the feature extraction problem in multi-omics gene data. Next, we construct a multi-omics data fusion framework based on sample similarity networks, bimodal tests, and variance information, which solves the data fusion problem and enables the NMDP model to focus on more relevant genomic data. Finally, we combine a one-dimensional convolution method and Kolmogorov-Arnold networks (KANs) to predict the drug response. We conduct five sets of real data experiments and compare NMDP against seven advanced drug response prediction methods. The results show that NMDP achieves the best performance, with sensitivity and specificity reaching 0.92 and 0.93, respectively-an improvement of 11%-57% compared to other models. Bio-enrichment experiments strongly support the biological interpretation of the NMDP model and its ability to identify potential targets for drug activity prediction.
Collapse
Affiliation(s)
- Rui Miao
- Basic Teaching Department, Zhuhai Campus of Zunyi Medical University, Zhu Hai, China
| | - Bing-Jie Zhong
- Basic Teaching Department, Zhuhai Campus of Zunyi Medical University, Zhu Hai, China
| | - Xin-Yue Mei
- Institute of Systems Engineering, Macau University of Science and Technology, Macau, China
| | - Xin Dong
- Institute of Systems Engineering, Macau University of Science and Technology, Macau, China
| | - Yang-Dong Ou
- School of Biomedical Engineering, Guangdong Medical University, Dongguan, China
| | | | - Hao-Yang Yu
- Basic Teaching Department, Zhuhai Campus of Zunyi Medical University, Zhu Hai, China
| | - Ying Wang
- Basic Teaching Department, Zhuhai Campus of Zunyi Medical University, Zhu Hai, China
| | - Zi-Han Dong
- Basic Teaching Department, Zhuhai Campus of Zunyi Medical University, Zhu Hai, China
| |
Collapse
|
6
|
Narykov O, Zhu Y, Brettin T, Evrard YA, Partin A, Xia F, Shukla M, Vasanthakumari P, Doroshow JH, Stevens RL. Data imbalance in drug response prediction: multi-objective optimization approach in deep learning setting. Brief Bioinform 2025; 26:bbaf134. [PMID: 40178282 PMCID: PMC11966611 DOI: 10.1093/bib/bbaf134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2024] [Revised: 02/07/2025] [Accepted: 02/18/2025] [Indexed: 04/05/2025] Open
Abstract
Drug response prediction (DRP) methods tackle the complex task of associating the effectiveness of small molecules with the specific genetic makeup of the patient. Anti-cancer DRP is a particularly challenging task requiring costly experiments as underlying pathogenic mechanisms are broad and associated with multiple genomic pathways. The scientific community has exerted significant efforts to generate public drug screening datasets, giving a path to various machine learning models that attempt to reason over complex data space of small compounds and biological characteristics of tumors. However, the data depth is still lacking compared to application domains like computer vision or natural language processing domains, limiting current learning capabilities. To combat this issue and improves the generalizability of the DRP models, we are exploring strategies that explicitly address the imbalance in the DRP datasets. We reframe the problem as a multi-objective optimization across multiple drugs to maximize deep learning model performance. We implement this approach by constructing Multi-Objective Optimization Regularized by Loss Entropy loss function and plugging it into a Deep Learning model. We demonstrate the utility of proposed drug discovery methods and make suggestions for further potential application of the work to achieve desirable outcomes in the healthcare field.
Collapse
Affiliation(s)
- Oleksandr Narykov
- Computing, Environment and Life Sciences, Argonne National Laboratory, 9700 S Cass Ave, Lemont, IL 60439, United States
| | - Yitan Zhu
- Computing, Environment and Life Sciences, Argonne National Laboratory, 9700 S Cass Ave, Lemont, IL 60439, United States
| | - Thomas Brettin
- Computing, Environment and Life Sciences, Argonne National Laboratory, 9700 S Cass Ave, Lemont, IL 60439, United States
| | - Yvonne A Evrard
- Leidos Biomedical Research, Frederick National Laboratory for Cancer Research, 8560 Progress Drive, Frederick, MD 21702, United States
| | - Alexander Partin
- Computing, Environment and Life Sciences, Argonne National Laboratory, 9700 S Cass Ave, Lemont, IL 60439, United States
| | - Fangfang Xia
- Computing, Environment and Life Sciences, Argonne National Laboratory, 9700 S Cass Ave, Lemont, IL 60439, United States
| | - Maulik Shukla
- Computing, Environment and Life Sciences, Argonne National Laboratory, 9700 S Cass Ave, Lemont, IL 60439, United States
| | - Priyanka Vasanthakumari
- Computing, Environment and Life Sciences, Argonne National Laboratory, 9700 S Cass Ave, Lemont, IL 60439, United States
| | - James H Doroshow
- Developmental Therapeutics Branch, National Cancer Institute, 31 Center Dr, Bethesda, MD 20892, United States
| | - Rick L Stevens
- Computing, Environment and Life Sciences, Argonne National Laboratory, 9700 S Cass Ave, Lemont, IL 60439, United States
- Department of Computer Science, The University of Chicago, 5730 S Ellis Ave, Chicago, IL 60637, United States
| |
Collapse
|
7
|
Tran D, Nguyen H, Pham VD, Nguyen P, Nguyen Luu H, Minh Phan L, Blair DeStefano C, Jim Yeung SC, Nguyen T. A comprehensive review of cancer survival prediction using multi-omics integration and clinical variables. Brief Bioinform 2025; 26:bbaf150. [PMID: 40221959 PMCID: PMC11994034 DOI: 10.1093/bib/bbaf150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2024] [Revised: 01/29/2025] [Accepted: 03/19/2025] [Indexed: 04/15/2025] Open
Abstract
Cancer is an umbrella term that includes a wide spectrum of disease severity, from those that are malignant, metastatic, and aggressive to benign lesions with very low potential for progression or death. The ability to prognosticate patient outcomes would facilitate management of various malignancies: patients whose cancer is likely to advance quickly would receive necessary treatment that is commensurate with the predicted biology of the disease. Former prognostic models based on clinical variables (age, gender, cancer stage, tumor grade, etc.), though helpful, cannot account for genetic differences, molecular etiology, tumor heterogeneity, and important host biological mechanisms. Therefore, recent prognostic models have shifted toward the integration of complementary information available in both molecular data and clinical variables to better predict patient outcomes: vital status (overall survival), metastasis (metastasis-free survival), and recurrence (progression-free survival). In this article, we review 20 survival prediction approaches that integrate multi-omics and clinical data to predict patient outcomes. We discuss their strategies for modeling survival time (continuous and discrete), the incorporation of molecular measurements and clinical variables into risk models (clinical and multi-omics data), how to cope with censored patient records, the effectiveness of data integration techniques, prediction methodologies, model validation, and assessment metrics. The goal is to inform life scientists of available resources, and to provide a complete review of important building blocks in survival prediction. At the same time, we thoroughly describe the pros and cons of each methodology, and discuss in depth the outstanding challenges that need to be addressed in future method development.
Collapse
Affiliation(s)
- Dao Tran
- Department of Computer Science and Software Engineering, Auburn University, 345 W Magnolia Avenue, Auburn, AL 36849, United States
| | - Ha Nguyen
- Department of Computer Science and Software Engineering, Auburn University, 345 W Magnolia Avenue, Auburn, AL 36849, United States
| | - Van-Dung Pham
- Department of Computer Science and Software Engineering, Auburn University, 345 W Magnolia Avenue, Auburn, AL 36849, United States
| | - Phuong Nguyen
- Department of Computer Science and Software Engineering, Auburn University, 345 W Magnolia Avenue, Auburn, AL 36849, United States
| | - Hung Nguyen Luu
- UPMC Hillman Cancer Center, University of Pittsburgh Medical Center, 5150 Centre Avenue, Pittsburgh, PA 15232, United States
- Department of Epidemiology, School of Public Health, University of Pittsburgh, 130 De Soto Street, Pittsburgh, PA 15261, United States
| | - Liem Minh Phan
- David Grant USAF Medical Center—Clinical Investigation Facility, 60 Medical Group, Defense Health Agency, 101 Bodin Circle, Travis Air Force Base, CA 94535, United States
| | - Christin Blair DeStefano
- Walter Reed National Military Medical Center, Defense Health Agency, 8901 Rockville Pike, Bethesda, MD 20889, United States
| | - Sai-Ching Jim Yeung
- Department of Emergency Medicine, The University of Texas MD Anderson Cancer Center, 1400 Pressler Street, Houston, TX 77030, United States
| | - Tin Nguyen
- Department of Computer Science and Software Engineering, Auburn University, 345 W Magnolia Avenue, Auburn, AL 36849, United States
| |
Collapse
|
8
|
Spooner A, Moridani MK, Toplis B, Behary J, Safarchi A, Maher S, Vafaee F, Zekry A, Sowmya A. Benchmarking ensemble machine learning algorithms for multi-class, multi-omics data integration in clinical outcome prediction. Brief Bioinform 2025; 26:bbaf116. [PMID: 40116658 PMCID: PMC11926982 DOI: 10.1093/bib/bbaf116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2024] [Revised: 02/06/2025] [Accepted: 02/21/2025] [Indexed: 03/23/2025] Open
Abstract
The complementary information found in different modalities of patient data can aid in more accurate modelling of a patient's disease state and a better understanding of the underlying biological processes of a disease. However, the analysis of multi-modal, multi-omics data presents many challenges. In this work, we compare the performance of a variety of ensemble machine learning (ML) algorithms that are capable of late integration of multi-class data from different modalities. The ensemble methods and their variations tested were (i) a voting ensemble, with hard and soft vote, (ii) a meta learner, and (iii) a multi-modal AdaBoost model using hard vote, soft vote, and meta learner to integrate the modalities on each boosting round, the PB-MVBoost model and a novel application of a mixture of expert's model. These were compared to simple concatenation. We examine these methods using data from an in-house study on hepatocellular carcinoma, plus validation datasets on studies from breast cancer and irritable bowel disease. We develop models that achieve an area under the receiver operating curve of up to 0.85 and find that two boosted methods, PB-MVBoost and AdaBoost with soft vote were the best performing models. We also examine the stability of features selected and the size of the clinical signature. Our work shows that integrating complementary omics and data modalities with effective ensemble ML models enhances accuracy in multi-class clinical outcome predictions and produces more stable predictive features than individual modalities or simple concatenation. We provide recommendations for the integration of multi-modal multi-class data.
Collapse
Affiliation(s)
- Annette Spooner
- School of Computer Science and Engineering, University of New South Wales, High St, Kensington, NSW 2052, Australia
| | - Mohammad Karimi Moridani
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, NSW 2052, Australia
| | - Barbra Toplis
- St George and Sutherland Clinical Campuses, University of New South Wales, Short St, Kogarah, NSW 2217, Australia
| | - Jason Behary
- St George and Sutherland Clinical Campuses, University of New South Wales, Short St, Kogarah, NSW 2217, Australia
- Department of Gastroenterology and Hepatology, St George Hospital, Gray St, Kogarah, NSW 2217, Australia
| | - Azadeh Safarchi
- Health and Biosecurity, Microbiome for One System Health, Commonwealth Scientific and Industrial Research Organisation, 160 Hawkesbury Rd, Westmead, NSW 2145, Australia
| | - Salim Maher
- St George and Sutherland Clinical Campuses, University of New South Wales, Short St, Kogarah, NSW 2217, Australia
- Department of Gastroenterology and Hepatology, St George Hospital, Gray St, Kogarah, NSW 2217, Australia
| | - Fatemeh Vafaee
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, NSW 2052, Australia
- UNSW Data Science Hub, University of New South Wales, High St, Kensington, NSW 2052, Australia
| | - Amany Zekry
- St George and Sutherland Clinical Campuses, University of New South Wales, Short St, Kogarah, NSW 2217, Australia
- Department of Gastroenterology and Hepatology, St George Hospital, Gray St, Kogarah, NSW 2217, Australia
| | - Arcot Sowmya
- School of Computer Science and Engineering, University of New South Wales, High St, Kensington, NSW 2052, Australia
| |
Collapse
|
9
|
Sefer E. DRGAT: Predicting Drug Responses Via Diffusion-Based Graph Attention Network. J Comput Biol 2025; 32:330-350. [PMID: 39639802 DOI: 10.1089/cmb.2024.0807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2024] Open
Abstract
Accurately predicting drug response depending on a patient's genomic profile is critical for advancing personalized medicine. Deep learning approaches rise and especially the rise of graph neural networks leveraging large-scale omics datasets have been a key driver of research in this area. However, these biological datasets, which are typically high dimensional but have small sample sizes, present challenges such as overfitting and poor generalization in predictive models. As a complicating matter, gene expression (GE) data must capture complex inter-gene relationships, exacerbating these issues. In this article, we tackle these challenges by introducing a drug response prediction method, called drug response graph attention network (DRGAT), which combines a denoising diffusion implicit model for data augmentation with a recently introduced graph attention network (GAT) with high-order neighbor propagation (HO-GATs) prediction module. Our proposed approach achieved almost 5% improvement in the area under receiver operating characteristic curve compared with state-of-the-art models for the many studied drugs, indicating our method's reasonable generalization capabilities. Moreover, our experiments confirm the potential of diffusion-based generative models, a core component of our method, to mitigate the inherent limitations of omics datasets by effectively augmenting GE data.
Collapse
Affiliation(s)
- Emre Sefer
- Artificial Intelligence and Data Engineering Department, Ozyegin University, Istanbul, Turkey
| |
Collapse
|
10
|
Wu Y, Chen M, Qin Y. Anticancer drug response prediction integrating multi-omics pathway-based difference features and multiple deep learning techniques. PLoS Comput Biol 2025; 21:e1012905. [PMID: 40163555 PMCID: PMC11978092 DOI: 10.1371/journal.pcbi.1012905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2024] [Revised: 04/08/2025] [Accepted: 02/24/2025] [Indexed: 04/02/2025] Open
Abstract
Individualized prediction of cancer drug sensitivity is of vital importance in precision medicine. While numerous predictive methodologies for cancer drug response have been proposed, the precise prediction of an individual patient's response to drug and a thorough understanding of differences in drug responses among individuals continue to pose significant challenges. This study introduced a deep learning model PASO, which integrated transformer encoder, multi-scale convolutional networks and attention mechanisms to predict the sensitivity of cell lines to anticancer drugs, based on the omics data of cell lines and the SMILES representations of drug molecules. First, we use statistical methods to compute the differences in gene expression, gene mutation, and gene copy number variations between within and outside biological pathways, and utilized these pathway difference values as cell line features, combined with the drugs' SMILES chemical structure information as inputs to the model. Then the model integrates various deep learning technologies multi-scale convolutional networks and transformer encoder to extract the properties of drug molecules from different perspectives, while an attention network is devoted to learning complex interactions between the omics features of cell lines and the aforementioned properties of drug molecules. Finally, a multilayer perceptron (MLP) outputs the final predictions of drug response. Our model exhibits higher accuracy in predicting the sensitivity to anticancer drugs comparing with other methods proposed recently. It is found that PARP inhibitors, and Topoisomerase I inhibitors were particularly sensitive to SCLC when analyzing the drug response predictions for lung cancer cell lines. Additionally, the model is capable of highlighting biological pathways related to cancer and accurately capturing critical parts of the drug's chemical structure. We also validated the model's clinical utility using clinical data from The Cancer Genome Atlas. In summary, the PASO model suggests potential as a robust support in individualized cancer treatment. Our methods are implemented in Python and are freely available from GitHub (https://github.com/queryang/PASO).
Collapse
Affiliation(s)
- Yang Wu
- College of Information Technology, Shanghai Ocean University, Shanghai, China
- Key Laboratory of Fisheries Information Ministry of Agriculture, Shanghai, China
| | - Ming Chen
- College of Information Technology, Shanghai Ocean University, Shanghai, China
- Key Laboratory of Fisheries Information Ministry of Agriculture, Shanghai, China
| | - Yufang Qin
- College of Information Technology, Shanghai Ocean University, Shanghai, China
- Key Laboratory of Fisheries Information Ministry of Agriculture, Shanghai, China
| |
Collapse
|
11
|
He Y, Liu N, Yang J, Hong Y, Ni H, Zhang Z. Comparison of artificial intelligence and logistic regression models for mortality prediction in acute respiratory distress syndrome: a systematic review and meta-analysis. Intensive Care Med Exp 2025; 13:23. [PMID: 39982531 PMCID: PMC11845658 DOI: 10.1186/s40635-024-00706-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Accepted: 12/04/2024] [Indexed: 02/22/2025] Open
Abstract
BACKGROUND The application of artificial intelligence (AI) in predicting the mortality of acute respiratory distress syndrome (ARDS) has garnered significant attention. However, there is still a lack of evidence-based support for its specific diagnostic performance. Thus, this systematic review and meta-analysis was conducted to evaluate the effectiveness of AI algorithms in predicting ARDS mortality. METHOD We conducted a comprehensive electronic search across Web of Science, Embase, PubMed, Scopus, and EBSCO databases up to April 28, 2024. The QUADAS-2 tool was used to assess the risk of bias in the included articles. A bivariate mixed-effects model was applied for the meta-analysis. Sensitivity analysis, meta-regression analysis, and tests for heterogeneity were also performed. RESULTS Eight studies were included in the analysis. The sensitivity, specificity, and summarized receiver operating characteristic (SROC) of the AI-based model in the validation set were 0.89 (95% CI 0.79-0.95), 0.72 (95% CI 0.65-0.78), and 0.84 (95% CI 0.80-0.87), respectively. For the logistic regression (LR) model, the sensitivity, specificity, and SROC were 0.78 (95% CI 0.74-0.82), 0.68 (95% CI 0.60-0.76), and 0.81 (95% CI 0.77-0.84). The AI model demonstrated superior predictive accuracy compared to the LR model. Notably, the predictive model performed better in patients with moderate to severe ARDS (SAUC: 0.84 [95% CI 0.80-0.87] vs. 0.81 [95% CI 0.77-0.84]). CONCLUSION The AI algorithms showed superior performance in predicting the mortality of ARDS patients and demonstrated strong potential for clinical application. Additionally, we found that for ARDS, a highly heterogeneous condition, the accuracy of the model is influenced by the severity of the disease.
Collapse
Affiliation(s)
- Yang He
- Department of Emergency Medicine, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, 3#, East Qingchun Road, Hangzhou, 310016, China
| | - Ning Liu
- Department of Emergency Medicine, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, 3#, East Qingchun Road, Hangzhou, 310016, China
| | - Jie Yang
- Department of Emergency Medicine, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, 3#, East Qingchun Road, Hangzhou, 310016, China
| | - Yucai Hong
- Department of Emergency Medicine, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, 3#, East Qingchun Road, Hangzhou, 310016, China
| | - Hongying Ni
- Department of Critical Care Medicine, Affiliated Jinhua Hospital, Zhejiang University School of Medicine, No.365 Renmin East Rd, Jinhua, 321000, China.
| | - Zhongheng Zhang
- Department of Emergency Medicine, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, 3#, East Qingchun Road, Hangzhou, 310016, China.
- Provincial Key Laboratory of Precise Diagnosis and Treatment of Abdominal Infection, School of Medicine, Sir Run Run Shaw Hospital, Zhejiang University, Zhejiang, 310016, People's Republic of China.
- School of Medicine, Shaoxing University, Shaoxing, China.
| |
Collapse
|
12
|
Alemu R, Sharew NT, Arsano YY, Ahmed M, Tekola-Ayele F, Mersha TB, Amare AT. Multi-omics approaches for understanding gene-environment interactions in noncommunicable diseases: techniques, translation, and equity issues. Hum Genomics 2025; 19:8. [PMID: 39891174 PMCID: PMC11786457 DOI: 10.1186/s40246-025-00718-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2024] [Accepted: 01/16/2025] [Indexed: 02/03/2025] Open
Abstract
Non-communicable diseases (NCDs) such as cardiovascular diseases, chronic respiratory diseases, cancers, diabetes, and mental health disorders pose a significant global health challenge, accounting for the majority of fatalities and disability-adjusted life years worldwide. These diseases arise from the complex interactions between genetic, behavioral, and environmental factors, necessitating a thorough understanding of these dynamics to identify effective diagnostic strategies and interventions. Although recent advances in multi-omics technologies have greatly enhanced our ability to explore these interactions, several challenges remain. These challenges include the inherent complexity and heterogeneity of multi-omic datasets, limitations in analytical approaches, and severe underrepresentation of non-European genetic ancestries in most omics datasets, which restricts the generalizability of findings and exacerbates health disparities. This scoping review evaluates the global landscape of multi-omics data related to NCDs from 2000 to 2024, focusing on recent advancements in multi-omics data integration, translational applications, and equity considerations. We highlight the need for standardized protocols, harmonized data-sharing policies, and advanced approaches such as artificial intelligence/machine learning to integrate multi-omics data and study gene-environment interactions. We also explore challenges and opportunities in translating insights from gene-environment (GxE) research into precision medicine strategies. We underscore the potential of global multi-omics research in advancing our understanding of NCDs and enhancing patient outcomes across diverse and underserved populations, emphasizing the need for equity and fairness-centered research and strategic investments to build local capacities in underrepresented populations and regions.
Collapse
Affiliation(s)
- Robel Alemu
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Anderson School of Management, University of California Los Angeles, Los Angeles, CA, USA.
- Adelaide Medical School, Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, Australia.
| | - Nigussie T Sharew
- Adelaide Medical School, Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, Australia
| | - Yodit Y Arsano
- Alpert Medical School, Lifespan Health Systems, Brown University, WarrenProvidence, Rhode Island, USA
| | - Muktar Ahmed
- Adelaide Medical School, Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, Australia
| | - Fasil Tekola-Ayele
- Epidemiology Branch, Division of Population Health Research, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD, USA
| | - Tesfaye B Mersha
- Department of Pediatrics, Cincinnati Children's Hospital Medical Center, University of Cincinnati College of Medicine, Cincinnati, OH, USA.
| | - Azmeraw T Amare
- Adelaide Medical School, Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, Australia.
| |
Collapse
|
13
|
Dong Y, Zhang Y, Qian Y, Zhao Y, Yang Z, Feng X. ASGCL: Adaptive Sparse Mapping-based graph contrastive learning network for cancer drug response prediction. PLoS Comput Biol 2025; 21:e1012748. [PMID: 39883719 PMCID: PMC11781687 DOI: 10.1371/journal.pcbi.1012748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Accepted: 12/23/2024] [Indexed: 02/01/2025] Open
Abstract
Personalized cancer drug treatment is emerging as a frontier issue in modern medical research. Considering the genomic differences among cancer patients, determining the most effective drug treatment plan is a complex and crucial task. In response to these challenges, this study introduces the Adaptive Sparse Graph Contrastive Learning Network (ASGCL), an innovative approach to unraveling latent interactions in the complex context of cancer cell lines and drugs. The core of ASGCL is the GraphMorpher module, an innovative component that enhances the input graph structure via strategic node attribute masking and topological pruning. By contrasting the augmented graph with the original input, the model delineates distinct positive and negative sample sets at both node and graph levels. This dual-level contrastive approach significantly amplifies the model's discriminatory prowess in identifying nuanced drug responses. Leveraging a synergistic combination of supervised and contrastive loss, ASGCL accomplishes end-to-end learning of feature representations, substantially outperforming existing methodologies. Comprehensive ablation studies underscore the efficacy of each component, corroborating the model's robustness. Experimental evaluations further illuminate ASGCL's proficiency in predicting drug responses, offering a potent tool for guiding clinical decision-making in cancer therapy.
Collapse
Affiliation(s)
- Yunyun Dong
- School of Software, Taiyuan University of Technology, Taiyuan, China
- Institute of Big Data Science and Industry, Shanxi University, Taiyuan, China
| | - Yuanrong Zhang
- School of Software, Taiyuan University of Technology, Taiyuan, China
| | - Yuhua Qian
- Institute of Big Data Science and Industry, Shanxi University, Taiyuan, China
- School of Computer and Information Technology, Shanxi University, Taiyuan, China
| | - Yiming Zhao
- School of Software, Taiyuan University of Technology, Taiyuan, China
| | - Ziting Yang
- School of Software, Taiyuan University of Technology, Taiyuan, China
| | - Xiufang Feng
- School of Software, Taiyuan University of Technology, Taiyuan, China
| |
Collapse
|
14
|
Hajim WI, Zainudin S, Daud KM, Alheeti K. Golden eagle optimized CONV-LSTM and non-negativity-constrained autoencoder to support spatial and temporal features in cancer drug response prediction. PeerJ Comput Sci 2024; 10:e2520. [PMID: 39896419 PMCID: PMC11784781 DOI: 10.7717/peerj-cs.2520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Accepted: 10/25/2024] [Indexed: 02/04/2025]
Abstract
Advanced machine learning (ML) and deep learning (DL) methods have recently been utilized in Drug Response Prediction (DRP), and these models use the details from genomic profiles, such as extensive drug screening data and cell line data, to predict the response of drugs. Comparatively, the DL-based prediction approaches provided better learning of such features. However, prior knowledge, like pathway data, is sometimes discarded as irrelevant since the drug response datasets are multidimensional and noisy. Optimized feature learning and extraction processes are suggested to handle this problem. First, the noise and class imbalance problems must be tackled to avoid low identification accuracy, long prediction times, and poor applicability. This article aims to apply the Non-Negativity-Constrained Auto Encoder (NNCAE) network to tackle these issues, enhance the adaptive search for the optimal size of sliding windows, and ensure that deep network architectures are adept at learning the vital hidden features. NNCAE methodology is used after performing the standard pre-processing procedures to handle the noise and class imbalance problem. This class balanced and noise-removed input data features are learned to train the proposed hybrid classifier. The classification model, Golden Eagle Optimization-based Convolutional Long Short-Term Memory neural networks (GEO-Conv-LSTM), is assembled by integrating Convolutional Neural Network CNN and LSTM models, with parameter tuning performed by the GEO algorithm. Evaluations are conducted on two large datasets from the Genomics of Drug Sensitivity in Cancer (GDSC) repository, and the proposed NNCAE-GEO-Conv-LSTM-based approach has achieved 96.99% and 97.79% accuracies, respectively, with reduced processing time and error rate for the DRP problem.
Collapse
Affiliation(s)
- Wesam Ibrahim Hajim
- Department of Applied Geology, College of Sciences, University of Tikrit, Tikrit, Salah ad Din, Iraq
- Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Selangor, Malaysia
| | - Suhaila Zainudin
- Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Selangor, Malaysia
| | - Kauthar Mohd Daud
- Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Selangor, Malaysia
| | - Khattab Alheeti
- Department of Computer Networking Systems College of Computer Sciences and Information Technology, University of Anbar, Ramadi, Al Anbar, Iraq
| |
Collapse
|
15
|
Wang H, Han X, Niu S, Cheng H, Ren J, Duan Y. DFASGCNS: A prognostic model for ovarian cancer prediction based on dual fusion channels and stacked graph convolution. PLoS One 2024; 19:e0315924. [PMID: 39680618 DOI: 10.1371/journal.pone.0315924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Accepted: 12/03/2024] [Indexed: 12/18/2024] Open
Abstract
Ovarian cancer is a malignant tumor with different clinicopathological and molecular characteristics. Due to its nonspecific early symptoms, the majority of patients are diagnosed with local or extensive metastasis, severely affecting treatment and prognosis. The occurrence of ovarian cancer is influenced by multiple complex mechanisms including genomics, transcriptomics, and proteomics. Integrating multiple types of omics data aids in predicting the survival rate of ovarian cancer patients. However, existing methods only fuse multi-omics data at the feature level, neglecting the shared and complementary neighborhood information among samples of multi-omics data, and failing to consider the potential interactions between different omics data at the molecular level. In this paper, we propose a prognostic model for ovarian cancer prediction named Dual Fusion Channels and Stacked Graph Convolutional Neural Network (DFASGCNS). The DFASGCNS utilizes dual fusion channels to learn feature representations of different omics data and the associations between samples. Stacked graph convolutional network is used to comprehensively learn the deep and intricate correlation networks present in multi-omics data, enhancing the model's ability to represent multi-omics data. An attention mechanism is introduced to allocate different weights to important features of different omics data, optimizing the feature representation of multi-omics data. Experimental results demonstrate that compared to existing methods, the DFASGCNS model exhibits significant advantages in ovarian cancer prognosis prediction and survival analysis. Kaplan-Meier curve analysis results indicate significant differences in the survival subgroups predicted by the DFASGCNS model, contributing to a deeper understanding of the pathogenesis of ovarian cancer and providing more reliable auxiliary diagnostic information for the prognosis assessment of ovarian cancer patients.
Collapse
Affiliation(s)
- Huiqing Wang
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, China
| | - Xiao Han
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, China
| | - Shuaijun Niu
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, China
| | - Hao Cheng
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, China
| | - Jianxue Ren
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, China
| | - Yimeng Duan
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, China
| |
Collapse
|
16
|
Wang FA, Li Y, Zeng T. Deep Learning of radiology-genomics integration for computational oncology: A mini review. Comput Struct Biotechnol J 2024; 23:2708-2716. [PMID: 39035833 PMCID: PMC11260400 DOI: 10.1016/j.csbj.2024.06.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 06/18/2024] [Accepted: 06/18/2024] [Indexed: 07/23/2024] Open
Abstract
In the field of computational oncology, patient status is often assessed using radiology-genomics, which includes two key technologies and data, such as radiology and genomics. Recent advances in deep learning have facilitated the integration of radiology-genomics data, and even new omics data, significantly improving the robustness and accuracy of clinical predictions. These factors are driving artificial intelligence (AI) closer to practical clinical applications. In particular, deep learning models are crucial in identifying new radiology-genomics biomarkers and therapeutic targets, supported by explainable AI (xAI) methods. This review focuses on recent developments in deep learning for radiology-genomics integration, highlights current challenges, and outlines some research directions for multimodal integration and biomarker discovery of radiology-genomics or radiology-omics that are urgently needed in computational oncology.
Collapse
Affiliation(s)
- Feng-ao Wang
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China
| | - Yixue Li
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China
- Guangzhou National Laboratory, Guangzhou, China
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou, China
| | - Tao Zeng
- Guangzhou National Laboratory, Guangzhou, China
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou, China
| |
Collapse
|
17
|
Tang X, Prodduturi N, Thompson K, Weinshilboum R, O’Sullivan C, Boughey J, Tizhoosh H, Klee E, Wang L, Goetz M, Suman V, Kalari K. OmicsFootPrint: a framework to integrate and interpret multi-omics data using circular images and deep neural networks. Nucleic Acids Res 2024; 52:e99. [PMID: 39445795 PMCID: PMC11602161 DOI: 10.1093/nar/gkae915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Revised: 08/14/2024] [Accepted: 10/07/2024] [Indexed: 10/25/2024] Open
Abstract
The OmicsFootPrint framework addresses the need for advanced multi-omics data analysis methodologies by transforming data into intuitive two-dimensional circular images and facilitating the interpretation of complex diseases. Utilizing deep neural networks and incorporating the SHapley Additive exPlanations algorithm, the framework enhances model interpretability. Tested with The Cancer Genome Atlas data, OmicsFootPrint effectively classified lung and breast cancer subtypes, achieving high area under the curve (AUC) scores-0.98 ± 0.02 for lung cancer subtype differentiation and 0.83 ± 0.07 for breast cancer PAM50 subtypes, and successfully distinguished between invasive lobular and ductal carcinomas in breast cancer, showcasing its robustness. It also demonstrated notable performance in predicting drug responses in cancer cell lines, with a median AUC of 0.74, surpassing nine existing methods. Furthermore, its effectiveness persists even with reduced training sample sizes. OmicsFootPrint marks an enhancement in multi-omics research, offering a novel, efficient and interpretable approach that contributes to a deeper understanding of disease mechanisms.
Collapse
Affiliation(s)
- Xiaojia Tang
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55905, USA
| | - Naresh Prodduturi
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55905, USA
| | - Kevin J Thompson
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55905, USA
| | - Richard Weinshilboum
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN 55905, USA
| | | | - Judy C Boughey
- Department of Surgery, Mayo Clinic, Rochester, MN 55905, USA
| | - Hamid R Tizhoosh
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN 55905, USA
| | - Eric W Klee
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55905, USA
| | - Liewei Wang
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN 55905, USA
| | - Matthew P Goetz
- Department of Oncology, Mayo Clinic, Rochester, MN 55905, USA
| | - Vera Suman
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55905, USA
| | - Krishna R Kalari
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55905, USA
| |
Collapse
|
18
|
Briscik M, Tazza G, Vidács L, Dillies MA, Déjean S. Supervised multiple kernel learning approaches for multi-omics data integration. BioData Min 2024; 17:53. [PMID: 39580456 PMCID: PMC11585117 DOI: 10.1186/s13040-024-00406-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Accepted: 11/14/2024] [Indexed: 11/25/2024] Open
Abstract
BACKGROUND Advances in high-throughput technologies have originated an ever-increasing availability of omics datasets. The integration of multiple heterogeneous data sources is currently an issue for biology and bioinformatics. Multiple kernel learning (MKL) has shown to be a flexible and valid approach to consider the diverse nature of multi-omics inputs, despite being an underused tool in genomic data mining. RESULTS We provide novel MKL approaches based on different kernel fusion strategies. To learn from the meta-kernel of input kernels, we adapted unsupervised integration algorithms for supervised tasks with support vector machines. We also tested deep learning architectures for kernel fusion and classification. The results show that MKL-based models can outperform more complex, state-of-the-art, supervised multi-omics integrative approaches. CONCLUSION Multiple kernel learning offers a natural framework for predictive models in multi-omics data. It proved to provide a fast and reliable solution that can compete with and outperform more complex architectures. Our results offer a direction for bio-data mining research, biomarker discovery and further development of methods for heterogeneous data integration.
Collapse
Affiliation(s)
- Mitja Briscik
- Institut de Mathématiques de Toulouse, UMR5219, CNRS, UPS, Université de Toulouse, Cedex 9, Toulouse, 31062, France.
| | - Gabriele Tazza
- Department of Computer Science, Applied Artificial Intelligence Group , University of Szeged, Szeged, 6720, Hungary.
| | - László Vidács
- Department of Computer Science, Applied Artificial Intelligence Group , University of Szeged, Szeged, 6720, Hungary
| | - Marie-Agnès Dillies
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, F-75015, Paris, France
| | - Sébastien Déjean
- Institut de Mathématiques de Toulouse, UMR5219, CNRS, UPS, Université de Toulouse, Cedex 9, Toulouse, 31062, France
| |
Collapse
|
19
|
Yang K, Cheng J, Cao S, Pan X, Shen HB, Yuan Y. Predicting transcriptional changes induced by molecules with MiTCP. Brief Bioinform 2024; 26:bbaf006. [PMID: 39847444 PMCID: PMC11756340 DOI: 10.1093/bib/bbaf006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Revised: 12/05/2024] [Accepted: 01/21/2025] [Indexed: 01/24/2025] Open
Abstract
Studying the changes in cellular transcriptional profiles induced by small molecules can significantly advance our understanding of cellular state alterations and response mechanisms under chemical perturbations, which plays a crucial role in drug discovery and screening processes. Considering that experimental measurements need substantial time and cost, we developed a deep learning-based method called Molecule-induced Transcriptional Change Predictor (MiTCP) to predict changes in transcriptional profiles (CTPs) of 978 landmark genes induced by molecules. MiTCP utilizes graph neural network-based approaches to simultaneously model molecular structure representation and gene co-expression relationships, and integrates them for CTP prediction. After training on the L1000 dataset, MiTCP achieves an average Pearson correlation coefficient (PCC) of 0.482 on the test set and an average PCC of 0.801 for predicting the top 50 differentially expressed genes, which outperforms other existing methods. Furthermore, we used MiTCP to predict CTPs of three cancer drugs, palbociclib, irinotecan and goserelin, and performed gene enrichment analysis on the top differentially expressed genes and found that the enriched pathways and Gene Ontology terms are highly relevant to the corresponding diseases, which reveals the potential of MiTCP in drug development.
Collapse
Affiliation(s)
- Kaiyuan Yang
- Department of Automation, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, 800 Dongchuan Road, Minhang District, Shanghai 200240, China
| | - Jiabei Cheng
- Department of Automation, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, 800 Dongchuan Road, Minhang District, Shanghai 200240, China
| | - Shenghao Cao
- Department of Automation, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, 800 Dongchuan Road, Minhang District, Shanghai 200240, China
| | - Xiaoyong Pan
- Department of Automation, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, 800 Dongchuan Road, Minhang District, Shanghai 200240, China
| | - Hong-Bin Shen
- Department of Automation, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, 800 Dongchuan Road, Minhang District, Shanghai 200240, China
| | - Ye Yuan
- Department of Automation, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, 800 Dongchuan Road, Minhang District, Shanghai 200240, China
- State Key Laboratory of Biopharmaceutical Preparation and Delivery, Institute of Process Engineering, Chinese Academy of Sciences, 1 North 2nd Street, Zhongguancun, Haidian District, Beijing 100190, China
| |
Collapse
|
20
|
Hu X, Zhang P, Zhang J, Deng L. DeepFusionCDR: Employing Multi-Omics Integration and Molecule-Specific Transformers for Enhanced Prediction of Cancer Drug Responses. IEEE J Biomed Health Inform 2024; 28:6248-6258. [PMID: 38935469 DOI: 10.1109/jbhi.2024.3417014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024]
Abstract
Deep learning approaches have demonstrated remarkable potential in predicting cancer drug responses (CDRs), using cell line and drug features. However, existing methods predominantly rely on single-omics data of cell lines, potentially overlooking the complex biological mechanisms governing cell line responses. This paper introduces DeepFusionCDR, a novel approach employing unsupervised contrastive learning to amalgamate multi-omics features, including mutation, transcriptome, methylome, and copy number variation data, from cell lines. Furthermore, we incorporate molecular SMILES-specific transformers to derive drug features from their chemical structures. The unified multi-omics and drug signatures are combined, and a multi-layer perceptron (MLP) is applied to predict IC50 values for cell line-drug pairs. Moreover, this MLP can discern whether a cell line is resistant or sensitive to a particular drug. We assessed DeepFusionCDR's performance on the GDSC dataset and juxtaposed it against cutting-edge methods, demonstrating its superior performance in regression and classification tasks. We also conducted ablation studies and case analyses to exhibit the effectiveness and versatility of our proposed approach. Our results underscore the potential of DeepFusionCDR to enhance CDR predictions by harnessing the power of multi-omics fusion and molecular-specific transformers. The prediction of DeepFusionCDR on TCGA patient data and case study highlight the practical application scenarios of DeepFusionCDR in real-world environments.
Collapse
|
21
|
Murmu A, Győrffy B. Artificial intelligence methods available for cancer research. Front Med 2024; 18:778-797. [PMID: 39115792 DOI: 10.1007/s11684-024-1085-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 05/17/2024] [Indexed: 11/01/2024]
Abstract
Cancer is a heterogeneous and multifaceted disease with a significant global footprint. Despite substantial technological advancements for battling cancer, early diagnosis and selection of effective treatment remains a challenge. With the convenience of large-scale datasets including multiple levels of data, new bioinformatic tools are needed to transform this wealth of information into clinically useful decision-support tools. In this field, artificial intelligence (AI) technologies with their highly diverse applications are rapidly gaining ground. Machine learning methods, such as Bayesian networks, support vector machines, decision trees, random forests, gradient boosting, and K-nearest neighbors, including neural network models like deep learning, have proven valuable in predictive, prognostic, and diagnostic studies. Researchers have recently employed large language models to tackle new dimensions of problems. However, leveraging the opportunity to utilize AI in clinical settings will require surpassing significant obstacles-a major issue is the lack of use of the available reporting guidelines obstructing the reproducibility of published studies. In this review, we discuss the applications of AI methods and explore their benefits and limitations. We summarize the available guidelines for AI in healthcare and highlight the potential role and impact of AI models on future directions in cancer research.
Collapse
Affiliation(s)
- Ankita Murmu
- Institute of Molecular Life Sciences, HUN-REN Research Centre for Natural Sciences, Budapest, 1117, Hungary
- National Laboratory for Drug Research and Development, Budapest, 1117, Hungary
- Department of Bioinformatics, Semmelweis University, Budapest, 1094, Hungary
| | - Balázs Győrffy
- Institute of Molecular Life Sciences, HUN-REN Research Centre for Natural Sciences, Budapest, 1117, Hungary.
- Department of Bioinformatics, Semmelweis University, Budapest, 1094, Hungary.
- Department of Biophysics, University of Pecs, Pecs, 7624, Hungary.
| |
Collapse
|
22
|
Saranya KR, Vimina ER. DRN-CDR: A cancer drug response prediction model using multi-omics and drug features. Comput Biol Chem 2024; 112:108175. [PMID: 39191166 DOI: 10.1016/j.compbiolchem.2024.108175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 08/09/2024] [Accepted: 08/14/2024] [Indexed: 08/29/2024]
Abstract
Cancer drug response (CDR) prediction is an important area of research that aims to personalize cancer therapy, optimizing treatment plans for maximum effectiveness while minimizing potential negative effects. Despite the advancements in Deep learning techniques, the effective integration of multi-omics data for drug response prediction remains challenging. In this paper, a regression method using Deep ResNet for CDR (DRN-CDR) prediction is proposed. We aim to explore the potential of considering sole cancer genes in drug response prediction. Here the multi-omics data such as gene expressions, mutation data, and methylation data along with the molecular structural information of drugs were integrated to predict the IC50 values of drugs. Drug features are extracted by employing a Uniform Graph Convolution Network, while Cell line features are extracted using a combination of Convolutional Neural Network and Fully Connected Networks. These features are then concatenated and fed into a deep ResNet for the prediction of IC50 values between Drug - Cell line pairs. The proposed method yielded higher Pearson's correlation coefficient (rp) of 0.7938 with lowest Root Mean Squared Error (RMSE) value of 0.92 when compared with similar methods of tCNNS, MOLI, DeepCDR, TGSA, NIHGCN, DeepTTA, GraTransDRP and TSGCNN. Further, when the model is extended to a classification problem to categorize drugs as sensitive or resistant, we achieved AUC and AUPR measures of 0.7623 and 0.7691, respectively. The drugs such as Tivozanib, SNX-2112, CGP-60474, PHA-665752, Foretinib etc., exhibited low median IC50 values and were found to be effective anti-cancer drugs. The case studies with different TCGA cancer types also revealed the effectiveness of SNX-2112, CGP-60474, Foretinib, Cisplatin, Vinblastine etc. This consistent pattern strongly suggests the effectiveness of the model in predicting CDR.
Collapse
Affiliation(s)
- K R Saranya
- Department of Computer Science and IT, School of Computing, Amrita Vishwa Vidyapeetham, Kochi Campus, India
| | - E R Vimina
- Department of Computer Science and IT, School of Computing, Amrita Vishwa Vidyapeetham, Kochi Campus, India.
| |
Collapse
|
23
|
Acharya D, Mukhopadhyay A. A comprehensive review of machine learning techniques for multi-omics data integration: challenges and applications in precision oncology. Brief Funct Genomics 2024; 23:549-560. [PMID: 38600757 DOI: 10.1093/bfgp/elae013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 03/12/2024] [Accepted: 03/22/2024] [Indexed: 04/12/2024] Open
Abstract
Multi-omics data play a crucial role in precision medicine, mainly to understand the diverse biological interaction between different omics. Machine learning approaches have been extensively employed in this context over the years. This review aims to comprehensively summarize and categorize these advancements, focusing on the integration of multi-omics data, which includes genomics, transcriptomics, proteomics and metabolomics, alongside clinical data. We discuss various machine learning techniques and computational methodologies used for integrating distinct omics datasets and provide valuable insights into their application. The review emphasizes both the challenges and opportunities present in multi-omics data integration, precision medicine and patient stratification, offering practical recommendations for method selection in various scenarios. Recent advances in deep learning and network-based approaches are also explored, highlighting their potential to harmonize diverse biological information layers. Additionally, we present a roadmap for the integration of multi-omics data in precision oncology, outlining the advantages, challenges and implementation difficulties. Hence this review offers a thorough overview of current literature, providing researchers with insights into machine learning techniques for patient stratification, particularly in precision oncology. Contact: anirban@klyuniv.ac.in.
Collapse
Affiliation(s)
- Debabrata Acharya
- Department of Computer Science & Engineering, University of Kalyani, Kalyani-741235, West Bengal, India
| | - Anirban Mukhopadhyay
- Department of Computer Science & Engineering, University of Kalyani, Kalyani-741235, West Bengal, India
| |
Collapse
|
24
|
Pak M, Bang D, Sung I, Kim S, Lee S. DGDRP: drug-specific gene selection for drug response prediction via re-ranking through propagating and learning biological network. Front Genet 2024; 15:1441558. [PMID: 39371421 PMCID: PMC11450864 DOI: 10.3389/fgene.2024.1441558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Accepted: 09/03/2024] [Indexed: 10/08/2024] Open
Abstract
Introduction: Drug response prediction, especially in terms of cell viability prediction, is a well-studied research problem with significant implications for personalized medicine. It enables the identification of the most effective drugs based on individual genetic profiles, aids in selecting potential drug candidates, and helps identify biomarkers that predict drug efficacy and toxicity.A deeper investigation on drug response prediction reveals that drugs exert their effects by targeting specific proteins, which in turn perturb related genes in cascading ways. This perturbation affects cellular pathways and regulatory networks, ultimately influencing the cellular response to the drug. Identifying which genes are perturbed and how they interact can provide critical insights into the mechanisms of drug action. Hence, the problem of predicting drug response can be framed as a dual problem involving both the prediction of drug efficacy and the selection of drug-specific genes. Identifying these drug-specific genes (biomarkers) is crucial because they serve as indicators of how the drug will affect the biological system, thereby facilitating both drug response prediction and biomarker discovery.Methods: In this study, we propose DGDRP (Drug-specific Gene selection for Drug Response Prediction), a graph neural network (GNN)-based model that uses a novel rank-and-re-rank process for drug-specific gene selection. DGDRP first ranks genes using a pathway knowledge-enhanced network propagation algorithm based on drug target information, ensuring biological relevance. It then re-ranks genes based on the similarity between gene and drug target embeddings learned from the GNN, incorporating semantic relationships. Thus, our model adaptively learns to select drug mechanism-associated genes that contribute to drug response prediction. This integrated approach not only improves drug response predictions compared to other gene selection methods but also allows for effective biomarker discovery.Discussion: As a result, our approach demonstrates improved drug response predictions compared to other gene selection methods and demonstrates comparability with state-of-the-art deep learning models. Case studies further support our method by showing alignment of selected gene sets with the mechanisms of action of input drugs.Conclusion: Overall, DGDRP represents a deep learning based re-ranking strategy, offering a robust gene selection framework for more accurate drug response prediction. The source code for DGDRP can be found at: https://github.com/minwoopak/heteronet.
Collapse
Affiliation(s)
- Minwoo Pak
- Department of Computer Science and Engineering, Seoul National University, Seoul, Republic of Korea
| | - Dongmin Bang
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
- Aigendrug Co., Ltd., Seoul, Republic of Korea
| | - Inyoung Sung
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Sun Kim
- Department of Computer Science and Engineering, Seoul National University, Seoul, Republic of Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul, Republic of Korea
| | - Sunho Lee
- Aigendrug Co., Ltd., Seoul, Republic of Korea
| |
Collapse
|
25
|
Connell W, Garcia K, Goodarzi H, Keiser MJ. Learning chemical sensitivity reveals mechanisms of cellular response. Commun Biol 2024; 7:1149. [PMID: 39278951 PMCID: PMC11402971 DOI: 10.1038/s42003-024-06865-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Accepted: 09/06/2024] [Indexed: 09/18/2024] Open
Abstract
Chemical probes interrogate disease mechanisms at the molecular level by linking genetic changes to observable traits. However, comprehensive chemical screens in diverse biological models are impractical. To address this challenge, we develop ChemProbe, a model that predicts cellular sensitivity to hundreds of molecular probes and drugs by learning to combine transcriptomes and chemical structures. Using ChemProbe, we infer the chemical sensitivity of cancer cell lines and tumor samples and analyze how the model makes predictions. We retrospectively evaluate drug response predictions for precision breast cancer treatment and prospectively validate chemical sensitivity predictions in new cellular models, including a genetically modified cell line. Our model interpretation analysis identifies transcriptome features reflecting compound targets and protein network modules, identifying genes that drive ferroptosis. ChemProbe is an interpretable in silico screening tool that allows researchers to measure cellular response to diverse compounds, facilitating research into molecular mechanisms of chemical sensitivity.
Collapse
Affiliation(s)
- William Connell
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA, USA
- Institute for Neurodegenerative Diseases, University of California, San Francisco, San Francisco, CA, USA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Kristle Garcia
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
- Department of Urology, University of California, San Francisco, San Francisco, CA, USA
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, USA
| | - Hani Goodarzi
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
- Department of Urology, University of California, San Francisco, San Francisco, CA, USA
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, USA
| | - Michael J Keiser
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA, USA.
- Institute for Neurodegenerative Diseases, University of California, San Francisco, San Francisco, CA, USA.
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
26
|
Ren Y, Wu C, Zhou H, Hu X, Miao Z. Dual-extraction modeling: A multi-modal deep-learning architecture for phenotypic prediction and functional gene mining of complex traits. PLANT COMMUNICATIONS 2024; 5:101002. [PMID: 38872306 DOI: 10.1016/j.xplc.2024.101002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Revised: 05/27/2024] [Accepted: 06/11/2024] [Indexed: 06/15/2024]
Abstract
Despite considerable advances in extracting crucial insights from bio-omics data to unravel the intricate mechanisms underlying complex traits, the absence of a universal multi-modal computational tool with robust interpretability for accurate phenotype prediction and identification of trait-associated genes remains a challenge. This study introduces the dual-extraction modeling (DEM) approach, a multi-modal deep-learning architecture designed to extract representative features from heterogeneous omics datasets, enabling the prediction of complex trait phenotypes. Through comprehensive benchmarking experiments, we demonstrate the efficacy of DEM in classification and regression prediction of complex traits. DEM consistently exhibits superior accuracy, robustness, generalizability, and flexibility. Notably, we establish its effectiveness in predicting pleiotropic genes that influence both flowering time and rosette leaf number, underscoring its commendable interpretability. In addition, we have developed user-friendly software to facilitate seamless utilization of DEM's functions. In summary, this study presents a state-of-the-art approach with the ability to effectively predict qualitative and quantitative traits and identify functional genes, confirming its potential as a valuable tool for exploring the genetic basis of complex traits.
Collapse
Affiliation(s)
- Yanlin Ren
- State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Chenhua Wu
- State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - He Zhou
- State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Xiaona Hu
- College of Chemistry & Pharmacy, Northwest A&F University, Yangling, Shaanxi 712100, China.
| | - Zhenyan Miao
- State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Yangling, Shaanxi 712100, China; Key Laboratory of Biology and Genetics Improvement of Maize in Arid Area of Northwest Region, Ministry of Agriculture, Northwest A&F University, Yangling, Shaanxi 712100, China.
| |
Collapse
|
27
|
Xu M, Zhu Z, Zhao Y, He K, Huang Q, Zhao Y. RedCDR: Dual Relation Distillation for Cancer Drug Response Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1468-1479. [PMID: 38776197 DOI: 10.1109/tcbb.2024.3404262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2024]
Abstract
Based on multi-omics data and drug information, predicting the response of cancer cell lines to drugs is a crucial area of research in modern oncology, as it can promote the development of personalized treatments. Despite the promising performance achieved by existing models, most of them overlook the variations among different omics and lack effective integration of multi-omics data. Moreover, the explicit modeling of cell line/drug attribute and cell line-drug association has not been thoroughly investigated in existing approaches. To address these issues, we propose RedCDR, a dual relation distillation model for cancer drug response (CDR) prediction. Specifically, a parallel dual-branch architecture is designed to enable both the independent learning and interactive fusion feasible for cell line/drug attribute and cell line-drug association information. To facilitate the adaptive interacting integration of multi-omics data, the proposed multi-omics encoder introduces the multiple similarity relations between cell lines and takes the importance of different omics data into account. To accomplish knowledge transfer from the two independent attribute and association branches to their fusion, a dual relation distillation mechanism consisting of representation distillation and prediction distillation is presented. Experiments conducted on the GDSC and CCLE datasets show that RedCDR outperforms previous state-of-the-art approaches in CDR prediction.
Collapse
|
28
|
Huang Z, Fan Z, Shen S, Wu M, Deng L. MolMVC: Enhancing molecular representations for drug-related tasks through multi-view contrastive learning. Bioinformatics 2024; 40:ii190-ii197. [PMID: 39230706 PMCID: PMC11373324 DOI: 10.1093/bioinformatics/btae386] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/05/2024] Open
Abstract
MOTIVATION Effective molecular representation is critical in drug development. The complex nature of molecules demands comprehensive multi-view representations, considering 1D, 2D, and 3D aspects, to capture diverse perspectives. Obtaining representations that encompass these varied structures is crucial for a holistic understanding of molecules in drug-related contexts. RESULTS In this study, we introduce an innovative multi-view contrastive learning framework for molecular representation, denoted as MolMVC. Initially, we use a Transformer encoder to capture 1D sequence information and a Graph Transformer to encode the intricate 2D and 3D structural details of molecules. Our approach incorporates a novel attention-guided augmentation scheme, leveraging prior knowledge to create positive samples tailored to different molecular data views. To align multi-view molecular positive samples effectively in latent space, we introduce an adaptive multi-view contrastive loss (AMCLoss). In particular, we calculate AMCLoss at various levels within the model to effectively capture the hierarchical nature of the molecular information. Eventually, we pre-train the encoders via minimizing AMCLoss to obtain the molecular representation, which can be used for various down-stream tasks. In our experiments, we evaluate the performance of our MolMVC on multiple tasks, including molecular property prediction (MPP), drug-target binding affinity (DTA) prediction and cancer drug response (CDR) prediction. The results demonstrate that the molecular representation learned by our MolMVC can enhance the predictive accuracy on these tasks and also reduce the computational costs. Furthermore, we showcase MolMVC's efficacy in drug repositioning across a spectrum of drug-related applications. AVAILABILITY AND IMPLEMENTATION The code and pre-trained model are publicly available at https://github.com/Hhhzj-7/MolMVC.
Collapse
Affiliation(s)
- Zhijian Huang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Ziyu Fan
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Siyuan Shen
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Min Wu
- Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
29
|
Fu W, Lin Y, Bai M, Yao J, Huang C, Gao L, Mi N, Ma H, Tian L, Yue P, Zhang Y, zhang J, Ren Y, Ding L, Dai L, Leung JW, Yuan J, Zhang W, Meng W. Beyond ribosomal function: RPS6 deficiency suppresses cholangiocarcinoma cell growth by disrupting alternative splicing. Acta Pharm Sin B 2024; 14:3931-3948. [PMID: 39309509 PMCID: PMC11413689 DOI: 10.1016/j.apsb.2024.06.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 05/05/2024] [Accepted: 05/23/2024] [Indexed: 09/25/2024] Open
Abstract
Cholangiocarcinoma (CCA) is a bile duct malignancy with a dismal prognosis. This study systematically investigated the role of the ribosomal protein S6 (RPS6) gene, which is dependent in CCA. We found that RPS6 upregulation in CCA tissues was correlated with a poor prognosis. Functional investigations have shown that alterations in RPS6 expression, both gain- and loss-of function could affect the proliferation of CCA cells. In xenograft tumor models, RPS6 overexpression enhances tumorigenicity, whereas RPS6 silencing reduces it. Integration analysis using RNA-seq and proteomics elucidated downstream signaling pathways of RPS6 depletion by affecting the cell cycle, especially DNA replication. Immunoprecipitation followed by mass spectrometry has identified numerous spliceosome complex proteins associated with RPS6. Transcriptomic profiling revealed that RPS6 affects numerous alternative splicing (AS) events, and combined with RNA immunoprecipitation sequencing, revealed that minichromosome maintenance complex component 7 (MCM7) binds to RPS6, which regulates its AS and increases oncogenic activity in CCA. Targeting RPS6 with vivo phosphorodiamidate morpholino oligomer (V-PMO) significantly inhibited the growth of CCA cells, patient-derived organoids, and subcutaneous xenograft tumor. Taken together, the data demonstrate that RPS6 is an oncogenic regulator in CCA and that RPS6-V-PMO could be repositioned as a promising strategy for treating CCA.
Collapse
Affiliation(s)
- Wenkang Fu
- The First School of Clinical Medicne, Lanzhou University, Lanzhou 730030, China
| | - Yanyan Lin
- Department of General Surgery, the First Hospital of Lanzhou University, Lanzhou 730030, China
| | - Mingzhen Bai
- The First School of Clinical Medicne, Lanzhou University, Lanzhou 730030, China
| | - Jia Yao
- The First School of Clinical Medicne, Lanzhou University, Lanzhou 730030, China
- Key Laboratory of Biotherapy and Regenerative Medicine of Gansu Province, the First Hospital of Lanzhou University, Lanzhou 730000, China
| | - Chongfei Huang
- The First School of Clinical Medicne, Lanzhou University, Lanzhou 730030, China
| | - Long Gao
- The First School of Clinical Medicne, Lanzhou University, Lanzhou 730030, China
| | - Ningning Mi
- The First School of Clinical Medicne, Lanzhou University, Lanzhou 730030, China
| | - Haidong Ma
- The First School of Clinical Medicne, Lanzhou University, Lanzhou 730030, China
| | - Liang Tian
- The First School of Clinical Medicne, Lanzhou University, Lanzhou 730030, China
| | - Ping Yue
- Department of General Surgery, the First Hospital of Lanzhou University, Lanzhou 730030, China
| | - Yong Zhang
- Department of General Surgery, the First Hospital of Lanzhou University, Lanzhou 730030, China
| | - Jinduo zhang
- Department of General Surgery, the First Hospital of Lanzhou University, Lanzhou 730030, China
| | - Yanxian Ren
- Department of General Surgery, the First Hospital of Lanzhou University, Lanzhou 730030, China
| | - Liyun Ding
- School of Physical Science and Technology, Lanzhou University, Lanzhou 730000, China
| | - Lunzhi Dai
- National Clinical Research Center for Geriatrics and Department of General Practice, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Joseph W. Leung
- Division of Gastroenterology, UC Davis Medical Center and Sacramento VA Medical Center, Sacramento, CA 95817, USA
| | - Jinqiu Yuan
- Clinical Research Center, Big Data Center, the Seventh Affiliated Hospital, Sun Yat-sen University, Shenzhen 518107, China
| | - Wenhua Zhang
- School of Life Sciences, Lanzhou University, Lanzhou 730000, China
| | - Wenbo Meng
- The First School of Clinical Medicne, Lanzhou University, Lanzhou 730030, China
- Department of General Surgery, the First Hospital of Lanzhou University, Lanzhou 730030, China
| |
Collapse
|
30
|
Meier TA, Refahi MS, Hearne G, Restifo DS, Munoz-Acuna R, Rosen GL, Woloszynek S. The Role and Applications of Artificial Intelligence in the Treatment of Chronic Pain. Curr Pain Headache Rep 2024; 28:769-784. [PMID: 38822995 DOI: 10.1007/s11916-024-01264-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/28/2024] [Indexed: 06/03/2024]
Abstract
PURPOSE OF REVIEW This review aims to explore the interface between artificial intelligence (AI) and chronic pain, seeking to identify areas of focus for enhancing current treatments and yielding novel therapies. RECENT FINDINGS In the United States, the prevalence of chronic pain is estimated to be upwards of 40%. Its impact extends to increased healthcare costs, reduced economic productivity, and strain on healthcare resources. Addressing this condition is particularly challenging due to its complexity and the significant variability in how patients respond to treatment. Current options often struggle to provide long-term relief, with their benefits rarely outweighing the risks, such as dependency or other side effects. Currently, AI has impacted four key areas of chronic pain treatment and research: (1) predicting outcomes based on clinical information; (2) extracting features from text, specifically clinical notes; (3) modeling 'omic data to identify meaningful patient subgroups with potential for personalized treatments and improved understanding of disease processes; and (4) disentangling complex neuronal signals responsible for pain, which current therapies attempt to modulate. As AI advances, leveraging state-of-the-art architectures will be essential for improving chronic pain treatment. Current efforts aim to extract meaningful representations from complex data, paving the way for personalized medicine. The identification of unique patient subgroups should reveal targets for tailored chronic pain treatments. Moreover, enhancing current treatment approaches is achievable by gaining a more profound understanding of patient physiology and responses. This can be realized by leveraging AI on the increasing volume of data linked to chronic pain.
Collapse
Affiliation(s)
| | - Mohammad S Refahi
- Ecological and Evolutionary Signal-Processing and Informatics (EESI) Laboratory, Department of Electrical and Computer Engineering, Drexel University, Philadelphia, PA, USA
| | - Gavin Hearne
- Ecological and Evolutionary Signal-Processing and Informatics (EESI) Laboratory, Department of Electrical and Computer Engineering, Drexel University, Philadelphia, PA, USA
| | | | - Ricardo Munoz-Acuna
- Anesthesia, Critical Care, and Pain Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Gail L Rosen
- Ecological and Evolutionary Signal-Processing and Informatics (EESI) Laboratory, Department of Electrical and Computer Engineering, Drexel University, Philadelphia, PA, USA
| | - Stephen Woloszynek
- Anesthesia, Critical Care, and Pain Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA.
| |
Collapse
|
31
|
Mohammadzadeh-Vardin T, Ghareyazi A, Gharizadeh A, Abbasi K, Rabiee HR. DeepDRA: Drug repurposing using multi-omics data integration with autoencoders. PLoS One 2024; 19:e0307649. [PMID: 39058696 PMCID: PMC11280260 DOI: 10.1371/journal.pone.0307649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Accepted: 07/09/2024] [Indexed: 07/28/2024] Open
Abstract
Cancer treatment has become one of the biggest challenges in the world today. Different treatments are used against cancer; drug-based treatments have shown better results. On the other hand, designing new drugs for cancer is costly and time-consuming. Some computational methods, such as machine learning and deep learning, have been suggested to solve these challenges using drug repurposing. Despite the promise of classical machine-learning methods in repurposing cancer drugs and predicting responses, deep-learning methods performed better. This study aims to develop a deep-learning model that predicts cancer drug response based on multi-omics data, drug descriptors, and drug fingerprints and facilitates the repurposing of drugs based on those responses. To reduce multi-omics data's dimensionality, we use autoencoders. As a multi-task learning model, autoencoders are connected to MLPs. We extensively tested our model using three primary datasets: GDSC, CTRP, and CCLE to determine its efficacy. In multiple experiments, our model consistently outperforms existing state-of-the-art methods. Compared to state-of-the-art models, our model achieves an impressive AUPRC of 0.99. Furthermore, in a cross-dataset evaluation, where the model is trained on GDSC and tested on CCLE, it surpasses the performance of three previous works, achieving an AUPRC of 0.72. In conclusion, we presented a deep learning model that outperforms the current state-of-the-art regarding generalization. Using this model, we could assess drug responses and explore drug repurposing, leading to the discovery of novel cancer drugs. Our study highlights the potential for advanced deep learning to advance cancer therapeutic precision.
Collapse
Affiliation(s)
- Taha Mohammadzadeh-Vardin
- Department of Computer Engineering, Bioinformatics and Computational Biology Lab, Sharif University of Technology, Tehran, Iran
| | - Amin Ghareyazi
- Department of Computer Engineering, Bioinformatics and Computational Biology Lab, Sharif University of Technology, Tehran, Iran
| | - Ali Gharizadeh
- Department of Computer Engineering, Bioinformatics and Computational Biology Lab, Sharif University of Technology, Tehran, Iran
| | - Karim Abbasi
- Department of Computer Engineering, Bioinformatics and Computational Biology Lab, Sharif University of Technology, Tehran, Iran
- Faculty of Mathematics and Computer Science, Kharazmi University, Tehran, Iran
| | - Hamid R. Rabiee
- Department of Computer Engineering, Bioinformatics and Computational Biology Lab, Sharif University of Technology, Tehran, Iran
| |
Collapse
|
32
|
Sohrabei S, Moghaddasi H, Hosseini A, Ehsanzadeh SJ. Investigating the effects of artificial intelligence on the personalization of breast cancer management: a systematic study. BMC Cancer 2024; 24:852. [PMID: 39026174 PMCID: PMC11256548 DOI: 10.1186/s12885-024-12575-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Accepted: 06/27/2024] [Indexed: 07/20/2024] Open
Abstract
BACKGROUND Providing appropriate specialized treatment to the right patient at the right time is considered necessary in cancer management. Targeted therapy tailored to the genetic changes of each breast cancer patient is a desirable feature of precision oncology, which can not only reduce disease progression but also potentially increase patient survival. The use of artificial intelligence alongside precision oncology can help physicians by identifying and selecting more effective treatment factors for patients. METHOD A systematic review was conducted using the PubMed, Embase, Scopus, and Web of Science databases in September 2023. We performed the search strategy with keywords, namely: Breast Cancer, Artificial intelligence, and precision Oncology along with their synonyms in the article titles. Descriptive, qualitative, review, and non-English studies were excluded. The quality assessment of the articles and evaluation of bias were determined based on the SJR journal and JBI indices, as well as the PRISMA2020 guideline. RESULTS Forty-six studies were selected that focused on personalized breast cancer management using artificial intelligence models. Seventeen studies using various deep learning methods achieved a satisfactory outcome in predicting treatment response and prognosis, contributing to personalized breast cancer management. Two studies utilizing neural networks and clustering provided acceptable indicators for predicting patient survival and categorizing breast tumors. One study employed transfer learning to predict treatment response. Twenty-six studies utilizing machine-learning methods demonstrated that these techniques can improve breast cancer classification, screening, diagnosis, and prognosis. The most frequent modeling techniques used were NB, SVM, RF, XGBoost, and Reinforcement Learning. The average area under the curve (AUC) for the models was 0.91. Moreover, the average values for accuracy, sensitivity, specificity, and precision were reported to be in the range of 90-96% for the models. CONCLUSION Artificial intelligence has proven to be effective in assisting physicians and researchers in managing breast cancer treatment by uncovering hidden patterns in complex omics and genetic data. Intelligent processing of omics data through protein and gene pattern classification and the utilization of deep neural patterns has the potential to significantly transform the field of complex disease management.
Collapse
Affiliation(s)
- Solmaz Sohrabei
- Department of Health Information Technology and Management, Medical Informatics, School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Hamid Moghaddasi
- Department of Health Information Technology and Management, Medical Informatics, School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| | - Azamossadat Hosseini
- Department of Health Information Technology and Management, Health Information Management, School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| | - Seyed Jafar Ehsanzadeh
- Department of English Language, School of Health Management and Information Sciences, Iran University of Medical Sciences, Tehran, Iran
| |
Collapse
|
33
|
Yeh SJ, Paithankar S, Chen R, Xing J, Sun M, Liu K, Zhou J, Chen B. TransCell: In Silico Characterization of Genomic Landscape and Cellular Responses by Deep Transfer Learning. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzad008. [PMID: 39240541 PMCID: PMC11378636 DOI: 10.1093/gpbjnl/qzad008] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 06/30/2023] [Accepted: 09/20/2023] [Indexed: 09/07/2024]
Abstract
Gene expression profiling of new or modified cell lines becomes routine today; however, obtaining comprehensive molecular characterization and cellular responses for a variety of cell lines, including those derived from underrepresented groups, is not trivial when resources are minimal. Using gene expression to predict other measurements has been actively explored; however, systematic investigation of its predictive power in various measurements has not been well studied. Here, we evaluated commonly used machine learning methods and presented TransCell, a two-step deep transfer learning framework that utilized the knowledge derived from pan-cancer tumor samples to predict molecular features and responses. Among these models, TransCell had the best performance in predicting metabolite, gene effect score (or genetic dependency), and drug sensitivity, and had comparable performance in predicting mutation, copy number variation, and protein expression. Notably, TransCell improved the performance by over 50% in drug sensitivity prediction and achieved a correlation of 0.7 in gene effect score prediction. Furthermore, predicted drug sensitivities revealed potential repurposing candidates for new 100 pediatric cancer cell lines, and predicted gene effect scores reflected BRAF resistance in melanoma cell lines. Together, we investigated the predictive power of gene expression in six molecular measurement types and developed a web portal (http://apps.octad.org/transcell/) that enables the prediction of 352,000 genomic and cellular response features solely from gene expression profiles.
Collapse
Affiliation(s)
- Shan-Ju Yeh
- Department of Pediatrics and Human Development, Michigan State University, Grand Rapids, MI 49503, USA
| | - Shreya Paithankar
- Department of Pediatrics and Human Development, Michigan State University, Grand Rapids, MI 49503, USA
| | - Ruoqiao Chen
- Department of Pharmacology and Toxicology, Michigan State University, Grand Rapids, MI 49503, USA
| | - Jing Xing
- Department of Pediatrics and Human Development, Michigan State University, Grand Rapids, MI 49503, USA
| | - Mengying Sun
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Ke Liu
- Department of Pediatrics and Human Development, Michigan State University, Grand Rapids, MI 49503, USA
| | - Jiayu Zhou
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Bin Chen
- Department of Pediatrics and Human Development, Michigan State University, Grand Rapids, MI 49503, USA
- Department of Pharmacology and Toxicology, Michigan State University, Grand Rapids, MI 49503, USA
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
34
|
Feng Y, Soni A, Brightwell G, M Reis M, Wang Z, Wang J, Wu Q, Ding Y. The potential new microbial hazard monitoring tool in food safety: Integration of metabolomics and artificial intelligence. Trends Food Sci Technol 2024; 149:104555. [DOI: 10.1016/j.tifs.2024.104555] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2025]
|
35
|
Chen HO, Cui YC, Lin PC, Chiang JH. An Innovative Multi-Omics Model Integrating Latent Alignment and Attention Mechanism for Drug Response Prediction. J Pers Med 2024; 14:694. [PMID: 39063948 PMCID: PMC11277895 DOI: 10.3390/jpm14070694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Revised: 06/18/2024] [Accepted: 06/24/2024] [Indexed: 07/28/2024] Open
Abstract
By using omics, we can now examine all components of biological systems simultaneously. Deep learning-based drug prediction methods have shown promise by integrating cancer-related multi-omics data. However, the complex interaction between genes poses challenges in accurately projecting multi-omics data. In this research, we present a predictive model for drug response that incorporates diverse types of omics data, comprising genetic mutation, copy number variation, methylation, and gene expression data. This study proposes latent alignment for information mismatch in integration, which is achieved through an attention module capturing interactions among diverse types of omics data. The latent alignment and attention modules significantly improve predictions, outperforming the baseline model, with MSE = 1.1333, F1-score = 0.5342, and AUROC = 0.5776. High accuracy was achieved in predicting drug responses for piplartine and tenovin-6, while the accuracy was comparatively lower for mitomycin-C and obatoclax. The latent alignment module exclusively outperforms the baseline model, enhancing the MSE by 0.2375, the F1-score by 4.84%, and the AUROC by 6.1%. Similarly, the attention module only improves these metrics by 0.1899, 2.88%, and 2.84%, respectively. In the interpretability case study, panobinostat exhibited the most effective predicted response, with a value of -4.895. We provide reliable insights for drug selection in personalized medicine by identifying crucial genetic factors influencing drug response.
Collapse
Affiliation(s)
- Hui-O Chen
- Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan 701, Taiwan
- Institute of Medical Informatics, National Cheng Kung University, Tainan 701, Taiwan
| | - Yuan-Chi Cui
- Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan 701, Taiwan
- Institute of Medical Informatics, National Cheng Kung University, Tainan 701, Taiwan
| | - Peng-Chan Lin
- Department of Oncology, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan 701, Taiwan
- Department of Genomic Medicine, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan 701, Taiwan
| | - Jung-Hsien Chiang
- Department of Oncology, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan 701, Taiwan
- Department of Genomic Medicine, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan 701, Taiwan
| |
Collapse
|
36
|
Lee S, Sun M, Hu Y, Wang Y, Islam MN, Goerlitz D, Lucas PC, Lee AV, Swain SM, Tang G, Wang XS. iGenSig-Rx: an integral genomic signature based white-box tool for modeling cancer therapeutic responses using multi-omics data. BMC Bioinformatics 2024; 25:220. [PMID: 38898383 PMCID: PMC11186173 DOI: 10.1186/s12859-024-05835-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Accepted: 06/10/2024] [Indexed: 06/21/2024] Open
Abstract
Multi-omics sequencing is poised to revolutionize clinical care in the coming decade. However, there is a lack of effective and interpretable genome-wide modeling methods for the rational selection of patients for personalized interventions. To address this, we present iGenSig-Rx, an integral genomic signature-based approach, as a transparent tool for modeling therapeutic response using clinical trial datasets. This method adeptly addresses challenges related to cross-dataset modeling by capitalizing on high-dimensional redundant genomic features, analogous to reinforcing building pillars with redundant steel rods. Moreover, it integrates adaptive penalization of feature redundancy on a per-sample basis to prevent score flattening and mitigate overfitting. We then developed a purpose-built R package to implement this method for modeling clinical trial datasets. When applied to genomic datasets for HER2 targeted therapies, iGenSig-Rx model demonstrates consistent and reliable predictive power across four independent clinical trials. More importantly, the iGenSig-Rx model offers the level of transparency much needed for clinical application, allowing for clear explanations as to how the predictions are produced, how the features contribute to the prediction, and what are the key underlying pathways. We anticipate that iGenSig-Rx, as an interpretable class of multi-omics modeling methods, will find broad applications in big-data based precision oncology. The R package is available: https://github.com/wangxlab/iGenSig-Rx .
Collapse
Affiliation(s)
- Sanghoon Lee
- UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, 15213, USA
- Department of Pathology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15213, USA
- Department of Biomedical Informatics, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15206, USA
| | - Min Sun
- UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, 15213, USA
- Department of Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15261, USA
| | - Yiheng Hu
- UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, 15213, USA
- Department of Pathology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15213, USA
| | - Yue Wang
- UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, 15213, USA
- Department of Pathology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15213, USA
| | - Md N Islam
- Genomics and Epigenomics Shared Resource (GESR), Georgetown University Medical Center, Washington, DC, 20057, USA
| | - David Goerlitz
- Lombardi Comprehensive Cancer Center, Georgetown University Medical Center, Washington, DC, 20057, USA
| | - Peter C Lucas
- UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, 15213, USA
- Department of Pathology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15213, USA
- National Surgical Adjuvant Breast and Bowel Project (NSABP), Pittsburgh, PA, 15213, USA
| | - Adrian V Lee
- UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, 15213, USA
- Department of Pharmacology and Chemical Biology, University of Pittsburgh, Pittsburgh, PA, 15213, USA
| | - Sandra M Swain
- National Surgical Adjuvant Breast and Bowel Project (NSABP), Pittsburgh, PA, 15213, USA
| | - Gong Tang
- Department of Biostatistics, School of Public Health, University of Pittsburgh, Pittsburgh, PA, 15261, USA
- National Surgical Adjuvant Breast and Bowel Project (NSABP), Pittsburgh, PA, 15213, USA
| | - Xiao-Song Wang
- UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, 15213, USA.
- Department of Pathology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15213, USA.
- Department of Biomedical Informatics, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15206, USA.
| |
Collapse
|
37
|
Wang FA, Zhuang Z, Gao F, He R, Zhang S, Wang L, Liu J, Li Y. TMO-Net: an explainable pretrained multi-omics model for multi-task learning in oncology. Genome Biol 2024; 25:149. [PMID: 38845006 PMCID: PMC11157742 DOI: 10.1186/s13059-024-03293-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 05/29/2024] [Indexed: 06/09/2024] Open
Abstract
Cancer is a complex disease composing systemic alterations in multiple scales. In this study, we develop the Tumor Multi-Omics pre-trained Network (TMO-Net) that integrates multi-omics pan-cancer datasets for model pre-training, facilitating cross-omics interactions and enabling joint representation learning and incomplete omics inference. This model enhances multi-omics sample representation and empowers various downstream oncology tasks with incomplete multi-omics datasets. By employing interpretable learning, we characterize the contributions of distinct omics features to clinical outcomes. The TMO-Net model serves as a versatile framework for cross-modal multi-omics learning in oncology, paving the way for tumor omics-specific foundation models.
Collapse
Affiliation(s)
- Feng-Ao Wang
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, 310024, China
- Guangzhou National Laboratory, Guangzhou, 510005, China
| | - Zhenfeng Zhuang
- Department of Computer Science at the School of Informatics, Xiamen University, Xiamen, 361005, China
| | - Feng Gao
- Department of Colorectal Surgery, The Sixth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, 510655, China
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200433, China
- Biomedical Innovation Center, The Sixth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, 510655, China
| | - Ruikun He
- BYHEALTH Institute of Nutrition & Health, Guangzhou, 510000, China
| | - Shaoting Zhang
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200433, China
| | - Liansheng Wang
- Department of Computer Science at the School of Informatics, Xiamen University, Xiamen, 361005, China.
| | - Junwei Liu
- Guangzhou National Laboratory, Guangzhou, 510005, China.
| | - Yixue Li
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, 310024, China.
- Guangzhou National Laboratory, Guangzhou, 510005, China.
- Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai, 200030, China.
- GZMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Medical University, Guangzhou, 511436, China.
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China.
- Collaborative Innovation Center for Genetics and Development, Fudan University, Shanghai, 200433, China.
- Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai, 200032, China.
| |
Collapse
|
38
|
Abinas V, Abhinav U, Haneem EM, Vishnusankar A, Nazeer KAA. Integration of autoencoder and graph convolutional network for predicting breast cancer drug response. J Bioinform Comput Biol 2024; 22:2450013. [PMID: 39051144 DOI: 10.1142/s0219720024500136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/27/2024]
Abstract
Background and objectives: Breast cancer is the most prevalent type of cancer among women. The effectiveness of anticancer pharmacological therapy may get adversely affected by tumor heterogeneity that includes genetic and transcriptomic features. This leads to clinical variability in patient response to therapeutic drugs. Anticancer drug design and cancer understanding require precise identification of cancer drug responses. The performance of drug response prediction models can be improved by integrating multi-omics data and drug structure data. Methods: In this paper, we propose an Autoencoder (AE) and Graph Convolutional Network (AGCN) for drug response prediction, which integrates multi-omics data and drug structure data. Specifically, we first converted the high dimensional representation of each omic data to a lower dimensional representation using an AE for each omic data set. Subsequently, these individual features are combined with drug structure data obtained using a Graph Convolutional Network and given to a Convolutional Neural Network to calculate IC[Formula: see text] values for every combination of cell lines and drugs. Then a threshold IC[Formula: see text] value is obtained for each drug by performing K-means clustering of their known IC[Formula: see text] values. Finally, with the help of this threshold value, cell lines are classified as either sensitive or resistant to each drug. Results: Experimental results indicate that AGCN has an accuracy of 0.82 and performs better than many existing methods. In addition to that, we have done external validation of AGCN using data taken from The Cancer Genome Atlas (TCGA) clinical database, and we got an accuracy of 0.91. Conclusion: According to the results obtained, concatenating multi-omics data with drug structure data using AGCN for drug response prediction tasks greatly improves the accuracy of the prediction task.
Collapse
Affiliation(s)
- V Abinas
- Department of Computer Science and Engineering, National Institute of Technology Calicut, Calicut, Kerala, India
| | - U Abhinav
- Department of Computer Science and Engineering, National Institute of Technology Calicut, Calicut, Kerala, India
| | - E M Haneem
- Department of Computer Science and Engineering, National Institute of Technology Calicut, Calicut, Kerala, India
| | - A Vishnusankar
- Department of Computer Science and Engineering, National Institute of Technology Calicut, Calicut, Kerala, India
| | - K A Abdul Nazeer
- Department of Computer Science and Engineering, National Institute of Technology Calicut, Calicut, Kerala, India
| |
Collapse
|
39
|
Eckhart L, Lenhof K, Rolli LM, Lenhof HP. A comprehensive benchmarking of machine learning algorithms and dimensionality reduction methods for drug sensitivity prediction. Brief Bioinform 2024; 25:bbae242. [PMID: 38797968 PMCID: PMC11128483 DOI: 10.1093/bib/bbae242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 04/05/2024] [Accepted: 05/06/2024] [Indexed: 05/29/2024] Open
Abstract
A major challenge of precision oncology is the identification and prioritization of suitable treatment options based on molecular biomarkers of the considered tumor. In pursuit of this goal, large cancer cell line panels have successfully been studied to elucidate the relationship between cellular features and treatment response. Due to the high dimensionality of these datasets, machine learning (ML) is commonly used for their analysis. However, choosing a suitable algorithm and set of input features can be challenging. We performed a comprehensive benchmarking of ML methods and dimension reduction (DR) techniques for predicting drug response metrics. Using the Genomics of Drug Sensitivity in Cancer cell line panel, we trained random forests, neural networks, boosting trees and elastic nets for 179 anti-cancer compounds with feature sets derived from nine DR approaches. We compare the results regarding statistical performance, runtime and interpretability. Additionally, we provide strategies for assessing model performance compared with a simple baseline model and measuring the trade-off between models of different complexity. Lastly, we show that complex ML models benefit from using an optimized DR strategy, and that standard models-even when using considerably fewer features-can still be superior in performance.
Collapse
Affiliation(s)
- Lea Eckhart
- Center for Bioinformatics, Saarland Informatics Campus, Saarland University, 66123, Saarland, Germany
| | - Kerstin Lenhof
- Center for Bioinformatics, Saarland Informatics Campus, Saarland University, 66123, Saarland, Germany
| | - Lisa-Marie Rolli
- Center for Bioinformatics, Saarland Informatics Campus, Saarland University, 66123, Saarland, Germany
| | - Hans-Peter Lenhof
- Center for Bioinformatics, Saarland Informatics Campus, Saarland University, 66123, Saarland, Germany
| |
Collapse
|
40
|
Liu X, Tao Y, Cai Z, Bao P, Ma H, Li K, Li M, Zhu Y, Lu ZJ. Pathformer: a biological pathway informed transformer for disease diagnosis and prognosis using multi-omics data. Bioinformatics 2024; 40:btae316. [PMID: 38741230 PMCID: PMC11139513 DOI: 10.1093/bioinformatics/btae316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 03/29/2024] [Accepted: 05/11/2024] [Indexed: 05/16/2024] Open
Abstract
MOTIVATION Multi-omics data provide a comprehensive view of gene regulation at multiple levels, which is helpful in achieving accurate diagnosis of complex diseases like cancer. However, conventional integration methods rarely utilize prior biological knowledge and lack interpretability. RESULTS To integrate various multi-omics data of tissue and liquid biopsies for disease diagnosis and prognosis, we developed a biological pathway informed Transformer, Pathformer. It embeds multi-omics input with a compacted multi-modal vector and a pathway-based sparse neural network. Pathformer also leverages criss-cross attention mechanism to capture the crosstalk between different pathways and modalities. We first benchmarked Pathformer with 18 comparable methods on multiple cancer datasets, where Pathformer outperformed all the other methods, with an average improvement of 6.3%-14.7% in F1 score for cancer survival prediction, 5.1%-12% for cancer stage prediction, and 8.1%-13.6% for cancer drug response prediction. Subsequently, for cancer prognosis prediction based on tissue multi-omics data, we used a case study to demonstrate the biological interpretability of Pathformer by identifying key pathways and their biological crosstalk. Then, for cancer early diagnosis based on liquid biopsy data, we used plasma and platelet datasets to demonstrate Pathformer's potential of clinical applications in cancer screening. Moreover, we revealed deregulation of interesting pathways (e.g. scavenger receptor pathway) and their crosstalk in cancer patients' blood, providing potential candidate targets for cancer microenvironment study. AVAILABILITY AND IMPLEMENTATION Pathformer is implemented and freely available at https://github.com/lulab/Pathformer.
Collapse
Affiliation(s)
- Xiaofan Liu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| | - Yuhuan Tao
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| | - Zilin Cai
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Pengfei Bao
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| | - Hongli Ma
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| | - Kexing Li
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Mengtao Li
- Department of Rheumatology and Clinical Immunology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Peking Union Medical College, National Clinical Research Center for Dermatologic and Immunologic Diseases (NCRC-DID), MST State Key Laboratory of Complex Severe and Rare Diseases, MOE Key Laboratory of Rheumatology and Clinical Immunology, Beijing 100730, China
| | - Yunping Zhu
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| | - Zhi John Lu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| |
Collapse
|
41
|
Deng D, Xu X, Cui T, Xu M, Luo K, Zhang H, Wang Q, Song C, Li C, Li G, Shang D. PBAC: A pathway-based attention convolution neural network for predicting clinical drug treatment responses. J Cell Mol Med 2024; 28:e18298. [PMID: 38683133 PMCID: PMC11057419 DOI: 10.1111/jcmm.18298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 03/05/2024] [Accepted: 03/25/2024] [Indexed: 05/01/2024] Open
Abstract
Precise and personalized drug application is crucial in the clinical treatment of complex diseases. Although neural networks offer a new approach to improving drug strategies, their internal structure is difficult to interpret. Here, we propose PBAC (Pathway-Based Attention Convolution neural network), which integrates a deep learning framework and attention mechanism to address the complex biological pathway information, thereby provide a biology function-based robust drug responsiveness prediction model. PBAC has four layers: gene-pathway layer, attention layer, convolution layer and fully connected layer. PBAC improves the performance of predicting drug responsiveness by focusing on important pathways, helping us understand the mechanism of drug action in diseases. We validated the PBAC model using data from four chemotherapy drugs (Bortezomib, Cisplatin, Docetaxel and Paclitaxel) and 11 immunotherapy datasets. In the majority of datasets, PBAC exhibits superior performance compared to traditional machine learning methods and other research approaches (area under curve = 0.81, the area under the precision-recall curve = 0.73). Using PBAC attention layer output, we identified some pathways as potential core cancer regulators, providing good interpretability for drug treatment prediction. In summary, we presented PBAC, a powerful tool to predict drug responsiveness based on the biology pathway information and explore the potential cancer-driving pathways.
Collapse
Affiliation(s)
- Dexun Deng
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical SchoolUniversity of South ChinaHengyangHunanChina
- Hunan Provincial Key Laboratory of Multi‐omics And Artificial Intelligence of Cardiovascular DiseasesUniversity of South ChinaHengyangHunanChina
- School of ComputerUniversity of South ChinaHengyangHunanChina
| | - Xiaoqiang Xu
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical SchoolUniversity of South ChinaHengyangHunanChina
- Hunan Provincial Key Laboratory of Multi‐omics And Artificial Intelligence of Cardiovascular DiseasesUniversity of South ChinaHengyangHunanChina
- The First Affiliated Hospital, Institute of Cardiovascular Disease, Hengyang Medical SchoolUniversity of South ChinaHengyangHunanChina
- Department of Cardiology, The First Affiliated Hospital, Hengyang Medical SchoolUniversity of South ChinaHengyangChina
| | - Ting Cui
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical SchoolUniversity of South ChinaHengyangHunanChina
- Hunan Provincial Key Laboratory of Multi‐omics And Artificial Intelligence of Cardiovascular DiseasesUniversity of South ChinaHengyangHunanChina
| | - Mingcong Xu
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical SchoolUniversity of South ChinaHengyangHunanChina
- Hunan Provincial Key Laboratory of Multi‐omics And Artificial Intelligence of Cardiovascular DiseasesUniversity of South ChinaHengyangHunanChina
| | - Kunpeng Luo
- Department of Gastroenterology and HepatologySecond Affiliated Hospital of Harbin Medical UniversityHarbinHeilongjiangChina
| | - Han Zhang
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical SchoolUniversity of South ChinaHengyangHunanChina
- Hunan Provincial Key Laboratory of Multi‐omics And Artificial Intelligence of Cardiovascular DiseasesUniversity of South ChinaHengyangHunanChina
- School of ComputerUniversity of South ChinaHengyangHunanChina
| | - Qiuyu Wang
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical SchoolUniversity of South ChinaHengyangHunanChina
- Hunan Provincial Key Laboratory of Multi‐omics And Artificial Intelligence of Cardiovascular DiseasesUniversity of South ChinaHengyangHunanChina
- School of ComputerUniversity of South ChinaHengyangHunanChina
- The First Affiliated Hospital, Institute of Cardiovascular Disease, Hengyang Medical SchoolUniversity of South ChinaHengyangHunanChina
- Department of Cardiology, The First Affiliated Hospital, Hengyang Medical SchoolUniversity of South ChinaHengyangChina
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Hengyang Medical SchoolUniversity of South ChinaHengyangHunanChina
- Department of Cell Biology and Genetics, School of Basic Medical Sciences, Hengyang Medical SchoolUniversity of South ChinaHengyangHunanChina
| | - Chao Song
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical SchoolUniversity of South ChinaHengyangHunanChina
- Hunan Provincial Key Laboratory of Multi‐omics And Artificial Intelligence of Cardiovascular DiseasesUniversity of South ChinaHengyangHunanChina
- School of ComputerUniversity of South ChinaHengyangHunanChina
- The First Affiliated Hospital, Institute of Cardiovascular Disease, Hengyang Medical SchoolUniversity of South ChinaHengyangHunanChina
- Department of Cardiology, The First Affiliated Hospital, Hengyang Medical SchoolUniversity of South ChinaHengyangChina
| | - Chao Li
- Department of AnesthesiologyThe First Affiliated Hospital of University of South ChinaHengyangPR China
| | - Guohua Li
- Department of Pathophysiology, Key Laboratory for Arteriosclerology of Hunan Province, MOE Key Lab of Rare Pediatric Diseases, Hengyang Medical SchoolInstitute of Cardiovascular Disease, Hunan International Scientific and Technological Cooperation Base of Arteriosclerotic Disease, University of South ChinaHengyangHunanChina
| | - Desi Shang
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical SchoolUniversity of South ChinaHengyangHunanChina
- Hunan Provincial Key Laboratory of Multi‐omics And Artificial Intelligence of Cardiovascular DiseasesUniversity of South ChinaHengyangHunanChina
- School of ComputerUniversity of South ChinaHengyangHunanChina
- The First Affiliated Hospital, Institute of Cardiovascular Disease, Hengyang Medical SchoolUniversity of South ChinaHengyangHunanChina
- Department of Cardiology, The First Affiliated Hospital, Hengyang Medical SchoolUniversity of South ChinaHengyangChina
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Hengyang Medical SchoolUniversity of South ChinaHengyangHunanChina
- Department of Cell Biology and Genetics, School of Basic Medical Sciences, Hengyang Medical SchoolUniversity of South ChinaHengyangHunanChina
| |
Collapse
|
42
|
Tang X, Prodduturi N, Thompson KJ, Weinshilboum RM, O'Sullivan CC, Boughey JC, Tizhoosh H, Klee EW, Wang L, Goetz MP, Suman V, Kalari KR. OmicsFootPrint: a framework to integrate and interpret multi-omics data using circular images and deep neural networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.21.586001. [PMID: 38585820 PMCID: PMC10996492 DOI: 10.1101/2024.03.21.586001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
The OmicsFootPrint framework addresses the need for advanced multi-omics data analysis methodologies by transforming data into intuitive two-dimensional circular images and facilitating the interpretation of complex diseases. Utilizing Deep Neural Networks and incorporating the SHapley Additive exPlanations (SHAP) algorithm, the framework enhances model interpretability. Tested with The Cancer Genome Atlas (TCGA) data, OmicsFootPrint effectively classified lung and breast cancer subtypes, achieving high Area Under Curve (AUC) scores - 0.98±0.02 for lung cancer subtype differentiation, 0.83±0.07 for breast cancer PAM50 subtypes, and successfully distinguishe between invasive lobular and ductal carcinomas in breast cancer, showcasing its robustness. It also demonstrated notable performance in predicting drug responses in cancer cell lines, with a median AUC of 0.74, surpassing existing algorithms. Furthermore, its effectiveness persists even with reduced training sample sizes. OmicsFootPrint marks an enhancement in multi-omics research, offering a novel, efficient, and interpretable approach that contributes to a deeper understanding of disease mechanisms.
Collapse
|
43
|
Liu H, Wang F, Yu J, Pan Y, Gong C, Zhang L, Zhang L. DBDNMF: A Dual Branch Deep Neural Matrix Factorization method for drug response prediction. PLoS Comput Biol 2024; 20:e1012012. [PMID: 38574114 PMCID: PMC11020650 DOI: 10.1371/journal.pcbi.1012012] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2023] [Revised: 04/16/2024] [Accepted: 03/19/2024] [Indexed: 04/06/2024] Open
Abstract
Anti-cancer response of cell lines to drugs is in urgent need for individualized precision medical decision-making in the era of precision medicine. Measurements with wet-experiments is time-consuming and expensive and it is almost impossible for wide ranges of application. The design of computational models that can precisely predict the responses between drugs and cell lines could provide a credible reference for further research. Existing methods of response prediction based on matrix factorization or neural networks have revealed that both linear or nonlinear latent characteristics are applicable and effective for the precise prediction of drug responses. However, the majority of them consider only linear or nonlinear relationships for drug response prediction. Herein, we propose a Dual Branch Deep Neural Matrix Factorization (DBDNMF) method to address the above-mentioned issues. DBDNMF learns the latent representation of drugs and cell lines through flexible inputs and reconstructs the partially observed matrix through a series of hidden neural network layers. Experimental results on the datasets of Cancer Cell Line Encyclopedia (CCLE) and Genomics of Drug Sensitivity in Cancer (GDSC) show that the accuracy of drug prediction exceeds state-of-the-art drug response prediction algorithms, demonstrating its reliability and stability. The hierarchical clustering results show that drugs with similar response levels tend to target similar signaling pathway, and cell lines coming from the same tissue subtype tend to share the same pattern of response, which are consistent with previously published studies.
Collapse
Affiliation(s)
- Hui Liu
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, Jiangsu, China
| | - Feng Wang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, Jiangsu, China
| | - Jian Yu
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, Jiangsu, China
| | - Yong Pan
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, Jiangsu, China
| | - Chaoju Gong
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, Jiangsu, China
- Department of Ophthalmology, Xuzhou First People’s Hospital, Xuzhou, Jiangsu, China
| | - Liang Zhang
- Department of Gastrointestinal Surgery, Xuzhou Central Hospital, Xuzhou, Jiangsu, China
| | - Lin Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, Jiangsu, China
| |
Collapse
|
44
|
Lac L, Leung CK, Hu P. Computational frameworks integrating deep learning and statistical models in mining multimodal omics data. J Biomed Inform 2024; 152:104629. [PMID: 38552994 DOI: 10.1016/j.jbi.2024.104629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 02/26/2024] [Accepted: 03/25/2024] [Indexed: 04/04/2024]
Abstract
BACKGROUND In health research, multimodal omics data analysis is widely used to address important clinical and biological questions. Traditional statistical methods rely on the strong assumptions of distribution. Statistical methods such as testing and differential expression are commonly used in omics analysis. Deep learning, on the other hand, is an advanced computer science technique that is powerful in mining high-dimensional omics data for prediction tasks. Recently, integrative frameworks or methods have been developed for omics studies that combine statistical models and deep learning algorithms. METHODS AND RESULTS The aim of these integrative frameworks is to combine the strengths of both statistical methods and deep learning algorithms to improve prediction accuracy while also providing interpretability and explainability. This review report discusses the current state-of-the-art integrative frameworks, their limitations, and potential future directions in survival and time-to-event longitudinal analysis, dimension reduction and clustering, regression and classification, feature selection, and causal and transfer learning.
Collapse
Affiliation(s)
- Leann Lac
- Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, Canada; Department of Statistics, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Carson K Leung
- Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Pingzhao Hu
- Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, Canada; Department of Biochemistry, Western University, London, Ontario, Canada; Department of Computer Science, Western University, London, Ontario, Canada; Department of Oncology, Western University, London, Ontario, Canada; Department of Epidemiology and Biostatistics, Western University, London, Ontario, Canada; The Children's Health Research Institute, Lawson Health Research Institute, London, Ontario, Canada.
| |
Collapse
|
45
|
Yan H, Weng D, Li D, Gu Y, Ma W, Liu Q. Prior knowledge-guided multilevel graph neural network for tumor risk prediction and interpretation via multi-omics data integration. Brief Bioinform 2024; 25:bbae184. [PMID: 38670157 PMCID: PMC11052635 DOI: 10.1093/bib/bbae184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 03/11/2024] [Accepted: 04/06/2024] [Indexed: 04/28/2024] Open
Abstract
The interrelation and complementary nature of multi-omics data can provide valuable insights into the intricate molecular mechanisms underlying diseases. However, challenges such as limited sample size, high data dimensionality and differences in omics modalities pose significant obstacles to fully harnessing the potential of these data. The prior knowledge such as gene regulatory network and pathway information harbors useful gene-gene interaction and gene functional module information. To effectively integrate multi-omics data and make full use of the prior knowledge, here, we propose a Multilevel-graph neural network (GNN): a hierarchically designed deep learning algorithm that sequentially leverages multi-omics data, gene regulatory networks and pathway information to extract features and enhance accuracy in predicting survival risk. Our method achieved better accuracy compared with existing methods. Furthermore, key factors nonlinearly associated with the tumor pathogenesis are prioritized by employing two interpretation algorithms (i.e. GNN-Explainer and IGscore) for neural networks, at gene and pathway level, respectively. The top genes and pathways exhibit strong associations with disease in survival analyses, many of which such as SEC61G and CYP27B1 are previously reported in the literature.
Collapse
Affiliation(s)
- Hongxi Yan
- Department of Computer Science, Beihang University, XueYuan Road, 100191, BeiJing, China
| | - Dawei Weng
- School of Biomedical Engineering, Capital Medical University, 10 You An Men WaiXi Tou Tiao, 100069, Beijing, China
| | - Dongguo Li
- School of Biomedical Engineering, Capital Medical University, 10 You An Men WaiXi Tou Tiao, 100069, Beijing, China
| | - Yu Gu
- School of Biomedical Engineering, Capital Medical University, 10 You An Men WaiXi Tou Tiao, 100069, Beijing, China
| | - Wenji Ma
- Center for Single-Cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, 227 South Chongqing Road, 200025, Shanghai, China
| | - Qingjie Liu
- Department of Computer Science, Beihang University, XueYuan Road, 100191, BeiJing, China
| |
Collapse
|
46
|
Hajim WI, Zainudin S, Mohd Daud K, Alheeti K. Optimized models and deep learning methods for drug response prediction in cancer treatments: a review. PeerJ Comput Sci 2024; 10:e1903. [PMID: 38660174 PMCID: PMC11042005 DOI: 10.7717/peerj-cs.1903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 01/31/2024] [Indexed: 04/26/2024]
Abstract
Recent advancements in deep learning (DL) have played a crucial role in aiding experts to develop personalized healthcare services, particularly in drug response prediction (DRP) for cancer patients. The DL's techniques contribution to this field is significant, and they have proven indispensable in the medical field. This review aims to analyze the diverse effectiveness of various DL models in making these predictions, drawing on research published from 2017 to 2023. We utilized the VOS-Viewer 1.6.18 software to create a word cloud from the titles and abstracts of the selected studies. This study offers insights into the focus areas within DL models used for drug response. The word cloud revealed a strong link between certain keywords and grouped themes, highlighting terms such as deep learning, machine learning, precision medicine, precision oncology, drug response prediction, and personalized medicine. In order to achieve an advance in DRP using DL, the researchers need to work on enhancing the models' generalizability and interoperability. It is also crucial to develop models that not only accurately represent various architectures but also simplify these architectures, balancing the complexity with the predictive capabilities. In the future, researchers should try to combine methods that make DL models easier to understand; this will make DRP reviews more open and help doctors trust the decisions made by DL models in cancer DRP.
Collapse
Affiliation(s)
- Wesam Ibrahim Hajim
- Department of Applied Geology, College of Sciences, Tirkit University, Tikrit, Salah ad Din, Iraq
- Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Selangor, Malaysia
| | - Suhaila Zainudin
- Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Selangor, Malaysia
| | - Kauthar Mohd Daud
- Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Selangor, Malaysia
| | - Khattab Alheeti
- Department of Computer Networking Systems, College of Computer Sciences and Information Technology, University of Anbar, Al Anbar, Ramadi, Iraq
| |
Collapse
|
47
|
Lao C, Zheng P, Chen H, Liu Q, An F, Li Z. DeepAEG: a model for predicting cancer drug response based on data enhancement and edge-collaborative update strategies. BMC Bioinformatics 2024; 25:105. [PMID: 38461284 PMCID: PMC10925015 DOI: 10.1186/s12859-024-05723-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2023] [Accepted: 02/27/2024] [Indexed: 03/11/2024] Open
Abstract
MOTIVATION The prediction of cancer drug response is a challenging subject in modern personalized cancer therapy due to the uncertainty of drug efficacy and the heterogeneity of patients. It has been shown that the characteristics of the drug itself and the genomic characteristics of the patient can greatly influence the results of cancer drug response. Therefore, accurate, efficient, and comprehensive methods for drug feature extraction and genomics integration are crucial to improve the prediction accuracy. RESULTS Accurate prediction of cancer drug response is vital for guiding the design of anticancer drugs. In this study, we propose an end-to-end deep learning model named DeepAEG which is based on a complete-graph update mode to predict IC50. Specifically, we integrate an edge update mechanism on the basis of a hybrid graph convolutional network to comprehensively learn the potential high-dimensional representation of topological structures in drugs, including atomic characteristics and chemical bond information. Additionally, we present a novel approach for enhancing simplified molecular input line entry specification data by employing sequence recombination to eliminate the defect of single sequence representation of drug molecules. Our extensive experiments show that DeepAEG outperforms other existing methods across multiple evaluation parameters in multiple test sets. Furthermore, we identify several potential anticancer agents, including bortezomib, which has proven to be an effective clinical treatment option. Our results highlight the potential value of DeepAEG in guiding the design of specific cancer treatment regimens.
Collapse
Affiliation(s)
- Chuanqi Lao
- Research Center for Graph Computing, Zhejiang Lab, Yuhang, Hangzhou, 311121, Zhejiang, China
| | - Pengfei Zheng
- Research Center for Graph Computing, Zhejiang Lab, Yuhang, Hangzhou, 311121, Zhejiang, China
| | - Hongyang Chen
- Research Center for Graph Computing, Zhejiang Lab, Yuhang, Hangzhou, 311121, Zhejiang, China.
| | - Qiao Liu
- Department of Statistics, Stanford University, Stanford, Palo Alto, CA, 94305, USA
| | - Feng An
- Research Center for Graph Computing, Zhejiang Lab, Yuhang, Hangzhou, 311121, Zhejiang, China
| | - Zhao Li
- Research Center for Graph Computing, Zhejiang Lab, Yuhang, Hangzhou, 311121, Zhejiang, China
| |
Collapse
|
48
|
Demirbaş KC, Yıldız M, Saygılı S, Canpolat N, Kasapçopur Ö. Artificial Intelligence in Pediatrics: Learning to Walk Together. Turk Arch Pediatr 2024; 59:121-130. [PMID: 38454219 PMCID: PMC11059951 DOI: 10.5152/turkarchpediatr.2024.24002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 02/02/2024] [Indexed: 03/09/2024]
Abstract
In this era of rapidly advancing technology, artificial intelligence (AI) has emerged as a transformative force, even being called the Fourth Industrial Revolution, along with gene editing and robotics. While it has undoubtedly become an increasingly important part of our daily lives, it must be recognized that it is not an additional tool, but rather a complex concept that poses a variety of challenges. AI, with considerable potential, has found its place in both medical care and clinical research. Within the vast field of pediatrics, it stands out as a particularly promising advancement. As pediatricians, we are indeed witnessing the impactful integration of AI-based applications into our daily clinical practice and research efforts. These tools are being used for simple to more complex tasks such as diagnosing clinically challenging conditions, predicting disease outcomes, creating treatment plans, educating both patients and healthcare professionals, and generating accurate medical records or scientific papers. In conclusion, the multifaceted applications of AI in pediatrics will increase efficiency and improve the quality of healthcare and research. However, there are certain risks and threats accompanying this advancement including the biases that may contribute to health disparities and, inaccuracies. Therefore, it is crucial to recognize and address the technical, ethical, and legal challenges as well as explore the benefits in both clinical and research fields.
Collapse
Affiliation(s)
- Kaan Can Demirbaş
- İstanbul University-Cerrahpaşa, Cerrahpaşa Faculty of Medicine, İstanbul, Turkey
| | - Mehmet Yıldız
- Department of Pediatric Rheumatology, İstanbul University-Cerrahpaşa, Cerrahpaşa Faculty of Medicine, İstanbul, Turkey
| | - Seha Saygılı
- Department of Pediatric Nephrology, İstanbul University-Cerrahpaşa, Cerrahpaşa Faculty of Medicine, İstanbul, Turkey
| | - Nur Canpolat
- Department of Pediatric Nephrology, İstanbul University-Cerrahpaşa, Cerrahpaşa Faculty of Medicine, İstanbul, Turkey
| | - Özgür Kasapçopur
- Department of Pediatric Rheumatology, İstanbul University-Cerrahpaşa, Cerrahpaşa Faculty of Medicine, İstanbul, Turkey
| |
Collapse
|
49
|
Abbasi EY, Deng Z, Ali Q, Khan A, Shaikh A, Reshan MSA, Sulaiman A, Alshahrani H. A machine learning and deep learning-based integrated multi-omics technique for leukemia prediction. Heliyon 2024; 10:e25369. [PMID: 38352790 PMCID: PMC10862685 DOI: 10.1016/j.heliyon.2024.e25369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 12/13/2023] [Accepted: 01/25/2024] [Indexed: 02/16/2024] Open
Abstract
In recent years, scientific data on cancer has expanded, providing potential for a better understanding of malignancies and improved tailored care. Advances in Artificial Intelligence (AI) processing power and algorithmic development position Machine Learning (ML) and Deep Learning (DL) as crucial players in predicting Leukemia, a blood cancer, using integrated multi-omics technology. However, realizing these goals demands novel approaches to harness this data deluge. This study introduces a novel Leukemia diagnosis approach, analyzing multi-omics data for accuracy using ML and DL algorithms. ML techniques, including Random Forest (RF), Naive Bayes (NB), Decision Tree (DT), Logistic Regression (LR), Gradient Boosting (GB), and DL methods such as Recurrent Neural Networks (RNN) and Feedforward Neural Networks (FNN) are compared. GB achieved 97 % accuracy in ML, while RNN outperformed by achieving 98 % accuracy in DL. This approach filters unclassified data effectively, demonstrating the significance of DL for leukemia prediction. The testing validation was based on 17 different features such as patient age, sex, mutation type, treatment methods, chromosomes, and others. Our study compares ML and DL techniques and chooses the best technique that gives optimum results. The study emphasizes the implications of high-throughput technology in healthcare, offering improved patient care.
Collapse
Affiliation(s)
- Erum Yousef Abbasi
- State Key Laboratory of Wireless Network Positioning and Communication Engineering Integration Research, School of Electronics Engineering, Beijing University of Posts and Telecommunications, Beijing, China
| | - Zhongliang Deng
- State Key Laboratory of Wireless Network Positioning and Communication Engineering Integration Research, School of Electronics Engineering, Beijing University of Posts and Telecommunications, Beijing, China
| | - Qasim Ali
- Department of Software Engineering, Mehran University of Engineering and Technology, Jamshoro, Pakistan
| | - Adil Khan
- State Key Laboratory of Wireless Network Positioning and Communication Engineering Integration Research, School of Electronics Engineering, Beijing University of Posts and Telecommunications, Beijing, China
| | - Asadullah Shaikh
- Department of Information Systems, College of Computer Science and Information Systems, Najran University, Najran, 61441, Saudi Arabia
| | - Mana Saleh Al Reshan
- Department of Information Systems, College of Computer Science and Information Systems, Najran University, Najran, 61441, Saudi Arabia
- Scientific and Engineering Research Centre, Najran University, Najran, 61441, Saudi Arabia
| | - Adel Sulaiman
- Department of Computer Science, College of Computer Science and Information Systems, Najran University, Najran, 61441, Saudi Arabia
| | - Hani Alshahrani
- Department of Computer Science, College of Computer Science and Information Systems, Najran University, Najran, 61441, Saudi Arabia
| |
Collapse
|
50
|
Liu H, Peng W, Dai W, Lin J, Fu X, Liu L, Liu L, Yu N. Improving anti-cancer drug response prediction using multi-task learning on graph convolutional networks. Methods 2024; 222:41-50. [PMID: 38157919 DOI: 10.1016/j.ymeth.2023.11.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 09/19/2023] [Accepted: 11/19/2023] [Indexed: 01/03/2024] Open
Abstract
Predicting the therapeutic effect of anti-cancer drugs on tumors based on the characteristics of tumors and patients is one of the important contents of precision oncology. Existing computational methods regard the drug response prediction problem as a classification or regression task. However, few of them consider leveraging the relationship between the two tasks. In this work, we propose a Multi-task Interaction Graph Convolutional Network (MTIGCN) for anti-cancer drug response prediction. MTIGCN first utilizes an graph convolutional network-based model to produce embeddings for both cell lines and drugs. After that, the model employs multi-task learning to predict anti-cancer drug response, which involves training the model on three different tasks simultaneously: the main task of the drug sensitive or resistant classification task and the two auxiliary tasks of regression prediction and similarity network reconstruction. By sharing parameters and optimizing the losses of different tasks simultaneously, MTIGCN enhances the feature representation and reduces overfitting. The results of the experiments on two in vitro datasets demonstrated that MTIGCN outperformed seven state-of-the-art baseline methods. Moreover, the well-trained model on the in vitro dataset GDSC exhibited good performance when applied to predict drug responses in in vivo datasets PDX and TCGA. The case study confirmed the model's ability to discover unknown drug responses in cell lines.
Collapse
Affiliation(s)
- Hancheng Liu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650050, China
| | - Wei Peng
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650050, China; Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming 650050, China.
| | - Wei Dai
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650050, China; Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming 650050, China.
| | - Jiangzhen Lin
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650050, China
| | - Xiaodong Fu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650050, China; Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming 650050, China
| | - Li Liu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650050, China; Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming 650050, China.
| | - Lijun Liu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650050, China; Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming 650050, China
| | - Ning Yu
- State University of New York, The College at Brockport, Department of Computing Sciences, 350 New Campus Drive, Brockport NY 14422.
| |
Collapse
|