1
|
Dey V, Ning X. Improving Anticancer Drug Selection and Prioritization via Neural Learning to Rank. J Chem Inf Model 2024; 64:4071-4088. [PMID: 38740382 DOI: 10.1021/acs.jcim.3c01060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Personalized cancer treatment requires a thorough understanding of complex interactions between drugs and cancer cell lines in varying genetic and molecular contexts. To address this, high-throughput screening has been used to generate large-scale drug response data, facilitating data-driven computational models. Such models can capture complex drug-cell line interactions across various contexts in a fully data-driven manner. However, accurately prioritizing the most effective drugs for each cell line still remains a significant challenge. To address this, we developed multiple neural ranking approaches that leverage large-scale drug response data across multiple cell lines from diverse cancer types. Unlike existing approaches that primarily utilize regression and classification techniques for drug response prediction, we formulated the objective of drug selection and prioritization as a drug ranking problem. In this work, we proposed multiple pairwise and listwise neural ranking methods that learn latent representations of drugs and cell lines and then use those representations to score drugs in each cell line via a learnable scoring function. Specifically, we developed neural pairwise and listwise ranking methods, Pair-PushC and List-One on top of the existing methods, pLETORg and ListNet, respectively. Additionally, we proposed a novel listwise ranking method, List-All, that focuses on all the effective drugs instead of the top effective drug, unlike List-One. We also provide an exhaustive empirical evaluation with state-of-the-art regression and ranking baselines on large-scale data sets across multiple experimental settings. Our results demonstrate that our proposed ranking methods mostly outperform the best baselines with significant improvements of as much as 25.6% in terms of selecting truly effective drugs within the top 20 predicted drugs (i.e., hit@20) across 50% test cell lines. Furthermore, our analyses suggest that the learned latent spaces from our proposed methods demonstrate informative clustering structures and capture relevant underlying biological features. Moreover, our comprehensive evaluation provides a thorough and objective comparison of the performance of different methods (including our proposed ones).
Collapse
Affiliation(s)
- Vishal Dey
- Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio 43210, United States
| | - Xia Ning
- Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio 43210, United States
- Biomedical Informatics, The Ohio State University, Columbus, Ohio 43210, United States
- Translational Data Analytics Institute, The Ohio State University, Columbus, Ohio 43210, United States
| |
Collapse
|
2
|
Sotudian S, Paschalidis IC. ITNR: Inversion Transformer-based Neural Ranking for cancer drug recommendations. Comput Biol Med 2024; 172:108312. [PMID: 38503090 PMCID: PMC10990436 DOI: 10.1016/j.compbiomed.2024.108312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Revised: 03/09/2024] [Accepted: 03/12/2024] [Indexed: 03/21/2024]
Abstract
Personalized drug response prediction is an approach for tailoring effective therapeutic strategies for patients based on their tumors' genomic characterization. While machine learning methods are widely employed in the literature, they often struggle to capture drug-cell line relations across various cell lines. In addressing this challenge, our study introduces a novel listwise Learning-to-Rank (LTR) model named Inversion Transformer-based Neural Ranking (ITNR). ITNR utilizes genomic features and a transformer architecture to decipher functional relationships and construct models that can predict patient-specific drug responses. Our experiments were conducted on three major drug response data sets, showing that ITNR reliably and consistently outperforms state-of-the-art LTR models.
Collapse
Affiliation(s)
- Shahabeddin Sotudian
- Department of Electrical and Computer Engineering, Division of Systems Engineering, Boston University, Boston, MA, USA.
| | - Ioannis Ch Paschalidis
- Department of Electrical and Computer Engineering, Division of Systems Engineering, Boston University, Boston, MA, USA; Department of Biomedical Engineering, and Faculty of Computing and Data Sciences, Boston University, Boston, MA, USA.
| |
Collapse
|
3
|
Zhou X, Qian Y, Ling C, He Z, Shi P, Gao Y, Sui X. An integrated framework for prognosis prediction and drug response modeling in colorectal liver metastasis drug discovery. J Transl Med 2024; 22:321. [PMID: 38555418 PMCID: PMC10981831 DOI: 10.1186/s12967-024-05127-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 03/23/2024] [Indexed: 04/02/2024] Open
Abstract
BACKGROUND Colorectal cancer (CRC) is the third most prevalent cancer globally, and liver metastasis (CRLM) is the primary cause of death. Hence, it is essential to discover novel prognostic biomarkers and therapeutic drugs for CRLM. METHODS This study developed two liver metastasis-associated prognostic signatures based on differentially expressed genes (DEGs) in CRLM. Additionally, we employed an interpretable deep learning model utilizing drug sensitivity databases to identify potential therapeutic drugs for high-risk CRLM patients. Subsequently, in vitro and in vivo experiments were performed to verify the efficacy of these compounds. RESULTS These two prognostic models exhibited superior performance compared to previously reported ones. Obatoclax, a BCL-2 inhibitor, showed significant differential responses between high and low risk groups classified by prognostic models, and demonstrated remarkable effectiveness in both Transwell assay and CT26 colorectal liver metastasis mouse model. CONCLUSIONS This study highlights the significance of developing specialized prognostication approaches and investigating effective therapeutic drugs for patients with CRLM. The application of a deep learning drug response model provides a new drug discovery strategy for translational medicine in precision oncology.
Collapse
Affiliation(s)
- Xiuman Zhou
- School of Pharmaceutical Sciences (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, Guangdong Province, 518107, China
| | - Yuzhen Qian
- School of Life Sciences, Zhengzhou University, Zhengzhou, 450001, China
| | - Chen Ling
- School of Pharmaceutical Sciences (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, Guangdong Province, 518107, China
| | - Zhuoying He
- School of Pharmaceutical Sciences (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, Guangdong Province, 518107, China
| | - Peishang Shi
- School of Life Sciences, Zhengzhou University, Zhengzhou, 450001, China
| | - Yanfeng Gao
- School of Pharmaceutical Sciences (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, Guangdong Province, 518107, China.
| | - Xinghua Sui
- School of Pharmaceutical Sciences (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, Guangdong Province, 518107, China.
| |
Collapse
|
4
|
Li P, Jiang Z, Liu T, Liu X, Qiao H, Yao X. Improving drug response prediction via integrating gene relationships with deep learning. Brief Bioinform 2024; 25:bbae153. [PMID: 38600666 PMCID: PMC11006795 DOI: 10.1093/bib/bbae153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 03/05/2024] [Accepted: 03/18/2024] [Indexed: 04/12/2024] Open
Abstract
Predicting the drug response of cancer cell lines is crucial for advancing personalized cancer treatment, yet remains challenging due to tumor heterogeneity and individual diversity. In this study, we present a deep learning-based framework named Deep neural network Integrating Prior Knowledge (DIPK) (DIPK), which adopts self-supervised techniques to integrate multiple valuable information, including gene interaction relationships, gene expression profiles and molecular topologies, to enhance prediction accuracy and robustness. We demonstrated the superior performance of DIPK compared to existing methods on both known and novel cells and drugs, underscoring the importance of gene interaction relationships in drug response prediction. In addition, DIPK extends its applicability to single-cell RNA sequencing data, showcasing its capability for single-cell-level response prediction and cell identification. Further, we assess the applicability of DIPK on clinical data. DIPK accurately predicted a higher response to paclitaxel in the pathological complete response (pCR) group compared to the residual disease group, affirming the better response of the pCR group to the chemotherapy compound. We believe that the integration of DIPK into clinical decision-making processes has the potential to enhance individualized treatment strategies for cancer patients.
Collapse
Affiliation(s)
- Pengyong Li
- School of Computer Science and Technology,Xidian University, 710126 Xi’an, Shaanxi, China
- State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, 519020 Macau, China
| | - Zhengxiang Jiang
- School of Electronic Engineering, Xidian University, 710126 Xi’an, Shaanxi, China
| | - Tianxiao Liu
- School of Computer Science and Technology,Xidian University, 710126 Xi’an, Shaanxi, China
| | - Xinyu Liu
- Beijing Laboratory of Biomedical Materials, Department of Geriatric Dentistry, Peking University School and Hospital of Stomatology, 100081 Beijing, China
| | - Hui Qiao
- Department of Oncology, Tai’an Municipal Hospital, 271021 Tai’an, Shandong, China
| | - Xiaojun Yao
- Centre for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, 999078 Macao, China
| |
Collapse
|
5
|
Sharma R, Saghapour E, Chen JY. An NLP-based technique to extract meaningful features from drug SMILES. iScience 2024; 27:109127. [PMID: 38455979 PMCID: PMC10918220 DOI: 10.1016/j.isci.2024.109127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 09/30/2023] [Accepted: 02/01/2024] [Indexed: 03/09/2024] Open
Abstract
NLP is a well-established field in ML for developing language models that capture the sequence of words in a sentence. Similarly, drug molecule structures can also be represented as sequences using the SMILES notation. However, unlike natural language texts, special characters in drug SMILES have specific meanings and cannot be ignored. We introduce a novel NLP-based method that extracts interpretable sequences and essential features from drug SMILES notation using N-grams. Our method compares these features to Morgan fingerprint bit-vectors using UMAP-based embedding, and we validate its effectiveness through two personalized drug screening (PSD) case studies. Our NLP-based features are sparse and, when combined with gene expressions and disease phenotype features, produce better ML models for PSD. This approach provides a new way to analyze drug molecule structures represented as SMILES notation, which can help accelerate drug discovery efforts. We have also made our method accessible through a Python library.
Collapse
Affiliation(s)
- Rahul Sharma
- Informatics Institute, School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, USA
| | - Ehsan Saghapour
- Informatics Institute, School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, USA
| | - Jake Y. Chen
- Informatics Institute, School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, USA
| |
Collapse
|
6
|
Jose A, Roy R, Moreno-Andrés D, Stegmaier J. Automatic detection of cell-cycle stages using recurrent neural networks. PLoS One 2024; 19:e0297356. [PMID: 38466708 PMCID: PMC10927108 DOI: 10.1371/journal.pone.0297356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 01/02/2024] [Indexed: 03/13/2024] Open
Abstract
Mitosis is the process by which eukaryotic cells divide to produce two similar daughter cells with identical genetic material. Research into the process of mitosis is therefore of critical importance both for the basic understanding of cell biology and for the clinical approach to manifold pathologies resulting from its malfunctioning, including cancer. In this paper, we propose an approach to study mitotic progression automatically using deep learning. We used neural networks to predict different mitosis stages. We extracted video sequences of cells undergoing division and trained a Recurrent Neural Network (RNN) to extract image features. The use of RNN enabled better extraction of features. The RNN-based approach gave better performance compared to classifier based feature extraction methods which do not use time information. Evaluation of precision, recall, and F-score indicates the superiority of the proposed model compared to the baseline. To study the loss in performance due to confusion between adjacent classes, we plotted the confusion matrix as well. In addition, we visualized the feature space to understand why RNNs are better at classifying the mitosis stages than other classifier models, which indicated the formation of strong clusters for the different classes, clearly confirming the advantage of the proposed RNN-based approach.
Collapse
Affiliation(s)
- Abin Jose
- Institute of Imaging and Computer Vision, RWTH Aachen University, Aachen, Germany
| | - Rijo Roy
- Institute of Imaging and Computer Vision, RWTH Aachen University, Aachen, Germany
| | - Daniel Moreno-Andrés
- Institute of Biochemistry and Molecular Cell Biology, Medical School, RWTH Aachen University, Aachen, Germany
| | - Johannes Stegmaier
- Institute of Imaging and Computer Vision, RWTH Aachen University, Aachen, Germany
| |
Collapse
|
7
|
Sharma A, Lysenko A, Jia S, Boroevich KA, Tsunoda T. Advances in AI and machine learning for predictive medicine. J Hum Genet 2024:10.1038/s10038-024-01231-y. [PMID: 38424184 DOI: 10.1038/s10038-024-01231-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 02/04/2024] [Accepted: 02/12/2024] [Indexed: 03/02/2024]
Abstract
The field of omics, driven by advances in high-throughput sequencing, faces a data explosion. This abundance of data offers unprecedented opportunities for predictive modeling in precision medicine, but also presents formidable challenges in data analysis and interpretation. Traditional machine learning (ML) techniques have been partly successful in generating predictive models for omics analysis but exhibit limitations in handling potential relationships within the data for more accurate prediction. This review explores a revolutionary shift in predictive modeling through the application of deep learning (DL), specifically convolutional neural networks (CNNs). Using transformation methods such as DeepInsight, omics data with independent variables in tabular (table-like, including vector) form can be turned into image-like representations, enabling CNNs to capture latent features effectively. This approach not only enhances predictive power but also leverages transfer learning, reducing computational time, and improving performance. However, integrating CNNs in predictive omics data analysis is not without challenges, including issues related to model interpretability, data heterogeneity, and data size. Addressing these challenges requires a multidisciplinary approach, involving collaborations between ML experts, bioinformatics researchers, biologists, and medical doctors. This review illuminates these complexities and charts a course for future research to unlock the full predictive potential of CNNs in omics data analysis and related fields.
Collapse
Affiliation(s)
- Alok Sharma
- Laboratory for Medical Science Mathematics, Department of Biological Sciences, School of Science, The University of Tokyo, Tokyo, Japan.
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
- Institute for Integrated and Intelligent Systems, Griffith University, Queensland, Australia.
| | - Artem Lysenko
- Laboratory for Medical Science Mathematics, Department of Biological Sciences, School of Science, The University of Tokyo, Tokyo, Japan.
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
| | - Shangru Jia
- Laboratory for Medical Science Mathematics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Keith A Boroevich
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Tatsuhiko Tsunoda
- Laboratory for Medical Science Mathematics, Department of Biological Sciences, School of Science, The University of Tokyo, Tokyo, Japan.
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
- Laboratory for Medical Science Mathematics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan.
| |
Collapse
|
8
|
Li Y, Guo Z, Gao X, Wang G. MMCL-CDR: enhancing cancer drug response prediction with multi-omics and morphology images contrastive representation learning. Bioinformatics 2023; 39:btad734. [PMID: 38070154 PMCID: PMC10756335 DOI: 10.1093/bioinformatics/btad734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 11/09/2023] [Indexed: 12/30/2023] Open
Abstract
MOTIVATION Cancer is a complex disease that results in a significant number of global fatalities. Treatment strategies can vary among patients, even if they have the same type of cancer. The application of precision medicine in cancer shows promise for treating different types of cancer, reducing healthcare expenses, and improving recovery rates. To achieve personalized cancer treatment, machine learning models have been developed to predict drug responses based on tumor and drug characteristics. However, current studies either focus on constructing homogeneous networks from single data source or heterogeneous networks from multiomics data. While multiomics data have shown potential in predicting drug responses in cancer cell lines, there is still a lack of research that effectively utilizes insights from different modalities. Furthermore, effectively utilizing the multimodal knowledge of cancer cell lines poses a challenge due to the heterogeneity inherent in these modalities. RESULTS To address these challenges, we introduce MMCL-CDR (Multimodal Contrastive Learning for Cancer Drug Responses), a multimodal approach for cancer drug response prediction that integrates copy number variation, gene expression, morphology images of cell lines, and chemical structure of drugs. The objective of MMCL-CDR is to align cancer cell lines across different data modalities by learning cell line representations from omic and image data, and combined with structural drug representations to enhance the prediction of cancer drug responses (CDR). We have carried out comprehensive experiments and show that our model significantly outperforms other state-of-the-art methods in CDR prediction. The experimental results also prove that the model can learn more accurate cell line representation by integrating multiomics and morphological data from cell lines, thereby improving the accuracy of CDR prediction. In addition, the ablation study and qualitative analysis also confirm the effectiveness of each part of our proposed model. Last but not least, MMCL-CDR opens up a new dimension for cancer drug response prediction through multimodal contrastive learning, pioneering a novel approach that integrates multiomics and multimodal drug and cell line modeling. AVAILABILITY AND IMPLEMENTATION MMCL-CDR is available at https://github.com/catly/MMCL-CDR.
Collapse
Affiliation(s)
- Yang Li
- College of Computer and Control Engineering, Northeast Forestry University, Harbin 150006, China
| | - Zihou Guo
- College of Computer and Control Engineering, Northeast Forestry University, Harbin 150006, China
| | - Xin Gao
- Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Guohua Wang
- College of Computer and Control Engineering, Northeast Forestry University, Harbin 150006, China
| |
Collapse
|
9
|
Pellecchia S, Viscido G, Franchini M, Gambardella G. Predicting drug response from single-cell expression profiles of tumours. BMC Med 2023; 21:476. [PMID: 38041118 PMCID: PMC10693176 DOI: 10.1186/s12916-023-03182-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 11/20/2023] [Indexed: 12/03/2023] Open
Abstract
BACKGROUND Intra-tumour heterogeneity (ITH) presents a significant obstacle in formulating effective treatment strategies in clinical practice. Single-cell RNA sequencing (scRNA-seq) has evolved as a powerful instrument for probing ITH at the transcriptional level, offering an unparalleled opportunity for therapeutic intervention. RESULTS Drug response prediction at the single-cell level is an emerging field of research that aims to improve the efficacy and precision of cancer treatments. Here, we introduce DREEP (Drug Response Estimation from single-cell Expression Profiles), a computational method that leverages publicly available pharmacogenomic screens from GDSC2, CTRP2, and PRISM and functional enrichment analysis to predict single-cell drug sensitivity from transcriptomic data. We validated DREEP extensively in vitro using several independent single-cell datasets with over 200 cancer cell lines and showed its accuracy and robustness. Additionally, we also applied DREEP to molecularly barcoded breast cancer cells and identified drugs that can selectively target specific cell populations. CONCLUSIONS DREEP provides an in silico framework to prioritize drugs from single-cell transcriptional profiles of tumours and thus helps in designing personalized treatment strategies and accelerating drug repurposing studies. DREEP is available at https://github.com/gambalab/DREEP .
Collapse
Affiliation(s)
- Simona Pellecchia
- Telethon Institute of Genetics and Medicine, Naples, Italy
- Genomics and Experimental Medicine Program, Scuola Superiore Meridionale, Naples, Italy
| | - Gaetano Viscido
- Telethon Institute of Genetics and Medicine, Naples, Italy
- Department of Chemical, Materials and Industrial Engineering, University of Naples Federico II, Naples, Italy
| | - Melania Franchini
- Telethon Institute of Genetics and Medicine, Naples, Italy
- Department of Electrical Engineering and Information Technology, University of Naples Federico II, Naples, Italy
| | | |
Collapse
|
10
|
Das T, Bhattarai K, Rajaganapathy S, Wang L, Cerhan JR, Zong N. Leveraging multi-source to resolve inconsistency across pharmacogenomic datasets in drug sensitivity prediction. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.05.25.23290546. [PMID: 37333219 PMCID: PMC10274988 DOI: 10.1101/2023.05.25.23290546] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
Pharmacogenomics datasets have been generated for various purposes, such as investigating different biomarkers. However, when studying the same cell line with the same drugs, differences in drug responses exist between studies. These variations arise from factors such as inter-tumoral heterogeneity, experimental standardization, and the complexity of cell subtypes. Consequently, drug response prediction suffers from limited generalizability. To address these challenges, we propose a computational model based on Federated Learning (FL) for drug response prediction. By leveraging three pharmacogenomics datasets (CCLE, GDSC2, and gCSI), we evaluate the performance of our model across diverse cell line-based databases. Our results demonstrate superior predictive performance compared to baseline methods and traditional FL approaches through various experimental tests. This study underscores the potential of employing FL to leverage multiple data sources, enabling the development of generalized models that account for inconsistencies among pharmacogenomics datasets. By addressing the limitations of low generalizability, our approach contributes to advancing drug response prediction in precision oncology.
Collapse
Affiliation(s)
- Trisha Das
- University of Illinois Urbana-Champaign, Champaign, Illinois, United States
| | | | - Sivaraman Rajaganapathy
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN, USA
| | - Liewei Wang
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN
| | - James R. Cerhan
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA
| | - Nansu Zong
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN, USA
| |
Collapse
|
11
|
Zhang H, Wang Z, Nan Y, Zagidullin B, Yi D, Tang J, Guan Y. Harmonizing across datasets to improve the transferability of drug combination prediction. Commun Biol 2023; 6:397. [PMID: 37041243 PMCID: PMC10090076 DOI: 10.1038/s42003-023-04783-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 03/30/2023] [Indexed: 04/13/2023] Open
Abstract
Combination treatment has multiple advantages over traditional monotherapy in clinics, thus becoming a target of interest for many high-throughput screening (HTS) studies, which enables the development of machine learning models predicting the response of new drug combinations. However, most existing models have been tested only within a single study, and these models cannot generalize across different datasets due to significantly variable experimental settings. Here, we thoroughly assessed the transferability issue of single-study-derived models on new datasets. More importantly, we propose a method to overcome the experimental variability by harmonizing dose-response curves of different studies. Our method improves the prediction performance of machine learning models by 184% and 1367% compared to the baseline models in intra-study and inter-study predictions, respectively, and shows consistent improvement in multiple cross-validation settings. Our study addresses the crucial question of the transferability in drug combination predictions, which is fundamental for such models to be extrapolated to new drug combination discovery and clinical applications that are de facto different datasets.
Collapse
Affiliation(s)
- Hanrui Zhang
- Department of Computational Medicine and Bioinformatics, Michigan Medicine, University of Michigan, Ann Arbor, MI, USA
| | - Ziyan Wang
- Department of Electrical Engineering and Computer Science (EECS) - CSE Division, University of Michigan, Ann Arbor, MI, USA
| | - Yiyang Nan
- Department of Computational Medicine and Bioinformatics, Michigan Medicine, University of Michigan, Ann Arbor, MI, USA
| | - Bulat Zagidullin
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
| | - Daiyao Yi
- Department of Computational Medicine and Bioinformatics, Michigan Medicine, University of Michigan, Ann Arbor, MI, USA
| | - Jing Tang
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, Michigan Medicine, University of Michigan, Ann Arbor, MI, USA.
- Department of Internal medicine, Michigan Medicine, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
12
|
Singh DP, Kaushik B. CTDN (Convolutional Temporal Based Deep- Neural Network): An Improvised Stacked Hybrid Computational Approach for Anticancer Drug Response Prediction. Comput Biol Chem 2023; 105:107868. [PMID: 37257399 DOI: 10.1016/j.compbiolchem.2023.107868] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 03/31/2023] [Accepted: 04/04/2023] [Indexed: 06/02/2023]
Abstract
The characterization of drug - metabolizing enzymes is a significant problem for customized therapy. It is important to choose the right drugs for cancer victims, and the ability to forecast how those drugs will react is usually based on the available information, genetic sequence, and structural properties. To the finest of our knowledge, this is the first study to evaluate optimization algorithms for selection of features and pharmacogenetics categorization using classification methods based on a successful evolutionary algorithm using datasets from the Cancer Cell Line Encyclopaedia (CCLE) and Genomics of Drug Sensitivity in Cancer (GDSC). The study proposes the uses of Firefly and Grey Wolf Optimization techniques for feature extraction, while comparing the traditional Machine Learning (ML), ensemble ML and Stacking Algorithm with the proposed Convolutional Temporal Deep Neural Network or CTDN. With the potential to increase efficiency from the suggested intelligible classifier model for a suggestive chemotherapeutic drugs response prediction, our study is important in particular for selecting an acceptable feature selection method. The comparison analysis demonstrates that the proposed model not only surpasses the prior state-of-the-art methods, but also uses Grey Wolf and Fire Fly Optimization to lessen multicollinearity and overfitting.
Collapse
Affiliation(s)
- Davinder Paul Singh
- School of Computer Science and Engineering, Shri Mata Vaishno Devi University, Katra 182320, Jammu and Kashmir, India.
| | - Baijnath Kaushik
- School of Computer Science and Engineering, Shri Mata Vaishno Devi University, Katra 182320, Jammu and Kashmir, India
| |
Collapse
|
13
|
Partin A, Brettin TS, Zhu Y, Narykov O, Clyde A, Overbeek J, Stevens RL. Deep learning methods for drug response prediction in cancer: Predominant and emerging trends. Front Med (Lausanne) 2023; 10:1086097. [PMID: 36873878 PMCID: PMC9975164 DOI: 10.3389/fmed.2023.1086097] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 01/23/2023] [Indexed: 02/17/2023] Open
Abstract
Cancer claims millions of lives yearly worldwide. While many therapies have been made available in recent years, by in large cancer remains unsolved. Exploiting computational predictive models to study and treat cancer holds great promise in improving drug development and personalized design of treatment plans, ultimately suppressing tumors, alleviating suffering, and prolonging lives of patients. A wave of recent papers demonstrates promising results in predicting cancer response to drug treatments while utilizing deep learning methods. These papers investigate diverse data representations, neural network architectures, learning methodologies, and evaluations schemes. However, deciphering promising predominant and emerging trends is difficult due to the variety of explored methods and lack of standardized framework for comparing drug response prediction models. To obtain a comprehensive landscape of deep learning methods, we conducted an extensive search and analysis of deep learning models that predict the response to single drug treatments. A total of 61 deep learning-based models have been curated, and summary plots were generated. Based on the analysis, observable patterns and prevalence of methods have been revealed. This review allows to better understand the current state of the field and identify major challenges and promising solution paths.
Collapse
Affiliation(s)
- Alexander Partin
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Thomas S. Brettin
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Yitan Zhu
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Oleksandr Narykov
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Austin Clyde
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Jamie Overbeek
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Rick L. Stevens
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
- Department of Computer Science, The University of Chicago, Chicago, IL, United States
| |
Collapse
|
14
|
Shen B, Feng F, Li K, Lin P, Ma L, Li H. A systematic assessment of deep learning methods for drug response prediction: from in vitro to clinical applications. Brief Bioinform 2023; 24:6961794. [PMID: 36575826 DOI: 10.1093/bib/bbac605] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 10/30/2022] [Accepted: 12/09/2022] [Indexed: 12/29/2022] Open
Abstract
Drug response prediction is an important problem in personalized cancer therapy. Among various newly developed models, significant improvement in prediction performance has been reported using deep learning methods. However, systematic comparisons of deep learning methods, especially of the transferability from preclinical models to clinical cohorts, are currently lacking. To provide a more rigorous assessment, the performance of six representative deep learning methods for drug response prediction using nine evaluation metrics, including the overall prediction accuracy, predictability of each drug, potential associated factors and transferability to clinical cohorts, in multiple application scenarios was benchmarked. Most methods show promising prediction within cell line datasets, and TGSA, with its lower time cost and better performance, is recommended. Although the performance metrics decrease when applying models trained on cell lines to patients, a certain amount of power to distinguish clinical response on some drugs can be maintained using CRDNN and TGSA. With these assessments, we provide a guidance for researchers to choose appropriate methods, as well as insights into future directions for the development of more effective methods in clinical scenarios.
Collapse
Affiliation(s)
- Bihan Shen
- Cancer Systems Biology group at Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Fangyoumin Feng
- Cancer Systems Biology group at Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Kunshi Li
- Cancer Systems Biology group at Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Ping Lin
- Cancer Systems Biology group at Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Liangxiao Ma
- Bio-Med Big Data Center at Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Hong Li
- Cancer Systems Biology group at Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| |
Collapse
|
15
|
Partin A, Brettin T, Zhu Y, Dolezal JM, Kochanny S, Pearson AT, Shukla M, Evrard YA, Doroshow JH, Stevens RL. Data augmentation and multimodal learning for predicting drug response in patient-derived xenografts from gene expressions and histology images. Front Med (Lausanne) 2023; 10:1058919. [PMID: 36960342 PMCID: PMC10027779 DOI: 10.3389/fmed.2023.1058919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 02/10/2023] [Indexed: 03/09/2023] Open
Abstract
Patient-derived xenografts (PDXs) are an appealing platform for preclinical drug studies. A primary challenge in modeling drug response prediction (DRP) with PDXs and neural networks (NNs) is the limited number of drug response samples. We investigate multimodal neural network (MM-Net) and data augmentation for DRP in PDXs. The MM-Net learns to predict response using drug descriptors, gene expressions (GE), and histology whole-slide images (WSIs). We explore whether combining WSIs with GE improves predictions as compared with models that use GE alone. We propose two data augmentation methods which allow us training multimodal and unimodal NNs without changing architectures with a single larger dataset: 1) combine single-drug and drug-pair treatments by homogenizing drug representations, and 2) augment drug-pairs which doubles the sample size of all drug-pair samples. Unimodal NNs which use GE are compared to assess the contribution of data augmentation. The NN that uses the original and the augmented drug-pair treatments as well as single-drug treatments outperforms NNs that ignore either the augmented drug-pairs or the single-drug treatments. In assessing the multimodal learning based on the MCC metric, MM-Net outperforms all the baselines. Our results show that data augmentation and integration of histology images with GE can improve prediction performance of drug response in PDXs.
Collapse
Affiliation(s)
- Alexander Partin
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
- *Correspondence: Alexander Partin
| | - Thomas Brettin
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Yitan Zhu
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - James M. Dolezal
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medical Center, Chicago, IL, United States
| | - Sara Kochanny
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medical Center, Chicago, IL, United States
| | - Alexander T. Pearson
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medical Center, Chicago, IL, United States
| | - Maulik Shukla
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Yvonne A. Evrard
- Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc., Frederick, MD, United States
| | - James H. Doroshow
- Division of Cancer Therapeutics and Diagnosis, National Cancer Institute, Bethesda, MD, United States
| | - Rick L. Stevens
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
- Department of Computer Science, The University of Chicago, Chicago, IL, United States
| |
Collapse
|
16
|
Utilization of Cancer Cell Line Screening to Elucidate the Anticancer Activity and Biological Pathways Related to the Ruthenium-Based Therapeutic BOLD-100. Cancers (Basel) 2022; 15:cancers15010028. [PMID: 36612025 PMCID: PMC9817855 DOI: 10.3390/cancers15010028] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 11/30/2022] [Accepted: 12/16/2022] [Indexed: 12/24/2022] Open
Abstract
BOLD-100 (sodium trans-[tetrachlorobis(1H indazole)ruthenate(III)]) is a ruthenium-based anticancer compound currently in clinical development. The identification of cancer types that show increased sensitivity towards BOLD-100 can lead to improved developmental strategies. Sensitivity profiling can also identify mechanisms of action that are pertinent for the bioactivity of complex therapeutics. Sensitivity to BOLD-100 was measured in a 319-cancer-cell line panel spanning 24 tissues. BOLD-100's sensitivity profile showed variation across the tissue lineages, including increased response in esophageal, bladder, and hematologic cancers. Multiple cancers, including esophageal, bile duct and colon cancer, had higher relative response to BOLD-100 than to cisplatin. Response to BOLD-100 showed only moderate correlation to anticancer compounds in the Genomics of Drug Sensitivity in Cancer (GDSC) database, as well as no clear theme in bioactivity of correlated hits, suggesting that BOLD-100 may have a differentiated therapeutic profile. The genomic modalities of cancer cell lines were modeled against the BOLD-100 sensitivity profile, which revealed that genes related to ribosomal processes were associated with sensitivity to BOLD-100. Machine learning modeling of the sensitivity profile to BOLD-100 and gene expression data provided moderative predictive value. These findings provide further mechanistic understanding around BOLD-100 and support its development for additional cancer types.
Collapse
|
17
|
Samal BR, Loers JU, Vermeirssen V, De Preter K. Opportunities and challenges in interpretable deep learning for drug sensitivity prediction of cancer cells. FRONTIERS IN BIOINFORMATICS 2022; 2:1036963. [PMID: 36466148 PMCID: PMC9714662 DOI: 10.3389/fbinf.2022.1036963] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 11/03/2022] [Indexed: 01/02/2024] Open
Abstract
In precision oncology, therapy stratification is done based on the patients' tumor molecular profile. Modeling and prediction of the drug response for a given tumor molecular type will further improve therapeutic decision-making for cancer patients. Indeed, deep learning methods hold great potential for drug sensitivity prediction, but a major problem is that these models are black box algorithms and do not clarify the mechanisms of action. This puts a limitation on their clinical implementation. To address this concern, many recent studies attempt to overcome these issues by developing interpretable deep learning methods that facilitate the understanding of the logic behind the drug response prediction. In this review, we discuss strengths and limitations of recent approaches, and suggest future directions that could guide further improvement of interpretable deep learning in drug sensitivity prediction in cancer research.
Collapse
Affiliation(s)
- Bikash Ranjan Samal
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium
- Cancer Research Institute Ghent (CRIG), Ghent, Belgium
| | - Jens Uwe Loers
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium
- Cancer Research Institute Ghent (CRIG), Ghent, Belgium
- Department of Biomedical Molecular Biology, Ghent University, Ghent, Belgium
| | - Vanessa Vermeirssen
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium
- Cancer Research Institute Ghent (CRIG), Ghent, Belgium
- Department of Biomedical Molecular Biology, Ghent University, Ghent, Belgium
| | - Katleen De Preter
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Center for Medical Genetics Ghent (CMGG), Ghent University, Ghent, Belgium
- Cancer Research Institute Ghent (CRIG), Ghent, Belgium
| |
Collapse
|
18
|
Stahlberg EA, Abdel-Rahman M, Aguilar B, Asadpoure A, Beckman RA, Borkon LL, Bryan JN, Cebulla CM, Chang YH, Chatterjee A, Deng J, Dolatshahi S, Gevaert O, Greenspan EJ, Hao W, Hernandez-Boussard T, Jackson PR, Kuijjer M, Lee A, Macklin P, Madhavan S, McCoy MD, Mohammad Mirzaei N, Razzaghi T, Rocha HL, Shahriyari L, Shmulevich I, Stover DG, Sun Y, Syeda-Mahmood T, Wang J, Wang Q, Zervantonakis I. Exploring approaches for predictive cancer patient digital twins: Opportunities for collaboration and innovation. Front Digit Health 2022; 4:1007784. [PMID: 36274654 PMCID: PMC9586248 DOI: 10.3389/fdgth.2022.1007784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Accepted: 08/30/2022] [Indexed: 01/26/2023] Open
Abstract
We are rapidly approaching a future in which cancer patient digital twins will reach their potential to predict cancer prevention, diagnosis, and treatment in individual patients. This will be realized based on advances in high performance computing, computational modeling, and an expanding repertoire of observational data across multiple scales and modalities. In 2020, the US National Cancer Institute, and the US Department of Energy, through a trans-disciplinary research community at the intersection of advanced computing and cancer research, initiated team science collaborative projects to explore the development and implementation of predictive Cancer Patient Digital Twins. Several diverse pilot projects were launched to provide key insights into important features of this emerging landscape and to determine the requirements for the development and adoption of cancer patient digital twins. Projects included exploring approaches to using a large cohort of digital twins to perform deep phenotyping and plan treatments at the individual level, prototyping self-learning digital twin platforms, using adaptive digital twin approaches to monitor treatment response and resistance, developing methods to integrate and fuse data and observations across multiple scales, and personalizing treatment based on cancer type. Collectively these efforts have yielded increased insights into the opportunities and challenges facing cancer patient digital twin approaches and helped define a path forward. Given the rapidly growing interest in patient digital twins, this manuscript provides a valuable early progress report of several CPDT pilot projects commenced in common, their overall aims, early progress, lessons learned and future directions that will increasingly involve the broader research community.
Collapse
Affiliation(s)
- Eric A. Stahlberg
- Cancer Data Science Initiatives, Frederick National Laboratory for Cancer Research, Frederick, MD, United States
| | - Mohamed Abdel-Rahman
- Department of Ophthalmology and Visual Sciences, The Ohio State University Wexner Medical Center and James Comprehensive Cancer Center, Columbus, OH, United States
| | - Boris Aguilar
- Institute for Systems Biology, Seattle, WA, United States
| | - Alireza Asadpoure
- Department of Civil and Environmental Engineering, University of Massachusetts Amherst, Amherst, MA, United States
| | - Robert A. Beckman
- Innovation Center for Biomedical Informatics, Georgetown University, Washington DC, United States
| | - Lynn L. Borkon
- Cancer Data Science Initiatives, Frederick National Laboratory for Cancer Research, Frederick, MD, United States
| | - Jeffrey N. Bryan
- Department of Veterinary Medicine and Surgery, University of Missouri, Columbia, MO, United States
| | - Colleen M. Cebulla
- Department of Ophthalmology and Visual Sciences, The Ohio State University Wexner Medical Center and James Comprehensive Cancer Center, Columbus, OH, United States
| | - Young Hwan Chang
- Department of Biomedical Engineering and OHSU Center for Spatial Systems Biomedicine (OCSSB), Oregon Health and Science University, Portland, OR, United States
| | - Ansu Chatterjee
- School of Statistics, University of Minnesota, Minneapolis, MN, United States
| | - Jun Deng
- Department of Therapeutic Radiology, Yale University School of Medicine, Yale University, New Haven, CT, United States
| | - Sepideh Dolatshahi
- Department of Biomedical Engineering, University of Virginia, Charlottesville VA, United States
| | - Olivier Gevaert
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine and Department of Biomedical Data Science, Stanford University, Stanford, CA, United States
| | - Emily J. Greenspan
- Center for Biomedical Informatics and Information Technology, National Cancer Institute, National Institutes of Health, Bethesda, MD, United States
| | - Wenrui Hao
- Department of Mathematics, The Pennsylvania State University, University Park, PA, United States
| | - Tina Hernandez-Boussard
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine and Department of Biomedical Data Science, Stanford University, Stanford, CA, United States
| | - Pamela R. Jackson
- Mathematical NeuroOncology Lab, Precision Neurotherapeutics Innovation Program, Mayo Clinic Arizona, Phoenix, AZ, United States
| | - Marieke Kuijjer
- Computational Biology and Systems Medicine Group, Centre for Molecular Medicine Norway University of Oslo, Oslo, Norway
| | - Adrian Lee
- Department of Pharmacology and Chemical Biology, University of Pittsburgh, Pittsburgh, PA, United States
| | - Paul Macklin
- Department of Intelligent Systems Engineering, Indiana University, Bloomington, IN, United States
| | - Subha Madhavan
- Innovation Center for Biomedical Informatics, Georgetown University, Washington DC, United States
| | - Matthew D. McCoy
- Innovation Center for Biomedical Informatics, Georgetown University, Washington DC, United States
| | - Navid Mohammad Mirzaei
- Department of Mathematics and Statistics, University of Massachusetts Amherst, Amherst, MA, United States
| | - Talayeh Razzaghi
- School of Industrial and Systems Engineering, The University of Oklahoma, Norman, OK, United States
| | - Heber L. Rocha
- Department of Intelligent Systems Engineering, Indiana University, Bloomington, IN, United States
| | - Leili Shahriyari
- Department of Mathematics and Statistics, University of Massachusetts Amherst, Amherst, MA, United States
| | | | - Daniel G. Stover
- Division of Medical Oncology and Department of Medicine, The Ohio State University Comprehensive Cancer Center, Columbus, OH, United States
| | - Yi Sun
- Department of Mathematics, University of South Carolina, Columbia, SC, United States
| | | | - Jinhua Wang
- Institute for Health Informatics and the Masonic Cancer Center, University of Minnesota, Minneapolis, MN, United States
| | - Qi Wang
- Department of Mathematics, University of South Carolina, Columbia, SC, United States
| | - Ioannis Zervantonakis
- Department of Bioengineering, UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, United States
| |
Collapse
|
19
|
Peng W, Liu H, Dai W, Yu N, Wang J. Predicting cancer drug response using parallel heterogeneous graph convolutional networks with neighborhood interactions. Bioinformatics 2022; 38:4546-4553. [PMID: 35997568 DOI: 10.1093/bioinformatics/btac574] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 07/26/2022] [Accepted: 08/22/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Due to cancer heterogeneity, the therapeutic effect may not be the same when a cohort of patients of the same cancer type receive the same treatment. The anticancer drug response prediction may help develop personalized therapy regimens to increase survival and reduce patients' expenses. Recently, graph neural network-based methods have aroused widespread interest and achieved impressive results on the drug response prediction task. However, most of them apply graph convolution to process cell line-drug bipartite graphs while ignoring the intrinsic differences between cell lines and drug nodes. Moreover, most of these methods aggregate node-wise neighbor features but fail to consider the element-wise interaction between cell lines and drugs. RESULTS This work proposes a neighborhood interaction (NI)-based heterogeneous graph convolution network method, namely NIHGCN, for anticancer drug response prediction in an end-to-end way. Firstly, it constructs a heterogeneous network consisting of drugs, cell lines and the known drug response information. Cell line gene expression and drug molecular fingerprints are linearly transformed and input as node attributes into an interaction model. The interaction module consists of a parallel graph convolution network layer and a NI layer, which aggregates node-level features from their neighbors through graph convolution operation and considers the element-level of interactions with their neighbors in the NI layer. Finally, the drug response predictions are made by calculating the linear correlation coefficients of feature representations of cell lines and drugs. We have conducted extensive experiments to assess the effectiveness of our model on Cancer Drug Sensitivity Data (GDSC) and Cancer Cell Line Encyclopedia (CCLE) datasets. It has achieved the best performance compared with the state-of-the-art algorithms, especially in predicting drug responses for new cell lines, new drugs and targeted drugs. Furthermore, our model that was well trained on the GDSC dataset can be successfully applied to predict samples of PDX and TCGA, which verified the transferability of our model from cell line in vitro to the datasets in vivo. AVAILABILITY AND IMPLEMENTATION The source code can be obtained from https://github.com/weiba/NIHGCN. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wei Peng
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650050, P.R. China
| | - Hancheng Liu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650050, P.R. China
| | - Wei Dai
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650050, P.R. China
| | - Ning Yu
- Department of Computing Sciences, The College at Brockport, State University of New York, Brockport, NY 14422, USA
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha 410083, P.R. China.,Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, P. R. China
| |
Collapse
|
20
|
From single-omics to interactomics: How can ligand-induced perturbations modulate single-cell phenotypes? ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2022; 131:45-83. [PMID: 35871896 DOI: 10.1016/bs.apcsb.2022.05.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Cells suffer from perturbations by different stimuli, which, consequently, rise to individual alterations in their profile and function that may end up affecting the tissue as a whole. This is no different if we consider the effect of a therapeutic agent on a biological system. As cells are exposed to external ligands their profile can change at different single-omics levels. Detecting how these changes take place through different sequencing technologies is key to a better understanding of the effects of therapeutic agents. Single-cell RNA-sequencing stands out as one of the most common approaches for cell profiling and perturbation analysis. As a result, single-cell transcriptomics data can be integrated with other omics data sources, such as proteomics and epigenomics data, to clarify the perturbation effects and mechanism at the cell level. Appropriate computational tools are key to process and integrate the available information. This chapter focuses on the recent advances on ligand-induced perturbation and single-cell omics computational tools and algorithms, their current limitations, and how the deluge of data can be used to improve the current process of drug research and development.
Collapse
|
21
|
Abeykoon V, Kamburugamuve S, Widanage C, Perera N, Uyar A, Kanewala TA, von Laszewski G, Fox G. HPTMT Parallel Operators for High Performance Data Science and Data Engineering. Front Big Data 2022; 4:756041. [PMID: 35198971 PMCID: PMC8860100 DOI: 10.3389/fdata.2021.756041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Accepted: 11/29/2021] [Indexed: 11/13/2022] Open
Abstract
Data-intensive applications are becoming commonplace in all science disciplines. They are comprised of a rich set of sub-domains such as data engineering, deep learning, and machine learning. These applications are built around efficient data abstractions and operators that suit the applications of different domains. Often lack of a clear definition of data structures and operators in the field has led to other implementations that do not work well together. The HPTMT architecture that we proposed recently, identifies a set of data structures, operators, and an execution model for creating rich data applications that links all aspects of data engineering and data science together efficiently. This paper elaborates and illustrates this architecture using an end-to-end application with deep learning and data engineering parts working together. Our analysis show that the proposed system architecture is better suited for high performance computing environments compared to the current big data processing systems. Furthermore our proposed system emphasizes the importance of efficient compact data structures such as Apache Arrow tabular data representation defined for high performance. Thus the system integration we proposed scales a sequential computation to a distributed computation retaining optimum performance along with highly usable application programming interface.
Collapse
Affiliation(s)
- Vibhatha Abeykoon
- Indiana University Alumni, Bloomington, IN, United States
- *Correspondence: Vibhatha Abeykoon,
| | - Supun Kamburugamuve
- Luddy School of Informatics, Computing and Engineering, Bloomington, IN, United States
| | - Chathura Widanage
- Luddy School of Informatics, Computing and Engineering, Bloomington, IN, United States
| | - Niranda Perera
- Luddy School of Informatics, Computing and Engineering, Bloomington, IN, United States
| | - Ahmet Uyar
- Luddy School of Informatics, Computing and Engineering, Bloomington, IN, United States
| | | | - Gregor von Laszewski
- Biocomplexity Institute and Initiative, University of Virginia, Charlottesville, VA, United States
| | - Geoffrey Fox
- Biocomplexity Institute and Initiative, University of Virginia, Charlottesville, VA, United States
- Computer Science Department, University of Virginia, Charlottesville, VA, United States
| |
Collapse
|
22
|
Feizi N, Nair SK, Smirnov P, Beri G, Eeles C, Esfahani PN, Nakano M, Tkachuk D, Mammoliti A, Gorobets E, Mer AS, Lin E, Yu Y, Martin S, Hafner M, Haibe-Kains B. PharmacoDB 2.0: improving scalability and transparency of in vitro pharmacogenomics analysis. Nucleic Acids Res 2022; 50:D1348-D1357. [PMID: 34850112 PMCID: PMC8728279 DOI: 10.1093/nar/gkab1084] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Revised: 10/15/2021] [Accepted: 10/20/2021] [Indexed: 11/14/2022] Open
Abstract
Cancer pharmacogenomics studies provide valuable insights into disease progression and associations between genomic features and drug response. PharmacoDB integrates multiple cancer pharmacogenomics datasets profiling approved and investigational drugs across cell lines from diverse tissue types. The web-application enables users to efficiently navigate across datasets, view and compare drug dose-response data for a specific drug-cell line pair. In the new version of PharmacoDB (version 2.0, https://pharmacodb.ca/), we present (i) new datasets such as NCI-60, the Profiling Relative Inhibition Simultaneously in Mixtures (PRISM) dataset, as well as updated data from the Genomics of Drug Sensitivity in Cancer (GDSC) and the Genentech Cell Line Screening Initiative (gCSI); (ii) implementation of FAIR data pipelines using ORCESTRA and PharmacoDI; (iii) enhancements to drug-response analysis such as tissue distribution of dose-response metrics and biomarker analysis; and (iv) improved connectivity to drug and cell line databases in the community. The web interface has been rewritten using a modern technology stack to ensure scalability and standardization to accommodate growing pharmacogenomics datasets. PharmacoDB 2.0 is a valuable tool for mining pharmacogenomics datasets, comparing and assessing drug-response phenotypes of cancer models.
Collapse
Affiliation(s)
- Nikta Feizi
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
| | - Sisira Kadambat Nair
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
| | - Petr Smirnov
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Gangesh Beri
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
| | - Christopher Eeles
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
| | - Parinaz Nasr Esfahani
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
| | - Minoru Nakano
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
| | - Denis Tkachuk
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
| | - Anthony Mammoliti
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Evgeniya Gorobets
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON M5S 3G5, Canada
| | - Arvind Singh Mer
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Eva Lin
- Department of Discovery Oncology, Genentech Inc, South San Francisco, CA 94080, USA
| | - Yihong Yu
- Department of Discovery Oncology, Genentech Inc, South San Francisco, CA 94080, USA
| | - Scott Martin
- Department of Discovery Oncology, Genentech Inc, South San Francisco, CA 94080, USA
| | - Marc Hafner
- Department of Oncology Bioinformatics, Genentech Inc, South San Francisco, CA 94080, USA
| | - Benjamin Haibe-Kains
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON M5G 1L7, Canada
- Department of Computer Science, University of Toronto, Toronto, ON M5T 3A1, Canada
- Ontario Institute for Cancer Research, Toronto, ON M5G 0A3, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON M5G 1M1, Canada
| |
Collapse
|
23
|
Out-of-distribution generalization from labelled and unlabelled gene expression data for drug response prediction. NAT MACH INTELL 2021. [DOI: 10.1038/s42256-021-00408-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
24
|
Chen Y, Zhang L. How much can deep learning improve prediction of the responses to drugs in cancer cell lines? Brief Bioinform 2021; 23:6370847. [PMID: 34529029 DOI: 10.1093/bib/bbab378] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Revised: 08/21/2021] [Accepted: 08/24/2021] [Indexed: 12/24/2022] Open
Abstract
The drug response prediction problem arises from personalized medicine and drug discovery. Deep neural networks have been applied to the multi-omics data being available for over 1000 cancer cell lines and tissues for better drug response prediction. We summarize and examine state-of-the-art deep learning methods that have been published recently. Although significant progresses have been made in deep learning approach in drug response prediction, deep learning methods show their weakness for predicting the response of a drug that does not appear in the training dataset. In particular, all the five evaluated deep learning methods performed worst than the similarity-regularized matrix factorization (SRMF) method in our drug blind test. We outline the challenges in applying deep learning approach to drug response prediction and suggest unique opportunities for deep learning integrated with established bioinformatics analyses to overcome some of these challenges.
Collapse
Affiliation(s)
- Yurui Chen
- Department of Mathematics and Computational Biology Programme, National University of Singapore, 119076, Singapore
| | - Louxin Zhang
- Department of Mathematics and Computational Biology Programme, National University of Singapore, 119076, Singapore
| |
Collapse
|