1
|
Vasanthakumari P, Zhu Y, Brettin T, Partin A, Shukla M, Xia F, Narykov O, Weil MR, Stevens RL. A Comprehensive Investigation of Active Learning Strategies for Conducting Anti-Cancer Drug Screening. Cancers (Basel) 2024; 16:530. [PMID: 38339281 PMCID: PMC10854925 DOI: 10.3390/cancers16030530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 01/12/2024] [Accepted: 01/22/2024] [Indexed: 02/12/2024] Open
Abstract
It is well-known that cancers of the same histology type can respond differently to a treatment. Thus, computational drug response prediction is of paramount importance for both preclinical drug screening studies and clinical treatment design. To build drug response prediction models, treatment response data need to be generated through screening experiments and used as input to train the prediction models. In this study, we investigate various active learning strategies of selecting experiments to generate response data for the purposes of (1) improving the performance of drug response prediction models built on the data and (2) identifying effective treatments. Here, we focus on constructing drug-specific response prediction models for cancer cell lines. Various approaches have been designed and applied to select cell lines for screening, including a random, greedy, uncertainty, diversity, combination of greedy and uncertainty, sampling-based hybrid, and iteration-based hybrid approach. All of these approaches are evaluated and compared using two criteria: (1) the number of identified hits that are selected experiments validated to be responsive, and (2) the performance of the response prediction model trained on the data of selected experiments. The analysis was conducted for 57 drugs and the results show a significant improvement on identifying hits using active learning approaches compared with the random and greedy sampling method. Active learning approaches also show an improvement on response prediction performance for some of the drugs and analysis runs compared with the greedy sampling method.
Collapse
Affiliation(s)
- Priyanka Vasanthakumari
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL 60439, USA; (Y.Z.); (A.P.); (M.S.); (F.X.); (O.N.)
| | - Yitan Zhu
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL 60439, USA; (Y.Z.); (A.P.); (M.S.); (F.X.); (O.N.)
| | - Thomas Brettin
- Computing, Environment and Life Sciences, Argonne National Laboratory, Lemont, IL 60439, USA; (T.B.); (R.L.S.)
| | - Alexander Partin
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL 60439, USA; (Y.Z.); (A.P.); (M.S.); (F.X.); (O.N.)
| | - Maulik Shukla
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL 60439, USA; (Y.Z.); (A.P.); (M.S.); (F.X.); (O.N.)
| | - Fangfang Xia
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL 60439, USA; (Y.Z.); (A.P.); (M.S.); (F.X.); (O.N.)
| | - Oleksandr Narykov
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL 60439, USA; (Y.Z.); (A.P.); (M.S.); (F.X.); (O.N.)
| | - Michael Ryan Weil
- Cancer Research Technology Program, Cancer Data Science Initiatives, Frederick National Laboratory for Cancer Research, Frederick, MD 21701, USA;
| | - Rick L. Stevens
- Computing, Environment and Life Sciences, Argonne National Laboratory, Lemont, IL 60439, USA; (T.B.); (R.L.S.)
- Department of Computer Science, The University of Chicago, Chicago, IL 60637, USA
| |
Collapse
|
2
|
Ng CW, Wong KK. Deep learning-enabled breast cancer endocrine response determination from H&E staining based on ESR1 signaling activity. Sci Rep 2023; 13:21454. [PMID: 38052873 PMCID: PMC10698147 DOI: 10.1038/s41598-023-48830-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 11/30/2023] [Indexed: 12/07/2023] Open
Abstract
Estrogen receptor (ER) positivity by immunohistochemistry has long been a main selection criterium for breast cancer patients to be treated with endocrine therapy. However, ER positivity might not directly correlate with activated ER signaling activity, which is a better predictor for endocrine therapy responsiveness. In this study, we investigated if a deep learning method using whole-slide H&E-stained images could predict ER signaling activity. First, ER signaling activity score was determined using RNAseq data available from each of the 1082 breast cancer samples in the TCGA Pan-Cancer dataset based on the Hallmark Estrogen Response Early gene set from the Molecular Signature Database (MSigDB). Then the processed H&E-stained images and ER signaling activity scores from a training cohort were fed into ResNet101 with three additional fully connected layers to generate a predicted ER activity score. The trained models were subsequently applied to an independent testing cohort. The result demonstrated that ER + /HER2- breast cancer patients with a higher predicted ER activity score had longer progression-free survival (p = 0.0368) than those with lower predicted ER activity score. In conclusion, a convolutional deep neural network can predict prognosis and endocrine therapy response in breast cancer patients based on whole-slide H&E-stained images. The trained models were found to robustly predict the prognosis of ER + /HER2- patients. This information is valuable for patient management, as it does not require RNA-seq or microarray data analyses. Thus, these models can reduce the cost of the diagnosis workflow if such information is required.
Collapse
Affiliation(s)
- Chun Wai Ng
- Department of Gynecologic Oncology and Reproductive Medicine, Unit 1362, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd, Houston, TX, 77030, USA
| | - Kwong-Kwok Wong
- Department of Gynecologic Oncology and Reproductive Medicine, Unit 1362, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd, Houston, TX, 77030, USA.
| |
Collapse
|
3
|
Janizek JD, Dincer AB, Celik S, Chen H, Chen W, Naxerova K, Lee SI. Uncovering expression signatures of synergistic drug responses via ensembles of explainable machine-learning models. Nat Biomed Eng 2023; 7:811-829. [PMID: 37127711 DOI: 10.1038/s41551-023-01034-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 04/01/2023] [Indexed: 05/03/2023]
Abstract
Machine learning may aid the choice of optimal combinations of anticancer drugs by explaining the molecular basis of their synergy. By combining accurate models with interpretable insights, explainable machine learning promises to accelerate data-driven cancer pharmacology. However, owing to the highly correlated and high-dimensional nature of transcriptomic data, naively applying current explainable machine-learning strategies to large transcriptomic datasets leads to suboptimal outcomes. Here by using feature attribution methods, we show that the quality of the explanations can be increased by leveraging ensembles of explainable machine-learning models. We applied the approach to a dataset of 133 combinations of 46 anticancer drugs tested in ex vivo tumour samples from 285 patients with acute myeloid leukaemia and uncovered a haematopoietic-differentiation signature underlying drug combinations with therapeutic synergy. Ensembles of machine-learning models trained to predict drug combination synergies on the basis of gene-expression data may improve the feature attribution quality of complex machine-learning models.
Collapse
Affiliation(s)
- Joseph D Janizek
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA
- Medical Scientist Training Program, University of Washington, Seattle, WA, USA
| | - Ayse B Dincer
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA
| | - Safiye Celik
- Recursion Pharmaceuticals, Salt Lake City, UT, USA
| | - Hugh Chen
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA
| | - William Chen
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA
| | - Kamila Naxerova
- Center for Systems Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA.
- Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA.
| | - Su-In Lee
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA.
| |
Collapse
|
4
|
Partin A, Brettin TS, Zhu Y, Narykov O, Clyde A, Overbeek J, Stevens RL. Deep learning methods for drug response prediction in cancer: Predominant and emerging trends. Front Med (Lausanne) 2023; 10:1086097. [PMID: 36873878 PMCID: PMC9975164 DOI: 10.3389/fmed.2023.1086097] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 01/23/2023] [Indexed: 02/17/2023] Open
Abstract
Cancer claims millions of lives yearly worldwide. While many therapies have been made available in recent years, by in large cancer remains unsolved. Exploiting computational predictive models to study and treat cancer holds great promise in improving drug development and personalized design of treatment plans, ultimately suppressing tumors, alleviating suffering, and prolonging lives of patients. A wave of recent papers demonstrates promising results in predicting cancer response to drug treatments while utilizing deep learning methods. These papers investigate diverse data representations, neural network architectures, learning methodologies, and evaluations schemes. However, deciphering promising predominant and emerging trends is difficult due to the variety of explored methods and lack of standardized framework for comparing drug response prediction models. To obtain a comprehensive landscape of deep learning methods, we conducted an extensive search and analysis of deep learning models that predict the response to single drug treatments. A total of 61 deep learning-based models have been curated, and summary plots were generated. Based on the analysis, observable patterns and prevalence of methods have been revealed. This review allows to better understand the current state of the field and identify major challenges and promising solution paths.
Collapse
Affiliation(s)
- Alexander Partin
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Thomas S. Brettin
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Yitan Zhu
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Oleksandr Narykov
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Austin Clyde
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Jamie Overbeek
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Rick L. Stevens
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
- Department of Computer Science, The University of Chicago, Chicago, IL, United States
| |
Collapse
|
5
|
Tang YC, Powell RT, Gottlieb A. Molecular pathways enhance drug response prediction using transfer learning from cell lines to tumors and patient-derived xenografts. Sci Rep 2022; 12:16109. [PMID: 36168036 PMCID: PMC9515168 DOI: 10.1038/s41598-022-20646-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 09/16/2022] [Indexed: 11/24/2022] Open
Abstract
Computational models have been successful in predicting drug sensitivity in cancer cell line data, creating an opportunity to guide precision medicine. However, translating these models to tumors remains challenging. We propose a new transfer learning workflow that transfers drug sensitivity predicting models from large-scale cancer cell lines to both tumors and patient derived xenografts based on molecular pathways derived from genomic features. We further compute feature importance to identify pathways most important to drug response prediction. We obtained good performance on tumors (AUROC = 0.77) and patient derived xenografts from triple negative breast cancers (RMSE = 0.11). Using feature importance, we highlight the association between ER-Golgi trafficking pathway in everolimus sensitivity within breast cancer patients and the role of class II histone deacetylases and interlukine-12 in response to drugs for triple-negative breast cancer. Pathway information support transfer of drug response prediction models from cell lines to tumors and can provide biological interpretation underlying the predictions, serving as a steppingstone towards usage in clinical setting.
Collapse
Affiliation(s)
- Yi-Ching Tang
- grid.267308.80000 0000 9206 2401Center for Precision Health, School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX 77030 USA
| | - Reid T. Powell
- grid.264756.40000 0004 4687 2082Center for Translational Cancer Research, Texas A&M University, Houston, TX 77030 USA
| | - Assaf Gottlieb
- Center for Precision Health, School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
| |
Collapse
|
6
|
Cai Y, Zeng R, Peng J, Liu W, He Q, Xu Z, Bai N. The downregulated drug-metabolism related ALDH6A1 serves as predictor for prognosis and therapeutic immune response in gastric cancer. Aging (Albany NY) 2022; 14:7038-7051. [PMID: 36098688 PMCID: PMC9512493 DOI: 10.18632/aging.204270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Accepted: 08/24/2022] [Indexed: 11/25/2022]
Abstract
Drug metabolism-associated genes have been clarified to play a vital role in the process of cancer cell growth and migration. Nevertheless, the correlation between drug metabolism-associated genes and gastric cancer (GC) has not been fully explored and clarified. This paper has focused on the role of aldehyde dehydrogenase 6 family member A1 (ALDH6A1), a drug metabolism-associated gene, in the immune regulation and prognosis of GC patients. Using several bioinformatics platforms and immunohistochemistry (IHC) assay, we found that ALDH6A1 expression was significantly down-regulated in GC tissues. Moreover, higher expression of ALDH6A1 was related to the better prognosis of GC patients. ALDH6A1 was also found to be involved in the regulation of several immune-associated signatures, including immunoinhibitors. In conclusion, the above results have concluded that aberrant expression of ALDH6A1 might be served as the promising predictor for prognosis and clinical immunotherapy response in GC patients.
Collapse
Affiliation(s)
- Yuan Cai
- Department of Pathology, Xiangya Hospital, Central South University, Changsha 410008, Hunan, China
| | - Rong Zeng
- General Surgery Department, Second Xiangya Hospital, Central South University, Changsha 410008, Hunan, China
| | - Jinwu Peng
- Department of Pathology, Xiangya Hospital, Central South University, Changsha 410008, Hunan, China
- Department of Pathology, Xiangya Changde Hospital, Changde 415000, Hunan, China
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha 410008, Hunan, China
| | - Wei Liu
- Department of Pathology, Xiangya Hospital, Central South University, Changsha 410008, Hunan, China
- Department of Orthopedic Surgery, The Second Hospital University of South China, Hengyang 421001, Hunan, China
| | - Qingchun He
- Department of Emergency, Xiangya Hospital, Central South University, Changsha 410008, Hunan, China
- Department of Emergency, Xiangya Changde Hospital, Changde 415000, Hunan, China
| | - Zhijie Xu
- Department of Pathology, Xiangya Hospital, Central South University, Changsha 410008, Hunan, China
| | - Ning Bai
- Department of General Surgery, Xiangya Hospital, Central South University, Changsha 410008, Hunan, China
| |
Collapse
|
7
|
Zhu EY, Dupuy AJ. Machine learning approach informs biology of cancer drug response. BMC Bioinformatics 2022; 23:184. [PMID: 35581546 PMCID: PMC9112473 DOI: 10.1186/s12859-022-04720-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 05/03/2022] [Indexed: 12/12/2022] Open
Abstract
Background The mechanism of action for most cancer drugs is not clear. Large-scale pharmacogenomic cancer cell line datasets offer a rich resource to obtain this knowledge. Here, we present an analysis strategy for revealing biological pathways that contribute to drug response using publicly available pharmacogenomic cancer cell line datasets. Methods We present a custom machine-learning based approach for identifying biological pathways involved in cancer drug response. We test the utility of our approach with a pan-cancer analysis of ML210, an inhibitor of GPX4, and a melanoma-focused analysis of inhibitors of BRAFV600. We apply our approach to reveal determinants of drug resistance to microtubule inhibitors. Results Our method implicated lipid metabolism and Rac1/cytoskeleton signaling in the context of ML210 and BRAF inhibitor response, respectively. These findings are consistent with current knowledge of how these drugs work. For microtubule inhibitors, our approach implicated Notch and Akt signaling as pathways that associated with response. Conclusions Our results demonstrate the utility of combining informed feature selection and machine learning algorithms in understanding cancer drug response. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04720-z.
Collapse
Affiliation(s)
- Eliot Y Zhu
- Department of Anatomy and Cell Biology, The University of Iowa, Iowa City, IA, USA.,Holden Comprehensive Cancer Center, The University of Iowa, Iowa City, IA, USA.,Cancer Biology Graduate Program, The University of Iowa, Iowa City, IA, USA.,The Medical Scientist Training Program, The University of Iowa, Iowa City, IA, USA
| | - Adam J Dupuy
- Department of Anatomy and Cell Biology, The University of Iowa, Iowa City, IA, USA. .,Holden Comprehensive Cancer Center, The University of Iowa, Iowa City, IA, USA.
| |
Collapse
|
8
|
Tang YC, Gottlieb A. SynPathy: Predicting Drug Synergy through Drug-Associated Pathways Using Deep Learning. Mol Cancer Res 2022; 20:762-769. [PMID: 35046110 DOI: 10.1158/1541-7786.mcr-21-0735] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 11/01/2021] [Accepted: 01/12/2022] [Indexed: 11/16/2022]
Abstract
Drug combination therapy has become a promising therapeutic strategy for cancer treatment. While high-throughput drug combination screening is effective for identifying synergistic drug combinations, measuring all possible combinations is impractical due to the vast space of therapeutic agents and cell lines. In this study, we propose a biologically-motivated deep learning approach to identify pathway-level features from drug and cell lines' molecular data for predicting drug synergy and quantifying the interactions in synergistic drug pairs. This method obtained an MSE of 70.6{plus minus}6.4, significantly surpassing previous approaches while providing potential candidate pathways to explain the prediction. We further demonstrate that drug combinations tend to be more synergistic when their top contributing pathways are closer to each other on a protein interaction network, suggesting a potential strategy for combination therapy with topologically interacting pathways. Our computational approach can thus be utilized both for pre-screening of potential drug combinations and for designing new combinations based on proximity of pathways associated with drug targets and cell lines. Implications: Our computational framework may be translated in the future to clinical scenarios where synergistic drugs are tailored to the patient and additionally, drug development could benefit from designing drugs that target topologically close pathways.
Collapse
Affiliation(s)
- Yi-Ching Tang
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston
| | - Assaf Gottlieb
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston
| |
Collapse
|
9
|
Firoozbakht F, Yousefi B, Schwikowski B. An overview of machine learning methods for monotherapy drug response prediction. Brief Bioinform 2022; 23:bbab408. [PMID: 34619752 PMCID: PMC8769705 DOI: 10.1093/bib/bbab408] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 08/25/2021] [Accepted: 09/06/2021] [Indexed: 12/11/2022] Open
Abstract
For an increasing number of preclinical samples, both detailed molecular profiles and their responses to various drugs are becoming available. Efforts to understand, and predict, drug responses in a data-driven manner have led to a proliferation of machine learning (ML) methods, with the longer term ambition of predicting clinical drug responses. Here, we provide a uniquely wide and deep systematic review of the rapidly evolving literature on monotherapy drug response prediction, with a systematic characterization and classification that comprises more than 70 ML methods in 13 subclasses, their input and output data types, modes of evaluation, and code and software availability. ML experts are provided with a fundamental understanding of the biological problem, and how ML methods are configured for it. Biologists and biomedical researchers are introduced to the basic principles of applicable ML methods, and their application to the problem of drug response prediction. We also provide systematic overviews of commonly used data sources used for training and evaluation methods.
Collapse
Affiliation(s)
- Farzaneh Firoozbakht
- Systems Biology Group, Department of Computational Biology, Institut Pasteur, Paris, France
| | - Behnam Yousefi
- Systems Biology Group, Department of Computational Biology, Institut Pasteur, Paris, France
- Sorbonne Université, École Doctorale Complexite du Vivant, Paris, France
| | - Benno Schwikowski
- Systems Biology Group, Department of Computational Biology, Institut Pasteur, Paris, France
| |
Collapse
|
10
|
Jin I, Nam H. HiDRA: Hierarchical Network for Drug Response Prediction with Attention. J Chem Inf Model 2021; 61:3858-3867. [PMID: 34342985 DOI: 10.1021/acs.jcim.1c00706] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Understanding differences in drug responses between patients is crucial for delivering effective cancer treatment. We describe an interpretable AI model for use in predicting drug responses in cancer cells at the gene, molecular pathway, and drug level, which we have called the hierarchical network for drug response prediction with attention. We found that the model shows better accuracy in predicting drugs having efficacy against a given cell line than other state-of-the-art methods, with a root mean squared error of 1.0064, a Pearson's correlation coefficient of 0.9307, and an R2 value of 0.8647. We also confirmed that the model gives high attention to drug-target genes and cancer-related pathways when predicting a response. The validity of predicted results was proven by in vitro cytotoxicity assay. Overall, we propose that our hierarchical and interpretable AI-based model is capable of interpreting intrinsic characteristics of cancer cells and drugs for accurate prediction of cancer-drug responses.
Collapse
Affiliation(s)
- Iljung Jin
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju 61005, Republic of Korea
| | - Hojung Nam
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju 61005, Republic of Korea.,AI Graduate School, Gwangju Institute of Science and Technology (GIST), Gwangju 61005, Republic of Korea
| |
Collapse
|