1
|
Dai W, Chen G, Peng W, Chen C, Fu X, Liu L, Liu L, Yu N. Domain alignment method based on masked variational autoencoder for predicting patient anticancer drug response. Methods 2025; 238:61-73. [PMID: 40090506 DOI: 10.1016/j.ymeth.2025.03.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2024] [Revised: 02/03/2025] [Accepted: 03/14/2025] [Indexed: 03/18/2025] Open
Abstract
Predicting the patient's response to anticancer drugs is essential in personalized treatment plans. However, due to significant distribution differences between cell line data and patient data, models trained well on cell line data may perform poorly on patient anticancer drug response predictions. Some existing methods use transfer learning strategies to implement domain feature alignment between cell lines and patient data and leverage knowledge from cell lines to predict patient anticancer drug responses. This study proposes a domain alignment method based on masked variational autoencoders, MVAEDA, to predict patient anticancer drug responses. The model constructs multiple variational autoencoders (VAEs) and mask predictors to extract specific and domain-invariant features of cell lines and patients. Then, it masks and reconstructs the gene expression matrix, using generative adversarial training to learn domain-invariant features from the cell line and patient domains. These domain-invariant features are then used to train a classifier. Finally, the final trained model predicts the anticancer drug response in the target domain. Our model is experimentally evaluated on the clinical dataset and the preclinical dataset. The results show that our method performs better than other state-of-the-art methods.
Collapse
Affiliation(s)
- Wei Dai
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650050, China; Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming 650050, China.
| | - Gong Chen
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650050, China
| | - Wei Peng
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650050, China; Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming 650050, China.
| | - Chuyue Chen
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650050, China
| | - Xiaodong Fu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650050, China; Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming 650050, China
| | - Li Liu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650050, China; Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming 650050, China.
| | - Lijun Liu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650050, China; Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming 650050, China
| | - Ning Yu
- State University of New York, The College at Brockport, Department of Computing Sciences, 350 New Campus Drive, Brockport, NY 14422, United States.
| |
Collapse
|
2
|
Jayagopal A, Walsh RJ, Hariprasannan KK, Mariappan R, Mahapatra D, Jaynes PW, Lim D, Peng Tan DS, Tan TZ, Pitt JJ, Jeyasekharan AD, Rajan V. A multi-task domain-adapted model to predict chemotherapy response from mutations in recurrently altered cancer genes. iScience 2025; 28:111992. [PMID: 40160429 PMCID: PMC11952854 DOI: 10.1016/j.isci.2025.111992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 08/23/2024] [Accepted: 02/06/2025] [Indexed: 04/02/2025] Open
Abstract
Next-generation sequencing (NGS) is increasingly utilized in oncological practice; however, only a minority of patients benefit from targeted therapy. Developing drug response prediction (DRP) models is important for the "untargetable" majority. Prior DRP models typically use whole-transcriptome and whole-exome sequencing data, which are clinically unavailable. We aim to develop a DRP model toward the repurposing of chemotherapy, requiring only information from clinical-grade NGS (cNGS) panels of restricted gene sets. Data sparsity and limited patient drug response information make this challenging. We firstly show that existing DRPs perform equally with whole-exome versus cNGS (∼300 genes) data. Drug IDentifier (DruID) is then described, a DRP model for restricted gene sets using transfer learning, variant annotations, domain-invariant representation learning, and multi-task learning. DruID outperformed state-of-the-art DRP methods on pan-cancer data and showed robust response classification on two real-world clinical datasets, representing a step toward a clinically applicable DRP tool.
Collapse
Affiliation(s)
- Aishwarya Jayagopal
- Department of Information Systems and Analytics, School of Computing, National University of Singapore, Singapore 117417, Singapore
| | - Robert J. Walsh
- Department of Haematology-Oncology, National University Cancer Institute, NUHS Tower Block, Level 7, 1E Kent Ridge Road, Singapore 119228, Singapore
| | - Krishna Kumar Hariprasannan
- Department of Information Systems and Analytics, School of Computing, National University of Singapore, Singapore 117417, Singapore
| | - Ragunathan Mariappan
- Department of Information Systems and Analytics, School of Computing, National University of Singapore, Singapore 117417, Singapore
| | - Debabrata Mahapatra
- Department of Computer Science, School of Computing, National University of Singapore, Singapore 117417, Singapore
| | - Patrick William Jaynes
- Cancer Science Institute of Singapore, National University of Singapore, Center for Translational Medicine, 14 Medical Drive, #12-01, Singapore 117599, Singapore
| | - Diana Lim
- Department of Pathology, National University Health System, 1E Kent Ridge Road Singapore 119228, Singapore
| | - David Shao Peng Tan
- Department of Haematology-Oncology, National University Cancer Institute, NUHS Tower Block, Level 7, 1E Kent Ridge Road, Singapore 119228, Singapore
- Cancer Science Institute of Singapore, National University of Singapore, Center for Translational Medicine, 14 Medical Drive, #12-01, Singapore 117599, Singapore
- Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore. 1E Kent Ridge Road, NUHS Tower Block, Level 10, Singapore 119228, Singapore
| | - Tuan Zea Tan
- Cancer Science Institute of Singapore, National University of Singapore, Center for Translational Medicine, 14 Medical Drive, #12-01, Singapore 117599, Singapore
| | - Jason J. Pitt
- Cancer Science Institute of Singapore, National University of Singapore, Center for Translational Medicine, 14 Medical Drive, #12-01, Singapore 117599, Singapore
| | - Anand D. Jeyasekharan
- Department of Haematology-Oncology, National University Cancer Institute, NUHS Tower Block, Level 7, 1E Kent Ridge Road, Singapore 119228, Singapore
- Cancer Science Institute of Singapore, National University of Singapore, Center for Translational Medicine, 14 Medical Drive, #12-01, Singapore 117599, Singapore
| | - Vaibhav Rajan
- Department of Information Systems and Analytics, School of Computing, National University of Singapore, Singapore 117417, Singapore
| |
Collapse
|
3
|
Peng W, Chen C, Dai W, Yu N, Wang J. Predicting Clinical Anticancer Drug Response of Patients by Using Domain Alignment and Prototypical Learning. IEEE J Biomed Health Inform 2025; 29:1534-1545. [PMID: 39292588 DOI: 10.1109/jbhi.2024.3462811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/20/2024]
Abstract
Anticancer drug response prediction is crucial in developing personalized treatment plans for cancer patients. However, High-quality patient anticancer drug response data are scarce and cell line data and patient data have different distributions, models trained solely on cell line data perform poorly. Some existing methods predict anticancer drug response by transferring knowledge from the cell line domain to the patient domain using transfer learning. However, the robustness of these classifiers is affected by anomalies in the cell line data, and they do not utilize the knowledge in the unlabeled target domain data. To this end, we proposed a model called DAPL to predict patient responses to anticancer drugs. The model extracts domain-invariant features from cell lines and patients by constructing multiple VAEs and extracts drug features using GNNs. These features are then combined for prototypical learning to train a classifier, resulting in better predictions of patient anticancer drug response. We used the cell line datasets CCLE and GDSC as source domains and the patient datasets TCGA and PDTC as target domains and conducted experiments. The results indicate that DAPL shows excellent performance in predicting patient anticancer drug response compared to other state-of-the-art methods.
Collapse
|
4
|
Nguyen T, Campbell A, Kumar A, Amponsah E, Fiterau M, Shahriyari L. Optimal fusion of genotype and drug embeddings in predicting cancer drug response. Brief Bioinform 2024; 25:bbae227. [PMID: 38754407 PMCID: PMC11097979 DOI: 10.1093/bib/bbae227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Revised: 04/14/2024] [Accepted: 04/25/2024] [Indexed: 05/18/2024] Open
Abstract
Predicting cancer drug response using both genomics and drug features has shown some success compared to using genomics features alone. However, there has been limited research done on how best to combine or fuse the two types of features. Using a visible neural network with two deep learning branches for genes and drug features as the base architecture, we experimented with different fusion functions and fusion points. Our experiments show that injecting multiplicative relationships between gene and drug latent features into the original concatenation-based architecture DrugCell significantly improved the overall predictive performance and outperformed other baseline models. We also show that different fusion methods respond differently to different fusion points, indicating that the relationship between drug features and different hierarchical biological level of gene features is optimally captured using different methods. Considering both predictive performance and runtime speed, tensor product partial is the best-performing fusion function to combine late-stage representations of drug and gene features to predict cancer drug response.
Collapse
Affiliation(s)
- Trang Nguyen
- Department of Computer Science, University of Massachusetts Amherst, Amherst 01002, MA, United States
| | - Anthony Campbell
- Department of Computer Science, University of Massachusetts Amherst, Amherst 01002, MA, United States
| | - Ankit Kumar
- Department of Mathematics and Statistics, University of Massachusetts Amherst, Amherst 01002, MA, United States
| | - Edwin Amponsah
- Department of Mathematics and Statistics, University of Massachusetts Amherst, Amherst 01002, MA, United States
| | - Madalina Fiterau
- Department of Computer Science, University of Massachusetts Amherst, Amherst 01002, MA, United States
| | - Leili Shahriyari
- Department of Mathematics and Statistics, University of Massachusetts Amherst, Amherst 01002, MA, United States
| |
Collapse
|
5
|
Kim J, Park SH, Lee H. PANCDR: precise medicine prediction using an adversarial network for cancer drug response. Brief Bioinform 2024; 25:bbae088. [PMID: 38487849 PMCID: PMC10940842 DOI: 10.1093/bib/bbae088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 01/09/2024] [Accepted: 02/16/2024] [Indexed: 03/18/2024] Open
Abstract
Pharmacogenomics aims to provide personalized therapy to patients based on their genetic variability. However, accurate prediction of cancer drug response (CDR) is challenging due to genetic heterogeneity. Since clinical data are limited, most studies predicting drug response use preclinical data to train models. However, such models might not be generalizable to external clinical data due to differences between the preclinical and clinical datasets. In this study, a Precision Medicine Prediction using an Adversarial Network for Cancer Drug Response (PANCDR) model is proposed. PANCDR consists of two sub-models, an adversarial model and a CDR prediction model. The adversarial model reduces the gap between the preclinical and clinical datasets, while the CDR prediction model extracts features and predicts responses. PANCDR was trained using both preclinical data and unlabeled clinical data. Subsequently, it was tested on external clinical data, including The Cancer Genome Atlas and brain tumor patients. PANCDR outperformed other machine learning models in predicting external test data. Our results demonstrate the robustness of PANCDR and its potential in precision medicine by recommending patient-specific drug candidates. The PANCDR codes and data are available at https://github.com/DMCB-GIST/PANCDR.
Collapse
Affiliation(s)
- Juyeon Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, 61005, Gwangju, South Korea
| | - Sung-Hye Park
- Department of Pathology, Seoul National University Hospital, Seoul National University College of Medicine, 03080, Seoul, South Korea
- Neuroscience Research Institute, Seoul National University College of Medicine, 03080, Seoul, South Korea
| | - Hyunju Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, 61005, Gwangju, South Korea
- Artificial Intelligence Graduate School, Gwangju Institute of Science and Technology, 61005, Gwangju, South Korea
| |
Collapse
|
6
|
Yuan S, Chen YC, Tsai CH, Chen HW, Shieh GS. Feature selection translates drug response predictors from cell lines to patients. Front Genet 2023; 14:1217414. [PMID: 37519889 PMCID: PMC10382684 DOI: 10.3389/fgene.2023.1217414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 06/26/2023] [Indexed: 08/01/2023] Open
Abstract
Targeted therapies and chemotherapies are prevalent in cancer treatment. Identification of predictive markers to stratify cancer patients who will respond to these therapies remains challenging because patient drug response data are limited. As large amounts of drug response data have been generated by cell lines, methods to efficiently translate cell-line-trained predictors to human tumors will be useful in clinical practice. Here, we propose versatile feature selection procedures that can be combined with any classifier. For demonstration, we combined the feature selection procedures with a (linear) logit model and a (non-linear) K-nearest neighbor and trained these on cell lines to result in LogitDA and KNNDA, respectively. We show that LogitDA/KNNDA significantly outperforms existing methods, e.g., a logistic model and a deep learning method trained by thousands of genes, in prediction AUC (0.70-1.00 for seven of the ten drugs tested) and is interpretable. This may be due to the fact that sample sizes are often limited in the area of drug response prediction. We further derive a novel adjustment on the prediction cutoff for LogitDA to yield a prediction accuracy of 0.70-0.93 for seven drugs, including erlotinib and cetuximab, whose pathways relevant to anti-cancer therapies are also uncovered. These results indicate that our methods can efficiently translate cell-line-trained predictors into tumors.
Collapse
Affiliation(s)
- Shinsheng Yuan
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
- Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei, Taiwan
| | - Yen-Chou Chen
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
| | - Chi-Hsuan Tsai
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
| | - Huei-Wen Chen
- College of Medicine, Graduate Institute of Toxicology, National Taiwan University, Taipei, Taiwan
| | - Grace S. Shieh
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
- Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei, Taiwan
- Genome and Systems Biology Degree Program, Academia Sinica and National Taiwan University, Taipei, Taiwan
- Data Science Degree Program, Academia Sinica and National Taiwan University, Taipei, Taiwan
| |
Collapse
|
7
|
Partin A, Brettin TS, Zhu Y, Narykov O, Clyde A, Overbeek J, Stevens RL. Deep learning methods for drug response prediction in cancer: Predominant and emerging trends. Front Med (Lausanne) 2023; 10:1086097. [PMID: 36873878 PMCID: PMC9975164 DOI: 10.3389/fmed.2023.1086097] [Citation(s) in RCA: 39] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 01/23/2023] [Indexed: 02/17/2023] Open
Abstract
Cancer claims millions of lives yearly worldwide. While many therapies have been made available in recent years, by in large cancer remains unsolved. Exploiting computational predictive models to study and treat cancer holds great promise in improving drug development and personalized design of treatment plans, ultimately suppressing tumors, alleviating suffering, and prolonging lives of patients. A wave of recent papers demonstrates promising results in predicting cancer response to drug treatments while utilizing deep learning methods. These papers investigate diverse data representations, neural network architectures, learning methodologies, and evaluations schemes. However, deciphering promising predominant and emerging trends is difficult due to the variety of explored methods and lack of standardized framework for comparing drug response prediction models. To obtain a comprehensive landscape of deep learning methods, we conducted an extensive search and analysis of deep learning models that predict the response to single drug treatments. A total of 61 deep learning-based models have been curated, and summary plots were generated. Based on the analysis, observable patterns and prevalence of methods have been revealed. This review allows to better understand the current state of the field and identify major challenges and promising solution paths.
Collapse
Affiliation(s)
- Alexander Partin
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Thomas S. Brettin
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Yitan Zhu
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Oleksandr Narykov
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Austin Clyde
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Jamie Overbeek
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Rick L. Stevens
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
- Department of Computer Science, The University of Chicago, Chicago, IL, United States
| |
Collapse
|
8
|
Shen B, Feng F, Li K, Lin P, Ma L, Li H. A systematic assessment of deep learning methods for drug response prediction: from in vitro to clinical applications. Brief Bioinform 2023; 24:6961794. [PMID: 36575826 DOI: 10.1093/bib/bbac605] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 10/30/2022] [Accepted: 12/09/2022] [Indexed: 12/29/2022] Open
Abstract
Drug response prediction is an important problem in personalized cancer therapy. Among various newly developed models, significant improvement in prediction performance has been reported using deep learning methods. However, systematic comparisons of deep learning methods, especially of the transferability from preclinical models to clinical cohorts, are currently lacking. To provide a more rigorous assessment, the performance of six representative deep learning methods for drug response prediction using nine evaluation metrics, including the overall prediction accuracy, predictability of each drug, potential associated factors and transferability to clinical cohorts, in multiple application scenarios was benchmarked. Most methods show promising prediction within cell line datasets, and TGSA, with its lower time cost and better performance, is recommended. Although the performance metrics decrease when applying models trained on cell lines to patients, a certain amount of power to distinguish clinical response on some drugs can be maintained using CRDNN and TGSA. With these assessments, we provide a guidance for researchers to choose appropriate methods, as well as insights into future directions for the development of more effective methods in clinical scenarios.
Collapse
Affiliation(s)
- Bihan Shen
- Cancer Systems Biology group at Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Fangyoumin Feng
- Cancer Systems Biology group at Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Kunshi Li
- Cancer Systems Biology group at Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Ping Lin
- Cancer Systems Biology group at Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Liangxiao Ma
- Bio-Med Big Data Center at Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Hong Li
- Cancer Systems Biology group at Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| |
Collapse
|
9
|
Yingtaweesittikul H, Wu J, Mongia A, Peres R, Ko K, Nagarajan N, Suphavilai C. CREAMMIST: an integrative probabilistic database for cancer drug response prediction. Nucleic Acids Res 2022; 51:D1242-D1248. [PMID: 36259664 PMCID: PMC9825458 DOI: 10.1093/nar/gkac911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Revised: 09/18/2022] [Accepted: 10/11/2022] [Indexed: 01/30/2023] Open
Abstract
Extensive in vitro cancer drug screening datasets have enabled scientists to identify biomarkers and develop machine learning models for predicting drug sensitivity. While most advancements have focused on omics profiles, cancer drug sensitivity scores precalculated by the original sources are often used as-is, without consideration for variabilities between studies. It is well-known that significant inconsistencies exist between the drug sensitivity scores across datasets due to differences in experimental setups and preprocessing methods used to obtain the sensitivity scores. As a result, many studies opt to focus only on a single dataset, leading to underutilization of available data and a limited interpretation of cancer pharmacogenomics analysis. To overcome these caveats, we have developed CREAMMIST (https://creammist.mtms.dev), an integrative database that enables users to obtain an integrative dose-response curve, to capture uncertainty (or high certainty when multiple datasets well align) across five widely used cancer cell-line drug-response datasets. We utilized the Bayesian framework to systematically integrate all available dose-response values across datasets (>14 millions dose-response data points). CREAMMIST provides easy-to-use statistics derived from the integrative dose-response curves for various downstream analyses such as identifying biomarkers, selecting drug concentrations for experiments, and training robust machine learning models.
Collapse
Affiliation(s)
| | - Jiaxi Wu
- Genome Institute of Singapore, A*STAR, Singapore, Singapore
| | - Aanchal Mongia
- Genome Institute of Singapore, A*STAR, Singapore, Singapore
| | - Rafael Peres
- Genome Institute of Singapore, A*STAR, Singapore, Singapore
| | - Karrie Ko
- Genome Institute of Singapore, A*STAR, Singapore, Singapore
| | | | - Chayaporn Suphavilai
- To whom correspondence should be addressed. Tel: +65 86213683; Fax: +65 68088292;
| |
Collapse
|
10
|
Ogunleye AZ, Piyawajanusorn C, Gonçalves A, Ghislat G, Ballester PJ. Interpretable Machine Learning Models to Predict the Resistance of Breast Cancer Patients to Doxorubicin from Their microRNA Profiles. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2022; 9:e2201501. [PMID: 35785523 PMCID: PMC9403644 DOI: 10.1002/advs.202201501] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 06/02/2022] [Indexed: 05/05/2023]
Abstract
Doxorubicin is a common treatment for breast cancer. However, not all patients respond to this drug, which sometimes causes life-threatening side effects. Accurately anticipating doxorubicin-resistant patients would therefore permit to spare them this risk while considering alternative treatments without delay. Stratifying patients based on molecular markers in their pretreatment tumors is a promising approach to advance toward this ambitious goal, but single-gene gene markers such as HER2 expression have not shown to be sufficiently predictive. The recent availability of matched doxorubicin-response and diverse molecular profiles across breast cancer patients permits now analysis at a much larger scale. 16 machine learning algorithms and 8 molecular profiles are systematically evaluated on the same cohort of patients. Only 2 of the 128 resulting models are substantially predictive, showing that they can be easily missed by a standard-scale analysis. The best model is classification and regression tree (CART) nonlinearly combining 4 selected miRNA isoforms to predict doxorubicin response (median Matthew correlation coefficient (MCC) and area under the curve (AUC) of 0.56 and 0.80, respectively). By contrast, HER2 expression is significantly less predictive (median MCC and AUC of 0.14 and 0.57, respectively). As the predictive accuracy of this CART model increases with larger training sets, its update with future data should result in even better accuracy.
Collapse
Affiliation(s)
- Adeolu Z. Ogunleye
- Cancer Research Center of Marseille (CRCM)INSERM U1068MarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Institut Paoli‐CalmettesMarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Aix‐Marseille UniversitéMarseilleF‐13284France
- Cancer Research Center of Marseille (CRCM)CNRS UMR7258MarseilleF‐13009France
| | - Chayanit Piyawajanusorn
- Cancer Research Center of Marseille (CRCM)INSERM U1068MarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Institut Paoli‐CalmettesMarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Aix‐Marseille UniversitéMarseilleF‐13284France
- Cancer Research Center of Marseille (CRCM)CNRS UMR7258MarseilleF‐13009France
| | - Anthony Gonçalves
- Cancer Research Center of Marseille (CRCM)INSERM U1068MarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Institut Paoli‐CalmettesMarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Aix‐Marseille UniversitéMarseilleF‐13284France
- Cancer Research Center of Marseille (CRCM)CNRS UMR7258MarseilleF‐13009France
| | - Ghita Ghislat
- Cancer Research Center of Marseille (CRCM)INSERM U1068MarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Institut Paoli‐CalmettesMarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Aix‐Marseille UniversitéMarseilleF‐13284France
- Cancer Research Center of Marseille (CRCM)CNRS UMR7258MarseilleF‐13009France
| | - Pedro J. Ballester
- Cancer Research Center of Marseille (CRCM)INSERM U1068MarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Institut Paoli‐CalmettesMarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Aix‐Marseille UniversitéMarseilleF‐13284France
- Cancer Research Center of Marseille (CRCM)CNRS UMR7258MarseilleF‐13009France
- Department of BioengineeringImperial College LondonLondonSW7 2AZUK
| |
Collapse
|
11
|
Out-of-distribution generalization from labelled and unlabelled gene expression data for drug response prediction. NAT MACH INTELL 2021. [DOI: 10.1038/s42256-021-00408-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|