1
|
Nguyen LC, Naulaerts S, Bruna A, Ghislat G, Ballester PJ. Predicting Cancer Drug Response In Vivo by Learning an Optimal Feature Selection of Tumour Molecular Profiles. Biomedicines 2021; 9:biomedicines9101319. [PMID: 34680436 PMCID: PMC8533095 DOI: 10.3390/biomedicines9101319] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 09/22/2021] [Accepted: 09/23/2021] [Indexed: 12/17/2022] Open
Abstract
(1) Background: Inter-tumour heterogeneity is one of cancer’s most fundamental features. Patient stratification based on drug response prediction is hence needed for effective anti-cancer therapy. However, single-gene markers of response are rare and/or may fail to achieve a significant impact in the clinic. Machine Learning (ML) is emerging as a particularly promising complementary approach to precision oncology. (2) Methods: Here we leverage comprehensive Patient-Derived Xenograft (PDX) pharmacogenomic data sets with dimensionality-reducing ML algorithms with this purpose. (3) Results: Combining multiple gene alterations via ML leads to better discrimination between sensitive and resistant PDXs in 19 of the 26 analysed cases. Highly predictive ML models employing concise gene lists were found for three cases: paclitaxel (breast cancer), binimetinib (breast cancer) and cetuximab (colorectal cancer). Interestingly, each of these multi-gene ML models identifies some treatment-responsive PDXs not harbouring the best actionable mutation for that case. Thus, ML multi-gene predictors generally have much fewer false negatives than the corresponding single-gene marker. (4) Conclusions: As PDXs often recapitulate clinical outcomes, these results suggest that many more patients could benefit from precision oncology if ML algorithms were also applied to existing clinical pharmacogenomics data, especially those algorithms generating classifiers combining data-selected gene alterations.
Collapse
Affiliation(s)
- Linh C. Nguyen
- Cancer Research Center of Marseille, INSERM U1068, F-13009 Marseille, France;
- Institut Paoli-Calmettes, F-13009 Marseille, France
- Aix-Marseille Université UM105, F-13009 Marseille, France
- CNRS UMR7258, F-13009 Marseille, France
- Department of Life Sciences, University of Science and Technology of Hanoi, Vietnam Academy of Science and Technology, Hanoi 100803, Vietnam
| | - Stefan Naulaerts
- Ludwig Institute for Cancer Research, 1200 Brussels, Belgium;
- Duve Institute, UCLouvain, 1200 Brussels, Belgium
| | | | - Ghita Ghislat
- Centre d’Immunologie de Marseille-Luminy, INSERM U1104, CNRS UMR7280, F-13009 Marseille, France;
| | - Pedro J. Ballester
- Cancer Research Center of Marseille, INSERM U1068, F-13009 Marseille, France;
- Institut Paoli-Calmettes, F-13009 Marseille, France
- Aix-Marseille Université UM105, F-13009 Marseille, France
- CNRS UMR7258, F-13009 Marseille, France
- Correspondence: ; Tel.: + 33-(0)4-8697-7201
| |
Collapse
|
2
|
Bomane A, Gonçalves A, Ballester PJ. Paclitaxel Response Can Be Predicted With Interpretable Multi-Variate Classifiers Exploiting DNA-Methylation and miRNA Data. Front Genet 2019; 10:1041. [PMID: 31708973 PMCID: PMC6823251 DOI: 10.3389/fgene.2019.01041] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Accepted: 09/30/2019] [Indexed: 12/27/2022] Open
Abstract
To address the problem of resistance to paclitaxel treatment, we have investigated to which extent is possible to predict Breast Cancer (BC) patient response to this drug. We carried out a large-scale tumor-based prediction analysis using data from the US National Cancer Institute’s Genomic Data Commons. These data sets comprise the responses of BC patients to paclitaxel along with six molecular profiles of their tumors. We assessed 10 Machine Learning (ML) algorithms on each of these profiles and evaluated the resulting 60 classifiers on the same BC patients. DNA methylation and miRNA profiles were the most informative overall. In combination with these two profiles, ML algorithms selecting the smallest subset of molecular features generated the most predictive classifiers: a complexity-optimized XGBoost classifier based on CpG island methylation extracted a subset of molecular factors relevant to predict paclitaxel response (AUC = 0.74). A CpG site methylation-based Decision Tree (DT) combining only 2 of the 22,941 considered CpG sites (AUC = 0.89) and a miRNA expression-based DT employing just 4 of the 337 analyzed mature miRNAs (AUC = 0.72) reveal the molecular types associated to paclitaxel-sensitive and resistant BC tumors. A literature review shows that features selected by these three classifiers have been individually linked to the cytotoxic-drug sensitivities and prognosis of BC patients. Our work leads to several molecular signatures, unearthed from methylome and miRNome, able to anticipate to some extent which BC tumors respond or not to paclitaxel. These results may provide insights to optimize paclitaxel-therapies in clinical practice.
Collapse
Affiliation(s)
- Alexandra Bomane
- Cancer Research Center of Marseille, CRCM, INSERM, Institut Paoli-Calmettes, Aix-Marseille Univ, CNRS, Paris, France
| | - Anthony Gonçalves
- Cancer Research Center of Marseille, CRCM, INSERM, Institut Paoli-Calmettes, Aix-Marseille Univ, CNRS, Paris, France
| | - Pedro J Ballester
- Cancer Research Center of Marseille, CRCM, INSERM, Institut Paoli-Calmettes, Aix-Marseille Univ, CNRS, Paris, France
| |
Collapse
|
3
|
Nguyen L, Dang CC, Ballester PJ. Systematic assessment of multi-gene predictors of pan-cancer cell line sensitivity to drugs exploiting gene expression data. F1000Res 2016; 5. [PMID: 28299173 PMCID: PMC5310525 DOI: 10.12688/f1000research.10529.2] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/10/2017] [Indexed: 12/19/2022] Open
Abstract
Background: Selected gene mutations are routinely used to guide the selection of cancer drugs for a given patient tumour. Large pharmacogenomic data sets, such as those by Genomics of Drug Sensitivity in Cancer (GDSC) consortium, were introduced to discover more of these single-gene markers of drug sensitivity. Very recently, machine learning regression has been used to investigate how well cancer cell line sensitivity to drugs is predicted depending on the type of molecular profile. The latter has revealed that gene expression data is the most predictive profile in the pan-cancer setting. However, no study to date has exploited GDSC data to systematically compare the performance of machine learning models based on multi-gene expression data against that of widely-used single-gene markers based on genomics data.
Methods: Here we present this systematic comparison using Random Forest (RF) classifiers exploiting the expression levels of 13,321 genes and an average of 501 tested cell lines per drug. To account for time-dependent batch effects in IC
50 measurements, we employ independent test sets generated with more recent GDSC data than that used to train the predictors and show that this is a more realistic validation than standard k-fold cross-validation.
Results and Discussion: Across 127 GDSC drugs, our results show that the single-gene markers unveiled by the MANOVA analysis tend to achieve higher precision than these RF-based multi-gene models, at the cost of generally having a poor recall (i.e. correctly detecting only a small part of the cell lines sensitive to the drug). Regarding overall classification performance, about two thirds of the drugs are better predicted by the multi-gene RF classifiers. Among the drugs with the most predictive of these models, we found pyrimethamine, sunitinib and 17-AAG.
Conclusions: Thanks to this unbiased validation, we now know that this type of models can predict
in vitro tumour response to some of these drugs. These models can thus be further investigated on
in vivo tumour models. R code to facilitate the construction of alternative machine learning models and their validation in the presented benchmark is available at
http://ballester.marseille.inserm.fr/gdsc.transcriptomicDatav2.tar.gz.
Collapse
Affiliation(s)
- Linh Nguyen
- Cancer Research Center of Marseille, INSERM U1068, Marseille, France; Institut Paoli-Calmettes, Marseille, France; Aix-Marseille Université, Marseille, France; Cancer Research Center of Marseille UMR7258, Marseille, France
| | - Cuong C Dang
- Cancer Research Center of Marseille, INSERM U1068, Marseille, France; Institut Paoli-Calmettes, Marseille, France; Aix-Marseille Université, Marseille, France; Cancer Research Center of Marseille UMR7258, Marseille, France
| | - Pedro J Ballester
- Cancer Research Center of Marseille, INSERM U1068, Marseille, France; Institut Paoli-Calmettes, Marseille, France; Aix-Marseille Université, Marseille, France; Cancer Research Center of Marseille UMR7258, Marseille, France
| |
Collapse
|
4
|
Nguyen L, Dang CC, Ballester PJ. Systematic assessment of multi-gene predictors of pan-cancer cell line sensitivity to drugs exploiting gene expression data. F1000Res 2016; 5. [PMID: 28299173 DOI: 10.12688/f1000research.10529.1] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/28/2016] [Indexed: 12/30/2022] Open
Abstract
Background: Selected gene mutations are routinely used to guide the selection of cancer drugs for a given patient tumour. Large pharmacogenomic data sets, such as those by Genomics of Drug Sensitivity in Cancer (GDSC) consortium, were introduced to discover more of these single-gene markers of drug sensitivity. Very recently, machine learning regression has been used to investigate how well cancer cell line sensitivity to drugs is predicted depending on the type of molecular profile. The latter has revealed that gene expression data is the most predictive profile in the pan-cancer setting. However, no study to date has exploited GDSC data to systematically compare the performance of machine learning models based on multi-gene expression data against that of widely-used single-gene markers based on genomics data. Methods: Here we present this systematic comparison using Random Forest (RF) classifiers exploiting the expression levels of 13,321 genes and an average of 501 tested cell lines per drug. To account for time-dependent batch effects in IC 50 measurements, we employ independent test sets generated with more recent GDSC data than that used to train the predictors and show that this is a more realistic validation than standard k-fold cross-validation. Results and Discussion: Across 127 GDSC drugs, our results show that the single-gene markers unveiled by the MANOVA analysis tend to achieve higher precision than these RF-based multi-gene models, at the cost of generally having a poor recall (i.e. correctly detecting only a small part of the cell lines sensitive to the drug). Regarding overall classification performance, about two thirds of the drugs are better predicted by the multi-gene RF classifiers. Among the drugs with the most predictive of these models, we found pyrimethamine, sunitinib and 17-AAG. Conclusions: Thanks to this unbiased validation, we now know that this type of models can predict in vitro tumour response to some of these drugs. These models can thus be further investigated on in vivo tumour models. R code to facilitate the construction of alternative machine learning models and their validation in the presented benchmark is available at http://ballester.marseille.inserm.fr/gdsc.transcriptomicDatav2.tar.gz.
Collapse
Affiliation(s)
- Linh Nguyen
- Cancer Research Center of Marseille, INSERM U1068, Marseille, France; Institut Paoli-Calmettes, Marseille, France; Aix-Marseille Université, Marseille, France; Cancer Research Center of Marseille UMR7258, Marseille, France
| | - Cuong C Dang
- Cancer Research Center of Marseille, INSERM U1068, Marseille, France; Institut Paoli-Calmettes, Marseille, France; Aix-Marseille Université, Marseille, France; Cancer Research Center of Marseille UMR7258, Marseille, France
| | - Pedro J Ballester
- Cancer Research Center of Marseille, INSERM U1068, Marseille, France; Institut Paoli-Calmettes, Marseille, France; Aix-Marseille Université, Marseille, France; Cancer Research Center of Marseille UMR7258, Marseille, France
| |
Collapse
|
5
|
Determinants of Genetic Diversity of Spontaneous Drug Resistance in Bacteria. Genetics 2016; 203:1369-80. [PMID: 27182949 DOI: 10.1534/genetics.115.185355] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2015] [Accepted: 05/04/2016] [Indexed: 01/05/2023] Open
Abstract
Any pathogen population sufficiently large is expected to harbor spontaneous drug-resistant mutants, often responsible for disease relapse after antibiotic therapy. It is seldom appreciated, however, that while larger populations harbor more mutants, the abundance distribution of these mutants is expected to be markedly uneven. This is because a larger population size allows early mutants to expand for longer, exacerbating their predominance in the final mutant subpopulation. Here, we investigate the extent to which this reduction in evenness can constrain the genetic diversity of spontaneous drug resistance in bacteria. Combining theory and experiments, we show that even small variations in growth rate between resistant mutants and the wild type result in orders-of-magnitude differences in genetic diversity. Indeed, only a slight fitness advantage for the mutant is enough to keep diversity low and independent of population size. These results have important clinical implications. Genetic diversity at antibiotic resistance loci can determine a population's capacity to cope with future challenges (i.e., second-line therapy). We thus revealed an unanticipated way in which the fitness effects of antibiotic resistance can affect the evolvability of pathogens surviving a drug-induced bottleneck. This insight will assist in the fight against multidrug-resistant microbes, as well as contribute to theories aimed at predicting cancer evolution.
Collapse
|
6
|
Mok T, Ladrera G, Srimuninnimit V, Sriuranpong V, Yu CJ, Thongprasert S, Sandoval-Tan J, Lee JS, Fuerte F, Shames DS, Klughammer B, Truman M, Perez-Moreno P, Wu YL. Tumor marker analyses from the phase III, placebo-controlled, FASTACT-2 study of intercalated erlotinib with gemcitabine/platinum in the first-line treatment of advanced non-small-cell lung cancer. Lung Cancer 2016; 98:1-8. [PMID: 27393499 DOI: 10.1016/j.lungcan.2016.04.023] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Revised: 04/20/2016] [Accepted: 04/30/2016] [Indexed: 02/06/2023]
Abstract
OBJECTIVES The FASTACT-2 study of intercalated erlotinib with chemotherapy in Asian patients found that EGFR mutations were the main driver behind the significant progression-free survival (PFS) benefit noted in the overall population. Further exploratory biomarker analyses were conducted to provide additional insight. MATERIALS AND METHODS This multicenter, randomized, placebo-controlled, double-blind, phase III study investigated intercalated first-line erlotinib or placebo with gemcitabine/platinum, followed by maintenance erlotinib or placebo, for patients with stage IIIB/IV non-small cell lung cancer (NSCLC). Provision of samples for biomarker analysis was encouraged but not mandatory. The following biomarkers were analyzed (in order of priority): EGFR mutation by cobas(®) test, KRAS mutation by cobas(®)KRAS test, HER2 by immunohistochemistry (IHC), HER3 by IHC, ERCC1 by IHC, EGFR gene copy number by fluorescence in-situ hybridization (FISH) and EGFR by IHC. All subgroups were assessed for PFS (primary endpoint), overall survival (OS), non-progression rate and objective response rate. RESULTS Overall, 256 patients provided samples for analysis. Considerable overlap was noted among biomarkers, except for EGFR and KRAS mutations, which are mutually exclusive. Other than EGFR mutations (p<0.0001), no other biomarkers were significantly predictive of outcomes in a treatment-by-biomarker interaction test, although ERCC1 IHC-positive status was predictive of improved OS for the erlotinib arm versus placebo in EGFR wild-type patients (median 18.4 vs 9.5 months; hazard ratio [HR] HR=0.32, 95% confidence intervals [CI]: 0.14-0.69, p=0.0024). CONCLUSION Activating EGFR mutations were predictive for improved treatment outcomes with a first-line intercalated regimen of chemotherapy and erlotinib in NSCLC. ERCC1 status may have some predictive value in EGFR wild-type disease, but requires further investigation.
Collapse
Affiliation(s)
- Tony Mok
- State Key Laboratory of Southern China, Department of Clinical Oncology, Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, China.
| | - Guia Ladrera
- Lung Center of the Philippines, Quezon Ave., Diliman, Quezon City, Metro Manila, Philippines
| | | | - Virote Sriuranpong
- King Chulalongkorn Memorial Hospital, 1873 Rama 4 Road, Bangkok, Pathumwan 10330, Thailand
| | - Chong-Jen Yu
- National Taiwan University Hospital, No. 7, Zhongshan S Rd., Zhongzheng District, Taipei, Taiwan
| | | | - Jennifer Sandoval-Tan
- Philippine General Hospital, Taft Avenue Ermita, Brgy 670 Zone 72, Manila, 1000 Metro Manila, Philippines
| | - Jin Soo Lee
- National Cancer Center, Goyang, Republic of Korea
| | - Fatima Fuerte
- Rizal Medical Center, Pasig Blvd, Pasig, Metro Manila, Philippines
| | - David S Shames
- Oncology Biomarker Development, Genentech Inc., South San Francisco, CA 94080, United States
| | | | - Matt Truman
- F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | | | - Yi-Long Wu
- Guangdong Lung Cancer Institute, Guangdong General Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China.
| |
Collapse
|
7
|
Kohli M, Young CY, Tindall DJ, Nandy D, McKenzie KM, Bevan GH, Donkena KV. Whole blood defensin mRNA expression is a predictive biomarker of docetaxel response in castration-resistant prostate cancer. Onco Targets Ther 2015; 8:1915-22. [PMID: 26261420 PMCID: PMC4527520 DOI: 10.2147/ott.s86637] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
This study tested the potential of circulating RNA-based signals as predictive biomarkers for docetaxel response in patients with metastatic castration-resistant prostate cancer (CRPC). RNA was analyzed in blood from six CRPC patients by whole-transcriptome sequencing (total RNA-sequencing) before and after docetaxel treatment using the Illumina’s HiSeq platform. Targeted RNA capture and sequencing was performed in an independent cohort of ten patients with CRPC matching the discovery cohort to confirm differential expression of the genes. Response to docetaxel was defined on the basis of prostate-specific antigen levels and imaging criteria. Two-way analysis of variance was used to compare differential gene expression in patients classified as responders versus nonresponders before and after docetaxel treatment. Thirty-four genes with two-fold differentially expressed transcripts in responders versus nonresponders were selected from total RNA-sequencing for further validation. Targeted RNA capture and sequencing showed that 13/34 genes were differentially expressed in responders. Alpha defensin genes DEFA1, DEFA1B, and DEFA3 exhibited significantly higher expression in responder patients compared with nonresponder patients before administration of chemotherapy (fold change >2.5). In addition, post-docetaxel treatment significantly increased transcript levels of these defensin genes in responders (fold change >2.8). Our results reveal that patients with higher defensin RNA transcripts in blood respond well to docetaxel therapy. We suggest that monitoring DEFA1, DEFA1B, and DEFA3 RNA transcripts in blood prior to treatment will be helpful to determine which patients are better candidates to receive docetaxel chemotherapy.
Collapse
Affiliation(s)
- Manish Kohli
- Department of Oncology, Mayo Clinic, Rochester, MN, USA
| | | | | | | | - Kyle M McKenzie
- Department of Geriatric Medicine, Mayo Clinic, Rochester, MN, USA
| | - Graham H Bevan
- University of Rochester Medical Center, Rochester, NY, USA
| | | |
Collapse
|
8
|
Restoration of CBX7 expression increases the susceptibility of human lung carcinoma cells to irinotecan treatment. NAUNYN-SCHMIEDEBERG'S ARCHIVES OF PHARMACOLOGY 2015. [PMID: 26216446 DOI: 10.1007/s00210-015-1153-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Lung cancer is one of the most common causes of cancer-related death worldwide in men and women, and, despite the recent remarkable scientific advances, drug treatment is still unsatisfactory. Polycomb protein chromobox homolog 7 (CBX7) is involved in several biological processes, including development and cancer progression, indeed the lack of CBX7 protein correlates with a highly malignant phenotype and a poor prognosis. However, its role in lung cancer still remains unknown. Since CBX7 is drastically downregulated in human lung carcinomas, we investigated whether restoration of CBX7 expression could affect growth property of lung cancer cells and modulate their sensitivity to treatment with irinotecan and etoposide, two chemoterapy drugs most commonly used in lung cancer therapy. Here, we demonstrate that restoration of CBX7 in two human lung carcinoma cell lines (A549 and H1299), in which this protein is not detectable, leads to a decreased proliferation (at least in part through a downregulation of phosphorylated ERK and phosphorylated p38) and an increased apoptotic cell death after drug exposure (at least in part through the downregulation of Bcl-2, phosphorylated Akt, and phosphorylated JNK). Taken together, these results suggest that the retention of CBX7 expression may play a role in the modulation of chemosensitivity of lung cancer patients to the treatment with irinotecan and etoposide.
Collapse
|
9
|
Serpin b3 is associated with poor survival after chemotherapy and is a potential novel predictive biomarker in advanced non-small-cell lung cancer. J Thorac Oncol 2015; 8:1502-9. [PMID: 24389432 DOI: 10.1097/jto.0000000000000016] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
INTRODUCTION In a previous biomarker discovery project using gene-expression profiling we identified Serpin B3 (SB3) as a predictor of resistance to platinum doublet chemotherapy (PtC) in non-small-cell lung cancer (NSCLC). This independent prospective study was designed to confirm the predictive utility of SB3. METHODS SB3 immunohistochemistry was scored by previously validated criteria (score 0 = negative, score 1 = 1%-10% tumor cells positive, score 2 = 11%-50% tumor cells positive, and score 3 = >50% tumor cell positive) in 197 patients with stage IV NSCLC treated with PtC. This provided 80% power to detect a median survival increase from 150 days in patients with an SB3 immunohistochemistry score of 2 or more to 300 days in those with an SB3 score of 0 or 1. RESULTS Thirty-six percent of NSCLCs stained positive for SB3. Median survival for SB3 negative/score 0 was 332 days, SB3 positive/score 1 was 268 days, and SB3 positive/score 2 or 3 was 120 days (p = 0.004). Cox proportional hazards analysis demonstrated that SB3 positivity is an independent predictor of survival (hazard ratio = 1.87; 95% confidence interval, 1.29-2.71; p = 0.001).The disease control rate in SB3 score 0, 1 = 65%, and score of 2 or more = 20 % (p = 0.002), with median survival 306 days (score 0, 1) versus 120 days (score ≥ 2, hazard ratio= 1.71; 95% confidence interval. 1.14-3.10; p = 0.002). CONCLUSIONS SB3-positive immunohistochemistry score of 2 or more (>10% tumor cells positive) identifies a subgroup of patients with stage IV NSCLC who have a poor survival (median 120 days) when treated with PtC, similar to that estimated for untreated or chemo-refractory stage IV NSCLC. Further prospective qualification using biospecimens from randomized studies is needed, but SB3 seems to be a useful biomarker that identifies a highly resistant subgroup in whom PtC should be avoided.
Collapse
|