1
|
Sun S, Sechidis K, Chen Y, Lu J, Ma C, Mirshani A, Ohlssen D, Vandemeulebroecke M, Bornkamp B. Comparing algorithms for characterizing treatment effect heterogeneity in randomized trials. Biom J 2024; 66:e2100337. [PMID: 36437036 DOI: 10.1002/bimj.202100337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Revised: 10/04/2022] [Accepted: 10/16/2022] [Indexed: 11/29/2022]
Abstract
The identification and estimation of heterogeneous treatment effects in biomedical clinical trials are challenging, because trials are typically planned to assess the treatment effect in the overall trial population. Nevertheless, the identification of how the treatment effect may vary across subgroups is of major importance for drug development. In this work, we review some existing simulation work and perform a simulation study to evaluate recent methods for identifying and estimating the heterogeneous treatments effects using various metrics and scenarios relevant for drug development. Our focus is not only on a comparison of the methods in general, but on how well these methods perform in simulation scenarios that reflect real clinical trials. We provide the R package benchtm that can be used to simulate synthetic biomarker distributions based on real clinical trial data and to create interpretable scenarios to benchmark methods for identification and estimation of treatment effect heterogeneity.
Collapse
Affiliation(s)
- Sophie Sun
- Advanced Methodology and Data Science, Novartis Pharmaceuticals Corporation, East Hanover, New Jersey, USA
| | | | - Yao Chen
- Advanced Methodology and Data Science, Novartis Pharmaceuticals Corporation, East Hanover, New Jersey, USA
| | - Jiarui Lu
- Advanced Methodology and Data Science, Novartis Pharmaceuticals Corporation, East Hanover, New Jersey, USA
| | - Chong Ma
- Early Development Analytics, Novartis Pharmaceuticals Corporation, Cambridge, Massachusetts, USA
| | - Ardalan Mirshani
- Advanced Methodology and Data Science, Novartis Pharmaceuticals Corporation, East Hanover, New Jersey, USA
| | - David Ohlssen
- Advanced Methodology and Data Science, Novartis Pharmaceuticals Corporation, East Hanover, New Jersey, USA
| | | | - Björn Bornkamp
- Advanced Methodology and Data Science, Novartis Pharma AG, Basel, Switzerland
| |
Collapse
|
2
|
Mannion E, Ritz C, Ferrario PG. Post hoc subgroup analysis and identification-learning more from existing data. Eur J Clin Nutr 2023:10.1038/s41430-023-01297-5. [PMID: 37311869 DOI: 10.1038/s41430-023-01297-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 05/19/2023] [Accepted: 05/31/2023] [Indexed: 06/15/2023]
Affiliation(s)
- Elizabeth Mannion
- National Institute of Public Health, University of Southern Denmark, Copenhagen, Denmark
| | - Christian Ritz
- National Institute of Public Health, University of Southern Denmark, Copenhagen, Denmark.
| | - Paola G Ferrario
- Institut für Physiologie und Biochemie der Ernährung, Max Rubner-Institut, Karlsruhe, Germany
| |
Collapse
|
3
|
Cai H, Lu W, Marceau West R, Mehrotra DV, Huang L. CAPITAL: Optimal subgroup identification via constrained policy tree search. Stat Med 2022; 41:4227-4244. [PMID: 35799329 PMCID: PMC9544117 DOI: 10.1002/sim.9507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 05/04/2022] [Accepted: 06/06/2022] [Indexed: 11/10/2022]
Abstract
Personalized medicine, a paradigm of medicine tailored to a patient's characteristics, is an increasingly attractive field in health care. An important goal of personalized medicine is to identify a subgroup of patients, based on baseline covariates, that benefits more from the targeted treatment than other comparative treatments. Most of the current subgroup identification methods only focus on obtaining a subgroup with an enhanced treatment effect without paying attention to subgroup size. Yet, a clinically meaningful subgroup learning approach should identify the maximum number of patients who can benefit from the better treatment. In this article, we present an optimal subgroup selection rule (SSR) that maximizes the number of selected patients, and in the meantime, achieves the pre‐specified clinically meaningful mean outcome, such as the average treatment effect. We derive two equivalent theoretical forms of the optimal SSR based on the contrast function that describes the treatment‐covariates interaction in the outcome. We further propose a constrained policy tree search algorithm (CAPITAL) to find the optimal SSR within the interpretable decision tree class. The proposed method is flexible to handle multiple constraints that penalize the inclusion of patients with negative treatment effects, and to address time to event data using the restricted mean survival time as the clinically interesting mean outcome. Extensive simulations, comparison studies, and real data applications are conducted to demonstrate the validity and utility of our method.
Collapse
Affiliation(s)
- Hengrui Cai
- Department of Statistics, University of California Irvine, Irvine, California, USA
| | - Wenbin Lu
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, USA
| | - Rachel Marceau West
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., North Wales, Pennsylvania, USA
| | - Devan V Mehrotra
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., North Wales, Pennsylvania, USA
| | - Lingkang Huang
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, New Jersey, USA
| |
Collapse
|
4
|
Li L, Levine RA, Fan J. Causal Effect Random Forest of Interaction Trees for Learning Individualized Treatment Regimes with Multiple Treatments in Observational Studies. Stat (Int Stat Inst) 2022. [DOI: 10.1002/sta4.457] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Luo Li
- Computational Science Research Center San Diego State University California USA
| | - Richard A. Levine
- Department of Mathematics and Statistics San Diego State University California USA
- Analytics Studies and Institutional Research San Diego State University California USA
| | - Juanjuan Fan
- Department of Mathematics and Statistics San Diego State University California USA
| |
Collapse
|
5
|
Yuan A, Wang L, Tan MT. Set-regression with applications to subgroup analysis. Stat Med 2021; 41:180-193. [PMID: 34672000 DOI: 10.1002/sim.9229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Revised: 09/30/2021] [Accepted: 09/30/2021] [Indexed: 11/10/2022]
Abstract
Regression is a commonly used statistical model. It is the conditional mean of the response given covariates μ ( x ) = E ( Y | X = x ) . However, in some practical problems, the interest is the conditional mean of the response given the covariates belonging to some set A. Notably, in precision medicine and subgroup analysis in clinical trials, the aim is to identify subjects who benefit the most from the treatment, or identify an optimal set in the covariate space which manifests treatment favoritism if a subject's covariates fall in this set and the subject is classified to the favorable treatment subgroup. Existing methods for subgroup analysis achieve this indirectly by using classical regression. This motivates us to develop a new type of regression: set-regression, defined as μ ( A ) = E ( Y | X ∈ A ) which directly addresses the subgroup analysis problem. This extends not only the classical regression model but also improves recursive partitioning and support vector machine approaches, and is particularly suitable for objectives involving optimization of the regression over sets, such as subgroup analysis. We show that the new versatile set-regression identifies the subgroup with increased accuracy. It is easy to use. Simulation studies also show superior performance of the proposed method in finite samples.
Collapse
Affiliation(s)
- Ao Yuan
- Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University, Washington, District of Columbia, USA
| | - Lida Wang
- Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University, Washington, District of Columbia, USA
| | - Ming T Tan
- Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University, Washington, District of Columbia, USA
| |
Collapse
|
6
|
Bunouf P, Groc M, Dmitrienko A, Lipkovich I. Data-Driven Subgroup Identification in Confirmatory Clinical Trials. Ther Innov Regul Sci 2021; 56:65-75. [PMID: 34327673 DOI: 10.1007/s43441-021-00329-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 07/22/2021] [Indexed: 11/29/2022]
Abstract
Data-driven subgroup analysis plays an important role in clinical trials. This paper focuses on practical considerations in post-hoc subgroup investigations in the context of confirmatory clinical trials. The analysis is aimed at assessing the heterogeneity of treatment effects across the trial population and identifying patient subgroups with enhanced treatment benefit. The subgroups are defined using baseline patient characteristics, including demographic and clinical factors. Much progress has been made in the development of reliable statistical methods for subgroup investigation, including methods based on global models and recursive partitioning. The paper provides a review of principled approaches to data-driven subgroup identification and illustrates subgroup analysis strategies using a family of recursive partitioning methods known as the SIDES (subgroup identification based on differential effect search) methods. These methods are applied to a Phase III trial in patients with metastatic colorectal cancer. The paper discusses key considerations in subgroup exploration, including the role of covariate adjustment, subgroup analysis at early decision points and interpretation of subgroup search results in trials with a positive overall effect.
Collapse
|
7
|
Imai K, Li ML. Experimental Evaluation of Individualized Treatment Rules. J Am Stat Assoc 2021. [DOI: 10.1080/01621459.2021.1923511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Kosuke Imai
- Department of Government and Department of Statistics, Harvard University, Cambridge, MA
| | - Michael Lingzhi Li
- Operation Research Center, Massachusetts Institute of Technology, Cambridge, MA
| |
Collapse
|
8
|
Brnabic A, Hess LM. Systematic literature review of machine learning methods used in the analysis of real-world data for patient-provider decision making. BMC Med Inform Decis Mak 2021; 21:54. [PMID: 33588830 PMCID: PMC7885605 DOI: 10.1186/s12911-021-01403-2] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Accepted: 01/20/2021] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Machine learning is a broad term encompassing a number of methods that allow the investigator to learn from the data. These methods may permit large real-world databases to be more rapidly translated to applications to inform patient-provider decision making. METHODS This systematic literature review was conducted to identify published observational research of employed machine learning to inform decision making at the patient-provider level. The search strategy was implemented and studies meeting eligibility criteria were evaluated by two independent reviewers. Relevant data related to study design, statistical methods and strengths and limitations were identified; study quality was assessed using a modified version of the Luo checklist. RESULTS A total of 34 publications from January 2014 to September 2020 were identified and evaluated for this review. There were diverse methods, statistical packages and approaches used across identified studies. The most common methods included decision tree and random forest approaches. Most studies applied internal validation but only two conducted external validation. Most studies utilized one algorithm, and only eight studies applied multiple machine learning algorithms to the data. Seven items on the Luo checklist failed to be met by more than 50% of published studies. CONCLUSIONS A wide variety of approaches, algorithms, statistical software, and validation strategies were employed in the application of machine learning methods to inform patient-provider decision making. There is a need to ensure that multiple machine learning approaches are used, the model selection strategy is clearly defined, and both internal and external validation are necessary to be sure that decisions for patient care are being made with the highest quality evidence. Future work should routinely employ ensemble methods incorporating multiple machine learning algorithms.
Collapse
Affiliation(s)
| | - Lisa M Hess
- Eli Lilly and Company, Indianapolis, IN, USA.
| |
Collapse
|
9
|
Zhang P, Ma J, Chen X, Shentu Y. A nonparametric method for value function guided subgroup identification via gradient tree boosting for censored survival data. Stat Med 2020; 39:4133-4146. [PMID: 32786155 DOI: 10.1002/sim.8714] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2020] [Revised: 06/08/2020] [Accepted: 07/09/2020] [Indexed: 11/07/2022]
Abstract
In randomized clinical trials with survival outcome, there has been an increasing interest in subgroup identification based on baseline genomic, proteomic markers, or clinical characteristics. Some of the existing methods identify subgroups that benefit substantially from the experimental treatment by directly modeling outcomes or treatment effect. When the goal is to find an optimal treatment for a given patient rather than finding the right patient for a given treatment, methods under the individualized treatment regime framework estimate an individualized treatment rule that would lead to the best expected clinical outcome as measured by a value function. Connecting the concept of value function to subgroup identification, we propose a nonparametric method that searches for subgroup membership scores by maximizing a value function that directly reflects the subgroup-treatment interaction effect based on restricted mean survival time. A gradient tree boosting algorithm is proposed to search for the individual subgroup membership scores. We conduct simulation studies to evaluate the performance of the proposed method and an application to an AIDS clinical trial is performed for illustration.
Collapse
Affiliation(s)
- Pingye Zhang
- Biostatistics and Research Decision Sciences, MRL, Merck & Co., Inc., Rahway, New Jersey, USA
| | - Junshui Ma
- Biostatistics and Research Decision Sciences, MRL, Merck & Co., Inc., Rahway, New Jersey, USA
| | - Xinqun Chen
- Biostatistics and Research Decision Sciences, MRL, Merck & Co., Inc., Rahway, New Jersey, USA
| | - Yue Shentu
- Biostatistics and Research Decision Sciences, MRL, Merck & Co., Inc., Rahway, New Jersey, USA
| |
Collapse
|
10
|
Nguyen CT, Luckett DJ, Kahkoska AR, Shearrer GE, Spruijt-Metz D, Davis JN, Kosorok MR. Estimating individualized treatment regimes from crossover designs. Biometrics 2020; 76:778-788. [PMID: 31743424 PMCID: PMC7234899 DOI: 10.1111/biom.13186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Revised: 10/03/2019] [Accepted: 10/29/2019] [Indexed: 11/27/2022]
Abstract
The field of precision medicine aims to tailor treatment based on patient-specific factors in a reproducible way. To this end, estimating an optimal individualized treatment regime (ITR) that recommends treatment decisions based on patient characteristics to maximize the mean of a prespecified outcome is of particular interest. Several methods have been proposed for estimating an optimal ITR from clinical trial data in the parallel group setting where each subject is randomized to a single intervention. However, little work has been done in the area of estimating the optimal ITR from crossover study designs. Such designs naturally lend themselves to precision medicine since they allow for observing the response to multiple treatments for each patient. In this paper, we introduce a method for estimating the optimal ITR using data from a 2 × 2 crossover study with or without carryover effects. The proposed method is similar to policy search methods such as outcome weighted learning; however, we take advantage of the crossover design by using the difference in responses under each treatment as the observed reward. We establish Fisher and global consistency, present numerical experiments, and analyze data from a feeding trial to demonstrate the improved performance of the proposed method compared to standard methods for a parallel study design.
Collapse
Affiliation(s)
- Crystal T. Nguyen
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, U.S.A
| | - Daniel J. Luckett
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, U.S.A
| | - Anna R. Kahkoska
- Department of Nutrition, University of North Carolina, Chapel Hill, North Carolina, U.S.A
| | - Grace E. Shearrer
- Department of Nutrition, University of North Carolina, Chapel Hill, North Carolina, U.S.A
| | - Donna Spruijt-Metz
- Center of Economic and Social Research, University of Southern California, Los Angeles, California, U.S.A
| | - Jaimie N. Davis
- Department of Nutrition, University of Texas at Austin, Austin, Texas, U.S.A
| | - Michael R. Kosorok
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, U.S.A
| |
Collapse
|
11
|
Chen Y, Chirikov VV, Marston XL, Yang J, Qiu H, Xie J, Sun N, Gu C, Dong P, Gao X. Machine Learning for Precision Health Economics and Outcomes Research (P-HEOR): Conceptual Review of Applications and Next Steps. J Health Econ Outcomes Res 2020; 7:35-42. [PMID: 32685596 PMCID: PMC7299485 DOI: 10.36469/jheor.2020.12698] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Revised: 04/06/2020] [Accepted: 04/13/2020] [Indexed: 05/15/2023]
Abstract
Precision health economics and outcomes research (P-HEOR) integrates economic and clinical value assessment by explicitly discovering distinct clinical and health care utilization phenotypes among patients. Through a conceptualized example, the objective of this review is to highlight the capabilities and limitations of machine learning (ML) applications to P-HEOR and to contextualize the potential opportunities and challenges for the wide adoption of ML for health economics. We outline a P-HEOR conceptual framework extending the ML methodology to comparatively assess the economic value of treatment regimens. Latest methodology developments on bias and confounding control in ML applications to precision medicine are also summarized.
Collapse
Affiliation(s)
- Yixi Chen
- Pfizer Investment Co. Ltd., Beijing,
China
| | - Viktor V. Chirikov
- Real World Evidence, Pharmerit International, Bethesda, Maryland,
United States
| | - Xiaocong L. Marston
- Real World Evidence, Pharmerit International, Bethesda, Maryland,
United States
- Pharmerit (Shanghai) Company Limited, Shanghai,
China
| | | | - Haibo Qiu
- Zhongda Hospital, Southeast University, Nanjing,
China
| | - Jianfeng Xie
- Zhongda Hospital, Southeast University, Nanjing,
China
| | - Ning Sun
- Easy Visible Sky Tree Technology (Beijing) Co., Ltd., Beijing,
China
| | - Chengming Gu
- Sanofi (China) Investment Co. Ltd., Beijing,
China
| | - Peng Dong
- Pfizer Investment Co. Ltd., Beijing,
China
| | - Xin Gao
- Real World Evidence, Pharmerit International, Bethesda, Maryland,
United States
- Pharmerit (Shanghai) Company Limited, Shanghai,
China
| |
Collapse
|
12
|
Siriwardhana C, Kulasekera KB, Datta S. Personalized treatment selection using data from crossover designs with carry-over effects. Stat Med 2019; 38:5391-5412. [PMID: 31637762 DOI: 10.1002/sim.8372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2018] [Revised: 05/29/2019] [Accepted: 08/24/2019] [Indexed: 11/07/2022]
Abstract
In this work, we propose a semiparametric method for estimating the optimal treatment for a given patient based on individual covariate information for that patient when data from a crossover design are available. Here, we assume there are carry-over effects for patients switching from one treatment to another. For the K treatment (K ≥ 2) scenario, we show that nonparametric estimation of carry-over effects can have the undesirable property that comparison of treatment means can only be done using independent outcome measurements from different groups of patients rather than using available joint measurements for each patient. To overcome this barrier, we compare probabilities of outcome variable of each treatment dominating outcome variables for all other treatments conditional on patient-specific scores constructed from patient covariates. We suggest single-index models as appropriate models connecting outcome variables to covariates and our empirical investigations show that frequencies of correct treatment assignments are highly accurate. The proposed method is also rather robust against departures from a single-index model structure. We also conduct a real data analysis to show the applicability of the proposed procedure.
Collapse
Affiliation(s)
- Chathura Siriwardhana
- Department of Quantitative Health Sciences, University of Hawaii John A. Burns School of Medicine, Honolulu, Hawaii
| | - K B Kulasekera
- Department of Bioinformatics & Biostatistics, University of Louisville, Louisville, Kentucky
| | - Somnath Datta
- Department of Biostatistics, University of Florida, Gainesville, Florida
| |
Collapse
|
13
|
Siriwardhana C, Datta S, Kulasekera KB. Selection of the optimal personalized treatment from multiple treatments with multivariate outcome measures. J Biopharm Stat 2019; 30:462-480. [PMID: 31691633 DOI: 10.1080/10543406.2019.1684304] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
In this work, we propose a novel method for individualized treatment selection when the treatment response is multivariate. For the K treatment (K ≥2) scenario we compare quantities that are suitable indexes based on outcome variables for each treatment conditional on patient-specific scores constructed from collected covariate measurements. Our method covers any number of treatments and outcome variables, and it can be applied for a broad set of models. The proposed method uses a rank aggregation technique to estimate an ordering of treatments based on ranked lists of treatment performance measures such as smooth conditional means and conditional probability of a response for one treatment dominating others. The method has the flexibility to incorporate patient and clinician preferences to the optimal treatment decision on an individual case basis. A simulation study demonstrates the performance of the proposed method in finite samples. We also present data analyses using HIV and Diabetes clinical trials data to show the applicability of the proposed procedure for real data.
Collapse
Affiliation(s)
- Chathura Siriwardhana
- Department of Quantitative Health Sciences, University of Hawaii John A. Burns School of Medicine, Honolulu, HI, USA
| | - Somnath Datta
- Department of Biostatistics, University of Florida, Gainesville, FL, USA
| | - K B Kulasekera
- Department of Bioinformatics & Biostatistics, University of Louisville, Louisville, KY, USA
| |
Collapse
|
14
|
Sysoev O, Bartoszek K, Ekström EC, Ekholm Selling K. PSICA: Decision trees for probabilistic subgroup identification with categorical treatments. Stat Med 2019; 38:4436-4452. [PMID: 31246349 PMCID: PMC6771862 DOI: 10.1002/sim.8308] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Revised: 06/11/2019] [Accepted: 06/11/2019] [Indexed: 11/09/2022]
Abstract
Personalized medicine aims at identifying best treatments for a patient with given characteristics. It has been shown in the literature that these methods can lead to great improvements in medicine compared to traditional methods prescribing the same treatment to all patients. Subgroup identification is a branch of personalized medicine, which aims at finding subgroups of the patients with similar characteristics for which some of the investigated treatments have a better effect than the other treatments. A number of approaches based on decision trees have been proposed to identify such subgroups, but most of them focus on two-arm trials (control/treatment) while a few methods consider quantitative treatments (defined by the dose). However, no subgroup identification method exists that can predict the best treatments in a scenario with a categorical set of treatments. We propose a novel method for subgroup identification in categorical treatment scenarios. This method outputs a decision tree showing the probabilities of a given treatment being the best for a given group of patients as well as labels showing the possible best treatments. The method is implemented in an R package psica available on CRAN. In addition to a simulation study, we present an analysis of a community-based nutrition intervention trial that justifies the validity of our method.
Collapse
Affiliation(s)
- Oleg Sysoev
- Department of Computer and Information Science, Linköping University, Linköping, Sweden
| | - Krzysztof Bartoszek
- Department of Computer and Information Science, Linköping University, Linköping, Sweden
| | - Eva-Charlotte Ekström
- Department of Women's and Children's Health, Uppsala University, Akademiska Sjukhuset, Uppsala, Sweden
| | - Katarina Ekholm Selling
- Department of Women's and Children's Health, Uppsala University, Akademiska Sjukhuset, Uppsala, Sweden
| |
Collapse
|
15
|
Qiu X, Wang Y. Composite interaction tree for simultaneous learning of optimal individualized treatment rules and subgroups. Stat Med 2019; 38:2632-2651. [PMID: 30891797 PMCID: PMC8548070 DOI: 10.1002/sim.8105] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2018] [Revised: 10/25/2018] [Accepted: 01/02/2019] [Indexed: 11/12/2022]
Abstract
Treatment response heterogeneity has long been observed in patients affected by chronic diseases. Administering an individualized treatment rule (ITR) offers an opportunity to tailor treatment strategies according to patient-specific characteristics. Overly complex machine learning methods for estimating ITRs may produce treatment rules that have higher benefit but lack transparency and interpretability. In clinical practices, it is desirable to derive a simple and interpretable ITR while maintaining certain optimality that leads to improved benefit in subgroups of patients, if not on the overall sample. In this work, we propose a tree-based robust learning method to estimate optimal piecewise linear ITRs and identify subgroups of patients with a large benefit. We achieve these goals by simultaneously identifying qualitative and quantitative interactions through a tree model, referred to as the composite interaction tree (CITree). We show that it has improved performance compared to existing methods on both overall sample and subgroups via extensive simulation studies. Lastly, we fit CITree to Research Evaluating the Value of Augmenting Medication with Psychotherapy trial for treating patients with major depressive disorders, where we identified both qualitative and quantitative interactions and subgroups of patients with a large benefit.
Collapse
Affiliation(s)
- Xin Qiu
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, New York
| | - Yuanjia Wang
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, New York
- Department of Psychiatry, Columbia University Medical Center, New York, New York
| |
Collapse
|
16
|
Abstract
Current guidelines for treatment decision making largely rely on data from randomized controlled trials (RCTs) studying average treatment effects. They may be inadequate to make individualized treatment decisions in real-world settings. Large-scale electronic health records (EHR) provide opportunities to fulfill the goals of personalized medicine and learn individualized treatment rules (ITRs) depending on patient-specific characteristics from real-world patient data. In this work, we tackle challenges with EHRs and propose a machine learning approach based on matching (M-learning) to estimate optimal ITRs from EHRs. This new learning method performs matching instead of inverse probability weighting as commonly used in many existing methods for estimating ITRs to more accurately assess individuals' treatment responses to alternative treatments and alleviate confounding. Matching-based value functions are proposed to compare matched pairs under a unified framework, where various types of outcomes for measuring treatment response (including continuous, ordinal, and discrete outcomes) can easily be accommodated. We establish the Fisher consistency and convergence rate of M-learning. Through extensive simulation studies, we show that M-learning outperforms existing methods when propensity scores are misspecified or when unmeasured confounders are present in certain scenarios. Lastly, we apply M-learning to estimate optimal personalized second-line treatments for type 2 diabetes patients to achieve better glycemic control or reduce major complications using EHRs from New York Presbyterian Hospital.
Collapse
Affiliation(s)
- Peng Wu
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY 10032;
| | - Donglin Zeng
- Department of Biostatistics, University of North Carolina at Chapel Hill.
| | - Yuanjia Wang
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY 10032
| |
Collapse
|
17
|
Ritz C, Astrup A, Larsen TM, Hjorth MF. Weight loss at your fingertips: personalized nutrition with fasting glucose and insulin using a novel statistical approach. Eur J Clin Nutr 2019; 73:1529-1535. [DOI: 10.1038/s41430-019-0423-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 03/26/2019] [Accepted: 03/26/2019] [Indexed: 01/09/2023]
|
18
|
Wang J, Li J, Li Y, Wong WK. A model-based multithreshold method for subgroup identification. Stat Med 2019; 38:2605-2631. [PMID: 30887552 DOI: 10.1002/sim.8136] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Revised: 01/29/2019] [Accepted: 02/11/2019] [Indexed: 11/07/2022]
Abstract
Thresholding variable plays a crucial role in subgroup identification for personalized medicine. Most existing partitioning methods split the sample based on one predictor variable. In this paper, we consider setting the splitting rule from a combination of multivariate predictors, such as the latent factors, principle components, and weighted sum of predictors. Such a subgrouping method may lead to more meaningful partitioning of the population than using a single variable. In addition, our method is based on a change point regression model and thus yields straight forward model-based prediction results. After choosing a particular thresholding variable form, we apply a two-stage multiple change point detection method to determine the subgroups and estimate the regression parameters. We show that our approach can produce two or more subgroups from the multiple change points and identify the true grouping with high probability. In addition, our estimation results enjoy oracle properties. We design a simulation study to compare performances of our proposed and existing methods and apply them to analyze data sets from a Scleroderma trial and a breast cancer study.
Collapse
Affiliation(s)
- Jingli Wang
- Department of Statistics and Applied Probability, National University of Singapore, Singapore
| | - Jialiang Li
- Department of Statistics and Applied Probability, National University of Singapore, Singapore.,Duke University-NUS Graduate Medical School, Singapore.,Singapore Eye Research Institute, Singapore
| | - Yaguang Li
- University of Science and Technology of China, Hefei, China
| | - Weng Kee Wong
- Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles, Los Angeles, California
| |
Collapse
|
19
|
Abstract
Precision medicine, in the sense of tailoring the choice of medical treatment to patients' pretreatment characteristics, is nowadays gaining a lot of attention. Preferably, this tailoring should be realized in an evidence-based way, with key evidence in this regard pertaining to subgroups of patients that respond differentially to treatment (i.e., to subgroups involved in treatment-subgroup interactions). Often a-priori hypotheses on subgroups involved in treatment-subgroup interactions are lacking or are incomplete at best. Therefore, methods are needed that can induce such subgroups from empirical data on treatment effectiveness in a post hoc manner. Recently, quite a few such methods have been developed. So far, however, there is little empirical experience in their usage. This may be problematic for medical statisticians and statistically minded medical researchers, as many (nontrivial) choices have to be made during the data-analytic process. The main purpose of this paper is to discuss the major concepts and considerations when using these methods. This discussion will be based on a systematic, conceptual, and technical analysis of the type of research questions at play, and of the type of data that the methods can handle along with the available software, and a review of available empirical evidence. We will illustrate all this with the analysis of a dataset comparing several anti-depressant treatments.
Collapse
Affiliation(s)
- Aniek Sies
- a Faculty of Psychology and Educational Sciences , KU Leuven , Leuven , Belgium
| | | | - Iven Van Mechelen
- a Faculty of Psychology and Educational Sciences , KU Leuven , Leuven , Belgium
| |
Collapse
|
20
|
Alemayehu D, Chen Y, Markatou M. A comparative study of subgroup identification methods for differential treatment effect: Performance metrics and recommendations. Stat Methods Med Res 2018; 27:3658-3678. [DOI: 10.1177/0962280217710570] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Subgroup identification with differential treatment effects serves as an important step towards precision medicine, as it provides evidence regarding how individuals with specific characteristics respond to a given treatment. This knowledge not only supports the tailoring of treatment strategies but also prompts the development of new treatments. This manuscript provides a brief overview of the issues associated with the methodologies aimed at identifying subgroups with differential treatment effects, and studies in depth the operational characteristics of five data-driven methods that have appeared recently in the literature. The performance of the methods under study to identify correctly the covariates affecting treatment effects is evaluated via simulation and under various conditions. Two clinical trial data sets are also used to illustrate the application of these methods. Discussion and recommendations pertaining to the use of these methods are provided, with emphasis on the relative performance of the methods under the conditions studied.
Collapse
Affiliation(s)
| | - Yang Chen
- Department of Biostatistics, School of Public Health & Health Professions, SUNY Buffalo, NY, USA
| | - Marianthi Markatou
- Department of Biostatistics, School of Public Health & Health Professions, SUNY Buffalo, NY, USA
| |
Collapse
|
21
|
Liang M, Ye T, Fu H. Estimating individualized optimal combination therapies through outcome weighted deep learning algorithms. Stat Med 2018; 37:3869-3886. [PMID: 30014497 DOI: 10.1002/sim.7902] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2017] [Revised: 03/17/2018] [Accepted: 06/16/2018] [Indexed: 11/10/2022]
Abstract
With the advancement in drug development, multiple treatments are available for a single disease. Patients can often benefit from taking multiple treatments simultaneously. For example, patients in Clinical Practice Research Datalink with chronic diseases such as type 2 diabetes can receive multiple treatments simultaneously. Therefore, it is important to estimate what combination therapy from which patients can benefit the most. However, to recommend the best treatment combination is not a single label but a multilabel classification problem. In this paper, we propose a novel outcome weighted deep learning algorithm to estimate individualized optimal combination therapy. The Fisher consistency of the proposed loss function under certain conditions is also provided. In addition, we extend our method to a family of loss functions, which allows adaptive changes based on treatment interactions. We demonstrate the performance of our methods through simulations and real data analysis.
Collapse
Affiliation(s)
- Muxuan Liang
- Department of Statistics, University of Wisconsin-Madison, Madison, Wisconsin
| | - Ting Ye
- Department of Statistics, University of Wisconsin-Madison, Madison, Wisconsin
| | - Haoda Fu
- Eli Lilly and Company, Indianapolis, Indiana
| |
Collapse
|
22
|
Doubleday K, Zhou H, Fu H, Zhou J. An Algorithm for Generating Individualized Treatment Decision Trees and Random Forests. J Comput Graph Stat 2018; 27:849-860. [PMID: 32523325 PMCID: PMC7286561 DOI: 10.1080/10618600.2018.1451337] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2016] [Revised: 02/19/2018] [Indexed: 10/17/2022]
Abstract
With new treatments and novel technology available, precision medicine has become a key topic in the new era of healthcare. Traditional statistical methods for precision medicine focus on subgroup discovery through identifying interactions between a few markers and treatment regimes. However, given the large scale and high dimensionality of modern data sets, it is difficult to detect the interactions between treatment and high dimensional covariates. Recently, novel approaches have emerged that seek to directly estimate individualized treatment rules (ITR) via maximizing the expected clinical reward by using, for example, support vector machines (SVM) or decision trees. The latter enjoys great popularity in clinical practice due to its interpretability. In this paper, we propose a new reward function and a novel decision tree algorithm to directly maximize rewards. We further improve a single tree decision rule by an ensemble decision tree algorithm, ITR random forests. Our final decision rule is an average over single decision trees and it is a soft probability rather than a hard choice. Depending on how strong the treatment recommendation is, physicians can make decisions based on our model along with their own judgment and experience. Performance of ITR forest and tree methods is assessed through simulations along with applications to a randomized controlled trial (RCT) of 1385 patients with diabetes and an EMR cohort of 5177 patients with diabetes. ITR forest and tree methods are implemented using statistical software R (https://github.com/kdoub5ha/ITR.Forest).
Collapse
Affiliation(s)
- Kevin Doubleday
- Department of Epidemiology and Biostatistics, University of Arizona
| | - Hua Zhou
- Department of Biostatistics, University of California, Los Angeles
| | | | - Jin Zhou
- Department of Epidemiology and Biostatistics, University of Arizona
| |
Collapse
|
23
|
Abstract
Precision medicine is an emerging scientific topic for disease treatment and prevention that takes into account individual patient characteristics. It is an important direction for clinical research, and many statistical methods have been proposed recently. One of the primary goals of precision medicine is to obtain an optimal individual treatment rule (ITR), which can help make decisions on treatment selection according to each patient's specific characteristics. Recently, outcome weighted learning (OWL) has been proposed to estimate such an optimal ITR in a binary treatment setting by maximizing the expected clinical outcome. However, for ordinal treatment settings, such as individualized dose finding, it is unclear how to use OWL. In this article, we propose a new technique for estimating ITR with ordinal treatments. In particular, we propose a data duplication technique with a piecewise convex loss function. We establish Fisher consistency for the resulting estimated ITR under certain conditions, and obtain the convergence and risk bound properties. Simulated examples and an application to a dataset from a type 2 diabetes mellitus observational study demonstrate the highly competitive performance of the proposed method compared to existing alternatives.
Collapse
Affiliation(s)
- Jingxiang Chen
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Haoda Fu
- Eli Lilly and Company, Indianapolis, Indiana, U.S.A
| | - Xuanyao He
- Eli Lilly and Company, Indianapolis, Indiana, U.S.A
| | - Michael R Kosorok
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A.,Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Yufeng Liu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A.,Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A.,Department of Genetics, Carolina Center for Genome Sciences, Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, North Carolina, U.S.A
| |
Collapse
|
24
|
Logan BR, Sparapani R, McCulloch RE, Laud PW. Decision making and uncertainty quantification for individualized treatments using Bayesian Additive Regression Trees. Stat Methods Med Res 2017; 28:1079-1093. [PMID: 29254443 DOI: 10.1177/0962280217746191] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Individualized treatment rules can improve health outcomes by recognizing that patients may respond differently to treatment and assigning therapy with the most desirable predicted outcome for each individual. Flexible and efficient prediction models are desired as a basis for such individualized treatment rules to handle potentially complex interactions between patient factors and treatment. Modern Bayesian semiparametric and nonparametric regression models provide an attractive avenue in this regard as these allow natural posterior uncertainty quantification of patient specific treatment decisions as well as the population wide value of the prediction-based individualized treatment rule. In addition, via the use of such models, inference is also available for the value of the optimal individualized treatment rules. We propose such an approach and implement it using Bayesian Additive Regression Trees as this model has been shown to perform well in fitting nonparametric regression functions to continuous and binary responses, even with many covariates. It is also computationally efficient for use in practice. With Bayesian Additive Regression Trees, we investigate a treatment strategy which utilizes individualized predictions of patient outcomes from Bayesian Additive Regression Trees models. Posterior distributions of patient outcomes under each treatment are used to assign the treatment that maximizes the expected posterior utility. We also describe how to approximate such a treatment policy with a clinically interpretable individualized treatment rule, and quantify its expected outcome. The proposed method performs very well in extensive simulation studies in comparison with several existing methods. We illustrate the usage of the proposed method to identify an individualized choice of conditioning regimen for patients undergoing hematopoietic cell transplantation and quantify the value of this method of choice in relation to the optimal individualized treatment rule as well as non-individualized treatment strategies.
Collapse
Affiliation(s)
- Brent R Logan
- 1 Division of Biostatistics, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Rodney Sparapani
- 1 Division of Biostatistics, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Robert E McCulloch
- 2 School of Mathematical and Statistical Sciences, Arizona State University, Tempe, AZ, USA
| | - Purushottam W Laud
- 1 Division of Biostatistics, Medical College of Wisconsin, Milwaukee, WI, USA
| |
Collapse
|
25
|
Zhao Y, Zheng W, Zhuo DY, Lu Y, Ma X, Liu H, Zeng Z, Laird G. Bayesian additive decision trees of biomarker by treatment interactions for predictive biomarker detection and subgroup identification. J Biopharm Stat 2017; 28:534-549. [PMID: 29020511 DOI: 10.1080/10543406.2017.1372770] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Personalized medicine, or tailored therapy, has been an active and important topic in recent medical research. Many methods have been proposed in the literature for predictive biomarker detection and subgroup identification. In this article, we propose a novel decision tree-based approach applicable in randomized clinical trials. We model the prognostic effects of the biomarkers using additive regression trees and the biomarker-by-treatment effect using a single regression tree. Bayesian approach is utilized to periodically revise the split variables and the split rules of the decision trees, which provides a better overall fitting. Gibbs sampler is implemented in the MCMC procedure, which updates the prognostic trees and the interaction tree separately. We use the posterior distribution of the interaction tree to construct the predictive scores of the biomarkers and to identify the subgroup where the treatment is superior to the control. Numerical simulations show that our proposed method performs well under various settings comparing to existing methods. We also demonstrate an application of our method in a real clinical trial.
Collapse
Affiliation(s)
| | | | - Daisy Y Zhuo
- b Operations Research Center , Massachusetts Institute of Technology , Cambridge , MA , USA
| | | | | | - Hengchang Liu
- c Department of Computer Science , University of Science and Technology of China , Suzhou , China
| | - Zhen Zeng
- d Department of Biostatistics , University of Pittsburgh , Pittsburgh , PA , USA
| | | |
Collapse
|
26
|
|
27
|
Siriwardhana C, Zhao M, Datta S, Kulasekera KB. A probability based method for selecting the optimal personalized treatment from multiple treatments. Stat Methods Med Res 2017; 28:749-760. [DOI: 10.1177/0962280217735701] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
In this work we propose a method for optimal treatment assignment based on individual covariate information for a patient. For the K treatment ([Formula: see text]) scenario, we compare quantities that are suitable surrogates to true conditional probabilities of outcome variable of each treatment dominating outcome variables for all other treatments conditional on patient specific scores constructed from patient-specific covariates. As opposed to methods based on conditional means, our method can be applied for a broad set of models and error structures. Furthermore, the proposed method has very desirable large sample properties. We suggest Single Index Models as appropriate models connecting outcome variables to covariates and our empirical investigations show that correct treatment assignments are highly accurate. The proposed method is also rather robust against departures from a Single Index Model structure. Furthermore, selection of a treatment using the proposed metric appears to incur no losses in terms of the average reward for cases when two treatments are close in terms of this metric. We also conduct a real data analysis to show the applicability of the proposed procedure. This analysis highlights possible gains both in terms of average response and survival time if one were to use the proposed method.
Collapse
Affiliation(s)
- Chathura Siriwardhana
- Department of Complementary & Integrative Medicine, John A. Burns School of Medicine, University of Hawaii, HI, USA
| | - Meng Zhao
- Department of Bioinformatics & Biostatistics, University of Louisville, Louisville, KY, USA
| | - Somnath Datta
- Department of Biostatistics, University of Florida, Gainesville, FL, USA
| | - KB Kulasekera
- Department of Bioinformatics & Biostatistics, University of Louisville, Louisville, KY, USA
| |
Collapse
|
28
|
Lou Z, Shao J, Yu M. Optimal treatment assignment to maximize expected outcome with multiple treatments. Biometrics 2017; 74:506-516. [DOI: 10.1111/biom.12811] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Revised: 07/01/2017] [Accepted: 08/01/2017] [Indexed: 11/28/2022]
Affiliation(s)
- Zhilan Lou
- School of Statistics; East China Normal University; Shanghai China
| | - Jun Shao
- School of Statistics; East China Normal University; Shanghai China
- Department of Statistics; University of Wisconsin; Madison Wisconsin U.S.A
| | - Menggang Yu
- Department of Biostatistics and Medical Informatics; University of Wisconsin; Madison Wisconsin U.S.A
| |
Collapse
|
29
|
Sies A, Van Mechelen I. Comparing Four Methods for Estimating Tree-Based Treatment Regimes. Int J Biostat 2017; 13:/j/ijb.ahead-of-print/ijb-2016-0068/ijb-2016-0068.xml. [DOI: 10.1515/ijb-2016-0068] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
AbstractWhen multiple treatment alternatives are available for a certain psychological or medical problem, an important challenge is to find an optimal treatment regime, which specifies for each patient the most effective treatment alternative given his or her pattern of pretreatment characteristics. The focus of this paper is on tree-based treatment regimes, which link an optimal treatment alternative to each leaf of a tree; as such they provide an insightful representation of the decision structure underlying the regime. This paper compares the absolute and relative performance of four methods for estimating regimes of that sort (viz., Interaction Trees, Model-based Recursive Partitioning, an approach developed by Zhang et al. and Qualitative Interaction Trees) in an extensive simulation study. The evaluation criteria were, on the one hand, the expected outcome if the entire population would be subjected to the treatment regime resulting from each method under study and the proportion of clients assigned to the truly best treatment alternative, and, on the other hand, the Type I and Type II error probabilities of each method. The method of Zhang et al. was superior regarding the first two outcome measures and the Type II error probabilities, but performed worst in some conditions of the simulation study regarding Type I error probabilities.
Collapse
|
30
|
Lipkovich I, Dmitrienko A, B R. Tutorial in biostatistics: data-driven subgroup identification and analysis in clinical trials. Stat Med 2016; 36:136-196. [PMID: 27488683 DOI: 10.1002/sim.7064] [Citation(s) in RCA: 150] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2015] [Revised: 06/23/2016] [Accepted: 07/05/2016] [Indexed: 02/05/2023]
Abstract
It is well known that both the direction and magnitude of the treatment effect in clinical trials are often affected by baseline patient characteristics (generally referred to as biomarkers). Characterization of treatment effect heterogeneity plays a central role in the field of personalized medicine and facilitates the development of tailored therapies. This tutorial focuses on a general class of problems arising in data-driven subgroup analysis, namely, identification of biomarkers with strong predictive properties and patient subgroups with desirable characteristics such as improved benefit and/or safety. Limitations of ad-hoc approaches to biomarker exploration and subgroup identification in clinical trials are discussed, and the ad-hoc approaches are contrasted with principled approaches to exploratory subgroup analysis based on recent advances in machine learning and data mining. A general framework for evaluating predictive biomarkers and identification of associated subgroups is introduced. The tutorial provides a review of a broad class of statistical methods used in subgroup discovery, including global outcome modeling methods, global treatment effect modeling methods, optimal treatment regimes, and local modeling methods. Commonly used subgroup identification methods are illustrated using two case studies based on clinical trials with binary and survival endpoints. Copyright © 2016 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
| | | | - Ralph B
- Boston University, Boston, MA, U.S.A
| |
Collapse
|