1
|
Qian W, Ing CK, Liu J. Adaptive Algorithm for Multi-armed Bandit Problem with High-dimensional Covariates. J Am Stat Assoc 2022. [DOI: 10.1080/01621459.2022.2152343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Affiliation(s)
- Wei Qian
- Department of Applied Economics and Statistics, University of Delaware, Newark, DE
| | - Ching-Kang Ing
- Institute of Statistics, National Tsing Hua University, Hsinchu, Taiwan
| | - Ji Liu
- Meta Platforms, Inc., Seattle, WA
| |
Collapse
|
2
|
Park H, Petkova E, Tarpey T, Ogden RT. A sparse additive model for treatment effect-modifier selection. Biostatistics 2022; 23:412-429. [PMID: 32808656 PMCID: PMC9308457 DOI: 10.1093/biostatistics/kxaa032] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Revised: 07/05/2020] [Accepted: 07/10/2020] [Indexed: 11/26/2023] Open
Abstract
Sparse additive modeling is a class of effective methods for performing high-dimensional nonparametric regression. This article develops a sparse additive model focused on estimation of treatment effect modification with simultaneous treatment effect-modifier selection. We propose a version of the sparse additive model uniquely constrained to estimate the interaction effects between treatment and pretreatment covariates, while leaving the main effects of the pretreatment covariates unspecified. The proposed regression model can effectively identify treatment effect-modifiers that exhibit possibly nonlinear interactions with the treatment variable that are relevant for making optimal treatment decisions. A set of simulation experiments and an application to a dataset from a randomized clinical trial are presented to demonstrate the method.
Collapse
Affiliation(s)
- Hyung Park
- Division of Biostatistics, Department of Population Health, New York University, New York, NY, USA and Department of Biostatistics, Columbia University, New York, NY, USA
| | - Eva Petkova
- Division of Biostatistics, Department of Population Health, New York University, New York, NY, USA and Department of Biostatistics, Columbia University, New York, NY, USA
| | - Thaddeus Tarpey
- Division of Biostatistics, Department of Population Health, New York University, New York, NY, USA and Department of Biostatistics, Columbia University, New York, NY, USA
| | - R Todd Ogden
- Division of Biostatistics, Department of Population Health, New York University, New York, NY, USA and Department of Biostatistics, Columbia University, New York, NY, USA
| |
Collapse
|
3
|
Kapelner A, Bleich J, Levine A, Cohen ZD, DeRubeis RJ, Berk R. Evaluating the Effectiveness of Personalized Medicine With Software. Front Big Data 2021; 4:572532. [PMID: 34085036 PMCID: PMC8167073 DOI: 10.3389/fdata.2021.572532] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2020] [Accepted: 02/03/2021] [Indexed: 11/13/2022] Open
Abstract
We present methodological advances in understanding the effectiveness of personalized medicine models and supply easy-to-use open-source software. Personalized medicine involves the systematic use of individual patient characteristics to determine which treatment option is most likely to result in a better average outcome for the patient. Why is personalized medicine not done more in practice? One of many reasons is because practitioners do not have any easy way to holistically evaluate whether their personalization procedure does better than the standard of care, termed improvement. Our software, "Personalized Treatment Evaluator" (the R package PTE), provides inference for improvement out-of-sample in many clinical scenarios. We also extend current methodology by allowing evaluation of improvement in the case where the endpoint is binary or survival. In the software, the practitioner inputs 1) data from a single-stage randomized trial with one continuous, incidence or survival endpoint and 2) an educated guess of a functional form of a model for the endpoint constructed from domain knowledge. The bootstrap is then employed on data unseen during model fitting to provide confidence intervals for the improvement for the average future patient (assuming future patients are similar to the patients in the trial). One may also test against a null scenario where the hypothesized personalization are not more useful than a standard of care. We demonstrate our method's promise on simulated data as well as on data from a randomized comparative trial investigating two treatments for depression.
Collapse
Affiliation(s)
- Adam Kapelner
- Department of Mathematics, Queens College, CUNY, Queens, NY, United States
| | - Justin Bleich
- Department of Statistics, The Wharton School of the University of Pennsylvania, Philadelphia, PA, United States
| | - Alina Levine
- Department of Mathematics, Queens College, CUNY, Queens, NY, United States
| | - Zachary D. Cohen
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, United States
| | - Robert J. DeRubeis
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, United States
| | - Richard Berk
- Department of Statistics, The Wharton School of the University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
4
|
Huang Y, Cho J, Fong Y. Threshold-based subgroup testing in logistic regression models in two-phase sampling designs. J R Stat Soc Ser C Appl Stat 2021; 70:291-311. [PMID: 33840863 PMCID: PMC8032557 DOI: 10.1111/rssc.12459] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The effect of treatment on binary disease outcome can differ across subgroups characterized by other covariates. Testing for the existence of subgroups that are associated with heterogeneous treatment effects can provide valuable insight regarding the optimal treatment recommendation in practice. Our research in this paper is motivated by the question of whether host genetics could modify a vaccine's effect on HIV acquisition risk. To answer this question, we used data from an HIV vaccine trial with a two-phase sampling design and developed a general threshold-based model framework to test for the existence of subgroups associated with the heterogeneity in disease risks, allowing for subgroups based on multivariate covariates. We developed a testing procedure based on maximum of likelihood-ratio statistics over change planes and demonstrated its advantage over alternative methods. We further developed the testing procedure to account for bias sampling of expensive (i.e. resource-intensive to measure) covariates through the incorporation of inverse probability weighting techniques. We used the proposed method to analyze the motivating HIV vaccine trial data. Our proposed testing procedure also has broad applications in epidemiological studies for assessing heterogeneity in disease risk with respect to univariate or multivariate predictors.
Collapse
Affiliation(s)
- Ying Huang
- Biostatistics, Bioinformatics, & Epidemiology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109
| | - Juhee Cho
- Biostatistics, Bioinformatics, & Epidemiology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109
| | - Youyi Fong
- Biostatistics, Bioinformatics, & Epidemiology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109
| |
Collapse
|
5
|
Huang Y, Zhou XH. Identification of the optimal treatment regimen in the presence of missing covariates. Stat Med 2020; 39:353-368. [PMID: 31774192 PMCID: PMC6954309 DOI: 10.1002/sim.8407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2018] [Revised: 09/25/2019] [Accepted: 09/27/2019] [Indexed: 12/25/2022]
Abstract
Covariates associated with treatment-effect heterogeneity can potentially be used to make personalized treatment recommendations towards best clinical outcomes. Methods for treatment-selection rule development that directly maximize treatment-selection benefits have attracted much interest in recent years, due to the robustness of these methods to outcome modeling. In practice, the task of treatment-selection rule development can be further complicated by missingness in data. Here, we consider the identification of optimal treatment-selection rules for a binary disease outcome when measurements of an important covariate from study participants are partly missing. Under the missing at random assumption, we develop a robust estimator of treatment-selection rules under the direct-optimization paradigm. This estimator targets the maximum selection benefits to the population under correct specification of at least one mechanism from each of the two sets-missing data or conditional covariate distribution, and treatment assignment or disease outcome model. We evaluate and compare performance of the proposed estimator with alternative direct-optimization estimators through extensive simulation studies. We demonstrate the application of the proposed method through a real data example from an Alzheimer's disease study for developing covariate combinations to guide the treatment of Alzheimer's disease.
Collapse
Affiliation(s)
- Ying Huang
- Vaccine & Infectious Diseases Division, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA,Correspondence;
| | - Xiao-Hua Zhou
- Department of Biostatistics, Peking University, Beijing, China,Correspondence;
| |
Collapse
|
6
|
Xiao W, Zhang HH, Lu W. Robust regression for optimal individualized treatment rules. Stat Med 2019; 38:2059-2073. [PMID: 30740747 PMCID: PMC6449186 DOI: 10.1002/sim.8102] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2017] [Revised: 12/06/2018] [Accepted: 12/06/2018] [Indexed: 12/27/2022]
Abstract
Because different patients may respond quite differently to the same drug or treatment, there is an increasing interest in discovering individualized treatment rules. In particular, there is an emerging need to find optimal individualized treatment rules, which would lead to the "best" clinical outcome. In this paper, we propose a new class of loss functions and estimators based on robust regression to estimate the optimal individualized treatment rules. Compared to existing estimation methods in the literature, the new estimators are novel and advantageous in the following aspects. First, they are robust against skewed, heterogeneous, heavy-tailed errors or outliers in data. Second, they are robust against a misspecification of the baseline function. Third, under some general situations, the new estimator coupled with the pinball loss approximately maximizes the outcome's conditional quantile instead of the conditional mean, which leads to a more robust optimal individualized treatment rule than the traditional mean-based estimators. Consistency and asymptotic normality of the proposed estimators are established. Their empirical performance is demonstrated via extensive simulation studies and an analysis of an AIDS data set.
Collapse
Affiliation(s)
- W Xiao
- Department of Statistics, North Carolina State University, NC, USA
| | - H. H. Zhang
- Department of Mathematics, University of Arizona, AZ, USA
| | - W. Lu
- Department of Statistics, North Carolina State University, NC, USA
| |
Collapse
|
7
|
Ciarleglio A, Petkova E, Ogden T, Tarpey T. Constructing treatment decision rules based on scalar and functional predictors when moderators of treatment effect are unknown. J R Stat Soc Ser C Appl Stat 2018; 67:1331-1356. [PMID: 30546161 PMCID: PMC6287762 DOI: 10.1111/rssc.12278] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Treatment response heterogeneity poses serious challenges for selecting treatment for many diseases. To better understand this heterogeneity and to help in determining the best patient-specific treatments for a given disease, many clinical trials are collecting large amounts of patient-level data prior to administering treatment in the hope that some of these data can be used to identify moderators of treatment effect. These data can range from simple scalar values to complex functional data such as curves or images. Combining these various types of baseline data to discover "biosignatures" of treatment response is crucial for advancing precision medicine. Motivated by the problem of selecting optimal treatment for subjects with depression based on clinical and neuroimaging data, we present an approach that both (1) identifies covariates associated with differential treatment effect and (2) estimates a treatment decision rule based on these covariates. We focus on settings where there is a potentially large collection of candidate biomarkers consisting of both scalar and functional data. The validity of the proposed approach is justified via extensive simulation experiments and illustrated using data from a placebo-controlled clinical trial investigating antidepressant treatment response in subjects with depression.
Collapse
Affiliation(s)
- Adam Ciarleglio
- Mailman School of Public Health, Columbia University and New York State Psychiatric Institute, New York, U. S. A
| | - Eva Petkova
- New York University School of Medicine, New York, U. S. A
| | - Todd Ogden
- Mailman School of Public Health, Columbia University, New York, U. S. A
| | | |
Collapse
|
8
|
Díaz I, Savenkov O, Ballman K. Targeted learning ensembles for optimal individualized treatment rules with time-to-event outcomes. Biometrika 2018; 105:723-738. [PMID: 30799874 PMCID: PMC6374011 DOI: 10.1093/biomet/asy017] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Indexed: 12/19/2022] Open
Abstract
We consider estimation of an optimal individualized treatment rule when a high-dimensional vector of baseline variables is available. Our optimality criterion is with respect to delaying the expected time to occurrence of an event of interest. We use semiparametric efficiency theory to construct estimators with properties such as double robustness. We propose two estimators of the optimal rule, which arise from considering two loss functions aimed at directly estimating the conditional treatment effect and recasting the problem in terms of weighted classification using the 0-1 loss function. Our estimated rules are ensembles that minimize the crossvalidated risk of a linear combination in a user-supplied library of candidate estimators. We prove oracle inequalities bounding the finite-sample excess risk of the estimator. The bounds depend on the excess risk of the oracle selector and a doubly robust term related to estimation of the nuisance parameters. We discuss the convergence rates of our estimator to the oracle selector, and illustrate our methods by analysis of a phase III randomized study testing the efficacy of a new therapy for the treatment of breast cancer.
Collapse
Affiliation(s)
- I Díaz
- Division of Biostatistics, Weill Cornell Medicine, 402 East 67th Street, New York, New York, U.S.A
| | - O Savenkov
- Division of Biostatistics, Weill Cornell Medicine, 402 East 67th Street, New York, New York, U.S.A
| | - K Ballman
- Division of Biostatistics, Weill Cornell Medicine, 402 East 67th Street, New York, New York, U.S.A
| |
Collapse
|
9
|
Reiss PT, Goldsmith J, Shang HL, Ogden RT. Methods for scalar-on-function regression. Int Stat Rev 2017; 85:228-249. [PMID: 28919663 PMCID: PMC5598560 DOI: 10.1111/insr.12163] [Citation(s) in RCA: 69] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2015] [Accepted: 12/28/2015] [Indexed: 01/16/2023]
Abstract
Recent years have seen an explosion of activity in the field of functional data analysis (FDA), in which curves, spectra, images, etc. are considered as basic functional data units. A central problem in FDA is how to fit regression models with scalar responses and functional data points as predictors. We review some of the main approaches to this problem, categorizing the basic model types as linear, nonlinear and nonparametric. We discuss publicly available software packages, and illustrate some of the procedures by application to a functional magnetic resonance imaging dataset.
Collapse
Affiliation(s)
- Philip T. Reiss
- Department of Child and Adolescent Psychiatry and Department of Population Health, New York University School of Medicine
- Department of Statistics, University of Haifa
| | - Jeff Goldsmith
- Department of Biostatistics, Columbia University Mailman School of Public Health
| | - Han Lin Shang
- Research School of Finance, Actuarial Studies and Statistics, Australian National University
| | - R. Todd Ogden
- Department of Biostatistics, Columbia University Mailman School of Public Health
- New York State Psychiatric Institute
| |
Collapse
|
10
|
Laber EB, Staicu AM. Functional feature construction for individualized treatment regimes. J Am Stat Assoc 2017; 113:1219-1227. [PMID: 30416232 PMCID: PMC6223315 DOI: 10.1080/01621459.2017.1321545] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2015] [Revised: 01/01/2017] [Indexed: 10/19/2022]
Abstract
Evidence-based personalized medicine formalizes treatment selection as an individualized treatment regime that maps up-to-date patient information into the space of possible treatments. Available patient information may include static features such race, gender, family history, genetic and genomic information, as well as longitudinal information including the emergence of comorbidities, waxing and waning of symptoms, side-effect burden, and adherence. Dynamic information measured at multiple time points before treatment assignment should be included as input to the treatment regime. However, subject longitudinal measurements are typically sparse, irregularly spaced, noisy, and vary in number across subjects. Existing estimators for treatment regimes require equal information be measured on each subject and thus standard practice is to summarize longitudinal subject information into a scalar, ad hoc summary during data pre-processing. This reduction of the longitudinal information to a scalar feature precedes estimation of a treatment regime and is therefore not informed by subject outcomes, treatments, or covariates. Furthermore, we show that this reduction requires more stringent causal assumptions for consistent estimation than are necessary. We propose a data-driven method for constructing maximally prescriptive yet interpretable features that can be used with standard methods for estimating optimal treatment regimes. In our proposed framework, we treat the subject longitudinal information as a realization of a stochastic process observed with error at discrete time points. Functionals of this latent process are then combined with outcome models to estimate an optimal treatment regime. The proposed methodology requires weaker causal assumptions than Q-learning with an ad hoc scalar summary and is consistent for the optimal treatment regime.
Collapse
Affiliation(s)
- Eric B Laber
- Department of Statistics, North Carolina State University, Raleigh, NC, 27695, U.S.A
| | - Ana-Maria Staicu
- Department of Statistics, North Carolina State University, Raleigh, NC, 27695, U.S.A
| |
Collapse
|
11
|
Linn KA, Laber EB, Stefanski LA. Interactive Q-learning for Quantiles. J Am Stat Assoc 2017; 112:638-649. [PMID: 28890584 PMCID: PMC5586239 DOI: 10.1080/01621459.2016.1155993] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2014] [Revised: 01/01/2016] [Indexed: 12/18/2022]
Abstract
A dynamic treatment regime is a sequence of decision rules, each of which recommends treatment based on features of patient medical history such as past treatments and outcomes. Existing methods for estimating optimal dynamic treatment regimes from data optimize the mean of a response variable. However, the mean may not always be the most appropriate summary of performance. We derive estimators of decision rules for optimizing probabilities and quantiles computed with respect to the response distribution for two-stage, binary treatment settings. This enables estimation of dynamic treatment regimes that optimize the cumulative distribution function of the response at a prespecified point or a prespecified quantile of the response distribution such as the median. The proposed methods perform favorably in simulation experiments. We illustrate our approach with data from a sequentially randomized trial where the primary outcome is remission of depression symptoms.
Collapse
Affiliation(s)
- Kristin A Linn
- Department of Biostatistics and Epidemiology, University of Pennsylvania, Philadelphia, PA 19104
| | - Eric B Laber
- Department of Statistics, North Carolina State University, Raleigh, NC 27695
| | - Leonard A Stefanski
- Department of Statistics, North Carolina State University, Raleigh, NC 27695
| |
Collapse
|
12
|
Chen S, Tian L, Cai T, Yu M. A general statistical framework for subgroup identification and comparative treatment scoring. Biometrics 2017; 73:1199-1209. [PMID: 28211943 DOI: 10.1111/biom.12676] [Citation(s) in RCA: 62] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2016] [Revised: 12/01/2016] [Accepted: 01/01/2017] [Indexed: 12/28/2022]
Abstract
Many statistical methods have recently been developed for identifying subgroups of patients who may benefit from different available treatments. Compared with the traditional outcome-modeling approaches, these methods focus on modeling interactions between the treatments and covariates while by-pass or minimize modeling the main effects of covariates because the subgroup identification only depends on the sign of the interaction. However, these methods are scattered and often narrow in scope. In this article, we propose a general framework, by weighting and A-learning, for subgroup identification in both randomized clinical trials and observational studies. Our framework involves minimum modeling for the relationship between the outcome and covariates pertinent to the subgroup identification. Under the proposed framework, we may also estimate the magnitude of the interaction, which leads to the construction of scoring system measuring the individualized treatment effect. The proposed methods are quite flexible and include many recently proposed estimators as special cases. As a result, some estimators originally proposed for randomized clinical trials can be extended to observational studies, and procedures based on the weighting method can be converted to an A-learning method and vice versa. Our approaches also allow straightforward incorporation of regularization methods for high-dimensional data, as well as possible efficiency augmentation and generalization to multiple treatments. We examine the empirical performance of several procedures belonging to the proposed framework through extensive numerical studies.
Collapse
Affiliation(s)
- Shuai Chen
- Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, Wisconsin 53792, U.S.A
| | - Lu Tian
- Department of Biomedical Data Science, Stanford University, Palo Alto, California 94305, U.S.A
| | - Tianxi Cai
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts 02115, U.S.A
| | - Menggang Yu
- Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, Wisconsin 53792, U.S.A
| |
Collapse
|
13
|
Petkova E, Tarpey T, Su Z, Ogden RT. Generated effect modifiers (GEM's) in randomized clinical trials. Biostatistics 2016; 18:105-118. [PMID: 27465235 DOI: 10.1093/biostatistics/kxw035] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2015] [Revised: 06/09/2016] [Accepted: 06/12/2016] [Indexed: 01/07/2023] Open
Abstract
In a randomized clinical trial (RCT), it is often of interest not only to estimate the effect of various treatments on the outcome, but also to determine whether any patient characteristic has a different relationship with the outcome, depending on treatment. In regression models for the outcome, if there is a non-zero interaction between treatment and a predictor, that predictor is called an "effect modifier". Identification of such effect modifiers is crucial as we move towards precision medicine, that is, optimizing individual treatment assignment based on patient measurements assessed when presenting for treatment. In most settings, there will be several baseline predictor variables that could potentially modify the treatment effects. This article proposes optimal methods of constructing a composite variable (defined as a linear combination of pre-treatment patient characteristics) in order to generate an effect modifier in an RCT setting. Several criteria are considered for generating effect modifiers and their performance is studied via simulations. An example from a RCT is provided for illustration.
Collapse
Affiliation(s)
- Eva Petkova
- Department of Child and Adolescent Psychiatry, New York University, 1 Park Ave., New York, NY 10016, USA and Nathan Kline Institute for Psychiatric Research, 140 Old Orangeburg Road, Orangeburg, NY 10962, USA
| | - Thaddeus Tarpey
- Department of Mathematics and Statistics, Wright State University, 3640 Colonel Glenn Hwy, Dayton, OH 45435, USA and Department of Child and Adolescent Psychiatry, New York University, 1 Park Ave., New York, NY 10016, USA
| | - Zhe Su
- Department of Child and Adolescent Psychiatry, New York University, 1 Park Ave., New York, NY 10016, USA
| | - R Todd Ogden
- Department of Biostatistics, Mailman School of Public Health, Columbia University, 722 West 168th St., New York, NY 10032, USA
| |
Collapse
|
14
|
Ciarleglio A, Petkova E, Tarpey T, Ogden RT. Flexible functional regression methods for estimating individualized treatment regimes. Stat (Int Stat Inst) 2016; 5:185-199. [PMID: 28845233 PMCID: PMC5568105 DOI: 10.1002/sta4.114] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
A major focus of personalized medicine is on the development of individualized treatment rules. Good decision rules have the potential to significantly advance patient care and reduce the burden of a host of diseases. Statistical methods for developing such rules are progressing rapidly, but few methods have considered the use of pre-treatment functional data to guide in decision-making. Furthermore, those methods that do allow for the incorporation of functional pre-treatment covariates typically make strong assumptions about the relationships between the functional covariates and the response of interest. We propose two approaches for using functional data to select an optimal treatment that address some of the shortcomings of previously developed methods. Specifically, we combine the flexibility of functional additive regression models with Q-learning or A-learning in order to obtain treatment decision rules. Properties of the corresponding estimators are discussed. Our approaches are evaluated in several realistic settings using synthetic data and are applied to real data arising from a clinical trial comparing two treatments for major depressive disorder in which baseline imaging data are available for subjects who are subsequently treated.
Collapse
Affiliation(s)
- Adam Ciarleglio
- Department of Child and Adolescent Psychiatry, NYU School of Medicine, New York, NY 10016, USA
| | - Eva Petkova
- Department of Child and Adolescent Psychiatry, NYU School of Medicine, New York, NY 10016, USA
- Nathan S. Kline Institute for Psychiatric Research, Orangesburg, NY 10962, USA
| | - Thaddeus Tarpey
- Department of Mathematics and Statistics, Wright State University, Dayton, OH 45435, USA
| | - R. Todd Ogden
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY 10032, USA
| |
Collapse
|
15
|
Huang Y. Identifying optimal biomarker combinations for treatment selection through randomized controlled trials. Clin Trials 2015; 12:348-56. [PMID: 25948620 PMCID: PMC4506270 DOI: 10.1177/1740774515580126] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
BACKGROUND/AIMS Biomarkers associated with treatment-effect heterogeneity can be used to make treatment recommendations that optimize individual clinical outcomes. To accomplish this, statistical methods are needed to generate marker-based treatment-selection rules that can most effectively reduce the population burden due to disease and treatment. Compared to the standard approach of risk modeling to derive treatment-selection rules, a more robust approach is to directly minimize an unbiased estimate of total disease and treatment burden among a pre-specified class of rules. This problem is one of minimizing a weighted sum of 0-1 loss function, which is computationally challenging to solve due to the nonsmoothness of 0-1 loss. Huang and Fong, among others, proposed a method that uses the Ramp loss to approximate the 0-1 loss and solves the minimization problem through repetitive constrained optimizations. The algorithm was shown to have comparable or better performance than other comparative estimators in various settings. Our aim in this article is to further extend the algorithm to allow for variable selection in the presence of a large number of candidate markers. METHODS We develop an alternative method to derive marker combinations to minimize the weighted sum of Ramp loss in Huang and Fong, based on data from randomized trials. The new algorithm estimates treatment-selection rules by repetitively minimizing a smooth and differentiable objective function. Through the use of an L1 penalty, we expand the method to allow for feature selection and develop an algorithm based on the coordinate descent method to build the treatment-selection rule. RESULTS Through extensive simulation studies, we compared performance of the proposed estimator to four existing approaches: (1) a logistic regression risk modeling approach, and three other "direct optimizing" approaches including (2) the estimator in Huang and Fong, (3) the weighted support vector machine, and (4) the weighted logistic regression. The proposed estimator performs comparably to that of Huang and Fong, and comparably or better than other estimators. Allowing for variable selection using the proposed estimator in the presence of a large number of markers further improves treatment-selection performance. The proposed estimator is also advantageous for selecting variables relevant to treatment selection compared to L1 penalized logistic regression and weighted logistic regression. We illustrate the application of the proposed methods in host-genetics data from an HIV vaccine trial. CONCLUSION The proposed estimator is appealing considering its effectiveness and conceptual simplicity. It has significant potential to contribute to the selection and combination of biomarkers for treatment selection in clinical practice.
Collapse
Affiliation(s)
- Ying Huang
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA Department of biostatistics, University of Washington, Seattle, WA, USA
| |
Collapse
|
16
|
Ciarleglio A, Petkova E, Ogden RT, Tarpey T. Treatment decisions based on scalar and functional baseline covariates. Biometrics 2015; 71:884-94. [PMID: 26111145 DOI: 10.1111/biom.12346] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2014] [Revised: 03/01/2015] [Accepted: 05/01/2015] [Indexed: 01/05/2023]
Abstract
The amount and complexity of patient-level data being collected in randomized-controlled trials offer both opportunities and challenges for developing personalized rules for assigning treatment for a given disease or ailment. For example, trials examining treatments for major depressive disorder are not only collecting typical baseline data such as age, gender, or scores on various tests, but also data that measure the structure and function of the brain such as images from magnetic resonance imaging (MRI), functional MRI (fMRI), or electroencephalography (EEG). These latter types of data have an inherent structure and may be considered as functional data. We propose an approach that uses baseline covariates, both scalars and functions, to aid in the selection of an optimal treatment. In addition to providing information on which treatment should be selected for a new patient, the estimated regime has the potential to provide insight into the relationship between treatment response and the set of baseline covariates. Our approach can be viewed as an extension of "advantage learning" to include both scalar and functional covariates. We describe our method and how to implement it using existing software. Empirical performance of our method is evaluated with simulated data in a variety of settings and also applied to data arising from a study of patients with major depressive disorder from whom baseline scalar covariates as well as functional data from EEG are available.
Collapse
Affiliation(s)
- Adam Ciarleglio
- Department of Child and Adolescent Psychiatry, New York University, New York, NY 10016, U.S.A
| | - Eva Petkova
- Department of Child and Adolescent Psychiatry, New York University, New York, NY 10016, U.S.A.,Nathan S. Kline Institute for Psychiatric Research, Orangeburg, NY 10962, U.S.A
| | - R Todd Ogden
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York NY, 10032, U.S.A
| | - Thaddeus Tarpey
- Department of Mathematics and Statistics, Wright State University, Dayton, OH 45435, U.S.A
| |
Collapse
|
17
|
Huang Y, Fong Y. Identifying optimal biomarker combinations for treatment selection via a robust kernel method. Biometrics 2014; 70:891-901. [PMID: 25124089 PMCID: PMC4277554 DOI: 10.1111/biom.12204] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2013] [Revised: 05/01/2014] [Accepted: 05/01/2014] [Indexed: 01/05/2023]
Abstract
Treatment-selection markers predict an individual's response to different therapies, thus allowing for the selection of a therapy with the best predicted outcome. A good marker-based treatment-selection rule can significantly impact public health through the reduction of the disease burden in a cost-effective manner. Our goal in this article is to use data from randomized trials to identify optimal linear and nonlinear biomarker combinations for treatment selection that minimize the total burden to the population caused by either the targeted disease or its treatment. We frame this objective into a general problem of minimizing a weighted sum of 0-1 loss and propose a novel penalized minimization method that is based on the difference of convex functions algorithm (DCA). The corresponding estimator of marker combinations has a kernel property that allows flexible modeling of linear and nonlinear marker combinations. We compare the proposed methods with existing methods for optimizing treatment regimens such as the logistic regression model and the weighted support vector machine. Performances of different weight functions are also investigated. The application of the proposed method is illustrated using a real example from an HIV vaccine trial: we search for a combination of Fc receptor genes for recommending vaccination in preventing HIV infection.
Collapse
Affiliation(s)
- Ying Huang
- Fred Hutchinson Cancer Research Center Public Health Sciences, 1100
Fairview Avenue N., Seattle, WA 98109-1024, University of Washington, Department
of Biostatistics, Seattle, WA, 98195
| | - Youyi Fong
- Fred Hutchinson Cancer Research Center Public Health Sciences, 1100
Fairview Avenue N., Seattle, WA 98109-1024, University of Washington, Department
of Biostatistics, Seattle, WA, 98195
| |
Collapse
|