1
|
Liang W, Jia J. Reinforcement learning using neural networks in estimating an optimal dynamic treatment regime in patients with sepsis. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 266:108754. [PMID: 40222267 DOI: 10.1016/j.cmpb.2025.108754] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/24/2024] [Revised: 03/24/2025] [Accepted: 03/27/2025] [Indexed: 04/15/2025]
Abstract
OBJECTIVE Early fluid resuscitation is crucial in the treatment of sepsis, yet the optimal dosage remains debated. This study aims to determine the optimal multi-stage fluid resuscitation dosage for sepsis patients. METHODS We propose a reinforcement learning algorithm with neural networks (RL-NN), utilizing the flexibility of deep learning architectures to mitigate model misspecification. We use cross-validation and random search for hyperparameter tuning to further enhance model robustness and generalization. RESULTS Simulation results demonstrate that our method outperforms existing methods in terms of both the percentage of correctly classified optimal treatments and the predicted counterfactual mean outcome. Applying this method to the sepsis cohort from the Medical Information Mart for Intensive Care III (MIMIC-III), we recommend that all sepsis patients receive adequate fluid resuscitation (≥ 30 mL/kg) within the first 3 h of admission to the MICU. Our approach is expected to significantly reduce the mean SOFA score by 23.71%, enhancing patient outcomes. CONCLUSION Our RL-NN method offers an accurate, real-time approach to optimizing sepsis treatment and aligns with the 'Surviving Sepsis Campaign' guidelines. It also has the potential to be integrated with existing electronic health record (EHR) systems, guiding clinical decision-making and thereby improving patient prognosis.
Collapse
Affiliation(s)
- Weijie Liang
- Department of Biostatistics, School of Public Health, Peking University, No. 38 Xueyuan Road, Beijing, 100191, China
| | - Jinzhu Jia
- Department of Biostatistics, School of Public Health, Peking University, No. 38 Xueyuan Road, Beijing, 100191, China; Center for Statistical Science, Peking University, No. 5 Yiheyuan Road, Haidian District, Beijing, 100871, China.
| |
Collapse
|
2
|
Vadde R, Gupta MK. Machine Learning Approaches for Neuroblastoma Risk Prediction and Stratification. Crit Rev Oncog 2025; 30:15-30. [PMID: 39819432 DOI: 10.1615/critrevoncog.2024056447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2025]
Abstract
Machine learning (ML) holds great promise in advancing risk prediction and stratification for neuroblastoma, a highly heterogeneous pediatric cancer. By utilizing large-scale biological and clinical data, ML models can detect complex patterns that traditional approaches often overlook, enabling more personalized treatments and better patient outcomes. Various ML techniques, such as support vector machines, random forests, and deep learning, have shown superior performance in predicting survival, relapse, and treatment responses in neuroblastoma patients compared to conventional methods. However, challenges like limited data size, model interpretability, data variability, and difficulties in clinical integration hinder broader adoption. Additionally, ethical concerns related to bias and privacy must be addressed. Future work should focus on improving data quality, enhancing model transparency, and conducting thorough clinical validation. With these advancements, ML has the potential to revolutionize neuroblastoma care by refining early diagnosis, risk assessment, and therapeutic decision-making.
Collapse
Affiliation(s)
- Ramakrishna Vadde
- Department of Biotechnology & Bioinformatics, Yogi Vemana University, Kadapa - 516003, Andhra Pradesh, India
| | - Manoj Kumar Gupta
- Hematology, Hemostasis, Oncology and Stem Cell Transplantation, Hannover Medical School (MHH), Hannover, Germany
| |
Collapse
|
3
|
Rizi MM, Dubin JA, Wallace MP. Dynamic Treatment Regimes on Dyadic Networks. Stat Med 2024; 43:5944-5967. [PMID: 39608868 PMCID: PMC11639660 DOI: 10.1002/sim.10278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Revised: 09/13/2024] [Accepted: 10/25/2024] [Indexed: 11/30/2024]
Abstract
Identifying interventions that are optimally tailored to each individual is of significant interest in various fields, in particular precision medicine. Dynamic treatment regimes (DTRs) employ sequences of decision rules that utilize individual patient information to recommend treatments. However, the assumption that an individual's treatment does not impact the outcomes of others, known as the no interference assumption, is often challenged in practical settings. For example, in infectious disease studies, the vaccine status of individuals in close proximity can influence the likelihood of infection. Imposing this assumption when it, in fact, does not hold, may lead to biased results and impact the validity of the resulting DTR optimization. We extend the estimation method of dynamic weighted ordinary least squares (dWOLS), a doubly robust and easily implemented approach for estimating optimal DTRs, to incorporate the presence of interference within dyads (i.e., pairs of individuals). We formalize an appropriate outcome model and describe the estimation of an optimal decision rule in the dyadic-network context. Through comprehensive simulations and analysis of the Population Assessment of Tobacco and Health (PATH) data, we demonstrate the improved performance of the proposed joint optimization strategy compared to the current state-of-the-art conditional optimization methods in estimating the optimal treatment assignments when within-dyad interference exists.
Collapse
Affiliation(s)
- Marizeh Mussavi Rizi
- Department of Statistics and Actuarial ScienceUniversity of WaterlooWaterlooOntarioCanada
| | - Joel A. Dubin
- Department of Statistics and Actuarial ScienceUniversity of WaterlooWaterlooOntarioCanada
| | - Micheal P. Wallace
- Department of Statistics and Actuarial ScienceUniversity of WaterlooWaterlooOntarioCanada
| |
Collapse
|
4
|
Yu W, Bondell H. Bayesian Empirical Likelihood Regression for Semiparametric Estimation of Optimal Dynamic Treatment Regimes. Stat Med 2024; 43:5461-5472. [PMID: 39449157 DOI: 10.1002/sim.10251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 08/06/2024] [Accepted: 10/02/2024] [Indexed: 10/26/2024]
Abstract
We propose a semiparametric approach to Bayesian modeling of dynamic treatment regimes that is built on a Bayesian likelihood-based regression estimation framework. Methods based on this framework exhibit a probabilistic coherence property that leads to accurate estimation of the optimal dynamic treatment regime. Unlike most Bayesian estimation methods, our proposed method avoids strong distributional assumptions for the intermediate and final outcomes by utilizing empirical likelihoods. Our proposed method allows for either linear, or more flexible forms of mean functions for the stagewise outcomes. A variational Bayes approximation is used for computation to avoid common pitfalls associated with Markov Chain Monte Carlo approaches coupled with empirical likelihood. Through simulations and analysis of the STAR*D sequential randomized trial data, our proposed method demonstrates superior accuracy over Q-learning and parametric Bayesian likelihood-based regression estimation, particularly when the parametric assumptions of regression error distributions may be potentially violated.
Collapse
Affiliation(s)
- Weichang Yu
- School of Mathematics and Statistics, University of Melbourne, Parkville, Victoria, Australia
| | - Howard Bondell
- School of Mathematics and Statistics, University of Melbourne, Parkville, Victoria, Australia
| |
Collapse
|
5
|
Moodie EEM, Talbot D. On "Reflections on the concept of optimality of single decision point treatment regimes". Biom J 2023; 65:e2300027. [PMID: 37797173 DOI: 10.1002/bimj.202300027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 04/26/2023] [Accepted: 06/22/2023] [Indexed: 10/07/2023]
Abstract
This is a discussion of "Reflections on the concept of optimality of single decision point treatment regimes" by Trung Dung Tran, Ariel Alonso Abad, Geert Verbeke, Geert Molenberghs, and Iven Van Mechelen. The authors propose a thoughtful consideration of optimization targets and the implications of such targets for the resulting optimal treatment rule. However, we contest the assertation that targets of optimization have been overlooked and suggest additional considerations that researchers must contemplate as part of a complete framework for learning about optimal treatment regimes.
Collapse
Affiliation(s)
- Erica E M Moodie
- Department of Epidemiology & Biostatistics, McGill University, Montreal, Quebec, Canada
| | - Denis Talbot
- Department of Social and Preventive Medicine, Université Laval, Quebec, Canada
| |
Collapse
|
6
|
Zhang Y, Vock DM, Patrick ME, Murray TA. Modified interactive Q-learning for attenuating the impact of model misspecification with treatment effect heterogeneity. Stat Methods Med Res 2023; 32:2240-2253. [PMID: 37859598 PMCID: PMC10683339 DOI: 10.1177/09622802231206471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2023]
Abstract
A sequential multiple assignment randomized trial, which incorporates multiple stages of randomization, is a popular approach for collecting data to inform personalized and adaptive treatments. There is an extensive literature on statistical methods to analyze data collected in sequential multiple assignment randomized trials and estimate the optimal dynamic treatment regime. Q-learning with linear regression is widely used for this purpose due to its ease of implementation. However, model misspecification is a common problem with this approach, and little attention has been given to the impact of model misspecification when treatment effects are heterogeneous across subjects. This article describes the integrative impact of two possible types of model misspecification related to treatment effect heterogeneity: omitted early-stage treatment effects in late-stage main effect model, and violated linearity assumption between pseudo-outcomes and predictors despite non-linearity arising from the optimization operation. The proposed method, aiming to deal with both types of misspecification concomitantly, builds interactive models into modified parametric Q-learning with Murphy's regret function. Simulations show that the proposed method is robust to both sources of model misspecification. The proposed method is applied to a two-stage sequential multiple assignment randomized trial with embedded tailoring aimed at reducing binge drinking in first-year college students.
Collapse
Affiliation(s)
- Yuan Zhang
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - David M Vock
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Megan E Patrick
- Institute for Social Research, University of Michigan, Ann Arbor, MI, USA
| | - Thomas A Murray
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
7
|
Weisenthal SJ, Thurston SW, Ertefaie A. Relative sparsity for medical decision problems. Stat Med 2023; 42:3067-3092. [PMID: 37315949 PMCID: PMC10524900 DOI: 10.1002/sim.9755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 03/24/2023] [Accepted: 04/02/2023] [Indexed: 06/16/2023]
Abstract
Existing statistical methods can estimate a policy, or a mapping from covariates to decisions, which can then instruct decision makers (eg, whether to administer hypotension treatment based on covariates blood pressure and heart rate). There is great interest in using such data-driven policies in healthcare. However, it is often important to explain to the healthcare provider, and to the patient, how a new policy differs from the current standard of care. This end is facilitated if one can pinpoint the aspects of the policy (ie, the parameters for blood pressure and heart rate) that change when moving from the standard of care to the new, suggested policy. To this end, we adapt ideas from Trust Region Policy Optimization (TRPO). In our work, however, unlike in TRPO, the difference between the suggested policy and standard of care is required to be sparse, aiding with interpretability. This yields "relative sparsity," where, as a function of a tuning parameter,λ $$ \lambda $$ , we can approximately control the number of parameters in our suggested policy that differ from their counterparts in the standard of care (eg, heart rate only). We propose a criterion for selectingλ $$ \lambda $$ , perform simulations, and illustrate our method with a real, observational healthcare dataset, deriving a policy that is easy to explain in the context of the current standard of care. Our work promotes the adoption of data-driven decision aids, which have great potential to improve health outcomes.
Collapse
Affiliation(s)
- Samuel J. Weisenthal
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, New York, USA
- Medical Scientist Training Program, University of Rochester School of Medicine and Dentistry, New York, USA
| | - Sally W. Thurston
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, New York, USA
| | - Ashkan Ertefaie
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, New York, USA
| |
Collapse
|
8
|
Jones J, Ertefaie A, Shortreed SM. Rejoinder to "Reader reaction to 'Outcome-adaptive Lasso: Variable selection for causal inference' by Shortreed and Ertefaie (2017)". Biometrics 2023; 79:521-525. [PMID: 35579597 PMCID: PMC9669282 DOI: 10.1111/biom.13681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Accepted: 04/19/2022] [Indexed: 11/28/2022]
Affiliation(s)
- Jeremiah Jones
- Department of Biostatistics and Computational Biology, University of Rochester, Rochester, New York, USA
| | - Ashkan Ertefaie
- Department of Biostatistics and Computational Biology, University of Rochester, Rochester, New York, USA
| | - Susan M Shortreed
- Biostatistics Division, Kaiser Permanente Washington Health Research Institute, Seattle, Washington, USA
- Department of Biostatistics, Univerisity of Washington, Seattle, Washington, USA
| |
Collapse
|
9
|
Li Z, Chen J, Laber E, Liu F, Baumgartner R. Optimal Treatment Regimes: A Review and Empirical Comparison. Int Stat Rev 2023. [DOI: 10.1111/insr.12536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/24/2023]
Affiliation(s)
- Zhen Li
- Department of Statistics North Carolina State University Raleigh 27607 NC USA
| | - Jie Chen
- Department of Biometrics Overland Pharmaceuticals Dover 19901 DE USA
| | - Eric Laber
- Department of Statistical Science, Department of Biostatistics and Bioinformatics Duke University Durham 27708 NC USA
| | - Fang Liu
- Biostatistics and Research Decision Sciences Merck & Co., Inc. Kenilworth NJ 07033 USA
| | - Richard Baumgartner
- Biostatistics and Research Decision Sciences Merck & Co., Inc. Kenilworth NJ 07033 USA
| |
Collapse
|
10
|
Li C, Li W, Zhu W. Penalized robust learning for optimal treatment regimes with heterogeneous individualized treatment effects. J Appl Stat 2023; 51:1151-1170. [PMID: 38628447 PMCID: PMC11018073 DOI: 10.1080/02664763.2023.2180167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 02/05/2023] [Indexed: 02/22/2023]
Abstract
The growing popularity of personalized medicine motivates people to explore individualized treatment regimes according to heterogeneous characteristics of the patients. For the large-scale data analysis, however, the data are collected at different times and different locations, i.e. subjects are usually from a heterogeneous population, which causes that the optimal treatment regimes also vary for patients across different subgroups. In this paper, we mainly focus on the estimation of optimal treatment regimes for subjects come from a heterogeneous population with high-dimensional data. We first remove the main effects of the covariates for each subgroup to eliminate non-ignorable residual confounding. Based on the centralized outcome, we propose a penalized robust learning that estimates the coefficient matrix of the interactions between covariates and treatment by penalizing pairwise differences of the coefficients of any two subgroups for the same covariate, which can automatically identify the latent complex structure of the coefficient matrix with heterogeneous and homogeneous columns. At the same time, the penalized robust learning can also select the important variables that truly contribute to the individualized treatment decisions with commonly used sparsity structure penalty. Extensive simulation studies show that our proposed method outperforms current popular methods, and it is further illustrated in the real analysis of the Tamoxifen breast cancer data.
Collapse
Affiliation(s)
- Canhui Li
- Key Laboratory for Applied Statistics of MOE and School of Mathematics and Statistics, Northeast Normal University, Changchun, People's Republic of China
| | - Weirong Li
- Key Laboratory for Applied Statistics of MOE and School of Mathematics and Statistics, Northeast Normal University, Changchun, People's Republic of China
| | - Wensheng Zhu
- Key Laboratory for Applied Statistics of MOE and School of Mathematics and Statistics, Northeast Normal University, Changchun, People's Republic of China
| |
Collapse
|
11
|
Talbot D, Moodie EEM, Diorio C. Double robust estimation of optimal partially adaptive treatment strategies: An application to breast cancer treatment using hormonal therapy. Stat Med 2023; 42:178-192. [PMID: 36408723 DOI: 10.1002/sim.9608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 09/17/2022] [Accepted: 11/05/2022] [Indexed: 11/22/2022]
Abstract
Precision medicine aims to tailor treatment decisions according to patients' characteristics. G-estimation and dynamic weighted ordinary least squares are double robust methods to identify optimal adaptive treatment strategies. It is underappreciated that they require modeling all existing treatment-confounder interactions to be consistent. Identifying optimal partially adaptive treatment strategies that tailor treatments according to only a few covariates, ignoring some interactions, may be preferable in practice. Building on G-estimation and dWOLS, we propose estimators of such partially adaptive strategies and demonstrate their double robustness. We investigate these estimators in a simulation study. Using data maintained by the Centre des Maladies du Sein, we estimate a partially adaptive treatment strategy for tailoring hormonal therapy use in breast cancer patients. R software implementing our estimators is provided.
Collapse
Affiliation(s)
- Denis Talbot
- Département de médecine sociale et préventive, Université Laval, Québec, Canada.,Axe santé des Populations et Pratiques Optimales en Santé, Centre de Recherche du CHU de Québec - Université Laval, Québec, Canada
| | - Erica E M Moodie
- Department of Epidemiology, Biostatistics & Occupational Health, McGill University, Québec, Canada
| | - Caroline Diorio
- Département de médecine sociale et préventive, Université Laval, Québec, Canada.,Axe oncologie, Centre de recherche du CHU de Québec - Université Laval, Québec, Canada
| |
Collapse
|
12
|
Oh EJ, Qian M, Cheung YK. Generalization error bounds of dynamic treatment regimes in penalized regression-based learning. Ann Stat 2022. [DOI: 10.1214/22-aos2171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Eun Jeong Oh
- Department of Biostatistics, Columbia University
| | - Min Qian
- Department of Biostatistics, Columbia University
| | | |
Collapse
|
13
|
Mo W, Liu Y. Efficient learning of optimal individualized treatment rules for heteroscedastic or misspecified treatment‐free effect models. J R Stat Soc Series B Stat Methodol 2021. [DOI: 10.1111/rssb.12474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Weibin Mo
- Department of Statistics and Operations Research University of North Carolina at Chapel Hill Chapel Hill North Carolina USA
| | - Yufeng Liu
- Department of Statistics and Operations Research Department of Genetics Department of Biostatistics Carolina Center for Genome Sciences Lineberger Comprehensive Cancer Center University of North Carolina at Chapel Hill Chapel Hill North Carolina USA
| |
Collapse
|