Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Lu M, Sadiq S, Feaster DJ, Ishwaran H. Estimating Individual Treatment Effect in Observational Data Using Random Forest Methods. J Comput Graph Stat 2018;27:209-219. [PMID: 29706752 DOI: 10.1080/10618600.2017.1356325] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]

Number

Cited by Other Article(s)

Babaei H, Alemohammad S, Baraniuk RG. Covariate Balancing Methods for Randomized Controlled Trials Are Not Adversarially Robust. IEEE Trans Neural Netw Learn Syst 2024;35:5014-5026. [PMID: 37104113 DOI: 10.1109/tnnls.2023.3266429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]

Dandl S, Bender A, Hothorn T. Heterogeneous treatment effect estimation for observational data using model-based forests. Stat Methods Med Res 2024;33:392-413. [PMID: 38332489 PMCID: PMC10981193 DOI: 10.1177/09622802231224628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2024]

Esposti R. Non-monetary motivations of the EU agri-environmental policy adoption. A causal forest approach. J Environ Manage 2024;352:119992. [PMID: 38194870 DOI: 10.1016/j.jenvman.2023.119992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 12/18/2023] [Accepted: 12/28/2023] [Indexed: 01/11/2024]

Endo Y, Alaimo L, Moazzam Z, Woldesenbet S, Lima HA, Munir MM, Shaikh CF, Yang J, Azap L, Katayama E, Guglielmi A, Ruzzenente A, Aldrighetti L, Alexandrescu S, Kitago M, Poultsides G, Sasaki K, Aucejo F, Pawlik TM. Postoperative morbidity after simultaneous versus staged resection of synchronous colorectal liver metastases: Impact of hepatic tumor burden. Surgery 2024;175:432-440. [PMID: 38001013 DOI: 10.1016/j.surg.2023.10.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 10/09/2023] [Accepted: 10/25/2023] [Indexed: 11/26/2023]

Abstract

BACKGROUND

We sought to characterize the risk of postoperative complications relative to the surgical approach and overall synchronous colorectal liver metastases tumor burden score.

METHODS

Patients with synchronous colorectal liver metastases who underwent curative-intent resection between 2000 and 2020 were identified from an international multi-institutional database. Propensity score matching was employed to control for heterogeneity between the 2 groups. A virtual twins analysis was performed to identify potential subgroups of patients who might benefit more from staged versus simultaneous resection.

RESULTS

Among 976 patients who underwent liver resection for synchronous colorectal liver metastases, 589 patients (60.3%) had a staged approach, whereas 387 (39.7%) patients underwent simultaneous resection of the primary tumor and synchronous colorectal liver metastases. After propensity score matching, 295 patients who underwent each surgical approach were analyzed. Overall, the incidence of postoperative complications was 34.1% (n = 201). Among patients with high tumor burden scores, the surgical approach was associated with a higher incidence of postoperative complications; in contrast, among patients with low or medium tumor burden scores, the likelihood of complications did not differ based on the surgical approach. Virtual twins analysis demonstrated that preoperative tumor burden score was important to identify which subgroup of patients benefited most from staged versus simultaneous resection. Simultaneous resection was associated with better outcomes among patients with a tumor burden score <9 and a node-negative right-sided primary tumor; in contrast, staged resection was associated with better outcomes among patients with node-positive left-sided primary tumors and higher tumor burden score.

CONCLUSION

Among patients with high tumor burden scores, simultaneous resection of the primary tumor and liver metastases was associated with an increased incidence of postoperative complications.

Collapse

Affiliation(s)

Yutaka Endo Department of Surgery, The Ohio State University Wexner Medical Center and James Comprehensive Cancer Center, Columbus, OH
Laura Alaimo Department of Surgery, The Ohio State University Wexner Medical Center and James Comprehensive Cancer Center, Columbus, OH; Department of Surgery, University of Verona, Italy
Zorays Moazzam Department of Surgery, The Ohio State University Wexner Medical Center and James Comprehensive Cancer Center, Columbus, OH
Selamawit Woldesenbet Department of Surgery, The Ohio State University Wexner Medical Center and James Comprehensive Cancer Center, Columbus, OH
Henrique A Lima Department of Surgery, The Ohio State University Wexner Medical Center and James Comprehensive Cancer Center, Columbus, OH
Muhammad Musaab Munir Department of Surgery, The Ohio State University Wexner Medical Center and James Comprehensive Cancer Center, Columbus, OH
Chanza F Shaikh Department of Surgery, The Ohio State University Wexner Medical Center and James Comprehensive Cancer Center, Columbus, OH
Jason Yang Department of Surgery, The Ohio State University Wexner Medical Center and James Comprehensive Cancer Center, Columbus, OH
Lovette Azap Department of Surgery, The Ohio State University Wexner Medical Center and James Comprehensive Cancer Center, Columbus, OH
Erryk Katayama Department of Surgery, The Ohio State University Wexner Medical Center and James Comprehensive Cancer Center, Columbus, OH
Alfredo Guglielmi Department of Surgery, University of Verona, Italy
Andrea Ruzzenente Department of Surgery, University of Verona, Italy
Luca Aldrighetti Department of Surgery, Ospedale San Raffaele, Milan, Italy
Sorin Alexandrescu Department of Surgery, Fundeni Clinical Institute, Bucharest, Romania
Minoru Kitago Department of Surgery, Keio University, Tokyo, Japan
George Poultsides Department of Surgery, Stanford University, CA
Kazunari Sasaki Department of Surgery, Stanford University, CA
Federico Aucejo Department of General Surgery, Cleveland Clinic Foundation, OH
Timothy M Pawlik Department of Surgery, The Ohio State University Wexner Medical Center and James Comprehensive Cancer Center, Columbus, OH.

Collapse

Post RAJ, Petkovic M, van den Heuvel IL, van den Heuvel ER. Flexible Machine Learning Estimation of Conditional Average Treatment Effects: A Blessing and a Curse. Epidemiology 2024;35:32-40. [PMID: 37889951 DOI: 10.1097/ede.0000000000001684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/29/2023]

Ferrario PG, Gedrich K. Machine learning and personalized nutrition: a promising liaison? Eur J Clin Nutr 2024;78:74-76. [PMID: 37833568 PMCID: PMC10774117 DOI: 10.1038/s41430-023-01350-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 09/12/2023] [Accepted: 09/20/2023] [Indexed: 10/15/2023]

Hu L. A new method for clustered survival data: Estimation of treatment effect heterogeneity and variable selection. Biom J 2024;66:e2200178. [PMID: 38072661 PMCID: PMC10953775 DOI: 10.1002/bimj.202200178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 07/31/2023] [Accepted: 08/11/2023] [Indexed: 01/30/2024]

Abstract

We recently developed a new method random-intercept accelerated failure time model with Bayesian additive regression trees (riAFT-BART) to draw causal inferences about population treatment effect on patient survival from clustered and censored survival data while accounting for the multilevel data structure. The practical utility of this method goes beyond the estimation of population average treatment effect. In this work, we exposit how riAFT-BART can be used to solve two important statistical questions with clustered survival data: estimating the treatment effect heterogeneity and variable selection. Leveraging the likelihood-based machine learning, we describe a way in which we can draw posterior samples of the individual survival treatment effect from riAFT-BART model runs, and use the drawn posterior samples to perform an exploratory treatment effect heterogeneity analysis to identify subpopulations who may experience differential treatment effects than population average effects. There is sparse literature on methods for variable selection among clustered and censored survival data, particularly ones using flexible modeling techniques. We propose a permutation-based approach using the predictor's variable inclusion proportion supplied by the riAFT-BART model for variable selection. To address the missing data issue frequently encountered in health databases, we propose a strategy to combine bootstrap imputation and riAFT-BART for variable selection among incomplete clustered survival data. We conduct an expansive simulation study to examine the practical operating characteristics of our proposed methods, and provide empirical evidence that our proposed methods perform better than several existing methods across a wide range of data scenarios. Finally, we demonstrate the methods via a case study of predictors for in-hospital mortality among severe COVID-19 patients and estimating the heterogeneous treatment effects of three COVID-specific medications. The methods developed in this work are readily available in the R ${\textsf {R}}$ package riAFTBART $\textsf {riAFTBART}$ .

Collapse

Zhang J, Zhang P, Ma J, Shentu Y. Covariate-adjusted value-guided subgroup identification via boosting. J Biopharm Stat 2023:1-18. [PMID: 37955423 DOI: 10.1080/10543406.2023.2275757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Accepted: 10/22/2023] [Indexed: 11/14/2023]

Johnson D, Lu W, Davidian M. A general framework for subgroup detection via one-step value difference estimation. Biometrics 2023;79:2116-2126. [PMID: 35793474 PMCID: PMC10694635 DOI: 10.1111/biom.13711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Accepted: 06/15/2022] [Indexed: 11/29/2022]

Baird A, Cheng Y, Xia Y. Determinants of outpatient substance use disorder treatment length-of-stay and completion: the case of a treatment program in the southeast U.S. Sci Rep 2023;13:13961. [PMID: 37633996 PMCID: PMC10460408 DOI: 10.1038/s41598-023-41350-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Accepted: 08/24/2023] [Indexed: 08/28/2023] Open

Raja S, Rice TW, Lu M, Semple ME, Blackstone EH, Murthy SC, Ahmad U, McNamara M, Toth AJ, Hemant I. Adjuvant Therapy After Neoadjuvant Therapy for Esophageal Cancer: Who Needs It? Ann Surg 2023;278:e240-e249. [PMID: 35997269 PMCID: PMC10955553 DOI: 10.1097/sla.0000000000005679] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Ghosh S, Feng Z, Bian J, Butler K, Prosperi M. DR-VIDAL - Doubly Robust Variational Information-theoretic Deep Adversarial Learning for Counterfactual Prediction and Treatment Effect Estimation on Real World Data. AMIA Annu Symp Proc 2023;2022:485-494. [PMID: 37128454 PMCID: PMC10148269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]

Blette BS, Granholm A, Li F, Shankar-Hari M, Lange T, Munch MW, Møller MH, Perner A, Harhay MO. Causal Bayesian machine learning to assess treatment effect heterogeneity by dexamethasone dose for patients with COVID-19 and severe hypoxemia. Sci Rep 2023;13:6570. [PMID: 37085591 PMCID: PMC10120498 DOI: 10.1038/s41598-023-33425-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 04/12/2023] [Indexed: 04/23/2023] Open

Affiliation(s)

Bryan S Blette Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA Clinical Trials Methods and Outcomes Lab, Palliative and Advanced Illness Research (PAIR) Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
Anders Granholm Department of Intensive Care, Rigshospitalet-Copenhagen University Hospital, Copenhagen, Denmark Collaboration for Research in Intensive Care, Copenhagen, Denmark
Fan Li Department of Biostatistics, Yale University School of Public Health, New Haven, CT, USA Center for Methods in Implementation and Prevention Science, Yale University School of Public Health, New Haven, CT, USA
Manu Shankar-Hari Centre for Inflammation Research, University of Edinburgh, Edinburgh, UK
Theis Lange Section of Biostatistics, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
Marie Warrer Munch Department of Intensive Care, Rigshospitalet-Copenhagen University Hospital, Copenhagen, Denmark Collaboration for Research in Intensive Care, Copenhagen, Denmark
Morten Hylander Møller Department of Intensive Care, Rigshospitalet-Copenhagen University Hospital, Copenhagen, Denmark Collaboration for Research in Intensive Care, Copenhagen, Denmark
Anders Perner Department of Intensive Care, Rigshospitalet-Copenhagen University Hospital, Copenhagen, Denmark Collaboration for Research in Intensive Care, Copenhagen, Denmark
Michael O Harhay Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA. Clinical Trials Methods and Outcomes Lab, Palliative and Advanced Illness Research (PAIR) Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA. Division of Pulmonary and Critical Care, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, 304 Blockley Hall, 423 Guardian Drive, Philadelphia, PA, 19104-6021, USA.

Collapse

Rekkas A, Rijnbeek PR, Kent DM, Steyerberg EW, van Klaveren D. Estimating individualized treatment effects from randomized controlled trials: a simulation study to compare risk-based approaches. BMC Med Res Methodol 2023;23:74. [PMID: 36977990 PMCID: PMC10045909 DOI: 10.1186/s12874-023-01889-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Accepted: 03/15/2023] [Indexed: 03/30/2023] Open

Guo X, Wei W, Liu M, Cai T, Wu C, Wang J. Assessing the Most Vulnerable Subgroup to Type II Diabetes Associated with Statin Usage: Evidence from Electronic Health Record Data. J Am Stat Assoc 2023;118:1488-1499. [PMID: 38223220 PMCID: PMC10786632 DOI: 10.1080/01621459.2022.2157727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Accepted: 11/21/2022] [Indexed: 12/23/2022]

Hapfelmeier A, On BI, Mühlau M, Kirschke JS, Berthele A, Gasperi C, Mansmann U, Wuschek A, Bussas M, Boeker M, Bayas A, Senel M, Havla J, Kowarik MC, Kuhn K, Gatz I, Spengler H, Wiestler B, Grundl L, Sepp D, Hemmer B. Retrospective cohort study to devise a treatment decision score predicting adverse 24-month radiological activity in early multiple sclerosis. Ther Adv Neurol Disord 2023;16:17562864231161892. [PMID: 36993939 PMCID: PMC10041597 DOI: 10.1177/17562864231161892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Accepted: 02/19/2023] [Indexed: 03/31/2023] Open

Abstract

Background

Multiple sclerosis (MS) is a chronic neuroinflammatory disease affecting about 2.8 million people worldwide. Disease course after the most common diagnoses of relapsing-remitting multiple sclerosis (RRMS) and clinically isolated syndrome (CIS) is highly variable and cannot be reliably predicted. This impairs early personalized treatment decisions.

Objectives

The main objective of this study was to algorithmically support clinical decision-making regarding the options of early platform medication or no immediate treatment of patients with early RRMS and CIS.

Design

Retrospective monocentric cohort study within the Data Integration for Future Medicine (DIFUTURE) Consortium.

Methods

Multiple data sources of routine clinical, imaging and laboratory data derived from a large and deeply characterized cohort of patients with MS were integrated to conduct a retrospective study to create and internally validate a treatment decision score [Multiple Sclerosis Treatment Decision Score (MS-TDS)] through model-based random forests (RFs). The MS-TDS predicts the probability of no new or enlarging lesions in cerebral magnetic resonance images (cMRIs) between 6 and 24 months after the first cMRI.

Results

Data from 65 predictors collected for 475 patients between 2008 and 2017 were included. No medication and platform medication were administered to 277 (58.3%) and 198 (41.7%) patients. The MS-TDS predicted individual outcomes with a cross-validated area under the receiver operating characteristics curve (AUROC) of 0.624. The respective RF prediction model provides patient-specific MS-TDS and probabilities of treatment success. The latter may increase by 5-20% for half of the patients if the treatment considered superior by the MS-TDS is used.

Conclusion

Routine clinical data from multiple sources can be successfully integrated to build prediction models to support treatment decision-making. In this study, the resulting MS-TDS estimates individualized treatment success probabilities that can identify patients who benefit from early platform medication. External validation of the MS-TDS is required, and a prospective study is currently being conducted. In addition, the clinical relevance of the MS-TDS needs to be established.

Collapse

Affiliation(s)

Alexander Hapfelmeier
Begum Irmak On Institute for Medical Information Processing, Biometry, and Epidemiology, Ludwig-Maximilians-Universität in Munich, Munich, Germany Data Integration for Future Medicine (DIFUTURE) Consortium, Munich, Germany
Mark Mühlau Department of Neurology, Klinikum rechts der Isar School of Medicine, Technical University of Munich, Munich, Germany
Jan S. Kirschke Department of Diagnostic and Interventional Neuroradiology, Klinikum rechts der Isar, School of Medicine, Technical University of Munich, Munich, Germany
Achim Berthele Department of Neurology, Klinikum rechts der Isar School of Medicine, Technical University of Munich, Munich, Germany
Christiane Gasperi Department of Neurology, Klinikum rechts der Isar School of Medicine, Technical University of Munich, Munich, Germany
Ulrich Mansmann Institute for Medical Information Processing, Biometry, and Epidemiology, Ludwig-Maximilians-Universität in Munich, Munich, Germany Data Integration for Future Medicine (DIFUTURE) Consortium, Munich, Germany
Alexander Wuschek Department of Neurology, Klinikum rechts der Isar School of Medicine, Technical University of Munich, Munich, Germany
Matthias Bussas Department of Neurology, Klinikum rechts der Isar School of Medicine, Technical University of Munich, Munich, Germany
Martin Boeker Institute of AI and Informatics in Medicine, School of Medicine, Technical University of Munich, Munich, Germany Data Integration for Future Medicine (DIFUTURE) Consortium, Munich, Germany
Antonios Bayas Department of Neurology, Medical Faculty, University of Augsburg, Augsburg, Germany Data Integration for Future Medicine (DIFUTURE) Consortium, Munich, Germany
Makbule Senel Department of Neurology, Ulm University Hospital, Ulm, Germany Data Integration for Future Medicine (DIFUTURE) Consortium, Munich, Germany
Joachim Havla Institute of Clinical Neuroimmunology, LMU Hospital, Ludwig-Maximilians-Universität in Munich, Munich, Germany Data Integration for Future Medicine (DIFUTURE) Consortium, Munich, Germany
Markus C. Kowarik Department of Neurology & Stroke and Hertie-Institute for Clinical Brain Research, Eberhard-Karls University of Tübingen, Tübingen, Germany Data Integration for Future Medicine (DIFUTURE) Consortium, Munich, Germany
Klaus Kuhn Institute of AI and Informatics in Medicine, School of Medicine, Technical University of Munich, Munich, Germany Data Integration for Future Medicine (DIFUTURE) Consortium, Munich, Germany
Ingrid Gatz Institute of AI and Informatics in Medicine, School of Medicine, Technical University of Munich, Munich, Germany Data Integration for Future Medicine (DIFUTURE) Consortium, Munich, Germany
Helmut Spengler Institute of AI and Informatics in Medicine, School of Medicine, Technical University of Munich, Munich, Germany Data Integration for Future Medicine (DIFUTURE) Consortium, Munich, Germany
Benedikt Wiestler Department of Diagnostic and Interventional Neuroradiology, Klinikum rechts der Isar, School of Medicine, Technical University of Munich, Munich, Germany
Lioba Grundl Department of Diagnostic and Interventional Neuroradiology, Klinikum rechts der Isar, School of Medicine, Technical University of Munich, Munich, Germany
Dominik Sepp Department of Diagnostic and Interventional Neuroradiology, Klinikum rechts der Isar, School of Medicine, Technical University of Munich, Munich, Germany
Bernhard Hemmer Department of Neurology, Klinikum rechts der Isar, School of Medicine, Technical University of Munich, Munich, Germany Data Integration for Future Medicine (DIFUTURE) Consortium, Munich, Germany Munich Cluster for Systems Neurology (SyNergy), Munich, Germany

Collapse

Xu J, Wei K, Wang C, Huang C, Xue Y, Zhang R, Qin G, Yu Y. Estimation of average treatment effect based on a multi-index propensity score. BMC Med Res Methodol 2022;22:337. [PMID: 36577950 PMCID: PMC9795597 DOI: 10.1186/s12874-022-01822-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 12/16/2022] [Indexed: 12/29/2022] Open

Abstract

BACKGROUND

Estimating the average effect of a treatment, exposure, or intervention on health outcomes is a primary aim of many medical studies. However, unbalanced covariates between groups can lead to confounding bias when using observational data to estimate the average treatment effect (ATE). In this study, we proposed an estimator to correct confounding bias and provide multiple protection for estimation consistency.

METHODS

With reference to the kernel function-based double-index propensity score (Ker.DiPS) estimator, we proposed the artificial neural network-based multi-index propensity score (ANN.MiPS) estimator. The ANN.MiPS estimator employed the artificial neural network to estimate the MiPS that combines the information from multiple candidate models for propensity score and outcome regression. A Monte Carlo simulation study was designed to evaluate the performance of the proposed ANN.MiPS estimator. Furthermore, we applied our estimator to real data to discuss its practicability.

RESULTS

The simulation study showed the bias of the ANN.MiPS estimators is very small and the standard error is similar if any one of the candidate models is correctly specified under all evaluated sample sizes, treatment rates, and covariate types. Compared to the kernel function-based estimator, the ANN.MiPS estimator usually yields smaller standard error when the correct model is incorporated in the estimator. The empirical study indicated the point estimation for ATE and its bootstrap standard error of the ANN.MiPS estimator is stable under different model specifications.

CONCLUSIONS

The proposed estimator extended the combination of information from two models to multiple models and achieved multiply robust estimation for ATE. Extra efficiency was gained by our estimator compared to the kernel-based estimator. The proposed estimator provided a novel approach for estimating the causal effects in observational studies.

Collapse

Hu L, Ji J, Liu H, Ennis R. A Flexible Approach for Assessing Heterogeneity of Causal Treatment Effects on Patient Survival Using Large Datasets with Clustered Observations. Int J Environ Res Public Health 2022;19:14903. [PMID: 36429621 PMCID: PMC9690785 DOI: 10.3390/ijerph192214903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Revised: 11/08/2022] [Accepted: 11/09/2022] [Indexed: 06/16/2023]

Baird A, Cheng Y, Xia Y. Use of machine learning to examine disparities in completion of substance use disorder treatment. PLoS One 2022;17:e0275054. [PMID: 36149868 PMCID: PMC9506659 DOI: 10.1371/journal.pone.0275054] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 09/11/2022] [Indexed: 11/19/2022] Open

Abstract

The objective of this work is to examine disparities in the completion of substance use disorder treatment in the U.S. Our data is from the Treatment Episode Dataset Discharge (TEDS-D) datasets from the U.S. Substance Abuse and Mental Health Services Administration (SAMHSA) for 2017–2019. We apply a two-stage virtual twins model (random forest + decision tree) where, in the first stage (random forest), we determine differences in treatment completion probability associated with race/ethnicity, income source, no co-occurrence of mental health disorders, gender (biological), no health insurance, veteran status, age, and primary substance (alcohol or opioid). In the second stage (decision tree), we identify subgroups associated with probability differences, where such subgroups are more or less likely to complete treatment. We find the subgroups most likely to complete substance use disorder treatment, when the subgroup represents more than 1% of the sample, are those with no mental health condition co-occurrence (4.8% more likely when discharged from an ambulatory outpatient treatment program, representing 62% of the sample; and 10% more likely for one of the more specifically defined subgroups representing 10% of the sample), an income source of job-related wages/salary (4.3% more likely when not having used in the 30 days primary to discharge and when primary substance is not alcohol only, representing 28% of the sample), and white non-Hispanics (2.7% more likely when discharged from residential long-term treatment, representing 9% of the sample). Important implications are that: 1) those without a co-occurring mental health condition are the most likely to complete treatment, 2) those with job related wages or income are more likely to complete treatment, and 3) racial/ethnicity disparities persist in favor of white non-Hispanic individuals seeking to complete treatment. Thus, additional resources may be needed to combat such disparities.

Collapse

Rostami M, Saarela O. Targeted L₁-Regularization and Joint Modeling of Neural Networks for Causal Inference. Entropy (Basel) 2022;24:1290. [PMID: 36141175 PMCID: PMC9497603 DOI: 10.3390/e24091290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Revised: 09/07/2022] [Accepted: 09/08/2022] [Indexed: 06/16/2023]

Shi J, Norgeot B. Learning Causal Effects From Observational Data in Healthcare: A Review and Summary. Front Med (Lausanne) 2022;9:864882. [PMID: 35872797 PMCID: PMC9300826 DOI: 10.3389/fmed.2022.864882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 06/17/2022] [Indexed: 11/29/2022] Open

Cai H, Lu W, Marceau West R, Mehrotra DV, Huang L. CAPITAL: Optimal subgroup identification via constrained policy tree search. Stat Med 2022;41:4227-4244. [PMID: 35799329 PMCID: PMC9544117 DOI: 10.1002/sim.9507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 05/04/2022] [Accepted: 06/06/2022] [Indexed: 11/10/2022]

Xu J, Guo Y, Wang F, Xu H, Lucero R, Bian J, Prosperi M. Protocol for the development of a reporting guideline for causal and counterfactual prediction models in biomedicine. BMJ Open 2022;12:e059715. [PMID: 35725267 PMCID: PMC9214357 DOI: 10.1136/bmjopen-2021-059715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open

Abstract

INTRODUCTION

While there are guidelines for reporting on observational studies (eg, Strengthening the Reporting of Observational Studies in Epidemiology, Reporting of Studies Conducted Using Observational Routinely Collected Health Data Statement), estimation of causal effects from both observational data and randomised experiments (eg, A Guideline for Reporting Mediation Analyses of Randomised Trials and Observational Studies, Consolidated Standards of Reporting Trials, PATH) and on prediction modelling (eg, Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis), none is purposely made for deriving and validating models from observational data to predict counterfactuals for individuals on one or more possible interventions, on the basis of given (or inferred) causal structures. This paper describes methods and processes that will be used to develop a Reporting Guideline for Causal and Counterfactual Prediction Models (PRECOG).

METHODS AND ANALYSIS

PRECOG will be developed following published guidance from the Enhancing the Quality and Transparency of Health Research (EQUATOR) network and will comprise five stages. Stage 1 will be meetings of a working group every other week with rotating external advisors (active until stage 5). Stage 2 will comprise a systematic review of literature on counterfactual prediction modelling for biomedical sciences (registered in Prospective Register of Systematic Reviews). In stage 3, a computer-based, real-time Delphi survey will be performed to consolidate the PRECOG checklist, involving experts in causal inference, epidemiology, statistics, machine learning, informatics and protocols/standards. Stage 4 will involve the write-up of the PRECOG guideline based on the results from the prior stages. Stage 5 will seek the peer-reviewed publication of the guideline, the scoping/systematic review and dissemination.

ETHICS AND DISSEMINATION

The study will follow the principles of the Declaration of Helsinki. The study has been registered in EQUATOR and approved by the University of Florida's Institutional Review Board (#202200495). Informed consent will be obtained from the working groups and the Delphi survey participants. The dissemination of PRECOG and its products will be done through journal publications, conferences, websites and social media.

Collapse

Prosperi M, Boucher C, Bian J, Marini S. Assessing putative bias in prediction of anti-microbial resistance from real-world genotyping data under explicit causal assumptions. Artif Intell Med 2022;130:102326. [PMID: 35809965 PMCID: PMC9425730 DOI: 10.1016/j.artmed.2022.102326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Revised: 05/11/2022] [Accepted: 05/23/2022] [Indexed: 11/02/2022]

Abstract

Whole genome sequencing (WGS) is quickly becoming the customary means for identification of antimicrobial resistance (AMR) due to its ability to obtain high resolution information about the genes and mechanisms that are causing resistance and driving pathogen mobility. By contrast, traditional phenotypic (antibiogram) testing cannot easily elucidate such information. Yet development of AMR prediction tools from genotype-phenotype data can be biased, since sampling is non-randomized. Sample provenience, period of collection, and species representation can confound the association of genetic traits with AMR. Thus, prediction models can perform poorly on new data with sampling distribution shifts. In this work -under an explicit set of causal assumptions- we evaluate the effectiveness of propensity-based rebalancing and confounding adjustment on antibiotic resistance prediction using genotype-phenotype AMR data from the Pathosystems Resource Integration Center (PATRIC). We select bacterial genotypes (encoded as k-mer signatures, i.e., DNA fragments of length k), country, year, species, and AMR phenotypes for the tetracycline drug class, preparing test data with recent genomes coming from a single country. We test boosted logistic regression (BLR) and random forests (RF) with/without bias-handling. On 10,936 instances, we find evidence of species, location and year imbalance with respect to the AMR phenotype. The crude versus bias-adjusted change in effect of genetic signatures on AMR varies but only moderately (selecting the top 20,000 out of 40+ million k-mers). The area under the receiver operating characteristic (AUROC) of the RF (0.95) is comparable to that of BLR (0.94) on both out-of-bag samples from bootstrap and the external test (n = 1085), where AUROCs do not decrease. We observe a 1 %-5 % gain in AUROC with bias-handling compared to the sole use of genetic signatures. In conclusion, we recommend using causally-informed prediction methods for modeling real-world AMR data; however, traditional adjustment or propensity-based methods may not provide advantage in all use cases and further methodological development should be sought.

Collapse

Caron A, Baio G, Manolopoulou I. Shrinkage Bayesian Causal Forests for Heterogeneous Treatment Effects Estimation*. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2067549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

Zhou N, Brook RD, Dinov ID, Wang L. Optimal dynamic treatment regime estimation using information extraction from unstructured clinical text. Biom J 2022;64:805-817. [PMID: 35112726 PMCID: PMC9185731 DOI: 10.1002/bimj.202100077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2021] [Revised: 10/18/2021] [Accepted: 10/21/2021] [Indexed: 11/10/2022]

Abstract

The wide-scale adoption of electronic health records (EHRs) provides extensive information to support precision medicine and personalized health care. In addition to structured EHRs, we leverage free-text clinical information extraction (IE) techniques to estimate optimal dynamic treatment regimes (DTRs), a sequence of decision rules that dictate how to individualize treatments to patients based on treatment and covariate history. The proposed IE of patient characteristics closely resembles "The clinical Text Analysis and Knowledge Extraction System" and employs named entity recognition, boundary detection, and negation annotation. It also utilizes regular expressions to extract numerical information. Combining the proposed IE with optimal DTR estimation, we extract derived patient characteristics and use tree-based reinforcement learning (T-RL) to estimate multistage optimal DTRs. IE significantly improved the estimation in counterfactual outcome models compared to using structured EHR data alone, which often include incomplete data, data entry errors, and other potentially unobserved risk factors. Moreover, including IE in optimal DTR estimation provides larger study cohorts and a broader pool of candidate tailoring variables. We demonstrate the performance of our proposed method via simulations and an application using clinical records to guide blood pressure control treatments among critically ill patients with severe acute hypertension. This joint estimation approach improves the accuracy of identifying the optimal treatment sequence by 14-24% compared to traditional inference without using IE, based on our simulations over various scenarios. In the blood pressure control application, we successfully extracted significant blood pressure predictors that are unobserved or partially missing from structured EHR.

Collapse

Zhou T, Ji Y. Incorporating external data into the analysis of clinical trials via Bayesian additive regression trees. Stat Med 2021;40:6421-6442. [PMID: 34494288 DOI: 10.1002/sim.9191] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 08/18/2021] [Accepted: 08/21/2021] [Indexed: 11/06/2022]

Zhong Y, Kennedy EH, Bodnar LM, Naimi AI. AIPW: An R Package for Augmented Inverse Probability-Weighted Estimation of Average Causal Effects. Am J Epidemiol 2021;190:2690-2699. [PMID: 34268567 PMCID: PMC8796813 DOI: 10.1093/aje/kwab207] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Revised: 07/09/2021] [Accepted: 07/13/2021] [Indexed: 12/26/2022] Open

Hoogland J, IntHout J, Belias M, Rovers MM, Riley RD, E. Harrell Jr F, Moons KGM, Debray TPA, Reitsma JB. A tutorial on individualized treatment effect prediction from randomized trials with a binary endpoint. Stat Med 2021;40:5961-5981. [PMID: 34402094 PMCID: PMC9291969 DOI: 10.1002/sim.9154] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Revised: 06/08/2021] [Accepted: 07/19/2021] [Indexed: 12/23/2022]

Hatch SG, Lobaina D, Doss BD. Optimizing Coaching During Web-Based Relationship Education for Low-Income Couples: Protocol for Precision Medicine Research. JMIR Res Protoc 2021;10:e33047. [PMID: 34734838 PMCID: PMC8603166 DOI: 10.2196/33047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2021] [Accepted: 09/03/2021] [Indexed: 11/29/2022] Open

Abstract

Background

In-person relationship education classes funded by the federal government tend to experience relatively high attrition rates and have only a limited effect on relationships. In contrast, low-income couples tend to report meaningful gains from web-based relationship education when provided with individualized coach contact. However, little is known about the method and intensity of practitioner contact that a couple requires to complete the web-based program and receive the intended benefit.

Objective

The aim of this study is to use within-group models to create an algorithm to assign future couples to different programs and levels of coach contact, identify the most powerful predictors of treatment adherence and gains in relationship satisfaction within 3 different levels of coaching, and examine the most powerful predictors of treatment adherence and gains in relationship satisfaction among the 3 levels of coach contact.

Methods

To accomplish these goals, this project intends to use data from a web-based Sequential Multiple Assignment Randomized Trial of the OurRelationship and web-based Prevention and Relationship Enhancement programs, in which the method and type of coach contact were randomly varied across 1248 couples (2496 individuals), with the hope of advancing theory in this area and generating accurate predictions. This study was funded by the US Department of Health and Human Services, Administration for Children and Families (grant number 90PD0309).

Results

Data collection from the Sequential Multiple Assignment Randomized Trial of the OurRelationship and web-based Prevention and Relationship Enhancement Program was completed in October of 2020.

Conclusions

Some of the direct benefits of this study include benefits to social services program administrators, tailoring of more effective relationship education, and effective delivery of evidence- and web-based relationship health interventions.

International Registered Report Identifier (IRRID)

DERR1-10.2196/33047

Collapse

Raja S, Rice TW, Murthy SC, Ahmad U, Semple ME, Blackstone EH, Ishwaran H. Value of Lymphadenectomy in Patients Receiving Neoadjuvant Therapy for Esophageal Adenocarcinoma. Ann Surg 2021;274:e320-e327. [PMID: 31850981 PMCID: PMC7295683 DOI: 10.1097/sla.0000000000003598] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Zhang Y, Sabbaghi A. The Designed Bootstrap for Causal Inference in Big Observational Data. J Stat Theory Pract 2021. [DOI: 10.1007/s42519-021-00213-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Hu L, Ji J, Li F. Estimating heterogeneous survival treatment effect in observational data using machine learning. Stat Med 2021;40:4691-4713. [PMID: 34114252 PMCID: PMC9827499 DOI: 10.1002/sim.9090] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Revised: 05/16/2021] [Accepted: 05/19/2021] [Indexed: 01/12/2023]

Sun LZ, Wu C, Li X, Chen C, Schmidt EV. Independent action models and prediction of combination treatment effects for response rate, duration of response and tumor size change in oncology drug development. Contemp Clin Trials 2021;106:106434. [PMID: 34004341 DOI: 10.1016/j.cct.2021.106434] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 03/05/2021] [Accepted: 05/10/2021] [Indexed: 11/16/2022]

Prosperi M, Salemi M, Ghosh S, Lyu T, Bian J, Chen Z, Zhao J. Causal AI with Real World Data: Do Statins Protect from Alzheimer's Disease Onset? ICMHI 2021 (2021) 2021;2021:296-303. [PMID: 37954527 PMCID: PMC10636706 DOI: 10.1145/3472813.3473206] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2023]

Qi W, Abu-Hanna A, van Esch TEM, de Beurs D, Liu Y, Flinterman LE, Schut MC. Explaining heterogeneity of individual treatment causal effects by subgroup discovery: An observational case study in antibiotics treatment of acute rhino-sinusitis. Artif Intell Med 2021;116:102080. [PMID: 34020753 DOI: 10.1016/j.artmed.2021.102080] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Revised: 03/09/2021] [Accepted: 04/20/2021] [Indexed: 11/19/2022]

Abstract

OBJECTIVES

Individuals may respond differently to the same treatment, and there is a need to understand such heterogeneity of causal individual treatment effects. We propose and evaluate a modelling approach to better understand this heterogeneity from observational studies by identifying patient subgroups with a markedly deviating response to treatment. We illustrate this approach in a primary care case-study of antibiotic (AB) prescription on recovery from acute rhino-sinusitis (ARS).

METHODS

Our approach consists of four stages and is applied to a large dataset in primary care dataset of 24,392 patients suspected of suffering from ARS. We first identify pre-treatment variables that either confound the relationship between treatment and outcome or are risk factors of the outcome. Second, based on the pre-treatment variables we create Synthetic Random Forest (SRF) models to compute the potential outcomes and subsequently the causal individual treatment effect (ITE) estimates. Third, we perform subgroup discovery using the ITE estimates as outcomes to identify positive and negative responders. Fourth, we evaluate the predictive performance of the identified subgroups for predicting the outcome in two ways: the likelihood ratio test, and whether the subgroups are selected via the Akaike Information Criterion (AIC) using backward stepwise variable selection. We validate the whole modelling strategy by means of 10-fold-cross-validation.

RESULTS

Based on 20 pre-treatment variables, four subgroups (three for positive responders and one for negative responders) were identified. The log likelihood ratio tests showed that the subgroups were significant. Variable selection using the AIC kept two of the four subgroups, one for positive responders and one for negative responders. As for the validation of the whole modelling strategy, all reported measures (the number of pre-treatment variables associated with the outcome, number of subgroups, number of subgroups surviving variable selection and coverage) showed little variation.

CONCLUSIONS

With the proposed approach, we identified subgroups of positive and negative responders to treatment that markedly deviate from the mean response. The subgroups showed additive predictive value of the outcome. The modelling approach strategy was shown to be robust on this dataset. Our approach was thus able to discover understandable subgroups from observational data that have predictive value and which may be considered by the clinical users to get insight into who responds positively or negatively to a proposed treatment.

Collapse

Prosperi M, Guo Y, Bian J. Bagged random causal networks for interventional queries on observational biomedical datasets. J Biomed Inform 2021;115:103689. [PMID: 33548542 DOI: 10.1016/j.jbi.2021.103689] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Revised: 12/30/2020] [Accepted: 01/23/2021] [Indexed: 11/30/2022]

Abstract

Learning causal effects from observational data, e.g. estimating the effect of a treatment on survival by data-mining electronic health records (EHRs), can be biased due to unmeasured confounders, mediators, and colliders. When the causal dependencies among features/covariates are expressed in the form of a directed acyclic graph, using do-calculus it is possible to identify one or more adjustment sets for eliminating the bias on a given causal query under certain assumptions. However, prior knowledge of the causal structure might be only partial; algorithms for causal structure discovery often provide ambiguous solutions, and their computational complexity becomes practically intractable when the feature sets grow large. We hypothesize that the estimation of the true causal effect of a causal query on to an outcome can be approximated as an ensemble of lower complexity estimators, namely bagged random causal networks. A bagged random causal network is an ensemble of subnetworks constructed by sampling the feature subspaces (with the query, the outcome, and a random number of other features), drawing conditional dependencies among the features, and inferring the corresponding adjustment sets. The causal effect can be then estimated by any regression function of the outcome by the query paired with the adjustment sets. Through simulations and a real-world clinical dataset (class III malocclusion data), we show that the bagged estimator is -in most cases- consistent with the true causal effect if the structure is known, has a good variance/bias trade-off when the structure is unknown (estimated using heuristics), has lower computational complexity than learning a full network, and outperforms boosted regression. In conclusion, the bagged random causal network is well-suited to estimate query-target causal effects from observational studies on EHR and other high-dimensional biomedical databases.

Collapse

Adil SM, Elahi C, Gramer R, Spears CA, Fuller AT, Haglund MM, Dunn TW. Predicting the Individual Treatment Effect of Neurosurgery for Patients with Traumatic Brain Injury in the Low-Resource Setting: A Machine Learning Approach in Uganda. J Neurotrauma 2020;38:928-939. [PMID: 33054545 DOI: 10.1089/neu.2020.7262] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open

Abstract

Traumatic brain injury (TBI) disproportionately affects low- and middle-income countries (LMICs). In these low-resource settings, effective triage of patients with TBI-including the decision of whether or not to perform neurosurgery-is critical in optimizing patient outcomes and healthcare resource utilization. Machine learning may allow for effective predictions of patient outcomes both with and without surgery. Data from patients with TBI was collected prospectively at Mulago National Referral Hospital in Kampala, Uganda, from 2016 to 2019. One linear and six non-linear machine learning models were designed to predict good versus poor outcome near hospital discharge and internally validated using nested five-fold cross-validation. The 13 predictors included clinical variables easily acquired on admission and whether or not the patient received surgery. Using an elastic-net regularized logistic regression model (GLMnet), with predictions calibrated using Platt scaling, the probability of poor outcome was calculated for each patient both with and without surgery (with the difference quantifying the "individual treatment effect," ITE). Relative ITE represents the percent reduction in chance of poor outcome, equaling this ITE divided by the probability of poor outcome with no surgery. Ultimately, 1766 patients were included. Areas under the receiver operating characteristic curve (AUROCs) ranged from 83.1% (single C5.0 ruleset) to 88.5% (random forest), with the GLMnet at 87.5%. The two variables promoting good outcomes in the GLMnet model were high Glasgow Coma Scale score and receiving surgery. For the subgroup not receiving surgery, the median relative ITE was 42.9% (interquartile range [IQR], 32.7% to 53.5%); similarly, in those receiving surgery, it was 43.2% (IQR, 32.9% to 54.3%). We provide the first machine learning-based model to predict TBI outcomes with and without surgery in LMICs, thus enabling more effective surgical decision making in the resource-limited setting. Predicted ITE similarity between surgical and non-surgical groups suggests that, currently, patients are not being chosen optimally for neurosurgical intervention. Our clinical decision aid has the potential to improve outcomes.

Collapse

Zhang P, Ma J, Chen X, Shentu Y. A nonparametric method for value function guided subgroup identification via gradient tree boosting for censored survival data. Stat Med 2020;39:4133-4146. [PMID: 32786155 DOI: 10.1002/sim.8714] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2020] [Revised: 06/08/2020] [Accepted: 07/09/2020] [Indexed: 11/07/2022]

Hu L, Li L, Ji J. Machine learning to identify and understand key factors for provider-patient discussions about smoking. Prev Med Rep 2020;20:101238. [PMID: 33224719 PMCID: PMC7666379 DOI: 10.1016/j.pmedr.2020.101238] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2020] [Revised: 10/07/2020] [Accepted: 10/20/2020] [Indexed: 12/15/2022] Open

Liang M, Yu M. A Semiparametric Approach to Model Effect Modification. J Am Stat Assoc 2020. [DOI: 10.1080/01621459.2020.1811099] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Garcia-Montemayor V, Martin-Malo A, Barbieri C, Bellocchio F, Soriano S, Pendon-Ruiz de Mier V, Molina IR, Aljama P, Rodriguez M. Predicting mortality in hemodialysis patients using machine learning analysis. Clin Kidney J 2020;14:1388-1395. [PMID: 34221370 PMCID: PMC8247746 DOI: 10.1093/ckj/sfaa126] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2020] [Accepted: 06/10/2020] [Indexed: 12/18/2022] Open

Prosperi M, Guo Y, Sperrin M, Koopman JS, Min JS, He X, Rich S, Wang M, Buchan IE, Bian J. Causal inference and counterfactual prediction in machine learning for actionable healthcare. NAT MACH INTELL 2020;2:369-75. [DOI: 10.1038/s42256-020-0197-y] [Citation(s) in RCA: 68] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Sun M, Cai Y, Zhang K, Zhao X, Chen Z. A method to analyze the sensitivity ranking of various abiotic factors to acoustic densities of fishery resources in the surface mixed layer and bottom cold water layer of the coastal area of low latitude: a case study in the northern South China Sea. Sci Rep 2020;10:11128. [PMID: 32636512 DOI: 10.1038/s41598-020-67387-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Accepted: 06/04/2020] [Indexed: 11/09/2022] Open

Yadlowsky S, Pellegrini F, Lionetto F, Braune S, Tian L. Estimation and Validation of Ratio-based Conditional Average Treatment Effects Using Observational Data. J Am Stat Assoc 2020;116:335-352. [PMID: 33767517 PMCID: PMC7985957 DOI: 10.1080/01621459.2020.1772080] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2019] [Revised: 04/20/2020] [Accepted: 05/16/2020] [Indexed: 10/24/2022]

Sugasawa S, Noma H. Efficient screening of predictive biomarkers for individual treatment selection. Biometrics 2020;77:249-257. [PMID: 32294246 DOI: 10.1111/biom.13279] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Revised: 03/27/2020] [Accepted: 03/30/2020] [Indexed: 01/18/2023]

Wongvibulsin S, Wu KC, Zeger SL. Clinical risk prediction with random forests for survival, longitudinal, and multivariate (RF-SLAM) data analysis. BMC Med Res Methodol 2019;20:1. [PMID: 31888507 PMCID: PMC6937754 DOI: 10.1186/s12874-019-0863-0] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Accepted: 11/08/2019] [Indexed: 12/23/2022] Open

Abstract

Background

Clinical research and medical practice can be advanced through the prediction of an individual’s health state, trajectory, and responses to treatments. However, the majority of current clinical risk prediction models are based on regression approaches or machine learning algorithms that are static, rather than dynamic. To benefit from the increasing emergence of large, heterogeneous data sets, such as electronic health records (EHRs), novel tools to support improved clinical decision making through methods for individual-level risk prediction that can handle multiple variables, their interactions, and time-varying values are necessary.

Methods

We introduce a novel dynamic approach to clinical risk prediction for survival, longitudinal, and multivariate (SLAM) outcomes, called random forest for SLAM data analysis (RF-SLAM). RF-SLAM is a continuous-time, random forest method for survival analysis that combines the strengths of existing statistical and machine learning methods to produce individualized Bayes estimates of piecewise-constant hazard rates. We also present a method-agnostic approach for time-varying evaluation of model performance.

Results

We derive and illustrate the method by predicting sudden cardiac arrest (SCA) in the Left Ventricular Structural (LV) Predictors of Sudden Cardiac Death (SCD) Registry. We demonstrate superior performance relative to standard random forest methods for survival data. We illustrate the importance of the number of preceding heart failure hospitalizations as a time-dependent predictor in SCA risk assessment.

Conclusions

RF-SLAM is a novel statistical and machine learning method that improves risk prediction by incorporating time-varying information and accommodating a large number of predictors, their interactions, and missing values. RF-SLAM is designed to easily extend to simultaneous predictions of multiple, possibly competing, events and/or repeated measurements of discrete or continuous variables over time.Trial registration: LV Structural Predictors of SCD Registry (clinicaltrials.gov, NCT01076660), retrospectively registered 25 February 2010

Collapse

Żegleń M, Marini E, Cabras S, Kryst Ł, Das R, Chakraborty A, Dasgupta P. The relationship among the age at menarche, anthropometric characteristics, and socio-economic factors in Bengali girls from Kolkata, India. Am J Hum Biol 2019;32:e23380. [PMID: 31875347 DOI: 10.1002/ajhb.23380] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Revised: 12/05/2019] [Accepted: 12/09/2019] [Indexed: 11/12/2022] Open

Rice TW, Lu M, Ishwaran H, Blackstone EH. Precision Surgical Therapy for Adenocarcinoma of the Esophagus and Esophagogastric Junction. J Thorac Oncol 2019;14:2164-2175. [PMID: 31442498 PMCID: PMC6876319 DOI: 10.1016/j.jtho.2019.08.004] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2019] [Revised: 07/31/2019] [Accepted: 08/05/2019] [Indexed: 12/12/2022]

Abstract

INTRODUCTION

To facilitate the initial clinical decision regarding whether to use esophagectomy alone or neoadjuvant therapy in surgical care for individual patients with adenocarcinoma of the esophagus and esophagogastric junction-information not available from randomized trials-a machine-learning analysis was performed using worldwide real-world data on patients undergoing different therapies for this rare adenocarcinoma.

METHODS

Using random forest technology in a sequential analysis, we (1) identified eligibility for each of four therapies among 13,365 patients: esophagectomy alone (n = 6649), neoadjuvant therapy (n = 4706), esophagectomy and adjuvant therapy (n = 998), and neoadjuvant and adjuvant therapy (n = 1022); (2) performed survival analyses incorporating interactions of patient and cancer characteristics with therapy; (3) determined optimal therapy as that predicted to maximize lifetime within 10 years (restricted mean survival time; RMST) for each patient; and (4) compared lifetime gained from optimal versus actual therapies.

RESULTS

Actual therapy was optimal in 61% of those receiving esophagectomy alone; neoadjuvant therapy was optimal for 36% receiving neoadjuvant therapy. Many patients were predicted to benefit from postoperative adjuvant therapy. Total RMST for actual therapy received was 58,825 years. Had patients received optimal therapy, total RMST was predicted to be 62,982 years, a 7% gain.

CONCLUSIONS

Average treatment effect for adenocarcinoma of the esophagus yields only crude evidence-based therapy guidelines. However, patient response to therapy is widely variable, and survival after data-driven predicted optimal therapy often differs from actual therapy received. Therapy must address an individual patient's cancer and clinical characteristics to provide precision surgical therapy for adenocarcinoma of the esophagus and esophagogastric junction.

Collapse

Grigoryan H, Schiffman C, Gunter MJ, Naccarati A, Polidoro S, Dagnino S, Dudoit S, Vineis P, Rappaport SM. Cys34 Adductomics Links Colorectal Cancer with the Gut Microbiota and Redox Biology. Cancer Res 2019;79:6024-6031. [PMID: 31641032 PMCID: PMC6891211 DOI: 10.1158/0008-5472.can-19-1529] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Revised: 08/21/2019] [Accepted: 10/11/2019] [Indexed: 12/12/2022]

Abstract

Chronic inflammation is an established risk factor for colorectal cancer. To study reactive products of gut inflammation and redox signaling on colorectal cancer development, we used untargeted adductomics to detect adduct features in prediagnostic serum from the EPIC Italy cohort. We focused on modifications to Cys34 in human serum albumin, which is responsible for scavenging small reactive electrophiles that might initiate cancers. Employing a combination of statistical methods, we selected seven Cys34 adducts associated with colorectal cancer, as well as body mass index (BMI; a well-known risk factor). Five adducts were more abundant in colorectal cancer cases than controls and clustered with each other, suggesting a common pathway. Because two of these adducts were Cys34 modifications by methanethiol, a microbial-human cometabolite, and crotonaldehyde, a product of lipid peroxidation, these findings further implicate infiltration of gut microbes into the intestinal mucosa and the corresponding inflammatory response as causes of colorectal cancer. The other two associated adducts were Cys34 disulfides of homocysteine that were less abundant in colorectal cancer cases than controls and may implicate homocysteine metabolism as another causal pathway. The selected adducts and BMI ranked higher as potentially causal factors than variables previously associated with colorectal cancer (smoking, alcohol consumption, physical activity, and total meat consumption). Regressions of case-control differences in adduct levels on days to diagnosis showed no statistical evidence that disease progression, rather than causal factors at recruitment, contributed to the observed differences. These findings support the hypothesis that infiltration of gut microbes into the intestinal mucosa and the resulting inflammation are causal factors for colorectal cancer. SIGNIFICANCE: Infiltration of gut microbes into the intestinal mucosa and the resulting inflammation are causal factors for colorectal cancer.

Collapse