1
|
Baker WL, Moore TE, Baron E, Kittleson M, Parker WF, Jaiswal A. A systematic review of reporting and handling of missing data in observational studies using the UNOS database. J Heart Lung Transplant 2025; 44:462-468. [PMID: 39521197 DOI: 10.1016/j.healun.2024.10.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2024] [Revised: 10/21/2024] [Accepted: 10/28/2024] [Indexed: 11/16/2024] Open
Abstract
BACKGROUND Missing data decreasing study power and introducing bias, thereby undermining a registry's ability to draw valid inferences. We evaluated how missing data are reported and addressed in heart transplantation (HT) studies using the United Network for Organ Sharing (UNOS) database. METHODS We conducted a systematic literature search of Medline from January 1, 2018 through August 22, 2023 and included studies that used the UNOS database to evaluate adult (≥18 years) de novo HT recipients. We collected details on the study population, timeframe, primary end-point, use of missing data, and whether and what methods were used to handle missing data. Approaches were classified as variable selection, complete case analysis (CCA), missing indicator method, single imputation, or multiple imputation. RESULTS Of the 229 included studies, 67 (29.3%) limited their cohorts to those without missing data for the outcome or key variables and 93 (40.6%) reported missing data in their final cohort. 78 (34.1%) studies reported how they handled missing data in their statistical modeling. Of these, CCA was most used (n = 41, 52.6%) followed by multiple imputation (n = 22, 28.2%), and other methods (n = 15, 19.2%). Thirty-one (13.5%) studies reported removing covariates from their analysis because of missingness. CONCLUSIONS Merely a third of the identified UNOS database studies reported how they handled missing data in their analysis, with strategies varying. Although no singular approach to handling missing data exists, methods are available that can improve upon the most used approaches. Future best practices should include explicit reporting of missingness, detailed methods, and sensitivity checks.
Collapse
Affiliation(s)
- William L Baker
- Department of Pharmacy Practice, University of Connecticut School of Pharmacy, Storrs, Connecticut.
| | - Timothy E Moore
- Statistical Consulting Services (Center for Open Research Resources & Equipment), University of Connecticut, Storrs, Connecticut
| | - Eric Baron
- Servier Pharmaceuticals, Boston, Massachusetts
| | - Michelle Kittleson
- Department of Cardiology, Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California
| | - William F Parker
- Departments of Medicine and Public Health Sciences, University of Chicago Medicine, Chicago, Illinois
| | - Abhishek Jaiswal
- Hartford HealthCare Heart and Vascular Institute, Hartford Hospital, Hartford, Connecticut
| |
Collapse
|
2
|
Weymann D, Krebs E, Regier DA. Addressing immortal time bias in precision medicine: Practical guidance and methods development. Health Serv Res 2025; 60:e14376. [PMID: 39225454 PMCID: PMC11782076 DOI: 10.1111/1475-6773.14376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/04/2024] Open
Abstract
OBJECTIVE To compare theoretical strengths and limitations of common immortal time adjustment methods, propose a new approach using multiple imputation (MI), and provide practical guidance for using MI in precision medicine evaluations centered on a real-world case study. STUDY SETTING AND DESIGN Methods comparison, guidance, and real-world case study based on previous literature. We compared landmark analysis, time-distribution matching, time-dependent analysis, and our proposed MI application. Guidance for MI spanned (1) selecting the imputation method; (2) specifying and applying the imputation model; and (3) conducting comparative analysis and pooling estimates. Our case study used a matched cohort design to evaluate overall survival benefits of whole-genome and transcriptome analysis, a precision medicine technology, compared to usual care for advanced cancers, and applied both time-distribution matching and MI. Bootstrap simulation characterized imputation sensitivity to varying data missingness and sample sizes. DATA SOURCES AND ANALYTIC SAMPLE Case study used population-based administrative data and single-arm precision medicine program data from British Columbia, Canada for the study period 2012 to 2015. PRINCIPAL FINDINGS While each method described can reduce immortal time bias, MI offers theoretical advantages. Compared to alternative approaches, MI minimizes information loss and better characterizes statistical uncertainty about the true length of the immortal time period, avoiding false precision. Additionally, MI explicitly considers the impacts of patient characteristics on immortal time distributions, with inclusion criteria and follow-up period definitions that do not inadvertently risk biasing evaluations. In the real-world case study, survival analysis results did not substantively differ across MI and time distribution matching, but standard errors based on MI were higher for all point estimates. Mean imputed immortal time was stable across simulations. CONCLUSIONS Precision medicine evaluations must employ immortal time adjustment methods for unbiased, decision-grade real-world evidence generation. MI is a promising solution to the challenge of immortal time bias.
Collapse
Affiliation(s)
- Deirdre Weymann
- Cancer Control Research, BC CancerVancouverBritish ColumbiaCanada
- Faculty of Health SciencesSimon Fraser UniversityBurnabyBritish ColumbiaCanada
| | - Emanuel Krebs
- Cancer Control Research, BC CancerVancouverBritish ColumbiaCanada
| | - Dean A. Regier
- Cancer Control Research, BC CancerVancouverBritish ColumbiaCanada
- School of Population and Publics Health, Faculty of MedicineUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| |
Collapse
|
3
|
Ren W, Liu Z, Wu Y, Zhang Z, Hong S, Liu H. Moving Beyond Medical Statistics: A Systematic Review on Missing Data Handling in Electronic Health Records. HEALTH DATA SCIENCE 2024; 4:0176. [PMID: 39635227 PMCID: PMC11615160 DOI: 10.34133/hds.0176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/07/2024] [Accepted: 07/23/2024] [Indexed: 12/07/2024]
Abstract
Background: Missing data in electronic health records (EHRs) presents significant challenges in medical studies. Many methods have been proposed, but uncertainty exists regarding the current state of missing data addressing methods applied for EHR and which strategy performs better within specific contexts. Methods: All studies referencing EHR and missing data methods published from their inception until 2024 March 30 were searched via the MEDLINE, EMBASE, and Digital Bibliography and Library Project databases. The characteristics of the included studies were extracted. We also compared the performance of various methods under different missingness scenarios. Results: After screening, 46 studies published between 2010 and 2024 were included. Three missingness mechanisms were simulated when evaluating the missing data methods: missing completely at random (29/46), missing at random (20/46), and missing not at random (21/46). Multiple imputation by chained equations (MICE) was the most popular statistical method, whereas generative adversarial network-based methods and the k nearest neighbor (KNN) classification were the common deep-learning-based or traditional machine-learning-based methods, respectively. Among the 26 articles comparing the performance among medical statistical and machine learning approaches, traditional machine learning or deep learning methods generally outperformed statistical methods. Med.KNN and context-aware time-series imputation performed better for longitudinal datasets, whereas probabilistic principal component analysis and MICE-based methods were optimal for cross-sectional datasets. Conclusions: Machine learning methods show significant promise for addressing missing data in EHRs. However, no single approach provides a universally generalizable solution. Standardized benchmarking analyses are essential to evaluate these methods across different missingness scenarios.
Collapse
Affiliation(s)
- Wenhui Ren
- Department of Clinical Epidemiology and Biostatistics,
Peking University People’s Hospital, Beijing, China
| | - Zheng Liu
- Department of Clinical Epidemiology and Biostatistics,
Peking University People’s Hospital, Beijing, China
| | - Yanqiu Wu
- Department of Clinical Epidemiology and Biostatistics,
Peking University People’s Hospital, Beijing, China
| | - Zhilong Zhang
- National Institute of Health Data Science, Peking University, Beijing, China
- Institute of Medical Technology,
Health Science Center of Peking University, Beijing, China
| | - Shenda Hong
- National Institute of Health Data Science, Peking University, Beijing, China
| | - Huixin Liu
- Department of Clinical Epidemiology and Biostatistics,
Peking University People’s Hospital, Beijing, China
| | | |
Collapse
|
4
|
Hawwash NK, Sperrin M, Martin GP, Sinha R, Matthews CE, Ricceri F, Tjønneland A, Heath AK, Neuhouser ML, Joshu CE, Platz EA, Freisling H, Gunter MJ, Renehan AG. Excess weight by degree and duration and cancer risk (ABACus2 consortium): a cohort study and individual participant data meta-analysis. EClinicalMedicine 2024; 78:102921. [PMID: 39640936 PMCID: PMC11617392 DOI: 10.1016/j.eclinm.2024.102921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/03/2024] [Revised: 10/16/2024] [Accepted: 10/23/2024] [Indexed: 12/07/2024] Open
Abstract
Background Elevated body mass index (BMI) ≥25 kg/m2 is a major preventable cause of cancer. A single BMI measure does not capture the degree and duration of exposure to excess BMI. We investigate associations between adulthood overweight-years, incorporating exposure time to BMI ≥25 kg/m2, and cancer incidence, and compare this with single BMI. Methods In this cohort study and individual participant data meta-analysis, we obtained data from the ABACus 2 Consortium, consisting of four US cohorts: Atherosclerosis Risk in Communities (ARIC) study (1987-2015), Women's Health Initiative (WHI; 1991 to 2005 [main study], to 2010 [Extension 1], and to 2020 [Extension 2]), Prostate, Lung, Colorectal, Ovarian Cancer Screening (PLCO) Trial (1993-2009), NIH-AARP Diet and Health Study (1996-2011), and one European cohort, the European Prospective Investigation into Cancer and Nutrition (EPIC; participants enrolled in 1990 and administrative censoring was centre-specific). Participants with at least 3 BMI measurements and complete cancer follow-up data were included. We calculated overweight-years: degree of overweight (BMI ≥25 kg/m2) multiplied by the duration of overweight (years). Using random effects two-stage individual participant data meta-analyses, associations between cancer and overweight-years, single BMI, cumulative overweight degree and duration, measured at the same time and captured over a median of 41 years in men and 39 years in women, were evaluated with Cox proportional hazards models. Models were age-adjusted or multivariable (MV) adjusted for baseline age, ethnicity, alcohol, smoking and hormone replacement therapy (HRT). Harrell's C-statistic of metrics were compared. This study is registered at PROSPERO, CRD42021238270. Findings 720,210 participants, including 312,132 men and 408,078 women, were followed up for cancer incidence over a median 9.85 years (interquartile range (IQR) 8.03, 11.67) in men and 10.80 years (IQR 6.05, 15.55) in women. 12,959 men (4.15%) and 36,509 women (8.95%) were diagnosed with obesity-related cancer. Hazard ratios for obesity-related cancers in men, per 1 standard deviation (SD) overweight-years were 1.15 (95% CI: 1.14, 1.16, I2: 0) age-adjusted and 1.15 (95% CI: 1.13, 1.17, I2: 0%) MV-adjusted and per 1SD increment in single BMI were 1.17 (95% CI: 1.16, 1.18, I2: 0) age-adjusted and 1.16 (95% CI: 1.15, 1.18, I2: 0%) MV-adjusted. The HR for overweight-years in women per 1 SD increment was 1.08 (95% CI: 1.04, 1.13, I2: 82%) age-adjusted and 1.08 (95% CI: 1.04, 1.13, I2: 83%) MV-adjusted and per 1SD increment in single BMI was 1.10 (95% CI: 1.07, 1.14, I2: 72%) age-adjusted and 1.11 (95% CI: 1.07, 1.15, I2: 79%) MV-adjusted. C-statistics for overweight-years and single BMI for obesity-related cancers were 0.612 (95% CI: 0.578, 0.646) and 0.611 (95% CI: 0.578, 0.644) respectively for men and 0.566 (95% CI: 0.534, 0.598) and 0.573 (95% CI: 0.546, 0.600) for women. Interpretation Adulthood degree and duration of excess BMI were associated with cancer risk. Both factors should be considered in cancer prevention strategies and policies. This study only focused on adulthood exposure to excess BMI, so the minimal differences in the predictive performance between adiposity metrics may be due to underestimation of cumulative excess BMI exposure. Funding Cancer Research UK, the Manchester NIHR Biomedical Research Centre, the National Cancer Institute, the National Heart, Lung, and Blood Institute, National Institutes of Health, Department of Health and Human Services, U.S. Department of Health and Human Services, the Intramural Research Program of the National Cancer Institute, the International Agency for Research on Cancer, Imperial College London, European Commission (DG-SANCO), the Danish Cancer Society, Ligue Contre le Cancer, Institut Gustave-Roussy, Mutuelle Générale de l'Education Nationale, Institut National de la Santé et de la Recherche Médicale, Deutsche Krebshilfe, Deutsches Krebsforschungszentrum, German Federal Ministry of Education and Research, the Hellenic Health Foundation, Associazione Italiana per la Ricerca sul Cancro-AIRC-Italy and National Research Council, Dutch Ministry of Public Health, Welfare, and Sports, Netherlands Cancer Registry, LK Research Funds, Dutch Prevention Funds, Dutch Zorg Onderzoek Nederland, World Cancer Research Fund, Statistics Netherlands, Health Research Fund, Instituto de Salud Carlos III, regional Spanish governments of Andalucía, Asturias, Basque Country, Murcia, and Navarra, the Catalan Institute of Oncology, Swedish Cancer Society, Swedish Scientific Council, and Region Skåne and Region Västerbotten, and the Medical Research Council.
Collapse
Affiliation(s)
- Nadin K. Hawwash
- Division of Cancer Sciences, School of Medical Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
- Cancer Research UK Manchester Cancer Research Centre, Manchester, United Kingdom
| | - Matthew Sperrin
- Centre for Health Informatics, Division of Informatics, Imaging and Data Sciences, School of Health Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
| | - Glen P. Martin
- Centre for Health Informatics, Division of Informatics, Imaging and Data Sciences, School of Health Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
| | - Rashmi Sinha
- Metabolic Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Shady Grove, USA
| | - Charles E. Matthews
- Metabolic Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Shady Grove, USA
| | - Fulvio Ricceri
- Centre for Biostatistics, Epidemiology, and Public Health (C-BEPH), Department of Clinical and Biological Sciences, University of Turin, Regione Gonzole 10, Orbassano (TO), Italy
| | - Anne Tjønneland
- Danish Cancer Institute, Strandboulevarden 49, 2100 Copenhagen O, Denmark
| | - Alicia K. Heath
- Cancer Epidemiology and Prevention Research Unit, School of Public Health, Imperial College London, London, W2 1PG, United Kingdom
| | - Marian L. Neuhouser
- Cancer Prevention Program, Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Corinne E. Joshu
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Elizabeth A. Platz
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Heinz Freisling
- Nutrition and Metabolism Branch, International Agency for Research on Cancer (IARC-WHO), Lyon, France
| | - Marc J. Gunter
- Cancer Epidemiology and Prevention Research Unit, School of Public Health, Imperial College London, London, W2 1PG, United Kingdom
- Nutrition and Metabolism Branch, International Agency for Research on Cancer (IARC-WHO), Lyon, France
| | - Andrew G. Renehan
- Division of Cancer Sciences, School of Medical Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
- Cancer Research UK Manchester Cancer Research Centre, Manchester, United Kingdom
- National Institute for Health Research (NIHR) Manchester Biomedical Research Centre, Manchester, United Kingdom
| |
Collapse
|
5
|
Hu YH, Wu RY, Lin YC, Lin TY. A novel MissForest-based missing values imputation approach with recursive feature elimination in medical applications. BMC Med Res Methodol 2024; 24:269. [PMID: 39516783 PMCID: PMC11546113 DOI: 10.1186/s12874-024-02392-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2024] [Accepted: 10/28/2024] [Indexed: 11/16/2024] Open
Abstract
BACKGROUND Missing values in datasets present significant challenges for data analysis, particularly in the medical field where data accuracy is crucial for patient diagnosis and treatment. Although MissForest (MF) has demonstrated efficacy in imputation research and recursive feature elimination (RFE) has proven effective in feature selection, the potential for enhancing MF through RFE integration remains unexplored. METHODS This study introduces a novel imputation method, "recursive feature elimination-MissForest" (RFE-MF), designed to enhance imputation quality by reducing the impact of irrelevant features. A comparative analysis is conducted between RFE-MF and four classical imputation methods: mean/mode, k-nearest neighbors (kNN), multiple imputation by chained equations (MICE), and MF. The comparison is carried out across ten medical datasets containing both numerical and mixed data types. Different missing data rates, ranging from 10 to 50%, are evaluated under the missing completely at random (MCAR) mechanism. The performance of each method is assessed using two evaluation metrics: normalized root mean squared error (NRMSE) and predictive fidelity criterion (PFC). Additionally, paired samples t-tests are employed to analyze the statistical significance of differences among the outcomes. RESULTS The findings indicate that RFE-MF demonstrates superior performance across the majority of datasets when compared to four classical imputation methods (mean/mode, kNN, MICE, and MF). Notably, RFE-MF consistently outperforms the original MF, irrespective of variable type (numerical or categorical). Mean/mode imputation exhibits consistent performance across various scenarios. Conversely, the efficacy of kNN imputation fluctuates in relation to varying missing data rates. CONCLUSION This study demonstrates that RFE-MF holds promise as an effective imputation method for medical datasets, providing a novel approach to addressing missing data challenges in medical applications.
Collapse
Affiliation(s)
- Ya-Han Hu
- Department of Information Management, National Central University, Taoyuan City, Taiwan
| | - Ruei-Yan Wu
- Department of Information Management, National Central University, Taoyuan City, Taiwan
| | - Yen-Cheng Lin
- Department of Information Management, National Central University, Taoyuan City, Taiwan
| | - Ting-Yin Lin
- Department of Laboratory Medicine, Ditmanson Medical Foundation Chia-Yi Christian Hospital, Chiayi City, Taiwan.
| |
Collapse
|
6
|
Mainzer RM, Moreno-Betancur M, Nguyen CD, Simpson JA, Carlin JB, Lee KJ. Gaps in the usage and reporting of multiple imputation for incomplete data: findings from a scoping review of observational studies addressing causal questions. BMC Med Res Methodol 2024; 24:193. [PMID: 39232661 PMCID: PMC11373423 DOI: 10.1186/s12874-024-02302-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Accepted: 08/02/2024] [Indexed: 09/06/2024] Open
Abstract
BACKGROUND Missing data are common in observational studies and often occur in several of the variables required when estimating a causal effect, i.e. the exposure, outcome and/or variables used to control for confounding. Analyses involving multiple incomplete variables are not as straightforward as analyses with a single incomplete variable. For example, in the context of multivariable missingness, the standard missing data assumptions ("missing completely at random", "missing at random" [MAR], "missing not at random") are difficult to interpret and assess. It is not clear how the complexities that arise due to multivariable missingness are being addressed in practice. The aim of this study was to review how missing data are managed and reported in observational studies that use multiple imputation (MI) for causal effect estimation, with a particular focus on missing data summaries, missing data assumptions, primary and sensitivity analyses, and MI implementation. METHODS We searched five top general epidemiology journals for observational studies that aimed to answer a causal research question and used MI, published between January 2019 and December 2021. Article screening and data extraction were performed systematically. RESULTS Of the 130 studies included in this review, 108 (83%) derived an analysis sample by excluding individuals with missing data in specific variables (e.g., outcome) and 114 (88%) had multivariable missingness within the analysis sample. Forty-four (34%) studies provided a statement about missing data assumptions, 35 of which stated the MAR assumption, but only 11/44 (25%) studies provided a justification for these assumptions. The number of imputations, MI method and MI software were generally well-reported (71%, 75% and 88% of studies, respectively), while aspects of the imputation model specification were not clear for more than half of the studies. A secondary analysis that used a different approach to handle the missing data was conducted in 69/130 (53%) studies. Of these 69 studies, 68 (99%) lacked a clear justification for the secondary analysis. CONCLUSION Effort is needed to clarify the rationale for and improve the reporting of MI for estimation of causal effects from observational data. We encourage greater transparency in making and reporting analytical decisions related to missing data.
Collapse
Affiliation(s)
- Rheanna M Mainzer
- Clinical Epidemiology and Biostatistics Unit, Murdoch Children's Research Institute, Parkville, Victoria, 3052, Australia.
- Department of Paediatrics, The University of Melbourne, Parkville, Victoria, 3052, Australia.
| | - Margarita Moreno-Betancur
- Clinical Epidemiology and Biostatistics Unit, Murdoch Children's Research Institute, Parkville, Victoria, 3052, Australia
- Department of Paediatrics, The University of Melbourne, Parkville, Victoria, 3052, Australia
| | - Cattram D Nguyen
- Clinical Epidemiology and Biostatistics Unit, Murdoch Children's Research Institute, Parkville, Victoria, 3052, Australia
- Department of Paediatrics, The University of Melbourne, Parkville, Victoria, 3052, Australia
| | - Julie A Simpson
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Parkville, Victoria, 3052, Australia
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - John B Carlin
- Clinical Epidemiology and Biostatistics Unit, Murdoch Children's Research Institute, Parkville, Victoria, 3052, Australia
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Parkville, Victoria, 3052, Australia
| | - Katherine J Lee
- Clinical Epidemiology and Biostatistics Unit, Murdoch Children's Research Institute, Parkville, Victoria, 3052, Australia
- Department of Paediatrics, The University of Melbourne, Parkville, Victoria, 3052, Australia
| |
Collapse
|
7
|
El Badisy I, Graffeo N, Khalis M, Giorgi R. Multi-metric comparison of machine learning imputation methods with application to breast cancer survival. BMC Med Res Methodol 2024; 24:191. [PMID: 39215245 PMCID: PMC11363416 DOI: 10.1186/s12874-024-02305-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Accepted: 08/08/2024] [Indexed: 09/04/2024] Open
Abstract
Handling missing data in clinical prognostic studies is an essential yet challenging task. This study aimed to provide a comprehensive assessment of the effectiveness and reliability of different machine learning (ML) imputation methods across various analytical perspectives. Specifically, it focused on three distinct classes of performance metrics used to evaluate ML imputation methods: post-imputation bias of regression estimates, post-imputation predictive accuracy, and substantive model-free metrics. As an illustration, we applied data from a real-world breast cancer survival study. This comprehensive approach aimed to provide a thorough assessment of the effectiveness and reliability of ML imputation methods across various analytical perspectives. A simulated dataset with 30% Missing At Random (MAR) values was used. A number of single imputation (SI) methods - specifically KNN, missMDA, CART, missForest, missRanger, missCforest - and multiple imputation (MI) methods - specifically miceCART and miceRF - were evaluated. The performance metrics used were Gower's distance, estimation bias, empirical standard error, coverage rate, length of confidence interval, predictive accuracy, proportion of falsely classified (PFC), normalized root mean squared error (NRMSE), AUC, and C-index scores. The analysis revealed that in terms of Gower's distance, CART and missForest were the most accurate, while missMDA and CART excelled for binary covariates; missForest and miceCART were superior for continuous covariates. When assessing bias and accuracy in regression estimates, miceCART and miceRF exhibited the least bias. Overall, the various imputation methods demonstrated greater efficiency than complete-case analysis (CCA), with MICE methods providing optimal confidence interval coverage. In terms of predictive accuracy for Cox models, missMDA and missForest had superior AUC and C-index scores. Despite offering better predictive accuracy, the study found that SI methods introduced more bias into the regression coefficients compared to MI methods. This study underlines the importance of selecting appropriate imputation methods based on study goals and data types in time-to-event research. The varying effectiveness of methods across the different performance metrics studied highlights the value of using advanced machine learning algorithms within a multiple imputation framework to enhance research integrity and the robustness of findings.
Collapse
Affiliation(s)
- Imad El Badisy
- Mohammed VI Center For Research and Innovation, Rabat, Morocco.
- International School of Public Health, Mohammed VI University of Sciences and Health, Casablanca, Morocco.
- Aix Marseille Univ, INSERM, IRD, ISSPAM, SESSTIM, Sciences Economiques & Sociales de la Santé & Traitement de l'Information Médicale, ISSPAM, Marseille, France.
| | - Nathalie Graffeo
- Aix Marseille Univ, INSERM, IRD, ISSPAM, SESSTIM, Sciences Economiques & Sociales de la Santé & Traitement de l'Information Médicale, ISSPAM, Marseille, France
| | - Mohamed Khalis
- Mohammed VI Center For Research and Innovation, Rabat, Morocco
- International School of Public Health, Mohammed VI University of Sciences and Health, Casablanca, Morocco
| | - Roch Giorgi
- Aix Marseille Univ, INSERM, IRD, ISSPAM, SESSTIM, Sciences Economiques & Sociales de la Santé & Traitement de l'Information Médicale, ISSPAM, Marseille, France
- Aix Marseille Univ, APHM, INSERM, IRD, SESSTIM, Hop Timone, Biostatistique et Technologies de l'Information et de la Communication, Sciences Economiques & Sociales de la Santé & Traitement de l'Information Médicale, ISSPAM, Hop Timone, BioSTIC, Biostatistique et Technologies de l'Information et de la Communication, Marseille, France
| |
Collapse
|
8
|
Weberpals J, Raman SR, Shaw PA, Lee H, Russo M, Hammill BG, Toh S, Connolly JG, Dandreo KJ, Tian F, Liu W, Li J, Hernández-Muñoz JJ, Glynn RJ, Desai RJ. A Principled Approach to Characterize and Analyze Partially Observed Confounder Data from Electronic Health Records. Clin Epidemiol 2024; 16:329-343. [PMID: 38798915 PMCID: PMC11127690 DOI: 10.2147/clep.s436131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Accepted: 04/09/2024] [Indexed: 05/29/2024] Open
Abstract
Objective Partially observed confounder data pose challenges to the statistical analysis of electronic health records (EHR) and systematic assessments of potentially underlying missingness mechanisms are lacking. We aimed to provide a principled approach to empirically characterize missing data processes and investigate performance of analytic methods. Methods Three empirical sub-cohorts of diabetic SGLT2 or DPP4-inhibitor initiators with complete information on HbA1c, BMI and smoking as confounders of interest (COI) formed the basis of data simulation under a plasmode framework. A true null treatment effect, including the COI in the outcome generation model, and four missingness mechanisms for the COI were simulated: completely at random (MCAR), at random (MAR), and two not at random (MNAR) mechanisms, where missingness was dependent on an unmeasured confounder and on the value of the COI itself. We evaluated the ability of three groups of diagnostics to differentiate between mechanisms: 1)-differences in characteristics between patients with or without the observed COI (using averaged standardized mean differences [ASMD]), 2)-predictive ability of the missingness indicator based on observed covariates, and 3)-association of the missingness indicator with the outcome. We then compared analytic methods including "complete case", inverse probability weighting, single and multiple imputation in their ability to recover true treatment effects. Results The diagnostics successfully identified characteristic patterns of simulated missingness mechanisms. For MAR, but not MCAR, the patient characteristics showed substantial differences (median ASMD 0.20 vs 0.05) and consequently, discrimination of the prediction models for missingness was also higher (0.59 vs 0.50). For MNAR, but not MAR or MCAR, missingness was significantly associated with the outcome even in models adjusting for other observed covariates. Comparing analytic methods, multiple imputation using a random forest algorithm resulted in the lowest root-mean-squared-error. Conclusion Principled diagnostics provided reliable insights into missingness mechanisms. When assumptions allow, multiple imputation with nonparametric models could help reduce bias.
Collapse
Affiliation(s)
- Janick Weberpals
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
| | - Sudha R Raman
- Department of Population Health Sciences, Duke University School of Medicine, Durham, NC, USA
| | - Pamela A Shaw
- Biostatistics Division, Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA
| | - Hana Lee
- Office of Biostatistics, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - Massimiliano Russo
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
| | - Bradley G Hammill
- Department of Population Health Sciences, Duke University School of Medicine, Durham, NC, USA
| | - Sengwee Toh
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, USA
| | - John G Connolly
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, USA
| | - Kimberly J Dandreo
- Department of Population Medicine, Harvard Pilgrim Health Care Institute, Boston, MA, USA
| | - Fang Tian
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - Wei Liu
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - Jie Li
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - José J Hernández-Muñoz
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - Robert J Glynn
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
| | - Rishi J Desai
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
9
|
Weberpals J, Raman SR, Shaw PA, Lee H, Hammill BG, Toh S, Connolly JG, Dandreo KJ, Tian F, Liu W, Li J, Hernández-Muñoz JJ, Glynn RJ, Desai RJ. smdi: an R package to perform structural missing data investigations on partially observed confounders in real-world evidence studies. JAMIA Open 2024; 7:ooae008. [PMID: 38304248 PMCID: PMC10833461 DOI: 10.1093/jamiaopen/ooae008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 01/09/2024] [Accepted: 01/16/2024] [Indexed: 02/03/2024] Open
Abstract
Objectives Partially observed confounder data pose a major challenge in statistical analyses aimed to inform causal inference using electronic health records (EHRs). While analytic approaches such as imputation are available, assumptions on underlying missingness patterns and mechanisms must be verified. We aimed to develop a toolkit to streamline missing data diagnostics to guide choice of analytic approaches based on meeting necessary assumptions. Materials and methods We developed the smdi (structural missing data investigations) R package based on results of a previous simulation study which considered structural assumptions of common missing data mechanisms in EHR. Results smdi enables users to run principled missing data investigations on partially observed confounders and implement functions to visualize, describe, and infer potential missingness patterns and mechanisms based on observed data. Conclusions The smdi R package is freely available on CRAN and can provide valuable insights into underlying missingness patterns and mechanisms and thereby help improve the robustness of real-world evidence studies.
Collapse
Affiliation(s)
- Janick Weberpals
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02120, United States
| | - Sudha R Raman
- Department of Population Health Sciences, Duke University School of Medicine, Durham, NC 27701, United States
| | - Pamela A Shaw
- Biostatistics Division, Kaiser Permanente Washington Health Research Institute, Seattle, WA 98101, United States
| | - Hana Lee
- Office of Biostatistics, Center for Drug Evaluation and Research, United States Food and Drug Administration, Silver Spring, MD 20993, United States
| | - Bradley G Hammill
- Department of Population Health Sciences, Duke University School of Medicine, Durham, NC 27701, United States
| | - Sengwee Toh
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA 02215, United States
| | - John G Connolly
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA 02215, United States
| | - Kimberly J Dandreo
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA 02215, United States
| | - Fang Tian
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, United States Food and Drug Administration, Silver Spring, MD 20993, United States
| | - Wei Liu
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, United States Food and Drug Administration, Silver Spring, MD 20993, United States
| | - Jie Li
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, United States Food and Drug Administration, Silver Spring, MD 20993, United States
| | - José J Hernández-Muñoz
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, United States Food and Drug Administration, Silver Spring, MD 20993, United States
| | - Robert J Glynn
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02120, United States
| | - Rishi J Desai
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02120, United States
| |
Collapse
|
10
|
Guo F, Langworthy B, Ogino S, Wang M. Comparison between inverse-probability weighting and multiple imputation in Cox model with missing failure subtype. Stat Methods Med Res 2024; 33:344-356. [PMID: 38262434 DOI: 10.1177/09622802231226328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2024]
Abstract
Identifying and distinguishing risk factors for heterogeneous disease subtypes has been of great interest. However, missingness in disease subtypes is a common problem in those data analyses. Several methods have been proposed to deal with the missing data, including complete-case analysis, inverse-probability weighting, and multiple imputation. Although extant literature has compared these methods in missing problems, none has focused on the competing risk setting. In this paper, we discuss the assumptions required when complete-case analysis, inverse-probability weighting, and multiple imputation are used to deal with the missing failure subtype problem, focusing on how to implement these methods under various realistic scenarios in competing risk settings. Besides, we compare these three methods regarding their biases, efficiency, and robustness to model misspecifications using simulation studies. Our results show that complete-case analysis can be seriously biased when the missing completely at random assumption does not hold. Inverse-probability weighting and multiple imputation estimators are valid when we correctly specify the corresponding models for missingness and for imputation, and multiple imputation typically shows higher efficiency than inverse-probability weighting. However, in real-world studies, building imputation models for the missing subtypes can be more challenging than building missingness models. In that case, inverse-probability weighting could be preferred for its easy usage. We also propose two automated model selection procedures and demonstrate their usage in a study of the association between smoking and colorectal cancer subtypes in the Nurses' Health Study and Health Professional Follow-Up Study.
Collapse
Affiliation(s)
- Fuyu Guo
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | | | - Shuji Ogino
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Cancer Immunology and Cancer Epidemiology Programs, Dana-Farber Harvard Cancer Center, Boston, MA, USA
- Program in MPE Molecular Pathological Epidemiology, Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA,USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Molin Wang
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA,USA
- Harvard Medical School, Boston, MA, USA
| |
Collapse
|
11
|
Zhong L, Yang F, Sun S, Wang L, Yu H, Nie X, Liu A, Xu N, Zhang L, Zhang M, Qi Y, Ji H, Liu G, Zhao H, Jiang Y, Li J, Song C, Yu X, Yang L, Yu J, Feng H, Guo X, Yang F, Xue F. Predicting lung cancer survival prognosis based on the conditional survival bayesian network. BMC Med Res Methodol 2024; 24:16. [PMID: 38254038 PMCID: PMC10801949 DOI: 10.1186/s12874-023-02043-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Accepted: 09/25/2023] [Indexed: 01/24/2024] Open
Abstract
Lung cancer is a leading cause of cancer deaths and imposes an enormous economic burden on patients. It is important to develop an accurate risk assessment model to determine the appropriate treatment for patients after an initial lung cancer diagnosis. The Cox proportional hazards model is mainly employed in survival analysis. However, real-world medical data are usually incomplete, posing a great challenge to the application of this model. Commonly used imputation methods cannot achieve sufficient accuracy when data are missing, so we investigated novel methods for the development of clinical prediction models. In this article, we present a novel model for survival prediction in missing scenarios. We collected data from 5,240 patients diagnosed with lung cancer at the Weihai Municipal Hospital, China. Then, we applied a joint model that combined a BN and a Cox model to predict mortality risk in individual patients with lung cancer. The established prognostic model achieved good predictive performance in discrimination and calibration. We showed that combining the BN with the Cox proportional hazards model is highly beneficial and provides a more efficient tool for risk prediction.
Collapse
Affiliation(s)
- Lu Zhong
- Department of Epidemiology and Health Statistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China.
- Hainan Center for Disease Control and Prevention, Institute for Prevention and Control of Tropical Diseases and Chronic Noninfectious Diseases, Haikou, Hainan, China.
| | - Fan Yang
- Department of Epidemiology and Health Statistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China.
- Institute for Medical Dataology, Shandong University, Jinan, China.
| | - Shanshan Sun
- Department of Oncology, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, China
| | - Lijie Wang
- Department of Epidemiology and Health Statistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- Institute for Medical Dataology, Shandong University, Jinan, China
| | - Hong Yu
- Chongqing Key Laboratory of Computational Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Xiushan Nie
- School of Computer Science and Technology, Shandong Jianzhu University, Jinan, China
| | - Ailing Liu
- Department of Pulmonary and Critical Care Medicine, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, China
| | - Ning Xu
- Department of Pulmonary and Critical Care Medicine, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, China
| | - Lanfang Zhang
- Department of Oncology, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, China
| | - Mingjuan Zhang
- Department of Oncology, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, China
| | - Yue Qi
- Department of Oncology, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, China
| | - Huaijun Ji
- Department of Thoracic Surgery, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, China
| | - Guiyuan Liu
- Department of Radiology, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, China
| | - Huan Zhao
- Department of Oncology, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, China
- The Second School of Clinical Medicine of Binzhou Medical University, Yantai, China
| | - Yinan Jiang
- Department of Oncology, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, China
| | - Jingyi Li
- Department of Oncology, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, China
| | - Chengcun Song
- Department of Oncology, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, China
| | - Xin Yu
- Department of Oncology, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, China
| | - Liu Yang
- Department of Oncology, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, China
| | - Jinchao Yu
- Department of Radiology, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, China
| | - Hu Feng
- Department of Oncology, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, China
| | - Xiaolei Guo
- The Department for Chronic and Non-communicable Disease Control and Prevention, Shandong Center for Disease Control and Prevention, Jinan, China
| | - Fujun Yang
- Department of Oncology, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, China.
| | - Fuzhong Xue
- Department of Epidemiology and Health Statistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China.
- Institute for Medical Dataology, Shandong University, Jinan, China.
| |
Collapse
|
12
|
Huchet N, Penel N, Bonvalot S, Thariat J, Ducimetière F, Giraud A, Toulmonde M, Le Cesne A, Blay JY, Bellera C. Handling missing covariates in observational studies: an illustration with the assessment of prognostic factors of survival outcomes in soft-tissue or visceral sarcomas in irradiated fields (SIF). Ther Adv Med Oncol 2024; 16:17588359231220999. [PMID: 38249328 PMCID: PMC10798078 DOI: 10.1177/17588359231220999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 11/29/2023] [Indexed: 01/23/2024] Open
Abstract
Background Missing covariates are common in observational research and can lead to bias and loss of statistical power. Limited data regarding prognostic factors of survival outcomes of sarcomas in irradiated fields (SIF) are available. Because of the long lag time between irradiation of first cancer and scarcity of SIF, missing data are a critical issue when analyzing long-term outcomes. We assessed prognostic factors of overall (OS), progression-free (PFS), and metastatic-progression-free (MPFS) survivals in SIF using three methods to account for missing covariates. Methods We relied on the NETSARC French Sarcoma Group database, Cox (OS/PFS), and competitive hazards (MPFS) survival models. Covariates investigated were age, sex, histological subtype, tumor size, depth and grade, metastasis, surgery, surgical resection, surgeon's expertise, imaging, and neo-adjuvant treatment. We first applied multiple imputation (MI): observed data were used to estimate the missing covariate. With the missing-data modality approach, a category missing was created for qualitative variables. With the complete-case (CC) approach, analysis was restricted to patients without missing covariates. Results CC subjects (N = 167; 33%) presented more often with soft-tissue sarcoma (versus visceral sarcoma) and grade I-II tumors as compared to the 504 eligible cases. With MI (N = 504), factors associated with the worst outcome included metastasis (p = 0.04) and R1/R2 resection (p < 0.001) for OS; higher grade/non-gradable tumors (p = 0.002) and R1/R2 resection (p < 0.001) for PFS; and metastasis (p = 0.01) for M-PFS. The 'missing-data modality' approach (N = 504) led to different associations, including significance reached due to variables with the modality 'missing'. The CC analysis led to different results and reduced precision. Conclusion The CC population was not representative of the eligible population, introducing bias, in addition to worst precision. The 'missing-data modality method' results in biased estimates in non-randomized studies, as outcomes may be related to variables with missing values. Appropriate statistical methods for missing covariates, for example, MI, should therefore be considered.
Collapse
Affiliation(s)
- Noémie Huchet
- INSERM CIC1401, Clinical and Epidemiological Research Unit, Institut Bergonié, Comprehensive Cancer Center, Bordeaux, France
| | - Nicolas Penel
- Department of Medical Oncology, Centre Oscar Lambret, Lille, France
- Lille University, Lille, France
| | - Sylvie Bonvalot
- Surgery Department, Institut Curie, Comprehensive Cancer Center, Paris, France
| | - Juliette Thariat
- Centre François Baclesse, Comprehensive Cancer Center, Caen, France
- Laboratoire de physique Corpusculaire IN2P3/ENSICAEN/CNRS UMR 6534, Normandie Université, Caen France
| | - Françoise Ducimetière
- Department of Medical Oncology, Centre Léon Bérard, Comprehensive Cancer Center, Lyon, France
| | - Antoine Giraud
- INSERM CIC1401, Clinical and Epidemiological Research Unit, Institut Bergonié, Comprehensive Cancer Center, Bordeaux, France
| | - Maud Toulmonde
- Department of Medical Oncology, Institut Bergonié, Comprehensive Cancer Center, Bordeaux, France
| | - Axel Le Cesne
- Department of Medical Oncology, Gustave Roussy Cancer Campus, Comprehensive Cancer Center, Villejuif, France
| | - Jean-Yves Blay
- Department of Medical Oncology, Centre Léon Bérard, Comprehensive Cancer Center, Lyon, France
| | - Carine Bellera
- INSERM CIC1401, Clinical and Epidemiological Research Unit, Institut Bergonié, Comprehensive Cancer Center, 229 Cours de l'Argonne, Bordeaux 33076, France
- Univ. Bordeaux, INSERM, Bordeaux Population Health Research Center, Epicene team, UMR 1219, Bordeaux, France
| |
Collapse
|
13
|
Castelo-Branco L, Pellat A, Martins-Branco D, Valachis A, Derksen JWG, Suijkerbuijk KPM, Dafni U, Dellaporta T, Vogel A, Prelaj A, Groenwold RHH, Martins H, Stahel R, Bliss J, Kather J, Ribelles N, Perrone F, Hall PS, Dienstmann R, Booth CM, Pentheroudakis G, Delaloge S, Koopman M. ESMO Guidance for Reporting Oncology real-World evidence (GROW). Ann Oncol 2023; 34:1097-1112. [PMID: 37848160 DOI: 10.1016/j.annonc.2023.10.001] [Citation(s) in RCA: 70] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 09/28/2023] [Accepted: 10/04/2023] [Indexed: 10/19/2023] Open
Affiliation(s)
- L Castelo-Branco
- Scientific and Medical Division, European Society for Medical Oncology (ESMO), Lugano, Switzerland.
| | - A Pellat
- Department of Gastroenterology and Digestive Oncology, Hôpital Cochin AP-HP, Université Paris Cité, Paris; Centre d'Épidémiologie Clinique, Hôtel Dieu, Paris, France
| | - D Martins-Branco
- Scientific and Medical Division, European Society for Medical Oncology (ESMO), Lugano, Switzerland; Université Libre de Bruxelles (ULB), Hôpital Universitaire de Bruxelles (HUB), Institut Jules Bordet, Academic Trials Promoting Team (ATPT), Brussels, Belgium
| | - A Valachis
- Department of Oncology, Faculty of Medicine and Health, Örebro University Hospital, Örebro University, Örebro, Sweden
| | - J W G Derksen
- Julius Center for Health Sciences and Primary Care, Department of Epidemiology and Health Economics, University Medical Centre Utrecht, Utrecht University, Utrecht
| | - K P M Suijkerbuijk
- Department of Medical Oncology, University Medical Centre Utrecht, Utrecht University, Utrecht, The Netherlands
| | - U Dafni
- Laboratory of Biostatistics, Department of Nursing, National and Kapodistrian University of Athens, Athens; Frontier Science Foundation Hellas, Athens, Greece
| | - T Dellaporta
- Frontier Science Foundation Hellas, Athens, Greece
| | - A Vogel
- Department of Gastroenterology, Hepatology and Endocrinology, Medical School of Hannover, Hannover, Germany; Toronto Center of Liver Disease, Toronto General Hospital, University Health Network, Toronto; Princess Margaret Cancer Centre, University of Toronto, Toronto, Canada
| | - A Prelaj
- AI-ON-Lab, Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan; NEARLab, Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | - R H H Groenwold
- Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, The Netherlands
| | - H Martins
- Business Research Unit, ISCTE Business School, ISCTE-IUL, Lisbon, Portugal
| | - R Stahel
- ETOP IBCSG Partners Foundation, Berne, Switzerland
| | - J Bliss
- ICR-CTSU, Division of Clinical Studies, The Institute of Cancer Research, London, UK
| | - J Kather
- Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden; Medical Oncology, National Center for Tumor Diseases, University Hospital Heidelberg, Heidelberg, Germany
| | - N Ribelles
- Medical Oncology Intercenter Unit, Regional and Virgen de la Victoria University Hospitals, IBIMA, Málaga, Spain
| | - F Perrone
- Clinical Trial Unit, Istituto Nazionale Tumori IRCCS Fondazione G. Pascale, Naples, Italy
| | - P S Hall
- Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - R Dienstmann
- Oncoclinicas Precision Medicine, Oncoclinicas Group, São Paulo, Brazil; Oncology Data Science Group, Vall d'Hebron Institute of Oncology, Barcelona, Spain
| | - C M Booth
- Department of Oncology; Department of Public Health Sciences, Queen's University, Kingston, Canada
| | - G Pentheroudakis
- Scientific and Medical Division, European Society for Medical Oncology (ESMO), Lugano, Switzerland
| | - S Delaloge
- Department of Cancer Medicine, Gustave Roussy, Villejuif, France
| | - M Koopman
- Department of Medical Oncology, University Medical Centre Utrecht, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
14
|
Vingrys K, Mathai ML, McAinch AJ, Bassett JK, de Courten M, Stojanovska L, Millar L, Giles GG, Hodge AM, Apostolopoulos V. Intake of polyphenols from cereal foods and colorectal cancer risk in the Melbourne Collaborative Cohort Study. Cancer Med 2023; 12:19188-19202. [PMID: 37702114 PMCID: PMC10557875 DOI: 10.1002/cam4.6514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 08/16/2023] [Accepted: 08/29/2023] [Indexed: 09/14/2023] Open
Abstract
BACKGROUND Cereal-derived polyphenols have demonstrated protective mechanisms in colorectal cancer (CRC) models; however, confirmation in human studies is lacking. Therefore, this study examined the association between cereal polyphenol intakes and CRC risk in the Melbourne Collaborative Cohort Study (MCCS), a prospective cohort study in Melbourne, Australia that recruited participants between 1990 and 1994 to investigate diet-disease relationships. METHODS Using food frequency questionnaire diet data matched to polyphenol data, dietary intakes of alkylresorcinols, phenolic acids, lignans, and total polyphenols from cereals were estimated. Hazard ratios (HRs) and 95% confidence intervals for CRC risk were estimated for quintiles of intake with the lowest quintile as the comparison category, using multivariable adjusted Cox proportional hazards models with age as the time axis adjusted for sex, socio-economic status, alcohol consumption, fibre intake, country of birth, total energy intake, physical activity and smoking status. RESULTS From 35,245 eligible adults, mean (SD) age 54.7 (8.6) years, mostly female (61%) and Australian-born (69%), there were 1394 incident cases of CRC (946 colon cancers and 448 rectal cancers). Results for total cereal polyphenol intake showed reduced HRs in Q2 (HR: 0.80; 95% CI, 0.68-0.95) and Q4 (HR: 0.75; 95% CI, 0.62-0.90), and similar for phenolic acids. Alkylresorcinol intake showed reduced HR in Q3 (HR: 0.80; 95% CI, 0.67-0.95) and Q4 (HR: 0.79; 95% CI, 0.66-0.95). CONCLUSIONS Overall, the present study showed little evidence of association between intakes of cereal polyphenols and CRC risk. Future investigations may be useful to understand associations between cereal-derived polyphenols and additional cancers in different populations.
Collapse
Affiliation(s)
- Kristina Vingrys
- Institute for Health and SportVictoria UniversityMelbourneVictoriaAustralia
- VU First Year College®Victoria UniversityMelbourneVictoriaAustralia
| | - Michael L. Mathai
- Institute for Health and SportVictoria UniversityMelbourneVictoriaAustralia
| | - Andrew J. McAinch
- Institute for Health and SportVictoria UniversityMelbourneVictoriaAustralia
- Australian Institute for Musculoskeletal Science (AIMSS)Victoria UniversityMelbourneVictoriaAustralia
| | - Julie K. Bassett
- Cancer Epidemiology DivisionCancer Council VictoriaMelbourneVictoriaAustralia
| | - Maximilian de Courten
- Institute for Health and SportVictoria UniversityMelbourneVictoriaAustralia
- Mitchell Institute for Education and Health PolicyVictoria UniversityMelbourneVictoriaAustralia
| | - Lily Stojanovska
- Institute for Health and SportVictoria UniversityMelbourneVictoriaAustralia
- Department of Nutrition and Health, College of Medicine and Health SciencesUnited Arab Emirates UniversityAl AinUnited Arab Emirates
| | - Lynne Millar
- Institute for Health and SportVictoria UniversityMelbourneVictoriaAustralia
- Telethon Kids InstituteNedlandsWAAustralia
| | - Graham G. Giles
- Cancer Epidemiology DivisionCancer Council VictoriaMelbourneVictoriaAustralia
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global HealthThe University of MelbourneParkvilleVictoriaAustralia
- Precision Medicine, School of Clinical Sciences at Monash HealthMonash UniversityClaytonVictoriaAustralia
| | - Allison M. Hodge
- Cancer Epidemiology DivisionCancer Council VictoriaMelbourneVictoriaAustralia
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global HealthThe University of MelbourneParkvilleVictoriaAustralia
| | - Vasso Apostolopoulos
- Institute for Health and SportVictoria UniversityMelbourneVictoriaAustralia
- Australian Institute for Musculoskeletal Science (AIMSS)Victoria UniversityMelbourneVictoriaAustralia
| |
Collapse
|
15
|
Sondhi A, Weberpals J, Yerram P, Jiang C, Taylor M, Samant M, Cherng S. A systematic approach towards missing lab data in electronic health records: A case study in non-small cell lung cancer and multiple myeloma. CPT Pharmacometrics Syst Pharmacol 2023; 12:1201-1212. [PMID: 37322818 PMCID: PMC10508534 DOI: 10.1002/psp4.12998] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 04/17/2023] [Accepted: 05/05/2023] [Indexed: 06/17/2023] Open
Abstract
Real-world data derived from electronic health records often exhibit high levels of missingness in variables, such as laboratory results, presenting a challenge for statistical analyses. We developed a systematic workflow for gathering evidence of different missingness mechanisms and performing subsequent statistical analyses. We quantify evidence for missing completely at random (MCAR) or missing at random (MAR), mechanisms using Hotelling's multivariate t-test, and random forest classifiers, respectively. We further illustrate how to apply sensitivity analyses using the not at random fully conditional specification procedure to examine changes in parameter estimates under missing not at random (MNAR) mechanisms. In simulation studies, we validated these diagnostics and compared analytic bias under different mechanisms. To demonstrate the application of this workflow, we applied it to two exemplary case studies with an advanced non-small cell lung cancer and a multiple myeloma cohort derived from a real-world oncology database. Here, we found strong evidence against MCAR, and some evidence of MAR, implying that imputation approaches that attempt to predict missing values by fitting a model to observed data may be suitable for use. Sensitivity analyses did not suggest meaningful departures of our analytic results under potential MNAR mechanisms; these results were also in line with results reported in clinical trials.
Collapse
|
16
|
Bonneville EF, Schetelig J, Putter H, de Wreede LC. Handling missing covariate data in clinical studies in haematology. Best Pract Res Clin Haematol 2023; 36:101477. [PMID: 37353284 DOI: 10.1016/j.beha.2023.101477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Revised: 05/05/2023] [Accepted: 05/09/2023] [Indexed: 06/25/2023]
Abstract
Missing data are frequently encountered across studies in clinical haematology. Failure to handle these missing values in an appropriate manner can complicate the interpretation of a study's findings, as estimates presented may be biased and/or imprecise. In the present work, we first provide an overview of current methods for handling missing covariate data, along with their advantages and disadvantages. Furthermore, a systematic review is presented, exploring both contemporary reporting of missing values in major haematological journals, and the methods used for handling them. A principal finding was that the method of handling missing data was explicitly specified in a minority of articles (in 76 out of 195 articles reporting missing values, 39%). Among these, complete case analysis and the missing indicator method were the most common approaches to dealing with missing values, with more complex methods such as multiple imputation being extremely rare (in 7 out of 195 articles). An example analysis (with associated code) is also provided using hematopoietic stem cell transplantation data, illustrating the different approaches to handling missing values. We conclude with various recommendations regarding the reporting and handling of missing values for future studies in clinical haematology.
Collapse
Affiliation(s)
- Edouard F Bonneville
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands.
| | - Johannes Schetelig
- Dresden University Hospital, Dresden, Germany; DKMS Clinical Trials Unit, Dresden, Germany
| | - Hein Putter
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands
| | - Liesbeth C de Wreede
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands; DKMS Clinical Trials Unit, Dresden, Germany
| |
Collapse
|
17
|
McCann ZH, Szaflarski M. Differences in county-level cardiovascular disease mortality rates due to damage caused by hurricane Matthew and the moderating effect of social capital: a natural experiment. BMC Public Health 2023; 23:60. [PMID: 36624492 PMCID: PMC9830798 DOI: 10.1186/s12889-022-14919-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 12/21/2022] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND As the climate continues to warm, hurricanes will continue to increase in both severity and frequency. Hurricane damage is associated with cardiovascular events, but social capital may moderate this relationship. Social capital is a multidimensional concept with a rich theoretical tradition. Simply put, social capital refers to the social relationships and structures that provide individuals with material, financial, and emotional resources throughout their lives. Previous research has found an association between high levels of social capital and lower rates of cardiovascular (CVD) mortality. In post-disaster settings, social capital may protect against CVD mortality by improving access to life-saving resources. We examined the association between county-level hurricane damage and CVD mortality rates after Hurricane Matthew, and the moderating effect of several aspects of social capital and hurricane damage on this relationship. We hypothesized that (1) higher (vs. lower) levels of hurricane damage would be associated with increased CVD mortality rates and (2) in highly damaged counties, higher (vs. lower) levels of social capital would be associated with lower CVD mortality. METHODS Analysis used yearly (2013-2018) county-level sociodemographic and epidemiological data (n = 183). Sociodemographic data were compiled from federal surveys before and after Hurricane Matthew to construct, per prior literature, a social capital index based on four dimensions of social capital (sub-indices): family unity, informal civil society, institutional confidence, and collective efficacy. Epidemiological data comprised monthly CVD mortality rates constructed from monthly county-level CVD death counts from the CDC WONDER database and the US Census population estimates. Changes in CVD mortality based on level of hurricane damage were assessed using regression adjustment. We used cluster robust Poisson population average models to determine the moderating effect of social capital on CVD mortality rates in both high and low-damage counties. RESULTS We found that mean levels of CVD mortality increased (before and after adjustment for sociodemographic controls) in both low-damage counties (unadjusted. Mean = 2.50, 95% CI [2.41, 2.59], adjusted mean = 2.50, 95% CI [2.40, 2.72]) and high-damage counties (mean = 2.44, CI [2.29, 2.46], adj. Mean = 2.51, 95% CI [2.49, 2.84]). Among the different social capital dimensions, institutional confidence was associated with reduced initial CVD mortality in low-damage counties (unadj. IRR 1.00, 95% CI [0.90, 1.11], adj. IRR 0.91 CI [0.87, 0.94]), but its association with CVD mortality trends was null. The overall effects of social capital and its sub-indices were largely nonsignificant. CONCLUSION Hurricane damage is associated with increased CVD mortality for 18 months after Hurricane Matthew. The role of social capital remains unclear. Future research should focus on improving measurement of social capital and quality of hurricane damage and CVD mortality data.
Collapse
Affiliation(s)
- Zachary H McCann
- Department of Environmental Health, Rollins School of Public Health-Emory University, Atlanta, Georgia.
| | - Magdalena Szaflarski
- Department of Sociology, University of Alabama at Birmingham, Birmingham, AL, United States
| |
Collapse
|
18
|
McLernon DJ, Giardiello D, Van Calster B, Wynants L, van Geloven N, van Smeden M, Therneau T, Steyerberg EW. Assessing Performance and Clinical Usefulness in Prediction Models With Survival Outcomes: Practical Guidance for Cox Proportional Hazards Models. Ann Intern Med 2023; 176:105-114. [PMID: 36571841 DOI: 10.7326/m22-0844] [Citation(s) in RCA: 56] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Risk prediction models need thorough validation to assess their performance. Validation of models for survival outcomes poses challenges due to the censoring of observations and the varying time horizon at which predictions can be made. This article describes measures to evaluate predictions and the potential improvement in decision making from survival models based on Cox proportional hazards regression. As a motivating case study, the authors consider the prediction of the composite outcome of recurrence or death (the "event") in patients with breast cancer after surgery. They developed a simple Cox regression model with 3 predictors, as in the Nottingham Prognostic Index, in 2982 women (1275 events over 5 years of follow-up) and externally validated this model in 686 women (285 events over 5 years). Improvement in performance was assessed after the addition of progesterone receptor as a prognostic biomarker. The model predictions can be evaluated across the full range of observed follow-up times or for the event occurring by the end of a fixed time horizon of interest. The authors first discuss recommended statistical measures that evaluate model performance in terms of discrimination, calibration, or overall performance. Further, they evaluate the potential clinical utility of the model to support clinical decision making according to a net benefit measure. They provide SAS and R code to illustrate internal and external validation. The authors recommend the proposed set of performance measures for transparent reporting of the validity of predictions from survival models.
Collapse
Affiliation(s)
- David J McLernon
- Institute of Applied Health Sciences, University of Aberdeen, Aberdeen, United Kingdom (D.J.M.)
| | - Daniele Giardiello
- Netherlands Cancer Institute, Amsterdam, the Netherlands, Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands, and Institute of Biomedicine, Eurac Research, Affiliated Institute of the University of Lübeck, Bolzano, Italy (D.G.)
| | - Ben Van Calster
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands, and Department of Development and Regeneration, Katholieke Universiteit Leuven, Leuven, Belgium (B.V.)
| | - Laure Wynants
- School for Public Health and Primary Care, Maastricht University, Maastricht, the Netherlands (L.W.)
| | - Nan van Geloven
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands (N.V., E.W.S.)
| | - Maarten van Smeden
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands (M.V.)
| | - Terry Therneau
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, Minnesota (T.T.)
| | - Ewout W Steyerberg
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands (N.V., E.W.S.)
| |
Collapse
|
19
|
Lopacinska-Jørgensen J, Petersen PHD, Oliveira DVNP, Høgdall CK, Høgdall EV. Strategies for data normalization and missing data imputation and consequences for potential diagnostic microRNA biomarkers in epithelial ovarian cancer. PLoS One 2023; 18:e0282576. [PMID: 37141239 PMCID: PMC10159121 DOI: 10.1371/journal.pone.0282576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 02/21/2023] [Indexed: 05/05/2023] Open
Abstract
MicroRNAs (miRNAs) are small non-coding RNA molecules regulating gene expression with diagnostic potential in different diseases, including epithelial ovarian carcinomas (EOC). As only a few studies have been published on the identification of stable endogenous miRNA in EOC, there is no consensus which miRNAs should be used aiming standardization. Currently, U6-snRNA is widely adopted as a normalization control in RT-qPCR when investigating miRNAs in EOC; despite its variable expression across cancers being reported. Therefore, our goal was to compare different missing data and normalization approaches to investigate their impact on the choice of stable endogenous controls and subsequent survival analysis while performing expression analysis of miRNAs by RT-qPCR in most frequent subtype of EOC: high-grade serous carcinoma (HGSC). 40 miRNAs were included based on their potential as stable endogenous controls or as biomarkers in EOC. Following RNA extraction from formalin-fixed paraffin embedded tissues from 63 HGSC patients, RT-qPCR was performed with a custom panel covering 40 target miRNAs and 8 controls. The raw data was analyzed by applying various strategies regarding choosing stable endogenous controls (geNorm, BestKeeper, NormFinder, the comparative ΔCt method and RefFinder), missing data (single/multiple imputation), and normalization (endogenous miRNA controls, U6-snRNA or global mean). Based on our study, we propose hsa-miR-23a-3p and hsa-miR-193a-5p, but not U6-snRNA as endogenous controls in HGSC patients. Our findings are validated in two external cohorts retrieved from the NCBI Gene Expression Omnibus database. We present that the outcome of stability analysis depends on the histological composition of the cohort, and it might suggest unique pattern of miRNA stability profiles for each subtype of EOC. Moreover, our data demonstrates the challenge of miRNA data analysis by presenting various outcomes from normalization and missing data imputation strategies on survival analysis.
Collapse
Affiliation(s)
| | - Patrick H D Petersen
- Department of Pathology, Herlev Hospital, University of Copenhagen, Herlev, Denmark
| | | | - Claus K Høgdall
- Department of Gynaecology, Juliane Marie Centre, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Estrid V Høgdall
- Department of Pathology, Herlev Hospital, University of Copenhagen, Herlev, Denmark
| |
Collapse
|
20
|
Paul SK, Ling J, Samanta M, Montvida O. Robustness of Multiple Imputation Methods for Missing Risk Factor Data from Electronic Medical Records for Observational Studies. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2022; 6:385-400. [PMID: 36744084 PMCID: PMC9892403 DOI: 10.1007/s41666-022-00119-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 07/26/2022] [Accepted: 08/18/2022] [Indexed: 02/07/2023]
Abstract
Evaluating appropriate methodologies for imputation of missing outcome data from electronic medical records (EMRs) is crucial but lacking for observational studies. Using US EMR in people with type 2 diabetes treated over 12 and 24 months with dipeptidyl peptidase 4 inhibitors (DPP-4i, n = 38,483) and glucagon-like peptide 1 receptor agonists (GLP-1RA, n = 8,977), predictors of missingness of disease biomarker (HbA1c) were explored. Robustness of multiple imputation (MI) by chained equations, two-fold MI (MI-2F) and MI with Monte Carlo Markov Chain were compared to complete case analyses for drawing inferences. Compared to younger people (age quartile Q1), those in age quartile Q3 and Q4 were less likely to have missing HbA1c by 25-32% (range of OR CI: 0.55-0.88) at 6-month follow-up and by 26-39% (range of OR CI: 0.50-0.80) at 12-month follow-up. People with HbA1c ≥ 7.5% at baseline were 12% (OR CI: 0.83, 0.93) and 14% (OR CI: 0.77, 0.97) less likely to have missing data at 6-month follow-up in the DPP-4i and GLP-1RA groups, respectively. All imputation methods provided similar HbA1c distributions during follow-up as observed with complete case analyses. The clinical inferences based on absolute change in HbA1c and by proportion of people reducing HbA1c to a clinically acceptable level (≤ 7%) were also similar between imputed data and complete case analyses. MI-2F method provided marginally smaller mean difference between observed and imputed data with relatively smaller standard error of difference, compared to other methods, while evaluating for consistency through artificial within-sample analyses. The established MI techniques can be reliably employed for missing outcome data imputations in large EMR-based relational databases, leading to efficiently designing and drawing robust clinical inferences in pharmaco-epidemiological studies. Supplementary Information The online version contains supplementary material available at 10.1007/s41666-022-00119-w.
Collapse
Affiliation(s)
- Sanjoy K. Paul
- Melbourne EpiCentre, University of Melbourne and Melbourne Health, Melbourne, Australia
| | - Joanna Ling
- Melbourne EpiCentre, University of Melbourne and Melbourne Health, Melbourne, Australia
- Royal Melbourne Institute of Technology, Melbourne, Australia
| | - Mayukh Samanta
- Melbourne EpiCentre, University of Melbourne and Melbourne Health, Melbourne, Australia
| | - Olga Montvida
- Melbourne EpiCentre, University of Melbourne and Melbourne Health, Melbourne, Australia
| |
Collapse
|
21
|
Lin Y, Shao H, Shi L, Anderson AH, Fonseca V. Predicting incident heart failure among patients with type 2 diabetes mellitus: The DM-CURE risk score. Diabetes Obes Metab 2022; 24:2203-2211. [PMID: 35801340 DOI: 10.1111/dom.14806] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Revised: 06/20/2022] [Accepted: 07/01/2022] [Indexed: 11/30/2022]
Abstract
AIM Early identification and prediction of incident heart failure (HF) is important because of severe morbidity and mortality. This study aimed to predict onset of HF among patients with diabetes. METHODS A time-varying Cox model was derived from ACCORD clinical trial to predict the risk of incident HF, defined by hospitalization for HF (HHF). External validation was performed on patient-level data from the Harmony Outcome trial and Chronic Renal Insufficiency Cohort (CRIC) study. The model was transformed into an integer-based scoring algorithm for 10-year risk evaluation. A stepwise algorithm identified and selected predictors from demographic characteristics, physical examination, laboratory results, medical history, medication and health care utilization, to develop a risk prediction model. The main outcome was incident HF, defined by HHF. The C statistic and Brier score were used to assess model performance. RESULTS In total, 9649 patients with diabetes free of HF were used, with median follow-up of 4 years and 299 incident hospitalization of HF events. The model identified several predictors for the 10-year HF incidence risk score 'DM-CURE': socio-Demographic [education, age at type 2 diabetes (T2DM) diagnosis], Metabolic (glycated haemoglobin, systolic blood pressure, body mass index, high-density lipoproteins), diabetes-related Complications (myocardial infarction, revascularization, cardiovascular medications, neuropathy, hypertension duration, albuminuria, urine albumin-to-creatinine ratio, End Stage Kidney Disease), and health care Utilization (all-cause hospitalization, emergency room visits) for Risk Evaluation. Among them, the strongest impact factors for future HF were age at T2DM diagnosis, health care utilization and cardiovascular disease-related variables. The model showed good discrimination (C statistic: 0.838, 95% CI: 0.821-0.855) and calibration (Brier score: 0.006, 95% CI: 0.006-0.007) in the ACCORD data and good performance in the validation data (Harmony: C statistic: 0.881, 95% CI: 0.863-0.899; CRIC: C statistic: 0.813, 95% CI: 0.794-0.833). The 10-year risk of incident HF increased in a graded fashion, from ≤1% in quintile 1 (score ≤14), 1%-5% in quintile 2 (score 15-23), 5%-10% in quintile 3 (score 24-27), 10%-20% in quintile 4 (score 28-33) and ≥20% in quintile 5 (score >33). CONCLUSIONS The DM-CURE model and score were useful for population risk stratification of incident HHF among patients with T2DM and can be easily applied in clinical practice.
Collapse
Affiliation(s)
- Yilu Lin
- Department of Health Policy and Management, School of Public Health and Tropical Medicine, Tulane University, New Orleans, Louisiana, United States
| | - Hui Shao
- Department of Pharmaceutical Outcomes and Policy, College of Pharmacy, University of Florida, Gainesville, Florida, United States
| | - Lizheng Shi
- Department of Health Policy and Management, School of Public Health and Tropical Medicine, Tulane University, New Orleans, Louisiana, United States
| | - Amanda H Anderson
- Department of Epidemiology, School of Public Health and Tropical Medicine, Tulane University, New Orleans, Louisiana, United States
| | - Vivian Fonseca
- Department of Medicine and Pharmacology, School of Medicine, Tulane University, New Orleans, Louisiana, United States
| |
Collapse
|
22
|
Bonneville EF, Resche-Rigon M, Schetelig J, Putter H, de Wreede LC. Multiple imputation for cause-specific Cox models: Assessing methods for estimation and prediction. Stat Methods Med Res 2022; 31:1860-1880. [PMID: 35658734 PMCID: PMC9523822 DOI: 10.1177/09622802221102623] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
In studies analyzing competing time-to-event outcomes, interest often lies in both estimating the effects of baseline covariates on the cause-specific hazards and predicting cumulative incidence functions. When missing values occur in these baseline covariates, they may be discarded as part of a complete-case analysis or multiply imputed. In the latter case, the imputations may be performed either compatibly with a substantive model pre-specified as a cause-specific Cox model [substantive model compatible fully conditional specification (SMC-FCS)], or approximately so [multivariate imputation by chained equations (MICE)]. In a large simulation study, we assessed the performance of these three different methods in terms of estimating cause-specific regression coefficients and predicting cumulative incidence functions. Concerning regression coefficients, results provide further support for use of SMC-FCS over MICE, particularly when covariate effects are large and the baseline hazards of the competing events are substantially different. Complete-case analysis also shows adequate performance in settings where missingness is not outcome dependent. With regard to cumulative incidence prediction, SMC-FCS and MICE are performed more similarly, as also evidenced in the illustrative analysis of competing outcomes following a hematopoietic stem cell transplantation. The findings are discussed alongside recommendations for practising statisticians.
Collapse
Affiliation(s)
- Edouard F Bonneville
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - Matthieu Resche-Rigon
- Service de Biostatistique et Information Médicale, Hôpital Saint-Louis, Paris, France
- Centre de Recherche en Epidémiologie et Statistiques Sorbonne Paris Cité, Paris, France
- ECSTRRA Team, INSERM, Paris, France
| | - Johannes Schetelig
- Dresden University Hospital, Dresden, Germany
- DKMS Clinical Trials Unit, Dresden, Germany
| | - Hein Putter
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - Liesbeth C de Wreede
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
- DKMS Clinical Trials Unit, Dresden, Germany
| |
Collapse
|
23
|
Atem FD, Bluestein MA, Harrell MB, Chen B, Messiah SE, Kuk AE, Sterling KL, Spells CE, Pérez A. Precise Estimation for the Age of Initiation of Tobacco Use Among U.S. Youth: Finding from the Population Assessment of Tobacco and Health (PATH) Study, 2013-2017. BIOSTATISTICS AND BIOMETRICS OPEN ACCESS JOURNAL 2022; 11:555801. [PMID: 36777448 PMCID: PMC9912413 DOI: 10.19080/bboaj.2022.11.555801] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Context Youth tobacco use remains a prominent United States public health issue with a high economic and health burden. Method We pooled never and ever users at youth's first wave of PATH participation (waves 1-3) to estimate age of initiation for hookah, e-cigarettes, cigarettes, traditional cigars, cigarillos, and smokeless tobacco prospectively (waves 2-4). Age of initiation of each tobacco product was estimated using weighted interval-censored survival analyses. Weighted interval censoring Cox-proportional hazards regression models were used to assess the association of ever use of the TP at the first wave of PATH participation, sex, and race/ethnicity on the age of initiation of ever use of each tobacco product. Sensitivity analyses were performed to understand the impact of the recalled age of initiation for the left-censored participants by replacing the recalled age of initiation with a uniform "6" years lower bound. Results The proportion of those who ever used each tobacco product at the first wave of PATH participation ranged from 1.8% for traditional cigars to 10.4% for cigarettes. There was a significant increase in ever use of each tobacco product after the age of 14, with e-cigarettes and cigarettes showing the highest cumulative incidence of initiation by age 21, while smokeless and cigarillos recorded the lowest cumulative incidence by age 21. The adjusted Cox models showed boys initiated at earlier ages for all of these tobacco products except for hookah, which showed no difference. Similarly, apart from ever use of hookah, non-Hispanic White youth were more likely to initiate each tobacco product at earlier ages compared to Hispanic, non-Hispanic Black, and non-Hispanic Other youth. Conclusion The increased sample size and the inclusion of ever users yielded greater precision for age of initiation of each tobacco product than analyses limited to never users at the first wave of PATH participation. These analyses can help elucidate population selection criteria for estimating the age of initiation of tobacco products.
Collapse
Affiliation(s)
- Folefac D. Atem
- The University of Texas Health Science Center at Houston, School of Public Health,Center for Pediatric Population Health, Children's Health System of Texas and UT Health Science Center School of Public Health, School of Public Health in Dallas, 2777 N Stemmons Fwy, Dallas, TX
| | - Meagan A. Bluestein
- The University of Texas Health Science Center at Houston, School of Public Health,Michael & Susan Dell Center for Healthy Living, 1616 Guadalupe Suite 6.300 Austin TX 78701
| | - Melissa B. Harrell
- The University of Texas Health Science Center at Houston, School of Public Health,Michael & Susan Dell Center for Healthy Living, 1616 Guadalupe Suite 6.300 Austin TX 78701,Consultant with litigation involving the vaping industry, Austin TX
| | - Baojiang Chen
- The University of Texas Health Science Center at Houston, School of Public Health,Michael & Susan Dell Center for Healthy Living, 1616 Guadalupe Suite 6.300 Austin TX 78701
| | - Sarah E. Messiah
- The University of Texas Health Science Center at Houston, School of Public Health,Center for Pediatric Population Health, Children's Health System of Texas and UT Health Science Center School of Public Health, School of Public Health in Dallas, 2777 N Stemmons Fwy, Dallas, TX
| | - Arnold E. Kuk
- The University of Texas Health Science Center at Houston, School of Public Health,Michael & Susan Dell Center for Healthy Living, 1616 Guadalupe Suite 6.300 Austin TX 78701
| | - Kymberle L. Sterling
- The University of Texas Health Science Center at Houston, School of Public Health,Michael & Susan Dell Center for Healthy Living, 1616 Guadalupe Suite 6.300 Austin TX 78701
| | - Charles E. Spells
- The University of Texas Health Science Center at Houston, School of Public Health,Michael & Susan Dell Center for Healthy Living, 1616 Guadalupe Suite 6.300 Austin TX 78701
| | - Adriana Pérez
- The University of Texas Health Science Center at Houston, School of Public Health,Michael & Susan Dell Center for Healthy Living, 1616 Guadalupe Suite 6.300 Austin TX 78701
| |
Collapse
|
24
|
Mohammed YS, Abdelkader H, Pławiak P, Hammad M. A novel model to optimize multiple imputation algorithm for missing data using evolution methods. Biomed Signal Process Control 2022; 76:103661. [DOI: 10.1016/j.bspc.2022.103661] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
|
25
|
Yu B, Steptoe A, Chen Y. Social isolation, loneliness, and all-cause mortality: A cohort study of 35,254 Chinese older adults. J Am Geriatr Soc 2022; 70:1717-1725. [PMID: 35229887 DOI: 10.1111/jgs.17708] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 02/06/2022] [Accepted: 02/11/2022] [Indexed: 01/21/2023]
Abstract
BACKGROUND Few studies of social isolation, loneliness and associations with all-cause mortality in older adults have been conducted in non-Western countries. The aim of this study was to conduct such an analysis in a nationally representative sample of Chinese older adults. METHODS This study used eight waves of data from the Chinese Longitudinal Healthy Longevity Survey from 1998 to 2018 and focused on participants aged ≥60 years. A total of 21,570 people died (61.2%) over a median follow-up of 4.8 years. Social isolation, loneliness, demographic, health and lifestyle factors were measured at baseline. The primary outcome was all-cause mortality. Cox proportional hazard regression models were used to examine the associations of isolation and loneliness with all-cause mortality. RESULTS This study included 35,254 participants with mean age of 86.63 ± 11.39 years. Social isolation was significantly associated with an increased mortality (adjusted HR 1.22; 95% CI 1.18-1.25; p < 0.01). The association of loneliness with mortality was nonsignificant after adjustment for health indicators and low psychological well-being (HR 1.01; 95% CI 0.98-1.04; p = 0.69). However, when stratified by age, there was a significant association of loneliness with mortality among participants aged <80 years (HR 1.15; 95% CI 1.05-1.26; p < 0.01). CONCLUSIONS Social isolation was associated with an increased all-cause mortality among the older Chinese adults. However, loneliness was associated with an increased mortality only among younger participants. Public health interventions aimed at increasing social connectedness may potentially reduce excess mortality among older adults.
Collapse
Affiliation(s)
- Bin Yu
- Institute of Applied Psychology, Tianjin University, Tianjin, China
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China
- Department of Psychiatry and Psychology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Andrew Steptoe
- Department of Behavioural Science and Health, University College London, London, UK
| | - Yongjie Chen
- Department of Epidemiology and Statistics, School of Public Health, Tianjin Medical University, Tianjin, China
| |
Collapse
|
26
|
Okpara C, Edokwe C, Ioannidis G, Papaioannou A, Adachi JD, Thabane L. The reporting and handling of missing data in longitudinal studies of older adults is suboptimal: a methodological survey of geriatric journals. BMC Med Res Methodol 2022; 22:122. [PMID: 35473665 PMCID: PMC9040343 DOI: 10.1186/s12874-022-01605-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2021] [Accepted: 04/13/2022] [Indexed: 11/26/2022] Open
Abstract
Background Missing data are common in longitudinal studies, and more so, in studies of older adults, who are susceptible to health and functional decline that limit completion of assessments. We assessed the extent, current reporting, and handling of missing data in longitudinal studies of older adults. Methods Medline and Embase databases were searched from 2015 to 2019 for publications on longitudinal observational studies conducted among persons ≥55 years old. The search was restricted to 10 general geriatric journals published in English. Reporting and handling of missing data were assessed using questions developed from the recommended standards. Data were summarised descriptively as frequencies and proportions. Results A total of 165 studies were included in the review from 7032 identified records. In approximately half of the studies 97 (62.5%), there was either no comment on missing data or unclear descriptions. The percentage of missing data varied from 0.1 to 55%, with a 14% average among the studies that reported having missing data. Complete case analysis was the most common method for handling missing data with nearly 75% of the studies (n = 52) excluding individual observations due to missing data, at the initial phase of study inclusion or at the analysis stage. Of the 10 studies where multiple imputation was used, only 1 (10.0%) study followed the guideline for reporting the procedure fully using online supplementary documents. Conclusion The current reporting and handling of missing data in longitudinal observational studies of older adults are inadequate. Journal endorsement and implementation of guidelines may potentially improve the quality of missing data reporting. Further, authors should be encouraged to use online supplementary files to provide additional details on how missing data were addressed, to allow for more transparency and comprehensive appraisal of studies. Supplementary Information The online version contains supplementary material available at 10.1186/s12874-022-01605-w.
Collapse
Affiliation(s)
- Chinenye Okpara
- Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, ON, L8S 4L8, Canada.
| | | | - George Ioannidis
- Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, ON, L8S 4L8, Canada.,GERAS Centre, Hamilton Health Sciences, Hamilton, ON, Canada.,Department of Medicine, McMaster University, Hamilton, ON, Canada
| | - Alexandra Papaioannou
- Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, ON, L8S 4L8, Canada.,GERAS Centre, Hamilton Health Sciences, Hamilton, ON, Canada.,Department of Medicine, McMaster University, Hamilton, ON, Canada
| | - Jonathan D Adachi
- GERAS Centre, Hamilton Health Sciences, Hamilton, ON, Canada.,Department of Medicine, McMaster University, Hamilton, ON, Canada
| | - Lehana Thabane
- Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, ON, L8S 4L8, Canada.,GERAS Centre, Hamilton Health Sciences, Hamilton, ON, Canada.,Biostatistics Unit, Research Institute of St Joseph's Healthcare, Hamilton, ON, Canada.,Faculty of Health Sciences, University of Johannesburg, Johannesburg, South Africa
| |
Collapse
|
27
|
Diop M, Epstein D. Comparing methods for handling missing cost and quality of life data in the Early Endovenous Ablation in Venous Ulceration trial. Cost Eff Resour Alloc 2022; 20:18. [PMID: 35392924 PMCID: PMC8991820 DOI: 10.1186/s12962-022-00351-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 03/18/2022] [Indexed: 11/10/2022] Open
Abstract
Objectives This study compares methods for handling missing data to conduct cost-effectiveness analysis in the context of a clinical study. Methods Patients in the Early Endovenous Ablation in Venous Ulceration (EVRA) trial had between 1 year and 5.5 years (median 3 years) of follow-up under early or deferred endovenous ablation. This study compares complete-case-analysis (CCA), multiple imputation using linear regression (MILR) and using predictive mean matching (MIPMM), Bayesian parametric approach using the R package missingHE (BPA), repeated measures fixed effect (RMFE) and repeated measures mixed model (RMM). The outcomes were total mean costs and total mean quality-adjusted life years (QALYs) at different time horizons (1 year, 3 years and 5 years). Results All methods found no statistically significant difference in cost at the 5% level in all time horizons, and all methods found statistically significantly greater mean QALY at year 1. By year 3, only BPA showed a statistically significant difference in QALY between treatments. Standard errors differed substantially between the methods employed. Conclusion CCA can be biased if data are MAR and is wasteful of the data. Hence the results for CCA are likely to be inaccurate. Other methods coincide in suggesting that early intervention is cost-effective at a threshold of £30,000 per QALY 1, 3 and 5 years. However, the variation in the results across the methods does generate some additional methodological uncertainty, underlining the importance of conducting sensitivity analyses using alternative approaches. Supplementary Information The online version contains supplementary material available at 10.1186/s12962-022-00351-6.
Collapse
Affiliation(s)
- Modou Diop
- Department of Applied Economics, University of Granada, Campus de Cartuja, 18071, Granada, Spain.
| | - David Epstein
- Department of Applied Economics, University of Granada, Campus de Cartuja, 18071, Granada, Spain
| |
Collapse
|
28
|
Remigio RV, Turpin R, Raimann JG, Kotanko P, Maddux FW, Sapkota AR, Liang XZ, Puett R, He X, Sapkota A. Assessing proximate intermediates between ambient temperature, hospital admissions, and mortality in hemodialysis patients. ENVIRONMENTAL RESEARCH 2022; 204:112127. [PMID: 34582801 PMCID: PMC8901270 DOI: 10.1016/j.envres.2021.112127] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 08/19/2021] [Accepted: 09/22/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND Typical thermoregulatory responses to elevated temperatures among healthy individuals include reduced blood pressure and perspiration. Individuals with end-stage kidney disease (ESKD) are susceptible to systemic fluctuations caused by ambient temperature changes that may increase morbidity and mortality. We investigated whether pre-dialysis systolic blood pressure (preSBP) and interdialytic weight gain (IDWG) can independently mediate the association between ambient temperature, all-cause hospital admissions (ACHA), and all-cause mortality (ACM). METHODS The study population consisted of ESKD patients receiving hemodialysis treatments at Fresenius Medical Care facilities in Philadelphia County, PA, from 2011 to 2019 (n = 1981). Within a time-to-event framework, we estimated the association between daily maximum dry-bulb temperature (TMAX) and, as separate models, ACHA and ACM during warmer calendar months. Clinically measured preSBP and IDWG responses to temperature increases were estimated using linear mixed effect models. We employed the difference (c-c') method to decompose total effect models for ACHA and ACM using preSBP and IDWG as time-dependent mediators. Covariate adjustments for exposure-mediator and total and direct effect models include age, race, ethnicity, blood pressure medication use, treatment location, preSBP, and IDWG. We considered lags up to two days for exposure and 1-day lag for mediator variables (Lag 2-Lag 1) to assure temporality between exposure-outcome models. Sensitivity analyses for 2-day (Lag 2-only) and 1-day (Lag 1-only) lag structures were also conducted. RESULTS Based on Lag 2- Lag 1 temporal ordering, 1 °C increase in daily TMAX was associated with increased hazard of ACHA by 1.4% (adjusted hazard ratio (HR), 1.014; 95% confidence interval, 1.007-1.021) and ACM 7.5% (adjusted HR, 1.075, 1.050-1.100). Short-term lag exposures to 1 °C increase in temperature predicted mean reductions in IDWG and preSBP by 0.013-0.015% and 0.168-0.229 mmHg, respectively. Mediation analysis for ACHA identified significant indirect effects for all three studied pathways (preSBP, IDWG, and preSBP + IDWG) and significant indirect effects for IDWG and conjoined preSBP + IDWG pathways for ACM. Of note, only 1.03% of the association between temperature and ACM was mediated through preSBP. The mechanistic path for IDWG, independent of preSBP, demonstrated inconsistent mediation and, consequently, potential suppression effects in ACHA (-15.5%) and ACM (-6.3%) based on combined pathway models. Proportion mediated estimates from preSBP + IDWG pathways achieved 2.2% and 0.3% in combined pathway analysis for ACHA and ACM outcomes, respectively. Lag 2 discrete-time ACM mediation models exhibited consistent mediation for all three pathways suggesting that 2-day lag in IDWG and preSBP responses can explain 2.11% and 4.41% of total effect association between temperature and mortality, respectively. CONCLUSION We corroborated the previously reported association between ambient temperature, ACHA and ACM. Our results foster the understanding of potential physiological linkages that may explain or suppress temperature-driven hospital admissions and mortality risks. Of note, concomitant changes in preSBP and IDWG may have little intermediary effect when analyzed in combined pathway models. These findings advance our assessment of candidate interventions to reduce the impact of outdoor temperature change on ESKD patients.
Collapse
Affiliation(s)
- Richard V Remigio
- Maryland Institute for Applied Environmental Health, University of Maryland-College Park, School of Public Health, College Park, MD, USA.
| | - Rodman Turpin
- Department of Epidemiology and Biostatistics, University of Maryland-College Park, School of Public Health, College Park, MD, USA
| | | | - Peter Kotanko
- Research Division, Renal Research Institute, New York, NY, USA; Icahn School of Medicine, Mount Sinai Hospital, New York, NY, USA
| | | | - Amy Rebecca Sapkota
- Maryland Institute for Applied Environmental Health, University of Maryland-College Park, School of Public Health, College Park, MD, USA
| | - Xin-Zhong Liang
- Department of Atmospheric and Oceanic Sciences, University of Maryland-College Park, College Park, MD, USA
| | - Robin Puett
- Maryland Institute for Applied Environmental Health, University of Maryland-College Park, School of Public Health, College Park, MD, USA
| | - Xin He
- Department of Epidemiology and Biostatistics, University of Maryland-College Park, School of Public Health, College Park, MD, USA
| | - Amir Sapkota
- Maryland Institute for Applied Environmental Health, University of Maryland-College Park, School of Public Health, College Park, MD, USA
| |
Collapse
|
29
|
Martin GP, Jenkins DA, Bull L, Sisk R, Lin L, Hulme W, Wilson A, Wang W, Barrowman M, Sammut-Powell C, Pate A, Sperrin M, Peek N. Toward a framework for the design, implementation, and reporting of methodology scoping reviews. J Clin Epidemiol 2020; 127:191-197. [PMID: 32726605 DOI: 10.1016/j.jclinepi.2020.07.014] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Revised: 06/12/2020] [Accepted: 07/20/2020] [Indexed: 12/17/2022]
Abstract
BACKGROUND AND OBJECTIVE In view of the growth of published articles, there is an increasing need for studies that summarize scientific research. An increasingly common review is a "methodology scoping review," which provides a summary of existing analytical methods, techniques and software that have been proposed or applied in research articles to address an analytical problem or further an analytical approach. However, guidelines for their design, implementation, and reporting are limited. METHODS Drawing on the experiences of the authors, which were consolidated through a series of face-to-face workshops, we summarize the challenges inherent in conducting a methodology scoping review and offer suggestions of best practice to promote future guideline development. RESULTS We identified three challenges of conducting a methodology scoping review. First, identification of search terms; one cannot usually define the search terms a priori, and the language used for a particular method can vary across the literature. Second, the scope of the review requires careful consideration because new methodology is often not described (in full) within abstracts. Third, many new methods are motivated by a specific clinical question, where the methodology may only be documented in supplementary materials. We formulated several recommendations that build upon existing review guidelines. These recommendations ranged from an iterative approach to defining search terms through to screening and data extraction processes. CONCLUSION Although methodology scoping reviews are an important aspect of research, there is currently a lack of guidelines to standardize their design, implementation, and reporting. We recommend a wider discussion on this topic.
Collapse
Affiliation(s)
- Glen P Martin
- Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK.
| | - David A Jenkins
- Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK; NIHR Greater Manchester Patient Safety Translational Research Centre, University of Manchester, Manchester, UK
| | - Lucy Bull
- Manchester Epidemiology Centre Versus Arthritis, Centre for Musculoskeletal Research, Manchester Academic Health Science Centre, University of Manchester, Manchester, UK; Centre for Biostatistics, Manchester Academic Health Science Centre, University of Manchester, Manchester, UK
| | - Rose Sisk
- Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
| | - Lijing Lin
- Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
| | - William Hulme
- Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
| | - Anthony Wilson
- Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK; Adult Critical Care, Manchester University Hospitals NHS Foundation Trust, Manchester, UK
| | - Wenjuan Wang
- Department of Population Health Sciences, Faculty of Life Science and Medicine, King's College London, London, UK
| | - Michael Barrowman
- Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
| | - Camilla Sammut-Powell
- Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
| | - Alexander Pate
- Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
| | - Matthew Sperrin
- Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
| | - Niels Peek
- Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK; NIHR Greater Manchester Patient Safety Translational Research Centre, University of Manchester, Manchester, UK
| |
Collapse
|