1
|
Gosho M, Ohigashi T, Nagashima K, Ito Y, Maruo K. Bias in odds ratios from logistic regression methods with sparse data sets. J Epidemiol 2021. [PMID: 34565762 PMCID: PMC10165217 DOI: 10.2188/jea.je20210089] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Logistic regression models are widely used to evaluate the association between a binary outcome and a set of covariates. However, when there are few study participants at the outcome and covariate levels, the models lead to bias of the odds ratio (OR) estimated using the maximum likelihood (ML) method. This bias is known as sparse data bias, and the estimated OR can yield impossibly large values because of data sparsity. However, this bias has been ignored in most epidemiological studies. METHODS We review several methods for reducing sparse data bias in logistic regression. The primary aim is to evaluate the Bayesian methods in comparison with the classical methods, such as the ML, Firth's, and exact methods using a simulation study. We also apply these methods to a real data set. RESULTS Our simulation results indicate that the bias of the OR from the ML, Firth's, and exact methods is considerable. Furthermore, the Bayesian methods with hyper-g prior modeling of the prior covariance matrix for regression coefficients reduced the bias under the null hypothesis, whereas the Bayesian methods with log F-type priors reduced the bias under the alternative hypothesis. CONCLUSION The Bayesian methods using log F-type priors and hyper-g prior are superior to the ML, Firth's, and exact methods when fitting logistic models to sparse data sets. The choice of a preferable method depends on the null and alternative hypothesis. Sensitivity analysis is important to understand the robustness of the results in sparse data analysis.
Collapse
Affiliation(s)
- Masahiko Gosho
- Department of Biostatistics, Faculty of Medicine, University of Tsukuba
| | - Tomohiro Ohigashi
- Graduate School of Comprehensive Human Sciences, University of Tsukuba.,Department of Biostatistics, Tsukuba Clinical Research & Development Organization, University of Tsukuba
| | - Kengo Nagashima
- Research Center for Medical and Health Data Science, The Institute of Statistical Mathematics
| | - Yuri Ito
- Department of Medical Statistics, Research & Development Center, Osaka Medical College
| | - Kazushi Maruo
- Department of Biostatistics, Faculty of Medicine, University of Tsukuba
| |
Collapse
|
2
|
Affiliation(s)
- Jialiang Mao
- Department of Statistical Science, Duke University, Durham, NC
| | - Yuhan Chen
- Department of Statistical Science, Duke University, Durham, NC
| | - Li Ma
- Department of Statistical Science, Duke University, Durham, NC
| |
Collapse
|
3
|
Heyard R, Timsit JF, Held L. Validation of discrete time-to-event prediction models in the presence of competing risks. Biom J 2019; 62:643-657. [PMID: 31368172 PMCID: PMC7217187 DOI: 10.1002/bimj.201800293] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Revised: 06/21/2019] [Accepted: 06/28/2019] [Indexed: 11/06/2022]
Abstract
Clinical prediction models play a key role in risk stratification, therapy assignment and many other fields of medical decision making. Before they can enter clinical practice, their usefulness has to be demonstrated using systematic validation. Methods to assess their predictive performance have been proposed for continuous, binary, and time-to-event outcomes, but the literature on validation methods for discrete time-to-event models with competing risks is sparse. The present paper tries to fill this gap and proposes new methodology to quantify discrimination, calibration, and prediction error (PE) for discrete time-to-event outcomes in the presence of competing risks. In our case study, the goal was to predict the risk of ventilator-associated pneumonia (VAP) attributed to Pseudomonas aeruginosa in intensive care units (ICUs). Competing events are extubation, death, and VAP due to other bacteria. The aim of this application is to validate complex prediction models developed in previous work on more recently available validation data.
Collapse
Affiliation(s)
- Rachel Heyard
- Department of Biostatistics at the Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Hirschengraben, Switzerland
| | | | - Leonhard Held
- Department of Biostatistics at the Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Hirschengraben, Switzerland
| | | |
Collapse
|
4
|
|
5
|
Affiliation(s)
- Yingbo Li
- Department of Mathematical Sciences, Clemson University, Clemson, SC
| | | |
Collapse
|
6
|
Affiliation(s)
- Manuela Ott
- Epidemiology, Biostatistics and Prevention Institute; University of Zurich; Hirschengraben 84 Zurich 8001 Switzerland
| | - Leonhard Held
- Epidemiology, Biostatistics and Prevention Institute; University of Zurich; Hirschengraben 84 Zurich 8001 Switzerland
| |
Collapse
|
7
|
Heyard R, Timsit J, Essaied WI, Held L. Dynamic clinical prediction models for discrete time‐to‐event data with competing risks—A case study on the OUTCOMEREA database. Biom J 2018; 61:514-534. [DOI: 10.1002/bimj.201700259] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2017] [Revised: 08/10/2018] [Accepted: 08/14/2018] [Indexed: 11/08/2022]
Affiliation(s)
- Rachel Heyard
- Department of Biostatistics at the Epidemiology, Biostatistics and Prevention InstituteUniversity of ZurichHirschengraben 84 Zurich Switzerland
| | | | | | - Leonhard Held
- Department of Biostatistics at the Epidemiology, Biostatistics and Prevention InstituteUniversity of ZurichHirschengraben 84 Zurich Switzerland
| | | |
Collapse
|
8
|
Jensen KO, Heyard R, Schmitt D, Mica L, Ossendorf C, Simmen HP, Wanner GA, Werner CML, Held L, Sprengel K. Which pre-hospital triage parameters indicate a need for immediate evaluation and treatment of severely injured patients in the resuscitation area? Eur J Trauma Emerg Surg 2017; 45:91-98. [PMID: 29238847 DOI: 10.1007/s00068-017-0889-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2017] [Accepted: 12/08/2017] [Indexed: 10/18/2022]
Abstract
PURPOSE To find ways to reduce the rate of over-triage without drastically increasing the rate of under-triage, we applied a current guideline and identified relevant pre-hospital triage predictors that indicate the need for immediate evaluation and treatment of severely injured patients in the resuscitation area. METHODS Data for adult trauma patients admitted to our level-1 trauma centre in a one year period were collected. Outpatients were excluded. Correct triage for trauma team activation was identified for patients with an ISS or NISS ≥ 16 or the need for ICU treatment due to trauma sequelae. In this retrospective analysis, patients were assigned to trauma team activation according to the S3 guideline of the German Trauma Society. This assignment was compared to the actual need for activation as defined above. 13 potential predictors were retained. The relevance of the predictors was assessed and 14 models of interest were considered. The performance of these potential triage models to predict the need for trauma team activation was evaluated with leave-one-out cross-validated Brier and logarithmic scores. RESULTS A total of 1934 inpatients ≥ 16 years were admitted to our trauma department (mean age 48 ± 22 years, 38% female). Sixty-nine per cent (n = 1341) were allocated to the emergency department and 31% (n = 593) were treated in the resuscitation room. The median ISS was 4 (IQR 7) points and the median NISS 4 (IQR 6) points. The mortality rate was 3.5% (n = 67) corresponding to a standardized mortality ratio of 0.73. Under-triage occurred in 1.3% (26/1934) and over-triage in 18% (349/1934). A model with eight predictors was finally selected with under-triage rate of 3.3% (63/1934) and over-triage rate of 10.8% (204/1934). CONCLUSION The trauma team activation criteria could be reduced to eight predictors without losing its predictive performance. Non-relevant parameters such as EMS provider judgement, endotracheal intubation, suspected paralysis, the presence of burned body surface of > 20% and suspected fractures of two proximal long bones could be excluded for full trauma team activation. The fact that the emergency physicians did a better job in reducing under-triage compared to our final triage model suggests that other variables not present in the S3 guideline may be relevant for prediction.
Collapse
Affiliation(s)
- K O Jensen
- Department of Trauma, University of Zurich, Raemistrasse 100, 8091, Zurich, Switzerland.
| | - R Heyard
- Department of Biostatistics, Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Zurich, Switzerland
| | - D Schmitt
- Department of Trauma, University of Zurich, Raemistrasse 100, 8091, Zurich, Switzerland
| | - L Mica
- Department of Trauma, University of Zurich, Raemistrasse 100, 8091, Zurich, Switzerland
| | - C Ossendorf
- Department of Trauma, University of Zurich, Raemistrasse 100, 8091, Zurich, Switzerland
| | - H P Simmen
- Department of Trauma, University of Zurich, Raemistrasse 100, 8091, Zurich, Switzerland
| | - G A Wanner
- Department of Trauma, University of Zurich, Raemistrasse 100, 8091, Zurich, Switzerland
| | - C M L Werner
- Department of Trauma, University of Zurich, Raemistrasse 100, 8091, Zurich, Switzerland
| | - L Held
- Department of Biostatistics, Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Zurich, Switzerland
| | - K Sprengel
- Department of Trauma, University of Zurich, Raemistrasse 100, 8091, Zurich, Switzerland
| |
Collapse
|
9
|
Held L. An objective Bayes perspective onp-values. Biom J 2017; 59:886-888. [DOI: 10.1002/bimj.201700068] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2017] [Revised: 04/21/2017] [Accepted: 04/24/2017] [Indexed: 11/08/2022]
Affiliation(s)
- Leonhard Held
- Department of Biostatistics, EBPI; University of Zurich; Hirschengraben 84 8001 Zurich Switzerland
| |
Collapse
|
10
|
Papathomas M. On the correspondence from Bayesian log-linear modelling to logistic regression modelling with g-priors. TEST-SPAIN 2017. [DOI: 10.1007/s11749-017-0540-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
11
|
Muff S, Puhan MA, Held L. Bias away from the null due to miscounted outcomes? A case study on the TORCH trial. Stat Methods Med Res 2017; 27:3151-3166. [PMID: 29298639 DOI: 10.1177/0962280217694403] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Count outcomes occur in virtually all disciplines, such as medicine, epidemiology or biology, but they often contain error. Recently, it has been shown that self-reported numbers of exacerbations of Chronic Obstructive Pulmonary Disease patients can be considerably miscounted. Motivated by this result, we reanalysed data from the Towards a Revolution in Chronic Obstructive Pulmonary Disease Health trial, a large randomized controlled trial with the self-reported number of exacerbations of Chronic Obstructive Pulmonary Disease patients as outcome. To adjust for miscounting error in the response of Poisson and (zero-inflated) negative binomial models, we introduce novel, general methodology. The key idea is to formulate a zero-inflated negative binomial model to capture the error mechanism. This parametric approach automatically circumvents drawbacks of previously suggested methodology that treats miscounted outcomes in the misclassification framework. Prior information for the response error model parameters was elicited from validation data of an external study and adaptively weighted to account for potential prior-data conflict. The results of the Bayesian hierarchical modelling approach indicated that the treatment effect has been overestimated in the original study. However, closer inspection revealed that this unexpected result was an artefact of an unaccounted time dependency of the treatment effect.
Collapse
Affiliation(s)
- Stefanie Muff
- 1 Epidemiology, Biostatistics and Prevention Institute (EBPI), University of Zurich, Zurich, Switzerland.,2 Department of Evolutionary Biology and Environmental Studies (IEU), University of Zurich, Zurich, Switzerland
| | - Milo A Puhan
- 1 Epidemiology, Biostatistics and Prevention Institute (EBPI), University of Zurich, Zurich, Switzerland
| | - Leonhard Held
- 1 Epidemiology, Biostatistics and Prevention Institute (EBPI), University of Zurich, Zurich, Switzerland
| |
Collapse
|
12
|
Held L, Ott M. How the Maximal Evidence of P-Values Against Point Null Hypotheses Depends on Sample Size. AM STAT 2016. [DOI: 10.1080/00031305.2016.1209128] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Leonhard Held
- Department of Biostatistics Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Zurich, Switzerland
| | - Manuela Ott
- Department of Biostatistics Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Zurich, Switzerland
| |
Collapse
|
13
|
Held L, Gravestock I, Sabanés Bové D. Objective Bayesian model selection for Cox regression. Stat Med 2016; 35:5376-5390. [PMID: 27580645 DOI: 10.1002/sim.7089] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2015] [Revised: 07/05/2016] [Accepted: 08/08/2016] [Indexed: 11/12/2022]
Abstract
There is now a large literature on objective Bayesian model selection in the linear model based on the g-prior. The methodology has been recently extended to generalized linear models using test-based Bayes factors. In this paper, we show that test-based Bayes factors can also be applied to the Cox proportional hazards model. If the goal is to select a single model, then both the maximum a posteriori and the median probability model can be calculated. For clinical prediction of survival, we shrink the model-specific log hazard ratio estimates with subsequent calculation of the Breslow estimate of the cumulative baseline hazard function. A Bayesian model average can also be employed. We illustrate the proposed methodology with the analysis of survival data on primary biliary cirrhosis patients and the development of a clinical prediction model for future cardiovascular events based on data from the Second Manifestations of ARTerial disease (SMART) cohort study. Cross-validation is applied to compare the predictive performance with alternative model selection approaches based on Harrell's c-Index, the calibration slope and the integrated Brier score. Finally, a novel application of Bayesian variable selection to optimal conditional prediction via landmarking is described. Copyright © 2016 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Leonhard Held
- Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Hirschegraben 84, 8001, Zurich, Switzerland
| | - Isaac Gravestock
- Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Hirschegraben 84, 8001, Zurich, Switzerland
| | | |
Collapse
|
14
|
Held L, Sauter R. Adaptive prior weighting in generalized regression. Biometrics 2016; 73:242-251. [PMID: 27192504 DOI: 10.1111/biom.12541] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2015] [Revised: 04/01/2016] [Accepted: 04/01/2016] [Indexed: 11/28/2022]
Abstract
The prior distribution is a key ingredient in Bayesian inference. Prior information on regression coefficients may come from different sources and may or may not be in conflict with the observed data. Various methods have been proposed to quantify a potential prior-data conflict, such as Box's p-value. However, there are no clear recommendations how to react to possible prior-data conflict in generalized regression models. To address this deficiency, we propose to adaptively weight a prespecified multivariate normal prior distribution on the regression coefficients. To this end, we relate empirical Bayes estimates of prior weight to Box's p-value and propose alternative fully Bayesian approaches. Prior weighting can be done for the joint prior distribution of the regression coefficients or-under prior independence-separately for prespecified blocks of regression coefficients. We outline how the proposed methodology can be implemented using integrated nested Laplace approximations (INLA) and illustrate the applicability with a Bayesian logistic regression model for data from a cross-sectional study. We also provide a simulation study that shows excellent performance of our approach in the case of prior misspecification in terms of root mean squared error and coverage. Supplementary Materials give details on software implementation and code and another application to binary longitudinal data from a randomized clinical trial using a Bayesian generalized linear mixed model.
Collapse
Affiliation(s)
- Leonhard Held
- Department of Biostatistics, Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Hirschengraben 84, 8001 Zurich, Switzerland
| | - Rafael Sauter
- Department of Biostatistics, Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Hirschengraben 84, 8001 Zurich, Switzerland
| |
Collapse
|