1
|
Holsteen KK, Hittle M, Barad M, Nelson LM. Development and Internal Validation of a Multivariable Prediction Model for Individual Episodic Migraine Attacks Based on Daily Trigger Exposures. Headache 2020; 60:2364-2379. [PMID: 33022773 DOI: 10.1111/head.13960] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2020] [Revised: 07/14/2020] [Accepted: 08/12/2020] [Indexed: 12/17/2022]
Abstract
OBJECTIVE To develop and internally validate a multivariable predictive model for days with new-onset migraine headaches based on patient self-prediction and exposure to common trigger factors. BACKGROUND Accurate real-time forecasting of one's daily risk of migraine attack could help episodic migraine patients to target preventive medications for susceptible time periods and help decrease the burden of disease. Little is known about the predictive utility of common migraine trigger factors. METHODS We recruited adults with episodic migraine through online forums to participate in a 90-day prospective daily-diary cohort study conducted through a custom research application for iPhone. Every evening, participants answered questions about migraine occurrence and potential predictors including stress, sleep, caffeine and alcohol consumption, menstruation, and self-prediction. We developed and estimated multivariable multilevel logistic regression models for the risk of a new-onset migraine day vs a healthy day and internally validated the models using repeated cross-validation. RESULTS We had 178 participants complete the study and qualify for the primary analysis which included 1870 migraine events. We found that a decrease in caffeine consumption, higher self-predicted probability of headache, a higher level of stress, and times within 2 days of the onset of menstruation were positively associated with next-day migraine risk. The multivariable model predicted migraine risk only slightly better than chance (within-person C-statistic: 0.56, 95% CI: 0.54, 0.58). CONCLUSIONS In this study, episodic migraine attacks were not predictable based on self-prediction or on self-reported exposure to common trigger factors. Improvements in accuracy and breadth of data collection are needed to build clinically useful migraine prediction models.
Collapse
Affiliation(s)
- Katherine K Holsteen
- Department of Epidemiology & Population Health, Stanford University School of Medicine, Stanford, CA, USA
| | - Michael Hittle
- Department of Epidemiology & Population Health, Stanford University School of Medicine, Stanford, CA, USA
| | - Meredith Barad
- Department of Anesthesia, Stanford University School of Medicine, Stanford University, Stanford, CA, USA
| | - Lorene M Nelson
- Department of Epidemiology & Population Health, Stanford University School of Medicine, Stanford, CA, USA
| |
Collapse
|
2
|
Falconieri N, Van Calster B, Timmerman D, Wynants L. Developing risk models for multicenter data using standard logistic regression produced suboptimal predictions: A simulation study. Biom J 2020; 62:932-944. [PMID: 31957077 PMCID: PMC7383814 DOI: 10.1002/bimj.201900075] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Revised: 09/16/2019] [Accepted: 10/15/2019] [Indexed: 11/17/2022]
Abstract
Although multicenter data are common, many prediction model studies ignore this during model development. The objective of this study is to evaluate the predictive performance of regression methods for developing clinical risk prediction models using multicenter data, and provide guidelines for practice. We compared the predictive performance of standard logistic regression, generalized estimating equations, random intercept logistic regression, and fixed effects logistic regression. First, we presented a case study on the diagnosis of ovarian cancer. Subsequently, a simulation study investigated the performance of the different models as a function of the amount of clustering, development sample size, distribution of center‐specific intercepts, the presence of a center‐predictor interaction, and the presence of a dependency between center effects and predictors. The results showed that when sample sizes were sufficiently large, conditional models yielded calibrated predictions, whereas marginal models yielded miscalibrated predictions. Small sample sizes led to overfitting and unreliable predictions. This miscalibration was worse with more heavily clustered data. Calibration of random intercept logistic regression was better than that of standard logistic regression even when center‐specific intercepts were not normally distributed, a center‐predictor interaction was present, center effects and predictors were dependent, or when the model was applied in a new center. Therefore, to make reliable predictions in a specific center, we recommend random intercept logistic regression.
Collapse
Affiliation(s)
- Nora Falconieri
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium
| | - Ben Van Calster
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium.,Department of Biomedical Data Sciences, Leiden University Medical Center (LUMC), Leiden, The Netherlands
| | - Dirk Timmerman
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium.,Department of Obstetrics and Gynecology, University Hospitals Leuven, Leuven, Belgium
| | - Laure Wynants
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium.,Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, The Netherlands
| |
Collapse
|
3
|
Weldemariam KT, Gezae KE, Abebe HT. Reasons and multilevel factors associated with unscheduled contraceptive use discontinuation in Ethiopia: evidence from Ethiopian demographic and health survey 2016. BMC Public Health 2019; 19:1745. [PMID: 31881865 PMCID: PMC6935182 DOI: 10.1186/s12889-019-8088-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Accepted: 12/15/2019] [Indexed: 11/29/2022] Open
Abstract
Background Contraceptive discontinuations for reasons other than the desire for pregnancy are a public health concern because of their negative effect on reproductive health outcomes. In Ethiopia, the contraceptive discontinuation rate is increasing; however the factors associated are poorly understood. So this study was aimed at assessing reasons and multilevel factors for unscheduled contraceptive use discontinuation. Methods This is a cross-sectional study of Ethiopian women who participated in the Ethiopian demographic health survey from January 18, 2016, to June 27, 2016. Ever using any contraceptive with in the calendar of the survey were an inclusion criteria for which 3835 women were found eligible. The data were analyzed using multilevel binary logistic regression in STATA version 14. Variables with p-value less than 0.05 were considered as statistically significant, and reported using adjusted odds ratio and 95% confidence interval. Median odds ratio and interval odds ratio, to quantify the magnitude of the general and specific contextual effect respectively, were used. Receiver operating characteristics curve and akaike’s information criterion were used for model comparison. Result The prevalence of unscheduled contraceptive use discontinuation was 46.18% for the principal reason of method related problems (Side effects-45.3%, needing better method-33.6%, and inconvenience-21.1%,). Women heading a household (AOR = 1.281, 95%CI 1.079–1.520), women who had no work (AOR = 0.812, 95%CI 0.673, 0.979) compared to professionals, living in poorest house hold income (AOR = 0.753, 95%CI 0.567, 0.997) compared to middle, residing in community with low contraceptive utilization rate (AOR = 1.945, 95%CI 1.618, 2.339), residing in poor community (AOR = 0.763, 95%CI 0.596–0.997), and having more children, and region were found to be significant predictors of unscheduled contraceptive use discontinuation. Conclusion Method related problems were found to contribute for more than half of the contraceptive use discontinuation. Both individual and community level factors were found to significantly influence the Unscheduled contraceptive use discontinuation. The outcome was common in groups who could have more social interactions and knowledge on which myths and rumors are common. So strengthening the efforts to reduce contraceptive use discontinuation and quality of contraceptive service provision could be important.
Collapse
Affiliation(s)
- Kibrom Taame Weldemariam
- Department of Biostatistics, School of Public Health, College of Health Sciences, Aksum University, P.O.Box: 298, Axum, Ethiopia.
| | - Kebede Embaye Gezae
- Department of Biostatistics, School of Public Health, College of Health Sciences, Mekelle University, Mekelle, Ethiopia
| | - Haftom Temesgen Abebe
- Department of Biostatistics, School of Public Health, College of Health Sciences, Mekelle University, Mekelle, Ethiopia
| |
Collapse
|
4
|
van den Broek HT, Wenker S, van de Leur R, Doevendans PA, Chamuleau SAJ, van Slochteren FJ, van Es R. 3D Myocardial Scar Prediction Model Derived from Multimodality Analysis of Electromechanical Mapping and Magnetic Resonance Imaging. J Cardiovasc Transl Res 2019; 12:517-527. [PMID: 31338795 PMCID: PMC6854049 DOI: 10.1007/s12265-019-09899-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Accepted: 07/01/2019] [Indexed: 01/27/2023]
Abstract
Many cardiac catheter interventions require accurate discrimination between healthy and infarcted myocardia. The gold standard for infarct imaging is late gadolinium-enhanced MRI (LGE-MRI), but during cardiac procedures electroanatomical or electromechanical mapping (EAM or EMM, respectively) is usually employed. We aimed to improve the ability of EMM to identify myocardial infarction by combining multiple EMM parameters in a statistical model. From a porcine infarction model, 3D electromechanical maps were 3D registered to LGE-MRI. A multivariable mixed-effects logistic regression model was fitted to predict the presence of infarct based on EMM parameters. Furthermore, we correlated feature-tracking strain parameters to EMM measures of local mechanical deformation. We registered 787 EMM points from 13 animals to the corresponding MRI locations. The mean registration error was 2.5 ± 1.16 mm. Our model showed a strong ability to predict the presence of infarction (C-statistic = 0.85). Strain parameters were only weakly correlated to EMM measures. The model is accurate in discriminating infarcted from healthy myocardium. Unipolar and bipolar voltages were the strongest predictors.
Collapse
Affiliation(s)
| | - Steven Wenker
- Department of Cardiology, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Rutger van de Leur
- Department of Cardiology, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Pieter A Doevendans
- Department of Cardiology, University Medical Center Utrecht, Utrecht, The Netherlands
- Netherlands Heart Institute, Utrecht, The Netherlands
- CMH, Utrecht, Netherlands
| | - Steven A J Chamuleau
- Department of Cardiology, University Medical Center Utrecht, Utrecht, The Netherlands.
| | | | - René van Es
- Department of Cardiology, University Medical Center Utrecht, Utrecht, The Netherlands
| |
Collapse
|
5
|
Wynants L, Kent DM, Timmerman D, Lundquist CM, Van Calster B. Untapped potential of multicenter studies: a review of cardiovascular risk prediction models revealed inappropriate analyses and wide variation in reporting. Diagn Progn Res 2019; 3:6. [PMID: 31093576 PMCID: PMC6460661 DOI: 10.1186/s41512-019-0046-9] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Accepted: 01/03/2019] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Clinical prediction models are often constructed using multicenter databases. Such a data structure poses additional challenges for statistical analysis (clustered data) but offers opportunities for model generalizability to a broad range of centers. The purpose of this study was to describe properties, analysis, and reporting of multicenter studies in the Tufts PACE Clinical Prediction Model Registry and to illustrate consequences of common design and analyses choices. METHODS Fifty randomly selected studies that are included in the Tufts registry as multicenter and published after 2000 underwent full-text screening. Simulated examples illustrate some key concepts relevant to multicenter prediction research. RESULTS Multicenter studies differed widely in the number of participating centers (range 2 to 5473). Thirty-nine of 50 studies ignored the multicenter nature of data in the statistical analysis. In the others, clustering was resolved by developing the model on only one center, using mixed effects or stratified regression, or by using center-level characteristics as predictors. Twenty-three of 50 studies did not describe the clinical settings or type of centers from which data was obtained. Four of 50 studies discussed neither generalizability nor external validity of the developed model. CONCLUSIONS Regression methods and validation strategies tailored to multicenter studies are underutilized. Reporting on generalizability and potential external validity of the model lacks transparency. Hence, multicenter prediction research has untapped potential. REGISTRATION This review was not registered.
Collapse
Affiliation(s)
- L. Wynants
- Department of Development and Regeneration, KU Leuven, Herestraat 49, box 7003, 3000 Leuven, Belgium
- Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University, PO Box 9600, 6200 MD Maastricht, The Netherlands
| | - D. M. Kent
- Predictive Analytics and Comparative Effectiveness (PACE) Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, 800 Washington St, Box 63, Boston, MA 02111 USA
| | - D. Timmerman
- Department of Development and Regeneration, KU Leuven, Herestraat 49, box 7003, 3000 Leuven, Belgium
- Department of Obstetrics and Gynecology, University Hospitals Leuven, Herestraat 49, 3000 Leuven, Belgium
| | - C. M. Lundquist
- Predictive Analytics and Comparative Effectiveness (PACE) Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, 800 Washington St, Box 63, Boston, MA 02111 USA
| | - B. Van Calster
- Department of Development and Regeneration, KU Leuven, Herestraat 49, box 7003, 3000 Leuven, Belgium
- Department of Biomedical Data Sciences, Leiden University Medical Center, PO Box 9600, Leiden, 2300RC The Netherlands
| |
Collapse
|
6
|
Xiong C, Luo J, Chen L, Gao F, Liu J, Wang G, Bateman R, Morris JC. Estimating diagnostic accuracy for clustered ordinal diagnostic groups in the three-class case-Application to the early diagnosis of Alzheimer disease. Stat Methods Med Res 2017; 27:701-714. [PMID: 29182052 DOI: 10.1177/0962280217742539] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Many medical diagnostic studies involve three ordinal diagnostic populations in which the diagnostic accuracy can be summarized by the volume or partial volume under the receiver operating characteristic surface for a diagnostic marker. When the diagnostic populations are clustered, e.g. by families, we propose to model the diagnostic marker by a general linear mixed model that takes into account of the correlation on the diagnostic marker from members of the same clusters. This model then facilitates the maximum likelihood estimation and statistical inferences of the diagnostic accuracy for the diagnostic marker. This approach naturally allows the incorporation of covariates as well as missing data when some clusters do not have subjects on all diagnostic groups in the estimation of, and the subsequent inferences on the diagnostic accuracy. We further study the performance of the proposed methods in a large simulation study with clustered data. Finally, we apply the proposed methodology to the data of several biomarkers collected by the Dominantly Inherited Alzheimer Network, an international family-clustered registry to study autosomal dominant Alzheimer disease which is a rare form of Alzheimer disease caused by mutations in any of the three genes including the amyloid precursor protein, presenilin 1 and presenilin 2. We estimate the accuracy of several cerebrospinal fluid and neuroimaging biomarkers in differentiating three diagnostic and genetic populations: normal non-mutation carriers, asymptomatic mutation carriers, and symptomatic mutation carriers.
Collapse
Affiliation(s)
- Chengjie Xiong
- 1 Division of Biostatistics, Washington University in St. Louis, St. Louis, MO, USA.,2 Department of Neurology, Washington University in St. Louis, St. Louis, MO, USA
| | - Jingqin Luo
- 3 Division of Public health, Department of Surgery, Washington University in St. Louis, St. Louis, MO, USA.,4 Biostatistics Core, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, USA
| | - Ling Chen
- 1 Division of Biostatistics, Washington University in St. Louis, St. Louis, MO, USA
| | - Feng Gao
- 3 Division of Public health, Department of Surgery, Washington University in St. Louis, St. Louis, MO, USA.,4 Biostatistics Core, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, USA
| | - Jingxia Liu
- 3 Division of Public health, Department of Surgery, Washington University in St. Louis, St. Louis, MO, USA.,4 Biostatistics Core, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, USA
| | - Guoqiao Wang
- 1 Division of Biostatistics, Washington University in St. Louis, St. Louis, MO, USA.,2 Department of Neurology, Washington University in St. Louis, St. Louis, MO, USA
| | - Randall Bateman
- 2 Department of Neurology, Washington University in St. Louis, St. Louis, MO, USA
| | - John C Morris
- 2 Department of Neurology, Washington University in St. Louis, St. Louis, MO, USA.,5 Department of Pathology and Immunology, Washington University in St. Louis, St. Louis, MO, USA
| |
Collapse
|
7
|
Van Oirbeek R, Lesaffre E. Exploring the Clustering Effect of the Frailty Survival Model by Means of the Brier Score. COMMUN STAT-SIMUL C 2016. [DOI: 10.1080/03610918.2014.936464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Robin Van Oirbeek
- Interuniversity Institute for Biostatistics and Statistical Bioinformatics, KU Leuven, Leuven, Belgium
| | - Emmanuel Lesaffre
- Interuniversity Institute for Biostatistics and Statistical Bioinformatics, KU Leuven, Leuven, Belgium
- Department of Biostatistics, Erasmus Medical Center Rotterdam, Rotterdam, The Netherlands
| |
Collapse
|
8
|
Wynants L, Vergouwe Y, Van Huffel S, Timmerman D, Van Calster B. Does ignoring clustering in multicenter data influence the performance of prediction models? A simulation study. Stat Methods Med Res 2016; 27:1723-1736. [PMID: 27647815 DOI: 10.1177/0962280216668555] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Clinical risk prediction models are increasingly being developed and validated on multicenter datasets. In this article, we present a comprehensive framework for the evaluation of the predictive performance of prediction models at the center level and the population level, considering population-averaged predictions, center-specific predictions, and predictions assuming an average random center effect. We demonstrated in a simulation study that calibration slopes do not only deviate from one because of over- or underfitting of patterns in the development dataset, but also as a result of the choice of the model (standard versus mixed effects logistic regression), the type of predictions (marginal versus conditional versus assuming an average random effect), and the level of model validation (center versus population). In particular, when data is heavily clustered (ICC 20%), center-specific predictions offer the best predictive performance at the population level and the center level. We recommend that models should reflect the data structure, while the level of model validation should reflect the research question.
Collapse
Affiliation(s)
- L Wynants
- 1 KU Leuven Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Leuven, Belgium.,2 KU Leuven iMinds Department Medical Information Technologies, Leuven, Belgium
| | - Y Vergouwe
- 3 Center for Medical Decision Sciences, Department of Public Health, Erasmus Medical Center, Rotterdam, The Netherlands
| | - S Van Huffel
- 1 KU Leuven Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Leuven, Belgium.,2 KU Leuven iMinds Department Medical Information Technologies, Leuven, Belgium
| | - D Timmerman
- 4 KU Leuven Department of Development and Regeneration, Leuven, Belgium
| | - B Van Calster
- 3 Center for Medical Decision Sciences, Department of Public Health, Erasmus Medical Center, Rotterdam, The Netherlands.,4 KU Leuven Department of Development and Regeneration, Leuven, Belgium
| |
Collapse
|
9
|
Merlo J, Wagner P, Ghith N, Leckie G. An Original Stepwise Multilevel Logistic Regression Analysis of Discriminatory Accuracy: The Case of Neighbourhoods and Health. PLoS One 2016; 11:e0153778. [PMID: 27120054 PMCID: PMC4847925 DOI: 10.1371/journal.pone.0153778] [Citation(s) in RCA: 139] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2015] [Accepted: 04/04/2016] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND AND AIM Many multilevel logistic regression analyses of "neighbourhood and health" focus on interpreting measures of associations (e.g., odds ratio, OR). In contrast, multilevel analysis of variance is rarely considered. We propose an original stepwise analytical approach that distinguishes between "specific" (measures of association) and "general" (measures of variance) contextual effects. Performing two empirical examples we illustrate the methodology, interpret the results and discuss the implications of this kind of analysis in public health. METHODS We analyse 43,291 individuals residing in 218 neighbourhoods in the city of Malmö, Sweden in 2006. We study two individual outcomes (psychotropic drug use and choice of private vs. public general practitioner, GP) for which the relative importance of neighbourhood as a source of individual variation differs substantially. In Step 1 of the analysis, we evaluate the OR and the area under the receiver operating characteristic (AUC) curve for individual-level covariates (i.e., age, sex and individual low income). In Step 2, we assess general contextual effects using the AUC. Finally, in Step 3 the OR for a specific neighbourhood characteristic (i.e., neighbourhood income) is interpreted jointly with the proportional change in variance (i.e., PCV) and the proportion of ORs in the opposite direction (POOR) statistics. RESULTS For both outcomes, information on individual characteristics (Step 1) provide a low discriminatory accuracy (AUC = 0.616 for psychotropic drugs; = 0.600 for choosing a private GP). Accounting for neighbourhood of residence (Step 2) only improved the AUC for choosing a private GP (+0.295 units). High neighbourhood income (Step 3) was strongly associated to choosing a private GP (OR = 3.50) but the PCV was only 11% and the POOR 33%. CONCLUSION Applying an innovative stepwise multilevel analysis, we observed that, in Malmö, the neighbourhood context per se had a negligible influence on individual use of psychotropic drugs, but appears to strongly condition individual choice of a private GP. However, the latter was only modestly explained by the socioeconomic circumstances of the neighbourhoods. Our analyses are based on real data and provide useful information for understanding neighbourhood level influences in general and on individual use of psychotropic drugs and choice of GP in particular. However, our primary aim is to illustrate how to perform and interpret a multilevel analysis of individual heterogeneity in social epidemiology and public health. Our study shows that neighbourhood "effects" are not properly quantified by reporting differences between neighbourhood averages but rather by measuring the share of the individual heterogeneity that exists at the neighbourhood level.
Collapse
Affiliation(s)
- Juan Merlo
- Unit for Social Epidemiology, Faculty of Medicine, Lund University, Malmö, Sweden
| | - Philippe Wagner
- Unit for Social Epidemiology, Faculty of Medicine, Lund University, Malmö, Sweden
- Centre for Clinical Research Västmanland, Uppsala University, Uppsala, Sweden
| | - Nermin Ghith
- Unit for Social Epidemiology, Faculty of Medicine, Lund University, Malmö, Sweden
- Research Unit of Chronic Conditions, Bispebjerg University Hospital, Copenhagen, Denmark
| | - George Leckie
- Centre for Multilevel Modelling, University of Bristol, Bristol, United Kingdom
| |
Collapse
|
10
|
Wynants L, Bouwmeester W, Moons KGM, Moerbeek M, Timmerman D, Van Huffel S, Van Calster B, Vergouwe Y. A simulation study of sample size demonstrated the importance of the number of events per variable to develop prediction models in clustered data. J Clin Epidemiol 2015; 68:1406-14. [PMID: 25817942 DOI: 10.1016/j.jclinepi.2015.02.002] [Citation(s) in RCA: 73] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2014] [Revised: 01/27/2015] [Accepted: 02/09/2015] [Indexed: 12/23/2022]
Abstract
OBJECTIVES This study aims to investigate the influence of the amount of clustering [intraclass correlation (ICC) = 0%, 5%, or 20%], the number of events per variable (EPV) or candidate predictor (EPV = 5, 10, 20, or 50), and backward variable selection on the performance of prediction models. STUDY DESIGN AND SETTING Researchers frequently combine data from several centers to develop clinical prediction models. In our simulation study, we developed models from clustered training data using multilevel logistic regression and validated them in external data. RESULTS The amount of clustering was not meaningfully associated with the models' predictive performance. The median calibration slope of models built in samples with EPV = 5 and strong clustering (ICC = 20%) was 0.71. With EPV = 5 and ICC = 0%, it was 0.72. A higher EPV related to an increased performance: the calibration slope was 0.85 at EPV = 10 and ICC = 20% and 0.96 at EPV = 50 and ICC = 20%. Variable selection sometimes led to a substantial relative bias in the estimated predictor effects (up to 118% at EPV = 5), but this had little influence on the model's performance in our simulations. CONCLUSION We recommend at least 10 EPV to fit prediction models in clustered data using logistic regression. Up to 50 EPV may be needed when variable selection is performed.
Collapse
Affiliation(s)
- L Wynants
- KU Leuven Department of Electrical Engineering-ESAT, STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Kasteelpark Arenberg 10, Box 2446, Leuven 3001, Belgium; KU Leuven iMinds Medical IT Department, Kasteelpark Arenberg 10, Box 2446, Leuven 3001, Belgium.
| | - W Bouwmeester
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX Utrecht, The Netherlands; Pharmerit B.V., Marten Meesweg 107, Rotterdam 3068 AV, The Netherlands
| | - K G M Moons
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX Utrecht, The Netherlands
| | - M Moerbeek
- Department of Methodology and Statistics, Utrecht University, Padualaan 14, 3584 CH Utrecht, The Netherlands
| | - D Timmerman
- KU Leuven Department of Development and Regeneration, Herestraat 49 Box 7003, Leuven 3000, Belgium; Department of Obstetrics and Gynaecology, University Hospitals Leuven, Herestraat 49, 3000 Leuven, Belgium
| | - S Van Huffel
- KU Leuven Department of Electrical Engineering-ESAT, STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Kasteelpark Arenberg 10, Box 2446, Leuven 3001, Belgium; KU Leuven iMinds Medical IT Department, Kasteelpark Arenberg 10, Box 2446, Leuven 3001, Belgium
| | - B Van Calster
- KU Leuven Department of Development and Regeneration, Herestraat 49 Box 7003, Leuven 3000, Belgium; Center for Medical Decision Sciences, Department of Public Health, Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, The Netherlands
| | - Y Vergouwe
- Center for Medical Decision Sciences, Department of Public Health, Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, The Netherlands
| |
Collapse
|
11
|
Assessing discriminative ability of risk models in clustered data. BMC Med Res Methodol 2014; 14:5. [PMID: 24423445 PMCID: PMC3897966 DOI: 10.1186/1471-2288-14-5] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2013] [Accepted: 01/08/2014] [Indexed: 11/24/2022] Open
Abstract
Background The discriminative ability of a risk model is often measured by Harrell’s concordance-index (c-index). The c-index estimates for two randomly chosen subjects the probability that the model predicts a higher risk for the subject with poorer outcome (concordance probability). When data are clustered, as in multicenter data, two types of concordance are distinguished: concordance in subjects from the same cluster (within-cluster concordance probability) and concordance in subjects from different clusters (between-cluster concordance probability). We argue that the within-cluster concordance probability is most relevant when a risk model supports decisions within clusters (e.g. who should be treated in a particular center). We aimed to explore different approaches to estimate the within-cluster concordance probability in clustered data. Methods We used data of the CRASH trial (2,081 patients clustered in 35 centers) to develop a risk model for mortality after traumatic brain injury. To assess the discriminative ability of the risk model within centers we first calculated cluster-specific c-indexes. We then pooled the cluster-specific c-indexes into a summary estimate with different meta-analytical techniques. We considered fixed effect meta-analysis with different weights (equal; inverse variance; number of subjects, events or pairs) and random effects meta-analysis. We reflected on pooling the estimates on the log-odds scale rather than the probability scale. Results The cluster-specific c-index varied substantially across centers (IQR = 0.70-0.81; I2 = 0.76 with 95% confidence interval 0.66 to 0.82). Summary estimates resulting from fixed effect meta-analysis ranged from 0.75 (equal weights) to 0.84 (inverse variance weights). With random effects meta-analysis – accounting for the observed heterogeneity in c-indexes across clusters – we estimated a mean of 0.77, a between-cluster variance of 0.0072 and a 95% prediction interval of 0.60 to 0.95. The normality assumptions for derivation of a prediction interval were better met on the probability than on the log-odds scale. Conclusion When assessing the discriminative ability of risk models used to support decisions at cluster level we recommend meta-analysis of cluster-specific c-indexes. Particularly, random effects meta-analysis should be considered.
Collapse
|