van der Ploeg T, Schalk R, Gobbens RJJ. External Validation of Models for Predicting Disability in Community-Dwelling Older People in the Netherlands: A Comparative Study.
Clin Interv Aging 2023;
18:1873-1882. [PMID:
38020449 PMCID:
PMC10654350 DOI:
10.2147/cia.s428036]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 11/07/2023] [Indexed: 12/01/2023] Open
Abstract
Background
Advanced statistical modeling techniques may help predict health outcomes. However, it is not the case that these modeling techniques always outperform traditional techniques such as regression techniques. In this study, external validation was carried out for five modeling strategies for the prediction of the disability of community-dwelling older people in the Netherlands.
Methods
We analyzed data from five studies consisting of community-dwelling older people in the Netherlands. For the prediction of the total disability score as measured with the Groningen Activity Restriction Scale (GARS), we used fourteen predictors as measured with the Tilburg Frailty Indicator (TFI). Both the TFI and the GARS are self-report questionnaires. For the modeling, five statistical modeling techniques were evaluated: general linear model (GLM), support vector machine (SVM), neural net (NN), recursive partitioning (RP), and random forest (RF). Each model was developed on one of the five data sets and then applied to each of the four remaining data sets. We assessed the performance of the models with calibration characteristics, the correlation coefficient, and the root of the mean squared error.
Results
The models GLM, SVM, RP, and RF showed satisfactory performance characteristics when validated on the validation data sets. All models showed poor performance characteristics for the deviating data set both for development and validation due to the deviating baseline characteristics compared to those of the other data sets.
Conclusion
The performance of four models (GLM, SVM, RP, RF) on the development data sets was satisfactory. This was also the case for the validation data sets, except when these models were developed on the deviating data set. The NN models showed a much worse performance on the validation data sets than on the development data sets.
Collapse