1
|
Shamsutdinova D, Stamate D, Stahl D. Balancing accuracy and Interpretability: An R package assessing complex relationships beyond the Cox model and applications to clinical prediction. Int J Med Inform 2025; 194:105700. [PMID: 39546831 DOI: 10.1016/j.ijmedinf.2024.105700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2024] [Accepted: 11/08/2024] [Indexed: 11/17/2024]
Abstract
BACKGROUND Accurate and interpretable models are essential for clinical decision-making, where predictions can directly impact patient care. Machine learning (ML) survival methods can handle complex multidimensional data and achieve high accuracy but require post-hoc explanations. Traditional models such as the Cox Proportional Hazards Model (Cox-PH) are less flexible, but fast, stable, and intrinsically transparent. Moreover, ML does not always outperform Cox-PH in clinical settings, warranting a diligent model validation. We aimed to develop a set of R functions to help explore the limits of Cox-PH compared to the tree-based and deep learning survival models for clinical prediction modelling, employing ensemble learning and nested cross-validation. METHODS We developed a set of R functions, publicly available as the package "survcompare". It supports Cox-PH and Cox-Lasso, and Survival Random Forest (SRF) and DeepHit are the ML alternatives, along with the ensemble methods integrating Cox-PH with SRF or DeepHit designed to isolate the marginal value of ML. The package performs a repeated nested cross-validation and tests for statistical significance of the ML's superiority using the survival-specific performance metrics, the concordance index, time-dependent AUC-ROC and calibration slope. To get practical insights, we applied this methodology to clinical and simulated datasets with varying complexities and sizes. RESULTS In simulated data with non-linearities or interactions, ML models outperformed Cox-PH at sample sizes ≥ 500. ML superiority was also observed in imaging and high-dimensional clinical data. However, for tabular clinical data, the performance gains of ML were minimal; in some cases, regularised Cox-Lasso recovered much of the ML's performance advantage with significantly faster computations. Ensemble methods combining Cox-PH and ML predictions were instrumental in quantifying Cox-PH's limits and improving ML calibration. Traditional models like Cox-PH or Cox-Lasso should not be overlooked while developing clinical predictive models from tabular data or data of limited size. CONCLUSION Our package offers researchers a framework and practical tool for evaluating the accuracy-interpretability trade-off, helping make informed decisions about model selection.
Collapse
Affiliation(s)
- Diana Shamsutdinova
- Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom.
| | - Daniel Stamate
- Data Science and Soft Computing Lab, Computing Department, Goldsmiths University of London, United Kingdom; School of Health Sciences, University of Manchester, Manchester, United Kingdom
| | - Daniel Stahl
- Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom
| |
Collapse
|
2
|
Stahl D. New horizons in prediction modelling using machine learning in older people's healthcare research. Age Ageing 2024; 53:afae201. [PMID: 39311424 PMCID: PMC11417961 DOI: 10.1093/ageing/afae201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 06/26/2024] [Indexed: 09/26/2024] Open
Abstract
Machine learning (ML) and prediction modelling have become increasingly influential in healthcare, providing critical insights and supporting clinical decisions, particularly in the age of big data. This paper serves as an introductory guide for health researchers and readers interested in prediction modelling and explores how these technologies support clinical decisions, particularly with big data, and covers all aspects of the development, assessment and reporting of a model using ML. The paper starts with the importance of prediction modelling for precision medicine. It outlines different types of prediction and machine learning approaches, including supervised, unsupervised and semi-supervised learning, and provides an overview of popular algorithms for various outcomes and settings. It also introduces key theoretical ML concepts. The importance of data quality, preprocessing and unbiased model performance evaluation is highlighted. Concepts of apparent, internal and external validation will be introduced along with metrics for discrimination and calibration for different types of outcomes. Additionally, the paper addresses model interpretation, fairness and implementation in clinical practice. Finally, the paper provides recommendations for reporting and identifies common pitfalls in prediction modelling and machine learning. The aim of the paper is to help readers understand and critically evaluate research papers that present ML models and to serve as a first guide for developing, assessing and implementing their own.
Collapse
Affiliation(s)
- Daniel Stahl
- Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London, UK
| |
Collapse
|
3
|
Duan J, Wang M, Sam NB, Tian Q, Zheng T, Chen Y, Deng X, Liu Y. The development and validation of a nomogram-based risk prediction model for mortality among older adults. SSM Popul Health 2024; 25:101605. [PMID: 38292049 PMCID: PMC10825771 DOI: 10.1016/j.ssmph.2024.101605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Revised: 10/15/2023] [Accepted: 01/05/2024] [Indexed: 02/01/2024] Open
Abstract
Objective This research aims to construct and authenticate a comprehensive predictive model for all-cause mortality, based on a multifaceted array of risk factors. Methods The derivation cohort for this study was the Chinese Longitudinal Healthy Longevity Survey (CLHLS), while the Healthy Ageing and Biomarkers Cohort Study (HABCS) and the China Health and Retirement Longitudinal Study (CHARLS) were used as validation cohorts. Risk factors were filtered using lasso regression, and predictive factors were determined using net reclassification improvement. Cox proportional hazards models were employed to establish the mortality risk prediction equations, and the model's fit was evaluated using a discrimination concordance index (C-index). To evaluate the internal consistency of discrimination and calibration, a 10x10 cross-validation technique was employed. Calibration plots were generated to compare predicted probabilities with observed probabilities. The prediction ability of the equations was demonstrated using nomogram. Results The CLHLS (mean age 88.08, n = 37074) recorded 28158 deaths (179683 person-years) throughout the course of an 8-20 year follow-up period. Additionally, there were 1384 deaths in the HABCS (mean age 86.74, n = 2552), and 1221 deaths in the CHARLS (mean age 72.48, n = 4794). The final all-cause mortality model incorporated demographic characteristics like age, sex, and current marital status, as well as functional status indicators including cognitive function and activities of daily living. Additionally, lifestyle factors like past smoking condition and leisure activities including housework, television viewing or radio listening, and gardening work were included. The C-index for the derivation cohort was 0.728 (95% CI: 0.724-0.732), while the external validation results for the CHARS and HABCS cohorts were 0.761 (95% CI: 0.749-0.773) and 0.713 (95% CI: 0.697-0.729), respectively. Conclusion This study introduces a reliable, validated, and acceptable mortality risk predictor for older adults in China. These predictive factors have potential applications in public health policy and clinical practice.
Collapse
Affiliation(s)
- Jun Duan
- Department of Medical Record Statistics, Peking University Shenzhen Hospital, Shenzhen, China
| | - MingXia Wang
- Department of Stomatology, Luohu Hospital of Traditional Chinese Medicine, Shenzhen, China
| | - Napoleon Bellua Sam
- Department of Epidemiology and Biostatistics, University for Development Studies, Tamale, Ghana
| | - Qin Tian
- Scientific Research Center, The Seventh Affiliated Hospital, Sun Yat-sen University, Shenzhen, Guangdong, 518107, China
| | - TingTing Zheng
- Department of Ultrasound, Peking University Shenzhen Hospital, Shenzhen Key Laboratory for Drug Addiction and Medication Safety, Institute of Ultrasound Medicine, Shenzhen-PKU-HKUST Medical Center, Shenzhen, China
| | - Yun Chen
- Department of Ultrasound, Peking University Shenzhen Hospital, Shenzhen Key Laboratory for Drug Addiction and Medication Safety, Institute of Ultrasound Medicine, Shenzhen-PKU-HKUST Medical Center, Shenzhen, China
| | - XiaoMei Deng
- Department of Comprehensive Ward, Peking University Shenzhen Hospital, Shenzhen, China
| | - Yan Liu
- Department of Medical Record Statistics, Peking University Shenzhen Hospital, Shenzhen, China
| |
Collapse
|
4
|
Åkerla J, Nevalainen J, Pesonen JS, Pöyhönen A, Koskimäki J, Häkkinen J, Tammela TLJ, Auvinen A. Do LUTS Predict Mortality? An Analysis Using Random Forest Algorithms. Clin Interv Aging 2024; 19:237-245. [PMID: 38371602 PMCID: PMC10873145 DOI: 10.2147/cia.s432368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Accepted: 01/17/2024] [Indexed: 02/20/2024] Open
Abstract
Purpose To evaluate a random forest (RF) algorithm of lower urinary tract symptoms (LUTS) as a predictor of all-cause mortality in a population-based cohort. Materials and Methods A population-based cohort of 3143 men born in 1924, 1934, and 1944 was evaluated using a mailed questionnaire including the Danish Prostatic Symptom Score (DAN-PSS-1) to assess LUTS as well as questions on medical conditions and behavioral and sociodemographic factors. Surveys were repeated in 1994, 1999, 2004, 2009 and 2015. The cohort was followed-up for vital status until the end of 2018. RF uses an ensemble of classification trees for prediction with a good flexibility and without overfitting. RF algorithms were developed to predict the five-year mortality using LUTS, demographic, medical, and behavioral factors alone and in combinations. Results A total of 2663 men were included in the study, of whom 917 (34%) died during follow-up (median follow-up time 15.0 years). The LUTS-based RF algorithm showed an area under the curve (AUC) 0.60 (95% CI 0.52-0.69) for five-year mortality. An expanded RF algorithm, including LUTS, medical history, and behavioral and sociodemographic factors, yielded an AUC 0.73 (0.65-0.81), while an algorithm excluding LUTS yielded an AUC 0.71 (0.62-0.78). Conclusion An exploratory RF algorithm using LUTS can predict all-cause mortality with acceptable discrimination at the group level. In clinical practice, it is unlikely that LUTS will improve the accuracy to predict death if the patient's background is well known.
Collapse
Affiliation(s)
- Jonne Åkerla
- Department of Urology, Tampere University Hospital, Tampere, Finland
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | | | - Jori S Pesonen
- Department of Surgery, Päijät-Häme Central Hospital, Lahti, Finland
| | - Antti Pöyhönen
- Centre for Military Medicine, The Finnish Defence Forces, Riihimäki, Finland
| | - Juha Koskimäki
- Department of Urology, Tampere University Hospital, Tampere, Finland
| | - Jukka Häkkinen
- Department of Urology, Länsi-Pohja healthcare District, Kemi, Finland
| | - Teuvo L J Tammela
- Department of Urology, Tampere University Hospital, Tampere, Finland
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | - Anssi Auvinen
- Faculty of Social Sciences, Tampere University, Tampere, Finland
| |
Collapse
|
5
|
Ajnakina O, Murray R, Steptoe A, Cadar D. The long-term effects of a polygenetic predisposition to general cognition on healthy cognitive ageing: evidence from the English Longitudinal Study of Ageing. Psychol Med 2023; 53:2852-2860. [PMID: 35139938 PMCID: PMC10235650 DOI: 10.1017/s0033291721004827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 10/28/2021] [Accepted: 11/03/2021] [Indexed: 11/07/2022]
Abstract
BACKGROUND As an accelerated cognitive decline frequently heralds onset of severe neuropathological disorders, understanding the source of individual differences in withstanding the onslaught of cognitive ageing may highlight how best cognitive abilities may be retained into advanced age. METHODS Using a population representative sample of 5088 adults aged •50 years from the English Longitudinal Study of Ageing, we investigated relationships of polygenic predisposition to general cognition with a rate of change in cognition during a 10-year follow-up period. Polygenic predisposition was measured with polygenic scores for general cognition (GC-PGS). Cognition was measured employing tests for verbal memory and semantic fluency. RESULTS The average baseline memory score was 11.1 (s.d. = 2.9) and executive function score was 21.5 (s.d. = 5.8). An increase in GC-PGS by one standard deviation (1-s.d.) was associated with a higher baseline verbal memory by an average 0.27 points (95% CI 0.19-0.34, p < 0.001). Similarly, 1-s.d. increase in GC-PGS was associated with a higher semantic fluency score at baseline in the entire sample (β = 0.45, 95% CI 0.27-0.64, p < 0.001). These associations were significant for women and men, and all age groups. Nonetheless, 1-s.d. increase in GC-PGS was not associated with decreases in verbal memory nor semantic fluency during follow-up in the entire sample, as well stratified models by sex and age. CONCLUSION Although common genetic variants associated with general cognition additively are associated with a stable surplus to cognition in adults, a polygenic predisposition to general cognition is not associated with age-related cognitive decline during a 10-year follow-up.
Collapse
Affiliation(s)
- Olesya Ajnakina
- Department of Behavioural Science and Health, Institute of Epidemiology and Health Care, University College London, 1-19 Torrington Place, London, WC1E 7HB, UK
- Department of Biostatistics & Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, 16 De Crespigny Park, Camberwell, London, SE5 8AF, UK
| | - Robin Murray
- Department of Psychosis Studies, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
- Department of Psychiatry, Experimental Biomedicine and Clinical Neuroscience, University of Palermo, Palermo, Italy
| | - Andrew Steptoe
- Department of Behavioural Science and Health, Institute of Epidemiology and Health Care, University College London, 1-19 Torrington Place, London, WC1E 7HB, UK
| | - Dorina Cadar
- Department of Behavioural Science and Health, Institute of Epidemiology and Health Care, University College London, 1-19 Torrington Place, London, WC1E 7HB, UK
| |
Collapse
|
6
|
Genetic propensity, socioeconomic status, and trajectories of depression over a course of 14 years in older adults. Transl Psychiatry 2023; 13:68. [PMID: 36823133 PMCID: PMC9950051 DOI: 10.1038/s41398-023-02367-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 02/07/2023] [Accepted: 02/13/2023] [Indexed: 02/25/2023] Open
Abstract
Depression is one of the leading causes of disability worldwide and is a major contributor to the global burden of disease among older adults. The study aimed to investigate the interplay between socio-economic markers (education and financial resources) and polygenic predisposition influencing individual differences in depressive symptoms and their change over time in older adults, which is of central relevance for preventative strategies. The sample encompassing n = 6202 adults aged ≥50 years old with a follow-up period of 14 years was utilised from the English Longitudinal Study of Ageing. Polygenic scores for depressive symptoms were calculated using summary statistics for (1) single-trait depressive symptoms (PGS-DSsingle), and (2) multi-trait including depressive symptoms, subjective well-being, neuroticism, loneliness, and self-rated health (PGS-DSmulti-trait). The depressive symptoms over the past week were measured using the eight-item Centre for Epidemiologic Studies Depression Scale. One standard deviation increase in each PGS was associated with a higher baseline score in depressive symptoms. Each additional year of completed schooling was associated with lower baseline depression symptoms (β = -0.06, 95%CI = -0.07 to -0.05, p < 0.001); intermediate and lower wealth were associated with a higher baseline score in depressive symptoms. Although there was a weak interaction effect between PGS-DSs and socio-economic status in association with the baseline depressive symptoms, there were no significant relationships of PGS-DSs, socio-economic factors, and rate of change in the depressive symptoms during the 14-year follow-up period. Common genetic variants for depressive symptoms are associated with a greater number of depressive symptoms onset but not with their rate of change in the following 14 years. Lower socio-economic status is an important factor influencing individual levels of depressive symptoms, independently from polygenic predisposition to depressive symptoms.
Collapse
|
7
|
Li Z, Yang N, He L, Wang J, Ping F, Li W, Xu L, Zhang H, Li Y. Development and validation of questionnaire-based machine learning models for predicting all-cause mortality in a representative population of China. Front Public Health 2023; 11:1033070. [PMID: 36778549 PMCID: PMC9911458 DOI: 10.3389/fpubh.2023.1033070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 01/11/2023] [Indexed: 01/28/2023] Open
Abstract
Background Considering that the previously developed mortality prediction models have limited applications to the Chinese population, a questionnaire-based prediction model is of great importance for its accuracy and convenience in clinical practice. Methods Two national cohort, namely, the China Health and Nutrition Survey (8,355 individual older than 18) and the China Health and Retirement Longitudinal Study (12,711 individuals older than 45) were used for model development and validation. One hundred and fifty-nine variables were compiled to generate predictions. The Cox regression model and six machine learning (ML) models were used to predict all-cause mortality. Finally, a simple questionnaire-based ML prediction model was developed using the best algorithm and validated. Results In the internal validation set, all the ML models performed better than the traditional Cox model in predicting 6-year mortality and the random survival forest (RSF) model performed best. The questionnaire-based ML model, which only included 20 variables, achieved a C-index of 0.86 (95%CI: 0.80-0.92). On external validation, the simple questionnaire-based model achieved a C-index of 0.82 (95%CI: 0.77-0.87), 0.77 (95%CI: 0.75-0.79), and 0.79 (95%CI: 0.77-0.81), respectively, in predicting 2-, 9-, and 11-year mortality. Conclusions In this prospective population-based study, a model based on the RSF analysis performed best among all models. Furthermore, there was no significant difference between the prediction performance of the questionnaire-based ML model, which only included 20 variables, and that of the model with all variables (including laboratory variables). The simple questionnaire-based ML prediction model, which needs to be further explored, is of great importance for its accuracy and suitability to the Chinese general population.
Collapse
|
8
|
Ajnakina O, Fadilah I, Quattrone D, Arango C, Berardi D, Bernardo M, Bobes J, de Haan L, Del-Ben CM, Gayer-Anderson C, Stilo S, Jongsma HE, Lasalvia A, Tosato S, Llorca PM, Menezes PR, Rutten BP, Santos JL, Sanjuán J, Selten JP, Szöke A, Tarricone I, D’Andrea G, Tortelli A, Velthorst E, Jones PB, Romero MA, La Cascia C, Kirkbride JB, van Os J, O’Donovan M, Morgan C, di Forti M, Murray RM, Stahl D. Development and Validation of Predictive Model for a Diagnosis of First Episode Psychosis Using the Multinational EU-GEI Case-control Study and Modern Statistical Learning Methods. SCHIZOPHRENIA BULLETIN OPEN 2023; 4:sgad008. [PMID: 39145333 PMCID: PMC11207766 DOI: 10.1093/schizbullopen/sgad008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 08/16/2024]
Abstract
Background and Hypothesis It is argued that availability of diagnostic models will facilitate a more rapid identification of individuals who are at a higher risk of first episode psychosis (FEP). Therefore, we developed, evaluated, and validated a diagnostic risk estimation model to classify individual with FEP and controls across six countries. Study Design We used data from a large multi-center study encompassing 2627 phenotypically well-defined participants (aged 18-64 years) recruited from six countries spanning 17 research sites, as part of the European Network of National Schizophrenia Networks Studying Gene-Environment Interactions study. To build the diagnostic model and identify which of important factors for estimating an individual risk of FEP, we applied a binary logistic model with regularization by the least absolute shrinkage and selection operator. The model was validated employing the internal-external cross-validation approach. The model performance was assessed with the area under the receiver operating characteristic curve (AUROC), calibration, sensitivity, and specificity. Study Results Having included preselected 22 predictor variables, the model was able to discriminate adults with FEP and controls with high accuracy across all six countries (rangesAUROC = 0.84-0.86). Specificity (range = 73.9-78.0%) and sensitivity (range = 75.6-79.3%) were equally good, cumulatively indicating an excellent model accuracy; though, calibration slope for the diagnostic model showed a presence of some overfitting when applied specifically to participants from France, the UK, and The Netherlands. Conclusions The new FEP model achieved a good discrimination and good calibration across six countries with different ethnic contributions supporting its robustness and good generalizability.
Collapse
Affiliation(s)
- Olesya Ajnakina
- Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, University of London, London, UK
- Department of Behavioural Science and Health, Institute of Epidemiology and Health Care, University College London, London, UK
| | - Ihsan Fadilah
- Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, University of London, London, UK
| | - Diego Quattrone
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK
| | - Celso Arango
- Child and Adolescent Psychiatry Department, Institute of Psychiatry and Mental Health, Hospital General Universitario Gregorio Marañón, School of Medicine, Universidad Complutense, IiSGM, CIBERSAM, C/Doctor Esquerdo 46, 28007 Madrid, Spain
| | - Domenico Berardi
- Department of Biomedical and Neuromotor Sciences, Psychiatry Unit, Alma Mater Studiorum Università di Bologna, Viale Pepoli 5, 40126 Bologna, Italy
| | - Miguel Bernardo
- Department of Psychiatry, Barcelona Clinic Schizophrenia Unit, Neuroscience Institute, Hospital Clinic of Barcelona, University of Barcelona, IDIBAPS, CIBERSAM, Barcelona, Spain
| | - Julio Bobes
- Faculty of Medicine and Health Sciences, Psychiatry, Universidad de Oviedo, ISPA, INEUROPA. CIBERSAM, Oviedo, Spain
| | - Lieuwe de Haan
- Department of Psychiatry, Early Psychosis Section, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Cristina Marta Del-Ben
- Neuroscience and Behavior Department, Ribeirão Preto Medical School, University of São Paulo, São Paulo, Brazil
| | - Charlotte Gayer-Anderson
- Department of Health Service and Population Research, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK
| | - Simona Stilo
- Department of Mental Health and Addiction Services, ASP Crotone, Crotone, Italy
- Department of Psychosis Studies, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK
| | - Hannah E Jongsma
- Centre for Transcultural Psychiatry Veldzicht, Balkbrug, The Netherlands
- University Centre for Psychiatry, University Medical Centre Groningen, Groningen, The Netherlands
| | - Antonio Lasalvia
- Section of Psychiatry, Department of Neuroscience, Biomedicine and Movement Sciences, University of Verona, Piazzale L.A. Scuro 10, 37134 Verona, Italy
| | - Sarah Tosato
- Section of Psychiatry, Department of Neuroscience, Biomedicine and Movement Sciences, University of Verona, Piazzale L.A. Scuro 10, 37134 Verona, Italy
| | - Pierre-Michel Llorca
- Université Clermont Auvergne, CMP-B CHU, CNRS, Clermont Auvergne INP, Institut Pascal, F-63000 Clermont-Ferrand, France
| | - Paulo Rossi Menezes
- Department of Preventative Medicine, Faculdade de Medicina FMUSP, University of São Paulo, São Paulo, Brazil
| | - Bart P Rutten
- Department of Psychiatry and Neuropsychology, School for Mental Health and Neuroscience, South Limburg Mental Health Research and Teaching Network, Maastricht University Medical Centre, P.O. Box 616, 6200 MD Maastricht, The Netherlands
| | - Jose Luis Santos
- Department of Psychiatry, Servicio de Psiquiatría Hospital “Virgen de la Luz”, Cuenca, Spain
| | - Julio Sanjuán
- Department of Psychiatry, Hospital Clínico Universitario de Valencia, INCLIVA, CIBERSAM, School of Medicine, Universidad de Valencia, Valencia, Spain
| | - Jean-Paul Selten
- Rivierduinen Institute for Mental Health Care, Sandifortdreef 19, 2333 ZZ Leiden, The Netherlands
| | - Andrei Szöke
- University of Paris Est Creteil, INSERM, IMRB, AP-HP, Hôpitaux Universitaires « H. Mondor », DMU IMPACT, Fondation FondaMental, F-94010 Creteil, France
| | - Ilaria Tarricone
- Department of Medical and Surgical Sciences, Bologna University, Bologna, Italy
| | - Giuseppe D’Andrea
- Department of Biomedical and Neuromotor Sciences, Psychiatry Unit, Alma Mater Studiorum Università di Bologna, Viale Pepoli 5, 40126 Bologna, Italy
| | | | - Eva Velthorst
- Department of Psychiatry, Early Psychosis Section, Academic Medical Centre, University of Amsterdam, Amsterdam, The Netherlands
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Peter B Jones
- Department of Psychiatry, University of Cambridge, Herchel Smith Building for Brain and Mind Sciences, Forvie Site, Robinson Way, Cambridge, CB2 0SZ, UK
- CAMEO Early Intervention Service, Cambridgeshire and Peterborough NHS Foundation Trust, Cambridge, CB21 5EF, UK
| | - Manuel Arrojo Romero
- Department of Psychiatry, Psychiatric Genetic Group, Instituto de Investigación Sanitaria de Santiago de Compostela, Complejo Hospitalario s, Santiago de Compostela, Spain
| | - Caterina La Cascia
- Department of Experimental Biomedicine and Clinical Neuroscience, University of Palermo, Via G. La Loggia 1, 90129 Palermo, Italy
| | - James B Kirkbride
- Psylife Group, Division of Psychiatry, University College London, 6th Floor, Maple House, 149 Tottenham Court Road, London, W1T 7NF, UK
| | - Jim van Os
- Department of Psychosis Studies, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK
- Department of Psychiatry, Brain Centre Rudolf Magnus, Utrecht University Medical centre, Utrecht, The Netherlands
- Department of Psychiatry and Neuropsychology, School for Mental Health and Neuroscience, South Limburg Mental Health Research and Teaching Network, Maastricht University Medical Centre, P.O. Box 616, 6200 MD Maastricht, The Netherlands
| | - Michael O’Donovan
- Division of Psychological Medicine and Clinical Neurosciences, MRC Centre for Neuropsychiatric Genetics and Genomics, Cardiff University, Cardiff CF24 4HQ, UK
| | - Craig Morgan
- Department of Health Service and Population Research, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK
| | - Marta di Forti
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK
| | - Robin M Murray
- Department of Psychosis Studies, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK
- Department of Psychiatry, Experimental Biomedicine and Clinical Neuroscience, University of Palermo, Palermo, Italy
| | - Daniel Stahl
- Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, University of London, London, UK
| |
Collapse
|
9
|
Mizani MA, Dashtban A, Pasea L, Lai AG, Thygesen J, Tomlinson C, Handy A, Mamza JB, Morris T, Khalid S, Zaccardi F, Macleod MJ, Torabi F, Canoy D, Akbari A, Berry C, Bolton T, Nolan J, Khunti K, Denaxas S, Hemingway H, Sudlow C, Banerjee A, on behalf of the CVD-COVID-UK Consortium. Using national electronic health records for pandemic preparedness: validation of a parsimonious model for predicting excess deaths among those with COVID-19-a data-driven retrospective cohort study. J R Soc Med 2023; 116:10-20. [PMID: 36374585 PMCID: PMC9909113 DOI: 10.1177/01410768221131897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 09/24/2022] [Indexed: 11/16/2022] Open
Abstract
OBJECTIVES To use national, pre- and post-pandemic electronic health records (EHR) to develop and validate a scenario-based model incorporating baseline mortality risk, infection rate (IR) and relative risk (RR) of death for prediction of excess deaths. DESIGN An EHR-based, retrospective cohort study. SETTING Linked EHR in Clinical Practice Research Datalink (CPRD); and linked EHR and COVID-19 data in England provided in NHS Digital Trusted Research Environment (TRE). PARTICIPANTS In the development (CPRD) and validation (TRE) cohorts, we included 3.8 million and 35.1 million individuals aged ≥30 years, respectively. MAIN OUTCOME MEASURES One-year all-cause excess deaths related to COVID-19 from March 2020 to March 2021. RESULTS From 1 March 2020 to 1 March 2021, there were 127,020 observed excess deaths. Observed RR was 4.34% (95% CI, 4.31-4.38) and IR was 6.27% (95% CI, 6.26-6.28). In the validation cohort, predicted one-year excess deaths were 100,338 compared with the observed 127,020 deaths with a ratio of predicted to observed excess deaths of 0.79. CONCLUSIONS We show that a simple, parsimonious model incorporating baseline mortality risk, one-year IR and RR of the pandemic can be used for scenario-based prediction of excess deaths in the early stages of a pandemic. Our analyses show that EHR could inform pandemic planning and surveillance, despite limited use in emergency preparedness to date. Although infection dynamics are important in the prediction of mortality, future models should take greater account of underlying conditions.
Collapse
Affiliation(s)
- Mehrdad A Mizani
- Institute of Health Informatics, University College London,
London NW1 2DA, UK
- BHF Data Science Centre, Health Data Research UK, London, NW1
2BE, UK
| | - Ashkan Dashtban
- Institute of Health Informatics, University College London,
London NW1 2DA, UK
| | - Laura Pasea
- Institute of Health Informatics, University College London,
London NW1 2DA, UK
| | - Alvina G Lai
- Institute of Health Informatics, University College London,
London NW1 2DA, UK
| | - Johan Thygesen
- Institute of Health Informatics, University College London,
London NW1 2DA, UK
| | - Chris Tomlinson
- Institute of Health Informatics, University College London,
London NW1 2DA, UK
| | - Alex Handy
- Institute of Health Informatics, University College London,
London NW1 2DA, UK
| | - Jil B Mamza
- Medical and Scientific Affairs, BioPharmaceuticals Medical,
AstraZeneca, Cambridge, CB2 0AA, UK
| | - Tamsin Morris
- Medical and Scientific Affairs, BioPharmaceuticals Medical,
AstraZeneca, Cambridge, CB2 0AA, UK
| | - Sara Khalid
- Nuffield Department of Orthopaedics, Rheumatology and
Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7HE, UK
| | - Francesco Zaccardi
- Leicester Diabetes Centre, University of Leicester, Leicester,
LE5 4PW, UK
| | - Mary Joan Macleod
- School of Medicine, Medical Sciences and Nutrition, University
of Aberdeen, Aberdeen, AB24 3FX, UK
| | - Fatemeh Torabi
- Faculty of Medicine, Health and Life Science, Swansea
University, Swansea, SA2 8QA, UK
| | - Dexter Canoy
- Nuffield Department of Women’s and Reproductive Health,
University of Oxford, Oxford, OX3 9DU, UK
| | - Ashley Akbari
- Faculty of Medicine, Health and Life Science, Swansea
University, Swansea, SA2 8QA, UK
| | - Colin Berry
- Institute of Cardiovascular and Medical Sciences, University of
Glasgow, Glasgow, G12 8TA, UK
| | - Thomas Bolton
- BHF Data Science Centre, Health Data Research UK, London, NW1
2BE, UK
| | - John Nolan
- BHF Data Science Centre, Health Data Research UK, London, NW1
2BE, UK
| | - Kamlesh Khunti
- Leicester Diabetes Centre, University of Leicester, Leicester,
LE5 4PW, UK
| | - Spiros Denaxas
- Institute of Health Informatics, University College London,
London NW1 2DA, UK
| | - Harry Hemingway
- Institute of Health Informatics, University College London,
London NW1 2DA, UK
| | - Cathie Sudlow
- BHF Data Science Centre, Health Data Research UK, London, NW1
2BE, UK
| | - Amitava Banerjee
- Institute of Health Informatics, University College London,
London NW1 2DA, UK
| | - on behalf of the CVD-COVID-UK Consortium
- Institute of Health Informatics, University College London,
London NW1 2DA, UK
- BHF Data Science Centre, Health Data Research UK, London, NW1
2BE, UK
- Medical and Scientific Affairs, BioPharmaceuticals Medical,
AstraZeneca, Cambridge, CB2 0AA, UK
- Nuffield Department of Orthopaedics, Rheumatology and
Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7HE, UK
- Leicester Diabetes Centre, University of Leicester, Leicester,
LE5 4PW, UK
- School of Medicine, Medical Sciences and Nutrition, University
of Aberdeen, Aberdeen, AB24 3FX, UK
- Faculty of Medicine, Health and Life Science, Swansea
University, Swansea, SA2 8QA, UK
- Nuffield Department of Women’s and Reproductive Health,
University of Oxford, Oxford, OX3 9DU, UK
- Institute of Cardiovascular and Medical Sciences, University of
Glasgow, Glasgow, G12 8TA, UK
| |
Collapse
|
10
|
Elnakib S, Vecino-Ortiz AI, Gibson DG, Agarwal S, Trujillo AJ, Zhu Y, Labrique A. A novel score for mobile health applications to predict and prevent mortality: Further validation and adaptation to US population using the US NHANES dataset. J Med Internet Res 2022; 24:e36787. [PMID: 35483022 PMCID: PMC9240932 DOI: 10.2196/36787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Revised: 04/14/2022] [Accepted: 04/28/2022] [Indexed: 11/28/2022] Open
Abstract
Background The C-Score, which is an individual health score, is based on a predictive model validated in the UK and US populations. It was designed to serve as an individualized point-in-time health assessment tool that could be integrated into clinical counseling or consumer-facing digital health tools to encourage lifestyle modifications that reduce the risk of premature death. Objective Our study aimed to conduct an external validation of the C-Score in the US population and expand the original score to improve its predictive capabilities in the US population. The C-Score is intended for mobile health apps on wearable devices. Methods We conducted a literature review to identify relevant variables that were missing in the original C-Score. Subsequently, we used data from the 2005 to 2014 US National Health and Nutrition Examination Survey (NHANES; N=21,015) to test the capacity of the model to predict all-cause mortality. We used NHANES III data from 1988 to 1994 (N=1440) to conduct an external validation of the test. Only participants with complete data were included in this study. Discrimination and calibration tests were conducted to assess the operational characteristics of the adapted C-Score from receiver operating curves and a design-based goodness-of-fit test. Results Higher C-Scores were associated with reduced odds of all-cause mortality (odds ratio 0.96, P<.001). We found a good fit of the C-Score for all-cause mortality with an area under the curve (AUC) of 0.72. Among participants aged between 40 and 69 years, C-Score models had a good fit for all-cause mortality and an AUC >0.72. A sensitivity analysis using NHANES III data (1988-1994) was performed, yielding similar results. The inclusion of sociodemographic and clinical variables in the basic C-Score increased the AUCs from 0.72 (95% CI 0.71-0.73) to 0.87 (95% CI 0.85-0.88). Conclusions Our study shows that this digital biomarker, the C-Score, has good capabilities to predict all-cause mortality in the general US population. An expanded health score can predict 87% of the mortality in the US population. This model can be used as an instrument to assess individual mortality risk and as a counseling tool to motivate behavior changes and lifestyle modifications.
Collapse
Affiliation(s)
- Shatha Elnakib
- Department of International Health., Johns Hopkins Bloomberg School of Public Health, 615 N Wolfe Street.E8620, Baltimore, US
| | - Andres I Vecino-Ortiz
- Department of International Health., Johns Hopkins Bloomberg School of Public Health, 615 N Wolfe Street.E8620, Baltimore, US
| | - Dustin G Gibson
- Department of International Health., Johns Hopkins Bloomberg School of Public Health, 615 N Wolfe Street.E8620, Baltimore, US
| | - Smisha Agarwal
- Department of International Health., Johns Hopkins Bloomberg School of Public Health, 615 N Wolfe Street.E8620, Baltimore, US
| | - Antonio J Trujillo
- Department of International Health., Johns Hopkins Bloomberg School of Public Health, 615 N Wolfe Street.E8620, Baltimore, US
| | - Yifan Zhu
- Department of International Health., Johns Hopkins Bloomberg School of Public Health, 615 N Wolfe Street.E8620, Baltimore, US
| | - Alain Labrique
- Department of International Health., Johns Hopkins Bloomberg School of Public Health, 615 N Wolfe Street.E8620, Baltimore, US
| |
Collapse
|
11
|
Francis ER, Cadar D, Steptoe A, Ajnakina O. Interplay between polygenic propensity for ageing-related traits and the consumption of fruits and vegetables on future dementia diagnosis. BMC Psychiatry 2022; 22:75. [PMID: 35093034 PMCID: PMC8801085 DOI: 10.1186/s12888-022-03717-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 01/21/2022] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND Understanding how polygenic scores for ageing-related traits interact with diet in determining a future dementia including Alzheimer's diagnosis (AD) would increase our understanding of mechanisms underlying dementia onset. METHODS Using 6784 population representative adults aged ≥50 years from the English Longitudinal Study of Ageing, we employed accelerated failure time survival model to investigate interactions between polygenic scores for AD (AD-PGS), schizophrenia (SZ-PGS) and general cognition (GC-PGS) and the baseline daily fruit and vegetable intake in association with dementia diagnosis during a 10-year follow-up. The baseline sample was obtained from waves 3-4 (2006-2009); follow-up data came from wave 5 (2010-2011) to wave 8 (2016-2017). RESULTS Consuming < 5 portions of fruit and vegetables a day was associated with 33-37% greater risk for dementia in the following 10 years depending on an individual polygenic propensity. One standard deviation (1-SD) increase in AD-PGS was associated with 24% higher risk of dementia and 47% higher risk for AD diagnosis. 1-SD increase in SZ-PGS was associated with an increased risk of AD diagnosis by 66%(95%CI = 1.05-2.64) in participants who consumed < 5 portions of fruit or vegetables. There was a significant additive interaction between GC-PGS and < 5 portions of the baseline daily intake of fruit and vegetables in association with AD diagnosis during the 10-year follow-up (RERI = 0.70, 95%CI = 0.09-4.82; AP = 0.36, 95%CI = 0.17-0.66). CONCLUSION A diet rich in fruit and vegetables is an important factor influencing the subsequent risk of dementia in the 10 years follow-up, especially in the context of polygenetic predisposition to AD, schizophrenia, and general cognition.
Collapse
Affiliation(s)
- Emma Ruby Francis
- Department of Behavioural Science and Health, Institute of Epidemiology and Health Care, University College London, 1-19 Torrington Place, London, WC1E 7HB, UK
| | - Dorina Cadar
- Department of Behavioural Science and Health, Institute of Epidemiology and Health Care, University College London, 1-19 Torrington Place, London, WC1E 7HB, UK
- Brighton and Sussex Medical School, Brighton, East Sussex, UK
| | - Andrew Steptoe
- Department of Behavioural Science and Health, Institute of Epidemiology and Health Care, University College London, 1-19 Torrington Place, London, WC1E 7HB, UK
| | - Olesya Ajnakina
- Department of Behavioural Science and Health, Institute of Epidemiology and Health Care, University College London, 1-19 Torrington Place, London, WC1E 7HB, UK.
- Department of Biostatistics & Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK.
| |
Collapse
|