1
|
Tibble H, Sheikh A, Tsanas A. Development and validation of a machine learning risk prediction model for asthma attacks in adults in primary care. NPJ Prim Care Respir Med 2025; 35:24. [PMID: 40268974 PMCID: PMC12019439 DOI: 10.1038/s41533-025-00428-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Accepted: 04/07/2025] [Indexed: 04/25/2025] Open
Abstract
Primary care consultations provide an opportunity for patients and clinicians to assess asthma attack risk. Using a data-driven risk prediction tool with routinely collected health records may be an efficient way to aid promotion of effective self-management, and support clinical decision making. Longitudinal Scottish primary care data for 21,250 asthma patients were used to predict the risk of asthma attacks in the following year. A selection of machine learning algorithms (i.e., Naïve Bayes Classifier, Logistic Regression, Random Forests, and Extreme Gradient Boosting), hyperparameters, training data enrichment methods were explored, and validated in a random unseen data partition. Our final Logistic Regression model achieved the best performance when no training data enrichment was applied. Around 1 in 3 (36.2%) predicted high-risk patients had an attack within one year of consultation, compared to approximately 1 in 16 in the predicted low-risk group (6.7%). The model was well calibrated, with a calibration slope of 1.02 and an intercept of 0.004, and the Area under the Curve was 0.75. This model has the potential to increase the efficiency of routine asthma care by creating new personalized care pathways mapped to predicted risk of asthma attacks, such as priority ranking patients for scheduled consultations and interventions. Furthermore, it could be used to educate patients about their individual risk and risk factors, and promote healthier lifestyle changes, use of self-management plans, and early emergency care seeking following rapid symptom deterioration.
Collapse
Affiliation(s)
- Holly Tibble
- Usher Institute, The University of Edinburgh, Edinburgh, UK.
- Asthma UK Centre for Applied Research, Edinburgh, UK.
| | - Aziz Sheikh
- Usher Institute, The University of Edinburgh, Edinburgh, UK
- Asthma UK Centre for Applied Research, Edinburgh, UK
| | - Athanasios Tsanas
- Usher Institute, The University of Edinburgh, Edinburgh, UK
- Asthma UK Centre for Applied Research, Edinburgh, UK
| |
Collapse
|
2
|
Newby D, Taylor N, Joyce DW, Winchester LM. Optimising the use of electronic medical records for large scale research in psychiatry. Transl Psychiatry 2024; 14:232. [PMID: 38824136 PMCID: PMC11144247 DOI: 10.1038/s41398-024-02911-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 04/13/2024] [Accepted: 04/15/2024] [Indexed: 06/03/2024] Open
Abstract
The explosion and abundance of digital data could facilitate large-scale research for psychiatry and mental health. Research using so-called "real world data"-such as electronic medical/health records-can be resource-efficient, facilitate rapid hypothesis generation and testing, complement existing evidence (e.g. from trials and evidence-synthesis) and may enable a route to translate evidence into clinically effective, outcomes-driven care for patient populations that may be under-represented. However, the interpretation and processing of real-world data sources is complex because the clinically important 'signal' is often contained in both structured and unstructured (narrative or "free-text") data. Techniques for extracting meaningful information (signal) from unstructured text exist and have advanced the re-use of routinely collected clinical data, but these techniques require cautious evaluation. In this paper, we survey the opportunities, risks and progress made in the use of electronic medical record (real-world) data for psychiatric research.
Collapse
Affiliation(s)
- Danielle Newby
- Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Centre for Statistics in Medicine, University of Oxford, Oxford, UK
| | - Niall Taylor
- Department of Psychiatry, University of Oxford, Oxford, UK
| | - Dan W Joyce
- Department of Primary Care and Mental Health and Civic Health, Innovation Labs, Institute of Population Health, University of Liverpool, Liverpool, UK
| | | |
Collapse
|
3
|
Horne EMF, McLean S, Alsallakh MA, Davies GA, Price DB, Sheikh A, Tsanas A. Defining clinical subtypes of adult asthma using electronic health records: Analysis of a large UK primary care database with external validation. Int J Med Inform 2023; 170:104942. [PMID: 36529028 DOI: 10.1016/j.ijmedinf.2022.104942] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Revised: 11/13/2022] [Accepted: 11/28/2022] [Indexed: 12/12/2022]
Abstract
INTRODUCTION Asthma is one of the commonest chronic conditions in the world. Subtypes of asthma have been defined, typically from clinical datasets on small, well-characterised subpopulations of asthma patients. We sought to define asthma subtypes from large longitudinal primary care electronic health records (EHRs) using cluster analysis. METHODS In this retrospective cohort study, we extracted asthma subpopulations from the Optimum Patient Care Research Database (OPCRD) to robustly train and test algorithms, and externally validated findings in the Secure Anonymised Information Linkage (SAIL) Databank. In both databases, we identified adults with an asthma diagnosis code recorded in the three years prior to an index date. Train and test datasets were selected from OPCRD using an index date of Jan 1, 2016. Two internal validation datasets were selected from OPCRD using index dates of Jan 1, 2017 and 2018. Three external validation datasets were selected from SAIL using index dates of Jan 1, 2016, 2017 and 2018. Each dataset comprised 50,000 randomly selected non-overlapping patients. Subtypes were defined by applying multiple correspondence analysis and k-means cluster analysis to the train dataset, and were validated in the internal and external validation datasets. RESULTS We defined six asthma subtypes with clear clinical interpretability: low inhaled corticosteroid (ICS) use and low healthcare utilisation (30% of patients); low-to-medium ICS use (36%); low-to-medium ICS use and comorbidities (12%); varied ICS use and comorbid chronic obstructive pulmonary disease (4%); high (10%) and very high ICS use (7%). The subtypes were replicated with high accuracy in internal (91-92%) and external (84-86%) datasets. CONCLUSION Asthma subtypes derived and validated in large independent EHR databases were primarily defined by level of ICS use, level of healthcare use, and presence of comorbidities. This has important clinical implications towards defining asthma subtypes, facilitating patient stratification, and developing more personalised monitoring and treatment strategies.
Collapse
Affiliation(s)
- Elsie M F Horne
- Asthma UK Centre for Applied Research, Edinburgh, UK; Usher Institute, Edinburgh Medical School, College of Medicine and Veterinary Medicine, University of Edinburgh, Edinburgh, UK; Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK.
| | - Susannah McLean
- Asthma UK Centre for Applied Research, Edinburgh, UK; Usher Institute, Edinburgh Medical School, College of Medicine and Veterinary Medicine, University of Edinburgh, Edinburgh, UK
| | - Mohammad A Alsallakh
- Asthma UK Centre for Applied Research, Edinburgh, UK; Population Data Science, Swansea University Medical School, Swansea, UK; Health Data Research UK, Swansea and Edinburgh, UK
| | - Gwyneth A Davies
- Asthma UK Centre for Applied Research, Edinburgh, UK; Population Data Science, Swansea University Medical School, Swansea, UK
| | - David B Price
- Observational and Pragmatic Research Institute (OPRI), Singapore; Centre of Academic Primary Care, Division of Applied Health Sciences, University of Aberdeen, Aberdeen, UK
| | - Aziz Sheikh
- Asthma UK Centre for Applied Research, Edinburgh, UK; Usher Institute, Edinburgh Medical School, College of Medicine and Veterinary Medicine, University of Edinburgh, Edinburgh, UK
| | - Athanasios Tsanas
- Asthma UK Centre for Applied Research, Edinburgh, UK; Usher Institute, Edinburgh Medical School, College of Medicine and Veterinary Medicine, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
4
|
Johnson M, Rigge L, Culliford D, Josephs L, Thomas M, Wilkinson T. Primary care risk stratification in COPD using routinely collected data: a secondary data analysis. NPJ Prim Care Respir Med 2019; 29:42. [PMID: 31797867 PMCID: PMC6892877 DOI: 10.1038/s41533-019-0154-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Accepted: 11/08/2019] [Indexed: 11/28/2022] Open
Abstract
Most clinical contacts with chronic obstructive pulmonary disease (COPD) patients take place in primary care, presenting opportunity for proactive clinical management. Electronic health records could be used to risk stratify diagnosed patients in this setting, but may be limited by poor data quality or completeness. We developed a risk stratification database algorithm using the DOSE index (Dyspnoea, Obstruction, Smoking and Exacerbation) with routinely collected primary care data, aiming to calculate up to three repeated risk scores per patient over five years, each separated by at least one year. Among 10,393 patients with diagnosed COPD, sufficient primary care data were present to calculate at least one risk score for 77.4%, and the maximum of three risk scores for 50.6%. Linked secondary care data revealed primary care under-recording of hospital exacerbations, which translated to a slight, non-significant cohort average risk score reduction, and an understated risk group allocation for less than 1% of patients. Algorithmic calculation of the DOSE index is possible using primary care data, and appears robust to the absence of linked secondary care data, if unavailable. The DOSE index appears a simple and practical means of incorporating risk stratification into the routine primary care of COPD patients, but further research is needed to evaluate its clinical utility in this setting. Although secondary analysis of routinely collected primary care data could benefit clinicians, patients and the health system, standardised data collection and improved data quality and completeness are also needed.
Collapse
Affiliation(s)
- Matthew Johnson
- MRC Lifecourse Epidemiology Unit, University of Southampton, Southampton, UK.
- NIHR ARC Wessex Data Science Hub, Faculty of Health Sciences, University of Southampton, Southampton, UK.
| | - Lucy Rigge
- NIHR ARC Wessex, University of Southampton, Southampton, UK
- NIHR Respiratory Biomedical Research Unit, Southampton General Hospital, Southampton, UK
| | - David Culliford
- NIHR ARC Wessex Data Science Hub, Faculty of Health Sciences, University of Southampton, Southampton, UK
| | - Lynn Josephs
- NIHR ARC Wessex, University of Southampton, Southampton, UK
- Department of Primary Care & Population Sciences, Faculty of Medicine, University of Southampton, Southampton, UK
| | - Mike Thomas
- NIHR ARC Wessex, University of Southampton, Southampton, UK
- NIHR Respiratory Biomedical Research Unit, Southampton General Hospital, Southampton, UK
- Department of Primary Care & Population Sciences, Faculty of Medicine, University of Southampton, Southampton, UK
| | - Tom Wilkinson
- NIHR Respiratory Biomedical Research Unit, Southampton General Hospital, Southampton, UK
- Clinical and Experimental Sciences, University of Southampton Faculty of Medicine, Southampton General Hospital, Southampton, UK
- Wessex Investigational Sciences Hub, University of Southampton Faculty of Medicine, Southampton General Hospital, Southampton, UK
| |
Collapse
|
5
|
Martin RJ, Bel EH, Pavord ID, Price D, Reddel HK. Defining severe obstructive lung disease in the biologic era: an endotype-based approach. Eur Respir J 2019; 54:1900108. [PMID: 31515397 PMCID: PMC6917363 DOI: 10.1183/13993003.00108-2019] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Accepted: 08/19/2019] [Indexed: 11/05/2022]
Abstract
Severe obstructive lung disease, which encompasses asthma, chronic obstructive pulmonary disease (COPD) or features of both, remains a considerable global health problem and burden on healthcare resources. However, the clinical definitions of severe asthma and COPD do not reflect the heterogeneity within these diagnoses or the potential for overlap between them, which may lead to inappropriate treatment decisions. Furthermore, most studies exclude patients with diagnoses of both asthma and COPD. Clinical definitions can influence clinical trial design and are both influenced by, and influence, regulatory indications and treatment recommendations. Therefore, to ensure its relevance in the era of targeted biologic therapies, the definition of severe obstructive lung disease must be updated so that it includes all patients who could benefit from novel treatments and for whom associated costs are justified. Here, we review evolving clinical definitions of severe obstructive lung disease and evaluate how these have influenced trial design by summarising eligibility criteria and primary outcomes of phase III randomised controlled trials of biologic therapies. Based on our findings, we discuss the advantages of a phenotype- and endotype-based approach to select appropriate populations for future trials that may influence regulatory approvals and clinical practice, allowing targeted biologic therapies to benefit a greater proportion and range of patients. This calls for co-ordinated efforts between investigators, pharmaceutical developers and regulators to ensure biologic therapies reach their full potential in the management of severe obstructive lung disease.
Collapse
Affiliation(s)
- Richard J Martin
- National Jewish Health and the University of Colorado, Denver, CO, USA
| | - Elisabeth H Bel
- Amsterdam University Medical Centre, University of Amsterdam, Amsterdam, the Netherlands
| | - Ian D Pavord
- Respiratory Medicine Unit and NIHR Oxford Respiratory BRC, Nuffield Dept of Medicine, University of Oxford, Oxford, UK
| | - David Price
- Observational and Pragmatic Research Institute, Singapore
- Centre of Academic Primary Care, University of Aberdeen, Aberdeen, UK
| | - Helen K Reddel
- Woolcock Institute of Medical Research, University of Sydney, Sydney, Australia
| |
Collapse
|
6
|
Franssen FME, Alter P, Bar N, Benedikter BJ, Iurato S, Maier D, Maxheim M, Roessler FK, Spruit MA, Vogelmeier CF, Wouters EFM, Schmeck B. Personalized medicine for patients with COPD: where are we? Int J Chron Obstruct Pulmon Dis 2019; 14:1465-1484. [PMID: 31371934 PMCID: PMC6636434 DOI: 10.2147/copd.s175706] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2019] [Accepted: 06/05/2019] [Indexed: 12/19/2022] Open
Abstract
Chronic airflow limitation is the common denominator of patients with chronic obstructive pulmonary disease (COPD). However, it is not possible to predict morbidity and mortality of individual patients based on the degree of lung function impairment, nor does the degree of airflow limitation allow guidance regarding therapies. Over the last decades, understanding of the factors contributing to the heterogeneity of disease trajectories, clinical presentation, and response to existing therapies has greatly advanced. Indeed, diagnostic assessment and treatment algorithms for COPD have become more personalized. In addition to the pulmonary abnormalities and inhaler therapies, extra-pulmonary features and comorbidities have been studied and are considered essential components of comprehensive disease management, including lifestyle interventions. Despite these advances, predicting and/or modifying the course of the disease remains currently impossible, and selection of patients with a beneficial response to specific interventions is unsatisfactory. Consequently, non-response to pharmacologic and non-pharmacologic treatments is common, and many patients have refractory symptoms. Thus, there is an ongoing urgency for a more targeted and holistic management of the disease, incorporating the basic principles of P4 medicine (predictive, preventive, personalized, and participatory). This review describes the current status and unmet needs regarding personalized medicine for patients with COPD. Also, it proposes a systems medicine approach, integrating genetic, environmental, (micro)biological, and clinical factors in experimental and computational models in order to decipher the multilevel complexity of COPD. Ultimately, the acquired insights will enable the development of clinical decision support systems and advance personalized medicine for patients with COPD.
Collapse
Affiliation(s)
- Frits ME Franssen
- Department of Research and Education, CIRO, Horn, The Netherlands
- Department of Respiratory Medicine, Maastricht University Medical Centre, NUTRIM School of Nutrition and Translational Research in Metabolism, Maastricht, The Netherlands
| | - Peter Alter
- Department of Medicine, Pulmonary and Critical Care Medicine, University Medical Center Giessen and Marburg, Philipps University of Marburg (UMR), Member of the German Center for Lung Research (DZL), Marburg, Germany
| | - Nadav Bar
- Department of Chemical Engineering, Norwegian University of Science and Technology (NTNU), Trondheim, Norway
| | - Birke J Benedikter
- Institute for Lung Research, Universities of Giessen and Marburg Lung Centre, Philipps-University Marburg, Member of the German Center for Lung Research (DZL), Marburg, Germany
- Department of Medical Microbiology, Maastricht University Medical Center (MUMC+), Maastricht, The Netherlands
| | | | | | - Michael Maxheim
- Department of Medicine, Pulmonary and Critical Care Medicine, University Medical Center Giessen and Marburg, Philipps University of Marburg (UMR), Member of the German Center for Lung Research (DZL), Marburg, Germany
| | - Fabienne K Roessler
- Department of Chemical Engineering, Norwegian University of Science and Technology (NTNU), Trondheim, Norway
| | - Martijn A Spruit
- Department of Research and Education, CIRO, Horn, The Netherlands
- Department of Respiratory Medicine, Maastricht University Medical Centre, NUTRIM School of Nutrition and Translational Research in Metabolism, Maastricht, The Netherlands
- REVAL - Rehabilitation Research Center, BIOMED - Biomedical Research Institute, Faculty of Rehabilitation Sciences, Hasselt University, Diepenbeek, Belgium
| | - Claus F Vogelmeier
- Department of Medicine, Pulmonary and Critical Care Medicine, University Medical Center Giessen and Marburg, Philipps University of Marburg (UMR), Member of the German Center for Lung Research (DZL), Marburg, Germany
| | - Emiel FM Wouters
- Department of Research and Education, CIRO, Horn, The Netherlands
- Department of Respiratory Medicine, Maastricht University Medical Centre, NUTRIM School of Nutrition and Translational Research in Metabolism, Maastricht, The Netherlands
| | - Bernd Schmeck
- Department of Medicine, Pulmonary and Critical Care Medicine, University Medical Center Giessen and Marburg, Philipps University of Marburg (UMR), Member of the German Center for Lung Research (DZL), Marburg, Germany
- Institute for Lung Research, Universities of Giessen and Marburg Lung Centre, Philipps-University Marburg, Member of the German Center for Lung Research (DZL), Marburg, Germany
| |
Collapse
|
7
|
Kaplan A, Hardjojo A, Yu S, Price D. Asthma Across Age: Insights From Primary Care. Front Pediatr 2019; 7:162. [PMID: 31131265 PMCID: PMC6510260 DOI: 10.3389/fped.2019.00162] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/29/2018] [Accepted: 04/08/2019] [Indexed: 11/16/2022] Open
Abstract
Asthma is a heterogeneous disease comprising of multiple phenotypes and affects patients from childhood up to old age. In this review, we summarize the current knowledge on the similarities and differences in asthma across different age-groups, with emphasis on the perspective from primary care. Despite the similar disease presentation, phenotyping studies showed that there are differences in the distribution of phenotypes of asthma presenting in childhood compared to that in adulthood. Whereas, asthma with early age of onset tends to be of the atopic phenotype, the disease shifts toward the non-atopic phenotypes at later ages. Studies within primary care patients aiming to elucidate risk factors for future asthma exacerbation have shown pediatric and elderly patients to be at higher risk for future asthma attacks compared to other adult patients. Regardless, both pediatric and adult studies demonstrated previous asthma episodes and severity, along with high blood eosinophil to predict subsequent asthma attacks. Differences in childhood and adult asthma are not limited to the underlying phenotypes but also extends to the challenges in the diagnosis, treatment, and management of the disease. Diagnosis of asthma is complicated by age-specific differential diagnoses such as infectious wheezing and nasal obstruction in children, and aging-related problems such as heart disease and obesity in the elderly. There are also age-related issues leading to decreased disease control such as non-adherence, tobacco use, difficulty in using inhalers and corticosteroid-related side effects which hinder asthma control at different patient age-groups. Several clinical guidelines are available to guide the diagnosis and drug prescription of asthma in pediatric patients. However, there are conflicting recommendations for the diagnostic tools and treatment for pediatric patients, posing additional challenges for primary care physicians in working with multiple guidelines. While tools such as spirometry and peak flow variability are often available in primary care, their usage in preschool patients is not consistently recommended. FeNO measurement may be a valuable non-invasive tool which can be adopted by primary physicians to assist asthma diagnosis in preschool-age patients.
Collapse
Affiliation(s)
- Alan Kaplan
- Department of Family and Community Medicine, University of Toronto, Toronto, ON, Canada
| | - Antony Hardjojo
- Observational and Pragmatic Research Institute, Singapore, Singapore
| | - Shaylynn Yu
- Observational and Pragmatic Research Institute, Singapore, Singapore
| | - David Price
- Observational and Pragmatic Research Institute, Singapore, Singapore.,Division of Applied Health Sciences, Centre of Academic Primary Care, University of Aberdeen, Aberdeen, United Kingdom.,Optimum Patient Care, Cambridge, United Kingdom
| |
Collapse
|