1
|
Expression of Concern: Hybrid feature selection and classification technique for early prediction and severity of diabetes type 2. PLoS One 2025; 20:e0324149. [PMID: 40338832 PMCID: PMC12061119 DOI: 10.1371/journal.pone.0324149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2025] Open
|
2
|
Wilhelm C, Steckelberg A, Rebitschek FG. Benefits and harms associated with the use of AI-related algorithmic decision-making systems by healthcare professionals: a systematic review. THE LANCET REGIONAL HEALTH. EUROPE 2025; 48:101145. [PMID: 39687669 PMCID: PMC11648885 DOI: 10.1016/j.lanepe.2024.101145] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 11/06/2024] [Accepted: 11/08/2024] [Indexed: 12/18/2024]
Abstract
Background Despite notable advancements in artificial intelligence (AI) that enable complex systems to perform certain tasks more accurately than medical experts, the impact on patient-relevant outcomes remains uncertain. To address this gap, this systematic review assesses the benefits and harms associated with AI-related algorithmic decision-making (ADM) systems used by healthcare professionals, compared to standard care. Methods In accordance with the PRISMA guidelines, we included interventional and observational studies published as peer-reviewed full-text articles that met the following criteria: human patients; interventions involving algorithmic decision-making systems, developed with and/or utilizing machine learning (ML); and outcomes describing patient-relevant benefits and harms that directly affect health and quality of life, such as mortality and morbidity. Studies that did not undergo preregistration, lacked a standard-of-care control, or pertained to systems that assist in the execution of actions (e.g., in robotics) were excluded. We searched MEDLINE, EMBASE, IEEE Xplore, and Google Scholar for studies published in the past decade up to 31 March 2024. We assessed risk of bias using Cochrane's RoB 2 and ROBINS-I tools, and reporting transparency with CONSORT-AI and TRIPOD-AI. Two researchers independently managed the processes and resolved conflicts through discussion. This review has been registered with PROSPERO (CRD42023412156) and the study protocol has been published. Findings Out of 2,582 records identified after deduplication, 18 randomized controlled trials (RCTs) and one cohort study met the inclusion criteria, covering specialties such as psychiatry, oncology, and internal medicine. Collectively, the studies included a median of 243 patients (IQR 124-828), with a median of 50.5% female participants (range 12.5-79.0, IQR 43.6-53.6) across intervention and control groups. Four studies were classified as having low risk of bias, seven showed some concerns, and another seven were assessed as having high or serious risk of bias. Reporting transparency varied considerably: six studies showed high compliance, four moderate, and five low compliance with CONSORT-AI or TRIPOD-AI. Twelve studies (63%) reported patient-relevant benefits. Of those with low risk of bias, interventions reduced length of stay in hospital and intensive care unit (10.3 vs. 13.0 days, p = 0.042; 6.3 vs. 8.4 days, p = 0.030), in-hospital mortality (9.0% vs. 21.3%, p = 0.018), and depression symptoms in non-complex cases (45.1% vs. 52.3%, p = 0.03). However, harms were frequently underreported, with only eight studies (42%) documenting adverse events. No study reported an increase in adverse events as a result of the interventions. Interpretation The current evidence on AI-related ADM systems provides limited insights into patient-relevant outcomes. Our findings underscore the essential need for rigorous evaluations of clinical benefits, reinforced compliance with methodological standards, and balanced consideration of both benefits and harms to ensure meaningful integration into healthcare practice. Funding This study did not receive any funding.
Collapse
Affiliation(s)
- Christoph Wilhelm
- International Graduate Academy (InGrA), Institute of Health and Nursing Science, Medical Faculty, Martin Luther University Halle-Wittenberg, Magdeburger Str. 8, Halle (Saale) 06112, Germany
- Harding Center for Risk Literacy, Faculty of Health Sciences Brandenburg, University of Potsdam, Virchowstr. 2, Potsdam 14482, Germany
| | - Anke Steckelberg
- Institute of Health and Nursing Science, Medical Faculty, Martin Luther University Halle-Wittenberg, Magdeburger Str. 8, Halle (Saale) 06112, Germany
| | - Felix G. Rebitschek
- Harding Center for Risk Literacy, Faculty of Health Sciences Brandenburg, University of Potsdam, Virchowstr. 2, Potsdam 14482, Germany
- Max Planck Institute for Human Development, Lentzeallee 94, Berlin 14195, Germany
| |
Collapse
|
3
|
Wändell P, Carlsson AC, Wierzbicka M, Sigurdsson K, Ärnlöv J, Eriksson J, Wachtler C, Ruge T. A machine learning tool for identifying patients with newly diagnosed diabetes in primary care. Prim Care Diabetes 2024; 18:501-505. [PMID: 38944562 DOI: 10.1016/j.pcd.2024.06.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Revised: 06/24/2024] [Accepted: 06/25/2024] [Indexed: 07/01/2024]
Abstract
BACKGROUND AND AIM It is crucial to identify a diabetes diagnosis early. Create a predictive model utilizing machine learning (ML) to identify new cases of diabetes in primary health care (PHC). METHODS A case-control study utilizing data on PHC visits for sex-, age, and PHC-matched controls. Stochastic gradient boosting was used to construct a model for predicting cases of diabetes based on diagnostic codes from PHC consultations during the year before index (diagnosis) date and number of consultations. Variable importance was estimated using the normalized relative influence (NRI) score. Risks of having diabetes were calculated using odds ratios of marginal effects (ORME). Four groups by age and sex were studied, age-groups 35-64 years and ≥ 65 years in men and women, respectively. RESULTS The most important predictive factors were hypertension with NRI 21.4-29.7 %, and obesity 4.8-15.2 %. The NRI for other top ten diagnoses and administrative codes generally ranged 1.0-4.2 %. CONCLUSIONS Our data confirm the known risk patterns for predicting a new diagnosis of diabetes, and the need to test blood glucose frequently. To assess the full potential of ML for risk prediction purposes in clinical practice, future studies could include clinical data on life-style patterns, laboratory tests and prescribed medication.
Collapse
Affiliation(s)
- Per Wändell
- Department of Neurobiology, Care Sciences and Society, Division of Family Medicine and Primary Care, Karolinska Institutet, Huddinge, Sweden
| | - Axel C Carlsson
- Department of Neurobiology, Care Sciences and Society, Division of Family Medicine and Primary Care, Karolinska Institutet, Huddinge, Sweden; Academic Primary Health Care Centre, Region Stockholm, Stockholm, Sweden.
| | - Marcelina Wierzbicka
- Department of Emergency and Internal Medicine, Skånes University Hospital, Malmö, Sweden; Department of Clinical Sciences Malmö, Lund University & Department of Internal Medicine, Skåne, Sweden
| | - Karolina Sigurdsson
- Department of Emergency and Internal Medicine, Skånes University Hospital, Malmö, Sweden; Department of Clinical Sciences Malmö, Lund University & Department of Internal Medicine, Skåne, Sweden
| | - Johan Ärnlöv
- Department of Neurobiology, Care Sciences and Society, Division of Family Medicine and Primary Care, Karolinska Institutet, Huddinge, Sweden; School of Health and Social Studies, Dalarna University, Falun, Sweden
| | - Julia Eriksson
- Division of Biostatistics, Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Caroline Wachtler
- Department of Neurobiology, Care Sciences and Society, Division of Family Medicine and Primary Care, Karolinska Institutet, Huddinge, Sweden
| | - Toralph Ruge
- Department of Emergency and Internal Medicine, Skånes University Hospital, Malmö, Sweden; Department of Clinical Sciences Malmö, Lund University & Department of Internal Medicine, Skåne, Sweden
| |
Collapse
|
4
|
Wilhelm C, Steckelberg A, Rebitschek FG. Is artificial intelligence for medical professionals serving the patients? : Protocol for a systematic review on patient-relevant benefits and harms of algorithmic decision-making. Syst Rev 2024; 13:228. [PMID: 39242544 PMCID: PMC11378383 DOI: 10.1186/s13643-024-02646-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Accepted: 08/22/2024] [Indexed: 09/09/2024] Open
Abstract
BACKGROUND Algorithmic decision-making (ADM) utilises algorithms to collect and process data and develop models to make or support decisions. Advances in artificial intelligence (AI) have led to the development of support systems that can be superior to medical professionals without AI support in certain tasks. However, whether patients can benefit from this remains unclear. The aim of this systematic review is to assess the current evidence on patient-relevant benefits and harms, such as improved survival rates and reduced treatment-related complications, when healthcare professionals use ADM systems (developed using or working with AI) compared to healthcare professionals without AI-related ADM (standard care)-regardless of the clinical issues. METHODS Following the PRISMA statement, MEDLINE and PubMed (via PubMed), Embase (via Elsevier) and IEEE Xplore will be searched using English free text terms in title/abstract, Medical Subject Headings (MeSH) terms and Embase Subject Headings (Emtree fields). Additional studies will be identified by contacting authors of included studies and through reference lists of included studies. Grey literature searches will be conducted in Google Scholar. Risk of bias will be assessed by using Cochrane's RoB 2 for randomised trials and ROBINS-I for non-randomised trials. Transparent reporting of the included studies will be assessed using the CONSORT-AI extension statement. Two researchers will screen, assess and extract from the studies independently, with a third in case of conflicts that cannot be resolved by discussion. DISCUSSION It is expected that there will be a substantial shortage of suitable studies that compare healthcare professionals with and without ADM systems concerning patient-relevant endpoints. This can be attributed to the prioritisation of technical quality criteria and, in some cases, clinical parameters over patient-relevant endpoints in the development of study designs. Furthermore, it is anticipated that a significant portion of the identified studies will exhibit relatively poor methodological quality and provide only limited generalisable results. SYSTEMATIC REVIEW REGISTRATION This study is registered within PROSPERO (CRD42023412156).
Collapse
Affiliation(s)
- Christoph Wilhelm
- Institute of Health and Nursing Sciences, Martin Luther University Halle-Wittenberg, Magdeburger Str. 8, Halle, 06112, Germany.
- Harding Center for Risk Literacy, Faculty of Health Sciences Brandenburg, University of Potsdam, Virchowstr. 2, Potsdam, 14482, Germany.
| | - Anke Steckelberg
- Institute of Health and Nursing Sciences, Martin Luther University Halle-Wittenberg, Magdeburger Str. 8, Halle, 06112, Germany
| | - Felix G Rebitschek
- Harding Center for Risk Literacy, Faculty of Health Sciences Brandenburg, University of Potsdam, Virchowstr. 2, Potsdam, 14482, Germany
- Max Planck Institute for Human Development, Lentzeallee 94, Berlin, 14195, Germany
| |
Collapse
|
5
|
Askar M, Tafavvoghi M, Småbrekke L, Bongo LA, Svendsen K. Using machine learning methods to predict all-cause somatic hospitalizations in adults: A systematic review. PLoS One 2024; 19:e0309175. [PMID: 39178283 PMCID: PMC11343463 DOI: 10.1371/journal.pone.0309175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 08/06/2024] [Indexed: 08/25/2024] Open
Abstract
AIM In this review, we investigated how Machine Learning (ML) was utilized to predict all-cause somatic hospital admissions and readmissions in adults. METHODS We searched eight databases (PubMed, Embase, Web of Science, CINAHL, ProQuest, OpenGrey, WorldCat, and MedNar) from their inception date to October 2023, and included records that predicted all-cause somatic hospital admissions and readmissions of adults using ML methodology. We used the CHARMS checklist for data extraction, PROBAST for bias and applicability assessment, and TRIPOD for reporting quality. RESULTS We screened 7,543 studies of which 163 full-text records were read and 116 met the review inclusion criteria. Among these, 45 predicted admission, 70 predicted readmission, and one study predicted both. There was a substantial variety in the types of datasets, algorithms, features, data preprocessing steps, evaluation, and validation methods. The most used types of features were demographics, diagnoses, vital signs, and laboratory tests. Area Under the ROC curve (AUC) was the most used evaluation metric. Models trained using boosting tree-based algorithms often performed better compared to others. ML algorithms commonly outperformed traditional regression techniques. Sixteen studies used Natural language processing (NLP) of clinical notes for prediction, all studies yielded good results. The overall adherence to reporting quality was poor in the review studies. Only five percent of models were implemented in clinical practice. The most frequently inadequately addressed methodological aspects were: providing model interpretations on the individual patient level, full code availability, performing external validation, calibrating models, and handling class imbalance. CONCLUSION This review has identified considerable concerns regarding methodological issues and reporting quality in studies investigating ML to predict hospitalizations. To ensure the acceptability of these models in clinical settings, it is crucial to improve the quality of future studies.
Collapse
Affiliation(s)
- Mohsen Askar
- Faculty of Health Sciences, Department of Pharmacy, UiT-The Arctic University of Norway, Tromsø, Norway
| | - Masoud Tafavvoghi
- Faculty of Science and Technology, Department of Computer Science, UiT-The Arctic University of Norway, Tromsø, Norway
| | - Lars Småbrekke
- Faculty of Health Sciences, Department of Pharmacy, UiT-The Arctic University of Norway, Tromsø, Norway
| | - Lars Ailo Bongo
- Faculty of Science and Technology, Department of Computer Science, UiT-The Arctic University of Norway, Tromsø, Norway
| | - Kristian Svendsen
- Faculty of Health Sciences, Department of Pharmacy, UiT-The Arctic University of Norway, Tromsø, Norway
| |
Collapse
|
6
|
Kamel Rahimi A, Pienaar O, Ghadimi M, Canfell OJ, Pole JD, Shrapnel S, van der Vegt AH, Sullivan C. Implementing AI in Hospitals to Achieve a Learning Health System: Systematic Review of Current Enablers and Barriers. J Med Internet Res 2024; 26:e49655. [PMID: 39094106 PMCID: PMC11329852 DOI: 10.2196/49655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 02/08/2024] [Accepted: 05/22/2024] [Indexed: 08/04/2024] Open
Abstract
BACKGROUND Efforts are underway to capitalize on the computational power of the data collected in electronic medical records (EMRs) to achieve a learning health system (LHS). Artificial intelligence (AI) in health care has promised to improve clinical outcomes, and many researchers are developing AI algorithms on retrospective data sets. Integrating these algorithms with real-time EMR data is rare. There is a poor understanding of the current enablers and barriers to empower this shift from data set-based use to real-time implementation of AI in health systems. Exploring these factors holds promise for uncovering actionable insights toward the successful integration of AI into clinical workflows. OBJECTIVE The first objective was to conduct a systematic literature review to identify the evidence of enablers and barriers regarding the real-world implementation of AI in hospital settings. The second objective was to map the identified enablers and barriers to a 3-horizon framework to enable the successful digital health transformation of hospitals to achieve an LHS. METHODS The PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines were adhered to. PubMed, Scopus, Web of Science, and IEEE Xplore were searched for studies published between January 2010 and January 2022. Articles with case studies and guidelines on the implementation of AI analytics in hospital settings using EMR data were included. We excluded studies conducted in primary and community care settings. Quality assessment of the identified papers was conducted using the Mixed Methods Appraisal Tool and ADAPTE frameworks. We coded evidence from the included studies that related to enablers of and barriers to AI implementation. The findings were mapped to the 3-horizon framework to provide a road map for hospitals to integrate AI analytics. RESULTS Of the 1247 studies screened, 26 (2.09%) met the inclusion criteria. In total, 65% (17/26) of the studies implemented AI analytics for enhancing the care of hospitalized patients, whereas the remaining 35% (9/26) provided implementation guidelines. Of the final 26 papers, the quality of 21 (81%) was assessed as poor. A total of 28 enablers was identified; 8 (29%) were new in this study. A total of 18 barriers was identified; 5 (28%) were newly found. Most of these newly identified factors were related to information and technology. Actionable recommendations for the implementation of AI toward achieving an LHS were provided by mapping the findings to a 3-horizon framework. CONCLUSIONS Significant issues exist in implementing AI in health care. Shifting from validating data sets to working with live data is challenging. This review incorporated the identified enablers and barriers into a 3-horizon framework, offering actionable recommendations for implementing AI analytics to achieve an LHS. The findings of this study can assist hospitals in steering their strategic planning toward successful adoption of AI.
Collapse
Affiliation(s)
- Amir Kamel Rahimi
- Queensland Digital Health Centre, Faculty of Medicine, The University of Queensland, Brisbane, Australia
- Digital Health Cooperative Research Centre, Australian Government, Sydney, Australia
| | - Oliver Pienaar
- The School of Mathematics and Physics, The University of Queensland, Brisbane, Australia
| | - Moji Ghadimi
- The School of Mathematics and Physics, The University of Queensland, Brisbane, Australia
| | - Oliver J Canfell
- Queensland Digital Health Centre, Faculty of Medicine, The University of Queensland, Brisbane, Australia
- Digital Health Cooperative Research Centre, Australian Government, Sydney, Australia
- Business School, The University of Queensland, Brisbane, Australia
- Department of Nutritional Sciences, Faculty of Life Sciences and Medicine, King's College London, London, United Kingdom
| | - Jason D Pole
- Queensland Digital Health Centre, Faculty of Medicine, The University of Queensland, Brisbane, Australia
- Dalla Lana School of Public Health, The University of Toronto, Toronto, ON, Canada
- ICES, Toronto, ON, Canada
| | - Sally Shrapnel
- Queensland Digital Health Centre, Faculty of Medicine, The University of Queensland, Brisbane, Australia
- The School of Mathematics and Physics, The University of Queensland, Brisbane, Australia
| | - Anton H van der Vegt
- Queensland Digital Health Centre, Faculty of Medicine, The University of Queensland, Brisbane, Australia
| | - Clair Sullivan
- Queensland Digital Health Centre, Faculty of Medicine, The University of Queensland, Brisbane, Australia
- Metro North Hospital and Health Service, Department of Health, Queensland Government, Brisbane, Australia
| |
Collapse
|
7
|
Vyas PK, Brandon K, Gephart SM. A Scoping Review of Studies Using Artificial Intelligence Identifying Optimal Practice Patterns for Inpatients With Type 2 Diabetes That Lead to Positive Healthcare Outcomes. Comput Inform Nurs 2024; 42:396-402. [PMID: 39248450 DOI: 10.1097/cin.0000000000001143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/10/2024]
Abstract
The objective of this scoping review was to survey the literature on the use of AI/ML applications in analyzing inpatient EHR data to identify bundles of care (groupings of interventions). If evidence suggested AI/ML models could determine bundles, the review aimed to explore whether implementing these interventions as bundles reduced practice pattern variance and positively impacted patient care outcomes for inpatients with T2DM. Six databases were searched for articles published from January 1, 2000, to January 1, 2024. Nine studies met criteria and were summarized by aims, outcome measures, clinical or practice implications, AI/ML model types, study variables, and AI/ML model outcomes. A variety of AI/ML models were used. Multiple data sources were leveraged to train the models, resulting in varying impacts on practice patterns and outcomes. Studies included aims across 4 thematic areas to address: therapeutic patterns of care, analysis of treatment pathways and their constraints, dashboard development for clinical decision support, and medication optimization and prescription pattern mining. Multiple disparate data sources (i.e., prescription payment data) were leveraged outside of those traditionally available within EHR databases. Notably missing was the use of holistic multidisciplinary data (i.e., nursing and ancillary) to train AI/ML models. AI/ML can assist in identifying the appropriateness of specific interventions to manage diabetic care and support adherence to efficacious treatment pathways if the appropriate data are incorporated into AI/ML design. Additional data sources beyond the EHR are needed to provide more complete data to develop AI/ML models that effectively discern meaningful clinical patterns. Further study is needed to better address nursing care using AI/ML to support effective inpatient diabetes management.
Collapse
Affiliation(s)
- Pankaj K Vyas
- Author Affiliations: The University of Arizona College of Nursing, Tucson, AZ (Mr Vyas, Ms Brandon, and Dr Gephart), San Antonio Regional Hospital, Upland, CA (Mr Vyas)
| | | | | |
Collapse
|
8
|
Wright AP, Embi PJ, Nelson SD, Smith JC, Turchin A, Mize DE. Development and Validation of Inpatient Hypoglycemia Models Centered Around the Insulin Ordering Process. J Diabetes Sci Technol 2024; 18:423-429. [PMID: 36047538 PMCID: PMC10973866 DOI: 10.1177/19322968221119788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
BACKGROUND The insulin ordering process is an opportunity to provide clinicians with hypoglycemia risk predictions, but few hypoglycemia models centered around the insulin ordering process exist. METHODS We used data on adult patients, admitted in 2019 to non-ICU floors of a large teaching hospital, who had orders for subcutaneous insulin. Our outcome was hypoglycemia, defined as a blood glucose (BG) <70 mg/dL within 24 hours after ordering insulin. We trained and evaluated models to predict hypoglycemia at the time of placing an insulin order, using logistic regression, random forest, and extreme gradient boosting (XGBoost). We compared performance using area under the receiver operating characteristic curve (AUCs) and precision-recall curves. We determined recall at our goal precision of 0.30. RESULTS Of 21 052 included insulin orders, 1839 (9%) were followed by a hypoglycemic event within 24 hours. Logistic regression, random forest, and XGBoost models had AUCs of 0.81, 0.80, and 0.79, and recall of 0.44, 0.49, and 0.32, respectively. The most significant predictor was the lowest BG value in the 24 hours preceding the order. Predictors related to the insulin order being placed at the time of the prediction were useful to the model but less important than the patient's history of BG values over time. CONCLUSIONS Hypoglycemia within the next 24 hours can be predicted at the time an insulin order is placed, providing an opportunity to integrate decision support into the medication ordering process to make insulin therapy safer.
Collapse
Affiliation(s)
- Aileen P. Wright
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Peter J. Embi
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Scott D. Nelson
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Joshua C. Smith
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Alexander Turchin
- Department of Medicine, Brigham and Women’s Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Dara E. Mize
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
9
|
Haque MA, Gedara MLB, Nickel N, Turgeon M, Lix LM. The validity of electronic health data for measuring smoking status: a systematic review and meta-analysis. BMC Med Inform Decis Mak 2024; 24:33. [PMID: 38308231 PMCID: PMC10836023 DOI: 10.1186/s12911-024-02416-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 01/03/2024] [Indexed: 02/04/2024] Open
Abstract
BACKGROUND Smoking is a risk factor for many chronic diseases. Multiple smoking status ascertainment algorithms have been developed for population-based electronic health databases such as administrative databases and electronic medical records (EMRs). Evidence syntheses of algorithm validation studies have often focused on chronic diseases rather than risk factors. We conducted a systematic review and meta-analysis of smoking status ascertainment algorithms to describe the characteristics and validity of these algorithms. METHODS The Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines were followed. We searched articles published from 1990 to 2022 in EMBASE, MEDLINE, Scopus, and Web of Science with key terms such as validity, administrative data, electronic health records, smoking, and tobacco use. The extracted information, including article characteristics, algorithm characteristics, and validity measures, was descriptively analyzed. Sources of heterogeneity in validity measures were estimated using a meta-regression model. Risk of bias (ROB) in the reviewed articles was assessed using the Quality Assessment of Diagnostic Accuracy Studies-2 tool. RESULTS The initial search yielded 2086 articles; 57 were selected for review and 116 algorithms were identified. Almost three-quarters (71.6%) of algorithms were based on EMR data. The algorithms were primarily constructed using diagnosis codes for smoking-related conditions, although prescription medication codes for smoking treatments were also adopted. About half of the algorithms were developed using machine-learning models. The pooled estimates of positive predictive value, sensitivity, and specificity were 0.843, 0.672, and 0.918 respectively. Algorithm sensitivity and specificity were highly variable and ranged from 3 to 100% and 36 to 100%, respectively. Model-based algorithms had significantly greater sensitivity (p = 0.006) than rule-based algorithms. Algorithms for EMR data had higher sensitivity than algorithms for administrative data (p = 0.001). The ROB was low in most of the articles (76.3%) that underwent the assessment. CONCLUSIONS Multiple algorithms using different data sources and methods have been proposed to ascertain smoking status in electronic health data. Many algorithms had low sensitivity and positive predictive value, but the data source influenced their validity. Algorithms based on machine-learning models for multiple linked data sources have improved validity.
Collapse
Affiliation(s)
- Md Ashiqul Haque
- Department of Community Health Sciences, University of Manitoba, Winnipeg, MB, Canada
| | | | - Nathan Nickel
- Department of Community Health Sciences, University of Manitoba, Winnipeg, MB, Canada
| | - Maxime Turgeon
- Department of Statistics, University of Manitoba, Winnipeg, MB, Canada
| | - Lisa M Lix
- Department of Community Health Sciences, University of Manitoba, Winnipeg, MB, Canada.
| |
Collapse
|
10
|
Zrubka Z, Kertész G, Gulácsi L, Czere J, Hölgyesi Á, Nezhad HM, Mosavi A, Kovács L, Butte AJ, Péntek M. The Reporting Quality of Machine Learning Studies on Pediatric Diabetes Mellitus: Systematic Review. J Med Internet Res 2024; 26:e47430. [PMID: 38241075 PMCID: PMC10837761 DOI: 10.2196/47430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 04/29/2023] [Accepted: 11/17/2023] [Indexed: 01/23/2024] Open
Abstract
BACKGROUND Diabetes mellitus (DM) is a major health concern among children with the widespread adoption of advanced technologies. However, concerns are growing about the transparency, replicability, biasedness, and overall validity of artificial intelligence studies in medicine. OBJECTIVE We aimed to systematically review the reporting quality of machine learning (ML) studies of pediatric DM using the Minimum Information About Clinical Artificial Intelligence Modelling (MI-CLAIM) checklist, a general reporting guideline for medical artificial intelligence studies. METHODS We searched the PubMed and Web of Science databases from 2016 to 2020. Studies were included if the use of ML was reported in children with DM aged 2 to 18 years, including studies on complications, screening studies, and in silico samples. In studies following the ML workflow of training, validation, and testing of results, reporting quality was assessed via MI-CLAIM by consensus judgments of independent reviewer pairs. Positive answers to the 17 binary items regarding sufficient reporting were qualitatively summarized and counted as a proxy measure of reporting quality. The synthesis of results included testing the association of reporting quality with publication and data type, participants (human or in silico), research goals, level of code sharing, and the scientific field of publication (medical or engineering), as well as with expert judgments of clinical impact and reproducibility. RESULTS After screening 1043 records, 28 studies were included. The sample size of the training cohort ranged from 5 to 561. Six studies featured only in silico patients. The reporting quality was low, with great variation among the 21 studies assessed using MI-CLAIM. The number of items with sufficient reporting ranged from 4 to 12 (mean 7.43, SD 2.62). The items on research questions and data characterization were reported adequately most often, whereas items on patient characteristics and model examination were reported adequately least often. The representativeness of the training and test cohorts to real-world settings and the adequacy of model performance evaluation were the most difficult to judge. Reporting quality improved over time (r=0.50; P=.02); it was higher than average in prognostic biomarker and risk factor studies (P=.04) and lower in noninvasive hypoglycemia detection studies (P=.006), higher in studies published in medical versus engineering journals (P=.004), and higher in studies sharing any code of the ML pipeline versus not sharing (P=.003). The association between expert judgments and MI-CLAIM ratings was not significant. CONCLUSIONS The reporting quality of ML studies in the pediatric population with DM was generally low. Important details for clinicians, such as patient characteristics; comparison with the state-of-the-art solution; and model examination for valid, unbiased, and robust results, were often the weak points of reporting. To assess their clinical utility, the reporting standards of ML studies must evolve, and algorithms for this challenging population must become more transparent and replicable.
Collapse
Affiliation(s)
- Zsombor Zrubka
- HECON Health Economics Research Center, University Research and Innovation Center, Óbuda University, Budapest, Hungary
| | - Gábor Kertész
- John von Neumann Faculty of Informatics, Óbuda University, Budapest, Hungary
| | - László Gulácsi
- HECON Health Economics Research Center, University Research and Innovation Center, Óbuda University, Budapest, Hungary
| | - János Czere
- Doctoral School of Innovation Management, Óbuda University, Budapest, Hungary
| | - Áron Hölgyesi
- HECON Health Economics Research Center, University Research and Innovation Center, Óbuda University, Budapest, Hungary
- Doctoral School of Molecular Medicine, Semmelweis University, Budapest, Hungary
| | - Hossein Motahari Nezhad
- HECON Health Economics Research Center, University Research and Innovation Center, Óbuda University, Budapest, Hungary
- Doctoral School of Business and Management, Corvinus University of Budapest, Budapest, Hungary
| | - Amir Mosavi
- John von Neumann Faculty of Informatics, Óbuda University, Budapest, Hungary
| | - Levente Kovács
- Physiological Controls Research Center, University Research and Innovation Center, Óbuda University, Budapest, Hungary
| | - Atul J Butte
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA, United States
| | - Márta Péntek
- HECON Health Economics Research Center, University Research and Innovation Center, Óbuda University, Budapest, Hungary
| |
Collapse
|