Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Andaur Navarro CL, Damen JAA, Takada T, Nijman SWJ, Dhiman P, Ma J, Collins GS, Bajpai R, Riley RD, Moons KGM, Hooft L. Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review. BMJ 2021;375:n2281. [PMID: 34670780 PMCID: PMC8527348 DOI: 10.1136/bmj.n2281] [Citation(s) in RCA: 88] [Impact Index Per Article: 29.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/13/2021] [Indexed: 12/23/2022]

For:	Andaur Navarro CL, Damen JAA, Takada T, Nijman SWJ, Dhiman P, Ma J, Collins GS, Bajpai R, Riley RD, Moons KGM, Hooft L. Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review. BMJ 2021;375:n2281. [PMID: 34670780 PMCID: PMC8527348 DOI: 10.1136/bmj.n2281] [Citation(s) in RCA: 88] [Impact Index Per Article: 29.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/13/2021] [Indexed: 12/23/2022]

Number

Cited by Other Article(s)

Sinaci AA, Gencturk M, Alvarez-Romero C, Laleci Erturkmen GB, Martinez-Garcia A, Escalona-Cuaresma MJ, Parra-Calderon CL. Privacy-preserving federated machine learning on FAIR health data: A real-world application. Comput Struct Biotechnol J 2024;24:136-145. [PMID: 38434250 PMCID: PMC10904920 DOI: 10.1016/j.csbj.2024.02.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 02/15/2024] [Accepted: 02/15/2024] [Indexed: 03/05/2024] Open

Abstract

Objective

This paper introduces a privacy-preserving federated machine learning (ML) architecture built upon Findable, Accessible, Interoperable, and Reusable (FAIR) health data. It aims to devise an architecture for executing classification algorithms in a federated manner, enabling collaborative model-building among health data owners without sharing their datasets.

Materials and methods

Utilizing an agent-based architecture, a privacy-preserving federated ML algorithm was developed to create a global predictive model from various local models. This involved formally defining the algorithm in two steps: data preparation and federated model training on FAIR health data and constructing the architecture with multiple components facilitating algorithm execution. The solution was validated by five healthcare organizations using their specific health datasets.

Results

Five organizations transformed their datasets into Health Level 7 Fast Healthcare Interoperability Resources via a common FAIRification workflow and software set, thereby generating FAIR datasets. Each organization deployed a Federated ML Agent within its secure network, connected to a cloud-based Federated ML Manager. System testing was conducted on a use case aiming to predict 30-day readmission risk for chronic obstructive pulmonary disease patients and the federated model achieved an accuracy rate of 87%.

Discussion

The paper demonstrated a practical application of privacy-preserving federated ML among five distinct healthcare entities, highlighting the value of FAIR health data in machine learning when utilized in a federated manner that ensures privacy protection without sharing data.

Conclusion

This solution effectively leverages FAIR datasets from multiple healthcare organizations for federated ML while safeguarding sensitive health datasets, meeting legislative privacy and security requirements.

Collapse

Miyagi Y, Iwashima S. Prediction Models for Intravenous Immunoglobulin Non-Responders of Kawasaki Disease Using Machine Learning. Clin Drug Investig 2024:10.1007/s40261-024-01373-z. [PMID: 38869717 DOI: 10.1007/s40261-024-01373-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/27/2024] [Indexed: 06/14/2024]

Huang J, Yang J, Qi H, Xu M, Xu X, Zhu Y. Prediction models for amputation after diabetic foot: systematic review and critical appraisal. Diabetol Metab Syndr 2024;16:126. [PMID: 38858732 PMCID: PMC11163763 DOI: 10.1186/s13098-024-01360-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 05/24/2024] [Indexed: 06/12/2024] Open

Abstract

BACKGROUND

Numerous studies have developed or validated prediction models aimed at estimating the likelihood of amputation in diabetic foot (DF) patients. However, the quality and applicability of these models in clinical practice and future research remain uncertain. This study conducts a systematic review and assessment of the risk of bias and applicability of amputation prediction models among individuals with DF.

METHODS

A comprehensive search was conducted across multiple databases, including PubMed, Web of Science, EBSCO CINAHL Plus, Embase, Cochrane Library, China National Knowledge Infrastructure (CNKI), Wanfang, Chinese Biomedical Literature Database (CBM), and Weipu (VIP) from their inception to December 24, 2023. Two investigators independently screened the literature and extracted data using the checklist for critical appraisal and data extraction for systematic reviews of prediction modeling studies. The Prediction Model Risk of Bias Assessment Tool (PROBAST) checklist was employed to evaluate both the risk of bias and applicability.

RESULTS

A total of 20 studies were included in this analysis, comprising 17 development studies and three validation studies, encompassing 20 prediction models and 11 classification systems. The incidence of amputation in patients with DF ranged from 5.9 to 58.5%. Machine learning-based methods were employed in more than half of the studies. The reported area under the curve (AUC) varied from 0.560 to 0.939. Independent predictors consistently identified by multivariate models included age, gender, HbA1c, hemoglobin, white blood cell count, low-density lipoprotein cholesterol, diabetes duration, and Wagner's Classification. All studies were found to exhibit a high risk of bias, primarily attributed to inadequate handling of outcome events and missing data, lack of model performance assessment, and overfitting.

CONCLUSIONS

The assessment using PROBAST revealed a notable risk of bias in the existing prediction models for amputation in patients with DF. It is imperative for future studies to concentrate on enhancing the robustness of current prediction models or constructing new models with stringent methodologies.

Collapse

Jia Y, Cui N, Jia T, Song J. Prognostic models for patients suffering a heart failure with a preserved ejection fraction: a systematic review. ESC Heart Fail 2024;11:1341-1351. [PMID: 38318693 PMCID: PMC11098651 DOI: 10.1002/ehf2.14696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Revised: 01/02/2024] [Accepted: 01/09/2024] [Indexed: 02/07/2024] Open

Abstract

The purpose of this study was to systematically review the development, performance, and applicability of prognostic models developed for predicting poor events in patients with heart failure with preserved ejection fraction (HFpEF). Databases including Embase, PubMed, Web of Science Core Collection, the Cochrane Library, China National Knowledge Infrastructure, Wan Fang, Wei Pu, and China Biological Medicine were queried from their respective dates of inception to 1 June 2023, to examine multivariate models for prognostic prediction in HFpEF. Both forward and backward citations of all studies were included in our analysis. Two researchers individually used the Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies (CHARMS) checklist to extract data and assess the quality of the models using the Predictive Mode Bias Risk Assessment Tool (PROBAST). Among the 6897 studies screened, 16 studies derived and/or validated a total of 39 prognostic models. The sample size ranges for model development, internal validation, and external validation are 119 to 5988, 152 to 1000, and 30 to 5957, respectively. The most frequently employed modelling technique was Cox proportional hazards regression. Six studies (37.50%) conducted internal validation of models; bootstrap and k-fold cross-validation were the commonly used methods for internal validation of models. Ten of these models (25.64%) were validated externally, with reported the c-statistic in the external validation set ranging from 0.70 to 0.96, while the remaining models await external validation. The MEDIA echo score and I-PRESERVE-sudden cardiac death prediction mode have been externally validated using multiple cohorts, and the results consistently show good predictive performance. The most frequently used predictors identified among the models were age, n-terminal pro-brain natriuretic peptide, ejection fraction, albumin, and hospital stay in the last 5 months owing to heart failure. All study predictor domains and outcome domains were at low risk of bias, high or unclear risk of bias of all prognostic models due to underreporting in the area of analysis. All studies did not evaluate the clinical utility of the prognostic models. Predictive models for predicting prognostic outcomes in patients with HFpEF showed good discriminatory ability but their utility and generalization remain uncertain due to the risk of bias, differences in predictors between models, and the lack of clinical application studies. Future studies should improve the methodological quality of model development and conduct external validation of models.

Collapse

Alrawashdeh A, Alqahtani S, Alkhatib ZI, Kheirallah K, Melhem NY, Alwidyan M, Al-Dekah AM, Alshammari T, Nehme Z. Applications and Performance of Machine Learning Algorithms in Emergency Medical Services: A Scoping Review. Prehosp Disaster Med 2024:1-11. [PMID: 38757150 DOI: 10.1017/s1049023x24000414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2024]

Abstract

OBJECTIVE

The aim of this study was to summarize the literature on the applications of machine learning (ML) and their performance in Emergency Medical Services (EMS).

METHODS

Four relevant electronic databases were searched (from inception through January 2024) for all original studies that employed EMS-guided ML algorithms to enhance the clinical and operational performance of EMS. Two reviewers screened the retrieved studies and extracted relevant data from the included studies. The characteristics of included studies, employed ML algorithms, and their performance were quantitively described across primary domains and subdomains.

RESULTS

This review included a total of 164 studies published from 2005 through 2024. Of those, 125 were clinical domain focused and 39 were operational. The characteristics of ML algorithms such as sample size, number and type of input features, and performance varied between and within domains and subdomains of applications. Clinical applications of ML algorithms involved triage or diagnosis classification (n = 62), treatment prediction (n = 12), or clinical outcome prediction (n = 50), mainly for out-of-hospital cardiac arrest/OHCA (n = 62), cardiovascular diseases/CVDs (n = 19), and trauma (n = 24). The performance of these ML algorithms varied, with a median area under the receiver operating characteristic curve (AUC) of 85.6%, accuracy of 88.1%, sensitivity of 86.05%, and specificity of 86.5%. Within the operational studies, the operational task of most ML algorithms was ambulance allocation (n = 21), followed by ambulance detection (n = 5), ambulance deployment (n = 5), route optimization (n = 5), and quality assurance (n = 3). The performance of all operational ML algorithms varied and had a median AUC of 96.1%, accuracy of 90.0%, sensitivity of 94.4%, and specificity of 87.7%. Generally, neural network and ensemble algorithms, to some degree, out-performed other ML algorithms.

CONCLUSION

Triaging and managing different prehospital medical conditions and augmenting ambulance performance can be improved by ML algorithms. Future reports should focus on a specific clinical condition or operational task to improve the precision of the performance metrics of ML models.

Collapse

Cecchi R, Haja TM, Calabrò F, Fasterholdt I, Rasmussen BSB. Artificial intelligence in healthcare: why not apply the medico-legal method starting with the Collingridge dilemma? Int J Legal Med 2024;138:1173-1178. [PMID: 38172326 DOI: 10.1007/s00414-023-03152-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 12/15/2023] [Indexed: 01/05/2024]

Qian R, Zhuang J, Xie J, Cheng H, Ou H, Lu X, Ouyang Z. Predictive value of machine learning for the severity of acute pancreatitis: A systematic review and meta-analysis. Heliyon 2024;10:e29603. [PMID: 38655348 PMCID: PMC11035062 DOI: 10.1016/j.heliyon.2024.e29603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 04/02/2024] [Accepted: 04/10/2024] [Indexed: 04/26/2024] Open

Abstract

Background

Predicting the severity of acute pancreatitis (AP) early poses a challenge in clinical practice. While there are well-established clinical scoring tools, their actual predictive performance remains uncertain. Various studies have explored the application of machine-learning methods for early AP prediction. However, a more comprehensive evidence-based assessment is needed to determine their predictive accuracy. Hence, this systematic review and meta-analysis aimed to evaluate the predictive accuracy of machine learning in assessing the severity of AP.

Methods

PubMed, EMBASE, Cochrane Library, and Web of Science were systematically searched until December 5, 2023. The risk of bias in eligible studies was assessed using the Prediction Model Risk of Bias Assessment Tool (PROBAST). Subgroup analyses, based on different machine learning types, were performed. Additionally, the predictive accuracy of mainstream scoring tools was summarized.

Results

This systematic review ultimately included 33 original studies. The pooled c-index in both the training and validation sets was 0.87 (95 % CI: 0.84-0.89) and 0.88 (95 % CI: 0.86-0.90), respectively. The sensitivity in the training set was 0.81 (95 % CI: 0.77-0.84), and in the validation set, it was 0.79 (95 % CI: 0.71-0.85). The specificity in the training set was 0.84 (95 % CI: 0.78-0.89), and in the validation set, it was 0.90 (95 % CI: 0.86-0.93). The primary model incorporated was logistic regression; however, its predictive accuracy was found to be inferior to that of neural networks, random forests, and xgboost. The pooled c-index of the APACHE II, BISAP, and Ranson were 0.74 (95 % CI: 0.68-0.80), 0.77 (95 % CI: 0.70-0.85), and 0.74 (95 % CI: 0.68-0.79), respectively.

Conclusions

Machine learning demonstrates excellent accuracy in predicting the severity of AP, providing a reference for updating or developing a straightforward clinical prediction tool.

Collapse

Ahmed MS, Hasan T, Islam S, Ahmed N. Investigating Rhythmicity in App Usage to Predict Depressive Symptoms: Protocol for Personalized Framework Development and Validation Through a Countrywide Study. JMIR Res Protoc 2024;13:e51540. [PMID: 38657238 PMCID: PMC11079771 DOI: 10.2196/51540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 12/27/2023] [Accepted: 01/11/2024] [Indexed: 04/26/2024] Open

Abstract

BACKGROUND

Understanding a student's depressive symptoms could facilitate significantly more precise diagnosis and treatment. However, few studies have focused on depressive symptom prediction through unobtrusive systems, and these studies are limited by small sample sizes, low performance, and the requirement for higher resources. In addition, research has not explored whether statistically significant rhythms based on different app usage behavioral markers (eg, app usage sessions) exist that could be useful in finding subtle differences to predict with higher accuracy like the models based on rhythms of physiological data.

OBJECTIVE

The main objective of this study is to explore whether there exist statistically significant rhythms in resource-insensitive app usage behavioral markers and predict depressive symptoms through these marker-based rhythmic features. Another objective of this study is to understand whether there is a potential link between rhythmic features and depressive symptoms.

METHODS

Through a countrywide study, we collected 2952 students' raw app usage behavioral data and responses to the 9 depressive symptoms in the 9-item Patient Health Questionnaire (PHQ-9). The behavioral data were retrieved through our developed app, which was previously used in our pilot studies in Bangladesh on different research problems. To explore whether there is a rhythm based on app usage data, we will conduct a zero-amplitude test. In addition, we will develop a cosinor model for each participant to extract rhythmic parameters (eg, acrophase). In addition, to obtain a comprehensive picture of the rhythms, we will explore nonparametric rhythmic features (eg, interdaily stability). Furthermore, we will conduct regression analysis to understand the association of rhythmic features with depressive symptoms. Finally, we will develop a personalized multitask learning (MTL) framework to predict symptoms through rhythmic features.

RESULTS

After applying inclusion criteria (eg, having app usage data of at least 2 days to explore rhythmicity), we kept the data of 2902 (98.31%) students for analysis, with 24.48 million app usage events, and 7 days' app usage of 2849 (98.17%) students. The students are from all 8 divisions of Bangladesh, both public and private universities (19 different universities and 52 different departments). We are analyzing the data and will publish the findings in a peer-reviewed publication.

CONCLUSIONS

Having an in-depth understanding of app usage rhythms and their connection with depressive symptoms through a countrywide study can significantly help health care professionals and researchers better understand depressed students and may create possibilities for using app usage-based rhythms for intervention. In addition, the MTL framework based on app usage rhythmic features may more accurately predict depressive symptoms due to the rhythms' capability to find subtle differences.

INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID)

DERR1-10.2196/51540.

Collapse

Collins GS. Making the black box more transparent: improving the reporting of artificial intelligence studies in healthcare. BMJ 2024;385:q832. [PMID: 38626954 DOI: 10.1136/bmj.q832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 04/19/2024]

Collins GS, Moons KGM, Dhiman P, Riley RD, Beam AL, Van Calster B, Ghassemi M, Liu X, Reitsma JB, van Smeden M, Boulesteix AL, Camaradou JC, Celi LA, Denaxas S, Denniston AK, Glocker B, Golub RM, Harvey H, Heinze G, Hoffman MM, Kengne AP, Lam E, Lee N, Loder EW, Maier-Hein L, Mateen BA, McCradden MD, Oakden-Rayner L, Ordish J, Parnell R, Rose S, Singh K, Wynants L, Logullo P. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ 2024;385:e078378. [PMID: 38626948 PMCID: PMC11019967 DOI: 10.1136/bmj-2023-078378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/17/2024] [Indexed: 04/19/2024]

Affiliation(s)

Gary S Collins Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
Karel G M Moons Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
Paula Dhiman Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
Richard D Riley Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK National Institute for Health and Care Research (NIHR) Birmingham Biomedical Research Centre, Birmingham, UK
Andrew L Beam Department of Epidemiology, Harvard T H Chan School of Public Health, Boston, MA, USA
Ben Van Calster Department of Development and Regeneration, KU Leuven, Leuven, Belgium Department of Biomedical Data Science, Leiden University Medical Centre, Leiden, Netherlands
Marzyeh Ghassemi Department of Electrical Engineering and Computer Science, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
Xiaoxuan Liu Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
Johannes B Reitsma Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
Maarten van Smeden Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
Anne-Laure Boulesteix Institute for Medical Information Processing, Biometry and Epidemiology, Faculty of Medicine, Ludwig-Maximilians-University of Munich and Munich Centre of Machine Learning, Germany
Jennifer Catherine Camaradou Patient representative, Health Data Research UK patient and public involvement and engagement group Patient representative, University of East Anglia, Faculty of Health Sciences, Norwich Research Park, Norwich, UK
Leo Anthony Celi Beth Israel Deaconess Medical Center, Boston, MA, USA Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA Department of Biostatistics, Harvard T H Chan School of Public Health, Boston, MA, USA
Spiros Denaxas Institute of Health Informatics, University College London, London, UK British Heart Foundation Data Science Centre, London, UK
Alastair K Denniston National Institute for Health and Care Research (NIHR) Birmingham Biomedical Research Centre, Birmingham, UK Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
Ben Glocker Department of Computing, Imperial College London, London, UK
Robert M Golub Northwestern University Feinberg School of Medicine, Chicago, IL, USA
Hugh Harvey Hardian Health, Haywards Heath, UK
Georg Heinze Section for Clinical Biometrics, Centre for Medical Data Science, Medical University of Vienna, Vienna, Austria
Michael M Hoffman Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada Department of Computer Science, University of Toronto, Toronto, ON, Canada Vector Institute for Artificial Intelligence, Toronto, ON, Canada
André Pascal Kengne Department of Medicine, University of Cape Town, Cape Town, South Africa
Emily Lam Patient representative, Health Data Research UK patient and public involvement and engagement group
Naomi Lee National Institute for Health and Care Excellence, London, UK
Elizabeth W Loder The BMJ, London, UK Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
Lena Maier-Hein Department of Intelligent Medical Systems, German Cancer Research Centre, Heidelberg, Germany
Bilal A Mateen Institute of Health Informatics, University College London, London, UK Wellcome Trust, London, UK Alan Turing Institute, London, UK
Melissa D McCradden Department of Bioethics, Hospital for Sick Children Toronto, ON, Canada Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, Canada
Lauren Oakden-Rayner Australian Institute for Machine Learning, University of Adelaide, Adelaide, SA, Australia
Johan Ordish Medicines and Healthcare products Regulatory Agency, London, UK
Richard Parnell Patient representative, Health Data Research UK patient and public involvement and engagement group
Sherri Rose Department of Health Policy and Center for Health Policy, Stanford University, Stanford, CA, USA
Karandeep Singh Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, Netherlands
Laure Wynants Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, Netherlands
Patricia Logullo Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK

Collapse

Armoundas AA, Narayan SM, Arnett DK, Spector-Bagdady K, Bennett DA, Celi LA, Friedman PA, Gollob MH, Hall JL, Kwitek AE, Lett E, Menon BK, Sheehan KA, Al-Zaiti SS. Use of Artificial Intelligence in Improving Outcomes in Heart Disease: A Scientific Statement From the American Heart Association. Circulation 2024;149:e1028-e1050. [PMID: 38415358 PMCID: PMC11042786 DOI: 10.1161/cir.0000000000001201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]

Huang Z, Denti P, Mistry H, Kloprogge F. Machine Learning and Artificial Intelligence in PK-PD Modeling: Fad, Friend, or Foe? Clin Pharmacol Ther 2024;115:652-654. [PMID: 38179832 PMCID: PMC11146679 DOI: 10.1002/cpt.3165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 12/22/2023] [Indexed: 01/06/2024]

Gallardo-Pizarro A, Peyrony O, Chumbita M, Monzo-Gallo P, Aiello TF, Teijon-Lumbreras C, Gras E, Mensa J, Soriano A, Garcia-Vidal C. Improving management of febrile neutropenia in oncology patients: the role of artificial intelligence and machine learning. Expert Rev Anti Infect Ther 2024;22:179-187. [PMID: 38457198 DOI: 10.1080/14787210.2024.2322445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 02/20/2024] [Indexed: 03/09/2024]

Gray M, Samala R, Liu Q, Skiles D, Xu J, Tong W, Wu L. Measurement and Mitigation of Bias in Artificial Intelligence: A Narrative Literature Review for Regulatory Science. Clin Pharmacol Ther 2024;115:687-697. [PMID: 38018360 DOI: 10.1002/cpt.3117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 11/21/2023] [Indexed: 11/30/2023]

Kwong JCC, Nickel GC, Wang SCY, Kvedar JC. Integrating artificial intelligence into healthcare systems: more than just the algorithm. NPJ Digit Med 2024;7:52. [PMID: 38429418 PMCID: PMC10907626 DOI: 10.1038/s41746-024-01066-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 02/22/2024] [Indexed: 03/03/2024] Open

Talimtzi P, Ntolkeras A, Kostopoulos G, Bougioukas KI, Pagkalidou E, Ouranidis A, Pataka A, Haidich AB. The reporting completeness and transparency of systematic reviews of prognostic prediction models for COVID-19 was poor: a methodological overview of systematic reviews. J Clin Epidemiol 2024;167:111264. [PMID: 38266742 DOI: 10.1016/j.jclinepi.2024.111264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 01/08/2024] [Accepted: 01/13/2024] [Indexed: 01/26/2024]

Milders J, Ramspek CL, Janse RJ, Bos WJW, Rotmans JI, Dekker FW, van Diepen M. Prognostic Models in Nephrology: Where Do We Stand and Where Do We Go from Here? Mapping Out the Evidence in a Scoping Review. J Am Soc Nephrol 2024;35:367-380. [PMID: 38082484 PMCID: PMC10914213 DOI: 10.1681/asn.0000000000000285] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2024] Open

Gan T, Guan H, Li P, Huang X, Li Y, Zhang R, Li T. Risk prediction models for cardiovascular events in hemodialysis patients: A systematic review. Semin Dial 2024;37:101-109. [PMID: 37743062 DOI: 10.1111/sdi.13181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 06/25/2023] [Accepted: 09/10/2023] [Indexed: 09/26/2023]

Fehr J, Citro B, Malpani R, Lippert C, Madai VI. A trustworthy AI reality-check: the lack of transparency of artificial intelligence products in healthcare. Front Digit Health 2024;6:1267290. [PMID: 38455991 PMCID: PMC10919164 DOI: 10.3389/fdgth.2024.1267290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 02/05/2024] [Indexed: 03/09/2024] Open

Barreñada L, Ledger A, Dhiman P, Collins G, Wynants L, Verbakel JY, Timmerman D, Valentin L, Van Calster B. ADNEX risk prediction model for diagnosis of ovarian cancer: systematic review and meta-analysis of external validation studies. BMJ MEDICINE 2024;3:e000817. [PMID: 38375077 PMCID: PMC10875560 DOI: 10.1136/bmjmed-2023-000817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 01/25/2024] [Indexed: 02/21/2024]

Abstract

Objectives

To conduct a systematic review of studies externally validating the ADNEX (Assessment of Different Neoplasias in the adnexa) model for diagnosis of ovarian cancer and to present a meta-analysis of its performance.

Design

Systematic review and meta-analysis of external validation studies.

Data sources

Medline, Embase, Web of Science, Scopus, and Europe PMC, from 15 October 2014 to 15 May 2023.

Eligibility criteria for selecting studies

All external validation studies of the performance of ADNEX, with any study design and any study population of patients with an adnexal mass. Two independent reviewers extracted the data. Disagreements were resolved by discussion. Reporting quality of the studies was scored with the TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis) reporting guideline, and methodological conduct and risk of bias with PROBAST (Prediction model Risk Of Bias Assessment Tool). Random effects meta-analysis of the area under the receiver operating characteristic curve (AUC), sensitivity and specificity at the 10% risk of malignancy threshold, and net benefit and relative utility at the 10% risk of malignancy threshold were performed.

Results

47 studies (17 007 tumours) were included, with a median study sample size of 261 (range 24-4905). On average, 61% of TRIPOD items were reported. Handling of missing data, justification of sample size, and model calibration were rarely described. 91% of validations were at high risk of bias, mainly because of the unexplained exclusion of incomplete cases, small sample size, or no assessment of calibration. The summary AUC to distinguish benign from malignant tumours in patients who underwent surgery was 0.93 (95% confidence interval 0.92 to 0.94, 95% prediction interval 0.85 to 0.98) for ADNEX with the serum biomarker, cancer antigen 125 (CA125), as a predictor (9202 tumours, 43 centres, 18 countries, and 21 studies) and 0.93 (95% confidence interval 0.91 to 0.94, 95% prediction interval 0.85 to 0.98) for ADNEX without CA125 (6309 tumours, 31 centres, 13 countries, and 12 studies). The estimated probability that the model has use clinically in a new centre was 95% (with CA125) and 91% (without CA125). When restricting analysis to studies with a low risk of bias, summary AUC values were 0.93 (with CA125) and 0.91 (without CA125), and estimated probabilities that the model has use clinically were 89% (with CA125) and 87% (without CA125).

Conclusions

The results of the meta-analysis indicated that ADNEX performed well in distinguishing between benign and malignant tumours in populations from different countries and settings, regardless of whether the serum biomarker, CA125, was used as a predictor. A key limitation was that calibration was rarely assessed.

Systematic review registration

PROSPERO CRD42022373182.

Collapse

Cai Y, Cai YQ, Tang LY, Wang YH, Gong M, Jing TC, Li HJ, Li-Ling J, Hu W, Yin Z, Gong DX, Zhang GW. Artificial intelligence in the risk prediction models of cardiovascular disease and development of an independent validation screening tool: a systematic review. BMC Med 2024;22:56. [PMID: 38317226 PMCID: PMC10845808 DOI: 10.1186/s12916-024-03273-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Accepted: 01/23/2024] [Indexed: 02/07/2024] Open

Abstract

BACKGROUND

A comprehensive overview of artificial intelligence (AI) for cardiovascular disease (CVD) prediction and a screening tool of AI models (AI-Ms) for independent external validation are lacking. This systematic review aims to identify, describe, and appraise AI-Ms of CVD prediction in the general and special populations and develop a new independent validation score (IVS) for AI-Ms replicability evaluation.

METHODS

PubMed, Web of Science, Embase, and IEEE library were searched up to July 2021. Data extraction and analysis were performed for the populations, distribution, predictors, algorithms, etc. The risk of bias was evaluated with the prediction risk of bias assessment tool (PROBAST). Subsequently, we designed IVS for model replicability evaluation with five steps in five items, including transparency of algorithms, performance of models, feasibility of reproduction, risk of reproduction, and clinical implication, respectively. The review is registered in PROSPERO (No. CRD42021271789).

RESULTS

In 20,887 screened references, 79 articles (82.5% in 2017-2021) were included, which contained 114 datasets (67 in Europe and North America, but 0 in Africa). We identified 486 AI-Ms, of which the majority were in development (n = 380), but none of them had undergone independent external validation. A total of 66 idiographic algorithms were found; however, 36.4% were used only once and only 39.4% over three times. A large number of different predictors (range 5-52,000, median 21) and large-span sample size (range 80-3,660,000, median 4466) were observed. All models were at high risk of bias according to PROBAST, primarily due to the incorrect use of statistical methods. IVS analysis confirmed only 10 models as "recommended"; however, 281 and 187 were "not recommended" and "warning," respectively.

CONCLUSION

AI has led the digital revolution in the field of CVD prediction, but is still in the early stage of development as the defects of research design, report, and evaluation systems. The IVS we developed may contribute to independent external validation and the development of this field.

Collapse

Sebro R. Advancing Diagnostics and Patient Care: The Role of Biomarkers in Radiology. Semin Musculoskelet Radiol 2024;28:3-13. [PMID: 38330966 DOI: 10.1055/s-0043-1776426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2024]

Chang RSK, Nguyen S, Chen Z, Foster E, Kwan P. Role of machine learning in the management of epilepsy: a systematic review protocol. BMJ Open 2024;14:e079785. [PMID: 38272549 PMCID: PMC10823996 DOI: 10.1136/bmjopen-2023-079785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 01/05/2024] [Indexed: 01/27/2024] Open

Tabja Bortesi JP, Ranisau J, Di S, McGillion M, Rosella L, Johnson A, Devereaux PJ, Petch J. Machine Learning Approaches for the Image-Based Identification of Surgical Wound Infections: Scoping Review. J Med Internet Res 2024;26:e52880. [PMID: 38236623 PMCID: PMC10835585 DOI: 10.2196/52880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 11/09/2023] [Accepted: 12/12/2023] [Indexed: 01/19/2024] Open

Abstract

BACKGROUND

Surgical site infections (SSIs) occur frequently and impact patients and health care systems. Remote surveillance of surgical wounds is currently limited by the need for manual assessment by clinicians. Machine learning (ML)-based methods have recently been used to address various aspects of the postoperative wound healing process and may be used to improve the scalability and cost-effectiveness of remote surgical wound assessment.

OBJECTIVE

The objective of this review was to provide an overview of the ML methods that have been used to identify surgical wound infections from images.

METHODS

We conducted a scoping review of ML approaches for visual detection of SSIs following the JBI (Joanna Briggs Institute) methodology. Reports of participants in any postoperative context focusing on identification of surgical wound infections were included. Studies that did not address SSI identification, surgical wounds, or did not use image or video data were excluded. We searched MEDLINE, Embase, CINAHL, CENTRAL, Web of Science Core Collection, IEEE Xplore, Compendex, and arXiv for relevant studies in November 2022. The records retrieved were double screened for eligibility. A data extraction tool was used to chart the relevant data, which was described narratively and presented using tables. Employment of TRIPOD (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis) guidelines was evaluated and PROBAST (Prediction Model Risk of Bias Assessment Tool) was used to assess risk of bias (RoB).

RESULTS

In total, 10 of the 715 unique records screened met the eligibility criteria. In these studies, the clinical contexts and surgical procedures were diverse. All papers developed diagnostic models, though none performed external validation. Both traditional ML and deep learning methods were used to identify SSIs from mostly color images, and the volume of images used ranged from under 50 to thousands. Further, 10 TRIPOD items were reported in at least 4 studies, though 15 items were reported in fewer than 4 studies. PROBAST assessment led to 9 studies being identified as having an overall high RoB, with 1 study having overall unclear RoB.

CONCLUSIONS

Research on the image-based identification of surgical wound infections using ML remains novel, and there is a need for standardized reporting. Limitations related to variability in image capture, model building, and data sources should be addressed in the future.

Collapse

Ciobanu-Caraus O, Aicher A, Kernbach JM, Regli L, Serra C, Staartjes VE. A critical moment in machine learning in medicine: on reproducible and interpretable learning. Acta Neurochir (Wien) 2024;166:14. [PMID: 38227273 DOI: 10.1007/s00701-024-05892-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 12/14/2023] [Indexed: 01/17/2024]

Abstract

Over the past two decades, advances in computational power and data availability combined with increased accessibility to pre-trained models have led to an exponential rise in machine learning (ML) publications. While ML may have the potential to transform healthcare, this sharp increase in ML research output without focus on methodological rigor and standard reporting guidelines has fueled a reproducibility crisis. In addition, the rapidly growing complexity of these models compromises their interpretability, which currently impedes their successful and widespread clinical adoption. In medicine, where failure of such models may have severe implications for patients' health, the high requirements for accuracy, robustness, and interpretability confront ML researchers with a unique set of challenges. In this review, we discuss the semantics of reproducibility and interpretability, as well as related issues and challenges, and outline possible solutions to counteracting the "black box". To foster reproducibility, standard reporting guidelines need to be further developed and data or code sharing encouraged. Editors and reviewers may equally play a critical role by establishing high methodological standards and thus preventing the dissemination of low-quality ML publications. To foster interpretable learning, the use of simpler models more suitable for medical data can inform the clinician how results are generated based on input data. Model-agnostic explanation tools, sensitivity analysis, and hidden layer representations constitute further promising approaches to increase interpretability. Balancing model performance and interpretability are important to ensure clinical applicability. We have now reached a critical moment for ML in medicine, where addressing these issues and implementing appropriate solutions will be vital for the future evolution of the field.

Collapse

Jung J, Dai J, Liu B, Wu Q. Artificial intelligence in fracture detection with different image modalities and data types: A systematic review and meta-analysis. PLOS DIGITAL HEALTH 2024;3:e0000438. [PMID: 38289965 PMCID: PMC10826962 DOI: 10.1371/journal.pdig.0000438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Accepted: 12/25/2023] [Indexed: 02/01/2024]

Abstract

Artificial Intelligence (AI), encompassing Machine Learning and Deep Learning, has increasingly been applied to fracture detection using diverse imaging modalities and data types. This systematic review and meta-analysis aimed to assess the efficacy of AI in detecting fractures through various imaging modalities and data types (image, tabular, or both) and to synthesize the existing evidence related to AI-based fracture detection. Peer-reviewed studies developing and validating AI for fracture detection were identified through searches in multiple electronic databases without time limitations. A hierarchical meta-analysis model was used to calculate pooled sensitivity and specificity. A diagnostic accuracy quality assessment was performed to evaluate bias and applicability. Of the 66 eligible studies, 54 identified fractures using imaging-related data, nine using tabular data, and three using both. Vertebral fractures were the most common outcome (n = 20), followed by hip fractures (n = 18). Hip fractures exhibited the highest pooled sensitivity (92%; 95% CI: 87-96, p< 0.01) and specificity (90%; 95% CI: 85-93, p< 0.01). Pooled sensitivity and specificity using image data (92%; 95% CI: 90-94, p< 0.01; and 91%; 95% CI: 88-93, p < 0.01) were higher than those using tabular data (81%; 95% CI: 77-85, p< 0.01; and 83%; 95% CI: 76-88, p < 0.01), respectively. Radiographs demonstrated the highest pooled sensitivity (94%; 95% CI: 90-96, p < 0.01) and specificity (92%; 95% CI: 89-94, p< 0.01). Patient selection and reference standards were major concerns in assessing diagnostic accuracy for bias and applicability. AI displays high diagnostic accuracy for various fracture outcomes, indicating potential utility in healthcare systems for fracture diagnosis. However, enhanced transparency in reporting and adherence to standardized guidelines are necessary to improve the clinical applicability of AI. Review Registration: PROSPERO (CRD42021240359).

Collapse

Stewart R, Chaturvedi J, Roberts A. Natural language processing - relevance to patient outcomes and real-world evidence. Expert Rev Pharmacoecon Outcomes Res 2024;24:5-9. [PMID: 37874661 DOI: 10.1080/14737167.2023.2275670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 10/23/2023] [Indexed: 10/26/2023]

Sutradhar A, Al Rafi M, Shamrat FMJM, Ghosh P, Das S, Islam MA, Ahmed K, Zhou X, Azad AKM, Alyami SA, Moni MA. BOO-ST and CBCEC: two novel hybrid machine learning methods aim to reduce the mortality of heart failure patients. Sci Rep 2023;13:22874. [PMID: 38129433 PMCID: PMC10739972 DOI: 10.1038/s41598-023-48486-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 11/27/2023] [Indexed: 12/23/2023] Open

van Royen FS, Asselbergs FW, Alfonso F, Vardas P, van Smeden M. Five critical quality criteria for artificial intelligence-based prediction models. Eur Heart J 2023;44:4831-4834. [PMID: 37897346 DOI: 10.1093/eurheartj/ehad727] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/30/2023] Open

Banerji CRS, Chakraborti T, Harbron C, MacArthur BD. Clinical AI tools must convey predictive uncertainty for each individual patient. Nat Med 2023;29:2996-2998. [PMID: 37821686 DOI: 10.1038/s41591-023-02562-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]

Riley RD, Collins GS. Stability of clinical prediction models developed using statistical or machine learning methods. Biom J 2023;65:e2200302. [PMID: 37466257 PMCID: PMC10952221 DOI: 10.1002/bimj.202200302] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 04/26/2023] [Accepted: 05/02/2023] [Indexed: 07/20/2023]

Jacquemyn X, Kutty S, Manlhiot C. The Lifelong Impact of Artificial Intelligence and Clinical Prediction Models on Patients With Tetralogy of Fallot. CJC PEDIATRIC AND CONGENITAL HEART DISEASE 2023;2:440-452. [PMID: 38161675 PMCID: PMC10755786 DOI: 10.1016/j.cjcpc.2023.08.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 08/24/2023] [Indexed: 01/03/2024]

Pei J, Guo X, Tao H, Wei Y, Zhang H, Ma Y, Han L. Machine learning-based prediction models for pressure injury: A systematic review and meta-analysis. Int Wound J 2023;20:4328-4339. [PMID: 37340520 PMCID: PMC10681397 DOI: 10.1111/iwj.14280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2023] [Accepted: 06/01/2023] [Indexed: 06/22/2023] Open

Wilson A. CORR Synthesis: Can Decision Tree Learning Advance Orthopaedic Surgery Research? Clin Orthop Relat Res 2023;481:2337-2342. [PMID: 37678231 PMCID: PMC10642865 DOI: 10.1097/corr.0000000000002820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 07/20/2023] [Indexed: 09/09/2023]

Moon SJ, Lee S, Hwang J, Lee J, Kang S, Cha HS. Performances of machine learning algorithms in discriminating sacroiliitis features on MRI: a systematic review. RMD Open 2023;9:e003783. [PMID: 37996126 PMCID: PMC10668284 DOI: 10.1136/rmdopen-2023-003783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Accepted: 11/08/2023] [Indexed: 11/25/2023] Open

Carrasco-Ribelles LA, Llanes-Jurado J, Gallego-Moll C, Cabrera-Bean M, Monteagudo-Zaragoza M, Violán C, Zabaleta-del-Olmo E. Prediction models using artificial intelligence and longitudinal data from electronic health records: a systematic methodological review. J Am Med Inform Assoc 2023;30:2072-2082. [PMID: 37659105 PMCID: PMC10654870 DOI: 10.1093/jamia/ocad168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 08/02/2023] [Accepted: 08/11/2023] [Indexed: 09/04/2023] Open

Affiliation(s)

Lucía A Carrasco-Ribelles Fundació Institut Universitari per a la recerca a l’Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Barcelona, 08007, Spain Department of Signal Theory and Communications, Universitat Politècnica de Catalunya (UPC), Barcelona, 08034, Spain Unitat de Suport a la Recerca Metropolitana Nord, Fundació Institut Universitari per a la recerca a l’Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Mataró, 08303, Spain
José Llanes-Jurado Instituto de Investigación e Innovación en Bioingeniería (i3B), Universitat Politècnica de València (UPV), València, 46022, Spain
Carlos Gallego-Moll Fundació Institut Universitari per a la recerca a l’Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Barcelona, 08007, Spain Unitat de Suport a la Recerca Metropolitana Nord, Fundació Institut Universitari per a la recerca a l’Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Mataró, 08303, Spain
Margarita Cabrera-Bean Department of Signal Theory and Communications, Universitat Politècnica de Catalunya (UPC), Barcelona, 08034, Spain
Mònica Monteagudo-Zaragoza Fundació Institut Universitari per a la recerca a l’Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Barcelona, 08007, Spain
Concepción Violán Unitat de Suport a la Recerca Metropolitana Nord, Fundació Institut Universitari per a la recerca a l’Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Mataró, 08303, Spain Direcció d’Atenció Primària Metropolitana Nord, Institut Català de Salut, Badalona, 08915, Spain Fundació Institut d’Investigació en ciències de la salut Germans Trias i Pujol (IGTP), Badalona, 08916, Spain Fundació UAB, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, 08193, Spain
Edurne Zabaleta-del-Olmo Fundació Institut Universitari per a la recerca a l’Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Barcelona, 08007, Spain Gerència Territorial de Barcelona, Institut Català de la Salut, Carrer de Balmes 22, Barcelona, 08007, Spain Nursing Department, Faculty of Nursing, Universitat de Girona, Girona, 17003, Spain

Collapse

Ser SE, Shear K, Snigurska UA, Prosperi M, Wu Y, Magoc T, Bjarnadottir RI, Lucero RJ. Clinical Prediction Models for Hospital-Induced Delirium Using Structured and Unstructured Electronic Health Record Data: Protocol for a Development and Validation Study. JMIR Res Protoc 2023;12:e48521. [PMID: 37943599 PMCID: PMC10667972 DOI: 10.2196/48521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 09/01/2023] [Accepted: 09/05/2023] [Indexed: 11/10/2023] Open

Abstract

BACKGROUND

Hospital-induced delirium is one of the most common and costly iatrogenic conditions, and its incidence is predicted to increase as the population of the United States ages. An academic and clinical interdisciplinary systems approach is needed to reduce the frequency and impact of hospital-induced delirium.

OBJECTIVE

The long-term goal of our research is to enhance the safety of hospitalized older adults by reducing iatrogenic conditions through an effective learning health system. In this study, we will develop models for predicting hospital-induced delirium. In order to accomplish this objective, we will create a computable phenotype for our outcome (hospital-induced delirium), design an expert-based traditional logistic regression model, leverage machine learning techniques to generate a model using structured data, and use machine learning and natural language processing to produce an integrated model with components from both structured data and text data.

METHODS

This study will explore text-based data, such as nursing notes, to improve the predictive capability of prognostic models for hospital-induced delirium. By using supervised and unsupervised text mining in addition to structured data, we will examine multiple types of information in electronic health record data to predict medical-surgical patient risk of developing delirium. Development and validation will be compliant to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement.

RESULTS

Work on this project will take place through March 2024. For this study, we will use data from approximately 332,230 encounters that occurred between January 2012 to May 2021. Findings from this project will be disseminated at scientific conferences and in peer-reviewed journals.

CONCLUSIONS

Success in this study will yield a durable, high-performing research-data infrastructure that will process, extract, and analyze clinical text data in near real time. This model has the potential to be integrated into the electronic health record and provide point-of-care decision support to prevent harm and improve quality of care.

INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID)

DERR1-10.2196/48521.

Collapse

Buick JE, Austin PC, Cheskes S, Ko DT, Atzema CL. Prediction models in prehospital and emergency medicine research: How to derive and internally validate a clinical prediction model. Acad Emerg Med 2023;30:1150-1160. [PMID: 37266925 DOI: 10.1111/acem.14756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2023] [Revised: 05/24/2023] [Accepted: 05/29/2023] [Indexed: 06/03/2023]

Jiang Y, Wang C, Zhou S. Artificial intelligence-based risk stratification, accurate diagnosis and treatment prediction in gynecologic oncology. Semin Cancer Biol 2023;96:82-99. [PMID: 37783319 DOI: 10.1016/j.semcancer.2023.09.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2022] [Revised: 08/27/2023] [Accepted: 09/25/2023] [Indexed: 10/04/2023]

Malorgio A, Henckert D, Schweiger G, Braun J, Zacharowski K, Raimann FJ, Piekarski F, Meybohm P, Hottenrott S, Froehlich C, Spahn DR, Noethiger CB, Tscholl DW, Roche TR. Using Visual Patient to Show Vital Sign Predictions, a Computer-Based Mixed Quantitative and Qualitative Simulation Study. Diagnostics (Basel) 2023;13:3281. [PMID: 37892102 PMCID: PMC10606017 DOI: 10.3390/diagnostics13203281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 10/16/2023] [Accepted: 10/19/2023] [Indexed: 10/29/2023] Open

Abstract

BACKGROUND

Machine learning can analyze vast amounts of data and make predictions for events in the future. Our group created machine learning models for vital sign predictions. To transport the information of these predictions without numbers and numerical values and make them easily usable for human caregivers, we aimed to integrate them into the Philips Visual-Patient-avatar, an avatar-based visualization of patient monitoring.

METHODS

We conducted a computer-based simulation study with 70 participants in 3 European university hospitals. We validated the vital sign prediction visualizations by testing their identification by anesthesiologists and intensivists. Each prediction visualization consisted of a condition (e.g., low blood pressure) and an urgency (a visual indication of the timespan in which the condition is expected to occur). To obtain qualitative user feedback, we also conducted standardized interviews and derived statements that participants later rated in an online survey.

RESULTS

The mixed logistic regression model showed 77.9% (95% CI 73.2-82.0%) correct identification of prediction visualizations (i.e., condition and urgency both correctly identified) and 93.8% (95% CI 93.7-93.8%) for conditions only (i.e., without considering urgencies). A total of 49 out of 70 participants completed the online survey. The online survey participants agreed that the prediction visualizations were fun to use (32/49, 65.3%), and that they could imagine working with them in the future (30/49, 61.2%). They also agreed that identifying the urgencies was difficult (32/49, 65.3%).

CONCLUSIONS

This study found that care providers correctly identified >90% of the conditions (i.e., without considering urgencies). The accuracy of identification decreased when considering urgencies in addition to conditions. Therefore, in future development of the technology, we will focus on either only displaying conditions (without urgencies) or improving the visualizations of urgency to enhance usability for human users.

Collapse

Affiliation(s)

Amos Malorgio Institute of Anesthesiology, University Hospital Zurich, University of Zurich, 8091 Zurich, Switzerland; (A.M.); (D.H.); (G.S.); (D.R.S.); (C.B.N.); (D.W.T.)
David Henckert Institute of Anesthesiology, University Hospital Zurich, University of Zurich, 8091 Zurich, Switzerland; (A.M.); (D.H.); (G.S.); (D.R.S.); (C.B.N.); (D.W.T.)
Giovanna Schweiger Institute of Anesthesiology, University Hospital Zurich, University of Zurich, 8091 Zurich, Switzerland; (A.M.); (D.H.); (G.S.); (D.R.S.); (C.B.N.); (D.W.T.)
Julia Braun Departments of Epidemiology and Biostatistics, Epidemiology, Biostatistics and Prevention Institute, University of Zurich, 8001 Zurich, Switzerland;
Kai Zacharowski Department of Anesthesiology, Intensive Care Medicine, and Pain Therapy, University Hospital Frankfurt, Goethe University Frankfurt, 60323 Frankfurt, Germany; (K.Z.); (F.J.R.); (F.P.)
Florian J. Raimann Department of Anesthesiology, Intensive Care Medicine, and Pain Therapy, University Hospital Frankfurt, Goethe University Frankfurt, 60323 Frankfurt, Germany; (K.Z.); (F.J.R.); (F.P.)
Florian Piekarski Department of Anesthesiology, Intensive Care Medicine, and Pain Therapy, University Hospital Frankfurt, Goethe University Frankfurt, 60323 Frankfurt, Germany; (K.Z.); (F.J.R.); (F.P.)
Patrick Meybohm Department of Anesthesiology, Intensive Care, Emergency, and Pain Medicine, University Hospital Wuerzburg, 97070 Wuerzburg, Germany; (P.M.); (S.H.); (C.F.)
Sebastian Hottenrott Department of Anesthesiology, Intensive Care, Emergency, and Pain Medicine, University Hospital Wuerzburg, 97070 Wuerzburg, Germany; (P.M.); (S.H.); (C.F.)
Corinna Froehlich Department of Anesthesiology, Intensive Care, Emergency, and Pain Medicine, University Hospital Wuerzburg, 97070 Wuerzburg, Germany; (P.M.); (S.H.); (C.F.)
Donat R. Spahn Institute of Anesthesiology, University Hospital Zurich, University of Zurich, 8091 Zurich, Switzerland; (A.M.); (D.H.); (G.S.); (D.R.S.); (C.B.N.); (D.W.T.)
Christoph B. Noethiger Institute of Anesthesiology, University Hospital Zurich, University of Zurich, 8091 Zurich, Switzerland; (A.M.); (D.H.); (G.S.); (D.R.S.); (C.B.N.); (D.W.T.)
David W. Tscholl Institute of Anesthesiology, University Hospital Zurich, University of Zurich, 8091 Zurich, Switzerland; (A.M.); (D.H.); (G.S.); (D.R.S.); (C.B.N.); (D.W.T.)
Tadzio R. Roche Institute of Anesthesiology, University Hospital Zurich, University of Zurich, 8091 Zurich, Switzerland; (A.M.); (D.H.); (G.S.); (D.R.S.); (C.B.N.); (D.W.T.)

Collapse

Chimbunde E, Sigwadhi LN, Tamuzi JL, Okango EL, Daramola O, Ngah VD, Nyasulu PS. Machine learning algorithms for predicting determinants of COVID-19 mortality in South Africa. Front Artif Intell 2023;6:1171256. [PMID: 37899965 PMCID: PMC10600470 DOI: 10.3389/frai.2023.1171256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 08/15/2023] [Indexed: 10/31/2023] Open

Abstract

Background

COVID-19 has strained healthcare resources, necessitating efficient prognostication to triage patients effectively. This study quantified COVID-19 risk factors and predicted COVID-19 intensive care unit (ICU) mortality in South Africa based on machine learning algorithms.

Methods

Data for this study were obtained from 392 COVID-19 ICU patients enrolled between 26 March 2020 and 10 February 2021. We used an artificial neural network (ANN) and random forest (RF) to predict mortality among ICU patients and a semi-parametric logistic regression with nine covariates, including a grouping variable based on K-means clustering. Further evaluation of the algorithms was performed using sensitivity, accuracy, specificity, and Cohen's K statistics.

Results

From the semi-parametric logistic regression and ANN variable importance, age, gender, cluster, presence of severe symptoms, being on the ventilator, and comorbidities of asthma significantly contributed to ICU death. In particular, the odds of mortality were six times higher among asthmatic patients than non-asthmatic patients. In univariable and multivariate regression, advanced age, PF1 and 2, FiO2, severe symptoms, asthma, oxygen saturation, and cluster 4 were strongly predictive of mortality. The RF model revealed that intubation status, age, cluster, diabetes, and hypertension were the top five significant predictors of mortality. The ANN performed well with an accuracy of 71%, a precision of 83%, an F1 score of 100%, Matthew's correlation coefficient (MCC) score of 100%, and a recall of 88%. In addition, Cohen's k-value of 0.75 verified the most extreme discriminative power of the ANN. In comparison, the RF model provided a 76% recall, an 87% precision, and a 65% MCC.

Conclusion

Based on the findings, we can conclude that both ANN and RF can predict COVID-19 mortality in the ICU with accuracy. The proposed models accurately predict the prognosis of COVID-19 patients after diagnosis. The models can be used to prioritize COVID-19 patients with a high mortality risk in resource-constrained ICUs.

Collapse

Ratna MB, Bhattacharya S, McLernon DJ. External validation of models for predicting cumulative live birth over multiple complete cycles of IVF treatment. Hum Reprod 2023;38:1998-2010. [PMID: 37632223 PMCID: PMC10546080 DOI: 10.1093/humrep/dead165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 07/28/2023] [Indexed: 08/27/2023] Open

Abstract

STUDY QUESTION

Can two prediction models developed using data from 1999 to 2009 accurately predict the cumulative probability of live birth per woman over multiple complete cycles of IVF in an updated UK cohort?

SUMMARY ANSWER

After being updated, the models were able to estimate individualized chances of cumulative live birth over multiple complete cycles of IVF with greater accuracy.

WHAT IS KNOWN ALREADY

The McLernon models were the first to predict cumulative live birth over multiple complete cycles of IVF. They were converted into an online calculator called OPIS (Outcome Prediction In Subfertility) which has 3000 users per month on average. A previous study externally validated the McLernon models using a Dutch prospective cohort containing data from 2011 to 2014. With changes in IVF practice over time, it is important that the McLernon models are externally validated on a more recent cohort of patients to ensure that predictions remain accurate.

STUDY DESIGN, SIZE, DURATION

A population-based cohort of 91 035 women undergoing IVF in the UK between January 2010 and December 2016 was used for external validation. Data on frozen embryo transfers associated with these complete IVF cycles conducted from 1 January 2017 to 31 December 2017 were also collected.

PARTICIPANTS/MATERIALS, SETTING, METHODS

Data on IVF treatments were obtained from the Human Fertilisation and Embryology Authority (HFEA). The predictive performances of the McLernon models were evaluated in terms of discrimination and calibration. Discrimination was assessed using the c-statistic and calibration was assessed using calibration-in-the-large, calibration slope, and calibration plots. Where any model demonstrated poor calibration in the validation cohort, the models were updated using intercept recalibration, logistic recalibration, or model revision to improve model performance.

MAIN RESULTS AND THE ROLE OF CHANCE

Following exclusions, 91 035 women who underwent 144 734 complete cycles were included. The validation cohort had a similar distribution age profile to women in the development cohort. Live birth rates over all complete cycles of IVF per woman were higher in the validation cohort. After calibration assessment, both models required updating. The coefficients of the pre-treatment model were revised, and the updated model showed reasonable discrimination (c-statistic: 0.67, 95% CI: 0.66 to 0.68). After logistic recalibration, the post-treatment model showed good discrimination (c-statistic: 0.75, 95% CI: 0.74 to 0.76). As an example, in the updated pre-treatment model, a 32-year-old woman with 2 years of primary infertility has a 42% chance of having a live birth in the first complete ICSI cycle and a 77% chance over three complete cycles. In a couple with 2 years of primary male factor infertility where a 30-year-old woman has 15 oocytes collected in the first cycle, a single fresh blastocyst embryo transferred in the first cycle and spare embryos cryopreserved, the estimated chance of live birth provided by the post-treatment model is 46% in the first complete ICSI cycle and 81% over three complete cycles.

LIMITATIONS, REASONS FOR CAUTION

Two predictors from the original models, duration of infertility and previous pregnancy, which were not available in the recent HFEA dataset, were imputed using data from the older cohort used to develop the models. The HFEA dataset does not contain some other potentially important predictors, e.g. BMI, ethnicity, race, smoking and alcohol intake in women, as well as measures of ovarian reserve such as antral follicle count.

WIDER IMPLICATIONS OF THE FINDINGS

Both updated models show improved predictive ability and provide estimates which are more reflective of current practice and patient case mix. The updated OPIS tool can be used by clinicians to help shape couples' expectations by informing them of their individualized chances of live birth over a sequence of multiple complete cycles of IVF.

STUDY FUNDING/COMPETING INTEREST(S)

This study was supported by an Elphinstone scholarship scheme at the University of Aberdeen and Aberdeen Fertility Centre, University of Aberdeen. S.B. has a commitment of research funding from Merck. D.J.M. and M.B.R. declare support for the present manuscript from Elphinstone scholarship scheme at the University of Aberdeen and Assisted Reproduction Unit at Aberdeen Fertility Centre, University of Aberdeen. D.J.M. declares grants received by University of Aberdeen from NHS Grampian, The Meikle Foundation, and Chief Scientist Office in the past 3 years. D.J.M. declares receiving an honorarium for lectures from Merck. D.J.M. is Associate Editor of Human Reproduction Open and Statistical Advisor for Reproductive BioMed Online. S.B. declares royalties from Cambridge University Press for a book. S.B. declares receiving an honorarium for lectures from Merck, Organon, Ferring, Obstetric and Gynaecological Society of Singapore, and Taiwanese Society for Reproductive Medicine. S.B. has received support from Merck, ESHRE, and Ferring for attending meetings as speaker and is on the METAFOR and CAPRE Trials Data Monitoring Committee.

TRIAL REGISTRATION NUMBER

N/A.

Collapse

Wang Y, Hou R, Ni B, Jiang Y, Zhang Y. Development and validation of a prediction model based on machine learning algorithms for predicting the risk of heart failure in middle-aged and older US people with prediabetes or diabetes. Clin Cardiol 2023;46:1234-1243. [PMID: 37519220 PMCID: PMC10577538 DOI: 10.1002/clc.24104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 07/13/2023] [Accepted: 07/16/2023] [Indexed: 08/01/2023] Open

Lewis AE, Weiskopf N, Abrams ZB, Foraker R, Lai AM, Payne PRO, Gupta A. Electronic health record data quality assessment and tools: a systematic review. J Am Med Inform Assoc 2023;30:1730-1740. [PMID: 37390812 PMCID: PMC10531113 DOI: 10.1093/jamia/ocad120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Revised: 05/16/2023] [Accepted: 06/23/2023] [Indexed: 07/02/2023] Open

Clichet V, Lebon D, Chapuis N, Zhu J, Bardet V, Marolleau JP, Garçon L, Caulier A, Boyer T. Artificial intelligence to empower diagnosis of myelodysplastic syndromes by multiparametric flow cytometry. Haematologica 2023;108:2435-2443. [PMID: 36924240 PMCID: PMC10483367 DOI: 10.3324/haematol.2022.282370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 03/07/2023] [Indexed: 03/18/2023] Open

Okada Y, Mertens M, Liu N, Lam SSW, Ong MEH. AI and machine learning in resuscitation: Ongoing research, new concepts, and key challenges. Resusc Plus 2023;15:100435. [PMID: 37547540 PMCID: PMC10400904 DOI: 10.1016/j.resplu.2023.100435] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/08/2023] Open

Vasey B, Collins GS. Invited Commentary: Transparent reporting of artificial intelligence models development and evaluation in surgery: The TRIPOD and DECIDE-AI checklists. Surgery 2023;174:727-729. [PMID: 37244769 DOI: 10.1016/j.surg.2023.04.037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 04/27/2023] [Indexed: 05/29/2023]

Dhiman P, Ma J, Qi C, Bullock G, Sergeant JC, Riley RD, Collins GS. Sample size requirements are not being considered in studies developing prediction models for binary outcomes: a systematic review. BMC Med Res Methodol 2023;23:188. [PMID: 37598153 PMCID: PMC10439652 DOI: 10.1186/s12874-023-02008-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 08/04/2023] [Indexed: 08/21/2023] Open

Abstract

BACKGROUND

Having an appropriate sample size is important when developing a clinical prediction model. We aimed to review how sample size is considered in studies developing a prediction model for a binary outcome.

METHODS

We searched PubMed for studies published between 01/07/2020 and 30/07/2020 and reviewed the sample size calculations used to develop the prediction models. Using the available information, we calculated the minimum sample size that would be needed to estimate overall risk and minimise overfitting in each study and summarised the difference between the calculated and used sample size.

RESULTS

A total of 119 studies were included, of which nine studies provided sample size justification (8%). The recommended minimum sample size could be calculated for 94 studies: 73% (95% CI: 63-82%) used sample sizes lower than required to estimate overall risk and minimise overfitting including 26% studies that used sample sizes lower than required to estimate overall risk only. A similar number of studies did not meet the ≥ 10EPV criteria (75%, 95% CI: 66-84%). The median deficit of the number of events used to develop a model was 75 [IQR: 234 lower to 7 higher]) which reduced to 63 if the total available data (before any data splitting) was used [IQR:225 lower to 7 higher]. Studies that met the minimum required sample size had a median c-statistic of 0.84 (IQR:0.80 to 0.9) and studies where the minimum sample size was not met had a median c-statistic of 0.83 (IQR: 0.75 to 0.9). Studies that met the ≥ 10 EPP criteria had a median c-statistic of 0.80 (IQR: 0.73 to 0.84).

CONCLUSIONS

Prediction models are often developed with no sample size calculation, as a consequence many are too small to precisely estimate the overall risk. We encourage researchers to justify, perform and report sample size calculations when developing a prediction model.

Collapse

Lapi F, Nuti L, Marconi E, Medea G, Cricelli I, Papi M, Gorini M, Fiorani M, Piccinocchi G, Cricelli C. To predict the risk of chronic kidney disease (CKD) using Generalized Additive2 Models (GA2M). J Am Med Inform Assoc 2023;30:1494-1502. [PMID: 37330672 PMCID: PMC10436146 DOI: 10.1093/jamia/ocad097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 03/13/2023] [Accepted: 05/27/2023] [Indexed: 06/19/2023] Open

Ahmed MS, Ahmed N. A Fast and Minimal System to Identify Depression Using Smartphones: Explainable Machine Learning-Based Approach. JMIR Form Res 2023;7:e28848. [PMID: 37561568 PMCID: PMC10450542 DOI: 10.2196/28848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 03/17/2023] [Accepted: 03/19/2023] [Indexed: 08/11/2023] Open

Abstract

BACKGROUND

Existing robust, pervasive device-based systems developed in recent years to detect depression require data collected over a long period and may not be effective in cases where early detection is crucial. Additionally, due to the requirement of running systems in the background for prolonged periods, existing systems can be resource inefficient. As a result, these systems can be infeasible in low-resource settings.

OBJECTIVE

Our main objective was to develop a minimalistic system to identify depression using data retrieved in the fastest possible time. Another objective was to explain the machine learning (ML) models that were best for identifying depression.

METHODS

We developed a fast tool that retrieves the past 7 days' app usage data in 1 second (mean 0.31, SD 1.10 seconds). A total of 100 students from Bangladesh participated in our study, and our tool collected their app usage data and responses to the Patient Health Questionnaire-9. To identify depressed and nondepressed students, we developed a diverse set of ML models: linear, tree-based, and neural network-based models. We selected important features using the stable approach, along with 3 main types of feature selection (FS) approaches: filter, wrapper, and embedded methods. We developed and validated the models using the nested cross-validation method. Additionally, we explained the best ML models through the Shapley additive explanations (SHAP) method.

RESULTS

Leveraging only the app usage data retrieved in 1 second, our light gradient boosting machine model used the important features selected by the stable FS approach and correctly identified 82.4% (n=42) of depressed students (precision=75%, F1-score=78.5%). Moreover, after comprehensive exploration, we presented a parsimonious stacking model where around 5 features selected by the all-relevant FS approach Boruta were used in each iteration of validation and showed a maximum precision of 77.4% (balanced accuracy=77.9%). Feature importance analysis suggested app usage behavioral markers containing diurnal usage patterns as being more important than aggregated data-based markers. In addition, a SHAP analysis of our best models presented behavioral markers that were related to depression. For instance, students who were not depressed spent more time on education apps on weekdays, whereas those who were depressed used a higher number of photo and video apps and also had a higher deviation in using photo and video apps over the morning, afternoon, evening, and night time periods of the weekend.

CONCLUSIONS

Due to our system's fast and minimalistic nature, it may make a worthwhile contribution to identifying depression in underdeveloped and developing regions. In addition, our detailed discussion about the implication of our findings can facilitate the development of less resource-intensive systems to better understand students who are depressed and take steps for intervention.

Collapse