1
|
Sinaci AA, Gencturk M, Alvarez-Romero C, Laleci Erturkmen GB, Martinez-Garcia A, Escalona-Cuaresma MJ, Parra-Calderon CL. Privacy-preserving federated machine learning on FAIR health data: A real-world application. Comput Struct Biotechnol J 2024; 24:136-145. [PMID: 38434250 PMCID: PMC10904920 DOI: 10.1016/j.csbj.2024.02.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 02/15/2024] [Accepted: 02/15/2024] [Indexed: 03/05/2024] Open
Abstract
Objective This paper introduces a privacy-preserving federated machine learning (ML) architecture built upon Findable, Accessible, Interoperable, and Reusable (FAIR) health data. It aims to devise an architecture for executing classification algorithms in a federated manner, enabling collaborative model-building among health data owners without sharing their datasets. Materials and methods Utilizing an agent-based architecture, a privacy-preserving federated ML algorithm was developed to create a global predictive model from various local models. This involved formally defining the algorithm in two steps: data preparation and federated model training on FAIR health data and constructing the architecture with multiple components facilitating algorithm execution. The solution was validated by five healthcare organizations using their specific health datasets. Results Five organizations transformed their datasets into Health Level 7 Fast Healthcare Interoperability Resources via a common FAIRification workflow and software set, thereby generating FAIR datasets. Each organization deployed a Federated ML Agent within its secure network, connected to a cloud-based Federated ML Manager. System testing was conducted on a use case aiming to predict 30-day readmission risk for chronic obstructive pulmonary disease patients and the federated model achieved an accuracy rate of 87%. Discussion The paper demonstrated a practical application of privacy-preserving federated ML among five distinct healthcare entities, highlighting the value of FAIR health data in machine learning when utilized in a federated manner that ensures privacy protection without sharing data. Conclusion This solution effectively leverages FAIR datasets from multiple healthcare organizations for federated ML while safeguarding sensitive health datasets, meeting legislative privacy and security requirements.
Collapse
Affiliation(s)
- A. Anil Sinaci
- SRDC Software Research Development and Consultancy Corporation, Ankara, Turkey
| | - Mert Gencturk
- SRDC Software Research Development and Consultancy Corporation, Ankara, Turkey
- Department of Computer Engineering, Middle East Technical University, Ankara, Turkey
| | - Celia Alvarez-Romero
- Group of Research and Innovation in Biomedical Informatics, Biomedical Engineering and Health Economy, Institute of Biomedicine of Seville, IBiS / Virgen del Rocío University Hospital / CSIC / University of Seville, Seville, Spain
| | | | - Alicia Martinez-Garcia
- Group of Research and Innovation in Biomedical Informatics, Biomedical Engineering and Health Economy, Institute of Biomedicine of Seville, IBiS / Virgen del Rocío University Hospital / CSIC / University of Seville, Seville, Spain
| | | | - Carlos Luis Parra-Calderon
- Group of Research and Innovation in Biomedical Informatics, Biomedical Engineering and Health Economy, Institute of Biomedicine of Seville, IBiS / Virgen del Rocío University Hospital / CSIC / University of Seville, Seville, Spain
| |
Collapse
|
2
|
Miyagi Y, Iwashima S. Prediction Models for Intravenous Immunoglobulin Non-Responders of Kawasaki Disease Using Machine Learning. Clin Drug Investig 2024:10.1007/s40261-024-01373-z. [PMID: 38869717 DOI: 10.1007/s40261-024-01373-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/27/2024] [Indexed: 06/14/2024]
Abstract
BACKGROUND AND OBJECTIVE: Intravenous immunoglobulin (IVIG) is a prominent therapeutic agent for Kawasaki disease (KD) that significantly reduces the incidence of coronary artery anomalies. Various methodologies, including machine learning, have been employed to develop IVIG non-responder prediction models; however, their validation and reproducibility remain unverified. This study aimed to develop a predictive scoring system for identifying IVIG nonresponders and rigorously test the accuracy and reliability of this system. METHODS: The study included an exposure group of 228 IVIG non-responders and a control group of 997 IVIG responders. Subsequently, a predictive machine learning model was constructed. The Shizuoka score, including variables such as the "initial treatment date" (cutoff: < 4 days), sodium level (cutoff: < 133 mEq/L), total bilirubin level (cutoff: ≥ 0.5 mg/dL), and neutrophil-to-lymphocyte ratio (cutoff: ≥ 2.6), was established. Patients meeting two or more of these criteria were grouped as high-risk IVIG non-responders. Using the Shizuoka score to stratify IVIG responders, propensity score matching was used to analyze 85 patients each for IVIG and IVIG-added prednisolone treatment in the high-risk group. In the IVIG plus prednisolone group, the IVIG non-responder count significantly decreased (p < 0.001), with an odds ratio of 0.192 (95% confidence interval 0.078-0.441). CONCLUSIONS: Intravenous immunoglobulin non-responders were predicted using machine learning models and validated using propensity score matching. The initiation of initial IVIG-added prednisolone treatment in the high-risk group identified by the Shizuoka score, crafted using machine learning models, appears useful for predicting IVIG non-responders.
Collapse
Affiliation(s)
- Yoshifumi Miyagi
- Department of Pediatrics, Haibara Hospital, Makinohara City, Shizuoka, Japan
| | - Satoru Iwashima
- The Shizuoka Kawasaki Disease Study Group, Shizuoka, Japan.
- Department of Pediatrics, Chutoen General Medical Center, 1-1 Shobugaike, Kakegawa, Shizuoka, 436-0040, Japan.
| |
Collapse
|
3
|
Huang J, Yang J, Qi H, Xu M, Xu X, Zhu Y. Prediction models for amputation after diabetic foot: systematic review and critical appraisal. Diabetol Metab Syndr 2024; 16:126. [PMID: 38858732 PMCID: PMC11163763 DOI: 10.1186/s13098-024-01360-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 05/24/2024] [Indexed: 06/12/2024] Open
Abstract
BACKGROUND Numerous studies have developed or validated prediction models aimed at estimating the likelihood of amputation in diabetic foot (DF) patients. However, the quality and applicability of these models in clinical practice and future research remain uncertain. This study conducts a systematic review and assessment of the risk of bias and applicability of amputation prediction models among individuals with DF. METHODS A comprehensive search was conducted across multiple databases, including PubMed, Web of Science, EBSCO CINAHL Plus, Embase, Cochrane Library, China National Knowledge Infrastructure (CNKI), Wanfang, Chinese Biomedical Literature Database (CBM), and Weipu (VIP) from their inception to December 24, 2023. Two investigators independently screened the literature and extracted data using the checklist for critical appraisal and data extraction for systematic reviews of prediction modeling studies. The Prediction Model Risk of Bias Assessment Tool (PROBAST) checklist was employed to evaluate both the risk of bias and applicability. RESULTS A total of 20 studies were included in this analysis, comprising 17 development studies and three validation studies, encompassing 20 prediction models and 11 classification systems. The incidence of amputation in patients with DF ranged from 5.9 to 58.5%. Machine learning-based methods were employed in more than half of the studies. The reported area under the curve (AUC) varied from 0.560 to 0.939. Independent predictors consistently identified by multivariate models included age, gender, HbA1c, hemoglobin, white blood cell count, low-density lipoprotein cholesterol, diabetes duration, and Wagner's Classification. All studies were found to exhibit a high risk of bias, primarily attributed to inadequate handling of outcome events and missing data, lack of model performance assessment, and overfitting. CONCLUSIONS The assessment using PROBAST revealed a notable risk of bias in the existing prediction models for amputation in patients with DF. It is imperative for future studies to concentrate on enhancing the robustness of current prediction models or constructing new models with stringent methodologies.
Collapse
Affiliation(s)
- Jingying Huang
- Postanesthesia Care Unit, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Jin Yang
- Nursing Department, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Haiou Qi
- Nursing Department, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China.
| | - Miaomiao Xu
- Orthopedics Department, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Xin Xu
- Operating Room, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Yiting Zhu
- Postanesthesia Care Unit, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| |
Collapse
|
4
|
Jia Y, Cui N, Jia T, Song J. Prognostic models for patients suffering a heart failure with a preserved ejection fraction: a systematic review. ESC Heart Fail 2024; 11:1341-1351. [PMID: 38318693 PMCID: PMC11098651 DOI: 10.1002/ehf2.14696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Revised: 01/02/2024] [Accepted: 01/09/2024] [Indexed: 02/07/2024] Open
Abstract
The purpose of this study was to systematically review the development, performance, and applicability of prognostic models developed for predicting poor events in patients with heart failure with preserved ejection fraction (HFpEF). Databases including Embase, PubMed, Web of Science Core Collection, the Cochrane Library, China National Knowledge Infrastructure, Wan Fang, Wei Pu, and China Biological Medicine were queried from their respective dates of inception to 1 June 2023, to examine multivariate models for prognostic prediction in HFpEF. Both forward and backward citations of all studies were included in our analysis. Two researchers individually used the Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies (CHARMS) checklist to extract data and assess the quality of the models using the Predictive Mode Bias Risk Assessment Tool (PROBAST). Among the 6897 studies screened, 16 studies derived and/or validated a total of 39 prognostic models. The sample size ranges for model development, internal validation, and external validation are 119 to 5988, 152 to 1000, and 30 to 5957, respectively. The most frequently employed modelling technique was Cox proportional hazards regression. Six studies (37.50%) conducted internal validation of models; bootstrap and k-fold cross-validation were the commonly used methods for internal validation of models. Ten of these models (25.64%) were validated externally, with reported the c-statistic in the external validation set ranging from 0.70 to 0.96, while the remaining models await external validation. The MEDIA echo score and I-PRESERVE-sudden cardiac death prediction mode have been externally validated using multiple cohorts, and the results consistently show good predictive performance. The most frequently used predictors identified among the models were age, n-terminal pro-brain natriuretic peptide, ejection fraction, albumin, and hospital stay in the last 5 months owing to heart failure. All study predictor domains and outcome domains were at low risk of bias, high or unclear risk of bias of all prognostic models due to underreporting in the area of analysis. All studies did not evaluate the clinical utility of the prognostic models. Predictive models for predicting prognostic outcomes in patients with HFpEF showed good discriminatory ability but their utility and generalization remain uncertain due to the risk of bias, differences in predictors between models, and the lack of clinical application studies. Future studies should improve the methodological quality of model development and conduct external validation of models.
Collapse
Affiliation(s)
- Ying‐Ying Jia
- Department of NursingThe Second Affiliated Hospital of Zhejiang University School of MedicineHangzhouChina
- Department of NursingZhejiang University School of MedicineHangzhouChina
| | - Nian‐Qi Cui
- School of NursingKunming Medical UniversityKunmingChina
| | - Ting‐Ting Jia
- Department of General SurgeryGansu Provincial People's Hospital, Cadre WardLanzhouChina
| | - Jian‐Ping Song
- Department of NursingThe Second Affiliated Hospital of Zhejiang University School of MedicineHangzhouChina
| |
Collapse
|
5
|
Alrawashdeh A, Alqahtani S, Alkhatib ZI, Kheirallah K, Melhem NY, Alwidyan M, Al-Dekah AM, Alshammari T, Nehme Z. Applications and Performance of Machine Learning Algorithms in Emergency Medical Services: A Scoping Review. Prehosp Disaster Med 2024:1-11. [PMID: 38757150 DOI: 10.1017/s1049023x24000414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2024]
Abstract
OBJECTIVE The aim of this study was to summarize the literature on the applications of machine learning (ML) and their performance in Emergency Medical Services (EMS). METHODS Four relevant electronic databases were searched (from inception through January 2024) for all original studies that employed EMS-guided ML algorithms to enhance the clinical and operational performance of EMS. Two reviewers screened the retrieved studies and extracted relevant data from the included studies. The characteristics of included studies, employed ML algorithms, and their performance were quantitively described across primary domains and subdomains. RESULTS This review included a total of 164 studies published from 2005 through 2024. Of those, 125 were clinical domain focused and 39 were operational. The characteristics of ML algorithms such as sample size, number and type of input features, and performance varied between and within domains and subdomains of applications. Clinical applications of ML algorithms involved triage or diagnosis classification (n = 62), treatment prediction (n = 12), or clinical outcome prediction (n = 50), mainly for out-of-hospital cardiac arrest/OHCA (n = 62), cardiovascular diseases/CVDs (n = 19), and trauma (n = 24). The performance of these ML algorithms varied, with a median area under the receiver operating characteristic curve (AUC) of 85.6%, accuracy of 88.1%, sensitivity of 86.05%, and specificity of 86.5%. Within the operational studies, the operational task of most ML algorithms was ambulance allocation (n = 21), followed by ambulance detection (n = 5), ambulance deployment (n = 5), route optimization (n = 5), and quality assurance (n = 3). The performance of all operational ML algorithms varied and had a median AUC of 96.1%, accuracy of 90.0%, sensitivity of 94.4%, and specificity of 87.7%. Generally, neural network and ensemble algorithms, to some degree, out-performed other ML algorithms. CONCLUSION Triaging and managing different prehospital medical conditions and augmenting ambulance performance can be improved by ML algorithms. Future reports should focus on a specific clinical condition or operational task to improve the precision of the performance metrics of ML models.
Collapse
Affiliation(s)
- Ahmad Alrawashdeh
- Department of Allied Medical Sciences, Jordan University of Science and Technology, Irbid, Jordan
| | - Saeed Alqahtani
- Department of Emergency Medical Services, Prince Sultan Military College for Health Sciences, Dhahran, Saudi Arabia
| | - Zaid I Alkhatib
- Department of Allied Medical Sciences, Jordan University of Science and Technology, Irbid, Jordan
| | - Khalid Kheirallah
- Department of Public Health and Family Medicine, Faculty of Medicine, Jordan University of Science and Technology, Irbid, Jordan
| | - Nebras Y Melhem
- Department of Anatomy, Physiology and Biochemistry, Faculty of Medicine, The Hashemite University, Zarqa, Jordan
| | - Mahmoud Alwidyan
- Department of Allied Medical Sciences, Jordan University of Science and Technology, Irbid, Jordan
| | | | - Talal Alshammari
- Department of Emergency Medical Care, College of Applied Medical Sciences, Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia
| | - Ziad Nehme
- Ambulance Victoria, Doncaster, Victoria, Australia
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Victoria, Australia
| |
Collapse
|
6
|
Cecchi R, Haja TM, Calabrò F, Fasterholdt I, Rasmussen BSB. Artificial intelligence in healthcare: why not apply the medico-legal method starting with the Collingridge dilemma? Int J Legal Med 2024; 138:1173-1178. [PMID: 38172326 DOI: 10.1007/s00414-023-03152-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 12/15/2023] [Indexed: 01/05/2024]
Abstract
Technology has greatly influenced and radically changed human life, from communication to creativity and from productivity to entertainment. The authors, starting from considerations concerning the implementation of new technologies with a strong impact on people's everyday lives, take up Collingridge's dilemma and relate it to the application of AI in healthcare. Collingridge's dilemma is an ethical and epistemological problem concerning the relationship between technology and society which involves two approaches. The proactive approach and socio-technological experimentation taken into account in the dilemma are discussed, the former taking health technology assessment (HTA) processes as a reference and the latter the AI studies conducted so far. As a possible prevention of the critical issues raised, the use of the medico-legal method is proposed, which classically lies between the prevention of possible adverse events and the reconstruction of how these occurred.The authors believe that this methodology, adopted as a European guideline in the medico-legal field for the assessment of medical liability, can be adapted to AI applied to the healthcare scenario and used for the assessment of liability issues. The topic deserves further investigation and will certainly be taken into consideration as a possible key to future scenarios.
Collapse
Affiliation(s)
- Rossana Cecchi
- Laboratory of Forensic Medicine, Department of Medicine and Surgery, University of Parma, Parma, Italy.
| | - Tudor Mihai Haja
- Laboratory of Forensic Medicine, Department of Medicine and Surgery, University of Parma, Parma, Italy
| | - Francesco Calabrò
- Laboratory of Forensic Medicine, Department of Medicine and Surgery, University of Parma, Parma, Italy
| | - Iben Fasterholdt
- CIMT - Centre for Innovative Medical Technology, Odense University Hospital, Odense, Denmark
- Program for Health System and Technology Evaluation, Toronto General Hospital Research Institute, University Health Network, Toronto, Canada
| | - Benjamin S B Rasmussen
- Department of Radiology & CAI-X - Centre for Clinical Artificial Intelligence, Odense University Hospital, Odense, Denmark
| |
Collapse
|
7
|
Qian R, Zhuang J, Xie J, Cheng H, Ou H, Lu X, Ouyang Z. Predictive value of machine learning for the severity of acute pancreatitis: A systematic review and meta-analysis. Heliyon 2024; 10:e29603. [PMID: 38655348 PMCID: PMC11035062 DOI: 10.1016/j.heliyon.2024.e29603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 04/02/2024] [Accepted: 04/10/2024] [Indexed: 04/26/2024] Open
Abstract
Background Predicting the severity of acute pancreatitis (AP) early poses a challenge in clinical practice. While there are well-established clinical scoring tools, their actual predictive performance remains uncertain. Various studies have explored the application of machine-learning methods for early AP prediction. However, a more comprehensive evidence-based assessment is needed to determine their predictive accuracy. Hence, this systematic review and meta-analysis aimed to evaluate the predictive accuracy of machine learning in assessing the severity of AP. Methods PubMed, EMBASE, Cochrane Library, and Web of Science were systematically searched until December 5, 2023. The risk of bias in eligible studies was assessed using the Prediction Model Risk of Bias Assessment Tool (PROBAST). Subgroup analyses, based on different machine learning types, were performed. Additionally, the predictive accuracy of mainstream scoring tools was summarized. Results This systematic review ultimately included 33 original studies. The pooled c-index in both the training and validation sets was 0.87 (95 % CI: 0.84-0.89) and 0.88 (95 % CI: 0.86-0.90), respectively. The sensitivity in the training set was 0.81 (95 % CI: 0.77-0.84), and in the validation set, it was 0.79 (95 % CI: 0.71-0.85). The specificity in the training set was 0.84 (95 % CI: 0.78-0.89), and in the validation set, it was 0.90 (95 % CI: 0.86-0.93). The primary model incorporated was logistic regression; however, its predictive accuracy was found to be inferior to that of neural networks, random forests, and xgboost. The pooled c-index of the APACHE II, BISAP, and Ranson were 0.74 (95 % CI: 0.68-0.80), 0.77 (95 % CI: 0.70-0.85), and 0.74 (95 % CI: 0.68-0.79), respectively. Conclusions Machine learning demonstrates excellent accuracy in predicting the severity of AP, providing a reference for updating or developing a straightforward clinical prediction tool.
Collapse
Affiliation(s)
- Rui Qian
- Department of Gastroenterology, Shenzhen Bao'an Chinese Medicine Hospital, Guangzhou University of Chinese Medicine, Shenzhen 518000, China
| | - Jiamei Zhuang
- The Fourth Clinical Medical College of Guangzhou University of Chinese Medicine, Shenzhen, 518033, China
| | - Jianjun Xie
- Department of Gastroenterology, Shenzhen Bao'an Chinese Medicine Hospital, Guangzhou University of Chinese Medicine, Shenzhen 518000, China
| | - Honghui Cheng
- Department of Gastroenterology, Shenzhen Bao'an Chinese Medicine Hospital, Guangzhou University of Chinese Medicine, Shenzhen 518000, China
| | - Haiya Ou
- Department of Gastroenterology, Shenzhen Bao'an Chinese Medicine Hospital, Guangzhou University of Chinese Medicine, Shenzhen 518000, China
| | - Xiang Lu
- Department of Plumonary and Critical Care Medicine, Shenzhen Bao'an Chinese Medicine Hospital, Guangzhou University of Chinese Medicine, Shenzhen 518000, China
| | - Zichen Ouyang
- Department of Hepatology, Shenzhen Bao'an Chinese Medicine Hospital, Guangzhou University of Chinese Medicine, Shenzhen 518000, China
| |
Collapse
|
8
|
Ahmed MS, Hasan T, Islam S, Ahmed N. Investigating Rhythmicity in App Usage to Predict Depressive Symptoms: Protocol for Personalized Framework Development and Validation Through a Countrywide Study. JMIR Res Protoc 2024; 13:e51540. [PMID: 38657238 PMCID: PMC11079771 DOI: 10.2196/51540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 12/27/2023] [Accepted: 01/11/2024] [Indexed: 04/26/2024] Open
Abstract
BACKGROUND Understanding a student's depressive symptoms could facilitate significantly more precise diagnosis and treatment. However, few studies have focused on depressive symptom prediction through unobtrusive systems, and these studies are limited by small sample sizes, low performance, and the requirement for higher resources. In addition, research has not explored whether statistically significant rhythms based on different app usage behavioral markers (eg, app usage sessions) exist that could be useful in finding subtle differences to predict with higher accuracy like the models based on rhythms of physiological data. OBJECTIVE The main objective of this study is to explore whether there exist statistically significant rhythms in resource-insensitive app usage behavioral markers and predict depressive symptoms through these marker-based rhythmic features. Another objective of this study is to understand whether there is a potential link between rhythmic features and depressive symptoms. METHODS Through a countrywide study, we collected 2952 students' raw app usage behavioral data and responses to the 9 depressive symptoms in the 9-item Patient Health Questionnaire (PHQ-9). The behavioral data were retrieved through our developed app, which was previously used in our pilot studies in Bangladesh on different research problems. To explore whether there is a rhythm based on app usage data, we will conduct a zero-amplitude test. In addition, we will develop a cosinor model for each participant to extract rhythmic parameters (eg, acrophase). In addition, to obtain a comprehensive picture of the rhythms, we will explore nonparametric rhythmic features (eg, interdaily stability). Furthermore, we will conduct regression analysis to understand the association of rhythmic features with depressive symptoms. Finally, we will develop a personalized multitask learning (MTL) framework to predict symptoms through rhythmic features. RESULTS After applying inclusion criteria (eg, having app usage data of at least 2 days to explore rhythmicity), we kept the data of 2902 (98.31%) students for analysis, with 24.48 million app usage events, and 7 days' app usage of 2849 (98.17%) students. The students are from all 8 divisions of Bangladesh, both public and private universities (19 different universities and 52 different departments). We are analyzing the data and will publish the findings in a peer-reviewed publication. CONCLUSIONS Having an in-depth understanding of app usage rhythms and their connection with depressive symptoms through a countrywide study can significantly help health care professionals and researchers better understand depressed students and may create possibilities for using app usage-based rhythms for intervention. In addition, the MTL framework based on app usage rhythmic features may more accurately predict depressive symptoms due to the rhythms' capability to find subtle differences. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) DERR1-10.2196/51540.
Collapse
Affiliation(s)
- Md Sabbir Ahmed
- Design Inclusion and Access Lab, North South University, Dhaka, Bangladesh
| | - Tanvir Hasan
- Design Inclusion and Access Lab, North South University, Dhaka, Bangladesh
| | - Salekul Islam
- Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh
| | - Nova Ahmed
- Design Inclusion and Access Lab, North South University, Dhaka, Bangladesh
| |
Collapse
|
9
|
Collins GS. Making the black box more transparent: improving the reporting of artificial intelligence studies in healthcare. BMJ 2024; 385:q832. [PMID: 38626954 DOI: 10.1136/bmj.q832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 04/19/2024]
Affiliation(s)
- Gary S Collins
- Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| |
Collapse
|
10
|
Collins GS, Moons KGM, Dhiman P, Riley RD, Beam AL, Van Calster B, Ghassemi M, Liu X, Reitsma JB, van Smeden M, Boulesteix AL, Camaradou JC, Celi LA, Denaxas S, Denniston AK, Glocker B, Golub RM, Harvey H, Heinze G, Hoffman MM, Kengne AP, Lam E, Lee N, Loder EW, Maier-Hein L, Mateen BA, McCradden MD, Oakden-Rayner L, Ordish J, Parnell R, Rose S, Singh K, Wynants L, Logullo P. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ 2024; 385:e078378. [PMID: 38626948 PMCID: PMC11019967 DOI: 10.1136/bmj-2023-078378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/17/2024] [Indexed: 04/19/2024]
Affiliation(s)
- Gary S Collins
- Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Karel G M Moons
- Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Paula Dhiman
- Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Richard D Riley
- Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
- National Institute for Health and Care Research (NIHR) Birmingham Biomedical Research Centre, Birmingham, UK
| | - Andrew L Beam
- Department of Epidemiology, Harvard T H Chan School of Public Health, Boston, MA, USA
| | - Ben Van Calster
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium
- Department of Biomedical Data Science, Leiden University Medical Centre, Leiden, Netherlands
| | - Marzyeh Ghassemi
- Department of Electrical Engineering and Computer Science, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Xiaoxuan Liu
- Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
| | - Johannes B Reitsma
- Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Maarten van Smeden
- Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Anne-Laure Boulesteix
- Institute for Medical Information Processing, Biometry and Epidemiology, Faculty of Medicine, Ludwig-Maximilians-University of Munich and Munich Centre of Machine Learning, Germany
| | - Jennifer Catherine Camaradou
- Patient representative, Health Data Research UK patient and public involvement and engagement group
- Patient representative, University of East Anglia, Faculty of Health Sciences, Norwich Research Park, Norwich, UK
| | - Leo Anthony Celi
- Beth Israel Deaconess Medical Center, Boston, MA, USA
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Biostatistics, Harvard T H Chan School of Public Health, Boston, MA, USA
| | - Spiros Denaxas
- Institute of Health Informatics, University College London, London, UK
- British Heart Foundation Data Science Centre, London, UK
| | - Alastair K Denniston
- National Institute for Health and Care Research (NIHR) Birmingham Biomedical Research Centre, Birmingham, UK
- Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
| | - Ben Glocker
- Department of Computing, Imperial College London, London, UK
| | - Robert M Golub
- Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | | | - Georg Heinze
- Section for Clinical Biometrics, Centre for Medical Data Science, Medical University of Vienna, Vienna, Austria
| | - Michael M Hoffman
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON, Canada
| | | | - Emily Lam
- Patient representative, Health Data Research UK patient and public involvement and engagement group
| | - Naomi Lee
- National Institute for Health and Care Excellence, London, UK
| | - Elizabeth W Loder
- The BMJ, London, UK
- Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Lena Maier-Hein
- Department of Intelligent Medical Systems, German Cancer Research Centre, Heidelberg, Germany
| | - Bilal A Mateen
- Institute of Health Informatics, University College London, London, UK
- Wellcome Trust, London, UK
- Alan Turing Institute, London, UK
| | - Melissa D McCradden
- Department of Bioethics, Hospital for Sick Children Toronto, ON, Canada
- Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, Canada
| | - Lauren Oakden-Rayner
- Australian Institute for Machine Learning, University of Adelaide, Adelaide, SA, Australia
| | - Johan Ordish
- Medicines and Healthcare products Regulatory Agency, London, UK
| | - Richard Parnell
- Patient representative, Health Data Research UK patient and public involvement and engagement group
| | - Sherri Rose
- Department of Health Policy and Center for Health Policy, Stanford University, Stanford, CA, USA
| | - Karandeep Singh
- Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, Netherlands
| | - Laure Wynants
- Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, Netherlands
| | - Patricia Logullo
- Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| |
Collapse
|
11
|
Armoundas AA, Narayan SM, Arnett DK, Spector-Bagdady K, Bennett DA, Celi LA, Friedman PA, Gollob MH, Hall JL, Kwitek AE, Lett E, Menon BK, Sheehan KA, Al-Zaiti SS. Use of Artificial Intelligence in Improving Outcomes in Heart Disease: A Scientific Statement From the American Heart Association. Circulation 2024; 149:e1028-e1050. [PMID: 38415358 PMCID: PMC11042786 DOI: 10.1161/cir.0000000000001201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]
Abstract
A major focus of academia, industry, and global governmental agencies is to develop and apply artificial intelligence and other advanced analytical tools to transform health care delivery. The American Heart Association supports the creation of tools and services that would further the science and practice of precision medicine by enabling more precise approaches to cardiovascular and stroke research, prevention, and care of individuals and populations. Nevertheless, several challenges exist, and few artificial intelligence tools have been shown to improve cardiovascular and stroke care sufficiently to be widely adopted. This scientific statement outlines the current state of the art on the use of artificial intelligence algorithms and data science in the diagnosis, classification, and treatment of cardiovascular disease. It also sets out to advance this mission, focusing on how digital tools and, in particular, artificial intelligence may provide clinical and mechanistic insights, address bias in clinical studies, and facilitate education and implementation science to improve cardiovascular and stroke outcomes. Last, a key objective of this scientific statement is to further the field by identifying best practices, gaps, and challenges for interested stakeholders.
Collapse
|
12
|
Huang Z, Denti P, Mistry H, Kloprogge F. Machine Learning and Artificial Intelligence in PK-PD Modeling: Fad, Friend, or Foe? Clin Pharmacol Ther 2024; 115:652-654. [PMID: 38179832 PMCID: PMC11146679 DOI: 10.1002/cpt.3165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 12/22/2023] [Indexed: 01/06/2024]
Affiliation(s)
- Zhonghui Huang
- Great Ormond Street Institute of Child HealthUniversity College LondonLondonUK
| | - Paolo Denti
- Division of Clinical Pharmacology, Department of MedicineUniversity of CapeCape TownSouth Africa
| | - Hitesh Mistry
- Division of PharmacyUniversity of ManchesterManchesterUK
| | - Frank Kloprogge
- Institute for Global HealthUniversity College LondonLondonUK
| |
Collapse
|
13
|
Gallardo-Pizarro A, Peyrony O, Chumbita M, Monzo-Gallo P, Aiello TF, Teijon-Lumbreras C, Gras E, Mensa J, Soriano A, Garcia-Vidal C. Improving management of febrile neutropenia in oncology patients: the role of artificial intelligence and machine learning. Expert Rev Anti Infect Ther 2024; 22:179-187. [PMID: 38457198 DOI: 10.1080/14787210.2024.2322445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 02/20/2024] [Indexed: 03/09/2024]
Abstract
INTRODUCTION Artificial intelligence (AI) and machine learning (ML) have the potential to revolutionize the management of febrile neutropenia (FN) and drive progress toward personalized medicine. AREAS COVERED In this review, we detail how the collection of a large number of high-quality data can be used to conduct precise mathematical studies with ML and AI. We explain the foundations of these techniques, covering the fundamentals of supervised and unsupervised learning, as well as the most important challenges, e.g. data quality, 'black box' model interpretation and overfitting. To conclude, we provide detailed examples of how AI and ML have been used to enhance predictions of chemotherapy-induced FN, detection of bloodstream infections (BSIs) and multidrug-resistant (MDR) bacteria, and anticipation of severe complications and mortality. EXPERT OPINION There is promising potential of implementing accurate AI and ML models whilst managing FN. However, their integration as viable clinical tools poses challenges, including technical and implementation barriers. Improving global accessibility, fostering interdisciplinary collaboration, and addressing ethical and security considerations are essential. By overcoming these challenges, we could transform personalized care for patients with FN.
Collapse
Affiliation(s)
| | - Olivier Peyrony
- Hospital Clinic of Barcelona-IDIBAPS, University of Barcelona, Barcelona, Spain
| | - Mariana Chumbita
- Hospital Clinic of Barcelona-IDIBAPS, University of Barcelona, Barcelona, Spain
| | | | | | | | - Emmanuelle Gras
- Hospital Clinic of Barcelona-IDIBAPS, University of Barcelona, Barcelona, Spain
| | - Josep Mensa
- Hospital Clinic of Barcelona-IDIBAPS, University of Barcelona, Barcelona, Spain
| | - Alex Soriano
- Hospital Clinic of Barcelona-IDIBAPS, University of Barcelona, Barcelona, Spain
| | | |
Collapse
|
14
|
Gray M, Samala R, Liu Q, Skiles D, Xu J, Tong W, Wu L. Measurement and Mitigation of Bias in Artificial Intelligence: A Narrative Literature Review for Regulatory Science. Clin Pharmacol Ther 2024; 115:687-697. [PMID: 38018360 DOI: 10.1002/cpt.3117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 11/21/2023] [Indexed: 11/30/2023]
Abstract
Artificial intelligence (AI) is increasingly being used in decision making across various industries, including the public health arena. Bias in any decision-making process can significantly skew outcomes, and AI systems have been shown to exhibit biases at times. The potential for AI systems to perpetuate and even amplify biases is a growing concern. Bias, as used in this paper, refers to the tendency toward a particular characteristic or behavior, and thus, a biased AI system is one that shows biased associations entities. In this literature review, we examine the current state of research on AI bias, including its sources, as well as the methods for measuring, benchmarking, and mitigating it. We also examine the biases and methods of mitigation specifically relevant to the healthcare field and offer a perspective on bias measurement and mitigation in regulatory science decision making.
Collapse
Affiliation(s)
- Magnus Gray
- Division of Bioinformatics & Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, Arkansas, USA
| | - Ravi Samala
- Division of Imaging, Diagnostics, and Software Reliability, Office of Science and Engineering Laboratories, US Food and Drug Administration Center for Devices and Radiological Health, Silver Spring, Maryland, USA
| | - Qi Liu
- Office of Clinical Pharmacology, Office of Translational Sciences, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, Maryland, USA
| | - Denny Skiles
- Office of Management, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, Arkansas, USA
| | - Joshua Xu
- Division of Bioinformatics & Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, Arkansas, USA
| | - Weida Tong
- Division of Bioinformatics & Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, Arkansas, USA
| | - Leihong Wu
- Division of Bioinformatics & Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, Arkansas, USA
| |
Collapse
|
15
|
Kwong JCC, Nickel GC, Wang SCY, Kvedar JC. Integrating artificial intelligence into healthcare systems: more than just the algorithm. NPJ Digit Med 2024; 7:52. [PMID: 38429418 PMCID: PMC10907626 DOI: 10.1038/s41746-024-01066-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 02/22/2024] [Indexed: 03/03/2024] Open
Affiliation(s)
- Jethro C C Kwong
- Division of Urology, Department of Surgery, University of Toronto, Toronto, ON, Canada.
- Temerty Centre for AI Research and Education in Medicine, University of Toronto, Toronto, ON, Canada.
| | | | | | | |
Collapse
|
16
|
Talimtzi P, Ntolkeras A, Kostopoulos G, Bougioukas KI, Pagkalidou E, Ouranidis A, Pataka A, Haidich AB. The reporting completeness and transparency of systematic reviews of prognostic prediction models for COVID-19 was poor: a methodological overview of systematic reviews. J Clin Epidemiol 2024; 167:111264. [PMID: 38266742 DOI: 10.1016/j.jclinepi.2024.111264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 01/08/2024] [Accepted: 01/13/2024] [Indexed: 01/26/2024]
Abstract
OBJECTIVES To conduct a methodological overview of reviews to evaluate the reporting completeness and transparency of systematic reviews (SRs) of prognostic prediction models (PPMs) for COVID-19. STUDY DESIGN AND SETTING MEDLINE, Scopus, Cochrane Database of Systematic Reviews, and Epistemonikos (epistemonikos.org) were searched for SRs of PPMs for COVID-19 until December 31, 2022. The risk of bias in systematic reviews tool was used to assess the risk of bias. The protocol for this overview was uploaded in the Open Science Framework (https://osf.io/7y94c). RESULTS Ten SRs were retrieved; none of them synthesized the results in a meta-analysis. For most of the studies, there was absence of a predefined protocol and missing information on study selection, data collection process, and reporting of primary studies and models included, while only one SR had its data publicly available. In addition, for the majority of the SRs, the overall risk of bias was judged as being high. The overall corrected covered area was 6.3% showing a small amount of overlapping among the SRs. CONCLUSION The reporting completeness and transparency of SRs of PPMs for COVID-19 was poor. Guidance is urgently required, with increased awareness and education of minimum reporting standards and quality criteria. Specific focus is needed in predefined protocol, information on study selection and data collection process, and in the reporting of findings to improve the quality of SRs of PPMs for COVID-19.
Collapse
Affiliation(s)
- Persefoni Talimtzi
- Department of Hygiene, Social-Preventive Medicine and Medical Statistics, School of Medicine, Faculty of Health Sciences, Aristotle University of Thessaloniki, University Campus, 54124, Thessaloniki, Greece
| | - Antonios Ntolkeras
- School of Biology, Aristotle University of Thessaloniki, University Campus, 54636, Thessaloniki, Greece
| | | | - Konstantinos I Bougioukas
- Department of Hygiene, Social-Preventive Medicine and Medical Statistics, School of Medicine, Faculty of Health Sciences, Aristotle University of Thessaloniki, University Campus, 54124, Thessaloniki, Greece
| | - Eirini Pagkalidou
- Department of Hygiene, Social-Preventive Medicine and Medical Statistics, School of Medicine, Faculty of Health Sciences, Aristotle University of Thessaloniki, University Campus, 54124, Thessaloniki, Greece
| | - Andreas Ouranidis
- Department of Pharmaceutical Technology, School of Pharmacy, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece
| | - Athanasia Pataka
- Department of Respiratory Deficiency, School of Medicine, Faculty of Health Sciences, Aristotle University of Thessaloniki, University Campus, 54124, Thessaloniki, Greece
| | - Anna-Bettina Haidich
- Department of Hygiene, Social-Preventive Medicine and Medical Statistics, School of Medicine, Faculty of Health Sciences, Aristotle University of Thessaloniki, University Campus, 54124, Thessaloniki, Greece.
| |
Collapse
|
17
|
Milders J, Ramspek CL, Janse RJ, Bos WJW, Rotmans JI, Dekker FW, van Diepen M. Prognostic Models in Nephrology: Where Do We Stand and Where Do We Go from Here? Mapping Out the Evidence in a Scoping Review. J Am Soc Nephrol 2024; 35:367-380. [PMID: 38082484 PMCID: PMC10914213 DOI: 10.1681/asn.0000000000000285] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2024] Open
Abstract
Prognostic models can strongly support individualized care provision and well-informed shared decision making. There has been an upsurge of prognostic research in the field of nephrology, but the uptake of prognostic models in clinical practice remains limited. Therefore, we map out the research field of prognostic models for kidney patients and provide directions on how to proceed from here. We performed a scoping review of studies developing, validating, or updating a prognostic model for patients with CKD. We searched all published models in PubMed and Embase and report predicted outcomes, methodological quality, and validation and/or updating efforts. We found 602 studies, of which 30.1% concerned CKD populations, 31.6% dialysis populations, and 38.4% kidney transplantation populations. The most frequently predicted outcomes were mortality ( n =129), kidney disease progression ( n =75), and kidney graft survival ( n =54). Most studies provided discrimination measures (80.4%), but much less showed calibration results (43.4%). Of the 415 development studies, 28.0% did not perform any validation and 57.6% performed only internal validation. Moreover, only 111 models (26.7%) were externally validated either in the development study itself or in an independent external validation study. Finally, in 45.8% of development studies no useable version of the model was reported. To conclude, many prognostic models have been developed for patients with CKD, mainly for outcomes related to kidney disease progression and patient/graft survival. To bridge the gap between prediction research and kidney patient care, patient-reported outcomes, methodological rigor, complete reporting of prognostic models, external validation, updating, and impact assessment urgently need more attention.
Collapse
Affiliation(s)
- Jet Milders
- Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Chava L. Ramspek
- Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Roemer J. Janse
- Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Willem Jan W. Bos
- Department of Internal Medicine, Leiden University Medical Center, Leiden, The Netherlands
- Santeon, Utrecht, The Netherlands
- Department of Internal Medicine, St. Antonius Hospital, Nieuwegein, The Netherlands
| | - Joris I. Rotmans
- Department of Internal Medicine, Leiden University Medical Center, Leiden, The Netherlands
| | - Friedo W. Dekker
- Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Merel van Diepen
- Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, The Netherlands
| |
Collapse
|
18
|
Gan T, Guan H, Li P, Huang X, Li Y, Zhang R, Li T. Risk prediction models for cardiovascular events in hemodialysis patients: A systematic review. Semin Dial 2024; 37:101-109. [PMID: 37743062 DOI: 10.1111/sdi.13181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 06/25/2023] [Accepted: 09/10/2023] [Indexed: 09/26/2023]
Abstract
OBJECTIVE To perform a systematic review of risk prediction models for cardiovascular (CV) events in hemodialysis (HD) patients, and provide a reference for the application and optimization of related prediction models. METHODS PubMed, The Cochrane Library, Web of Science, and Embase databases were searched from inception to 1 February 2023. Two authors independently conducted the literature search, selection, and screening. The Prediction model Risk Of Bias Assessment Tool (PROBAST) was applied to evaluate the risk of bias and applicability of the included literature. RESULTS A total of nine studies containing 12 models were included, with performance measured by the area under the receiver operating characteristic curve (AUC) lying between 0.70 and 0.88. Age, diabetes mellitus (DM), C-reactive protein (CRP), and albumin (ALB) were the most commonly identified predictors of CV events in HD patients. While the included models demonstrated good applicability, there were still certain risks of bias, primarily related to inadequate handling of missing data and transformation of continuous variables, as well as a lack of model performance validation. CONCLUSION The included models showed good overall predictive performance and can assist healthcare professionals in the early identification of high-risk individuals for CV events in HD patients. In the future, the modeling methods should be improved, or the existing models should undergo external validation to provide better guidance for clinical practice.
Collapse
Affiliation(s)
- Tiantian Gan
- School of Nursing, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Hua Guan
- Health Management Center, Sichuan Academy of Medical Sciences·Sichuan People's Hospital, Chengdu, China
| | - Pengli Li
- Department of Nephrology, Sichuan Academy of Medical Sciences·Sichuan People's Hospital, Chengdu, China
| | - Xinping Huang
- School of Nursing, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Yue Li
- Health Management Center, Sichuan Academy of Medical Sciences·Sichuan People's Hospital, Chengdu, China
| | - Rui Zhang
- Health Management Center, Sichuan Academy of Medical Sciences·Sichuan People's Hospital, Chengdu, China
| | - Tingxin Li
- Health Management Center, Sichuan Academy of Medical Sciences·Sichuan People's Hospital, Chengdu, China
| |
Collapse
|
19
|
Fehr J, Citro B, Malpani R, Lippert C, Madai VI. A trustworthy AI reality-check: the lack of transparency of artificial intelligence products in healthcare. Front Digit Health 2024; 6:1267290. [PMID: 38455991 PMCID: PMC10919164 DOI: 10.3389/fdgth.2024.1267290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 02/05/2024] [Indexed: 03/09/2024] Open
Abstract
Trustworthy medical AI requires transparency about the development and testing of underlying algorithms to identify biases and communicate potential risks of harm. Abundant guidance exists on how to achieve transparency for medical AI products, but it is unclear whether publicly available information adequately informs about their risks. To assess this, we retrieved public documentation on the 14 available CE-certified AI-based radiology products of the II b risk category in the EU from vendor websites, scientific publications, and the European EUDAMED database. Using a self-designed survey, we reported on their development, validation, ethical considerations, and deployment caveats, according to trustworthy AI guidelines. We scored each question with either 0, 0.5, or 1, to rate if the required information was "unavailable", "partially available," or "fully available." The transparency of each product was calculated relative to all 55 questions. Transparency scores ranged from 6.4% to 60.9%, with a median of 29.1%. Major transparency gaps included missing documentation on training data, ethical considerations, and limitations for deployment. Ethical aspects like consent, safety monitoring, and GDPR-compliance were rarely documented. Furthermore, deployment caveats for different demographics and medical settings were scarce. In conclusion, public documentation of authorized medical AI products in Europe lacks sufficient public transparency to inform about safety and risks. We call on lawmakers and regulators to establish legally mandated requirements for public and substantive transparency to fulfill the promise of trustworthy AI for health.
Collapse
Affiliation(s)
- Jana Fehr
- Digital Health & Machine Learning, Hasso Plattner Institute, Potsdam, Germany
- Digital Engineering Faculty, University of Potsdam, Potsdam, Germany
- QUEST Center for Responsible Research, Berlin Institute of Health (BIH), Charité Universitätsmedizin Berlin, Berlin, Germany
| | - Brian Citro
- Independent Researcher, Chicago, IL, United States
| | | | - Christoph Lippert
- Digital Health & Machine Learning, Hasso Plattner Institute, Potsdam, Germany
- Digital Engineering Faculty, University of Potsdam, Potsdam, Germany
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Vince I. Madai
- QUEST Center for Responsible Research, Berlin Institute of Health (BIH), Charité Universitätsmedizin Berlin, Berlin, Germany
- Faculty of Computing, Engineering and the Built Environment, School of Computing and Digital Technology, Birmingham City University, Birmingham, United Kingdom
| |
Collapse
|
20
|
Barreñada L, Ledger A, Dhiman P, Collins G, Wynants L, Verbakel JY, Timmerman D, Valentin L, Van Calster B. ADNEX risk prediction model for diagnosis of ovarian cancer: systematic review and meta-analysis of external validation studies. BMJ MEDICINE 2024; 3:e000817. [PMID: 38375077 PMCID: PMC10875560 DOI: 10.1136/bmjmed-2023-000817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 01/25/2024] [Indexed: 02/21/2024]
Abstract
Objectives To conduct a systematic review of studies externally validating the ADNEX (Assessment of Different Neoplasias in the adnexa) model for diagnosis of ovarian cancer and to present a meta-analysis of its performance. Design Systematic review and meta-analysis of external validation studies. Data sources Medline, Embase, Web of Science, Scopus, and Europe PMC, from 15 October 2014 to 15 May 2023. Eligibility criteria for selecting studies All external validation studies of the performance of ADNEX, with any study design and any study population of patients with an adnexal mass. Two independent reviewers extracted the data. Disagreements were resolved by discussion. Reporting quality of the studies was scored with the TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis) reporting guideline, and methodological conduct and risk of bias with PROBAST (Prediction model Risk Of Bias Assessment Tool). Random effects meta-analysis of the area under the receiver operating characteristic curve (AUC), sensitivity and specificity at the 10% risk of malignancy threshold, and net benefit and relative utility at the 10% risk of malignancy threshold were performed. Results 47 studies (17 007 tumours) were included, with a median study sample size of 261 (range 24-4905). On average, 61% of TRIPOD items were reported. Handling of missing data, justification of sample size, and model calibration were rarely described. 91% of validations were at high risk of bias, mainly because of the unexplained exclusion of incomplete cases, small sample size, or no assessment of calibration. The summary AUC to distinguish benign from malignant tumours in patients who underwent surgery was 0.93 (95% confidence interval 0.92 to 0.94, 95% prediction interval 0.85 to 0.98) for ADNEX with the serum biomarker, cancer antigen 125 (CA125), as a predictor (9202 tumours, 43 centres, 18 countries, and 21 studies) and 0.93 (95% confidence interval 0.91 to 0.94, 95% prediction interval 0.85 to 0.98) for ADNEX without CA125 (6309 tumours, 31 centres, 13 countries, and 12 studies). The estimated probability that the model has use clinically in a new centre was 95% (with CA125) and 91% (without CA125). When restricting analysis to studies with a low risk of bias, summary AUC values were 0.93 (with CA125) and 0.91 (without CA125), and estimated probabilities that the model has use clinically were 89% (with CA125) and 87% (without CA125). Conclusions The results of the meta-analysis indicated that ADNEX performed well in distinguishing between benign and malignant tumours in populations from different countries and settings, regardless of whether the serum biomarker, CA125, was used as a predictor. A key limitation was that calibration was rarely assessed. Systematic review registration PROSPERO CRD42022373182.
Collapse
Affiliation(s)
- Lasai Barreñada
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium
| | - Ashleigh Ledger
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium
| | - Paula Dhiman
- Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford Centre for Statistics in Medicine, Oxford, UK
| | - Gary Collins
- Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford Centre for Statistics in Medicine, Oxford, UK
| | - Laure Wynants
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium
- Department of Epidemiology, Universiteit Maastricht Care and Public Health Research Institute, Maastricht, Netherlands
| | - Jan Y Verbakel
- Department of Public Health and Primary care, KU Leuven, Leuven, Belgium
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK
- Leuven Unit for Health Technology Assessment Research (LUHTAR), KU Leuven, Leuven, Belgium
| | - Dirk Timmerman
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium
- Department of Obstetrics and Gynaecology, UZ Leuven campus Gasthuisberg Dienst gynaecologie en verloskunde, Leuven, Belgium
| | - Lil Valentin
- Department of Obstetrics and Gynaecology, Skåne University Hospital, Malmo, Sweden
- Department of Clinical Sciences Malmö, Lund University, Lund, Sweden
| | - Ben Van Calster
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium
- Leuven Unit for Health Technology Assessment Research (LUHTAR), KU Leuven, Leuven, Belgium
- Department of Biomedical Data Sciences, Leiden University Medical Centre, Leiden, Netherlands
| |
Collapse
|
21
|
Cai Y, Cai YQ, Tang LY, Wang YH, Gong M, Jing TC, Li HJ, Li-Ling J, Hu W, Yin Z, Gong DX, Zhang GW. Artificial intelligence in the risk prediction models of cardiovascular disease and development of an independent validation screening tool: a systematic review. BMC Med 2024; 22:56. [PMID: 38317226 PMCID: PMC10845808 DOI: 10.1186/s12916-024-03273-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Accepted: 01/23/2024] [Indexed: 02/07/2024] Open
Abstract
BACKGROUND A comprehensive overview of artificial intelligence (AI) for cardiovascular disease (CVD) prediction and a screening tool of AI models (AI-Ms) for independent external validation are lacking. This systematic review aims to identify, describe, and appraise AI-Ms of CVD prediction in the general and special populations and develop a new independent validation score (IVS) for AI-Ms replicability evaluation. METHODS PubMed, Web of Science, Embase, and IEEE library were searched up to July 2021. Data extraction and analysis were performed for the populations, distribution, predictors, algorithms, etc. The risk of bias was evaluated with the prediction risk of bias assessment tool (PROBAST). Subsequently, we designed IVS for model replicability evaluation with five steps in five items, including transparency of algorithms, performance of models, feasibility of reproduction, risk of reproduction, and clinical implication, respectively. The review is registered in PROSPERO (No. CRD42021271789). RESULTS In 20,887 screened references, 79 articles (82.5% in 2017-2021) were included, which contained 114 datasets (67 in Europe and North America, but 0 in Africa). We identified 486 AI-Ms, of which the majority were in development (n = 380), but none of them had undergone independent external validation. A total of 66 idiographic algorithms were found; however, 36.4% were used only once and only 39.4% over three times. A large number of different predictors (range 5-52,000, median 21) and large-span sample size (range 80-3,660,000, median 4466) were observed. All models were at high risk of bias according to PROBAST, primarily due to the incorrect use of statistical methods. IVS analysis confirmed only 10 models as "recommended"; however, 281 and 187 were "not recommended" and "warning," respectively. CONCLUSION AI has led the digital revolution in the field of CVD prediction, but is still in the early stage of development as the defects of research design, report, and evaluation systems. The IVS we developed may contribute to independent external validation and the development of this field.
Collapse
Affiliation(s)
- Yue Cai
- China Medical University, Shenyang, 110122, China
| | - Yu-Qing Cai
- China Medical University, Shenyang, 110122, China
| | - Li-Ying Tang
- China Medical University, Shenyang, 110122, China
| | - Yi-Han Wang
- China Medical University, Shenyang, 110122, China
| | - Mengchun Gong
- Digital Health China Co. Ltd, Beijing, 100089, China
| | - Tian-Ci Jing
- Smart Hospital Management Department, the First Hospital of China Medical University, Shenyang, 110001, China
| | - Hui-Jun Li
- Shenyang Medical & Film Science and Technology Co. Ltd., Shenyang, 110001, China
- Enduring Medicine Smart Innovation Research Institute, Shenyang, 110001, China
| | - Jesse Li-Ling
- Institute of Genetic Medicine, School of Life Science, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, 610065, China
| | - Wei Hu
- Bayi Orthopedic Hospital, Chengdu, 610017, China
| | - Zhihua Yin
- Department of Epidemiology, School of Public Health, China Medical University, Shenyang, 110122, China.
| | - Da-Xin Gong
- Smart Hospital Management Department, the First Hospital of China Medical University, Shenyang, 110001, China.
- The Internet Hospital Branch of the Chinese Research Hospital Association, Beijing, 100006, China.
| | - Guang-Wei Zhang
- Smart Hospital Management Department, the First Hospital of China Medical University, Shenyang, 110001, China.
- The Internet Hospital Branch of the Chinese Research Hospital Association, Beijing, 100006, China.
| |
Collapse
|
22
|
Sebro R. Advancing Diagnostics and Patient Care: The Role of Biomarkers in Radiology. Semin Musculoskelet Radiol 2024; 28:3-13. [PMID: 38330966 DOI: 10.1055/s-0043-1776426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2024]
Abstract
The integration of biomarkers into medical practice has revolutionized the field of radiology, allowing for enhanced diagnostic accuracy, personalized treatment strategies, and improved patient care outcomes. This review offers radiologists a comprehensive understanding of the diverse applications of biomarkers in medicine. By elucidating the fundamental concepts, challenges, and recent advancements in biomarker utilization, it will serve as a bridge between the disciplines of radiology and epidemiology. Through an exploration of various biomarker types, such as imaging biomarkers, molecular biomarkers, and genetic markers, I outline their roles in disease detection, prognosis prediction, and therapeutic monitoring. I also discuss the significance of robust study designs, blinding, power and sample size calculations, performance metrics, and statistical methodologies in biomarker research. By fostering collaboration between radiologists, statisticians, and epidemiologists, I hope to accelerate the translation of biomarker discoveries into clinical practice, ultimately leading to improved patient care.
Collapse
Affiliation(s)
- Ronnie Sebro
- Department of Radiology, Center for Augmented Intelligence, Mayo Clinic, Jacksonville, Florida
- Department of Biostatistics, Center for Quantitative Health Sciences, Mayo Clinic, Jacksonville, Florida
- Department of Orthopedic Surgery, Mayo Clinic, Jacksonville, Florida
| |
Collapse
|
23
|
Chang RSK, Nguyen S, Chen Z, Foster E, Kwan P. Role of machine learning in the management of epilepsy: a systematic review protocol. BMJ Open 2024; 14:e079785. [PMID: 38272549 PMCID: PMC10823996 DOI: 10.1136/bmjopen-2023-079785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 01/05/2024] [Indexed: 01/27/2024] Open
Abstract
INTRODUCTION Machine learning is a rapidly expanding field and is already incorporated into many aspects of medicine including diagnostics, prognostication and clinical decision-support tools. Epilepsy is a common and disabling neurological disorder, however, management remains challenging in many cases, despite expanding therapeutic options. We present a systematic review protocol to explore the role of machine learning in the management of epilepsy. METHODS AND ANALYSIS This protocol has been drafted with reference to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) for Protocols. A literature search will be conducted in databases including MEDLINE, Embase, Scopus and Web of Science. A PRISMA flow chart will be constructed to summarise the study workflow. As the scope of this review is the clinical application of machine learning, the selection of papers will be focused on studies directly related to clinical decision-making in management of epilepsy, specifically the prediction of response to antiseizure medications, development of drug-resistant epilepsy, and epilepsy surgery and neuromodulation outcomes. Data will be extracted following the CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies checklist. Prediction model Risk Of Bias ASsessment Tool will be used for the quality assessment of the included studies. Syntheses of quantitative data will be presented in narrative format. ETHICS AND DISSEMINATION As this study is a systematic review which does not involve patients or animals, ethics approval is not required. The results of the systematic review will be submitted to peer-review journals for publication and presented in academic conferences. PROSPERO REGISTRATION NUMBER CRD42023442156.
Collapse
Affiliation(s)
- Richard Shek-Kwan Chang
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, Victoria, Australia
| | - Shani Nguyen
- Monash University Faculty of Medicine Nursing and Health Sciences, Melbourne, Victoria, Australia
| | - Zhibin Chen
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, Victoria, Australia
| | - Emma Foster
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, Victoria, Australia
| | - Patrick Kwan
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, Victoria, Australia
| |
Collapse
|
24
|
Tabja Bortesi JP, Ranisau J, Di S, McGillion M, Rosella L, Johnson A, Devereaux PJ, Petch J. Machine Learning Approaches for the Image-Based Identification of Surgical Wound Infections: Scoping Review. J Med Internet Res 2024; 26:e52880. [PMID: 38236623 PMCID: PMC10835585 DOI: 10.2196/52880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 11/09/2023] [Accepted: 12/12/2023] [Indexed: 01/19/2024] Open
Abstract
BACKGROUND Surgical site infections (SSIs) occur frequently and impact patients and health care systems. Remote surveillance of surgical wounds is currently limited by the need for manual assessment by clinicians. Machine learning (ML)-based methods have recently been used to address various aspects of the postoperative wound healing process and may be used to improve the scalability and cost-effectiveness of remote surgical wound assessment. OBJECTIVE The objective of this review was to provide an overview of the ML methods that have been used to identify surgical wound infections from images. METHODS We conducted a scoping review of ML approaches for visual detection of SSIs following the JBI (Joanna Briggs Institute) methodology. Reports of participants in any postoperative context focusing on identification of surgical wound infections were included. Studies that did not address SSI identification, surgical wounds, or did not use image or video data were excluded. We searched MEDLINE, Embase, CINAHL, CENTRAL, Web of Science Core Collection, IEEE Xplore, Compendex, and arXiv for relevant studies in November 2022. The records retrieved were double screened for eligibility. A data extraction tool was used to chart the relevant data, which was described narratively and presented using tables. Employment of TRIPOD (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis) guidelines was evaluated and PROBAST (Prediction Model Risk of Bias Assessment Tool) was used to assess risk of bias (RoB). RESULTS In total, 10 of the 715 unique records screened met the eligibility criteria. In these studies, the clinical contexts and surgical procedures were diverse. All papers developed diagnostic models, though none performed external validation. Both traditional ML and deep learning methods were used to identify SSIs from mostly color images, and the volume of images used ranged from under 50 to thousands. Further, 10 TRIPOD items were reported in at least 4 studies, though 15 items were reported in fewer than 4 studies. PROBAST assessment led to 9 studies being identified as having an overall high RoB, with 1 study having overall unclear RoB. CONCLUSIONS Research on the image-based identification of surgical wound infections using ML remains novel, and there is a need for standardized reporting. Limitations related to variability in image capture, model building, and data sources should be addressed in the future.
Collapse
Affiliation(s)
| | - Jonathan Ranisau
- Centre for Data Science and Digital Health, Hamilton Health Sciences, Hamilton, ON, Canada
| | - Shuang Di
- Centre for Data Science and Digital Health, Hamilton Health Sciences, Hamilton, ON, Canada
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
| | | | - Laura Rosella
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
| | | | - P J Devereaux
- Population Health Research Institute, Hamilton, ON, Canada
| | - Jeremy Petch
- Centre for Data Science and Digital Health, Hamilton Health Sciences, Hamilton, ON, Canada
- Population Health Research Institute, Hamilton, ON, Canada
- Institute for Health Policy, Management and Evaluation, University of Toronto, Toronto, ON, Canada
- Division of Cardiology, McMaster University, Hamilton, ON, Canada
| |
Collapse
|
25
|
Ciobanu-Caraus O, Aicher A, Kernbach JM, Regli L, Serra C, Staartjes VE. A critical moment in machine learning in medicine: on reproducible and interpretable learning. Acta Neurochir (Wien) 2024; 166:14. [PMID: 38227273 DOI: 10.1007/s00701-024-05892-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 12/14/2023] [Indexed: 01/17/2024]
Abstract
Over the past two decades, advances in computational power and data availability combined with increased accessibility to pre-trained models have led to an exponential rise in machine learning (ML) publications. While ML may have the potential to transform healthcare, this sharp increase in ML research output without focus on methodological rigor and standard reporting guidelines has fueled a reproducibility crisis. In addition, the rapidly growing complexity of these models compromises their interpretability, which currently impedes their successful and widespread clinical adoption. In medicine, where failure of such models may have severe implications for patients' health, the high requirements for accuracy, robustness, and interpretability confront ML researchers with a unique set of challenges. In this review, we discuss the semantics of reproducibility and interpretability, as well as related issues and challenges, and outline possible solutions to counteracting the "black box". To foster reproducibility, standard reporting guidelines need to be further developed and data or code sharing encouraged. Editors and reviewers may equally play a critical role by establishing high methodological standards and thus preventing the dissemination of low-quality ML publications. To foster interpretable learning, the use of simpler models more suitable for medical data can inform the clinician how results are generated based on input data. Model-agnostic explanation tools, sensitivity analysis, and hidden layer representations constitute further promising approaches to increase interpretability. Balancing model performance and interpretability are important to ensure clinical applicability. We have now reached a critical moment for ML in medicine, where addressing these issues and implementing appropriate solutions will be vital for the future evolution of the field.
Collapse
Affiliation(s)
- Olga Ciobanu-Caraus
- Machine Intelligence in Clinical Neuroscience & Microsurgical Neuroanatomy (MICN) Laboratory, Department of Neurosurgery, Clinical Neuroscience Center, University Hospital Zurich, University of Zurich, Zurich, Switzerland
| | - Anatol Aicher
- Machine Intelligence in Clinical Neuroscience & Microsurgical Neuroanatomy (MICN) Laboratory, Department of Neurosurgery, Clinical Neuroscience Center, University Hospital Zurich, University of Zurich, Zurich, Switzerland
| | - Julius M Kernbach
- Department of Neuroradiology, University Hospital Heidelberg, Heidelberg, Germany
| | - Luca Regli
- Machine Intelligence in Clinical Neuroscience & Microsurgical Neuroanatomy (MICN) Laboratory, Department of Neurosurgery, Clinical Neuroscience Center, University Hospital Zurich, University of Zurich, Zurich, Switzerland
| | - Carlo Serra
- Machine Intelligence in Clinical Neuroscience & Microsurgical Neuroanatomy (MICN) Laboratory, Department of Neurosurgery, Clinical Neuroscience Center, University Hospital Zurich, University of Zurich, Zurich, Switzerland
| | - Victor E Staartjes
- Machine Intelligence in Clinical Neuroscience & Microsurgical Neuroanatomy (MICN) Laboratory, Department of Neurosurgery, Clinical Neuroscience Center, University Hospital Zurich, University of Zurich, Zurich, Switzerland.
| |
Collapse
|
26
|
Jung J, Dai J, Liu B, Wu Q. Artificial intelligence in fracture detection with different image modalities and data types: A systematic review and meta-analysis. PLOS DIGITAL HEALTH 2024; 3:e0000438. [PMID: 38289965 PMCID: PMC10826962 DOI: 10.1371/journal.pdig.0000438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Accepted: 12/25/2023] [Indexed: 02/01/2024]
Abstract
Artificial Intelligence (AI), encompassing Machine Learning and Deep Learning, has increasingly been applied to fracture detection using diverse imaging modalities and data types. This systematic review and meta-analysis aimed to assess the efficacy of AI in detecting fractures through various imaging modalities and data types (image, tabular, or both) and to synthesize the existing evidence related to AI-based fracture detection. Peer-reviewed studies developing and validating AI for fracture detection were identified through searches in multiple electronic databases without time limitations. A hierarchical meta-analysis model was used to calculate pooled sensitivity and specificity. A diagnostic accuracy quality assessment was performed to evaluate bias and applicability. Of the 66 eligible studies, 54 identified fractures using imaging-related data, nine using tabular data, and three using both. Vertebral fractures were the most common outcome (n = 20), followed by hip fractures (n = 18). Hip fractures exhibited the highest pooled sensitivity (92%; 95% CI: 87-96, p< 0.01) and specificity (90%; 95% CI: 85-93, p< 0.01). Pooled sensitivity and specificity using image data (92%; 95% CI: 90-94, p< 0.01; and 91%; 95% CI: 88-93, p < 0.01) were higher than those using tabular data (81%; 95% CI: 77-85, p< 0.01; and 83%; 95% CI: 76-88, p < 0.01), respectively. Radiographs demonstrated the highest pooled sensitivity (94%; 95% CI: 90-96, p < 0.01) and specificity (92%; 95% CI: 89-94, p< 0.01). Patient selection and reference standards were major concerns in assessing diagnostic accuracy for bias and applicability. AI displays high diagnostic accuracy for various fracture outcomes, indicating potential utility in healthcare systems for fracture diagnosis. However, enhanced transparency in reporting and adherence to standardized guidelines are necessary to improve the clinical applicability of AI. Review Registration: PROSPERO (CRD42021240359).
Collapse
Affiliation(s)
- Jongyun Jung
- Department of Biomedical Informatics (Dr. Qing Wu, Jongyun Jung, and Jingyuan Dai), College of Medicine, The Ohio State University, Columbus, Ohio, United States of America
| | - Jingyuan Dai
- Department of Biomedical Informatics (Dr. Qing Wu, Jongyun Jung, and Jingyuan Dai), College of Medicine, The Ohio State University, Columbus, Ohio, United States of America
| | - Bowen Liu
- Department of Mathematics and Statistics, Division of Computing, Analytics, and Mathematics, School of Science and Engineering (Bowen Liu), University of Missouri-Kansas City, Kansas City, Missouri, United States of America
| | - Qing Wu
- Department of Biomedical Informatics (Dr. Qing Wu, Jongyun Jung, and Jingyuan Dai), College of Medicine, The Ohio State University, Columbus, Ohio, United States of America
| |
Collapse
|
27
|
Stewart R, Chaturvedi J, Roberts A. Natural language processing - relevance to patient outcomes and real-world evidence. Expert Rev Pharmacoecon Outcomes Res 2024; 24:5-9. [PMID: 37874661 DOI: 10.1080/14737167.2023.2275670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 10/23/2023] [Indexed: 10/26/2023]
Affiliation(s)
- Robert Stewart
- King's College London, Institute of Psychiatry, Psychology and Neuroscience, London, UK
- South London and Maudsley NHS Foundation Trust, London, UK
| | - Jaya Chaturvedi
- King's College London, Institute of Psychiatry, Psychology and Neuroscience, London, UK
| | - Angus Roberts
- King's College London, Institute of Psychiatry, Psychology and Neuroscience, London, UK
| |
Collapse
|
28
|
Sutradhar A, Al Rafi M, Shamrat FMJM, Ghosh P, Das S, Islam MA, Ahmed K, Zhou X, Azad AKM, Alyami SA, Moni MA. BOO-ST and CBCEC: two novel hybrid machine learning methods aim to reduce the mortality of heart failure patients. Sci Rep 2023; 13:22874. [PMID: 38129433 PMCID: PMC10739972 DOI: 10.1038/s41598-023-48486-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 11/27/2023] [Indexed: 12/23/2023] Open
Abstract
Heart failure (HF) is a leading cause of mortality worldwide. Machine learning (ML) approaches have shown potential as an early detection tool for improving patient outcomes. Enhancing the effectiveness and clinical applicability of the ML model necessitates training an efficient classifier with a diverse set of high-quality datasets. Hence, we proposed two novel hybrid ML methods ((a) consisting of Boosting, SMOTE, and Tomek links (BOO-ST); (b) combining the best-performing conventional classifier with ensemble classifiers (CBCEC)) to serve as an efficient early warning system for HF mortality. The BOO-ST was introduced to tackle the challenge of class imbalance, while CBCEC was responsible for training the processed and selected features derived from the Feature Importance (FI) and Information Gain (IG) feature selection techniques. We also conducted an explicit and intuitive comprehension to explore the impact of potential characteristics correlating with the fatality cases of HF. The experimental results demonstrated the proposed classifier CBCEC showcases a significant accuracy of 93.67% in terms of providing the early forecasting of HF mortality. Therefore, we can reveal that our proposed aspects (BOO-ST and CBCEC) can be able to play a crucial role in preventing the death rate of HF and reducing stress in the healthcare sector.
Collapse
Affiliation(s)
- Ananda Sutradhar
- Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City (DSC), Birulia, Savar, Dhaka, 1216, Bangladesh
| | - Mustahsin Al Rafi
- Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City (DSC), Birulia, Savar, Dhaka, 1216, Bangladesh
| | - F M Javed Mehedi Shamrat
- Department of Computer System and Technology, University of Malaya, 50603, Kuala Lumpur, Malaysia
| | - Pronab Ghosh
- Department of Computer Science, Lakehead University, 955 Oliver Rd, Thunder Bay, ON, P7B 5E1, Canada
| | - Subrata Das
- Department of Computer Science, Lakehead University, 955 Oliver Rd, Thunder Bay, ON, P7B 5E1, Canada
| | - Md Anaytul Islam
- Department of Computer Science, Lakehead University, 955 Oliver Rd, Thunder Bay, ON, P7B 5E1, Canada
| | - Kawsar Ahmed
- Department of Electrical and Computer Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada
- Department of Information and Communication Technology, Mawlana Bhashani Science and Technology University, Santosh, Tangail, 1902, Bangladesh
- Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh
| | - Xujuan Zhou
- School of Business, University of Southern Queensland, Toowoomba, Australia
| | - A K M Azad
- Department of Mathematics and Statistics, Faculty of Science, Imam Mohammad Ibn Saud Islamic University (IMSIU), 13318, Riyadh, Saudi Arabia
| | - Salem A Alyami
- Department of Mathematics and Statistics, Faculty of Science, Imam Mohammad Ibn Saud Islamic University (IMSIU), 13318, Riyadh, Saudi Arabia
| | - Mohammad Ali Moni
- Centre for AI & Digital Health Technology, Artificial Intelligence & Cyber Future Institute, Charles Stuart University, Bathurst, NSW, 2795, Australia.
| |
Collapse
|
29
|
van Royen FS, Asselbergs FW, Alfonso F, Vardas P, van Smeden M. Five critical quality criteria for artificial intelligence-based prediction models. Eur Heart J 2023; 44:4831-4834. [PMID: 37897346 DOI: 10.1093/eurheartj/ehad727] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/30/2023] Open
Abstract
To raise the quality of clinical artificial intelligence (AI) prediction modelling studies in the cardiovascular health domain and thereby improve their impact and relevancy, the editors for digital health, innovation, and quality standards of the European Heart Journal propose five minimal quality criteria for AI-based prediction model development and validation studies: complete reporting, carefully defined intended use of the model, rigorous validation, large enough sample size, and openness of code and software.
Collapse
Affiliation(s)
- Florien S van Royen
- Department of General Practice & Nursing Science, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Folkert W Asselbergs
- Department of Cardiology, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam, The Netherlands
- Health Data Research UK and Institute of Health Informatics, University College London, London, UK
| | - Fernando Alfonso
- Department of Cardiology, Hospital Universitario de la Princesa, Universidad Autónoma de Madrid, IIS-IP. CIVER-CV, Madrid, Spain
| | - Panos Vardas
- Biomedical Research Foundation Academy of Athens (BRFAA) and Hygeia Hospitals Group, Athens, Greece
| | - Maarten van Smeden
- Department of Epidemiology & Health Economics, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Universiteitsweg 100, 3584 CG Utrecht, Netherlands
- Department of Data Science & Biostatistics, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Universiteitsweg 100, 3584 CG Utrecht, The Netherlands
| |
Collapse
|
30
|
Banerji CRS, Chakraborti T, Harbron C, MacArthur BD. Clinical AI tools must convey predictive uncertainty for each individual patient. Nat Med 2023; 29:2996-2998. [PMID: 37821686 DOI: 10.1038/s41591-023-02562-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]
Affiliation(s)
- Christopher R S Banerji
- The Alan Turing Institute, London, UK.
- University College London Hospitals, NHS Foundation Trust, London, UK.
- UCL Cancer Institute, Faculty of Medical Sciences, University College London, London, UK.
| | - Tapabrata Chakraborti
- The Alan Turing Institute, London, UK
- UCL Cancer Institute, Faculty of Medical Sciences, University College London, London, UK
| | | | - Ben D MacArthur
- The Alan Turing Institute, London, UK.
- Faculty of Medicine, University of Southampton, Southampton, UK.
- Mathematical Sciences, University of Southampton, Southampton, UK.
| |
Collapse
|
31
|
Riley RD, Collins GS. Stability of clinical prediction models developed using statistical or machine learning methods. Biom J 2023; 65:e2200302. [PMID: 37466257 PMCID: PMC10952221 DOI: 10.1002/bimj.202200302] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 04/26/2023] [Accepted: 05/02/2023] [Indexed: 07/20/2023]
Abstract
Clinical prediction models estimate an individual's risk of a particular health outcome. A developed model is a consequence of the development dataset and model-building strategy, including the sample size, number of predictors, and analysis method (e.g., regression or machine learning). We raise the concern that many models are developed using small datasets that lead to instability in the model and its predictions (estimated risks). We define four levels of model stability in estimated risks moving from the overall mean to the individual level. Through simulation and case studies of statistical and machine learning approaches, we show instability in a model's estimated risks is often considerable, and ultimately manifests itself as miscalibration of predictions in new data. Therefore, we recommend researchers always examine instability at the model development stage and propose instability plots and measures to do so. This entails repeating the model-building steps (those used to develop the original prediction model) in each of multiple (e.g., 1000) bootstrap samples, to produce multiple bootstrap models, and deriving (i) a prediction instability plot of bootstrap model versus original model predictions; (ii) the mean absolute prediction error (mean absolute difference between individuals' original and bootstrap model predictions), and (iii) calibration, classification, and decision curve instability plots of bootstrap models applied in the original sample. A case study illustrates how these instability assessments help reassure (or not) whether model predictions are likely to be reliable (or not), while informing a model's critical appraisal (risk of bias rating), fairness, and further validation requirements.
Collapse
Affiliation(s)
- Richard D. Riley
- Institute of Applied Health ResearchCollege of Medical and Dental SciencesUniversity of BirminghamBirminghamUK
| | - Gary S. Collins
- Centre for Statistics in MedicineNuffield Department of OrthopaedicsRheumatology and Musculoskeletal SciencesUniversity of OxfordOxfordUK
| |
Collapse
|
32
|
Jacquemyn X, Kutty S, Manlhiot C. The Lifelong Impact of Artificial Intelligence and Clinical Prediction Models on Patients With Tetralogy of Fallot. CJC PEDIATRIC AND CONGENITAL HEART DISEASE 2023; 2:440-452. [PMID: 38161675 PMCID: PMC10755786 DOI: 10.1016/j.cjcpc.2023.08.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 08/24/2023] [Indexed: 01/03/2024]
Abstract
Medical advancements in the diagnosis, surgical techniques, perioperative care, and continued care throughout childhood have transformed the outlook for individuals with tetralogy of Fallot (TOF), improving survival and shifting the perspective towards lifelong care. However, with a growing population of survivors, longstanding challenges have been accentuated, and new challenges have surfaced, necessitating a re-evaluation of TOF care. Availability of prenatal diagnostics, insufficient information from traditional imaging techniques, previously unforeseen medical complications, and debates surrounding optimal timing and indications for reintervention are among the emerging issues. To address these challenges, the integration of artificial intelligence and machine learning holds great promise as they have the potential to revolutionize patient management and positively impact lifelong outcomes for individuals with TOF. Innovative applications of artificial intelligence and machine learning have spanned across multiple domains of TOF care, including screening and diagnosis, automated image processing and interpretation, clinical risk stratification, and planning and performing cardiac interventions. By embracing these advancements and incorporating them into routine clinical practice, personalized medicine could be delivered, leading to the best possible outcomes for patients. In this review, we provide an overview of these evolving applications and emphasize the challenges, limitations, and future potential for integrating them into clinical care.
Collapse
Affiliation(s)
- Xander Jacquemyn
- Blalock-Taussig-Thomas Pediatric and Congenital Heart Center, Department of Pediatrics, Johns Hopkins School of Medicine, Baltimore, Maryland, USA
- Department of Cardiovascular Sciences, KU Leuven, Leuven, Belgium
| | - Shelby Kutty
- Blalock-Taussig-Thomas Pediatric and Congenital Heart Center, Department of Pediatrics, Johns Hopkins School of Medicine, Baltimore, Maryland, USA
| | - Cedric Manlhiot
- Blalock-Taussig-Thomas Pediatric and Congenital Heart Center, Department of Pediatrics, Johns Hopkins School of Medicine, Baltimore, Maryland, USA
| |
Collapse
|
33
|
Pei J, Guo X, Tao H, Wei Y, Zhang H, Ma Y, Han L. Machine learning-based prediction models for pressure injury: A systematic review and meta-analysis. Int Wound J 2023; 20:4328-4339. [PMID: 37340520 PMCID: PMC10681397 DOI: 10.1111/iwj.14280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2023] [Accepted: 06/01/2023] [Indexed: 06/22/2023] Open
Abstract
Despite the fact that machine learning (ML) algorithms to construct predictive models for pressure injury development are widely reported, the performance of the model remains unknown. The goal of the review was to systematically appraise the performance of ML models in predicting pressure injury. PubMed, Embase, Cochrane Library, Web of Science, CINAHL, Grey literature and other databases were systematically searched. Original journal papers were included which met the inclusion criteria. The methodological quality was assessed independently by two reviewers using the Prediction Model Risk of Bias Assessment Tool (PROBAST). Meta-analysis was performed with Metadisc software, with the area under the receiver operating characteristic curve, sensitivity and specificity as effect measures. Chi-squared and I2 tests were used to assess the heterogeneity. A total of 18 studies were included for the narrative review, and 14 of them were eligible for meta-analysis. The models achieved excellent pooled AUC of 0.94, sensitivity of 0.79 (95% CI [0.78-0.80]) and specificity of 0.87 (95% CI [0.88-0.87]). Meta-regressions did not provide evidence that model performance varied by data or model types. The present findings indicate that ML models show an outstanding performance in predicting pressure injury. However, good-quality studies should be conducted to verify our results and confirm the clinical value of ML in pressure injury development.
Collapse
Affiliation(s)
- Juhong Pei
- The First Clinical Medical College, School of NursingLanzhou UniversityLanzhouChina
| | | | - Hongxia Tao
- The First Clinical Medical College, School of NursingLanzhou UniversityLanzhouChina
| | - Yuting Wei
- School of NursingLanzhou UniversityLanzhouChina
| | - Hongyan Zhang
- Department of NursingGansu Provincial HospitalLanzhouChina
| | - Yuxia Ma
- School of NursingLanzhou UniversityLanzhouChina
| | - Lin Han
- The First Clinical Medical College, School of NursingLanzhou UniversityLanzhouChina
- Department of NursingGansu Provincial HospitalLanzhouChina
| |
Collapse
|
34
|
Wilson A. CORR Synthesis: Can Decision Tree Learning Advance Orthopaedic Surgery Research? Clin Orthop Relat Res 2023; 481:2337-2342. [PMID: 37678231 PMCID: PMC10642865 DOI: 10.1097/corr.0000000000002820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 07/20/2023] [Indexed: 09/09/2023]
Affiliation(s)
- Andrew Wilson
- Research Coordinator, Department of Orthopaedic Surgery, University of Tennessee College of Medicine Chattanooga, Chattanooga, TN, USA
| |
Collapse
|
35
|
Moon SJ, Lee S, Hwang J, Lee J, Kang S, Cha HS. Performances of machine learning algorithms in discriminating sacroiliitis features on MRI: a systematic review. RMD Open 2023; 9:e003783. [PMID: 37996126 PMCID: PMC10668284 DOI: 10.1136/rmdopen-2023-003783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Accepted: 11/08/2023] [Indexed: 11/25/2023] Open
Abstract
OBJECTIVES Summarise the evidence of the performance of the machine learning algorithm in discriminating sacroiliitis features on MRI and compare it with the accuracy of human physicians. METHODS MEDLINE, EMBASE, CIHNAL, Web of Science, IEEE, American College of Rheumatology and European Alliance of Associations for Rheumatology abstract archives were searched for studies published between 2008 and 4 June 2023. Two authors independently screened and extracted the variables, and the results are presented using tables and forest plots. RESULTS Ten studies were selected from 2381. Over half of the studies used deep learning models, using Assessment of Spondyloarthritis International Society sacroiliitis criteria as the ground truth, and manually extracted the regions of interest. All studies reported the area under the curve as a performance index, ranging from 0.76 to 0.99. Sensitivity and specificity were the second-most commonly reported indices, with sensitivity ranging from 0.56 to 1.00 and specificity ranging from 0.67 to 1.00; these results are comparable to a radiologist's sensitivity of 0.67-1.00 and specificity of 0.78-1.00 in the same cohort. More than half of the studies showed a high risk of bias in the analysis domain of quality appraisal owing to the small sample size or overfitting issues. CONCLUSION The performance of machine learning algorithms in discriminating sacroiliitis features on MRI varied owing to the high heterogeneity between studies and the small sample sizes, overfitting, and under-reporting issues of individual studies. Further well-designed and transparent studies are required.
Collapse
Affiliation(s)
- Sun Jae Moon
- Department of Medicine, Santa Marie 24 Clinic, Seongnam-si, Korea (the Republic of)
| | - Seulkee Lee
- Department of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea (the Republic of)
| | - Jinseub Hwang
- Department of Data Science, Daegu University, Gyeongsan-si, Korea (the Republic of)
| | - Jaejoon Lee
- Department of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea (the Republic of)
| | - Seonyoung Kang
- Department of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea (the Republic of)
| | - Hoon-Suk Cha
- Department of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea (the Republic of)
| |
Collapse
|
36
|
Carrasco-Ribelles LA, Llanes-Jurado J, Gallego-Moll C, Cabrera-Bean M, Monteagudo-Zaragoza M, Violán C, Zabaleta-del-Olmo E. Prediction models using artificial intelligence and longitudinal data from electronic health records: a systematic methodological review. J Am Med Inform Assoc 2023; 30:2072-2082. [PMID: 37659105 PMCID: PMC10654870 DOI: 10.1093/jamia/ocad168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 08/02/2023] [Accepted: 08/11/2023] [Indexed: 09/04/2023] Open
Abstract
OBJECTIVE To describe and appraise the use of artificial intelligence (AI) techniques that can cope with longitudinal data from electronic health records (EHRs) to predict health-related outcomes. METHODS This review included studies in any language that: EHR was at least one of the data sources, collected longitudinal data, used an AI technique capable of handling longitudinal data, and predicted any health-related outcomes. We searched MEDLINE, Scopus, Web of Science, and IEEE Xplorer from inception to January 3, 2022. Information on the dataset, prediction task, data preprocessing, feature selection, method, validation, performance, and implementation was extracted and summarized using descriptive statistics. Risk of bias and completeness of reporting were assessed using a short form of PROBAST and TRIPOD, respectively. RESULTS Eighty-one studies were included. Follow-up time and number of registers per patient varied greatly, and most predicted disease development or next event based on diagnoses and drug treatments. Architectures generally were based on Recurrent Neural Networks-like layers, though in recent years combining different layers or transformers has become more popular. About half of the included studies performed hyperparameter tuning and used attention mechanisms. Most performed a single train-test partition and could not correctly assess the variability of the model's performance. Reporting quality was poor, and a third of the studies were at high risk of bias. CONCLUSIONS AI models are increasingly using longitudinal data. However, the heterogeneity in reporting methodology and results, and the lack of public EHR datasets and code sharing, complicate the possibility of replication. REGISTRATION PROSPERO database (CRD42022331388).
Collapse
Affiliation(s)
- Lucía A Carrasco-Ribelles
- Fundació Institut Universitari per a la recerca a l’Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Barcelona, 08007, Spain
- Department of Signal Theory and Communications, Universitat Politècnica de Catalunya (UPC), Barcelona, 08034, Spain
- Unitat de Suport a la Recerca Metropolitana Nord, Fundació Institut Universitari per a la recerca a l’Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Mataró, 08303, Spain
| | - José Llanes-Jurado
- Instituto de Investigación e Innovación en Bioingeniería (i3B), Universitat Politècnica de València (UPV), València, 46022, Spain
| | - Carlos Gallego-Moll
- Fundació Institut Universitari per a la recerca a l’Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Barcelona, 08007, Spain
- Unitat de Suport a la Recerca Metropolitana Nord, Fundació Institut Universitari per a la recerca a l’Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Mataró, 08303, Spain
| | - Margarita Cabrera-Bean
- Department of Signal Theory and Communications, Universitat Politècnica de Catalunya (UPC), Barcelona, 08034, Spain
| | - Mònica Monteagudo-Zaragoza
- Fundació Institut Universitari per a la recerca a l’Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Barcelona, 08007, Spain
| | - Concepción Violán
- Unitat de Suport a la Recerca Metropolitana Nord, Fundació Institut Universitari per a la recerca a l’Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Mataró, 08303, Spain
- Direcció d’Atenció Primària Metropolitana Nord, Institut Català de Salut, Badalona, 08915, Spain
- Fundació Institut d’Investigació en ciències de la salut Germans Trias i Pujol (IGTP), Badalona, 08916, Spain
- Fundació UAB, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, 08193, Spain
| | - Edurne Zabaleta-del-Olmo
- Fundació Institut Universitari per a la recerca a l’Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Barcelona, 08007, Spain
- Gerència Territorial de Barcelona, Institut Català de la Salut, Carrer de Balmes 22, Barcelona, 08007, Spain
- Nursing Department, Faculty of Nursing, Universitat de Girona, Girona, 17003, Spain
| |
Collapse
|
37
|
Ser SE, Shear K, Snigurska UA, Prosperi M, Wu Y, Magoc T, Bjarnadottir RI, Lucero RJ. Clinical Prediction Models for Hospital-Induced Delirium Using Structured and Unstructured Electronic Health Record Data: Protocol for a Development and Validation Study. JMIR Res Protoc 2023; 12:e48521. [PMID: 37943599 PMCID: PMC10667972 DOI: 10.2196/48521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 09/01/2023] [Accepted: 09/05/2023] [Indexed: 11/10/2023] Open
Abstract
BACKGROUND Hospital-induced delirium is one of the most common and costly iatrogenic conditions, and its incidence is predicted to increase as the population of the United States ages. An academic and clinical interdisciplinary systems approach is needed to reduce the frequency and impact of hospital-induced delirium. OBJECTIVE The long-term goal of our research is to enhance the safety of hospitalized older adults by reducing iatrogenic conditions through an effective learning health system. In this study, we will develop models for predicting hospital-induced delirium. In order to accomplish this objective, we will create a computable phenotype for our outcome (hospital-induced delirium), design an expert-based traditional logistic regression model, leverage machine learning techniques to generate a model using structured data, and use machine learning and natural language processing to produce an integrated model with components from both structured data and text data. METHODS This study will explore text-based data, such as nursing notes, to improve the predictive capability of prognostic models for hospital-induced delirium. By using supervised and unsupervised text mining in addition to structured data, we will examine multiple types of information in electronic health record data to predict medical-surgical patient risk of developing delirium. Development and validation will be compliant to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement. RESULTS Work on this project will take place through March 2024. For this study, we will use data from approximately 332,230 encounters that occurred between January 2012 to May 2021. Findings from this project will be disseminated at scientific conferences and in peer-reviewed journals. CONCLUSIONS Success in this study will yield a durable, high-performing research-data infrastructure that will process, extract, and analyze clinical text data in near real time. This model has the potential to be integrated into the electronic health record and provide point-of-care decision support to prevent harm and improve quality of care. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) DERR1-10.2196/48521.
Collapse
Affiliation(s)
- Sarah E Ser
- Department of Epidemiology, College of Public Health and Health Professions and College of Medicine, University of Florida, Gainesville, FL, United States
| | - Kristen Shear
- Department of Family, Community, and Health Systems Science, College of Nursing, University of Florida, Gainesville, FL, United States
| | - Urszula A Snigurska
- Department of Family, Community, and Health Systems Science, College of Nursing, University of Florida, Gainesville, FL, United States
| | - Mattia Prosperi
- Department of Epidemiology, College of Public Health and Health Professions and College of Medicine, University of Florida, Gainesville, FL, United States
| | - Yonghui Wu
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States
| | - Tanja Magoc
- Integrated Data Repository Research Services, University of Florida, Gainesville, FL, United States
| | - Ragnhildur I Bjarnadottir
- Department of Family, Community, and Health Systems Science, College of Nursing, University of Florida, Gainesville, FL, United States
| | - Robert J Lucero
- Department of Family, Community, and Health Systems Science, College of Nursing, University of Florida, Gainesville, FL, United States
- School of Nursing, University of California Los Angeles, Los Angeles, CA, United States
| |
Collapse
|
38
|
Buick JE, Austin PC, Cheskes S, Ko DT, Atzema CL. Prediction models in prehospital and emergency medicine research: How to derive and internally validate a clinical prediction model. Acad Emerg Med 2023; 30:1150-1160. [PMID: 37266925 DOI: 10.1111/acem.14756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2023] [Revised: 05/24/2023] [Accepted: 05/29/2023] [Indexed: 06/03/2023]
Abstract
Clinical prediction models are created to help clinicians with medical decision making, aid in risk stratification, and improve diagnosis and/or prognosis. With growing availability of both prehospital and in-hospital observational registries and electronic health records, there is an opportunity to develop, validate, and incorporate prediction models into clinical practice. However, many prediction models have high risk of bias due to poor methodology. Given that there are no methodological standards aimed at developing prediction models specifically in the prehospital setting, the objective of this paper is to describe the appropriate methodology for the derivation and validation of clinical prediction models in this setting. What follows can also be applied to the emergency medicine (EM) setting. There are eight steps that should be followed when developing and internally validating a prediction model: (1) problem definition, (2) coding of predictors, (3) addressing missing data, (4) ensuring adequate sample size, (5) variable selection, (6) evaluating model performance, (7) internal validation, and (8) model presentation. Subsequent steps include external validation, assessment of impact, and cost-effectiveness. By following these steps, researchers can develop a prediction model with the methodological rigor and quality required for prehospital and EM research.
Collapse
Affiliation(s)
- Jason E Buick
- Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Ontario, Canada
| | - Peter C Austin
- Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Ontario, Canada
- ICES, Toronto, Ontario, Canada
- Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada
| | - Sheldon Cheskes
- Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada
- Division of Emergency Medicine, Department of Family and Community Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Dennis T Ko
- Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Ontario, Canada
- ICES, Toronto, Ontario, Canada
- Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada
- Department of Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Clare L Atzema
- Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Ontario, Canada
- ICES, Toronto, Ontario, Canada
- Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada
- Division of Emergency Medicine, Department of Medicine, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
39
|
Jiang Y, Wang C, Zhou S. Artificial intelligence-based risk stratification, accurate diagnosis and treatment prediction in gynecologic oncology. Semin Cancer Biol 2023; 96:82-99. [PMID: 37783319 DOI: 10.1016/j.semcancer.2023.09.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2022] [Revised: 08/27/2023] [Accepted: 09/25/2023] [Indexed: 10/04/2023]
Abstract
As data-driven science, artificial intelligence (AI) has paved a promising path toward an evolving health system teeming with thrilling opportunities for precision oncology. Notwithstanding the tremendous success of oncological AI in such fields as lung carcinoma, breast tumor and brain malignancy, less attention has been devoted to investigating the influence of AI on gynecologic oncology. Hereby, this review sheds light on the ever-increasing contribution of state-of-the-art AI techniques to the refined risk stratification and whole-course management of patients with gynecologic tumors, in particular, cervical, ovarian and endometrial cancer, centering on information and features extracted from clinical data (electronic health records), cancer imaging including radiological imaging, colposcopic images, cytological and histopathological digital images, and molecular profiling (genomics, transcriptomics, metabolomics and so forth). However, there are still noteworthy challenges beyond performance validation. Thus, this work further describes the limitations and challenges faced in the real-word implementation of AI models, as well as potential solutions to address these issues.
Collapse
Affiliation(s)
- Yuting Jiang
- Department of Obstetrics and Gynecology, Key Laboratory of Birth Defects and Related Diseases of Women and Children of MOE and State Key Laboratory of Biotherapy, West China Second Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, Sichuan 610041, China; Department of Pulmonary and Critical Care Medicine, State Key Laboratory of Respiratory Health and Multimorbidity, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Chengdi Wang
- Department of Obstetrics and Gynecology, Key Laboratory of Birth Defects and Related Diseases of Women and Children of MOE and State Key Laboratory of Biotherapy, West China Second Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, Sichuan 610041, China; Department of Pulmonary and Critical Care Medicine, State Key Laboratory of Respiratory Health and Multimorbidity, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Shengtao Zhou
- Department of Obstetrics and Gynecology, Key Laboratory of Birth Defects and Related Diseases of Women and Children of MOE and State Key Laboratory of Biotherapy, West China Second Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, Sichuan 610041, China; Department of Pulmonary and Critical Care Medicine, State Key Laboratory of Respiratory Health and Multimorbidity, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China.
| |
Collapse
|
40
|
Malorgio A, Henckert D, Schweiger G, Braun J, Zacharowski K, Raimann FJ, Piekarski F, Meybohm P, Hottenrott S, Froehlich C, Spahn DR, Noethiger CB, Tscholl DW, Roche TR. Using Visual Patient to Show Vital Sign Predictions, a Computer-Based Mixed Quantitative and Qualitative Simulation Study. Diagnostics (Basel) 2023; 13:3281. [PMID: 37892102 PMCID: PMC10606017 DOI: 10.3390/diagnostics13203281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 10/16/2023] [Accepted: 10/19/2023] [Indexed: 10/29/2023] Open
Abstract
BACKGROUND Machine learning can analyze vast amounts of data and make predictions for events in the future. Our group created machine learning models for vital sign predictions. To transport the information of these predictions without numbers and numerical values and make them easily usable for human caregivers, we aimed to integrate them into the Philips Visual-Patient-avatar, an avatar-based visualization of patient monitoring. METHODS We conducted a computer-based simulation study with 70 participants in 3 European university hospitals. We validated the vital sign prediction visualizations by testing their identification by anesthesiologists and intensivists. Each prediction visualization consisted of a condition (e.g., low blood pressure) and an urgency (a visual indication of the timespan in which the condition is expected to occur). To obtain qualitative user feedback, we also conducted standardized interviews and derived statements that participants later rated in an online survey. RESULTS The mixed logistic regression model showed 77.9% (95% CI 73.2-82.0%) correct identification of prediction visualizations (i.e., condition and urgency both correctly identified) and 93.8% (95% CI 93.7-93.8%) for conditions only (i.e., without considering urgencies). A total of 49 out of 70 participants completed the online survey. The online survey participants agreed that the prediction visualizations were fun to use (32/49, 65.3%), and that they could imagine working with them in the future (30/49, 61.2%). They also agreed that identifying the urgencies was difficult (32/49, 65.3%). CONCLUSIONS This study found that care providers correctly identified >90% of the conditions (i.e., without considering urgencies). The accuracy of identification decreased when considering urgencies in addition to conditions. Therefore, in future development of the technology, we will focus on either only displaying conditions (without urgencies) or improving the visualizations of urgency to enhance usability for human users.
Collapse
Affiliation(s)
- Amos Malorgio
- Institute of Anesthesiology, University Hospital Zurich, University of Zurich, 8091 Zurich, Switzerland; (A.M.); (D.H.); (G.S.); (D.R.S.); (C.B.N.); (D.W.T.)
| | - David Henckert
- Institute of Anesthesiology, University Hospital Zurich, University of Zurich, 8091 Zurich, Switzerland; (A.M.); (D.H.); (G.S.); (D.R.S.); (C.B.N.); (D.W.T.)
| | - Giovanna Schweiger
- Institute of Anesthesiology, University Hospital Zurich, University of Zurich, 8091 Zurich, Switzerland; (A.M.); (D.H.); (G.S.); (D.R.S.); (C.B.N.); (D.W.T.)
| | - Julia Braun
- Departments of Epidemiology and Biostatistics, Epidemiology, Biostatistics and Prevention Institute, University of Zurich, 8001 Zurich, Switzerland;
| | - Kai Zacharowski
- Department of Anesthesiology, Intensive Care Medicine, and Pain Therapy, University Hospital Frankfurt, Goethe University Frankfurt, 60323 Frankfurt, Germany; (K.Z.); (F.J.R.); (F.P.)
| | - Florian J. Raimann
- Department of Anesthesiology, Intensive Care Medicine, and Pain Therapy, University Hospital Frankfurt, Goethe University Frankfurt, 60323 Frankfurt, Germany; (K.Z.); (F.J.R.); (F.P.)
| | - Florian Piekarski
- Department of Anesthesiology, Intensive Care Medicine, and Pain Therapy, University Hospital Frankfurt, Goethe University Frankfurt, 60323 Frankfurt, Germany; (K.Z.); (F.J.R.); (F.P.)
| | - Patrick Meybohm
- Department of Anesthesiology, Intensive Care, Emergency, and Pain Medicine, University Hospital Wuerzburg, 97070 Wuerzburg, Germany; (P.M.); (S.H.); (C.F.)
| | - Sebastian Hottenrott
- Department of Anesthesiology, Intensive Care, Emergency, and Pain Medicine, University Hospital Wuerzburg, 97070 Wuerzburg, Germany; (P.M.); (S.H.); (C.F.)
| | - Corinna Froehlich
- Department of Anesthesiology, Intensive Care, Emergency, and Pain Medicine, University Hospital Wuerzburg, 97070 Wuerzburg, Germany; (P.M.); (S.H.); (C.F.)
| | - Donat R. Spahn
- Institute of Anesthesiology, University Hospital Zurich, University of Zurich, 8091 Zurich, Switzerland; (A.M.); (D.H.); (G.S.); (D.R.S.); (C.B.N.); (D.W.T.)
| | - Christoph B. Noethiger
- Institute of Anesthesiology, University Hospital Zurich, University of Zurich, 8091 Zurich, Switzerland; (A.M.); (D.H.); (G.S.); (D.R.S.); (C.B.N.); (D.W.T.)
| | - David W. Tscholl
- Institute of Anesthesiology, University Hospital Zurich, University of Zurich, 8091 Zurich, Switzerland; (A.M.); (D.H.); (G.S.); (D.R.S.); (C.B.N.); (D.W.T.)
| | - Tadzio R. Roche
- Institute of Anesthesiology, University Hospital Zurich, University of Zurich, 8091 Zurich, Switzerland; (A.M.); (D.H.); (G.S.); (D.R.S.); (C.B.N.); (D.W.T.)
| |
Collapse
|
41
|
Chimbunde E, Sigwadhi LN, Tamuzi JL, Okango EL, Daramola O, Ngah VD, Nyasulu PS. Machine learning algorithms for predicting determinants of COVID-19 mortality in South Africa. Front Artif Intell 2023; 6:1171256. [PMID: 37899965 PMCID: PMC10600470 DOI: 10.3389/frai.2023.1171256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 08/15/2023] [Indexed: 10/31/2023] Open
Abstract
Background COVID-19 has strained healthcare resources, necessitating efficient prognostication to triage patients effectively. This study quantified COVID-19 risk factors and predicted COVID-19 intensive care unit (ICU) mortality in South Africa based on machine learning algorithms. Methods Data for this study were obtained from 392 COVID-19 ICU patients enrolled between 26 March 2020 and 10 February 2021. We used an artificial neural network (ANN) and random forest (RF) to predict mortality among ICU patients and a semi-parametric logistic regression with nine covariates, including a grouping variable based on K-means clustering. Further evaluation of the algorithms was performed using sensitivity, accuracy, specificity, and Cohen's K statistics. Results From the semi-parametric logistic regression and ANN variable importance, age, gender, cluster, presence of severe symptoms, being on the ventilator, and comorbidities of asthma significantly contributed to ICU death. In particular, the odds of mortality were six times higher among asthmatic patients than non-asthmatic patients. In univariable and multivariate regression, advanced age, PF1 and 2, FiO2, severe symptoms, asthma, oxygen saturation, and cluster 4 were strongly predictive of mortality. The RF model revealed that intubation status, age, cluster, diabetes, and hypertension were the top five significant predictors of mortality. The ANN performed well with an accuracy of 71%, a precision of 83%, an F1 score of 100%, Matthew's correlation coefficient (MCC) score of 100%, and a recall of 88%. In addition, Cohen's k-value of 0.75 verified the most extreme discriminative power of the ANN. In comparison, the RF model provided a 76% recall, an 87% precision, and a 65% MCC. Conclusion Based on the findings, we can conclude that both ANN and RF can predict COVID-19 mortality in the ICU with accuracy. The proposed models accurately predict the prognosis of COVID-19 patients after diagnosis. The models can be used to prioritize COVID-19 patients with a high mortality risk in resource-constrained ICUs.
Collapse
Affiliation(s)
- Emmanuel Chimbunde
- Division of Epidemiology and Biostatistics, Department of Global Health, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Lovemore N. Sigwadhi
- Division of Epidemiology and Biostatistics, Department of Global Health, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Jacques L. Tamuzi
- Division of Epidemiology and Biostatistics, Department of Global Health, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | | | - Olawande Daramola
- Department of Information Technology, Faculty of Informatics and Design, Cape Peninsula University of Technology, Cape Town, South Africa
| | - Veranyuy D. Ngah
- Division of Epidemiology and Biostatistics, Department of Global Health, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Peter S. Nyasulu
- Division of Epidemiology and Biostatistics, Department of Global Health, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
- Division of Epidemiology and Biostatistics, School of Public Health, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
| |
Collapse
|
42
|
Ratna MB, Bhattacharya S, McLernon DJ. External validation of models for predicting cumulative live birth over multiple complete cycles of IVF treatment. Hum Reprod 2023; 38:1998-2010. [PMID: 37632223 PMCID: PMC10546080 DOI: 10.1093/humrep/dead165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 07/28/2023] [Indexed: 08/27/2023] Open
Abstract
STUDY QUESTION Can two prediction models developed using data from 1999 to 2009 accurately predict the cumulative probability of live birth per woman over multiple complete cycles of IVF in an updated UK cohort? SUMMARY ANSWER After being updated, the models were able to estimate individualized chances of cumulative live birth over multiple complete cycles of IVF with greater accuracy. WHAT IS KNOWN ALREADY The McLernon models were the first to predict cumulative live birth over multiple complete cycles of IVF. They were converted into an online calculator called OPIS (Outcome Prediction In Subfertility) which has 3000 users per month on average. A previous study externally validated the McLernon models using a Dutch prospective cohort containing data from 2011 to 2014. With changes in IVF practice over time, it is important that the McLernon models are externally validated on a more recent cohort of patients to ensure that predictions remain accurate. STUDY DESIGN, SIZE, DURATION A population-based cohort of 91 035 women undergoing IVF in the UK between January 2010 and December 2016 was used for external validation. Data on frozen embryo transfers associated with these complete IVF cycles conducted from 1 January 2017 to 31 December 2017 were also collected. PARTICIPANTS/MATERIALS, SETTING, METHODS Data on IVF treatments were obtained from the Human Fertilisation and Embryology Authority (HFEA). The predictive performances of the McLernon models were evaluated in terms of discrimination and calibration. Discrimination was assessed using the c-statistic and calibration was assessed using calibration-in-the-large, calibration slope, and calibration plots. Where any model demonstrated poor calibration in the validation cohort, the models were updated using intercept recalibration, logistic recalibration, or model revision to improve model performance. MAIN RESULTS AND THE ROLE OF CHANCE Following exclusions, 91 035 women who underwent 144 734 complete cycles were included. The validation cohort had a similar distribution age profile to women in the development cohort. Live birth rates over all complete cycles of IVF per woman were higher in the validation cohort. After calibration assessment, both models required updating. The coefficients of the pre-treatment model were revised, and the updated model showed reasonable discrimination (c-statistic: 0.67, 95% CI: 0.66 to 0.68). After logistic recalibration, the post-treatment model showed good discrimination (c-statistic: 0.75, 95% CI: 0.74 to 0.76). As an example, in the updated pre-treatment model, a 32-year-old woman with 2 years of primary infertility has a 42% chance of having a live birth in the first complete ICSI cycle and a 77% chance over three complete cycles. In a couple with 2 years of primary male factor infertility where a 30-year-old woman has 15 oocytes collected in the first cycle, a single fresh blastocyst embryo transferred in the first cycle and spare embryos cryopreserved, the estimated chance of live birth provided by the post-treatment model is 46% in the first complete ICSI cycle and 81% over three complete cycles. LIMITATIONS, REASONS FOR CAUTION Two predictors from the original models, duration of infertility and previous pregnancy, which were not available in the recent HFEA dataset, were imputed using data from the older cohort used to develop the models. The HFEA dataset does not contain some other potentially important predictors, e.g. BMI, ethnicity, race, smoking and alcohol intake in women, as well as measures of ovarian reserve such as antral follicle count. WIDER IMPLICATIONS OF THE FINDINGS Both updated models show improved predictive ability and provide estimates which are more reflective of current practice and patient case mix. The updated OPIS tool can be used by clinicians to help shape couples' expectations by informing them of their individualized chances of live birth over a sequence of multiple complete cycles of IVF. STUDY FUNDING/COMPETING INTEREST(S) This study was supported by an Elphinstone scholarship scheme at the University of Aberdeen and Aberdeen Fertility Centre, University of Aberdeen. S.B. has a commitment of research funding from Merck. D.J.M. and M.B.R. declare support for the present manuscript from Elphinstone scholarship scheme at the University of Aberdeen and Assisted Reproduction Unit at Aberdeen Fertility Centre, University of Aberdeen. D.J.M. declares grants received by University of Aberdeen from NHS Grampian, The Meikle Foundation, and Chief Scientist Office in the past 3 years. D.J.M. declares receiving an honorarium for lectures from Merck. D.J.M. is Associate Editor of Human Reproduction Open and Statistical Advisor for Reproductive BioMed Online. S.B. declares royalties from Cambridge University Press for a book. S.B. declares receiving an honorarium for lectures from Merck, Organon, Ferring, Obstetric and Gynaecological Society of Singapore, and Taiwanese Society for Reproductive Medicine. S.B. has received support from Merck, ESHRE, and Ferring for attending meetings as speaker and is on the METAFOR and CAPRE Trials Data Monitoring Committee. TRIAL REGISTRATION NUMBER N/A.
Collapse
Affiliation(s)
- Mariam B Ratna
- Institute of Applied Health Sciences, School of Medicine, Medical Sciences & Nutrition, University of Aberdeen, Aberdeen, UK
- Clinical Trials Unit, Warwick Medical School, University of Warwick, Warwick, UK
| | | | - David J McLernon
- Institute of Applied Health Sciences, School of Medicine, Medical Sciences & Nutrition, University of Aberdeen, Aberdeen, UK
| |
Collapse
|
43
|
Wang Y, Hou R, Ni B, Jiang Y, Zhang Y. Development and validation of a prediction model based on machine learning algorithms for predicting the risk of heart failure in middle-aged and older US people with prediabetes or diabetes. Clin Cardiol 2023; 46:1234-1243. [PMID: 37519220 PMCID: PMC10577538 DOI: 10.1002/clc.24104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 07/13/2023] [Accepted: 07/16/2023] [Indexed: 08/01/2023] Open
Abstract
BACKGROUND The purpose of this study was to develop and validate a machine learning (ML) based prediction model for the risk of heart failure (HF) in patients with prediabetes or diabetes. METHODS We used 3527 subjects aged 40 years and older with a prior diagnosis of prediabetes or diabetes from the National Health and Nutrition Examination Survey (NHANES) from 2007 to 2018. The search for independent risk variables linked to HF was conducted using univariate and multivariable logistic regression analysis. The 3527 subjects were randomly divided into training set and validation set in a 7:3 ratio. Five ML models were built on the training set using five ML algorithms, including random forest (RF), and then validated on the validation set. Receiver operating characteristic (ROC) curves, calibration curves, and decision curve analysis and Bootstrap resampling method were used to measure the predictive performance of the five ML models. RESULTS Multivariate logistic regression analysis showed that age, poverty-to-income ratio, myocardial infarction condition, coronary heart disease condition, chest pain condition, and glucose-lowering medication use were independent predictors of HF. By comparing the performance of the five ML models, the RF model (AUC = 0.978) was the best prediction model. CONCLUSIONS The risk of HF in middle-aged and elderly patients with prediabetes or diabetes can be accurately predicted using ML models. The best prediction performance is presented by RF model, which can assist doctors in making clinical decisions.
Collapse
Affiliation(s)
- Yicheng Wang
- Department of Cardiovascular medicineAffiliated Fuzhou First Hospital of Fujian Medical UniversityFuzhouFujianChina
- The Third Clinical Medical CollegeFujian Medical UniversityFuzhouFujianChina
- Cardiovascular Disease Research Institute of Fuzhou CityFuzhouFujianChina
| | - Riting Hou
- Department of Cardiovascular medicineAffiliated Fuzhou First Hospital of Fujian Medical UniversityFuzhouFujianChina
- The Third Clinical Medical CollegeFujian Medical UniversityFuzhouFujianChina
- Cardiovascular Disease Research Institute of Fuzhou CityFuzhouFujianChina
| | - Binghang Ni
- Department of Cardiovascular medicineAffiliated Fuzhou First Hospital of Fujian Medical UniversityFuzhouFujianChina
- The Third Clinical Medical CollegeFujian Medical UniversityFuzhouFujianChina
- Cardiovascular Disease Research Institute of Fuzhou CityFuzhouFujianChina
| | - Yu Jiang
- Department of Cardiovascular medicineAffiliated Fuzhou First Hospital of Fujian Medical UniversityFuzhouFujianChina
- The Third Clinical Medical CollegeFujian Medical UniversityFuzhouFujianChina
- Cardiovascular Disease Research Institute of Fuzhou CityFuzhouFujianChina
| | - Yan Zhang
- Department of Cardiovascular medicineAffiliated Fuzhou First Hospital of Fujian Medical UniversityFuzhouFujianChina
- The Third Clinical Medical CollegeFujian Medical UniversityFuzhouFujianChina
- Cardiovascular Disease Research Institute of Fuzhou CityFuzhouFujianChina
| |
Collapse
|
44
|
Lewis AE, Weiskopf N, Abrams ZB, Foraker R, Lai AM, Payne PRO, Gupta A. Electronic health record data quality assessment and tools: a systematic review. J Am Med Inform Assoc 2023; 30:1730-1740. [PMID: 37390812 PMCID: PMC10531113 DOI: 10.1093/jamia/ocad120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Revised: 05/16/2023] [Accepted: 06/23/2023] [Indexed: 07/02/2023] Open
Abstract
OBJECTIVE We extended a 2013 literature review on electronic health record (EHR) data quality assessment approaches and tools to determine recent improvements or changes in EHR data quality assessment methodologies. MATERIALS AND METHODS We completed a systematic review of PubMed articles from 2013 to April 2023 that discussed the quality assessment of EHR data. We screened and reviewed papers for the dimensions and methods defined in the original 2013 manuscript. We categorized papers as data quality outcomes of interest, tools, or opinion pieces. We abstracted and defined additional themes and methods though an iterative review process. RESULTS We included 103 papers in the review, of which 73 were data quality outcomes of interest papers, 22 were tools, and 8 were opinion pieces. The most common dimension of data quality assessed was completeness, followed by correctness, concordance, plausibility, and currency. We abstracted conformance and bias as 2 additional dimensions of data quality and structural agreement as an additional methodology. DISCUSSION There has been an increase in EHR data quality assessment publications since the original 2013 review. Consistent dimensions of EHR data quality continue to be assessed across applications. Despite consistent patterns of assessment, there still does not exist a standard approach for assessing EHR data quality. CONCLUSION Guidelines are needed for EHR data quality assessment to improve the efficiency, transparency, comparability, and interoperability of data quality assessment. These guidelines must be both scalable and flexible. Automation could be helpful in generalizing this process.
Collapse
Affiliation(s)
- Abigail E Lewis
- Division of Computational and Data Sciences, Washington University in St. Louis, St. Louis, Missouri, USA
- Institute for Informatics, Data Science and Biostatistics, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Nicole Weiskopf
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, USA
| | - Zachary B Abrams
- Institute for Informatics, Data Science and Biostatistics, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Randi Foraker
- Institute for Informatics, Data Science and Biostatistics, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Albert M Lai
- Institute for Informatics, Data Science and Biostatistics, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Philip R O Payne
- Institute for Informatics, Data Science and Biostatistics, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Aditi Gupta
- Institute for Informatics, Data Science and Biostatistics, Washington University in St. Louis, St. Louis, Missouri, USA
| |
Collapse
|
45
|
Clichet V, Lebon D, Chapuis N, Zhu J, Bardet V, Marolleau JP, Garçon L, Caulier A, Boyer T. Artificial intelligence to empower diagnosis of myelodysplastic syndromes by multiparametric flow cytometry. Haematologica 2023; 108:2435-2443. [PMID: 36924240 PMCID: PMC10483367 DOI: 10.3324/haematol.2022.282370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 03/07/2023] [Indexed: 03/18/2023] Open
Abstract
The diagnosis of myelodysplastic syndromes (MDS) might be challenging and relies on the convergence of cytological, cytogenetic, and molecular factors. Multiparametric flow cytometry (MFC) helps diagnose MDS, especially when other features do not contribute to the decision-making process, but its usefulness remains underestimated, mostly due to a lack of standardization of cytometers. We present here an innovative model integrating artificial intelligence (AI) with MFC to improve the diagnosis and the classification of MDS. We develop a machine learning model through an elasticnet algorithm directed on a cohort of 191 patients, only based on flow cytometry parameters selected by the Boruta algorithm, to build a simple but reliable prediction score with five parameters. Our AI-assisted MDS prediction score greatly improves the sensitivity of the Ogata score while keeping an excellent specificity validated on an external cohort of 89 patients with an Area Under the Curve of 0.935. This model allows the diagnosis of both high- and low-risk MDS with 91.8% sensitivity and 92.5% specificity. Interestingly, it highlights a progressive evolution of the score from clonal hematopoiesis of indeterminate potential (CHIP) to highrisk MDS, suggesting a linear evolution between these different stages. By significantly decreasing the overall misclassification of 52% for patients with MDS and of 31.3% for those without MDS (P=0.02), our AI-assisted prediction score outperforms the Ogata score and positions itself as a reliable tool to help diagnose MDS.
Collapse
Affiliation(s)
- Valentin Clichet
- Service d’Hématologie Biologique, CHU Amiens-Picardie, Amiens, France
| | - Delphine Lebon
- Service d’Hématologie Clinique et de Thérapie Cellulaire, CHU Amiens-Picardie, Amiens, France
- HEMATIM, EA 4666, Université Picardie Jules Verne, Amiens, France
| | - Nicolas Chapuis
- Assistance Publique-Hôpitaux de Paris, Centre-Université Paris Cité, Service d’Hématologie Biologique, Hôpital Cochin, Paris, France
| | - Jaja Zhu
- Service d’Hématologie-Immunologie-Transfusion, CHU Ambroise Paré, INSERM UMR 1184, AP-HP, Université Paris Saclay, 92100 Boulogne Billancourt, France
| | - Valérie Bardet
- Service d’Hématologie-Immunologie-Transfusion, CHU Ambroise Paré, INSERM UMR 1184, AP-HP, Université Paris Saclay, 92100 Boulogne Billancourt, France
| | - Jean-Pierre Marolleau
- Service d’Hématologie Clinique et de Thérapie Cellulaire, CHU Amiens-Picardie, Amiens, France
- HEMATIM, EA 4666, Université Picardie Jules Verne, Amiens, France
| | - Loïc Garçon
- Service d’Hématologie Biologique, CHU Amiens-Picardie, Amiens, France
- HEMATIM, EA 4666, Université Picardie Jules Verne, Amiens, France
| | - Alexis Caulier
- Service d’Hématologie Clinique et de Thérapie Cellulaire, CHU Amiens-Picardie, Amiens, France
- HEMATIM, EA 4666, Université Picardie Jules Verne, Amiens, France
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Hematology/Oncology, Boston Children’s Hospital, Harvard Medical School, Cambridge, MA, USA
| | - Thomas Boyer
- Service d’Hématologie Biologique, CHU Amiens-Picardie, Amiens, France
- HEMATIM, EA 4666, Université Picardie Jules Verne, Amiens, France
| |
Collapse
|
46
|
Okada Y, Mertens M, Liu N, Lam SSW, Ong MEH. AI and machine learning in resuscitation: Ongoing research, new concepts, and key challenges. Resusc Plus 2023; 15:100435. [PMID: 37547540 PMCID: PMC10400904 DOI: 10.1016/j.resplu.2023.100435] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/08/2023] Open
Abstract
Aim Artificial intelligence (AI) and machine learning (ML) are important areas of computer science that have recently attracted attention for their application to medicine. However, as techniques continue to advance and become more complex, it is increasingly challenging for clinicians to stay abreast of the latest research. This overview aims to translate research concepts and potential concerns to healthcare professionals interested in applying AI and ML to resuscitation research but who are not experts in the field. Main text We present various research including prediction models using structured and unstructured data, exploring treatment heterogeneity, reinforcement learning, language processing, and large-scale language models. These studies potentially offer valuable insights for optimizing treatment strategies and clinical workflows. However, implementing AI and ML in clinical settings presents its own set of challenges. The availability of high-quality and reliable data is crucial for developing accurate ML models. A rigorous validation process and the integration of ML into clinical practice is essential for practical implementation. We furthermore highlight the potential risks associated with self-fulfilling prophecies and feedback loops, emphasizing the importance of transparency, interpretability, and trustworthiness in AI and ML models. These issues need to be addressed in order to establish reliable and trustworthy AI and ML models. Conclusion In this article, we overview concepts and examples of AI and ML research in the resuscitation field. Moving forward, appropriate understanding of ML and collaboration with relevant experts will be essential for researchers and clinicians to overcome the challenges and harness the full potential of AI and ML in resuscitation.
Collapse
Affiliation(s)
- Yohei Okada
- Duke-NUS Medical School, National University of Singapore, Singapore
- Preventive Services, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| | - Mayli Mertens
- Antwerp Center for Responsible AI, Antwerp University, Belgium
- Centre for Ethics, Department of Philosophy, Antwerp University, Belgium
| | - Nan Liu
- Duke-NUS Medical School, National University of Singapore, Singapore
| | - Sean Shao Wei Lam
- Duke-NUS Medical School, National University of Singapore, Singapore
| | - Marcus Eng Hock Ong
- Duke-NUS Medical School, National University of Singapore, Singapore
- Department of Emergency Medicine, Singapore General Hospital
| |
Collapse
|
47
|
Vasey B, Collins GS. Invited Commentary: Transparent reporting of artificial intelligence models development and evaluation in surgery: The TRIPOD and DECIDE-AI checklists. Surgery 2023; 174:727-729. [PMID: 37244769 DOI: 10.1016/j.surg.2023.04.037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 04/27/2023] [Indexed: 05/29/2023]
Affiliation(s)
- Baptiste Vasey
- Nuffield Department of Surgical Sciences, University of Oxford, UK; Department of Surgery, Geneva University Hospital, Switzerland.
| | - Gary S Collins
- Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, UK. http://www.twitter.com/GSCollins
| |
Collapse
|
48
|
Dhiman P, Ma J, Qi C, Bullock G, Sergeant JC, Riley RD, Collins GS. Sample size requirements are not being considered in studies developing prediction models for binary outcomes: a systematic review. BMC Med Res Methodol 2023; 23:188. [PMID: 37598153 PMCID: PMC10439652 DOI: 10.1186/s12874-023-02008-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 08/04/2023] [Indexed: 08/21/2023] Open
Abstract
BACKGROUND Having an appropriate sample size is important when developing a clinical prediction model. We aimed to review how sample size is considered in studies developing a prediction model for a binary outcome. METHODS We searched PubMed for studies published between 01/07/2020 and 30/07/2020 and reviewed the sample size calculations used to develop the prediction models. Using the available information, we calculated the minimum sample size that would be needed to estimate overall risk and minimise overfitting in each study and summarised the difference between the calculated and used sample size. RESULTS A total of 119 studies were included, of which nine studies provided sample size justification (8%). The recommended minimum sample size could be calculated for 94 studies: 73% (95% CI: 63-82%) used sample sizes lower than required to estimate overall risk and minimise overfitting including 26% studies that used sample sizes lower than required to estimate overall risk only. A similar number of studies did not meet the ≥ 10EPV criteria (75%, 95% CI: 66-84%). The median deficit of the number of events used to develop a model was 75 [IQR: 234 lower to 7 higher]) which reduced to 63 if the total available data (before any data splitting) was used [IQR:225 lower to 7 higher]. Studies that met the minimum required sample size had a median c-statistic of 0.84 (IQR:0.80 to 0.9) and studies where the minimum sample size was not met had a median c-statistic of 0.83 (IQR: 0.75 to 0.9). Studies that met the ≥ 10 EPP criteria had a median c-statistic of 0.80 (IQR: 0.73 to 0.84). CONCLUSIONS Prediction models are often developed with no sample size calculation, as a consequence many are too small to precisely estimate the overall risk. We encourage researchers to justify, perform and report sample size calculations when developing a prediction model.
Collapse
Affiliation(s)
- Paula Dhiman
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7LD, UK.
| | - Jie Ma
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7LD, UK
| | - Cathy Qi
- Population Data Science, Faculty of Medicine, Health and Life Science, Swansea University Medical School, Swansea University, Singleton Park, Swansea, SA2 8PP, UK
| | - Garrett Bullock
- Department of Orthopaedic Surgery, Wake Forest School of Medicine, Winston-Salem, NC, USA
- Centre for Sport, Exercise and Osteoarthritis Research Versus Arthritis, University of Oxford, Oxford, UK
| | - Jamie C Sergeant
- Centre for Biostatistics, University of Manchester, Manchester Academic Health Science Centre, Manchester, M13 9PL, UK
- Centre for Epidemiology Versus Arthritis, Centre for Musculoskeletal Research, University of Manchester, Manchester Academic Health Science Centre, Manchester, M13 9PT, UK
| | - Richard D Riley
- Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, B15 2TT, Birmingham, UK
| | - Gary S Collins
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7LD, UK
| |
Collapse
|
49
|
Lapi F, Nuti L, Marconi E, Medea G, Cricelli I, Papi M, Gorini M, Fiorani M, Piccinocchi G, Cricelli C. To predict the risk of chronic kidney disease (CKD) using Generalized Additive2 Models (GA2M). J Am Med Inform Assoc 2023; 30:1494-1502. [PMID: 37330672 PMCID: PMC10436146 DOI: 10.1093/jamia/ocad097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 03/13/2023] [Accepted: 05/27/2023] [Indexed: 06/19/2023] Open
Abstract
OBJECTIVE To train and test a model predicting chronic kidney disease (CKD) using the Generalized Additive2 Model (GA2M), and compare it with other models being obtained with traditional or machine learning approaches. MATERIALS We adopted the Health Search Database (HSD) which is a representative longitudinal database containing electronic healthcare records of approximately 2 million adults. METHODS We selected all patients aged 15 years or older being active in HSD between January 1, 2018 and December 31, 2020 with no prior diagnosis of CKD. The following models were trained and tested using 20 candidate determinants for incident CKD: logistic regression, Random Forest, Gradient Boosting Machines (GBMs), GAM, and GA2M. Their prediction performances were compared by calculating Area Under Curve (AUC) and Average Precision (AP). RESULTS Comparing the predictive performances of the 7 models, the AUC and AP for GBM and GA2M showed the highest values which were equal to 88.9%, 88.8% and 21.8%, 21.1%, respectively. These 2 models outperformed the others including logistic regression. In contrast to GBMs, GA2M kept the interpretability of variable combinations, including interactions and nonlinearities assessment. DISCUSSION Although GA2M is slightly less performant than light GBM, it is not "black-box" algorithm, so being simply interpretable using shape and heatmap functions. This evidence supports the fact machine learning techniques should be adopted in case of complex algorithms such as those predicting the risk of CKD. CONCLUSION The GA2M was reliably performant in predicting CKD in primary care. A related decision support system might be therefore implemented.
Collapse
Affiliation(s)
- Francesco Lapi
- Health Search, Italian College of General Practitioners and Primary Care, Florence, Italy
| | | | - Ettore Marconi
- Health Search, Italian College of General Practitioners and Primary Care, Florence, Italy
| | - Gerardo Medea
- Italian College of General Practitioners and Primary Care, Florence, Italy
| | | | | | | | | | | | - Claudio Cricelli
- Italian College of General Practitioners and Primary Care, Florence, Italy
| |
Collapse
|
50
|
Ahmed MS, Ahmed N. A Fast and Minimal System to Identify Depression Using Smartphones: Explainable Machine Learning-Based Approach. JMIR Form Res 2023; 7:e28848. [PMID: 37561568 PMCID: PMC10450542 DOI: 10.2196/28848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 03/17/2023] [Accepted: 03/19/2023] [Indexed: 08/11/2023] Open
Abstract
BACKGROUND Existing robust, pervasive device-based systems developed in recent years to detect depression require data collected over a long period and may not be effective in cases where early detection is crucial. Additionally, due to the requirement of running systems in the background for prolonged periods, existing systems can be resource inefficient. As a result, these systems can be infeasible in low-resource settings. OBJECTIVE Our main objective was to develop a minimalistic system to identify depression using data retrieved in the fastest possible time. Another objective was to explain the machine learning (ML) models that were best for identifying depression. METHODS We developed a fast tool that retrieves the past 7 days' app usage data in 1 second (mean 0.31, SD 1.10 seconds). A total of 100 students from Bangladesh participated in our study, and our tool collected their app usage data and responses to the Patient Health Questionnaire-9. To identify depressed and nondepressed students, we developed a diverse set of ML models: linear, tree-based, and neural network-based models. We selected important features using the stable approach, along with 3 main types of feature selection (FS) approaches: filter, wrapper, and embedded methods. We developed and validated the models using the nested cross-validation method. Additionally, we explained the best ML models through the Shapley additive explanations (SHAP) method. RESULTS Leveraging only the app usage data retrieved in 1 second, our light gradient boosting machine model used the important features selected by the stable FS approach and correctly identified 82.4% (n=42) of depressed students (precision=75%, F1-score=78.5%). Moreover, after comprehensive exploration, we presented a parsimonious stacking model where around 5 features selected by the all-relevant FS approach Boruta were used in each iteration of validation and showed a maximum precision of 77.4% (balanced accuracy=77.9%). Feature importance analysis suggested app usage behavioral markers containing diurnal usage patterns as being more important than aggregated data-based markers. In addition, a SHAP analysis of our best models presented behavioral markers that were related to depression. For instance, students who were not depressed spent more time on education apps on weekdays, whereas those who were depressed used a higher number of photo and video apps and also had a higher deviation in using photo and video apps over the morning, afternoon, evening, and night time periods of the weekend. CONCLUSIONS Due to our system's fast and minimalistic nature, it may make a worthwhile contribution to identifying depression in underdeveloped and developing regions. In addition, our detailed discussion about the implication of our findings can facilitate the development of less resource-intensive systems to better understand students who are depressed and take steps for intervention.
Collapse
Affiliation(s)
- Md Sabbir Ahmed
- Design Inclusion and Access Lab, North South University, Dhaka, Bangladesh
| | - Nova Ahmed
- Design Inclusion and Access Lab, North South University, Dhaka, Bangladesh
| |
Collapse
|