1
|
Shim S, Kim MS, Yae CG, Kang YK, Do JR, Kim HK, Yang HL. Development and validation of a multi-stage self-supervised learning model for optical coherence tomography image classification. J Am Med Inform Assoc 2025; 32:800-810. [PMID: 40037789 DOI: 10.1093/jamia/ocaf021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 11/03/2024] [Accepted: 01/23/2025] [Indexed: 03/06/2025] Open
Abstract
OBJECTIVE This study aimed to develop a novel multi-stage self-supervised learning model tailored for the accurate classification of optical coherence tomography (OCT) images in ophthalmology reducing reliance on costly labeled datasets while maintaining high diagnostic accuracy. MATERIALS AND METHODS A private dataset of 2719 OCT images from 493 patients was employed, along with 3 public datasets comprising 84 484 images from 4686 patients, 3231 images from 45 patients, and 572 images. Extensive internal, external, and clinical validation were performed to assess model performance. Grad-CAM was employed for qualitative analysis to interpret the model's decisions by highlighting relevant areas. Subsampling analyses evaluated the model's robustness with varying labeled data availability. RESULTS The proposed model outperformed conventional supervised or self-supervised learning-based models, achieving state-of-the-art results across 3 public datasets. In a clinical validation, the model exhibited up to 17.50% higher accuracy and 17.53% higher macro F-1 score than a supervised learning-based model under limited training data. DISCUSSION The model's robustness in OCT image classification underscores the potential of the multi-stage self-supervised learning to address challenges associated with limited labeled data. The availability of source codes and pre-trained models promotes the use of this model in a variety of clinical settings, facilitating broader adoption. CONCLUSION This model offers a promising solution for advancing OCT image classification, achieving high accuracy while reducing the cost of extensive expert annotation and potentially streamlining clinical workflows, thereby supporting more efficient patient management.
Collapse
Affiliation(s)
- Sungho Shim
- Department of Electrical Engineering and Computer Science, Daegu Gyeongbuk Institute of Science and Technology (DGIST), Daegu 42988, Republic of Korea
| | - Min-Soo Kim
- School of Computing, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
| | - Che Gyem Yae
- Department of Ophthalmology, School of Medicine, Kyungpook National University, Daegu 41944, Republic of Korea
| | - Yong Koo Kang
- Department of Ophthalmology, School of Medicine, Kyungpook National University, Daegu 41944, Republic of Korea
| | - Jae Rock Do
- Department of Ophthalmology, School of Medicine, Kyungpook National University, Daegu 41944, Republic of Korea
| | - Hong Kyun Kim
- Department of Ophthalmology, School of Medicine, Kyungpook National University, Daegu 41944, Republic of Korea
| | - Hyun-Lim Yang
- Office of Hospital Information, Seoul National University Hospital, Seoul 03080, Republic of Korea
- Innovative Medical Technology Research Institute, Seoul National University Hospital, Seoul 03080, Republic of Korea
- Department of Medicine, College of Medicine, Seoul National University, Seoul 03080, Republic of Korea
| |
Collapse
|
2
|
Melia R, Musacchio Schafer K, Rogers ML, Wilson-Lemoine E, Joiner TE. The Application of AI to Ecological Momentary Assessment Data in Suicide Research: Systematic Review. J Med Internet Res 2025; 27:e63192. [PMID: 40245396 DOI: 10.2196/63192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Revised: 10/24/2024] [Accepted: 02/11/2025] [Indexed: 04/19/2025] Open
Abstract
BACKGROUND Ecological momentary assessment (EMA) captures dynamic processes suitable to the study of suicidal ideation and behaviors. Artificial intelligence (AI) has increasingly been applied to EMA data in the study of suicidal processes. OBJECTIVE This review aims to (1) synthesize empirical research applying AI strategies to EMA data in the study of suicidal ideation and behaviors; (2) identify methodologies and data collection procedures used, suicide outcomes studied, AI applied, and results reported; and (3) develop a standardized reporting framework for researchers applying AI to EMA data in the future. METHODS PsycINFO, PubMed, Scopus, and Embase were searched for published articles applying AI to EMA data in the investigation of suicide outcomes. The PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines were used to identify studies while minimizing bias. Quality appraisal was performed using CREMAS (adapted STROBE [Strengthening the Reporting of Observational Studies in Epidemiology] Checklist for Reporting Ecological Momentary Assessment Studies). RESULTS In total, 1201 records were identified across databases. After a full-text review, 12 (1%) articles, comprising 4398 participants, were included. In the application of AI to EMA data to predict suicidal ideation, studies reported mean area under the curve (0.74-0.86), sensitivity (0.64-0.81), specificity (0.73-0.86), and positive predictive values (0.72-0.77). Studies met between 4 and 13 of the 16 recommended CREMAS reporting standards, with an average of 7 items met across studies. Studies performed poorly in reporting EMA training procedures and treatment of missing data. CONCLUSIONS Findings indicate the promise of AI applied to self-report EMA in the prediction of near-term suicidal ideation. The application of AI to EMA data within suicide research is a burgeoning area hampered by variations in data collection and reporting procedures. The development of an adapted reporting framework by the research team aims to address this. TRIAL REGISTRATION Open Science Framework (OSF); https://doi.org/10.17605/OSF.IO/NZWUJ and PROSPERO CRD42023440218; https://www.crd.york.ac.uk/PROSPERO/view/CRD42023440218.
Collapse
Affiliation(s)
- Ruth Melia
- Health Research Institute, University of Limerick, Limerick, Ireland
- Psychology Department, Florida State University, Tallahassee, FL, United States
| | | | - Megan L Rogers
- Department of Psychology, Texas State University, San Marcos, TX, United States
| | - Emma Wilson-Lemoine
- Department of Psychological Medicine, Kings College London, London, United Kingdom
- Department of Psychology, University of Virginia, Austin, TX, United States
| | - Thomas Ellis Joiner
- Psychology Department, Florida State University, Tallahassee, FL, United States
| |
Collapse
|
3
|
Shoham G, Zuckerman T, Fliss E, Govrin O, Zaretski A, Singolda R, Kedar DJ, Leshem D, Madah E, Arad E, Barnea Y. Utilizing Artificial Intelligence for Predicting Postoperative Complications in Breast Reduction Surgery: A Comprehensive Retrospective Analysis of Predictive Features and Outcomes. Aesthet Surg J 2025; 45:536-541. [PMID: 39899336 DOI: 10.1093/asj/sjaf021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2024] [Revised: 01/19/2025] [Accepted: 01/30/2025] [Indexed: 02/04/2025] Open
Abstract
BACKGROUND Breast reduction is a common procedure with growing rates in the United States of America, aimed at alleviating the physical and psychological burdens of macromastia. Despite high success rates, it carries a risk of complications, with incidence rates ranging from 6.2% to 43%. OBJECTIVES The authors developed a machine learning model using gradient-boosting decision trees to predict severe breast reduction complications up to 30 days following surgery requiring inpatient care. METHODS This retrospective study included 322 cases of breast reduction surgery performed at the Tel Aviv Medical Center from 2017 to 2024. Model performance was evaluated using 5-fold cross-validation, and key metrics such as area under the receiver operating characteristic curve, accuracy, sensitivity, and specificity were reported. An interpretability tool was also created to visualize complication risks based on clinical features. RESULTS Severe complications occurred in 7.4% of cases. Key predictive factors included specimen weight, SN-N distance, and liposuction volume. The model achieved an area under the receiver operating characteristic curve (AUC-ROC) of 0.83, with an accuracy of 0.93 and a negative predictive value (NPV) of 0.95. The interpretability tool clearly visualized complication risks, aiding preoperative counseling. CONCLUSIONS This is the first study to use artificial intelligence (AI) to predict severe complications in breast reduction surgery. In this study, the AI model, with an AUC-ROC of 0.83 and NPV of 0.95, offers a reliable tool for surgical planning and patient education. Further validation across diverse populations is recommended to confirm its clinical utility. LEVEL OF EVIDENCE: 4 (RISK)
Collapse
|
4
|
Nayak UU, Shanbhag S, Panakkal NC, J V, Mohapatra S. Predictive modeling of presenteeism among radiographers: a secondary analysis of comprehensive data using Bayesian neural network. INTERNATIONAL JOURNAL OF OCCUPATIONAL SAFETY AND ERGONOMICS 2025:1-11. [PMID: 40178048 DOI: 10.1080/10803548.2025.2480934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2025]
Abstract
TRIAL REGISTRATION Clinical Trials Registry - India identifier: CTRI/2021/09/036992.
Collapse
Affiliation(s)
- Ullas U Nayak
- Division of Anatomy, Department of Basic Medical Sciences, Manipal Academy of Higher Education, Manipal, India
| | - Shivanath Shanbhag
- Department of Medical Imaging Technology, Manipal College of Health Professions, Manipal Academy of Higher Education, Manipal, India
| | - Nitika C Panakkal
- Department of Medical Imaging Technology, Manipal College of Health Professions, Manipal Academy of Higher Education, Manipal, India
| | - Vennila J
- Statistics, Manipal College of Health Professions, Manipal Academy of Higher Education, Manipal, India
| | - Sidhiprada Mohapatra
- Centre for Comprehensive Rehabilitation, Department of Physiotherapy, Manipal College of Health Professions, Manipal Academy of Higher Education, Manipal, India
| |
Collapse
|
5
|
Park C, Han C, Jang SK, Kim H, Kim S, Kang BH, Jung K, Yoon D. Development and Validation of a Machine Learning Model for Early Prediction of Delirium in Intensive Care Units Using Continuous Physiological Data: Retrospective Study. J Med Internet Res 2025; 27:e59520. [PMID: 40173433 PMCID: PMC12004028 DOI: 10.2196/59520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 08/08/2024] [Accepted: 02/17/2025] [Indexed: 04/04/2025] Open
Abstract
BACKGROUND Delirium in intensive care unit (ICU) patients poses a significant challenge, affecting patient outcomes and health care efficiency. Developing an accurate, real-time prediction model for delirium represents an advancement in critical care, addressing needs for timely intervention and resource optimization in ICUs. OBJECTIVE We aimed to create a novel machine learning model for delirium prediction in ICU patients using only continuous physiological data. METHODS We developed models integrating routinely available clinical data, such as age, sex, and patient monitoring device outputs, to ensure practicality and adaptability in diverse clinical settings. To confirm the reliability of delirium determination records, we prospectively collected results of Confusion Assessment Method for the ICU (CAM-ICU) evaluations performed by qualified investigators from May 17, 2021, to December 23, 2022, determining Cohen κ coefficients. Participants were included in the study if they were aged ≥18 years at ICU admission, had delirium evaluations using the CAM-ICU, and had data collected for at least 4 hours before delirium diagnosis or nondiagnosis. The development cohort from Yongin Severance Hospital (March 1, 2020, to January 12, 2022) comprised 5478 records: 5129 (93.62%) records from 651 patients for training and 349 (6.37%) records from 163 patients for internal validation. For temporal validation, we used 4438 records from the same hospital (January 28, 2022, to December 31, 2022) to reflect potential seasonal variations. External validation was performed using data from 670 patients at Ajou University Hospital (March 2022 to September 2022). We evaluated machine learning algorithms (random forest [RF], extra-trees classifier, and light gradient boosting machine) and selected the RF model as the final model based on its performance. To confirm clinical utility, a decision curve analysis and temporal pattern for model prediction during the ICU stay were performed. RESULTS The κ coefficient between labels generated by ICU nurses and prospectively verified by qualified researchers was 0.81, indicating reliable CAM-ICU results. Our final model showed robust performance in internal validation (area under the receiver operating characteristic curve [AUROC]: 0.82; area under the precision-recall curve [AUPRC]: 0.62) and maintained its accuracy in temporal validation (AUROC: 0.73; AUPRC: 0.85). External validation supported its effectiveness (AUROC: 0.84; AUPRC: 0.77). Decision curve analysis showed a positive net benefit at all thresholds, and the temporal pattern analysis showed a gradual increase in the model scores as the actual delirium diagnosis time approached. CONCLUSIONS We developed a machine learning model for delirium prediction in ICU patients using routinely measured variables, including physiological waveforms. Our study demonstrates the potential of the RF model in predicting delirium, with consistent performance across various validation scenarios. The model uses noninvasive variables, making it applicable to a wide range of ICU patients, with minimal additional risk.
Collapse
Affiliation(s)
- Chanmin Park
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Changho Han
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Seoul, Republic of Korea
| | | | | | - Sora Kim
- Ajou University Hospital Gyeonggi South Regional Trauma Center, Suwon, Republic of Korea
| | - Byung Hee Kang
- Department of Surgery, Division of Trauma Surgery, Ajou University School of Medicine, Suwon, Republic of Korea
| | - Kyoungwon Jung
- Department of Surgery, Division of Trauma Surgery, Ajou University School of Medicine, Suwon, Republic of Korea
| | - Dukyong Yoon
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
6
|
Ogwel B, Mzazi VH, Nyawanda BO, Otieno G, Tickell KD, Omore R. A machine learning approach to predicting inpatient mortality among pediatric acute gastroenteritis patients in Kenya. Learn Health Syst 2025; 9:e10478. [PMID: 40247897 PMCID: PMC12000769 DOI: 10.1002/lrh2.10478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2024] [Revised: 10/22/2024] [Accepted: 12/10/2024] [Indexed: 04/19/2025] Open
Abstract
Background Mortality prediction scores for children admitted with diarrhea are unavailable, early identification of at-risk patients for proper management remains a challenge. This study utilizes machine learning (ML) to develop a highly sensitive model for timelier identification of at-risk children admitted with acute gastroenteritis (AGE) for better management. Methods We used seven ML algorithms to build prognostic models for the prediction of mortality using de-identified data collected from children aged <5 years hospitalized with AGE at Siaya County Referral Hospital (SCRH), Kenya, between 2010 through 2020. Potential predictors included demographic, medical history, and clinical examination data collected at admission to hospital. We conducted split-sampling and employed tenfold cross-validation in the model development. We evaluated the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and the area under the curve (AUC) for each of the models. Results During the study period, 12 546 children aged <5 years admitted at SCRH were enrolled in the inpatient disease surveillance, of whom 2271 (18.1%) had AGE and 164 (7.2%) subsequently died. The following features were identified as predictors of mortality in decreasing order: AVPU scale, Vesikari score, dehydration, sunken eyes, skin pinch, maximum number of vomits, unconsciousness, wasting, vomiting, pulse, fever, sunken fontanelle, restless, nasal flaring, diarrhea days, stridor, <90% oxygen saturation, chest indrawing, malaria, and stunting. The sensitivity ranged from 46.3%-78.0% across models, while the specificity and AUC ranged from 71.7% to 78.7% and 56.5%-82.6%, respectively. The random forest model emerged as the champion model achieving 78.0%, 76.6%, 20.6%, 97.8%, and 82.6% for sensitivity, specificity, PPV, NPV, and AUC, respectively. Conclusions This study demonstrates promising predictive performance of the proposed algorithm for identifying patients at risk of mortality in resource-limited settings. However, further validation in real-world clinical settings is needed to assess its feasibility and potential impact on patient outcomes.
Collapse
Affiliation(s)
- Billy Ogwel
- Kenya Medical Research Institute‐Center for Global Health Research (KEMRI‐CGHR)KisumuKenya
- Department of Information SystemsUniversity of South AfricaPretoriaSouth Africa
| | - Vincent H. Mzazi
- Department of Information SystemsUniversity of South AfricaPretoriaSouth Africa
| | - Bryan O. Nyawanda
- Kenya Medical Research Institute‐Center for Global Health Research (KEMRI‐CGHR)KisumuKenya
| | - Gabriel Otieno
- Department of ComputingUnited States International UniversityNairobiKenya
| | - Kirkby D. Tickell
- Department of Global HealthUniversity of WashingtonSeattleWashingtonUSA
| | - Richard Omore
- Kenya Medical Research Institute‐Center for Global Health Research (KEMRI‐CGHR)KisumuKenya
| |
Collapse
|
7
|
Grzenda A, Kraguljac NV, McDonald WM, Nemeroff C, Torous J, Alpert JE, Rodriguez CI, Widge AS. Evaluating the Machine Learning Literature: A Primer and User's Guide for Psychiatrists. FOCUS (AMERICAN PSYCHIATRIC PUBLISHING) 2025; 23:270-284. [PMID: 40235606 PMCID: PMC11995911 DOI: 10.1176/appi.focus.25023011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/17/2025]
Affiliation(s)
- Adrienne Grzenda
- Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, University of California, Los Angeles, and Olive View-UCLA Medical Center, Sylmar (Grzenda); Department of Psychiatry and Behavioral Neurobiology, University of Alabama at Birmingham (Kraguljac); Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta (McDonald); Department of Psychiatry, University of Texas Dell Medical School, Austin (Nemeroff); Department of Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston (Torous); Department of Psychiatry and Behavioral Sciences, Albert Einstein School of Medicine, Bronx, N.Y. (Alpert); Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, Calif., and Veterans Affairs Palo Alto Health Care System, Palo Alto, Calif. (Rodriguez); Department of Psychiatry and Behavioral Sciences, University of Minnesota, Minneapolis (Widge)
| | - Nina V Kraguljac
- Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, University of California, Los Angeles, and Olive View-UCLA Medical Center, Sylmar (Grzenda); Department of Psychiatry and Behavioral Neurobiology, University of Alabama at Birmingham (Kraguljac); Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta (McDonald); Department of Psychiatry, University of Texas Dell Medical School, Austin (Nemeroff); Department of Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston (Torous); Department of Psychiatry and Behavioral Sciences, Albert Einstein School of Medicine, Bronx, N.Y. (Alpert); Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, Calif., and Veterans Affairs Palo Alto Health Care System, Palo Alto, Calif. (Rodriguez); Department of Psychiatry and Behavioral Sciences, University of Minnesota, Minneapolis (Widge)
| | - William M McDonald
- Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, University of California, Los Angeles, and Olive View-UCLA Medical Center, Sylmar (Grzenda); Department of Psychiatry and Behavioral Neurobiology, University of Alabama at Birmingham (Kraguljac); Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta (McDonald); Department of Psychiatry, University of Texas Dell Medical School, Austin (Nemeroff); Department of Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston (Torous); Department of Psychiatry and Behavioral Sciences, Albert Einstein School of Medicine, Bronx, N.Y. (Alpert); Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, Calif., and Veterans Affairs Palo Alto Health Care System, Palo Alto, Calif. (Rodriguez); Department of Psychiatry and Behavioral Sciences, University of Minnesota, Minneapolis (Widge)
| | - Charles Nemeroff
- Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, University of California, Los Angeles, and Olive View-UCLA Medical Center, Sylmar (Grzenda); Department of Psychiatry and Behavioral Neurobiology, University of Alabama at Birmingham (Kraguljac); Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta (McDonald); Department of Psychiatry, University of Texas Dell Medical School, Austin (Nemeroff); Department of Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston (Torous); Department of Psychiatry and Behavioral Sciences, Albert Einstein School of Medicine, Bronx, N.Y. (Alpert); Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, Calif., and Veterans Affairs Palo Alto Health Care System, Palo Alto, Calif. (Rodriguez); Department of Psychiatry and Behavioral Sciences, University of Minnesota, Minneapolis (Widge)
| | - John Torous
- Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, University of California, Los Angeles, and Olive View-UCLA Medical Center, Sylmar (Grzenda); Department of Psychiatry and Behavioral Neurobiology, University of Alabama at Birmingham (Kraguljac); Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta (McDonald); Department of Psychiatry, University of Texas Dell Medical School, Austin (Nemeroff); Department of Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston (Torous); Department of Psychiatry and Behavioral Sciences, Albert Einstein School of Medicine, Bronx, N.Y. (Alpert); Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, Calif., and Veterans Affairs Palo Alto Health Care System, Palo Alto, Calif. (Rodriguez); Department of Psychiatry and Behavioral Sciences, University of Minnesota, Minneapolis (Widge)
| | - Jonathan E Alpert
- Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, University of California, Los Angeles, and Olive View-UCLA Medical Center, Sylmar (Grzenda); Department of Psychiatry and Behavioral Neurobiology, University of Alabama at Birmingham (Kraguljac); Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta (McDonald); Department of Psychiatry, University of Texas Dell Medical School, Austin (Nemeroff); Department of Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston (Torous); Department of Psychiatry and Behavioral Sciences, Albert Einstein School of Medicine, Bronx, N.Y. (Alpert); Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, Calif., and Veterans Affairs Palo Alto Health Care System, Palo Alto, Calif. (Rodriguez); Department of Psychiatry and Behavioral Sciences, University of Minnesota, Minneapolis (Widge)
| | - Carolyn I Rodriguez
- Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, University of California, Los Angeles, and Olive View-UCLA Medical Center, Sylmar (Grzenda); Department of Psychiatry and Behavioral Neurobiology, University of Alabama at Birmingham (Kraguljac); Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta (McDonald); Department of Psychiatry, University of Texas Dell Medical School, Austin (Nemeroff); Department of Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston (Torous); Department of Psychiatry and Behavioral Sciences, Albert Einstein School of Medicine, Bronx, N.Y. (Alpert); Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, Calif., and Veterans Affairs Palo Alto Health Care System, Palo Alto, Calif. (Rodriguez); Department of Psychiatry and Behavioral Sciences, University of Minnesota, Minneapolis (Widge)
| | - Alik S Widge
- Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, University of California, Los Angeles, and Olive View-UCLA Medical Center, Sylmar (Grzenda); Department of Psychiatry and Behavioral Neurobiology, University of Alabama at Birmingham (Kraguljac); Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta (McDonald); Department of Psychiatry, University of Texas Dell Medical School, Austin (Nemeroff); Department of Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston (Torous); Department of Psychiatry and Behavioral Sciences, Albert Einstein School of Medicine, Bronx, N.Y. (Alpert); Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, Calif., and Veterans Affairs Palo Alto Health Care System, Palo Alto, Calif. (Rodriguez); Department of Psychiatry and Behavioral Sciences, University of Minnesota, Minneapolis (Widge)
| |
Collapse
|
8
|
van Spanning SH, Verweij LPE, Hendrickx LAM, Allaart LJH, Athwal GS, Lafosse T, Lafosse L, Doornberg JN, Oosterhoff JHF, van den Bekerom MPJ, Buijze GA. Methodology and development of a machine learning probability calculator: Data heterogeneity limits ability to predict recurrence after arthroscopic Bankart repair. Knee Surg Sports Traumatol Arthrosc 2025; 33:1488-1499. [PMID: 39324357 PMCID: PMC11948171 DOI: 10.1002/ksa.12443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Revised: 08/02/2024] [Accepted: 08/02/2024] [Indexed: 09/27/2024]
Abstract
PURPOSE The aim of this study was to develop and train a machine learning (ML) algorithm to create a clinical decision support tool (i.e., ML-driven probability calculator) to be used in clinical practice to estimate recurrence rates following an arthroscopic Bankart repair (ABR). METHODS Data from 14 previously published studies were collected. Inclusion criteria were (1) patients treated with ABR without remplissage for traumatic anterior shoulder instability and (2) a minimum of 2 years follow-up. Risk factors associated with recurrence were identified using bivariate logistic regression analysis. Subsequently, four ML algorithms were developed and internally validated. The predictive performance was assessed using discrimination, calibration and the Brier score. RESULTS In total, 5591 patients underwent ABR with a recurrence rate of 15.4% (n = 862). Age <35 years, participation in contact and collision sports, bony Bankart lesions and full-thickness rotator cuff tears increased the risk of recurrence (all p < 0.05). A single shoulder dislocation (compared to multiple dislocations) lowered the risk of recurrence (p < 0.05). Due to the unavailability of certain variables in some patients, a portion of the patient data had to be excluded before pooling the data set to create the algorithm. A total of 797 patients were included providing information on risk factors associated with recurrence. The discrimination (area under the receiver operating curve) ranged between 0.54 and 0.57 for prediction of recurrence. CONCLUSION ML was not able to predict the recurrence following ABR with the current available predictors. Despite a global coordinated effort, the heterogeneity of clinical data limited the predictive capabilities of the algorithm, emphasizing the need for standardized data collection methods in future studies. LEVEL OF EVIDENCE Level IV, retrospective cohort study.
Collapse
Affiliation(s)
- Sanne H. van Spanning
- Alps Surgery Institute, Hand, Upper Limb, Peripheral Nerve, Brachial Plexus and Microsurgery Unit, Clinique GénéraleAnnecyFrance
- Amsterdam Shoulder and Elbow Centre of Expertise (ASECE)AmsterdamThe Netherlands
- Department of Human Movement SciencesFaculty of Behavioural and Movement Sciences, Vrije Universiteit Amsterdam, Amsterdam Movement SciencesAmsterdamThe Netherlands
- Department of Orthopedic SurgeryOLVG, Shoulder and Elbow UnitAmsterdamThe Netherlands
| | - Lukas P. E. Verweij
- Department of Human Movement SciencesFaculty of Behavioural and Movement Sciences, Vrije Universiteit Amsterdam, Amsterdam Movement SciencesAmsterdamThe Netherlands
- Amsterdam Movement Sciences, Musculoskeletal Health ProgramAmsterdamThe Netherlands
- Department of Amsterdam UMC, Department of Orthopedic Surgery and Sports Medicine, Location AMCUniversity of AmsterdamAmsterdamThe Netherlands
| | - Laurent A. M. Hendrickx
- Department of Amsterdam UMC, Department of Orthopedic Surgery and Sports Medicine, Location AMCUniversity of AmsterdamAmsterdamThe Netherlands
- Department of Orthopaedic & Trauma SurgeryFlinders Medical Centre, Flinders UniversityAdelaideSouth AustraliaAustralia
| | - Laurens J. H. Allaart
- Alps Surgery Institute, Hand, Upper Limb, Peripheral Nerve, Brachial Plexus and Microsurgery Unit, Clinique GénéraleAnnecyFrance
- Amsterdam Shoulder and Elbow Centre of Expertise (ASECE)AmsterdamThe Netherlands
- Department of Human Movement SciencesFaculty of Behavioural and Movement Sciences, Vrije Universiteit Amsterdam, Amsterdam Movement SciencesAmsterdamThe Netherlands
| | - George S. Athwal
- Roth McFarlane Hand and Upper Limb Centre, Schulich School of Medicine and DentistryWestern UniversityLondonOntarioCanada
| | - Thibault Lafosse
- Alps Surgery Institute, Hand, Upper Limb, Peripheral Nerve, Brachial Plexus and Microsurgery Unit, Clinique GénéraleAnnecyFrance
| | - Laurent Lafosse
- Alps Surgery Institute, Hand, Upper Limb, Peripheral Nerve, Brachial Plexus and Microsurgery Unit, Clinique GénéraleAnnecyFrance
| | - Job N. Doornberg
- Department of Orthopaedic & Trauma SurgeryFlinders Medical Centre, Flinders UniversityAdelaideSouth AustraliaAustralia
- Department of Orthopaedic and Trauma Surgery, University Medical Center GroningenUniversity of GroningenGroningenThe Netherlands
| | - Jacobien H. F. Oosterhoff
- Department of Engineering Systems and ServicesFaculty Technology Policy and Management, Delft University of TechnologyDelftThe Netherlands
| | - Michel P. J. van den Bekerom
- Department of Human Movement SciencesFaculty of Behavioural and Movement Sciences, Vrije Universiteit Amsterdam, Amsterdam Movement SciencesAmsterdamThe Netherlands
- Department of Orthopedic SurgeryOLVG, Shoulder and Elbow UnitAmsterdamThe Netherlands
- Amsterdam Movement Sciences, Musculoskeletal Health ProgramAmsterdamThe Netherlands
| | - Geert Alexander Buijze
- Alps Surgery Institute, Hand, Upper Limb, Peripheral Nerve, Brachial Plexus and Microsurgery Unit, Clinique GénéraleAnnecyFrance
- Department of Amsterdam UMC, Department of Orthopedic Surgery and Sports Medicine, Location AMCUniversity of AmsterdamAmsterdamThe Netherlands
- Department of Orthopedic Surgery, Montpellier University Medical Centre, Lapeyronie HospitalUniversity of MontpellierMontpellierFrance
| |
Collapse
|
9
|
Kierner S, Kierner P, Kucharski J. Combining machine learning models and rule engines in clinical decision systems: Exploring optimal aggregation methods for vaccine hesitancy prediction. Comput Biol Med 2025; 188:109749. [PMID: 39983355 DOI: 10.1016/j.compbiomed.2025.109749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2024] [Revised: 01/21/2025] [Accepted: 01/22/2025] [Indexed: 02/23/2025]
Abstract
BACKGROUND With the increasing application of artificial intelligence (AI) technologies in the healthcare sector and the emergence of new solutions, such as large language models, there is a growing need to combine medical knowledge, often expressed as clinical rules, with advances in machine learning (ML) offering higher prediction accuracy at the expense of decision-making transparency. PURPOSE This study investigates the efficacy of various aggregation methods combining the decisions of an AI model and a clinical rule-based (RB) engine in predicting vaccine hesitancy to maximize the effectiveness of patient incentive programs. This is the first study of parallel ensemble of rules and machine learning in clinical context proposing RB confidence-led fusion of ML and RB inference. METHODS A clinical decision system for predicting hesitation to vaccinate is developed based on a differentially private set of longitudinal health records of 974,000 US patients and clinical rules obtained from the present literature. Various approaches based on possibility theory have been explored to maximize classification accuracy, capture and hurdle rates while ensuring trustworthiness in clinical interventions. RESULTS Our findings reveal that the hybrid approach outperforms the individual models and RB systems when transparency and accuracy are critical. A RB confidence-led approach emerged as the most effective method. The aggregation of mismatched classes relies on RB results when the RB engine has high confidence (expressed as more than the median degree of membership to the vaccination hesitation output function) and on ML predictions when the RB engine exhibits lower confidence. CONCLUSIONS Implementing such an aggregation method preserves the accuracy and capture rates of a clinical decision system, while potentially improving acceptance among healthcare providers.
Collapse
Affiliation(s)
- Slawomir Kierner
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| | - Piotr Kierner
- Department of Genetics - Blavatnik Institute, Sinclair Lab, Harvard Medical School, D 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | - Jacek Kucharski
- Faculty of Electrical, Electronic, Computer and Control Engineering, Lodz University of Technology, 18/22 Stefanowskiego St., Łodź 90-924, Poland
| |
Collapse
|
10
|
Lu Y, Jurgensmeier K, Lamba A, Yang L, Hevesi M, Camp CL, Krych AJ, Stuart MJ. Posttraumatic Arthritis After Anterior Cruciate Ligament Injury: Machine Learning Comparison Between Surgery and Nonoperative Management. Am J Sports Med 2025; 53:1050-1060. [PMID: 40079334 DOI: 10.1177/03635465251322803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/15/2025]
Abstract
BACKGROUND Nonoperative and operative management techniques after anterior cruciate ligament (ACL) injury are both appropriate treatment options for selected patients. However, the subsequent development of posttraumatic knee osteoarthritis (PTOA) remains an area of active study. PURPOSE To compare the risk of PTOA between patients treated without surgery and with ACL reconstruction (ACLR) after primary ACL disruption using a machine learning causal inference model. STUDY DESIGN Cohort study; Level of evidence, 3. METHODS A geographic database identified patients undergoing ACLR between 1990 and 2016 with minimum 7.5-year follow-up. Variables collected include age, sex, body mass index, activity level, occupation, relevant comorbid diagnoses, radiographic findings, injury characteristics, and clinical course. Treatment effects of ACLR on the development of PTOA and progression to total knee arthroplasty (TKA) were analyzed with machine learning models (MLMs) in a causal inference estimator (targeted maximum likelihood estimation, TMLE), while controlling for confounders. RESULTS The study included 1194 patients with a minimum follow-up of 7.5 years, among whom 974 underwent primary reconstruction and 220 underwent nonoperative treatment. A total of 215 (22%) patients developed symptomatic PTOA in the ACLR group compared with 140 (64%) in the nonoperative treatment group (P < .001), whereas 25 (3%) patients underwent TKA in the ACLR group compared with 50 (23%) in the nonoperative treatment group (P < .001). Patients in the ACLR group had delayed TKA compared with patients in the nonoperative treatment group (193.4 vs 166.0 months, respectively; P = .02). TMLE evaluation revealed that reconstruction decreased the risk of PTOA by 11% (95% CI, 8%-13%; P < .001) compared with nonoperative treatment but did not demonstrate a significant effect on the rate of progression to TKA. Survival analysis with random forest algorithm demonstrated significant delay to the onset of PTOA as well as time to progression of TKA in patients undergoing ACLR. Additional risk factors for the development of PTOA, irrespective of treatment, included older age at injury, greater body mass index, total number of arthroscopic knee surgeries, and residual laxity at follow-up. CONCLUSION MLMs in a causal inference estimator found ACLR to exert a significant treatment effect in reducing the rate of development of PTOA by 11% compared with nonoperative treatment. ACLR also delayed the onset of PTOA and progression to TKA.
Collapse
Affiliation(s)
- Yining Lu
- Department of Orthopedic Surgery, Mayo Clinic, Rochester, Minnesota, USA
- Orthopedic Surgery Artificial Intelligence Laboratory, Mayo Clinic, Rochester, Minnesota, USA
| | - Kevin Jurgensmeier
- Department of Orthopedic Surgery, Mayo Clinic, Rochester, Minnesota, USA
| | - Abhinav Lamba
- Department of Orthopedic Surgery, Mayo Clinic, Rochester, Minnesota, USA
| | - Linjun Yang
- Department of Orthopedic Surgery, Mayo Clinic, Rochester, Minnesota, USA
| | - Mario Hevesi
- Department of Orthopedic Surgery, Mayo Clinic, Rochester, Minnesota, USA
| | - Christopher L Camp
- Department of Orthopedic Surgery, Mayo Clinic, Rochester, Minnesota, USA
| | - Aaron J Krych
- Department of Orthopedic Surgery, Mayo Clinic, Rochester, Minnesota, USA
| | - Michael J Stuart
- Department of Orthopedic Surgery, Mayo Clinic, Rochester, Minnesota, USA
| |
Collapse
|
11
|
Oh MY, Kim HS, Jung YM, Lee HC, Lee SB, Lee SM. Machine Learning-Based Explainable Automated Nonlinear Computation Scoring System for Health Score and an Application for Prediction of Perioperative Stroke: Retrospective Study. J Med Internet Res 2025; 27:e58021. [PMID: 40106818 PMCID: PMC11966079 DOI: 10.2196/58021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2024] [Revised: 03/24/2024] [Accepted: 10/30/2024] [Indexed: 03/22/2025] Open
Abstract
BACKGROUND Machine learning (ML) has the potential to enhance performance by capturing nonlinear interactions. However, ML-based models have some limitations in terms of interpretability. OBJECTIVE This study aimed to develop and validate a more comprehensible and efficient ML-based scoring system using SHapley Additive exPlanations (SHAP) values. METHODS We developed and validated the Explainable Automated nonlinear Computation scoring system for Health (EACH) framework score. We developed a CatBoost-based prediction model, identified key features, and automatically detected the top 5 steepest slope change points based on SHAP plots. Subsequently, we developed a scoring system (EACH) and normalized the score. Finally, the EACH score was used to predict perioperative stroke. We developed the EACH score using data from the Seoul National University Hospital cohort and validated it using data from the Boramae Medical Center, which was geographically and temporally different from the development set. RESULTS When applied for perioperative stroke prediction among 38,737 patients undergoing noncardiac surgery, the EACH score achieved an area under the curve (AUC) of 0.829 (95% CI 0.753-0.892). In the external validation, the EACH score demonstrated superior predictive performance with an AUC of 0.784 (95% CI 0.694-0.871) compared with a traditional score (AUC=0.528, 95% CI 0.457-0.619) and another ML-based scoring generator (AUC=0.564, 95% CI 0.516-0.612). CONCLUSIONS The EACH score is a more precise, explainable ML-based risk tool, proven effective in real-world data. The EACH score outperformed traditional scoring system and other prediction models based on different ML techniques in predicting perioperative stroke.
Collapse
Affiliation(s)
- Mi-Young Oh
- Department of Neurology, Sejong General Hospital, Sejong General Hospital, Bucheon-si, Republic of Korea
| | - Hee-Soo Kim
- Department of Medical Informatics, School of Medicine, Keimyung University, Daegu, Republic of Korea
| | - Young Mi Jung
- Department of Obstetrics and Gynecology, Seoul National University Bundang Hospital, Seongnam, Republic of Korea
| | - Hyung-Chul Lee
- Department of Anesthesiology and Pain Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
- Department of Anesthesiology and Pain Medicine, Seoul National University Hospital, Seoul, Republic of Korea
| | - Seung-Bo Lee
- Department of Medical Informatics, School of Medicine, Keimyung University, Daegu, Republic of Korea
| | - Seung Mi Lee
- Department of Obstetrics and Gynecology, College of Medicine, Seoul National University, Seoul, Republic of Korea
- Department of Obstetrics and Gynecology, Seoul National University Hospital, Seoul, Republic of Korea
- Innovative Medical Technology Research Institute, Seoul National University Hospital, Seoul, Republic of Korea
- Institute of Reproductive Medicine and Population & Medical Big Data Research Center, Seoul National University, Seoul, Republic of Korea
| |
Collapse
|
12
|
Lu Z, Dong B, Cai H, Tian T, Wang J, Fu L, Wang B, Zhang W, Lin S, Tuo X, Wang J, Yang T, Huang X, Zheng Z, Xue H, Xu S, Liu S, Sun P, Zou H. Identifying Data-Driven Clinical Subgroups for Cervical Cancer Prevention With Machine Learning: Population-Based, External, and Diagnostic Validation Study. JMIR Public Health Surveill 2025; 11:e67840. [PMID: 40106366 PMCID: PMC11939026 DOI: 10.2196/67840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2024] [Revised: 01/29/2025] [Accepted: 01/29/2025] [Indexed: 03/22/2025] Open
Abstract
Background Cervical cancer remains a major global health issue. Personalized, data-driven cervical cancer prevention (CCP) strategies tailored to phenotypic profiles may improve prevention and reduce disease burden. Objective This study aimed to identify subgroups with differential cervical precancer or cancer risks using machine learning, validate subgroup predictions across datasets, and propose a computational phenomapping strategy to enhance global CCP efforts. Methods We explored the data-driven CCP subgroups by applying unsupervised machine learning to a deeply phenotyped, population-based discovery cohort. We extracted CCP-specific risks of cervical intraepithelial neoplasia (CIN) and cervical cancer through weighted logistic regression analyses providing odds ratio (OR) estimates and 95% CIs. We trained a supervised machine learning model and developed pathways to classify individuals before evaluating its diagnostic validity and usability on an external cohort. Results This study included 551,934 women (median age, 49 years) in the discovery cohort and 47,130 women (median age, 37 years) in the external cohort. Phenotyping identified 5 CCP subgroups, with CCP4 showing the highest carcinoma prevalence. CCP2-4 had significantly higher risks of CIN2+ (CCP2: OR 2.07 [95% CI: 2.03-2.12], CCP3: 3.88 [3.78-3.97], and CCP4: 4.47 [4.33-4.63]) and CIN3+ (CCP2: 2.10 [2.05-2.14], CCP3: 3.92 [3.82-4.02], and CCP4: 4.45 [4.31-4.61]) compared to CCP1 (P<.001), consistent with the direction of results observed in the external cohort. The proposed triple strategy was validated as clinically relevant, prioritizing high-risk subgroups (CCP3-4) for colposcopies and scaling human papillomavirus screening for CCP1-2. Conclusions This study underscores the potential of leveraging machine learning algorithms and large-scale routine electronic health records to enhance CCP strategies. By identifying key determinants of CIN2+/CIN3+ risk and classifying 5 distinct subgroups, our study provides a robust, data-driven foundation for the proposed triple strategy. This approach prioritizes tailored prevention efforts for subgroups with varying risks, offering a novel and scalable tool to complement existing cervical cancer screening guidelines. Future work should focus on independent external and prospective validation to maximize the global impact of this strategy.
Collapse
Affiliation(s)
- Zhen Lu
- School of Public Health (Shenzhen), Sun Yat-sen University, Shenzhen, China
| | - Binhua Dong
- Department of Gynecology, Laboratory of Gynecologic Oncology, Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou, China
- Fujian Key Laboratory of Women and Children’s Critical Diseases Research, Fuzhou, China
| | - Hongning Cai
- Department of Gynecology, Maternal and Child Health Hospital of Hubei Province (Women and Children's Hospital of Hubei Province) Wuhan, Wuhan, China
| | - Tian Tian
- School of Public Health, Xinjiang Medical University, Urumqi, China
| | - Junfeng Wang
- Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Utrecht University, Utrecht, Netherlands
| | - Leiwen Fu
- Beijing Chest Hospital, Capital Medical University, Beijing Tuberculosis and Thoracic Tumor Research Institute, Beijing, China
| | - Bingyi Wang
- Institute for HIV/AIDS Control and Prevention, Guangdong Provincial Center for Disease Control and Prevention, Guangzhou, China
- Department of HIV/AIDS Control and Prevention, Guangdong Provincial Academy of Preventive Medicine, Guangzhou, China
| | - Weijie Zhang
- School of Public Health (Shenzhen), Sun Yat-sen University, Shenzhen, China
| | - Shaomei Lin
- Department of Gynecology, Shunde Women's and Children's Hospital of Guangdong Medical University, Foshan, China
| | - Xunyuan Tuo
- Department of Gynecology, Gansu Provincial Maternity and Child-care Hospital, Lanzhou, China
| | - Juntao Wang
- Department of Gynecology, Guiyang Maternal and Child Health Care Hospital, Guiyang, China
| | - Tianjie Yang
- Department of Gynecology, Shenzhen Maternity & Child Healthcare Hospital, Shenzhen, China
| | - Xinxin Huang
- The Ministry of Health, Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou, China
| | - Zheng Zheng
- Department of Gynecology, Shenzhen Maternity & Child Healthcare Hospital, Shenzhen, China
| | - Huifeng Xue
- Center for Cervical Disease Diagnosis and Treatment, Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou, China
| | - Shuxia Xu
- Department of Pathology, Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou, China
| | - Siyang Liu
- School of Public Health (Shenzhen), Sun Yat-sen University, Shenzhen, China
| | - Pengming Sun
- Department of Gynecology, Laboratory of Gynecologic Oncology, Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou, China
- Fujian Key Laboratory of Women and Children’s Critical Diseases Research, Fuzhou, China
- School of Group Medicine and Public Health, Peking Union Medical College, Beijing, China
| | - Huachun Zou
- School of Public Health, Fudan University, Shanghai, China
- Shenzhen Campus, Sun Yat-sen University, Shenzhen, China
- Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics and Gynecology and Pediatrics, Fujian Medical University, Fuzhou, China
| |
Collapse
|
13
|
Cho NJ, Jeong I, Ahn SJ, Gil HW, Kim Y, Park JH, Kang S, Lee H. Machine Learning to Assist in Managing Acute Kidney Injury in General Wards: Multicenter Retrospective Study. J Med Internet Res 2025; 27:e66568. [PMID: 40101226 PMCID: PMC11962325 DOI: 10.2196/66568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2024] [Revised: 01/10/2025] [Accepted: 02/14/2025] [Indexed: 03/20/2025] Open
Abstract
BACKGROUND Most artificial intelligence-based research on acute kidney injury (AKI) prediction has focused on intensive care unit settings, limiting their generalizability to general wards. The lack of standardized AKI definitions and reliance on intensive care units further hinder the clinical applicability of these models. OBJECTIVE This study aims to develop and validate a machine learning-based framework to assist in managing AKI and acute kidney disease (AKD) in general ward patients, using a refined operational definition of AKI to improve predictive performance and clinical relevance. METHODS This retrospective multicenter cohort study analyzed electronic health record data from 3 hospitals in South Korea. AKI and AKD were defined using a refined version of the Kidney Disease: Improving Global Outcomes criteria, which included adjustments to baseline serum creatinine estimation and a stricter minimum increase threshold to reduce misclassification due to transient fluctuations. The primary outcome was the development of machine learning models for early prediction of AKI (within 3 days before onset) and AKD (nonrecovery within 7 days after AKI). RESULTS The final analysis included 135,068 patients. A total of 7658 (8%) patients in the internal cohort and 2898 (7.3%) patients in the external cohort developed AKI. Among the 5429 patients in the internal cohort and 1998 patients in the external cohort for whom AKD progression could be assessed, 896 (16.5%) patients and 287 (14.4%) patients, respectively, progressed to AKD. Using the refined criteria, 2898 cases of AKI were identified, whereas applying the standard Kidney Disease: Improving Global Outcomes criteria resulted in the identification of 5407 cases. Among the 2509 patients who were not classified as having AKI under the refined criteria, 2242 had a baseline serum creatinine level below 0.6 mg/dL, while the remaining 267 experienced a decrease in serum creatinine before the onset of AKI. The final selected early prediction model for AKI achieved an area under the receiver operating characteristic curve of 0.9053 in the internal cohort and 0.8860 in the external cohort. The early prediction model for AKD achieved an area under the receiver operating characteristic curve of 0.8202 in the internal cohort and 0.7833 in the external cohort. CONCLUSIONS The proposed machine learning framework successfully predicted AKI and AKD in general ward patients with high accuracy. The refined AKI definition significantly reduced the classification of patients with transient serum creatinine fluctuations as AKI cases compared to the previous criteria. These findings suggest that integrating this machine learning framework into hospital workflows could enable earlier interventions, optimize resource allocation, and improve patient outcomes.
Collapse
Affiliation(s)
- Nam-Jun Cho
- Department of Internal Medicine, Soonchunhyang University Cheonan Hospital, Cheonan, Republic of Korea
| | - Inyong Jeong
- Department of Biomedical Informatics, Korea University College of Medicine, Seoul, Republic of Korea
| | - Se-Jin Ahn
- Department of Biomedical Informatics, Korea University College of Medicine, Seoul, Republic of Korea
| | - Hyo-Wook Gil
- Department of Internal Medicine, Soonchunhyang University Cheonan Hospital, Cheonan, Republic of Korea
| | - Yeongmin Kim
- Department of Biomedical Informatics, Korea University College of Medicine, Seoul, Republic of Korea
| | - Jin-Hyun Park
- Department of Biomedical Informatics, Korea University College of Medicine, Seoul, Republic of Korea
| | - Sanghee Kang
- Department of Surgery, Korea University Guro Hospital, Seoul, Republic of Korea
| | - Hwamin Lee
- Department of Biomedical Informatics, Korea University College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
14
|
Wang X, Zhu Z, Xu X, Sun J, Jia L, Huang Y, Chen Q, Yang Z, Zhao P, Huang X, Grzegorzek M, Liu Y, Lv H, Zong F, Wang Z. Construction of brain age models based on structural and white matter information. Brain Res 2025; 1851:149458. [PMID: 39826624 DOI: 10.1016/j.brainres.2025.149458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2024] [Revised: 01/07/2025] [Accepted: 01/15/2025] [Indexed: 01/22/2025]
Abstract
Brain aging is an inevitable process in adulthood, yet there is a lack of objective measures to accurately assess its extent. This study aims to develop brain age prediction model using magnetic resonance imaging (MRI), which includes structural information of gray matter and integrity information of white matter microstructure. Multiparameter MRI was performed on two population cohorts. We collected structural MRI data from T1- and T2-sequences, including gray matter volume, surface area, and thickness in different areas. For diffusion tensor imaging (DTI), we derived four white matter parameters: fractional anisotropy, mean diffusivity, axial diffusivity, and radial diffusivity. To achieve reliable brain age prediction based on structure and white matter integrity, we employed LASSO regression. We successfully constructed a brain age prediction model based on multiparameter brain MRI (Mean absolute error of 3.87). Using structural and diffusion metrics, we identified and visualized which brain areas were notably involved in brain aging. Simultaneously, we discovered that lateralization during brain aging is a significant factor in brain aging models. We have successfully developed a brain age estimation model utilizing white matter and gray matter metrics, which exhibits minimal errors and is suitable for adults.
Collapse
Affiliation(s)
- Xinghao Wang
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, No. 95 YongAn Road, Beijing 100050, People's Republic of China; Institute for Medical Informatics, University of Luebeck, Luebeck, Germany; German Research Center for Artificial Intelligence, (DFKI), Luebeck, Germany
| | - Zaimin Zhu
- School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, People's Republic of China
| | - Xinyuan Xu
- School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, People's Republic of China
| | - Jing Sun
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, No. 95 YongAn Road, Beijing 100050, People's Republic of China
| | - Li Jia
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, No. 95 YongAn Road, Beijing 100050, People's Republic of China
| | - Yan Huang
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, No. 95 YongAn Road, Beijing 100050, People's Republic of China
| | - Qian Chen
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, No. 95 YongAn Road, Beijing 100050, People's Republic of China
| | - Zhenghan Yang
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, No. 95 YongAn Road, Beijing 100050, People's Republic of China
| | - Pengfei Zhao
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, No. 95 YongAn Road, Beijing 100050, People's Republic of China
| | - Xinyu Huang
- Institute for Medical Informatics, University of Luebeck, Luebeck, Germany; German Research Center for Artificial Intelligence, (DFKI), Luebeck, Germany
| | - Marcin Grzegorzek
- Institute for Medical Informatics, University of Luebeck, Luebeck, Germany; German Research Center for Artificial Intelligence, (DFKI), Luebeck, Germany
| | - Yong Liu
- School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, People's Republic of China
| | - Han Lv
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, No. 95 YongAn Road, Beijing 100050, People's Republic of China.
| | - Fangrong Zong
- School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, People's Republic of China.
| | - Zhenchang Wang
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, No. 95 YongAn Road, Beijing 100050, People's Republic of China.
| |
Collapse
|
15
|
Assink N, Gonzalez-Perrino MP, Santana-Trejo R, Doornberg JN, Hoekstra H, Kraeima J, IJpma FFA. Development of Machine Learning-based Algorithms to Predict the 2- and 5-year Risk of TKA After Tibial Plateau Fracture Treatment. Clin Orthop Relat Res 2025:00003086-990000000-01948. [PMID: 40106382 DOI: 10.1097/corr.0000000000003442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/13/2024] [Accepted: 02/07/2025] [Indexed: 03/22/2025]
Abstract
BACKGROUND When faced with a severe intraarticular injury like a tibial plateau fracture, patients count on surgeons to make an accurate estimation of prognosis. Unfortunately, there are few tools available that enable precise, personalized prognosis estimation tailored to each patient's unique circumstances, including their individual and fracture-specific characteristics. In this study, we developed and validated a clinical prediction model using machine-learning algorithms for the 2- and 5-year risk of TKA after tibia plateau fractures. QUESTIONS/PURPOSES Can machine learning-based probability calculators estimate the probability of 2- and 5-year risk of conversion to TKA in patients with a tibial plateau fracture? METHODS A multicenter, cross-sectional study was performed in six hospitals in patients treated for a tibial plateau fracture between 2003 to 2019. In total, 2057 patients were eligible for inclusion and were sent informed consent and a questionnaire to inquire whether they underwent conversion to TKA. For 56% (1160 of 2057), status of conversion to TKA was accounted for at a minimum of 2 years, and 53% (1082 of 2057) were accounted for at a minimum of 5 years. The mean follow-up among responders was 6 ± 4 years after injury. An analysis of nonresponders found that responders were slightly older than nonresponders (53 ± 16 years versus 51 ± 17 years; p = 0.001), they were more often women (68% [788 of 1160] versus 58% [523 of 897]; p = 0.001), they were treated nonoperatively less often (30% [346 of 1160] versus 43% [387 of 897]; p = 0.001), and they had larger fracture gaps (6.4 ± 6.3 mm versus 4.2 ± 5.2 mm; p < 0.001) and step-offs (6.3 ± 5.7 mm versus 4.5 ± 4.7 mm; p < 0.001). AO Foundation/Orthopaedic Trauma Association (AO/OTA) fracture classification did not differ between nonresponders and responders (B1 11% versus 15%, B2 16% versus 19%, B3 45% versus 39%, C2 6% versus 8%, C3 22% versus 17%; p = 0.26). A total of 70% (814 of 1160) of patients were treated with open reduction and internal fixation, whereas 30% (346 of 1160) of patients were treated nonoperatively with a cast. Most fractures (80% [930 of 1160]) were AO/OTA type B fractures, and 20% (230 of 1160) were type C. Of these patients, 7% (79 of 1160) and 10% (109 of 1082) underwent conversion to a TKA at 2- and 5-year follow-up, respectively. Patient characteristics were retrieved from electronic patient records, and imaging data were shared with the initiating center from which fracture characteristics were determined. Obtained features derived from follow-up questionnaires, electronic patient records, and radiographic assessments were eligible for development of the prediction model. The first step consisted of data cleaning and included simple type formatting and standardization of numerical columns. Subsequent feature selection consisted of a review of the published evidence and expert opinion. This was followed by bivariate analysis of the identified features. The features for the models included: age, gender, BMI, AO/OTA fracture classification, fracture displacement (gap, step-off), medial proximal tibial alignment, and posterior proximal tibial alignment. The data set was used to train three models: logistic regression, random forest, and XGBoost. Logistic regression models linear relationships, random forest handles nonlinear complexities with decision trees, and XGBoost excels with sequential error correction and regularization. The models were tested using a sixfold validation approach by training the model on data from five (of six) respective medical centers and validating it against the remaining center that was left out for training. Performance was assessed by the area under the receiver operating characteristic curve (AUC), which measures a model's ability to distinguish between classes. AUC varies between 0 and 1, with values closer to 1 indicating better performance. To ensure robust and reliable results, we used bootstrapping as a resampling technique. In addition, calibration curves were plotted, and calibration was assessed with the calibration slope and intercept. The calibration plot compares the estimated probabilities with the observed probabilities for the primary outcome. Calibration slope evaluates alignment between predicted probabilities and observed outcomes (1 = perfect, < 1 = overfit, > 1 = underfit). Calibration intercept indicates bias (0 = perfect, negative = underestimation, positive = overestimation). Last, the Brier score, measuring the mean squared error of predicted probabilities (0 = perfect), was calculated. RESULTS There were no differences among the models in terms of sensitivity and specificity; the AUCs for each overlapped broadly and ranged from 0.76 to 0.83. Calibration was most optimal in logistic regression for both 2- and 5-year models, with slopes of 0.82 (random forest 0.60, XGBoost 0.26) and 0.95 (random forest 0.85, XGBoost 0.48) and intercepts of 0.01 for both (random forest 0.01 to 0.02; XGBoost 0.05 to 0.07). Brier score was similar between models varying between 0.06 to 0.09. Given that its performance metrics were highest, we chose the logistic regression algorithm as the final prediction model. The web application providing the prediction tool is freely available and can be accessed through: https://3dtrauma.shinyapps.io/tka_prediction/. CONCLUSION In this study, a personalized risk assessment tool was developed to support clinical decision-making and patient counseling. Our findings demonstrate that machine-learning algorithms, particularly logistic regression, can provide accurate and reliable predictions of TKA conversion at 2 and 5 years after a tibial plateau fracture. In addition, it provides a useful prognostic tool for surgeons who perform fracture surgery that can be used quickly and easily with patients in the clinic or emergency department once it complies with medical device regulations. External validation is needed to assess performance in other institutions and countries; to account for patient and surgeon preferences, resources, and cultures; and to further strengthen its clinical applicability. LEVEL OF EVIDENCE Level III, therapeutic study.
Collapse
Affiliation(s)
- Nick Assink
- Department of Trauma Surgery, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
- 3D Lab, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Maria P Gonzalez-Perrino
- Department of Trauma Surgery, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Raul Santana-Trejo
- Department of Trauma Surgery, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Job N Doornberg
- Department of Trauma Surgery, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Harm Hoekstra
- Department of Traumatology, KU Leuven University Hospitals Leuven Gasthuisberg Campus, Leuven, Belgium
| | - Joep Kraeima
- 3D Lab, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Frank F A IJpma
- Department of Trauma Surgery, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| |
Collapse
|
16
|
Woll S, Birkenmaier D, Biri G, Nissen R, Lutz L, Schroth M, Ebner-Priemer UW, Giurgiu M. Applying AI in the Context of the Association Between Device-Based Assessment of Physical Activity and Mental Health: Systematic Review. JMIR Mhealth Uhealth 2025; 13:e59660. [PMID: 40053765 PMCID: PMC11926455 DOI: 10.2196/59660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 11/29/2024] [Accepted: 02/06/2025] [Indexed: 03/09/2025] Open
Abstract
BACKGROUND Wearable technology is used by consumers worldwide for continuous activity monitoring in daily life but more recently also for classifying or predicting mental health parameters like stress or depression levels. Previous studies identified, based on traditional approaches, that physical activity is a relevant factor in the prevention or management of mental health. However, upcoming artificial intelligence methods have not yet been fully established in the research field of physical activity and mental health. OBJECTIVE This systematic review aims to provide a comprehensive overview of studies that integrated passive monitoring of physical activity data measured via wearable technology in machine learning algorithms for the detection, prediction, or classification of mental health states and traits. METHODS We conducted a review of studies processing wearable data to gain insights into mental health parameters. Eligibility criteria were (1) the study uses wearables or smartphones to acquire physical behavior and optionally other sensor measurement data, (2) the study must use machine learning to process the acquired data, and (3) the study had to be published in a peer-reviewed English language journal. Studies were identified via a systematic search in 5 electronic databases. RESULTS Of 11,057 unique search results, 49 published papers between 2016 and 2023 were included. Most studies examined the connection between wearable sensor data and stress (n=15, 31%) or depression (n=14, 29%). In total, 71% (n=35) of the studies had less than 100 participants, and 47% (n=23) had less than 14 days of data recording. More than half of the studies (n=27, 55%) used step count as movement measurement, and 44% (n=21) used raw accelerometer values. The quality of the studies was assessed, scoring between 0 and 18 points in 9 categories (maximum 2 points per category). On average, studies were rated 6.47 (SD 3.1) points. CONCLUSIONS The use of wearable technology for the detection, prediction, or classification of mental health states and traits is promising and offers a variety of applications across different settings and target groups. However, based on the current state of literature, the application of artificial intelligence cannot realize its full potential mostly due to a lack of methodological shortcomings and data availability. Future research endeavors may focus on the following suggestions to improve the quality of new applications in this context: first, by using raw data instead of already preprocessed data. Second, by using only relevant data based on empirical evidence. In particular, crafting optimal feature sets rather than using many individual detached features and consultation with in-field professionals. Third, by validating and replicating the existing approaches (ie, applying the model to unseen data). Fourth, depending on the research aim (ie, generalization vs personalization) maximizing the sample size or the duration over which data are collected.
Collapse
Affiliation(s)
- Simon Woll
- Mental mHealth Lab, Institute of Sports and Sports Science, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Dennis Birkenmaier
- Department of Embedded Systems and Sensors Engineering, Research Center for Information Technology, Karlsruhe, Germany
| | - Gergely Biri
- Department of Embedded Systems and Sensors Engineering, Research Center for Information Technology, Karlsruhe, Germany
| | - Rebecca Nissen
- Mental mHealth Lab, Institute of Sports and Sports Science, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Luisa Lutz
- Mental mHealth Lab, Institute of Sports and Sports Science, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Marc Schroth
- Department of Embedded Systems and Sensors Engineering, Research Center for Information Technology, Karlsruhe, Germany
| | - Ulrich W Ebner-Priemer
- Mental mHealth Lab, Institute of Sports and Sports Science, Karlsruhe Institute of Technology, Karlsruhe, Germany
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
- German Center for Mental Health, Mannheim, Germany
| | - Marco Giurgiu
- Mental mHealth Lab, Institute of Sports and Sports Science, Karlsruhe Institute of Technology, Karlsruhe, Germany
| |
Collapse
|
17
|
Shen L, Jin Y, Pan AX, Wang K, Ye R, Lin Y, Anwar S, Xia W, Zhou M, Guo X. Machine learning-based predictive models for perioperative major adverse cardiovascular events in patients with stable coronary artery disease undergoing noncardiac surgery. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 260:108561. [PMID: 39708562 DOI: 10.1016/j.cmpb.2024.108561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Revised: 11/17/2024] [Accepted: 12/07/2024] [Indexed: 12/23/2024]
Abstract
BACKGROUND AND OBJECTIVE Accurate prediction of perioperative major adverse cardiovascular events (MACEs) is crucial, as it not only aids clinicians in comprehensively assessing patients' surgical risks and tailoring personalized surgical and perioperative management plans, but also for information-based shared decision-making with patients and efficient allocation of medical resources. This study developed and validated a machine learning (ML) model using accessible preoperative clinical data to predict perioperative MACEs in stable coronary artery disease (SCAD) patients undergoing noncardiac surgery (NCS). METHODS We collected data from 9171 adult SCAD patients who underwent NCS and extracted 64 preoperative variables. First, the optimal data imputation, resampling, and feature selection methods were compared and selected to deal with missing data values and imbalances. Then, nine independent machine learning models (logistic regression (LR), support vector machine, Gaussian Naive Bayes (GNB), random forest, gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), light gradient boosting machine, categorical boosting (CatBoost), and deep neural network) and a stacking ensemble model were constructed and compared with the validated Revised Cardiac Risk Index's (RCRI) model for predictive performance, which was evaluated using the area under the receiver operating characteristic curve (AUROC), the area under the precision-recall curve (AUPRC), calibration curve, and decision curve analysis (DCA). To reduce overfitting and enhance robustness, we performed hyperparameter tuning and 5-fold cross-validation. Finally, the Shapley additive interpretation (SHAP) method and a partial dependence plot (PDP) were used to determine the optimal ML model. RESULTS Of the 9,171 patients, 514 (5.6 %) developed MACEs. 24 significant preoperative features were selected for model development and evaluation. All ML models performed well, with AUROC above 0.88 and AUPRC above 0.39, outperforming the AUROC (0.716) and AUPRC (0.185) of RCRI (P < 0.001). The best independent model was XGBoost (AUROC = 0.898, AUPRC = 0.479). The calibration curve accurately predicted the risk of MACEs (Brier score = 0.040), and the DCA results showed that XGBoost had a high net benefit for predicting MACEs. The top-ranked stacking ensemble model, consisting of CatBoost, GBDT, GNB, and LR, proved to be the best (AUROC 0.894, AUPRC 0.485). We identified the top 20 most important features using the mean absolute SHAP values and depicted their effects on model predictions using PDP. CONCLUSIONS This study combined missing-value imputation, feature screening, unbalanced data processing, and advanced machine learning methods to successfully develop and verify the first ML-based perioperative MACEs prediction model for patients with SCAD, which is more accurate than RCRI and enables effective identification of high-risk patients and implementation of targeted interventions to reduce the incidence of MACEs.
Collapse
Affiliation(s)
- Liang Shen
- Department of Information Technology, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310003, China
| | - YunPeng Jin
- Department of Cardiovascular Medicine, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310003, China
| | - AXiang Pan
- Department of Information Technology, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310003, China
| | - Kai Wang
- Department of Cardiovascular Medicine, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310003, China
| | - RunZe Ye
- Department of Cardiovascular Medicine, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310003, China
| | - YangKai Lin
- Department of Cardiovascular Medicine, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310003, China
| | - Safraz Anwar
- Department of Cardiovascular Medicine, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310003, China
| | - WeiCong Xia
- Department of Cardiovascular Medicine, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310003, China
| | - Min Zhou
- Department of Information Technology, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310003, China.
| | - XiaoGang Guo
- Department of Cardiovascular Medicine, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310003, China.
| |
Collapse
|
18
|
Ritter D, Denard PJ, Raiss P, Wijdicks CA, Werner BC, Bedi A, Müller PE, Bachmaier S. Machine learning models can define clinically relevant bone density subgroups based on patient-specific calibrated computed tomography scans in patients undergoing reverse shoulder arthroplasty. J Shoulder Elbow Surg 2025; 34:e141-e151. [PMID: 39154849 DOI: 10.1016/j.jse.2024.07.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 06/13/2024] [Accepted: 07/04/2024] [Indexed: 08/20/2024]
Abstract
BACKGROUND Reduced bone density is recognized as a predictor for potential complications in reverse shoulder arthroplasty (RSA). While humeral and glenoid planning based on preoperative computed tomography (CT) scans assist in implant selection and position, reproducible methods for quantifying the patients' bone density are currently not available. The purpose of this study was to perform bone density analyses including patient-specific calibration in an RSA cohort based on preoperative CT imaging. It was hypothesized that preoperative CT bone density measures would provide objective quantification of the patients' humeral bone quality. METHODS This study consisted of 3 parts, (1) analysis of a patient-specific calibration method in cadaveric CT scans, (2) retrospective application in a clinical RSA cohort, and (3) clustering and classification with machine learning (ML) models. Forty cadaveric shoulders were scanned in a clinical CT and compared regarding calibration with density phantoms, air muscle, and fat (patient-specific) or standard Hounsfield unit. Postscan patient-specific calibration was used to improve the extraction of 3-dimensional regions of interest for retrospective bone density analysis in a clinical RSA cohort (n = 345). ML models were used to improve the clustering (Hierarchical Ward) and classification (support vector machine) of low bone densities in the respective patients. RESULTS The patient-specific calibration method demonstrated improved accuracy with excellent intraclass correlation coefficients for cylindrical cancellous bone densities (intraclass correlation coefficient >0.75). Clustering partitioned the training data set into a high-density subgroup consisting of 96 patients and a low-density subgroup consisting of 146 patients, showing significant differences between these groups. The support vector machine showed optimized prediction accuracy of low and high bone densities compared to conventional statistics in the training (accuracy = 91.2%; area under curve = 0.967) and testing (accuracy = 90.5%; area under curve = 0.958) data set. CONCLUSION Preoperative CT scans can be used to quantify the proximal humeral bone quality in patients undergoing RSA. The use of ML models and patient-specific calibration on bone mineral density demonstrated that multiple three-dimensional bone density scores improved the accuracy of objective preoperative bone quality assessment. The trained model could provide preoperative information to surgeons treating patients with potentially poor bone quality.
Collapse
Affiliation(s)
- Daniel Ritter
- Department of Orthopedic Research, Arthrex, Munich, Germany; Department of Orthopaedics and Trauma Surgery, Musculoskeletal University Center Munich (MUM), University Hospital, LMU, Munich, Germany.
| | | | | | | | - Brian C Werner
- Department of Orthopaedic Surgery, University of Virginia Health System, Charlottesville, VA, USA
| | - Asheesh Bedi
- Department of Orthopaedic Surgery, University of Michigan, Ann Arbor, MI, USA
| | - Peter E Müller
- Department of Orthopaedics and Trauma Surgery, Musculoskeletal University Center Munich (MUM), University Hospital, LMU, Munich, Germany
| | | |
Collapse
|
19
|
Chen Y, Rivier CA, Mora SA, Torres Lopez V, Payabvash S, Sheth KN, Harloff A, Falcone GJ, Rosand J, Mayerhofer E, Anderson CD. Deep learning survival model predicts outcome after intracerebral hemorrhage from initial CT scan. Eur Stroke J 2025; 10:225-235. [PMID: 38880882 PMCID: PMC11569453 DOI: 10.1177/23969873241260154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Accepted: 05/21/2024] [Indexed: 06/18/2024] Open
Abstract
BACKGROUND Predicting functional impairment after intracerebral hemorrhage (ICH) provides valuable information for planning of patient care and rehabilitation strategies. Current prognostic tools are limited in making long term predictions and require multiple expert-defined inputs and interpretation that make their clinical implementation challenging. This study aimed to predict long term functional impairment of ICH patients from admission non-contrast CT scans, leveraging deep learning models in a survival analysis framework. METHODS We used the admission non-contrast CT scans from 882 patients from the Massachusetts General Hospital ICH Study for training, hyperparameter optimization, and model selection, and 146 patients from the Yale New Haven ICH Study for external validation of a deep learning model predicting functional outcome. Disability (modified Rankin scale [mRS] > 2), severe disability (mRS > 4), and dependent living status were assessed via telephone interviews after 6, 12, and 24 months. The prediction methods were evaluated by the c-index and compared with ICH score and FUNC score. RESULTS Using non-contrast CT, our deep learning model achieved higher prediction accuracy of post-ICH dependent living, disability, and severe disability by 6, 12, and 24 months (c-index 0.742 [95% CI -0.700 to 0.778], 0.712 [95% CI -0.674 to 0.752], 0.779 [95% CI -0.733 to 0.832] respectively) compared with the ICH score (c-index 0.673 [95% CI -0.662 to 0.688], 0.647 [95% CI -0.637 to 0.661] and 0.697 [95% CI -0.675 to 0.717]) and FUNC score (c-index 0.701 [95% CI- 0.698 to 0.723], 0.668 [95% CI -0.657 to 0.680] and 0.727 [95% CI -0.708 to 0.753]). In the external independent Yale-ICH cohort, similar performance metrics were obtained for disability and severe disability (c-index 0.725 [95% CI -0.673 to 0.781] and 0.747 [95% CI -0.676 to 0.807], respectively). Similar AUC of predicting each outcome at 6 months, 1 and 2 years after ICH was achieved compared with ICH score and FUNC score. CONCLUSION We developed a generalizable deep learning model to predict onset of dependent living and disability after ICH, which could help to guide treatment decisions, advise relatives in the acute setting, optimize rehabilitation strategies, and anticipate long-term care needs.
Collapse
Affiliation(s)
- Yutong Chen
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Henry and Allison McCance Center for Brain Health, Massachusetts General Hospital, Boston, MA, USA
| | - Cyprien A Rivier
- Department of Neurology, Yale School of Medicine, New Haven, CT, USA
- Yale Center for Brain and Mind Health, New Haven, CT, USA
| | - Samantha A Mora
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Henry and Allison McCance Center for Brain Health, Massachusetts General Hospital, Boston, MA, USA
| | - Victor Torres Lopez
- Department of Neurology, Yale School of Medicine, New Haven, CT, USA
- Yale Center for Brain and Mind Health, New Haven, CT, USA
| | - Sam Payabvash
- Department of Neurology, Yale School of Medicine, New Haven, CT, USA
- Yale Center for Brain and Mind Health, New Haven, CT, USA
| | - Kevin N Sheth
- Department of Neurology, Yale School of Medicine, New Haven, CT, USA
- Yale Center for Brain and Mind Health, New Haven, CT, USA
| | - Andreas Harloff
- Department of Neurology, University of Freiburg, Freiburg, Germany
| | - Guido J Falcone
- Department of Neurology, Yale School of Medicine, New Haven, CT, USA
- Yale Center for Brain and Mind Health, New Haven, CT, USA
| | - Jonathan Rosand
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Henry and Allison McCance Center for Brain Health, Massachusetts General Hospital, Boston, MA, USA
| | - Ernst Mayerhofer
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Henry and Allison McCance Center for Brain Health, Massachusetts General Hospital, Boston, MA, USA
| | - Christopher D Anderson
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Henry and Allison McCance Center for Brain Health, Massachusetts General Hospital, Boston, MA, USA
- Department of Neurology, Brigham and Women’s Hospital, Boston, MA, USA
| |
Collapse
|
20
|
Lu SC, Chen GY, Liu AS, Sun JT, Gao JW, Huang CH, Tsai CL, Fu LC. Deep Learning-Based Electrocardiogram Model (EIANet) to Predict Emergency Department Cardiac Arrest: Development and External Validation Study. J Med Internet Res 2025; 27:e67576. [PMID: 40053733 PMCID: PMC11928069 DOI: 10.2196/67576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2024] [Revised: 01/03/2025] [Accepted: 02/04/2025] [Indexed: 03/09/2025] Open
Abstract
BACKGROUND In-hospital cardiac arrest (IHCA) is a severe and sudden medical emergency that is characterized by the abrupt cessation of circulatory function, leading to death or irreversible organ damage if not addressed immediately. Emergency department (ED)-based IHCA (EDCA) accounts for 10% to 20% of all IHCA cases. Early detection of EDCA is crucial, yet identifying subtle signs of cardiac deterioration is challenging. Traditional EDCA prediction methods primarily rely on structured vital signs or electrocardiogram (ECG) signals, which require additional preprocessing or specialized devices. This study introduces a novel approach using image-based 12-lead ECG data obtained at ED triage, leveraging the inherent richness of visual ECG patterns to enhance prediction and integration into clinical workflows. OBJECTIVE This study aims to address the challenge of early detection of EDCA by developing an innovative deep learning model, the ECG-Image-Aware Network (EIANet), which uses 12-lead ECG images for early prediction of EDCA. By focusing on readily available triage ECG images, this research seeks to create a practical and accessible solution that seamlessly integrates into real-world ED workflows. METHODS For adult patients with EDCA (cases), 12-lead ECG images at ED triage were obtained from 2 independent data sets: National Taiwan University Hospital (NTUH) and Far Eastern Memorial Hospital (FEMH). Control ECGs were randomly selected from adult ED patients without cardiac arrest during the same study period. In EIANet, ECG images were first converted to binary form, followed by noise reduction, connected component analysis, and morphological opening. A spatial attention module was incorporated into the ResNet50 architecture to enhance feature extraction, and a custom binary recall loss (BRLoss) was used to balance precision and recall, addressing slight data set imbalance. The model was developed and internally validated on the NTUH-ECG data set and was externally validated on an independent FEMH-ECG data set. The model performance was evaluated using the F1-score, area under the receiver operating characteristic curve (AUROC), and area under the precision-recall curve (AUPRC). RESULTS There were 571 case ECGs and 826 control ECGs in the NTUH data set and 378 case ECGs and 713 control ECGs in the FEMH data set. The novel EIANet model achieved an F1-score of 0.805, AUROC of 0.896, and AUPRC of 0.842 on the NTUH-ECG data set with a 40% positive sample ratio. It achieved an F1-score of 0.650, AUROC of 0.803, and AUPRC of 0.678 on the FEMH-ECG data set with a 34.6% positive sample ratio. The feature map showed that the region of interest in the ECG was the ST segment. CONCLUSIONS EIANet demonstrates promising potential for accurately predicting EDCA using triage ECG images, offering an effective solution for early detection of high-risk cases in emergency settings. This approach may enhance the ability of health care professionals to make timely decisions, with the potential to improve patient outcomes by enabling earlier interventions for EDCA.
Collapse
Affiliation(s)
- Shao-Chi Lu
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
| | - Guang-Yuan Chen
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
| | - An-Sheng Liu
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
| | - Jen-Tang Sun
- Department of Emergency Medicine, Far Eastern Memorial Hospital, Taipei, Taiwan
| | - Jun-Wan Gao
- Department of Emergency Medicine, National Taiwan University Hospital and National Taiwan University College of Medicine, Taipei, Taiwan
| | - Chien-Hua Huang
- Department of Emergency Medicine, National Taiwan University Hospital and National Taiwan University College of Medicine, Taipei, Taiwan
| | - Chu-Lin Tsai
- Department of Emergency Medicine, National Taiwan University Hospital and National Taiwan University College of Medicine, Taipei, Taiwan
| | - Li-Chen Fu
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
| |
Collapse
|
21
|
Campagner A, Agnello L, Carobene A, Padoan A, Del Ben F, Locatelli M, Plebani M, Ognibene A, Lorubbio M, De Vecchi E, Cortegiani A, Piva E, Poz D, Curcio F, Cabitza F, Ciaccio M. Complete Blood Count and Monocyte Distribution Width-Based Machine Learning Algorithms for Sepsis Detection: Multicentric Development and External Validation Study. J Med Internet Res 2025; 27:e55492. [PMID: 40009841 PMCID: PMC11904381 DOI: 10.2196/55492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 05/04/2024] [Accepted: 09/09/2024] [Indexed: 02/28/2025] Open
Abstract
BACKGROUND Sepsis is an organ dysfunction caused by a dysregulated host response to infection. Early detection is fundamental to improving the patient outcome. Laboratory medicine can play a crucial role by providing biomarkers whose alteration can be detected before the onset of clinical signs and symptoms. In particular, the relevance of monocyte distribution width (MDW) as a sepsis biomarker has emerged in the previous decade. However, despite encouraging results, MDW has poor sensitivity and positive predictive value when compared to other biomarkers. OBJECTIVE This study aims to investigate the use of machine learning (ML) to overcome the limitations mentioned earlier by combining different parameters and therefore improving sepsis detection. However, making ML models function in clinical practice may be problematic, as their performance may suffer when deployed in contexts other than the research environment. In fact, even widely used commercially available models have been demonstrated to generalize poorly in out-of-distribution scenarios. METHODS In this multicentric study, we developed ML models whose intended use is the early detection of sepsis on the basis of MDW and complete blood count parameters. In total, data from 6 patient cohorts (encompassing 5344 patients) collected at 5 different Italian hospitals were used to train and externally validate ML models. The models were trained on a patient cohort encompassing patients enrolled at the emergency department, and it was externally validated on 5 different cohorts encompassing patients enrolled at both the emergency department and the intensive care unit. The cohorts were selected to exhibit a variety of data distribution shifts compared to the training set, including label, covariate, and missing data shifts, enabling a conservative validation of the developed models. To improve generalizability and robustness to different types of distribution shifts, the developed ML models combine traditional methodologies with advanced techniques inspired by controllable artificial intelligence (AI), namely cautious classification, which gives the ML models the ability to abstain from making predictions, and explainable AI, which provides health operators with useful information about the models' functioning. RESULTS The developed models achieved good performance on the internal validation (area under the receiver operating characteristic curve between 0.91 and 0.98), as well as consistent generalization performance across the external validation datasets (area under the receiver operating characteristic curve between 0.75 and 0.95), outperforming baseline biomarkers and state-of-the-art ML models for sepsis detection. Controllable AI techniques were further able to improve performance and were used to derive an interpretable set of diagnostic rules. CONCLUSIONS Our findings demonstrate how controllable AI approaches based on complete blood count and MDW may be used for the early detection of sepsis while also demonstrating how the proposed methodology can be used to develop ML models that are more resistant to different types of data distribution shifts.
Collapse
Affiliation(s)
| | | | - Anna Carobene
- IRCCS San Raffaele Scientific Institute, Milano, Italy
| | - Andrea Padoan
- Department of Medicine, University of Padova, Padova, Italy
- Laboratory Medicine Unit, University-Hospital of Padova, Padova, Italy
| | - Fabio Del Ben
- IRCCS Centro Di Riferimento Oncologico Aviano, Aviano, Italy
| | | | - Mario Plebani
- Department of Medicine, University of Padova, Padova, Italy
- Laboratory Medicine Unit, University-Hospital of Padova, Padova, Italy
| | | | | | | | - Andrea Cortegiani
- University of Palermo, Palermo, Italy
- University Hospital Policlinico Paolo Giaccone, Palermo, Italy
| | - Elisa Piva
- Azienda Socio Sanitaria Territoriale di Mantova, Mantova, Italy
| | | | | | - Federico Cabitza
- IRCCS Ospedale Galeazzi Sant'Ambrogio, Milan, Italy
- Department of Computer Science, Systems and Communication, University of Milano-Bicocca, Milano, Italy
| | - Marcello Ciaccio
- University of Palermo, Palermo, Italy
- University Hospital Policlinico Paolo Giaccone, Palermo, Italy
| |
Collapse
|
22
|
Li J, Tian Z, Fang Y, He Z, Xu Y, Xu H, Zhu Z, Qiu Y, Liu Z. Determining the risk factors for postoperative mechanical complication in degenerative scoliosis: a machine learning approach based on musculoskeletal metrics. EUROPEAN SPINE JOURNAL : OFFICIAL PUBLICATION OF THE EUROPEAN SPINE SOCIETY, THE EUROPEAN SPINAL DEFORMITY SOCIETY, AND THE EUROPEAN SECTION OF THE CERVICAL SPINE RESEARCH SOCIETY 2025:10.1007/s00586-025-08742-y. [PMID: 39988612 DOI: 10.1007/s00586-025-08742-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Revised: 01/06/2025] [Accepted: 02/12/2025] [Indexed: 02/25/2025]
Abstract
OBJECTIVE To determine the risk factors for mechanical complications (MC) following corrective surgery for degenerative scoliosis through a machine learning (ML) algorithm. METHODS Patients with degenerative scoliosis who received corrective surgery were enrolled. A total of 213 cases were ultimately included and randomized into the training set (70%) and test set (30%) to develop the machine learning-based algorithm. The demographic data, comorbidities, regional and global radiographic parameters, paraspinal muscle (PSM) fat infiltration rate (FI%), and vertebral bone quality (VBQ) score were analyzed. RESULTS A total of 101 patients (47.4%) had MC, including 46 patients with proximal junctional kyphosis or failure (PJK/PJF), 7 patients with distal junctional kyphosis or failure (DJK/DJF), and 25 patients with rod or screw breakage. In the testing set, Gaussian Naive Bayes (GNB) exhibited the highest AUC at 0.77, while Random Forest (RF) exhibited the highest PRC at 0.63. GNB, RF, and Logistic Regression (LR) models all achieved an accuracy of 0.69, while RF exhibited the highest sensitivity at 0.60 and lowest Brier score of 0.20. Shapley Additive Explanation (SHAP) analysis identified higher FI% of PSM, elevated VBQ score, higher preoperative T1-pelvic angle (T1PA), and postoperative lordosis maldistribution as major risk factors for MC. Based on RF model, local interpretable model-agnostic explanations (LIME) visualization was successfully developed for individual risk calculation. CONCLUSION The RF and GNB models showed the best overall performance. Both RF and GNB models identified top-ranked/major risk factors including higher paraspinal muscle fat infiltration, elevated VBQ score, higher preoperative T1PA angle, and postoperative lordosis maldistribution providing valuable insights for surgical decision-making.
Collapse
Affiliation(s)
- Jie Li
- Division of Spine Surgery, Department of Orthopedic Surgery, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Zhongshan Road 321, Nanjing, 210008, China
| | - Zhen Tian
- Division of Spine Surgery, Department of Orthopedic Surgery, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Zhongshan Road 321, Nanjing, 210008, China
| | - Yinyu Fang
- Division of Spine Surgery, Department of Orthopedic Surgery, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Zhongshan Road 321, Nanjing, 210008, China
| | - Zhong He
- Division of Spine Surgery, Department of Orthopedic Surgery, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Zhongshan Road 321, Nanjing, 210008, China
| | - Yanjie Xu
- Division of Spine Surgery, Department of Orthopedic Surgery, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Zhongshan Road 321, Nanjing, 210008, China
| | - Hui Xu
- Division of Spine Surgery, Department of Orthopedic Surgery, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Zhongshan Road 321, Nanjing, 210008, China
| | - Zezhang Zhu
- Division of Spine Surgery, Department of Orthopedic Surgery, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Zhongshan Road 321, Nanjing, 210008, China
| | - Yong Qiu
- Division of Spine Surgery, Department of Orthopedic Surgery, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Zhongshan Road 321, Nanjing, 210008, China.
| | - Zhen Liu
- Division of Spine Surgery, Department of Orthopedic Surgery, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Zhongshan Road 321, Nanjing, 210008, China.
| |
Collapse
|
23
|
Pruinelli L, Balakrishnan K, Ma S, Li Z, Wall A, Lai JC, Schold JD, Pruett T, Simon G. Transforming liver transplant allocation with artificial intelligence and machine learning: a systematic review. BMC Med Inform Decis Mak 2025; 25:98. [PMID: 39994720 PMCID: PMC11852809 DOI: 10.1186/s12911-025-02890-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2024] [Accepted: 01/22/2025] [Indexed: 02/26/2025] Open
Abstract
BACKGROUND The principles of urgency, utility, and benefit are fundamental concepts guiding the ethical and practical decision-making process for organ allocation; however, LT allocation still follows an urgency model. AIM To identify and analyze data elements used in Machine Learning (ML) and Artificial Intelligence (AI) methods, data sources, and their focus on urgency, utility, or benefit in LT. METHODS A comprehensive search across Ovid Medline and Scopus was conducted for studies published from 2002 to June 2023. Inclusion criteria targeted quantitative studies using ML/AI for candidates, donors, or recipients. Two reviewers assessed eligibility and extracted data, following PRISMA guidelines. RESULTS A total of 20 papers were included, synthesizing results into five major categories. Eight studies were led by a Spanish team, focusing on donor-recipient matching and proposing machine learning models to predict post- LT survival. Other international studies addressed organ supply-demand issues and developed predictive models to optimize LT outcomes. The studies highlight the potential of ML/AI to enhance LT allocation and outcomes. Despite advancements, limitations included the lack of robust transplant-related benefit models and improvements in urgency models compared to MELD. DISCUSSION This review highlighted the potential of AI and ML to enhance liver transplant allocation and outcomes. Significant advancements were noted, but limitations such as the need for better urgency models and the absence of a transplant-related benefit model remain. Most studies emphasized utility, focusing on survival outcomes. Future research should address the interpretability and generalizability of these models to improve organ allocation and post-LT survival predictions.
Collapse
Affiliation(s)
- Lisiane Pruinelli
- Department of Family, Community and Health Systems Science, University of Florida, Gainesville, Florida, US.
- Department of Surgery, University of Florida, Gainesville, Florida, US.
| | - Kiruthika Balakrishnan
- Department of Family, Community and Health Systems Science, University of Florida, Gainesville, Florida, US
| | - Sisi Ma
- Institute for Health Informatics, University of Minnesota, Minneapolis, MN, USA
- Division of General Internal Medicine, University of Minnesota, Minneapolis, Minnesota, USA
| | - Zhigang Li
- Department of Biostatistics, University of Florida, Gainesville, Florida, USA
| | - Anji Wall
- Baylor University Medical Center in Dallas, Dallas, Texas, USA
| | - Jennifer C Lai
- Department of Medicine, University of California, San Francisco, California, USA
| | - Jesse D Schold
- Departments of Surgery and Epidemiology, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| | - Timothy Pruett
- Department of Surgery, University of Minnesota, Minneapolis, Minnesota, US
| | - Gyorgy Simon
- Institute for Health Informatics, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
24
|
Ahmed KS, Issaka SM, Marcinak CT, Virani SS, Jaraczewski T, Afshar M, Mayampurath A, Churpek MM, Mathew J, Zafar SN. Machine Learning-Driven Modeling to Predict Postdischarge Venous Thromboembolism After Pancreatectomy for Pancreas Cancer. Ann Surg Oncol 2025:10.1245/s10434-025-17032-2. [PMID: 39979688 DOI: 10.1245/s10434-025-17032-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2024] [Accepted: 02/03/2025] [Indexed: 02/22/2025]
Abstract
BACKGROUND Postdischarge venous thromboembolism (pdVTE) is a life-threatening complication following resection for pancreatic cancer (PC). While national guidelines recommend extended chemoprophylaxis for all, adherence is low and ranges from 1.5 to 44%. Predicting a patient's pdVTE risk would enable a more tailored approach to extended chemoprophylaxis, better balancing the cost and risks of overtreatment. We aimed to demonstrate the feasibility of using machine learning models to predict pdVTE. PATIENTS AND METHODS We analyzed data from patients undergoing pancreatectomy for PC using the National Surgical Quality Improvement Program database between 2014 and 2020. Predictive classification models were trained and independently tested on features available at the time of discharge including demographics, clinical, laboratory, cancer, and surgery-specific variables. We developed and compared logistic regression (LR), decision tree (DT), random forest (RF), and gradient boosting (GB) models to predict the development of pdVTE. Model performance and feature importance were evaluated. RESULTS The study included a total of 51,916 patients, with 743 (1.4%) experiencing pdVTE. The best-performing GB, RF, and DT models achieved area under the curve (AUC) scores of 0.83, 0.80, and 0.80, respectively, demonstrating superior performance compared with the traditional LR (AUC = 0.72) model. The GB model achieved a specificity of 99%, sensitivity of 0.40%, and area under the precision recall curve of 0.34. The most important variables were intraoperative antibiotic use, blood transfusion, length of stay, and postoperative infections. CONCLUSIONS Machine learning models can reliably identify patients who are at high risk for pdVTE. Such models should be used to inform prescription of extended VTE prophylaxis.
Collapse
Affiliation(s)
- Kaleem S Ahmed
- Division of Surgical Oncology, Department of Surgery, UW Carbone Cancer Center, University of Wisconsin School of Medicine and Public Health, Madison, WI, USA
| | - Sheriff M Issaka
- Division of Surgical Oncology, Department of Surgery, UW Carbone Cancer Center, University of Wisconsin School of Medicine and Public Health, Madison, WI, USA
| | - Clayton T Marcinak
- Division of Surgical Oncology, Department of Surgery, UW Carbone Cancer Center, University of Wisconsin School of Medicine and Public Health, Madison, WI, USA
| | - Sehar S Virani
- Department of Surgery, Aga Khan University, Karachi, Pakistan
| | | | - Majid Afshar
- Department of Medicine, University of Wisconsin-Madison, Madison, WI, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
| | - Anoop Mayampurath
- Department of Medicine, University of Wisconsin-Madison, Madison, WI, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
| | - Matthew M Churpek
- Department of Medicine, University of Wisconsin-Madison, Madison, WI, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
| | - Jomol Mathew
- Department of Informatics and Information Technology, University of Wisconsin-Madison, Madison, WI, USA
| | - Syed Nabeel Zafar
- Division of Surgical Oncology, Department of Surgery, UW Carbone Cancer Center, University of Wisconsin School of Medicine and Public Health, Madison, WI, USA.
| |
Collapse
|
25
|
Safarzadeh S, Ardabili NS, Farashah MV, Roozbeh N, Darsareh F. Predicting mother and newborn skin-to-skin contact using a machine learning approach. BMC Pregnancy Childbirth 2025; 25:182. [PMID: 39966775 PMCID: PMC11837404 DOI: 10.1186/s12884-025-07313-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Accepted: 02/10/2025] [Indexed: 02/20/2025] Open
Abstract
BACKGROUND Despite the known benefits of skin-to-skin contact (SSC), limited data exists on its implementation, especially its influencing factors. The current study was designed to use machine learning (ML) to identify the predictors of SSC. METHODS This study implemented predictive SSC approaches based on the data obtained from the "Iranian Maternal and Neonatal Network (IMaN Net)" from January 2020 to January 2022. A predictive model was built using nine statistical learning models (linear regression, logistic regression, decision tree classification, random forest classification, deep learning feedforward, extreme gradient boost model, light gradient boost model, support vector machine, and permutation feature classification with k-nearest neighbors). Demographic, obstetric, and maternal and neonatal clinical factors were considered as potential predicting factors and were extracted from the patient's medical records. The area under the receiver operating characteristic curve (AUROC), accuracy, precision, recall, and F_1 Score were measured to evaluate the diagnostic performance. RESULTS Of 8031 eligible mothers, 3759 (46.8%) experienced SSC. The algorithms created by deep learning (AUROC: 0.81, accuracy: 0.75, precision: 0.67, recall: 0.77, and F_1 Score: 0.73) and linear regression (AUROC: 0.80, accuracy: 0.75, precision: 0.66, recall: 0.75, and F_1 Score: 0.71) had the highest performance in predicting SSC. Doula support, neonatal weight, gestational age, attending childbirth classes, and maternal age were the critical predictors for SSC based on the top two algorithms with superior performance. CONCLUSIONS Although this study found that the ML model performed well in predicting SSC, more research is needed to make a better conclusion about its performance.
Collapse
Affiliation(s)
- Sanaz Safarzadeh
- Mother and Child Welfare Research Center, Hormozgan University of Medical Sciences, Bandar Abbas, Iran
- Student research committee, Department of midwifery, School of Nursing and Midwifery, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | | | | | - Nasibeh Roozbeh
- Mother and Child Welfare Research Center, Hormozgan University of Medical Sciences, Bandar Abbas, Iran
| | - Fatemeh Darsareh
- Mother and Child Welfare Research Center, Hormozgan University of Medical Sciences, Bandar Abbas, Iran.
| |
Collapse
|
26
|
Mari T, Ali SH, Pacinotti L, Powsey S, Fallon N. Machine learning classification of active viewing of pain and non-pain images using EEG does not exceed chance in external validation samples. COGNITIVE, AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2025:10.3758/s13415-025-01268-2. [PMID: 39966304 DOI: 10.3758/s13415-025-01268-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 01/18/2025] [Indexed: 02/20/2025]
Abstract
Previous research has demonstrated that machine learning (ML) could not effectively decode passive observation of neutral versus pain photographs by using electroencephalogram (EEG) data. Consequently, the present study explored whether active viewing, i.e., requiring participant engagement in a task, of neutral and pain stimuli improves ML performance. Random forest (RF) models were trained on cortical event-related potentials (ERPs) during a two-alternative forced choice paradigm, whereby participants determined the presence or absence of pain in photographs of facial expressions and action scenes. Sixty-two participants were recruited for the model development sample. Moreover, a within-subject temporal validation sample was collected, consisting of 27 subjects. In line with our previous research, three RF models were developed to classify images into faces and scenes, neutral and pain scenes, and neutral and pain expressions. The results demonstrated that the RF successfully classified discrete categories of visual stimuli (faces and scenes) with accuracies of 78% and 66% on cross-validation and external validation, respectively. However, despite promising cross-validation results of 61% and 67% for the classification of neutral and pain scenes and neutral and pain faces, respectively, the RF models failed to exceed chance performance on the external validation dataset on both empathy classification attempts. These results align with previous research, highlighting the challenges of classifying complex states, such as pain empathy using ERPs. Moreover, the results suggest that active observation fails to enhance ML performance beyond previous passive studies. Future research should prioritise improving model performance to obtain levels exceeding chance, which would demonstrate increased utility.
Collapse
Affiliation(s)
- Tyler Mari
- Department of Psychology, Institute of Population Health, Faculty of Health and Life Sciences, University of Liverpool, Bedford Street South, Liverpool, L69 7ZA, UK.
| | - S Hasan Ali
- Department of Psychology, Institute of Population Health, Faculty of Health and Life Sciences, University of Liverpool, Bedford Street South, Liverpool, L69 7ZA, UK
| | - Lucrezia Pacinotti
- Department of Psychology, Institute of Population Health, Faculty of Health and Life Sciences, University of Liverpool, Bedford Street South, Liverpool, L69 7ZA, UK
| | - Sarah Powsey
- Department of Psychology, Institute of Population Health, Faculty of Health and Life Sciences, University of Liverpool, Bedford Street South, Liverpool, L69 7ZA, UK
| | - Nicholas Fallon
- Department of Psychology, Institute of Population Health, Faculty of Health and Life Sciences, University of Liverpool, Bedford Street South, Liverpool, L69 7ZA, UK
| |
Collapse
|
27
|
Liu Z, Shu W, Liu H, Zhang X, Chong W. Development and validation of interpretable machine learning models for triage patients admitted to the intensive care unit. PLoS One 2025; 20:e0317819. [PMID: 39964993 PMCID: PMC11835250 DOI: 10.1371/journal.pone.0317819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2024] [Accepted: 01/06/2025] [Indexed: 02/20/2025] Open
Abstract
OBJECTIVES Developing and validating interpretable machine learning (ML) models for predicting whether triaged patients need to be admitted to the intensive care unit (ICU). MEASURES The study analyzed 189,167 emergency patients from the Medical Information Mart for Intensive Care IV database, with the outcome being ICU admission. Three models were compared: Model 1 based on Emergency Severity Index (ESI), Model 2 on vital signs, and Model 3 on vital signs, demographic characteristics, medical history, and chief complaints. Nine ML algorithms were employed. The area under the receiver operating characteristic curve (AUC), F1 Score, Positive Predictive Value, Negative Predictive Value, Brier score, calibration curves, and decision curves analysis were used to evaluate the performance of the models. SHapley Additive exPlanations was used for explaining ML models. RESULTS The AUC of Model 3 was superior to that of Model 1 and Model 2. In Model 3, the top four algorithms with the highest AUC were Gradient Boosting (0.81), Logistic Regression (0.81), naive Bayes (0.80), and Random Forest (0.80). Upon further comparison of the four algorithms, Gradient Boosting was slightly superior to Random Forest and Logistic Regression, while naive Bayes performed the worst. CONCLUSIONS This study developed an interpretable ML triage model using vital signs, demographics, medical history, and chief complaints, proving more effective than traditional models in predicting ICU admission. Interpretable ML aids clinical decisions during triage.
Collapse
Affiliation(s)
- Zheng Liu
- Department of Emergency, The First Hospital of China Medical University, Shenyang, China
| | - Wenqi Shu
- Department of Emergency, The First Hospital of China Medical University, Shenyang, China
| | - Hongyan Liu
- Department of Emergency, The First Hospital of China Medical University, Shenyang, China
| | - Xuan Zhang
- Department of Emergency, The First Hospital of China Medical University, Shenyang, China
| | - Wei Chong
- Department of Emergency, The First Hospital of China Medical University, Shenyang, China
| |
Collapse
|
28
|
Chen YH, Lin CH, Fan CH, Long AJ, Scholl J, Kao YP, Iqbal U, Li YCJ. Machine Learning Approach to Identifying Wrong-Site Surgeries Using Centers for Medicare and Medicaid Services Dataset: Development and Validation Study. JMIR Form Res 2025; 9:e68436. [PMID: 39946709 PMCID: PMC11888080 DOI: 10.2196/68436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2024] [Revised: 12/20/2024] [Accepted: 01/12/2025] [Indexed: 03/10/2025] Open
Abstract
BACKGROUND Wrong-site surgery (WSS) is a critical but preventable medical error, often resulting in severe patient harm and substantial financial costs. While protocols exist to reduce wrong-site surgery, underreporting and inconsistent documentation continue to contribute to its persistence. Machine learning (ML) models, which have shown success in detecting medication errors, may offer a solution by identifying unusual procedure-diagnosis combinations. This study investigated whether an ML approach can effectively adapt to detect surgical errors. OBJECTIVE This study aimed to evaluate the transferability and effectiveness of an ML-based model for detecting inconsistencies within surgical documentation, particularly focusing on laterality discrepancies. METHODS We used claims data from the Centers for Medicare and Medicaid Services Limited Data Set (CMS-LDS) from 2017 to 2020, focusing on surgical procedures with documented laterality. We developed an adapted Association Outlier Pattern (AOP) ML model to identify uncommon procedure-diagnosis combinations, specifically targeting discrepancies in laterality. The model was trained on data from 2017 to 2019 and tested on 2020 orthopedic procedures, using ICD-10-PCS (International Classification of Diseases, Tenth Revision, Procedure Coding System) codes to distinguish body part and laterality. Test cases were classified based on alignment between procedural and diagnostic laterality, with 2 key subgroups (right-left and left-right mismatches) identified for evaluation. Model performance was assessed by comparing precision-recall curves and accuracy against rule-based methods. RESULTS The findings here included 346,382 claims, of which 2170 claims demonstrated with significant laterality discrepancies between procedures and diagnoses. Among patients with left-side procedures and right-side diagnoses (603/1106), 54.5% were confirmed as errors after clinical review. For right-side procedures with left-side diagnoses (541/1064), 50.8% were classified as errors. The AOP model identified 697 and 655 potentially unusual combinations in the left-right and right-left subgroups, respectively, with over 80% of these cases confirmed as errors following clinical review. Most confirmed errors involved discrepancies in laterality for the same body part, while nonerror cases typically involved general diagnoses without specified laterality. CONCLUSIONS This investigation showed that the AOP model effectively detects inconsistencies between surgical procedures and diagnoses using CMS-LDS data. The AOP model outperformed traditional rule-based methods, offering higher accuracy in identifying errors. Moreover, the model's transferability from medication-disease associations to procedure-diagnosis verification highlights its broad applicability. By improving the precision of identifying laterality discrepancies, the AOP model can reduce surgical errors, particularly in orthopedic care. These findings suggest that the model enhances patient safety and has the potential to improve clinical decision-making and outcomes.
Collapse
Affiliation(s)
- Yuan-Hsin Chen
- Department of Surgery, Massachusetts General Hospital, Boston, MA, United States
| | - Ching-Hsuan Lin
- Center for the Evaluation of Value and Risk in Health, Tufts Medical Center, Boston, MA, United States
| | | | | | | | - Yen-Pin Kao
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, New Taipei City, Taiwan
| | - Usman Iqbal
- School of Population Health, Faculty of Medicine and Health, University of New South Wales, Sydney, Australia
| | - Yu-Chuan Jack Li
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, New Taipei City, Taiwan
- International Center for Health Information and Technology, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
- Department of Dermatology, Wan Fang Hospital, Taipei, Taiwan
- Research Center for Artificial Intelligence in Medicine, Taipei Medical University, Taipei, Taiwan
| |
Collapse
|
29
|
Saito R, Tsugawa S. Understanding Citizens' Response to Social Activities on Twitter in US Metropolises During the COVID-19 Recovery Phase Using a Fine-Tuned Large Language Model: Application of AI. J Med Internet Res 2025; 27:e63824. [PMID: 39932775 PMCID: PMC11862765 DOI: 10.2196/63824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Revised: 12/08/2024] [Accepted: 12/24/2024] [Indexed: 02/13/2025] Open
Abstract
BACKGROUND The COVID-19 pandemic continues to hold an important place in the collective memory as of 2024. As of March 2024, >676 million cases, 6 million deaths, and 13 billion vaccine doses have been reported. It is crucial to evaluate sociopsychological impacts as well as public health indicators such as these to understand the effects of the COVID-19 pandemic. OBJECTIVE This study aimed to explore the sentiments of residents of major US cities toward restrictions on social activities in 2022 during the transitional phase of the COVID-19 pandemic, from the peak of the pandemic to its gradual decline. By illuminating people's susceptibility to COVID-19, we provide insights into the general sentiment trends during the recovery phase of the pandemic. METHODS To analyze these trends, we collected posts (N=119,437) on the social media platform Twitter (now X) created by people living in New York City, Los Angeles, and Chicago from December 2021 to December 2022, which were impacted by the COVID-19 pandemic in similar ways. A total of 47,111 unique users authored these posts. In addition, for privacy considerations, any identifiable information, such as author IDs and usernames, was excluded, retaining only the text for analysis. Then, we developed a sentiment estimation model by fine-tuning a large language model on the collected data and used it to analyze how citizens' sentiments evolved throughout the pandemic. RESULTS In the evaluation of models, GPT-3.5 Turbo with fine-tuning outperformed GPT-3.5 Turbo without fine-tuning and Robustly Optimized Bidirectional Encoder Representations from Transformers Pretraining Approach (RoBERTa)-large with fine-tuning, demonstrating significant accuracy (0.80), recall (0.79), precision (0.79), and F1-score (0.79). The findings using GPT-3.5 Turbo with fine-tuning reveal a significant relationship between sentiment levels and actual cases in all 3 cities. Specifically, the correlation coefficient for New York City is 0.89 (95% CI 0.81-0.93), for Los Angeles is 0.39 (95% CI 0.14-0.60), and for Chicago is 0.65 (95% CI 0.47-0.78). Furthermore, feature words analysis showed that COVID-19-related keywords were replaced with non-COVID-19-related keywords in New York City and Los Angeles from January 2022 onward and Chicago from March 2022 onward. CONCLUSIONS The results show a gradual decline in sentiment and interest in restrictions across all 3 cities as the pandemic approached its conclusion. These results are also ensured by a sentiment estimation model fine-tuned on actual Twitter posts. This study represents the first attempt from a macro perspective to depict sentiment using a classification model created with actual data from the period when COVID-19 was prevalent. This approach can be applied to the spread of other infectious diseases by adjusting search keywords for observational data.
Collapse
Affiliation(s)
- Ryuichi Saito
- Institute of Systems and Information Engineering, University of Tsukuba, Tsukuba, Japan
| | - Sho Tsugawa
- Institute of Systems and Information Engineering, University of Tsukuba, Tsukuba, Japan
| |
Collapse
|
30
|
Zhang SY, Zhang YD, Li H, Wang QY, Ye QF, Wang XM, Xia TH, He YE, Rong X, Wu TT, Wu RZ. Explainable machine learning model for predicting decline in platelet count after interventional closure in children with patent ductus arteriosus. Front Pediatr 2025; 13:1519002. [PMID: 39981204 PMCID: PMC11839778 DOI: 10.3389/fped.2025.1519002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/29/2024] [Accepted: 01/20/2025] [Indexed: 02/22/2025] Open
Abstract
Background This study aimed to apply four machine learning algorithms to develop the optimal model to predict decline in platelet count (DPC) after interventional closure in children with patent ductus arteriosus (PDA). Methods Data from children with PDA who underwent successful transcatheter closure at the Second Affiliated Hospital of Wenzhou Medical University and Yuying Children's Hospital from January 2016, to December 2022, were collected. The cohort data were split into training and testing sets. DPC following the intervention is defined as a percentage DPC ≥25% [(baseline platelet count-nadir platelet count)/baseline platelet count]. The extra tree algorithm was used for feature selection and four ML algorithms [random forest (RF), adaptive boosting, extreme gradient boosting, and logistic regression] were established. Moreover, SHapley Additive exPlanation (SHAP) to explain the importance of features and the ML models. Results This study included 330 children who underwent successful transcatheter closure of PDA, of which 113 (34.2%) experienced DPC. After 62 clinical features were considered, the extra tree algorithm selected six clinical features to build the ML models. Amongst the four ML algorithms, the RF model achieved the greatest AUC. SHAP analysis revealed that pulmonary artery systolic pressure, size of defect and weight were the top three most important clinical features in the RF model. Furthermore, clinical descriptions of two children with PDA, with accurate predictions, and explanations of the prediction results were provided. Conclusion In this study, an ML model (RF) capable of predicting post-intervention DPC in children with PDA undergoing transcatheter closure was established.
Collapse
Affiliation(s)
- Song-Yue Zhang
- Children's Heart Center, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, China
| | - Yi-Dong Zhang
- Children's Heart Center, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, China
| | - Hao Li
- Children's Heart Center, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, China
| | - Qiao-Yu Wang
- Children's Heart Center, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, China
| | | | - Xun-Min Wang
- Children's Heart Center, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, China
| | - Tian-He Xia
- Children's Heart Center, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, China
| | - Yue-E He
- Children's Heart Center, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, China
| | - Xing Rong
- Children's Heart Center, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, China
| | - Ting-Ting Wu
- Children's Heart Center, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, China
| | - Rong-Zhou Wu
- Children's Heart Center, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, China
| |
Collapse
|
31
|
Langford DJ, Reichel JF, Zhong H, Basseri BH, Koch MP, Kolady R, Liu J, Sideris A, Dworkin RH, Poeran J, Wu CL. Machine learning research methods to predict postoperative pain and opioid use: a narrative review. Reg Anesth Pain Med 2025; 50:102-109. [PMID: 39909542 DOI: 10.1136/rapm-2024-105603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2024] [Accepted: 10/20/2024] [Indexed: 02/07/2025]
Abstract
The use of machine learning to predict postoperative pain and opioid use has likely been catalyzed by the availability of complex patient-level data, computational and statistical advancements, the prevalence and impact of chronic postsurgical pain, and the persistence of the opioid crisis. The objectives of this narrative review were to identify and characterize methodological aspects of studies that have developed and/or tested machine learning algorithms to predict acute, subacute, or chronic pain or opioid use after any surgery and to propose considerations for future machine learning studies. Pairs of independent reviewers screened titles and abstracts of 280 PubMed-indexed articles and ultimately extracted data from 61 studies that met entry criteria. We observed a marked increase in the number of relevant publications over time. Studies most commonly focused on machine learning algorithms to predict chronic postsurgical pain or opioid use, using real-world data from patients undergoing orthopedic surgery. We identified variability in sample size, number and type of predictors, and how outcome variables were defined. Patient-reported predictors were highlighted as particularly informative and important to include in such machine learning algorithms, where possible. We hope that findings from this review might inform future applications of machine learning that improve the performance and clinical utility of resultant machine learning algorithms.
Collapse
Affiliation(s)
- Dale J Langford
- Pain Prevention Research Center, Department of Anesthesiology, Critical Care & Pain Management, Hospital for Special Surgery, New York, New York, USA
- Department of Anesthesiology, Weill Cornell Medicine, New York, New York, USA
| | - Julia F Reichel
- Department of Anesthesiology, Critical Care & Pain Management, Hospital for Special Surgery, New York, New York, USA
| | - Haoyan Zhong
- Department of Anesthesiology, Critical Care & Pain Management, Hospital for Special Surgery, New York, New York, USA
| | - Benjamin H Basseri
- Department of Anesthesiology, Critical Care & Pain Management, Hospital for Special Surgery, New York, New York, USA
| | - Marc P Koch
- Davidson College, Davidson, North Carolina, USA
| | | | - Jiabin Liu
- Pain Prevention Research Center, Department of Anesthesiology, Critical Care & Pain Management, Hospital for Special Surgery, New York, New York, USA
- Department of Anesthesiology, Weill Cornell Medicine, New York, New York, USA
| | - Alexandra Sideris
- Pain Prevention Research Center, Department of Anesthesiology, Critical Care & Pain Management, Hospital for Special Surgery, New York, New York, USA
- Department of Anesthesiology, Weill Cornell Medicine, New York, New York, USA
- Hospital for Special Surgery Research Institute, New York, New York, USA
| | - Robert H Dworkin
- Pain Prevention Research Center, Department of Anesthesiology, Critical Care & Pain Management, Hospital for Special Surgery, New York, New York, USA
- Hospital for Special Surgery Research Institute, New York, New York, USA
- Anesthesiology and Perioperative Medicine, University of Rochester, Rochester, New York, USA
| | - Jashvant Poeran
- Pain Prevention Research Center, Department of Anesthesiology, Critical Care & Pain Management, Hospital for Special Surgery, New York, New York, USA
- Hospital for Special Surgery Research Institute, New York, New York, USA
| | - Christopher L Wu
- Pain Prevention Research Center, Department of Anesthesiology, Critical Care & Pain Management, Hospital for Special Surgery, New York, New York, USA
- Department of Anesthesiology, Weill Cornell Medicine, New York, New York, USA
| |
Collapse
|
32
|
Marigi EM, Oeding JF, Nieboer M, Marigi IM, Wahlig B, Barlow JD, Sanchez-Sotelo J, Sperling JW. The relationship between design-based lateralization, humeral bearing design, polyethylene angle, and patient-related factors on surgical complications after reverse shoulder arthroplasty: a machine learning analysis. J Shoulder Elbow Surg 2025; 34:462-472. [PMID: 38852709 DOI: 10.1016/j.jse.2024.04.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 04/12/2024] [Accepted: 04/18/2024] [Indexed: 06/11/2024]
Abstract
BACKGROUND Technological advancements in implant design and surgical technique have focused on diminishing complications and optimizing performance of reverse shoulder arthroplasty (rTSA). Despite this, there remains a paucity of literature correlating prosthetic features and clinical outcomes. This investigation utilized a machine learning approach to evaluate the effect of select implant design features and patient-related factors on surgical complications after rTSA. METHODS Over a 16-year period (2004-2020), all primary rTSA performed at a single institution for elective and traumatic indications with a minimum follow-up of 2 years were identified. Parameters related to implant design evaluated in this study included inlay vs. onlay humeral bearing design, glenoid lateralization (medialized or lateralized), humeral lateralization (medialized, minimally lateralized, or lateralized), global lateralization (medialized, minimally lateralized, lateralized, highly lateralized, or very highly lateralized), stem to metallic bearing neck shaft angle, and polyethylene neck shaft angle. Machine learning models predicting surgical complications were constructed for each patient and Shapley additive explanation values were calculated to quantify feature importance. RESULTS A total of 3837 rTSA were identified, of which 472 (12.3%) experienced a surgical complication. Those experiencing a surgical complication were more likely to be current smokers (Odds ratio [OR] = 1.71; P = .003), have prior surgery (OR = 1.60; P < .001), have an underlying diagnosis of sequalae of instability (OR = 4.59; P < .001) or nonunion (OR = 3.09; P < .001), and required longer OR times (98 vs. 86 minutes; P < .001). Notable implant design features at an increased odds for complications included an inlay humeral component (OR = 1.67; P < .001), medialized glenoid (OR = 1.43; P = .001), medialized humerus (OR = 1.48; P = .004), a minimally lateralized global construct (OR = 1.51; P < .001), and glenohumeral constructs consisting of a medialized glenoid and minimally lateralized humerus (OR = 1.59; P < .001), and a lateralized glenoid and medialized humerus (OR = 2.68; P < .001). Based on patient- and implant-specific features, the machine learning model predicted complications after rTSA with an area under the receiver operating characteristic curve of 0.61. CONCLUSIONS This study demonstrated that patient-specific risk factors had a more substantial effect than implant design configurations on the predictive ability of a machine learning model on surgical complications after rTSA. However, certain implant features appeared to be associated with a higher odd of surgical complications.
Collapse
Affiliation(s)
- Erick M Marigi
- Department of Orthopedic Surgery, Mayo Clinic, Jacksonville, FL, USA
| | - Jacob F Oeding
- School of Medicine, Mayo Clinic Alix School of Medicine, Rochester, MN, USA
| | - Micah Nieboer
- Department of Orthopedic Surgery, Mayo Clinic, Rochester, MN, USA
| | - Ian M Marigi
- Washington University Medical School, St. Louis, MO, USA
| | - Brian Wahlig
- Department of Orthopedic Surgery, Mayo Clinic, Rochester, MN, USA
| | | | | | - John W Sperling
- Department of Orthopedic Surgery, Mayo Clinic, Rochester, MN, USA.
| |
Collapse
|
33
|
Shiferaw KB, Roloff M, Balaur I, Welter D, Waltemath D, Zeleke AA. Guidelines and standard frameworks for artificial intelligence in medicine: a systematic review. JAMIA Open 2025; 8:ooae155. [PMID: 39759773 PMCID: PMC11700560 DOI: 10.1093/jamiaopen/ooae155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2024] [Revised: 12/12/2024] [Accepted: 12/20/2024] [Indexed: 01/07/2025] Open
Abstract
Objectives The continuous integration of artificial intelligence (AI) into clinical settings requires the development of up-to-date and robust guidelines and standard frameworks that consider the evolving challenges of AI implementation in medicine. This review evaluates the quality of these guideline and summarizes ethical frameworks, best practices, and recommendations. Materials and Methods The Appraisal of Guidelines, Research, and Evaluation II tool was used to assess the quality of guidelines based on 6 domains: scope and purpose, stakeholder involvement, rigor of development, clarity of presentation, applicability, and editorial independence. The protocol of this review including the eligibility criteria, the search strategy data extraction sheet and methods, was published prior to the actual review with International Registered Report Identifier of DERR1-10.2196/47105. Results The initial search resulted in 4975 studies from 2 databases and 7 studies from manual search. Eleven articles were selected for data extraction based on the eligibility criteria. We found that while guidelines generally excel in scope, purpose, and editorial independence, there is significant variability in applicability and the rigor of guideline development. Well-established initiatives such as TRIPOD+AI, DECIDE-AI, SPIRIT-AI, and CONSORT-AI have shown high quality, particularly in terms of stakeholder involvement. However, applicability remains a prominent challenge among the guidelines. The result also showed that the reproducibility, ethical, and environmental aspects of AI in medicine still need attention from both medical and AI communities. Discussion Our work highlights the need for working toward the development of integrated and comprehensive reporting guidelines that adhere to the principles of Findability, Accessibility, Interoperability and Reusability. This alignment is essential for fostering a cultural shift toward transparency and open science, which are pivotal milestone for sustainable digital health research. Conclusion This review evaluates the current reporting guidelines, discussing their advantages as well as challenges and limitations.
Collapse
Affiliation(s)
- Kirubel Biruk Shiferaw
- Department of Medical Informatics, Institute for Community Medicine, University Medicine Greifswald, Greifswald D-17475, Germany
| | - Moritz Roloff
- Department of Medical Informatics, Institute for Community Medicine, University Medicine Greifswald, Greifswald D-17475, Germany
| | - Irina Balaur
- Luxembourg Centre for Systems Biology, University of Luxembourg, Belvaux L-4367, Luxembourg
| | - Danielle Welter
- Luxembourg National Data Service, Esch-sur-Alzette L-4362, Luxembourg
| | - Dagmar Waltemath
- Department of Medical Informatics, Institute for Community Medicine, University Medicine Greifswald, Greifswald D-17475, Germany
| | - Atinkut Alamirrew Zeleke
- Department of Medical Informatics, Institute for Community Medicine, University Medicine Greifswald, Greifswald D-17475, Germany
| |
Collapse
|
34
|
Koole D, Shen O, Lans A, de Groot TM, Verlaan JJ, Schwab JH. Development of Machine Learning Algorithms for Identifying Patients With Limited Health Literacy. J Eval Clin Pract 2025; 31:e14248. [PMID: 39574338 PMCID: PMC11582738 DOI: 10.1111/jep.14248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 07/21/2024] [Accepted: 10/17/2024] [Indexed: 11/24/2024]
Abstract
RATIONALE Limited health literacy (HL) leads to poor health outcomes, psychological stress, and misutilization of medical resources. Although interventions aimed at improving HL may be effective, identifying patients at risk of limited HL in the clinical workflow is challenging. With machine learning (ML) algorithms based on readily available data, healthcare professionals would be enabled to incorporate HL screening without the need for administering in-person HL screening tools. AIMS AND OBJECTIVES Develop ML algorithms to identify patients at risk for limited HL in spine patients. METHODS Between December 2021 and February 2023, consecutive English-speaking patients over the age of 18 and new to an urban academic outpatient spine clinic were approached for participation in a cross-sectional survey study. HL was assessed using the Newest Vital Sign and the scores were divided into limited (0-3) and adequate (4-6) HL. Additional patient characteristics were extracted through a sociodemographic survey and electronic health records. Subsequently, feature selection was performed by random forest algorithms with recursive feature selection and five ML models (stochastic gradient boosting, random forest, Bayes point machine, elastic-net penalized logistic regression, support vector machine) were developed to predict limited HL. RESULTS Seven hundred and fifty-three patients were included for model development, of whom 259 (34.4%) had limited HL. Variables identified for predicting limited HL were age, Area Deprivation Index-national, Social Vulnerability Index, insurance category, Body Mass Index, race, college education, and employment status. The Elastic-Net Penalized Logistic Regression algorithm achieved the best performance with a c-statistic of 0.766, calibration slope/intercept of 1.044/-0.037, and Brier score of 0.179. CONCLUSION Elastic-Net Penalized Logistic Regression had the best performance when compared with other ML algorithms with a c-statistic of 0.766, calibration slope/intercept of 1.044/-0.037, and a Brier score of 0.179. Over one-third of patients presenting to an outpatient spine center were found to have limited HL. While this algorithm is far from being used in clinical practice, ML algorithms offer a potential opportunity for identifying patients at risk for limited HL without administering in-person HL assessments. This could possibly enable screening and early intervention to mitigate the potential negative consequences of limited HL without taxing the existing clinical workflow.
Collapse
Affiliation(s)
- Dylan Koole
- Department of Orthopaedic Surgery, Orthopaedic Oncology ServiceMassachusetts General Hospital, Harvard Medical SchoolBostonMassachusettsUSA
- Department of Orthopaedic Surgery, Leiden University Medical CenterLeiden UniversityLeidenThe Netherlands
| | - Oscar Shen
- Department of Orthopaedic Surgery, Orthopaedic Oncology ServiceMassachusetts General Hospital, Harvard Medical SchoolBostonMassachusettsUSA
| | - Amanda Lans
- Department of Orthopaedic Surgery, Orthopaedic Oncology ServiceMassachusetts General Hospital, Harvard Medical SchoolBostonMassachusettsUSA
- Department of Orthopaedic Surgery, University Medical Center UtrechtUtrecht UniversityUtrechtThe Netherlands
| | - Tom M. de Groot
- Department of Orthopaedic Surgery, Orthopaedic Oncology ServiceMassachusetts General Hospital, Harvard Medical SchoolBostonMassachusettsUSA
- Department of Orthopaedic Surgery, University Medical Center GroningenUniversity of GroningenGroningenThe Netherlands
| | - J. J. Verlaan
- Department of Orthopaedic Surgery, University Medical Center UtrechtUtrecht UniversityUtrechtThe Netherlands
| | - J. H. Schwab
- Department of Orthopaedic Surgery, Orthopaedic Oncology ServiceMassachusetts General Hospital, Harvard Medical SchoolBostonMassachusettsUSA
| |
Collapse
|
35
|
Cata JP, Soni B, Bhavsar S, Pillai PS, Rypinski TA, Deva A, Siewerdsen JH, Soliz JM. Forecasting intraoperative hypotension during hepatobiliary surgery. J Clin Monit Comput 2025; 39:107-118. [PMID: 39317921 PMCID: PMC11821686 DOI: 10.1007/s10877-024-01223-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2024] [Accepted: 09/13/2024] [Indexed: 09/26/2024]
Abstract
Prediction and avoidance of intraoperative hypotension (IOH) can lead to less postoperative morbidity. Machine learning (ML) is increasingly being applied to predict IOH. We hypothesize that incorporating demographic and physiological features in an ML model will improve the performance of IOH prediction. In addition, we added a "dial" feature to alter prediction performance. An ML prediction model was built based on a multivariate random forest (RF) trained algorithm using 13 physiologic time series and patient demographic data (age, sex, and BMI) for adult patients undergoing hepatobiliary surgery. A novel implementation was developed with an adjustable, multi-model voting (MMV) approach to improve performance in the challenging context of a dynamic, sliding window for which the propensity of data is normal (negative for IOH). The study cohort included 85% of subjects exhibiting at least one IOH event. Males constituted 70% of the cohort, median age was 55.8 years, and median BMI was 27.7. The multivariate model yielded average AUC = 0.97 in the static context of a single prediction made up to 8 min before a possible IOH event, and it outperformed a univariate model based on MAP-only (average AUC = 0.83). The MMV model demonstrated AUC = 0.96, PPV = 0.89, and NPV = 0.98 within the challenging context of a dynamic sliding window across 40 min prior to a possible IOH event. We present a novel ML model to predict IOH with a distinctive "dial" on sensitivity and specificity to predict first IOH episode during liver resection surgeries.
Collapse
Affiliation(s)
- Juan P Cata
- Department of Anaesthesiology and Perioperative Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Anesthesiology and Surgical Oncology Research Group (ASORG), Houston, TX, USA
| | - Bhavin Soni
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Surgical Data Science Program, Institute for Data Science in Oncology (IDSO), The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Shreyas Bhavsar
- Department of Anaesthesiology and Perioperative Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Parvathy Sudhir Pillai
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Tatiana A Rypinski
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Anshuj Deva
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Jeffrey H Siewerdsen
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Surgical Data Science Program, Institute for Data Science in Oncology (IDSO), The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Jose M Soliz
- Department of Anaesthesiology and Perioperative Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
- Anesthesiology and Surgical Oncology Research Group (ASORG), Houston, TX, USA.
- Surgical Data Science Program, Institute for Data Science in Oncology (IDSO), The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
| |
Collapse
|
36
|
Todd E, Orr R, Gamage E, West E, Jabeen T, McGuinness AJ, George V, Phuong-Nguyen K, Voglsanger LM, Jennings L, Angwenyi L, Taylor S, Khosravi A, Jacka F, Dawson SL. Lifestyle factors and other predictors of common mental disorders in diagnostic machine learning studies: A systematic review. Comput Biol Med 2025; 185:109521. [PMID: 39667056 DOI: 10.1016/j.compbiomed.2024.109521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 11/28/2024] [Accepted: 12/02/2024] [Indexed: 12/14/2024]
Abstract
BACKGROUND Machine Learning (ML) models have been used to predict common mental disorders (CMDs) and may provide insights into the key modifiable factors that can identify and predict CMD risk and be targeted through interventions. This systematic review aimed to synthesise evidence from ML studies predicting CMDs, evaluate their performance, and establish the potential benefit of incorporating lifestyle data in ML models alongside biological and/or demographic-environmental factors. METHODS This systematic review adheres to the PRISMA statement (Prospero CRD42023401194). Databases searched included MEDLINE, EMBASE, PsycInfo, IEEE Xplore, Engineering Village, Web of Science, and Scopus from database inception to 28/08/24. Included studies used ML methods with feature importance to predict CMDs in adults. Risk of bias (ROB) was assessed using PROBAST. Model performance metrics were compared. The ten most important variables reported by each study were assigned to broader categories to evaluate their frequency across studies. RESULTS 117 studies were included (111 model development-only, 16 development and validation). Deep learning methods showed best accuracy for predicting CMD cases. Studies commonly incorporated features from multiple categories (n = 56), and frequently identified demographic-environmental predictors in their top ten most important variables (63/69 models). These tended to be in combination with psycho-social and biological variables (n = 15). Lifestyle data were infrequently examined as sole predictors of CMDs across included studies (4.27 %). Studies commonly had high heterogeneity and ROB ratings. CONCLUSION This review is the first to evaluate the utility of diagnostic ML for CMDs, assess their ROB, and evaluate predictor types. CMDs were able to be predicted, however studies had high ROB and lifestyle data were underutilised, precluding full identification of a robust predictor set.
Collapse
Affiliation(s)
- Emma Todd
- Deakin University, Food & Mood Centre, Institute for Mental and Physical Health and Clinical Translation (IMPACT), Health Education and Research Building, Ryrie Street, Geelong, Victoria, Australia; Deakin University, Institute for Mental and Physical Health and Clinical Translation (IMPACT), 75 Pigdons Rd, Waurn Ponds, Victoria, Australia
| | - Rebecca Orr
- Deakin University, Food & Mood Centre, Institute for Mental and Physical Health and Clinical Translation (IMPACT), Health Education and Research Building, Ryrie Street, Geelong, Victoria, Australia; Deakin University, Institute for Mental and Physical Health and Clinical Translation (IMPACT), 75 Pigdons Rd, Waurn Ponds, Victoria, Australia
| | - Elizabeth Gamage
- Deakin University, Food & Mood Centre, Institute for Mental and Physical Health and Clinical Translation (IMPACT), Health Education and Research Building, Ryrie Street, Geelong, Victoria, Australia; Deakin University, Institute for Mental and Physical Health and Clinical Translation (IMPACT), 75 Pigdons Rd, Waurn Ponds, Victoria, Australia
| | - Emma West
- Deakin University, Institute for Mental and Physical Health and Clinical Translation (IMPACT), 75 Pigdons Rd, Waurn Ponds, Victoria, Australia
| | - Tabinda Jabeen
- Deakin University, Food & Mood Centre, Institute for Mental and Physical Health and Clinical Translation (IMPACT), Health Education and Research Building, Ryrie Street, Geelong, Victoria, Australia; Deakin University, Institute for Mental and Physical Health and Clinical Translation (IMPACT), 75 Pigdons Rd, Waurn Ponds, Victoria, Australia
| | - Amelia J McGuinness
- Deakin University, Food & Mood Centre, Institute for Mental and Physical Health and Clinical Translation (IMPACT), Health Education and Research Building, Ryrie Street, Geelong, Victoria, Australia; Deakin University, Institute for Mental and Physical Health and Clinical Translation (IMPACT), 75 Pigdons Rd, Waurn Ponds, Victoria, Australia
| | - Victoria George
- Deakin University, Institute for Mental and Physical Health and Clinical Translation (IMPACT), 75 Pigdons Rd, Waurn Ponds, Victoria, Australia; University of Copenhagen, Novo Nordisk Foundation, Centre for Basic Metabolic Research, Blegdamsvej 3A, 2200, København, Denmark
| | - Kate Phuong-Nguyen
- Deakin University, Institute for Mental and Physical Health and Clinical Translation (IMPACT), 75 Pigdons Rd, Waurn Ponds, Victoria, Australia
| | - Lara M Voglsanger
- Deakin University, Institute for Mental and Physical Health and Clinical Translation (IMPACT), 75 Pigdons Rd, Waurn Ponds, Victoria, Australia
| | - Laura Jennings
- Deakin University, Food & Mood Centre, Institute for Mental and Physical Health and Clinical Translation (IMPACT), Health Education and Research Building, Ryrie Street, Geelong, Victoria, Australia; Deakin University, Institute for Mental and Physical Health and Clinical Translation (IMPACT), 75 Pigdons Rd, Waurn Ponds, Victoria, Australia
| | - Lisa Angwenyi
- Deakin University, Institute for Mental and Physical Health and Clinical Translation (IMPACT), 75 Pigdons Rd, Waurn Ponds, Victoria, Australia
| | - Sabine Taylor
- Macquarie University, Balaclava Rd, Macquarie Park, Sydney, NSW, Australia
| | - Abbas Khosravi
- Deakin University, Institute for Intelligent Systems Research and Innovation, 75 Pigdons Rd, Waurn Ponds, Australia
| | - Felice Jacka
- Deakin University, Food & Mood Centre, Institute for Mental and Physical Health and Clinical Translation (IMPACT), Health Education and Research Building, Ryrie Street, Geelong, Victoria, Australia; Deakin University, Institute for Mental and Physical Health and Clinical Translation (IMPACT), 75 Pigdons Rd, Waurn Ponds, Victoria, Australia
| | - Samantha L Dawson
- Deakin University, Food & Mood Centre, Institute for Mental and Physical Health and Clinical Translation (IMPACT), Health Education and Research Building, Ryrie Street, Geelong, Victoria, Australia; Deakin University, Institute for Mental and Physical Health and Clinical Translation (IMPACT), 75 Pigdons Rd, Waurn Ponds, Victoria, Australia.
| |
Collapse
|
37
|
Fountzilas E, Pearce T, Baysal MA, Chakraborty A, Tsimberidou AM. Convergence of evolving artificial intelligence and machine learning techniques in precision oncology. NPJ Digit Med 2025; 8:75. [PMID: 39890986 PMCID: PMC11785769 DOI: 10.1038/s41746-025-01471-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Accepted: 01/19/2025] [Indexed: 02/03/2025] Open
Abstract
The confluence of new technologies with artificial intelligence (AI) and machine learning (ML) analytical techniques is rapidly advancing the field of precision oncology, promising to improve diagnostic approaches and therapeutic strategies for patients with cancer. By analyzing multi-dimensional, multiomic, spatial pathology, and radiomic data, these technologies enable a deeper understanding of the intricate molecular pathways, aiding in the identification of critical nodes within the tumor's biology to optimize treatment selection. The applications of AI/ML in precision oncology are extensive and include the generation of synthetic data, e.g., digital twins, in order to provide the necessary information to design or expedite the conduct of clinical trials. Currently, many operational and technical challenges exist related to data technology, engineering, and storage; algorithm development and structures; quality and quantity of the data and the analytical pipeline; data sharing and generalizability; and the incorporation of these technologies into the current clinical workflow and reimbursement models.
Collapse
Affiliation(s)
- Elena Fountzilas
- Department of Medical Oncology, St Luke's Clinic, Panorama, Thessaloniki, Greece
| | | | - Mehmet A Baysal
- Department of Investigational Cancer Therapeutics, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd., Houston, TX, USA
| | - Abhijit Chakraborty
- Department of Investigational Cancer Therapeutics, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd., Houston, TX, USA
| | - Apostolia M Tsimberidou
- Department of Investigational Cancer Therapeutics, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd., Houston, TX, USA.
| |
Collapse
|
38
|
Kumar D, Haag D, Blechert J, Niebauer J, Smeddinck JD. Feature Selection for Physical Activity Prediction Using Ecological Momentary Assessments to Personalize Intervention Timing: Longitudinal Observational Study. JMIR Mhealth Uhealth 2025; 13:e57255. [PMID: 39865572 PMCID: PMC11785349 DOI: 10.2196/57255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Revised: 10/24/2024] [Accepted: 11/12/2024] [Indexed: 01/28/2025] Open
Abstract
Background There has been a surge in the development of apps that aim to improve health, physical activity (PA), and well-being through behavior change. These apps often focus on creating a long-term and sustainable impact on the user. Just-in-time adaptive interventions (JITAIs) that are based on passive sensing of the user's current context (eg, via smartphones and wearables) have been devised to enhance the effectiveness of these apps and foster PA. JITAIs aim to provide personalized support and interventions such as encouraging messages in a context-aware manner. However, the limited range of passive sensing capabilities often make it challenging to determine the timing and context for delivering well-accepted and effective interventions. Ecological momentary assessment (EMA) can provide personal context by directly capturing user assessments (eg, moods and emotions). Thus, EMA might be a useful complement to passive sensing in determining when JITAIs are triggered. However, extensive EMA schedules need to be scrutinized, as they can increase user burden. Objective The aim of the study was to use machine learning to balance the feature set size of EMA questions with the prediction accuracy regarding of enacting PA. Methods A total of 43 healthy participants (aged 19-67 years) completed 4 EMA surveys daily over 3 weeks. These surveys prospectively assessed various states, including both motivational and volitional variables related to PA preparation (eg, intrinsic motivation, self-efficacy, and perceived barriers) alongside stress and mood or emotions. PA enactment was assessed retrospectively via EMA and served as the outcome variable. Results The best-performing machine learning models predicted PA engagement with a mean area under the curve score of 0.87 (SD 0.02) in 5-fold cross-validation and 0.87 on the test set. Particularly strong predictors included self-efficacy, stress, planning, and perceived barriers, indicating that a small set of EMA predictors can yield accurate PA prediction for these participants. Conclusions A small set of EMA-based features like self-efficacy, stress, planning, and perceived barriers can be enough to predict PA reasonably well and can thus be used to meaningfully tailor JITAIs such as sending well-timed and context-aware support messages.
Collapse
Affiliation(s)
- Devender Kumar
- Ludwig Boltzmann Institute for Digital Health and Prevention, Salzburg, Austria
| | - David Haag
- Ludwig Boltzmann Institute for Digital Health and Prevention, Salzburg, Austria
- Department of Psychology, Paris Lodron University of Salzburg, Salzburg, Austria
- Digital Health Information Systems, Center for Health & Bioresources, AIT Austrian Institute of Technology GmbH, Graz, Austria
| | - Jens Blechert
- Department of Psychology, Paris Lodron University of Salzburg, Salzburg, Austria
- Centre for Cognitive Neuroscience, Paris Lodron University of Salzburg, Salzburg, Austria
| | - Josef Niebauer
- Ludwig Boltzmann Institute for Digital Health and Prevention, Salzburg, Austria
- University Institute of Sports Medicine, Prevention and Rehabilitation, Paracelsus Medical University, Salzburg, Austria
| | - Jan David Smeddinck
- Ludwig Boltzmann Institute for Digital Health and Prevention, Salzburg, Austria
| |
Collapse
|
39
|
van Boven MR, Bennis FC, Onland W, Aarnoudse-Moens CSH, Frings M, Tran K, Katz TA, Romijn M, Hoogendoorn M, van Kaam AH, Leemhuis AG, Oosterlaan J, Königs M. Machine learning models for neurocognitive outcome prediction in preterm born infants. Pediatr Res 2025:10.1038/s41390-025-03815-6. [PMID: 39827255 DOI: 10.1038/s41390-025-03815-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Revised: 11/24/2024] [Accepted: 12/09/2024] [Indexed: 01/22/2025]
Abstract
BACKGROUND Outcome prediction after preterm birth is important for long-term neonatal care, but has proven notoriously challenging for neurocognitive outcome. This study investigated the potential of machine learning to improve neurocognitive outcome prediction at two and five years of corrected age in preterm infants, using readily available predictors from the neonatal setting. METHODS Predictors originating from the antenatal and neonatal period of preterm infants born <30 weeks gestation were used to predict adverse neurocognitive outcome on the Bayley Scale and Wechsler Preschool and Primary Scale of Intelligence. Machine learning models were compared to conventional logistic regression and validated using internal cross-validation. RESULTS Best performing models were a random forest (two-year outcome) and a support vector machine (five-year outcome) with an area under the receiver operating characteristic curve (AUC) of 0.682 and 0.695 respectively, reaching high negative predictive values (95% and 91%, respectively). These models performed significantly better than the conventional models. CONCLUSIONS The models reached moderate overall predictive performance, yet with promising potential for early identification of children without adverse neurocognitive outcome. Machine learning modestly improved neurocognitive outcome prediction. Future research may harvest the predictive potential of a wider variety of routine (clinical) data, such as vital sign time series. IMPACT Early prediction of neurocognitive outcome in preterm infants will enable targeted follow-up and deployment of early (preventative) interventions to improve outcome. Neurocognitive outcome remains notoriously challenging using conventional models, while existing machine learning models depend on advanced MRI-derived predictors with limited potential for implementation into daily clinical practice. This study developed machine learning models for neurocognitive outcome prediction using predictors that are readily available in neonatal settings. Neurocognitive outcome prediction remains challenging due to low AUC and PPV, however, the models demonstrate high NPV, indicating potential for identifying children at low risk for adverse outcome.
Collapse
Affiliation(s)
- Menne R van Boven
- Emma Children's Hospital Amsterdam UMC, location University of Amsterdam, Department of Neonatology, Meibergdreef 9, Amsterdam, The Netherlands.
- Emma Children's Hospital Amsterdam UMC, location University of Amsterdam, Follow-Me program & Emma Neuroscience group, Meibergdreef 9, Amsterdam, The Netherlands.
- Amsterdam Reproduction and Development research institute, Amsterdam, The Netherlands.
| | - Frank C Bennis
- Emma Children's Hospital Amsterdam UMC, location University of Amsterdam, Follow-Me program & Emma Neuroscience group, Meibergdreef 9, Amsterdam, The Netherlands
- Amsterdam Reproduction and Development research institute, Amsterdam, The Netherlands
- Vrije Universiteit Amsterdam, Faculty of Science, Department Computer Science, Quantitative Data Analytics Group, Amsterdam, The Netherlands
| | - Wes Onland
- Emma Children's Hospital Amsterdam UMC, location University of Amsterdam, Department of Neonatology, Meibergdreef 9, Amsterdam, The Netherlands
- Amsterdam Reproduction and Development research institute, Amsterdam, The Netherlands
| | - Cornelieke S H Aarnoudse-Moens
- Emma Children's Hospital Amsterdam UMC, location University of Amsterdam, Department of Neonatology, Meibergdreef 9, Amsterdam, The Netherlands
- Amsterdam Reproduction and Development research institute, Amsterdam, The Netherlands
- Emma Children's Hospital Amsterdam UMC, location University of Amsterdam, Psychosocial Department, Meibergdreef 9, Amsterdam, The Netherlands
| | - Max Frings
- University of Amsterdam, Faculty of Science, Data science, Amsterdam, The Netherlands
| | - Kevin Tran
- University of Amsterdam, Faculty of Science, Data science, Amsterdam, The Netherlands
| | - Trixie A Katz
- Emma Children's Hospital Amsterdam UMC, location University of Amsterdam, Department of Neonatology, Meibergdreef 9, Amsterdam, The Netherlands
- Amsterdam Reproduction and Development research institute, Amsterdam, The Netherlands
| | - Michelle Romijn
- Emma Children's Hospital Amsterdam UMC, location University of Amsterdam, Department of Neonatology, Meibergdreef 9, Amsterdam, The Netherlands
- Amsterdam Reproduction and Development research institute, Amsterdam, The Netherlands
| | - Mark Hoogendoorn
- Vrije Universiteit Amsterdam, Faculty of Science, Department Computer Science, Quantitative Data Analytics Group, Amsterdam, The Netherlands
| | - Anton H van Kaam
- Emma Children's Hospital Amsterdam UMC, location University of Amsterdam, Department of Neonatology, Meibergdreef 9, Amsterdam, The Netherlands
- Amsterdam Reproduction and Development research institute, Amsterdam, The Netherlands
| | - Aleid G Leemhuis
- Emma Children's Hospital Amsterdam UMC, location University of Amsterdam, Department of Neonatology, Meibergdreef 9, Amsterdam, The Netherlands
| | - Jaap Oosterlaan
- Emma Children's Hospital Amsterdam UMC, location University of Amsterdam, Follow-Me program & Emma Neuroscience group, Meibergdreef 9, Amsterdam, The Netherlands
- Amsterdam Reproduction and Development research institute, Amsterdam, The Netherlands
| | - Marsh Königs
- Emma Children's Hospital Amsterdam UMC, location University of Amsterdam, Follow-Me program & Emma Neuroscience group, Meibergdreef 9, Amsterdam, The Netherlands
- Amsterdam Reproduction and Development research institute, Amsterdam, The Netherlands
| |
Collapse
|
40
|
Lucero-Garófano Á, Aliena-Valero A, Vielba-Gómez I, Escudero-Martínez I, Morales-Caba L, Aparici-Robles F, Tarruella Hernández DL, Fortea G, Tembl JI, Salom JB, Manjón JV. Automatic etiological classification of stroke thrombus digital photographs using a deep learning model. Front Neurol 2025; 16:1534845. [PMID: 39897943 PMCID: PMC11782041 DOI: 10.3389/fneur.2025.1534845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2024] [Accepted: 01/03/2025] [Indexed: 02/04/2025] Open
Abstract
Background Etiological classification of ischemic stroke is fundamental for secondary prevention, but frequently results in undetermined cause. We aimed to develop a Deep Learning (DL)-based model for automatic etiological classification of ischemic stroke using digital images of thrombi retrieved by mechanical thrombectomy. Methods Patients with large vessel occlusion stroke subjected to mechanical thrombectomy between April 2016 and January 2023 at La Fe University and Polytechnic Hospital in Valencia were included. Thrombus digital images were obtained and clinical characteristics, including TOAST etiological classification as reference standard, were retrieved. Statistical analysis was performed to compare clinical characteristics between atherothrombotic and cardioembolic strokes. A DL method was designed based on two deep neural networks for: (1) image segmentation and (2) image classification including clinical characteristics. The metrics used were DICE coefficient for the segmentation network, and accuracy, precision, sensitivity, specificity and area under the curve (AUC) for the predictions of the classification network. Results A total of 166 patients (mean age 69 [SD, 13], 67 female) were included. TOAST classification was: 31 atherothrombotic, 87 cardioembolic, and 48 cryptogenic. The segmentation network achieved an average DICE coefficient of 0.96 [SD, 0.13]. The optimal fused imaging and clinical classification network had a 0.968 accuracy [95% CI, 0.935-0.994], and AUC of 0.947 [95% CI, 0.870-1]. Cryptogenic thrombi were classified as cardioembolic (96%) or atherothrombotic (4%). Conclusion Two convolutional neural networks perform the automatic segmentation of thrombus images and, combined with selected clinical characteristics, their accurate and precise classification into atherothrombotic or cardioembolic etiology in patients with acute ischemic stroke.
Collapse
Affiliation(s)
- Álvaro Lucero-Garófano
- Unidad Mixta de Investigación Cerebrovascular, Instituto de Investigación Sanitaria La Fe, Valencia, Spain
- Instituto de Aplicaciones de las Tecnologías de la Información y de las Comunicaciones Avanzadas (ITACA), Universitat Politècnica de València, Valencia, Spain
| | - Alicia Aliena-Valero
- Unidad Mixta de Investigación Cerebrovascular, Instituto de Investigación Sanitaria La Fe, Valencia, Spain
| | - Isabel Vielba-Gómez
- Unidad Mixta de Investigación Cerebrovascular, Instituto de Investigación Sanitaria La Fe, Valencia, Spain
- Unidad de Ictus, Servicio de Neurología, Hospital Universitario y Politécnico La Fe, Valencia, Spain
| | - Irene Escudero-Martínez
- Unidad Mixta de Investigación Cerebrovascular, Instituto de Investigación Sanitaria La Fe, Valencia, Spain
- Unidad de Ictus, Servicio de Neurología, Hospital Universitario y Politécnico La Fe, Valencia, Spain
| | - Lluís Morales-Caba
- Unidad Mixta de Investigación Cerebrovascular, Instituto de Investigación Sanitaria La Fe, Valencia, Spain
- Unidad de Ictus, Servicio de Neurología, Hospital Universitario y Politécnico La Fe, Valencia, Spain
| | - Fernando Aparici-Robles
- Unidad Mixta de Investigación Cerebrovascular, Instituto de Investigación Sanitaria La Fe, Valencia, Spain
- Servicio de Radiología, Hospital Universitario y Politécnico La Fe, Valencia, Spain
| | - Diana L. Tarruella Hernández
- Unidad Mixta de Investigación Cerebrovascular, Instituto de Investigación Sanitaria La Fe, Valencia, Spain
- Unidad de Ictus, Servicio de Neurología, Hospital Universitario y Politécnico La Fe, Valencia, Spain
| | - Gerardo Fortea
- Unidad Mixta de Investigación Cerebrovascular, Instituto de Investigación Sanitaria La Fe, Valencia, Spain
- Unidad de Ictus, Servicio de Neurología, Hospital Universitario y Politécnico La Fe, Valencia, Spain
| | - José I. Tembl
- Unidad Mixta de Investigación Cerebrovascular, Instituto de Investigación Sanitaria La Fe, Valencia, Spain
- Unidad de Ictus, Servicio de Neurología, Hospital Universitario y Politécnico La Fe, Valencia, Spain
| | - Juan B. Salom
- Unidad Mixta de Investigación Cerebrovascular, Instituto de Investigación Sanitaria La Fe, Valencia, Spain
- Departamento de Fisiología, Universitat de València, Valencia, Spain
| | - José V. Manjón
- Instituto de Aplicaciones de las Tecnologías de la Información y de las Comunicaciones Avanzadas (ITACA), Universitat Politècnica de València, Valencia, Spain
| |
Collapse
|
41
|
Kelly BS, Duignan S, Mathur P, Dillon H, Lee EH, Yeom KW, Keane PA, Lawlor A, Killeen RP. Can ChatGPT4-vision identify radiologic progression of multiple sclerosis on brain MRI? Eur Radiol Exp 2025; 9:9. [PMID: 39812885 PMCID: PMC11735712 DOI: 10.1186/s41747-024-00547-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2024] [Accepted: 12/16/2024] [Indexed: 01/16/2025] Open
Abstract
BACKGROUND The large language model ChatGPT can now accept image input with the GPT4-vision (GPT4V) version. We aimed to compare the performance of GPT4V to pretrained U-Net and vision transformer (ViT) models for the identification of the progression of multiple sclerosis (MS) on magnetic resonance imaging (MRI). METHODS Paired coregistered MR images with and without progression were provided as input to ChatGPT4V in a zero-shot experiment to identify radiologic progression. Its performance was compared to pretrained U-Net and ViT models. Accuracy was the primary evaluation metric and 95% confidence interval (CIs) were calculated by bootstrapping. We included 170 patients with MS (50 males, 120 females), aged 21-74 years (mean 42.3), imaged at a single institution from 2019 to 2021, each with 2-5 MRI studies (496 in total). RESULTS One hundred seventy patients were included, 110 for training, 30 for tuning, and 30 for testing; 100 unseen paired images were randomly selected from the test set for evaluation. Both U-Net and ViT had 94% (95% CI: 89-98%) accuracy while GPT4V had 85% (77-91%). GPT4V gave cautious nonanswers in six cases. GPT4V had precision (specificity), recall (sensitivity), and F1 score of 89% (75-93%), 92% (82-98%), 91 (82-97%) compared to 100% (100-100%), 88 (78-96%), and 0.94 (88-98%) for U-Net and 94% (87-100%), 94 (88-100%), and 94 (89-98%) for ViT. CONCLUSION The performance of GPT4V combined with its accessibility suggests has the potential to impact AI radiology research. However, misclassified cases and overly cautious non-answers confirm that it is not yet ready for clinical use. RELEVANCE STATEMENT GPT4V can identify the radiologic progression of MS in a simplified experimental setting. However, GPT4V is not a medical device, and its widespread availability highlights the need for caution and education for lay users, especially those with limited access to expert healthcare. KEY POINTS Without fine-tuning or the need for prior coding experience, GPT4V can perform a zero-shot radiologic change detection task with reasonable accuracy. However, in absolute terms, in a simplified "spot the difference" medical imaging task, GPT4V was inferior to state-of-the-art computer vision methods. GPT4V's performance metrics were more similar to the ViT than the U-net. This is an exploratory experimental study and GPT4V is not intended for use as a medical device.
Collapse
Affiliation(s)
- Brendan S Kelly
- St Vincent's University Hospital, Dublin, Ireland.
- Insight Centre for Data Analytics, UCD, Dublin, Ireland.
- Wellcome Trust-HRB, Irish Clinical Academic Training, Dublin, Ireland.
- School of Medicine, University College Dublin, Dublin, Ireland.
| | | | | | - Henry Dillon
- St Vincent's University Hospital, Dublin, Ireland
| | - Edward H Lee
- Lucille Packard Children's Hospital at Stanford, Stanford, CA, USA
| | - Kristen W Yeom
- Lucille Packard Children's Hospital at Stanford, Stanford, CA, USA
| | | | | | - Ronan P Killeen
- St Vincent's University Hospital, Dublin, Ireland
- School of Medicine, University College Dublin, Dublin, Ireland
| |
Collapse
|
42
|
Mao A, Su J, Ren M, Chen S, Zhang H. Risk prediction models for falls in hospitalized older patients: a systematic review and meta-analysis. BMC Geriatr 2025; 25:29. [PMID: 39810076 PMCID: PMC11730783 DOI: 10.1186/s12877-025-05688-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Accepted: 01/07/2025] [Indexed: 01/16/2025] Open
Abstract
BACKGROUND Existing fall risk assessment tools in clinical settings often lack accuracy. Although an increasing number of fall risk prediction models have been developed for hospitalized older patients in recent years, it remains unclear how useful these models are for clinical practice and future research. OBJECTIVES To systematically review published studies of fall risk prediction models for hospitalized older adults. METHODS A search was performed of the Web of Science, PubMed, Cochrane Library, CINAHL, MEDLINE, and Embase databases: to retrieve studies of predictive models related to falls in hospitalized older adults from their inception until January 11, 2024. Extraction of data from included studies, including study design, data sources, sample size, predictors, model development and performance, etc. Risk of bias and applicability were assessed using the Prediction Model Risk of Bias Assessment Tool (PROBAST) checklist. RESULTS A total of 8086 studies were retrieved, and after screening, 13 prediction models from 13 studies were included. Four models were externally validated. Eight models reported discrimination metrics and two models reported calibration metrics. The most common predictors of falls were mobility, fall history, medications, and psychiatric disorders. All studies indicated a high risk of bias, primarily due to inadequate study design and methodological flaws. The AUC values of 8 models ranged from 0.630 to 0.851. CONCLUSIONS In the present study, all included studies had a high risk of bias, primarily due to the lack of prospective study design, inappropriate data analysis, and the absence of robust external validation. Future studies should prioritize the use of rigorous methodologies for the external validation of fall risk prediction models in hospitalized older adults. TRIAL REGISTRATION The study was registered in the International Database of Prospectively Registered Systematic Reviews (PROSPERO) CRD42024503718.
Collapse
Affiliation(s)
- Anli Mao
- Department of Nursing, the Fourth Affiliated Hospital of School of Medicine, and International School of Medicine, International Institutes of Medicine, Zhejiang University, Yiwu, 322000, China
| | - Jie Su
- Department of Nursing, the Fourth Affiliated Hospital of School of Medicine, and International School of Medicine, International Institutes of Medicine, Zhejiang University, Yiwu, 322000, China
| | - Mingzhu Ren
- Department of Nursing, the Fourth Affiliated Hospital of School of Medicine, and International School of Medicine, International Institutes of Medicine, Zhejiang University, Yiwu, 322000, China
| | - Shuying Chen
- Department of Nursing, the Fourth Affiliated Hospital of School of Medicine, and International School of Medicine, International Institutes of Medicine, Zhejiang University, Yiwu, 322000, China
| | - Huafang Zhang
- Department of Nursing, the Fourth Affiliated Hospital of School of Medicine, and International School of Medicine, International Institutes of Medicine, Zhejiang University, Yiwu, 322000, China.
| |
Collapse
|
43
|
Li X, Shu Q, Kong C, Wang J, Li G, Fang X, Lou X, Yu G. An Intelligent System for Classifying Patient Complaints Using Machine Learning and Natural Language Processing: Development and Validation Study. J Med Internet Res 2025; 27:e55721. [PMID: 39778195 PMCID: PMC11754990 DOI: 10.2196/55721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 04/28/2024] [Accepted: 11/04/2024] [Indexed: 01/11/2025] Open
Abstract
BACKGROUND Accurate classification of patient complaints is crucial for enhancing patient satisfaction management in health care settings. Traditional manual methods for categorizing complaints often lack efficiency and precision. Thus, there is a growing demand for advanced and automated approaches to streamline the classification process. OBJECTIVE This study aimed to develop and validate an intelligent system for automatically classifying patient complaints using machine learning (ML) and natural language processing (NLP) techniques. METHODS An ML-based NLP technology was proposed to extract frequently occurring dissatisfactory words related to departments, staff, and key treatment procedures. A dataset containing 1465 complaint records from 2019 to 2023 was used for training and validation, with an additional 376 complaints from Hangzhou Cancer Hospital serving as an external test set. Complaints were categorized into 4 types-communication problems, diagnosis and treatment issues, management problems, and sense of responsibility concerns. The imbalanced data were balanced using the Synthetic Minority Oversampling Technique (SMOTE) algorithm to ensure equal representation across all categories. A total of 3 ML algorithms (Multifactor Logistic Regression, Multinomial Naive Bayes, and Support Vector Machines [SVM]) were used for model training and validation. The best-performing model was tested using a 5-fold cross-validation on external data. RESULTS The original dataset consisted of 719, 376, 260, and 86 records for communication problems, diagnosis and treatment issues, management problems, and sense of responsibility concerns, respectively. The Multifactor Logistic Regression and SVM models achieved weighted average accuracies of 0.89 and 0.93 in the training set, and 0.83 and 0.87 in the internal test set, respectively. Ngram-level term frequency-inverse document frequency did not significantly improve classification performance, with only a marginal 1% increase in precision, recall, and F1-score when implementing Ngram-level term frequency-inverse document frequency (n=2) from 0.91 to 0.92. The SVM algorithm performed best in prediction, achieving an average accuracy of 0.91 on the external test set with a 95% CI of 0.87-0.97. CONCLUSIONS The NLP-driven SVM algorithm demonstrates effective classification performance in automatically categorizing patient complaint texts. It showed superior performance in both internal and external test sets for communication and management problems. However, caution is advised when using it for classifying sense of responsibility complaints. This approach holds promises for implementation in medical institutions with high complaint volumes and limited resources for addressing patient feedback.
Collapse
Affiliation(s)
- Xiadong Li
- Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center For Child Health, Hang Zhou, China
| | - Qiang Shu
- Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center For Child Health, Hang Zhou, China
| | - Canhong Kong
- Patient Service Surveillance Office, Medical Information Department, Hangzhou Red Cross Hospital, Hang Zhou, China
| | - Jinhu Wang
- Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center For Child Health, Hang Zhou, China
| | - Gang Li
- Department of Radiation Oncology, Zhe Jiang Xiaoshan hospital, Hangzhou Normal University, Hang Zhou, China
| | - Xin Fang
- Hospital Management Office, Hangzhou Cancer Hospital, Hang Zhou, China
| | - Xiaomin Lou
- Patient Service Surveillance Office, Hangzhou Red Cross Hospital, Hang Zhou, China
| | - Gang Yu
- Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center For Child Health, Hang Zhou, China
| |
Collapse
|
44
|
Yang X, Li Z, Lei L, Shi X, Zhang D, Zhou F, Li W, Xu T, Liu X, Wang S, Yuan Q, Yang J, Wang X, Zhong Y, Yu L. Noninvasive Oral Hyperspectral Imaging-Driven Digital Diagnosis of Heart Failure With Preserved Ejection Fraction: Model Development and Validation Study. J Med Internet Res 2025; 27:e67256. [PMID: 39773415 PMCID: PMC11751651 DOI: 10.2196/67256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2024] [Revised: 12/04/2024] [Accepted: 12/20/2024] [Indexed: 01/11/2025] Open
Abstract
BACKGROUND Oral microenvironmental disorders are associated with an increased risk of heart failure with preserved ejection fraction (HFpEF). Hyperspectral imaging (HSI) technology enables the detection of substances that are visually indistinguishable to the human eye, providing a noninvasive approach with extensive applications in medical diagnostics. OBJECTIVE The objective of this study is to develop and validate a digital, noninvasive oral diagnostic model for patients with HFpEF using HSI combined with various machine learning algorithms. METHODS Between April 2023 and August 2023, a total of 140 patients were recruited from Renmin Hospital of Wuhan University to serve as the training and internal testing groups for this study. Subsequently, from August 2024 to September 2024, an additional 35 patients were enrolled from Three Gorges University and Yichang Central People's Hospital to constitute the external testing group. After preprocessing to ensure image quality, spectral and textural features were extracted from the images. We extracted 25 spectral bands from each patient image and obtained 8 corresponding texture features to evaluate the performance of 28 machine learning algorithms for their ability to distinguish control participants from participants with HFpEF. The model demonstrating the optimal performance in both internal and external testing groups was selected to construct the HFpEF diagnostic model. Hyperspectral bands significant for identifying participants with HFpEF were identified for further interpretative analysis. The Shapley Additive Explanations (SHAP) model was used to provide analytical insights into feature importance. RESULTS Participants were divided into a training group (n=105), internal testing group (n=35), and external testing group (n=35), with consistent baseline characteristics across groups. Among the 28 algorithms tested, the random forest algorithm demonstrated superior performance with an area under the receiver operating characteristic curve (AUC) of 0.884 and an accuracy of 82.9% in the internal testing group, as well as an AUC of 0.812 and an accuracy of 85.7% in the external testing group. For model interpretation, we used the top 25 features identified by the random forest algorithm. The SHAP analysis revealed discernible distinctions between control participants and participants with HFpEF, thereby validating the diagnostic model's capacity to accurately identify participants with HFpEF. CONCLUSIONS This noninvasive and efficient model facilitates the identification of individuals with HFpEF, thereby promoting early detection, diagnosis, and treatment. Our research presents a clinically advanced diagnostic framework for HFpEF, validated using independent data sets and demonstrating significant potential to enhance patient care. TRIAL REGISTRATION China Clinical Trial Registry ChiCTR2300078855; https://www.chictr.org.cn/showproj.html?proj=207133.
Collapse
Affiliation(s)
- Xiaomeng Yang
- Cardiovascular Hospital, Renmin Hospital of Wuhan University, Wuhan, China
- Hubei Key Laboratory of Autonomic Nervous System Modulation, Wuhan University, Wuhan, China
- Cardiac Autonomic Nervous System Research Center, Wuhan University, Wuhan, China
| | - Zeyan Li
- Cardiovascular Hospital, Renmin Hospital of Wuhan University, Wuhan, China
- Hubei Key Laboratory of Autonomic Nervous System Modulation, Wuhan University, Wuhan, China
- Cardiac Autonomic Nervous System Research Center, Wuhan University, Wuhan, China
| | - Lei Lei
- State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan, China
| | - Xiaoyu Shi
- Cardiovascular Hospital, Renmin Hospital of Wuhan University, Wuhan, China
- Hubei Key Laboratory of Autonomic Nervous System Modulation, Wuhan University, Wuhan, China
- Cardiac Autonomic Nervous System Research Center, Wuhan University, Wuhan, China
| | - Dingming Zhang
- College of Geomatics, Xi'an University of Science and Technology, Xi'an, China
| | - Fei Zhou
- Department of Cardiology, The First College of Clinical Medical Science, Yichang Central People's Hospital, Yichang, China
- Hubei Key Laboratory of Ischemic Cardiovascular Disease, China Three Gorges University, Yichang, China
| | - Wenjing Li
- Department of Cardiology, The First College of Clinical Medical Science, Yichang Central People's Hospital, Yichang, China
- Hubei Key Laboratory of Ischemic Cardiovascular Disease, China Three Gorges University, Yichang, China
| | - Tianyou Xu
- Cardiovascular Hospital, Renmin Hospital of Wuhan University, Wuhan, China
- Hubei Key Laboratory of Autonomic Nervous System Modulation, Wuhan University, Wuhan, China
- Cardiac Autonomic Nervous System Research Center, Wuhan University, Wuhan, China
| | - Xinyu Liu
- Cardiovascular Hospital, Renmin Hospital of Wuhan University, Wuhan, China
- Hubei Key Laboratory of Autonomic Nervous System Modulation, Wuhan University, Wuhan, China
- Cardiac Autonomic Nervous System Research Center, Wuhan University, Wuhan, China
| | - Songyun Wang
- Cardiovascular Hospital, Renmin Hospital of Wuhan University, Wuhan, China
- Hubei Key Laboratory of Autonomic Nervous System Modulation, Wuhan University, Wuhan, China
- Cardiac Autonomic Nervous System Research Center, Wuhan University, Wuhan, China
- Medical Remote Sensing Information Cross-Institute, Wuhan University, Wuhan, China
| | - Quan Yuan
- College of Chemistry and Molecular Sciences, Key Laboratory of Biomedical Polymers of Ministry of Education, Wuhan University, Wuhan, China
- lnstitute of Molecular Medicine, Renmin Hospital of Wuhan University, Wuhan, China
| | - Jian Yang
- Department of Cardiology, The First College of Clinical Medical Science, Yichang Central People's Hospital, Yichang, China
- Hubei Key Laboratory of Ischemic Cardiovascular Disease, China Three Gorges University, Yichang, China
| | - Xinyu Wang
- Medical Remote Sensing Information Cross-Institute, Wuhan University, Wuhan, China
- School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, China
| | - Yanfei Zhong
- State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan, China
- Medical Remote Sensing Information Cross-Institute, Wuhan University, Wuhan, China
| | - Lilei Yu
- Cardiovascular Hospital, Renmin Hospital of Wuhan University, Wuhan, China
- Hubei Key Laboratory of Autonomic Nervous System Modulation, Wuhan University, Wuhan, China
- Cardiac Autonomic Nervous System Research Center, Wuhan University, Wuhan, China
- Medical Remote Sensing Information Cross-Institute, Wuhan University, Wuhan, China
| |
Collapse
|
45
|
Clark SL, Hartwell EE, Choi DS, Krystal JH, Messing RO, Ferguson LB. Next-generation biomarkers for alcohol consumption and alcohol use disorder diagnosis, prognosis, and treatment: A critical review. ALCOHOL, CLINICAL & EXPERIMENTAL RESEARCH 2025; 49:5-24. [PMID: 39532676 PMCID: PMC11747793 DOI: 10.1111/acer.15476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 10/04/2024] [Accepted: 10/14/2024] [Indexed: 11/16/2024]
Abstract
This critical review summarizes the current state of omics-based biomarkers in the alcohol research field. We first provide definitions and background information on alcohol and alcohol use disorder (AUD), biomarkers, and "omic" technologies. We next summarize using (1) genetic information as risk/prognostic biomarkers for the onset of alcohol-related problems and the progression from regular drinking to problematic drinking (including AUD), (2) epigenetic information as diagnostic biomarkers for AUD and risk biomarkers for alcohol consumption, (3) transcriptomic information as diagnostic biomarkers for AUD, risk biomarkers for alcohol consumption, and (4) metabolomic information as diagnostic biomarkers for AUD, risk biomarkers for alcohol consumption, and predictive biomarkers for response to acamprosate in subjects with AUD. In the final section, the clinical implications of the findings are discussed, and recommendations are made for future research.
Collapse
Affiliation(s)
- Shaunna L. Clark
- Department of Psychiatry & Behavioral Sciences, Texas A&M University, College Station, TX, USA
| | - Emily E. Hartwell
- Mental Illness Research, Education and Clinical Center, Crescenz Veterans Affairs Medical Center, Philadelphia, PA, USA
- Center for Studies of Addiction, Department of Psychiatry, Perelman School of Medicine of the University of Pennsylvania, Philadelphia, PA, USA
| | - Doo-Sup Choi
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic College of Medicine and Science, Rochester, MN, USA
- Department of Psychiatry and Psychology, Mayo Clinic College of Medicine and Science, Rochester, MN, USA
- Neuroscience Program, Mayo Clinic College of Medicine and Science, Rochester, MN, USA
| | - John H. Krystal
- Department of Psychiatry, Yale University School of Medicine, New Haven, CT, USA
| | - Robert O. Messing
- Waggoner Center for Alcohol and Addiction Research, University of Texas at Austin, Austin, Texas, USA
- Department of Neurology, Dell Medical School, University of Texas at Austin, Austin, Texas, USA
- Department of Neuroscience, University of Texas at Austin, Austin, Texas, USA
| | - Laura B. Ferguson
- Waggoner Center for Alcohol and Addiction Research, University of Texas at Austin, Austin, Texas, USA
- Department of Neurology, Dell Medical School, University of Texas at Austin, Austin, Texas, USA
- Department of Neuroscience, University of Texas at Austin, Austin, Texas, USA
| |
Collapse
|
46
|
Song W, Frakes D, Dasi LP. Active Machine Learning for Pre-procedural Prediction of Time-Varying Boundary Condition After Fontan Procedure Using Generative Adversarial Networks. Ann Biomed Eng 2025; 53:217-229. [PMID: 39480609 DOI: 10.1007/s10439-024-03640-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Accepted: 10/15/2024] [Indexed: 11/02/2024]
Abstract
The Fontan procedure is the definitive palliation for pediatric patients born with single ventricles. Surgical planning for the Fontan procedure has emerged as a promising vehicle toward optimizing outcomes, where pre-operative measurements are used prospectively as post-operative boundary conditions for simulation. Nevertheless, actual post-operative measurements can be very different from pre-operative states, which raises questions for the accuracy of surgical planning. The goal of this study is to apply machine leaning techniques to describing pre-operative and post-operative vena caval flow conditions in Fontan patients in order to develop predictions of post-operative boundary conditions to be used in surgical planning. Based on a virtual cohort synthesized by lumped-parameter models, we proposed a novel diversity-aware generative adversarial active learning framework to successfully train predictive deep neural networks on very limited amount of cases that are generally faced by cardiovascular studies. Results of 14 groups of experiments uniquely combining different data query strategies, metrics, and data augmentation options with generative adversarial networks demonstrated that the highest overall prediction accuracy and coefficient of determination were exhibited by the proposed method. This framework serves as a first step toward deep learning for cardiovascular flow prediction/regression with reduced labeling requirements and augmented learning space.
Collapse
Affiliation(s)
- Wenyuan Song
- School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USA
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
| | | | - Lakshmi Prasad Dasi
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA.
| |
Collapse
|
47
|
Kwon S, Chung S, Lee SR, Kim K, Kim J, Baek D, Yang HL, Choi EK, Oh S. Prediction of reduced left ventricular ejection fraction using atrial fibrillation or flutter electrocardiograms: A machine-learning study. Digit Health 2025; 11:20552076241311460. [PMID: 39839953 PMCID: PMC11748079 DOI: 10.1177/20552076241311460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Accepted: 12/16/2024] [Indexed: 01/23/2025] Open
Abstract
Objective Although the evaluation of left ventricular ejection fraction (LVEF) in patients with atrial fibrillation (AF) or atrial flutter (AFL) is crucial for appropriate medical management, the prediction of reduced LVEF (<50%) with AF/AFL electrocardiograms (ECGs) lacks evidence. This study aimed to investigate deep-learning approaches to predict reduced LVEF (<50%) in patients with AF/AFL ECGs and easily obtainable clinical information. Methods Patients with 12-lead ECGs of AF/AFL and echocardiography were divided into those with LVEF <50% and ≥50%. A convolutional neural networks-based model customized to the study (AFibEFNet) and other deep-learning models were investigated. Electrocardiogram signals, ECG features, and clinical features (demographic information, comorbidities, blood cell counts, and blood test results) were collected for training. A hold-out test dataset was constructed using a different recruitment period. Five-fold cross-validation and calibration plots were used to evaluate performance. Results A total of 15,683 patients were analyzed (mean age, 70.0 ± 11.7 years; 61.2% men), with 82.2% having LVEF ≥50% and 17.8% having LVEF < 50%. Among the learning models, the AFibEFNet outperformed other models regarding area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC), and F1-score. Using ECG signals alone, the AFibEFNet model predicted reduced LVEF with AUROC of 0.798 (95% confidence interval [CI], 0.767-0.829) and AUPRC of 0.508 (95% CI, 0.434-0.564). For the AFibEFNet model, additional training with ECG and clinical features significantly improved AUROC (0.816 vs. 0.798, p = 0.04) and AUPRC (0.547 vs. 0.508, p < 0.001). The AFibEFNet model primarily focused on the R-wave, QRS onset and offset, and T-wave in ECG signals. Conclusions Among the patients with AF/AFL, machine learning may predict reduced LVEF with 12-lead ECGs of AF/AFL.
Collapse
Affiliation(s)
- Soonil Kwon
- Division of Cardiology, Department of Internal Medicine, SMG–SNU Boramae Medical Center, Seoul, Republic of Korea
| | - SooMin Chung
- Interdisciplinary Program in Bioengineering, Seoul National University, Seoul, Republic of Korea
| | - So-Ryoung Lee
- Department of Internal Medicine, Seoul National University Hospital, Seoul, Republic of Korea
- Department of Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Kwangsoo Kim
- Department of Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
- Department of Transdisciplinary Medicine, Institute of Convergence Medicine with Innovative Technology, Seoul National University Hospital, Seoul, Republic of Korea
| | - Junmo Kim
- Interdisciplinary Program in Bioengineering, Seoul National University, Seoul, Republic of Korea
| | - Dahyeon Baek
- Industrial and Management Engineering, POSTECH, Pohang, Republic of Korea
| | - Hyun-Lim Yang
- Office of Hospital Information, Seoul National University Hospital, Seoul, Republic of Korea
| | - Eue-Keun Choi
- Department of Internal Medicine, Seoul National University Hospital, Seoul, Republic of Korea
- Department of Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Seil Oh
- Department of Internal Medicine, Seoul National University Hospital, Seoul, Republic of Korea
- Department of Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
48
|
Zhang Z, Zhu M, Jiang W. Risk Factors Analysis of Cutaneous Adverse Drug Reactions Caused by Targeted Therapy and Immunotherapy Drugs for Oncology and Establishment of a Prediction Model. Clin Transl Sci 2025; 18:e70118. [PMID: 39757364 DOI: 10.1111/cts.70118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2024] [Revised: 12/12/2024] [Accepted: 12/15/2024] [Indexed: 01/07/2025] Open
Abstract
Targeted therapy and immunotherapy drugs for oncology have greater efficacy and tolerability than cytotoxic chemotherapeutic drugs. However, the cutaneous adverse drug reactions associated with these newer therapies are more common and remain poorly predicted. An effective prediction model is urgently needed and essential. This retrospective study included 1052 patients, divided into train set, test set, and external validation set. As a data-driven study, a total of 76 variables were collected. Univariate logistic analysis, least absolute shrinkage and selection operator regression, and stepwise logistic regression were utilized for feature screening. Finally, nine machine-learning models were constructed and compared, and grid search was performed to adjust the parameters. Model performance was evaluated using calibration curve and the area under the receiver operating characteristic curve (AUROC). Nine risk factors were eventually identified: age, treatment modality, cancer types, history of allergies, age-corrected Charlson comorbidity index, percentage of eosinophils, absolute number of monocytes, Eastern Cooperative Oncology Group Performance Status, and C-reactive protein. Among the models, the logistic model performed best, demonstrating strong performance in test set (AUROC = 0.734) and external validation set (AUROC = 0.817). This study identified nine significant risk factors and developed a nomogram prediction model. These findings have important implications for optimizing therapeutic efficacy and maintaining the quality of life of patients from the perspective of managing cutaneous adverse drug reactions. Trial Registration: ChiCTR2400088422.
Collapse
Affiliation(s)
- Zimin Zhang
- Department of Pharmacy, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
- College of Pharmacy, Chongqing Medical University, Chongqing, China
| | - Mingyang Zhu
- Department of Pharmacy, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Weiwei Jiang
- Department of Pharmacy, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| |
Collapse
|
49
|
Zhou Z, Wang D, Sun J, Zhu M, Teng L. A Machine Learning-Based Prediction Model for the Probability of Fall Risk Among Chinese Community-Dwelling Older Adults. Comput Inform Nurs 2024; 42:913-921. [PMID: 39356834 DOI: 10.1097/cin.0000000000001202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/04/2024]
Abstract
Fall is a common adverse event among older adults. This study aimed to identify essential fall factors and develop a machine learning-based prediction model to predict the fall risk category among community-dwelling older adults, leading to earlier intervention and better outcomes. Three prediction models (logistic regression, random forest, and naive Bayes) were constructed and evaluated. A total of 459 people were involved, including 156 participants (34.0%) with high fall risk. Seven independent predictors (frail status, age, smoking, heart attack, cerebrovascular disease, arthritis, and osteoporosis) were selected to develop the models. Among the three machine learning models, the logistic regression model had the best model fit, with the highest area under the curve (0.856) and accuracy (0.797) and sensitivity (0.735) in the test set. The logistic regression model had excellent discrimination, calibration, and clinical decision-making ability, which could aid in accurately identifying the high-risk groups and taking early intervention with the model.
Collapse
Affiliation(s)
- Zhou Zhou
- Author Affiliations: Wuxi School of Medicine, Jiangnan University, Jiangsu (Mr Zhou; Mss Wang, Sun, and Zhu; and Dr Teng); Traditional Chinese Medicine Hospital of Qinghai Province, Xining, Qinghai (Ms Wang), China
| | | | | | | | | |
Collapse
|
50
|
Chrysafi P, Lam B, Carton S, Patell R. From Code to Clots: Applying Machine Learning to Clinical Aspects of Venous Thromboembolism Prevention, Diagnosis, and Management. Hamostaseologie 2024; 44:429-445. [PMID: 39657652 DOI: 10.1055/a-2415-8408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2024] Open
Abstract
The high incidence of venous thromboembolism (VTE) globally and the morbidity and mortality burden associated with the disease make it a pressing issue. Machine learning (ML) can improve VTE prevention, detection, and treatment. The ability of this novel technology to process large amounts of high-dimensional data can help identify new risk factors and better risk stratify patients for thromboprophylaxis. Applications of ML for VTE include systems that interpret medical imaging, assess the severity of the VTE, tailor treatment according to individual patient needs, and identify VTE cases to facilitate surveillance. Generative artificial intelligence may be leveraged to design new molecules such as new anticoagulants, generate synthetic data to expand datasets, and reduce clinical burden by assisting in generating clinical notes. Potential challenges in the applications of these novel technologies include the availability of multidimensional large datasets, prospective studies and clinical trials to ensure safety and efficacy, continuous quality assessment to maintain algorithm accuracy, mitigation of unwanted bias, and regulatory and legal guardrails to protect patients and providers. We propose a practical approach for clinicians to integrate ML into research, from choosing appropriate problems to integrating ML into clinical workflows. ML offers much promise and opportunity for clinicians and researchers in VTE to translate this technology into the clinic and directly benefit the patients.
Collapse
Affiliation(s)
- Pavlina Chrysafi
- Department of Medicine, Mount Auburn Hospital, Harvard Medical School, Cambridge, Massachusetts, United States
| | - Barbara Lam
- Division of Hemostasis and Thrombosis, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, United States
- Division of Clinical Informatics, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, United States
| | - Samuel Carton
- Department of Computer Science, College of Engineering and Physical Sciences, University of New Hampshire, Durham, New Hampshire, United States
| | - Rushad Patell
- Division of Hemostasis and Thrombosis, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, United States
| |
Collapse
|