1
|
Velez T, Ibrahim Z, Duru K, Velez D, Triantafyllou M, McKinley K, Saif P, Kratimenos P, Clark A, Koutroulis I. Predicting hospital admissions, ICU utilization, and prolonged length of stay among febrile pediatric emergency department patients using incomplete and imbalanced electronic health record (EHR) data strategies. Int J Med Inform 2025; 200:105905. [PMID: 40203463 DOI: 10.1016/j.ijmedinf.2025.105905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2024] [Revised: 03/09/2025] [Accepted: 03/30/2025] [Indexed: 04/11/2025]
Abstract
OBJECTIVE Determine the efficacy of commonly used approaches to handling missing and/or imbalanced Electronic Health Record (EHR) data on the performance of predictive models targeting risk of admission, intensive care unit (ICU) use, or prolonged length of stay (PLOS) among presenting febrile pediatric emergency department (ED) patients. MATERIALS AND METHODS Historical ED EHR data was used to train a series of XGBoost (XGB) and logistic regression (LR) classifiers. Data handling strategies included imputation methods (multiple imputation (MI), median imputation, complete case (CC) analysis), and imbalanced data corrections (minority oversampling, stratified sub-group analysis). Model performance was evaluated using discriminative (AUC, AUPRC) and calibration metrics (Brier score, Z-scores, p-values). RESULTS Among the study population, 34 % were admitted, 2 % utilized the ICU, and 7 % had a PLOS. Significant data missingness was observed and determined to be not at random (MNAR). In predicting admissions using data recorded within the first two hours of presentation, LR trained using full cohort with median imputation was comparable to MI yielding well-calibrated admissions models with an AUC/AUPRC of 0.82/0.73 while CC analysis yielded an AUC/AUPRC of 0.76/0.78. XGB, trained with unimputed data, produced a well-calibrated admissions classifier with an AUC/AUPRC of 0.85/0.78. In contrast, imbalanced data correction techniques, including synthetic minority oversampling (SMOTE), risk stratification, or the use of XGB did not significantly improve the poor AUPRC and calibration performance of LR models predicting ICU and PLOS. CONCLUSION Both XGB and LR with median imputation demonstrated robust performance in predicting admissions in the presence of missing data. However, deriving clinically useful models for rare outcomes, such as ICU use or PLOS, remains a challenge due to poor precision/recall and calibration performance. Further research is needed to improve the prediction of rare outcomes in this population.
Collapse
Affiliation(s)
- Tom Velez
- Computer Technology Associates, Cardiff, CA, United States
| | - Zara Ibrahim
- Department of Pediatrics, Children's National Hospital, Washington, DC, United States
| | - Kanayo Duru
- Department of Pediatrics, Children's National Hospital, Washington, DC, United States; Brown University, Providence, RI, United States
| | - Dante Velez
- Department of Pediatrics, Children's National Hospital, Washington, DC, United States
| | - Maria Triantafyllou
- Center for Genetic Medicine Research, Children's National Research Institute, Washington, DC, United States
| | - Kenneth McKinley
- Department of Pediatrics, Children's National Hospital, Washington, DC, United States; George Washington University School of Medicine and Health Sciences, Washington, DC, United States
| | - Pasha Saif
- Virginia Tech Carilion School of Medicine, Roanoke, VA, United States
| | - Panagiotis Kratimenos
- Department of Pediatrics, Children's National Hospital, Washington, DC, United States; George Washington University School of Medicine and Health Sciences, Washington, DC, United States
| | - Andy Clark
- Computer Technology Associates, Cardiff, CA, United States
| | - Ioannis Koutroulis
- Department of Pediatrics, Children's National Hospital, Washington, DC, United States; Center for Genetic Medicine Research, Children's National Research Institute, Washington, DC, United States; George Washington University School of Medicine and Health Sciences, Washington, DC, United States.
| |
Collapse
|
2
|
Azenkot T, Rivera DR, Stewart MD, Patel SP. Artificial Intelligence and Machine Learning Innovations to Improve Design and Representativeness in Oncology Clinical Trials. Am Soc Clin Oncol Educ Book 2025; 45:e473590. [PMID: 40403202 DOI: 10.1200/edbk-25-473590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/24/2025]
Abstract
The integration of artificial intelligence (AI) and machine learning (ML) in oncology clinical trials is rapidly evolving alongside the broader field. For example, AI-driven adaptive trial designs may allow for real-time modifications based on emerging safety and efficacy signals, enabling more responsive and efficient trials. AI-powered diagnostic tools, including radiomics, computational pathology, and spatial omics, can improve trial patient selection and response assessments. ML-based patient outcome simulations can similarly enhance patient stratification strategies and statistical power. Application of AI can also improve the accessibility of real-world data, including opportunities to enhance data extraction, standardization, and harmonization of data from routine clinical practice. Data generated from digital health technologies (eg, wearable devices, electronic sensors, computing platforms, software applications) may enable a more comprehensive understanding of patient populations to support clinical trials from enrollment to assessment. Automation of trial operations and data management can also improve data fidelity and decrease investigator burden, which has the potential to streamline trial execution and increase potential use of decentralization. There are ongoing efforts to enhance regulatory clarity, mitigate bias, and uphold ethical use of these novel technologies. In this article, we review use cases of AI and ML in oncology clinical trials, including their role in patient recruitment, trial design and operations, data management, and diagnostics. Although these technologies can have applications across all phases of drug development including early discovery, we focus on phase II and III trials, where AI and ML may have a pronounced ability to enhance trial efficiency, patient stratification, and regulatory decision making. By integrating AI and ML, clinical trials can become more adaptive, data-driven, and inclusive in the pursuit of improving patient outcomes.
Collapse
Affiliation(s)
- Tali Azenkot
- University of California at San Diego Moores Cancer Center, La Jolla, CA
| | - Donna R Rivera
- Oncology Center of Excellence, US Food and Drug Administration, Silver Springs, MD
| | | | - Sandip P Patel
- University of California at San Diego Moores Cancer Center, La Jolla, CA
| |
Collapse
|
3
|
Xie XY, Huang LY, Liu D, Cheng GR, Hu FF, Zhou J, Zhang JJ, Han GB, Geng JW, Liu XC, Wang JY, Zeng DY, Liu J, Nie QQ, Song D, Li SY, Cai C, Cui YY, Xu L, Ou YM, Chen XX, Zhou YL, Chen YS, Li JQ, Wei Z, Wu Q, Mei YF, Song SJ, Tan W, Zhao QH, Ding D, Zeng Y. Predicting Progression to Dementia Using Auditory Verbal Learning Test in Community-Dwelling Older Adults Based On Machine Learning. Am J Geriatr Psychiatry 2025; 33:487-499. [PMID: 39645504 DOI: 10.1016/j.jagp.2024.10.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 10/22/2024] [Accepted: 10/30/2024] [Indexed: 12/09/2024]
Abstract
BACKGROUND Primary healthcare institutions find identifying individuals with dementia particularly challenging. This study aimed to develop machine learning models for identifying predictive features of older adults with normal cognition to develop dementia. METHODS We developed four machine learning models: logistic regression, decision tree, random forest, and gradient-boosted trees, predicting dementia of 1,162 older adults with normal cognition at baseline from the Hubei Memory and Aging Cohort Study. All relevant variables collected were included in the models. The Shanghai Aging Study was selected as a replication cohort (n = 1,370) to validate the performance of models including the key features after a wrapper feature selection technique. Both cohorts adopted comparable diagnostic criteria for dementia to most previous cohort studies. RESULTS The random forest model exhibited slightly better predictive power using a series of auditory verbal learning test, education, and follow-up time, as measured by overall accuracy (93%) and an area under the curve (AUC) (mean [standard error]: 088 [0.07]). When assessed in the external validation cohort, its performance was deemed acceptable with an AUC of 0.81 (0.15). Conversely, the logistic regression model showed better results in the external validation set, attaining an AUC of 0.88 (0.20). CONCLUSION Our machine learning framework offers a viable strategy for predicting dementia using only memory tests in primary healthcare settings. This model can track cognitive changes and provide valuable insights for early intervention.
Collapse
Affiliation(s)
- Xin-Yan Xie
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan; Geriatric Hospital Affiliated to Wuhan University of Science and Technology (XYX, DL, GRC, FFH, LX, YMO, XXC, YLZ, JQL, QW, YFM, WT, YZ), Wuhan; School of Public Health (XYX, DL, LX, YMO, YSC, JQL, ZW, YZ), Wuhan University of Science and Technology, Wuhan
| | - Lin-Ya Huang
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan
| | - Dan Liu
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan; Geriatric Hospital Affiliated to Wuhan University of Science and Technology (XYX, DL, GRC, FFH, LX, YMO, XXC, YLZ, JQL, QW, YFM, WT, YZ), Wuhan; School of Public Health (XYX, DL, LX, YMO, YSC, JQL, ZW, YZ), Wuhan University of Science and Technology, Wuhan
| | - Gui-Rong Cheng
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan; Geriatric Hospital Affiliated to Wuhan University of Science and Technology (XYX, DL, GRC, FFH, LX, YMO, XXC, YLZ, JQL, QW, YFM, WT, YZ), Wuhan; School of Public Health (XYX, DL, LX, YMO, YSC, JQL, ZW, YZ), Wuhan University of Science and Technology, Wuhan
| | - Fei-Fei Hu
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan; Geriatric Hospital Affiliated to Wuhan University of Science and Technology (XYX, DL, GRC, FFH, LX, YMO, XXC, YLZ, JQL, QW, YFM, WT, YZ), Wuhan; School of Public Health (XYX, DL, LX, YMO, YSC, JQL, ZW, YZ), Wuhan University of Science and Technology, Wuhan
| | - Juan Zhou
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan
| | - Jing-Jing Zhang
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan
| | - Gang-Bin Han
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan
| | - Jing-Wen Geng
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan
| | - Xiao-Chang Liu
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan
| | - Jun-Yi Wang
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan
| | - De-Yang Zeng
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan
| | - Jing Liu
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan
| | - Qian-Qian Nie
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan
| | - Dan Song
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan
| | - Shi-Yue Li
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan
| | - Cheng Cai
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan
| | - Yu-Yang Cui
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan
| | - Lang Xu
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan; Geriatric Hospital Affiliated to Wuhan University of Science and Technology (XYX, DL, GRC, FFH, LX, YMO, XXC, YLZ, JQL, QW, YFM, WT, YZ), Wuhan; School of Public Health (XYX, DL, LX, YMO, YSC, JQL, ZW, YZ), Wuhan University of Science and Technology, Wuhan
| | - Yang-Ming Ou
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan; Geriatric Hospital Affiliated to Wuhan University of Science and Technology (XYX, DL, GRC, FFH, LX, YMO, XXC, YLZ, JQL, QW, YFM, WT, YZ), Wuhan; School of Public Health (XYX, DL, LX, YMO, YSC, JQL, ZW, YZ), Wuhan University of Science and Technology, Wuhan
| | - Xing-Xing Chen
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan; Geriatric Hospital Affiliated to Wuhan University of Science and Technology (XYX, DL, GRC, FFH, LX, YMO, XXC, YLZ, JQL, QW, YFM, WT, YZ), Wuhan; School of Public Health (XYX, DL, LX, YMO, YSC, JQL, ZW, YZ), Wuhan University of Science and Technology, Wuhan
| | - Yan-Ling Zhou
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan; Geriatric Hospital Affiliated to Wuhan University of Science and Technology (XYX, DL, GRC, FFH, LX, YMO, XXC, YLZ, JQL, QW, YFM, WT, YZ), Wuhan
| | - Yu-Shan Chen
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan
| | - Jin-Quan Li
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan; Geriatric Hospital Affiliated to Wuhan University of Science and Technology (XYX, DL, GRC, FFH, LX, YMO, XXC, YLZ, JQL, QW, YFM, WT, YZ), Wuhan; School of Public Health (XYX, DL, LX, YMO, YSC, JQL, ZW, YZ), Wuhan University of Science and Technology, Wuhan
| | - Zhen Wei
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan
| | - Qiong Wu
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan; School of Public Health (XYX, DL, LX, YMO, YSC, JQL, ZW, YZ), Wuhan University of Science and Technology, Wuhan
| | - Yu-Fei Mei
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan; Geriatric Hospital Affiliated to Wuhan University of Science and Technology (XYX, DL, GRC, FFH, LX, YMO, XXC, YLZ, JQL, QW, YFM, WT, YZ), Wuhan; School of Public Health (XYX, DL, LX, YMO, YSC, JQL, ZW, YZ), Wuhan University of Science and Technology, Wuhan
| | - Shao-Jun Song
- Reproductive Medicine Center (SJS), Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan
| | - Wei Tan
- Geriatric Hospital Affiliated to Wuhan University of Science and Technology (XYX, DL, GRC, FFH, LX, YMO, XXC, YLZ, JQL, QW, YFM, WT, YZ), Wuhan; School of Public Health (XYX, DL, LX, YMO, YSC, JQL, ZW, YZ), Wuhan University of Science and Technology, Wuhan
| | - Qian-Hua Zhao
- Department of Neurology (QHZ, DD), Huashan Hospital, Fudan University, Shanghai; National Center for Neurological Disorders (QHZ, DD), Huashan Hospital, Fudan University, Shanghai; National Clinical Research Center for Aging and Medicine (QHZ, DD), Huashan Hospital, Fudan University, Shanghai
| | - Ding Ding
- Department of Neurology (QHZ, DD), Huashan Hospital, Fudan University, Shanghai; National Center for Neurological Disorders (QHZ, DD), Huashan Hospital, Fudan University, Shanghai; National Clinical Research Center for Aging and Medicine (QHZ, DD), Huashan Hospital, Fudan University, Shanghai.
| | - Yan Zeng
- Hubei Provincial Clinical Research Center for Alzheimer's Disease (XYX, LYH, DL, GRC, FFH, JZ, JJZ, GBH, JWG, XCL, JYW, DYZ, JL, QQN, DS, SYL, CC, YYC, LX, YMO, XXC, YLZ, YSC, JQL, ZW, QW, YFM, YZ), Tian You Hospital Affiliated to Wuhan University of Science and Technology, Wuhan; Geriatric Hospital Affiliated to Wuhan University of Science and Technology (XYX, DL, GRC, FFH, LX, YMO, XXC, YLZ, JQL, QW, YFM, WT, YZ), Wuhan; School of Public Health (XYX, DL, LX, YMO, YSC, JQL, ZW, YZ), Wuhan University of Science and Technology, Wuhan.
| |
Collapse
|
4
|
Sola J, Arderiu A, Almeida TP, Fallet S, Yazdani S, Haddad S, Perruchoud D, Grossenbacher O, Shah J. The quest for blood pressure markers in photoplethysmography and its applications in digital health. Front Digit Health 2025; 7:1518322. [PMID: 40370706 PMCID: PMC12075524 DOI: 10.3389/fdgth.2025.1518322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2024] [Accepted: 04/14/2025] [Indexed: 05/16/2025] Open
Abstract
Introduction Photoplethysmography (PPG) sensors, capturing optical signals from arterial pulses, are debated for their potential in blood pressure (BP) measurement. This study employed the largest dataset to date of paired PPG and cuff BP readings to explore PPG signals for BP estimation. Methods 32,152 European residents (age 55.9% ± 11.8, 24% female, BMI 27.7 ± 4.6) voluntarily acquired and used a cuffless BP monitor (Aktiia SA, Switzerland) between March/2,021-March/2023. Systolic and diastolic BP (SBP, DBP) from an upper arm oscillometric cuff were collected simultaneously with wrist PPG (668,080 paired measurements). Six different machine learning models were developed to predict BP using cuff BP readings as reference (75%|15%|15% training|validation|testing): four baseline models [heart rate (HR), Age, Demography (DEM: Age + Gender + BMI), DEM + HR], and two models relying on the analysis of the PPG waveforms (PPG, PPG + DEM). Performance of each model was evaluated on the 4,823 subjects from the testing set using as metrics the Pearson's correlation (r) when comparing the estimated and the reference BP values, and the area under the receiver operating characteristic (AUROC) curves, and true positive and true negative rates (TPR, TNR) for the detection of high BP (reference SBP ≥ 140 or DBP ≥ 90 mmHg, applying a ± 8 mmHg exclusion zone to account for cuff measurement uncertainty). Results Baseline models showed low correlation with cuff data and poor high BP detection (r < 0.35; AUROC < 0.65, TPR < 0.65, TNR < 0.58). PPG-based models excelled in correlating with cuff BP (SBP: r = 0.53 for PPG, r = 0.63 for PPG + DEM; DBP: r = 0.58 for PPG, r = 0.67 for PPG + DEM) and high BP detection (SBP: AUROC = 0.84, TPR = TNR = 0.75; DBP: AUROC = 0.89, TPR = TNR = 0.81 for PPG; SBP: AUROC = 0.89, TPR = TNR = 0.80; DBP: AUROC = 0.93, TPR = TNR = 0.86 for PPG + DEM). Discussion This study demonstrated that PPG signals contain reliable markers of BP, and that BP values can be estimated using only markers found within PPG's optical pulsatility signals, outperforming models based solely on demographic data. These findings hold the potential to radically transform hypertension screening and global healthcare delivery, paving the way for innovative approaches in patient diagnosis, monitoring and treatment methodologies.
Collapse
|
5
|
Lv S, Sun N, Hao C, Li J, Li Y. Development and validation of machine learning models for predicting post-cesarean pain and individualized pain management strategies: a multicenter study. BMC Anesthesiol 2025; 25:170. [PMID: 40211131 PMCID: PMC11983914 DOI: 10.1186/s12871-025-03034-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2024] [Accepted: 03/28/2025] [Indexed: 04/12/2025] Open
Abstract
BACKGROUND Effective management of postoperative pain remains a significant challenge in obstetric care due to the variability in pain perception and response influenced by physical, medical, and psychosocial factors. Current standardized pain management protocols often fail to accommodate this variability, necessitating more tailored approaches. OBJECTIVE This study aims to improve postoperative pain management following cesarean sections by developing personalized protocols using machine learning (ML) models. METHOD The study analyzed the efficacy of eight ML models, including XGBoost, Random Forest, and Neural Networks, using data from two distinct hospital cohorts. Performance metrics such as Root Mean Squared Error (RMSE) and Coefficient of Determination (R²) were evaluated through internal and external validations. SHAP value analysis was used to identify key predictors influencing pain management outcomes. RESULTS The XGBoost model demonstrated superior performance, achieving the lowest RMSE and highest R². Key factors impacting pain management included esketamine use, anesthesia method, and anesthetic drug type, with esketamine significantly delaying the first activation of patient-controlled intravenous analgesia (PCIA). CONCLUSIONS The study highlights the potential of machine learning to refine postoperative pain management strategies in obstetric care, suggesting that personalized approaches, particularly incorporating esketamine and specific anesthesia methods, could enhance patient outcomes. TRIAL REGISTRATION Not applicable.
Collapse
Affiliation(s)
- Shenjuan Lv
- Department of Anesthesiology, Jinan Second Maternal and Child Health Hospital, Shandong, China
| | - Ning Sun
- Ultrasound Department, Jinan Second Maternal and Child Health Hospital, Shandong, China
| | - Chunhui Hao
- Department of Anesthesiology, Jinan Second Maternal and Child Health Hospital, Shandong, China.
| | - Junqing Li
- Ultrasound Department, Jinan Second Maternal and Child Health Hospital, Shandong, China
| | - Yun Li
- Department of Pain Management, Provincial Hospital Affiliated to Shandong First Medical University, Shandong, China
| |
Collapse
|
6
|
Johnson M, Tao P, Burcu M, Kang J, Baumgartner R, Ma J, Svetnik V. Creating a Proxy for Baseline Eastern Cooperative Oncology Group Performance Status in Electronic Health Records for Comparative Effectiveness Research in Advanced Non-Small Cell Lung Cancer. JCO Clin Cancer Inform 2025; 9:e2400185. [PMID: 40179336 DOI: 10.1200/cci-24-00185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Revised: 12/20/2024] [Accepted: 02/12/2025] [Indexed: 04/05/2025] Open
Abstract
PURPOSE Eastern Cooperative Oncology Group performance status (ECOG PS) is a key confounder in comparative effectiveness research, predicting treatment and survival, but is often incomplete in electronic health records (EHRs). Imputation on the basis of classification metrics alone may introduce differences in survival between patients with known and imputed ECOG PS, complicating comparative effectiveness research. We developed an approach to impute ECOG PS so that those with known and imputed ECOG PS are indistinguishable in their survival, reducing potential biases introduced by the imputation. METHODS We analyzed deidentified data from an EHR-derived database for patients with advanced non-small cell lung cancer (aNSCLC) at their first line of treatment. Our novel imputation method involved (1) sample-splitting patients with known ECOG PS into modeling and thresholding data sets, (2) developing a predictive model of ECOG PS, (3) determining an optimal threshold aligning clinical outcomes, where a choice of outcome metric may depend on the use case, and (4) applying the model and threshold to impute missing ECOG PS. We evaluated the approach using binary classification metrics and alignment of survival metrics between observed and imputed ECOG PS. RESULTS Of 62,101 patients, 13,297 (21%) had missing ECOG PS at the start of their first treatment. Our method achieved similar or better performance in accuracy (73.3%), sensitivity (42.4%), and specificity (81%) compared with other techniques, with smaller survival metric differences between observed and imputed ECOG PS, with differences of 0.07 in hazard ratio, -0.36 months in median survival for good ECOG PS (<2), and -0.39 months for poor ECOG PS (≥2). CONCLUSION Our imputed ECOG PS aligning clinical outcomes enhanced the use of real-world EHR data of patients with aNSCLC for comparative effectiveness research.
Collapse
Affiliation(s)
- Michael Johnson
- Biostatistics and Research Decision Sciences, Merck & Co, Inc, Rahway, NJ
| | - Peining Tao
- Biostatistics and Research Decision Sciences, Merck & Co, Inc, Rahway, NJ
| | - Mehmet Burcu
- Biostatistics and Research Decision Sciences, Merck & Co, Inc, Rahway, NJ
- Epidemiology, Merck & Co, Inc, Rahway, NJ
| | - John Kang
- Biostatistics and Research Decision Sciences, Merck & Co, Inc, Rahway, NJ
| | | | - Junshui Ma
- Biostatistics and Research Decision Sciences, Merck & Co, Inc, Rahway, NJ
| | - Vladimir Svetnik
- Biostatistics and Research Decision Sciences, Merck & Co, Inc, Rahway, NJ
| |
Collapse
|
7
|
Winicki NM, Radomski SN, Ciftci Y, Johnston FM, Greer JB. Predicting Postoperative Infection After Cytoreductive Surgery and Hyperthermic Intraperitoneal Chemotherapy with Splenectomy. Ann Surg Oncol 2025; 32:2903-2911. [PMID: 39841336 DOI: 10.1245/s10434-024-16728-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Accepted: 12/05/2024] [Indexed: 01/23/2025]
Abstract
BACKGROUND Hematologic changes after splenectomy and hyperthermic intraperitoneal chemotherapy (HIPEC) can complicate postoperative assessment of infection. This study aimed to develop a machine-learning model to predict postoperative infection after cytoreductive surgery (CRS) and HIPEC with splenectomy. METHODS The study enrolled patients in the national TriNetX database and at the Johns Hopkins Hospital (JHH) who underwent splenectomy during CRS/HIPEC from 2010 to 2024. Demographics, comorbidities, vital signs, daily laboratory values, and documented infections were collected. The patients were divided into infected and non-infected cohorts within 14 days postoperatively. Extreme gradient boost (XGBoost) machine-learning was used to predict postoperative infection. An initial model was generated using the TriNetX dataset and externally validated in the JHH cohort. RESULTS From TriNetX, 1016 patients were included: 802 in the non-infected group (79%) and 214 (21%) in the postoperative infection group. The mean age was 61 ± 13 years, and 597 (56%) of the patientswere female. Most of the patients underwent CRS/HIPEC with splenectomy for appendiceal cancer (n = 590, 56%), followed by colorectal malignancy (n = 299, 29%). The remainder (n = 127, 15%) underwent CRS/HIPEC with splenectomy for gastric, pancreatic, ovarian, and small bowel malignancies or peritoneal mesothelioma. In detecting any infection, XGBoost exhibited excellent prediction accuracy (area under the receiver operating characteristic curve [AUC], 0.910 ± 0.073; F1 score, 0.915 ± 0.040) and retained high accuracy upon external validation with 96 demographically similar JHH patients (AUC, 0.823 ± 0.08; F1 score, 0.864 ± 0.03). CONCLUSION A novel machine-learning algorithm was developed to predict postoperative infection after CRS/HIPEC with splenectomy that could aid in the early diagnosis and initiation of treatment.
Collapse
Affiliation(s)
- Nolan M Winicki
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Shannon N Radomski
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Yusuf Ciftci
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Fabian M Johnston
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Jonathan B Greer
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
- Division of Gastrointestinal Surgical Oncology, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
8
|
Duan M, Geng Z, Gao L, Zhao Y, Li Z, Chen L, Kuosmanen P, Qi G, Gong F, Yu G. An interpretable machine learning-assisted diagnostic model for Kawasaki disease in children. Sci Rep 2025; 15:7927. [PMID: 40050685 PMCID: PMC11885592 DOI: 10.1038/s41598-025-92277-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2024] [Accepted: 02/26/2025] [Indexed: 03/09/2025] Open
Abstract
Kawasaki disease (KD) is a syndrome of acute systemic vasculitis commonly observed in children. Due to its unclear pathogenesis and the lack of specific diagnostic markers, it is prone to being confused with other diseases that exhibit similar symptoms, making early and accurate diagnosis challenging. This study aimed to develop an interpretable machine learning (ML) diagnostic model for KD. We collected demographic and laboratory data from 3650 patients (2299 with KD, 1351 with similar symptoms but different diseases) and employed 10 ML algorithms to construct the diagnostic model. Diagnostic performance was evaluated using several metrics, including area under the receiver-operating characteristic curve (AUC). Additionally, the shapley additive explanations (SHAP) method was employed to select important features and explain the final model. Using the Streamlit framework, we converted the model into a user-friendly web application to enhance its practicality in clinical settings. Among the 10 ML algorithms, XGBoost demonstrates the best diagnostic performance, achieving an AUC of 0.9833. SHAP analysis revealed that features, including age in months, fibrinogen, and human interferon gamma, are important for diagnosis. When relying on the top 10 most important features, the model's AUC remains at 0.9757. The proposed model can assist clinicians in making early and accurate diagnoses of KD. Furthermore, its interpretability enhances model transparency, facilitating clinicians' understanding of prediction reliability.
Collapse
Affiliation(s)
- Mengyu Duan
- National Clinical Research Center for Child Health, National Children's Regional Medical Center, Children's Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Sino-Finland Joint AI Laboratory for Child Health of Zhejiang Province, Hangzhou, China
| | - Zhimin Geng
- National Clinical Research Center for Child Health, National Children's Regional Medical Center, Children's Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Lichao Gao
- National Clinical Research Center for Child Health, National Children's Regional Medical Center, Children's Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Yonggen Zhao
- National Clinical Research Center for Child Health, National Children's Regional Medical Center, Children's Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Sino-Finland Joint AI Laboratory for Child Health of Zhejiang Province, Hangzhou, China
| | - Zheming Li
- National Clinical Research Center for Child Health, National Children's Regional Medical Center, Children's Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Sino-Finland Joint AI Laboratory for Child Health of Zhejiang Province, Hangzhou, China
| | - Lindong Chen
- National Clinical Research Center for Child Health, National Children's Regional Medical Center, Children's Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Sino-Finland Joint AI Laboratory for Child Health of Zhejiang Province, Hangzhou, China
| | - Pekka Kuosmanen
- Sino-Finland Joint AI Laboratory for Child Health of Zhejiang Province, Hangzhou, China
- Avaintec Oy, Helsinki, Finland
| | - Guoqiang Qi
- National Clinical Research Center for Child Health, National Children's Regional Medical Center, Children's Hospital, Zhejiang University School of Medicine, Hangzhou, China.
- Sino-Finland Joint AI Laboratory for Child Health of Zhejiang Province, Hangzhou, China.
| | - Fangqi Gong
- National Clinical Research Center for Child Health, National Children's Regional Medical Center, Children's Hospital, Zhejiang University School of Medicine, Hangzhou, China.
| | - Gang Yu
- National Clinical Research Center for Child Health, National Children's Regional Medical Center, Children's Hospital, Zhejiang University School of Medicine, Hangzhou, China.
- Sino-Finland Joint AI Laboratory for Child Health of Zhejiang Province, Hangzhou, China.
| |
Collapse
|
9
|
Lin B, Liu J, Li K, Zhong X. Predicting the Risk of HIV Infection and Sexually Transmitted Diseases Among Men Who Have Sex With Men: Cross-Sectional Study Using Multiple Machine Learning Approaches. J Med Internet Res 2025; 27:e59101. [PMID: 39977856 PMCID: PMC11888048 DOI: 10.2196/59101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Revised: 12/12/2024] [Accepted: 01/02/2025] [Indexed: 02/22/2025] Open
Abstract
BACKGROUND Men who have sex with men (MSM) are at high risk for HIV infection and sexually transmitted diseases (STDs). However, there is a lack of accurate and convenient tools to assess this risk. OBJECTIVE This study aimed to develop machine learning models and tools to predict and assess the risk of HIV infection and STDs among MSM. METHODS We conducted a cross-sectional study that collected individual characteristics of 1999 MSM with negative or unknown HIV serostatus in Western China from 2013 to 2023. MSM self-reported their STD history and were tested for HIV. We compared the accuracy of 6 machine learning methods in predicting the risk of HIV infection and STDs using 7 parameters for a comprehensive assessment, ranking the methods according to their performance in each parameter. We selected data from the Sichuan MSM for external validation. RESULTS Of the 1999 MSM, 72 (3.6%) tested positive for HIV and 146 (7.3%) self-reported a history of previous STD infection. After taking the results of the intersection of the 3 feature screening methods, a total of 7 and 5 predictors were screened for predicting HIV infection and STDs, respectively, and multiple machine learning prediction models were constructed. Extreme gradient boost models performed optimally in predicting the risk of HIV infection and STDs, with area under the curve values of 0.777 (95% CI 0.639-0.915) and 0.637 (95% CI 0.541-0.732), respectively, demonstrating stable performance in both internal and external validation. The highest combined predictive performance scores of HIV and STD models were 33 and 39, respectively. Interpretability analysis showed that nonadherence to condom use, low HIV knowledge, multiple male partners, and internet dating were risk factors for HIV infection. Low degree of education, internet dating, and multiple male and female partners were risk factors for STDs. The risk stratification analysis showed that the optimal model effectively distinguished between high- and low-risk MSM. MSM were classified into HIV (predicted risk score <0.506 and ≥0.506) and STD (predicted risk score <0.479 and ≥0.479) risk groups. In total, 22.8% (114/500) were in the HIV high-risk group, and 43% (215/500) were in the STD high-risk group. HIV infection and STDs were significantly higher in the high-risk groups (P<.001 and P=.05, respectively), with higher predicted probabilities (P<.001 for both). The prediction results of the optimal model were displayed in web applications for probability estimation and interactive computation. CONCLUSIONS Machine learning methods have demonstrated strengths in predicting the risk of HIV infection and STDs among MSM. Risk stratification models and web applications can facilitate clinicians in accurately assessing the risk of infection in individuals with high risk, especially MSM with concealed behaviors, and help them to self-monitor their risk for targeted, timely diagnosis and interventions to reduce new infections.
Collapse
Affiliation(s)
- Bing Lin
- School of Public Health, Chongqing Medical University, Chongqing, China
- Research Center for Medicine and Social Development, Chongqing Medical University, Chongqing, China
| | - Jiaxiu Liu
- School of Medical Informatics, Chongqing Medical University, Chongqing, China
| | - Kangjie Li
- School of Public Health, Chongqing Medical University, Chongqing, China
- Research Center for Medicine and Social Development, Chongqing Medical University, Chongqing, China
| | - Xiaoni Zhong
- School of Public Health, Chongqing Medical University, Chongqing, China
- Research Center for Medicine and Social Development, Chongqing Medical University, Chongqing, China
| |
Collapse
|
10
|
Warraich HJ, Tazbaz T, Califf RM. FDA Perspective on the Regulation of Artificial Intelligence in Health Care and Biomedicine. JAMA 2025; 333:241-247. [PMID: 39405330 DOI: 10.1001/jama.2024.21451] [Citation(s) in RCA: 26] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/22/2025]
Abstract
Importance Advances in artificial intelligence (AI) must be matched by efforts to better understand and evaluate how AI performs across health care and biomedicine as well as develop appropriate regulatory frameworks. This Special Communication reviews the history of the US Food and Drug Administration's (FDA) regulation of AI; presents potential uses of AI in medical product development, clinical research, and clinical care; and presents concepts that merit consideration as the regulatory system adapts to AI's unique challenges. Observations The FDA has authorized almost 1000 AI-enabled medical devices and has received hundreds of regulatory submissions for drugs that used AI in their discovery and development. Health AI regulation needs to be coordinated across all regulated industries, the US government, and with international organizations. Regulators will need to advance flexible mechanisms to keep up with the pace of change in AI across biomedicine and health care. Sponsors need to be transparent about and regulators need proficiency in evaluating the use of AI in premarket development. A life cycle management approach incorporating recurrent local postmarket performance monitoring should be central to health AI development. Special mechanisms to evaluate large language models and their uses are needed. Approaches are necessary to balance the needs of the entire spectrum of health ecosystem interests, from large firms to start-ups. The evaluation and regulatory system will need to focus on patient health outcomes to balance the use of AI for financial optimization for developers, payers, and health systems. Conclusions and Relevance Strong oversight by the FDA protects the long-term success of industries by focusing on evaluation to advance regulated technologies that improve health. The FDA will continue to play a central role in ensuring safe, effective, and trustworthy AI tools to improve the lives of patients and clinicians alike. However, all involved entities will need to attend to AI with the rigor this transformative technology merits.
Collapse
Affiliation(s)
| | - Troy Tazbaz
- US Food and Drug Administration, Silver Spring, Maryland
| | | |
Collapse
|
11
|
卢 梓, 黄 方, 蔡 光, 刘 继, 甄 鑫. A multi-constraint representation learning model for identification of ovarian cancer with missing laboratory indicators. NAN FANG YI KE DA XUE XUE BAO = JOURNAL OF SOUTHERN MEDICAL UNIVERSITY 2025; 45:170-178. [PMID: 39819725 PMCID: PMC11744287 DOI: 10.12122/j.issn.1673-4254.2025.01.20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Received: 09/05/2024] [Indexed: 01/19/2025]
Abstract
OBJECTIVES To evaluate the performance of a multi-constraint representation learning classification model for identifying ovarian cancer with missing laboratory indicators. METHODS Tabular data with missing laboratory indicators were collected from 393 patients with ovarian cancer and 1951 control patients. The missing ovarian cancer laboratory indicator features were projected to the latent space to obtain a classification model using the representational learning classification model based on discriminative learning and mutual information coupled with feature projection significance score consistency and missing location estimation. The proposed constraint term was ablated experimentally to assess the feasibility and validity of the constraint term by accuracy, area under the ROC curve (AUC), sensitivity, and specificity. Cross-validation methods and accuracy, AUC, sensitivity and specificity were also used to evaluate the discriminative performance of this classification model in comparison with other interpolation methods for processing of the missing data. RESULTS The results of the ablation experiments showed good compatibility among the constraints, and each constraint had good robustness. The cross-validation experiment showed that for identification of ovarian cancer with missing laboratory indicators, the AUC, accuracy, sensitivity and specificity of the proposed multi-constraints representation-based learning classification model was 0.915, 0.888, 0.774, and 0.910, respectively, and its AUC and sensitivity were superior to those of other interpolation methods. CONCLUSIONS The proposed model has excellent discriminatory ability with better performance than other missing data interpolation methods for identification of ovarian cancer with missing laboratory indicators.
Collapse
|
12
|
Magan D, Yadav RK, Aneja J, Pandey S. Association Between BMI and Neurocognitive Functions Among Middle-aged Obese Adults: Preliminary Findings Using Machine-learning (ML)-based Approach. Ann Neurosci 2025:09727531241307462. [PMID: 39834557 PMCID: PMC11742150 DOI: 10.1177/09727531241307462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2024] [Revised: 11/21/2024] [Accepted: 11/28/2024] [Indexed: 01/22/2025] Open
Abstract
Background Studies suggest that obesity predisposes individuals to developing cognitive dysfunction and an increased risk of dementia, but the nature of the relationship remains largely unexplored for better prognostic predictors. Purpose This study, the first of its kind in Indian participants with obesity, was intended to explore the use of quantification of different neurocognitive indices with increasing body mass index (BMI) among middle-aged participants with obesity. Additionally, machine-learning models were used to analyse the predictive performance of BMI for different cognitive functions. Methods In the cross-sectional analytical study, a total of 137 (n = 137) participants were included. Out of the total, 107 healthy obese (BMI = 23.0-30.0 kg m-2; age between 36 and 55 years of both genders) were recruited from the out-patient department of the Department of Endocrinology and General Medicine, and 30 participants were recruited as the control group, between March 2023 to February 2024. The participants underwent neuropsychological assessments, including mini-mental state examination (MMSE), Montreal cognitive assessment (MoCA) and serum levels of brain-derived neurotrophic factor (BDNF). Results Significant (p < .05) differences were observed for neurocognitive functions for the obese group versus the control group. According to the correlation heatmaps, BMI was significantly (p < .05) negatively associated with BDNF. Multivariate linear regression analysis revealed a substantial (p < .05) decline in BDNF with a change in BMI, accenting its significant impact on cognitive ageing. Additionally, consistent decreasing trends were observed across the MoCA and MMSE, confirming the robustness of the findings across diverse analytical methodologies. Furthermore, the linear regression model and super vector machine model contributed additional evidence to the consistency of the trends in cognitive decline linked to BMI variations. Conclusion The preliminary results of the present study support that increased BMI is an important physiological indicator that influences neurocognition and neuroplasticity in individuals with obesity.
Collapse
Affiliation(s)
- Dipti Magan
- Department of Physiology, All India Institute of Medical Sciences, Bathinda, Punjab, India
| | - Raj Kumar Yadav
- Department of Physiology, All India Institute of Medical Sciences, New Delhi, Delhi, India
| | - Jitender Aneja
- Department of Psychiatry, All India Institute of Medical Sciences, Bathinda, Punjab, India
| | - Shivam Pandey
- Department of Biostatistics, All India Institute of Medical Sciences, New Delhi, Delhi, India
| |
Collapse
|
13
|
Fan Z, Song W, Ke Y, Jia L, Li S, Li JJ, Zhang Y, Lin J, Wang B. XGBoost-SHAP-based interpretable diagnostic framework for knee osteoarthritis: a population-based retrospective cohort study. Arthritis Res Ther 2024; 26:213. [PMID: 39696605 DOI: 10.1186/s13075-024-03450-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2024] [Accepted: 12/01/2024] [Indexed: 12/20/2024] Open
Abstract
OBJECTIVE To use routine demographic and clinical data to develop an interpretable individual-level machine learning (ML) model to diagnose knee osteoarthritis (KOA) and to identify highly ranked features. METHODS In this retrospective, population-based cohort study, anonymized questionnaire data was retrieved from the Wu Chuan KOA Study, Inner Mongolia, China. After feature selections, participants were divided in a 7:3 ratio into training and test sets. Class balancing was applied to the training set for data augmentation. Four ML classifiers were compared by cross-validation within the training set and their performance was further analyzed with an unseen test set. Classifications were evaluated using sensitivity, specificity, positive predictive value, negative predictive value, accuracy, area under the curve(AUC), G-means, and F1 scores. The best model was explained using Shapley values to extract highly ranked features. RESULTS A total of 1188 participants were investigated in this study, among whom 26.3% were diagnosed with KOA. Comparatively, XGBoost with Boruta exhibited the highest classification performance among the four models, with an AUC of 0.758, G-means of 0.800, and F1 scores of 0.703. The SHAP method reveals the top 17 features of KOA according to the importance ranking, and the average of the experience of joint pain was recognized as the most important features. CONCLUSIONS Our study highlights the usefulness of machine learning in unveiling important factors that influence the diagnosis of KOA to guide new prevention strategies. Further work is needed to validate this approach.
Collapse
Affiliation(s)
- Zijuan Fan
- Department of Orthopaedic Surgery, The First Affiliated Hospital, Zhejiang University School of Medicine, Qingchun Road No. 79, Hangzhou, China
- Department of Health Statistics, School of Public Health, Sun Yat-sen University, Guangzhou, China
| | - Wenzhu Song
- Department of Big Data in Health Science School of Public Health, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Yan Ke
- Arthritis Clinic & Research Center, Peking University People's Hospital, Beijing, China
| | - Ligan Jia
- School of Computer Science and Technology, Xinjiang University, Urumchi, China
| | - Songyan Li
- Department of Orthopaedic Surgery, The First Affiliated Hospital, Zhejiang University School of Medicine, Qingchun Road No. 79, Hangzhou, China
| | - Jiao Jiao Li
- School of Biomedical Engineering, Faculty of Engineering and IT, University of Technology Sydney, Sydney, Australia
| | - Yuqing Zhang
- Harvard Medical School, Boston Massachusetts, USA
| | - Jianhao Lin
- Arthritis Clinic & Research Center, Peking University People's Hospital, Beijing, China.
| | - Bin Wang
- Department of Orthopaedic Surgery, The First Affiliated Hospital, Zhejiang University School of Medicine, Qingchun Road No. 79, Hangzhou, China.
| |
Collapse
|
14
|
Marko B, Palmowski L, Nowak H, Witowski A, Koos B, Rump K, Bergmann L, Bandow J, Eisenacher M, Günther P, Adamzik M, Sitek B, Rahmel T. Employing artificial intelligence for optimising antibiotic dosages in sepsis on intensive care unit: a study protocol for a prospective observational study (KI.SEP). BMJ Open 2024; 14:e086094. [PMID: 39672586 PMCID: PMC11647398 DOI: 10.1136/bmjopen-2024-086094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 11/22/2024] [Indexed: 12/15/2024] Open
Abstract
INTRODUCTION In sepsis treatment, achieving and maintaining effective antibiotic therapy is crucial. However, optimal antibiotic dosing faces challenges due to significant variability among patients with sepsis. Therapeutic drug monitoring (TDM), the current gold standard, lacks initial dosage adjustments and global availability. Even with daily TDM, antibiotic serum concentrations (ASCs) often deviate from the therapeutic range. This study addresses these challenges by developing machine learning (ML)-based ASC prediction models capable of handling variable data input and encompassing diverse clinical, laboratory, microbiological and proteomic parameters without the need for daily TDM. METHODS This prospective observational study is conducted in a German university hospital intensive care unit. Eligible sepsis patients receive continuous antibiotic therapy with piperacillin/tazobactam (n=100) or meropenem (n=100) within 24 hours. Exclusion criteria include refusal, pregnancy, lactation and severe anaemia (haemoglobin <8 g/dL). Blood samples for TDM are collected from patients, along with clinical and laboratory parameters on days 1-8 and day 30 or on discharge. ML models predicting ASC between day 1 and day 8 serve as primary and key secondary endpoints. We will use the collected data to develop multifaceted ML-based algorithms aimed at optimising antibiotic dosing in sepsis. Our two-way approach involves creating two distinct algorithms: the first focuses on predictive accuracy and generalisability using routine clinical parameters, while the second leverages an extended dataset including a plethora of factors currently insufficiently explored and not available in standard clinical practice but may help to enhance precision. Ultimately, these models are envisioned for integration into clinical decision support systems within patient data management systems, facilitating automated, personalised treatment recommendations for sepsis. ETHICS AND DISSEMINATION The study received approval from the Ethics Committee of the Medical Faculty of Ruhr-University Bochum (No. 23-7905). Findings will be disseminated through open-access publication in a peer-reviewed journal and social media channels. TRIAL REGISTRATION NUMBER DRKS00032970.
Collapse
Affiliation(s)
- Britta Marko
- Klinik für Anästhesiologie, Intensivmedizin und Schmerztherapie, Universitätsklinikum Knappschaftskrankenhaus Bochum GmbH, Bochum, Germany
| | - Lars Palmowski
- Klinik für Anästhesiologie, Intensivmedizin und Schmerztherapie, Universitätsklinikum Knappschaftskrankenhaus Bochum GmbH, Bochum, Germany
| | - Hartmuth Nowak
- Klinik für Anästhesiologie, Intensivmedizin und Schmerztherapie, Universitätsklinikum Knappschaftskrankenhaus Bochum GmbH, Bochum, Germany
- Zentrum für Künstliche Intelligenz, Medizininformatik und Datenwissenschaften, Universitätsklinikum Knappschaftskrankenhaus Bochum GmbH, Bochum, Germany
| | - Andrea Witowski
- Klinik für Anästhesiologie, Intensivmedizin und Schmerztherapie, Universitätsklinikum Knappschaftskrankenhaus Bochum GmbH, Bochum, Germany
| | - Björn Koos
- Klinik für Anästhesiologie, Intensivmedizin und Schmerztherapie, Universitätsklinikum Knappschaftskrankenhaus Bochum GmbH, Bochum, Germany
| | - Katharina Rump
- Klinik für Anästhesiologie, Intensivmedizin und Schmerztherapie, Universitätsklinikum Knappschaftskrankenhaus Bochum GmbH, Bochum, Germany
| | - Lars Bergmann
- Klinik für Anästhesiologie, Intensivmedizin und Schmerztherapie, Universitätsklinikum Knappschaftskrankenhaus Bochum GmbH, Bochum, Germany
| | - Julia Bandow
- Lehrstuhl für Angewandte Mikrobiologie, Ruhr-Universitat Bochum, Bochum, Germany
- Center für systembasierte Antibiotikaforschung (CESAR), Ruhr-Universitat Bochum, Bochum, Germany
| | - Martin Eisenacher
- Medizinisches Proteom-Center, Ruhr-Universitat Bochum Medizinische Fakultat, Bochum, Germany
- Zentrum für Proteindiagnostik (PRODI), Ruhr-Universitat Bochum, Bochum, Germany
| | | | - Michael Adamzik
- Klinik für Anästhesiologie, Intensivmedizin und Schmerztherapie, Universitätsklinikum Knappschaftskrankenhaus Bochum GmbH, Bochum, Germany
| | - Barbara Sitek
- Klinik für Anästhesiologie, Intensivmedizin und Schmerztherapie, Universitätsklinikum Knappschaftskrankenhaus Bochum GmbH, Bochum, Germany
- Medizinisches Proteom-Center, Ruhr-Universitat Bochum Medizinische Fakultat, Bochum, Germany
| | - Tim Rahmel
- Klinik für Anästhesiologie, Intensivmedizin und Schmerztherapie, Universitätsklinikum Knappschaftskrankenhaus Bochum GmbH, Bochum, Germany
| |
Collapse
|
15
|
Huang YC, Liu TC, Lu CJ. Establishing a machine learning dementia progression prediction model with multiple integrated data. BMC Med Res Methodol 2024; 24:288. [PMID: 39578765 PMCID: PMC11583646 DOI: 10.1186/s12874-024-02411-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Accepted: 11/08/2024] [Indexed: 11/24/2024] Open
Abstract
OBJECTIVE Dementia is a significant medical and social issue in most developed countries. Practical tools for predicting the progression of degenerative dementia are highly valuable. Machine learning (ML) methods facilitate the construction of effective models using real-world data, which may include missing values and various integrated datasets. METHOD This retrospective study analyzed data from 679 patients diagnosed with degenerative dementia at Fu Jen Catholic University Hospital, who were evaluated by neurologists, psychologists and followed for over two years. Predictive variables were categorized into demographic (D), clinical dementia rating (CDR), mini-mental state examination (MMSE), and laboratory data value (LV) groups. These categories were further integrated into three subgroups (D-CDR, D-CDR-MMSE, and D-CDR-MMSE-LV). We utilized the extreme gradient boosting (XGB) model to rank the importance of variables and identify the most effective feature combination via a step-wise approach. RESULT The D-CDR-MMSE-LV model combination showed robust performance with an excellent area under the receiver operating characteristic curve (AUC) and the highest sensitivity value (84.66). Employing both demographic and neuropsychiatric variables, our prediction model achieved an AUC of 83.74. By incorporating additional clinical information from laboratory data and applying our proposed feature selection strategy, we constructed a model based on eight variables that achieved an AUC of 85.12 using the XGB technique. CONCLUSION We established a machine-learning model to monitor the progression of dementia using a limited, real-world clinical dataset. The XGB technique identified eight critical variables across our integrated datasets, potentially providing clinicians with valuable guidance.
Collapse
Affiliation(s)
- Yung-Chuan Huang
- Department of Neurology, Fu Jen Catholic University Hospital, Fu Jen Catholic University, New Taipei City, Taiwan
| | - Tzu-Chi Liu
- Graduate Institute of Business Administration, College of Management, Fu Jen Catholic University, New Taipei City, Taiwan
| | - Chi-Jie Lu
- Graduate Institute of Business Administration, College of Management, Fu Jen Catholic University, New Taipei City, Taiwan.
- Artificial Intelligence Development Center, Fu Jen Catholic University, New Taipei City, Taiwan.
- Department of Information Management, Fu Jen Catholic University, New Taipei City, Taiwan.
- Graduate Institute of Business Administration, Fu Jen Catholic University, No.510, Zhongzheng Rd., Xinzhuang Dist, New Taipei City, 242062, Taiwan.
| |
Collapse
|
16
|
Pham MK, Mai TT, Crane M, Ebiele M, Brennan R, Ward ME, Geary U, McDonald N, Bezbradica M. Forecasting Patient Early Readmission from Irish Hospital Discharge Records Using Conventional Machine Learning Models. Diagnostics (Basel) 2024; 14:2405. [PMID: 39518372 PMCID: PMC11545812 DOI: 10.3390/diagnostics14212405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2024] [Revised: 09/27/2024] [Accepted: 10/23/2024] [Indexed: 11/16/2024] Open
Abstract
BACKGROUND/OBJECTIVES Predicting patient readmission is an important task for healthcare risk management, as it can help prevent adverse events, reduce costs, and improve patient outcomes. In this paper, we compare various conventional machine learning models and deep learning models on a multimodal dataset of electronic discharge records from an Irish acute hospital. METHODS We evaluate the effectiveness of several widely used machine learning models that leverage patient demographics, historical hospitalization records, and clinical diagnosis codes to forecast future clinical risks. Our work focuses on addressing two key challenges in the medical fields, data imbalance and the variety of data types, in order to boost the performance of machine learning algorithms. Furthermore, we also employ SHapley Additive Explanations (SHAP) value visualization to interpret the model predictions and identify both the key data features and disease codes associated with readmission risks, identifying a specific set of diagnosis codes that are significant predictors of readmission within 30 days. RESULTS Through extensive benchmarking and the application of a variety of feature engineering techniques, we successfully improved the area under the curve (AUROC) score from 0.628 to 0.7 across our models on the test dataset. We also revealed that specific diagnoses, including cancer, COPD, and certain social factors, are significant predictors of 30-day readmission risk. Conversely, bacterial carrier status appeared to have minimal impact due to lower case frequencies. CONCLUSIONS Our study demonstrates how we effectively utilize routinely collected hospital data to forecast patient readmission through the use of conventional machine learning while applying explainable AI techniques to explore the correlation between data features and patient readmission rate.
Collapse
Affiliation(s)
- Minh-Khoi Pham
- ADAPT Centre, D02 PN40 Dublin, Ireland; (T.T.M.); (M.C.); (M.E.); (R.B.); (M.B.)
- School of Computing, Dublin City University, D09 Y074 Dublin, Ireland
| | - Tai Tan Mai
- ADAPT Centre, D02 PN40 Dublin, Ireland; (T.T.M.); (M.C.); (M.E.); (R.B.); (M.B.)
- School of Computing, Dublin City University, D09 Y074 Dublin, Ireland
| | - Martin Crane
- ADAPT Centre, D02 PN40 Dublin, Ireland; (T.T.M.); (M.C.); (M.E.); (R.B.); (M.B.)
- School of Computing, Dublin City University, D09 Y074 Dublin, Ireland
| | - Malick Ebiele
- ADAPT Centre, D02 PN40 Dublin, Ireland; (T.T.M.); (M.C.); (M.E.); (R.B.); (M.B.)
- School of Computer Science, University College Dublin, D04 V1W8 Dublin, Ireland
| | - Rob Brennan
- ADAPT Centre, D02 PN40 Dublin, Ireland; (T.T.M.); (M.C.); (M.E.); (R.B.); (M.B.)
- School of Computer Science, University College Dublin, D04 V1W8 Dublin, Ireland
| | - Marie E. Ward
- St James’s Hospital, D08 NHY1 Dublin, Ireland; (M.E.W.); (U.G.)
| | - Una Geary
- St James’s Hospital, D08 NHY1 Dublin, Ireland; (M.E.W.); (U.G.)
| | - Nick McDonald
- School of Psychology, Trinity College Dublin, D02 F6N2 Dublin, Ireland;
| | - Marija Bezbradica
- ADAPT Centre, D02 PN40 Dublin, Ireland; (T.T.M.); (M.C.); (M.E.); (R.B.); (M.B.)
- School of Computing, Dublin City University, D09 Y074 Dublin, Ireland
| |
Collapse
|
17
|
Halvorson BD, Ward AD, Murrell D, Lacefield JC, Wiseman RW, Goldman D, Frisbee JC. Regulation of Skeletal Muscle Resistance Arteriolar Tone: Temporal Variability in Vascular Responses. J Vasc Res 2024; 61:269-297. [PMID: 39362208 DOI: 10.1159/000541169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Accepted: 08/25/2024] [Indexed: 10/05/2024] Open
Abstract
INTRODUCTION A full understanding of the integration of the mechanisms of vascular tone regulation requires an interrogation of the temporal behavior of arterioles across vasoactive challenges. Building on previous work, the purpose of the present study was to start to interrogate the temporal nature of arteriolar tone regulation with physiological stimuli. METHODS We determined the response rate of ex vivo proximal and in situ distal resistance arterioles when challenged by one-, two-, and three-parameter combinations of five major physiological stimuli (norepinephrine, intravascular pressure, oxygen, adenosine [metabolism], and intralumenal flow). Predictive machine learning models determined which factors were most influential in controlling the rate of arteriolar responses. RESULTS Results indicate that vascular response rate is dependent on the intensity of the stimulus used and can be severely hindered by altered environments, caused by application of secondary or tertiary stimuli. Advanced analytics suggest that adrenergic influences were dominant in predicting proximal arteriolar response rate compared to metabolic influences in distal arterioles. CONCLUSION These data suggest that the vascular response rate to physiologic stimuli can be strongly influenced by the local environment. Translating how these effects impact vascular networks is imperative for understanding how the microcirculation appropriately perfuses tissue across conditions.
Collapse
Affiliation(s)
- Brayden D Halvorson
- Departments of Medical Biophysics, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Ontario, Canada
| | - Aaron D Ward
- Departments of Medical Biophysics, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Ontario, Canada
- Departments of Oncology, University of Western Ontario, London, Ontario, Canada
| | - Donna Murrell
- Departments of Medical Biophysics, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Ontario, Canada
- Departments of Oncology, University of Western Ontario, London, Ontario, Canada
| | - James C Lacefield
- Departments of Medical Biophysics, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Ontario, Canada
- School of Biomedical Engineering, University of Western Ontario, London, Ontario, Canada
| | - Robert W Wiseman
- Departments of Physiology and Radiology, Michigan State University, East Lansing, Michigan, USA
| | - Daniel Goldman
- Departments of Medical Biophysics, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Ontario, Canada
| | - Jefferson C Frisbee
- Departments of Medical Biophysics, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Ontario, Canada
| |
Collapse
|
18
|
Kim KA, Kim H, Ha EJ, Yoon BC, Kim DJ. Artificial Intelligence-Enhanced Neurocritical Care for Traumatic Brain Injury : Past, Present and Future. J Korean Neurosurg Soc 2024; 67:493-509. [PMID: 38186369 PMCID: PMC11375068 DOI: 10.3340/jkns.2023.0195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 12/18/2023] [Accepted: 01/04/2024] [Indexed: 01/09/2024] Open
Abstract
In neurointensive care units (NICUs), particularly in cases involving traumatic brain injury (TBI), swift and accurate decision-making is critical because of rapidly changing patient conditions and the risk of secondary brain injury. The use of artificial intelligence (AI) in NICU can enhance clinical decision support and provide valuable assistance in these complex scenarios. This article aims to provide a comprehensive review of the current status and future prospects of AI utilization in the NICU, along with the challenges that must be overcome to realize this. Presently, the primary application of AI in NICU is outcome prediction through the analysis of preadmission and high-resolution data during admission. Recent applications include augmented neuromonitoring via signal quality control and real-time event prediction. In addition, AI can integrate data gathered from various measures and support minimally invasive neuromonitoring to increase patient safety. However, despite the recent surge in AI adoption within the NICU, the majority of AI applications have been limited to simple classification tasks, thus leaving the true potential of AI largely untapped. Emerging AI technologies, such as generalist medical AI and digital twins, harbor immense potential for enhancing advanced neurocritical care through broader AI applications. If challenges such as acquiring high-quality data and ethical issues are overcome, these new AI technologies can be clinically utilized in the actual NICU environment. Emphasizing the need for continuous research and development to maximize the potential of AI in the NICU, we anticipate that this will further enhance the efficiency and accuracy of TBI treatment within the NICU.
Collapse
Affiliation(s)
- Kyung Ah Kim
- Department of Brain and Cognitive Engineering, Korea University, Seoul, Korea
| | - Hakseung Kim
- Department of Brain and Cognitive Engineering, Korea University, Seoul, Korea
| | - Eun Jin Ha
- Department of Critical Care Medicine, Seoul National University Hospital, Seoul, Korea
| | - Byung C. Yoon
- Department of Radiology, Stanford University School of Medicine, VA Palo Alto Heath Care System, Palo Alto, CA, USA
| | - Dong-Joo Kim
- Department of Brain and Cognitive Engineering, Korea University, Seoul, Korea
- Department of Neurology, Korea University College of Medicine, Seoul, Korea
| |
Collapse
|
19
|
Suárez M, Martínez-Blanco P, Gil-Rojas S, Torres AM, Torralba-González M, Mateo J. Assessment of Albumin-Incorporating Scores at Hepatocellular Carcinoma Diagnosis Using Machine Learning Techniques: An Evaluation of Prognostic Relevance. Bioengineering (Basel) 2024; 11:762. [PMID: 39199720 PMCID: PMC11351615 DOI: 10.3390/bioengineering11080762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Revised: 07/19/2024] [Accepted: 07/25/2024] [Indexed: 09/01/2024] Open
Abstract
Hepatocellular carcinoma (HCC) presents high mortality rates worldwide, with limited evidence on prognostic factors at diagnosis. This study evaluates the utility of common scores incorporating albumin as predictors of mortality at HCC diagnosis using Machine Learning techniques. They are also compared to other scores and variables commonly used. A retrospective cohort study was conducted with 191 patients from Virgen de la Luz Hospital of Cuenca and University Hospital of Guadalajara. Demographic, analytical, and tumor-specific variables were included. Various Machine Learning algorithms were implemented, with eXtreme Gradient Boosting (XGB) as the reference method. In the predictive model developed, the Barcelona Clinic Liver Cancer score was the best predictor of mortality, closely followed by the Platelet-Albumin-Bilirubin and Albumin-Bilirubin scores. Albumin levels alone also showed high relevance. Other scores, such as C-Reactive Protein/albumin and Child-Pugh performed less effectively. XGB proved to be the most accurate method across the metrics analyzed, outperforming other ML algorithms. In conclusion, the Barcelona Clinic Liver Cancer, Platelet-Albumin-Bilirubin and Albumin-Bilirubin scores are highly reliable for assessing survival at HCC diagnosis. The XGB-developed model proved to be the most reliable for this purpose compared to the other proposed methods.
Collapse
Affiliation(s)
- Miguel Suárez
- Gastroenterology Department, Virgen de la Luz Hospital, 16002 Cuenca, Spain
- Medical Analysis Expert Group, Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
- Medical Analysis Expert Group, Institute of Technology, Universidad de Castilla-La Mancha, 16071 Cuenca, Spain
| | - Pablo Martínez-Blanco
- Gastroenterology Department, Virgen de la Luz Hospital, 16002 Cuenca, Spain
- Medical Analysis Expert Group, Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
| | - Sergio Gil-Rojas
- Gastroenterology Department, Virgen de la Luz Hospital, 16002 Cuenca, Spain
- Medical Analysis Expert Group, Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
| | - Ana M. Torres
- Medical Analysis Expert Group, Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
- Medical Analysis Expert Group, Institute of Technology, Universidad de Castilla-La Mancha, 16071 Cuenca, Spain
| | - Miguel Torralba-González
- Internal Medicine Unit, University Hospital of Guadalajara, 19002 Guadalajara, Spain
- Faculty of Medicine, Universidad de Alcalá de Henares, 28801 Alcalá de Henares, Spain
- Translational Research Group in Cellular Immunology (GITIC), Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
| | - Jorge Mateo
- Medical Analysis Expert Group, Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
- Medical Analysis Expert Group, Institute of Technology, Universidad de Castilla-La Mancha, 16071 Cuenca, Spain
| |
Collapse
|
20
|
Schaffert D, Bibi I, Blauth M, Lull C, von Ahnen JA, Gross G, Schulze-Hagen T, Knitza J, Kuhn S, Benecke J, Schmieder A, Leipe J, Olsavszky V. Using Automated Machine Learning to Predict Necessary Upcoming Therapy Changes in Patients With Psoriasis Vulgaris and Psoriatic Arthritis and Uncover New Influences on Disease Progression: Retrospective Study. JMIR Form Res 2024; 8:e55855. [PMID: 38738977 PMCID: PMC11240079 DOI: 10.2196/55855] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 02/27/2024] [Accepted: 04/11/2024] [Indexed: 05/14/2024] Open
Abstract
BACKGROUND Psoriasis vulgaris (PsV) and psoriatic arthritis (PsA) are complex, multifactorial diseases significantly impacting health and quality of life. Predicting treatment response and disease progression is crucial for optimizing therapeutic interventions, yet challenging. Automated machine learning (AutoML) technology shows promise for rapidly creating accurate predictive models based on patient features and treatment data. OBJECTIVE This study aims to develop highly accurate machine learning (ML) models using AutoML to address key clinical questions for PsV and PsA patients, including predicting therapy changes, identifying reasons for therapy changes, and factors influencing skin lesion progression or an abnormal Bath Ankylosing Spondylitis Disease Activity Index (BASDAI) score. METHODS Clinical study data from 309 PsV and PsA patients were extensively prepared and analyzed using AutoML to build and select the most accurate predictive models for each variable of interest. RESULTS Therapy change at 24 weeks follow-up was modeled using the extreme gradient boosted trees classifier with early stopping (area under the receiver operating characteristic curve [AUC] of 0.9078 and logarithmic loss [LogLoss] of 0.3955 for the holdout partition). Key influencing factors included the initial systemic therapeutic agent, the Classification Criteria for Psoriatic Arthritis score at baseline, and changes in quality of life. An average blender incorporating three models (gradient boosted trees classifier, ExtraTrees classifier, and Eureqa generalized additive model classifier) with an AUC of 0.8750 and LogLoss of 0.4603 was used to predict therapy changes for 2 hypothetical patients, highlighting the significance of these factors. Treatments such as methotrexate or specific biologicals showed a lower propensity for change. An average blender of a random forest classifier, an extreme gradient boosted trees classifier, and a Eureqa classifier (AUC of 0.9241 and LogLoss of 0.4498) was used to estimate PASI (Psoriasis Area and Severity Index) change after 24 weeks. Primary predictors included the initial PASI score, change in pruritus levels, and change in therapy. A lower initial PASI score and consistently low pruritus were associated with better outcomes. BASDAI classification at onset was analyzed using an average blender of a Eureqa generalized additive model classifier, an extreme gradient boosted trees classifier with early stopping, and a dropout additive regression trees classifier with an AUC of 0.8274 and LogLoss of 0.5037. Influential factors included initial pain, disease activity, and Hospital Anxiety and Depression Scale scores for depression and anxiety. Increased pain, disease activity, and psychological distress generally led to higher BASDAI scores. CONCLUSIONS The practical implications of these models for clinical decision-making in PsV and PsA can guide early investigation and treatment, contributing to improved patient outcomes.
Collapse
Affiliation(s)
- Daniel Schaffert
- Department of Dermatology, Venereology and Allergology, University Medical Center and Medical Faculty Mannheim, University of Heidelberg, and Center of Excellence in Dermatology, Mannheim, Germany
| | - Igor Bibi
- Department of Dermatology, Venereology and Allergology, University Medical Center and Medical Faculty Mannheim, University of Heidelberg, and Center of Excellence in Dermatology, Mannheim, Germany
| | - Mara Blauth
- Department of Dermatology, Venereology and Allergology, University Medical Center and Medical Faculty Mannheim, University of Heidelberg, and Center of Excellence in Dermatology, Mannheim, Germany
| | - Christian Lull
- Department of Dermatology, Venereology and Allergology, University Medical Center and Medical Faculty Mannheim, University of Heidelberg, and Center of Excellence in Dermatology, Mannheim, Germany
| | - Jan Alwin von Ahnen
- Department of Dermatology, Venereology and Allergology, University Medical Center and Medical Faculty Mannheim, University of Heidelberg, and Center of Excellence in Dermatology, Mannheim, Germany
| | - Georg Gross
- Department of Medicine V, Division of Rheumatology, University Medical Center and Medical Faculty Mannheim, Mannheim, Germany
| | - Theresa Schulze-Hagen
- Department of Dermatology, Venereology and Allergology, University Medical Center and Medical Faculty Mannheim, University of Heidelberg, and Center of Excellence in Dermatology, Mannheim, Germany
| | - Johannes Knitza
- Institute of Digital Medicine, Philipps-University Marburg and University Hospital of Giessen and Marburg, Marburg, Germany
| | - Sebastian Kuhn
- Institute of Digital Medicine, Philipps-University Marburg and University Hospital of Giessen and Marburg, Marburg, Germany
| | - Johannes Benecke
- Department of Dermatology, Venereology and Allergology, University Medical Center and Medical Faculty Mannheim, University of Heidelberg, and Center of Excellence in Dermatology, Mannheim, Germany
| | - Astrid Schmieder
- Department of Dermatology, Venereology, and Allergology, University Hospital Würzburg, Würzburg, Germany
| | - Jan Leipe
- Department of Medicine V, Division of Rheumatology, University Medical Center and Medical Faculty Mannheim, Mannheim, Germany
| | - Victor Olsavszky
- Department of Dermatology, Venereology and Allergology, University Medical Center and Medical Faculty Mannheim, University of Heidelberg, and Center of Excellence in Dermatology, Mannheim, Germany
| |
Collapse
|
21
|
Uno M, Nakamaru Y, Yamashita F. Application of machine learning techniques in population pharmacokinetics/pharmacodynamics modeling. Drug Metab Pharmacokinet 2024; 56:101004. [PMID: 38795660 DOI: 10.1016/j.dmpk.2024.101004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 01/22/2024] [Accepted: 02/10/2024] [Indexed: 05/28/2024]
Abstract
Population pharmacokinetics/pharmacodynamics (pop-PK/PD) consolidates pharmacokinetic and pharmacodynamic data from many subjects to understand inter- and intra-individual variability due to patient backgrounds, including disease state and genetics. The typical workflow in pop-PK/PD analysis involves the determination of the structure model, selection of the error model, analysis based on the base model, covariate modeling, and validation of the final model. Machine learning is gaining considerable attention in the medical and various fields because, in contrast to traditional modeling, which often assumes linear or predefined relationships, machine learning modeling learns directly from data and accommodates complex patterns. Machine learning has demonstrated excellent capabilities for prescreening covariates and developing predictive models. This review introduces various applications of machine learning techniques in pop-PK/PD research.
Collapse
Affiliation(s)
- Mizuki Uno
- Department of Drug Delivery Research, Graduate School of Pharmaceutical Sciences, Kyoto University, 46-29 Yoshidashimoadachi-cho, Sakyo-ku, Kyoto, 606-8501, Japan
| | - Yuta Nakamaru
- Department of Drug Delivery Research, Graduate School of Pharmaceutical Sciences, Kyoto University, 46-29 Yoshidashimoadachi-cho, Sakyo-ku, Kyoto, 606-8501, Japan
| | - Fumiyoshi Yamashita
- Department of Drug Delivery Research, Graduate School of Pharmaceutical Sciences, Kyoto University, 46-29 Yoshidashimoadachi-cho, Sakyo-ku, Kyoto, 606-8501, Japan.
| |
Collapse
|
22
|
Winicki NM, Radomski SN, Ciftci Y, Sabit AH, Johnston FM, Greer JB. Mortality risk prediction for primary appendiceal cancer. Surgery 2024; 175:1489-1495. [PMID: 38494390 DOI: 10.1016/j.surg.2024.02.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 02/10/2024] [Accepted: 02/13/2024] [Indexed: 03/19/2024]
Abstract
BACKGROUND Accurately predicting survival in patients with cancer is crucial for both clinical decision-making and patient counseling. The primary aim of this study was to generate the first machine-learning algorithm to predict the risk of mortality following the diagnosis of an appendiceal neoplasm. METHODS Patients with primary appendiceal cancer in the Surveillance, Epidemiology, and End Results database from 2000 to 2019 were included. Patient demographics, tumor characteristics, and survival data were extracted from the Surveillance, Epidemiology, and End Results database. Extreme gradient boost, random forest, neural network, and logistic regression machine learning models were employed to predict 1-, 5-, and 10-year mortality. After algorithm validation, the best-performance model was used to develop a patient-specific web-based risk prediction model. RESULTS A total of 16,579 patients were included in the study, with 13,262 in the training group (80%) and 3,317 in the validation group (20%). Extreme gradient boost exhibited the highest prediction accuracy for 1-, 5-, and 10-year mortality, with the 10-year model exhibiting the maximum area under the curve (0.909 [±0.006]) after 10-fold cross-validation. Variables that significantly influenced the predictive ability of the model were disease grade, malignant carcinoid histology, incidence of positive regional lymph nodes, number of nodes harvested, and presence of distant disease. CONCLUSION Here, we report the development and validation of a novel prognostic prediction model for patients with appendiceal neoplasms of numerous histologic subtypes that incorporate a vast array of patient, surgical, and pathologic variables. By using machine learning, we achieved an excellent predictive accuracy that was superior to that of previous nomograms.
Collapse
Affiliation(s)
- Nolan M Winicki
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Shannon N Radomski
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Yusuf Ciftci
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Ahmed H Sabit
- Department of Biostatistics, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Fabian M Johnston
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Jonathan B Greer
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD.
| |
Collapse
|
23
|
Yan C, Zhang Z, Nyemba S, Li Z. Generating Synthetic Electronic Health Record Data Using Generative Adversarial Networks: Tutorial. JMIR AI 2024; 3:e52615. [PMID: 38875595 PMCID: PMC11074891 DOI: 10.2196/52615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/10/2023] [Revised: 01/24/2024] [Accepted: 03/07/2024] [Indexed: 06/16/2024]
Abstract
Synthetic electronic health record (EHR) data generation has been increasingly recognized as an important solution to expand the accessibility and maximize the value of private health data on a large scale. Recent advances in machine learning have facilitated more accurate modeling for complex and high-dimensional data, thereby greatly enhancing the data quality of synthetic EHR data. Among various approaches, generative adversarial networks (GANs) have become the main technical path in the literature due to their ability to capture the statistical characteristics of real data. However, there is a scarcity of detailed guidance within the domain regarding the development procedures of synthetic EHR data. The objective of this tutorial is to present a transparent and reproducible process for generating structured synthetic EHR data using a publicly accessible EHR data set as an example. We cover the topics of GAN architecture, EHR data types and representation, data preprocessing, GAN training, synthetic data generation and postprocessing, and data quality evaluation. We conclude this tutorial by discussing multiple important issues and future opportunities in this domain. The source code of the entire process has been made publicly available.
Collapse
Affiliation(s)
- Chao Yan
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Ziqi Zhang
- Department of Computer Science, Vanderbilt University, Nashville, TN, United States
| | - Steve Nyemba
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Zhuohang Li
- Department of Computer Science, Vanderbilt University, Nashville, TN, United States
| |
Collapse
|
24
|
Kawashima A, Furukawa T, Imaizumi T, Morohashi A, Hara M, Yamada S, Hama M, Kawaguchi A, Sato K. Predictive Models for Palliative Care Needs of Advanced Cancer Patients Receiving Chemotherapy. J Pain Symptom Manage 2024; 67:306-316.e6. [PMID: 38218414 DOI: 10.1016/j.jpainsymman.2024.01.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 12/22/2023] [Accepted: 01/03/2024] [Indexed: 01/15/2024]
Abstract
CONTEXT Early palliative care is recommended within eight-week of diagnosing advanced cancer. Although guidelines suggest routine screening to identify cancer patients who could benefit from palliative care, implementing screening can be challenging due to understaffing and time constraints. OBJECTIVES To develop and evaluate machine learning models for predicting specialist palliative care needs in advanced cancer patients undergoing chemotherapy, and to investigate if predictive models could substitute screening tools. METHODS We conducted a retrospective cohort study using supervised machine learning. The study included patients aged 18 or older, diagnosed with metastatic or stage IV cancer, who underwent chemotherapy and distress screening at a designated cancer hospital in Japan from April 1, 2018, to March 31, 2023. Specialist palliative care needs were assessed based on distress screening scores and expert evaluations. Data sources were hospital's cancer registry, health claims database, and nursing admission records. The predictive model was developed using XGBoost, a machine learning algorithm. RESULTS Out of the 1878 included patients, 561 were analyzed. Among them, 114 (20.3%) exhibited needs for specialist palliative care. After under-sampling to address data imbalance, the models achieved an Area Under the Curve (AUC) of 0.89 with 95.8% sensitivity and a specificity of 71.9%. After feature selection, the model retained five variables, including the patient-reported pain score, and showcased an 0.82 AUC. CONCLUSION Our models could forecast specialist palliative care needs for advanced cancer patients on chemotherapy. Using five variables as predictors could replace screening tools and has the potential to contribute to earlier palliative care.
Collapse
Affiliation(s)
- Arisa Kawashima
- Division of Integrated Health Sciences (A.K. K.S.), Department of Nursing for Advanced Practice, Nagoya University Graduate School of Medicine, Nagoya, Japan; Department of Social Science (A.K.), Center for Gerontology and Social Science, Research Institute, National Center for Geriatrics and Gerontology, Obu, Japan..
| | - Taiki Furukawa
- Medical IT Center (T.F.), Nagoya University Hospital, Nagoya, Japan; Department of Respiratory Medicine (T.F.), Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Takahiro Imaizumi
- Department of Advanced Medicine (T.I., A.M.), Nagoya University Hospital, Nagoya, Japan
| | - Akemi Morohashi
- Department of Advanced Medicine (T.I., A.M.), Nagoya University Hospital, Nagoya, Japan
| | - Mariko Hara
- Department of Clinical Oncology and Chemotherapy (M.H., S.Y., M.H., A.K.), Nagoya University Hospital, Nagoya, Japan
| | - Satomi Yamada
- Department of Clinical Oncology and Chemotherapy (M.H., S.Y., M.H., A.K.), Nagoya University Hospital, Nagoya, Japan
| | - Masayo Hama
- Department of Clinical Oncology and Chemotherapy (M.H., S.Y., M.H., A.K.), Nagoya University Hospital, Nagoya, Japan
| | - Aya Kawaguchi
- Department of Clinical Oncology and Chemotherapy (M.H., S.Y., M.H., A.K.), Nagoya University Hospital, Nagoya, Japan
| | - Kazuki Sato
- Division of Integrated Health Sciences (A.K. K.S.), Department of Nursing for Advanced Practice, Nagoya University Graduate School of Medicine, Nagoya, Japan
| |
Collapse
|
25
|
Wang P, Wu S, Tian M, Liu K, Cong J, Zhang W, Wei B. A conformal regressor for predicting negative conversion time of Omicron patients. Med Biol Eng Comput 2024:10.1007/s11517-024-03029-8. [PMID: 38363486 DOI: 10.1007/s11517-024-03029-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Accepted: 01/09/2024] [Indexed: 02/17/2024]
Abstract
In light of the situation and the characteristics of Omicron, the country has continuously optimized the rules for the prevention and control of COVID-19. The global epidemic is still spreading, and new cases of infection continue to emerge in China. To facilitate the infected person to estimate the course of virus infection, a prediction model for predicting negative conversion time is proposed in this article. The clinical features of Omicron-infected patients in Shandong Province in the first half of 2022 are retrospectively studied. These features are grouped by disease diagnosis result, clinical sign, traditional Chinese medicine symptoms, and drug use. These features are input to the eXtreme Gradient Boosting (XGBoost) model, and the output is the predicted number of negative conversion days. At the same time, XGBoost is used as the underlying algorithm of the conformal prediction (CP) framework, which can realize the probability interval estimation with a controllable error rate. The results show that the proposed model has a mean absolute error of 3.54 days and has the shortest interval prediction result. This shows that the method in this paper can carry more decision-making information and help people better understand the disease and self-estimate the course of the disease to a certain extent.
Collapse
Affiliation(s)
- Pingping Wang
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, 266112, China
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, 266112, China
| | - Shenjing Wu
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, 266112, China
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, 266112, China
| | - Mei Tian
- Affiliated Hospital of Shandong University of Chinese Medicine, Jinan, 250011, China
| | - Kunmeng Liu
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, 266112, China
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, 266112, China
| | - Jinyu Cong
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, 266112, China
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, 266112, China
| | - Wei Zhang
- Affiliated Hospital of Shandong University of Chinese Medicine, Jinan, 250011, China.
| | - Benzheng Wei
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, 266112, China.
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, 266112, China.
| |
Collapse
|
26
|
Manav-Demir N, Gelgor HB, Oz E, Ilhan F, Ulucan-Altuntas K, Tiwary A, Debik E. Effluent parameters prediction of a biological nutrient removal (BNR) process using different machine learning methods: A case study. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 351:119899. [PMID: 38159310 DOI: 10.1016/j.jenvman.2023.119899] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Revised: 12/16/2023] [Accepted: 12/19/2023] [Indexed: 01/03/2024]
Abstract
This paper proposes a novel targeted blend of machine learning (ML) based approaches for controlling wastewater treatment plant (WWTP) operation by predicting distributions of key effluent parameters of a biological nutrient removal (BNR) process. Two years of data were collected from Plajyolu wastewater treatment plant in Kocaeli, Türkiye and the effluent parameters were predicted using six machine learning algorithms to compare their performances. Based on mean absolute percentage error (MAPE) metric only, support vector regression machine (SVRM) with linear kernel method showed a good agreement for COD and BOD5, with the MAPE values of about 9% and 0.9%, respectively. Random Forest (RF) and EXtreme Gradient Boosting (XGBoost) regression were found to be the best algorithms for TN and TP effluent parameters, with the MAPE values of about 34% and 27%, respectively. Further, when the results were evaluated together according to all the performance metrics, RF, SVRM (with both linear kernel and RBF kernel), and Hybrid Regression algorithms generally made more successful predictions than Light GBM and XGBoost algorithms for all the parameters. Through this case study we demonstrated selective application of ML algorithms can be used to predict different effluent parameters more effectively. Wider implementation of this approach can potentially reduce the resource demands for active monitoring the environmental performance of WWTPs.
Collapse
Affiliation(s)
- Neslihan Manav-Demir
- Yildiz Technical University, Environmental Engineering Department, Esenler, Istanbul, 34220, Turkey.
| | - Huseyin Baran Gelgor
- Yildiz Technical University, Environmental Engineering Department, Esenler, Istanbul, 34220, Turkey
| | - Ersoy Oz
- Yildiz Technical University, Statistics Department, Esenler, Istanbul, 34220, Turkey.
| | - Fatih Ilhan
- Yildiz Technical University, Environmental Engineering Department, Esenler, Istanbul, 34220, Turkey
| | - Kubra Ulucan-Altuntas
- Istanbul Technical University, Environmental Engineering Department, Maslak, Istanbul, 34469, Turkey
| | - Abhishek Tiwary
- De Montfort University, School of Engineering and Sustainable Development, The Gateway, Leicester, LE1 9BH, United Kingdom
| | - Eyup Debik
- Yildiz Technical University, Environmental Engineering Department, Esenler, Istanbul, 34220, Turkey
| |
Collapse
|
27
|
Ben-Haim G, Yosef M, Rowand E, Ben-Yosef J, Berman A, Sina S, Halabi N, Grossbard E, Marziano Y, Segal G. Combination of machine learning algorithms with natural language processing may increase the probability of bacteremia detection in the emergency department: A retrospective, big-data analysis of 94,482 patients. Digit Health 2024; 10:20552076241277673. [PMID: 39291149 PMCID: PMC11406632 DOI: 10.1177/20552076241277673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2024] [Accepted: 08/07/2024] [Indexed: 09/19/2024] Open
Abstract
Background Prompt diagnosis of bacteremia in the emergency department (ED) is of utmost importance. Nevertheless, the average time to first clinical laboratory finding range from 1 to 3 days. Alongside a myriad of scoring systems for occult bacteremia prediction, efforts for applying artificial intelligence (AI) in this realm are still preliminary. In the current study we combined an AI algorithm with a Natural Language Processing (NLP) algorithm that would potentially increase the yield extracted from clinical ED data. Methods This study involved adult patients who visited our emergency department and at least one blood culture was taken to rule out bacteremia. Using both tabular and free text data, we built an ensemble model that leverages XGBoost for structured data, and logistic regression (LR) on a word-analysis technique called bag-of-words (BOW) Term Frequency-Inverse Document Frequency (TF-IDF), for textual data. All algorithms were designed in order to predict the risk for bacteremia with ED patients whose blood cultures were sent to the laboratory. Results The study cohort comprised 94,482 individuals, of whom 52% were males. The prevalence of bacteremia in the entire cohort was 9.7%. The model trained on the tabular data yielded an area under the curve (AUC) of 73.7% for XGBoost, while the LR that was trained on the free text achieved an AUC of 71.3%. After checking a range of weights, the best combination was for 55% weight on the XGBoost prediction and 45% weight on the LR prediction. The final model prediction yielded an AUC of 75.6%. Conclusion Harnessing artificial intelligence to the task of bacteremia surveillance in the ED settings by a combination of both free text and tabular data analysis improved predictive performance compared to using tabular data alone. We recommend that future AI applications based on our findings should be assimilated into the clinical routines of ED physicians.
Collapse
Affiliation(s)
- Gal Ben-Haim
- Emergency Department, Chaim Sheba Medical Center, Ramat-Gan, Israel
- The Faculty of Medicine, Tel-Aviv University, Tel-Aviv, Israel
- ARC, Innovation Center, Chaim Sheba Medical Center, Ramat Gan, Israel
| | - Mika Yosef
- ARC, Innovation Center, Chaim Sheba Medical Center, Ramat Gan, Israel
| | - Eyade Rowand
- The Faculty of Medicine, Tel-Aviv University, Tel-Aviv, Israel
- Education Authority, Chaim Sheba Medical Center, Ramat-Gan, Israel
| | | | - Aya Berman
- Dan Petah-Tikvah District, Clalit Health Services, Dan, Israel
| | - Sigal Sina
- ARC, Innovation Center, Chaim Sheba Medical Center, Ramat Gan, Israel
| | - Nitsan Halabi
- ARC, Innovation Center, Chaim Sheba Medical Center, Ramat Gan, Israel
| | - Eitan Grossbard
- Kaplan Medical Center, St George's University of London, program delivered by University of Nicosia at the Chaim Sheba Medical Center, Ramat-Gan, Israel
| | - Yehonatan Marziano
- Barzilai Medical Center. St George's University of London, program delivered by University of Nicosia at the Chaim Sheba Medical Center, Ramat-Gan, Israel
| | - Gad Segal
- The Faculty of Medicine, Tel-Aviv University, Tel-Aviv, Israel
- Education Authority, Chaim Sheba Medical Center, Ramat-Gan, Israel
| |
Collapse
|
28
|
Malashin IP, Tynchenko VS, Nelyub VA, Borodulin AS, Gantimurov AP. Estimation and Prediction of the Polymers' Physical Characteristics Using the Machine Learning Models. Polymers (Basel) 2023; 16:115. [PMID: 38201778 PMCID: PMC10780762 DOI: 10.3390/polym16010115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2023] [Revised: 12/23/2023] [Accepted: 12/27/2023] [Indexed: 01/12/2024] Open
Abstract
This article investigates the utility of machine learning (ML) methods for predicting and analyzing the diverse physical characteristics of polymers. Leveraging a rich dataset of polymers' characteristics, the study encompasses an extensive range of polymer properties, spanning compressive and tensile strength to thermal and electrical behaviors. Using various regression methods like Ensemble, Tree-based, Regularization, and Distance-based, the research undergoes thorough evaluation using the most common quality metrics. As a result of a series of experimental studies on the selection of effective model parameters, those that provide a high-quality solution to the stated problem were found. The best results were achieved by Random Forest with the highest R2 scores of 0.71, 0.73, and 0.88 for glass transition, thermal decomposition, and melting temperatures, respectively. The outcomes are intricately compared, providing valuable insights into the efficiency of distinct ML approaches in predicting polymer properties. Unknown values for each characteristic were predicted, and a method validation was performed by training on the predicted values, comparing the results with the specified variance values of each characteristic. The research not only advances our comprehension of polymer physics but also contributes to informed model selection and optimization for materials science applications.
Collapse
Affiliation(s)
- Ivan Pavlovich Malashin
- Artificial Intelligence Technology Scientific and Education Center, Bauman Moscow State Technical University, 105005 Moscow, Russia; (V.A.N.); (A.S.B.); (A.P.G.)
| | - Vadim Sergeevich Tynchenko
- Artificial Intelligence Technology Scientific and Education Center, Bauman Moscow State Technical University, 105005 Moscow, Russia; (V.A.N.); (A.S.B.); (A.P.G.)
- Information-Control Systems Department, Institute of Computer Science and Telecommunications, Reshetnev Siberian State University of Science and Technology, 660037 Krasnoyarsk, Russia
- Department of Technological Machines and Equipment of Oil and Gas Complex, School of Petroleum and Natural Gas Engineering, Siberian Federal University, 660041 Krasnoyarsk, Russia
| | - Vladimir Aleksandrovich Nelyub
- Artificial Intelligence Technology Scientific and Education Center, Bauman Moscow State Technical University, 105005 Moscow, Russia; (V.A.N.); (A.S.B.); (A.P.G.)
| | - Aleksei Sergeevich Borodulin
- Artificial Intelligence Technology Scientific and Education Center, Bauman Moscow State Technical University, 105005 Moscow, Russia; (V.A.N.); (A.S.B.); (A.P.G.)
| | - Andrei Pavlovich Gantimurov
- Artificial Intelligence Technology Scientific and Education Center, Bauman Moscow State Technical University, 105005 Moscow, Russia; (V.A.N.); (A.S.B.); (A.P.G.)
| |
Collapse
|
29
|
Bakasa W, Viriri S. Stacked ensemble deep learning for pancreas cancer classification using extreme gradient boosting. Front Artif Intell 2023; 6:1232640. [PMID: 37876961 PMCID: PMC10591225 DOI: 10.3389/frai.2023.1232640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 09/04/2023] [Indexed: 10/26/2023] Open
Abstract
Ensemble learning aims to improve prediction performance by combining several models or forecasts. However, how much and which ensemble learning techniques are useful in deep learning-based pipelines for pancreas computed tomography (CT) image classification is a challenge. Ensemble approaches are the most advanced solution to many machine learning problems. These techniques entail training multiple models and combining their predictions to improve the predictive performance of a single model. This article introduces the idea of Stacked Ensemble Deep Learning (SEDL), a pipeline for classifying pancreas CT medical images. The weak learners are Inception V3, VGG16, and ResNet34, and we employed a stacking ensemble. By combining the first-level predictions, an input train set for XGBoost, the ensemble model at the second level of prediction, is created. Extreme Gradient Boosting (XGBoost), employed as a strong learner, will make the final classification. Our findings showed that SEDL performed better, with a 98.8% ensemble accuracy, after some adjustments to the hyperparameters. The Cancer Imaging Archive (TCIA) public access dataset consists of 80 pancreas CT scans with a resolution of 512 * 512 pixels, from 53 male and 27 female subjects. A sample of two hundred and twenty-two images was used for training and testing data. We concluded that implementing the SEDL technique is an effective way to strengthen the robustness and increase the performance of the pipeline for classifying pancreas CT medical images. Interestingly, grouping like-minded or talented learners does not make a difference.
Collapse
Affiliation(s)
| | - Serestina Viriri
- School of Mathematics Statistics & Computer Science, College of Agriculture, Engineering and Science, University of KwaZulu-Natal, Durban, South Africa
| |
Collapse
|
30
|
Halvorson BD, Bao Y, Ward AD, Goldman D, Frisbee JC. Regulation of Skeletal Muscle Resistance Arteriolar Tone: Integration of Multiple Mechanisms. J Vasc Res 2023; 60:245-272. [PMID: 37769627 DOI: 10.1159/000533316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 07/27/2023] [Indexed: 10/03/2023] Open
Abstract
INTRODUCTION Physiological system complexity represents an imposing challenge to gaining insight into how arteriolar behavior emerges. Further, mechanistic complexity in arteriolar tone regulation requires that a systematic determination of how these processes interact to alter vascular diameter be undertaken. METHODS The present study evaluated the reactivity of ex vivo proximal and in situ distal resistance arterioles in skeletal muscle with challenges across the full range of multiple physiologically relevant stimuli and determined the stability of responses over progressive alterations to each other parameter. The five parameters chosen for examination were (1) metabolism (adenosine concentration), (2) adrenergic activation (norepinephrine concentration), (3) myogenic activation (intravascular pressure), (4) oxygen (superfusate PO2), and (5) wall shear rate (altered intraluminal flow). Vasomotor tone of both arteriole groups following challenge with individual parameters was determined; subsequently, responses were determined following all two- and three-parameter combinations to gain deeper insight into how stimuli integrate to change arteriolar tone. A hierarchical ranking of stimulus significance for establishing arteriolar tone was performed using mathematical and statistical analyses in conjunction with machine learning methods. RESULTS Results were consistent across methods and indicated that metabolic and adrenergic influences were most robust and stable across all conditions. While the other parameters individually impact arteriolar tone, their impact can be readily overridden by the two dominant contributors. CONCLUSION These data suggest that mechanisms regulating arteriolar tone are strongly affected by acute changes to the local environment and that ongoing investigation into how microvessels integrate stimuli regulating tone will provide a more thorough understanding of arteriolar behavior emergence across physiological and pathological states.
Collapse
Affiliation(s)
- Brayden D Halvorson
- Department of Medical Biophysics, Schulich School of Medicine and Dentistry, London, Ontario, Canada
| | - Yuki Bao
- Department of Biomedical Engineering, University of Western Ontario, London, Ontario, Canada
| | - Aaron D Ward
- Department of Medical Biophysics, Schulich School of Medicine and Dentistry, London, Ontario, Canada
- Lawson Health Research Institute, London, Ontario, Canada
| | - Daniel Goldman
- Department of Medical Biophysics, Schulich School of Medicine and Dentistry, London, Ontario, Canada
- Department of Biomedical Engineering, University of Western Ontario, London, Ontario, Canada
| | - Jefferson C Frisbee
- Department of Medical Biophysics, Schulich School of Medicine and Dentistry, London, Ontario, Canada
| |
Collapse
|
31
|
Ma J, Yu Z, Chen T, Li P, Liu Y, Chen J, Lyu C, Hao X, Zhang J, Wang S, Gao F, Zhang J, Bu S. The effect of Shengmai injection in patients with coronary heart disease in real world and its personalized medicine research using machine learning techniques. Front Pharmacol 2023; 14:1208621. [PMID: 37781710 PMCID: PMC10537936 DOI: 10.3389/fphar.2023.1208621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 08/30/2023] [Indexed: 10/03/2023] Open
Abstract
Objective: Shengmai injection is a common treatment for coronary heart disease. The accurate dose regimen is important to maximize effectiveness and minimize adverse reactions. We aim to explore the effect of Shengmai injection in patients with coronary heart disease based on real-world data and establish a personalized medicine model using machine learning and deep learning techniques. Methods: 211 patients were enrolled. The length of hospital stay was used to explore the effect of Shengmai injection in a case-control study. We applied propensity score matching to reduce bias and Wilcoxon rank sum test to compare results between the experimental group and the control group. Important variables influencing the dose regimen of Shengmai injection were screened by XGBoost. A personalized medicine model of Shengmai injection was established by XGBoost selected from nine algorithm models. SHapley Additive exPlanations and confusion matrix were used to interpret the results clinically. Results: Patients using Shengmai injection had shorter length of hospital stay than those not using Shengmai injection (median 10.00 days vs. 11.00 days, p = 0.006). The personalized medicine model established via XGBoost shows accuracy = 0.81 and AUC = 0.87 in test cohort and accuracy = 0.84 and AUC = 0.84 in external verification. The important variables influencing the dose regimen of Shengmai injection include lipid-lowering drugs, platelet-lowering drugs, levels of GGT, hemoglobin, prealbumin, and cholesterol at admission. Finally, the personalized model shows precision = 75%, recall rate = 83% and F1-score = 79% for predicting 40 mg of Shengmai injection; and precision = 86%, recall rate = 79% and F1-score = 83% for predicting 60 mg of Shengmai injection. Conclusion: This study provides evidence supporting the clinical effectiveness of Shengmai injection, and established its personalized medicine model, which may help clinicians make better decisions.
Collapse
Affiliation(s)
- Jing Ma
- Department of Pharmacy, Xinhua Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Ze Yu
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Ting Chen
- Department of Pharmacy, Xinhua Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Ping Li
- Department of Pharmacy, Xinhua Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Yan Liu
- Department of Pharmacy, Xinhua Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Jihui Chen
- Department of Pharmacy, Xinhua Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Chunming Lyu
- Experiment Center for Science and Technology, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Xin Hao
- Dalian Medicinovo Technology Co., Ltd., Dalian, China
| | - Jinyuan Zhang
- Beijing Medicinovo Technology Co., Ltd., Beijing, China
| | - Shuang Wang
- Dalian Medicinovo Technology Co., Ltd., Dalian, China
| | - Fei Gao
- Beijing Medicinovo Technology Co., Ltd., Beijing, China
| | - Jian Zhang
- Department of Pharmacy, Xinhua Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Shuhong Bu
- Department of Pharmacy, Xinhua Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| |
Collapse
|
32
|
Niazi SK. The Coming of Age of AI/ML in Drug Discovery, Development, Clinical Testing, and Manufacturing: The FDA Perspectives. Drug Des Devel Ther 2023; 17:2691-2725. [PMID: 37701048 PMCID: PMC10493153 DOI: 10.2147/dddt.s424991] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 08/24/2023] [Indexed: 09/14/2023] Open
Abstract
Artificial intelligence (AI) and machine learning (ML) represent significant advancements in computing, building on technologies that humanity has developed over millions of years-from the abacus to quantum computers. These tools have reached a pivotal moment in their development. In 2021 alone, the U.S. Food and Drug Administration (FDA) received over 100 product registration submissions that heavily relied on AI/ML for applications such as monitoring and improving human performance in compiling dossiers. To ensure the safe and effective use of AI/ML in drug discovery and manufacturing, the FDA and numerous other U.S. federal agencies have issued continuously updated, stringent guidelines. Intriguingly, these guidelines are often generated or updated with the aid of AI/ML tools themselves. The overarching goal is to expedite drug discovery, enhance the safety profiles of existing drugs, introduce novel treatment modalities, and improve manufacturing compliance and robustness. Recent FDA publications offer an encouraging outlook on the potential of these tools, emphasizing the need for their careful deployment. This has expanded market opportunities for retraining personnel handling these technologies and enabled innovative applications in emerging therapies such as gene editing, CRISPR-Cas9, CAR-T cells, mRNA-based treatments, and personalized medicine. In summary, the maturation of AI/ML technologies is a testament to human ingenuity. Far from being autonomous entities, these are tools created by and for humans designed to solve complex problems now and in the future. This paper aims to present the status of these technologies, along with examples of their present and future applications.
Collapse
|
33
|
Kazijevs M, Samad MD. Deep imputation of missing values in time series health data: A review with benchmarking. J Biomed Inform 2023; 144:104440. [PMID: 37429511 PMCID: PMC10529422 DOI: 10.1016/j.jbi.2023.104440] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 06/26/2023] [Accepted: 07/05/2023] [Indexed: 07/12/2023]
Abstract
The imputation of missing values in multivariate time series (MTS) data is critical in ensuring data quality and producing reliable data-driven predictive models. Apart from many statistical approaches, a few recent studies have proposed state-of-the-art deep learning methods to impute missing values in MTS data. However, the evaluation of these deep methods is limited to one or two data sets, low missing rates, and completely random missing value types. This survey performs six data-centric experiments to benchmark state-of-the-art deep imputation methods on five time series health data sets. Our extensive analysis reveals that no single imputation method outperforms the others on all five data sets. The imputation performance depends on data types, individual variable statistics, missing value rates, and types. Deep learning methods that jointly perform cross-sectional (across variables) and longitudinal (across time) imputations of missing values in time series data yield statistically better data quality than traditional imputation methods. Although computationally expensive, deep learning methods are practical given the current availability of high-performance computing resources, especially when data quality and sample size are of paramount importance in healthcare informatics. Our findings highlight the importance of data-centric selection of imputation methods to optimize data-driven predictive models.
Collapse
Affiliation(s)
- Maksims Kazijevs
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, United States
| | - Manar D Samad
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, United States.
| |
Collapse
|
34
|
Yan X, Yue T, Winkler DA, Yin Y, Zhu H, Jiang G, Yan B. Converting Nanotoxicity Data to Information Using Artificial Intelligence and Simulation. Chem Rev 2023. [PMID: 37262026 DOI: 10.1021/acs.chemrev.3c00070] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Decades of nanotoxicology research have generated extensive and diverse data sets. However, data is not equal to information. The question is how to extract critical information buried in vast data streams. Here we show that artificial intelligence (AI) and molecular simulation play key roles in transforming nanotoxicity data into critical information, i.e., constructing the quantitative nanostructure (physicochemical properties)-toxicity relationships, and elucidating the toxicity-related molecular mechanisms. For AI and molecular simulation to realize their full impacts in this mission, several obstacles must be overcome. These include the paucity of high-quality nanomaterials (NMs) and standardized nanotoxicity data, the lack of model-friendly databases, the scarcity of specific and universal nanodescriptors, and the inability to simulate NMs at realistic spatial and temporal scales. This review provides a comprehensive and representative, but not exhaustive, summary of the current capability gaps and tools required to fill these formidable gaps. Specifically, we discuss the applications of AI and molecular simulation, which can address the large-scale data challenge for nanotoxicology research. The need for model-friendly nanotoxicity databases, powerful nanodescriptors, new modeling approaches, molecular mechanism analysis, and design of the next-generation NMs are also critically discussed. Finally, we provide a perspective on future trends and challenges.
Collapse
Affiliation(s)
- Xiliang Yan
- Institute of Environmental Research at the Greater Bay Area, Key Laboratory for Water Quality and Conservation of the Pearl River Delta, Ministry of Education, Guangzhou University, Guangzhou 510006, China
| | - Tongtao Yue
- Key Laboratory of Marine Environment and Ecology, Ministry of Education, Institute of Coastal Environmental Pollution Control, Ocean University of China, Qingdao 266100, China
| | - David A Winkler
- Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, Victoria 3052, Australia
- School of Pharmacy, University of Nottingham, Nottingham NG7 2QL, U.K
- Department of Biochemistry and Chemistry, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, Victoria 3086, Australia
| | - Yongguang Yin
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
| | - Hao Zhu
- Department of Chemistry and Biochemistry, Rowan University, Glassboro, New Jersey 08028, United States
| | - Guibin Jiang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
| | - Bing Yan
- Institute of Environmental Research at the Greater Bay Area, Key Laboratory for Water Quality and Conservation of the Pearl River Delta, Ministry of Education, Guangzhou University, Guangzhou 510006, China
| |
Collapse
|
35
|
Wang X, Wang T, Zheng Y, Yin X. Recognition of liver tumors by predicted hyperspectral features based on patient's Computed Tomography radiomics features. Photodiagnosis Photodyn Ther 2023:103638. [PMID: 37247798 DOI: 10.1016/j.pdpdt.2023.103638] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 05/24/2023] [Accepted: 05/26/2023] [Indexed: 05/31/2023]
Abstract
BACKGROUND Primary liver tumors have posed a serious threat to human life and health, and their early diagnosis is urgent. Therefore, enhancing the accuracy of non-invasive early detection of liver tumors is imperative. METHODS Firstly, image enhancement was applied to augment the dataset, resulting in a total of 464 samples after employing seven data augmentation methods. Subsequently, the XGBoost model was utilized to construct and learn the mapping relationship between Computed Tomography (CT) and corresponding hyperspectral imaging (HSI) data. This model enables the prediction of HSI features corresponding to CT features, thereby enriching CT with more comprehensive hyperspectral information. RESULTS Four classifiers were employed to discern the presence of tumors in patients. The results demonstrated exceptional performance, with a classification accuracy exceeding 90%. CONCLUSIONS This study proposes an artificial intelligence-based methodology that utilizes early CT radiomics features to predict HSI features. Subsequently, the results are utilized for non-invasive tumor prediction and early screening, thereby enhancing the accuracy of non-invasive liver tumor detection.
Collapse
Affiliation(s)
- Xuehu Wang
- College of Electronic and Information Engineering, Hebei University, Baoding 071000, China; Research Center of Machine Vision Engineering & Technology of Hebei Province, Baoding 071000, China; Key Laboratory of Digital Medical Engineering of Hebei Province, Baoding 071000, China
| | - Tianqi Wang
- College of Electronic and Information Engineering, Hebei University, Baoding 071000, China; Research Center of Machine Vision Engineering & Technology of Hebei Province, Baoding 071000, China; Key Laboratory of Digital Medical Engineering of Hebei Province, Baoding 071000, China
| | - Yongchang Zheng
- Department of Liver Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing, 100010, P. R. China.
| | - Xiaoping Yin
- Affiliated Hospital of Hebei University, Baoding 071000, China
| |
Collapse
|
36
|
Hauptman A, Balasubramaniam GM, Arnon S. Machine Learning Diffuse Optical Tomography Using Extreme Gradient Boosting and Genetic Programming. Bioengineering (Basel) 2023; 10:bioengineering10030382. [PMID: 36978773 PMCID: PMC10045273 DOI: 10.3390/bioengineering10030382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 03/18/2023] [Accepted: 03/20/2023] [Indexed: 03/30/2023] Open
Abstract
Diffuse optical tomography (DOT) is a non-invasive method for detecting breast cancer; however, it struggles to produce high-quality images due to the complexity of scattered light and the limitations of traditional image reconstruction algorithms. These algorithms can be affected by boundary conditions and have a low imaging accuracy, a shallow imaging depth, a long computation time, and a high signal-to-noise ratio. However, machine learning can potentially improve the performance of DOT by being better equipped to solve inverse problems, perform regression, classify medical images, and reconstruct biomedical images. In this study, we utilized a machine learning model called "XGBoost" to detect tumors in inhomogeneous breasts and applied a post-processing technique based on genetic programming to improve accuracy. The proposed algorithm was tested using simulated DOT measurements from complex inhomogeneous breasts and evaluated using the cosine similarity metrics and root mean square error loss. The results showed that the use of XGBoost and genetic programming in DOT could lead to more accurate and non-invasive detection of tumors in inhomogeneous breasts compared to traditional methods, with the reconstructed breasts having an average cosine similarity of more than 0.97 ± 0.07 and average root mean square error of around 0.1270 ± 0.0031 compared to the ground truth.
Collapse
Affiliation(s)
- Ami Hauptman
- Department of Computer Science, Sapir Academic College, Sderot 7915600, Israel
| | - Ganesh M Balasubramaniam
- Department of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Be'er Sheva 8441405, Israel
| | - Shlomi Arnon
- Department of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Be'er Sheva 8441405, Israel
| |
Collapse
|
37
|
Salem H, Huynh T, Topolski N, Mwangi B, Trivedi MH, Soares JC, Rush AJ, Selvaraj S. Temporal multi-step predictive modeling of remission in major depressive disorder using early stage treatment data; STAR*D based machine learning approach. J Affect Disord 2023; 324:286-293. [PMID: 36584711 PMCID: PMC9863277 DOI: 10.1016/j.jad.2022.12.076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 12/10/2022] [Accepted: 12/18/2022] [Indexed: 12/29/2022]
Abstract
BACKGROUND Artificial intelligence is currently being used to facilitate early disease detection, better understand disease progression, optimize medication/treatment dosages, and uncover promising novel treatments and potential outcomes. METHODS Utilizing the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) dataset, we built a machine learning model to predict depression remission rates using same clinical data as features for each of the first three antidepressant treatment steps in STAR*D. We only used early treatment data (baseline and first follow up) in each STAR*D step to temporally analyze predictive features of remission at the end of the step. RESULTS Our model showed significant prediction performance across the three treatment steps, At step 1, Model accuracy was 66 %; sensitivity-65 %, specificity-67 %, positive predictive value (PPV)-65.5 %, and negative predictive value (NPV)-66.6 %. At step 2, model accuracy was 71.3 %, sensitivity-74.3 %, specificity-69 %, PPV-64.5 %, and NPV-77.9 %. At step 3, accuracy reached 84.6 %; sensitivity-69 %, specificity-88.8 %, PPV-67 %, and NPV-91.1 %. Across all three steps, the early Quick Inventory of Depressive Symptomatology-Self-Report (QIDS-SR) scores were key elements in predicting the final treatment outcome. The model also identified key sociodemographic factors that predicted treatment remission at different steps. LIMITATIONS The retrospective design, lack of replication in an independent dataset, and the use of "a complete case analysis" model in our analysis. CONCLUSIONS This proof-of-concept study showed that using early treatment data, multi-step temporal prediction of depressive symptom remission results in clinically useful accuracy rates. Whether these predictive models are generalizable deserves further study.
Collapse
Affiliation(s)
- Haitham Salem
- Department of Psychiatry and Human Behavior (DPHB), Warren Alpert School of Medicine, Brown University, Providence, RI, USA
| | - Tung Huynh
- Louis Faillace Department of Psychiatry and Behavioral Science, McGovern Medical School, University of Texas Health Science Center, Houston, TX, USA
| | - Natasha Topolski
- Louis Faillace Department of Psychiatry and Behavioral Science, McGovern Medical School, University of Texas Health Science Center, Houston, TX, USA
| | - Benson Mwangi
- Louis Faillace Department of Psychiatry and Behavioral Science, McGovern Medical School, University of Texas Health Science Center, Houston, TX, USA
| | - Madhukar H Trivedi
- Department of Psychiatry, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Jair C Soares
- Louis Faillace Department of Psychiatry and Behavioral Science, McGovern Medical School, University of Texas Health Science Center, Houston, TX, USA
| | - A John Rush
- Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, NC, USA; Professor Emeritus, Duke-National University of Singapore, Singapore, Singapore
| | - Sudhakar Selvaraj
- Louis Faillace Department of Psychiatry and Behavioral Science, McGovern Medical School, University of Texas Health Science Center, Houston, TX, USA.
| |
Collapse
|
38
|
Burns CM, Pung L, Witt D, Gao M, Sendak M, Balu S, Krakower D, Marcus JL, Okeke NL, Clement ME. Development of a Human Immunodeficiency Virus Risk Prediction Model Using Electronic Health Record Data From an Academic Health System in the Southern United States. Clin Infect Dis 2023; 76:299-306. [PMID: 36125084 PMCID: PMC10202432 DOI: 10.1093/cid/ciac775] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 09/03/2022] [Accepted: 09/14/2022] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND Human immunodeficiency virus (HIV) pre-exposure prophylaxis (PrEP) is underutilized in the southern United States. Rapid identification of individuals vulnerable to diagnosis of HIV using electronic health record (EHR)-based tools may augment PrEP uptake in the region. METHODS Using machine learning, we developed EHR-based models to predict incident HIV diagnosis as a surrogate for PrEP candidacy. We included patients from a southern medical system with encounters between October 2014 and August 2016, training the model to predict incident HIV diagnosis between September 2016 and August 2018. We obtained 74 EHR variables as potential predictors. We compared Extreme Gradient Boosting (XGBoost) versus least absolute shrinkage selection operator (LASSO) logistic regression models, and assessed performance, overall and among women, using area under the receiver operating characteristic curve (AUROC) and area under precision recall curve (AUPRC). RESULTS Of 998 787 eligible patients, 162 had an incident HIV diagnosis, of whom 49 were women. The XGBoost model outperformed the LASSO model for the total cohort, achieving an AUROC of 0.89 and AUPRC of 0.01. The female-only cohort XGBoost model resulted in an AUROC of 0.78 and AUPRC of 0.00025. The most predictive variables for the overall cohort were race, sex, and male partner. The strongest positive predictors for the female-only cohort were history of pelvic inflammatory disease, drug use, and tobacco use. CONCLUSIONS Our machine-learning models were able to effectively predict incident HIV diagnoses including among women. This study establishes feasibility of using these models to identify persons most suitable for PrEP in the South.
Collapse
Affiliation(s)
- Charles M Burns
- Division of Infectious Diseases, Duke University Medical Center, Durham, North Carolina, USA
| | - Leland Pung
- School of Medicine, Duke University, Durham, North Carolina, USA
- Duke Institute for Health Innovation, Durham, North Carolina, USA
| | - Daniel Witt
- Duke Institute for Health Innovation, Durham, North Carolina, USA
| | - Michael Gao
- Duke Institute for Health Innovation, Durham, North Carolina, USA
| | - Mark Sendak
- Duke Institute for Health Innovation, Durham, North Carolina, USA
| | - Suresh Balu
- Duke Institute for Health Innovation, Durham, North Carolina, USA
| | - Douglas Krakower
- Division of Infectious Disease, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA
- Department of Population Medicine, Harvard Medical School, Boston, Massachusetts, USA
| | - Julia L Marcus
- Department of Population Medicine, Harvard Medical School, Boston, Massachusetts, USA
| | - Nwora Lance Okeke
- Division of Infectious Diseases, Duke University Medical Center, Durham, North Carolina, USA
| | - Meredith E Clement
- Division of Infectious Diseases, Louisiana State University Health Sciences Center, New Orleans, Louisiana, USA
| |
Collapse
|
39
|
Song L, Huang CR, Pan SZ, Zhu JG, Cheng ZQ, Yu X, Xue L, Xia F, Zhang JY, Wu DP, Miao LY. A model based on machine learning for the prediction of cyclosporin A trough concentration in Chinese allo-HSCT patients. Expert Rev Clin Pharmacol 2023; 16:83-91. [PMID: 36373407 DOI: 10.1080/17512433.2023.2142561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
BACKGROUND Cyclosporin A is a calcineurin inhibitor which has a narrow therapeutic window and high interindividual variability. Various population pharmacokinetic models have been reported; however, professional software and technical personnel were needed and the variables of the models were limited. Therefore, the aim of this study was to establish a model based on machine learning to predict CsA trough concentrations in Chinese allo-HSCT patients. METHODS A total of 7874 cases of CsA therapeutic drug monitoring data from 2069 allo-HSCT patients were retrospectively included. Sequential forward selection was used to select variable subsets, and eight different algorithms were applied to establish the prediction model. RESULTS XGBoost exhibited the highest prediction ability. Except for the variables that were identified by previous studies, some rarely reported variables were found, such as norethindrone, WBC, PAB, and hCRP. The prediction accuracy within ±30% of the actual trough concentration was above 0.80, and the predictive ability of the models was demonstrated to be effective in external validation. CONCLUSION In this study, models based on machine learning technology were established to predict CsA levels 3-4 days in advance during the early inpatient phase after HSCT. A new perspective for CsA clinical application is provided.
Collapse
Affiliation(s)
- Lin Song
- Department of Pharmacy, the First Affiliated Hospital of Soochow University, Suzhou, China.,College of Pharmaceutical Sciences, Soochow University, Suzhou, China
| | - Chen-Rong Huang
- Department of Pharmacy, the First Affiliated Hospital of Soochow University, Suzhou, China
| | - Shi-Zheng Pan
- Department of Pharmacy, the First Affiliated Hospital of Soochow University, Suzhou, China.,College of Pharmaceutical Sciences, Soochow University, Suzhou, China
| | - Jian-Guo Zhu
- Department of Pharmacy, the First Affiliated Hospital of Soochow University, Suzhou, China
| | - Zong-Qi Cheng
- Department of Pharmacy, the First Affiliated Hospital of Soochow University, Suzhou, China
| | - Xun Yu
- Department of Pharmacy, the First Affiliated Hospital of Soochow University, Suzhou, China
| | - Ling Xue
- Department of Pharmacy, the First Affiliated Hospital of Soochow University, Suzhou, China
| | - Fan Xia
- Department of Pharmacy, the First Affiliated Hospital of Soochow University, Suzhou, China
| | | | - De-Pei Wu
- National Clinical Research Center for Hematologic Diseases, Jiangsu Institute of Hematology, The First Affiliated Hospital of Soochow University, Suzhou, China
| | - Li-Yan Miao
- Department of Pharmacy, the First Affiliated Hospital of Soochow University, Suzhou, China.,College of Pharmaceutical Sciences, Soochow University, Suzhou, China.,National Clinical Research Center for Hematologic Diseases, The First Affiliated Hospital of Soochow University, Suzhou, China
| |
Collapse
|
40
|
Mercier JA, Ferguson TW, Tangri N. A Machine Learning Model to Predict Diuretic Resistance. KIDNEY360 2023; 4:15-22. [PMID: 36700900 PMCID: PMC10101605 DOI: 10.34067/kid.0005562022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 11/01/2022] [Indexed: 11/11/2022]
Abstract
BACKGROUND Volume overload is a common complication encountered in hospitalized patients, and the mainstay of therapy is diuresis. Unfortunately, the diuretic response in some individuals is inadequate despite a typical dose of loop diuretics, a phenomenon called diuretic resistance. An accurate prediction model that predicts diuretic resistance using predosing variables could inform the right diuretic dose for a prospective patient. METHODS Two large, deidentified, publicly available, and independent intensive care unit (ICU) databases from the United States were used-the Medical Information Mart for Intensive Care III (MIMIC) and the Philips eICU databases. Loop diuretic resistance was defined as <1400 ml of urine per 40 mg of diuretic dose in 24 hours. Using 24-hour windows throughout admission, commonly accessible variables were obtained and incorporated into the model. Data imputation was performed using a highly accurate machine learning method. Using XGBoost, several models were created using train and test datasets from the eICU database. These were then combined into an ensemble model optimized for increased specificity and then externally validated on the MIMIC database. RESULTS The final ensemble model was composed of four separate models, each using 21 commonly available variables. The ensemble model outperformed individual models during validation. Higher serum creatinine, lower systolic blood pressure, lower serum chloride, higher age, and female sex were the most important predictors of diuretic resistance (in that order). The specificity of the model on external validation was 92%, yielding a positive likelihood ratio of 3.46 while maintaining overall discrimination (C-statistic 0.69). CONCLUSIONS A diuretic resistance prediction model was created using machine learning and was externally validated in ICU populations. The model is easy to use, would provide actionable information at the bedside, and would be ready for implementation in existing electronic medical records. This study also provides a framework for the development of future machine learning models.
Collapse
Affiliation(s)
- Joey A. Mercier
- Department of Internal Medicine, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Thomas W. Ferguson
- Department of Internal Medicine, University of Manitoba, Winnipeg, Manitoba, Canada
- Seven Oaks Hospital Chronic Disease Innovation Centre, Seven Oaks General Hospital, Winnipeg, Manitoba, Canada
| | - Navdeep Tangri
- Department of Internal Medicine, University of Manitoba, Winnipeg, Manitoba, Canada
- Seven Oaks Hospital Chronic Disease Innovation Centre, Seven Oaks General Hospital, Winnipeg, Manitoba, Canada
| |
Collapse
|
41
|
Yu J, Liu X, Zhu Z, Yang Z, He J, Zhang L, Lu H. Prediction models for cardiovascular disease risk among people living with HIV: A systematic review and meta-analysis. Front Cardiovasc Med 2023; 10:1138234. [PMID: 37034346 PMCID: PMC10077152 DOI: 10.3389/fcvm.2023.1138234] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 03/08/2023] [Indexed: 04/11/2023] Open
Abstract
Background HIV continues to be a major global health issue. The relative risk of cardiovascular disease (CVD) among people living with HIV (PLWH) was 2.16 compared to non-HIV-infections. The prediction of CVD is becoming an important issue in current HIV management. However, there is no consensus on optional CVD risk models for PLWH. Therefore, we aimed to systematically summarize and compare prediction models for CVD risk among PLWH. Methods Longitudinal studies that developed or validated prediction models for CVD risk among PLWH were systematically searched. Five databases were searched up to January 2022. The quality of the included articles was evaluated by using the Prediction model Risk Of Bias ASsessment Tool (PROBAST). We applied meta-analysis to pool the logit-transformed C-statistics for discrimination performance. Results Thirteen articles describing 17 models were included. All the included studies had a high risk of bias. In the meta-analysis, the pooled estimated C-statistic was 0.76 (95% CI: 0.72-0.81, I 2 = 84.8%) for the Data collection on Adverse Effects of Anti-HIV Drugs Study risk equation (D:A:D) (2010), 0.75 (95% CI: 0.70-0.79, I 2 = 82.4%) for the D:A:D (2010) 10-year risk version, 0.77 (95% CI: 0.74-0.80, I 2 = 82.2%) for the full D:A:D (2016) model, 0.74 (95% CI: 0.68-0.79, I 2 = 86.2%) for the reduced D:A:D (2016) model, 0.71 (95% CI: 0.61-0.79, I 2 = 87.9%) for the Framingham Risk Score (FRS) for coronary heart disease (CHD) (1998), 0.74 (95% CI: 0.70-0.78, I 2 = 87.8%) for the FRS CVD model (2008), 0.72 (95% CI: 0.67-0.76, I 2 = 75.0%) for the pooled cohort equations of the American Heart Society/ American score (PCE), and 0.67 (95% CI: 0.56-0.77, I 2 = 51.3%) for the Systematic COronary Risk Evaluation (SCORE). In the subgroup analysis, the discrimination of PCE was significantly better in the group aged ≤40 years than in the group aged 40-45 years (P = 0.024) and the group aged ≥45 years (P = 0.010). No models were developed or validated in Sub-Saharan Africa and the Asia region. Conclusions The full D:A:D (2016) model performed the best in terms of discrimination, followed by the D:A:D (2010) and PCE. However, there were no significant differences between any of the model pairings. Specific CVD risk models for older PLWH and for PLWH in Sub-Saharan Africa and the Asia region should be established.Systematic Review Registration: PROSPERO CRD42022322024.
Collapse
Affiliation(s)
- Junwen Yu
- School of Nursing, Fudan University, Shanghai, China
| | - Xiaoning Liu
- Department of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Shenzhen Third People's Hospital, Guangdong, China
- National Heart & Lung Institute, Faculty of Medicine, Imperial College London, London, United Kingdom
| | - Zheng Zhu
- School of Nursing, Fudan University, Shanghai, China
- Fudan University Centre for Evidence-Based Nursing: A Joanna Briggs Institute Centre of Excellence, Shanghai, China
- NYU Rory Meyers College of Nursing, New York University, New York City, NY, United States
- Correspondence: Zheng Zhu Hongzhou Lu
| | - Zhongfang Yang
- School of Nursing, Fudan University, Shanghai, China
- Fudan University Centre for Evidence-Based Nursing: A Joanna Briggs Institute Centre of Excellence, Shanghai, China
- Shanghai Institute of Infectious Disease and Biosecurity, Fudan University, Shanghai, China
| | - Jiamin He
- School of Nursing, Fudan University, Shanghai, China
| | - Lin Zhang
- Shanghai Public Health Clinical Center, Fudan University, Shanghai, China
| | - Hongzhou Lu
- Department of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Shenzhen Third People's Hospital, Guangdong, China
- Correspondence: Zheng Zhu Hongzhou Lu
| |
Collapse
|
42
|
Fatahi R, Nasiri H, Homafar A, Khosravi R, Siavoshi H, Chehreh Chelgani S. Modeling operational cement rotary kiln variables with explainable artificial intelligence methods – a “conscious lab” development. PARTICULATE SCIENCE AND TECHNOLOGY 2022. [DOI: 10.1080/02726351.2022.2135470] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
- Rasoul Fatahi
- School of Mining Engineering, College of Engineering, University of Tehran, Tehran, Iran
| | - Hamid Nasiri
- Department of Computer Engineering, Amirkabir University of Technology, Tehran, Iran
| | - Arman Homafar
- Electrical and Computer Engineering Department, Semnan University, Semnan, Iran
| | - Rasoul Khosravi
- Department of Mining, Faculty of Engineering, Lorestan University, Khorramabad, Iran
| | - Hossein Siavoshi
- Department of Mining and Geological Engineering, University of Arizona, Tucson, USA
| | - Saeed Chehreh Chelgani
- Minerals and Metallurgical Engineering, Department of Civil, Environmental and Natural Resources Engineering, Luleå University of Technology, Sweden
| |
Collapse
|
43
|
Yang Q, Gao S, Lin J, Lyu K, Wu Z, Chen Y, Qiu Y, Zhao Y, Wang W, Lin T, Pan H, Chen M. A machine learning-based data mining in medical examination data: a biological features-based biological age prediction model. BMC Bioinformatics 2022; 23:411. [PMID: 36192681 PMCID: PMC9528174 DOI: 10.1186/s12859-022-04966-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 09/26/2022] [Indexed: 11/11/2022] Open
Abstract
Background Biological age (BA) has been recognized as a more accurate indicator of aging than chronological age (CA). However, the current limitations include: insufficient attention to the incompleteness of medical data for constructing BA; Lack of machine learning-based BA (ML-BA) on the Chinese population; Neglect of the influence of model overfitting degree on the stability of the association results. Methods and results Based on the medical examination data of the Chinese population (45–90 years), we first evaluated the most suitable missing interpolation method, then constructed 14 ML-BAs based on biomarkers, and finally explored the associations between ML-BAs and health statuses (healthy risk indicators and disease). We found that round-robin linear regression interpolation performed best, while AutoEncoder showed the highest interpolation stability. We further illustrated the potential overfitting problem in ML-BAs, which affected the stability of ML-Bas’ associations with health statuses. We then proposed a composite ML-BA based on the Stacking method with a simple meta-model (STK-BA), which overcame the overfitting problem, and associated more strongly with CA (r = 0.66, P < 0.001), healthy risk indicators, disease counts, and six types of disease. Conclusion We provided an improved aging measurement method for middle-aged and elderly groups in China, which can more stably capture aging characteristics other than CA, supporting the emerging application potential of machine learning in aging research. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04966-7.
Collapse
Affiliation(s)
- Qing Yang
- Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, 310051, China
| | - Sunan Gao
- College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou, 310058, China
| | - Junfen Lin
- Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, 310051, China
| | - Ke Lyu
- College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Zexu Wu
- College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Yuhao Chen
- College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Yinwei Qiu
- Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, 310051, China
| | - Yanrong Zhao
- Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, 310051, China
| | - Wei Wang
- Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, 310051, China
| | - Tianxiang Lin
- Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, 310051, China
| | - Huiyun Pan
- The First Affiliated Hospital of School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Ming Chen
- College of Life Sciences, Zhejiang University, Hangzhou, 310058, China. .,The First Affiliated Hospital of School of Medicine, Zhejiang University, Hangzhou, 310058, China.
| |
Collapse
|
44
|
Chang YW, Natali L, Jamialahmadi O, Romeo S, Pereira JB, Volpe G. Neural Network Training with Highly Incomplete Medical Datasets. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2022. [DOI: 10.1088/2632-2153/ac7b69] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Abstract
Neural network training and validation rely on the availability of large high-quality datasets. However, in many cases only incomplete datasets are available, particularly in health care applications, where each patient typically undergoes different clinical procedures or can drop out of a study. Since the data to train the neural networks need to be complete, most studies discard the incomplete datapoints, which reduces the size of the training data, or impute the missing features, which can lead to artefacts. Alas, both approaches are inadequate when a large portion of the data is missing. Here, we introduce GapNet, an alternative deep-learning training approach that can use highly incomplete datasets without overfitting or introducing artefacts. First, the dataset is split into subsets of samples containing all values for a certain cluster of features. Then, these subsets are used to train individual neural networks. Finally, this ensemble of neural networks is combined into a single neural network whose training is fine-tuned using all complete datapoints. Using two highly incomplete real-world medical datasets, we show that GapNet improves the identification of patients with underlying Alzheimer’s disease pathology and of patients at risk of hospitalization due to Covid-19. Compared to commonly used imputation methods, this improvement suggests that GapNet can become a general tool to handle incomplete medical datasets.
Collapse
|
45
|
How statistical modeling and machine learning could help in the calibration of numerical simulation and fluid mechanics models? Application to the calibration of models reproducing the vibratory behavior of an overhead line conductor. ARRAY 2022. [DOI: 10.1016/j.array.2022.100187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
|
46
|
Fahmy AS, Csecs I, Arafati A, Assana S, Yankama TT, Al-Otaibi T, Rodriguez J, Chen YY, Ngo LH, Manning WJ, Kwong RY, Nezafat R. An Explainable Machine Learning Approach Reveals Prognostic Significance of Right Ventricular Dysfunction in Nonischemic Cardiomyopathy. JACC Cardiovasc Imaging 2022; 15:766-779. [PMID: 35033500 DOI: 10.1016/j.jcmg.2021.11.029] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 10/25/2021] [Accepted: 11/18/2021] [Indexed: 11/19/2022]
Abstract
OBJECTIVES The authors implemented an explainable machine learning (ML) model to gain insight into the association between cardiac magnetic resonance markers and adverse outcomes of cardiovascular hospitalization and all-cause death (composite endpoint) in patients with nonischemic dilated cardiomyopathy (NICM). BACKGROUND Risk stratification of patients with NICM remains challenging. An explainable ML model has the potential to provide insight into the contributions of different risk markers in the prediction model. METHODS An explainable ML model based on extreme gradient boosting (XGBoost) machines was developed using cardiac magnetic resonance and clinical parameters. The study cohorts consist of patients with NICM from 2 academic medical centers: Beth Israel Deaconess Medical Center (BIDMC) and Brigham and Women's Hospital (BWH), with 328 and 214 patients, respectively. XGBoost was trained on 70% of patients from the BIDMC cohort and evaluated based on the other 30% as internal validation. The model was externally validated using the BWH cohort. To investigate the contribution of different features in our risk prediction model, we used Shapley additive explanations (SHAP) analysis. RESULTS During a mean follow-up duration of 40 months, 34 patients from BIDMC and 33 patients from BWH experienced the composite endpoint. The area under the curve for predicting the composite endpoint was 0.71 for the internal BIDMC validation and 0.69 for the BWH cohort. SHAP analysis identified parameters associated with right ventricular (RV) dysfunction and remodeling as primary markers of adverse outcomes. High risk thresholds were identified by SHAP analysis and thus provided thresholds for top predictive continuous clinical variables. CONCLUSIONS An explainable ML-based risk prediction model has the potential to identify patients with NICM at risk for cardiovascular hospitalization and all-cause death. RV ejection fraction, end-systolic and end-diastolic volumes (as indicators of RV dysfunction and remodeling) were determined to be major risk markers.
Collapse
Affiliation(s)
- Ahmed S Fahmy
- Cardiovascular Division, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA; Harvard Medical School, Boston, Massachusetts, USA
| | - Ibolya Csecs
- Cardiovascular Division, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA; Harvard Medical School, Boston, Massachusetts, USA; Department of Medicine, Jacobi Medical Center/Albert Einstein College of Medicine, Bronx, New York, USA
| | - Arghavan Arafati
- Cardiovascular Division, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA; Harvard Medical School, Boston, Massachusetts, USA
| | - Salah Assana
- Cardiovascular Division, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA; Harvard Medical School, Boston, Massachusetts, USA
| | - Tuyen T Yankama
- Cardiovascular Division, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA; Harvard Medical School, Boston, Massachusetts, USA
| | - Talal Al-Otaibi
- Cardiovascular Division, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA; Harvard Medical School, Boston, Massachusetts, USA
| | - Jennifer Rodriguez
- Cardiovascular Division, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA; Harvard Medical School, Boston, Massachusetts, USA
| | - Yi-Yun Chen
- Harvard Medical School, Boston, Massachusetts, USA; Cardiovascular Division, Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts, USA
| | - Long H Ngo
- Cardiovascular Division, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA; Harvard Medical School, Boston, Massachusetts, USA
| | - Warren J Manning
- Cardiovascular Division, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA; Harvard Medical School, Boston, Massachusetts, USA; Department of Medicine, Jacobi Medical Center/Albert Einstein College of Medicine, Bronx, New York, USA
| | - Raymond Y Kwong
- Harvard Medical School, Boston, Massachusetts, USA; Cardiovascular Division, Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts, USA
| | - Reza Nezafat
- Cardiovascular Division, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA; Harvard Medical School, Boston, Massachusetts, USA.
| |
Collapse
|
47
|
Luo Y. Evaluating the state of the art in missing data imputation for clinical data. Brief Bioinform 2022; 23:bbab489. [PMID: 34882223 PMCID: PMC8769894 DOI: 10.1093/bib/bbab489] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 10/14/2021] [Accepted: 10/24/2021] [Indexed: 11/14/2022] Open
Abstract
Clinical data are increasingly being mined to derive new medical knowledge with a goal of enabling greater diagnostic precision, better-personalized therapeutic regimens, improved clinical outcomes and more efficient utilization of health-care resources. However, clinical data are often only available at irregular intervals that vary between patients and type of data, with entries often being unmeasured or unknown. As a result, missing data often represent one of the major impediments to optimal knowledge derivation from clinical data. The Data Analytics Challenge on Missing data Imputation (DACMI) presented a shared clinical dataset with ground truth for evaluating and advancing the state of the art in imputing missing data for clinical time series. We extracted 13 commonly measured blood laboratory tests. To evaluate the imputation performance, we randomly removed one recorded result per laboratory test per patient admission and used them as the ground truth. DACMI is the first shared-task challenge on clinical time series imputation to our best knowledge. The challenge attracted 12 international teams spanning three continents across multiple industries and academia. The evaluation outcome suggests that competitive machine learning and statistical models (e.g. LightGBM, MICE and XGBoost) coupled with carefully engineered temporal and cross-sectional features can achieve strong imputation performance. However, care needs to be taken to prevent overblown model complexity. The challenge participating systems collectively experimented with a wide range of machine learning and probabilistic algorithms to combine temporal imputation and cross-sectional imputation, and their design principles will inform future efforts to better model clinical missing data.
Collapse
Affiliation(s)
- Yuan Luo
- Division of Health and Biomedical Informatics, Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| |
Collapse
|
48
|
Ronzio L, Cabitza F, Barbaro A, Banfi G. Has the Flood Entered the Basement? A Systematic Literature Review about Machine Learning in Laboratory Medicine. Diagnostics (Basel) 2021; 11:372. [PMID: 33671623 PMCID: PMC7926482 DOI: 10.3390/diagnostics11020372] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 02/08/2021] [Accepted: 02/18/2021] [Indexed: 02/08/2023] Open
Abstract
This article presents a systematic literature review that expands and updates a previous review on the application of machine learning to laboratory medicine. We used Scopus and PubMed to collect, select and analyse the papers published from 2017 to the present in order to highlight the main studies that have applied machine learning techniques to haematochemical parameters and to review their diagnostic and prognostic performance. In doing so, we aim to address the question we asked three years ago about the potential of these techniques in laboratory medicine and the need to leverage a tool that was still under-utilised at that time.
Collapse
Affiliation(s)
- Luca Ronzio
- Department of Informatics, University of Milano-Bicocca, 20126 Milan, Italy;
| | - Federico Cabitza
- Department of Informatics, University of Milano-Bicocca, 20126 Milan, Italy;
| | - Alessandro Barbaro
- IRCCS Istituto Ortopedico Galeazzi, Via Riccardo Galeazzi, 4, 20161 Milan, Italy; (A.B.); (G.B.)
| | - Giuseppe Banfi
- IRCCS Istituto Ortopedico Galeazzi, Via Riccardo Galeazzi, 4, 20161 Milan, Italy; (A.B.); (G.B.)
- School of Medicine, University Vita-Salute San Raffaele, Via Olgettina, 58, 20132 Milan, Italy
| |
Collapse
|