1
|
Xue B, Lan J, Chen S, Wang L, Xin E, Xie J, Zheng X, Wang LG, Tang K. Explainable PET-based intratumoral and peritumoral machine learning model for predicting visceral pleural invasion in clinical-stage IA non-small cell lung cancer: A two-center study. Clin Radiol 2025; 85:106903. [PMID: 40253896 DOI: 10.1016/j.crad.2025.106903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2024] [Revised: 02/22/2025] [Accepted: 03/15/2025] [Indexed: 04/22/2025]
Abstract
AIM The aim of this study was to develop a PET-based machine learning model for predicting visceral pleural invasion (VPI) in patients with clinical stage IA non-small cell lung cancer. MATERIALS AND METHODS A total of 294 patients and 69 patients from two institutions who underwent the 18F-FDG-PET scan were retrospectively analyzed. We extracted PET-based radiomics features from the gross tumor volume (GTV) and gross tumor volume incorporating peritumoral 4, 8 and 12 mm regions (GPTV4, GPTV8, GPTV12), respectively. Then four models were respectively established by using machine learning algorithms. The performance of the models was assessed by the receiver operating characteristic (ROC) curve and decision curve analyses (DCA). Shapley additive explanation (SHAP) was employed to explain the machine learning (ML) models and visualize variable predictions. RESULTS Compared with GTV, GPTV4, and GPTV12 radiomics models, the radiomics model based on GPTV8 using random forest (RF) among the 10 features demonstrated better prediction performance, with the AUC of 0.879, 0.846, and 0.745 in the training, internal validation, and external validation sets, respectively. The results of the SHAP method showed that the GLRLM_ShortRunLowGreyLevel Emphasis features were the most important factors in VPI. At the patient level, SHAP force plots provided a deep understanding for predicting VPI. CONCLUSION The PET-based intratumoral and peritumoral model based on machine learning offers an innovative tool for preoperative prediction of VPI in patients with lung adenocarcinoma. By employing the SHAP method, clinicians may gain a clearer insight into the factors contributing to VPI, which could enhance clinical decision-making of prognosis assessment.
Collapse
Affiliation(s)
- B Xue
- Department of Radiology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - J Lan
- Department of Radiology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - S Chen
- Department of Radiology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - L Wang
- Department of Radiology, Wenzhou Central Hospital, China
| | - E Xin
- Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd, Shanghai, China
| | - J Xie
- Department of Radiology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - X Zheng
- Department of Radiology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - L G Wang
- Division of Pulmonary Medicine, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou Key Laboratory of Interdiscipline and Translational Medicine, Wenzhou Key Laboratory of Heart and Lung, Wenzhou, China
| | - K Tang
- Department of Nuclear Medicine, The First Affiliated Hospital of Wenzhou Medical University, Key Laboratory of Novel Nuclide Technologies on Precision Diagnosis and Treatment & Clinical Transformation of Wenzhou City, Wenzhou, China.
| |
Collapse
|
2
|
Lu X, Kou H, Li C, Zhan R, Guo R, Liu S, Shen P, Shen M, Du T, Lu J, Shen X. Development and validation of an interpretable machine learning model for predicting hyperuricemia risk: Based on environmental chemical exposure. ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2025; 299:118392. [PMID: 40403686 DOI: 10.1016/j.ecoenv.2025.118392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2025] [Revised: 04/30/2025] [Accepted: 05/19/2025] [Indexed: 05/24/2025]
Abstract
Hyperuricemia is a global health concern, with environmental chemicals as risk factors. This study used data of multiple environmental chemical exposures from the 2011-2012 cycle of the National Health and Nutrition Examination Survey (NHANES) to develop an interpretable machine learning model for hyperuricemia risk prediction. The least absolute shrinkage and selection operator (LASSO) regression method was employed to select relevant variables. The dataset was split into training (80 %) and test (20 %) sets and six machine learning models were constructed, including Random Forest (RF), Gaussian Naive Bayes (GNB), Light Gradient Boosting (LGB), Extreme Gradient Boosting (XGB), Adaptive Boosting Classifier (AB), and Support Vector Machine (SVM). Our study identified a hyperuricemia prevalence of 20.58 % in the 2011-2012 NHANES cycle, which was consistent with previous studies. The XGB model exhibited optimal performance, achieving the highest AUC (0.806, 95 % CI: 0.768-0.845), balanced accuracy (0.762; 95 % CI: 0.721-0.802), F1 value (0585; 95 % CI: 0.535-0.635), as well as the lowest Brier score (0.133; 95 % CI:0.122-0.144). Estimated glomerular filtration rate (eGFR), body mass index (BMI), cobalt (Co), mono-(2-ethyl)-hexyl phthalate (MEHP), mono-(3-carboxypropyl) phthalate (MCPP), mono-(2-ethyl-5-hydroxyhexyl) phthalate (MEHHP), 2-hydroxynaphthalene (OHNa2) were identified as the key factors contributing to the predictive model. The results of Shapley additive explanations and partial dependence plots indicated that hyperuricemia was positively associated with MCPP, MEHHP, and OHNa2, while negatively associated with Co and MEHP. This study is the first to predict the risk of hyperuricemia based on multiple environmental chemical exposures using a machine learning model.
Collapse
Affiliation(s)
- Xiaochuan Lu
- Department of Epidemiology and Health Statistics, School of Public Health, Qingdao University, Qingdao 266071, China.
| | - Huawei Kou
- Medical Affairs Department of Cancer Hospital, General Hospital of Ningxia Medical University, Yinchuan 750004, China.
| | - Cong Li
- Department of Epidemiology and Health Statistics, School of Public Health, Qingdao University, Qingdao 266071, China.
| | | | - Rongrong Guo
- Department of Epidemiology and Health Statistics, School of Public Health, Qingdao University, Qingdao 266071, China.
| | - Shengnan Liu
- Department of Epidemiology and Health Statistics, School of Public Health, Qingdao University, Qingdao 266071, China.
| | - Peixuan Shen
- Department of Epidemiology and Health Statistics, School of Public Health, Qingdao University, Qingdao 266071, China.
| | - Meiyue Shen
- Department of Epidemiology and Health Statistics, School of Public Health, Qingdao University, Qingdao 266071, China.
| | - Tingwei Du
- Department of Epidemiology and Health Statistics, School of Public Health, Qingdao University, Qingdao 266071, China.
| | - Jiaqi Lu
- Department of Epidemiology and Health Statistics, School of Public Health, Qingdao University, Qingdao 266071, China.
| | - Xiaoli Shen
- Department of Epidemiology and Health Statistics, School of Public Health, Qingdao University, Qingdao 266071, China.
| |
Collapse
|
3
|
Chhibbar P, Das J. Machine learning approaches enable the discovery of therapeutics across domains. Mol Ther 2025; 33:2269-2278. [PMID: 40186352 DOI: 10.1016/j.ymthe.2025.04.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2025] [Revised: 03/21/2025] [Accepted: 04/01/2025] [Indexed: 04/07/2025] Open
Abstract
Multi-modal datasets have grown exponentially in the last decade. This has created an enormous demand for machine learning models that can predict complex outcomes by leveraging cellular, molecular, and humoral profiles. Corresponding inference of mechanisms can help to uncover new therapeutic targets. Here, we discuss how biological principles guide the design of predictive models and how interpretable machine learning can lead to novel mechanistic insights. We provide descriptions of multiple learning techniques and how suited they are to domain adaptations. Finally, we talk about broad learning capabilities of foundation models on large datasets and whether they can be used to provide meaningful inference about biological datasets.
Collapse
Affiliation(s)
- Prabal Chhibbar
- Centre for Systems Immunology, Department of Immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Integrative Systems Biology PhD Program, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA.
| | - Jishnu Das
- Centre for Systems Immunology, Department of Immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA.
| |
Collapse
|
4
|
Jo Y, Shin MY, Kim S. Assessing the association of multi-environmental chemical exposures on metabolic syndrome: A machine learning approach. ENVIRONMENT INTERNATIONAL 2025; 199:109481. [PMID: 40279688 DOI: 10.1016/j.envint.2025.109481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/21/2025] [Revised: 04/08/2025] [Accepted: 04/16/2025] [Indexed: 04/27/2025]
Abstract
Metabolic syndrome (MetS) is a major global public health concern due to its rising prevalence and association with increased risks of cardiovascular disease and type 2 diabetes. Emerging evidence suggests that environmental chemical exposures may play a significant role in the development of MetS by disrupting metabolic pathways. This study used data from 2,960 participants in the Korean National Environmental Health Survey (KoNEHS) cycle 4 (2018-2020) to examine associations between environmental exposures and MetS risk through machine learning (ML) approaches. Eight ML algorithms were applied, with the multilayer perceptron (MLP) and random forest (RF) models identified as optimal predictors. The MLP achieved an AUC of 0.79, and the RF achieved the highest F1 score of 0.82. Both models highlighted PFOA and PFOS, alongside age and BMI, as key predictors. SHapley Additive exPlanations (SHAP) and partial dependence plots (PDP) revealed both linear and nonlinear exposure-response patterns, suggesting threshold effects for key chemicals. These findings underscore the importance of incorporating environmental exposures into MetS risk assessments. The ML models provided robust predictive performance and novel insights into chemical and metabolic interactions, advocating for regulatory measures to reduce harmful exposures and integrate environmental factors into MetS prevention strategies.
Collapse
Affiliation(s)
- Yehoon Jo
- Institute of Health and Environment, Graduate School of Public Health, Seoul National University, Seoul, Republic of Korea; Department of Environmental Health Sciences, Graduate School of Public Health, Seoul National University, Seoul, Republic of Korea
| | - Mi-Yeon Shin
- Toxicological Centre, University of Antwerp, Wilrijk, Belgium.
| | - Sungkyoon Kim
- Institute of Health and Environment, Graduate School of Public Health, Seoul National University, Seoul, Republic of Korea; Department of Environmental Health Sciences, Graduate School of Public Health, Seoul National University, Seoul, Republic of Korea.
| |
Collapse
|
5
|
Xu J, Li J, Wang T, Luo X, Zhu Z, Wang Y, Wang Y, Zhang Z, Song R, Yang LZ, Wang H, Wong STC, Li H. Predicting treatment response and prognosis of immune checkpoint inhibitors-based combination therapy in advanced hepatocellular carcinoma using a longitudinal CT-based radiomics model: a multicenter study. BMC Cancer 2025; 25:602. [PMID: 40181337 PMCID: PMC11967134 DOI: 10.1186/s12885-025-13978-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 03/19/2025] [Indexed: 04/05/2025] Open
Abstract
BACKGROUND Identifying effective predictive strategies to assess the response of immune checkpoint inhibitors (ICIs)-based combination therapy in advanced hepatocellular carcinoma (HCC) is crucial. This study presents a new longitudinal CT-based radiomics model to predict treatment response and prognosis in advanced HCC patients undergoing ICIs-based combination therapy. METHODS Longitudinal CT images were collected before and during the treatment for HCC patients across three institutions from January 2019 to April 2022. A total of 1316 radiomic features were extracted from arterial and portal venous phase abdominal CT images for each patient. A model called Longitudinal Whole-liver CT-based Radiomics (LWCTR) was developed to categorize patients into responders or non-responders using radiomic features and clinical information through support vector machine (SVM) classifiers. The area under the curve (AUC) was used as the performance metric and subsequently applied for risk stratification and prognostic assessment. The Shapley Additive explanations (SHAP) method was used to calculate the Shapley value, which explains the contribution of each feature in the SVM model to the prediction. RESULTS This study included 395 eligible participants, with a median age of 57 years (IQR 51-66), comprising 344 males and 51 females. The LWCTR model performed well in predicting treatment response, achieving an AUC of 0.883 (95% confidence interval [CI] 0.881-0.888) in the training cohort, 0.876 (0.858-0.895) in the internal validation cohort, and 0.875 (0.860-0.887) in the external test cohort. The Rad-Nomo model, integrating the LWCTR model's prediction score (Rad-score) with the modified Response Evaluation Criteria in Solid Tumors (mRECIST), demonstrated strong prognostic performance. It achieved time-dependent AUC values of 0.902, 0.823, and 0.850 at 1, 2, and 3 years in the internal validation cohort and 0.893, 0.848, and 0.762 at the same intervals in the external test cohort. CONCLUSION The proposed LWCTR model performs well in predicting treatment response and prognosis in patients with HCC receiving ICIs-based combination therapy, potentially contributing to personalized and timely treatment decisions.
Collapse
Affiliation(s)
- Jun Xu
- Anhui Province Key Laboratory of Medical Physics and Technology, Institute of Health and Medical Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, 230031, People's Republic of China
- University of Science and Technology of China, Hefei, 230026, People's Republic of China
- Department of Intervention, The First Affiliated Hospital of University of Science and Technology of China, Hefei, 230001, People's Republic of China
- Department of Oncology, Hefei Cancer Hospital, Chinese Academy of Sciences, Hefei, 230031, People's Republic of China
| | - Junjun Li
- Department of Radiology, The First Affiliated Hospital of University of Science and Technology of China, Hefei, 230001, People's Republic of China
| | - Tengfei Wang
- Anhui Province Key Laboratory of Medical Physics and Technology, Institute of Health and Medical Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, 230031, People's Republic of China.
- University of Science and Technology of China, Hefei, 230026, People's Republic of China.
- Department of Oncology, Hefei Cancer Hospital, Chinese Academy of Sciences, Hefei, 230031, People's Republic of China.
| | - Xin Luo
- Yangtze Delta Region Institute (Huzhou) & School of Resources and Environment, University of Electronic Science and Technology of China, Huzhou, Chengdu, 313099, 611731, China
| | - Zhangxiang Zhu
- Department of Radiology, The First Affiliated Hospital of Anhui Medical University, Hefei, 230022, People's Republic of China
| | - Yimou Wang
- Anhui Province Key Laboratory of Medical Physics and Technology, Institute of Health and Medical Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, 230031, People's Republic of China
- University of Science and Technology of China, Hefei, 230026, People's Republic of China
| | - Yong Wang
- Department of Radiology, The First Affiliated Hospital of Wannan Medical College (Yijishan Hospital of Wannan Medical College), Wuhu, 241001, People's Republic of China
| | - Zhenglin Zhang
- Anhui Province Key Laboratory of Medical Physics and Technology, Institute of Health and Medical Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, 230031, People's Republic of China
- University of Science and Technology of China, Hefei, 230026, People's Republic of China
| | - Ruipeng Song
- Department of Hepatobiliary Surgerydivision of Life Sciences and Medicineanhui Province Key Laboratory of Hepatopancreatobiliary Surgery, Anhui Provincial Clinical Research Center for Hepatobiliary Diseases, The First Affiliated Hospital of USTC, the University of Science and Technology of China, Hefei, 230001, People's Republic of China
| | - Li-Zhuang Yang
- Anhui Province Key Laboratory of Medical Physics and Technology, Institute of Health and Medical Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, 230031, People's Republic of China
- University of Science and Technology of China, Hefei, 230026, People's Republic of China
- Department of Oncology, Hefei Cancer Hospital, Chinese Academy of Sciences, Hefei, 230031, People's Republic of China
| | - Hongzhi Wang
- Anhui Province Key Laboratory of Medical Physics and Technology, Institute of Health and Medical Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, 230031, People's Republic of China
- University of Science and Technology of China, Hefei, 230026, People's Republic of China
- Department of Oncology, Hefei Cancer Hospital, Chinese Academy of Sciences, Hefei, 230031, People's Republic of China
| | - Stephen T C Wong
- Department of Systems Medicine and Bioengineering, Houston Methodist Cancer Center, Houston Methodist Hospital, Houston, TX, 77030, USA
- Department of Radiology, Weill Cornell Medical College, New York, NY, 10065, United States
| | - Hai Li
- Anhui Province Key Laboratory of Medical Physics and Technology, Institute of Health and Medical Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, 230031, People's Republic of China.
- University of Science and Technology of China, Hefei, 230026, People's Republic of China.
- Department of Oncology, Hefei Cancer Hospital, Chinese Academy of Sciences, Hefei, 230031, People's Republic of China.
| |
Collapse
|
6
|
Oh K, Heo DW, Mulyadi AW, Jung W, Kang E, Lee KH, Suk HI. A quantitatively interpretable model for Alzheimer's disease prediction using deep counterfactuals. Neuroimage 2025; 309:121077. [PMID: 39954872 DOI: 10.1016/j.neuroimage.2025.121077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2024] [Revised: 01/19/2025] [Accepted: 02/05/2025] [Indexed: 02/17/2025] Open
Abstract
Deep learning (DL) for predicting Alzheimer's disease (AD) has provided timely intervention in disease progression yet still demands attentive interpretability to explain how their DL models make definitive decisions. Counterfactual reasoning has recently gained increasing attention in medical research because of its ability to provide a refined visual explanatory map. However, such visual explanatory maps based on visual inspection alone are insufficient unless we intuitively demonstrate their medical or neuroscientific validity via quantitative features. In this study, we synthesize the counterfactual-labeled structural MRIs using our proposed framework and transform it into a gray matter density map to measure its volumetric changes over the parcellated region of interest (ROI). We also devised a lightweight linear classifier to boost the effectiveness of constructed ROIs, promoted quantitative interpretation, and achieved comparable predictive performance to DL methods. Throughout this, our framework produces an "AD-relatedness index" for each ROI. It offers an intuitive understanding of brain status for an individual patient and across patient groups concerning AD progression.
Collapse
Affiliation(s)
- Kwanseok Oh
- Department of Artificial Intelligence, Korea University, Seoul 02841, Republic of Korea
| | - Da-Woon Heo
- Department of Artificial Intelligence, Korea University, Seoul 02841, Republic of Korea
| | - Ahmad Wisnu Mulyadi
- Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, Republic of Korea
| | - Wonsik Jung
- Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, Republic of Korea
| | - Eunsong Kang
- Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, Republic of Korea
| | - Kun Ho Lee
- Gwangju Alzheimer's & Related Dementia Cohort Research Center, Chosun University, Gwangju 61452, Republic of Korea; Department of Biomedical Science and Gwangju Alzheimer's & Related Dementia Cohort Research Center, Chosun University, Gwangju 61452, Republic of Korea; Korea Brain Research Institute, Daegu 41062, Republic of Korea.
| | - Heung-Il Suk
- Department of Artificial Intelligence, Korea University, Seoul 02841, Republic of Korea.
| |
Collapse
|
7
|
Zhang R, Wang J, Zhai X, Guo Y, Zhou L, Hao X, Yang L, Xing R, Hu J, Gao J, Wang F, Yang J, Liu J. Targeted Detection of 76 Carnitine Indicators Combined with a Machine Learning Algorithm Based on HPLC-MS/MS in the Diagnosis of Rheumatoid Arthritis. Metabolites 2025; 15:205. [PMID: 40137169 PMCID: PMC11944147 DOI: 10.3390/metabo15030205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2025] [Revised: 03/11/2025] [Accepted: 03/12/2025] [Indexed: 03/27/2025] Open
Abstract
BACKGROUND/OBJECTIVES Early diagnosis and treatment of rheumatoid arthritis (RA) are essential to reducing disability. However, the diagnostic criteria remain unclear, relying on clinical symptoms and blood markers. METHODS Using high-performance liquid chromatography-mass spectrometry (HPLC-MS/MS) targeted detection, we evaluated 76 carnitine indicators (55 carnitines and 21 corresponding ratios) in the serum of patients with RA to investigate the role of carnitine in RA. A total of 359 patients (207 patients with RA and 152 healthy controls) were included in the study. Screening involved three methods and integrated 76 carnitine indicators and 128 clinical indicators to identify candidate markers to establish a theoretical basis for RA diagnosis and new therapeutic targets. The diagnostic model derived from the screened markers was validated using three machine learning algorithms. RESULTS The model was refined using eight candidate indicators (C0, C10:1, LYMPH, platelet distribution width, anti-keratin antibody, glucose, urobilinogen, and erythrocyte sedimentation rate (ESR)). The receiver operating characteristic curve, sensitivity, specificity, and accuracy of the V8 model obtained from the training set were >0.948, 79.46%, 92.99%, and 89.18%, whereas those of the test set were >0.925, 78.89%, 89.22%, and 85.87%, respectively. Twenty-four carnitines were identified as risk factors of RA, with three significantly correlating with ESR, four with anti-cyclic citrullinated peptide antibody activity, two with C-reactive protein, five with immunoglobulin-G, eight with immunoglobulin-A levels, and eleven with immunoglobulin-M levels. CONCLUSIONS Carnitine is integral in the progression of RA. The diagnostic model developed shows excellent diagnostic capacity, improving early detection and enabling timely intervention to minimize disability associated with RA.
Collapse
Affiliation(s)
- Rui Zhang
- Department of Clinical Laboratory Medicine, Xijing Hospital, Fourth Military Medical University, 127 Changle West Road, Xi’an 710032, China; (R.Z.); (J.W.); (X.Z.); (Y.G.); (L.Z.); (X.H.); (L.Y.); (R.X.); (J.H.); (J.G.); (F.W.)
| | - Juan Wang
- Department of Clinical Laboratory Medicine, Xijing Hospital, Fourth Military Medical University, 127 Changle West Road, Xi’an 710032, China; (R.Z.); (J.W.); (X.Z.); (Y.G.); (L.Z.); (X.H.); (L.Y.); (R.X.); (J.H.); (J.G.); (F.W.)
| | - Xiaonan Zhai
- Department of Clinical Laboratory Medicine, Xijing Hospital, Fourth Military Medical University, 127 Changle West Road, Xi’an 710032, China; (R.Z.); (J.W.); (X.Z.); (Y.G.); (L.Z.); (X.H.); (L.Y.); (R.X.); (J.H.); (J.G.); (F.W.)
| | - Yuanbing Guo
- Department of Clinical Laboratory Medicine, Xijing Hospital, Fourth Military Medical University, 127 Changle West Road, Xi’an 710032, China; (R.Z.); (J.W.); (X.Z.); (Y.G.); (L.Z.); (X.H.); (L.Y.); (R.X.); (J.H.); (J.G.); (F.W.)
| | - Lei Zhou
- Department of Clinical Laboratory Medicine, Xijing Hospital, Fourth Military Medical University, 127 Changle West Road, Xi’an 710032, China; (R.Z.); (J.W.); (X.Z.); (Y.G.); (L.Z.); (X.H.); (L.Y.); (R.X.); (J.H.); (J.G.); (F.W.)
| | - Xiaoyan Hao
- Department of Clinical Laboratory Medicine, Xijing Hospital, Fourth Military Medical University, 127 Changle West Road, Xi’an 710032, China; (R.Z.); (J.W.); (X.Z.); (Y.G.); (L.Z.); (X.H.); (L.Y.); (R.X.); (J.H.); (J.G.); (F.W.)
| | - Liu Yang
- Department of Clinical Laboratory Medicine, Xijing Hospital, Fourth Military Medical University, 127 Changle West Road, Xi’an 710032, China; (R.Z.); (J.W.); (X.Z.); (Y.G.); (L.Z.); (X.H.); (L.Y.); (R.X.); (J.H.); (J.G.); (F.W.)
| | - Ruiqing Xing
- Department of Clinical Laboratory Medicine, Xijing Hospital, Fourth Military Medical University, 127 Changle West Road, Xi’an 710032, China; (R.Z.); (J.W.); (X.Z.); (Y.G.); (L.Z.); (X.H.); (L.Y.); (R.X.); (J.H.); (J.G.); (F.W.)
| | - Juanjuan Hu
- Department of Clinical Laboratory Medicine, Xijing Hospital, Fourth Military Medical University, 127 Changle West Road, Xi’an 710032, China; (R.Z.); (J.W.); (X.Z.); (Y.G.); (L.Z.); (X.H.); (L.Y.); (R.X.); (J.H.); (J.G.); (F.W.)
| | - Jiawei Gao
- Department of Clinical Laboratory Medicine, Xijing Hospital, Fourth Military Medical University, 127 Changle West Road, Xi’an 710032, China; (R.Z.); (J.W.); (X.Z.); (Y.G.); (L.Z.); (X.H.); (L.Y.); (R.X.); (J.H.); (J.G.); (F.W.)
| | - Fengjuan Wang
- Department of Clinical Laboratory Medicine, Xijing Hospital, Fourth Military Medical University, 127 Changle West Road, Xi’an 710032, China; (R.Z.); (J.W.); (X.Z.); (Y.G.); (L.Z.); (X.H.); (L.Y.); (R.X.); (J.H.); (J.G.); (F.W.)
| | - Jun Yang
- Comprehensive Cancer Center, Department of Entomology and Nematology, University of California, Davis, CA 95616, USA
| | - Jiayun Liu
- Department of Clinical Laboratory Medicine, Xijing Hospital, Fourth Military Medical University, 127 Changle West Road, Xi’an 710032, China; (R.Z.); (J.W.); (X.Z.); (Y.G.); (L.Z.); (X.H.); (L.Y.); (R.X.); (J.H.); (J.G.); (F.W.)
| |
Collapse
|
8
|
Du S, Wu Y, Tao J, Shu L, Yan T, Xiao B, Lv S, Ye M, Gong Y, Zhu X, Hu P, Wu M. Development and Validation of Machine Learning Models for Outcome Prediction in Patients with Poor-Grade Aneurysmal Subarachnoid Hemorrhage Following Endovascular Treatment. Ther Clin Risk Manag 2025; 21:293-307. [PMID: 40071129 PMCID: PMC11895686 DOI: 10.2147/tcrm.s504745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2024] [Accepted: 02/25/2025] [Indexed: 03/14/2025] Open
Abstract
Background Endovascular treatment (EVT) has been recommended as a superior modality for the treatment of intracranial aneurysm. However, there still exists a worse percentage of poor functional outcome in patients with poor-grade aneurysmal subarachnoid hemorrhage (aSAH) undergoing EVT. Therefore, it is urgently needed to investigate the risk factors and develop a critical decision model in the subtype of such patients. Methods We extracted the target variables from an ongoing registry cohort study, PROSAH-MPC, which was conducted in multiple centers in China. We randomly assigned these patients to training and validation cohorts with a ratio of 7:3. Univariate and multivariate logistic regressions were performed to find the potential factors, and then nine machine learning models and a stack ensemble model were developed with optimized variables. The performance of these models was evaluated through several indicators, including area under the receiver operating characteristic curve (AUC-ROC). We further use Shapley Additive Explanations (SHAP) methods for the distribution of feature visualization based on the optimal models. Results A total of 226 eligible patients with poor-grade aSAH undergoing EVT were enrolled, while 89 (39.4%) has a poor 12-month outcome. Age (Adjusted OR [aOR], 1.08; 95% CI: 1.03-1.13; p = 0.002), subarachnoid hemorrhage volume (aOR, 1.02; 95% CI: 1.00-1.05; p = 0.033), World Federation of Neurosurgical Societies grade (WFNS) (aOR, 2.03; 95% CI: 1.05-3.93; p = 0.035), and Hunt-Hess grade (aOR, 2.36; 95% CI: 1.13-4.93; p = 0.022) were identified as the independent risk factors of the poor outcome. Then, the prediction models developed have revealed that LightGBM algorithm has a superior performance with an AUC-ROC value of 0.842 in the validation cohort, while the SHAP results showed that age is the most important risk factor affecting functional outcomes. Conclusion The LightGBM model holds immense potential in facilitating risk stratification for poor-grade aSAH patients undergoing endovascular treatment who are at risk of adverse outcomes, thereby enhancing clinical decision-making processes. Trial Registration PROSAH-MPC. NCT05738083. Registered 16 November 2022 - Retrospectively registered, https://clinicaltrials.gov/study/NCT05738083.
Collapse
Affiliation(s)
- Senlin Du
- Department of Neurosurgery, The second Affiliated Hospital, Jiangxi Medical College of Nanchang University, Nanchang, 330006, People’s Republic of China
- Jiangxi Key Laboratory of Neurological Tumors and Cerebrovascular Diseases, Nanchang, 330006, People’s Republic of China
- Jiangxi Health Commission Key Laboratory of Neurological Medicine, Nanchang, 330006, People’s Republic of China
- Institute of Neuroscience, Nanchang University, Nanchang, 330006, People’s Republic of China
| | - Yanze Wu
- Department of Neurosurgery, The second Affiliated Hospital, Jiangxi Medical College of Nanchang University, Nanchang, 330006, People’s Republic of China
- Jiangxi Key Laboratory of Neurological Tumors and Cerebrovascular Diseases, Nanchang, 330006, People’s Republic of China
- Jiangxi Health Commission Key Laboratory of Neurological Medicine, Nanchang, 330006, People’s Republic of China
- Institute of Neuroscience, Nanchang University, Nanchang, 330006, People’s Republic of China
| | - Jiarong Tao
- Department of Neurosurgery, The second Affiliated Hospital, Jiangxi Medical College of Nanchang University, Nanchang, 330006, People’s Republic of China
| | - Lei Shu
- Department of Neurosurgery, The second Affiliated Hospital, Jiangxi Medical College of Nanchang University, Nanchang, 330006, People’s Republic of China
- Jiangxi Key Laboratory of Neurological Tumors and Cerebrovascular Diseases, Nanchang, 330006, People’s Republic of China
- Jiangxi Health Commission Key Laboratory of Neurological Medicine, Nanchang, 330006, People’s Republic of China
- Institute of Neuroscience, Nanchang University, Nanchang, 330006, People’s Republic of China
| | - Tengfeng Yan
- Department of Neurosurgery, The second Affiliated Hospital, Jiangxi Medical College of Nanchang University, Nanchang, 330006, People’s Republic of China
- Jiangxi Key Laboratory of Neurological Tumors and Cerebrovascular Diseases, Nanchang, 330006, People’s Republic of China
- Jiangxi Health Commission Key Laboratory of Neurological Medicine, Nanchang, 330006, People’s Republic of China
- Institute of Neuroscience, Nanchang University, Nanchang, 330006, People’s Republic of China
| | - Bing Xiao
- Department of Neurosurgery, The second Affiliated Hospital, Jiangxi Medical College of Nanchang University, Nanchang, 330006, People’s Republic of China
| | - Shigang Lv
- Department of Neurosurgery, The second Affiliated Hospital, Jiangxi Medical College of Nanchang University, Nanchang, 330006, People’s Republic of China
| | - Minhua Ye
- Department of Neurosurgery, The second Affiliated Hospital, Jiangxi Medical College of Nanchang University, Nanchang, 330006, People’s Republic of China
| | - Yanyan Gong
- Department of Neurosurgery, The second Affiliated Hospital, Jiangxi Medical College of Nanchang University, Nanchang, 330006, People’s Republic of China
| | - Xingen Zhu
- Department of Neurosurgery, The second Affiliated Hospital, Jiangxi Medical College of Nanchang University, Nanchang, 330006, People’s Republic of China
- Jiangxi Key Laboratory of Neurological Tumors and Cerebrovascular Diseases, Nanchang, 330006, People’s Republic of China
- Jiangxi Health Commission Key Laboratory of Neurological Medicine, Nanchang, 330006, People’s Republic of China
- Institute of Neuroscience, Nanchang University, Nanchang, 330006, People’s Republic of China
| | - Ping Hu
- Department of Neurosurgery, The second Affiliated Hospital, Jiangxi Medical College of Nanchang University, Nanchang, 330006, People’s Republic of China
- Department of Neurosurgery, Panzhihua Central Hospital, The second Clinical Medical College of Panzhihua University, Panzhihua, 617067, People’s Republic of China
| | - Miaojing Wu
- Department of Neurosurgery, The second Affiliated Hospital, Jiangxi Medical College of Nanchang University, Nanchang, 330006, People’s Republic of China
| |
Collapse
|
9
|
Lamens A, Bajorath J. Comparing Explanations of Molecular Machine Learning Models Generated with Different Methods for the Calculation of Shapley Values. Mol Inform 2025; 44:e202500067. [PMID: 40112199 PMCID: PMC11925390 DOI: 10.1002/minf.202500067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2025] [Revised: 03/04/2025] [Accepted: 03/06/2025] [Indexed: 03/22/2025]
Abstract
Feature attribution methods from explainable artificial intelligence (XAI) provide explanations of machine learning models by quantifying feature importance for predictions of test instances. While features determining individual predictions have frequently been identified in machine learning applications, the consistency of feature importance-based explanations of machine learning models using different attribution methods has not been thoroughly investigated. We have systematically compared model explanations in molecular machine learning. Therefore, a test system of highly accurate compound activity predictions for different targets using different machine learning methods was generated. For these predictions, explanations were computed using methodological variants of the Shapley value formalism, a popular feature attribution approach in machine learning adapted from game theory. Predictions of each model were assessed using a model-agnostic and model-specific Shapley value-based method. The resulting feature importance distributions were characterized and compared by a global statistical analysis using diverse measures. Unexpectedly, methodological variants for Shapley value calculations yielded distinct feature importance distributions for highly accurate predictions. There was only little agreement between alternative model explanations. Our findings suggest that feature importance-based explanations of machine learning predictions should include an assessment of consistency using alternative methods.
Collapse
Affiliation(s)
- Alec Lamens
- Department of Life Science Informatics and Data ScienceB-IT, LIMES Program Unit Chemical Biology and Medicinal ChemistryRheinische Friedrich-Wilhelms-UniversitätFriedrich-Hirzebruch-Allee 5/6D-53115BonnGermany
| | - Jürgen Bajorath
- Department of Life Science Informatics and Data ScienceB-IT, LIMES Program Unit Chemical Biology and Medicinal ChemistryRheinische Friedrich-Wilhelms-UniversitätFriedrich-Hirzebruch-Allee 5/6D-53115BonnGermany
- Lamarr Institute for Machine Learning and Artificial IntelligenceRheinische Friedrich-Wilhelms-Universität BonnFriedrich-Hirzebruch-Allee 5/6D-53115BonnGermany
| |
Collapse
|
10
|
Kim TJ, Suh J, Park SH, Kim Y, Ko SB. System for Predicting Neurological Outcomes Following Cardiac Arrest Based on Clinical Predictors Using a Machine Learning Method: The Neurological Outcomes After Cardiac Arrest Method. Neurocrit Care 2025:10.1007/s12028-025-02222-3. [PMID: 39979708 DOI: 10.1007/s12028-025-02222-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2024] [Accepted: 01/21/2025] [Indexed: 02/22/2025]
Abstract
BACKGROUND A multimodal approach may prove effective for predicting clinical outcomes following cardiac arrest (CA). We aimed to develop a practical predictive model that incorporates clinical factors related to CA and multiple prognostic tests using machine learning methods. METHODS The neurological outcomes after CA (NOCA) method for predicting poor outcomes were developed using data from 390 patients with CA between May 2018 and June 2023. The outcome was poor neurological outcome, defined as a Cerebral Performance Category score of 3-5 at discharge. We analyzed 31 variables describing the circumstances at CA, demographics, comorbidities, and prognostic studies. The prognostic method was developed based on an extreme gradient-boosting algorithm with threefold cross-validation and hyperparameter optimization. The performance of the predictive model was evaluated using the receiver operating characteristic curve analysis and calculating the area under the curve (AUC). RESULTS Of the 390 total patients (mean age 64.2 years; 71.3% male), 235 (60.3%) experienced poor outcomes at discharge. We selected variables to predict poor neurological outcomes using least absolute shrinkage and selection operator regression. The Glasgow Coma Scale-M (best motor response), electroencephalographic features, the neurological pupil index, time from CA to return of spontaneous circulation, and brain imaging were found to be important key parameters in the NOCA score. The AUC of the NOCA method was 0.965 (95% confidence interval 0.941-0.976). CONCLUSIONS The NOCA score represents a simple method for predicting neurological outcomes, with good performance in patients with CA, using a machine learning analysis that incorporates widely available variables.
Collapse
Affiliation(s)
- Tae Jung Kim
- Department of Neurology, Seoul National University College of Medicine, Seoul, Korea
- Department of Critical Care Medicine, Seoul National University Hospital, Seoul, Korea
| | - Jungyo Suh
- Department of Urology, Asan Medical Center, Ulsan University College of Medicine, Seoul, Korea
| | - Soo-Hyun Park
- Department of Neurology, Soonchunhyang University Hospital Seoul, Seoul, Korea
| | - Youngjoon Kim
- Department of Neurology, Seoul National University College of Medicine, Seoul, Korea
- Department of Critical Care Medicine, Seoul National University Hospital, Seoul, Korea
| | - Sang-Bae Ko
- Department of Neurology, Seoul National University College of Medicine, Seoul, Korea.
- Department of Critical Care Medicine, Seoul National University Hospital, Seoul, Korea.
| |
Collapse
|
11
|
Zhang P, Nde J, Eliaz Y, Jennings N, Cieplak P, Cheung MS. Ca XML: Chemistry-informed machine learning explains mutual changes between protein conformations and calcium ions in calcium-binding proteins using structural and topological features. Protein Sci 2025; 34:e70023. [PMID: 39865355 PMCID: PMC11761698 DOI: 10.1002/pro.70023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Revised: 11/28/2024] [Accepted: 12/16/2024] [Indexed: 01/28/2025]
Abstract
Proteins' flexibility is a feature in communicating changes in cell signaling instigated by binding with secondary messengers, such as calcium ions, associated with the coordination of muscle contraction, neurotransmitter release, and gene expression. When binding with the disordered parts of a protein, calcium ions must balance their charge states with the shape of calcium-binding proteins and their versatile pool of partners depending on the circumstances they transmit. Accurately determining the ionic charges of those ions is essential for understanding their role in such processes. However, it is unclear whether the limited experimental data available can be effectively used to train models to accurately predict the charges of calcium-binding protein variants. Here, we developed a chemistry-informed, machine-learning algorithm that implements a game theoretic approach to explain the output of a machine-learning model without the prerequisite of an excessively large database for high-performance prediction of atomic charges. We used the ab initio electronic structure data representing calcium ions and the structures of the disordered segments of calcium-binding peptides with surrounding water molecules to train several explainable models. Network theory was used to extract the topological features of atomic interactions in the structurally complex data dictated by the coordination chemistry of a calcium ion, a potent indicator of its charge state in protein. Our design created a computational tool of CaXML, which provided a framework of explainable machine learning model to annotate ionic charges of calcium ions in calcium-binding proteins in response to the chemical changes in an environment. Our framework will provide new insights into protein design for engineering functionality based on the limited size of scientific data in a genome space.
Collapse
Affiliation(s)
- Pengzhi Zhang
- Center for Bioinformatics and Computational BiologyHouston Methodist Research InstituteHoustonTexasUSA
| | - Jules Nde
- Department of PhysicsUniversity of WashingtonSeattleWashingtonUSA
| | - Yossi Eliaz
- Department of PhysicsUniversity of HoustonHoustonTexasUSA
- Computer Science DepartmentHIT Holon Institute of TechnologyHolonIsrael
| | | | - Piotr Cieplak
- Bioinformatics and Systems BiologySanford Burnham Prebys Medical Discovery InstituteLa JollaCaliforniaUSA
| | - Margaret S. Cheung
- Department of PhysicsUniversity of WashingtonSeattleWashingtonUSA
- Environmental Molecular Sciences LaboratoryPacific Northwest National LaboratorySeattleWashingtonUSA
| |
Collapse
|
12
|
Young J, Inamo J, Caterer Z, Krishna R, Zhang F. CellPhenoX: An eXplainable Cell-specific machine learning method to predict clinical Phenotypes using single-cell multi-omics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.01.24.634132. [PMID: 39975336 PMCID: PMC11838219 DOI: 10.1101/2025.01.24.634132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
Single-cell technologies have enhanced our knowledge of molecular and cellular heterogeneity underlying disease. As the scale of single-cell datasets expands, linking cell-level phenotypic alterations with clinical outcomes becomes increasingly challenging. To address this, we introduce CellPhenoX, an eXplainable machine learning method to identify cell-specific phenotypes that influence clinical outcomes. CellPhenoX integrates classification models, explainable AI techniques, and a statistical framework to generate interpretable, cell-specific scores that uncover cell populations associated with relevant clinical phenotypes and interaction effects. We demonstrated the performance of CellPhenoX across diverse single-cell designs, including simulations, binary disease-control comparisons, and multi-class studies. Notably, CellPhenoX identified an activated monocyte phenotype in COVID-19, with expansion correlated with disease severity after adjusting for covariates and interactive effects. It also uncovered an inflammation-associated gradient in fibroblasts from ulcerative colitis. We anticipate that CellPhenoX holds the potential to detect clinically relevant phenotypic changes in single-cell data with multiple sources of variation, paving the way for translating single-cell findings into clinical impact.
Collapse
Affiliation(s)
- Jade Young
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, USA
| | - Jun Inamo
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, USA
- Department of Medicine Rheumatology, University of Colorado School of Medicine, Aurora, CO, USA
| | - Zachary Caterer
- Interdisciplinary Quantitative Biology PhD Program, BioFrontiers Institute, University of Colorado, Boulder, CO, USA
| | - Revanth Krishna
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, USA
- Department of Medicine Rheumatology, University of Colorado School of Medicine, Aurora, CO, USA
| | - Fan Zhang
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, USA
- Department of Medicine Rheumatology, University of Colorado School of Medicine, Aurora, CO, USA
| |
Collapse
|
13
|
Vaiyapuri T. Optimizing Hydrogen Production in the Co-Gasification Process: Comparison of Explainable Regression Models Using Shapley Additive Explanations. ENTROPY (BASEL, SWITZERLAND) 2025; 27:83. [PMID: 39851702 PMCID: PMC11765325 DOI: 10.3390/e27010083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/26/2024] [Revised: 01/15/2025] [Accepted: 01/15/2025] [Indexed: 01/26/2025]
Abstract
The co-gasification of biomass and plastic waste offers a promising solution for producing hydrogen-rich syngas, addressing the rising demand for cleaner energy. However, optimizing this complex process to maximize hydrogen yield remains challenging, particularly when balancing diverse feedstocks and improving process efficiency. While machine learning (ML) has shown significant potential in simulating and optimizing such processes, there is no clear consensus on the most effective regression models for co-gasification, especially with limited experimental data. Additionally, the interpretability of these models is a key concern. This study aims to bridge these gaps through two primary objectives: (1) modeling the co-gasification process using seven different ML algorithms, and (2) developing a framework for evaluating model interpretability, ultimately identifying the most suitable model for process optimization. A comprehensive set of experiments was conducted across three key dimensions, generalization ability, predictive accuracy, and interpretability, to thoroughly assess the models. Support Vector Regression (SVR) exhibited superior performance, achieving the highest coefficient of determination (R2) of 0.86. SVR outperformed other models in capturing non-linear dependencies and demonstrated effective overfitting mitigation. This study further highlights the limitations of other ML models, emphasizing the importance of regularization and hyperparameter tuning in improving model stability. By integrating Shapley Additive Explanations (SHAP) into model evaluation, this work is the first to provide detailed insights into feature importance and demonstrate the operational feasibility of ML models for industrial-scale hydrogen production in the co-gasification process. The findings contribute to the development of a robust framework for optimizing co-gasification, supporting the advancement of sustainable energy technologies and the reduction of greenhouse gas (GHG) emissions.
Collapse
Affiliation(s)
- Thavavel Vaiyapuri
- College of Computer Engineering and Sciences, Prince Sattam bin Abdulaziz University, Al-Kharj 11942, Saudi Arabia
| |
Collapse
|
14
|
Zhao T, He J, Zhang L, Li H, Duan Q. A multimodal deep-learning model based on multichannel CT radiomics for predicting pathological grade of bladder cancer. Abdom Radiol (NY) 2024:10.1007/s00261-024-04748-0. [PMID: 39690281 DOI: 10.1007/s00261-024-04748-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2024] [Revised: 12/02/2024] [Accepted: 12/03/2024] [Indexed: 12/19/2024]
Abstract
OBJECTIVE To construct a predictive model using deep-learning radiomics and clinical risk factors for assessing the preoperative histopathological grade of bladder cancer according to computed tomography (CT) images. METHODS A retrospective analysis was conducted involving 201 bladder cancer patients with definite pathological grading results after surgical excision at the organization between January 2019 and June 2023. The cohort was classified into a test set of 81 cases and a training set of 120 cases. Hand-crafted radiomics (HCR) and features derived from deep-learning (DL) were obtained from computed tomography (CT) images. The research builds a prediction model using 12 machine-learning classifiers, which integrate HCR, DL features, and clinical data. Model performance was estimated utilizing decision-curve analysis (DCA), the area under the curve (AUC), and calibration curves. RESULTS Among the classifiers tested, the logistic regression model that combined DL and HCR characteristics demonstrated the finest performance. The AUC values were 0.912 (training set) and 0.777 (test set). The AUC values of clinical model achieved 0.850 (training set) and 0.804 (test set). The AUC values of the combined model were 0.933 (training set) and 0.824 (test set), outperforming both the clinical and HCR-only models. CONCLUSION The CT-based combined model demonstrated considerable diagnostic capability in differentiating high-grade from low-grade bladder cancer, serving as a valuable noninvasive instrument for preoperative pathological evaluation.
Collapse
Affiliation(s)
- Ting Zhao
- Department of Radiology, The Affiliated Hospital of Guizhou Medical University, Guizhou, China
- College of Medical Imaging, Guizhou Medical University, Guizhou, China
| | - Jian He
- Department of Radiology, The Affiliated Hospital of Guizhou Medical University, Guizhou, China
- College of Medical Imaging, Guizhou Medical University, Guizhou, China
| | - Licui Zhang
- Department of Radiology, The Affiliated Hospital of Guizhou Medical University, Guizhou, China
- College of Medical Imaging, Guizhou Medical University, Guizhou, China
| | - Hongyang Li
- College of Medical Imaging, Guizhou Medical University, Guizhou, China
| | - Qinghong Duan
- College of Medical Imaging, Guizhou Medical University, Guizhou, China.
- Department of Radiology, The Affiliated Cancer Hospital of Guizhou Medical University, GuiZhou, China.
| |
Collapse
|
15
|
Moguilner S, Baez S, Hernandez H, Migeot J, Legaz A, Gonzalez-Gomez R, Farina FR, Prado P, Cuadros J, Tagliazucchi E, Altschuler F, Maito MA, Godoy ME, Cruzat J, Valdes-Sosa PA, Lopera F, Ochoa-Gómez JF, Hernandez AG, Bonilla-Santos J, Gonzalez-Montealegre RA, Anghinah R, d'Almeida Manfrinati LE, Fittipaldi S, Medel V, Olivares D, Yener GG, Escudero J, Babiloni C, Whelan R, Güntekin B, Yırıkoğulları H, Santamaria-Garcia H, Lucas AF, Huepe D, Di Caterina G, Soto-Añari M, Birba A, Sainz-Ballesteros A, Coronel-Oliveros C, Yigezu A, Herrera E, Abasolo D, Kilborn K, Rubido N, Clark RA, Herzog R, Yerlikaya D, Hu K, Parra MA, Reyes P, García AM, Matallana DL, Avila-Funes JA, Slachevsky A, Behrens MI, Custodio N, Cardona JF, Barttfeld P, Brusco IL, Bruno MA, Sosa Ortiz AL, Pina-Escudero SD, Takada LT, Resende E, Possin KL, de Oliveira MO, Lopez-Valdes A, Lawlor B, Robertson IH, Kosik KS, Duran-Aniotz C, Valcour V, Yokoyama JS, Miller B, Ibanez A. Brain clocks capture diversity and disparities in aging and dementia across geographically diverse populations. Nat Med 2024; 30:3646-3657. [PMID: 39187698 PMCID: PMC11645278 DOI: 10.1038/s41591-024-03209-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Accepted: 07/22/2024] [Indexed: 08/28/2024]
Abstract
Brain clocks, which quantify discrepancies between brain age and chronological age, hold promise for understanding brain health and disease. However, the impact of diversity (including geographical, socioeconomic, sociodemographic, sex and neurodegeneration) on the brain-age gap is unknown. We analyzed datasets from 5,306 participants across 15 countries (7 Latin American and Caribbean countries (LAC) and 8 non-LAC countries). Based on higher-order interactions, we developed a brain-age gap deep learning architecture for functional magnetic resonance imaging (2,953) and electroencephalography (2,353). The datasets comprised healthy controls and individuals with mild cognitive impairment, Alzheimer disease and behavioral variant frontotemporal dementia. LAC models evidenced older brain ages (functional magnetic resonance imaging: mean directional error = 5.60, root mean square error (r.m.s.e.) = 11.91; electroencephalography: mean directional error = 5.34, r.m.s.e. = 9.82) associated with frontoposterior networks compared with non-LAC models. Structural socioeconomic inequality, pollution and health disparities were influential predictors of increased brain-age gaps, especially in LAC (R² = 0.37, F² = 0.59, r.m.s.e. = 6.9). An ascending brain-age gap from healthy controls to mild cognitive impairment to Alzheimer disease was found. In LAC, we observed larger brain-age gaps in females in control and Alzheimer disease groups compared with the respective males. The results were not explained by variations in signal quality, demographics or acquisition methods. These findings provide a quantitative framework capturing the diversity of accelerated brain aging.
Collapse
Grants
- R01 AG075775 NIA NIH HHS
- R01AG083799 John E. Fogarty Foundation for Persons with Intellectual and Developmental Disabilities
- 75N95022C00031 NIDA NIH HHS
- P01 AG019724 NIA NIH HHS
- SG-20-725707 Alzheimer's Association
- R01 AG057234 NIA NIH HHS
- R01 AG083799 NIA NIH HHS
- U.S. Department of Health & Human Services | NIH | Fogarty International Center (FIC)
- Latin American Brain Health Institute (BrainLat) # BL-SRGP2020-02 ReDLat [National Institutes of Health and the Fogarty International Center (FIC), National Institutes of Aging (R01 AG057234, R01 AG075775, AG021051, R01AG083799, CARDS-NIH 75N95022C00031), Alzheimer's Association (SG-20-725707), Rainwater Charitable Foundation, The Bluefield project to cure FTD, and Global Brain Health Institute)], ANID/FONDECYT Regular (1210195, 1210176 and 1220995); and ANID/FONDAP/15150012
- National Institute on Aging of the National Institutes of Health (R01AG075775, R01AG083799, 2P01AG019724); ANID (FONDECYT Regular 1210176, 1210195); and DICYT-USACH (032351G_DAS)
Collapse
Affiliation(s)
- Sebastian Moguilner
- Latin American Brain Health Institute, Universidad Adolfo Ibañez, Santiago de Chile, Chile
- Cognitive Neuroscience Center, Universidad de San Andrés, Buenos Aires, Argentina
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Sandra Baez
- Universidad de los Andes, Bogota, Colombia
- Global Brain Health Institute (GBHI), University of California, San Francisco, CA, USA
- Global Brain Health Institute (GBHI), Trinity College Dublin, Dublin, Ireland
| | - Hernan Hernandez
- Latin American Brain Health Institute, Universidad Adolfo Ibañez, Santiago de Chile, Chile
| | - Joaquín Migeot
- Latin American Brain Health Institute, Universidad Adolfo Ibañez, Santiago de Chile, Chile
| | - Agustina Legaz
- Latin American Brain Health Institute, Universidad Adolfo Ibañez, Santiago de Chile, Chile
- Cognitive Neuroscience Center, Universidad de San Andrés, Buenos Aires, Argentina
| | - Raul Gonzalez-Gomez
- Latin American Brain Health Institute, Universidad Adolfo Ibañez, Santiago de Chile, Chile
| | - Francesca R Farina
- Global Brain Health Institute (GBHI), University of California, San Francisco, CA, USA
- Global Brain Health Institute (GBHI), Trinity College Dublin, Dublin, Ireland
- The University of California Santa Barbara (UCSB), Santa Barbara, CA, USA
| | - Pavel Prado
- Escuela de Fonoaudiología, Universidad San Sebastián, Santiago de Chile, Chile
| | - Jhosmary Cuadros
- Latin American Brain Health Institute, Universidad Adolfo Ibañez, Santiago de Chile, Chile
- Grupo de Bioingeniería, Decanato de Investigación, Universidad Nacional Experimental del Táchira, San Cristóbal, Venezuela
- Advanced Center for Electrical and Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso, Chile
| | - Enzo Tagliazucchi
- Latin American Brain Health Institute, Universidad Adolfo Ibañez, Santiago de Chile, Chile
- University of Buenos Aires, Buenos Aires, Argentina
| | - Florencia Altschuler
- Cognitive Neuroscience Center, Universidad de San Andrés, Buenos Aires, Argentina
| | - Marcelo Adrián Maito
- Latin American Brain Health Institute, Universidad Adolfo Ibañez, Santiago de Chile, Chile
- Cognitive Neuroscience Center, Universidad de San Andrés, Buenos Aires, Argentina
| | - María E Godoy
- Latin American Brain Health Institute, Universidad Adolfo Ibañez, Santiago de Chile, Chile
- Cognitive Neuroscience Center, Universidad de San Andrés, Buenos Aires, Argentina
| | - Josephine Cruzat
- Latin American Brain Health Institute, Universidad Adolfo Ibañez, Santiago de Chile, Chile
| | - Pedro A Valdes-Sosa
- The Clinical Hospital of Chengdu Brain Sciences Institute, University of Electronic Sciences and Technology of China, Chengdu, China
- Technology of China, Chengdu, China
- Cuban Neuroscience Center, La Habana, Cuba
| | - Francisco Lopera
- Grupo de Neurociencias de Antioquia (GNA), University of Antioquia, Medellín, Colombia
| | | | - Alfredis Gonzalez Hernandez
- Department of Psychology, Master Program of Clinical Neuropsychology, Universidad Surcolombiana Neiva, Neiva, Colombia
| | | | | | - Renato Anghinah
- Reference Center of Behavioural Disturbances and Dementia, School of Medicine, University of Sao Paulo, Sao Paulo, Brazil
- Traumatic Brain Injury Cognitive Rehabilitation Out-Patient Center, University of Sao Paulo, Sao Paulo, Brazil
| | - Luís E d'Almeida Manfrinati
- Reference Center of Behavioural Disturbances and Dementia, School of Medicine, University of Sao Paulo, Sao Paulo, Brazil
- Traumatic Brain Injury Cognitive Rehabilitation Out-Patient Center, University of Sao Paulo, Sao Paulo, Brazil
| | - Sol Fittipaldi
- Latin American Brain Health Institute, Universidad Adolfo Ibañez, Santiago de Chile, Chile
- Global Brain Health Institute (GBHI), University of California, San Francisco, CA, USA
- Global Brain Health Institute (GBHI), Trinity College Dublin, Dublin, Ireland
| | - Vicente Medel
- Latin American Brain Health Institute, Universidad Adolfo Ibañez, Santiago de Chile, Chile
| | - Daniela Olivares
- Latin American Brain Health Institute, Universidad Adolfo Ibañez, Santiago de Chile, Chile
- Center for Social and Cognitive Neuroscience, School of Psychology, Universidad Adolfo Ibáñez, Santiago, Chile
- Neuropsychology and Clinical Neuroscience Laboratory (LANNEC), Physiopathology Program-Institute of Biomedical Sciences (ICBM), Neuroscience and East Neuroscience Departments, University of Chile, Santiago, Chile
- Centro de Neuropsicología Clínica (CNC), Santiago, Chile
| | - Görsev G Yener
- Faculty of Medicine, Izmir University of Economics, Izmir, Turkey
- Brain Dynamics Multidisciplinary Research Center, Dokuz Eylul University, Izmir, Turkey
- Izmir Biomedicine and Genome Center, Izmir, Turkey
| | - Javier Escudero
- School of Engineering, Institute for Imaging, Data and Communications, University of Edinburgh, Edinburgh, UK
| | - Claudio Babiloni
- Department of Physiology and Pharmacology 'V. Erspamer', Sapienza University of Rome, Rome, Italy
- Hospital San Raffaele Cassino, Cassino, Italy
| | - Robert Whelan
- Global Brain Health Institute (GBHI), University of California, San Francisco, CA, USA
- Global Brain Health Institute (GBHI), Trinity College Dublin, Dublin, Ireland
- School of Psychology, Trinity College Dublin, Dublin, Ireland
| | - Bahar Güntekin
- Department of Neurosciences, Health Sciences Institute, Istanbul Medipol University, İstanbul, Turkey
- Health Sciences and Technology Research Institute (SABITA), Istanbul Medipol University, Istanbul, Turkey
- Department of Biophysics, School of Medicine, Istanbul Medipol University, Istanbul, Turkey
| | - Harun Yırıkoğulları
- Department of Neurosciences, Health Sciences Institute, Istanbul Medipol University, İstanbul, Turkey
- Health Sciences and Technology Research Institute (SABITA), Istanbul Medipol University, Istanbul, Turkey
| | - Hernando Santamaria-Garcia
- Pontificia Universidad Javeriana (PhD Program in Neuroscience), Bogotá, Colombia
- Center of Memory and Cognition Intellectus, Hospital Universitario San Ignacio Bogotá, San Ignacio, Colombia
| | - Alberto Fernández Lucas
- Departamento de Medicina Legal, Psiquiatría y Patología, Universidad Complutense de Madrid, Madrid, Spain
| | - David Huepe
- Center for Social and Cognitive Neuroscience, School of Psychology, Universidad Adolfo Ibáñez, Santiago, Chile
| | - Gaetano Di Caterina
- Department of Electronic and Electrical Engineering, University of Strathclyde, Glasgow, UK
| | | | - Agustina Birba
- Latin American Brain Health Institute, Universidad Adolfo Ibañez, Santiago de Chile, Chile
| | | | - Carlos Coronel-Oliveros
- Latin American Brain Health Institute, Universidad Adolfo Ibañez, Santiago de Chile, Chile
- Global Brain Health Institute (GBHI), University of California, San Francisco, CA, USA
- Global Brain Health Institute (GBHI), Trinity College Dublin, Dublin, Ireland
- Centro Interdisciplinario de Neurociencia de Valparaíso (CINV), Universidad de Valparaíso, Valparaíso, Chile
| | - Amanuel Yigezu
- The University of California Santa Barbara (UCSB), Santa Barbara, CA, USA
| | - Eduar Herrera
- Departamento de Estudios Psicológicos, Universidad ICESI, Cali, Colombia
| | - Daniel Abasolo
- Centre for Biomedical Engineering, School of Mechanical Engineering Sciences, University of Surrey, Guildford, UK
| | - Kerry Kilborn
- School of Psychology, University of Glasgow, Glasgow, UK
| | - Nicolás Rubido
- Institute for Complex Systems and Mathematical Biology, University of Aberdeen, Aberdeen, UK
| | - Ruaridh A Clark
- Centre for Signal and Image Processing, Department of Electronic and Electrical Engineering, University of Strathclyde, Strathclyde, UK
| | - Ruben Herzog
- Latin American Brain Health Institute, Universidad Adolfo Ibañez, Santiago de Chile, Chile
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, InsermCNRS, Paris, France
| | - Deniz Yerlikaya
- Department of Neurosciences, Health Sciences Institute, Dokuz Eylül University, Izmir, Turkey
| | - Kun Hu
- Harvard Medical School, Boston, MA, USA
| | - Mario A Parra
- Department of Psychological Sciences and Health, University of Strathclyde, Glasgow, UK
- BrainLat, Universidad Adolfo Ibáñez, Santiago, Chile
| | - Pablo Reyes
- Pontificia Universidad Javeriana (PhD Program in Neuroscience), Bogotá, Colombia
- Center of Memory and Cognition Intellectus, Hospital Universitario San Ignacio Bogotá, San Ignacio, Colombia
| | - Adolfo M García
- Cognitive Neuroscience Center, Universidad de San Andrés, Buenos Aires, Argentina
- Global Brain Health Institute (GBHI), University of California, San Francisco, CA, USA
- Global Brain Health Institute (GBHI), Trinity College Dublin, Dublin, Ireland
- Departamento de Lingüística y Literatura, Universidad de Santiago de Chile, Santiago, Chile
| | - Diana L Matallana
- Pontificia Universidad Javeriana (PhD Program in Neuroscience), Bogotá, Colombia
- Center of Memory and Cognition Intellectus, Hospital Universitario San Ignacio Bogotá, San Ignacio, Colombia
- Mental Health Department, Hospital Universitario Fundación Santa Fe, Bogota, Colombia
| | - José Alberto Avila-Funes
- Department of Geriatrics, Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán, Mexico City, Mexico
| | - Andrea Slachevsky
- Memory and Neuropsychiatric Center (CMYN), Neurology Department, Hospital del Salvador and Faculty of Medicine, University of Chile, Santiago, Chile
- Geroscience Center for Brain Health and Metabolism (GERO), Santiago, Chile
- Neuropsychology and Clinical Neuroscience Laboratory (LANNEC), Physiopathology Program - Institute of Biomedical Sciences (ICBM), Neuroscience and East Neuroscience Departments, University of Chile, Santiago, Chile
| | - María I Behrens
- Neurology and Psychiatry Department, Clínica Alemana-Universidad Desarrollo, Santiago, Chile
- Centro de Investigación Clínica Avanzada (CICA), Universidad de Chile, Santiago, Chile
- Departamento de Neurología y Neurocirugía, Hospital Clínico de la Universidad de Chile, Santiago, Chile
- Departamento de Neurociencia, Universidad de Chile, Santiago, Chile
| | - Nilton Custodio
- Servicio de Neurología, Instituto Peruano de Neurociencias, Lima, Perú
| | - Juan F Cardona
- Facultad de Psicología, Universidad del Valle, Cali, Colombia
| | - Pablo Barttfeld
- Cognitive Science Group, Instituto de Investigaciones Psicológicas (IIPsi), CONICET UNC, Universidad Nacional de Córdoba, Córdoba, Argentina
| | - Ignacio L Brusco
- Centro de Neuropsiquiatría y Neurología de la Conducta (CENECON), Universidad de Buenos Aires (UBA), Buenos Aires, Argentina
| | - Martín A Bruno
- Instituto de Ciencias Biomédicas (ICBM), Universidad Catoóica de Cuyo, San Juan, Argentina
| | - Ana L Sosa Ortiz
- Instituto Nacional de Neurologia y Neurocirugia MVS, Universidad Nacional Autonoma de Mexico, Mexico City, Mexico
| | - Stefanie D Pina-Escudero
- Global Brain Health Institute (GBHI), University of California, San Francisco, CA, USA
- Global Brain Health Institute (GBHI), Trinity College Dublin, Dublin, Ireland
- Memory and Aging Center, Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, CA, USA
| | - Leonel T Takada
- Cognitive and Behavioral Neurology Unit, Hospital das Clinicas, University of São Paulo Medical School, São Paulo, Brazil
| | - Elisa Resende
- Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Katherine L Possin
- Global Brain Health Institute (GBHI), University of California, San Francisco, CA, USA
- Global Brain Health Institute (GBHI), Trinity College Dublin, Dublin, Ireland
- Memory and Aging Center, Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, CA, USA
| | - Maira Okada de Oliveira
- Global Brain Health Institute (GBHI), University of California, San Francisco, CA, USA
- Global Brain Health Institute (GBHI), Trinity College Dublin, Dublin, Ireland
- Cognitive and Behavioral Neurology Unit, Hospital das Clinicas, University of São Paulo Medical School, São Paulo, Brazil
| | - Alejandro Lopez-Valdes
- Global Brain Health Institute (GBHI), University of California, San Francisco, CA, USA
- Global Brain Health Institute (GBHI), Trinity College Dublin, Dublin, Ireland
- School of Engineering, Department of Electrical and Electronic Engineering, Trinity College Dublin, Dublin, Ireland
- Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin, Ireland
- Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland
| | - Brian Lawlor
- Global Brain Health Institute (GBHI), University of California, San Francisco, CA, USA
- Global Brain Health Institute (GBHI), Trinity College Dublin, Dublin, Ireland
| | - Ian H Robertson
- Global Brain Health Institute (GBHI), University of California, San Francisco, CA, USA
- Global Brain Health Institute (GBHI), Trinity College Dublin, Dublin, Ireland
- Memory and Aging Center, Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, CA, USA
| | - Kenneth S Kosik
- Division of the Biological Sciences, The University of Chicago, Chicago, IL, USA
| | - Claudia Duran-Aniotz
- Latin American Brain Health Institute, Universidad Adolfo Ibañez, Santiago de Chile, Chile
| | - Victor Valcour
- Global Brain Health Institute (GBHI), University of California, San Francisco, CA, USA
- Global Brain Health Institute (GBHI), Trinity College Dublin, Dublin, Ireland
- Memory and Aging Center, Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, CA, USA
| | - Jennifer S Yokoyama
- Global Brain Health Institute (GBHI), University of California, San Francisco, CA, USA
- Global Brain Health Institute (GBHI), Trinity College Dublin, Dublin, Ireland
- Memory and Aging Center, Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, CA, USA
| | - Bruce Miller
- Global Brain Health Institute (GBHI), University of California, San Francisco, CA, USA
- Global Brain Health Institute (GBHI), Trinity College Dublin, Dublin, Ireland
- Memory and Aging Center, Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, CA, USA
| | - Agustin Ibanez
- Latin American Brain Health Institute, Universidad Adolfo Ibañez, Santiago de Chile, Chile.
- Cognitive Neuroscience Center, Universidad de San Andrés, Buenos Aires, Argentina.
- Global Brain Health Institute (GBHI), University of California, San Francisco, CA, USA.
- Global Brain Health Institute (GBHI), Trinity College Dublin, Dublin, Ireland.
| |
Collapse
|
16
|
Arslan AK, Yagin FH, Algarni A, Karaaslan E, Al-Hashem F, Ardigò LP. Enhancing type 2 diabetes mellitus prediction by integrating metabolomics and tree-based boosting approaches. Front Endocrinol (Lausanne) 2024; 15:1444282. [PMID: 39588339 PMCID: PMC11586166 DOI: 10.3389/fendo.2024.1444282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Accepted: 10/28/2024] [Indexed: 11/27/2024] Open
Abstract
Background Type 2 diabetes mellitus (T2DM) is a global health problem characterized by insulin resistance and hyperglycemia. Early detection and accurate prediction of T2DM is crucial for effective management and prevention. This study explores the integration of machine learning (ML) and explainable artificial intelligence (XAI) approaches based on metabolomics panel data to identify biomarkers and develop predictive models for T2DM. Methods Metabolomics data from T2DM (n = 31) and healthy controls (n = 34) were analyzed for biomarker discovery (mostly amino acids, fatty acids, and purines) and T2DM prediction. Feature selection was performed using the least absolute shrinkage and selection operator (LASSO) regression to enhance the model's accuracy and interpretability. Advanced three tree-based ML algorithms (KTBoost: Kernel-Tree Boosting; XGBoost: eXtreme Gradient Boosting; NGBoost: Natural Gradient Boosting) were employed to predict T2DM using these biomarkers. The SHapley Additive exPlanations (SHAP) method was used to explain the effects of metabolomics biomarkers on the prediction of the model. Results The study identified multiple metabolites associated with T2DM, where LASSO feature selection highlighted important biomarkers. KTBoost [Accuracy: 0.938; CI: (0.880-0.997), Sensitivity: 0.971; CI: (0.847-0.999), Area under the Curve (AUC): 0.965; CI: (0.937-0.994)] demonstrated its effectiveness in using complex metabolomics data for T2DM prediction and achieved better performance than other models. According to KTBoost's SHAP, high levels of phenylactate (pla) and taurine metabolites, as well as low concentrations of cysteine, laspartate, and lcysteate, are strongly associated with the presence of T2DM. Conclusion The integration of metabolomics profiling and XAI offers a promising approach to predicting T2DM. The use of tree-based algorithms, in particular KTBoost, provides a robust framework for analyzing complex datasets and improves the prediction accuracy of T2DM onset. Future research should focus on validating these biomarkers and models in larger, more diverse populations to solidify their clinical utility.
Collapse
Affiliation(s)
- Ahmet Kadir Arslan
- Department of Biostatistics and Medical Informatics, Faculty of Medicine, Inonu University, Malatya, Türkiye
| | - Fatma Hilal Yagin
- Department of Biostatistics and Medical Informatics, Faculty of Medicine, Inonu University, Malatya, Türkiye
| | | | - Erol Karaaslan
- Department of Anesthesiology and Reanimation, Faculty of Medicine, Inonu University, Malatya, Türkiye
| | - Fahaid Al-Hashem
- Department of Physiology, College of Medicine, King Khalid University, Abha, Saudi Arabia
| | - Luca Paolo Ardigò
- Department of Teacher Education, NLA University College, Oslo, Norway
| |
Collapse
|
17
|
Xiao H, Liang X, Li H, Chen X, Li Y. Trends in the prevalence of osteoporosis and effects of heavy metal exposure using interpretable machine learning. ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2024; 286:117238. [PMID: 39490102 DOI: 10.1016/j.ecoenv.2024.117238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Revised: 09/30/2024] [Accepted: 10/19/2024] [Indexed: 11/05/2024]
Abstract
There is limited evidence that heavy metals exposure contributes to osteoporosis. Multi-parameter scoring machine learning (ML) techniques were developed using National Health and Nutrition Examination Survey data to predict osteoporosis based on heavy metal exposure levels. For generating an optimal predictive model for osteoporosis, 12 ML models were used. Identification was carried out using the model that performed the best. For interpretation of models, Shapley additive explanation (SHAP) methods and partial dependence plots (PDP) were integrated into a pipeline and incorporated into the ML pipeline. By regressing osteoporosis on survey cycles, logistic regression was used to evaluate linear trends in osteoporosis over time. For the purpose of training and validating predictive models, 5745 eligible participants were randomly selected into training and testing set. It was evident from the results that the gradient boosting decision tree model performed the best among the predictive models, attributing to an accuracy rate of 89.40 % in the testing set. Based on the model results, the area under the curve and F1 score were 0.88 and 0.39, respectively. As a result of the SHAP analysis, urinary Co, urinary Tu, blood Cd, and urinary Hg levels were identified as the most influential factors influencing osteoporosis. Urinary Co (0.20-6.10 μg/mg creatinine), urinary Tu (0.06-1.93 μg/mg creatinine), blood Cd (0.07-0.50 μg/L), and urinary Hg (0.06-0.75 μg/mg creatinine) levels displayed a distinctive upward trend with risk of osteoporosis as values increased. Our analysis revealed that urinary Co, urinary Tu, blood Cd, and urinary Hg played a significant role in predictability.
Collapse
Affiliation(s)
- Hewei Xiao
- Department of Scientific Research, Guangxi Academy of Medical Sciences and the People's Hospital of Guangxi Zhuang Autonomous Region, Nanning, Guangxi, China
| | - Xueyan Liang
- Phase 1 Clinical Trial Laboratory, Guangxi Academy of Medical Sciences and the People's Hospital of Guangxi Zhuang Autonomous Region, Nanning, Guangxi, China
| | - Huijuan Li
- Phase 1 Clinical Trial Laboratory, Guangxi Academy of Medical Sciences and the People's Hospital of Guangxi Zhuang Autonomous Region, Nanning, Guangxi, China; Department of Pharmacy, Guangxi Academy of Medical Sciences and the People's Hospital of Guangxi Zhuang Autonomous Region, Nanning, Guangxi, China
| | - Xiaoyu Chen
- Phase 1 Clinical Trial Laboratory, Guangxi Academy of Medical Sciences and the People's Hospital of Guangxi Zhuang Autonomous Region, Nanning, Guangxi, China; Department of Pharmacy, Guangxi Academy of Medical Sciences and the People's Hospital of Guangxi Zhuang Autonomous Region, Nanning, Guangxi, China.
| | - Yan Li
- Department of Pharmacy, Guangxi Academy of Medical Sciences and the People's Hospital of Guangxi Zhuang Autonomous Region, Nanning, Guangxi, China.
| |
Collapse
|
18
|
Kuo CH, Liu GT, Lee CE, Wu J, Casimo K, Weaver KE, Lo YC, Chen YY, Huang WC, Ojemann JG. Decoding micro-electrocorticographic signals by using explainable 3D convolutional neural network to predict finger movements. J Neurosci Methods 2024; 411:110251. [PMID: 39151656 DOI: 10.1016/j.jneumeth.2024.110251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Revised: 08/05/2024] [Accepted: 08/13/2024] [Indexed: 08/19/2024]
Abstract
BACKGROUND Electroencephalography (EEG) and electrocorticography (ECoG) recordings have been used to decode finger movements by analyzing brain activity. Traditional methods focused on single bandpass power changes for movement decoding, utilizing machine learning models requiring manual feature extraction. NEW METHOD This study introduces a 3D convolutional neural network (3D-CNN) model to decode finger movements using ECoG data. The model employs adaptive, explainable AI (xAI) techniques to interpret the physiological relevance of brain signals. ECoG signals from epilepsy patients during awake craniotomy were processed to extract power spectral density across multiple frequency bands. These data formed a 3D matrix used to train the 3D-CNN to predict finger trajectories. RESULTS The 3D-CNN model showed significant accuracy in predicting finger movements, with root-mean-square error (RMSE) values of 0.26-0.38 for single finger movements and 0.20-0.24 for combined movements. Explainable AI techniques, Grad-CAM and SHAP, identified the high gamma (HG) band as crucial for movement prediction, showing specific cortical regions involved in different finger movements. These findings highlighted the physiological significance of the HG band in motor control. COMPARISON WITH EXISTING METHODS The 3D-CNN model outperformed traditional machine learning approaches by effectively capturing spatial and temporal patterns in ECoG data. The use of xAI techniques provided clearer insights into the model's decision-making process, unlike the "black box" nature of standard deep learning models. CONCLUSIONS The proposed 3D-CNN model, combined with xAI methods, enhances the decoding accuracy of finger movements from ECoG data. This approach offers a more efficient and interpretable solution for brain-computer interface (BCI) applications, emphasizing the HG band's role in motor control.
Collapse
Affiliation(s)
- Chao-Hung Kuo
- Department of Neurosurgery, Neurological Institute, Taipei Veterans General Hospital, Taipei, Taiwan; School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan; Department of Biomedical Engineering, National Yang Ming Chiao Tung University, Taipei, Taiwan; Department of Neurological Surgery, University of Washington, Seattle, WA, USA; The Ph.D. Program for Neural Regenerative Medicine, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan.
| | - Guan-Tze Liu
- Department of Neurosurgery, Neurological Institute, Taipei Veterans General Hospital, Taipei, Taiwan; School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Chi-En Lee
- Department of Biomedical Engineering, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Jing Wu
- Department of Bioengineering, University of Washington, Seattle, WA, USA; Center for Neurotechnology, University of Washington, Seattle, WA, USA
| | - Kaitlyn Casimo
- Center for Neurotechnology, University of Washington, Seattle, WA, USA
| | - Kurt E Weaver
- Center for Neurotechnology, University of Washington, Seattle, WA, USA; Department of Radiology, and Integrated Brain Imaging Center, University of Washington, Seattle, WA, USA
| | - Yu-Chun Lo
- The Ph.D. Program for Neural Regenerative Medicine, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
| | - You-Yin Chen
- Department of Biomedical Engineering, National Yang Ming Chiao Tung University, Taipei, Taiwan; The Ph.D. Program for Neural Regenerative Medicine, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan.
| | - Wen-Cheng Huang
- Department of Neurosurgery, Neurological Institute, Taipei Veterans General Hospital, Taipei, Taiwan; School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Jeffrey G Ojemann
- Department of Neurological Surgery, University of Washington, Seattle, WA, USA; Center for Neurotechnology, University of Washington, Seattle, WA, USA; Departments of Surgery, Seattle Children's Hospital, Seattle, WA, USA
| |
Collapse
|
19
|
Zhou H, Lin S, Watson M, Bernadt CT, Zhang O, Liao L, Govindan R, Cote RJ, Yang C. Length-scale study in deep learning prediction for non-small cell lung cancer brain metastasis. Sci Rep 2024; 14:22328. [PMID: 39333630 PMCID: PMC11436900 DOI: 10.1038/s41598-024-73428-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Accepted: 09/17/2024] [Indexed: 09/29/2024] Open
Abstract
Deep learning-assisted digital pathology has demonstrated the potential to profoundly impact clinical practice, even surpassing human pathologists in performance. However, as deep neural network (DNN) architectures grow in size and complexity, their explainability decreases, posing challenges in interpreting pathology features for broader clinical insights into physiological diseases. To better assess the interpretability of digital microscopic images and guide future microscopic system design, we developed a novel method to study the predictive feature length-scale that underpins a DNN's predictive power. We applied this method to analyze a DNN's capability in predicting brain metastasis from early-stage non-small-cell lung cancer biopsy slides. This study quantifies DNN's attention for brain metastasis prediction, targeting features at both the cellular scale and tissue scale in H&E-stained histological whole slide images. At the cellular scale, the predictive power of DNNs progressively increases with higher resolution and significantly decreases when the resolvable feature length exceeds 5 microns. Additionally, DNN uses more macro-scale features associated with tissue architecture and is optimized when assessing visual fields greater than 41 microns. Our study computes the length-scale requirements for optimal DNN learning on digital whole-slide microscopic images, holding the promise to guide future optical microscope designs in pathology applications and facilitating downstream deep learning analysis.
Collapse
Affiliation(s)
- Haowen Zhou
- Department of Electrical Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Siyu Lin
- Department of Electrical Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Mark Watson
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, MO, 63110, USA
| | - Cory T Bernadt
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, MO, 63110, USA
| | - Oumeng Zhang
- Department of Electrical Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Ling Liao
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, MO, 63110, USA
| | - Ramaswamy Govindan
- Department of Medicine, Washington University School of Medicine, St. Louis, MO, 63110, USA
| | - Richard J Cote
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, MO, 63110, USA
| | - Changhuei Yang
- Department of Electrical Engineering, California Institute of Technology, Pasadena, CA, 91125, USA.
| |
Collapse
|
20
|
Sasse A, Chikina M, Mostafavi S. Quick and effective approximation of in silico saturation mutagenesis experiments with first-order taylor expansion. iScience 2024; 27:110807. [PMID: 39286491 PMCID: PMC11404212 DOI: 10.1016/j.isci.2024.110807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Revised: 08/08/2024] [Accepted: 08/20/2024] [Indexed: 09/19/2024] Open
Abstract
To understand the decision process of genomic sequence-to-function models, explainable AI algorithms determine the importance of each nucleotide in a given input sequence to the model's predictions and enable discovery of cis-regulatory motifs for gene regulation. The most commonly applied method is in silico saturation mutagenesis (ISM) because its per-nucleotide importance scores can be intuitively understood as the computational counterpart to in vivo saturation mutagenesis experiments. While ISM is highly interpretable, it is computationally challenging to perform for many sequences, and becomes prohibitive as the length of the input sequences and size of the model grows. Here, we use the first-order Taylor approximation to approximate ISM values from the model's gradient, which reduces its computation cost to a single forward pass for an input sequence. We show that the Taylor ISM (TISM) approximation is robust across different model ablations, random initializations, training parameters, and dataset sizes.
Collapse
Affiliation(s)
- Alexander Sasse
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98195, USA
| | - Maria Chikina
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 16354, USA
| | - Sara Mostafavi
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98195, USA
- Canadian Institute for Advanced Research, Toronto, ON MG5 1ZB, Canada
| |
Collapse
|
21
|
Chai C, Gibson J, Li P, Pampari A, Patel A, Kundaje A, Wang B. Flexible use of conserved motif vocabularies constrains genome access in cell type evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.03.611027. [PMID: 39282369 PMCID: PMC11398382 DOI: 10.1101/2024.09.03.611027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]
Abstract
Cell types evolve into a hierarchy with related types grouped into families. How cell type diversification is constrained by the stable separation between families over vast evolutionary times remains unknown. Here, integrating single-nucleus multiomic sequencing and deep learning, we show that hundreds of sequence features (motifs) divide into distinct sets associated with accessible genomes of specific cell type families. This division is conserved across highly divergent, early-branching animals including flatworms and cnidarians. While specific interactions between motifs delineate cell type relationships within families, surprisingly, these interactions are not conserved between species. Consistently, while deep learning models trained on one species can predict accessibility of other species' sequences, their predictions frequently rely on distinct, but synonymous, motif combinations. We propose that long-term stability of cell type families is maintained through genome access specified by conserved motif sets, or 'vocabularies', whereas cell types diversify through flexible use of motifs within each set.
Collapse
Affiliation(s)
- Chew Chai
- Department of Bioengineering, Stanford University, Stanford, USA
| | - Jesse Gibson
- Department of Bioengineering, Stanford University, Stanford, USA
| | - Pengyang Li
- Department of Bioengineering, Stanford University, Stanford, USA
| | - Anusri Pampari
- Department of Computer Science, Stanford University, Stanford, USA
| | - Aman Patel
- Department of Computer Science, Stanford University, Stanford, USA
| | - Anshul Kundaje
- Department of Computer Science, Stanford University, Stanford, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, USA
| | - Bo Wang
- Department of Bioengineering, Stanford University, Stanford, USA
- Department of Developmental Biology, Stanford University School of Medicine, Stanford, USA
| |
Collapse
|
22
|
Lu W, Zhao L, Wang S, Zhang H, Jiang K, Ji J, Chen S, Wang C, Wei C, Zhou R, Wang Z, Li X, Wang F, Wei X, Hou W. Explainable and visualizable machine learning models to predict biochemical recurrence of prostate cancer. Clin Transl Oncol 2024; 26:2369-2379. [PMID: 38602643 DOI: 10.1007/s12094-024-03480-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 03/23/2024] [Indexed: 04/12/2024]
Abstract
PURPOSE Machine learning (ML) models presented an excellent performance in the prognosis prediction. However, the black box characteristic of ML models limited the clinical applications. Here, we aimed to establish explainable and visualizable ML models to predict biochemical recurrence (BCR) of prostate cancer (PCa). MATERIALS AND METHODS A total of 647 PCa patients were retrospectively evaluated. Clinical parameters were identified using LASSO regression. Then, cohort was split into training and validation datasets with a ratio of 0.75:0.25 and BCR-related features were included in Cox regression and five ML algorithm to construct BCR prediction models. The clinical utility of each model was evaluated by concordance index (C-index) values and decision curve analyses (DCA). Besides, Shapley Additive Explanation (SHAP) values were used to explain the features in the models. RESULTS We identified 11 BCR-related features using LASSO regression, then establishing five ML-based models, including random survival forest (RSF), survival support vector machine (SSVM), survival Tree (sTree), gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), and a Cox regression model, C-index were 0.846 (95%CI 0.796-0.894), 0.774 (95%CI 0.712-0.834), 0.757 (95%CI 0.694-0.818), 0.820 (95%CI 0.765-0.869), 0.793 (95%CI 0.735-0.852), and 0.807 (95%CI 0.753-0.858), respectively. The DCA showed that RSF model had significant advantages over all models. In interpretability of ML models, the SHAP value demonstrated the tangible contribution of each feature in RSF model. CONCLUSIONS Our score system provide reference for the identification for BCR, and the crafting of a framework for making therapeutic decisions for PCa on a personalized basis.
Collapse
Affiliation(s)
- Wenhao Lu
- Collaborative Innovation Centre of Regenerative Medicine and Medical BioResource Development and Application Co-Constructed By the Province and Ministry, Guangxi Medical University, No. 22, Shuangyong Road, Qingxiu District, Nanning City, 530021, Guangxi Zhuang Autonomous Region, People's Republic of China
- Center for Genomic and Personalized Medicine, Guangxi Key Laboratory for Genomic and Personalized Medicine, Guangxi Collaborative Innovation Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, 530021, Guangxi, People's Republic of China
- Department of Urology, the First Affiliated Hospital of Guangxi Medical University, Guangxi Medical University, Guangxi, 530021, People's Republic of China
- School of Life Sciences, Guangxi Medical University, Nanning, 530021, Guangxi, People's Republic of China
| | - Lin Zhao
- Department of Urology, Shanghai Changhai Hospital, Second Military Medical University, Shanghai, 200433, People's Republic of China
| | - Shenfan Wang
- Department of Urology, Shanghai Changhai Hospital, Second Military Medical University, Shanghai, 200433, People's Republic of China
| | - Huiyong Zhang
- Center for Genomic and Personalized Medicine, Guangxi Key Laboratory for Genomic and Personalized Medicine, Guangxi Collaborative Innovation Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, 530021, Guangxi, People's Republic of China
- School of Life Sciences, Guangxi Medical University, Nanning, 530021, Guangxi, People's Republic of China
| | - Kangxian Jiang
- Department of Urology, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, 362000, People's Republic of China
| | - Jin Ji
- Department of Urology, Shanghai Changhai Hospital, Second Military Medical University, Shanghai, 200433, People's Republic of China
| | - Shaohua Chen
- Center for Genomic and Personalized Medicine, Guangxi Key Laboratory for Genomic and Personalized Medicine, Guangxi Collaborative Innovation Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, 530021, Guangxi, People's Republic of China
- Department of Urology, the First Affiliated Hospital of Guangxi Medical University, Guangxi Medical University, Guangxi, 530021, People's Republic of China
| | - Chengbang Wang
- Center for Genomic and Personalized Medicine, Guangxi Key Laboratory for Genomic and Personalized Medicine, Guangxi Collaborative Innovation Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, 530021, Guangxi, People's Republic of China
| | - Chunmeng Wei
- Center for Genomic and Personalized Medicine, Guangxi Key Laboratory for Genomic and Personalized Medicine, Guangxi Collaborative Innovation Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, 530021, Guangxi, People's Republic of China
- Department of Urology, the First Affiliated Hospital of Guangxi Medical University, Guangxi Medical University, Guangxi, 530021, People's Republic of China
- School of Life Sciences, Guangxi Medical University, Nanning, 530021, Guangxi, People's Republic of China
| | - Rongbin Zhou
- Collaborative Innovation Centre of Regenerative Medicine and Medical BioResource Development and Application Co-Constructed By the Province and Ministry, Guangxi Medical University, No. 22, Shuangyong Road, Qingxiu District, Nanning City, 530021, Guangxi Zhuang Autonomous Region, People's Republic of China
- Center for Genomic and Personalized Medicine, Guangxi Key Laboratory for Genomic and Personalized Medicine, Guangxi Collaborative Innovation Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, 530021, Guangxi, People's Republic of China
- School of Life Sciences, Guangxi Medical University, Nanning, 530021, Guangxi, People's Republic of China
| | - Zuheng Wang
- Center for Genomic and Personalized Medicine, Guangxi Key Laboratory for Genomic and Personalized Medicine, Guangxi Collaborative Innovation Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, 530021, Guangxi, People's Republic of China
- Department of Urology, the First Affiliated Hospital of Guangxi Medical University, Guangxi Medical University, Guangxi, 530021, People's Republic of China
| | - Xiao Li
- Collaborative Innovation Centre of Regenerative Medicine and Medical BioResource Development and Application Co-Constructed By the Province and Ministry, Guangxi Medical University, No. 22, Shuangyong Road, Qingxiu District, Nanning City, 530021, Guangxi Zhuang Autonomous Region, People's Republic of China
- Center for Genomic and Personalized Medicine, Guangxi Key Laboratory for Genomic and Personalized Medicine, Guangxi Collaborative Innovation Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, 530021, Guangxi, People's Republic of China
- School of Life Sciences, Guangxi Medical University, Nanning, 530021, Guangxi, People's Republic of China
| | - Fubo Wang
- Collaborative Innovation Centre of Regenerative Medicine and Medical BioResource Development and Application Co-Constructed By the Province and Ministry, Guangxi Medical University, No. 22, Shuangyong Road, Qingxiu District, Nanning City, 530021, Guangxi Zhuang Autonomous Region, People's Republic of China.
- Center for Genomic and Personalized Medicine, Guangxi Key Laboratory for Genomic and Personalized Medicine, Guangxi Collaborative Innovation Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, 530021, Guangxi, People's Republic of China.
- Department of Urology, the First Affiliated Hospital of Guangxi Medical University, Guangxi Medical University, Guangxi, 530021, People's Republic of China.
- School of Life Sciences, Guangxi Medical University, Nanning, 530021, Guangxi, People's Republic of China.
| | - Xuedong Wei
- Department of Urology, the First Affiliated Hospital of Soochow University, Suzhou, 210000, Jiangsu, People's Republic of China.
| | - Wenlei Hou
- Information Technology School of Guangxi Police College, Nanning, 530021, Guangxi, People's Republic of China.
| |
Collapse
|
23
|
Banerjee A, Sharma A, Kamble P, Garg P. Prediction of Mycobacterium tuberculosis cell wall permeability using machine learning methods. Mol Divers 2024; 28:2317-2329. [PMID: 39133353 DOI: 10.1007/s11030-024-10952-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Accepted: 07/26/2024] [Indexed: 08/13/2024]
Abstract
Tuberculosis (TB) caused by the bacteria Mycobacterium tuberculosis (M. tb), continues to pose a significant worldwide health threat. The advent of drug-resistant strains of the disease highlights the critical need for novel treatments. The unique cell wall of M. tb provides an extra layer of protection for the bacteria and hence only compounds that can penetrate this barrier can reach their targets within the bacterial cell wall. The creation of a reliable machine learning (ML) model to predict the mycobacterial cell wall permeability of small molecules is presented in this work and four ML algorithms, including Random Forest, Support Vector Machines (SVM), k-nearest Neighbour (k-NN) and Logistic Regression were trained on a dataset of 5368 compounds. RDKit and Mordred toolkits were used to calculate features. To determine the most effective model, various performance metrics were used such as accuracy, precision, recall, F1 score, and area under the receiver operating characteristic curve. The best-performing model was further refined with hyperparameter tuning and tenfold cross-validation. The SVM model with filtering outperformed the other machine learning models and demonstrated 80.26% and 81.13% accuracy on the test and validation datasets, respectively. The study also provided insights into the molecular descriptors that play the most important role in predicting the ability of a molecule to pass the M. tb cell wall, which could guide future compound design. The model is available at https://github.com/PGlab-NIPER/MTB_Permeability .
Collapse
Affiliation(s)
- Aritra Banerjee
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research, S. A. S. Nagar, Punjab, 160 062, India
| | - Anju Sharma
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research, S. A. S. Nagar, Punjab, 160 062, India
| | - Pradnya Kamble
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research, S. A. S. Nagar, Punjab, 160 062, India
| | - Prabha Garg
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research, S. A. S. Nagar, Punjab, 160 062, India.
| |
Collapse
|
24
|
Islam MT, Xing L. Deciphering the Feature Representation of Deep Neural Networks for High-Performance AI. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:5273-5287. [PMID: 38373137 PMCID: PMC11296119 DOI: 10.1109/tpami.2024.3363642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
AI driven by deep learning is transforming many aspects of science and technology. The enormous success of deep learning stems from its unique capability of extracting essential features from Big Data for decision-making. However, the feature extraction and hidden representations in deep neural networks (DNNs) remain inexplicable, primarily because of lack of technical tools to comprehend and interrogate the feature space data. The main hurdle here is that the feature data are often noisy in nature, complex in structure, and huge in size and dimensionality, making it intractable for existing techniques to analyze the data reliably. In this work, we develop a computational framework named contrastive feature analysis (CFA) to facilitate the exploration of the DNN feature space and improve the performance of AI. By utilizing the interaction relations among the features and incorporating a novel data-driven kernel formation strategy into the feature analysis pipeline, CFA mitigates the limitations of traditional approaches and provides an urgently needed solution for the analysis of feature space data. The technique allows feature data exploration in unsupervised, semi-supervised and supervised formats to address different needs of downstream applications. The potential of CFA and its applications for pruning of neural network architectures are demonstrated using several state-of-the-art networks and well-annotated datasets across different disciplines.
Collapse
|
25
|
Li L, Lu C, Winiwarter W, Tian H, Canadell JG, Ito A, Jain AK, Kou-Giesbrecht S, Pan S, Pan N, Shi H, Sun Q, Vuichard N, Ye S, Zaehle S, Zhu Q. Enhanced nitrous oxide emission factors due to climate change increase the mitigation challenge in the agricultural sector. GLOBAL CHANGE BIOLOGY 2024; 30:e17472. [PMID: 39158113 DOI: 10.1111/gcb.17472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2024] [Revised: 07/17/2024] [Accepted: 07/18/2024] [Indexed: 08/20/2024]
Abstract
Effective nitrogen fertilizer management is crucial for reducing nitrous oxide (N2O) emissions while ensuring food security within planetary boundaries. However, climate change might also interact with management practices to alter N2O emission and emission factors (EFs), adding further uncertainties to estimating mitigation potentials. Here, we developed a new hybrid modeling framework that integrates a machine learning model with an ensemble of eight process-based models to project EFs under different climate and nitrogen policy scenarios. Our findings reveal that EFs are dynamically modulated by environmental changes, including climate, soil properties, and nitrogen management practices. Under low-ambition nitrogen regulation policies, EF would increase from 1.18%-1.22% in 2010 to 1.27%-1.34% by 2050, representing a relative increase of 4.4%-11.4% and exceeding the IPCC tier-1 EF of 1%. This trend is particularly pronounced in tropical and subtropical regions with high nitrogen inputs, where EFs could increase by 0.14%-0.35% (relative increase of 11.9%-17%). In contrast, high-ambition policies have the potential to mitigate the increases in EF caused by climate change, possibly leading to slight decreases in EFs. Furthermore, our results demonstrate that global EFs are expected to continue rising due to warming and regional drying-wetting cycles, even in the absence of changes in nitrogen management practices. This asymmetrical influence of nitrogen fertilizers on EFs, driven by climate change, underscores the urgent need for immediate N2O emission reductions and further assessments of mitigation potentials. This hybrid modeling framework offers a computationally efficient approach to projecting future N2O emissions across various climate, soil, and nitrogen management scenarios, facilitating socio-economic assessments and policy-making efforts.
Collapse
Affiliation(s)
- Linchao Li
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, Iowa, USA
| | - Chaoqun Lu
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, Iowa, USA
| | - Wilfried Winiwarter
- International Institute for Applied Systems Analysis, Laxenburg, Austria
- Institute of Environmental Engineering, University of Zielona Góra, Zielona Góra, Poland
| | - Hanqin Tian
- Center for Earth System Science and Global Sustainability, Schiller Institute for Integrated Science and Society, Boston College, Chestnut Hill, Massachusetts, USA
- Department of Earth and Environmental Sciences, Boston College, Chestnut Hill, Massachusetts, USA
| | - Josep G Canadell
- CSIRO Environment, Canberra, Australian Capital Territory, Australia
| | - Akihiko Ito
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, 113-8657, Japan
- Earth System Division, National Institute for Environmental Studies, Tsukuba, Japan
| | - Atul K Jain
- Department of Climate, Meteorology, and Atmospheric Sciences, University of Illinois, Urbana-Champaign, Urbana, USA
| | - Sian Kou-Giesbrecht
- Department of Earth and Environmental Sciences, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Shufen Pan
- Center for Earth System Science and Global Sustainability, Schiller Institute for Integrated Science and Society, Boston College, Chestnut Hill, Massachusetts, USA
- Department of Engineering and Environmental Studies Program, Boston College, Chestnut Hill, Massachusetts, USA
| | - Naiqing Pan
- Center for Earth System Science and Global Sustainability, Schiller Institute for Integrated Science and Society, Boston College, Chestnut Hill, Massachusetts, USA
| | - Hao Shi
- Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing, China
| | - Qing Sun
- Climate and Environmental Physics, Physics Institute and Oeschger Centre for Climate Change Research, University of Bern, Bern, Switzerland
| | - Nicolas Vuichard
- Laboratoire des Sciences du Climat et de l'Environnement, LSCE, CEA CNRS, UVSQ UPSACLAY, Gif sur Yvette, France
| | - Shuchao Ye
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, Iowa, USA
| | - Sönke Zaehle
- Max Planck Institute for Biogeochemistry, Jena, Germany
| | - Qing Zhu
- Climate and Ecosystem Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| |
Collapse
|
26
|
Zhu L, Wang Y, Chavas D, Johncox M, Yung YL. Leading role of Saharan dust on tropical cyclone rainfall in the Atlantic Basin. SCIENCE ADVANCES 2024; 10:eadn6106. [PMID: 39047098 PMCID: PMC11268405 DOI: 10.1126/sciadv.adn6106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 06/20/2024] [Indexed: 07/27/2024]
Abstract
Tropical cyclone rainfall (TCR) extensively affects coastal communities, primarily through inland flooding. The impact of global climate changes on TCR is complex and debatable. This study uses an XGBoost machine learning model with 19-year meteorological data and hourly satellite precipitation observations to predict TCR for individual storms. The model identifies dust optical depth (DOD) as a key predictor that enhances performance evidently. The model also uncovers a nonlinear and boomerang-shape relationship between Saharan dust and TCR, with a TCR peak at 0.06 DOD and a sharp decrease thereafter. This indicates a shift from microphysical enhancement to radiative suppression at high dust concentrations. The model also highlights meaningful correlations between TCR and meteorological factors like sea surface temperature and equivalent potential temperature near storm cores. These findings illustrate the effectiveness of machine learning in predicting TCR and understanding its driving factors and physical mechanisms.
Collapse
Affiliation(s)
- Laiyin Zhu
- School of Environment, Geography, and Sustainability, Western Michigan University, Kalamazoo, MI, USA
| | - Yuan Wang
- Department of Earth System Science, Stanford University, Stanford, CA, USA
| | - Dan Chavas
- Department of Earth, Atmospheric, and Planetary Sciences, Purdue University, West Lafayette, IN, USA
| | - Max Johncox
- Department of Atmospheric Science, University of Utah, Salt Lake City, UT, USA
| | - Yuk L. Yung
- Division of Geological and Planetary Sciences, California Institute of Technology, Pasadena, CA, USA
| |
Collapse
|
27
|
Zhang P, Nde J, Eliaz Y, Jennings N, Cieplak P, Cheung MS. Chemistry-informed Machine Learning Explains Calcium-binding Proteins' Fuzzy Shape for Communicating Changes in the Atomic States of Calcium Ions. ARXIV 2024:arXiv:2407.17017v1. [PMID: 39108291 PMCID: PMC11302678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
Proteins' fuzziness are features for communicating changes in cell signaling instigated by binding with secondary messengers, such as calcium ions, associated with the coordination of muscle contraction, neurotransmitter release, and gene expression. Binding with the disordered parts of a protein, calcium ions must balance their charge states with the shape of calcium-binding proteins and their versatile pool of partners depending on the circumstances they transmit, but it is unclear whether the limited experimental data available can be used to train models to accurately predict the charges of calcium-binding protein variants. Here, we developed a chemistry-informed, machine-learning algorithm that implements a game theoretic approach to explain the output of a machine-learning model without the prerequisite of an excessively large database for high-performance prediction of atomic charges. We used the ab initio electronic structure data representing calcium ions and the structures of the disordered segments of calcium-binding peptides with surrounding water molecules to train several explainable models. Network theory was used to extract the topological features of atomic interactions in the structurally complex data dictated by the coordination chemistry of a calcium ion, a potent indicator of its charge state in protein. With our designs, we provided a framework of explainable machine learning model to annotate atomic charges of calcium ions in calcium-binding proteins with domain knowledge in response to the chemical changes in an environment based on the limited size of scientific data in a genome space.
Collapse
Affiliation(s)
- Pengzhi Zhang
- Center for Bioinformatics and Computational Biology, Houston Methodist Research Institute, Houston, TX, USA
| | - Jules Nde
- Department of Physics, University of Washington, Seattle, WA, USA
| | - Yossi Eliaz
- Department of Physics, University of Houston, Houston, TX, USA
- Computer Science Department, HIT Holon Institute of Technology, Holon, Israel
| | | | - Piotr Cieplak
- Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, USA
| | - Margaret S Cheung
- Department of Physics, University of Washington, Seattle, WA, USA
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA, USA
| |
Collapse
|
28
|
Machiraju G, Derry A, Desai A, Guha N, Karimi AH, Zou J, Altman RB, Ré C, Mallick P. Prospector Heads: Generalized Feature Attribution for Large Models & Data. ARXIV 2024:arXiv:2402.11729v2. [PMID: 38947933 PMCID: PMC11213143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Feature attribution, the ability to localize regions of the input data that are relevant for classification, is an important capability for ML models in scientific and biomedical domains. Current methods for feature attribution, which rely on "explaining" the predictions of end-to-end classifiers, suffer from imprecise feature localization and are inadequate for use with small sample sizes and high-dimensional datasets due to computational challenges. We introduce prospector heads, an efficient and interpretable alternative to explanation-based attribution methods that can be applied to any encoder and any data modality. Prospector heads generalize across modalities through experiments on sequences (text), images (pathology), and graphs (protein structures), outperforming baseline attribution methods by up to 26.3 points in mean localization AUPRC. We also demonstrate how prospector heads enable improved interpretation and discovery of class-specific patterns in input data. Through their high performance, flexibility, and generalizability, prospectors provide a framework for improving trust and transparency for ML models in complex domains.
Collapse
Affiliation(s)
| | | | | | - Neel Guha
- Department of Computer Science, Stanford University
| | | | - James Zou
- Department of Biomedical Data Science, Stanford University
| | - Russ B. Altman
- Department of Biomedical Data Science, Stanford University
| | | | | |
Collapse
|
29
|
An TF, Zhang ZP, Xue JT, Luo WM, Li Y, Fang ZZ, Zong GW. Interpretable machine learning identifies metabolites associated with glomerular filtration rate in type 2 diabetes patients. Front Endocrinol (Lausanne) 2024; 15:1279034. [PMID: 38915893 PMCID: PMC11194401 DOI: 10.3389/fendo.2024.1279034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 01/17/2024] [Indexed: 06/26/2024] Open
Abstract
Objective The co-occurrence of kidney disease in patients with type 2 diabetes (T2D) is a major public health challenge. Although early detection and intervention can prevent or slow down the progression, the commonly used estimated glomerular filtration rate (eGFR) based on serum creatinine may be influenced by factors unrelated to kidney function. Therefore, there is a need to identify novel biomarkers that can more accurately assess renal function in T2D patients. In this study, we employed an interpretable machine-learning framework to identify plasma metabolomic features associated with GFR in T2D patients. Methods We retrieved 1626 patients with type 2 diabetes (T2D) in Liaoning Medical University First Affiliated Hospital (LMUFAH) as a development cohort and 716 T2D patients in Second Affiliated Hospital of Dalian Medical University (SAHDMU) as an external validation cohort. The metabolite features were screened by the orthogonal partial least squares discriminant analysis (OPLS-DA). We compared machine learning prediction methods, including logistic regression (LR), support vector machine (SVM), random forest (RF), and eXtreme Gradient Boosting (XGBoost). The Shapley Additive exPlanations (SHAP) were used to explain the optimal model. Results For T2D patients, compared with the normal or elevated eGFR group, glutarylcarnitine (C5DC) and decanoylcarnitine (C10) were significantly elevated in GFR mild reduction group, and citrulline and 9 acylcarnitines were also elevated significantly (FDR<0.05, FC > 1.2 and VIP > 1) in moderate or severe reduction group. The XGBoost model with metabolites had the best performance: in the internal validate dataset (AUROC=0.90, AUPRC=0.65, BS=0.064) and external validate cohort (AUROC=0.970, AUPRC=0.857, BS=0.046). Through the SHAP method, we found that C5DC higher than 0.1μmol/L, Cit higher than 26 μmol/L, triglyceride higher than 2 mmol/L, age greater than 65 years old, and duration of T2D more than 10 years were associated with reduced GFR. Conclusion Elevated plasma levels of citrulline and a panel of acylcarnitines were associated with reduced GFR in T2D patients, independent of other conventional risk factors.
Collapse
Affiliation(s)
- Tian-Feng An
- Department of Toxicology and Health Inspection and Quarantine, School of Public Health, Tianjin Medical University, Tianjin, China
| | - Zhi-Peng Zhang
- Department of Toxicology and Health Inspection and Quarantine, School of Public Health, Tianjin Medical University, Tianjin, China
| | - Jun-Tang Xue
- Department of Surgery, Peking University Third Hospital, Beijing, China
| | - Wei-Ming Luo
- Department of Toxicology and Health Inspection and Quarantine, School of Public Health, Tianjin Medical University, Tianjin, China
| | - Yang Li
- Department of Toxicology and Health Inspection and Quarantine, School of Public Health, Tianjin Medical University, Tianjin, China
| | - Zhong-Ze Fang
- Department of Toxicology and Health Inspection and Quarantine, School of Public Health, Tianjin Medical University, Tianjin, China
- Tianjin Key Laboratory of Environment, Nutrition and Public Health, Tianjin, China
| | - Guo-Wei Zong
- Department of Mathematics, School of Public Health, Tianjin Medical University, Tianjin, China
| |
Collapse
|
30
|
Zheng L, Cao S, Ding T, Tian J, Sun J. Research on Active Safety Situation of Road Passenger Transportation Enterprises: Evaluation, Prediction, and Analysis. ENTROPY (BASEL, SWITZERLAND) 2024; 26:434. [PMID: 38920443 PMCID: PMC11203358 DOI: 10.3390/e26060434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Revised: 05/16/2024] [Accepted: 05/16/2024] [Indexed: 06/27/2024]
Abstract
The road passenger transportation enterprise is a complex system, requiring a clear understanding of their active safety situation (ASS), trends, and influencing factors. This facilitates transportation authorities to promptly receive signals and take effective measures. Through exploratory factor analysis and confirmatory factor analysis, we delved into potential factors for evaluating ASS and extracted an ASS index. To predict obtaining a higher ASS information rate, we compared multiple time series models, including GRU (gated recurrent unit), LSTM (long short-term memory), ARIMA, Prophet, Conv_LSTM, and TCN (temporal convolutional network). This paper proposed the WDA-DBN (water drop algorithm-Deep Belief Network) model and employed DEEPSHAP to identify factors with higher ASS information content. TCN and GRU performed well in the prediction. Compared to the other models, WDA-DBN exhibited the best performance in terms of MSE and MAE. Overall, deep learning models outperform econometric models in terms of information processing. The total time spent processing alarms positively influences ASS, while variables such as fatigue driving occurrences, abnormal driving occurrences, and nighttime driving alarm occurrences have a negative impact on ASS.
Collapse
Affiliation(s)
- Lili Zheng
- Transportation College, Jilin University, Changchun 130022, China; (L.Z.); (S.C.); (J.S.)
| | - Shiyu Cao
- Transportation College, Jilin University, Changchun 130022, China; (L.Z.); (S.C.); (J.S.)
| | - Tongqiang Ding
- Transportation College, Jilin University, Changchun 130022, China; (L.Z.); (S.C.); (J.S.)
| | - Jian Tian
- China Academy of Transportation Sciences, Beijing 100029, China;
| | - Jinghang Sun
- Transportation College, Jilin University, Changchun 130022, China; (L.Z.); (S.C.); (J.S.)
| |
Collapse
|
31
|
Kim H, Son Y, Lee H, Kang J, Hammoodi A, Choi Y, Kim HJ, Lee H, Fond G, Boyer L, Kwon R, Woo S, Yon DK. Machine Learning-Based Prediction of Suicidal Thinking in Adolescents by Derivation and Validation in 3 Independent Worldwide Cohorts: Algorithm Development and Validation Study. J Med Internet Res 2024; 26:e55913. [PMID: 38758578 PMCID: PMC11143390 DOI: 10.2196/55913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 03/24/2024] [Accepted: 03/25/2024] [Indexed: 05/18/2024] Open
Abstract
BACKGROUND Suicide is the second-leading cause of death among adolescents and is associated with clusters of suicides. Despite numerous studies on this preventable cause of death, the focus has primarily been on single nations and traditional statistical methods. OBJECTIVE This study aims to develop a predictive model for adolescent suicidal thinking using multinational data sets and machine learning (ML). METHODS We used data from the Korea Youth Risk Behavior Web-based Survey with 566,875 adolescents aged between 13 and 18 years and conducted external validation using the Youth Risk Behavior Survey with 103,874 adolescents and Norway's University National General Survey with 19,574 adolescents. Several tree-based ML models were developed, and feature importance and Shapley additive explanations values were analyzed to identify risk factors for adolescent suicidal thinking. RESULTS When trained on the Korea Youth Risk Behavior Web-based Survey data from South Korea with a 95% CI, the XGBoost model reported an area under the receiver operating characteristic (AUROC) curve of 90.06% (95% CI 89.97-90.16), displaying superior performance compared to other models. For external validation using the Youth Risk Behavior Survey data from the United States and the University National General Survey from Norway, the XGBoost model achieved AUROCs of 83.09% and 81.27%, respectively. Across all data sets, XGBoost consistently outperformed the other models with the highest AUROC score, and was selected as the optimal model. In terms of predictors of suicidal thinking, feelings of sadness and despair were the most influential, accounting for 57.4% of the impact, followed by stress status at 19.8%. This was followed by age (5.7%), household income (4%), academic achievement (3.4%), sex (2.1%), and others, which contributed less than 2% each. CONCLUSIONS This study used ML by integrating diverse data sets from 3 countries to address adolescent suicide. The findings highlight the important role of emotional health indicators in predicting suicidal thinking among adolescents. Specifically, sadness and despair were identified as the most significant predictors, followed by stressful conditions and age. These findings emphasize the critical need for early diagnosis and prevention of mental health issues during adolescence.
Collapse
Affiliation(s)
- Hyejun Kim
- Center for Digital Health, Medical Science Research Institute, Kyung Hee University College of Medicine, Seoul, Republic of Korea
- Department of Applied Information Engineering, Yonsei University, Seoul, Republic of Korea
| | - Yejun Son
- Center for Digital Health, Medical Science Research Institute, Kyung Hee University College of Medicine, Seoul, Republic of Korea
- Department of Precision Medicine, Kyung Hee University College of Medicine, Seoul, Republic of Korea
| | - Hojae Lee
- Center for Digital Health, Medical Science Research Institute, Kyung Hee University College of Medicine, Seoul, Republic of Korea
- Department of Regulatory Science, Kyung Hee University, Seoul, Republic of Korea
| | - Jiseung Kang
- Division of Sleep Medicine, Harvard Medical School, Boston, MA, United States
- Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Boston, MA, United States
| | - Ahmed Hammoodi
- Department of Business Administration, Kyung Hee University School of Management, Seoul, Republic of Korea
| | - Yujin Choi
- Center for Digital Health, Medical Science Research Institute, Kyung Hee University College of Medicine, Seoul, Republic of Korea
- Department of Korean Medicine, Kyung Hee University College of Korean Medicine, Seoul, Republic of Korea
| | - Hyeon Jin Kim
- Center for Digital Health, Medical Science Research Institute, Kyung Hee University College of Medicine, Seoul, Republic of Korea
- Department of Regulatory Science, Kyung Hee University, Seoul, Republic of Korea
| | - Hayeon Lee
- Center for Digital Health, Medical Science Research Institute, Kyung Hee University College of Medicine, Seoul, Republic of Korea
| | - Guillaume Fond
- Assistance Publique-Hôpitaux de Marseille (APHM), CEReSS-Health Service Research and Quality of Life Center, Aix-Marseille University, Marseille, France
| | - Laurent Boyer
- Assistance Publique-Hôpitaux de Marseille (APHM), CEReSS-Health Service Research and Quality of Life Center, Aix-Marseille University, Marseille, France
| | - Rosie Kwon
- Center for Digital Health, Medical Science Research Institute, Kyung Hee University College of Medicine, Seoul, Republic of Korea
| | - Selin Woo
- Center for Digital Health, Medical Science Research Institute, Kyung Hee University College of Medicine, Seoul, Republic of Korea
| | - Dong Keon Yon
- Center for Digital Health, Medical Science Research Institute, Kyung Hee University College of Medicine, Seoul, Republic of Korea
- Department of Precision Medicine, Kyung Hee University College of Medicine, Seoul, Republic of Korea
- Department of Regulatory Science, Kyung Hee University, Seoul, Republic of Korea
- Department of Pediatrics, Kyung Hee University Medical Center, Kyung Hee University College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
32
|
Burton RJ, Raffray L, Moet LM, Cuff SM, White DA, Baker SE, Moser B, O’Donnell VB, Ghazal P, Morgan MP, Artemiou A, Eberl M. Conventional and unconventional T-cell responses contribute to the prediction of clinical outcome and causative bacterial pathogen in sepsis patients. Clin Exp Immunol 2024; 216:293-306. [PMID: 38430552 PMCID: PMC11097916 DOI: 10.1093/cei/uxae019] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 02/12/2024] [Accepted: 02/28/2024] [Indexed: 03/04/2024] Open
Abstract
Sepsis is characterized by a dysfunctional host response to infection culminating in life-threatening organ failure that requires complex patient management and rapid intervention. Timely diagnosis of the underlying cause of sepsis is crucial, and identifying those at risk of complications and death is imperative for triaging treatment and resource allocation. Here, we explored the potential of explainable machine learning models to predict mortality and causative pathogen in sepsis patients. By using a modelling pipeline employing multiple feature selection algorithms, we demonstrate the feasibility of identifying integrative patterns from clinical parameters, plasma biomarkers, and extensive phenotyping of blood immune cells. While no single variable had sufficient predictive power, models that combined five and more features showed a macro area under the curve (AUC) of 0.85 to predict 90-day mortality after sepsis diagnosis, and a macro AUC of 0.86 to discriminate between Gram-positive and Gram-negative bacterial infections. Parameters associated with the cellular immune response contributed the most to models predictive of 90-day mortality, most notably, the proportion of T cells among PBMCs, together with expression of CXCR3 by CD4+ T cells and CD25 by mucosal-associated invariant T (MAIT) cells. Frequencies of Vδ2+ γδ T cells had the most profound impact on the prediction of Gram-negative infections, alongside other T-cell-related variables and total neutrophil count. Overall, our findings highlight the added value of measuring the proportion and activation patterns of conventional and unconventional T cells in the blood of sepsis patients in combination with other immunological, biochemical, and clinical parameters.
Collapse
Affiliation(s)
- Ross J Burton
- Division of Infection and Immunity, School of Medicine, Cardiff University, Cardiff, UK
- Adult Critical Care, University Hospital of Wales, Cardiff and Vale University Health Board, Cardiff, UK
| | - Loïc Raffray
- Division of Infection and Immunity, School of Medicine, Cardiff University, Cardiff, UK
- Department of Internal Medicine, Félix Guyon University Hospital of La Réunion, Saint Denis, Réunion Island, France
| | - Linda M Moet
- Division of Infection and Immunity, School of Medicine, Cardiff University, Cardiff, UK
| | - Simone M Cuff
- Division of Infection and Immunity, School of Medicine, Cardiff University, Cardiff, UK
| | - Daniel A White
- Division of Infection and Immunity, School of Medicine, Cardiff University, Cardiff, UK
| | - Sarah E Baker
- Division of Infection and Immunity, School of Medicine, Cardiff University, Cardiff, UK
| | - Bernhard Moser
- Division of Infection and Immunity, School of Medicine, Cardiff University, Cardiff, UK
- Systems Immunity Research Institute, Cardiff University, Cardiff, UK
| | - Valerie B O’Donnell
- Division of Infection and Immunity, School of Medicine, Cardiff University, Cardiff, UK
- Systems Immunity Research Institute, Cardiff University, Cardiff, UK
| | - Peter Ghazal
- Division of Infection and Immunity, School of Medicine, Cardiff University, Cardiff, UK
- Systems Immunity Research Institute, Cardiff University, Cardiff, UK
| | - Matt P Morgan
- Adult Critical Care, University Hospital of Wales, Cardiff and Vale University Health Board, Cardiff, UK
| | - Andreas Artemiou
- School of Mathematics, Cardiff University, Cardiff, UK
- Department of Information Technologies, University of Limassol, 3025 Limassol, Cyprus
| | - Matthias Eberl
- Division of Infection and Immunity, School of Medicine, Cardiff University, Cardiff, UK
- Systems Immunity Research Institute, Cardiff University, Cardiff, UK
| |
Collapse
|
33
|
Niemczak CE, Montagnese B, Levy J, Fellows AM, Gui J, Leigh SM, Magohe A, Massawe ER, Buckey JC. Machine learning for predicting cognitive deficits using auditory and demographic factors. PLoS One 2024; 19:e0302902. [PMID: 38743715 PMCID: PMC11093307 DOI: 10.1371/journal.pone.0302902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 04/15/2024] [Indexed: 05/16/2024] Open
Abstract
IMPORTANCE Predicting neurocognitive deficits using complex auditory assessments could change how cognitive dysfunction is identified, and monitored over time. Detecting cognitive impairment in people living with HIV (PLWH) is important for early intervention, especially in low- to middle-income countries where most cases exist. Auditory tests relate to neurocognitive test results, but the incremental predictive capability beyond demographic factors is unknown. OBJECTIVE Use machine learning to predict neurocognitive deficits, using auditory tests and demographic factors. SETTING The Infectious Disease Center in Dar es Salaam, Tanzania. PARTICIPANTS Participants were 939 Tanzanian individuals from Dar es Salaam living with and without HIV who were part of a longitudinal study. Patients who had only one visit, a positive history of ear drainage, concussion, significant noise or chemical exposure, neurological disease, mental illness, or exposure to ototoxic antibiotics (e.g., gentamycin), or chemotherapy were excluded. This provided 478 participants (349 PLWH, 129 HIV-negative). Participant data were randomized to training and test sets for machine learning. MAIN OUTCOME(S) AND MEASURE(S) The main outcome was whether auditory variables combined with relevant demographic variables could predict neurocognitive dysfunction (defined as a score of <26 on the Kiswahili Montreal Cognitive Assessment) better than demographic factors alone. The performance of predictive machine learning algorithms was primarily evaluated using the area under the receiver operational characteristic curve. Secondary metrics for evaluation included F1 scores, accuracies, and the Youden's indices for the algorithms. RESULTS The percentage of individuals with cognitive deficits was 36.2% (139 PLWH and 34 HIV-negative). The Gaussian and kernel naïve Bayes classifiers were the most predictive algorithms for neurocognitive impairment. Algorithms trained with auditory variables had average area under the curve values of 0.91 and 0.87, F1 scores (metric for precision and recall) of 0.81 and 0.76, and average accuracies of 86.3% and 81.9% respectively. Algorithms trained without auditory variables as features were statistically worse (p < .001) in both the primary measure of area under the curve (0.82/0.78) and the secondary measure of accuracy (72.3%/74.5%) for the Gaussian and kernel algorithms respectively. CONCLUSIONS AND RELEVANCE Auditory variables improved the prediction of cognitive function. Since auditory tests are easy-to-administer and often naturalistic tasks, they may offer objective measures or predictors of neurocognitive performance suitable for many global settings. Further research and development into using machine learning algorithms for predicting cognitive outcomes should be pursued.
Collapse
Affiliation(s)
- Christopher E. Niemczak
- Geisel School of Medicine at Dartmouth, Space Medicine Innovations Laboratory, Lebanon, NH, United States of America
- Dartmouth Health, Department of Medicine, Division of Hyperbaric Medicine, Lebanon, NH, United States of America
| | - Basile Montagnese
- Geisel School of Medicine at Dartmouth, Space Medicine Innovations Laboratory, Lebanon, NH, United States of America
| | - Joshua Levy
- Dartmouth Health, Department of Pathology and Laboratory Medicine, Lebanon, NH, United States of America
- Dartmouth Health, Department of Dermatology, Lebanon, NH, United States of America
- Geisel School of Medicine at Dartmouth, Epidemiology, Lebanon, NH, United States of America
- Geisel School of Medicine at Dartmouth, Program in Quantitative Biomedical Sciences, Lebanon, NH, United States of America
| | - Abigail M. Fellows
- Dartmouth Health, Department of Medicine, Division of Hyperbaric Medicine, Lebanon, NH, United States of America
| | - Jiang Gui
- Geisel School of Medicine at Dartmouth, Program in Quantitative Biomedical Sciences, Lebanon, NH, United States of America
- Geisel School of Medicine at Dartmouth, Biomedical Data Science, Lebanon, NH, United States of America
| | - Samantha M. Leigh
- Dartmouth Health, Department of Medicine, Division of Hyperbaric Medicine, Lebanon, NH, United States of America
| | - Albert Magohe
- Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania
| | - Enica R. Massawe
- Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania
| | - Jay C. Buckey
- Geisel School of Medicine at Dartmouth, Space Medicine Innovations Laboratory, Lebanon, NH, United States of America
- Dartmouth Health, Department of Medicine, Division of Hyperbaric Medicine, Lebanon, NH, United States of America
| |
Collapse
|
34
|
Li Y, Yang AY, Marelli A, Li Y. MixEHR-SurG: A joint proportional hazard and guided topic model for inferring mortality-associated topics from electronic health records. J Biomed Inform 2024; 153:104638. [PMID: 38631461 DOI: 10.1016/j.jbi.2024.104638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 03/07/2024] [Accepted: 04/03/2024] [Indexed: 04/19/2024]
Abstract
Survival models can help medical practitioners to evaluate the prognostic importance of clinical variables to patient outcomes such as mortality or hospital readmission and subsequently design personalized treatment regimes. Electronic Health Records (EHRs) hold the promise for large-scale survival analysis based on systematically recorded clinical features for each patient. However, existing survival models either do not scale to high dimensional and multi-modal EHR data or are difficult to interpret. In this study, we present a supervised topic model called MixEHR-SurG to simultaneously integrate heterogeneous EHR data and model survival hazard. Our contributions are three-folds: (1) integrating EHR topic inference with Cox proportional hazards likelihood; (2) integrating patient-specific topic hyperparameters using the PheCode concepts such that each topic can be identified with exactly one PheCode-associated phenotype; (3) multi-modal survival topic inference. This leads to a highly interpretable survival topic model that can infer PheCode-specific phenotype topics associated with patient mortality. We evaluated MixEHR-SurG using a simulated dataset and two real-world EHR datasets: the Quebec Congenital Heart Disease (CHD) data consisting of 8211 subjects with 75,187 outpatient claim records of 1767 unique ICD codes; the MIMIC-III consisting of 1458 subjects with multi-modal EHR records. Compared to the baselines, MixEHR-SurG achieved a superior dynamic AUROC for mortality prediction, with a mean AUROC score of 0.89 in the simulation dataset and a mean AUROC of 0.645 on the CHD dataset. Qualitatively, MixEHR-SurG associates severe cardiac conditions with high mortality risk among the CHD patients after the first heart failure hospitalization and critical brain injuries with increased mortality among the MIMIC-III patients after their ICU discharge. Together, the integration of the Cox proportional hazards model and EHR topic inference in MixEHR-SurG not only leads to competitive mortality prediction but also meaningful phenotype topics for in-depth survival analysis. The software is available at GitHub: https://github.com/li-lab-mcgill/MixEHR-SurG.
Collapse
Affiliation(s)
- Yixuan Li
- Department of Mathematics and Statistics, McGill University, Montreal, Canada; Mila - Quebec AI institute, Montreal, Canada
| | - Archer Y Yang
- Department of Mathematics and Statistics, McGill University, Montreal, Canada; Mila - Quebec AI institute, Montreal, Canada; School of Computer Science, McGill University, Montreal, Canada.
| | - Ariane Marelli
- McGill Adult Unit for Congenital Heart Disease (MAUDE Unit), McGill University of Health Centre, Montreal, Canada.
| | - Yue Li
- Mila - Quebec AI institute, Montreal, Canada; School of Computer Science, McGill University, Montreal, Canada.
| |
Collapse
|
35
|
Shannon CP, Lee AH, Tebbutt SJ, Singh A. A Commentary on Multi-omics Data Integration in Systems Vaccinology. J Mol Biol 2024; 436:168522. [PMID: 38458605 DOI: 10.1016/j.jmb.2024.168522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 03/04/2024] [Accepted: 03/04/2024] [Indexed: 03/10/2024]
Affiliation(s)
| | - Amy Hy Lee
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, Canada
| | - Scott J Tebbutt
- PROOF Centre of Excellence, Vancouver, Canada; Department of Medicine, The University of British Columbia, Vancouver, Canada; Centre for Heart Lung Innovation, Vancouver, Canada
| | - Amrit Singh
- Centre for Heart Lung Innovation, Vancouver, Canada; Department of Anesthesiology, Pharmacology and Therapeutics, The University of British Columbia, Vancouver, Canada.
| |
Collapse
|
36
|
Wang TH, Kao CC, Chang TH. Ensemble Machine Learning for Predicting 90-Day Outcomes and Analyzing Risk Factors in Acute Kidney Injury Requiring Dialysis. J Multidiscip Healthc 2024; 17:1589-1602. [PMID: 38628614 PMCID: PMC11020304 DOI: 10.2147/jmdh.s448004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 03/24/2024] [Indexed: 04/19/2024] Open
Abstract
Purpose Our objectives were to (1) employ ensemble machine learning algorithms utilizing real-world clinical data to predict 90-day prognosis, including dialysis dependence and mortality, following the first hospitalized dialysis and (2) identify the significant factors associated with overall outcomes. Patients and Methods We identified hospitalized patients with Acute kidney injury requiring dialysis (AKI-D) from a dataset of the Taipei Medical University Clinical Research Database (TMUCRD) from January 2008 to December 2020. The extracted data comprise demographics, comorbidities, medications, and laboratory parameters. Ensemble machine learning models were developed utilizing real-world clinical data through the Google Cloud Platform. Results The Study Analyzed 1080 Patients in the Dialysis-Dependent Module, Out of Which 616 Received Regular Dialysis After 90 Days. Our Ensemble Model, Consisting of 25 Feedforward Neural Network Models, Demonstrated the Best Performance with an Auroc of 0.846. We Identified the Baseline Creatinine Value, Assessed at Least 90 Days Before the Initial Dialysis, as the Most Crucial Factor. We selected 2358 patients, 984 of whom were deceased after 90 days, for the survival module. The ensemble model, comprising 15 feedforward neural network models and 10 gradient-boosted decision tree models, achieved superior performance with an AUROC of 0.865. The pre-dialysis creatinine value, tested within 90 days prior to the initial dialysis, was identified as the most significant factor. Conclusion Ensemble machine learning models outperform logistic regression models in predicting outcomes of AKI-D, compared to existing literature. Our study, which includes a large sample size from three different hospitals, supports the significance of the creatinine value tested before the first hospitalized dialysis in determining overall prognosis. Healthcare providers could benefit from utilizing our validated prediction model to improve clinical decision-making and enhance patient care for the high-risk population.
Collapse
Affiliation(s)
- Tzu-Hao Wang
- Division of General Medicine, Department of Medical Education, Shuang-Ho Hospital, Taipei Medical University, New Taipei City, Taiwan, Republic of China
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan, Republic of China
| | - Chih-Chin Kao
- Division of Nephrology, Department of Internal Medicine, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan, Republic of China
- Division of Nephrology, Department of Internal Medicine, Taipei Medical University Hospital, Taipei, Taiwan, Republic of China
- Taipei Medical University-Research Center of Urology and Kidney (TMU-RCUK), Taipei Medical University, Taipei, Taiwan, Republic of China
| | - Tzu-Hao Chang
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan, Republic of China
- Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei City, Taiwan, Republic of China
| |
Collapse
|
37
|
Lin PJ, Li W, Zhai X, Li Z, Sun J, Xu Q, Pan Y, Ji L, Li C. Explainable Deep-Learning Prediction for Brain-Computer Interfaces Supported Lower Extremity Motor Gains Based on Multistate Fusion. IEEE Trans Neural Syst Rehabil Eng 2024; 32:1546-1555. [PMID: 38578854 DOI: 10.1109/tnsre.2024.3384498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/07/2024]
Abstract
Predicting the potential for recovery of motor function in stroke patients who undergo specific rehabilitation treatments is an important and major challenge. Recently, electroencephalography (EEG) has shown potential in helping to determine the relationship between cortical neural activity and motor recovery. EEG recorded in different states could more accurately predict motor recovery than single-state recordings. Here, we design a multi-state (combining eyes closed, EC, and eyes open, EO) fusion neural network for predicting the motor recovery of patients with stroke after EEG-brain-computer-interface (BCI) rehabilitation training and use an explainable deep learning method to identify the most important features of EEG power spectral density and functional connectivity contributing to prediction. The prediction accuracy of the multi-states fusion network was 82%, significantly improved compared with a single-state model. The neural network explanation result demonstrated the important region and frequency oscillation bands. Specifically, in those two states, power spectral density and functional connectivity were shown as the regions and bands related to motor recovery in frontal, central, and occipital. Moreover, the motor recovery relation in bands, the power spectrum density shows the bands at delta and alpha bands. The functional connectivity shows the delta, theta, and alpha bands in the EC state; delta, theta, and beta mid at the EO state are related to motor recovery. Multi-state fusion neural networks, which combine multiple states of EEG signals into a single network, can increase the accuracy of predicting motor recovery after BCI training, and reveal the underlying mechanisms of motor recovery in brain activity.
Collapse
|
38
|
Saez-Matia A, Ibarluzea MG, M-Alicante S, Muguruza-Montero A, Nuñez E, Ramis R, Ballesteros OR, Lasa-Goicuria D, Fons C, Gallego M, Casis O, Leonardo A, Bergara A, Villarroel A. MLe-KCNQ2: An Artificial Intelligence Model for the Prognosis of Missense KCNQ2 Gene Variants. Int J Mol Sci 2024; 25:2910. [PMID: 38474157 DOI: 10.3390/ijms25052910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 02/27/2024] [Accepted: 02/29/2024] [Indexed: 03/14/2024] Open
Abstract
Despite the increasing availability of genomic data and enhanced data analysis procedures, predicting the severity of associated diseases remains elusive in the absence of clinical descriptors. To address this challenge, we have focused on the KV7.2 voltage-gated potassium channel gene (KCNQ2), known for its link to developmental delays and various epilepsies, including self-limited benign familial neonatal epilepsy and epileptic encephalopathy. Genome-wide tools often exhibit a tendency to overestimate deleterious mutations, frequently overlooking tolerated variants, and lack the capacity to discriminate variant severity. This study introduces a novel approach by evaluating multiple machine learning (ML) protocols and descriptors. The combination of genomic information with a novel Variant Frequency Index (VFI) builds a robust foundation for constructing reliable gene-specific ML models. The ensemble model, MLe-KCNQ2, formed through logistic regression, support vector machine, random forest and gradient boosting algorithms, achieves specificity and sensitivity values surpassing 0.95 (AUC-ROC > 0.98). The ensemble MLe-KCNQ2 model also categorizes pathogenic mutations as benign or severe, with an area under the receiver operating characteristic curve (AUC-ROC) above 0.67. This study not only presents a transferable methodology for accurately classifying KCNQ2 missense variants, but also provides valuable insights for clinical counseling and aids in the determination of variant severity. The research context emphasizes the necessity of precise variant classification, especially for genes like KCNQ2, contributing to the broader understanding of gene-specific challenges in the field of genomic research. The MLe-KCNQ2 model stands as a promising tool for enhancing clinical decision making and prognosis in the realm of KCNQ2-related pathologies.
Collapse
Affiliation(s)
| | - Markel G Ibarluzea
- Physics Department, Universidad del País Vasco, UPV/EHU, 48940 Leioa, Spain
- Donostia International Physics Center, 20018 Donostia, Spain
| | - Sara M-Alicante
- Instituto Biofisika, CSIC-UPV/EHU, 48940 Leioa, Spain
- Physics Department, Universidad del País Vasco, UPV/EHU, 48940 Leioa, Spain
| | | | - Eider Nuñez
- Instituto Biofisika, CSIC-UPV/EHU, 48940 Leioa, Spain
- Physics Department, Universidad del País Vasco, UPV/EHU, 48940 Leioa, Spain
| | - Rafael Ramis
- Physics Department, Universidad del País Vasco, UPV/EHU, 48940 Leioa, Spain
- Donostia International Physics Center, 20018 Donostia, Spain
| | - Oscar R Ballesteros
- Physics Department, Universidad del País Vasco, UPV/EHU, 48940 Leioa, Spain
- Centro de Física de Materiales CFM, CSIC-UPV/EHU, 20018 Donostia, Spain
| | | | - Carmen Fons
- Pediatric Neurology Department, Sant Joan de Déu Hospital, Institut de Recerca Sant Joan de Déu, Barcelona University, 08950 Barcelona, Spain
| | - Mónica Gallego
- Departamento de Fisiología, Universidad del País Vasco, UPV/EHU, 01006 Vitoria-Gasteiz, Spain
| | - Oscar Casis
- Departamento de Fisiología, Universidad del País Vasco, UPV/EHU, 01006 Vitoria-Gasteiz, Spain
| | - Aritz Leonardo
- Physics Department, Universidad del País Vasco, UPV/EHU, 48940 Leioa, Spain
- Donostia International Physics Center, 20018 Donostia, Spain
| | - Aitor Bergara
- Physics Department, Universidad del País Vasco, UPV/EHU, 48940 Leioa, Spain
- Donostia International Physics Center, 20018 Donostia, Spain
- Centro de Física de Materiales CFM, CSIC-UPV/EHU, 20018 Donostia, Spain
| | | |
Collapse
|
39
|
Hussain I, Jany R. Interpreting Stroke-Impaired Electromyography Patterns through Explainable Artificial Intelligence. SENSORS (BASEL, SWITZERLAND) 2024; 24:1392. [PMID: 38474928 DOI: 10.3390/s24051392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 02/17/2024] [Accepted: 02/19/2024] [Indexed: 03/14/2024]
Abstract
Electromyography (EMG) proves invaluable myoelectric manifestation in identifying neuromuscular alterations resulting from ischemic strokes, serving as a potential marker for diagnostics of gait impairments caused by ischemia. This study aims to develop an interpretable machine learning (ML) framework capable of distinguishing between the myoelectric patterns of stroke patients and those of healthy individuals through Explainable Artificial Intelligence (XAI) techniques. The research included 48 stroke patients (average age 70.6 years, 65% male) undergoing treatment at a rehabilitation center, alongside 75 healthy adults (average age 76.3 years, 32% male) as the control group. EMG signals were recorded from wearable devices positioned on the bicep femoris and lateral gastrocnemius muscles of both lower limbs during indoor ground walking in a gait laboratory. Boosting ML techniques were deployed to identify stroke-related gait impairments using EMG gait features. Furthermore, we employed XAI techniques, such as Shapley Additive Explanations (SHAP), Local Interpretable Model-Agnostic Explanations (LIME), and Anchors to interpret the role of EMG variables in the stroke-prediction models. Among the ML models assessed, the GBoost model demonstrated the highest classification performance (AUROC: 0.94) during cross-validation with the training dataset, and it also overperformed (AUROC: 0.92, accuracy: 85.26%) when evaluated using the testing EMG dataset. Through SHAP and LIME analyses, the study identified that EMG spectral features contributing to distinguishing the stroke group from the control group were associated with the right bicep femoris and lateral gastrocnemius muscles. This interpretable EMG-based stroke prediction model holds promise as an objective tool for predicting post-stroke gait impairments. Its potential application could greatly assist in managing post-stroke rehabilitation by providing reliable EMG biomarkers and address potential gait impairment in individuals recovering from ischemic stroke.
Collapse
Affiliation(s)
- Iqram Hussain
- Department of Anesthesiology, Weill Cornell Medicine, Cornell University, New York, NY 10065, USA
| | - Rafsan Jany
- Department of Computer Science and Engineering, Islamic University and Technology (IUT), Gazipur 1704, Bangladesh
| |
Collapse
|
40
|
Li W, Huang G, Tang N, Lu P, Jiang L, Lv J, Qin Y, Lin Y, Xu F, Lei D. Association between co-exposure to phenols, phthalates, and polycyclic aromatic hydrocarbons with the risk of frailty. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:105181-105193. [PMID: 37713077 DOI: 10.1007/s11356-023-29887-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Accepted: 09/11/2023] [Indexed: 09/16/2023]
Abstract
The phenomenon of population aging has brought forth the challenge of frailty. Nevertheless, the contribution of environmental exposure to frailty remains ambiguous. Our objective was to investigate the association between phenols, phthalates (PAEs), and polycyclic aromatic hydrocarbons (PAHs) with frailty. We constructed a 48-item frailty index using data from the National Health and Nutrition Examination Survey (NHANES). The exposure levels of 20 organic contaminants were obtained from the survey circle between 2005 and 2016. The association between individual organic contaminants and the frailty index was assessed using negative binomial regression models. The combined effect of organic contaminants was examined using weighted quantile sum (WQS) regression. Dose-response patterns were modeled using generalized additive models (GAMs). Additionally, an interpretable machine learning approach was employed to develop a predictive model for the frailty index. A total of 1566 participants were included in the analysis. Positive associations were observed between exposure to MIB, P02, ECP, MBP, MHH, MOH, MZP, MC1, and P01 with the frailty index. WQS regression analysis revealed a significant increase in the frailty index with higher levels of the mixture of organic contaminants (aOR, 1.12; 95% CI, 1.05-1.20; p < 0.001), with MIB, ECP, COP, MBP, P02, and P01 identified as the major contributors. Dose-response relationships were observed between MIB, ECP, MBP, P02, and P01 exposure with an increased risk of frailty (both with p < 0.05). The developed predictive model based on organic contaminants exposure demonstrated high performance, with an R2 of 0.9634 and 0.9611 in the training and testing sets, respectively. Furthermore, the predictive model suggested potential synergistic effects in the MIB-MBP and P01-P02 pairs. Taken together, these findings suggest a significant association between exposure to phthalates and PAHs with an increased susceptibility to frailty.
Collapse
Affiliation(s)
- Wenxiang Li
- Department of Ophthalmology, the People's Hospital of Guangxi Zhuang Autonomous Region & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology & Institute of Ophthalmic Diseases, Guangxi Academy of Medical Sciences, Nanning, 530021, People's Republic of China
| | - Guangyi Huang
- Department of Ophthalmology, the People's Hospital of Guangxi Zhuang Autonomous Region & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology & Institute of Ophthalmic Diseases, Guangxi Academy of Medical Sciences, Nanning, 530021, People's Republic of China
| | - Ningning Tang
- Department of Ophthalmology, the People's Hospital of Guangxi Zhuang Autonomous Region & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology & Institute of Ophthalmic Diseases, Guangxi Academy of Medical Sciences, Nanning, 530021, People's Republic of China
| | - Peng Lu
- Department of Ophthalmology, the People's Hospital of Guangxi Zhuang Autonomous Region & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology & Institute of Ophthalmic Diseases, Guangxi Academy of Medical Sciences, Nanning, 530021, People's Republic of China
| | - Li Jiang
- Department of Ophthalmology, the People's Hospital of Guangxi Zhuang Autonomous Region & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology & Institute of Ophthalmic Diseases, Guangxi Academy of Medical Sciences, Nanning, 530021, People's Republic of China
| | - Jian Lv
- Department of Ophthalmology, the People's Hospital of Guangxi Zhuang Autonomous Region & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology & Institute of Ophthalmic Diseases, Guangxi Academy of Medical Sciences, Nanning, 530021, People's Republic of China
| | - Yuanjun Qin
- Department of Ophthalmology, the People's Hospital of Guangxi Zhuang Autonomous Region & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology & Institute of Ophthalmic Diseases, Guangxi Academy of Medical Sciences, Nanning, 530021, People's Republic of China
| | - Yunru Lin
- Department of Ophthalmology, the People's Hospital of Guangxi Zhuang Autonomous Region & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology & Institute of Ophthalmic Diseases, Guangxi Academy of Medical Sciences, Nanning, 530021, People's Republic of China
| | - Fan Xu
- Department of Ophthalmology, the People's Hospital of Guangxi Zhuang Autonomous Region & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology & Institute of Ophthalmic Diseases, Guangxi Academy of Medical Sciences, Nanning, 530021, People's Republic of China
| | - Daizai Lei
- Department of Ophthalmology, the People's Hospital of Guangxi Zhuang Autonomous Region & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology & Institute of Ophthalmic Diseases, Guangxi Academy of Medical Sciences, Nanning, 530021, People's Republic of China.
- Department of Ophthalmology, The People's Hospital of Guangxi Zhuang Autonomous Region, 6 Taoyuan Road, Qingxiu District, Nanning, 530000, China.
| |
Collapse
|
41
|
D’Elia D, Truu J, Lahti L, Berland M, Papoutsoglou G, Ceci M, Zomer A, Lopes MB, Ibrahimi E, Gruca A, Nechyporenko A, Frohme M, Klammsteiner T, Pau ECDS, Marcos-Zambrano LJ, Hron K, Pio G, Simeon A, Suharoschi R, Moreno-Indias I, Temko A, Nedyalkova M, Apostol ES, Truică CO, Shigdel R, Telalović JH, Bongcam-Rudloff E, Przymus P, Jordamović NB, Falquet L, Tarazona S, Sampri A, Isola G, Pérez-Serrano D, Trajkovik V, Klucar L, Loncar-Turukalo T, Havulinna AS, Jansen C, Bertelsen RJ, Claesson MJ. Advancing microbiome research with machine learning: key findings from the ML4Microbiome COST action. Front Microbiol 2023; 14:1257002. [PMID: 37808321 PMCID: PMC10558209 DOI: 10.3389/fmicb.2023.1257002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 09/05/2023] [Indexed: 10/10/2023] Open
Abstract
The rapid development of machine learning (ML) techniques has opened up the data-dense field of microbiome research for novel therapeutic, diagnostic, and prognostic applications targeting a wide range of disorders, which could substantially improve healthcare practices in the era of precision medicine. However, several challenges must be addressed to exploit the benefits of ML in this field fully. In particular, there is a need to establish "gold standard" protocols for conducting ML analysis experiments and improve interactions between microbiome researchers and ML experts. The Machine Learning Techniques in Human Microbiome Studies (ML4Microbiome) COST Action CA18131 is a European network established in 2019 to promote collaboration between discovery-oriented microbiome researchers and data-driven ML experts to optimize and standardize ML approaches for microbiome analysis. This perspective paper presents the key achievements of ML4Microbiome, which include identifying predictive and discriminatory 'omics' features, improving repeatability and comparability, developing automation procedures, and defining priority areas for the novel development of ML methods targeting the microbiome. The insights gained from ML4Microbiome will help to maximize the potential of ML in microbiome research and pave the way for new and improved healthcare practices.
Collapse
Affiliation(s)
- Domenica D’Elia
- Department of Biomedical Sciences, National Research Council, Institute for Biomedical Technologies, Bari, Italy
| | - Jaak Truu
- Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | - Leo Lahti
- Department of Computing, University of Turku, Turku, Finland
| | - Magali Berland
- Université Paris-Saclay, INRAE, MetaGenoPolis, Jouy-en-Josas, France
| | - Georgios Papoutsoglou
- JADBio Gnosis DA S.A., Science and Technology Park of Crete, Heraklion, Greece
- Department of Computer Science, University of Crete, Heraklion, Greece
| | - Michelangelo Ceci
- Department of Computer Science, University of Bari Aldo Moro, Bari, Italy
| | - Aldert Zomer
- Department of Biomolecular Health Sciences (Infectious Diseases and Immunology), Faculty of Veterinary Medicine, Utrecht University, Utrecht, Netherlands
| | - Marta B. Lopes
- Center for Mathematics and Applications (NOVA Math), NOVA School of Science and Technology, Caparica, Portugal
- UNIDEMI, Department of Mechanical and Industrial Engineering, NOVA School of Science and Technology, Caparica, Portugal
| | - Eliana Ibrahimi
- Department of Biology, University of Tirana, Tirana, Albania
| | - Aleksandra Gruca
- Department of Computer Networks and Systems, Silesian University of Technology, Gliwice, Poland
| | - Alina Nechyporenko
- Systems Engineering Department, Kharkiv National University of Radio Electronics, Kharkiv, Ukraine
- Department of Molecular Biotechnology and Functional Genomics, Technical University of Applied Sciences Wildau, Wildau, Germany
| | - Marcus Frohme
- Department of Molecular Biotechnology and Functional Genomics, Technical University of Applied Sciences Wildau, Wildau, Germany
| | - Thomas Klammsteiner
- Department of Microbiology, Universität Innsbruck, Innsbruck, Austria
- Department of Ecology, Universität Innsbruck, Innsbruck, Austria
| | - Enrique Carrillo-de Santa Pau
- Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, CEI UAM+CSIC, Madrid, Spain
| | - Laura Judith Marcos-Zambrano
- Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, CEI UAM+CSIC, Madrid, Spain
| | - Karel Hron
- Department of Mathematical Analysis and Applications of Mathematics, Faculty of Science, Palacký University, Olomouc, Czechia
| | - Gianvito Pio
- Department of Computer Science, University of Bari Aldo Moro, Bari, Italy
| | - Andrea Simeon
- BioSense Institute, University of Novi Sad, Novi Sad, Serbia
| | - Ramona Suharoschi
- Molecular Nutrition and Proteomics Research Laboratory, Department of Food Science, University of Agricultural Sciences and Veterinary Medicine of Cluj-Napoca, Cluj-Napoca, Romania
| | - Isabel Moreno-Indias
- Department of Endocrinology and Nutrition, Virgen de la Victoria University Hospital, the Biomedical Research Institute of Malaga and Platform in Nanomedicine (IBIMA-BIONAND Platform), University of Malaga, Malaga, Spain
| | - Andriy Temko
- Department of Electrical and Electronic Engineering, University College Cork, Cork, Ireland
| | | | - Elena-Simona Apostol
- Computer Science and Engineering Department, Faculty of Automatic Control and Computers, University Politehnica of Bucharest, Bucharest, Romania
| | - Ciprian-Octavian Truică
- Computer Science and Engineering Department, Faculty of Automatic Control and Computers, University Politehnica of Bucharest, Bucharest, Romania
| | - Rajesh Shigdel
- Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Jasminka Hasić Telalović
- Department of Computer Science, University Sarajevo School of Science and Technology, Sarajevo, Bosnia and Herzegovina
| | - Erik Bongcam-Rudloff
- Swedish University of Agricultural Sciences, Department of Animal Breeding and Genetics, Uppsala, Sweden
| | | | - Naida Babić Jordamović
- Computational Biology, International Centre for Genetic Engineering and Biotechnology, Trieste, Italy
- Verlab Research Institute for BIomedical Engineering, Medical Devices and Artificial Intelligence, Sarajevo, Bosnia and Herzegovina
| | - Laurent Falquet
- University of Fribourg and Swiss Institute of Bioinformatics, Fribourg, Switzerland
| | - Sonia Tarazona
- Department of Applied Statistics and Operations Research and Quality, Universitat Politècnica de València, València, Spain
| | - Alexia Sampri
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, United Kingdom
| | - Gaetano Isola
- Department of General Surgery and Surgical-Medical Specialties, School of Dentistry, University of Catania, Catania, Italy
| | - David Pérez-Serrano
- Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, CEI UAM+CSIC, Madrid, Spain
| | | | - Lubos Klucar
- Institute of Molecular Biology, Slovak Academy of Sciences, Bratislava, Slovakia
| | | | - Aki S. Havulinna
- Finnish Institute for Health and Welfare, Helsinki, Finland
- Institute for Molecular Medicine Finland, FIMM-HiLIFE, Helsinki, Finland
| | - Christian Jansen
- Biome Diagnostics GmbH, Vienna, Austria
- Institute of Science and Technology Austria (ISTA), Klosterneuburg, Austria
| | | | | |
Collapse
|
42
|
Li W, Huang G, Tang N, Lu P, Jiang L, Lv J, Qin Y, Lin Y, Xu F, Lei D. Effects of heavy metal exposure on hypertension: A machine learning modeling approach. CHEMOSPHERE 2023; 337:139435. [PMID: 37422210 DOI: 10.1016/j.chemosphere.2023.139435] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 07/04/2023] [Accepted: 07/05/2023] [Indexed: 07/10/2023]
Abstract
Heavy metal exposure is a common risk factor for hypertension. To develop an interpretable predictive machine learning (ML) model for hypertension based on levels of heavy metal exposure, data from the NHANES (2003-2016) were employed. Random forest (RF), support vector machine (SVM), decision tree (DT), multilayer perceptron (MLP), ridge regression (RR), AdaBoost (AB), gradient boosting decision tree (GBDT), voting classifier (VC), and K-nearest neighbour (KNN) algorithms were utilized to generate an optimal predictive model for hypertension. Three interpretable methods, the permutation feature importance analysis, partial dependence plot (PDP), and Shapley additive explanations (SHAP) methods, were integrated into a pipeline and embedded in ML for model interpretation. A total of 9005 eligible individuals were randomly allocated into two distinct sets for predictive model training and validation. The results showed that among the predictive models, the RF model demonstrated the highest performance, achieving an accuracy rate of 77.40% in the validation set. The AUC and F1 score for the model were 0.84 and 0.76, respectively. Blood Pb, urinary Cd, urinary Tl, and urinary Co levels were identified as the main influencers of hypertension, and their contribution weights were 0.0504 ± 0.0482, 0.0389 ± 0.0256, 0.0307 ± 0.0179, and 0.0296 ± 0.0162, respectively. Blood Pb (0.55-2.93 μg/dL) and urinary Cd (0.06-0.15 μg/L) levels exhibited the most pronounced upwards trend with the risk of hypertension within a specific value range, while urinary Tl (0.06-0.26 μg/L) and urinary Co (0.02-0.32 μg/L) levels demonstrated a declining trend with hypertension. The findings on the synergistic effects indicated that Pb and Cd were the primary determinants of hypertension. Our findings underscore the predictive value of heavy metals for hypertension. By utilizing interpretable methods, we discerned that Pb, Cd, Tl, and Co emerged as noteworthy contributors within the predictive model.
Collapse
Affiliation(s)
- Wenxiang Li
- Department of Ophthalmology, the People's Hospital of Guangxi Zhuang Autonomous Region & Institute of Ophthalmic Diseases, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, 530021, China.
| | - Guangyi Huang
- Department of Ophthalmology, the People's Hospital of Guangxi Zhuang Autonomous Region & Institute of Ophthalmic Diseases, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, 530021, China
| | - Ningning Tang
- Department of Ophthalmology, the People's Hospital of Guangxi Zhuang Autonomous Region & Institute of Ophthalmic Diseases, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, 530021, China
| | - Peng Lu
- Department of Ophthalmology, the People's Hospital of Guangxi Zhuang Autonomous Region & Institute of Ophthalmic Diseases, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, 530021, China
| | - Li Jiang
- Department of Ophthalmology, the People's Hospital of Guangxi Zhuang Autonomous Region & Institute of Ophthalmic Diseases, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, 530021, China
| | - Jian Lv
- Department of Ophthalmology, the People's Hospital of Guangxi Zhuang Autonomous Region & Institute of Ophthalmic Diseases, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, 530021, China
| | - Yuanjun Qin
- Department of Ophthalmology, the People's Hospital of Guangxi Zhuang Autonomous Region & Institute of Ophthalmic Diseases, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, 530021, China
| | - Yunru Lin
- Department of Ophthalmology, the People's Hospital of Guangxi Zhuang Autonomous Region & Institute of Ophthalmic Diseases, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, 530021, China
| | - Fan Xu
- Department of Ophthalmology, the People's Hospital of Guangxi Zhuang Autonomous Region & Institute of Ophthalmic Diseases, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, 530021, China.
| | - Daizai Lei
- Department of Ophthalmology, the People's Hospital of Guangxi Zhuang Autonomous Region & Institute of Ophthalmic Diseases, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, 530021, China.
| |
Collapse
|
43
|
Bhat M, Rabindranath M, Chara BS, Simonetto DA. Artificial intelligence, machine learning, and deep learning in liver transplantation. J Hepatol 2023; 78:1216-1233. [PMID: 37208107 DOI: 10.1016/j.jhep.2023.01.006] [Citation(s) in RCA: 71] [Impact Index Per Article: 35.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/15/2022] [Revised: 01/11/2023] [Accepted: 01/16/2023] [Indexed: 05/21/2023]
Abstract
Liver transplantation (LT) is a life-saving treatment for individuals with end-stage liver disease. The management of LT recipients is complex, predominantly because of the need to consider demographic, clinical, laboratory, pathology, imaging, and omics data in the development of an appropriate treatment plan. Current methods to collate clinical information are susceptible to some degree of subjectivity; thus, clinical decision-making in LT could benefit from the data-driven approach offered by artificial intelligence (AI). Machine learning and deep learning could be applied in both the pre- and post-LT settings. Some examples of AI applications pre-transplant include optimising transplant candidacy decision-making and donor-recipient matching to reduce waitlist mortality and improve post-transplant outcomes. In the post-LT setting, AI could help guide the management of LT recipients, particularly by predicting patient and graft survival, along with identifying risk factors for disease recurrence and other associated complications. Although AI shows promise in medicine, there are limitations to its clinical deployment which include dataset imbalances for model training, data privacy issues, and a lack of available research practices to benchmark model performance in the real world. Overall, AI tools have the potential to enhance personalised clinical decision-making, especially in the context of liver transplant medicine.
Collapse
Affiliation(s)
- Mamatha Bhat
- Ajmera Transplant Program, University Health Network, Toronto, ON, Canada; Toronto General Hospital Research Institute, University Health Network, Toronto, ON, Canada; Institute of Medical Science, University of Toronto, Toronto, ON, Canada; Division of Gastroenterology & Hepatology, Department of Medicine, University of Toronto, Toronto, ON, Canada.
| | - Madhumitha Rabindranath
- Ajmera Transplant Program, University Health Network, Toronto, ON, Canada; Toronto General Hospital Research Institute, University Health Network, Toronto, ON, Canada; Institute of Medical Science, University of Toronto, Toronto, ON, Canada
| | - Beatriz Sordi Chara
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, Minnesota, USA
| | - Douglas A Simonetto
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, Minnesota, USA
| |
Collapse
|
44
|
Wang K, Gu L, Liu W, Xu C, Yin C, Liu H, Rong L, Li W, Wei X. The predictors of death within 1 year in acute ischemic stroke patients based on machine learning. Front Neurol 2023; 14:1092534. [PMID: 36908612 PMCID: PMC9998042 DOI: 10.3389/fneur.2023.1092534] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2022] [Accepted: 02/02/2023] [Indexed: 02/25/2023] Open
Abstract
OBJECTIVE To explore the predictors of death in acute ischemic stroke (AIS) patients within 1 year based on machine learning (ML) algorithms. METHODS This study retrospectively analyzed the clinical data of patients hospitalized and diagnosed with AIS in the Second Affiliated Hospital of Xuzhou Medical University between August 2017 and July 2019. The patients were randomly divided into training and validation sets at a ratio of 7:3, and the clinical characteristic variables of the patients were screened using univariate and multivariate logistics regression. Six ML algorithms, including logistic regression (LR), gradient boosting machine (GBM), extreme gradient boosting (XGB), random forest (RF), decision tree (DT), and naive Bayes classifier (NBC), were applied to develop models to predict death in AIS patients within 1 year. During training, a 10-fold cross-validation approach was used to validate the training set internally, and the models were interpreted using important ranking and the SHapley Additive exPlanations (SHAP) principle. The validation set was used to externally validate the models. Ultimately, the highest-performing model was selected to build a web-based calculator. RESULTS Multivariate logistic regression analysis revealed that C-reactive protein (CRP), homocysteine (HCY) levels, stroke severity (SS), and the number of stroke lesions (NOS) were independent risk factors for death within 1 year in patients with AIS. The area under the curve value of the XGB model was 0.846, which was the highest among the six ML algorithms. Therefore, we built an ML network calculator (https://mlmedicine-de-stroke-de-stroke-m5pijk.streamlitapp.com/) based on XGB to predict death in AIS patients within 1 year. CONCLUSIONS The network calculator based on the XGB model developed in this study can help clinicians make more personalized and rational clinical decisions.
Collapse
Affiliation(s)
- Kai Wang
- Department of Neurology, The Second Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China
- Key Laboratory of Neurological Diseases, The Second Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Longyuan Gu
- Department of Neurosurgery, The Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Wencai Liu
- Department of Orthopaedic Surgery, The First Affiliated Hospital of Nanchang University, Nanchang, China
| | - Chan Xu
- Department of Dermatology, Xianyang Central Hospital, Xianyang, China
| | - Chengliang Yin
- Faculty of Medicine, Macau University of Science and Technology, Taipa, Macao SAR, China
| | - Haiyan Liu
- Department of Neurology, The Second Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China
- Key Laboratory of Neurological Diseases, The Second Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Liangqun Rong
- Department of Neurology, The Second Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China
- Key Laboratory of Neurological Diseases, The Second Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Wenle Li
- Key Laboratory of Neurological Diseases, The Second Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China
- The State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics and Center for Molecular Imaging and Translational Medicine, School of Public Health, Xiamen University, Xiamen, China
| | - Xiu'e Wei
- Department of Neurology, The Second Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China
- Key Laboratory of Neurological Diseases, The Second Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China
| |
Collapse
|
45
|
Korfmann K, Gaggiotti OE, Fumagalli M. Deep Learning in Population Genetics. Genome Biol Evol 2023; 15:evad008. [PMID: 36683406 PMCID: PMC9897193 DOI: 10.1093/gbe/evad008] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 12/19/2022] [Accepted: 01/16/2023] [Indexed: 01/24/2023] Open
Abstract
Population genetics is transitioning into a data-driven discipline thanks to the availability of large-scale genomic data and the need to study increasingly complex evolutionary scenarios. With likelihood and Bayesian approaches becoming either intractable or computationally unfeasible, machine learning, and in particular deep learning, algorithms are emerging as popular techniques for population genetic inferences. These approaches rely on algorithms that learn non-linear relationships between the input data and the model parameters being estimated through representation learning from training data sets. Deep learning algorithms currently employed in the field comprise discriminative and generative models with fully connected, convolutional, or recurrent layers. Additionally, a wide range of powerful simulators to generate training data under complex scenarios are now available. The application of deep learning to empirical data sets mostly replicates previous findings of demography reconstruction and signals of natural selection in model organisms. To showcase the feasibility of deep learning to tackle new challenges, we designed a branched architecture to detect signals of recent balancing selection from temporal haplotypic data, which exhibited good predictive performance on simulated data. Investigations on the interpretability of neural networks, their robustness to uncertain training data, and creative representation of population genetic data, will provide further opportunities for technological advancements in the field.
Collapse
Affiliation(s)
- Kevin Korfmann
- Professorship for Population Genetics, Department of Life Science Systems, Technical University of Munich, Germany
| | - Oscar E Gaggiotti
- Centre for Biological Diversity, Sir Harold Mitchell Building, University of St Andrews, Fife KY16 9TF, UK
| | - Matteo Fumagalli
- Department of Biological and Behavioural Sciences, Queen Mary University of London, UK
| |
Collapse
|
46
|
Kalyakulina A, Yusipov I, Bacalini MG, Franceschi C, Vedunova M, Ivanchenko M. Disease classification for whole-blood DNA methylation: Meta-analysis, missing values imputation, and XAI. Gigascience 2022; 11:giac097. [PMID: 36259657 PMCID: PMC9718659 DOI: 10.1093/gigascience/giac097] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 08/01/2022] [Accepted: 09/15/2022] [Indexed: 07/25/2023] Open
Abstract
BACKGROUND DNA methylation has a significant effect on gene expression and can be associated with various diseases. Meta-analysis of available DNA methylation datasets requires development of a specific workflow for joint data processing. RESULTS We propose a comprehensive approach of combined DNA methylation datasets to classify controls and patients. The solution includes data harmonization, construction of machine learning classification models, dimensionality reduction of models, imputation of missing values, and explanation of model predictions by explainable artificial intelligence (XAI) algorithms. We show that harmonization can improve classification accuracy by up to 20% when preprocessing methods of the training and test datasets are different. The best accuracy results were obtained with tree ensembles, reaching above 95% for Parkinson's disease. Dimensionality reduction can substantially decrease the number of features, without detriment to the classification accuracy. The best imputation methods achieve almost the same classification accuracy for data with missing values as for the original data. XAI approaches have allowed us to explain model predictions from both populational and individual perspectives. CONCLUSIONS We propose a methodologically valid and comprehensive approach to the classification of healthy individuals and patients with various diseases based on whole-blood DNA methylation data using Parkinson's disease and schizophrenia as examples. The proposed algorithm works better for the former pathology, characterized by a complex set of symptoms. It allows to solve data harmonization problems for meta-analysis of many different datasets, impute missing values, and build classification models of small dimensionality.
Collapse
Affiliation(s)
- Alena Kalyakulina
- Institute of Information Technologies, Mathematics and Mechanics, Lobachevsky State University, 603022 Nizhny Novgorod, Russia
| | - Igor Yusipov
- Institute of Information Technologies, Mathematics and Mechanics, Lobachevsky State University, 603022 Nizhny Novgorod, Russia
| | | | - Claudio Franceschi
- Institute of Information Technologies, Mathematics and Mechanics, Lobachevsky State University, 603022 Nizhny Novgorod, Russia
| | - Maria Vedunova
- Institute of Biology and Biomedicine, Lobachevsky State University, 603022 Nizhny Novgorod, Russia
| | - Mikhail Ivanchenko
- Institute of Information Technologies, Mathematics and Mechanics, Lobachevsky State University, 603022 Nizhny Novgorod, Russia
| |
Collapse
|