1
|
Choi JH, Choi Y, Lee KS, Ahn KH, Jang WY. Explainable Model Using Shapley Additive Explanations Approach on Wound Infection after Wide Soft Tissue Sarcoma Resection: "Big Data" Analysis Based on Health Insurance Review and Assessment Service Hub. MEDICINA (KAUNAS, LITHUANIA) 2024; 60:327. [PMID: 38399614 PMCID: PMC10890019 DOI: 10.3390/medicina60020327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 02/04/2024] [Accepted: 02/12/2024] [Indexed: 02/25/2024]
Abstract
Background and Objectives: Soft tissue sarcomas represent a heterogeneous group of malignant mesenchymal tissues. Despite their low prevalence, soft tissue sarcomas present clinical challenges for orthopedic surgeons owing to their aggressive nature, and perioperative wound infections. However, the low prevalence of soft tissue sarcomas has hindered the availability of large-scale studies. This study aimed to analyze wound infections after wide resection in patients with soft tissue sarcomas by employing big data analytics from the Hub of the Health Insurance Review and Assessment Service (HIRA). Materials and Methods: Patients who underwent wide excision of soft tissue sarcomas between 2010 and 2021 were included. Data were collected from the HIRA database of approximately 50 million individuals' information in the Republic of Korea. The data collected included demographic information, diagnoses, prescribed medications, and surgical procedures. Random forest has been used to analyze the major associated determinants. A total of 10,906 observations with complete data were divided into training and validation sets in an 80:20 ratio (8773 vs. 2193 cases). Random forest permutation importance was employed to identify the major predictors of infection and Shapley Additive Explanations (SHAP) values were derived to analyze the directions of associations with predictors. Results: A total of 10,969 patients who underwent wide excision of soft tissue sarcomas were included. Among the study population, 886 (8.08%) patients had post-operative infections requiring surgery. The overall transfusion rate for wide excision was 20.67% (2267 patients). Risk factors among the comorbidities of each patient with wound infection were analyzed and dependence plots of individual features were visualized. The transfusion dependence plot reveals a distinctive pattern, with SHAP values displaying a negative trend for individuals without blood transfusions and a positive trend for those who received blood transfusions, emphasizing the substantial impact of blood transfusions on the likelihood of wound infection. Conclusions: Using the machine learning random forest model and the SHAP values, the perioperative transfusion, male sex, old age, and low SES were important features of wound infection in soft-tissue sarcoma patients.
Collapse
Affiliation(s)
- Ji-Hye Choi
- Department of Orthopedic Surgery, Anam Hospital, Korea University College of Medicine, 73 Goryeodae-ro, Seongbuk-gu, Seoul 02841, Republic of Korea;
- Anam Hospital Bloodless Medicine Center, Korea University College of Medicine, Seoul 02841, Republic of Korea
| | - Yumin Choi
- School of Mechanical Engineering, Korea University College of Medicine, 73 Goryeodae-ro, Seongbuk-gu, Seoul 02841, Republic of Korea;
| | - Kwang-Sig Lee
- AI Center, Anam Hospital, Korea University College of Medicine, 73 Goryeodae-ro, Seongbuk-gu, Seoul 02841, Republic of Korea;
| | - Ki-Hoon Ahn
- Anam Hospital Bloodless Medicine Center, Korea University College of Medicine, Seoul 02841, Republic of Korea
- Department of Obstetrics and Gynecology, Anam Hospital, Korea University College of Medicine, Seoul 02841, Republic of Korea
| | - Woo Young Jang
- Department of Orthopedic Surgery, Anam Hospital, Korea University College of Medicine, 73 Goryeodae-ro, Seongbuk-gu, Seoul 02841, Republic of Korea;
- Anam Hospital Bloodless Medicine Center, Korea University College of Medicine, Seoul 02841, Republic of Korea
| |
Collapse
|
2
|
Lee KS, Song IS, Kim ES, Kim J, Jung S, Nam S, Ahn KH. Machine learning analysis with population data for the associations of preterm birth with temporomandibular disorder and gastrointestinal diseases. PLoS One 2024; 19:e0296329. [PMID: 38165877 PMCID: PMC10760735 DOI: 10.1371/journal.pone.0296329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Accepted: 12/11/2023] [Indexed: 01/04/2024] Open
Abstract
This study employs machine learning analysis with population data for the associations of preterm birth (PTB) with temporomandibular disorder (TMD) and gastrointestinal diseases. The source of the population-based retrospective cohort was Korea National Health Insurance claims for 489,893 primiparous women with delivery at the age of 25-40 in 2017. The dependent variable was PTB in 2017. Twenty-one predictors were included, i.e., demographic, socioeconomic, disease and medication information during 2002-2016. Random forest variable importance was derived for finding important predictors of PTB and evaluating its associations with the predictors including TMD and gastroesophageal reflux disease (GERD). Shapley Additive Explanation (SHAP) values were calculated to analyze the directions of these associations. The random forest with oversampling registered a much higher area under the receiver-operating-characteristic curve compared to logistic regression with oversampling, i.e., 79.3% vs. 53.1%. According to random forest variable importance values and rankings, PTB has strong associations with low socioeconomic status, GERD, age, infertility, irritable bowel syndrome, diabetes, TMD, salivary gland disease, hypertension, tricyclic antidepressant and benzodiazepine. In terms of max SHAP values, these associations were positive, e.g., low socioeconomic status (0.29), age (0.21), GERD (0.27) and TMD (0.23). The inclusion of low socioeconomic status, age, GERD or TMD into the random forest will increase the probability of PTB by 0.29, 0.21, 0.27 or 0.23. A cutting-edge approach of explainable artificial intelligence highlights the strong associations of preterm birth with temporomandibular disorder, gastrointestinal diseases and antidepressant medication. Close surveillance is needed for pregnant women regarding these multiple risks at the same time.
Collapse
Affiliation(s)
- Kwang-Sig Lee
- AI Center, Korea University College of Medicine, Anam Hospital, Seoul, Korea
| | - In-Seok Song
- Department of Oral and Maxillofacial Surgery, Korea University Anam Hospital, Seoul, Korea
| | - Eun Sun Kim
- Department of Gastroenterology, Korea University College of Medicine, Anam Hospital, Seoul, Korea
| | - Jisu Kim
- AI Center, Korea University College of Medicine, Anam Hospital, Seoul, Korea
- Department of Statistics, Korea University College of Political Science and Economics, Seoul, Korea
- Department of Obstetrics and Gynecology, Korea University College of Medicine, Anam Hospital, Seoul, Korea
| | - Sohee Jung
- AI Center, Korea University College of Medicine, Anam Hospital, Seoul, Korea
- Department of Statistics, Korea University College of Political Science and Economics, Seoul, Korea
- Department of Obstetrics and Gynecology, Korea University College of Medicine, Anam Hospital, Seoul, Korea
| | - Sunwoo Nam
- AI Center, Korea University College of Medicine, Anam Hospital, Seoul, Korea
- Department of Obstetrics and Gynecology, Korea University College of Medicine, Anam Hospital, Seoul, Korea
| | - Ki Hoon Ahn
- Department of Obstetrics and Gynecology, Korea University College of Medicine, Anam Hospital, Seoul, Korea
| |
Collapse
|
3
|
Shah J, Siddiquee MMR, Krell-Roesch J, Syrjanen JA, Kremers WK, Vassilaki M, Forzani E, Wu T, Geda YE. Neuropsychiatric Symptoms and Commonly Used Biomarkers of Alzheimer's Disease: A Literature Review from a Machine Learning Perspective. J Alzheimers Dis 2023; 92:1131-1146. [PMID: 36872783 PMCID: PMC11102734 DOI: 10.3233/jad-221261] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
Abstract
There is a growing interest in the application of machine learning (ML) in Alzheimer's disease (AD) research. However, neuropsychiatric symptoms (NPS), frequent in subjects with AD, mild cognitive impairment (MCI), and other related dementias have not been analyzed sufficiently using ML methods. To portray the landscape and potential of ML research in AD and NPS studies, we present a comprehensive literature review of existing ML approaches and commonly studied AD biomarkers. We conducted PubMed searches with keywords related to NPS, AD biomarkers, machine learning, and cognition. We included a total of 38 articles in this review after excluding some irrelevant studies from the search results and including 6 articles based on a snowball search from the bibliography of the relevant studies. We found a limited number of studies focused on NPS with or without AD biomarkers. In contrast, multiple statistical machine learning and deep learning methods have been used to build predictive diagnostic models using commonly known AD biomarkers. These mainly included multiple imaging biomarkers, cognitive scores, and various omics biomarkers. Deep learning approaches that combine these biomarkers or multi-modality datasets typically outperform single-modality datasets. We conclude ML may be leveraged to untangle the complex relationships of NPS and AD biomarkers with cognition. This may potentially help to predict the progression of MCI or dementia and develop more targeted early intervention approaches based on NPS.
Collapse
Affiliation(s)
- Jay Shah
- School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, USA
- ASU-Mayo Center for Innovative Imaging, Tempe, AZ, USA
| | - Md Mahfuzur Rahman Siddiquee
- School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, USA
- ASU-Mayo Center for Innovative Imaging, Tempe, AZ, USA
| | - Janina Krell-Roesch
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA
- Institute of Sports and Sports Science, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Jeremy A. Syrjanen
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA
| | - Walter K. Kremers
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA
| | - Maria Vassilaki
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA
| | - Erica Forzani
- Biodesign Institute, Arizona State University, Tempe, AZ, USA
| | - Teresa Wu
- School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, USA
- ASU-Mayo Center for Innovative Imaging, Tempe, AZ, USA
| | - Yonas E. Geda
- Department of Neurology and the Franke Global Neuroscience Education Center, Barrow Neurological Institute, Phoenix, AZ, USA
| |
Collapse
|