Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Kalagotla SK, Gangashetty SV, Giridhar K. A novel stacking technique for prediction of diabetes. Comput Biol Med 2021;135:104554. [PMID: 34139440 DOI: 10.1016/j.compbiomed.2021.104554] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Revised: 06/02/2021] [Accepted: 06/02/2021] [Indexed: 11/29/2022]

For:	Kalagotla SK, Gangashetty SV, Giridhar K. A novel stacking technique for prediction of diabetes. Comput Biol Med 2021;135:104554. [PMID: 34139440 DOI: 10.1016/j.compbiomed.2021.104554] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Revised: 06/02/2021] [Accepted: 06/02/2021] [Indexed: 11/29/2022]

Number

Cited by Other Article(s)

Wang H, Jia Q, Wang Y, Xue W, Jiang Q, Ning F, Wang J, Zhu Z, Tian L. Stacking learning based on micro-CT radiomics for outcome prediction in the early-stage of silica-induced pulmonary fibrosis model. Heliyon 2024;10:e30651. [PMID: 38765063 PMCID: PMC11098827 DOI: 10.1016/j.heliyon.2024.e30651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 02/28/2024] [Accepted: 05/01/2024] [Indexed: 05/21/2024] Open

Affiliation(s)

Hongwei Wang Department of Occupational and Environmental Health, School of Public Health, Capital Medical University, Beijing, 100069, China Beijing Key Laboratory of Environmental Toxicology, Capital Medical University, Beijing, 100069, China
Qiyue Jia Department of Occupational and Environmental Health, School of Public Health, Capital Medical University, Beijing, 100069, China Beijing Key Laboratory of Environmental Toxicology, Capital Medical University, Beijing, 100069, China
Yan Wang Department of Occupational and Environmental Health, School of Public Health, Capital Medical University, Beijing, 100069, China Beijing Key Laboratory of Environmental Toxicology, Capital Medical University, Beijing, 100069, China
Wenming Xue Department of Occupational and Environmental Health, School of Public Health, Capital Medical University, Beijing, 100069, China Beijing Key Laboratory of Environmental Toxicology, Capital Medical University, Beijing, 100069, China
Qiyue Jiang Department of Occupational and Environmental Health, School of Public Health, Capital Medical University, Beijing, 100069, China Beijing Key Laboratory of Environmental Toxicology, Capital Medical University, Beijing, 100069, China
Fuao Ning Department of Occupational and Environmental Health, School of Public Health, Capital Medical University, Beijing, 100069, China Beijing Key Laboratory of Environmental Toxicology, Capital Medical University, Beijing, 100069, China
Jiaxin Wang Department of Occupational and Environmental Health, School of Public Health, Capital Medical University, Beijing, 100069, China Beijing Key Laboratory of Environmental Toxicology, Capital Medical University, Beijing, 100069, China
Zhonghui Zhu Department of Occupational and Environmental Health, School of Public Health, Capital Medical University, Beijing, 100069, China Beijing Key Laboratory of Environmental Toxicology, Capital Medical University, Beijing, 100069, China
Lin Tian Department of Occupational and Environmental Health, School of Public Health, Capital Medical University, Beijing, 100069, China Beijing Key Laboratory of Environmental Toxicology, Capital Medical University, Beijing, 100069, China

Collapse

Kuo DP, Chen YC, Li YT, Cheng SJ, Hsieh KLC, Kuo PC, Ou CY, Chen CY. Estimating the volume of penumbra in rodents using DTI and stack-based ensemble machine learning framework. Eur Radiol Exp 2024;8:59. [PMID: 38744784 PMCID: PMC11093947 DOI: 10.1186/s41747-024-00455-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 03/05/2024] [Indexed: 05/16/2024] Open

Abstract

BACKGROUND

This study investigates the potential of diffusion tensor imaging (DTI) in identifying penumbral volume (PV) compared to the standard gadolinium-required perfusion-diffusion mismatch (PDM), utilizing a stack-based ensemble machine learning (ML) approach with enhanced explainability.

METHODS

Sixteen male rats were subjected to middle cerebral artery occlusion. The penumbra was identified using PDM at 30 and 90 min after occlusion. We used 11 DTI-derived metrics and 14 distance-based features to train five voxel-wise ML models. The model predictions were integrated using stack-based ensemble techniques. ML-estimated and PDM-defined PVs were compared to evaluate model performance through volume similarity assessment, the Pearson correlation analysis, and Bland-Altman analysis. Feature importance was determined for explainability.

RESULTS

In the test rats, the ML-estimated median PV was 106.4 mL (interquartile range 44.6-157.3 mL), whereas the PDM-defined median PV was 102.0 mL (52.1-144.9 mL). These PVs had a volume similarity of 0.88 (0.79-0.96), a Pearson correlation coefficient of 0.93 (p < 0.001), and a Bland-Altman bias of 2.5 mL (2.4% of the mean PDM-defined PV), with 95% limits of agreement ranging from -44.9 to 49.9 mL. Among the features used for PV prediction, the mean diffusivity was the most important feature.

CONCLUSIONS

Our study confirmed that PV can be estimated using DTI metrics with a stack-based ensemble ML approach, yielding results comparable to the volume defined by the standard PDM. The model explainability enhanced its clinical relevance. Human studies are warranted to validate our findings.

RELEVANCE STATEMENT

The proposed DTI-based ML model can estimate PV without the need for contrast agent administration, offering a valuable option for patients with kidney dysfunction. It also can serve as an alternative if perfusion map interpretation fails in the clinical setting.

KEY POINTS

• Penumbral volume can be estimated by DTI combined with stack-based ensemble ML. • Mean diffusivity was the most important feature used for predicting penumbral volume. • The proposed approach can be beneficial for patients with kidney dysfunction.

Collapse

Affiliation(s)

Duen-Pang Kuo Department of Medical Imaging, Taipei Medical University Hospital, No.250, Wu Hsing Street, Taipei, Taiwan Translational Imaging Research Center, Taipei Medical University Hospital, Taipei, Taiwan Department of Radiology, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
Yung-Chieh Chen Department of Medical Imaging, Taipei Medical University Hospital, No.250, Wu Hsing Street, Taipei, Taiwan. Translational Imaging Research Center, Taipei Medical University Hospital, Taipei, Taiwan. Department of Radiology, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan.
Yi-Tien Li Translational Imaging Research Center, Taipei Medical University Hospital, Taipei, Taiwan Research Center for Neuroscience, Taipei Medical University, Taipei, Taiwan Ph.D. Program in Medical Neuroscience, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
Sho-Jen Cheng Department of Medical Imaging, Taipei Medical University Hospital, No.250, Wu Hsing Street, Taipei, Taiwan Translational Imaging Research Center, Taipei Medical University Hospital, Taipei, Taiwan
Kevin Li-Chun Hsieh Department of Medical Imaging, Taipei Medical University Hospital, No.250, Wu Hsing Street, Taipei, Taiwan Translational Imaging Research Center, Taipei Medical University Hospital, Taipei, Taiwan Department of Radiology, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
Po-Chih Kuo Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan
Chen-Yin Ou Translational Imaging Research Center, Taipei Medical University Hospital, Taipei, Taiwan
Cheng-Yu Chen Department of Medical Imaging, Taipei Medical University Hospital, No.250, Wu Hsing Street, Taipei, Taiwan Translational Imaging Research Center, Taipei Medical University Hospital, Taipei, Taiwan Research Center for Artificial Intelligence in Medicine, Taipei Medical University, Taipei, Taiwan Department of Radiology, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan Department of Radiology, National Defense Medical Center, Taipei, Taiwan

Collapse

Idris NF, Ismail MA, Jaya MIM, Ibrahim AO, Abulfaraj AW, Binzagr F. Stacking with Recursive Feature Elimination-Isolation Forest for classification of diabetes mellitus. PLoS One 2024;19:e0302595. [PMID: 38718024 PMCID: PMC11078423 DOI: 10.1371/journal.pone.0302595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Accepted: 04/08/2024] [Indexed: 05/12/2024] Open

Abstract

Diabetes Mellitus is one of the oldest diseases known to humankind, dating back to ancient Egypt. The disease is a chronic metabolic disorder that heavily burdens healthcare providers worldwide due to the steady increment of patients yearly. Worryingly, diabetes affects not only the aging population but also children. It is prevalent to control this problem, as diabetes can lead to many health complications. As evolution happens, humankind starts integrating computer technology with the healthcare system. The utilization of artificial intelligence assists healthcare to be more efficient in diagnosing diabetes patients, better healthcare delivery, and more patient eccentric. Among the advanced data mining techniques in artificial intelligence, stacking is among the most prominent methods applied in the diabetes domain. Hence, this study opts to investigate the potential of stacking ensembles. The aim of this study is to reduce the high complexity inherent in stacking, as this problem contributes to longer training time and reduces the outliers in the diabetes data to improve the classification performance. In addressing this concern, a novel machine learning method called the Stacking Recursive Feature Elimination-Isolation Forest was introduced for diabetes prediction. The application of stacking with Recursive Feature Elimination is to design an efficient model for diabetes diagnosis while using fewer features as resources. This method also incorporates the utilization of Isolation Forest as an outlier removal method. The study uses accuracy, precision, recall, F1 measure, training time, and standard deviation metrics to identify the classification performances. The proposed method acquired an accuracy of 79.077% for PIMA Indians Diabetes and 97.446% for the Diabetes Prediction dataset, outperforming many existing methods and demonstrating effectiveness in the diabetes domain.

Collapse

Gu J, Cao Y, Chai L, Xu E, Liu K, Chong Z, Zhang Y, Zou D, Xu Y, Wang J, Müller O, Cao J, Zhu G, Lu G. Delayed care-seeking in international migrant workers with imported malaria in China. J Travel Med 2024;31:taae021. [PMID: 38335249 DOI: 10.1093/jtm/taae021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 12/12/2023] [Accepted: 02/08/2024] [Indexed: 02/12/2024]

Abstract

BACKGROUND

Imported malaria cases continue to pose major challenges in China as well as in other countries that have achieved elimination. Early diagnosis and treatment of each imported malaria case is the key to successfully maintaining malaria elimination success. This study aimed to build an easy-to-use predictive nomogram to predict and intervene against delayed care-seeking among international migrant workers with imported malaria.

METHODS

A prediction model was built based on cases with imported malaria from 2012 to 2019, in Jiangsu Province, China. Routine surveillance information (e.g. sex, age, symptoms, origin country and length of stay abroad), data on the place of initial care-seeking and the gross domestic product (GDP) of the destination city were extracted. Multivariate logistic regression was performed to identify independent predictors and a nomogram was established to predict the risk of delayed care-seeking. The discrimination and calibration of the nomogram was performed using area under the curve and calibration plots. In addition, four machine learning models were used to make a comparison.

RESULTS

Of 2255 patients with imported malaria, 636 (28.2%) sought care within 24 h after symptom onset, and 577 (25.6%) sought care 3 days after symptom onset. Development of symptoms before entry into China, initial care-seeking from superior healthcare facilities and a higher GDP level of the destination city were significantly associated with delayed care-seeking among migrant workers with imported malaria. Based on these independent risk factors, an easy-to-use and intuitive nomogram was established. The calibration curves of the nomogram showed good consistency.

CONCLUSIONS

The tool provides public health practitioners with a method for the early detection of delayed care-seeking risk among international migrant workers with imported malaria, which may be of significance in improving post-travel healthcare for labour migrants, reducing the risk of severe malaria, preventing malaria reintroduction and sustaining achievements in malaria elimination.

Collapse

Affiliation(s)

Jiyue Gu Department of Epidemiology and Biostatistics, School of Public Health, Medical College of Yangzhou University, Yangzhou University, Yangzhou, Jiangsu Province, 225009, China
Yuanyuan Cao National Health Commission Key Laboratory of Parasitic Disease Control and Prevention, Jiangsu Provincial Key Laboratory on Parasite and Vector Control Technology, Jiangsu Institute of Parasitic Diseases, Wuxi, Jiangsu Province, 214064, China Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu Province, 211166, China
Liying Chai Department of Epidemiology and Biostatistics, School of Public Health, Medical College of Yangzhou University, Yangzhou University, Yangzhou, Jiangsu Province, 225009, China
Enyu Xu Department of Epidemiology and Biostatistics, School of Public Health, Medical College of Yangzhou University, Yangzhou University, Yangzhou, Jiangsu Province, 225009, China
Kaixuan Liu Department of Epidemiology and Biostatistics, School of Public Health, Medical College of Yangzhou University, Yangzhou University, Yangzhou, Jiangsu Province, 225009, China
Zeyin Chong Department of Epidemiology and Biostatistics, School of Public Health, Medical College of Yangzhou University, Yangzhou University, Yangzhou, Jiangsu Province, 225009, China
Yuying Zhang Department of Epidemiology and Biostatistics, School of Public Health, Medical College of Yangzhou University, Yangzhou University, Yangzhou, Jiangsu Province, 225009, China
Dandan Zou Department of Epidemiology and Biostatistics, School of Public Health, Medical College of Yangzhou University, Yangzhou University, Yangzhou, Jiangsu Province, 225009, China
Yuhui Xu Center for Disease Control and Prevention, Yangzhou, Jiangsu Province, 225007, China
Jian Wang Yangzhou Schistosomiasis and Parasitic Disease Control Office, Yangzhou, Jiangsu Province, 225007, China
Olaf Müller Institute of Global Health, Medical School, Ruprecht-Karls-University Heidelberg, Heidelberg, 69117, Germany
Jun Cao National Health Commission Key Laboratory of Parasitic Disease Control and Prevention, Jiangsu Provincial Key Laboratory on Parasite and Vector Control Technology, Jiangsu Institute of Parasitic Diseases, Wuxi, Jiangsu Province, 214064, China Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu Province, 211166, China
Guoding Zhu National Health Commission Key Laboratory of Parasitic Disease Control and Prevention, Jiangsu Provincial Key Laboratory on Parasite and Vector Control Technology, Jiangsu Institute of Parasitic Diseases, Wuxi, Jiangsu Province, 214064, China Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu Province, 211166, China
Guangyu Lu Department of Epidemiology and Biostatistics, School of Public Health, Medical College of Yangzhou University, Yangzhou University, Yangzhou, Jiangsu Province, 225009, China Jiangsu Key Laboratory of Zoonosis, Yangzhou, 225009, China

Collapse

Xing M, Zhao Y, Li Z, Zhang L, Yu Q, Zhou W, Huang R, Lv X, Ma Y, Li W. Development and validation of a stacking ensemble model for death prediction in the Chinese Longitudinal Healthy Longevity Survey (CLHLS). Maturitas 2024;182:107919. [PMID: 38290423 DOI: 10.1016/j.maturitas.2024.107919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 11/12/2023] [Accepted: 01/15/2024] [Indexed: 02/01/2024]

Affiliation(s)

Muqi Xing Department of Big Data in Health Science School of Public Health, Center of Clinical Big Data and Analytics of The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
Yunfeng Zhao School of Public Health, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, Zhejiang, China
Zihan Li Department of Big Data in Health Science School of Public Health, Center of Clinical Big Data and Analytics of The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
Lingzhi Zhang Department of Big Data in Health Science School of Public Health, Center of Clinical Big Data and Analytics of The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
Qi Yu Department of Big Data in Health Science School of Public Health, Center of Clinical Big Data and Analytics of The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
Wenhui Zhou Department of Biostatistics and Epidemiology, School of Public Health, China Medical University, Shenyang 110122, China
Rong Huang Department of Biostatistics and Epidemiology, School of Public Health, China Medical University, Shenyang 110122, China
Xiaozhen Lv Peking University Institute of Mental Health (Sixth Hospital), National Clinical Research Center for Mental Disorders, NHC Key Laboratory of Mental Health, Peking University, 51 Huayuan North Road, Haidian District, Beijing 100191, China.
Yanan Ma Department of Biostatistics and Epidemiology, School of Public Health, China Medical University, Shenyang 110122, China.
Wenyuan Li Department of Big Data in Health Science School of Public Health, Center of Clinical Big Data and Analytics of The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China.

Collapse

Li J, Wu YJ, Liu MF, Li N, Dang LH, An GS, Lu XJ, Wang LL, Du QX, Cao J, Sun JH. Multi-omics integration strategy in the post-mortem interval of forensic science. Talanta 2024;268:125249. [PMID: 37839320 DOI: 10.1016/j.talanta.2023.125249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 08/13/2023] [Accepted: 09/25/2023] [Indexed: 10/17/2023]

Affiliation(s)

Jian Li School of Forensic Medicine, Shanxi Medical University, No. 98, University Street, Wujinshan Town, Yuci District, Jinzhong City, Shanxi Province, 030604, PR China; Shanxi Key Laboratory of Forensic Medicine, Jinzhong, 030600, Shanxi, China
Yan-Juan Wu School of Forensic Medicine, Shanxi Medical University, No. 98, University Street, Wujinshan Town, Yuci District, Jinzhong City, Shanxi Province, 030604, PR China; Shanxi Key Laboratory of Forensic Medicine, Jinzhong, 030600, Shanxi, China
Ming-Feng Liu School of Forensic Medicine, Shanxi Medical University, No. 98, University Street, Wujinshan Town, Yuci District, Jinzhong City, Shanxi Province, 030604, PR China; Shanxi Key Laboratory of Forensic Medicine, Jinzhong, 030600, Shanxi, China
Na Li School of Forensic Medicine, Shanxi Medical University, No. 98, University Street, Wujinshan Town, Yuci District, Jinzhong City, Shanxi Province, 030604, PR China; Shanxi Key Laboratory of Forensic Medicine, Jinzhong, 030600, Shanxi, China
Li-Hong Dang School of Forensic Medicine, Shanxi Medical University, No. 98, University Street, Wujinshan Town, Yuci District, Jinzhong City, Shanxi Province, 030604, PR China; Shanxi Key Laboratory of Forensic Medicine, Jinzhong, 030600, Shanxi, China
Guo-Shuai An School of Forensic Medicine, Shanxi Medical University, No. 98, University Street, Wujinshan Town, Yuci District, Jinzhong City, Shanxi Province, 030604, PR China; Shanxi Key Laboratory of Forensic Medicine, Jinzhong, 030600, Shanxi, China
Xiao-Jun Lu Criminal Investigation Detachment, Baotou City Public Security Bureau, No. 191, Jianshe Road, Qingshan District, Baotou City, Inner Mongolia Autonomous Region, 014030, PR China
Liang-Liang Wang School of Forensic Medicine, Shanxi Medical University, No. 98, University Street, Wujinshan Town, Yuci District, Jinzhong City, Shanxi Province, 030604, PR China; Shanxi Key Laboratory of Forensic Medicine, Jinzhong, 030600, Shanxi, China
Qiu-Xiang Du School of Forensic Medicine, Shanxi Medical University, No. 98, University Street, Wujinshan Town, Yuci District, Jinzhong City, Shanxi Province, 030604, PR China; Shanxi Key Laboratory of Forensic Medicine, Jinzhong, 030600, Shanxi, China
Jie Cao School of Forensic Medicine, Shanxi Medical University, No. 98, University Street, Wujinshan Town, Yuci District, Jinzhong City, Shanxi Province, 030604, PR China; Shanxi Key Laboratory of Forensic Medicine, Jinzhong, 030600, Shanxi, China.
Jun-Hong Sun School of Forensic Medicine, Shanxi Medical University, No. 98, University Street, Wujinshan Town, Yuci District, Jinzhong City, Shanxi Province, 030604, PR China; Shanxi Key Laboratory of Forensic Medicine, Jinzhong, 030600, Shanxi, China.

Collapse

Arukonda S, Cheruku R. Nested genetic algorithm-based classifier selection and placement in multi-level ensemble framework for effective disease diagnosis. Comput Methods Biomech Biomed Engin 2023:1-24. [PMID: 38126276 DOI: 10.1080/10255842.2023.2294264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 12/05/2023] [Indexed: 12/23/2023]

Abstract

Effective disease diagnosis is a critical unmet need on a global scale. The intricacies of the numerous disease mechanisms and underlying symptoms make developing a model for early diagnosis and effective treatment extremely difficult. Machine learning (ML) can help to solve some of these issues. Recently, various ensemble-based ML models have benefited clinicians in early diagnosis. However, one of the most difficult challenges in multi-level ensemble approaches is the classifier selection and their placement in the ensemble framework as it improves the overall performance. Let m classifiers have to select from n classifiers there are ( n m ) ways. Again, these ( n m ) possibilities can be arranged in m ! ways. Finding the best m classifiers and their positions from total ( n m ) m ! ways is a challenging and hard problem. To address this challenge, a dynamic three-level ensemble framework is proposed. A nested Genetic Algorithm (GA) and ensemble-based fitness function are employed to optimize the classifier selection and their placement in a three-level ensemble framework. Our approach used eleven classifiers and chose seven classifiers by maximizing the fitness function. The proposed model experiments on 12 disease datasets. The proposed model outperformed in terms of accuracy, F1, and G-measure on the Chronic Kidney Disease (CKD) dataset is 0.987, 0.988, and 0.989, respectively. In terms of AUC on the Heart disease dataset (HDD) is 0.998 and in terms of recall on the Hypothyroid disease dataset (HyDD) is 0.988. In addition, the proposed model superiority is statically evaluated by Wilcoxon-Signed-Rank (WSR) test compared with other ensemble models, such as random forest (RF), bagging classifier (BC), XGBoost (XGB), and gradient boost classifier (GBC) with probability value p < 0.05 results shows all the traditional ensemble model differs with proposed model and also effective size evaluated with using the matched-pairs rank biserial correlation coefficient wc and statistical results shows effective size is large with RF and BC and effective size is medium with XGB and GBC. Proposed model has outperformed comparing with State-Of-The-Art (SOTA) ensemble and non-ensemble models. Further, the proposed model outperformed in terms of the ROC curve in the majority of the disease datasets. The results suggest the usage of the proposed model for disease diagnosis applications.

Collapse

Zheng J, Zhang Z, Wang J, Zhao R, Liu S, Yang G, Liu Z, Deng Z. Metabolic syndrome prediction model using Bayesian optimization and XGBoost based on traditional Chinese medicine features. Heliyon 2023;9:e22727. [PMID: 38125549 PMCID: PMC10730568 DOI: 10.1016/j.heliyon.2023.e22727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 11/16/2023] [Accepted: 11/17/2023] [Indexed: 12/23/2023] Open

Chellappan D, Rajaguru H. Enhancement of Classifier Performance Using Swarm Intelligence in Detection of Diabetes from Pancreatic Microarray Gene Data. Biomimetics (Basel) 2023;8:503. [PMID: 37887634 PMCID: PMC10604158 DOI: 10.3390/biomimetics8060503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 10/08/2023] [Accepted: 10/20/2023] [Indexed: 10/28/2023] Open

Jiang L, Xia Z, Zhu R, Gong H, Wang J, Li J, Wang L. Diabetes risk prediction model based on community follow-up data using machine learning. Prev Med Rep 2023;35:102358. [PMID: 37654514 PMCID: PMC10465943 DOI: 10.1016/j.pmedr.2023.102358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 07/31/2023] [Accepted: 08/01/2023] [Indexed: 09/02/2023] Open

Zhou H, Xin Y, Li S. A diabetes prediction model based on Boruta feature selection and ensemble learning. BMC Bioinformatics 2023;24:224. [PMID: 37264332 DOI: 10.1186/s12859-023-05300-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Accepted: 04/21/2023] [Indexed: 06/03/2023] Open

Liu C, Yao Z, Liu P, Tu Y, Chen H, Cheng H, Xie L, Xiao K. Early prediction of MODS interventions in the intensive care unit using machine learning. JOURNAL OF BIG DATA 2023;10:55. [PMID: 37193361 PMCID: PMC10158675 DOI: 10.1186/s40537-023-00719-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Accepted: 03/21/2023] [Indexed: 05/18/2023]

Abstract

Background

Multiple organ dysfunction syndrome (MODS) is one of the leading causes of death in critically ill patients. MODS is the result of a dysregulated inflammatory response that can be triggered by various causes. Owing to the lack of an effective treatment for patients with MODS, early identification and intervention are the most effective strategies. Therefore, we have developed a variety of early warning models whose prediction results can be interpreted by Kernel SHapley Additive exPlanations (Kernel-SHAP) and reversed by diverse counterfactual explanations (DiCE). So we can predict the probability of MODS 12 h in advance, quantify the risk factors, and automatically recommend relevant interventions.

Methods

We used various machine learning algorithms to complete the early risk assessment of MODS, and used a stacked ensemble to improve the prediction performance. The kernel-SHAP algorithm was used to quantify the positive and minus factors corresponding to the individual prediction results, and finally, the DiCE method was used to automatically recommend interventions. We completed the model training and testing based on the MIMIC-III and MIMIC-IV databases, in which the sample features in the model training included the patients' vital signs, laboratory test results, test reports, and data related to the use of ventilators.

Results

The customizable model called SuperLearner, which integrated multiple machine learning algorithms, had the highest authenticity of screening, and its Yordon index (YI), sensitivity, accuracy, and utility_score on the MIMIC-IV test set were 0.813, 0.884, 0.893, and 0.763, respectively, which were all maximum values of eleven models. The area under the curve of the deep-wide neural network (DWNN) model on the MIMIC-IV test set was 0.960, and the specificity was 0.935, which were both the maximum values of all these models. The Kernel-SHAP algorithm combined with SuperLearner was used to determine the minimum value of glasgow coma scale (GCS) in the current hour (OR = 0.609, 95% CI 0.606-0.612), maximum value of MODS score corresponding to GCS in the past 24 h (OR = 2.632, 95% CI 2.588-2.676), and maximum score of MODS corresponding to creatinine in the past 24 h (OR = 3.281, 95% CI 3.267-3.295) were generally the most influential factors.

Conclusion

The MODS early warning model based on machine learning algorithms has considerable application value, and the prediction efficiency of SuperLearner is superior to those of SubSuperLearner, DWNN, and other eight common machine learning models. Considering that the attribution analysis of Kernel-SHAP is a static analysis of the prediction results, we introduce the DiCE algorithm to automatically recommend counterfactuals to reverse the prediction results, which will be an important step towards the practical application of automatic MODS early intervention.

Supplementary Information

The online version contains supplementary material available at 10.1186/s40537-023-00719-2.

Collapse

A Novel Proposal for Deep Learning-Based Diabetes Prediction: Converting Clinical Data to Image Data. Diagnostics (Basel) 2023;13:diagnostics13040796. [PMID: 36832284 PMCID: PMC9955314 DOI: 10.3390/diagnostics13040796] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 02/14/2023] [Accepted: 02/15/2023] [Indexed: 02/22/2023] Open

Novel Prediction Method Applied to Wound Age Estimation: Developing a Stacking Ensemble Model to Improve Predictive Performance Based on Multi-mRNA. Diagnostics (Basel) 2023;13:diagnostics13030395. [PMID: 36766500 PMCID: PMC9914838 DOI: 10.3390/diagnostics13030395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 01/13/2023] [Accepted: 01/17/2023] [Indexed: 01/24/2023] Open

Joseph LP, Joseph EA, Prasad R. Explainable diabetes classification using hybrid Bayesian-optimized TabNet architecture. Comput Biol Med 2022;151:106178. [PMID: 36306578 DOI: 10.1016/j.compbiomed.2022.106178] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 09/23/2022] [Accepted: 10/01/2022] [Indexed: 12/27/2022]

Zhu X, Zhang M, Wen Y, Shang D. Machine learning advances the integration of covariates in population pharmacokinetic models: Valproic acid as an example. Front Pharmacol 2022;13:994665. [PMID: 36324679 PMCID: PMC9621318 DOI: 10.3389/fphar.2022.994665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Accepted: 10/03/2022] [Indexed: 11/24/2022] Open

Abstract

Background and Aim: Many studies associated with the combination of machine learning (ML) and pharmacometrics have appeared in recent years. ML can be used as an initial step for fast screening of covariates in population pharmacokinetic (popPK) models. The present study aimed to integrate covariates derived from different popPK models using ML.

Methods: Two published popPK models of valproic acid (VPA) in Chinese epileptic patients were used, where the population parameters were influenced by some covariates. Based on the covariates and a one-compartment model that describes the pharmacokinetics of VPA, a dataset was constructed using Monte Carlo simulation, to develop an XGBoost model to estimate the steady-state concentrations (Css) of VPA. We utilized SHapley Additive exPlanation (SHAP) values to interpret the prediction model, and calculated estimates of VPA exposure in four assumed scenarios involving different combinations of CYP2C19 genotypes and co-administered antiepileptic drugs. To develop an easy-to-use model in the clinic, we built a simplified model by using CYP2C19 genotypes and some noninvasive clinical parameters, and omitting several features that were infrequently measured or whose clinically available values were inaccurate, and verified it on our independent external dataset.

Results: After data preprocessing, the finally generated combined dataset was divided into a derivation cohort and a validation cohort (8:2). The XGBoost model was developed in the derivation cohort and yielded excellent performance in the validation cohort with a mean absolute error of 2.4 mg/L, root-mean-squared error of 3.3 mg/L, mean relative error of 0%, and percentages within ±20% of actual values of 98.85%. The SHAP analysis revealed that daily dose, time, CYP2C19*2 and/or *3 variants, albumin, body weight, single dose, and CYP2C19*1*1 genotype were the top seven confounding factors influencing the Css of VPA. Under the simulated dosage regimen of 500 mg/bid, the VPA exposure in patients who had CYP2C19*2 and/or *3 variants and no carbamazepine, phenytoin, or phenobarbital treatment, was approximately 1.74-fold compared to those with CYP2C19*1/*1 genotype and co-administered carbamazepine + phenytoin + phenobarbital. The feasibility of the simplified model was fully illustrated by its performance in our external dataset.

Conclusion: This study highlighted the bridging role of ML in big data and pharmacometrics, by integrating covariates derived from different popPK models.

Collapse

Zhu X, Hu J, Xiao T, Huang S, Wen Y, Shang D. An interpretable stacking ensemble learning framework based on multi-dimensional data for real-time prediction of drug concentration: The example of olanzapine. Front Pharmacol 2022;13:975855. [PMID: 36238557 PMCID: PMC9552071 DOI: 10.3389/fphar.2022.975855] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 09/05/2022] [Indexed: 11/13/2022] Open

Abstract

Background and Aim: Therapeutic drug monitoring (TDM) has evolved over the years as an important tool for personalized medicine. Nevertheless, some limitations are associated with traditional TDM. Emerging data-driven model forecasting [e.g., through machine learning (ML)-based approaches] has been used for individualized therapy. This study proposes an interpretable stacking-based ML framework to predict concentrations in real time after olanzapine (OLZ) treatment.

Methods: The TDM-OLZ dataset, consisting of 2,142 OLZ measurements and 472 features, was formed by collecting electronic health records during the TDM of 927 patients who had received OLZ treatment. We compared the performance of ML algorithms by using 10-fold cross-validation and the mean absolute error (MAE). The optimal subset of features was analyzed by a random forest-based sequential forward feature selection method in the context of the top five heterogeneous regressors as base models to develop a stacked ensemble regressor, which was then optimized via the grid search method. Its predictions were explained by using local interpretable model-agnostic explanations (LIME) and partial dependence plots (PDPs).

Results: A state-of-the-art stacking ensemble learning framework that integrates optimized extra trees, XGBoost, random forest, bagging, and gradient-boosting regressors was developed for nine selected features [i.e., daily dose (OLZ), gender_male, age, valproic acid_yes, ALT, K, BW, MONO#, and time of blood sampling after first administration]. It outperformed other base regressors that were considered, with an MAE of 0.064, R-square value of 0.5355, mean squared error of 0.0089, mean relative error of 13%, and ideal rate (the percentages of predicted TDM within ± 30% of actual TDM) of 63.40%. Predictions at the individual level were illustrated by LIME plots, whereas the global interpretation of associations between features and outcomes was illustrated by PDPs.

Conclusion: This study highlights the feasibility of the real-time estimation of drug concentrations by using stacking-based ML strategies without losing interpretability, thus facilitating model-informed precision dosing.

Collapse

Study of Multidimensional and High-Precision Height Model of Youth Based on Multilayer Perceptron. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022;2022:7843455. [PMID: 35761869 PMCID: PMC9233609 DOI: 10.1155/2022/7843455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 03/14/2022] [Accepted: 05/13/2022] [Indexed: 11/17/2022]

Gollapalli M, Alansari A, Alkhorasani H, Alsubaii M, Sakloua R, Alzahrani R, Taha Al-Hariri M, Nasser Alfares M, AlKhafaji D, Jaafar Al Argan R, Albaker W. A novel stacking ensemble for detecting three types of diabetes mellitus using a Saudi Arabian dataset: Pre-diabetes, T1DM, and T2DM. Comput Biol Med 2022;147:105757. [DOI: 10.1016/j.compbiomed.2022.105757] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 05/27/2022] [Accepted: 06/18/2022] [Indexed: 11/29/2022]

Machine learning models for classification and identification of significant attributes to detect type 2 diabetes. Health Inf Sci Syst 2022;10:2. [PMID: 35178244 PMCID: PMC8828812 DOI: 10.1007/s13755-021-00168-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Accepted: 10/27/2021] [Indexed: 12/15/2022] Open

Abstract

Type 2 Diabetes (T2D) is a chronic disease characterized by abnormally high blood glucose levels due to insulin resistance and reduced pancreatic insulin production. The challenge of this work is to identify T2D-associated features that can distinguish T2D sub-types for prognosis and treatment purposes. We thus employed machine learning (ML) techniques to categorize T2D patients using data from the Pima Indian Diabetes Dataset from the Kaggle ML repository. After data preprocessing, several feature selection techniques were used to extract feature subsets, and a range of classification techniques were used to analyze these. We then compared the derived classification results to identify the best classifiers by considering accuracy, kappa statistics, area under the receiver operating characteristic (AUROC), sensitivity, specificity, and logarithmic loss (logloss). To evaluate the performance of different classifiers, we investigated their outcomes using the summary statistics with a resampling distribution. Therefore, Generalized Boosted Regression modeling showed the highest accuracy (90.91%), followed by kappa statistics (78.77%) and specificity (85.19%). In addition, Sparse Distance Weighted Discrimination, Generalized Additive Model using LOESS and Boosted Generalized Additive Models also gave the maximum sensitivity (100%), highest AUROC (95.26%) and lowest logarithmic loss (30.98%) respectively. Notably, the Generalized Additive Model using LOESS was the top-ranked algorithm according to non-parametric Friedman testing. Of the features identified by these machine learning models, glucose levels, body mass index, diabetes pedigree function, and age were consistently identified as the best and most frequently accurate outcome predictors. These results indicate the utility of ML methods in constructing improved prediction models for T2D and successfully identified outcome predictors for this Pima Indian population.

Collapse

Fregoso-Aparicio L, Noguez J, Montesinos L, García-García JA. Machine learning and deep learning predictive models for type 2 diabetes: a systematic review. Diabetol Metab Syndr 2021;13:148. [PMID: 34930452 PMCID: PMC8686642 DOI: 10.1186/s13098-021-00767-9] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 12/07/2021] [Indexed: 12/12/2022] Open