1
|
Hossain MM, Ahmed MM, Rakib MRH, Zia MO, Hasan R, Islam MR, Islam MS, Alam MS, Islam MK. Optimizing Stroke Risk Prediction: A Primary Dataset-Driven Ensemble Classifier With Explainable Artificial Intelligence. Health Sci Rep 2025; 8:e70799. [PMID: 40330769 PMCID: PMC12052519 DOI: 10.1002/hsr2.70799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2024] [Revised: 04/10/2025] [Accepted: 04/16/2025] [Indexed: 05/08/2025] Open
Abstract
Background and Aims Stroke remains a leading cause of mortality and long-term disability worldwide, presenting a significant global health challenge. Effective early prediction models are essential for reducing its impact. This study introduces a novel ensemble method for predicting stroke using two datasets: a primary dataset collected from a hospital, containing medical histories and clinical parameters, and a secondary dataset. Methods We applied several preprocessing techniques, including outlier detection, data normalization, k-means clustering, and missing value detection, to refine the datasets. A novel ensemble classifier was developed, combining AdaBoost, Gradient Boosting Machine (GBM), Multilayer Perceptron (MLP), and Random Forest (RF) algorithms to enhance predictive accuracy. Additionally, Explainable Artificial Intelligence (XAI) techniques such as SHAP and LIME were integrated to elucidate key features influencing stroke prediction. Results The proposed ensemble classifier achieved an accuracy of 95% for the secondary dataset and 80.36% for the primary dataset. Comparative analysis with other machine learning models highlighted the superior performance of the ensemble approach. The integration of XAI further provided insights into the critical indicators influencing stroke classification, improving model interpretability and decision-making. Conclusion Our study demonstrates that the novel ensemble classifier, supported by effective preprocessing and XAI techniques, is a powerful tool for stroke prediction. The high accuracy rates achieved validate its effectiveness and potential for practical clinical application. Future work will focus on incorporating deep learning techniques and medical imaging to further improve classification accuracy and model performance.
Collapse
Affiliation(s)
- Md. Maruf Hossain
- Department of Biomedical EngineeringIslamic UniversityKushtiaBangladesh
- Bio‐Imaging Research Laboratory, BMEIslamic UniversityKushtiaBangladesh
| | - Md. Mahfuz Ahmed
- Department of Biomedical EngineeringIslamic UniversityKushtiaBangladesh
- Bio‐Imaging Research Laboratory, BMEIslamic UniversityKushtiaBangladesh
| | | | | | - Rakib Hasan
- Department of Biomedical EngineeringIslamic UniversityKushtiaBangladesh
| | - Md. Rakibul Islam
- Bio‐Imaging Research Laboratory, BMEIslamic UniversityKushtiaBangladesh
- Department of Computer Science and EngineeringNorthern University BangladeshDhakaBangladesh
| | | | - Md Shahariar Alam
- Department of Information and Communication TechnologyIslamic UniversityKushtiaBangladesh
| | - Md. Khairul Islam
- Department of Biomedical EngineeringIslamic UniversityKushtiaBangladesh
- Bio‐Imaging Research Laboratory, BMEIslamic UniversityKushtiaBangladesh
| |
Collapse
|
2
|
Thatha VN, Chalichalamala S, Pamula U, Krishna DP, Chinthakunta M, Mantena SV, Vahiduddin S, Vatambeti R. Optimized machine learning mechanism for big data healthcare system to predict disease risk factor. Sci Rep 2025; 15:14327. [PMID: 40274987 PMCID: PMC12022254 DOI: 10.1038/s41598-025-98721-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2024] [Accepted: 04/14/2025] [Indexed: 04/26/2025] Open
Abstract
Heart disease is becoming more and more common in modern society because of factors like stress, inadequate diets, etc. Early identification of heart disease risk factors is essential as it allows for treatment plans that may reduce the risk of severe consequences and enhance patient outcomes. Predictive methods have been used to estimate the risk factor, but they often have drawbacks such as improper feature selection, overfitting, etc. To overcome this, a novel Deep Red Fox belief prediction system (DRFBPS) has been introduced and implemented in Python software. Initially, the data was collected and preprocessed to enhance its quality, and the relevant features were selected using red fox optimization. The selected features analyze the risk factors, and DRFBPS makes the prediction. The effectiveness of the DRFBPS model is validated using Accuracy, F score, Precision, AUC, Recall, and error rate. The findings demonstrate the use of DRFBPS as a practical tool in healthcare analytics by showing the rate at which it produces accurate and reliable predictions. Additionally, its application in healthcare systems, including clinical decisions and remote patient monitoring, proves its real-world applicability in enhancing early diagnosis and preventive care measures. The results prove DRFBPS to be a potential tool in healthcare analytics, providing a strong framework for predictive modeling in heart disease risk prediction.
Collapse
Affiliation(s)
| | - Silpa Chalichalamala
- Department of Artificial Intelligence and Data Science, GITAM School of Technology, GITAM University-Bengaluru Campus, Bengaluru, India
| | - Udayaraju Pamula
- Department of Computer Science and Engineering, School of Engineering and Sciences, SRM University, Amaravati, AP, India
| | - D Pramodh Krishna
- Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Guntur, India
| | - Manjunath Chinthakunta
- Department of Computer Science and Engineering (AI & ML), Vidyavardhaka College of Engineering, Mysore, India
| | - Srihari Varma Mantena
- Department of Computer Science and Engineering, SRKR Engineering College, Bhimavaram, 534204, India
| | - Shariff Vahiduddin
- Department of Computer Science and Engineering, Sir C R Reddy College of Engineering, Eluru, India
| | - Ramesh Vatambeti
- School of Computer Science and Engineering, VIT-AP University, Vijayawada, 522237, India.
| |
Collapse
|
3
|
Heseltine-Carp W, Courtman M, Browning D, Kasabe A, Allen M, Streeter A, Ifeachor E, James M, Mullin S. Machine learning to predict stroke risk from routine hospital data: A systematic review. Int J Med Inform 2025; 196:105811. [PMID: 39908727 DOI: 10.1016/j.ijmedinf.2025.105811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2024] [Revised: 01/20/2025] [Accepted: 01/23/2025] [Indexed: 02/07/2025]
Abstract
PURPOSE Stroke remains a leading cause of morbidity and mortality. Despite this, current risk stratification tools such as CHA2DS2-VASc and QRISK3 are of limited accuracy, particularly in those without a diagnosis of atrial-fibrillation. Hence, there is a need for more accurate stroke risk prediction models. Machine-learning (ML) may provide a solution to this by leveraging existing routine hospital databases to build accurate stroke risk prediction models and identify novel risk factors for stroke. AIMS In this systematic review we appraise current research using ML to predict stroke risk from routine hospital data. Based on these findings we then highlight common methodological limitations and recommendations for future research. METHODS In this review we identify 49 original research (38 in the general population and 11 in AF specific populations) articles from the PUBMED database from January-2013 to December-2024 using ML and routine hospital data to predict the risk of stroke. RESULTS ML models were able to accurately predict stroke risk in both AF specific and general populations, with AUCs ranging from 0.64 to 0.99. Where tested, ML also consistently outperformed traditional risk stratification tool, such as CHA2DS2-VASc. ML also appeared useful in identifying several novel risk factors from electrocardiogram, laboratory test and echocardiography data. However, the quality of datasets were often limited, there was a high suspicion of overfitting and models often lacked calibration, external validation and explainability analysis. CONCLUSION Whilst ML has shown great potential in stroke prediction and identifying novel risk factors for stroke, improvements in study methodology is required prior to integration of ML into routine healthcare. Future research should adhere to the EQUATOR guidance on prediction models and encourage interdisciplinary collaboration between computer scientists and clinicians. Further prospective RCTs are also required to validate models in the clinical setting and the identify barriers of integrating ML into routine healthcare.
Collapse
Affiliation(s)
- William Heseltine-Carp
- University of Plymouth, Room N6, ITTC Building, Plymouth Science Park, Plymouth PL68BX, UK.
| | - Megan Courtman
- University of Plymouth, Room N6, ITTC Building, Plymouth Science Park, Plymouth PL68BX, UK; University of Plymouth, Plymouth PL4 8AA, UK.
| | - Daniel Browning
- University of Plymouth, Room N6, ITTC Building, Plymouth Science Park, Plymouth PL68BX, UK.
| | - Aishwarya Kasabe
- University of Plymouth, Room N6, ITTC Building, Plymouth Science Park, Plymouth PL68BX, UK.
| | - Michael Allen
- University of Exeter, Medical School, St Lukes Campus, Heavitree Road, SC 2.30, Exeter EX4 4QJ, UK.
| | - Adam Streeter
- University of Plymouth, N15, ITTC1, Plymouth Science Park, Plymouth PL6 8BX, UK.
| | - Emmanuel Ifeachor
- University of Plymouth, N15, ITTC1, Plymouth Science Park, Plymouth PL6 8BX, UK; School of Engineering, Computing and Mathematics, University of Plymouth, Plymouth PL4 8AA, UK.
| | - Martin James
- University of Exeter, Academic Department of Healthcare for Older People, Royal Devon & Exeter Hospital, Exeter EX2 5DW, UK.
| | - Stephen Mullin
- University of Plymouth, Room N6, ITTC Building, Plymouth Science Park, Plymouth PL68BX, UK.
| |
Collapse
|
4
|
Abousaber I. A Novel Explainable Attention-Based Meta-Learning Framework for Imbalanced Brain Stroke Prediction. SENSORS (BASEL, SWITZERLAND) 2025; 25:1739. [PMID: 40292890 PMCID: PMC11945820 DOI: 10.3390/s25061739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2025] [Revised: 03/01/2025] [Accepted: 03/07/2025] [Indexed: 04/30/2025]
Abstract
The accurate prediction of brain stroke is critical for effective diagnosis and management, yet the imbalanced nature of medical datasets often hampers the performance of conventional machine learning models. To address this challenge, we propose a novel meta-learning framework that integrates advanced hybrid resampling techniques, ensemble-based classifiers, and explainable artificial intelligence (XAI) to enhance predictive performance and interpretability. The framework employs SMOTE and SMOTEENN for handling class imbalance, dynamic feature selection to reduce noise, and a meta-learning approach combining predictions from Random Forest and LightGBM, and further refined by a deep learning-based meta-classifier. The model uses SHAP (Shapley Additive Explanations) to provide transparent insights into feature contributions, increasing trust in its predictions. Evaluated on three datasets, DF-1, DF-2, and DF-3, the proposed framework consistently outperformed state-of-the-art methods, achieving accuracy and F1-Score of 0.992189 and 0.992579 on DF-1, 0.980297 and 0.981916 on DF-2, and 0.981901 and 0.983365 on DF-3. These results validate the robustness and effectiveness of the approach, significantly improving the detection of minority-class instances while maintaining overall performance. This work establishes a reliable solution for stroke prediction and provides a foundation for applying meta-learning and explainable AI to other imbalanced medical prediction tasks.
Collapse
Affiliation(s)
- Inam Abousaber
- Department of Information Technology, Faculty of Computers and Information Technology, University of Tabuk, Tabuk 47912, Saudi Arabia
| |
Collapse
|
5
|
Haimerl M, Reich C. Risk-based evaluation of machine learning-based classification methods used for medical devices. BMC Med Inform Decis Mak 2025; 25:126. [PMID: 40069689 PMCID: PMC11895222 DOI: 10.1186/s12911-025-02909-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 01/31/2025] [Indexed: 03/15/2025] Open
Abstract
BACKGROUND In the future, more medical devices will be based on machine learning (ML) methods. In general, the consideration of risks is a crucial aspect for evaluating medical devices. Accordingly, risks and their associated costs should be taken into account when assessing the performance of ML-based medical devices. This paper addresses the following three research questions towards a risk-based evaluation with a focus on ML-based classification models. METHODS First, we analyzed how often risk-based metrics are currently utilized in the context of ML-based classification models. This was performed using a literature research based on a sample of recent scientific publications. Second, we introduce an approach for evaluating such models where expected risks and associated costs are integrated into the corresponding performance metrics. Additionally, we analyze the impact of different risk ratios on the resulting overall performance. Third, we elaborate how such risk-based approaches relate to regulatory requirements in the field of medical devices. A set of use case scenarios were utilized to demonstrate necessities and practical implications, in this regard. RESULTS First, it was shown that currently most scientific publications do not include risk-based approaches for measuring performance. Second, it was demonstrated that risk-based considerations have a substantial impact on the outcome. The relative increase of the resulting overall risks can go up to 196% when the ratio between different types of risks (false negatives vs. false positives) changes by a factor of 10.0. Third, we elaborated that risk-based considerations need to be included into the assessment of ML-based medical devices, according to the relevant EU regulations and standards. In particular, this applies when a substantial impact on the clinical outcome / in terms of the risk-benefit relationship occurs. CONCLUSION In summary, we demonstrated the necessity of a risk-based approach for the evaluation of medical devices which include ML-based classification methods. We showed that currently many scientific papers in this area do not include risk considerations. We developed basic steps towards a risk-based assessment of ML-based classifiers and elaborated consequences that could occur, when these steps are neglected. And, we demonstrated the consistency of our approach with current regulatory requirements in the EU.
Collapse
Affiliation(s)
- Martin Haimerl
- Furtwangen University of Applied Sciences, Furtwangen, Germany.
| | - Christoph Reich
- Furtwangen University of Applied Sciences, Furtwangen, Germany
| |
Collapse
|
6
|
Zhu J, Lin L, Si L, Zhao H, Song H, Xu X. Urban and rural disparities in stroke prediction using machine learning among Chinese older adults. Sci Rep 2025; 15:6779. [PMID: 40000818 PMCID: PMC11861258 DOI: 10.1038/s41598-025-91157-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Accepted: 02/18/2025] [Indexed: 02/27/2025] Open
Abstract
Stroke is a significant health concern in China. Differences in stroke risk between rural and urban areas have been highlighted in prior research. However, there is a scarcity of studies on urban-rural differences in predicting stroke. This study aimed to develop stroke prediction models, and urban-rural subgroup analyses were conducted to explore disparities in determinants among middle-aged and older adults. We employed nine machine learning algorithms, namely logistic regression (LR), adaptive boosting classifier, support vector machines, extreme gradient boosting, random forest, Gaussian naive Bayes (GNB), gradient boosting machine, light gradient boosting decision machine, and K Nearest Neighbours, using data derived from 9,413 individuals aged 45 years and above obtained from the China Health and Retirement Longitudinal Study (CHARLS) conducted in 2011 to build stroke prediction models and analyze urban-rural subgroups. In the total population, GNB (AUC = 0.76) was the best model for predicting strokes, and the ten most important variables were the time taken for repeated chair stands, the chair height from floor to seat, knee height, creatinine, complete repeated chair stands, mean corpuscular volume, platelet, uric acid, body mass index, and white blood cell. In the rural subgroup, LR and GNB (AUC = 0.76) were the best, and the ten most important variables were the time taken for repeated chair stands, creatinine, platelet, the chair height from floor to seat, knee height, complete repeated chair stands, pulse, white blood cell, maintaining semi - tandem balance statically, and uric acid. In the urban subgroup, LR (AUC = 0.67) was the best, and the ten most important variables were the time taken for repeated chair stands, mean corpuscular volume, maintaining semi - tandem balance statically, uric acid, right-hand grip strength, age, blood urea nitrogen, use of trunk, arms, legs for semi - tandem balance, number of marriages, and night sleep duration. The time taken for repeated chair stands was more critical in the stroke risk model for rural individuals. Uric acid and maintaining semi - tandem balance statically were more critical in the stroke risk model for urban individuals. Our results revealed the importance of knee height and physical function predictors for stroke and highlighted the differences in determinants between urban and rural individuals, proposing targeted stroke prevention and control strategies in different populations in terms of physical function.
Collapse
Affiliation(s)
- Jingjing Zhu
- School of Public Health, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Luotao Lin
- Department of Individual, Family, and Community Education, University of New Mexico, Albuquerque, USA
| | - Lei Si
- School of Health Sciences, Western Sydney University, Penrith, Australia
| | - Hailei Zhao
- School of Public Health, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Hualing Song
- School of Public Health, Shanghai University of Traditional Chinese Medicine, Shanghai, China.
| | - Xianglong Xu
- School of Public Health, Shanghai University of Traditional Chinese Medicine, Shanghai, China.
- School of Translational Medicine, Monash University, Melbourne, VIC, Australia.
| |
Collapse
|
7
|
Mochurad L, Babii V, Boliubash Y, Mochurad Y. Improving stroke risk prediction by integrating XGBoost, optimized principal component analysis, and explainable artificial intelligence. BMC Med Inform Decis Mak 2025; 25:63. [PMID: 39920691 PMCID: PMC11806876 DOI: 10.1186/s12911-025-02894-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2024] [Accepted: 01/28/2025] [Indexed: 02/09/2025] Open
Abstract
The relevance of the study is due to the growing number of diseases of the cerebrovascular system, in particular stroke, which is one of the leading causes of disability and mortality in the world. To improve stroke risk prediction models in terms of efficiency and interpretability, we propose to integrate modern machine learning algorithms and data dimensionality reduction methods, in particular XGBoost and optimized principal component analysis (PCA), which provide data structuring and increase processing speed, especially for large datasets. For the first time, explainable artificial intelligence (XAI) is integrated into the PCA process, which increases transparency and interpretation, providing a better understanding of risk factors for medical professionals. The proposed approach was tested on two datasets, with accuracy of 95% and 98%. Cross-validation yielded an average value of 0.99, and high values of Matthew's correlation coefficient (MCC) metrics of 0.96 and Cohen's Kappa (CK) of 0.96 confirmed the generalizability and reliability of the model. The processing speed is increased threefold due to OpenMP parallelization, which makes it possible to apply it in practice. Thus, the proposed method is innovative and can potentially improve forecasting systems in the healthcare industry.
Collapse
Affiliation(s)
- Lesia Mochurad
- Artificial Intelligence Department, Lviv Polytechnic National University, 12 S. Bandery St, Lviv, 79013, Ukraine.
| | - Viktoriia Babii
- Artificial Intelligence Department, Lviv Polytechnic National University, 12 S. Bandery St, Lviv, 79013, Ukraine
| | - Yuliia Boliubash
- Artificial Intelligence Department, Lviv Polytechnic National University, 12 S. Bandery St, Lviv, 79013, Ukraine
| | - Yulianna Mochurad
- Danylo Halytsky Lviv National Medical University, 69 Pekarska Street, Lviv, 79010, Ukraine
| |
Collapse
|
8
|
Xie S, Peng S, Zhao L, Yang B, Qu Y, Tang X. A comprehensive analysis of stroke risk factors and development of a predictive model using machine learning approaches. Mol Genet Genomics 2025; 300:18. [PMID: 39853452 PMCID: PMC11762205 DOI: 10.1007/s00438-024-02217-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2024] [Accepted: 12/15/2024] [Indexed: 01/26/2025]
Abstract
Stroke is a leading cause of death and disability globally, particularly in China. Identifying risk factors for stroke at an early stage is critical to improving patient outcomes and reducing the overall disease burden. However, the complexity of stroke risk factors requires advanced approaches for accurate prediction. The objective of this study is to identify key risk factors for stroke and develop a predictive model using machine learning techniques to enhance early detection and improve clinical decision-making. Data from the China Health and Retirement Longitudinal Study (2011-2020) were analyzed, classifying participants based on baseline characteristics. We evaluated correlations among 12 chronic diseases and applied machine learning algorithms to identify stroke-associated parameters. A dose-response relationship between these parameters and stroke was assessed using restricted cubic splines with Cox proportional hazards models. A refined predictive model, incorporating age, sex, and key risk factors, was developed. Stroke patients were significantly older (average age 69.03 years) and had a higher proportion of women (53%) compared to non-stroke individuals. Additionally, stroke patients were more likely to reside in rural areas, be unmarried, smoke, and suffer from various diseases. While the 12 chronic diseases were correlated (p < 0.05), the correlation coefficients were generally weak (r < 0.5). Machine learning identified nine parameters significantly associated with stroke risk: TyG-WC, WHtR, TyG-BMI, TyG, TMO, CysC, CREA, SBP, and HDL-C. Of these, TyG-WC, WHtR, TyG-BMI, TyG, CysC, CREA, and SBP exhibited a positive dose-response relationship with stroke risk. In contrast, TMO and HDL-C were associated with reduced stroke risk. In the fully adjusted model, elevated CysC (HR = 2.606, 95% CI 1.869-3.635), CREA (HR = 1.819, 95% CI 1.240-2.668), and SBP (HR = 1.008, 95% CI 1.003-1.012) were significantly associated with increased stroke risk, while higher HDL-C (HR = 0.989, 95% CI 0.984-0.995) and TMO (HR = 0.99995, 95% CI 0.99994-0.99997) were protective. A nomogram model incorporating age, sex, and the identified parameters demonstrated superior predictive accuracy, with a significantly higher Harrell's C-index compared to individual predictors. This study identifies several significant stroke risk factors and presents a predictive model that can enhance early detection of high-risk individuals. Among them, CREA, CysC, SBP, TyG-BMI, TyG, TyG-WC, and WHtR were positively associated with stroke risk, whereas TMO and HDL-C were opposite. This serves as a valuable decision-support resource for clinicians, facilitating more effective prevention and treatment strategies, ultimately improving patient outcomes.
Collapse
Affiliation(s)
- Songquan Xie
- Neurosurgery Department of North Sichuan Medical College Affiliated Hospital, NanChong, 637000, Sichuan, China
| | - Shuting Peng
- Neurosurgery Department of North Sichuan Medical College Affiliated Hospital, NanChong, 637000, Sichuan, China
| | - Long Zhao
- Neurosurgery Department of North Sichuan Medical College Affiliated Hospital, NanChong, 637000, Sichuan, China
| | - Binbin Yang
- Neurosurgery Department of North Sichuan Medical College Affiliated Hospital, NanChong, 637000, Sichuan, China
| | - Yukun Qu
- Neurosurgery Department of North Sichuan Medical College Affiliated Hospital, NanChong, 637000, Sichuan, China
| | - Xiaoping Tang
- Neurosurgery Department of North Sichuan Medical College Affiliated Hospital, NanChong, 637000, Sichuan, China.
| |
Collapse
|
9
|
Shen M, Zhang Y, Zhan R, Du T, Shen P, Lu X, Liu S, Guo R, Shen X. Predicting the risk of cardiovascular disease in adults exposed to heavy metals: Interpretable machine learning. ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2025; 290:117570. [PMID: 39721423 DOI: 10.1016/j.ecoenv.2024.117570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2024] [Revised: 12/16/2024] [Accepted: 12/17/2024] [Indexed: 12/28/2024]
Abstract
Machine learning exhibits excellent performance in terms of predictive power. We aimed to construct an interpretable machine learning model utilizing National Health and Nutrition Examination Survey data to investigate the relationship between heavy metal exposure and cardiovascular disease (CVD). A total of 4600 adults were included in the analysis. The Least Absolute Shrinkage and Selection Operator regression method was employed to select relevant feature variables. Subsequently, six machine learning models were constructed, including random forest, decision tree, gradient boosting decision tree, k-nearest neighbor, support vector machine, and AdaBoost algorithms. Feature importance analysis, partial dependence plot, and shapley additive explanations were integrated to enhance the interpretability of the CVD prediction model. Among all models, the random forest exhibited the best performance, with an accuracy of 90 %, an area under the curve of 0.85, and an F1 score of 0.86. Urine cadmium (Cd), blood lead (Pb), urine thallium (Tl), and urine tungsten (W) were identified as the most significant predictors of CVD, with importance scores of 0.062, 0.057, 0.051, and 0.050, respectively. At the overall level, higher levels of urine Cd, blood Pb, and urine W were associated with an increased risk of CVD, whereas a lower level of urine Tl was linked to a reduced CVD risk. Additionally, the analysis of synergistic effects revealed that Cd was the predominant determinant of CVD risk. The random forest-based CVD prediction model demonstrated excellent predictive power and provided valuable insights for personalized patient care and optimal resource allocation in populations exposed to heavy metals.
Collapse
Affiliation(s)
- Meiyue Shen
- Department of Epidemiology and Health Statistics, School of Public Health, Qingdao University, Qingdao 266071, China
| | - Yine Zhang
- Ningxia Center for Disease Control and Prevention, Yinchuan, China
| | | | - Tingwei Du
- Department of Epidemiology and Health Statistics, School of Public Health, Qingdao University, Qingdao 266071, China
| | - Peixuan Shen
- Department of Epidemiology and Health Statistics, School of Public Health, Qingdao University, Qingdao 266071, China
| | - Xiaochuan Lu
- Department of Epidemiology and Health Statistics, School of Public Health, Qingdao University, Qingdao 266071, China
| | - Shengnan Liu
- Department of Epidemiology and Health Statistics, School of Public Health, Qingdao University, Qingdao 266071, China; Ningxia Center for Disease Control and Prevention, Yinchuan, China; Qingdao Haici Hospital, Qingdao 266033, China
| | - Rongrong Guo
- Department of Epidemiology and Health Statistics, School of Public Health, Qingdao University, Qingdao 266071, China
| | - Xiaoli Shen
- Department of Epidemiology and Health Statistics, School of Public Health, Qingdao University, Qingdao 266071, China.
| |
Collapse
|
10
|
Sorayaie Azar A, Samimi T, Tavassoli G, Naemi A, Rahimi B, Hadianfard Z, Wiil UK, Nazarbaghi S, Bagherzadeh Mohasefi J, Lotfnezhad Afshar H. Predicting stroke severity of patients using interpretable machine learning algorithms. Eur J Med Res 2024; 29:547. [PMID: 39538301 PMCID: PMC11562860 DOI: 10.1186/s40001-024-02147-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Accepted: 11/05/2024] [Indexed: 11/16/2024] Open
Abstract
BACKGROUND Stroke is a significant global health concern, ranking as the second leading cause of death and placing a substantial financial burden on healthcare systems, particularly in low- and middle-income countries. Timely evaluation of stroke severity is crucial for predicting clinical outcomes, with standard assessment tools being the Rapid Arterial Occlusion Evaluation (RACE) and the National Institutes of Health Stroke Scale (NIHSS). This study aims to utilize Machine Learning (ML) algorithms to predict stroke severity using these two distinct scales. METHODS We conducted this study using two datasets collected from hospitals in Urmia, Iran, corresponding to stroke severity assessments based on RACE and NIHSS. Seven ML algorithms were applied, including K-Nearest Neighbor (KNN), Decision Tree (DT), Random Forest (RF), Adaptive Boosting (AdaBoost), Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), and Artificial Neural Network (ANN). Hyperparameter tuning was performed using grid search to optimize model performance, and SHapley Additive Explanations (SHAP) were used to interpret the contribution of individual features. RESULTS Among the models, the RF achieved the highest performance, with accuracies of 92.68% for the RACE dataset and 91.19% for the NIHSS dataset. The Area Under the Curve (AUC) was 92.02% and 97.86% for the RACE and NIHSS datasets, respectively. The SHAP analysis identified triglyceride levels, length of hospital stay, and age as critical predictors of stroke severity. CONCLUSIONS This study is the first to apply ML models to the RACE and NIHSS scales for predicting stroke severity. The use of SHAP enhances the interpretability of the models, increasing clinicians' trust in these ML algorithms. The best-performing ML model can be a valuable tool for assisting medical professionals in predicting stroke severity in clinical settings.
Collapse
Affiliation(s)
- Amir Sorayaie Azar
- SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark
- Department of Computer Engineering, Urmia University, Urmia, Iran
| | - Tahereh Samimi
- Department of Health Information Technology, Urmia University of Medical Sciences, Urmia, Iran
- Health and Biomedical Informatics Research Center, Urmia University of Medical Sciences, Urmia, Iran
| | - Ghanbar Tavassoli
- Department of Health Information Technology, Urmia University of Medical Sciences, Urmia, Iran
- Health and Biomedical Informatics Research Center, Urmia University of Medical Sciences, Urmia, Iran
- Department of Computer Engineering, Urmia Branch, Islamic Azad University, Urmia, Iran
| | - Amin Naemi
- SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark
| | - Bahlol Rahimi
- Department of Health Information Technology, Urmia University of Medical Sciences, Urmia, Iran
- Health and Biomedical Informatics Research Center, Urmia University of Medical Sciences, Urmia, Iran
| | - Zahra Hadianfard
- Department of Health Information Technology, Urmia University of Medical Sciences, Urmia, Iran
| | - Uffe Kock Wiil
- SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark
| | - Surena Nazarbaghi
- Department of Neurology, School of Medicine, Urmia University of Medical Sciences, Urmia, Iran
| | - Jamshid Bagherzadeh Mohasefi
- SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark.
- Department of Computer Engineering, Urmia University, Urmia, Iran.
| | - Hadi Lotfnezhad Afshar
- Department of Health Information Technology, Urmia University of Medical Sciences, Urmia, Iran.
- Health and Biomedical Informatics Research Center, Urmia University of Medical Sciences, Urmia, Iran.
| |
Collapse
|
11
|
Wei Z, Li M, Zhang C, Miao J, Wang W, Fan H. Machine learning-based predictive model for post-stroke dementia. BMC Med Inform Decis Mak 2024; 24:334. [PMID: 39529118 PMCID: PMC11555950 DOI: 10.1186/s12911-024-02752-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Accepted: 11/04/2024] [Indexed: 11/16/2024] Open
Abstract
BACKGROUND Post-stroke dementia (PSD), a common complication, diminishes rehabilitation efficacy and affects disease prognosis in stroke patients. Many factors may be related to PSD, including demographic, comorbidities, and examination characteristics. However, most existing methods are qualitative evaluations of independent factors, which ignore the interaction amongst various factors. Therefore, the purpose of this study is to explore the applicability of machine learning (ML) methods for predicting PSD. METHODS 9 acceptable features were screened out by the Spearman correlation analysis and Boruta algorithm. We developed and evaluated 8 ML models: logistic regression, elastic net, k-nearest neighbors, decision tree, extreme gradient boosting, support vector machine, random forest, and multilayer perceptron. RESULTS A total of 539 stroke patients were included in this study. Among the 8 models used to predict PSD, extreme gradient boosting and random forest showed the highest area under the curve (AUC) of the receiver operating characteristic curve (ROC), with values of 0.7287 and 0.7285, respectively. The most important features for predicting PSD included age, high sensitivity C-reactive protein, stroke side and location, and the occurrence of cerebral hemorrhage. CONCLUSION Our findings suggest that ML models, especially extreme gradient boosting, can best predict the risk of PSD.
Collapse
Affiliation(s)
- Zemin Wei
- Department of Geriatrics, Shaoxing People's Hospital, Shaoxing, Zhejiang, P. R. China
| | - Mengqi Li
- School of Medicine, Shaoxing University, Shaoxing, Zhejiang, P. R. China
| | - Chenghui Zhang
- School of Medicine, Shaoxing University, Shaoxing, Zhejiang, P. R. China
| | - Jinli Miao
- The Yangtze River Delta Biological Medicine Research and Development Center of Zhejiang Province, Yangtze Delta Region Institution of Tsinghua University, Hangzhou, 314006, Zhejiang, P.R. China
| | - Wenmin Wang
- The Yangtze River Delta Biological Medicine Research and Development Center of Zhejiang Province, Yangtze Delta Region Institution of Tsinghua University, Hangzhou, 314006, Zhejiang, P.R. China
| | - Hong Fan
- Department of Geriatrics, Shaoxing People's Hospital, Shaoxing, Zhejiang, P. R. China.
| |
Collapse
|
12
|
Zhang D, Yu N, Yang X, De Marinis Y, Liu ZP, Gao R. SRPNet: stroke risk prediction based on two-level feature selection and deep fusion network. Front Physiol 2024; 15:1357123. [PMID: 39588269 PMCID: PMC11586342 DOI: 10.3389/fphys.2024.1357123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 10/23/2024] [Indexed: 11/27/2024] Open
Abstract
Background Stroke is one of the major chronic non-communicable diseases (NCDs) with high morbidity, disability and mortality. The key to preventing stroke lies in controlling risk factors. However, screening risk factors and quantifying stroke risk levels remain challenging. Methods A novel prediction model for stroke risk based on two-level feature selection and deep fusion network (SRPNet) is proposed to solve the problem mentioned above. First, the two-level feature selection method is used to screen comprehensive features related to stroke risk, enabling accurate identification of significant risk factors while eliminating redundant information. Next, the deep fusion network integrating Transformer and fully connected neural network (FCN) is utilized to establish the risk prediction model SRPNet for stroke patients. Results We evaluate the performance of the SRPNet using screening data from the China Stroke Data Center (CSDC), and further validate its effectiveness with census data on stroke collected in affiliated hospital of Jining Medical University. The experimental results demonstrate that the SRPNet model selects features closely related to stroke and achieves superior risk prediction performance over benchmark methods. Conclusions SRPNet can rapidly identify high-quality stroke risk factors, improve the accuracy of stroke prediction, and provide a powerful tool for clinical diagnosis.
Collapse
Affiliation(s)
- Daoliang Zhang
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Na Yu
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Xiaodan Yang
- Department of Rehabilitation Medicine, Affiliated Hospital of Jining Medical University, Jining, China
| | - Yang De Marinis
- School of Control Science and Engineering, Shandong University, Jinan, China
- Department of Clinical Sciences, Lund University, Malmö, Sweden
| | - Zhi-Ping Liu
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Rui Gao
- School of Control Science and Engineering, Shandong University, Jinan, China
| |
Collapse
|
13
|
Chakraborty P, Bandyopadhyay A, Sahu PP, Burman A, Mallik S, Alsubaie N, Abbas M, Alqahtani MS, Soufiene BO. Predicting stroke occurrences: a stacked machine learning approach with feature selection and data preprocessing. BMC Bioinformatics 2024; 25:329. [PMID: 39407112 PMCID: PMC11476080 DOI: 10.1186/s12859-024-05866-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Accepted: 07/10/2024] [Indexed: 10/20/2024] Open
Abstract
Stroke prediction remains a critical area of research in healthcare, aiming to enhance early intervention and patient care strategies. This study investigates the efficacy of machine learning techniques, particularly principal component analysis (PCA) and a stacking ensemble method, for predicting stroke occurrences based on demographic, clinical, and lifestyle factors. We systematically varied PCA components and implemented a stacking model comprising random forest, decision tree, and K-nearest neighbors (KNN).Our findings demonstrate that setting PCA components to 16 optimally enhanced predictive accuracy, achieving a remarkable 98.6% accuracy in stroke prediction. Evaluation metrics underscored the robustness of our approach in handling class imbalance and improving model performance, also comparative analyses against traditional machine learning algorithms such as SVM, logistic regression, and Naive Bayes highlighted the superiority of our proposed method.
Collapse
Affiliation(s)
- Pritam Chakraborty
- School of computer engineering, KIIT University, Patia, Bhubaneswar, Odisha, 751024, India
| | - Anjan Bandyopadhyay
- School of computer engineering, KIIT University, Patia, Bhubaneswar, Odisha, 751024, India
| | - Preeti Padma Sahu
- School of computer engineering, KIIT University, Patia, Bhubaneswar, Odisha, 751024, India
| | - Aniket Burman
- School of computer engineering, KIIT University, Patia, Bhubaneswar, Odisha, 751024, India
| | - Saurav Mallik
- Department of Environmental Health, Harvard T H Chan School of public Health, 677 Harrington Avenue, Boston, MA, 02115, USA
| | - Najah Alsubaie
- Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, 11671, Riyadh, Saudi Arabia
| | - Mohamed Abbas
- Electrical Engineering Department, College of Engineering, King Khalid University, 61421, Abha, Saudi Arabia
| | - Mohammed S Alqahtani
- Radiological Sciences Department, College of Applied Medical Sciences, King Khalid University, 61421, Abha, Saudi Arabia
- BioImaging Unit, Space Research Centre, University of Leicester, Michael Atiyah Building, Leicester, LE1 7RH, UK
| | - Ben Othman Soufiene
- PRINCE Laboratory Research, ISITcom, Hammam Sousse, University of Sousse, Sousse, Tunisia.
| |
Collapse
|
14
|
Saleem MA, Javeed A, Akarathanawat W, Chutinet A, Suwanwela NC, Kaewplung P, Chaitusaney S, Deelertpaiboon S, Srisiri W, Benjapolakul W. An intelligent learning system based on electronic health records for unbiased stroke prediction. Sci Rep 2024; 14:23052. [PMID: 39367027 PMCID: PMC11452373 DOI: 10.1038/s41598-024-73570-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 09/18/2024] [Indexed: 10/06/2024] Open
Abstract
Stroke has a negative impact on people's lives and is one of the leading causes of death and disability worldwide. Early detection of symptoms can significantly help predict stroke and promote a healthy lifestyle. Researchers have developed several methods to predict strokes using machine learning (ML) techniques. However, the proposed systems have suffered from the following two main problems. The first problem is that the machine learning models are biased due to the uneven distribution of classes in the dataset. Recent research has not adequately addressed this problem, and no preventive measures have been taken. Synthetic Minority Oversampling (SMOTE) has been used to remove bias and balance the training of the proposed ML model. The second problem is to solve the problem of lower classification accuracy of machine learning models. We proposed a learning system that combines an autoencoder with a linear discriminant analysis (LDA) model to increase the accuracy of the proposed ML model for stroke prediction. Relevant features are extracted from the feature space using the autoencoder, and the extracted subset is then fed into the LDA model for stroke classification. The hyperparameters of the LDA model are found using a grid search strategy. However, the conventional accuracy metric does not truly reflect the performance of ML models. Therefore, we employed several evaluation metrics to validate the efficiency of the proposed model. Consequently, we evaluated the proposed model's accuracy, sensitivity, specificity, area under the curve (AUC), and receiver operator characteristic (ROC). The experimental results show that the proposed model achieves a sensitivity and specificity of 98.51% and 97.56%, respectively, with an accuracy of 99.24% and a balanced accuracy of 98.00%.
Collapse
Affiliation(s)
- Muhammad Asim Saleem
- Center of Excellence in Artificial Intelligence, Machine Learning and Smart Grid Technology, Department of Electrical Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, 10330, Thailand
| | - Ashir Javeed
- Aging Research Center, Karolinska Institutet, 171 65, Stockholm, Sweden
| | - Wasan Akarathanawat
- Division of Neurology, Department of Medicine, Faculty of Medicine, Chulalongkorn University, Bangkok, 10330, Thailand
- Chulalongkorn Stroke Center, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, 10330, Thailand
- Chula Neuroscience Center, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, 10330, Thailand
| | - Aurauma Chutinet
- Division of Neurology, Department of Medicine, Faculty of Medicine, Chulalongkorn University, Bangkok, 10330, Thailand
- Chulalongkorn Stroke Center, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, 10330, Thailand
- Chula Neuroscience Center, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, 10330, Thailand
| | - Nijasri Charnnarong Suwanwela
- Division of Neurology, Department of Medicine, Faculty of Medicine, Chulalongkorn University, Bangkok, 10330, Thailand
- Chulalongkorn Stroke Center, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, 10330, Thailand
- Chula Neuroscience Center, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, 10330, Thailand
| | - Pasu Kaewplung
- Center of Excellence in Artificial Intelligence, Machine Learning and Smart Grid Technology, Department of Electrical Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, 10330, Thailand.
| | - Surachai Chaitusaney
- Center of Excellence in Artificial Intelligence, Machine Learning and Smart Grid Technology, Department of Electrical Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, 10330, Thailand
| | - Sunchai Deelertpaiboon
- Center of Excellence in Artificial Intelligence, Machine Learning and Smart Grid Technology, Department of Electrical Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, 10330, Thailand
| | - Wattanasak Srisiri
- Center of Excellence in Artificial Intelligence, Machine Learning and Smart Grid Technology, Department of Electrical Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, 10330, Thailand
| | - Watit Benjapolakul
- Center of Excellence in Artificial Intelligence, Machine Learning and Smart Grid Technology, Department of Electrical Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, 10330, Thailand.
| |
Collapse
|
15
|
Asadi F, Rahimi M, Daeechini AH, Paghe A. The most efficient machine learning algorithms in stroke prediction: A systematic review. Health Sci Rep 2024; 7:e70062. [PMID: 39355095 PMCID: PMC11443322 DOI: 10.1002/hsr2.70062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2024] [Revised: 08/17/2024] [Accepted: 08/23/2024] [Indexed: 10/03/2024] Open
Abstract
Background and Aims Stroke is one of the most common causes of death worldwide, leading to numerous complications and significantly diminishing the quality of life for those affected. The purpose of this study is to systematically review published papers on stroke prediction using machine learning algorithms and introduce the most efficient machine learning algorithms and compare their performance. The papers have published in period from 2019 to August 2023. Methods The authors conducted a systematic search in PubMed, Scopus, Web of Science, and IEEE using the keywords "Artificial Intelligence," "Predictive Modeling," "Machine Learning," "Stroke," and "Cerebrovascular Accident" from 2019 to August 2023. Results Twenty articles were included based on the inclusion criteria. The Random Forest (RF) algorithm was introduced as the best and most efficient stroke ML algorithm in 25% of the articles (n = 5). In addition, in other articles, Support Vector Machines (SVM), Stacking and XGBOOST, DSGD, COX& GBT, ANN, NB, and RXLM algorithms were introduced as the best and most efficient ML algorithms in stroke prediction. Conclusion This research has shown a rapid increase in using ML algorithms to predict stroke, with significant improvements in model accuracy in recent years. However, no model has reached 100% accuracy or is entirely error-free. Variations in algorithm efficiency and accuracy stem from differences in sample sizes, datasets, and data types. Further studies should focus on consistent datasets, sample sizes, and data types for more reliable outcomes.
Collapse
Affiliation(s)
- Farkhondeh Asadi
- Department of Health Information Technology and Management School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences Tehran Iran
| | - Milad Rahimi
- Department of Health Information Technology Urmia University of Medical Sciences Urmia Iran
| | - Amir Hossein Daeechini
- Department of Health Information Technology and Management School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences Tehran Iran
| | - Atefeh Paghe
- Department of Health Information Technology and Management School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences Tehran Iran
| |
Collapse
|
16
|
Chaiter Y, Fink DL, Machluf Y. Vascular medicine in the 21 st century: Embracing comprehensive vasculature evaluation and multidisciplinary treatment. World J Clin Cases 2024; 12:6032-6044. [PMID: 39328850 PMCID: PMC11326099 DOI: 10.12998/wjcc.v12.i27.6032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 06/25/2024] [Accepted: 07/10/2024] [Indexed: 07/29/2024] Open
Abstract
The field of vascular medicine has undergone a profound transformation in the 21st century, transforming our approach to assessment and treatment. Atherosclerosis, a complex inflammatory disease that affects medium and large arteries, presents a major challenge for researchers and healthcare professionals. This condition, characterized by arterial plaque formation and narrowing, poses substantial challenges to vascular health at individual, national, and global scales. Its repercussions are far-reaching, with clinical outcomes including ischemic heart disease, ischemic stroke, and peripheral arterial disease-conditions with escalating global prevalence. Early detection of vascular changes caused by atherosclerosis is crucial in preventing these conditions, reducing morbidity, and averting mortality. This article underscored the imperative of adopting a holistic approach to grappling with the intricacies, trajectories, and ramifications of atherosclerosis. It stresses the need for a thorough evaluation of the vasculature and the implementation of a multidisciplinary treatment approach. By considering the entire vascular system, healthcare providers can explore avenues for prevention, early detection, and effective management of this condition, ultimately leading to improved patient outcomes. We discussed current practices and proposed new directions made possible by emerging diagnostic modalities and treatment strategies. Additionally, we considered healthcare expenditure, resource allocation, and the transformative potential of new innovative treatments and technologies.
Collapse
Affiliation(s)
- Yoram Chaiter
- The Israeli Center for Emerging Technologies in Hospitals and Hospital-Based Health Technology Assessment, Shamir (Assaf Harofeh) Medical Center, Zerifin 7030100, Israel
| | - Daniel Lyon Fink
- Department of Pediatric Cardiology Unit, HaEmek Medical Center, Afula 1834111, Israel
| | - Yossy Machluf
- Shamir Research Institute, University of Haifa, Kazerin 1290000, Israel
| |
Collapse
|
17
|
Zhang M, Zheng Y, Maidaiti X, Liang B, Wei Y, Sun F. Integrating Machine Learning into Statistical Methods in Disease Risk Prediction Modeling: A Systematic Review. HEALTH DATA SCIENCE 2024; 4:0165. [PMID: 39050273 PMCID: PMC11266123 DOI: 10.34133/hds.0165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 06/20/2024] [Indexed: 07/27/2024]
Abstract
Background: Disease prediction models often use statistical methods or machine learning, both with their own corresponding application scenarios, raising the risk of errors when used alone. Integrating machine learning into statistical methods may yield robust prediction models. This systematic review aims to comprehensively assess current development of global disease prediction integration models. Methods: PubMed, EMbase, Web of Science, CNKI, VIP, WanFang, and SinoMed databases were searched to collect studies on prediction models integrating machine learning into statistical methods from database inception to 2023 May 1. Information including basic characteristics of studies, integrating approaches, application scenarios, modeling details, and model performance was extracted. Results: A total of 20 eligible studies in English and 1 in Chinese were included. Five studies concentrated on diagnostic models, while 16 studies concentrated on predicting disease occurrence or prognosis. Integrating strategies of classification models included majority voting, weighted voting, stacking, and model selection (when statistical methods and machine learning disagreed). Regression models adopted strategies including simple statistics, weighted statistics, and stacking. AUROC of integration models surpassed 0.75 and performed better than statistical methods and machine learning in most studies. Stacking was used for situations with >100 predictors and needed relatively larger amount of training data. Conclusion: Research on integrating machine learning into statistical methods in prediction models remains limited, but some studies have exhibited great potential that integration models outperform single models. This study provides insights for the selection of integration methods for different scenarios. Future research could emphasize on the improvement and validation of integrating strategies.
Collapse
Affiliation(s)
- Meng Zhang
- Department of Epidemiology and Biostatistics, School of Public Health,
Peking University, Beijing, China
- Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing, China
| | - Yongqi Zheng
- Department of Epidemiology and Biostatistics, School of Public Health,
Peking University, Beijing, China
- Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing, China
| | | | - Baosheng Liang
- Department of Biostatistics, School of Public Health,
Peking University, Beijing, China
| | - Yongyue Wei
- Department of Epidemiology and Biostatistics, School of Public Health,
Peking University, Beijing, China
- Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing, China
| | - Feng Sun
- Department of Epidemiology and Biostatistics, School of Public Health,
Peking University, Beijing, China
- Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing, China
| |
Collapse
|
18
|
Wijaya R, Saeed F, Samimi P, Albarrak AM, Qasem SN. An Ensemble Machine Learning and Data Mining Approach to Enhance Stroke Prediction. Bioengineering (Basel) 2024; 11:672. [PMID: 39061754 PMCID: PMC11274138 DOI: 10.3390/bioengineering11070672] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2024] [Revised: 06/20/2024] [Accepted: 06/20/2024] [Indexed: 07/28/2024] Open
Abstract
Stroke poses a significant health threat, affecting millions annually. Early and precise prediction is crucial to providing effective preventive healthcare interventions. This study applied an ensemble machine learning and data mining approach to enhance the effectiveness of stroke prediction. By employing the cross-industry standard process for data mining (CRISP-DM) methodology, various techniques, including random forest, ExtraTrees, XGBoost, artificial neural network (ANN), and genetic algorithm with ANN (GANN) were applied on two benchmark datasets to predict stroke based on several parameters, such as gender, age, various diseases, smoking status, BMI, HighCol, physical activity, hypertension, heart disease, lifestyle, and others. Due to dataset imbalance, Synthetic Minority Oversampling Technique (SMOTE) was applied to the datasets. Hyperparameter tuning optimized the models via grid search and randomized search cross-validation. The evaluation metrics included accuracy, precision, recall, F1-score, and area under the curve (AUC). The experimental results show that the ensemble ExtraTrees classifier achieved the highest accuracy (98.24%) and AUC (98.24%). Random forest also performed well, achieving 98.03% in both accuracy and AUC. Comparisons with state-of-the-art stroke prediction methods revealed that the proposed approach demonstrates superior performance, indicating its potential as a promising method for stroke prediction and offering substantial benefits to healthcare.
Collapse
Affiliation(s)
- Richard Wijaya
- College of Computing and Digital Technology, Birmingham City University, Birmingham B4 7XG, UK; (R.W.); (P.S.)
| | - Faisal Saeed
- College of Computing and Digital Technology, Birmingham City University, Birmingham B4 7XG, UK; (R.W.); (P.S.)
| | - Parnia Samimi
- College of Computing and Digital Technology, Birmingham City University, Birmingham B4 7XG, UK; (R.W.); (P.S.)
| | - Abdullah M. Albarrak
- Computer Science Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia; (A.M.A.); (S.N.Q.)
| | - Sultan Noman Qasem
- Computer Science Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia; (A.M.A.); (S.N.Q.)
| |
Collapse
|
19
|
Vu T, Kokubo Y, Inoue M, Yamamoto M, Mohsen A, Martin-Morales A, Inoué T, Dawadi R, Araki M. Machine Learning Approaches for Stroke Risk Prediction: Findings from the Suita Study. J Cardiovasc Dev Dis 2024; 11:207. [PMID: 39057627 PMCID: PMC11276746 DOI: 10.3390/jcdd11070207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Revised: 06/12/2024] [Accepted: 06/27/2024] [Indexed: 07/28/2024] Open
Abstract
Stroke constitutes a significant public health concern due to its impact on mortality and morbidity. This study investigates the utility of machine learning algorithms in predicting stroke and identifying key risk factors using data from the Suita study, comprising 7389 participants and 53 variables. Initially, unsupervised k-prototype clustering categorized participants into risk clusters, while five supervised models including Logistic Regression (LR), Random Forest (RF), Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost), and Light Gradient Boosted Machine (LightGBM) were employed to predict stroke outcomes. Stroke incidence disparities among identified risk clusters using the unsupervised k-prototype clustering method are substantial, according to the findings. Supervised learning, particularly RF, was a preferable option because of the higher levels of performance metrics. The Shapley Additive Explanations (SHAP) method identified age, systolic blood pressure, hypertension, estimated glomerular filtration rate, metabolic syndrome, and blood glucose level as key predictors of stroke, aligning with findings from the unsupervised clustering approach in high-risk groups. Additionally, previously unidentified risk factors such as elbow joint thickness, fructosamine, hemoglobin, and calcium level demonstrate potential for stroke prediction. In conclusion, machine learning facilitated accurate stroke risk predictions and highlighted potential biomarkers, offering a data-driven framework for risk assessment and biomarker discovery.
Collapse
Affiliation(s)
- Thien Vu
- Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senrioka-Shinmachi, Settsu 566-0002, Japan; (M.I.); (M.Y.); (A.M.); (A.M.-M.); (R.D.)
- National Cerebral and Cardiovascular Center, 6-1 Kishibe-Shinmachi, Suita 564-8565, Japan;
- Department of Cardiac Surgery, Cardiovascular Center, Cho Ray Hospital, Ho Chi Minh City 72713, Vietnam
| | - Yoshihiro Kokubo
- National Cerebral and Cardiovascular Center, 6-1 Kishibe-Shinmachi, Suita 564-8565, Japan;
| | - Mai Inoue
- Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senrioka-Shinmachi, Settsu 566-0002, Japan; (M.I.); (M.Y.); (A.M.); (A.M.-M.); (R.D.)
- National Cerebral and Cardiovascular Center, 6-1 Kishibe-Shinmachi, Suita 564-8565, Japan;
| | - Masaki Yamamoto
- Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senrioka-Shinmachi, Settsu 566-0002, Japan; (M.I.); (M.Y.); (A.M.); (A.M.-M.); (R.D.)
- National Cerebral and Cardiovascular Center, 6-1 Kishibe-Shinmachi, Suita 564-8565, Japan;
| | - Attayeb Mohsen
- Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senrioka-Shinmachi, Settsu 566-0002, Japan; (M.I.); (M.Y.); (A.M.); (A.M.-M.); (R.D.)
| | - Agustin Martin-Morales
- Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senrioka-Shinmachi, Settsu 566-0002, Japan; (M.I.); (M.Y.); (A.M.); (A.M.-M.); (R.D.)
- National Cerebral and Cardiovascular Center, 6-1 Kishibe-Shinmachi, Suita 564-8565, Japan;
| | - Takao Inoué
- Faculty of Informatics, Yamato University, 2-5-1 Katayama, Suita 564-0082, Japan;
| | - Research Dawadi
- Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senrioka-Shinmachi, Settsu 566-0002, Japan; (M.I.); (M.Y.); (A.M.); (A.M.-M.); (R.D.)
- National Cerebral and Cardiovascular Center, 6-1 Kishibe-Shinmachi, Suita 564-8565, Japan;
| | - Michihiro Araki
- Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senrioka-Shinmachi, Settsu 566-0002, Japan; (M.I.); (M.Y.); (A.M.); (A.M.-M.); (R.D.)
- National Cerebral and Cardiovascular Center, 6-1 Kishibe-Shinmachi, Suita 564-8565, Japan;
- Graduate School of Medicine, Kyoto University, 54 Shogoin-Kawahara-cho, Sakyo-ku, Kyoto 606-8507, Japan
- Graduate School of Science Technology and Innovation, Kobe University, 1-1 Rokkodai Nada-ku, Kobe 657-8501, Japan
| |
Collapse
|
20
|
Leung R, Wang B, Gottbrecht M, Doerr A, Marya N, Soni A, McManus DD, Lin H. Association between deep neural network-derived electrocardiographic-age and incident stroke. Front Cardiovasc Med 2024; 11:1368094. [PMID: 39006167 PMCID: PMC11239432 DOI: 10.3389/fcvm.2024.1368094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 06/13/2024] [Indexed: 07/16/2024] Open
Abstract
Background Stroke continues to be a leading cause of death and disability worldwide despite improvements in prevention and treatment. Traditional stroke risk calculators are biased and imprecise. Novel stroke predictors need to be identified. Recently, deep neural networks (DNNs) have been used to determine age from ECGs, otherwise known as the electrocardiographic-age (ECG-age), which predicts clinical outcomes. However, the relationship between ECG-age and stroke has not been well studied. We hypothesized that ECG-age is associated with incident stroke. Methods In this study, UK Biobank participants with available ECGs (from 2014 or later). ECG-age was estimated using a deep neural network (DNN) applied to raw ECG waveforms. We calculated the Δage (ECG-age minus chronological age) and classified individuals as having normal, accelerated, or decelerated aging if Δage was within, higher, or lower than the mean absolute error of the model, respectively. Multivariable Cox proportional hazards regression models adjusted for age, sex, and clinical factors were used to assess the association between Δage and incident stroke. Results The study population included 67,757 UK Biobank participants (mean age 65 ± 8 years; 48.3% male). Every 10-year increase in Δage was associated with a 22% increase in incident stroke [HR, 1.22 (95% CI, 1.00-1.49)] in the multivariable-adjusted model. Accelerated aging was associated with a 42% increase in incident stroke [HR, 1.42 (95% CI, 1.12-1.80)] compared to normal aging. In addition, Δage was associated with prevalent stroke [OR, 1.28 (95% CI, 1.11-1.49)]. Conclusions DNN-estimated ECG-age was associated with incident and prevalent stroke in the UK Biobank. Further investigation is required to determine if ECG-age can be used as a reliable biomarker of stroke risk.
Collapse
Affiliation(s)
- Robert Leung
- Program in Digital Medicine, Department of Medicine, UMass Chan Medical School, Worcester, MA, United States
| | - Biqi Wang
- Program in Digital Medicine, Department of Medicine, UMass Chan Medical School, Worcester, MA, United States
- Division of Health Systems Science, Department of Medicine, UMass Chan Medical School, Worcester, MA, United States
| | - Matthew Gottbrecht
- Division of Cardiology, Department of Medicine, UMass Chan Medical School, Worcester, MA, United States
| | - Adam Doerr
- Program in Digital Medicine, Department of Medicine, UMass Chan Medical School, Worcester, MA, United States
| | - Neil Marya
- Program in Digital Medicine, Department of Medicine, UMass Chan Medical School, Worcester, MA, United States
- Division of Gastroenterology, Department of Medicine, UMass Chan Medical School, Worcester, MA, United States
| | - Apurv Soni
- Program in Digital Medicine, Department of Medicine, UMass Chan Medical School, Worcester, MA, United States
- Division of Health Systems Science, Department of Medicine, UMass Chan Medical School, Worcester, MA, United States
| | - David D. McManus
- Program in Digital Medicine, Department of Medicine, UMass Chan Medical School, Worcester, MA, United States
- Division of Health Systems Science, Department of Medicine, UMass Chan Medical School, Worcester, MA, United States
- Division of Cardiology, Department of Medicine, UMass Chan Medical School, Worcester, MA, United States
| | - Honghuang Lin
- Program in Digital Medicine, Department of Medicine, UMass Chan Medical School, Worcester, MA, United States
- Division of Health Systems Science, Department of Medicine, UMass Chan Medical School, Worcester, MA, United States
| |
Collapse
|
21
|
Zuo W, Yang X. A machine learning model predicts stroke associated with blood cadmium level. Sci Rep 2024; 14:14739. [PMID: 38926494 PMCID: PMC11208606 DOI: 10.1038/s41598-024-65633-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 06/21/2024] [Indexed: 06/28/2024] Open
Abstract
Stroke is the leading cause of death and disability worldwide. Cadmium is a prevalent environmental toxicant that may contribute to cardiovascular disease, including stroke. We aimed to build an effective and interpretable machine learning (ML) model that links blood cadmium to the identification of stroke. Our data exploring the association between blood cadmium and stroke came from the National Health and Nutrition Examination Survey (NHANES, 2013-2014). In total, 2664 participants were eligible for this study. We divided these data into a training set (80%) and a test set (20%). To analyze the relationship between blood cadmium and stroke, a multivariate logistic regression analysis was performed. We constructed and tested five ML algorithms including K-nearest neighbor (KNN), decision tree (DT), logistic regression (LR), multilayer perceptron (MLP), and random forest (RF). The best-performing model was selected to identify stroke in US adults. Finally, the features were interpreted using the Shapley Additive exPlanations (SHAP) tool. In the total population, participants in the second, third, and fourth quartiles had an odds ratio of 1.32 (95% CI 0.55, 3.14), 1.65 (95% CI 0.71, 3.83), and 2.67 (95% CI 1.10, 6.49) for stroke compared with the lowest reference group for blood cadmium, respectively. This blood cadmium-based LR approach demonstrated the greatest performance in identifying stroke (area under the operator curve: 0.800, accuracy: 0.966). Employing interpretable methods, we found blood cadmium to be a notable contributor to the predictive model. We found that blood cadmium was positively correlated with stroke risk and that stroke risk from cadmium exposure could be effectively predicted by using ML modeling.
Collapse
Affiliation(s)
- Wenwei Zuo
- School of Gongli Hospital Medical Technology, University of Shanghai for Science and Technology, No. 516, Jungong Road, Yangpu Area, Shanghai, 200093, China
| | - Xuelian Yang
- Department of Neurology, Shanghai Pudong New Area Gongli Hospital, No. 219 Miaopu Road, Pudong New Area, Shanghai, 200135, China.
| |
Collapse
|
22
|
Carvalho Macruz FBD, Dias ALMP, Andrade CS, Nucci MP, Rimkus CDM, Lucato LT, Rocha AJD, Kitamura FC. The new era of artificial intelligence in neuroradiology: current research and promising tools. ARQUIVOS DE NEURO-PSIQUIATRIA 2024; 82:1-12. [PMID: 38565188 PMCID: PMC10987255 DOI: 10.1055/s-0044-1779486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 12/13/2023] [Indexed: 04/04/2024]
Abstract
Radiology has a number of characteristics that make it an especially suitable medical discipline for early artificial intelligence (AI) adoption. These include having a well-established digital workflow, standardized protocols for image storage, and numerous well-defined interpretive activities. The more than 200 commercial radiologic AI-based products recently approved by the Food and Drug Administration (FDA) to assist radiologists in a number of narrow image-analysis tasks such as image enhancement, workflow triage, and quantification, corroborate this observation. However, in order to leverage AI to boost efficacy and efficiency, and to overcome substantial obstacles to widespread successful clinical use of these products, radiologists should become familiarized with the emerging applications in their particular areas of expertise. In light of this, in this article we survey the existing literature on the application of AI-based techniques in neuroradiology, focusing on conditions such as vascular diseases, epilepsy, and demyelinating and neurodegenerative conditions. We also introduce some of the algorithms behind the applications, briefly discuss a few of the challenges of generalization in the use of AI models in neuroradiology, and skate over the most relevant commercially available solutions adopted in clinical practice. If well designed, AI algorithms have the potential to radically improve radiology, strengthening image analysis, enhancing the value of quantitative imaging techniques, and mitigating diagnostic errors.
Collapse
Affiliation(s)
- Fabíola Bezerra de Carvalho Macruz
- Universidade de São Paulo, Hospital das Clínicas, Departamento de Radiologia e Oncologia, Seção de Neurorradiologia, Faculdade de Medicina, São Paulo SP, Brazil.
- Rede D'Or São Luiz, Departamento de Radiologia e Diagnóstico por Imagem, São Paulo SP, Brazil.
- Universidade de São Paulo, Laboratório de Investigação Médica em Ressonância Magnética (LIM 44), São Paulo SP, Brazil.
- Academia Nacional de Medicina, Rio de Janeiro RJ, Brazil.
| | | | | | - Mariana Penteado Nucci
- Universidade de São Paulo, Laboratório de Investigação Médica em Ressonância Magnética (LIM 44), São Paulo SP, Brazil.
| | - Carolina de Medeiros Rimkus
- Universidade de São Paulo, Hospital das Clínicas, Departamento de Radiologia e Oncologia, Seção de Neurorradiologia, Faculdade de Medicina, São Paulo SP, Brazil.
- Rede D'Or São Luiz, Departamento de Radiologia e Diagnóstico por Imagem, São Paulo SP, Brazil.
- Universidade de São Paulo, Laboratório de Investigação Médica em Ressonância Magnética (LIM 44), São Paulo SP, Brazil.
| | - Leandro Tavares Lucato
- Universidade de São Paulo, Hospital das Clínicas, Departamento de Radiologia e Oncologia, Seção de Neurorradiologia, Faculdade de Medicina, São Paulo SP, Brazil.
- Diagnósticos da América SA, São Paulo SP, Brazil.
| | | | - Felipe Campos Kitamura
- Diagnósticos da América SA, São Paulo SP, Brazil.
- Universidade Federal de São Paulo, São Paulo SP, Brazil.
| |
Collapse
|
23
|
Zingaro A, Ahmad Z, Kholmovski E, Sakata K, Dede' L, Morris AK, Quarteroni A, Trayanova NA. A comprehensive stroke risk assessment by combining atrial computational fluid dynamics simulations and functional patient data. Sci Rep 2024; 14:9515. [PMID: 38664464 PMCID: PMC11045804 DOI: 10.1038/s41598-024-59997-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 04/17/2024] [Indexed: 04/28/2024] Open
Abstract
Stroke, a major global health concern often rooted in cardiac dynamics, demands precise risk evaluation for targeted intervention. Current risk models, like theCHA 2 DS 2 -VASc score, often lack the granularity required for personalized predictions. In this study, we present a nuanced and thorough stroke risk assessment by integrating functional insights from cardiac magnetic resonance (CMR) with patient-specific computational fluid dynamics (CFD) simulations. Our cohort, evenly split between control and stroke groups, comprises eight patients. Utilizing CINE CMR, we compute kinematic features, revealing smaller left atrial volumes for stroke patients. The incorporation of patient-specific atrial displacement into our hemodynamic simulations unveils the influence of atrial compliance on the flow fields, emphasizing the importance of LA motion in CFD simulations and challenging the conventional rigid wall assumption in hemodynamics models. Standardizing hemodynamic features with functional metrics enhances the differentiation between stroke and control cases. While standalone assessments provide limited clarity, the synergistic fusion of CMR-derived functional data and patient-informed CFD simulations offers a personalized and mechanistic understanding, distinctly segregating stroke from control cases. Specifically, our investigation reveals a crucial clinical insight: normalizing hemodynamic features based on ejection fraction fails to differentiate between stroke and control patients. Differently, when normalized with stroke volume, a clear and clinically significant distinction emerges and this holds true for both the left atrium and its appendage, providing valuable implications for precise stroke risk assessment in clinical settings. This work introduces a novel framework for seamlessly integrating hemodynamic and functional metrics, laying the groundwork for improved predictive models, and highlighting the significance of motion-informed, personalized risk assessments.
Collapse
Affiliation(s)
- Alberto Zingaro
- ADVANCE, Alliance for Cardiovascular Diagnostic and Treatment Innovation, Johns Hopkins University, 3400 N. Charles St., Baltimore, MD, 21218, USA.
- MOX, Laboratory of Modeling and Scientific Computing, Dipartimento di Matematica, Politecnico di Milano, Piazza Leonardo da Vinci 32, 20133, Milan, Italy.
- ELEM Biotech S.L., Pier07, Via Laietana, 26, 08003, Barcelona, Spain.
| | - Zan Ahmad
- ADVANCE, Alliance for Cardiovascular Diagnostic and Treatment Innovation, Johns Hopkins University, 3400 N. Charles St., Baltimore, MD, 21218, USA
- Department of Applied Mathematics and Statistics, Johns Hopkins University, 100 Wyman Park Dr, Baltimore, MD, 21211, USA
| | - Eugene Kholmovski
- ADVANCE, Alliance for Cardiovascular Diagnostic and Treatment Innovation, Johns Hopkins University, 3400 N. Charles St., Baltimore, MD, 21218, USA
- Department of Radiology, University of Utah, 30 N Mario Capecchi Dr., Salt Lake City, UT, 84112, USA
| | - Kensuke Sakata
- ADVANCE, Alliance for Cardiovascular Diagnostic and Treatment Innovation, Johns Hopkins University, 3400 N. Charles St., Baltimore, MD, 21218, USA
| | - Luca Dede'
- MOX, Laboratory of Modeling and Scientific Computing, Dipartimento di Matematica, Politecnico di Milano, Piazza Leonardo da Vinci 32, 20133, Milan, Italy
| | - Alan K Morris
- Scientific Computing and Imaging Institute, University of Utah, 72 Central Campus Dr., Salt Lake City, UT, 84112, USA
| | - Alfio Quarteroni
- MOX, Laboratory of Modeling and Scientific Computing, Dipartimento di Matematica, Politecnico di Milano, Piazza Leonardo da Vinci 32, 20133, Milan, Italy
- Institute of Mathematics, École Polytechnique Fédérale de Lausanne, Station 8, Av. Piccard, 1015, Lausanne, Switzerland
| | - Natalia A Trayanova
- ADVANCE, Alliance for Cardiovascular Diagnostic and Treatment Innovation, Johns Hopkins University, 3400 N. Charles St., Baltimore, MD, 21218, USA
| |
Collapse
|
24
|
Das S, Nayak SP, Sahoo B, Nayak SC. Machine Learning in Healthcare Analytics: A State-of-the-Art Review. ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING 2024. [DOI: 10.1007/s11831-024-10098-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Accepted: 02/23/2024] [Indexed: 01/06/2025]
|
25
|
Sahriar S, Akther S, Mauya J, Amin R, Mia MS, Ruhi S, Reza MS. Unlocking stroke prediction: Harnessing projection-based statistical feature extraction with ML algorithms. Heliyon 2024; 10:e27411. [PMID: 38495193 PMCID: PMC10943390 DOI: 10.1016/j.heliyon.2024.e27411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 02/26/2024] [Accepted: 02/28/2024] [Indexed: 03/19/2024] Open
Abstract
Non-communicable diseases, such as cardiovascular disease, cancer, chronic respiratory diseases, and diabetes, are responsible for approximately 71% of all deaths worldwide. Stroke, a cerebrovascular disorder, is one of the leading contributors to this burden among the top three causes of death. Early recognition of symptoms can encourage a balanced lifestyle and provide essential information for stroke prediction. To identify a stroke patient and risk factors, machine learning (ML) is a key tool for physicians. Due to different data measurement scales and their probability distributional assumptions, ML-based algorithms struggle to detect risk factors. Furthermore, when dealing with risk factors with high-dimensional features, learning algorithms struggle with complexity. In this study, rigorous statistical tests are used to identify risk factors, and PCA-FA (Integration of Principal Components and Factors) and FPCA (Factor Based PCA) approaches are proposed for projecting suitable feature representations for improving learning algorithm performances. The study dataset consists of different clinical, lifestyle, and genetic attributes, allowing for a comprehensive analysis of potential risk factors associated with stroke, which contains 5110 patient records. Using significant test (P-value <0.05), chi-square and independent sample t-test identified age, heart_disease, hypertension, work_type, ever_married, bmi, and smoking_status as risk factors for stroke. To develop the predicting model with proposed feature extraction techniques, random forests approach provides the best results when utilizing the PCA-FA method. The best accuracy rate for this approach is 92.55%, while the AUC score is 98.15%. The prediction accuracy has increased from 2.19% to 19.03% compared to the existing work. Additionally, the prediction results is robustified and reproducible with a stacking ensemble-based classification algorithm. We also developed a web-based application to help doctors diagnose stroke risk based on the findings of this study, which could be used as an additional tool to help doctors diagnose.
Collapse
Affiliation(s)
- Saad Sahriar
- Deep Statistical Learning and Research Lab, Department of Statistics, Pabna University of Science & Technology, Pabna, 6600, Bangladesh
| | - Sanjida Akther
- Deep Statistical Learning and Research Lab, Department of Statistics, Pabna University of Science & Technology, Pabna, 6600, Bangladesh
| | - Jannatul Mauya
- Deep Statistical Learning and Research Lab, Department of Statistics, Pabna University of Science & Technology, Pabna, 6600, Bangladesh
| | - Ruhul Amin
- Deep Statistical Learning and Research Lab, Department of Statistics, Pabna University of Science & Technology, Pabna, 6600, Bangladesh
| | - Md Shahajada Mia
- Department of Statistics, Pabna University of Science & Technology, Pabna, 6600, Bangladesh
| | - Sabba Ruhi
- Department of Statistics, Pabna University of Science & Technology, Pabna, 6600, Bangladesh
| | - Md Shamim Reza
- Deep Statistical Learning and Research Lab, Department of Statistics, Pabna University of Science & Technology, Pabna, 6600, Bangladesh
- Department of Statistics, Pabna University of Science & Technology, Pabna, 6600, Bangladesh
| |
Collapse
|
26
|
Zhang Y, Yu M, Tong C, Zhao Y, Han J. CA-UNet Segmentation Makes a Good Ischemic Stroke Risk Prediction. Interdiscip Sci 2024; 16:58-72. [PMID: 37626263 DOI: 10.1007/s12539-023-00583-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 07/13/2023] [Accepted: 07/19/2023] [Indexed: 08/27/2023]
Abstract
Stroke is still the World's second major factor of death, as well as the third major factor of death and disability. Ischemic stroke is a type of stroke, in which early detection and treatment are the keys to preventing ischemic strokes. However, due to the limitation of privacy protection and labeling difficulties, there are only a few studies on the intelligent automatic diagnosis of stroke or ischemic stroke, and the results are unsatisfactory. Therefore, we collect some data and propose a 3D carotid Computed Tomography Angiography (CTA) image segmentation model called CA-UNet for fully automated extraction of carotid arteries. We explore the number of down-sampling times applicable to carotid segmentation and design a multi-scale loss function to resolve the loss of detailed features during the process of down-sampling. Moreover, based on CA-Unet, we propose an ischemic stroke risk prediction model to predict the risk in patients using their 3D CTA images, electronic medical records, and medical history. We have validated the efficacy of our segmentation model and prediction model through comparison tests. Our method can provide reliable diagnoses and results that benefit patients and medical professionals.
Collapse
Affiliation(s)
- Yuqi Zhang
- School of Computer Science and Engineering, Beihang University, Beijing, China
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
| | - Mengbo Yu
- School of Computer Science and Engineering, Beihang University, Beijing, China
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
| | - Chao Tong
- School of Computer Science and Engineering, Beihang University, Beijing, China.
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China.
| | - Yanqing Zhao
- Department of Interventional Radiology and Vascular Surgery, Peking University Third Hospital, Beijing, China
| | - Jintao Han
- Department of Interventional Radiology and Vascular Surgery, Peking University Third Hospital, Beijing, China
| |
Collapse
|
27
|
Mitu M, Hasan SMM, Uddin MP, Mamun MA, Rajinikanth V, Kadry S. A stroke prediction framework using explainable ensemble learning. Comput Methods Biomech Biomed Engin 2024:1-20. [PMID: 38384147 DOI: 10.1080/10255842.2024.2316877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 01/23/2024] [Indexed: 02/23/2024]
Abstract
The death of brain cells occurs when blood flow to a particular area of the brain is abruptly cut off, resulting in a stroke. Early recognition of stroke symptoms is essential to prevent strokes and promote a healthy lifestyle. FAST tests (looking for abnormalities in the face, arms, and speech) have limitations in reliability and accuracy for diagnosing strokes. This research employs machine learning (ML) techniques to develop and assess multiple ML models to establish a robust stroke risk prediction framework. This research uses a stacking-based ensemble method to select the best three machine learning (ML) models and combine their collective intelligence. An empirical evaluation of a publicly available stroke prediction dataset demonstrates the superior performance of the proposed stacking-based ensemble model, with only one misclassification. The experimental results reveal that the proposed stacking model surpasses other state-of-the-art research, achieving accuracy, precision, F1-score of 99.99%, recall of 100%, receiver operating characteristics (ROC), Mathews correlation coefficient (MCC), and Kappa scores 1.0. Furthermore, Shapley's Additive Explanations (SHAP) are employed to analyze the predictions of the black-box machine learning (ML) models. The findings highlight that age, BMI, and glucose level are the most significant risk factors for stroke prediction. These findings contribute to the development of more efficient techniques for stroke prediction, potentially saving many lives.
Collapse
Affiliation(s)
- Mostarina Mitu
- Department of Computer Science and Engineering, Rajshahi University of Engineering & Technology, Rajshahi, Bangladesh
| | - S M Mahedy Hasan
- Department of Computer Science and Engineering, Rajshahi University of Engineering & Technology, Rajshahi, Bangladesh
| | - Md Palash Uddin
- Department of Computer Science and Engineering, Hajee Mohammad Danesh Science and Technology University, Dinajpur, Bangladesh
- School of Information Technology, Deakin University, Melbourne, Australia
| | - Md Al Mamun
- Department of Computer Science and Engineering, Rajshahi University of Engineering & Technology, Rajshahi, Bangladesh
| | - Venkatesan Rajinikanth
- Division of Research and Innovation, Saveetha School of Engineering, SIMATS, Chennai, India
| | - Seifedine Kadry
- Department of Applied Data Science, Noroff University College, Kristiansand, Norway
- Artificial Intelligence Research Center (AIRC), Ajman University, Ajman, United Arab Emirates
- Department of Electrical and Computer Engineering, Lebanese American University, Byblos, Lebanon
- MEU Research Unit, Middle East University, Amman, Jordan
| |
Collapse
|
28
|
Zingaro A, Ahmad Z, Kholmovski E, Sakata K, Dede’ L, Morris AK, Quarteroni A, Trayanova NA. A comprehensive stroke risk assessment by combining atrial computational fluid dynamics simulations and functional patient data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.11.575156. [PMID: 38293150 PMCID: PMC10827064 DOI: 10.1101/2024.01.11.575156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Stroke, a major global health concern often rooted in cardiac dynamics, demands precise risk evaluation for targeted intervention. Current risk models, like the CHA2DS2-VASc score, often lack the granularity required for personalized predictions. In this study, we present a nuanced and thorough stroke risk assessment by integrating functional insights from cardiac magnetic resonance (CMR) with patient-specific computational fluid dynamics (CFD) simulations. Our cohort, evenly split between control and stroke groups, comprises eight patients. Utilizing CINE CMR, we compute kinematic features, revealing smaller left atrial volumes for stroke patients. The incorporation of patient-specific atrial displacement into our hemodynamic simulations unveils the influence of atrial compliance on the flow fields, emphasizing the importance of LA motion in CFD simulations and challenging the conventional rigid wall assumption in hemodynamics models. Standardizing hemodynamic features with functional metrics enhances the differentiation between stroke and control cases. While standalone assessments provide limited clarity, the synergistic fusion of CMR-derived functional data and patient-informed CFD simulations offers a personalized and mechanistic understanding, distinctly segregating stroke from control cases. Specifically, our investigation reveals a crucial clinical insight: normalizing hemodynamic features based on ejection fraction fails to differentiate between stroke and control patients. Differently, when normalized with stroke volume, a clear and clinically significant distinction emerges and this holds true for both the left atrium and its appendage, providing valuable implications for precise stroke risk assessment in clinical settings. This work introduces a novel framework for seamlessly integrating hemodynamic and functional metrics, laying the groundwork for improved predictive models, and highlighting the significance of motion-informed, personalized risk assessments.
Collapse
Affiliation(s)
- Alberto Zingaro
- ADVANCE, Alliance for Cardiovascular Diagnostic and Treatment Innovation, Johns Hopkins University, 3400 N. Charles St., 21218, Baltimore, MD, USA
- MOX, Laboratory of Modeling and Scientific Computing, Dipartimento di Matematica, Politecnico di Milano, Piazza Leonardo da Vinci 32, 20133, Milano, Italy
- ELEM Biotech S.L., Pier07, Via Laietana, 26, 08003, Barcelona, Spain
| | - Zan Ahmad
- ADVANCE, Alliance for Cardiovascular Diagnostic and Treatment Innovation, Johns Hopkins University, 3400 N. Charles St., 21218, Baltimore, MD, USA
- Department of Applied Mathematics and Statistics, Johns Hopkins University, 100 Wyman Park Dr, 21211, Baltimore, MD, USA
| | - Eugene Kholmovski
- ADVANCE, Alliance for Cardiovascular Diagnostic and Treatment Innovation, Johns Hopkins University, 3400 N. Charles St., 21218, Baltimore, MD, USA
- Department of Radiology, University of Utah, 30 N Mario Capecchi Dr., 84112, Salt Lake City, UT, USA
| | - Kensuke Sakata
- ADVANCE, Alliance for Cardiovascular Diagnostic and Treatment Innovation, Johns Hopkins University, 3400 N. Charles St., 21218, Baltimore, MD, USA
| | - Luca Dede’
- MOX, Laboratory of Modeling and Scientific Computing, Dipartimento di Matematica, Politecnico di Milano, Piazza Leonardo da Vinci 32, 20133, Milano, Italy
| | - Alan K. Morris
- Scientific Computing and Imaging Institute, University of Utah, 72 Central Campus Dr., 84112, Salt Lake City, UT, USA
| | - Alfio Quarteroni
- MOX, Laboratory of Modeling and Scientific Computing, Dipartimento di Matematica, Politecnico di Milano, Piazza Leonardo da Vinci 32, 20133, Milano, Italy
- Institute of Mathematics, École Polytechnique Fédérale de Lausanne, Station 8, Av. Piccard, CH-1015 Lausanne, Switzerland (Professor Emeritus)
| | - Natalia A. Trayanova
- ADVANCE, Alliance for Cardiovascular Diagnostic and Treatment Innovation, Johns Hopkins University, 3400 N. Charles St., 21218, Baltimore, MD, USA
| |
Collapse
|
29
|
Sun Y, Chen X. Epileptic EEG Signal Detection Using Variational Modal Decomposition and Improved Grey Wolf Algorithm. SENSORS (BASEL, SWITZERLAND) 2023; 23:8078. [PMID: 37836909 PMCID: PMC10575143 DOI: 10.3390/s23198078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 09/16/2023] [Accepted: 09/18/2023] [Indexed: 10/15/2023]
Abstract
Epilepsy does great harm to the human body, and even threatens human life when it is serious. Therefore, research focused on the diagnosis and treatment of epilepsy holds paramount clinical significance. In this paper, we utilized variational modal decomposition (VMD) and an enhanced grey wolf algorithm to detect epileptic electroencephalogram (EEG) signals. Data were extracted from each patient's preseizure period and seizure period of 200 s each, with every 2 s as a segment, meaning 100 data points could be obtained for each patient's health period as well as 100 data points for each patient's epilepsy period. Variational modal decomposition (VMD) was used to obtain the corresponding intrinsic modal function (VMF) of the data. Then, the differential entropy (DE) and high frequency detection (HFD) of each VMF were extracted as features. The improved grey wolf algorithm is adopted for a selected channel to improve the maximum value of the channel. Finally, the EEG signal samples were classified using a support vector machine (SVM) classifier to achieve the accurate detection of epilepsy EEG signals. Experimental results show that the accuracy, sensitivity and specificity of the proposed method can reach 98.3%, 98.9% and 98.5%, respectively. The proposed algorithm in this paper can be used as an index to detect epileptic seizures and has certain guiding significance for the early diagnosis and effective treatment of epileptic patients.
Collapse
Affiliation(s)
- Yongxin Sun
- College of Electronic Information Engineering, Changchun University of Science and Technology, Changchun 130022, China
- College of Physics and Electronic Information, Baicheng Normal University, Baicheng 137099, China
| | - Xiaojuan Chen
- College of Electronic Information Engineering, Changchun University of Science and Technology, Changchun 130022, China
| |
Collapse
|
30
|
An L, Qin J, Jiang W, Luo P, Luo X, Lai Y, Jin M. Non-invasive and accurate risk evaluation of cerebrovascular disease using retinal fundus photo based on deep learning. Front Neurol 2023; 14:1257388. [PMID: 37745652 PMCID: PMC10513168 DOI: 10.3389/fneur.2023.1257388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 08/25/2023] [Indexed: 09/26/2023] Open
Abstract
Background Cerebrovascular disease (CeVD) is a prominent contributor to global mortality and profound disability. Extensive research has unveiled a connection between CeVD and retinal microvascular abnormalities. Nonetheless, manual analysis of fundus images remains a laborious and time-consuming task. Consequently, our objective is to develop a risk prediction model that utilizes retinal fundus photo to noninvasively and accurately assess cerebrovascular risks. Materials and methods To leverage retinal fundus photo for CeVD risk evaluation, we proposed a novel model called Efficient Attention which combines the convolutional neural network with attention mechanism. This combination aims to reinforce the salient features present in fundus photos, consequently improving the accuracy and effectiveness of cerebrovascular risk assessment. Result Our proposed model demonstrates notable advancements compared to the conventional ResNet and Efficient-Net architectures. The accuracy (ACC) of our model is 0.834 ± 0.03, surpassing Efficient-Net by a margin of 3.6%. Additionally, our model exhibits an improved area under the receiver operating characteristic curve (AUC) of 0.904 ± 0.02, surpassing other methods by a margin of 2.2%. Conclusion This paper provides compelling evidence that Efficient-Attention methods can serve as effective and accurate tool for cerebrovascular risk. The results of the study strongly support the notion that retinal fundus photo holds great potential as a reliable predictor of CeVD, which offers a noninvasive, convenient and low-cost solution for large scale screening of CeVD.
Collapse
Affiliation(s)
- Lin An
- Guangdong Weiren Meditech Co., Ltd, Foshan, Guangdong, China
| | - Jia Qin
- Guangdong Weiren Meditech Co., Ltd, Foshan, Guangdong, China
| | - Weili Jiang
- Foshan Weizhi Meditech Co., Ltd, Foshan, Guangdong, China
| | - Penghao Luo
- Foshan Weizhi Meditech Co., Ltd, Foshan, Guangdong, China
| | - Xiaoyan Luo
- Department of Ophthalmology, Guangdong Provincial Hospital of Integrated Traditional Chinese and Western Medicine, Foshan, Guangdong, China
| | - Yuzheng Lai
- Department of Neurology, Guangdong Provincial Hospital of Integrated Traditional Chinese and Western Medicine, Foshan, Guangdong, China
| | - Mei Jin
- Department of Ophthalmology, Guangdong Provincial Hospital of Integrated Traditional Chinese and Western Medicine, Foshan, Guangdong, China
| |
Collapse
|
31
|
Abegaz TM, Baljoon A, Kilanko O, Sherbeny F, Ali AA. Machine learning algorithms to predict major adverse cardiovascular events in patients with diabetes. Comput Biol Med 2023; 164:107289. [PMID: 37557056 DOI: 10.1016/j.compbiomed.2023.107289] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 07/01/2023] [Accepted: 07/28/2023] [Indexed: 08/11/2023]
Abstract
BACKGROUND Major Adverse Cardiovascular Events (MACE) are common complications of type 2 diabetes mellitus (T2DM) that include myocardial infarction (MI), stroke, and heart failure (HF). The objective of the current study was to predict MACE among T2DM patients. METHODS Type 2 diabetes mellitus patients above 18 years old were recruited for the study from the All of Us Research Program. Eligible participants were those who took sodium-glucose cotransporter 2 inhibitors. Different Machine learning algorithms: including RandomForest (RF), XGBoost, logistic regression (LR), and weighted ensemble model (WEM) were employed. Clinical attributes, electrolytes and biomarkers were explored in predicting MACE. The feature importance was determined using mean decrease accuracy. RESULTS Overall, 9, 059 subjects were included in the analyses, of which 5197 (57.4%) were females. The XGBoost Model demonstrated a prediction accuracy of 0.80 [0.78-0.82], which is higher as compared to the RF 0.78[0.76-0.80], the LR model 0.65 [0.62-0.67], and the WEM 0.75 [0.73-0.76], respectively. The classification accuracy of the models for stroke was more than 95%, which was higher than prediction accuracy for MI (∼85%), and HF (∼80%). Phosphate, blood urea nitrogen and troponin levels were the major predictors of MACE. CONCLUSION The ML models had shown acceptable performance in predicting MACE in T2DM patients, except the LR model. Phosphate, blood urea nitrogen, and other electrolytes were important predictors of MACE, which is consistent between the individual components of MACE, such as stroke, MI, and HF. These parameters can be calibrated as prognostic parameters of MACE events in T2DM patients.
Collapse
Affiliation(s)
- Tadesse M Abegaz
- Economic, Social and Administrative Pharmacy (ESAP), College of Pharmacy and Pharmaceutical Sciences, Institute of Public Heath, Florida A&M University, Tallahassee, FL, 32307, USA
| | - Ahmead Baljoon
- Economic, Social and Administrative Pharmacy (ESAP), College of Pharmacy and Pharmaceutical Sciences, Institute of Public Heath, Florida A&M University, Tallahassee, FL, 32307, USA
| | - Oluwaseun Kilanko
- Economic, Social and Administrative Pharmacy (ESAP), College of Pharmacy and Pharmaceutical Sciences, Institute of Public Heath, Florida A&M University, Tallahassee, FL, 32307, USA
| | - Fatimah Sherbeny
- Economic, Social and Administrative Pharmacy (ESAP), College of Pharmacy and Pharmaceutical Sciences, Institute of Public Heath, Florida A&M University, Tallahassee, FL, 32307, USA
| | - Askal Ayalew Ali
- Economic, Social and Administrative Pharmacy (ESAP), College of Pharmacy and Pharmaceutical Sciences, Institute of Public Heath, Florida A&M University, Tallahassee, FL, 32307, USA.
| |
Collapse
|
32
|
Ahmed S, Irfan S, Kiran N, Masood N, Anjum N, Ramzan N. Remote Health Monitoring Systems for Elderly People: A Survey. SENSORS (BASEL, SWITZERLAND) 2023; 23:7095. [PMID: 37631632 PMCID: PMC10458487 DOI: 10.3390/s23167095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 08/03/2023] [Accepted: 08/03/2023] [Indexed: 08/27/2023]
Abstract
This paper addresses the growing demand for healthcare systems, particularly among the elderly population. The need for these systems arises from the desire to enable patients and seniors to live independently in their homes without relying heavily on their families or caretakers. To achieve substantial improvements in healthcare, it is essential to ensure the continuous development and availability of information technologies tailored explicitly for patients and elderly individuals. The primary objective of this study is to comprehensively review the latest remote health monitoring systems, with a specific focus on those designed for older adults. To facilitate a comprehensive understanding, we categorize these remote monitoring systems and provide an overview of their general architectures. Additionally, we emphasize the standards utilized in their development and highlight the challenges encountered throughout the developmental processes. Moreover, this paper identifies several potential areas for future research, which promise further advancements in remote health monitoring systems. Addressing these research gaps can drive progress and innovation, ultimately enhancing the quality of healthcare services available to elderly individuals. This, in turn, empowers them to lead more independent and fulfilling lives while enjoying the comforts and familiarity of their own homes. By acknowledging the importance of healthcare systems for the elderly and recognizing the role of information technologies, we can address the evolving needs of this population. Through ongoing research and development, we can continue to enhance remote health monitoring systems, ensuring they remain effective, efficient, and responsive to the unique requirements of elderly individuals.
Collapse
Affiliation(s)
- Salman Ahmed
- Department of Computer Science, Capital University of Science and Technology, Islamabad 44000, Pakistan; (N.M.); (N.A.)
| | - Saad Irfan
- Department of Information Engineering Technology, National Skills University, Islamabad 44000, Pakistan;
| | - Nasira Kiran
- School of Computing, Engineering and Physical Sciences, University of the West of Scotland, Paisley PA1 2BE, UK; (N.K.); (N.R.)
| | - Nayyer Masood
- Department of Computer Science, Capital University of Science and Technology, Islamabad 44000, Pakistan; (N.M.); (N.A.)
| | - Nadeem Anjum
- Department of Computer Science, Capital University of Science and Technology, Islamabad 44000, Pakistan; (N.M.); (N.A.)
| | - Naeem Ramzan
- School of Computing, Engineering and Physical Sciences, University of the West of Scotland, Paisley PA1 2BE, UK; (N.K.); (N.R.)
| |
Collapse
|
33
|
Alruily M, El-Ghany SA, Mostafa AM, Ezz M, El-Aziz AAA. A-Tuning Ensemble Machine Learning Technique for Cerebral Stroke Prediction. APPLIED SCIENCES 2023; 13:5047. [DOI: 10.3390/app13085047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
A cerebral stroke is a medical problem that occurs when the blood flowing to a section of the brain is suddenly cut off, causing damage to the brain. Brain cells gradually die because of interruptions in blood supply and other nutrients to the brain, resulting in disabilities, depending on the affected region. Early recognition and detection of symptoms can aid in the rapid treatment of strokes and result in better health by reducing the severity of a stroke episode. In this paper, the Random Forest (RF), Extreme Gradient Boosting (XGBoost), and light gradient-boosting machine (LightGBM) were used as machine learning (ML) algorithms for predicting the likelihood of a cerebral stroke by applying an open-access stroke prediction dataset. The stroke prediction dataset was pre-processed by handling missing values using the KNN imputer technique, eliminating outliers, applying the one-hot encoding method, and normalizing the features with different ranges of values. After data splitting, synthetic minority oversampling (SMO) was applied to balance the stroke samples and no-stroke classes. Furthermore, to fine-tune the hyper-parameters of the ML algorithm, we employed a random search technique that could achieve the best parameter values. After applying the tuning process, we stacked the parameters to a tuning ensemble RXLM that was analyzed and compared with traditional classifiers. The performance metrics after tuning the hyper-parameters achieved promising results with all ML algorithms.
Collapse
Affiliation(s)
- Meshrif Alruily
- Computer Science Department, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Aljouf, Saudi Arabia
| | - Sameh Abd El-Ghany
- Information Systems Department, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Aljouf, Saudi Arabia
| | - Ayman Mohamed Mostafa
- Information Systems Department, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Aljouf, Saudi Arabia
| | - Mohamed Ezz
- Computer Science Department, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Aljouf, Saudi Arabia
| | - A. A. Abd El-Aziz
- Information Systems Department, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Aljouf, Saudi Arabia
| |
Collapse
|
34
|
Clua-Espuny JL, Molto-Balado P, Lucas-Noll J, Panisello-Tafalla A, Muria-Subirats E, Clua-Queralt J, Queralt-Tomas L, Reverté-Villarroya S. Early Diagnosis of Atrial Fibrillation and Stroke Incidence in Primary Care: Translating Measurements into Actions-A Retrospective Cohort Study. Biomedicines 2023; 11:1116. [PMID: 37189734 PMCID: PMC10135492 DOI: 10.3390/biomedicines11041116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 03/08/2023] [Accepted: 03/27/2023] [Indexed: 05/17/2023] Open
Abstract
(1) Background: AF-related strokes will triple by 2060, are associated with an increased risk of cognitive decline, and alone or in combination, will be one of the main health and economic burdens on the European population. The main goal of this paper is to describe the incidence of new AF associated with stroke, cognitive decline and mortality among people at high risk for AF. (2) Methods: Multicenter, observational, retrospective, community-based studies were conducted from 1 January 2015 to 31 December 2021. The setting was primary care centers. A total of 40,297 people aged ≥65 years without previous AF or stroke were stratified by AFrisk at 5 years. The main measurements were the overall incidence density/1000 person-years (CI95%) of AF and stroke, prevalence of cognitive decline, and Kaplan-Meier curve. (3) Results: In total, 46.4% women, 77.65 ± 8.46 years old on average showed anAF incidence of 9.9/103/year (CI95% 9.5-10.3), associated with a four-fold higher risk of stroke (CI95% 3.4-4.7), cognitive impairment(OR 1.34 (CI95% 1.1-1.5)), and all-cause mortality (OR 1.14 (CI95% 1.0-1.2)), but there was no significant difference in ischemic heart disease, chronic kidney disease, or peripheral arteriopathy. Unknown AF was diagnosed in 9.4% and of these patients, 21.1% were diagnosed with new stroke. (4) Conclusions: The patients at high AF risk (Q4th) already had an increased cardiovascular risk before they were diagnosed with AF.
Collapse
Affiliation(s)
- Josep-Lluis Clua-Espuny
- Primary Health-Care Centre, Institut Català de la Salut, Primary Care Service (SAP), EAP Tortosa-Est, Plaça Carrilet s/núm, 43500 Tortosa, Spain
- Research Support Unit Terres de l’Ebre, Institut Universitarid’Investigació en Atenció Primària Jordi Gol (IDIAP JGol), USR Terres de l’Ebre, 43500 Tortosa, Spain
| | - Pedro Molto-Balado
- Primary Health-Care Centre, Institut Català de la Salut, Primary Care Service (SAP) Terres de l’Ebre, UUDDTortosa-Terres de l’Ebre, 43500 Tortosa, Spain
| | - Jorgina Lucas-Noll
- Health Department, Management CatSalut Terres de l’Ebre, 43500 Tortosa, Spain
| | - Anna Panisello-Tafalla
- Primary Health-Care Centre, Institut Català de la Salut, Primary Care Service (SAP), EAP Tortosa-Est, Plaça Carrilet s/núm, 43500 Tortosa, Spain
| | - Eulalia Muria-Subirats
- Primary Health-Care Centre, Institut Català de la Salut, Primary Care Service (SAP) Terres de l’Ebre, EAP Amposta, C/Sebastià Juan Arbó, 139, 43870 Amposta, Spain
| | - Josep Clua-Queralt
- Research Support Unit Terres de l’Ebre, Institut Universitarid’Investigació en Atenció Primària Jordi Gol (IDIAP JGol), USR Terres de l’Ebre, 43500 Tortosa, Spain
| | - Lluïsa Queralt-Tomas
- Primary Health-Care Centre, Institut Català de la Salut, Primary Care Service (SAP), EAP Tortosa-Oest, Avda Cristobal Colon, 16, 43500 Tortosa, Spain
| | - Silvia Reverté-Villarroya
- Nursing Department, Campus Terres de l’Ebre, University Rovira i Virgili, Av Remolins, 13, 43500 Tortosa, Spain
- Advanced Nursing Research Group, Medicine and Health Sciences, University Rovira i Virgili, 43002 Tarragona, Spain
| | | |
Collapse
|
35
|
[Organization and costs of stroke care in outpatient settings: Systematic review]. Aten Primaria 2023; 55:102578. [PMID: 36773416 PMCID: PMC9941369 DOI: 10.1016/j.aprim.2023.102578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 01/09/2023] [Accepted: 01/09/2023] [Indexed: 02/11/2023] Open
Abstract
OBJECTIVE To review the bibliography on stroke costs (ICD-10 code I63) in the field of primary care. DESIGN Systematic review. DATA SOURCES PubMed/Medline, ClinicalTrials.gov, Cochrane Reviews, EconLit, and Ovid/Embase between 01/01/2012-12/31/2021 with descriptors included in Medical Subject Heading (MeSH). SELECTION OF STUDIES Those with a description of the costs of activities carried out in the out-of-hospital setting. Systematic reviews were included; prospective and retrospective observational studies; analysis of databases and total or partial costs of stroke as a disease (COI). Articles were added using the snowball method. The studies were excluded because: a) not specifically related to stroke; b) in editorial or commentary format; c) irrelevant after review of the title and abstract; and d) gray literature and non-academic studies were excluded. DATA EXTRACTION They were assigned a level of evidence according to the GRADE levels. Direct and indirect cost data were collected. RESULTS AND CONCLUSIONS Thirty studies, of which 14 (46.6%) were related to post-stroke costs and 12 (40%) to cardiovascular prevention costs. The results show that most of them are retrospective analyzes of different databases of short-term hospital care, and do not allow a detailed analysis of the costs by different segments of services. The possibilities for improvement are centered on primary and secondary prevention, selection and pre-hospital transfer, early discharge with support, and social and health care.
Collapse
|
36
|
Trigka M, Dritsas E. Long-Term Coronary Artery Disease Risk Prediction with Machine Learning Models. SENSORS (BASEL, SWITZERLAND) 2023; 23:1193. [PMID: 36772237 PMCID: PMC9920214 DOI: 10.3390/s23031193] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 01/17/2023] [Accepted: 01/18/2023] [Indexed: 06/18/2023]
Abstract
The heart is the most vital organ of the human body; thus, its improper functioning has a significant impact on human life. Coronary artery disease (CAD) is a disease of the coronary arteries through which the heart is nourished and oxygenated. It is due to the formation of atherosclerotic plaques on the wall of the epicardial coronary arteries, resulting in the narrowing of their lumen and the obstruction of blood flow through them. Coronary artery disease can be delayed or even prevented with lifestyle changes and medical intervention. Long-term risk prediction of coronary artery disease will be the area of interest in this work. In this specific research paper, we experimented with various machine learning (ML) models after the use or non-use of the synthetic minority oversampling technique (SMOTE), evaluating and comparing them in terms of accuracy, precision, recall and an area under the curve (AUC). The results showed that the stacking ensemble model after the SMOTE with 10-fold cross-validation prevailed over the other models, achieving an accuracy of 90.9 %, a precision of 96.7%, a recall of 87.6% and an AUC equal to 96.1%.
Collapse
Affiliation(s)
- Maria Trigka
- Department of Computer Engineering and Informatics, University of Patras, 26504 Patras, Greece
| | | |
Collapse
|
37
|
Dritsas E, Trigka M. Efficient Data-Driven Machine Learning Models for Cardiovascular Diseases Risk Prediction. SENSORS (BASEL, SWITZERLAND) 2023; 23:1161. [PMID: 36772201 PMCID: PMC9921621 DOI: 10.3390/s23031161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 01/15/2023] [Accepted: 01/17/2023] [Indexed: 06/18/2023]
Abstract
Cardiovascular diseases (CVDs) are now the leading cause of death, as the quality of life and human habits have changed significantly. CVDs are accompanied by various complications, including all pathological changes involving the heart and/or blood vessels. The list of pathological changes includes hypertension, coronary heart disease, heart failure, angina, myocardial infarction and stroke. Hence, prevention and early diagnosis could limit the onset or progression of the disease. Nowadays, machine learning (ML) techniques have gained a significant role in disease prediction and are an essential tool in medicine. In this study, a supervised ML-based methodology is presented through which we aim to design efficient prediction models for CVD manifestation, highlighting the SMOTE technique's superiority. Detailed analysis and understanding of risk factors are shown to explore their importance and contribution to CVD prediction. These factors are fed as input features to a plethora of ML models, which are trained and tested to identify the most appropriate for our objective under a binary classification problem with a uniform class probability distribution. Various ML models were evaluated after the use or non-use of Synthetic Minority Oversampling Technique (SMOTE), and comparing them in terms of Accuracy, Recall, Precision and an Area Under the Curve (AUC). The experiment results showed that the Stacking ensemble model after SMOTE with 10-fold cross-validation prevailed over the other ones achieving an Accuracy of 87.8%, Recall of 88.3%, Precision of 88% and an AUC equal to 98.2%.
Collapse
|
38
|
Dritsas E, Trigka M. Supervised Machine Learning Models to Identify Early-Stage Symptoms of SARS-CoV-2. SENSORS (BASEL, SWITZERLAND) 2022; 23:s23010040. [PMID: 36616638 PMCID: PMC9824026 DOI: 10.3390/s23010040] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 12/07/2022] [Accepted: 12/16/2022] [Indexed: 06/12/2023]
Abstract
The coronavirus disease (COVID-19) pandemic was caused by the SARS-CoV-2 virus and began in December 2019. The virus was first reported in the Wuhan region of China. It is a new strain of coronavirus that until then had not been isolated in humans. In severe cases, pneumonia, acute respiratory distress syndrome, multiple organ failure or even death may occur. Now, the existence of vaccines, antiviral drugs and the appropriate treatment are allies in the confrontation of the disease. In the present research work, we utilized supervised Machine Learning (ML) models to determine early-stage symptoms of SARS-CoV-2 occurrence. For this purpose, we experimented with several ML models, and the results showed that the ensemble model, namely Stacking, outperformed the others, achieving an Accuracy, Precision, Recall and F-Measure equal to 90.9% and an Area Under Curve (AUC) of 96.4%.
Collapse
|
39
|
Hunter E, Kelleher JD. A review of risk concepts and models for predicting the risk of primary stroke. Front Neuroinform 2022; 16:883762. [DOI: 10.3389/fninf.2022.883762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Accepted: 10/31/2022] [Indexed: 11/17/2022] Open
Abstract
Predicting an individual's risk of primary stroke is an important tool that can help to lower the burden of stroke for both the individual and society. There are a number of risk models and risk scores in existence but no review or classification designed to help the reader better understand how models differ and the reasoning behind these differences. In this paper we review the existing literature on primary stroke risk prediction models. From our literature review we identify key similarities and differences in the existing models. We find that models can differ in a number of ways, including the event type, the type of analysis, the model type and the time horizon. Based on these similarities and differences we have created a set of questions and a system to help answer those questions that modelers and readers alike can use to help classify and better understand the existing models as well as help to make necessary decisions when creating a new model.
Collapse
|
40
|
Padhee S, Johnson M, Yi H, Banerjee T, Yang Z. Machine Learning for Aiding Blood Flow Velocity Estimation Based on Angiography. Bioengineering (Basel) 2022; 9:622. [PMID: 36354533 PMCID: PMC9687909 DOI: 10.3390/bioengineering9110622] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Revised: 10/24/2022] [Accepted: 10/25/2022] [Indexed: 06/28/2024] Open
Abstract
Computational fluid dynamics (CFD) is widely employed to predict hemodynamic characteristics in arterial models, while not friendly to clinical applications due to the complexity of numerical simulations. Alternatively, this work proposed a framework to estimate hemodynamics in vessels based on angiography images using machine learning (ML) algorithms. First, the iodine contrast perfusion in blood was mimicked by a flow of dye diffusing into water in the experimentally validated CFD modeling. The generated projective images from simulations imitated the counterpart of light passing through the flow field as an analogy of X-ray imaging. Thus, the CFD simulation provides both the ground truth velocity field and projective images of dye flow patterns. The rough velocity field was estimated using the optical flow method (OFM) based on 53 projective images. ML training with least absolute shrinkage, selection operator and convolutional neural network was conducted with CFD velocity data as the ground truth and OFM velocity estimation as the input. The performance of each model was evaluated based on mean absolute error and mean squared error, where all models achieved or surpassed the criteria of 3 × 10-3 and 5 × 10-7 m/s, respectively, with a standard deviation less than 1 × 10-6 m/s. Finally, the interpretable regression and ML models were validated with over 613 image sets. The validation results showed that the employed ML model significantly reduced the error rate from 53.5% to 2.5% on average for the v-velocity estimation in comparison with CFD. The ML framework provided an alternative pathway to support clinical diagnosis by predicting hemodynamic information with high efficiency and accuracy.
Collapse
Affiliation(s)
- Swati Padhee
- Department of Computer Science and Engineering, Wright State University, Dayton, OH 45435, USA
| | - Mark Johnson
- Department of Mechanical and Materials Engineering, Wright State University, Dayton, OH 45435, USA
| | - Hang Yi
- Department of Mechanical and Materials Engineering, Wright State University, Dayton, OH 45435, USA
| | - Tanvi Banerjee
- Department of Computer Science and Engineering, Wright State University, Dayton, OH 45435, USA
| | - Zifeng Yang
- Department of Mechanical and Materials Engineering, Wright State University, Dayton, OH 45435, USA
| |
Collapse
|
41
|
Dritsas E, Trigka M. Machine Learning Methods for Hypercholesterolemia Long-Term Risk Prediction. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22145365. [PMID: 35891045 PMCID: PMC9322993 DOI: 10.3390/s22145365] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/19/2022] [Revised: 07/12/2022] [Accepted: 07/16/2022] [Indexed: 06/12/2023]
Abstract
Cholesterol is a waxy substance found in blood lipids. Its role in the human body is helpful in the process of producing new cells as long as it is at a healthy level. When cholesterol exceeds the permissible limits, it works the opposite, causing serious heart health problems. When a person has high cholesterol (hypercholesterolemia), the blood vessels are blocked by fats, and thus, circulation through the arteries becomes difficult. The heart does not receive the oxygen it needs, and the risk of heart attack increases. Nowadays, machine learning (ML) has gained special interest from physicians, medical centers and healthcare providers due to its key capabilities in health-related issues, such as risk prediction, prognosis, treatment and management of various conditions. In this article, a supervised ML methodology is outlined whose main objective is to create risk prediction tools with high efficiency for hypercholesterolemia occurrence. Specifically, a data understanding analysis is conducted to explore the features association and importance to hypercholesterolemia. These factors are utilized to train and test several ML models to find the most efficient for our purpose. For the evaluation of the ML models, precision, recall, accuracy, F-measure, and AUC metrics have been taken into consideration. The derived results highlighted Soft Voting with Rotation and Random Forest trees as base models, which achieved better performance in comparison to the other models with an AUC of 94.5%, precision of 92%, recall of 91.8%, F-measure of 91.7% and an accuracy equal to 91.75%.
Collapse
|
42
|
Dritsas E, Trigka M. Data-Driven Machine-Learning Methods for Diabetes Risk Prediction. SENSORS (BASEL, SWITZERLAND) 2022; 22:5304. [PMID: 35890983 PMCID: PMC9318204 DOI: 10.3390/s22145304] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Revised: 07/10/2022] [Accepted: 07/13/2022] [Indexed: 01/11/2023]
Abstract
Diabetes mellitus is a chronic condition characterized by a disturbance in the metabolism of carbohydrates, fats and proteins. The most characteristic disorder in all forms of diabetes is hyperglycemia, i.e., elevated blood sugar levels. The modern way of life has significantly increased the incidence of diabetes. Therefore, early diagnosis of the disease is a necessity. Machine Learning (ML) has gained great popularity among healthcare providers and physicians due to its high potential in developing efficient tools for risk prediction, prognosis, treatment and the management of various conditions. In this study, a supervised learning methodology is described that aims to create risk prediction tools with high efficiency for type 2 diabetes occurrence. A features analysis is conducted to evaluate their importance and explore their association with diabetes. These features are the most common symptoms that often develop slowly with diabetes, and they are utilized to train and test several ML models. Various ML models are evaluated in terms of the Precision, Recall, F-Measure, Accuracy and AUC metrics and compared under 10-fold cross-validation and data splitting. Both validation methods highlighted Random Forest and K-NN as the best performing models in comparison to the other models.
Collapse
Affiliation(s)
| | - Maria Trigka
- Department of Computer Engineering and Informatics, University of Patras, 26504 Patras, Greece;
| |
Collapse
|
43
|
Cluster Analysis of US COVID-19 Infected States for Vaccine Distribution. Healthcare (Basel) 2022; 10:healthcare10071235. [PMID: 35885762 PMCID: PMC9323689 DOI: 10.3390/healthcare10071235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 06/30/2022] [Accepted: 07/01/2022] [Indexed: 11/16/2022] Open
Abstract
Since December 2019, COVID-19 has been raging worldwide. To prevent the spread of COVID-19 infection, many countries have proposed epidemic prevention policies and quickly administered vaccines, However, under facing a shortage of vaccines, the United States did not put forward effective epidemic prevention policies in time to prevent the infection from expanding, resulting in the epidemic in the United States becoming more and more serious. Through “The COVID Tracking Project”, this study collects medical indicators for each state in the United States from 2020 to 2021, and through feature selection, each state is clustered according to the epidemic’s severity. Furthermore, through the confusion matrix of the classifier to verify the accuracy of the cluster analysis, the study results show that the Cascade K-means cluster analysis has the highest accuracy. This study also labeled the three clusters of the cluster analysis results as high, medium, and low infection levels. Policymakers could more objectively decide which states should prioritize vaccine allocation in a vaccine shortage to prevent the epidemic from continuing to expand. It is hoped that if there is a similar epidemic in the future, relevant policymakers can use the analysis procedure of this study to determine the allocation of relevant medical resources for epidemic prevention according to the severity of infection in each state to prevent the spread of infection.
Collapse
|