1
|
Yakovyna V, Shakhovska N, Szpakowska A. A novel hybrid supervised and unsupervised hierarchical ensemble for COVID-19 cases and mortality prediction. Sci Rep 2024; 14:9782. [PMID: 38684770 PMCID: PMC11059164 DOI: 10.1038/s41598-024-60637-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Accepted: 04/25/2024] [Indexed: 05/02/2024] Open
Abstract
Though COVID-19 is no longer a pandemic but rather an endemic, the epidemiological situation related to the SARS-CoV-2 virus is developing at an alarming rate, impacting every corner of the world. The rapid escalation of the coronavirus has led to the scientific community engagement, continually seeking solutions to ensure the comfort and safety of society. Understanding the joint impact of medical and non-medical interventions on COVID-19 spread is essential for making public health decisions that control the pandemic. This paper introduces two novel hybrid machine-learning ensembles that combine supervised and unsupervised learning for COVID-19 data classification and regression. The study utilizes publicly available COVID-19 outbreak and potential predictive features in the USA dataset, which provides information related to the outbreak of COVID-19 disease in the US, including data from each of 3142 US counties from the beginning of the epidemic (January 2020) until June 2021. The developed hybrid hierarchical classifiers outperform single classification algorithms. The best-achieved performance metrics for the classification task were Accuracy = 0.912, ROC-AUC = 0.916, and F1-score = 0.916. The proposed hybrid hierarchical ensemble combining both supervised and unsupervised learning allows us to increase the accuracy of the regression task by 11% in terms of MSE, 29% in terms of the area under the ROC, and 43% in terms of the MPP metric. Thus, using the proposed approach, it is possible to predict the number of COVID-19 cases and deaths based on demographic, geographic, climatic, traffic, public health, social-distancing-policy adherence, and political characteristics with sufficiently high accuracy. The study reveals that virus pressure is the most important feature in COVID-19 spread for classification and regression analysis. Five other significant features were identified to have the most influence on COVID-19 spread. The combined ensembling approach introduced in this study can help policymakers design prevention and control measures to avoid or minimize public health threats in the future.
Collapse
Affiliation(s)
- Vitaliy Yakovyna
- Faculty of Mathematics and Computer Science, University of Warmia and Mazury in Olsztyn, Ul. Oczapowskiego 2, 10-719, Olsztyn, Poland
- Artificial Intelligence Department, Lviv Polytechnic National University, 12 S. Bandery St, Lviv, 79013, Ukraine
| | - Nataliya Shakhovska
- Artificial Intelligence Department, Lviv Polytechnic National University, 12 S. Bandery St, Lviv, 79013, Ukraine.
- Universytet Rolniczy, 31120, Kraków, Poland.
| | - Aleksandra Szpakowska
- Faculty of Mathematics and Computer Science, University of Warmia and Mazury in Olsztyn, Ul. Oczapowskiego 2, 10-719, Olsztyn, Poland
| |
Collapse
|
2
|
Barreto TDO, Veras NVR, Cardoso PH, Fernandes FRDS, Medeiros LPDS, Bezerra MV, de Andrade FMQ, Pinheiro CDO, Sánchez-Gendriz I, Silva GJPC, Rodrigues LF, de Morais AHF, dos Santos JPQ, Paiva JC, de Andrade IGM, Valentim RADM. Artificial intelligence applied to analyzes during the pandemic: COVID-19 beds occupancy in the state of Rio Grande do Norte, Brazil. Front Artif Intell 2023; 6:1290022. [PMID: 38145230 PMCID: PMC10748397 DOI: 10.3389/frai.2023.1290022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Accepted: 11/17/2023] [Indexed: 12/26/2023] Open
Abstract
The COVID-19 pandemic is already considered one of the biggest global health crises. In Rio Grande do Norte, a Brazilian state, the RegulaRN platform was the health information system used to regulate beds for patients with COVID-19. This article explored machine learning and deep learning techniques with RegulaRN data in order to identify the best models and parameters to predict the outcome of a hospitalized patient. A total of 25,366 bed regulations for COVID-19 patients were analyzed. The data analyzed comes from the RegulaRN Platform database from April 2020 to August 2022. From these data, the nine most pertinent characteristics were selected from the twenty available, and blank or inconclusive data were excluded. This was followed by the following steps: data pre-processing, database balancing, training, and test. The results showed better performance in terms of accuracy (84.01%), precision (79.57%), and F1-score (81.00%) for the Multilayer Perceptron model with Stochastic Gradient Descent optimizer. The best results for recall (84.67%), specificity (84.67%), and ROC-AUC (91.6%) were achieved by Root Mean Squared Propagation. This study compared different computational methods of machine and deep learning whose objective was to classify bed regulation data for patients with COVID-19 from the RegulaRN Platform. The results have made it possible to identify the best model to help health professionals during the process of regulating beds for patients with COVID-19. The scientific findings of this article demonstrate that the computational methods used applied through a digital health solution, can assist in the decision-making of medical regulators and government institutions in situations of public health crisis.
Collapse
Affiliation(s)
- Tiago de Oliveira Barreto
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
| | - Nícolas Vinícius Rodrigues Veras
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - Pablo Holanda Cardoso
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - Felipe Ricardo dos Santos Fernandes
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
| | | | - Maria Valéria Bezerra
- Secretary of Public Health of Rio Grande do Norte, Natal, Rio Grande do Norte, Brazil
| | | | | | - Ignacio Sánchez-Gendriz
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
| | - Gleyson José Pinheiro Caldeira Silva
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - Leandro Farias Rodrigues
- Brazilian Company of Hospital Services (EBSERH), University Hospital of Pelotas, Federal University of Pelotas (UFPel), Pelotas, Rio Grande do Sul, Brazil
| | - Antonio Higor Freire de Morais
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - João Paulo Queiroz dos Santos
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - Jailton Carlos Paiva
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - Ion Garcia Mascarenhas de Andrade
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
| | | |
Collapse
|
3
|
Nopour R. Prediction of five-year survival among esophageal cancer patients using machine learning. Heliyon 2023; 9:e22654. [PMID: 38125437 PMCID: PMC10730993 DOI: 10.1016/j.heliyon.2023.e22654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 11/16/2023] [Accepted: 11/16/2023] [Indexed: 12/23/2023] Open
Abstract
Background and aim Considering the silent progression of esophageal cancer, the survival prediction of this disease is crucial in enhancing the quality of life of these patients globally. So far, no prediction solution has been introduced for the survival of EC in Iran based on the machine learning approach. So, this study aims to develop a prediction model for the five-year survival of EC based on the ML approach to promote clinical outcomes and various treatment and preventive plans. Material and methods In this retrospective study, we investigated the 1656 cases of survived and non-survived EC patients belonging to Imam Khomeini Hospital in Sari City from 2013 to 2020. The multivariable regression analysis was used to select the best predictors of five-year survival. We leveraged random forest, eXtreme Gradient Boosting, support vector machine, artificial neural networks, Bayesian networks, J-48 decision tree, and K-nearest neighborhood to develop the prediction models. To get the best model for predicting the five-year survival of EC, we compared them using the area under the receiver operator characteristics. Results The age at diagnosis, body mass index, smoking, obstruction, dysphagia, weight loss, lymphadenopathy, chemotherapy, radiotherapy, family history of EC, tumor stage, type of appearance, histological type, grade of differentiation, tumor location, tumor size, lymphatic invasion, vascular invasion, and platelet albumin ratio were considered as the best predictors associated with the five-year survival of EC based on the regression analysis. In this respect, the random forest with the area under the receiver operator characteristics of 0.95 was identified as a superior model. Conclusion The experimental results of the current study showed that the random forest could have a significant role in enhancing the quality of care in EC patients by increasing the effectiveness of follow-up and treatment measures introduced by care providers.
Collapse
Affiliation(s)
- Raoof Nopour
- Department of Health Information Management, Student Research Committee, School of Health Management and Information Sciences Branch, Iran University of Medical Sciences, Tehran, Iran
| |
Collapse
|
4
|
Wang Y, Liu L, Wang C. Trends in using deep learning algorithms in biomedical prediction systems. Front Neurosci 2023; 17:1256351. [PMID: 38027475 PMCID: PMC10665494 DOI: 10.3389/fnins.2023.1256351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 09/25/2023] [Indexed: 12/01/2023] Open
Abstract
In the domain of using DL-based methods in medical and healthcare prediction systems, the utilization of state-of-the-art deep learning (DL) methodologies assumes paramount significance. DL has attained remarkable achievements across diverse domains, rendering its efficacy particularly noteworthy in this context. The integration of DL with health and medical prediction systems enables real-time analysis of vast and intricate datasets, yielding insights that significantly enhance healthcare outcomes and operational efficiency in the industry. This comprehensive literature review systematically investigates the latest DL solutions for the challenges encountered in medical healthcare, with a specific emphasis on DL applications in the medical domain. By categorizing cutting-edge DL approaches into distinct categories, including convolutional neural networks (CNNs), recurrent neural networks (RNNs), generative adversarial networks (GANs), long short-term memory (LSTM) models, support vector machine (SVM), and hybrid models, this study delves into their underlying principles, merits, limitations, methodologies, simulation environments, and datasets. Notably, the majority of the scrutinized articles were published in 2022, underscoring the contemporaneous nature of the research. Moreover, this review accentuates the forefront advancements in DL techniques and their practical applications within the realm of medical prediction systems, while simultaneously addressing the challenges that hinder the widespread implementation of DL in image segmentation within the medical healthcare domains. These discerned insights serve as compelling impetuses for future studies aimed at the progressive advancement of using DL-based methods in medical and health prediction systems. The evaluation metrics employed across the reviewed articles encompass a broad spectrum of features, encompassing accuracy, precision, specificity, F-score, adoptability, adaptability, and scalability.
Collapse
Affiliation(s)
- Yanbu Wang
- School of Strength and Conditioning, Beijing Sport University, Beijing, China
| | - Linqing Liu
- Department of Physical Education, Peking University, Beijing, China
| | - Chao Wang
- Institute of Competitive Sports, Beijing Sport University, Beijing, China
| |
Collapse
|
5
|
Shearah Z, Ullah Z, Fakieh B. Intelligent Framework for Early Detection of Severe Pediatric Diseases from Mild Symptoms. Diagnostics (Basel) 2023; 13:3204. [PMID: 37892025 PMCID: PMC10606417 DOI: 10.3390/diagnostics13203204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 10/05/2023] [Accepted: 10/11/2023] [Indexed: 10/29/2023] Open
Abstract
Children's health is one of the most significant fields in medicine. Most diseases that result in children's death or long-term morbidity are caused by preventable and treatable etiologies, and they appear in the child at the early stages as mild symptoms. This research aims to develop a machine learning (ML) framework to detect the severity of disease in children. The proposed framework helps in discriminating children's urgent/severe conditions and notifying parents whether a child needs to visit the emergency room immediately or not. The model considers several variables to detect the severity of cases, which are the symptoms, risk factors (e.g., age), and the child's medical history. The framework is implemented by using nine ML methods. The results achieved show the high performance of the proposed framework in identifying serious pediatric diseases, where decision tree and random forest outperformed the other methods with an accuracy rate of 94%. This shows the reliability of the proposed framework to be used as a pediatric decision-making system for detecting serious pediatric illnesses. The results are promising when compared to recent state-of-the-art studies. The main contribution of this research is to propose a framework that is viable for use by parents when their child suffers from any commonly developed symptoms.
Collapse
Affiliation(s)
- Zelal Shearah
- Department of Information Systems, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia; (Z.U.); (B.F.)
| | | | | |
Collapse
|
6
|
John Joseph S, Gandhi Raj R. Hybrid optimized feature selection and deep learning based COVID-19 disease prediction. Comput Methods Biomech Biomed Engin 2023; 26:2070-2088. [PMID: 37018029 DOI: 10.1080/10255842.2023.2194476] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 03/07/2023] [Accepted: 03/19/2023] [Indexed: 04/06/2023]
Abstract
The COVID-19 virus has affected many people around the globe with several issues. Moreover, it causes a worldwide pandemic, and it makes more than one million deaths. Countries around the globe had to announce a complete lockdown when the corona virus causes the community to spread. In real-time, Polymerase Chain Reaction (RT-PCR) test is conducted to detect COVID-19, which is not effective and sensitive. Hence, this research presents the proposed Caviar-MFFO-assisted Deep LSTM scheme for COVID-19 detection. In this research, the COVID-19 cases data is utilized to process the COVID-19 detection. This method extracts the various technical indicators that improve the efficiency of COVID-19 detection. Moreover, the significant features fit for COVID-19 detection are selected using proposed mayfly with fruit fly optimization (MFFO). In addition, COVID-19 is detected by Deep Long Short Term Memory (Deep LSTM), and the Conditional Autoregressive Value at Risk MFFO (Caviar-MFFO) is modeled to train the weight of Deep LSTM. The experimental analysis reveals that the proposed Caviar-MFFO assisted Deep LSTM method provided efficient performance based on the Mean Squared Error (MSE) and Root Mean Squared Error (RMSE), and achieved the recovered cases with the minimal values of 1.438 and 1.199, whereas the developed model achieved the death cases with the values of 4.582 and 2.140 for MSE and RMSE. In addition, 6.127 and 2.475 are achieved by the developed model based on infected cases.
Collapse
Affiliation(s)
- S John Joseph
- Department of Computer Science and Engineering, Sudharsan Engineering College, Pudukkottai, Tamilnadu, India
| | - R Gandhi Raj
- Department of Electrical and Electronics Engineering, University College of Engineering (BIT Campus), Anna University, Tiruchirappalli, Tamilnadu, India
| |
Collapse
|
7
|
Abnoosian K, Farnoosh R, Behzadi MH. Prediction of diabetes disease using an ensemble of machine learning multi-classifier models. BMC Bioinformatics 2023; 24:337. [PMID: 37697283 PMCID: PMC10496262 DOI: 10.1186/s12859-023-05465-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Accepted: 09/04/2023] [Indexed: 09/13/2023] Open
Abstract
BACKGROUND AND OBJECTIVE Diabetes is a life-threatening chronic disease with a growing global prevalence, necessitating early diagnosis and treatment to prevent severe complications. Machine learning has emerged as a promising approach for diabetes diagnosis, but challenges such as limited labeled data, frequent missing values, and dataset imbalance hinder the development of accurate prediction models. Therefore, a novel framework is required to address these challenges and improve performance. METHODS In this study, we propose an innovative pipeline-based multi-classification framework to predict diabetes in three classes: diabetic, non-diabetic, and prediabetes, using the imbalanced Iraqi Patient Dataset of Diabetes. Our framework incorporates various pre-processing techniques, including duplicate sample removal, attribute conversion, missing value imputation, data normalization and standardization, feature selection, and k-fold cross-validation. Furthermore, we implement multiple machine learning models, such as k-NN, SVM, DT, RF, AdaBoost, and GNB, and introduce a weighted ensemble approach based on the Area Under the Receiver Operating Characteristic Curve (AUC) to address dataset imbalance. Performance optimization is achieved through grid search and Bayesian optimization for hyper-parameter tuning. RESULTS Our proposed model outperforms other machine learning models, including k-NN, SVM, DT, RF, AdaBoost, and GNB, in predicting diabetes. The model achieves high average accuracy, precision, recall, F1-score, and AUC values of 0.9887, 0.9861, 0.9792, 0.9851, and 0.999, respectively. CONCLUSION Our pipeline-based multi-classification framework demonstrates promising results in accurately predicting diabetes using an imbalanced dataset of Iraqi diabetic patients. The proposed framework addresses the challenges associated with limited labeled data, missing values, and dataset imbalance, leading to improved prediction performance. This study highlights the potential of machine learning techniques in diabetes diagnosis and management, and the proposed framework can serve as a valuable tool for accurate prediction and improved patient care. Further research can build upon our work to refine and optimize the framework and explore its applicability in diverse datasets and populations.
Collapse
Affiliation(s)
- Karlo Abnoosian
- Department of Statistics, Science and Research Branch, Islamic Azad University, Tehran, Iran
| | - Rahman Farnoosh
- School of Mathematics, Iran University of Science and Technology, Tehran, Iran.
| | - Mohammad Hassan Behzadi
- Department of Statistics, Science and Research Branch, Islamic Azad University, Tehran, Iran
| |
Collapse
|
8
|
Kumar R, Maheshwari S, Sharma A, Linda S, Kumar S, Chatterjee I. Ensemble learning-based early detection of influenza disease. MULTIMEDIA TOOLS AND APPLICATIONS 2023:1-21. [PMID: 37362719 PMCID: PMC10199437 DOI: 10.1007/s11042-023-15848-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 08/16/2022] [Accepted: 05/15/2023] [Indexed: 06/28/2023]
Abstract
Across the world, the seasonal disease influenza is a respiratory illness that impacts all age groups in many ways. Its symptoms are fever, chills, aches, pains, headaches, fatigue, cough, and weakness. Seasonal influenza can cause mild to severe illness and lead to death at times. The task of early detection of influenza is an important research area these days. Various studies show that machine learning techniques have attracted many researchers' attention to the early detection of influenza disease. In this paper, early detection of Influenza disease among all age groups is done using various machine learning techniques. Influenza Research Database and the Human Surveillance Records data sets are used. Data analysis is undertaken, and ensemble-based stacked algorithms are implemented on the whole data set. The performance of different models has been evaluated using different performance metrics. Overall, the study proposes efficient machine learning models that can be implemented to provide a cheaper and quicker diagnostic tool for detecting influenza.
Collapse
Affiliation(s)
- Ranjan Kumar
- Department of Computer Science, Aryabhatta College, University of Delhi, Delhi, 110021 India
| | - Sajal Maheshwari
- Department of Computer Science, Aryabhatta College, University of Delhi, Delhi, 110021 India
| | - Anushka Sharma
- Department of Computer Science, Aryabhatta College, University of Delhi, Delhi, 110021 India
| | - Sonal Linda
- Department of Computer Science, Aryabhatta College, University of Delhi, Delhi, 110021 India
| | - Subhash Kumar
- Department of Physics, Acharya Narendra Dev College, University of Delhi, Delhi, 110019 India
| | - Indranath Chatterjee
- Department of Computer Engineering, Tongmyong University, Busan, 48520 South Korea
- School of Technology, Woxsen University, Hyderabad, Telangana 500033 India
| |
Collapse
|
9
|
Aslam H, Biswas S. Analysis of COVID-19 Death Cases Using Machine Learning. SN COMPUTER SCIENCE 2023; 4:403. [PMID: 37220559 PMCID: PMC10191086 DOI: 10.1007/s42979-023-01835-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Accepted: 12/21/2022] [Indexed: 05/25/2023]
Abstract
COVID-19 has threatened the existence of human life for more than the last 2 years. More than 460 million confirmed cases and 6 million deaths have been reported worldwide due to COVID-19. To measure the severity of the COVID-19, the mortality rate plays an important role. Understanding the nature of COVID-19 and forecasting the death cases of COVID-19 require more investigation of the real effect for different risk factors. In this work, various regression machine learning models are proposed to extract the relationship between different factors and the death rate of COVID-19. The optimal regression tree algorithm employed in this work estimates the impact of essential causal variables that significantly affect the mortality rates. We have generated a real-time forecast for the death case of COVID-19 using machine learning techniques. The analysis is evaluated with the well-known regression models XGBoost, Random Forest, and SVM on the data sets of the US, India, Italy, and three continents Asia, Europe, and North America. The results show that the models can be used to forecast the death cases for the near future in case of an epidemic like Novel Coronavirus.
Collapse
Affiliation(s)
- Humaira Aslam
- Department of Mathematics, Adamas University, Barasat-Barrackpore Road, Jagannathpur, Kolkata, West Bengal 700126 India
| | - Santanu Biswas
- Department of Mathematics, Adamas University, Barasat-Barrackpore Road, Jagannathpur, Kolkata, West Bengal 700126 India
- Department Of Mathematics, Jadavpur University, Raja Subodh Chandra Mallick Road, Kolkata, West Bengal 700032 India
| |
Collapse
|
10
|
Zhang Y, Tang S, Yu G. An interpretable hybrid predictive model of COVID-19 cases using autoregressive model and LSTM. Sci Rep 2023; 13:6708. [PMID: 37185289 PMCID: PMC10126574 DOI: 10.1038/s41598-023-33685-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 04/17/2023] [Indexed: 05/17/2023] Open
Abstract
The Coronavirus Disease 2019 (COVID-19) has had a profound impact on global health and economy, making it crucial to build accurate and interpretable data-driven predictive models for COVID-19 cases to improve public policy making. The extremely large scale of the pandemic and the intrinsically changing transmission characteristics pose a great challenge for effectively predicting COVID-19 cases. To address this challenge, we propose a novel hybrid model in which the interpretability of the Autoregressive model (AR) and the predictive power of the long short-term memory neural networks (LSTM) join forces. The proposed hybrid model is formalized as a neural network with an architecture that connects two composing model blocks, of which the relative contribution is decided data-adaptively in the training procedure. We demonstrate the favorable performance of the hybrid model over its two single composing models as well as other popular predictive models through comprehensive numerical studies on two data sources under multiple evaluation metrics. Specifically, in county-level data of 8 California counties, our hybrid model achieves 4.173% MAPE, outperforming the composing AR (5.629%) and LSTM (4.934%) alone on average. In country-level datasets, our hybrid model outperforms the widely-used predictive models such as AR, LSTM, Support Vector Machines, Gradient Boosting, and Random Forest, in predicting the COVID-19 cases in Japan, Canada, Brazil, Argentina, Singapore, Italy, and the United Kingdom. In addition to the predictive performance, we illustrate the interpretability of our proposed hybrid model using the estimated AR component, which is a key feature that is not shared by most black-box predictive models for COVID-19 cases. Our study provides a new and promising direction for building effective and interpretable data-driven models for COVID-19 cases, which could have significant implications for public health policy making and control of the current COVID-19 and potential future pandemics.
Collapse
Affiliation(s)
- Yangyi Zhang
- Department of Mathematics, University of California Santa Barbara, Santa Barbara, CA, 93106, USA
| | - Sui Tang
- Department of Mathematics, University of California Santa Barbara, Santa Barbara, CA, 93106, USA.
| | - Guo Yu
- Department of Statistics and Applied Probability, University of California Santa Barbara, Santa Barbara, CA, 93106, USA.
| |
Collapse
|
11
|
An Q, Rahman S, Zhou J, Kang JJ. A Comprehensive Review on Machine Learning in Healthcare Industry: Classification, Restrictions, Opportunities and Challenges. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23094178. [PMID: 37177382 PMCID: PMC10180678 DOI: 10.3390/s23094178] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 04/16/2023] [Accepted: 04/18/2023] [Indexed: 05/15/2023]
Abstract
Recently, various sophisticated methods, including machine learning and artificial intelligence, have been employed to examine health-related data. Medical professionals are acquiring enhanced diagnostic and treatment abilities by utilizing machine learning applications in the healthcare domain. Medical data have been used by many researchers to detect diseases and identify patterns. In the current literature, there are very few studies that address machine learning algorithms to improve healthcare data accuracy and efficiency. We examined the effectiveness of machine learning algorithms in improving time series healthcare metrics for heart rate data transmission (accuracy and efficiency). In this paper, we reviewed several machine learning algorithms in healthcare applications. After a comprehensive overview and investigation of supervised and unsupervised machine learning algorithms, we also demonstrated time series tasks based on past values (along with reviewing their feasibility for both small and large datasets).
Collapse
Affiliation(s)
- Qi An
- School of Information Technology, Faculty of Science, Engineering and Built Environment, Deakin University, Geelong, VIC 3216, Australia
| | - Saifur Rahman
- School of Information Technology, Faculty of Science, Engineering and Built Environment, Deakin University, Geelong, VIC 3216, Australia
| | - Jingwen Zhou
- School of Information Technology, Faculty of Science, Engineering and Built Environment, Deakin University, Geelong, VIC 3216, Australia
| | - James Jin Kang
- Computing and Security, School of Science, Edith Cowan University, Joondalup, WA 6027, Australia
| |
Collapse
|
12
|
Muneer R, Hashmet MR, Pourafshary P, Shakeel M. Unlocking the Power of Artificial Intelligence: Accurate Zeta Potential Prediction Using Machine Learning. NANOMATERIALS (BASEL, SWITZERLAND) 2023; 13:1209. [PMID: 37049303 PMCID: PMC10096557 DOI: 10.3390/nano13071209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 03/16/2023] [Accepted: 03/22/2023] [Indexed: 06/19/2023]
Abstract
Nanoparticles have gained significance in modern science due to their unique characteristics and diverse applications in various fields. Zeta potential is critical in assessing the stability of nanofluids and colloidal systems but measuring it can be time-consuming and challenging. The current research proposes the use of cutting-edge machine learning techniques, including multiple regression analyses (MRAs), support vector machines (SVM), and artificial neural networks (ANNs), to simulate the zeta potential of silica nanofluids and colloidal systems, while accounting for affecting parameters such as nanoparticle size, concentration, pH, temperature, brine salinity, monovalent ion type, and the presence of sand, limestone, or nano-sized fine particles. Zeta potential data from different literature sources were used to develop and train the models using machine learning techniques. Performance indicators were employed to evaluate the models' predictive capabilities. The correlation coefficient (r) for the ANN, SVM, and MRA models was found to be 0.982, 0.997, and 0.68, respectively. The mean absolute percentage error for the ANN model was 5%, whereas, for the MRA and SVM models, it was greater than 25%. ANN models were more accurate than SVM and MRA models at predicting zeta potential, and the trained ANN model achieved an accuracy of over 97% in zeta potential predictions. ANN models are more accurate and faster at predicting zeta potential than conventional methods. The model developed in this research is the first ever to predict the zeta potential of silica nanofluids, dispersed kaolinite, sand-brine system, and coal dispersions considering several influencing parameters. This approach eliminates the need for time-consuming experimentation and provides a highly accurate and rapid prediction method with broad applications across different fields.
Collapse
Affiliation(s)
- Rizwan Muneer
- School of Mining and Geosciences, Nazarbayev University, Astana 010000, Kazakhstan
| | - Muhammad Rehan Hashmet
- Department of Chemical and Petroleum Engineering, United Arab Emirates University, Al Ain 15551, United Arab Emirates
| | - Peyman Pourafshary
- School of Mining and Geosciences, Nazarbayev University, Astana 010000, Kazakhstan
| | - Mariam Shakeel
- School of Mining and Geosciences, Nazarbayev University, Astana 010000, Kazakhstan
| |
Collapse
|
13
|
Gürsoy E, Kaya Y. An overview of deep learning techniques for COVID-19 detection: methods, challenges, and future works. MULTIMEDIA SYSTEMS 2023; 29:1603-1627. [PMID: 37261262 PMCID: PMC10039775 DOI: 10.1007/s00530-023-01083-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/16/2022] [Accepted: 03/20/2023] [Indexed: 06/02/2023]
Abstract
The World Health Organization (WHO) declared a pandemic in response to the coronavirus COVID-19 in 2020, which resulted in numerous deaths worldwide. Although the disease appears to have lost its impact, millions of people have been affected by this virus, and new infections still occur. Identifying COVID-19 requires a reverse transcription-polymerase chain reaction test (RT-PCR) or analysis of medical data. Due to the high cost and time required to scan and analyze medical data, researchers are focusing on using automated computer-aided methods. This review examines the applications of deep learning (DL) and machine learning (ML) in detecting COVID-19 using medical data such as CT scans, X-rays, cough sounds, MRIs, ultrasound, and clinical markers. First, the data preprocessing, the features used, and the current COVID-19 detection methods are divided into two subsections, and the studies are discussed. Second, the reported publicly available datasets, their characteristics, and the potential comparison materials mentioned in the literature are presented. Third, a comprehensive comparison is made by contrasting the similar and different aspects of the studies. Finally, the results, gaps, and limitations are summarized to stimulate the improvement of COVID-19 detection methods, and the study concludes by listing some future research directions for COVID-19 classification.
Collapse
Affiliation(s)
- Ercan Gürsoy
- Department of Computer Engineering, Adana Alparslan Turkes Science and Technology University, 01250 Adana, Turkey
| | - Yasin Kaya
- Department of Computer Engineering, Adana Alparslan Turkes Science and Technology University, 01250 Adana, Turkey
| |
Collapse
|
14
|
Surianarayanan C, Lawrence JJ, Chelliah PR, Prakash E, Hewage C. Convergence of Artificial Intelligence and Neuroscience towards the Diagnosis of Neurological Disorders-A Scoping Review. SENSORS (BASEL, SWITZERLAND) 2023; 23:3062. [PMID: 36991773 PMCID: PMC10053494 DOI: 10.3390/s23063062] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 03/09/2023] [Accepted: 03/09/2023] [Indexed: 06/19/2023]
Abstract
Artificial intelligence (AI) is a field of computer science that deals with the simulation of human intelligence using machines so that such machines gain problem-solving and decision-making capabilities similar to that of the human brain. Neuroscience is the scientific study of the struczture and cognitive functions of the brain. Neuroscience and AI are mutually interrelated. These two fields help each other in their advancements. The theory of neuroscience has brought many distinct improvisations into the AI field. The biological neural network has led to the realization of complex deep neural network architectures that are used to develop versatile applications, such as text processing, speech recognition, object detection, etc. Additionally, neuroscience helps to validate the existing AI-based models. Reinforcement learning in humans and animals has inspired computer scientists to develop algorithms for reinforcement learning in artificial systems, which enables those systems to learn complex strategies without explicit instruction. Such learning helps in building complex applications, like robot-based surgery, autonomous vehicles, gaming applications, etc. In turn, with its ability to intelligently analyze complex data and extract hidden patterns, AI fits as a perfect choice for analyzing neuroscience data that are very complex. Large-scale AI-based simulations help neuroscientists test their hypotheses. Through an interface with the brain, an AI-based system can extract the brain signals and commands that are generated according to the signals. These commands are fed into devices, such as a robotic arm, which helps in the movement of paralyzed muscles or other human parts. AI has several use cases in analyzing neuroimaging data and reducing the workload of radiologists. The study of neuroscience helps in the early detection and diagnosis of neurological disorders. In the same way, AI can effectively be applied to the prediction and detection of neurological disorders. Thus, in this paper, a scoping review has been carried out on the mutual relationship between AI and neuroscience, emphasizing the convergence between AI and neuroscience in order to detect and predict various neurological disorders.
Collapse
Affiliation(s)
| | | | | | - Edmond Prakash
- Research Center for Creative Arts, University for the Creative Arts (UCA), Farnham GU9 7DS, UK
| | - Chaminda Hewage
- Cardiff School of Technologies, Cardiff Metropolitan University, Cardiff CF5 2YB, UK
| |
Collapse
|
15
|
Paul SG, Saha A, Biswas AA, Zulfiker MS, Arefin MS, Rahman MM, Reza AW. Combating Covid-19 using machine learning and deep learning: Applications, challenges, and future perspectives. ARRAY 2023; 17:100271. [PMID: 36530931 PMCID: PMC9737520 DOI: 10.1016/j.array.2022.100271] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 12/05/2022] [Accepted: 12/07/2022] [Indexed: 12/14/2022] Open
Abstract
COVID-19, a worldwide pandemic that has affected many people and thousands of individuals have died due to COVID-19, during the last two years. Due to the benefits of Artificial Intelligence (AI) in X-ray image interpretation, sound analysis, diagnosis, patient monitoring, and CT image identification, it has been further researched in the area of medical science during the period of COVID-19. This study has assessed the performance and investigated different machine learning (ML), deep learning (DL), and combinations of various ML, DL, and AI approaches that have been employed in recent studies with diverse data formats to combat the problems that have arisen due to the COVID-19 pandemic. Finally, this study shows the comparison among the stand-alone ML and DL-based research works regarding the COVID-19 issues with the combinations of ML, DL, and AI-based research works. After in-depth analysis and comparison, this study responds to the proposed research questions and presents the future research directions in this context. This review work will guide different research groups to develop viable applications based on ML, DL, and AI models, and will also guide healthcare institutes, researchers, and governments by showing them how these techniques can ease the process of tackling the COVID-19.
Collapse
Affiliation(s)
- Showmick Guha Paul
- Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh
| | - Arpa Saha
- Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh
| | - Al Amin Biswas
- Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh,Corresponding author
| | - Md. Sabab Zulfiker
- Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh
| | - Mohammad Shamsul Arefin
- Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh,Department of Computer Science and Engineering, Chittagong University of Engineering and Technology, Chittagong, Bangladesh
| | - Md. Mahfujur Rahman
- Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh
| | - Ahmed Wasif Reza
- Department of Computer Science and Engineering, East West University, Dhaka, Bangladesh
| |
Collapse
|
16
|
Das P, Mazumder DH. An extensive survey on the use of supervised machine learning techniques in the past two decades for prediction of drug side effects. Artif Intell Rev 2023; 56:1-28. [PMID: 36819660 PMCID: PMC9930028 DOI: 10.1007/s10462-023-10413-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/01/2023] [Indexed: 02/19/2023]
Abstract
Approved drugs for sale must be effective and safe, implying that the drug's advantages outweigh its known harmful side effects. Side effects (SE) of drugs are one of the common reasons for drug failure that may halt the whole drug discovery pipeline. The side effects might vary from minor concerns like a runny nose to potentially life-threatening issues like liver damage, heart attack, and death. Therefore, predicting the side effects of the drug is vital in drug development, discovery, and design. Supervised machine learning-based side effects prediction task has recently received much attention since it reduces time, chemical waste, design complexity, risk of failure, and cost. The advancement of supervised learning approaches for predicting side effects have emerged as essential computational tools. Supervised machine learning technique provides early information on drug side effects to develop an effective drug based on drug properties. Still, there are several challenges to predicting drug side effects. Thus, a near-exhaustive survey is carried out in this paper on the use of supervised machine learning approaches employed in drug side effects prediction tasks in the past two decades. In addition, this paper also summarized the drug descriptor required for the side effects prediction task, commonly utilized drug properties sources, computational models, and their performances. Finally, the research gap, open problems, and challenges for the further supervised learning-based side effects prediction task have been discussed.
Collapse
Affiliation(s)
- Pranab Das
- Department of Computer Science and Engineering, National Institute of Technology Nagaland, Chumukedima, Dimapur, Nagaland 797103 India
| | - Dilwar Hussain Mazumder
- Department of Computer Science and Engineering, National Institute of Technology Nagaland, Chumukedima, Dimapur, Nagaland 797103 India
| |
Collapse
|
17
|
Penn MJ, Donnelly CA. Asymptotic Analysis of Optimal Vaccination Policies. Bull Math Biol 2023; 85:15. [PMID: 36662446 PMCID: PMC9859927 DOI: 10.1007/s11538-022-01114-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Accepted: 12/24/2022] [Indexed: 01/21/2023]
Abstract
Targeted vaccination policies can have a significant impact on the number of infections and deaths in an epidemic. However, optimising such policies is complicated, and the resultant solution may be difficult to explain to policy-makers and to the public. The key novelty of this paper is a derivation of the leading-order optimal vaccination policy under multi-group susceptible-infected-recovered dynamics in two different cases. Firstly, it considers the case of a small vulnerable subgroup in a population and shows that (in the asymptotic limit) it is optimal to vaccinate this group first, regardless of the properties of the other groups. Then, it considers the case of a small vaccine supply and transforms the optimal vaccination problem into a simple knapsack problem by linearising the final size equations. Both of these cases are then explored further through numerical examples, which show that these solutions are also directly useful for realistic parameter values. Moreover, the findings of this paper give some general principles for optimal vaccination policies which will help policy-makers and the public to understand the reasoning behind optimal vaccination programs in more generic cases.
Collapse
Affiliation(s)
- Matthew J. Penn
- Department of Statistics, University of Oxford, St Giles’, Oxford, OX1 3LB UK
| | - Christl A. Donnelly
- Department of Statistics, University of Oxford, St Giles’, Oxford, OX1 3LB UK
- Department of Infectious Disease Epidemiology, Imperial College London, South Kensington Campus, London, SW7 2AZ UK
| |
Collapse
|
18
|
Hasan MM, Islam MU, Sadeq MJ, Fung WK, Uddin J. Review on the Evaluation and Development of Artificial Intelligence for COVID-19 Containment. SENSORS (BASEL, SWITZERLAND) 2023; 23:527. [PMID: 36617124 PMCID: PMC9824505 DOI: 10.3390/s23010527] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 12/23/2022] [Accepted: 12/29/2022] [Indexed: 06/17/2023]
Abstract
Artificial intelligence has significantly enhanced the research paradigm and spectrum with a substantiated promise of continuous applicability in the real world domain. Artificial intelligence, the driving force of the current technological revolution, has been used in many frontiers, including education, security, gaming, finance, robotics, autonomous systems, entertainment, and most importantly the healthcare sector. With the rise of the COVID-19 pandemic, several prediction and detection methods using artificial intelligence have been employed to understand, forecast, handle, and curtail the ensuing threats. In this study, the most recent related publications, methodologies and medical reports were investigated with the purpose of studying artificial intelligence's role in the pandemic. This study presents a comprehensive review of artificial intelligence with specific attention to machine learning, deep learning, image processing, object detection, image segmentation, and few-shot learning studies that were utilized in several tasks related to COVID-19. In particular, genetic analysis, medical image analysis, clinical data analysis, sound analysis, biomedical data classification, socio-demographic data analysis, anomaly detection, health monitoring, personal protective equipment (PPE) observation, social control, and COVID-19 patients' mortality risk approaches were used in this study to forecast the threatening factors of COVID-19. This study demonstrates that artificial-intelligence-based algorithms integrated into Internet of Things wearable devices were quite effective and efficient in COVID-19 detection and forecasting insights which were actionable through wide usage. The results produced by the study prove that artificial intelligence is a promising arena of research that can be applied for disease prognosis, disease forecasting, drug discovery, and to the development of the healthcare sector on a global scale. We prove that artificial intelligence indeed played a significantly important role in helping to fight against COVID-19, and the insightful knowledge provided here could be extremely beneficial for practitioners and research experts in the healthcare domain to implement the artificial-intelligence-based systems in curbing the next pandemic or healthcare disaster.
Collapse
Affiliation(s)
- Md. Mahadi Hasan
- Department of Computer Science and Engineering, Asian University of Bangladesh, Ashulia 1349, Bangladesh
| | - Muhammad Usama Islam
- School of Computing and Informatics, University of Louisiana at Lafayette, Lafayette, LA 70504, USA
| | - Muhammad Jafar Sadeq
- Department of Computer Science and Engineering, Asian University of Bangladesh, Ashulia 1349, Bangladesh
| | - Wai-Keung Fung
- Department of Applied Computing and Engineering, Cardiff School of Technologies, Cardiff Metropolitan University, Cardiff CF5 2YB, UK
| | - Jasim Uddin
- Department of Applied Computing and Engineering, Cardiff School of Technologies, Cardiff Metropolitan University, Cardiff CF5 2YB, UK
| |
Collapse
|
19
|
Jeyananthan P. SARS-CoV-2 Diagnosis Using Transcriptome Data: A Machine Learning Approach. SN COMPUTER SCIENCE 2023; 4:218. [PMID: 36844504 PMCID: PMC9936926 DOI: 10.1007/s42979-023-01703-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Accepted: 01/24/2023] [Indexed: 05/02/2023]
Abstract
SARS-CoV-2 pandemic is the big issue of the whole world right now. The health community is struggling to rescue the public and countries from this spread, which revives time to time with different waves. Even the vaccination seems to be not prevents this spread. Accurate identification of infected people on time is essential these days to control the spread. So far, Polymerase chain reaction (PCR) and rapid antigen tests are widely used in this identification, accepting their own drawbacks. False negative cases are the menaces in this scenario. To avoid these problems, this study uses machine learning techniques to build a classification model with higher accuracy to filter the COVID-19 cases from the non-COVID individuals. Transcriptome data of the SARS-CoV-2 patients along with the control are used in this stratification using three different feature selection algorithms and seven classification models. Differently expressed genes also studied between these two groups of people and used in this classification. Results shows that mutual information (or DEGs) along with naïve Bayes (or SVM) gives the best accuracy (0.98 ± 0.04) among these methods. Supplementary Information The online version contains supplementary material available at 10.1007/s42979-023-01703-6.
Collapse
|
20
|
Ibrahim Z, Tulay P, Abdullahi J. Multi-region machine learning-based novel ensemble approaches for predicting COVID-19 pandemic in Africa. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:3621-3643. [PMID: 35948797 PMCID: PMC9365685 DOI: 10.1007/s11356-022-22373-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 07/30/2022] [Indexed: 06/15/2023]
Abstract
Coronavirus disease 2019 (COVID-19) has produced a global pandemic, which has devastating effects on health, economy and social interactions. Despite the less contraction and spread of COVID-19 in Africa compared to some other continents in the world, Africa remains amongst the most vulnerable regions due to less technology and unequipped or poor health system. Recent happenings showed that COVID-19 may stay for years owing to the discoveries of new variants (such as Omicron) and new wave of infections in several countries. Therefore, accurate prediction of new cases is vital to make informed decisions and in evaluating the measures that should be implemented. Studies on COVID-19 prediction are limited in Africa despite the risks and dangers that the virus possessed. Hence, this study was performed to predict daily COVID-19 cases in 10 African countries spread across the north, south, east, west and central Africa considering countries with few and large number of daily COVID-19 cases. Machine learning (ML) models due to their nonlinearity and accurate prediction capabilities were employed for this purpose, including artificial neural network (ANN), adaptive neuro-fuzzy inference system (ANFIS), support vector machine (SVM) and conventional multiple linear regression (MLR) models. As any other natural process, the COVID-19 pandemic may contain both linear and nonlinear aspects. In such circumstances, neither nonlinear (ML) nor linear (MLR) models could be sufficient; hence, combining both ML and MLR models may produce better accuracy. Consequently, to improve the prediction efficiency of the ML models, novel ensemble approaches including ANN-E and SVM-E were employed. The advantage of using ensemble approaches is that they provide collective benefits of all the standalone models, thereby reducing their weaknesses and enhancing their prediction capabilities. The obtained results showed that ANFIS led to better prediction performance with MAD = 0.0106, MSE = 0.0003, RMSE = 0.0185 and R2 = 0.9059 in the validation step. The results of the proposed ensemble approaches demonstrated very high improvements in predicting the COVID-19 pandemic in Africa with MAD = 0.0073, MSE = 0.0002, RMSE = 0.0155 and R2 = 0.9616. The ANN-E improved the standalone models performance in the validation step up to 10%, 14%, 42%, 6%, 83%, 11%, 7%, 5%, 7% and 31% for Morocco, Sudan, Namibia, South Africa, Uganda, Rwanda, Nigeria, Senegal, Gabon and Cameroon, respectively. This study results offer a solid foundation in the application of ensemble approaches for predicting COVID-19 pandemic across all regions and countries in the world.
Collapse
Affiliation(s)
- Zurki Ibrahim
- Department of Medical Genetics, Near East University, Mersin 10, Lefkosa, Turkey
| | - Pinar Tulay
- Department of Medical Genetics, Near East University, Mersin 10, Lefkosa, Turkey
| | - Jazuli Abdullahi
- Department of Civil Engineering, Faculty of Engineering, Baze University, Abuja, Nigeria.
| |
Collapse
|
21
|
Daramola O, Kavu TD, Kotze MJ, Kamati O, Emjedi Z, Kabaso B, Moser T, Stroetmann K, Fwemba I, Daramola F, Nyirenda M, van Rensburg SJ, Nyasulu PS, Marnewick JL. Detecting the most critical clinical variables of COVID-19 breakthrough infection in vaccinated persons using machine learning. Digit Health 2023; 9:20552076231207593. [PMID: 37936960 PMCID: PMC10627023 DOI: 10.1177/20552076231207593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 09/28/2023] [Indexed: 11/09/2023] Open
Abstract
Background COVID-19 vaccines offer different levels of immune protection but do not provide 100% protection. Vaccinated persons with pre-existing comorbidities may be at an increased risk of SARS-CoV-2 breakthrough infection or reinfection. The aim of this study is to identify the critical variables associated with a higher probability of SARS-CoV-2 breakthrough infection using machine learning. Methods A dataset comprising symptoms and feedback from 257 persons, of whom 203 were vaccinated and 54 unvaccinated, was used for the investigation. Three machine learning algorithms - Deep Multilayer Perceptron (Deep MLP), XGBoost, and Logistic Regression - were trained with the original (imbalanced) dataset and the balanced dataset created by using the Random Oversampling Technique (ROT), and the Synthetic Minority Oversampling Technique (SMOTE). We compared the performance of the classification algorithms when the features highly correlated with breakthrough infection were used and when all features in the dataset were used. Result The results show that when highly correlated features were considered as predictors, with Random Oversampling to address data imbalance, the XGBoost classifier has the best performance (F1 = 0.96; accuracy = 0.96; AUC = 0.98; G-Mean = 0.98; MCC = 0.88). The Deep MLP had the second best performance (F1 = 0.94; accuracy = 0.94; AUC = 0.92; G-Mean = 0.70; MCC = 0.42), while Logistic Regression had less accurate performance (F1 = 0.89; accuracy = 0.88; AUC = 0.89; G-Mean = 0.89; MCC = 0.68). We also used Shapley Additive Explanations (SHAP) to investigate the interpretability of the models. We found that body temperature, total cholesterol, glucose level, blood pressure, waist circumference, body weight, body mass index (BMI), haemoglobin level, and physical activity per week are the most critical variables indicating a higher risk of breakthrough infection. Conclusion These results, evident from our unique data source derived from apparently healthy volunteers with cardiovascular risk factors, follow the expected pattern of positive or negative correlations previously reported in the literature. This information strengthens the body of knowledge currently applied in public health guidelines and may also be used by medical practitioners in the future to reduce the risk of SARS-CoV-2 breakthrough infection.
Collapse
Affiliation(s)
- Olawande Daramola
- Department of Information Technology, Faculty of Informatics and Design, Cape Peninsula University of Technology, Cape Town, South Africa
| | - Tatenda Duncan Kavu
- Department of Information Technology, Faculty of Informatics and Design, Cape Peninsula University of Technology, Cape Town, South Africa
| | - Maritha J Kotze
- Division of Chemical Pathology, Department of Pathology, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
- Division of Chemical Pathology, Department of Pathology, National Health Laboratory Service, Tygerberg Hospital, Cape Town, South Africa
| | - Oiva Kamati
- Applied Microbial and Health Biotechnology Institute (AMHBI), Cape Peninsula University of Technology, Cape Town, South Africa
- Department of Biomedical Sciences, Faculty of Health and Wellness Sciences, Cape Peninsula University of Technology, Cape Town, South Africa
| | - Zaakiyah Emjedi
- Applied Microbial and Health Biotechnology Institute (AMHBI), Cape Peninsula University of Technology, Cape Town, South Africa
| | - Boniface Kabaso
- Department of Information Technology, Faculty of Informatics and Design, Cape Peninsula University of Technology, Cape Town, South Africa
| | - Thomas Moser
- St. Pölten University of Applied Sciences, St. Pölten, Austria
| | - Karl Stroetmann
- School of Health Information Science, University of Victoria, Victoria, BC, Canada
| | - Isaac Fwemba
- Division of Epidemiology and Biostatistics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Fisayo Daramola
- Division of Epidemiology and Biostatistics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Martha Nyirenda
- Division of Epidemiology and Biostatistics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Susan J van Rensburg
- Division of Chemical Pathology, Department of Pathology, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Peter S Nyasulu
- Division of Epidemiology and Biostatistics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Jeanine L Marnewick
- Applied Microbial and Health Biotechnology Institute (AMHBI), Cape Peninsula University of Technology, Cape Town, South Africa
| |
Collapse
|
22
|
Cabrero-Holgueras J, Pastrana S. Towards realistic privacy-preserving deep learning over encrypted medical data. Front Cardiovasc Med 2023; 10:1117360. [PMID: 37187785 PMCID: PMC10175772 DOI: 10.3389/fcvm.2023.1117360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 03/31/2023] [Indexed: 05/17/2023] Open
Abstract
Cardiovascular disease supposes a substantial fraction of healthcare systems. The invisible nature of these pathologies demands solutions that enable remote monitoring and tracking. Deep Learning (DL) has arisen as a solution in many fields, and in healthcare, multiple successful applications exist for image enhancement and health outside hospitals. However, the computational requirements and the need for large-scale datasets limit DL. Thus, we often offload computation onto server infrastructure, and various Machine-Learning-as-a-Service (MLaaS) platforms emerged from this need. These enable the conduction of heavy computations in a cloud infrastructure, usually equipped with high-performance computing servers. Unfortunately, the technical barriers persist in healthcare ecosystems since sending sensitive data (e.g., medical records or personally identifiable information) to third-party servers involves privacy and security concerns with legal and ethical implications. In the scope of Deep Learning for Healthcare to improve cardiovascular health, Homomorphic Encryption (HE) is a promising tool to enable secure, private, and legal health outside hospitals. Homomorphic Encryption allows for privacy-preserving computations over encrypted data, thus preserving the privacy of the processed information. Efficient HE requires structural optimizations to perform the complex computation of the internal layers. One such optimization is Packed Homomorphic Encryption (PHE), which encodes multiple elements on a single ciphertext, allowing for efficient Single Instruction over Multiple Data (SIMD) operations. However, using PHE in DL circuits is not straightforward, and it demands new algorithms and data encoding, which existing literature has not adequately addressed. To fill this gap, in this work, we elaborate on novel algorithms to adapt the linear algebra operations of DL layers to PHE. Concretely, we focus on Convolutional Neural Networks. We provide detailed descriptions and insights into the different algorithms and efficient inter-layer data format conversion mechanisms. We formally analyze the complexity of the algorithms in terms of performance metrics and provide guidelines and recommendations for adapting architectures that deal with private data. Furthermore, we confirm the theoretical analysis with practical experimentation. Among other conclusions, we prove that our new algorithms speed up the processing of convolutional layers compared to the existing proposals.
Collapse
Affiliation(s)
- José Cabrero-Holgueras
- Innovation, IT Department, CERN, Geneva, Switzerland
- Computer Science Department, Universidad Carlos III de Madrid, Madrid, Spain
- Correspondence: José Cabrero-Holgueras
| | - Sergio Pastrana
- Computer Science Department, Universidad Carlos III de Madrid, Madrid, Spain
| |
Collapse
|
23
|
Owokotomo OE, Manda S, Cleasen J, Kasim A, Sengupta R, Shome R, Subhra Paria S, Reddy T, Shkedy Z. Modeling the positive testing rate of COVID-19 in South Africa using a semi-parametric smoother for binomial data. Front Public Health 2023; 11:979230. [PMID: 36908419 PMCID: PMC9992730 DOI: 10.3389/fpubh.2023.979230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Accepted: 01/13/2023] [Indexed: 02/24/2023] Open
Abstract
Identification and isolation of COVID-19 infected persons plays a significant role in the control of COVID-19 pandemic. A country's COVID-19 positive testing rate is useful in understanding and monitoring the disease transmission and spread for the planning of intervention policy. Using publicly available data collected between March 5th, 2020 and May 31st, 2021, we proposed to estimate both the positive testing rate and its daily rate of change in South Africa with a flexible semi-parametric smoothing model for discrete data. There was a gradual increase in the positive testing rate up to a first peak rate in July, 2020, then a decrease before another peak around mid-December 2020 to mid-January 2021. The proposed semi-parametric smoothing model provides a data driven estimates for both the positive testing rate and its change. We provide an online R dashboard that can be used to estimate the positive rate in any country of interest based on publicly available data. We believe this is a useful tool for both researchers and policymakers for planning intervention and understanding the COVID-19 spread.
Collapse
Affiliation(s)
| | - Samuel Manda
- Department of Statistics, University of Pretoria, Pretoria, South Africa
| | - Jürgen Cleasen
- Center for Statistics, Data Science Institute, I-BioStat, Hasselt University, Hasselt, Belgium
| | - Adetayo Kasim
- Department of Anthropology, Durham Research Methods Centre, Durham University, Durham, United Kingdom
| | - Rudradev Sengupta
- The Janssen Pharmaceutical, Companies of Johnson & Johnson, Beerse, Belgium
| | - Rahul Shome
- Department of Computer Science, Rice University, Houston, TX, United States
| | - Soumya Subhra Paria
- School of Mathematics and Statistics, The Open University, Milton Keynes, United Kingdom
| | - Tarylee Reddy
- Biostatistics Research Unit, South African Medical Research Council, Capetown, South Africa
| | - Ziv Shkedy
- Center for Statistics, Data Science Institute, I-BioStat, Hasselt University, Hasselt, Belgium
| |
Collapse
|
24
|
Subash Chandra Bose S, Vinoth Kumar A, Premkumar A, Deepika M, Gokilavani M. Biserial targeted feature projection based radial kernel regressive deep belief neural learning for covid-19 prediction. Soft comput 2023; 27:1651-1662. [PMID: 35378723 PMCID: PMC8968782 DOI: 10.1007/s00500-022-06943-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/21/2022] [Indexed: 01/31/2023]
Abstract
Coronavirus disease 2019 (COVID-19) is a highly infectious viral disease caused by the novel SARS-CoV-2 virus. Different prediction techniques have been developed to predict the coronavirus disease's existence in patients. However, the accurate prediction was not improved and time consumption was not minimized. In order to address these existing problems, a novel technique called Biserial Targeted Feature Projection-based Radial Kernel Regressive Deep Belief Neural Learning (BTFP-RKRDBNL) is introduced to perform accurate disease prediction with lesser time consumption. The BTFP-RKRDBNL techniques perform disease prediction with the help of different layers such as two visible layers namely input and layer and two hidden layers. Initially, the features and data are collected from the dataset and transmitted to the input layer. The Point Biserial Correlative Target feature projection is used to select relevant features and other irrelevant features are removed with minimizing the disease prediction time. Then the relevant features are sent to the hidden layer 2. Next, Radial Kernel Regression is applied to analyze the training features and testing disease features to identify the disease with higher accuracy and a lesser false positive rate. Experimental analysis is planned to measure the prediction accuracy, sensitivity, and specificity, and prediction time for different numbers of patients. The result illustrates that the method increases the prediction accuracy, sensitivity, and specificity by 10, 6, and 21% and reduces the prediction time by 10% as compared to state-of-the-art works.
Collapse
Affiliation(s)
- S. Subash Chandra Bose
- Department of Information Technology, Guru Nanak College, Velachery, Chennai, Tamil Nadu India
| | - A. Vinoth Kumar
- grid.444354.60000 0004 1774 1403Department of Electronics and Communication Engineering, Dr MGR Educational and Research Institute, Chennai, Tamil Nadu India
| | - Anitha Premkumar
- grid.412537.60000 0004 1768 2925Department of Computer Science, Presidency University, Bangalore, 560064 India
| | - M. Deepika
- Computer Science and Engineering, ASIET, Kalady, Kerala India
| | - M. Gokilavani
- grid.449504.80000 0004 1766 2457Computer Science and Engineering, KL University, Guntur, Andra Pradesh India
| |
Collapse
|
25
|
Erol G, Uzbaş B, Yücelbaş C, Yücelbaş Ş. Analyzing the effect of data preprocessing techniques using machine learning algorithms on the diagnosis of COVID-19. CONCURRENCY AND COMPUTATION : PRACTICE & EXPERIENCE 2022; 34:e7393. [PMID: 36714180 PMCID: PMC9874401 DOI: 10.1002/cpe.7393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 07/18/2022] [Accepted: 09/08/2022] [Indexed: 06/18/2023]
Abstract
Real-time polymerase chain reaction (RT-PCR) known as the swab test is a diagnostic test that can diagnose COVID-19 disease through respiratory samples in the laboratory. Due to the rapid spread of the coronavirus around the world, the RT-PCR test has become insufficient to get fast results. For this reason, the need for diagnostic methods to fill this gap has arisen and machine learning studies have started in this area. On the other hand, studying medical data is a challenging area because the data it contains is inconsistent, incomplete, difficult to scale, and very large. Additionally, some poor clinical decisions, irrelevant parameters, and limited medical data adversely affect the accuracy of studies performed. Therefore, considering the availability of datasets containing COVID-19 blood parameters, which are less in number than other medical datasets today, it is aimed to improve these existing datasets. In this direction, to obtain more consistent results in COVID-19 machine learning studies, the effect of data preprocessing techniques on the classification of COVID-19 data was investigated in this study. In this study primarily, encoding categorical feature and feature scaling processes were applied to the dataset with 15 features that contain blood data of 279 patients, including gender and age information. Then, the missingness of the dataset was eliminated by using both K-nearest neighbor algorithm (KNN) and chain equations multiple value assignment (MICE) methods. Data balancing has been done with synthetic minority oversampling technique (SMOTE), which is a data balancing method. The effect of data preprocessing techniques on ensemble learning algorithms bagging, AdaBoost, random forest and on popular classifier algorithms KNN classifier, support vector machine, logistic regression, artificial neural network, and decision tree classifiers have been analyzed. The highest accuracies obtained with the bagging classifier were 83.42% and 83.74% with KNN and MICE imputations by applying SMOTE, respectively. On the other hand, the highest accuracy ratio reached with the same classifier without SMOTE was 83.91% for the KNN imputation. In conclusion, certain data preprocessing techniques are examined comparatively and the effect of these data preprocessing techniques on success is presented and the importance of the right combination of data preprocessing to achieve success has been demonstrated by experimental studies.
Collapse
Affiliation(s)
- Gizemnur Erol
- Konya Technical UniversitySoftware Engineering DepartmentKonyaTurkey
| | - Betül Uzbaş
- Konya Technical UniversityComputer Engineering DepartmentKonyaTurkey
| | - Cüneyt Yücelbaş
- Tarsus UniversityElectronics and Automation Department, Mersin‐Tarsus OIZ Vocational School of Technical SciencesMersinTurkey
| | - Şule Yücelbaş
- Tarsus UniversityComputer Engineering DepartmentMersinTurkey
| |
Collapse
|
26
|
Muhammad LJ, Haruna AA, Sharif US, Mohammed MB. CNN-LSTM deep learning based forecasting model for COVID-19 infection cases in Nigeria, South Africa and Botswana. HEALTH AND TECHNOLOGY 2022; 12:1259-1276. [PMCID: PMC9663291 DOI: 10.1007/s12553-022-00711-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Accepted: 10/24/2022] [Indexed: 11/17/2022]
Abstract
Background COVID-19 pandemic has indeed plunged the global community especially African countries into an alarming difficult situation culminating into a great deal amounts of catastrophes such as economic recession, political instability and loss of jobs. The pandemic spreads exponentially and causes loss of lives. Following the outbreak of the omicron new variant of concern, forecasting and identification of the COVID-19 infection cases is very vital for government at various levels. Hence, having knowledge of the spread at a particular point in time, swift actions can be taken by government at various levels with a view to accordingly formulate new policies and modalities towards minimizing the trajectory of the consequences of COVID-19 pandemic to both public health and economic sectors. Methods Here, a potent combination of Convolutional Neural Network (CNN) learning algorithm along with Long Short Term Memory (LSTM) learning algorithm has been proposed in this work in order to produce a hybrid of a deep learning algorithm Convolutional Neural Network - Long Short Term Memory (CNN-LSTM) for forecasting COVID-19 infection cases particularly in Nigeria, South Africa and Botswana. Forecasting models for COVID-19 infection cases in Nigeria, South Africa and Botswana, were developed for 10 days using deep learning-based approaches namely CNN, LSTM and CNN-LSTM deep learning algorithm respectively. Results The models were evaluated on the basis of four standard performance evaluation metrics which include accuracy, MSE, MAE and RMSE respectively. However, the CNN-LSTM deep learning-based forecasting model achieved the best accuracy of 98.30%, 97.60%, and 97.74% for Nigeria, South Africa and Botswana respectively; and in the same manner, achieved lesser MSE, MAE and RMSE values compared to models developed with CNN and LSTM respectively. Conclusions Taken together, the CNN-LSTM deep learning-based forecasting model for COVID-19 infection cases in Nigeria, South Africa and Botswana dramatically surpasses the two other DL based forecasting models (CNN and LSTM) for COVID-19 infection cases in Nigeria, South Africa and Botswana in terms of not only the best accuracy of with 98.30%, 97.60%, and 97.74% but also in terms of lesser MSE, MAE and RMSE.
Collapse
Affiliation(s)
- L. J. Muhammad
- grid.459482.6Department of Computer Science, Federal University of Kashere, P.M.B. 0182, Gombe State, Nigeria
| | - Ahmed Abba Haruna
- grid.494617.90000 0004 4907 8298Department of Computer Science, University of Hafr Al Batin, Al Jamiah, Hafr Al Batin, Saudi Arabia
| | - Usman Sani Sharif
- grid.459482.6Department of Biological Sciences, Faculty of Science, Federal University of Kashere, P.M.B. 0182, Gombe, Nigeria
| | - Mohammed Bappah Mohammed
- grid.459482.6Department of Mathematics, Faculty of Science, Federal University of Kashere, P.M.B. 0182, Gombe, Nigeria
| |
Collapse
|
27
|
Moradi Khaniabadi P, Bouchareb Y, Al-Dhuhli H, Shiri I, Al-Kindi F, Moradi Khaniabadi B, Zaidi H, Rahmim A. Two-step machine learning to diagnose and predict involvement of lungs in COVID-19 and pneumonia using CT radiomics. Comput Biol Med 2022; 150:106165. [PMID: 36215849 PMCID: PMC9533634 DOI: 10.1016/j.compbiomed.2022.106165] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 09/18/2022] [Accepted: 10/01/2022] [Indexed: 11/26/2022]
Abstract
OBJECTIVE To develop a two-step machine learning (ML) based model to diagnose and predict involvement of lungs in COVID-19 and non COVID-19 pneumonia patients using CT chest radiomic features. METHODS Three hundred CT scans (3-classes: 100 COVID-19, 100 pneumonia, and 100 healthy subjects) were enrolled in this study. Diagnostic task included 3-class classification. Severity prediction score for COVID-19 and pneumonia was considered as mild (0-25%), moderate (26-50%), and severe (>50%). Whole lungs were segmented utilizing deep learning-based segmentation. Altogether, 107 features including shape, first-order histogram, second and high order texture features were extracted. Pearson correlation coefficient (PCC≥90%) followed by different features selection algorithms were employed. ML-based supervised algorithms (Naïve Bays, Support Vector Machine, Bagging, Random Forest, K-nearest neighbors, Decision Tree and Ensemble Meta voting) were utilized. The optimal model was selected based on precision, recall and area-under-curve (AUC) by randomizing the training/validation, followed by testing using the test set. RESULTS Nine pertinent features (2 shape, 1 first-order, and 6 second-order) were obtained after features selection for both phases. In diagnostic task, the performance of 3-class classification using Random Forest was 0.909±0.026, 0.907±0.056, 0.902±0.044, 0.939±0.031, and 0.982±0.010 for precision, recall, F1-score, accuracy, and AUC, respectively. The severity prediction task using Random Forest achieved 0.868±0.123 precision, 0.865±0.121 recall, 0.853±0.139 F1-score, 0.934±0.024 accuracy, and 0.969±0.022 AUC. CONCLUSION The two-phase ML-based model accurately classified COVID-19 and pneumonia patients using CT radiomics, and adequately predicted severity of lungs involvement. This 2-steps model showed great potential in assessing COVID-19 CT images towards improved management of patients.
Collapse
Affiliation(s)
- Pegah Moradi Khaniabadi
- Department of Radiology and Molecular Imaging, College of Medicine and Health Sciences, Sultan Qaboos University, PO. Box 35, PC123, Al Khoud, Muscat, Oman.
| | - Yassine Bouchareb
- Department of Radiology and Molecular Imaging, College of Medicine and Health Sciences, Sultan Qaboos University, PO. Box 35, PC123, Al Khoud, Muscat, Oman.
| | - Humoud Al-Dhuhli
- Department of Radiology and Molecular Imaging, College of Medicine and Health Sciences, Sultan Qaboos University, PO. Box 35, PC123, Al Khoud, Muscat, Oman
| | - Isaac Shiri
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, CH-1211 Geneva 4, Switzerland
| | | | - Bita Moradi Khaniabadi
- Child Growth and Development Research Center, Research Institute for Primordial Prevention of Non-communicable Disease, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Habib Zaidi
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, CH-1211 Geneva 4, Switzerland; Geneva University Neurocenter, Geneva University, Geneva, Switzerland; Department of Nuclear Medicine and Molecular Imaging, University of Groningen, University Medical Center Groningen, Groningen, Netherlands; Department of Nuclear Medicine, University of Southern Denmark, Odense, Denmark
| | - Arman Rahmim
- Departments of Radiology and Physics, University of British Columbia, Vancouver, BC, Canada; Department of Integrative Oncology, BC Cancer Research Institute, Vancouver, BC, Canada
| |
Collapse
|
28
|
Johnson DP, Lulla V. Predicting COVID-19 community infection relative risk with a Dynamic Bayesian Network. Front Public Health 2022; 10:876691. [PMID: 36388264 PMCID: PMC9650227 DOI: 10.3389/fpubh.2022.876691] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Accepted: 10/10/2022] [Indexed: 01/21/2023] Open
Abstract
As COVID-19 continues to impact the United States and the world at large it is becoming increasingly necessary to develop methods which predict local scale spread of the disease. This is especially important as newer variants of the virus are likely to emerge and threaten community spread. We develop a Dynamic Bayesian Network (DBN) to predict community-level relative risk of COVID-19 infection at the census tract scale in the U.S. state of Indiana. The model incorporates measures of social and environmental vulnerability-including environmental determinants of COVID-19 infection-into a spatial temporal prediction of infection relative risk 1-month into the future. The DBN significantly outperforms five other modeling techniques used for comparison and which are typically applied in spatial epidemiological applications. The logic behind the DBN also makes it very well-suited for spatial-temporal prediction and for "what-if" analysis. The research results also highlight the need for further research using DBN-type approaches that incorporate methods of artificial intelligence into modeling dynamic processes, especially prominent within spatial epidemiologic applications.
Collapse
Affiliation(s)
- Daniel P. Johnson
- Department of Geography, Indiana University – Purdue University at Indianapolis, Indianapolis, IN, United States,*Correspondence: Daniel P. Johnson
| | - Vijay Lulla
- Center for Complex Networks and Systems Research, Indiana University, Bloomington, IN, United States
| |
Collapse
|
29
|
Peng HY, Lin YK, Nguyen PA, Hsu JC, Chou CL, Chang CC, Lin CC, Lam C, Chen CI, Wang KH, Lu CY. Determinants of coronavirus disease 2019 infection by artificial intelligence technology: A study of 28 countries. PLoS One 2022; 17:e0272546. [PMID: 36018862 PMCID: PMC9417026 DOI: 10.1371/journal.pone.0272546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 07/05/2022] [Indexed: 12/02/2022] Open
Abstract
Objectives The coronavirus disease 2019 pandemic has affected countries around the world since 2020, and an increasing number of people are being infected. The purpose of this research was to use big data and artificial intelligence technology to find key factors associated with the coronavirus disease 2019 infection. The results can be used as a reference for disease prevention in practice. Methods This study obtained data from the "Imperial College London YouGov Covid-19 Behaviour Tracker Open Data Hub", covering a total of 291,780 questionnaire results from 28 countries (April 1~August 31, 2020). Data included basic characteristics, lifestyle habits, disease history, and symptoms of each subject. Four types of machine learning classification models were used, including logistic regression, random forest, support vector machine, and artificial neural network, to build prediction modules. The performance of each module is presented as the area under the receiver operating characteristics curve. Then, this study further processed important factors selected by each module to obtain an overall ranking of determinants. Results This study found that the area under the receiver operating characteristics curve of the prediction modules established by the four machine learning methods were all >0.95, and the RF had the highest performance (area under the receiver operating characteristics curve is 0.988). Top ten factors associated with the coronavirus disease 2019 infection were identified in order of importance: whether the family had been tested, having no symptoms, loss of smell, loss of taste, a history of epilepsy, acquired immune deficiency syndrome, cystic fibrosis, sleeping alone, country, and the number of times leaving home in a day. Conclusions This study used big data from 28 countries and artificial intelligence methods to determine the predictors of the coronavirus disease 2019 infection. The findings provide important insights for the coronavirus disease 2019 infection prevention strategies.
Collapse
Affiliation(s)
- Hsiao-Ya Peng
- International PhD Program in Biotech and Healthcare Management, College of Management, Taipei Medical University, Taipei, Taiwan
| | - Yen-Kuang Lin
- Biostatistics Center, Office of Data Science, Taipei Medical University, Taipei, Taiwan
| | - Phung-Anh Nguyen
- Clinical Data Center, Office of Data Science, Taipei Medical University, Taipei, Taiwan
- Research Center of Health Care Industry Data Science, College of Management, Taipei Medical University, Taipei, Taiwan
- Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei Medical University, Taipei, Taiwan
- Department of Healthcare Information & Management, Ming Chuan University, Taoyuan, Taiwan
| | - Jason C. Hsu
- International PhD Program in Biotech and Healthcare Management, College of Management, Taipei Medical University, Taipei, Taiwan
- Clinical Data Center, Office of Data Science, Taipei Medical University, Taipei, Taiwan
- Research Center of Health Care Industry Data Science, College of Management, Taipei Medical University, Taipei, Taiwan
- Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei Medical University, Taipei, Taiwan
- * E-mail:
| | - Chun-Liang Chou
- Department of Thoracic Medicine, Taipei Medical University Hospital, Taipei Medical University, Taipei, Taiwan
| | - Chih-Cheng Chang
- Division of Pulmonary Medicine, Department of Internal Medicine, Shuang Ho Hospital, Taipei Medical University, New Taipei City, Taiwan
| | - Chia-Chi Lin
- International PhD Program in Biotech and Healthcare Management, College of Management, Taipei Medical University, Taipei, Taiwan
| | - Carlos Lam
- Emergency Department, Department of Emergency and Critical Care Medicine, Wan Fang Hospital, Taipei Medical University, Taipei, Taiwan
- Department of Emergency, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
- Graduate Institute of Injury Prevention and Control, College of Public Health, Taipei Medical University, Taipei, Taiwan
| | - Chang-I Chen
- Department of Healthcare Administration, School of Management, Taipei Medical University, Taipei, Taiwan
| | - Kai-Hsun Wang
- Graduate Institute of Business Administration, College of Management, Fu Jen Catholic University, New Taipei City, Taiwan
| | - Christine Y. Lu
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, United States of America
| |
Collapse
|
30
|
Zhang W, Liu S, Osgood N, Zhu H, Qian Y, Jia P. Using simulation modelling and systems science to help contain COVID-19: A systematic review. SYSTEMS RESEARCH AND BEHAVIORAL SCIENCE 2022; 40:SRES2897. [PMID: 36245570 PMCID: PMC9538520 DOI: 10.1002/sres.2897] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 05/23/2022] [Accepted: 08/03/2022] [Indexed: 06/16/2023]
Abstract
This study systematically reviews applications of three simulation approaches, that is, system dynamics model (SDM), agent-based model (ABM) and discrete event simulation (DES), and their hybrids in COVID-19 research and identifies theoretical and application innovations in public health. Among the 372 eligible papers, 72 focused on COVID-19 transmission dynamics, 204 evaluated both pharmaceutical and non-pharmaceutical interventions, 29 focused on the prediction of the pandemic and 67 investigated the impacts of COVID-19. ABM was used in 275 papers, followed by 54 SDM papers, 32 DES papers and 11 hybrid model papers. Evaluation and design of intervention scenarios are the most widely addressed area accounting for 55% of the four main categories, that is, the transmission of COVID-19, prediction of the pandemic, evaluation and design of intervention scenarios and societal impact assessment. The complexities in impact evaluation and intervention design demand hybrid simulation models that can simultaneously capture micro and macro aspects of the socio-economic systems involved.
Collapse
Affiliation(s)
- Weiwei Zhang
- Research Institute of Economics and ManagementSouthwestern University of Finance and EconomicsChengduChina
| | - Shiyong Liu
- Institute of Advanced Studies in Humanities and Social SciencesBeijing Normal University at ZhuhaiZhuhaiChina
| | - Nathaniel Osgood
- Department of Computer ScienceUniversity of SaskatchewanSaskatoonCanada
- Department of Community Health and EpidemiologyUniversity of SaskatchewanSaskatoonCanada
| | - Hongli Zhu
- Research Institute of Economics and ManagementSouthwestern University of Finance and EconomicsChengduChina
| | - Ying Qian
- Business SchoolUniversity of Shanghai for Science and TechnologyShanghaiChina
| | - Peng Jia
- School of Resource and Environmental SciencesWuhan UniversityWuhanHubeiChina
- International Institute of Spatial Lifecourse HealthWuhan UniversityWuhanHubeiChina
| |
Collapse
|
31
|
Symptom-Based COVID-19 Prognosis through AI-Based IoT: A Bioinformatics Approach. BIOMED RESEARCH INTERNATIONAL 2022; 2022:3113119. [PMID: 35915793 PMCID: PMC9338856 DOI: 10.1155/2022/3113119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Accepted: 06/17/2022] [Indexed: 11/29/2022]
Abstract
Objective Internet of Things (IoT) integrates several technologies where devices learn from the experience of each other thereby reducing human-intervened likely errors. Modern technologies like IoT and machine learning enable the conventional to patient-specific approach transition in healthcare. In conventional approach, the biggest challenge faced by healthcare professionals is to predict a disease by observing the symptoms, monitoring the remote area patient, and also attending to the patient all the time after being hospitalised. IoT provides real-time data, makes decision-making smarter, and provides far superior analytics, and all these to help improve the quality of healthcare. The main objective of the work was to create an IoT-based automated system using machine learning models for symptom-based COVID-19 prognosis. Methods Comparative analysis of predictive microbiology of COVID-19 from case symptoms using various machine learning classifiers like logistics regression, k-nearest neighbor, support vector machine, random forest, decision trees, Naïve Bayes, and gradient booster is reported here. For the sake of the validation and verification of the models, performance of each model based on the retrieved cloud-stored data was measured for accuracy. Results From the accuracy plot, it was concluded that k-NN was more accurate (97.97%) followed by decision tree (97.79), support vector machine (97.42), logistics regression (96.50), random forest (90.66), gradient boosting classifier (87.77), and Naïve Bayes (73.50) in COVID-19 prognosis. Conclusion The paper presents a health monitoring IoT framework having high clinical significance in real-time and remote healthcare monitoring. The findings reported here and the lessons learnt shall enable the healthcare system worldwide to counter not only this ongoing COVID but many other such global pandemics the humanity may suffer from time to come.
Collapse
|
32
|
Ragab M, Choudhry H, H. Asseri A, Binyamin SS, Al-Rabia MW. Enhanced Gravitational Search Optimization with Hybrid Deep Learning Model for COVID-19 Diagnosis on Epidemiology Data. Healthcare (Basel) 2022; 10:healthcare10071339. [PMID: 35885865 PMCID: PMC9317045 DOI: 10.3390/healthcare10071339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 07/13/2022] [Accepted: 07/14/2022] [Indexed: 11/16/2022] Open
Abstract
Effective screening provides efficient and quick diagnoses of COVID-19 and could alleviate related problems in the health care system. A prediction model that combines multiple features to assess contamination risks was established in the hope of supporting healthcare workers worldwide in triaging patients, particularly in situations with limited health care resources. Furthermore, a lack of diagnosis kits and asymptomatic cases can lead to missed or delayed diagnoses, exposing visitors, medical staff, and patients to 2019-nCoV contamination. Non-clinical techniques including data mining, expert systems, machine learning, and other artificial intelligence technologies have a crucial role to play in containment and diagnosis in the COVID-19 outbreak. This study developed Enhanced Gravitational Search Optimization with a Hybrid Deep Learning Model (EGSO-HDLM) for COVID-19 diagnoses using epidemiology data. The major aim of designing the EGSO-HDLM model was the identification and classification of COVID-19 using epidemiology data. In order to examine the epidemiology data, the EGSO-HDLM model employed a hybrid convolutional neural network with a gated recurrent unit based fusion (HCNN-GRUF) model. In addition, the hyperparameter optimization of the HCNN-GRUF model was improved by the use of the EGSO algorithm, which was derived by including the concepts of cat map and the traditional GSO algorithm. The design of the EGSO algorithm helps in reducing the ergodic problem, avoiding premature convergence, and enhancing algorithm efficiency. To demonstrate the better performance of the EGSO-HDLM model, experimental validation on a benchmark dataset was performed. The simulation results ensured the enhanced performance of the EGSO-HDLM model over recent approaches.
Collapse
Affiliation(s)
- Mahmoud Ragab
- Information Technology Department, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia
- Centre for Artificial Intelligence in Precision Medicines, King Abdulaziz University, Jeddah 21589, Saudi Arabia; (H.C.); (A.H.A.)
- Mathematics Department, Faculty of Science, Al-Azhar University, Nasr City, Cairo 11884, Egypt
- Correspondence:
| | - Hani Choudhry
- Centre for Artificial Intelligence in Precision Medicines, King Abdulaziz University, Jeddah 21589, Saudi Arabia; (H.C.); (A.H.A.)
- Biochemistry Department, Faculty of Science, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Amer H. Asseri
- Centre for Artificial Intelligence in Precision Medicines, King Abdulaziz University, Jeddah 21589, Saudi Arabia; (H.C.); (A.H.A.)
- Biochemistry Department, Faculty of Science, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Sami Saeed Binyamin
- Computer and Information Technology Department, The Applied College, King Abdulaziz University, Jeddah 21589, Saudi Arabia;
| | - Mohammed W. Al-Rabia
- Department of Medical Microbiology and Parasitolog, Faculty of Medicine, King Abdulaziz University, Jeddah 21589, Saudi Arabia;
- Health Promotion Center, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| |
Collapse
|
33
|
Bahache AN, Chikouche N, Mezrag F. Authentication Schemes for Healthcare Applications Using Wireless Medical Sensor Networks: A Survey. SN COMPUTER SCIENCE 2022; 3:382. [PMID: 35873706 PMCID: PMC9289661 DOI: 10.1007/s42979-022-01300-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 07/05/2022] [Indexed: 11/26/2022]
Abstract
Many applications are developed with the quick emergence of the Internet of things (IoT) and wireless sensor networks (WSNs) in the health sector. Healthcare applications that use wireless medical sensor networks (WMSNs) provide competent communication solutions for enhancing people life. WMSNs rely on highly sensitive and resource-constrained devices, so-called sensors, that sense patients’ vital signs then send them through open channels via gateways to specialists. However, these transmitted data from WMSNs can be manipulated by adversaries without data security, resulting in crucial consequences. In light of this, efficient security solutions and authentication schemes are needed. Lately, researchers have focussed highly on authentication for WMSNs, and many schemes have been proposed to preserve privacy and security requirements. These schemes face a lot of security and performance issues due to the constrained devices used. This paper presents a new classification of authentication schemes in WMSNs based on its architecture; as far as we know, it is the first of its kind. It also provides a comprehensive study of the existing authentication schemes in terms of security and performance. The performance evaluation is based on experimental results. Moreover, it identifies some future research directions and recommendations for designing authentication schemes in WMSNs.
Collapse
Affiliation(s)
- Anwar Noureddine Bahache
- Department of Computer Science, University of M’sila, BP. 166 Ichebilia 28000 M’sila, Algeria
- Laboratoire d’Analyse des Signaux et Systèmes, Université Mohamed Boudiaf - M’Sila, M’sila, Algeria
| | - Noureddine Chikouche
- Laboratory of Informatics and its Applications of M’sila, University of M’sila, BP. 166 Ichebilia 28000 M’sila, Algeria
| | - Fares Mezrag
- Laboratory of Informatics and its Applications of M’sila, University of M’sila, BP. 166 Ichebilia 28000 M’sila, Algeria
| |
Collapse
|
34
|
Supervised Learning Models for the Preliminary Detection of COVID-19 in Patients Using Demographic and Epidemiological Parameters. INFORMATION 2022. [DOI: 10.3390/info13070330] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The World Health Organization labelled the new COVID-19 breakout a public health crisis of worldwide concern on 30 January 2020, and it was named the new global pandemic in March 2020. It has had catastrophic consequences on the world economy and well-being of people and has put a tremendous strain on already-scarce healthcare systems globally, particularly in underdeveloped countries. Over 11 billion vaccine doses have already been administered worldwide, and the benefits of these vaccinations will take some time to appear. Today, the only practical approach to diagnosing COVID-19 is through the RT-PCR and RAT tests, which have sometimes been known to give unreliable results. Timely diagnosis and implementation of precautionary measures will likely improve the survival outcome and decrease the fatality rates. In this study, we propose an innovative way to predict COVID-19 with the help of alternative non-clinical methods such as supervised machine learning models to identify the patients at risk based on their characteristic parameters and underlying comorbidities. Medical records of patients from Mexico admitted between 23 January 2020 and 26 March 2022, were chosen for this purpose. Among several supervised machine learning approaches tested, the XGBoost model achieved the best results with an accuracy of 92%. It is an easy, non-invasive, inexpensive, instant and accurate way of forecasting those at risk of contracting the virus. However, it is pretty early to deduce that this method can be used as an alternative in the clinical diagnosis of coronavirus cases.
Collapse
|
35
|
Mannepalli DP, Namdeo V. An effective detection of COVID-19 using adaptive dual-stage horse herd bidirectional long short-term memory framework. INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY 2022; 32:1049-1067. [PMID: 35937036 PMCID: PMC9347606 DOI: 10.1002/ima.22747] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Revised: 04/05/2022] [Accepted: 04/22/2022] [Indexed: 05/08/2023]
Abstract
COVID-19 is a quickly increasing severe viral disease that affects the human beings as well as animals. The increasing amount of infection and death due to COVID-19 needs timely detection. This work presented an innovative deep learning methodology for the prediction of COVID-19 patients with chest x-ray images. Chest x-ray is the most effective imaging technique for predicting the lung associated diseases. An effective approach with adaptive dual-stage horse herd bidirectional LSTM model is presented for the classification of images into normal, lung opacity, viral pneumonia, and COVID-19. Initially, the input images are preprocessed using modified histogram equalization approach. This is utilized to improve the contrast of the images by changing low-resolution images into high-resolution images. Subsequently, an extended dual tree complex wavelet with trigonometric transform is introduced to extract the high-density features to decrease the complexity of features. Moreover, the dimensionality of the features reduced by adaptive beetle antennae search optimization is utilized. This approach enhances the performance of disease classification by reducing the computational complexity. Finally, an adaptive dual-stage horse herd bidirectional LSTM model is utilized for the classification of images into normal, viral pneumonia, lung opacity, and COVID-19. The implementation platform used in the work is PYTHON. The performance of the presented approach is proved by comparing with the existing approaches in accuracy (99.07%), sensitivity (97.6%), F-measure (97.1%), specificity (99.36%), kappa coefficient (97.7%), precision (98.56%), and area under the receiver operating characteristic curve (99%) for COVID-19 chest x-ray database.
Collapse
Affiliation(s)
- Durga Prasad Mannepalli
- Department of Computer Science and EngineeringSarvepalli Radhakrishna UniversityBhopalMadhya PradeshIndia
| | - Varsha Namdeo
- Department of Computer Science and EngineeringSarvepalli Radhakrishna UniversityBhopalMadhya PradeshIndia
| |
Collapse
|
36
|
Fang L, Liang X. ISW-LM: An intensive symptom weight learning mechanism for early COVID-19 diagnosis. Comput Biol Med 2022; 146:105615. [PMID: 35605484 PMCID: PMC9112616 DOI: 10.1016/j.compbiomed.2022.105615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Revised: 05/09/2022] [Accepted: 05/11/2022] [Indexed: 12/16/2022]
Abstract
The novel coronavirus disease 2019 (COVID-19) pandemic has severely impacted the world. The early diagnosis of COVID-19 and self-isolation can help curb the spread of the virus. Besides, a simple and accurate diagnostic method can help in making rapid decisions for the treatment and isolation of patients. The analysis of patient characteristics, case trajectory, comorbidities, symptoms, diagnosis, and outcomes will be performed in the model. In this paper, a symptom-based machine learning (ML) model with a new learning mechanism called Intensive Symptom Weight Learning Mechanism (ISW-LM) is proposed. The proposed model designs three new symptoms' weight functions to identify the most relevant symptoms used to diagnose and classify COVID-19. To verify the efficiency of the proposed model, multiple laboratory and clinical datasets containing epidemiological symptoms and blood tests are used. Experiments indicate that the importance of COVID-19 infection symptoms varies between countries and regions. In most datasets, the most frequent and significant predictive symptoms for diagnosing COVID-19 are fever, sore throat, and cough. The experiment also compares the state-of-the-art methods with the proposed method, which shows that the proposed model has a high accuracy rate of up to 97.1711%. The positive results indicate that the proposed learning mechanism can help clinicians quickly diagnose and screen patients for COVID-19 at an early stage.
Collapse
|
37
|
Ramírez-del Real T, Martínez-García M, Márquez MF, López-Trejo L, Gutiérrez-Esparza G, Hernández-Lemus E. Individual Factors Associated With COVID-19 Infection: A Machine Learning Study. Front Public Health 2022; 10:912099. [PMID: 35844896 PMCID: PMC9279686 DOI: 10.3389/fpubh.2022.912099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Accepted: 05/24/2022] [Indexed: 11/13/2022] Open
Abstract
The fast, exponential increase of COVID-19 infections and their catastrophic effects on patients' health have required the development of tools that support health systems in the quick and efficient diagnosis and prognosis of this disease. In this context, the present study aims to identify the potential factors associated with COVID-19 infections, applying machine learning techniques, particularly random forest, chi-squared, xgboost, and rpart for feature selection; ROSE and SMOTE were used as resampling methods due to the existence of class imbalance. Similarly, machine and deep learning algorithms such as support vector machines, C4.5, random forest, rpart, and deep neural networks were explored during the train/test phase to select the best prediction model. The dataset used in this study contains clinical data, anthropometric measurements, and other health parameters related to smoking habits, alcohol consumption, quality of sleep, physical activity, and health status during confinement due to the pandemic associated with COVID-19. The results showed that the XGBoost model got the best features associated with COVID-19 infection, and random forest approximated the best predictive model with a balanced accuracy of 90.41% using SMOTE as a resampling technique. The model with the best performance provides a tool to help prevent contracting SARS-CoV-2 since the variables with the highest risk factor are detected, and some of them are, to a certain extent controllable.
Collapse
Affiliation(s)
- Tania Ramírez-del Real
- Cátedras Conacyt, National Council on Science and Technology, Mexico City, Mexico
- Center for Research in Geospatial Information Sciences, Mexico City, Mexico
| | - Mireya Martínez-García
- Clinical Research Division, National Institute of Cardiology “Ignacio Chávez”, Mexico City, Mexico
| | - Manlio F. Márquez
- Clinical Research Division, National Institute of Cardiology “Ignacio Chávez”, Mexico City, Mexico
| | - Laura López-Trejo
- Institute for Security and Social Services of State Workers, Mexico City, Mexico
| | - Guadalupe Gutiérrez-Esparza
- Cátedras Conacyt, National Council on Science and Technology, Mexico City, Mexico
- Clinical Research Division, National Institute of Cardiology “Ignacio Chávez”, Mexico City, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City, Mexico
| |
Collapse
|
38
|
Novel Insights in Spatial Epidemiology Utilizing Explainable AI (XAI) and Remote Sensing. REMOTE SENSING 2022. [DOI: 10.3390/rs14133074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
The COVID-19 pandemic has affected many aspects of human life around the world, due to its tremendous outcomes on public health and socio-economic activities. Policy makers have tried to develop efficient responses based on technologies and advanced pandemic control methodologies, to limit the wide spreading of the virus in urban areas. However, techniques such as social isolation and lockdown are short-term solutions that minimize the spread of the pandemic in cities and do not invert long-term issues that derive from climate change, air pollution and urban planning challenges that enhance the spreading ability. Thus, it seems crucial to understand what kind of factors assist or prevent the wide spreading of the virus. Although AI frameworks have a very efficient predictive ability as data-driven procedures, they often struggle to identify strong correlations among multidimensional data and provide robust explanations. In this paper, we propose the fusion of a heterogeneous, spatio-temporal dataset that combine data from eight European cities spanning from 1 January 2020 to 31 December 2021 and describe atmospheric, socio-economic, health, mobility and environmental factors all related to potential links with COVID-19. Remote sensing data are the key solution to monitor the availability on public green spaces between cities in the study period. So, we evaluate the benefits of NIR and RED bands of satellite images to calculate the NDVI and locate the percentage in vegetation cover on each city for each week of our 2-year study. This novel dataset is evaluated by a tree-based machine learning algorithm that utilizes ensemble learning and is trained to make robust predictions on daily cases and deaths. Comparisons with other machine learning techniques justify its robustness on the regression metrics RMSE and MAE. Furthermore, the explainable frameworks SHAP and LIME are utilized to locate potential positive or negative influence of the factors on global and local level, with respect to our model’s predictive ability. A variation of SHAP, namely treeSHAP, is utilized for our tree-based algorithm to make fast and accurate explanations.
Collapse
|
39
|
Clinical and Laboratory Approach to Diagnose COVID-19 Using Machine Learning. Interdiscip Sci 2022; 14:452-470. [PMID: 35133633 PMCID: PMC8846962 DOI: 10.1007/s12539-021-00499-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 12/17/2021] [Accepted: 12/23/2021] [Indexed: 12/18/2022]
Abstract
Coronavirus 2 (SARS-CoV-2), often known by the name COVID-19, is a type of acute respiratory syndrome that has had a significant influence on both economy and health infrastructure worldwide. This novel virus is diagnosed utilising a conventional method known as the RT-PCR (Reverse Transcription Polymerase Chain Reaction) test. This approach, however, produces a lot of false-negative and erroneous outcomes. According to recent studies, COVID-19 can also be diagnosed using X-rays, CT scans, blood tests and cough sounds. In this article, we use blood tests and machine learning to predict the diagnosis of this deadly virus. We also present an extensive review of various existing machine-learning applications that diagnose COVID-19 from clinical and laboratory markers. Four different classifiers along with a technique called Synthetic Minority Oversampling Technique (SMOTE) were used for classification. Shapley Additive Explanations (SHAP) method was utilized to calculate the gravity of each feature and it was found that eosinophils, monocytes, leukocytes and platelets were the most critical blood parameters that distinguished COVID-19 infection for our dataset. These classifiers can be utilized in conjunction with RT-PCR tests to improve sensitivity and in emergency situations such as a pandemic outbreak that might happen due to new strains of the virus. The positive results indicate the prospective use of an automated framework that could help clinicians and medical personnel diagnose and screen patients.
Collapse
|
40
|
Struyf T, Deeks JJ, Dinnes J, Takwoingi Y, Davenport C, Leeflang MM, Spijker R, Hooft L, Emperador D, Domen J, Tans A, Janssens S, Wickramasinghe D, Lannoy V, Horn SRA, Van den Bruel A. Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19. Cochrane Database Syst Rev 2022; 5:CD013665. [PMID: 35593186 PMCID: PMC9121352 DOI: 10.1002/14651858.cd013665.pub3] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
BACKGROUND COVID-19 illness is highly variable, ranging from infection with no symptoms through to pneumonia and life-threatening consequences. Symptoms such as fever, cough, or loss of sense of smell (anosmia) or taste (ageusia), can help flag early on if the disease is present. Such information could be used either to rule out COVID-19 disease, or to identify people who need to go for COVID-19 diagnostic tests. This is the second update of this review, which was first published in 2020. OBJECTIVES To assess the diagnostic accuracy of signs and symptoms to determine if a person presenting in primary care or to hospital outpatient settings, such as the emergency department or dedicated COVID-19 clinics, has COVID-19. SEARCH METHODS We undertook electronic searches up to 10 June 2021 in the University of Bern living search database. In addition, we checked repositories of COVID-19 publications. We used artificial intelligence text analysis to conduct an initial classification of documents. We did not apply any language restrictions. SELECTION CRITERIA Studies were eligible if they included people with clinically suspected COVID-19, or recruited known cases with COVID-19 and also controls without COVID-19 from a single-gate cohort. Studies were eligible when they recruited people presenting to primary care or hospital outpatient settings. Studies that included people who contracted SARS-CoV-2 infection while admitted to hospital were not eligible. The minimum eligible sample size of studies was 10 participants. All signs and symptoms were eligible for this review, including individual signs and symptoms or combinations. We accepted a range of reference standards. DATA COLLECTION AND ANALYSIS Pairs of review authors independently selected all studies, at both title and abstract, and full-text stage. They resolved any disagreements by discussion with a third review author. Two review authors independently extracted data and assessed risk of bias using the QUADAS-2 checklist, and resolved disagreements by discussion with a third review author. Analyses were restricted to prospective studies only. We presented sensitivity and specificity in paired forest plots, in receiver operating characteristic (ROC) space and in dumbbell plots. We estimated summary parameters using a bivariate random-effects meta-analysis whenever five or more primary prospective studies were available, and whenever heterogeneity across studies was deemed acceptable. MAIN RESULTS We identified 90 studies; for this update we focused on the results of 42 prospective studies with 52,608 participants. Prevalence of COVID-19 disease varied from 3.7% to 60.6% with a median of 27.4%. Thirty-five studies were set in emergency departments or outpatient test centres (46,878 participants), three in primary care settings (1230 participants), two in a mixed population of in- and outpatients in a paediatric hospital setting (493 participants), and two overlapping studies in nursing homes (4007 participants). The studies did not clearly distinguish mild COVID-19 disease from COVID-19 pneumonia, so we present the results for both conditions together. Twelve studies had a high risk of bias for selection of participants because they used a high level of preselection to decide whether reverse transcription polymerase chain reaction (RT-PCR) testing was needed, or because they enrolled a non-consecutive sample, or because they excluded individuals while they were part of the study base. We rated 36 of the 42 studies as high risk of bias for the index tests because there was little or no detail on how, by whom and when, the symptoms were measured. For most studies, eligibility for testing was dependent on the local case definition and testing criteria that were in effect at the time of the study, meaning most people who were included in studies had already been referred to health services based on the symptoms that we are evaluating in this review. The applicability of the results of this review iteration improved in comparison with the previous reviews. This version has more studies of people presenting to ambulatory settings, which is where the majority of assessments for COVID-19 take place. Only three studies presented any data on children separately, and only one focused specifically on older adults. We found data on 96 symptoms or combinations of signs and symptoms. Evidence on individual signs as diagnostic tests was rarely reported, so this review reports mainly on the diagnostic value of symptoms. Results were highly variable across studies. Most had very low sensitivity and high specificity. RT-PCR was the most often used reference standard (40/42 studies). Only cough (11 studies) had a summary sensitivity above 50% (62.4%, 95% CI 50.6% to 72.9%)); its specificity was low (45.4%, 95% CI 33.5% to 57.9%)). Presence of fever had a sensitivity of 37.6% (95% CI 23.4% to 54.3%) and a specificity of 75.2% (95% CI 56.3% to 87.8%). The summary positive likelihood ratio of cough was 1.14 (95% CI 1.04 to 1.25) and that of fever 1.52 (95% CI 1.10 to 2.10). Sore throat had a summary positive likelihood ratio of 0.814 (95% CI 0.714 to 0.929), which means that its presence increases the probability of having an infectious disease other than COVID-19. Dyspnoea (12 studies) and fatigue (8 studies) had a sensitivity of 23.3% (95% CI 16.4% to 31.9%) and 40.2% (95% CI 19.4% to 65.1%) respectively. Their specificity was 75.7% (95% CI 65.2% to 83.9%) and 73.6% (95% CI 48.4% to 89.3%). The summary positive likelihood ratio of dyspnoea was 0.96 (95% CI 0.83 to 1.11) and that of fatigue 1.52 (95% CI 1.21 to 1.91), which means that the presence of fatigue slightly increases the probability of having COVID-19. Anosmia alone (7 studies), ageusia alone (5 studies), and anosmia or ageusia (6 studies) had summary sensitivities below 50% but summary specificities over 90%. Anosmia had a summary sensitivity of 26.4% (95% CI 13.8% to 44.6%) and a specificity of 94.2% (95% CI 90.6% to 96.5%). Ageusia had a summary sensitivity of 23.2% (95% CI 10.6% to 43.3%) and a specificity of 92.6% (95% CI 83.1% to 97.0%). Anosmia or ageusia had a summary sensitivity of 39.2% (95% CI 26.5% to 53.6%) and a specificity of 92.1% (95% CI 84.5% to 96.2%). The summary positive likelihood ratios of anosmia alone and anosmia or ageusia were 4.55 (95% CI 3.46 to 5.97) and 4.99 (95% CI 3.22 to 7.75) respectively, which is just below our arbitrary definition of a 'red flag', that is, a positive likelihood ratio of at least 5. The summary positive likelihood ratio of ageusia alone was 3.14 (95% CI 1.79 to 5.51). Twenty-four studies assessed combinations of different signs and symptoms, mostly combining olfactory symptoms. By combining symptoms with other information such as contact or travel history, age, gender, and a local recent case detection rate, some multivariable prediction scores reached a sensitivity as high as 90%. AUTHORS' CONCLUSIONS Most individual symptoms included in this review have poor diagnostic accuracy. Neither absence nor presence of symptoms are accurate enough to rule in or rule out the disease. The presence of anosmia or ageusia may be useful as a red flag for the presence of COVID-19. The presence of cough also supports further testing. There is currently no evidence to support further testing with PCR in any individuals presenting only with upper respiratory symptoms such as sore throat, coryza or rhinorrhoea. Combinations of symptoms with other readily available information such as contact or travel history, or the local recent case detection rate may prove more useful and should be further investigated in an unselected population presenting to primary care or hospital outpatient settings. The diagnostic accuracy of symptoms for COVID-19 is moderate to low and any testing strategy using symptoms as selection mechanism will result in both large numbers of missed cases and large numbers of people requiring testing. Which one of these is minimised, is determined by the goal of COVID-19 testing strategies, that is, controlling the epidemic by isolating every possible case versus identifying those with clinically important disease so that they can be monitored or treated to optimise their prognosis. The former will require a testing strategy that uses very few symptoms as entry criterion for testing, the latter could focus on more specific symptoms such as fever and anosmia.
Collapse
Affiliation(s)
- Thomas Struyf
- Department of Public Health and Primary Care, KU Leuven, Leuven, Belgium
| | - Jonathan J Deeks
- Test Evaluation Research Group, Institute of Applied Health Research, University of Birmingham, Birmingham, UK
- NIHR Birmingham Biomedical Research Centre, University Hospitals Birmingham NHS Foundation Trust and University of Birmingham, Birmingham, UK
| | - Jacqueline Dinnes
- Test Evaluation Research Group, Institute of Applied Health Research, University of Birmingham, Birmingham, UK
- NIHR Birmingham Biomedical Research Centre, University Hospitals Birmingham NHS Foundation Trust and University of Birmingham, Birmingham, UK
| | - Yemisi Takwoingi
- Test Evaluation Research Group, Institute of Applied Health Research, University of Birmingham, Birmingham, UK
- NIHR Birmingham Biomedical Research Centre, University Hospitals Birmingham NHS Foundation Trust and University of Birmingham, Birmingham, UK
| | - Clare Davenport
- Test Evaluation Research Group, Institute of Applied Health Research, University of Birmingham, Birmingham, UK
- NIHR Birmingham Biomedical Research Centre, University Hospitals Birmingham NHS Foundation Trust and University of Birmingham, Birmingham, UK
| | - Mariska Mg Leeflang
- Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam, Netherlands
| | - René Spijker
- Medical Library, Amsterdam UMC, University of Amsterdam, Amsterdam Public Health, Amsterdam, Netherlands
- Cochrane Netherlands, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
| | - Lotty Hooft
- Cochrane Netherlands, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
| | | | - Julie Domen
- Department of Primary Care, Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium
| | - Anouk Tans
- Department of Public Health and Primary Care, KU Leuven, Leuven, Belgium
| | | | | | | | - Sebastiaan R A Horn
- Department of Primary Care, Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium
| | - Ann Van den Bruel
- Department of Public Health and Primary Care, KU Leuven, Leuven, Belgium
| |
Collapse
|
41
|
Meraihi Y, Gabis AB, Mirjalili S, Ramdane-Cherif A, Alsaadi FE. Machine Learning-Based Research for COVID-19 Detection, Diagnosis, and Prediction: A Survey. SN COMPUTER SCIENCE 2022; 3:286. [PMID: 35578678 PMCID: PMC9096341 DOI: 10.1007/s42979-022-01184-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/09/2022] [Accepted: 04/30/2022] [Indexed: 12/12/2022]
Abstract
The year 2020 experienced an unprecedented pandemic called COVID-19, which impacted the whole world. The absence of treatment has motivated research in all fields to deal with it. In Computer Science, contributions mainly include the development of methods for the diagnosis, detection, and prediction of COVID-19 cases. Data science and Machine Learning (ML) are the most widely used techniques in this area. This paper presents an overview of more than 160 ML-based approaches developed to combat COVID-19. They come from various sources like Elsevier, Springer, ArXiv, MedRxiv, and IEEE Xplore. They are analyzed and classified into two categories: Supervised Learning-based approaches and Deep Learning-based ones. In each category, the employed ML algorithm is specified and a number of used parameters is given. The parameters set for each of the algorithms are gathered in different tables. They include the type of the addressed problem (detection, diagnosis, or detection), the type of the analyzed data (Text data, X-ray images, CT images, Time series, Clinical data,...) and the evaluated metrics (accuracy, precision, sensitivity, specificity, F1-Score, and AUC). The study discusses the collected information and provides a number of statistics drawing a picture about the state of the art. Results show that Deep Learning is used in 79% of cases where 65% of them are based on the Convolutional Neural Network (CNN) and 17% use Specialized CNN. On his side, supervised learning is found in only 16% of the reviewed approaches and only Random Forest, Support Vector Machine (SVM) and Regression algorithms are employed.
Collapse
Affiliation(s)
- Yassine Meraihi
- LIST Laboratory, University of M'Hamed Bougara Boumerdes, Avenue of Independence, 35000 Boumerdes, Algeria
| | - Asma Benmessaoud Gabis
- Ecole nationale Supérieure d'Informatique, Laboratoire des Méthodes de Conception des Systèmes, BP 68 M, 16309 Oued-Smar, Alger Algeria
| | - Seyedali Mirjalili
- Centre for Artificial Intelligence Research and Optimisation, Torrens University Australia, Fortitude Valley, Brisbane, QLD 4006 Australia.,Yonsei Frontier Lab, Yonsei University, Seoul, Korea
| | - Amar Ramdane-Cherif
- LISV Laboratory, University of Versailles St-Quentin-en-Yvelines, 10-12 Avenue of Europe, 78140 Velizy, France
| | - Fawaz E Alsaadi
- Information Technology Department, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
42
|
Muñoz-Organero M, Queipo-Álvarez P. Deep Spatiotemporal Model for COVID-19 Forecasting. SENSORS 2022; 22:s22093519. [PMID: 35591208 PMCID: PMC9101138 DOI: 10.3390/s22093519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 04/29/2022] [Accepted: 05/03/2022] [Indexed: 11/24/2022]
Abstract
COVID-19 has caused millions of infections and deaths over the last 2 years. Machine learning models have been proposed as an alternative to conventional epidemiologic models in an effort to optimize short- and medium-term forecasts that will help health authorities to optimize the use of policies and resources to tackle the spread of the SARS-CoV-2 virus. Although previous machine learning models based on time pattern analysis for COVID-19 sensed data have shown promising results, the spread of the virus has both spatial and temporal components. This manuscript proposes a new deep learning model that combines a time pattern extraction based on the use of a Long-Short Term Memory (LSTM) Recurrent Neural Network (RNN) over a preceding spatial analysis based on a Convolutional Neural Network (CNN) applied to a sequence of COVID-19 incidence images. The model has been validated with data from the 286 health primary care centers in the Comunidad de Madrid (Madrid region, Spain). The results show improved scores in terms of both root mean square error (RMSE) and explained variance (EV) when compared with previous models that have mainly focused on the temporal patterns and dependencies.
Collapse
|
43
|
Kuo KM, Talley PC, Chang CS. The Accuracy of Machine Learning Approaches Using Non-image Data for the Prediction of COVID-19: A Meta-Analysis. Int J Med Inform 2022; 164:104791. [PMID: 35594810 PMCID: PMC9098530 DOI: 10.1016/j.ijmedinf.2022.104791] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 04/08/2022] [Accepted: 05/09/2022] [Indexed: 12/12/2022]
Abstract
Objective COVID-19 is a novel, severely contagious disease with enormous negative impact on humanity as well as the world economy. An expeditious, feasible tool for detecting COVID-19 remains yet elusive. Recently, there has been a surge of interest in applying machine learning techniques to predict COVID-19 using non-image data. We have therefore undertaken a meta-analysis to quantify the diagnostic performance of machine learning models facilitating the prediction of COVID-19. Materials and methods A comprehensive electronic database search for the period between January 1st, 2021 and December 3rd, 2021 was undertaken in order to identify eligible studies relevant to this meta-analysis. Summary sensitivity, specificity, and the area under receiver operating characteristic curves were used to assess potential diagnostic accuracy. Risk of bias was assessed by means of a revised Quality Assessment of Diagnostic Studies. Results A total of 30 studies, including 34 models, met all of the inclusion criteria. Summary sensitivity, specificity, and area under receiver operating characteristic curves were 0.86, 0.86, and 0.91, respectively. The purpose of machine learning models, class imbalance, and feature selection are significant covariates useful in explaining the between-study heterogeneity, in terms of both sensitivity and specificity. Conclusions Our study findings show that non-image data can be used to predict COVID-19 with an acceptable performance. Further, class imbalance and feature selection are suggested to be incorporated whenever building models for the prediction of COVID-19, thus improving further diagnostic performance.
Collapse
|
44
|
Feature Importance Analysis by Nowcasting Perspective to Predict COVID-19. MOBILE NETWORKS AND APPLICATIONS 2022. [PMCID: PMC9033308 DOI: 10.1007/s11036-022-01966-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
Abstract
The present work raises an investigation about prediction and the feature importance to estimate the COVID-19 infection, using Machine Learning approach. Our work analyzed the inclusion of climatic features, mobility, government actions and the number of cases per health sub-territory from an existing model. The Random Forest with Permutation Importance method was used to assess the importance and list the thirty most relevant that represent the probability of infection of the disease. Among all features, the most important were: i) the variables per region health stand out, ii) period comprised between the date of notification and symptom onset, iii) symptoms features as fever, cough and sore throat, iv) variables of the traffic flow and mobility, and also v) wheathers features. The model was validated and reached an accuracy average of 81.82%, whereas the sensitivity and specificity achieved 87.52% and the 78.67% respectively in the infection estimate. Therefore, the proposed investigation represents an alternative to guide authorities in understanding aspects related to the disease.
Collapse
|
45
|
Hao F, Zheng K. Online Disease Identification and Diagnosis and Treatment Based on Machine Learning Technology. JOURNAL OF HEALTHCARE ENGINEERING 2022; 2022:6736249. [PMID: 35449857 PMCID: PMC9018189 DOI: 10.1155/2022/6736249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 03/02/2022] [Accepted: 03/12/2022] [Indexed: 11/18/2022]
Abstract
The article uses machine learning algorithms to extract disease symptom keyword vectors. At the same time, we used deep learning technology to design a disease symptom classification model. We apply this model to an online disease consultation recommendation system. The system integrates machine learning algorithms and knowledge graph technology to help patients conduct online consultations. The system analyses the misclassification data of different departments through high-frequency word analysis. The study found that the accuracy rate of our machine learning algorithm model to identify entities in electronic medical records reached 96.29%. This type of model can effectively screen out the most important pathogenic features.
Collapse
Affiliation(s)
- Feng Hao
- Jilin Medical University, Jilin 132013, China
| | - Kai Zheng
- Jilin Medical University, Jilin 132013, China
| |
Collapse
|
46
|
Gada V, Shegaonkar M, Inamdar M, Dinesh S, Sapariya D, Konde V, Warang M, Mehendale N. Data Analysis of COVID-19 Hospital Records Using Contextual Patient Classification System. ANNALS OF DATA SCIENCE 2022; 9:945-965. [PMID: 38624787 PMCID: PMC8939496 DOI: 10.1007/s40745-022-00378-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Revised: 11/01/2021] [Accepted: 02/19/2022] [Indexed: 12/18/2022]
Abstract
Humanity today is suffering from one of the most dangerous pandemics in history, the Coronavirus Disease of 2019 (COVID-19). Although today there is immense advancement in the medical field with the latest technology, the COVID-19 pandemic has affected us severely. The virus is spreading rapidly, resulting in an escalation in the number of patients admitted. We propose a contextual patient classification system for better analysis of the data from the discharge summary available from the research hospital. The classification was done using the Knuth-Morris-Pratt algorithm. We have also analyzed the data of COVID-19 and non-COVID-19 patients. During the analysis, studies on the medicines, medical services and tests, pulse count, body temperature, and the overall effect of age and gender was done. The death versus survival ratio for the COVID-19 positive patients has also been studied. The classification accuracy of the contextual patient classification system achieved was 97.4%. The combination of data analysis and contextual patient classification will be helpful to all the sectors to be better prepared for any future waves of the COVID-19 pandemic.
Collapse
Affiliation(s)
- Vrushabh Gada
- K. J. Somaiya College of Engineering, Mumbai, Maharashtra 400077 India
| | | | - Madhura Inamdar
- K. J. Somaiya College of Engineering, Mumbai, Maharashtra 400077 India
| | - Sharath Dinesh
- K. J. Somaiya College of Engineering, Mumbai, Maharashtra 400077 India
| | - Darshan Sapariya
- K. J. Somaiya College of Engineering, Mumbai, Maharashtra 400077 India
| | - Vedant Konde
- K. J. Somaiya College of Engineering, Mumbai, Maharashtra 400077 India
| | - Mahesh Warang
- K. J. Somaiya College of Engineering, Mumbai, Maharashtra 400077 India
| | - Ninad Mehendale
- K. J. Somaiya College of Engineering, Mumbai, Maharashtra 400077 India
| |
Collapse
|
47
|
Malhotra P, Gupta S, Koundal D, Zaguia A, Enbeyle W. Deep Neural Networks for Medical Image Segmentation. JOURNAL OF HEALTHCARE ENGINEERING 2022; 2022:9580991. [PMID: 35310182 PMCID: PMC8930223 DOI: 10.1155/2022/9580991] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Revised: 01/06/2022] [Accepted: 01/10/2022] [Indexed: 12/31/2022]
Abstract
Image segmentation is a branch of digital image processing which has numerous applications in the field of analysis of images, augmented reality, machine vision, and many more. The field of medical image analysis is growing and the segmentation of the organs, diseases, or abnormalities in medical images has become demanding. The segmentation of medical images helps in checking the growth of disease like tumour, controlling the dosage of medicine, and dosage of exposure to radiations. Medical image segmentation is really a challenging task due to the various artefacts present in the images. Recently, deep neural models have shown application in various image segmentation tasks. This significant growth is due to the achievements and high performance of the deep learning strategies. This work presents a review of the literature in the field of medical image segmentation employing deep convolutional neural networks. The paper examines the various widely used medical image datasets, the different metrics used for evaluating the segmentation tasks, and performances of different CNN based networks. In comparison to the existing review and survey papers, the present work also discusses the various challenges in the field of segmentation of medical images and different state-of-the-art solutions available in the literature.
Collapse
Affiliation(s)
- Priyanka Malhotra
- Chitkara University Institute of Engineering and Technology, Chitkara University, Chandigarh, Punjab, India
| | - Sheifali Gupta
- Chitkara University Institute of Engineering and Technology, Chitkara University, Chandigarh, Punjab, India
| | - Deepika Koundal
- Department of Systemics, School of Computer Science, University of Petroleum and Energy Studies, Dehradun, India
| | - Atef Zaguia
- Department of Computer Science, College of Computers and Information Technology, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia
| | | |
Collapse
|
48
|
Zare S, Meidani Z, Ouhadian M, Akbari H, Zand F, Fakharian E, Sharifian R. Identification of data elements for blood gas analysis dataset: a base for developing registries and artificial intelligence-based systems. BMC Health Serv Res 2022; 22:317. [PMID: 35260155 PMCID: PMC8902269 DOI: 10.1186/s12913-022-07706-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 03/01/2022] [Indexed: 11/23/2022] Open
Abstract
Background One of the challenging decision-making tasks in healthcare centers is the interpretation of blood gas tests. One of the most effective assisting approaches for the interpretation of blood gas analysis (BGA) can be artificial intelligence (AI)-based decision support systems. A primary step to develop intelligent systems is to determine information requirements and automated data input for the secondary analyses. Datasets can help the automated data input from dispersed information systems. Therefore, the current study aimed to identify the data elements required for supporting BGA as a dataset. Materials and methods This cross-sectional descriptive study was conducted in Nemazee Hospital, Shiraz, Iran. A combination of literature review, experts’ consensus, and the Delphi technique was used to develop the dataset. A review of the literature was performed on electronic databases to find the dataset for BGA. An expert panel was formed to discuss on, add, or remove the data elements extracted through searching the literature. Delphi technique was used to reach consensus and validate the draft dataset. Results The data elements of the BGA dataset were categorized into ten categories, namely personal information, admission details, present illnesses, past medical history, social status, physical examination, paraclinical investigation, blood gas parameter, sequential organ failure assessment (SOFA) score, and sampling technique errors. Overall, 313 data elements, including 172 mandatory and 141 optional data elements were confirmed by the experts for being included in the dataset. Conclusions We proposed a dataset as a base for registries and AI-based systems to assist BGA. It helps the storage of accurate and comprehensive data, as well as integrating them with other information systems. As a result, high-quality care is provided and clinical decision-making is improved.
Collapse
Affiliation(s)
- Sahar Zare
- Health Information Management Research Center (HIMRC), Kashan University of Medical Sciences, Kashan, Iran
| | - Zahra Meidani
- Health Information Management Research Center (HIMRC), Kashan University of Medical Sciences, Kashan, Iran.,Department of Health Information Management & Technology, School of Allied Health Professions, Kashan University of Medical Sciences, Kashan, Iran
| | - Maryam Ouhadian
- Anesthesiology and Critical Care Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Hosein Akbari
- Department of Epidemiology and Biostatistics, School of Health, Kashan University of Medical Sciences, Kashan, Iran
| | - Farid Zand
- Anesthesiology and Critical Care Research Center, Shiraz University of Medical Sciences, Shiraz, Iran. .,Department of Anesthesia and Critical Care Medicine, Shiraz University of Medical Sciences, Shiraz, Iran.
| | - Esmaeil Fakharian
- Department of Neurosurgery, Trauma Research Center, Kashan University of Medical Sciences, Kashan, Iran
| | - Roxana Sharifian
- Health Human Resources Research Center, Department of Health Information Management and Technology, Shiraz University of Medical Sciences, Shiraz, Iran
| |
Collapse
|
49
|
Alyasseri ZAA, Al‐Betar MA, Doush IA, Awadallah MA, Abasi AK, Makhadmeh SN, Alomari OA, Abdulkareem KH, Adam A, Damasevicius R, Mohammed MA, Zitar RA. Review on COVID-19 diagnosis models based on machine learning and deep learning approaches. EXPERT SYSTEMS 2022; 39:e12759. [PMID: 34511689 PMCID: PMC8420483 DOI: 10.1111/exsy.12759] [Citation(s) in RCA: 58] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Revised: 05/17/2021] [Accepted: 06/07/2021] [Indexed: 05/02/2023]
Abstract
COVID-19 is the disease evoked by a new breed of coronavirus called the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Recently, COVID-19 has become a pandemic by infecting more than 152 million people in over 216 countries and territories. The exponential increase in the number of infections has rendered traditional diagnosis techniques inefficient. Therefore, many researchers have developed several intelligent techniques, such as deep learning (DL) and machine learning (ML), which can assist the healthcare sector in providing quick and precise COVID-19 diagnosis. Therefore, this paper provides a comprehensive review of the most recent DL and ML techniques for COVID-19 diagnosis. The studies are published from December 2019 until April 2021. In general, this paper includes more than 200 studies that have been carefully selected from several publishers, such as IEEE, Springer and Elsevier. We classify the research tracks into two categories: DL and ML and present COVID-19 public datasets established and extracted from different countries. The measures used to evaluate diagnosis methods are comparatively analysed and proper discussion is provided. In conclusion, for COVID-19 diagnosing and outbreak prediction, SVM is the most widely used machine learning mechanism, and CNN is the most widely used deep learning mechanism. Accuracy, sensitivity, and specificity are the most widely used measurements in previous studies. Finally, this review paper will guide the research community on the upcoming development of machine learning for COVID-19 and inspire their works for future development. This review paper will guide the research community on the upcoming development of ML and DL for COVID-19 and inspire their works for future development.
Collapse
Affiliation(s)
- Zaid Abdi Alkareem Alyasseri
- Center for Artificial Intelligence Technology, Faculty of Information Science and TechnologyUniversiti Kebangsaan MalaysiaBangiMalaysia
- ECE Department‐Faculty of EngineeringUniversity of KufaNajafIraq
| | - Mohammed Azmi Al‐Betar
- Artificial Intelligence Research Center (AIRC)Ajman UniversityAjmanUnited Arab Emirates
- Department of Information TechnologyAl‐Huson University College, Al‐Balqa Applied UniversityIrbidJordan
| | - Iyad Abu Doush
- Computing Department, College of Engineering and Applied SciencesAmerican University of KuwaitSalmiyaKuwait
- Computer Science DepartmentYarmouk UniversityIrbidJordan
| | - Mohammed A. Awadallah
- Artificial Intelligence Research Center (AIRC)Ajman UniversityAjmanUnited Arab Emirates
- Department of Computer ScienceAl‐Aqsa UniversityGazaPalestine
| | - Ammar Kamal Abasi
- Artificial Intelligence Research Center (AIRC)Ajman UniversityAjmanUnited Arab Emirates
- School of Computer SciencesUniversiti Sains MalaysiaPenangMalaysia
| | - Sharif Naser Makhadmeh
- Artificial Intelligence Research Center (AIRC)Ajman UniversityAjmanUnited Arab Emirates
- Faculty of Information TechnologyMiddle East UniversityAmmanJordan
| | | | | | - Afzan Adam
- Center for Artificial Intelligence Technology, Faculty of Information Science and TechnologyUniversiti Kebangsaan MalaysiaBangiMalaysia
| | | | - Mazin Abed Mohammed
- College of Computer Science and Information TechnologyUniversity of AnbarAnbarIraq
| | - Raed Abu Zitar
- Sorbonne Center of Artificial IntelligenceSorbonne University‐Abu DhabiAbu DhabiUnited Arab Emirates
| |
Collapse
|
50
|
Bhuyan HK, Chakraborty C, Shelke Y, Pani SK. COVID-19 diagnosis system by deep learning approaches. EXPERT SYSTEMS 2022; 39:e12776. [PMID: 34511691 PMCID: PMC8420221 DOI: 10.1111/exsy.12776] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Revised: 05/26/2021] [Accepted: 07/01/2021] [Indexed: 05/15/2023]
Abstract
The novel coronavirus disease 2019 (COVID-19) has been a severe health issue affecting the respiratory system and spreads very fast from one human to other overall countries. For controlling such disease, limited diagnostics techniques are utilized to identify COVID-19 patients, which are not effective. The above complex circumstances need to detect suspected COVID-19 patients based on routine techniques like chest X-Rays or CT scan analysis immediately through computerized diagnosis systems such as mass detection, segmentation, and classification. In this paper, regional deep learning approaches are used to detect infected areas by the lungs' coronavirus. For mass segmentation of the infected region, a deep Convolutional Neural Network (CNN) is used to identify the specific infected area and classify it into COVID-19 or Non-COVID-19 patients with a full-resolution convolutional network (FrCN). The proposed model is experimented with based on detection, segmentation, and classification using a trained and tested COVID-19 patient dataset. The evaluation results are generated using a fourfold cross-validation test with several technical terms such as Sensitivity, Specificity, Jaccard (Jac.), Dice (F1-score), Matthews correlation coefficient (MCC), Overall accuracy, etc. The comparative performance of classification accuracy is evaluated on both with and without mass segmentation validated test dataset.
Collapse
Affiliation(s)
- Hemanta Kumar Bhuyan
- Department of Information TechnologyVignan's Foundation for Science, Technology & Research (VFSTR)GunturIndia
| | - Chinmay Chakraborty
- Electronics & Communication EngineeringBirla Institute of TechnologyMesraJharkhandIndia
| | - Yogesh Shelke
- Medical Professional and with Aranca Technology Research & AdvisoryMumbaiIndia
| | - Subhendu Kumar Pani
- Department of Computer Science & EngineeringKrupajal Computer AcademyBhubaneswarOdishaIndia
| |
Collapse
|