1
|
Zwiers LC, Grobbee DE, Uijl A, Ong DSY. Federated learning as a smart tool for research on infectious diseases. BMC Infect Dis 2024; 24:1327. [PMID: 39573994 PMCID: PMC11580691 DOI: 10.1186/s12879-024-10230-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 11/14/2024] [Indexed: 11/25/2024] Open
Abstract
BACKGROUND The use of real-world data has become increasingly popular, also in the field of infectious disease (ID), particularly since the COVID-19 pandemic emerged. While much useful data for research is being collected, these data are generally stored across different sources. Privacy concerns limit the possibility to store the data centrally, thereby also limiting the possibility of fully leveraging the potential power of combined data. Federated learning (FL) has been suggested to overcome privacy issues by making it possible to perform research on data from various sources without those data leaving local servers. In this review, we discuss existing applications of FL in ID research, as well as the most relevant opportunities and challenges of this method. METHODS References for this review were identified through searches of MEDLINE/PubMed, Google Scholar, Embase and Scopus until July 2023. We searched for studies using FL in different applications related to ID. RESULTS Thirty references were included and divided into four sub-topics: disease screening, prediction of clinical outcomes, infection epidemiology, and vaccine research. Most research was related to COVID-19. In all studies, FL achieved good accuracy when predicting diseases and outcomes, also in comparison to non-federated methods. However, most studies did not make use of real-world federated data, but rather showed the potential of FL by using data that was manually partitioned. CONCLUSIONS FL is a promising methodology which allows using data from several sources, potentially generating stronger and more generalisable results. However, further exploration of FL application possibilities in ID research is needed.
Collapse
Affiliation(s)
- Laura C Zwiers
- Julius Global Health, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands.
- Julius Clinical, Zeist, The Netherlands.
| | - Diederick E Grobbee
- Julius Global Health, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Julius Clinical, Zeist, The Netherlands
| | - Alicia Uijl
- Julius Global Health, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Department of Cardiology, Amsterdam University Medical Centers, Amsterdam Cardiovascular Sciences, University of Amsterdam, Amsterdam, The Netherlands
- Division of Cardiology, Department of Medicine, Karolinska Institutet, Stockholm, Sweden
| | - David S Y Ong
- Julius Global Health, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Julius Clinical, Zeist, The Netherlands
- Department of Medical Microbiology and Infection Control, Franciscus Gasthuis & Vlietland, Rotterdam, The Netherlands
| |
Collapse
|
2
|
Shiri I, Salimi Y, Sirjani N, Razeghi B, Bagherieh S, Pakbin M, Mansouri Z, Hajianfar G, Avval AH, Askari D, Ghasemian M, Sandoughdaran S, Sohrabi A, Sadati E, Livani S, Iranpour P, Kolahi S, Khosravi B, Bijari S, Sayfollahi S, Atashzar MR, Hasanian M, Shahhamzeh A, Teimouri A, Goharpey N, Shirzad-Aski H, Karimi J, Radmard AR, Rezaei-Kalantari K, Oghli MG, Oveisi M, Vafaei Sadr A, Voloshynovskiy S, Zaidi H. Differential privacy preserved federated learning for prognostic modeling in COVID-19 patients using large multi-institutional chest CT dataset. Med Phys 2024; 51:4736-4747. [PMID: 38335175 DOI: 10.1002/mp.16964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 01/10/2024] [Accepted: 01/21/2024] [Indexed: 02/12/2024] Open
Abstract
BACKGROUND Notwithstanding the encouraging results of previous studies reporting on the efficiency of deep learning (DL) in COVID-19 prognostication, clinical adoption of the developed methodology still needs to be improved. To overcome this limitation, we set out to predict the prognosis of a large multi-institutional cohort of patients with COVID-19 using a DL-based model. PURPOSE This study aimed to evaluate the performance of deep privacy-preserving federated learning (DPFL) in predicting COVID-19 outcomes using chest CT images. METHODS After applying inclusion and exclusion criteria, 3055 patients from 19 centers, including 1599 alive and 1456 deceased, were enrolled in this study. Data from all centers were split (randomly with stratification respective to each center and class) into a training/validation set (70%/10%) and a hold-out test set (20%). For the DL model, feature extraction was performed on 2D slices, and averaging was performed at the final layer to construct a 3D model for each scan. The DensNet model was used for feature extraction. The model was developed using centralized and FL approaches. For FL, we employed DPFL approaches. Membership inference attack was also evaluated in the FL strategy. For model evaluation, different metrics were reported in the hold-out test sets. In addition, models trained in two scenarios, centralized and FL, were compared using the DeLong test for statistical differences. RESULTS The centralized model achieved an accuracy of 0.76, while the DPFL model had an accuracy of 0.75. Both the centralized and DPFL models achieved a specificity of 0.77. The centralized model achieved a sensitivity of 0.74, while the DPFL model had a sensitivity of 0.73. A mean AUC of 0.82 and 0.81 with 95% confidence intervals of (95% CI: 0.79-0.85) and (95% CI: 0.77-0.84) were achieved by the centralized model and the DPFL model, respectively. The DeLong test did not prove statistically significant differences between the two models (p-value = 0.98). The AUC values for the inference attacks fluctuate between 0.49 and 0.51, with an average of 0.50 ± 0.003 and 95% CI for the mean AUC of 0.500 to 0.501. CONCLUSION The performance of the proposed model was comparable to centralized models while operating on large and heterogeneous multi-institutional datasets. In addition, the model was resistant to inference attacks, ensuring the privacy of shared data during the training process.
Collapse
Affiliation(s)
- Isaac Shiri
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland
| | - Yazdan Salimi
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland
| | - Nasim Sirjani
- Research and Development Department, Med Fanavarn Plus Co, Karaj, Iran
| | - Behrooz Razeghi
- Department of Computer Science, University of Geneva, Geneva, Switzerland
| | - Sara Bagherieh
- School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Masoumeh Pakbin
- Imaging Department, Qom University of Medical Sciences, Qom, Iran
| | - Zahra Mansouri
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland
| | - Ghasem Hajianfar
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland
| | | | - Dariush Askari
- Department of Radiology Technology, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mohammadreza Ghasemian
- Department of Radiology, Shahid Beheshti Hospital, Qom University of Medical Sciences, Qom, Iran
| | - Saleh Sandoughdaran
- Department of Clinical Oncology, Royal Surrey County Hospital, Guildford, UK
| | - Ahmad Sohrabi
- Radin Makian Azma Mehr Ltd., Radinmehr Veterinary Laboratory, Iran University of Medical Sciences, Gorgan, Iran
| | - Elham Sadati
- Department of Medical Physics, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran
| | - Somayeh Livani
- Clinical Research Development Unit (CRDU), Sayad Shirazi Hospital, Golestan University of Medical Sciences, Gorgan, Iran
| | - Pooya Iranpour
- Medical Imaging Research Center, Department of Radiology, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Shahriar Kolahi
- Department of Radiology, School of Medicine, Advanced Diagnostic and Interventional Radiology Research Center (ADIR), Imam Khomeini Hospital, Tehran University of Medical Sciences, Tehran, Iran
| | - Bardia Khosravi
- Digestive Diseases Research Center, Digestive Diseases Research Institute, Tehran University of Medical Sciences, Tehran, Iran
| | - Salar Bijari
- Department of Medical Physics, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran
| | - Sahar Sayfollahi
- Department of Neurosurgery, Faculty of Medical Sciences, Iran University of Medical Sciences, Tehran, Iran
| | - Mohammad Reza Atashzar
- Department of Immunology, School of Medicine, Fasa University of Medical Sciences, Fasa, Iran
| | - Mohammad Hasanian
- Department of Radiology, Arak University of Medical Sciences, Arak, Iran
| | - Alireza Shahhamzeh
- Clinical research development center, Qom University of Medical Sciences, Qom, Iran
| | - Arash Teimouri
- Medical Imaging Research Center, Department of Radiology, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Neda Goharpey
- Department of radiation oncology, Shohada-e Tajrish Hospital, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | | | - Jalal Karimi
- Department of Infectious Disease, School of Medicine, Fasa University of Medical Sciences, Fasa, Iran
| | - Amir Reza Radmard
- Department of Radiology, Shariati Hospital, Tehran University of Medical Sciences, Tehran, Iran
| | - Kiara Rezaei-Kalantari
- Rajaie Cardiovascular, Medical & Research Center, Iran University of Medical Science, Tehran, Iran
| | | | - Mehrdad Oveisi
- Department of Computer Science, University of British Columbia, Vancouver, British Columbia, Canada
| | - Alireza Vafaei Sadr
- Department of Public Health Sciences, College of Medicine, Pennsylvania State University, Hershey, Pennsylvania, USA
| | | | - Habib Zaidi
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland
- Department of Nuclear Medicine and Molecular Imaging, University of Groningen, University Medical Center Groningen, Groningen, Netherlands
- Department of Nuclear Medicine, University of Southern Denmark, Odense, Denmark
- University Research and Innovation Center, Óbuda University, Budapest, Hungary
| |
Collapse
|
3
|
Ahmed R, Maddikunta PKR, Gadekallu TR, Alshammari NK, Hendaoui FA. Efficient differential privacy enabled federated learning model for detecting COVID-19 disease using chest X-ray images. Front Med (Lausanne) 2024; 11:1409314. [PMID: 38912338 PMCID: PMC11193384 DOI: 10.3389/fmed.2024.1409314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Accepted: 05/15/2024] [Indexed: 06/25/2024] Open
Abstract
The rapid spread of COVID-19 pandemic across the world has not only disturbed the global economy but also raised the demand for accurate disease detection models. Although many studies have proposed effective solutions for the early detection and prediction of COVID-19 with Machine Learning (ML) and Deep learning (DL) based techniques, but these models remain vulnerable to data privacy and security breaches. To overcome the challenges of existing systems, we introduced Adaptive Differential Privacy-based Federated Learning (DPFL) model for predicting COVID-19 disease from chest X-ray images which introduces an innovative adaptive mechanism that dynamically adjusts privacy levels based on real-time data sensitivity analysis, improving the practical applicability of Federated Learning (FL) in diverse healthcare environments. We compared and analyzed the performance of this distributed learning model with a traditional centralized model. Moreover, we enhance the model by integrating a FL approach with an early stopping mechanism to achieve efficient COVID-19 prediction with minimal communication overhead. To ensure privacy without compromising model utility and accuracy, we evaluated the proposed model under various noise scales. Finally, we discussed strategies for increasing the model's accuracy while maintaining robustness as well as privacy.
Collapse
Affiliation(s)
- Rawia Ahmed
- Computer Science Department, Applied College, University of Ha’il, Ha’il, Saudi Arabia
| | - Praveen Kumar Reddy Maddikunta
- School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | - Thippa Reddy Gadekallu
- The College of Mathematics and Computer Science, Zhejiang A&F University, Hangzhou, China
- Division of Research and Development, Lovely Professional University, Phagwara, India
- Center of Research Impact and Outcome, Chitkara University, Rajpura, India
| | - Naif Khalaf Alshammari
- Mechanical Engineering Department, Engineering College, University of Ha’il, Ha’il, Saudi Arabia
| | - Fatma Ali Hendaoui
- Computer Science Department, Applied College, University of Ha’il, Ha’il, Saudi Arabia
| |
Collapse
|
4
|
Darzi E, Sijtsema NM, van Ooijen PMA. A comparative study of federated learning methods for COVID-19 detection. Sci Rep 2024; 14:3944. [PMID: 38365940 PMCID: PMC10873416 DOI: 10.1038/s41598-024-54323-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 02/11/2024] [Indexed: 02/18/2024] Open
Abstract
Deep learning has proven to be highly effective in diagnosing COVID-19; however, its efficacy is contingent upon the availability of extensive data for model training. The data sharing among hospitals, which is crucial for training robust models, is often restricted by privacy regulations. Federated learning (FL) emerges as a solution by enabling model training across multiple hospitals while preserving data privacy. However, the deployment of FL can be resource-intensive, necessitating efficient utilization of computational and network resources. In this study, we evaluate the performance and resource efficiency of five FL algorithms in the context of COVID-19 detection using Convolutional Neural Networks (CNNs) in a decentralized setting. The evaluation involves varying the number of participating entities, the number of federated rounds, and the selection algorithms. Our findings indicate that the Cyclic Weight Transfer algorithm exhibits superior performance, particularly when the number of participating hospitals is limited. These insights hold practical implications for the deployment of FL algorithms in COVID-19 detection and broader medical image analysis.
Collapse
Affiliation(s)
- Erfan Darzi
- Harvard Medical school, Harvard University, 300 Longwood avenue, Boston, United States.
| | - Nanna M Sijtsema
- Department of Radiotherapy, University Medical Center Groningen, University of Groningen, Hanzeplein 1, Groningen, The Netherlands
- Machine Learning Lab, Data Science Center in Health (DASH), University Medical Groningen, University of Groningen, Hanzeplein 1, Groningen, The Netherlands
| | - P M A van Ooijen
- Department of Radiotherapy, University Medical Center Groningen, University of Groningen, Hanzeplein 1, Groningen, The Netherlands
- Machine Learning Lab, Data Science Center in Health (DASH), University Medical Groningen, University of Groningen, Hanzeplein 1, Groningen, The Netherlands
| |
Collapse
|
5
|
Sandhu SS, Gorji HT, Tavakolian P, Tavakolian K, Akhbardeh A. Medical Imaging Applications of Federated Learning. Diagnostics (Basel) 2023; 13:3140. [PMID: 37835883 PMCID: PMC10572559 DOI: 10.3390/diagnostics13193140] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Revised: 10/03/2023] [Accepted: 10/03/2023] [Indexed: 10/15/2023] Open
Abstract
Since its introduction in 2016, researchers have applied the idea of Federated Learning (FL) to several domains ranging from edge computing to banking. The technique's inherent security benefits, privacy-preserving capabilities, ease of scalability, and ability to transcend data biases have motivated researchers to use this tool on healthcare datasets. While several reviews exist detailing FL and its applications, this review focuses solely on the different applications of FL to medical imaging datasets, grouping applications by diseases, modality, and/or part of the body. This Systematic Literature review was conducted by querying and consolidating results from ArXiv, IEEE Xplorer, and PubMed. Furthermore, we provide a detailed description of FL architecture, models, descriptions of the performance achieved by FL models, and how results compare with traditional Machine Learning (ML) models. Additionally, we discuss the security benefits, highlighting two primary forms of privacy-preserving techniques, including homomorphic encryption and differential privacy. Finally, we provide some background information and context regarding where the contributions lie. The background information is organized into the following categories: architecture/setup type, data-related topics, security, and learning types. While progress has been made within the field of FL and medical imaging, much room for improvement and understanding remains, with an emphasis on security and data issues remaining the primary concerns for researchers. Therefore, improvements are constantly pushing the field forward. Finally, we highlighted the challenges in deploying FL in medical imaging applications and provided recommendations for future directions.
Collapse
Affiliation(s)
- Sukhveer Singh Sandhu
- Biomedical Engineering Program, University of North Dakota, Grand Forks, ND 58202, USA; (H.T.G.); (P.T.)
| | - Hamed Taheri Gorji
- Biomedical Engineering Program, University of North Dakota, Grand Forks, ND 58202, USA; (H.T.G.); (P.T.)
- SafetySpect Inc., 4200 James Ray Dr., Grand Forks, ND 58202, USA
| | - Pantea Tavakolian
- Biomedical Engineering Program, University of North Dakota, Grand Forks, ND 58202, USA; (H.T.G.); (P.T.)
| | - Kouhyar Tavakolian
- Biomedical Engineering Program, University of North Dakota, Grand Forks, ND 58202, USA; (H.T.G.); (P.T.)
| | | |
Collapse
|
6
|
Gu X, Sabrina F, Fan Z, Sohail S. A Review of Privacy Enhancement Methods for Federated Learning in Healthcare Systems. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2023; 20:6539. [PMID: 37569079 PMCID: PMC10418741 DOI: 10.3390/ijerph20156539] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 07/11/2023] [Accepted: 08/04/2023] [Indexed: 08/13/2023]
Abstract
Federated learning (FL) provides a distributed machine learning system that enables participants to train using local data to create a shared model by eliminating the requirement of data sharing. In healthcare systems, FL allows Medical Internet of Things (MIoT) devices and electronic health records (EHRs) to be trained locally without sending patients data to the central server. This allows healthcare decisions and diagnoses based on datasets from all participants, as well as streamlining other healthcare processes. In terms of user data privacy, this technology allows collaborative training without the need of sharing the local data with the central server. However, there are privacy challenges in FL arising from the fact that the model updates are shared between the client and the server which can be used for re-generating the client's data, breaching privacy requirements of applications in domains like healthcare. In this paper, we have conducted a review of the literature to analyse the existing privacy and security enhancement methods proposed for FL in healthcare systems. It has been identified that the research in the domain focuses on seven techniques: Differential Privacy, Homomorphic Encryption, Blockchain, Hierarchical Approaches, Peer to Peer Sharing, Intelligence on the Edge Device, and Mixed, Hybrid and Miscellaneous Approaches. The strengths, limitations, and trade-offs of each technique were discussed, and the possible future for these seven privacy enhancement techniques for healthcare FL systems was identified.
Collapse
Affiliation(s)
- Xin Gu
- School of Information Technology, King’s Own Institute, Sydney, NSW 2000, Australia;
| | - Fariza Sabrina
- School of Engineering and Technology, Central Queensland University, Sydney, NSW 2000, Australia;
| | - Zongwen Fan
- College of Computer Science and Technology, Huaqiao University, Xiamen 361021, China
| | - Shaleeza Sohail
- College of Engineering, Science and Environment, The University of Newcastle, Callaghan, NSW 2308, Australia;
| |
Collapse
|
7
|
Chen S, Jie Z, Wang G, Li KC, Yang J, Liu X. A new federated learning-based wireless communication and client scheduling solution for combating COVID-19. COMPUTER COMMUNICATIONS 2023; 206:101-109. [PMID: 37197298 PMCID: PMC10162846 DOI: 10.1016/j.comcom.2023.04.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Revised: 03/21/2023] [Accepted: 04/22/2023] [Indexed: 05/19/2023]
Abstract
Federated learning is a machine learning method that can break the data island. Its inherent privacy-preserving property has an important role in training medical image models. However, federated learning requires frequent communication, which incur high communication costs. Moreover, the data is heterogeneous due to different users' preferences, which may degrade the performance of models. To address the problem of statistical heterogeneity, we propose FedUC, an algorithm to control the uploaded updates for federated learning, where a client scheduling method is made on the basis of weight divergence, update increment, and loss. We also balance the local data of the clients by image augmentation to mitigate the impact of the non-independently identically distribution. The server assigns compression thresholds to the clients based on the weight divergence and update increment of the models for gradient compression to reduce the wireless communication costs. Finally, based on the weight divergence, update increment and accuracy, the server dynamically assigns weights to the model parameters for the aggregation. Simulation and analysis utilizing a publicly available chest disease dataset containing COVID-19 are compared with existing federated learning methods. Experimental results show that our proposed strategy has better training performance in improving model accuracy and reducing wireless communication costs.
Collapse
Affiliation(s)
- Shuhong Chen
- School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, Guangdong Province, China
| | - Zhiyong Jie
- School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, Guangdong Province, China
| | - Guojun Wang
- School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, Guangdong Province, China
| | - Kuan-Ching Li
- Department of Computer Science and Information Engineering, Providence University, Taizhong, Taiwan Province, China
| | - Jiawei Yang
- School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, Guangdong Province, China
| | - Xulang Liu
- School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, Guangdong Province, China
| |
Collapse
|
8
|
Nazir S, Kaleem M. Federated Learning for Medical Image Analysis with Deep Neural Networks. Diagnostics (Basel) 2023; 13:diagnostics13091532. [PMID: 37174925 PMCID: PMC10177193 DOI: 10.3390/diagnostics13091532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2023] [Revised: 04/14/2023] [Accepted: 04/21/2023] [Indexed: 05/15/2023] Open
Abstract
Medical image analysis using deep neural networks (DNN) has demonstrated state-of-the-art performance in image classification and segmentation tasks, aiding disease diagnosis. The accuracy of the DNN is largely governed by the quality and quantity of the data used to train the model. However, for the medical images, the critical security and privacy concerns regarding sharing of local medical data across medical establishments precludes exploiting the full DNN potential for clinical diagnosis. The federated learning (FL) approach enables the use of local model's parameters to train a global model, while ensuring data privacy and security. In this paper, we review the federated learning applications in medical image analysis with DNNs, highlight the security concerns, cover some efforts to improve FL model performance, and describe the challenges and future research directions.
Collapse
Affiliation(s)
- Sajid Nazir
- Department of Computing, Glasgow Caledonian University, Glasgow G4 0BA, UK
| | - Mohammad Kaleem
- Department of Electrical and Computer Engineering, COMSATS University Islamabad, Islamabad 45550, Pakistan
| |
Collapse
|
9
|
Mavrogiorgou A, Kiourtis A, Kleftakis S, Mavrogiorgos K, Zafeiropoulos N, Kyriazis D. A Catalogue of Machine Learning Algorithms for Healthcare Risk Predictions. SENSORS (BASEL, SWITZERLAND) 2022; 22:8615. [PMID: 36433212 PMCID: PMC9695983 DOI: 10.3390/s22228615] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Revised: 11/04/2022] [Accepted: 11/04/2022] [Indexed: 05/27/2023]
Abstract
Extracting useful knowledge from proper data analysis is a very challenging task for efficient and timely decision-making. To achieve this, there exist a plethora of machine learning (ML) algorithms, while, especially in healthcare, this complexity increases due to the domain's requirements for analytics-based risk predictions. This manuscript proposes a data analysis mechanism experimented in diverse healthcare scenarios, towards constructing a catalogue of the most efficient ML algorithms to be used depending on the healthcare scenario's requirements and datasets, for efficiently predicting the onset of a disease. To this context, seven (7) different ML algorithms (Naïve Bayes, K-Nearest Neighbors, Decision Tree, Logistic Regression, Random Forest, Neural Networks, Stochastic Gradient Descent) have been executed on top of diverse healthcare scenarios (stroke, COVID-19, diabetes, breast cancer, kidney disease, heart failure). Based on a variety of performance metrics (accuracy, recall, precision, F1-score, specificity, confusion matrix), it has been identified that a sub-set of ML algorithms are more efficient for timely predictions under specific healthcare scenarios, and that is why the envisioned ML catalogue prioritizes the ML algorithms to be used, depending on the scenarios' nature and needed metrics. Further evaluation must be performed considering additional scenarios, involving state-of-the-art techniques (e.g., cloud deployment, federated ML) for improving the mechanism's efficiency.
Collapse
Affiliation(s)
- Argyro Mavrogiorgou
- Department of Digital Systems, University of Piraeus, 185 34 Piraeus, Greece
| | | | | | | | | | | |
Collapse
|
10
|
A Comprehensive Analysis of Privacy-Preserving Solutions Developed for Online Social Networks. ELECTRONICS 2022. [DOI: 10.3390/electronics11131931] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Owning to the massive growth in internet connectivity, smartphone technology, and digital tools, the use of various online social networks (OSNs) has significantly increased. On the one hand, the use of OSNs enables people to share their experiences and information. On the other hand, this ever-growing use of OSNs enables adversaries to launch various privacy attacks to compromise users’ accounts as well as to steal other sensitive information via statistical matching. In general, a privacy attack is carried out by the exercise of linking personal data available on the OSN site and social graphs (or statistics) published by the OSN service providers. The problem of securing user personal information for mitigating privacy attacks in OSNs environments is a challenging research problem. Recently, many privacy-preserving solutions have been proposed to secure users’ data available over OSNs from prying eyes. However, a systematic overview of the research dynamics of OSN privacy, and findings of the latest privacy-preserving approaches from a broader perspective, remain unexplored in the current literature. Furthermore, the significance of artificial intelligence (AI) techniques in the OSN privacy area has not been highlighted by previous research. To cover this gap, we present a comprehensive analysis of the state-of-the-art solutions that have been proposed to address privacy issues in OSNs. Specifically, we classify the existing privacy-preserving solutions into two main categories: privacy-preserving graph publishing (PPGP) and privacy preservation in application-specific scenarios of the OSNs. Then, we introduce a high-level taxonomy that encompasses common as well as AI-based privacy-preserving approaches that have proposed ways to combat the privacy issues in PPGP. In line with these works, we discuss many state-of-the-art privacy-preserving solutions that have been proposed for application-specific scenarios (e.g., information diffusion, community clustering, influence analysis, friend recommendation, etc.) of OSNs. In addition, we discuss the various latest de-anonymization methods (common and AI-based) that have been developed to infer either identity or sensitive information of OSN users from the published graph. Finally, some challenges of preserving the privacy of OSNs (i.e., social graph data) from malevolent adversaries are presented, and promising avenues for future research are suggested.
Collapse
|