1
|
Bostanci E, Kocak E, Unal M, Guzel MS, Acici K, Asuroglu T. Machine Learning Analysis of RNA-seq Data for Diagnostic and Prognostic Prediction of Colon Cancer. Sensors (Basel) 2023; 23:3080. [PMID: 36991790 PMCID: PMC10052105 DOI: 10.3390/s23063080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 03/09/2023] [Accepted: 03/11/2023] [Indexed: 06/19/2023]
Abstract
Data from omics studies have been used for prediction and classification of various diseases in biomedical and bioinformatics research. In recent years, Machine Learning (ML) algorithms have been used in many different fields related to healthcare systems, especially for disease prediction and classification tasks. Integration of molecular omics data with ML algorithms has offered a great opportunity to evaluate clinical data. RNA sequence (RNA-seq) analysis has been emerged as the gold standard for transcriptomics analysis. Currently, it is being used widely in clinical research. In our present work, RNA-seq data of extracellular vesicles (EV) from healthy and colon cancer patients are analyzed. Our aim is to develop models for prediction and classification of colon cancer stages. Five different canonical ML and Deep Learning (DL) classifiers are used to predict colon cancer of an individual with processed RNA-seq data. The classes of data are formed on the basis of both colon cancer stages and cancer presence (healthy or cancer). The canonical ML classifiers, which are k-Nearest Neighbor (kNN), Logistic Model Tree (LMT), Random Tree (RT), Random Committee (RC), and Random Forest (RF), are tested with both forms of the data. In addition, to compare the performance with canonical ML models, One-Dimensional Convolutional Neural Network (1-D CNN), Long Short-Term Memory (LSTM), and Bidirectional LSTM (BiLSTM) DL models are utilized. Hyper-parameter optimizations of DL models are constructed by using genetic meta-heuristic optimization algorithm (GA). The best accuracy in cancer prediction is obtained with RC, LMT, and RF canonical ML algorithms as 97.33%. However, RT and kNN show 95.33% performance. The best accuracy in cancer stage classification is achieved with RF as 97.33%. This result is followed by LMT, RC, kNN, and RT with 96.33%, 96%, 94.66%, and 94%, respectively. According to the results of the experiments with DL algorithms, the best accuracy in cancer prediction is obtained with 1-D CNN as 97.67%. BiLSTM and LSTM show 94.33% and 93.67% performance, respectively. In classification of the cancer stages, the best accuracy is achieved with BiLSTM as 98%. 1-D CNN and LSTM show 97% and 94.33% performance, respectively. The results reveal that both canonical ML and DL models may outperform each other for different numbers of features.
Collapse
Affiliation(s)
- Erkan Bostanci
- Department of Computer Engineering, Faculty of Engineering, Ankara University, 06830 Ankara, Turkey
| | - Engin Kocak
- Department of Analytical Chemistry, Faculty of Gülhane Pharmacy, University of Health Sciences, 06018 Ankara, Turkey
| | - Metehan Unal
- Department of Computer Engineering, Faculty of Engineering, Ankara University, 06830 Ankara, Turkey
| | - Mehmet Serdar Guzel
- Department of Computer Engineering, Faculty of Engineering, Ankara University, 06830 Ankara, Turkey
| | - Koray Acici
- Department of Artificial Intelligence and Data Engineering, Faculty of Engineering, Ankara University, 06830 Ankara, Turkey
| | - Tunc Asuroglu
- Faculty of Medicine and Health Technology, Tampere University, 33720 Tampere, Finland
| |
Collapse
|
2
|
Abstract
Federated learning (FL) refers to a system in which a central aggregator coordinates the efforts of several clients to solve the issues of machine learning. This setting allows the training data to be dispersed in order to protect the privacy of each device. This paper provides an overview of federated learning systems, with a focus on healthcare. FL is reviewed in terms of its frameworks, architectures and applications. It is shown here that FL solves the preceding issues with a shared global deep learning (DL) model via a central aggregator server. Inspired by the rapid growth of FL research, this paper examines recent developments and provides a comprehensive list of unresolved issues. Several privacy methods including secure multiparty computation, homomorphic encryption, differential privacy and stochastic gradient descent are described in the context of FL. Moreover, a review is provided for different classes of FL such as horizontal and vertical FL and federated transfer learning. FL has applications in wireless communication, service recommendation, intelligent medical diagnosis system and healthcare, which we review in this paper. We also present a comprehensive review of existing FL challenges for example privacy protection, communication cost, systems heterogeneity, unreliable model upload, followed by future research directions.
Collapse
Affiliation(s)
- Subrato Bharati
- Institute of Information and Communication Technology, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh
| | - M. Rubaiyat Hossain Mondal
- Institute of Information and Communication Technology, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh
| | - Prajoy Podder
- Institute of Information and Communication Technology, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh
| | - V.B. Surya Prasath
- Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, OH, USA
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Department of Biomedical Informatics, College of Medicine, University of Cincinnati, OH, USA
- Department of Electrical Engineering and Computer Science, University of Cincinnati, OH, USA
| |
Collapse
|
3
|
Mehrotra R, Agrawal R, Ansari MA. Diagnosis of hypercritical chronic pulmonary disorders using dense convolutional network through chest radiography. Multimed Tools Appl 2022; 81:7625-7649. [PMID: 35125924 PMCID: PMC8798313 DOI: 10.1007/s11042-021-11748-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Revised: 08/30/2021] [Accepted: 11/22/2021] [Indexed: 06/14/2023]
Abstract
Lung-related ailments are prevalent all over the world which majorly includes asthma, chronic obstructive pulmonary disease (COPD), tuberculosis, pneumonia, fibrosis, etc. and now COVID-19 is added to this list. Infection of COVID-19 poses respirational complications with other indications like cough, high fever, and pneumonia. WHO had identified cancer in the lungs as a fatal cancer type amongst others and thus, the timely detection of such cancer is pivotal for an individual's health. Since the elementary convolutional neural networks have not performed fairly well in identifying atypical image types hence, we recommend a novel and completely automated framework with a deep learning approach for the recognition and classification of chronic pulmonary disorders (CPD) and COVID-pneumonia using Thoracic or Chest X-Ray (CXR) images. A novel three-step, completely automated, approach is presented that first extracts the region of interest from CXR images for preprocessing, and they are then used to detects infected lungs X-rays from the Normal ones. Thereafter, the infected lung images are further classified into COVID-pneumonia, pneumonia, and other chronic pulmonary disorders (OCPD), which might be utilized in the current scenario to help the radiologist in substantiating their diagnosis and in starting well in time treatment of these deadly lung diseases. And finally, highlight the regions in the CXR which are indicative of severe chronic pulmonary disorders like COVID-19 and pneumonia. A detailed investigation of various pivotal parameters based on several experimental outcomes are made here. This paper presents an approach that detects the Normal lung X-rays from infected ones and the infected lung images are further classified into COVID-pneumonia, pneumonia, and other chronic pulmonary disorders with an utmost accuracy of 96.8%. Several other collective performance measurements validate the superiority of the presented model. The proposed framework shows effective results in classifying lung images into Normal, COVID-pneumonia, pneumonia, and other chronic pulmonary disorders (OCPD). This framework can be effectively utilized in this current pandemic scenario to help the radiologist in substantiating their diagnosis and in starting well in time treatment of these deadly lung diseases.
Collapse
Affiliation(s)
- Rajat Mehrotra
- Department of Electrical & Electronics Engineering, GL Bajaj Institute of Technology & Management, Gr. Noida, India
| | - Rajeev Agrawal
- Department of Electronics & Communication Engineering, GL Bajaj Institute of Technology & Management, Gr. Noida, India
| | - M. A. Ansari
- Department of Electrical Engineering, School of Engineering, Gautam Buddha University, Gr. Noida, India
| |
Collapse
|
4
|
Khamparia A, Bharati S, Podder P, Gupta D, Khanna A, Phung TK, Thanh DNH. Diagnosis of breast cancer based on modern mammography using hybrid transfer learning. Multidimens Syst Signal Process 2021; 32:747-765. [PMID: 33456204 PMCID: PMC7798373 DOI: 10.1007/s11045-020-00756-7] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/24/2020] [Revised: 12/09/2020] [Accepted: 12/19/2020] [Indexed: 02/06/2023]
Abstract
Breast cancer is a common cancer in women. Early detection of breast cancer in particular and cancer, in general, can considerably increase the survival rate of women, and it can be much more effective. This paper mainly focuses on the transfer learning process to detect breast cancer. Modified VGG (MVGG) is proposed and implemented on datasets of 2D and 3D images of mammograms. Experimental results showed that the proposed hybrid transfer learning model (a fusion of MVGG and ImageNet) provides an accuracy of 94.3%. On the other hand, only the proposed MVGG architecture provides an accuracy of 89.8%. So, it is precisely stated that the proposed hybrid pre-trained network outperforms other compared Convolutional Neural Networks. The proposed architecture can be considered as an effective tool for radiologists to decrease the false negative and false positive rates. Therefore, the efficiency of mammography analysis will be improved.
Collapse
Affiliation(s)
- Aditya Khamparia
- School of Computer Science and Engineering, Lovely Professional University, Phagwara, Punjab India
| | - Subrato Bharati
- Institute of Information and Communication Technology, Bangladesh University of Engineering and Technology, Dhaka, 1205 Bangladesh
| | - Prajoy Podder
- Institute of Information and Communication Technology, Bangladesh University of Engineering and Technology, Dhaka, 1205 Bangladesh
| | - Deepak Gupta
- Maharaja Agrasen Institute of Technology, Delhi, India
| | - Ashish Khanna
- Maharaja Agrasen Institute of Technology, Delhi, India
| | - Thai Kim Phung
- School of Business Information Technology, University of Economics Ho Chi Minh City, Ho Chi Minh City, Vietnam
| | - Dang N. H. Thanh
- School of Business Information Technology, University of Economics Ho Chi Minh City, Ho Chi Minh City, Vietnam
| |
Collapse
|
5
|
Bharati S, Podder P, Mondal MRH. Hybrid deep learning for detecting lung diseases from X-ray images. Inform Med Unlocked 2020; 20:100391. [PMID: 32835077 PMCID: PMC7341954 DOI: 10.1016/j.imu.2020.100391] [Citation(s) in RCA: 71] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 06/29/2020] [Accepted: 06/30/2020] [Indexed: 02/08/2023] Open
Abstract
Lung disease is common throughout the world. These include chronic obstructive pulmonary disease, pneumonia, asthma, tuberculosis, fibrosis, etc. Timely diagnosis of lung disease is essential. Many image processing and machine learning models have been developed for this purpose. Different forms of existing deep learning techniques including convolutional neural network (CNN), vanilla neural network, visual geometry group based neural network (VGG), and capsule network are applied for lung disease prediction. The basic CNN has poor performance for rotated, tilted, or other abnormal image orientation. Therefore, we propose a new hybrid deep learning framework by combining VGG, data augmentation and spatial transformer network (STN) with CNN. This new hybrid method is termed here as VGG Data STN with CNN (VDSNet). As implementation tools, Jupyter Notebook, Tensorflow, and Keras are used. The new model is applied to NIH chest X-ray image dataset collected from Kaggle repository. Full and sample versions of the dataset are considered. For both full and sample datasets, VDSNet outperforms existing methods in terms of a number of metrics including precision, recall, F0.5 score and validation accuracy. For the case of full dataset, VDSNet exhibits a validation accuracy of 73%, while vanilla gray, vanilla RGB, hybrid CNN and VGG, and modified capsule network have accuracy values of 67.8%, 69%, 69.5% and 63.8%, respectively. When sample dataset rather than full dataset is used, VDSNet requires much lower training time at the expense of a slightly lower validation accuracy. Hence, the proposed VDSNet framework will simplify the detection of lung disease for experts as well as for doctors.
Collapse
Affiliation(s)
- Subrato Bharati
- Institute of Information and Communication Technology, Bangladesh University of Engineering and Technology, Dhaka, 1205, Bangladesh
| | - Prajoy Podder
- Institute of Information and Communication Technology, Bangladesh University of Engineering and Technology, Dhaka, 1205, Bangladesh
| | - M Rubaiyat Hossain Mondal
- Institute of Information and Communication Technology, Bangladesh University of Engineering and Technology, Dhaka, 1205, Bangladesh
| |
Collapse
|