1
|
Lin L, Liu Y, Gao M, Rezaeipanah A. Improving hepatocellular carcinoma diagnosis using an ensemble classification approach based on Harris Hawks Optimization. Heliyon 2024; 10:e23497. [PMID: 38169861 PMCID: PMC10758797 DOI: 10.1016/j.heliyon.2023.e23497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Revised: 09/20/2023] [Accepted: 12/05/2023] [Indexed: 01/05/2024] Open
Abstract
Hepato-Cellular Carcinoma (HCC) is the most common type of liver cancer that often occurs in people with chronic liver diseases such as cirrhosis. Although HCC is known as a fatal disease, early detection can lead to successful treatment and improve survival chances. In recent years, the development of computer recognition systems using machine learning approaches has been emphasized by researchers. The effective performance of these approaches for the diagnosis of HCC has been proven in a wide range of applications. With this motivation, this paper proposes a hybrid machine learning approach including effective feature selection and ensemble classification for HCC detection, which is developed based on the Harris Hawks Optimization (HHO) algorithm. The proposed ensemble classifier is based on the bagging technique and is configured based on the decision tree method. Meanwhile, HHO as an emerging meta-heuristic algorithm can select a subset of the most suitable features related to HCC for classification. In addition, the proposed method is equipped with several strategies for handling missing values and data normalization. The simulations are based on the HCC dataset collected by the Coimbra Hospital and University Center (CHUC). The results of the experiments prove the acceptable performance of the proposed method. Specifically, the proposed method with an accuracy of 97.13 % is superior in comparison with the equivalent methods such as LASSO and DTPSO.
Collapse
Affiliation(s)
- LiuRen Lin
- Department of Pharmacy and Machinery, Qujing Second People's Hospital, Yunnan, Qujing, 655000, China
| | - YunKuan Liu
- Yunnan University of Chinese Medicine, Yunnan Key Laboratory of External Drug Delivery System and Preparation Technology in Universities, Yunnan, Kunming, 650500, China
| | - Min Gao
- Faculty of Life Science and Technology, Kunming University of Science and Technology, Yunnan, Kunming, 650500, China
| | - Amin Rezaeipanah
- Department of Computer Engineering, Persian Gulf University, Bushehr, Iran
| |
Collapse
|
2
|
Shi X, Yue C, Quan M, Li Y, Nashwan Sam H. A semi-supervised ensemble clustering algorithm for discovering relationships between different diseases by extracting cell-to-cell biological communications. J Cancer Res Clin Oncol 2024; 150:3. [PMID: 38168012 DOI: 10.1007/s00432-023-05559-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 11/01/2023] [Indexed: 01/05/2024]
Abstract
INTRODUCTION In recent decades, many theories have been proposed about the cause of hereditary diseases such as cancer. However, most studies state genetic and environmental factors as the most important parameters. It has been shown that gene expression data are valuable information about hereditary diseases and their analysis can identify the relationships between these diseases. OBJECTIVE Identification of damaged genes from various diseases can be done through the discovery of cell-to-cell biological communications. Also, extraction of intercellular communications can identify relationships between different diseases. For example, gene disorders that cause damage to the same cells in both breast and blood cancers. Hence, the purpose is to discover cell-to-cell biological communications in gene expression data. METHODOLOGY The identification of cell-to-cell biological communications for various cancer diseases has been widely performed by clustering algorithms. However, this field remains open due to the abundance of unprocessed gene expression data. Accordingly, this paper focuses on the development of a semi-supervised ensemble clustering algorithm that can discover relationships between different diseases through the extraction of cell-to-cell biological communications. The proposed clustering framework includes a stratified feature sampling mechanism and a novel similarity metric to deal with high-dimensional data and improve the diversity of primary partitions. RESULTS The performance of the proposed clustering algorithm is verified with several datasets from the UCI machine learning repository and then applied to the FANTOM5 dataset to extract cell-to-cell biological communications. The used version of this dataset contains 108 cells and 86,427 promoters from 702 samples. The strength of communication between two similar cells from different diseases indicates the relationship of those diseases. Here, the strength of communication is determined by promoter, so we found the highest cell-to-cell biological communication between "basophils" and "ciliary.epithelial.cells" with 62,809 promoters. CONCLUSION The maximum cell-to-cell biological similarity in each cluster can be used to detect the relationship between different diseases such as cancer.
Collapse
Affiliation(s)
- Xiuchao Shi
- College of Environment and Life Sciences, Weinan Normal University, Weinan, 714099, Shaanxi, China.
| | - Chunxiao Yue
- Weinan Junior Middle School, Weinan, 714000, Shaanxi, China
| | - Meiping Quan
- College of Environment and Life Sciences, Weinan Normal University, Weinan, 714099, Shaanxi, China
| | - Yalin Li
- College of Environment and Life Sciences, Weinan Normal University, Weinan, 714099, Shaanxi, China
| | - Hiba Nashwan Sam
- Department of Radiology and Sonar Techniques, Al-Noor University College, Nineveh, Iraq
| |
Collapse
|
3
|
Camargo-Marín L, Guzmán-Huerta M, Piña-Ramirez O, Perez-Gonzalez J. Multimodal Early Birth Weight Prediction Using Multiple Kernel Learning. SENSORS (BASEL, SWITZERLAND) 2023; 24:2. [PMID: 38202864 PMCID: PMC10780741 DOI: 10.3390/s24010002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 12/08/2023] [Accepted: 12/14/2023] [Indexed: 01/12/2024]
Abstract
In this work, a novel multimodal learning approach for early prediction of birth weight is presented. Fetal weight is one of the most relevant indicators in the assessment of fetal health status. The aim is to predict early birth weight using multimodal maternal-fetal variables from the first trimester of gestation (Anthropometric data, as well as metrics obtained from Fetal Biometry, Doppler and Maternal Ultrasound). The proposed methodology starts with the optimal selection of a subset of multimodal features using an ensemble-based approach of feature selectors. Subsequently, the selected variables feed the nonparametric Multiple Kernel Learning regression algorithm. At this stage, a set of kernels is selected and weighted to maximize performance in birth weight prediction. The proposed methodology is validated and compared with other computational learning algorithms reported in the state of the art. The obtained results (absolute error of 234 g) suggest that the proposed methodology can be useful as a tool for the early evaluation and monitoring of fetal health status through indicators such as birth weight.
Collapse
Affiliation(s)
- Lisbeth Camargo-Marín
- Departamento de Medicina Traslacional, Instituto Nacional de Perinatología Isidro Espinosa de los Reyes, Montes Urales 800, Lomas de Virreyes, Miguel Hidalgo, Mexico City 11000, Mexico; (L.C.-M.); (M.G.-H.)
| | - Mario Guzmán-Huerta
- Departamento de Medicina Traslacional, Instituto Nacional de Perinatología Isidro Espinosa de los Reyes, Montes Urales 800, Lomas de Virreyes, Miguel Hidalgo, Mexico City 11000, Mexico; (L.C.-M.); (M.G.-H.)
| | - Omar Piña-Ramirez
- Departamento de Bioinformática y Análisis Estadístico, Instituto Nacional de Perinatología Isidro Espinosa de los Reyes, Montes Urales 800, Lomas de Virreyes, Miguel Hidalgo, Mexico City 11000, Mexico;
| | - Jorge Perez-Gonzalez
- Unidad Académica del Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Km 4.5 Carretera Mérida-Tetiz, Municipio de Ucú, Yucatán 97357, Mexico
| |
Collapse
|
4
|
Wang D. Toward improving the performance of learning by joining feature selection and ensemble classification techniques: an application for cancer diagnosis. J Cancer Res Clin Oncol 2023; 149:16993-17006. [PMID: 37740767 DOI: 10.1007/s00432-023-05422-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Accepted: 09/12/2023] [Indexed: 09/25/2023]
Abstract
INTRODUCTION Breast cancer is known as the most common type of cancer in women, and this has raised the importance of its diagnosis in medical science as one of the most important issues. In addition to reducing costs, the diagnosis of benign or malignant breast cancer is very important in determining the treatment method. OBJECTIVE The purpose of this paper is to present a model based on data mining techniques including feature selection and ensemble classification that can accurately predict breast cancer patients in the early stages. METHODOLOGY The proposed breast cancer detection model is developed by joining Adaptive Differential Evolution (ADE) algorithm for feature selection and Learning Vector Quantization (LVQ) neural network for classification. Our proposed model as ADE-LVQ has the ability to automatically and quickly diagnose breast cancer patients into two classes, benign and malignant. As a new evolutionary approach, ADE performs optimal configuration for LVQ neural network in addition to selecting effective features from breast cancer data. Meanwhile, we configure an ensemble classification technique based on LVQ, which significantly improves the prediction performance. RESULTS ADE-LVQ has been analyzed from different perspectives on different datasets from Wisconsin breast cancer database. We apply different approaches to handle missing values and improve data quality on this database. The results of the simulations showed that the ADE-LVQ model is more successful than the equivalent and state-of-the-art models in diagnosing breast cancer patients. Also, ADE-LVQ provides better performance with less complexity, considering feature selection and ensemble learning. In particular, ADE-LVQ improves accuracy (up to 3.4%) and runtime (up to 2.3%) on average compared to the existing best method. CONCLUSION Combined methods based on data mining techniques for breast cancer diagnosis can help doctors in making better decisions for disease treatment.
Collapse
Affiliation(s)
- Dan Wang
- Zaozhuang Hospital of Traditional Chinese Medicine, Zaozhuang, 277000, Shandong, China.
| |
Collapse
|
5
|
Yang J, Hussein Kadir D. Data mining techniques in breast cancer diagnosis at the cellular-molecular level. J Cancer Res Clin Oncol 2023; 149:12605-12620. [PMID: 37442866 DOI: 10.1007/s00432-023-05090-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Accepted: 06/30/2023] [Indexed: 07/15/2023]
Abstract
INTRODUCTION Studies in the field of better diagnosis of breast cancer using machine learning and data mining techniques have always been promising. A new diagnostic method can detect the characteristics of breast cancer in the early stages and help in better treatment. The aim of this study is to provide a method for early detection of breast cancer by reducing human errors based on data mining techniques in medicine using accurate and rapid screening. METHODOLOGY The proposed method includes data pre-processing and image quality improvement in the first step. The second step consists of separating cancer cells from healthy breast tissue and removing outliers using image segmentation. Finally, a classification model is configured by combining deep neural networks in the third phase. The proposed ensemble classification model uses several effective features extracted from images and is based on majority vote. This model can be used as a screening system to diagnose the grade of invasive ductal carcinoma of the breast. RESULTS Evaluations have been done using two histopathological microscopic datasets including patients with invasive ductal carcinoma of the breast. With extracting high-level features with average accuracies of 92.65% and 93.34% in these two datasets, the proposed method has succeeded in quickly diagnosing and classifying breast cancer with high performance. CONCLUSION By combining deep neural networks and extracting features affecting breast cancer, the ability to diagnose with the highest accuracy is provided, and this is a step toward helping specialists and increasing the chances of patients' survival.
Collapse
Affiliation(s)
- Jian Yang
- General Office of China Science and Technology Development Center for Chinese Medicine, Chaoyang District, Beijing, 100020, China.
| | - Dler Hussein Kadir
- Department of Statistics and Informatics, College of Administration and Economics, Salahaddin University, Erbil, Iraq
- Department of Business Administration, Cihan University-Erbil, Erbil, Iraq
| |
Collapse
|
6
|
Zheng D, Tang P, Lu D, Han L, Saberi S. A structured combination of ensemble classifier and filter-based feature selection to improve breast cancer diagnosis. J Cancer Res Clin Oncol 2023; 149:14519-14534. [PMID: 37567985 DOI: 10.1007/s00432-023-05238-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 07/31/2023] [Indexed: 08/13/2023]
Abstract
INTRODUCTION Advances in technology have led to the emergence of computerized diagnostic systems as intelligent medical assistants. Machine learning approaches cannot replace professional humans, but they can change the treatment of diseases such as cancer and be used as medical assistants. BACKGROUND Breast cancer treatment can be very effective, especially when the disease is detected in the early stages. Feature selection and classification are common data mining techniques in machine learning that can provide breast cancer diagnosis with high speed, low cost and high precision. METHODOLOGY This paper proposes a new intelligent approach using an integrated filter-evolutionary search-based feature selection and an optimized ensemble classifier for breast cancer diagnosis. The selected features mainly relate to the viable solution as the selected features are successfully used in the breast cancer disease classification process. The proposed feature selection method selects the most informative features from the original feature set by integrating adaptive thresholder information gain-based feature selection and evolutionary gravity-search-based feature selection. Meanwhile, classification model is done by proposing a new intelligent multi-layer perceptron neural network-based ensemble classifier. RESULTS The simulation results show that the proposed method provides better performance compared to the state-of-the-art algorithms in terms of various criteria such as accuracy, sensitivity and specificity. Specifically, the proposed method achieves an average accuracy of 99.42% on WBCD, WDBC and WPBC datasets from Wisconsin database with only 56.7% of features. CONCLUSION Systems based on intelligent medical assistants configured with machine learning approaches are an important step toward helping doctors to detect breast cancer early.
Collapse
Affiliation(s)
- Dengru Zheng
- Cancer Center, Foshan Fuxing Chancheng Hospital, Foshan, 528000, Guangdong, China.
| | - Ping Tang
- Cancer Center, Foshan Fuxing Chancheng Hospital, Foshan, 528000, Guangdong, China
| | - Danping Lu
- Cancer Center, Foshan Fuxing Chancheng Hospital, Foshan, 528000, Guangdong, China
| | - Liangfu Han
- Cancer Center, Foshan Fuxing Chancheng Hospital, Foshan, 528000, Guangdong, China
| | - Sajjad Saberi
- Department of Computer Science, Khayyam University, Mashhad, Iran.
| |
Collapse
|
7
|
Tuerhong A, Silamujiang M, Xianmuxiding Y, Wu L, Mojarad M. An ensemble classifier method based on teaching-learning-based optimization for breast cancer diagnosis. J Cancer Res Clin Oncol 2023:10.1007/s00432-023-04861-5. [PMID: 37202580 DOI: 10.1007/s00432-023-04861-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 05/13/2023] [Indexed: 05/20/2023]
Abstract
INTRODUCTION Epidemiological studies show that breast cancer is the most common cancer in women in the world. Breast cancer treatment can be very effective, especially when the disease is detected in the early stages. The goal can be achieved by using large-scale breast cancer data with the machine learning models METHODS: This paper proposes a new intelligent approach using an optimized ensemble classifier for breast cancer diagnosis. The classification is done by proposing a new intelligent Group Method of Data Handling (GMDH) neural network-based ensemble classifier. This method improves the performance of the machine learning technique by using a Teaching-Learning-Based Optimization (TLBO) algorithm to optimize the hyperparameters of the classifier. Meanwhile, we use TLBO as an evolutionary method to address the problem of appropriate feature selection in breast cancer data. RESULTS The simulation results show that the proposed method has a better accuracy between 7 and 26% compared to the best results of the existing equivalent algorithms. CONCLUSION According to the obtained results, we suggest the proposed algorithm as an intelligent medical assistant system for breast cancer diagnosis.
Collapse
Affiliation(s)
- Adila Tuerhong
- Department of Cardio-Oncology, Affiliated Tumor Hospital of Xinjiang Medical University, Urumqi, 830011, Xinjiang, China
| | - Mutalipu Silamujiang
- Department of Traumatic Orthopedic, The Sixth Affiliated Hospital of Xinjiang Medical University, Urumqi, 830002, Xinjiang, China
| | - Yilixiati Xianmuxiding
- Department of Emergency, Affiliated Tumor Hospital of Xinjiang Medical University, Urumqi, 830011, Xinjiang, China
| | - Li Wu
- Department of Cardio-Oncology, Affiliated Tumor Hospital of Xinjiang Medical University, Urumqi, 830011, Xinjiang, China.
| | - Musa Mojarad
- Department of Computer Engineering, Firoozabad Branch, Islamic Azad University, Firoozabad, Iran.
| |
Collapse
|