1
|
Oukdach Y, Garbaz A, Kerkaou Z, Ansari ME, Koutti L, Ouafdi AFE, Salihoun M. InCoLoTransNet: An Involution-Convolution and Locality Attention-Aware Transformer for Precise Colorectal Polyp Segmentation in GI Images. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025:10.1007/s10278-025-01389-7. [PMID: 39825142 DOI: 10.1007/s10278-025-01389-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/12/2024] [Revised: 12/18/2024] [Accepted: 12/19/2024] [Indexed: 01/20/2025]
Abstract
Gastrointestinal (GI) disease examination presents significant challenges to doctors due to the intricate structure of the human digestive system. Colonoscopy and wireless capsule endoscopy are the most commonly used tools for GI examination. However, the large amount of data generated by these technologies requires the expertise and intervention of doctors for disease identification, making manual analysis a very time-consuming task. Thus, the development of a computer-assisted system is highly desirable to assist clinical professionals in making decisions in a low-cost and effective way. In this paper, we introduce a novel framework called InCoLoTransNet, designed for polyp segmentation. The study is based on a transformer and convolution-involution neural network, following the encoder-decoder architecture. We employed the vision transformer in the encoder section to focus on the global context, while the decoder involves a convolution-involution collaboration for resampling the polyp features. Involution enhances the model's ability to adaptively capture spatial and contextual information, while convolution focuses on local information, leading to more accurate feature extraction. The essential features captured by the transformer encoder are passed to the decoder through two skip connection pathways. The CBAM module refines the features and passes them to the convolution block, leveraging attention mechanisms to emphasize relevant information. Meanwhile, locality self-attention is employed to pass essential features to the involution block, reinforcing the model's ability to capture more global features in the polyp regions. Experiments were conducted on five public datasets: CVC-ClinicDB, CVC-ColonDB, Kvasir-SEG, Etis-LaribPolypDB, and CVC-300. The results obtained by InCoLoTransNet are optimal when compared with 15 state-of-the-art methods for polyp segmentation, achieving the highest mean dice score of 93% on CVC-ColonDB and 90% on mean intersection over union, outperforming the state-of-the-art methods. Additionally, InCoLoTransNet distinguishes itself in terms of polyp segmentation generalization performance. It achieved high scores in mean dice coefficient and mean intersection over union on unseen datasets as follows: 85% and 79% on CVC-ColonDB, 91% and 87% on CVC-300, and 79% and 70% on Etis-LaribPolypDB, respectively.
Collapse
Affiliation(s)
- Yassine Oukdach
- LabSIV, Department of Computer Science, Faculty of Sciences, Ibnou Zohr University, Agadir, 80000, Morocco.
| | - Anass Garbaz
- LabSIV, Department of Computer Science, Faculty of Sciences, Ibnou Zohr University, Agadir, 80000, Morocco
| | - Zakaria Kerkaou
- LabSIV, Department of Computer Science, Faculty of Sciences, Ibnou Zohr University, Agadir, 80000, Morocco
| | - Mohamed El Ansari
- Informatics and Applications Laboratory, Department of Computer Sciences, Faculty of Science, Moulay Ismail University, B.P 11201, Meknès, 52000, Morocco
| | - Lahcen Koutti
- LabSIV, Department of Computer Science, Faculty of Sciences, Ibnou Zohr University, Agadir, 80000, Morocco
| | - Ahmed Fouad El Ouafdi
- LabSIV, Department of Computer Science, Faculty of Sciences, Ibnou Zohr University, Agadir, 80000, Morocco
| | - Mouna Salihoun
- Faculty of Medicine and Pharmacy of Rabat, Mohammed V University of Rabat, Rabat, 10000, Morocco
| |
Collapse
|
2
|
Mohamed AAA, Hançerlioğullari A, Rahebi J, Rezaeizadeh R, Lopez-Guede JM. Colon Cancer Disease Diagnosis Based on Convolutional Neural Network and Fishier Mantis Optimizer. Diagnostics (Basel) 2024; 14:1417. [PMID: 39001307 PMCID: PMC11241213 DOI: 10.3390/diagnostics14131417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Revised: 06/25/2024] [Accepted: 06/27/2024] [Indexed: 07/16/2024] Open
Abstract
Colon cancer is a prevalent and potentially fatal disease that demands early and accurate diagnosis for effective treatment. Traditional diagnostic approaches for colon cancer often face limitations in accuracy and efficiency, leading to challenges in early detection and treatment. In response to these challenges, this paper introduces an innovative method that leverages artificial intelligence, specifically convolutional neural network (CNN) and Fishier Mantis Optimizer, for the automated detection of colon cancer. The utilization of deep learning techniques, specifically CNN, enables the extraction of intricate features from medical imaging data, providing a robust and efficient diagnostic model. Additionally, the Fishier Mantis Optimizer, a bio-inspired optimization algorithm inspired by the hunting behavior of the mantis shrimp, is employed to fine-tune the parameters of the CNN, enhancing its convergence speed and performance. This hybrid approach aims to address the limitations of traditional diagnostic methods by leveraging the strengths of both deep learning and nature-inspired optimization to enhance the accuracy and effectiveness of colon cancer diagnosis. The proposed method was evaluated on a comprehensive dataset comprising colon cancer images, and the results demonstrate its superiority over traditional diagnostic approaches. The CNN-Fishier Mantis Optimizer model exhibited high sensitivity, specificity, and overall accuracy in distinguishing between cancer and non-cancer colon tissues. The integration of bio-inspired optimization algorithms with deep learning techniques not only contributes to the advancement of computer-aided diagnostic tools for colon cancer but also holds promise for enhancing the early detection and diagnosis of this disease, thereby facilitating timely intervention and improved patient prognosis. Various CNN designs, such as GoogLeNet and ResNet-50, were employed to capture features associated with colon diseases. However, inaccuracies were introduced in both feature extraction and data classification due to the abundance of features. To address this issue, feature reduction techniques were implemented using Fishier Mantis Optimizer algorithms, outperforming alternative methods such as Genetic Algorithms and simulated annealing. Encouraging results were obtained in the evaluation of diverse metrics, including sensitivity, specificity, accuracy, and F1-Score, which were found to be 94.87%, 96.19%, 97.65%, and 96.76%, respectively.
Collapse
Affiliation(s)
- Amna Ali A Mohamed
- Department of Material Science and Engineering, University of Kastamonu, Kastamonu 37150, Turkey
| | | | - Javad Rahebi
- Department of Software Engineering, Istanbul Topkapi University, Istanbul 34087, Turkey
| | - Rezvan Rezaeizadeh
- Department of Physics, Faculty of Science, University of Guilan, Rasht P.O. Box 41335-1914, Guilan, Iran
| | - Jose Manuel Lopez-Guede
- Department of Systems and Automatic Control, Faculty of Engineering of Vitoria-Gasteiz, University of the Basque Country (UPV/EHU), C/Nieves Cano 12, 01006 Vitoria-Gasteiz, Spain
| |
Collapse
|
3
|
Xu C, Fan K, Mo W, Cao X, Jiao K. Dual ensemble system for polyp segmentation with submodels adaptive selection ensemble. Sci Rep 2024; 14:6152. [PMID: 38485963 PMCID: PMC10940608 DOI: 10.1038/s41598-024-56264-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 03/04/2024] [Indexed: 03/18/2024] Open
Abstract
Colonoscopy is one of the main methods to detect colon polyps, and its detection is widely used to prevent and diagnose colon cancer. With the rapid development of computer vision, deep learning-based semantic segmentation methods for colon polyps have been widely researched. However, the accuracy and stability of some methods in colon polyp segmentation tasks show potential for further improvement. In addition, the issue of selecting appropriate sub-models in ensemble learning for the colon polyp segmentation task still needs to be explored. In order to solve the above problems, we first implement the utilization of multi-complementary high-level semantic features through the Multi-Head Control Ensemble. Then, to solve the sub-model selection problem in training, we propose SDBH-PSO Ensemble for sub-model selection and optimization of ensemble weights for different datasets. The experiments were conducted on the public datasets CVC-ClinicDB, Kvasir, CVC-ColonDB, ETIS-LaribPolypDB and PolypGen. The results show that the DET-Former, constructed based on the Multi-Head Control Ensemble and the SDBH-PSO Ensemble, consistently provides improved accuracy across different datasets. Among them, the Multi-Head Control Ensemble demonstrated superior feature fusion capability in the experiments, and the SDBH-PSO Ensemble demonstrated excellent sub-model selection capability. The sub-model selection capabilities of the SDBH-PSO Ensemble will continue to have significant reference value and practical utility as deep learning networks evolve.
Collapse
Affiliation(s)
- Cun Xu
- Guilin University of Electronic Technology, Guilin, 541000, China
| | - Kefeng Fan
- China Electronics Standardization Institute, Beijing, 100007, China.
| | - Wei Mo
- Guilin University of Electronic Technology, Guilin, 541000, China
| | - Xuguang Cao
- Guilin University of Electronic Technology, Guilin, 541000, China
| | - Kaijie Jiao
- Guilin University of Electronic Technology, Guilin, 541000, China
| |
Collapse
|
4
|
Gangrade S, Sharma PC, Sharma AK, Singh YP. Modified DeeplabV3+ with multi-level context attention mechanism for colonoscopy polyp segmentation. Comput Biol Med 2024; 170:108096. [PMID: 38320340 DOI: 10.1016/j.compbiomed.2024.108096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 01/31/2024] [Accepted: 02/01/2024] [Indexed: 02/08/2024]
Abstract
The development of automated methods for analyzing medical images of colon cancer is one of the main research fields. A colonoscopy is a medical treatment that enables a doctor to look for any abnormalities like polyps, cancer, or inflammatory tissue inside the colon and rectum. It falls under the category of gastrointestinal illnesses, and it claims the lives of almost two million people worldwide. Video endoscopy is an advanced medical imaging approach to diagnose gastrointestinal disorders such as inflammatory bowel, ulcerative colitis, esophagitis, and polyps. Medical video endoscopy generates several images, which must be reviewed by specialists. The difficulty of manual diagnosis has sparked research towards computer-aided techniques that can quickly and reliably diagnose all generated images. The proposed methodology establishes a framework for diagnosing coloscopy diseases. Endoscopists can lower the risk of polyps turning into cancer during colonoscopies by using more accurate computer-assisted polyp detection and segmentation. With the aim of creating a model that can automatically distinguish polyps from images, we presented a modified DeeplabV3+ model in this study to carry out segmentation tasks successfully and efficiently. The framework's encoder uses a pre-trained dilated convolutional residual network for optimal feature map resolution. The robustness of the modified model is tested against state-of-the-art segmentation approaches. In this work, we employed two publicly available datasets, CVC-Clinic DB and Kvasir-SEG, and obtained Dice similarity coefficients of 0.97 and 0.95, respectively. The results show that the improved DeeplabV3+ model improves segmentation efficiency and effectiveness in both software and hardware with only minor changes.
Collapse
Affiliation(s)
- Shweta Gangrade
- School of Information Technology, Manipal University Jaipur, Jaipur, Rajasthan, India; School of Computer Science and Engineering, Manipal University Jaipur, Jaipur, Rajasthan, India
| | - Prakash Chandra Sharma
- School of Information Technology, Manipal University Jaipur, Jaipur, Rajasthan, India; School of Computer Science and Engineering, Manipal University Jaipur, Jaipur, Rajasthan, India
| | - Akhilesh Kumar Sharma
- School of Information Technology, Manipal University Jaipur, Jaipur, Rajasthan, India; School of Computer Science and Engineering, Manipal University Jaipur, Jaipur, Rajasthan, India
| | - Yadvendra Pratap Singh
- School of Information Technology, Manipal University Jaipur, Jaipur, Rajasthan, India; School of Computer Science and Engineering, Manipal University Jaipur, Jaipur, Rajasthan, India.
| |
Collapse
|
5
|
Pu Q, Xi Z, Yin S, Zhao Z, Zhao L. Advantages of transformer and its application for medical image segmentation: a survey. Biomed Eng Online 2024; 23:14. [PMID: 38310297 PMCID: PMC10838005 DOI: 10.1186/s12938-024-01212-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 01/22/2024] [Indexed: 02/05/2024] Open
Abstract
PURPOSE Convolution operator-based neural networks have shown great success in medical image segmentation over the past decade. The U-shaped network with a codec structure is one of the most widely used models. Transformer, a technology used in natural language processing, can capture long-distance dependencies and has been applied in Vision Transformer to achieve state-of-the-art performance on image classification tasks. Recently, researchers have extended transformer to medical image segmentation tasks, resulting in good models. METHODS This review comprises publications selected through a Web of Science search. We focused on papers published since 2018 that applied the transformer architecture to medical image segmentation. We conducted a systematic analysis of these studies and summarized the results. RESULTS To better comprehend the benefits of convolutional neural networks and transformers, the construction of the codec and transformer modules is first explained. Second, the medical image segmentation model based on transformer is summarized. The typically used assessment markers for medical image segmentation tasks are then listed. Finally, a large number of medical segmentation datasets are described. CONCLUSION Even if there is a pure transformer model without any convolution operator, the sample size of medical picture segmentation still restricts the growth of the transformer, even though it can be relieved by a pretraining model. More often than not, researchers are still designing models using transformer and convolution operators.
Collapse
Affiliation(s)
- Qiumei Pu
- School of Information Engineering, Minzu University of China, Beijing, 100081, China
| | - Zuoxin Xi
- School of Information Engineering, Minzu University of China, Beijing, 100081, China
- CAS Key Laboratory for Biomedical Effects of Nanomaterials and Nanosafety Institute of High Energy Physics, Chinese Academy of Sciences, Beijing, 100049, China
| | - Shuai Yin
- School of Information Engineering, Minzu University of China, Beijing, 100081, China
| | - Zhe Zhao
- The Fourth Medical Center of PLA General Hospital, Beijing, 100039, China
| | - Lina Zhao
- CAS Key Laboratory for Biomedical Effects of Nanomaterials and Nanosafety Institute of High Energy Physics, Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|