1
|
Wang Z, Guo L, Zhao S, Zhang S, Zhao X, Fang J, Wang G, Lu H, Yu J, Tian Q. Multi-Scale Group Agent Attention-Based Graph Convolutional Decoding Networks for 2D Medical Image Segmentation. IEEE J Biomed Health Inform 2025; 29:2718-2730. [PMID: 40030822 DOI: 10.1109/jbhi.2024.3523112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Automated medical image segmentation plays a crucial role in assisting doctors in diagnosing diseases. Feature decoding is a critical yet challenging issue for medical image segmentation. To address this issue, this work proposes a novel feature decoding network, called multi-scale group agent attention-based graph convolutional decoding networks (MSGAA-GCDN), to learn local-global features in graph structures for 2D medical image segmentation. The proposed MSGAA-GCDN combines graph convolutional network (GCN) and a lightweight multi-scale group agent attention (MSGAA) mechanism to represent features globally and locally within a graph structure. Moreover, in skip connections a simple yet efficient attention-based upsampling convolution fusion (AUCF) module is designed to enhance encoder-decoder feature fusion in both channel and spatial dimensions. Extensive experiments are conducted on three typical medical image segmentation tasks, namely Synapse abdominal multi-organs, Cardiac organs, and Polyp lesions. Experimental results demonstrate that the proposed MSGAA-GCDN outperforms the state-of-the-art methods, and the designed MSGAA is a lightweight yet effective attention architecture. The proposed MSGAA-GCDN can be easily taken as a plug-and-play decoder cascaded with other encoders for general medical image segmentation tasks.
Collapse
|
2
|
Oukdach Y, Garbaz A, Kerkaou Z, Ansari ME, Koutti L, Ouafdi AFE, Salihoun M. InCoLoTransNet: An Involution-Convolution and Locality Attention-Aware Transformer for Precise Colorectal Polyp Segmentation in GI Images. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025:10.1007/s10278-025-01389-7. [PMID: 39825142 DOI: 10.1007/s10278-025-01389-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/12/2024] [Revised: 12/18/2024] [Accepted: 12/19/2024] [Indexed: 01/20/2025]
Abstract
Gastrointestinal (GI) disease examination presents significant challenges to doctors due to the intricate structure of the human digestive system. Colonoscopy and wireless capsule endoscopy are the most commonly used tools for GI examination. However, the large amount of data generated by these technologies requires the expertise and intervention of doctors for disease identification, making manual analysis a very time-consuming task. Thus, the development of a computer-assisted system is highly desirable to assist clinical professionals in making decisions in a low-cost and effective way. In this paper, we introduce a novel framework called InCoLoTransNet, designed for polyp segmentation. The study is based on a transformer and convolution-involution neural network, following the encoder-decoder architecture. We employed the vision transformer in the encoder section to focus on the global context, while the decoder involves a convolution-involution collaboration for resampling the polyp features. Involution enhances the model's ability to adaptively capture spatial and contextual information, while convolution focuses on local information, leading to more accurate feature extraction. The essential features captured by the transformer encoder are passed to the decoder through two skip connection pathways. The CBAM module refines the features and passes them to the convolution block, leveraging attention mechanisms to emphasize relevant information. Meanwhile, locality self-attention is employed to pass essential features to the involution block, reinforcing the model's ability to capture more global features in the polyp regions. Experiments were conducted on five public datasets: CVC-ClinicDB, CVC-ColonDB, Kvasir-SEG, Etis-LaribPolypDB, and CVC-300. The results obtained by InCoLoTransNet are optimal when compared with 15 state-of-the-art methods for polyp segmentation, achieving the highest mean dice score of 93% on CVC-ColonDB and 90% on mean intersection over union, outperforming the state-of-the-art methods. Additionally, InCoLoTransNet distinguishes itself in terms of polyp segmentation generalization performance. It achieved high scores in mean dice coefficient and mean intersection over union on unseen datasets as follows: 85% and 79% on CVC-ColonDB, 91% and 87% on CVC-300, and 79% and 70% on Etis-LaribPolypDB, respectively.
Collapse
Affiliation(s)
- Yassine Oukdach
- LabSIV, Department of Computer Science, Faculty of Sciences, Ibnou Zohr University, Agadir, 80000, Morocco.
| | - Anass Garbaz
- LabSIV, Department of Computer Science, Faculty of Sciences, Ibnou Zohr University, Agadir, 80000, Morocco
| | - Zakaria Kerkaou
- LabSIV, Department of Computer Science, Faculty of Sciences, Ibnou Zohr University, Agadir, 80000, Morocco
| | - Mohamed El Ansari
- Informatics and Applications Laboratory, Department of Computer Sciences, Faculty of Science, Moulay Ismail University, B.P 11201, Meknès, 52000, Morocco
| | - Lahcen Koutti
- LabSIV, Department of Computer Science, Faculty of Sciences, Ibnou Zohr University, Agadir, 80000, Morocco
| | - Ahmed Fouad El Ouafdi
- LabSIV, Department of Computer Science, Faculty of Sciences, Ibnou Zohr University, Agadir, 80000, Morocco
| | - Mouna Salihoun
- Faculty of Medicine and Pharmacy of Rabat, Mohammed V University of Rabat, Rabat, 10000, Morocco
| |
Collapse
|
3
|
Du X, Xu X, Chen J, Zhang X, Li L, Liu H, Li S. UM-Net: Rethinking ICGNet for polyp segmentation with uncertainty modeling. Med Image Anal 2025; 99:103347. [PMID: 39316997 DOI: 10.1016/j.media.2024.103347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 05/26/2024] [Accepted: 09/10/2024] [Indexed: 09/26/2024]
Abstract
Automatic segmentation of polyps from colonoscopy images plays a critical role in the early diagnosis and treatment of colorectal cancer. Nevertheless, some bottlenecks still exist. In our previous work, we mainly focused on polyps with intra-class inconsistency and low contrast, using ICGNet to solve them. Due to the different equipment, specific locations and properties of polyps, the color distribution of the collected images is inconsistent. ICGNet was designed primarily with reverse-contour guide information and local-global context information, ignoring this inconsistent color distribution, which leads to overfitting problems and makes it difficult to focus only on beneficial image content. In addition, a trustworthy segmentation model should not only produce high-precision results but also provide a measure of uncertainty to accompany its predictions so that physicians can make informed decisions. However, ICGNet only gives the segmentation result and lacks the uncertainty measure. To cope with these novel bottlenecks, we further extend the original ICGNet to a comprehensive and effective network (UM-Net) with two main contributions that have been proved by experiments to have substantial practical value. Firstly, we employ a color transfer operation to weaken the relationship between color and polyps, making the model more concerned with the shape of the polyps. Secondly, we provide the uncertainty to represent the reliability of the segmentation results and use variance to rectify uncertainty. Our improved method is evaluated on five polyp datasets, which shows competitive results compared to other advanced methods in both learning ability and generalization capability. The source code is available at https://github.com/dxqllp/UM-Net.
Collapse
Affiliation(s)
- Xiuquan Du
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei, China; School of Computer Science and Technology, Anhui University, Hefei, China
| | - Xuebin Xu
- School of Computer Science and Technology, Anhui University, Hefei, China
| | - Jiajia Chen
- School of Computer Science and Technology, Anhui University, Hefei, China
| | - Xuejun Zhang
- School of Computer Science and Technology, Anhui University, Hefei, China
| | - Lei Li
- Department of Neurology, Shuyang Affiliated Hospital of Nanjing University of Traditional Chinese Medicine, Suqian, China.
| | - Heng Liu
- Department of Gastroenterology, The First Affiliated Hospital of Anhui Medical University, Hefei, China
| | - Shuo Li
- Department of Biomedical Engineering, Case Western Reserve University, Cleveland, USA
| |
Collapse
|
4
|
Song Z, Kang X, Wei X, Li S. Pixel-Centric Context Perception Network for Camouflaged Object Detection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:18576-18589. [PMID: 37819817 DOI: 10.1109/tnnls.2023.3319323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]
Abstract
Camouflaged object detection (COD) aims to identify object pixels visually embedded in the background environment. Existing deep learning methods fail to utilize the context information around different pixels adequately and efficiently. In order to solve this problem, a novel pixel-centric context perception network (PCPNet) is proposed, the core of which is to customize the personalized context of each pixel based on the automatic estimation of its surroundings. Specifically, PCPNet first employs an elegant encoder equipped with the designed vital component generation (VCG) module to obtain a set of compact features rich in low-level spatial and high-level semantic information across multiple subspaces. Then, we present a parameter-free pixel importance estimation (PIE) function based on multiwindow information fusion. Object pixels with complex backgrounds will be assigned with higher PIE values. Subsequently, PIE is utilized to regularize the optimization loss. In this way, the network can pay more attention to those pixels with higher PIE values in the decoding stage. Finally, a local continuity refinement module (LCRM) is used to refine the detection results. Extensive experiments on four COD benchmarks, five salient object detection (SOD) benchmarks, and five polyp segmentation benchmarks demonstrate the superiority of PCPNet with respect to other state-of-the-art methods.
Collapse
|
5
|
Peng C, Qian Z, Wang K, Zhang L, Luo Q, Bi Z, Zhang W. MugenNet: A Novel Combined Convolution Neural Network and Transformer Network with Application in Colonic Polyp Image Segmentation. SENSORS (BASEL, SWITZERLAND) 2024; 24:7473. [PMID: 39686010 DOI: 10.3390/s24237473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2024] [Revised: 11/18/2024] [Accepted: 11/21/2024] [Indexed: 12/18/2024]
Abstract
Accurate polyp image segmentation is of great significance, because it can help in the detection of polyps. Convolutional neural network (CNN) is a common automatic segmentation method, but its main disadvantage is the long training time. Transformer is another method that can be adapted to the automatic segmentation method by employing a self-attention mechanism, which essentially assigns different importance weights to each piece of information, thus achieving high computational efficiency during segmentation. However, a potential drawback with Transformer is the risk of information loss. The study reported in this paper employed the well-known hybridization principle to propose a method to combine CNN and Transformer to retain the strengths of both. Specifically, this study applied this method to the early detection of colonic polyps and to implement a model called MugenNet for colonic polyp image segmentation. We conducted a comprehensive experiment to compare MugenNet with other CNN models on five publicly available datasets. An ablation experiment on MugenNet was conducted as well. The experimental results showed that MugenNet can achieve a mean Dice of 0.714 on the ETIS dataset, which is the optimal performance on this dataset compared to other models, with an inference speed of 56 FPS. The overall outcome of this study is a method to optimally combine two methods of machine learning which are complementary to each other.
Collapse
Affiliation(s)
- Chen Peng
- School of Mechanical and Power Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Zhiqin Qian
- School of Mechanical and Power Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Kunyu Wang
- School of Mechanical and Power Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Lanzhu Zhang
- School of Mechanical and Power Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Qi Luo
- School of Mechanical and Power Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Zhuming Bi
- Department of Engineering, Purdue University, West Lafayette, IN 47907, USA
| | - Wenjun Zhang
- Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada
| |
Collapse
|
6
|
Xu W, Xu R, Wang C, Li X, Xu S, Guo L. PSTNet: Enhanced Polyp Segmentation With Multi-Scale Alignment and Frequency Domain Integration. IEEE J Biomed Health Inform 2024; 28:6042-6053. [PMID: 38954569 DOI: 10.1109/jbhi.2024.3421550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/04/2024]
Abstract
Accurate segmentation of colorectal polyps in colonoscopy images is crucial for effective diagnosis and management of colorectal cancer (CRC). However, current deep learning-based methods primarily rely on fusing RGB information across multiple scales, leading to limitations in accurately identifying polyps due to restricted RGB domain information and challenges in feature misalignment during multi-scale aggregation. To address these limitations, we propose the Polyp Segmentation Network with Shunted Transformer (PSTNet), a novel approach that integrates both RGB and frequency domain cues present in the images. PSTNet comprises three key modules: the Frequency Characterization Attention Module (FCAM) for extracting frequency cues and capturing polyp characteristics, the Feature Supplementary Alignment Module (FSAM) for aligning semantic information and reducing misalignment noise, and the Cross Perception localization Module (CPM) for synergizing frequency cues with high-level semantics to achieve efficient polyp segmentation. Extensive experiments on challenging datasets demonstrate PSTNet's significant improvement in polyp segmentation accuracy across various metrics, consistently outperforming state-of-the-art methods. The integration of frequency domain cues and the novel architectural design of PSTNet contribute to advancing computer-assisted polyp segmentation, facilitating more accurate diagnosis and management of CRC.
Collapse
|
7
|
Bhattacharya D, Reuter K, Behrendt F, Maack L, Grube S, Schlaefer A. PolypNextLSTM: a lightweight and fast polyp video segmentation network using ConvNext and ConvLSTM. Int J Comput Assist Radiol Surg 2024; 19:2111-2119. [PMID: 39115609 PMCID: PMC11442634 DOI: 10.1007/s11548-024-03244-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Accepted: 07/18/2024] [Indexed: 10/02/2024]
Abstract
PURPOSE Commonly employed in polyp segmentation, single-image UNet architectures lack the temporal insight clinicians gain from video data in diagnosing polyps. To mirror clinical practices more faithfully, our proposed solution, PolypNextLSTM, leverages video-based deep learning, harnessing temporal information for superior segmentation performance with least parameter overhead, making it possibly suitable for edge devices. METHODS PolypNextLSTM employs a UNet-like structure with ConvNext-Tiny as its backbone, strategically omitting the last two layers to reduce parameter overhead. Our temporal fusion module, a Convolutional Long Short Term Memory (ConvLSTM), effectively exploits temporal features. Our primary novelty lies in PolypNextLSTM, which stands out as the leanest in parameters and the fastest model, surpassing the performance of five state-of-the-art image and video-based deep learning models. The evaluation of the SUN-SEG dataset spans easy-to-detect and hard-to-detect polyp scenarios, along with videos containing challenging artefacts like fast motion and occlusion. RESULTS Comparison against 5 image-based and 5 video-based models demonstrates PolypNextLSTM's superiority, achieving a Dice score of 0.7898 on the hard-to-detect polyp test set, surpassing image-based PraNet (0.7519) and video-based PNS+ (0.7486). Notably, our model excels in videos featuring complex artefacts such as ghosting and occlusion. CONCLUSION PolypNextLSTM, integrating pruned ConvNext-Tiny with ConvLSTM for temporal fusion, not only exhibits superior segmentation performance but also maintains the highest frames per speed among evaluated models. Code can be found here: https://github.com/mtec-tuhh/PolypNextLSTM .
Collapse
Affiliation(s)
- Debayan Bhattacharya
- Institute of Medical Technology and Intelligent Systems, Technische Universitaet Hamburg, Hamburg, Germany
| | - Konrad Reuter
- Institute of Medical Technology and Intelligent Systems, Technische Universitaet Hamburg, Hamburg, Germany.
| | - Finn Behrendt
- Institute of Medical Technology and Intelligent Systems, Technische Universitaet Hamburg, Hamburg, Germany
| | - Lennart Maack
- Institute of Medical Technology and Intelligent Systems, Technische Universitaet Hamburg, Hamburg, Germany
| | - Sarah Grube
- Institute of Medical Technology and Intelligent Systems, Technische Universitaet Hamburg, Hamburg, Germany
| | - Alexander Schlaefer
- Institute of Medical Technology and Intelligent Systems, Technische Universitaet Hamburg, Hamburg, Germany
| |
Collapse
|
8
|
Sun J, Chen K, He Z, Ren S, He X, Liu X, Peng C. Medical image analysis using improved SAM-Med2D: segmentation and classification perspectives. BMC Med Imaging 2024; 24:241. [PMID: 39285324 PMCID: PMC11403950 DOI: 10.1186/s12880-024-01401-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 08/14/2024] [Indexed: 09/22/2024] Open
Abstract
Recently emerged SAM-Med2D represents a state-of-the-art advancement in medical image segmentation. Through fine-tuning the Large Visual Model, Segment Anything Model (SAM), on extensive medical datasets, it has achieved impressive results in cross-modal medical image segmentation. However, its reliance on interactive prompts may restrict its applicability under specific conditions. To address this limitation, we introduce SAM-AutoMed, which achieves automatic segmentation of medical images by replacing the original prompt encoder with an improved MobileNet v3 backbone. The performance on multiple datasets surpasses both SAM and SAM-Med2D. Current enhancements on the Large Visual Model SAM lack applications in the field of medical image classification. Therefore, we introduce SAM-MedCls, which combines the encoder of SAM-Med2D with our designed attention modules to construct an end-to-end medical image classification model. It performs well on datasets of various modalities, even achieving state-of-the-art results, indicating its potential to become a universal model for medical image classification.
Collapse
Affiliation(s)
- Jiakang Sun
- Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu, 610213, Sichuan, China
- School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 101499, China
| | - Ke Chen
- Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu, 610213, Sichuan, China
- School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 101499, China
| | - Zhiyi He
- Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu, 610213, Sichuan, China
- School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 101499, China
| | - Siyuan Ren
- Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu, 610213, Sichuan, China
- School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 101499, China
| | - Xinyang He
- Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu, 610213, Sichuan, China
- School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 101499, China
| | - Xu Liu
- Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu, 610213, Sichuan, China
- School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 101499, China
| | - Cheng Peng
- Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu, 610213, Sichuan, China.
- School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 101499, China.
| |
Collapse
|
9
|
Rajasekar D, Theja G, Prusty MR, Chinara S. Efficient colorectal polyp segmentation using wavelet transformation and AdaptUNet: A hybrid U-Net. Heliyon 2024; 10:e33655. [PMID: 39040380 PMCID: PMC11261057 DOI: 10.1016/j.heliyon.2024.e33655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 03/06/2024] [Accepted: 06/25/2024] [Indexed: 07/24/2024] Open
Abstract
The prevalence of colorectal cancer, primarily emerging from polyps, underscores the importance of their early detection in colonoscopy images. Due to the inherent complexity and variability of polyp appearances, the task stands difficult despite recent advances in medical technology. To tackle these challenges, a deep learning model featuring a customized U-Net architecture, AdaptUNet is proposed. Attention mechanisms and skip connections facilitate the effective combination of low-level details and high-level contextual information for accurate polyp segmentation. Further, wavelet transformations are used to extract useful features overlooked in conventional image processing. The model achieves benchmark results with a Dice coefficient of 0.9104, an Intersection over Union (IoU) coefficient of 0.8368, and a Balanced Accuracy of 0.9880 on the CVC-300 dataset. Additionally, it shows exceptional performance on other datasets, including Kvasir-SEG and Etis-LaribDB. Training was performed using the Hyper Kvasir segmented images dataset, further evidencing the model's ability to handle diverse data inputs. The proposed method offers a comprehensive and efficient implementation for polyp detection without compromising performance, thus promising an improved precision and reduction in manual labour for colorectal polyp detection.
Collapse
Affiliation(s)
- Devika Rajasekar
- School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, India
| | - Girish Theja
- School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, India
| | - Manas Ranjan Prusty
- Centre for Cyber Physical Systems, Vellore Institute of Technology, Chennai, India
| | - Suchismita Chinara
- Department of Computer Science and Engineering, National Institute of Technology, Rourkela, India
| |
Collapse
|
10
|
Li Z, Yi M, Uneri A, Niu S, Jones C. RTA-Former: Reverse Transformer Attention for Polyp Segmentation. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2024; 2024:1-5. [PMID: 40031481 DOI: 10.1109/embc53108.2024.10782181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Polyp segmentation is a key aspect of colorectal cancer prevention, enabling early detection and guiding subsequent treatments. Intelligent diagnostic tools, including deep learning solutions, are widely explored to streamline and potentially automate this process. However, even with many powerful network architectures, there still comes the problem of producing accurate edge segmentation. In this paper, we introduce a novel network, namely RTA-Former, that employs a transformer model as the encoder backbone and innovatively adapts Reverse Attention (RA) with a transformer stage in the decoder for enhanced edge segmentation. The results of the experiments illustrate that RTA-Former achieves state-of-the-art (SOTA) performance in five polyp segmentation datasets. The strong capability of RTA-Former holds promise in improving the accuracy of Transformer-based polyp segmentation, potentially leading to better clinical decisions and patient outcomes. Our code is publicly available on ${\color{Magenta}{\text{GitHub}}}$.
Collapse
|
11
|
Xu C, Fan K, Mo W, Cao X, Jiao K. Dual ensemble system for polyp segmentation with submodels adaptive selection ensemble. Sci Rep 2024; 14:6152. [PMID: 38485963 PMCID: PMC10940608 DOI: 10.1038/s41598-024-56264-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 03/04/2024] [Indexed: 03/18/2024] Open
Abstract
Colonoscopy is one of the main methods to detect colon polyps, and its detection is widely used to prevent and diagnose colon cancer. With the rapid development of computer vision, deep learning-based semantic segmentation methods for colon polyps have been widely researched. However, the accuracy and stability of some methods in colon polyp segmentation tasks show potential for further improvement. In addition, the issue of selecting appropriate sub-models in ensemble learning for the colon polyp segmentation task still needs to be explored. In order to solve the above problems, we first implement the utilization of multi-complementary high-level semantic features through the Multi-Head Control Ensemble. Then, to solve the sub-model selection problem in training, we propose SDBH-PSO Ensemble for sub-model selection and optimization of ensemble weights for different datasets. The experiments were conducted on the public datasets CVC-ClinicDB, Kvasir, CVC-ColonDB, ETIS-LaribPolypDB and PolypGen. The results show that the DET-Former, constructed based on the Multi-Head Control Ensemble and the SDBH-PSO Ensemble, consistently provides improved accuracy across different datasets. Among them, the Multi-Head Control Ensemble demonstrated superior feature fusion capability in the experiments, and the SDBH-PSO Ensemble demonstrated excellent sub-model selection capability. The sub-model selection capabilities of the SDBH-PSO Ensemble will continue to have significant reference value and practical utility as deep learning networks evolve.
Collapse
Affiliation(s)
- Cun Xu
- Guilin University of Electronic Technology, Guilin, 541000, China
| | - Kefeng Fan
- China Electronics Standardization Institute, Beijing, 100007, China.
| | - Wei Mo
- Guilin University of Electronic Technology, Guilin, 541000, China
| | - Xuguang Cao
- Guilin University of Electronic Technology, Guilin, 541000, China
| | - Kaijie Jiao
- Guilin University of Electronic Technology, Guilin, 541000, China
| |
Collapse
|
12
|
Wang M, An X, Pei Z, Li N, Zhang L, Liu G, Ming D. An Efficient Multi-Task Synergetic Network for Polyp Segmentation and Classification. IEEE J Biomed Health Inform 2024; 28:1228-1239. [PMID: 37155397 DOI: 10.1109/jbhi.2023.3273728] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Colonoscopy is considered the best diagnostic tool for early detection and resection of polyps, which can effectively prevent consequential colorectal cancer. In clinical practice, segmenting and classifying polyps from colonoscopic images have a great significance since they provide precious information for diagnosis and treatment. In this study, we propose an efficient multi-task synergetic network (EMTS-Net) for concurrent polyp segmentation and classification, and we introduce a polyp classification benchmark for exploring the potential correlations of the above-mentioned two tasks. This framework is composed of an enhanced multi-scale network (EMS-Net) for coarse-grained polyp segmentation, an EMTS-Net (Class) for accurate polyp classification, and an EMTS-Net (Seg) for fine-grained polyp segmentation. Specifically, we first obtain coarse segmentation masks by using EMS-Net. Then, we concatenate these rough masks with colonoscopic images to assist EMTS-Net (Class) in locating and classifying polyps precisely. To further enhance the segmentation performance of polyps, we propose a random multi-scale (RMS) training strategy to eliminate the interference caused by redundant information. In addition, we design an offline dynamic class activation mapping (OFLD CAM) generated by the combined effect of EMTS-Net (Class) and RMS strategy, which optimizes bottlenecks between multi-task networks efficiently and elegantly and helps EMTS-Net (Seg) to perform more accurate polyp segmentation. We evaluate the proposed EMTS-Net on the polyp segmentation and classification benchmarks, and it achieves an average mDice of 0.864 in polyp segmentation and an average AUC of 0.913 with an average accuracy of 0.924 in polyp classification. Quantitative and qualitative evaluations on the polyp segmentation and classification benchmarks demonstrate that our EMTS-Net achieves the best performance and outperforms previous state-of-the-art methods in terms of both efficiency and generalization.
Collapse
|
13
|
Huang Z, Xie F, Qing W, Wang M, Liu M, Sun D. MGF-net: Multi-channel group fusion enhancing boundary attention for polyp segmentation. Med Phys 2024; 51:407-418. [PMID: 37403578 DOI: 10.1002/mp.16584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 05/11/2023] [Accepted: 06/02/2023] [Indexed: 07/06/2023] Open
Abstract
BACKGROUND Colonic polyps are the most prevalent neoplastic lesions detected during colorectal cancer screening, and timely detection and excision of these precursor lesions is crucial for preventing multiple malignancies and reducing mortality rates. PURPOSE The pressing need for intelligent polyp detection has led to the development of a high-precision intelligent polyp segmentation network designed to improve polyp screening rates during colonoscopies. METHODS In this study, we employed ResNet50 as the backbone network and embedded a multi-channel grouping fusion encoding module in the third to fifth stages to extract high-level semantic features of polyps. Receptive field modules were utilized to capture multi-scale features, and grouping fusion modules were employed to capture salient features in different group channels, guiding the decoder to generate an initial global mapping with improved accuracy. To refine the segmentation of the initial global mapping, we introduced an enhanced boundary weight attention module that adaptively thresholds the initial global mapping using learnable parameters. A self-attention mechanism was then utilized to calculate the long-distance dependency relationship of the polyp boundary area, resulting in an output feature map with enhanced boundaries that effectively refines the boundary of the target area. RESULTS We carried out contrast experiments of MGF-Net with mainstream polyp segmentation networks on five public datasets of ColonDB, CVC-ColonDB, CVC-612, Kvasir, and ETIS. The results demonstrate that the segmentation accuracy of MGF-Net is significantly improved on the datasets. Furthermore, a hypothesis test was conducted to assess the statistical significance of the computed results. CONCLUSIONS Our proposed MGF-Net outperforms existing mainstream baseline networks and presents a promising solution to the pressing need for intelligent polyp detection. The proposed model is available at https://github.com/xiefanghhh/MGF-NET.
Collapse
Affiliation(s)
- Zhiyong Huang
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing, China
| | - Fang Xie
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing, China
| | - Wencheng Qing
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing, China
| | - Mengyao Wang
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing, China
| | - Man Liu
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing, China
| | - Daming Sun
- Chongqing Engineering Research Center of Medical Electronics and Information, Technology, Chongqing University of Posts and Telecommunications, Chongqing, China
| |
Collapse
|
14
|
Jain S, Atale R, Gupta A, Mishra U, Seal A, Ojha A, Jaworek-Korjakowska J, Krejcar O. CoInNet: A Convolution-Involution Network With a Novel Statistical Attention for Automatic Polyp Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3987-4000. [PMID: 37768798 DOI: 10.1109/tmi.2023.3320151] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/30/2023]
Abstract
Polyps are very common abnormalities in human gastrointestinal regions. Their early diagnosis may help in reducing the risk of colorectal cancer. Vision-based computer-aided diagnostic systems automatically identify polyp regions to assist surgeons in their removal. Due to their varying shape, color, size, texture, and unclear boundaries, polyp segmentation in images is a challenging problem. Existing deep learning segmentation models mostly rely on convolutional neural networks that have certain limitations in learning the diversity in visual patterns at different spatial locations. Further, they fail to capture inter-feature dependencies. Vision transformer models have also been deployed for polyp segmentation due to their powerful global feature extraction capabilities. But they too are supplemented by convolution layers for learning contextual local information. In the present paper, a polyp segmentation model CoInNet is proposed with a novel feature extraction mechanism that leverages the strengths of convolution and involution operations and learns to highlight polyp regions in images by considering the relationship between different feature maps through a statistical feature attention unit. To further aid the network in learning polyp boundaries, an anomaly boundary approximation module is introduced that uses recursively fed feature fusion to refine segmentation results. It is indeed remarkable that even tiny-sized polyps with only 0.01% of an image area can be precisely segmented by CoInNet. It is crucial for clinical applications, as small polyps can be easily overlooked even in the manual examination due to the voluminous size of wireless capsule endoscopy videos. CoInNet outperforms thirteen state-of-the-art methods on five benchmark polyp segmentation datasets.
Collapse
|
15
|
Sui D, Liu W, Zhang Y, Li Y, Luo G, Wang K, Guo M. ColonNet: A novel polyp segmentation framework based on LK-RFB and GPPD. Comput Biol Med 2023; 166:107541. [PMID: 37804779 DOI: 10.1016/j.compbiomed.2023.107541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 09/11/2023] [Accepted: 09/28/2023] [Indexed: 10/09/2023]
Abstract
Colorectal cancer (CRC) holds the distinction of being the most prevalent malignant tumor affecting the digestive system. It is a formidable global health challenge, as it ranks as the fourth leading cause of cancer-related fatalities around the world. Despite considerable advancements in comprehending and addressing colorectal cancer (CRC), the likelihood of recurring tumors and metastasis remains a major cause of high morbidity and mortality rates during treatment. Currently, colonoscopy is the predominant method for CRC screening. Artificial intelligence has emerged as a promising tool in aiding the diagnosis of polyps, which have demonstrated significant potential. Unfortunately, most segmentation methods face challenges in terms of limited accuracy and generalization to different datasets, especially the slow processing and analysis speed has become a major obstacle. In this study, we propose a fast and efficient polyp segmentation framework based on the Large-Kernel Receptive Field Block (LK-RFB) and Global Parallel Partial Decoder(GPPD). Our proposed ColonNet has been extensively tested and proven effective, achieving a DICE coefficient of over 0.910 and an FPS of over 102 on the CVC-300 dataset. In comparison to the state-of-the-art (SOTA) methods, ColonNet outperforms or achieves comparable performance on five publicly available datasets, establishing a new SOTA. Compared to state-of-the-art methods, ColonNet achieves the highest FPS (over 102 FPS) while maintaining excellent segmentation results, achieving the best or comparable performance on the five public datasets. The code will be released at: https://github.com/SPECTRELWF/ColonNet.
Collapse
Affiliation(s)
- Dong Sui
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China.
| | - Weifeng Liu
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China
| | - Yue Zhang
- College of Computer Science and Technology, Harbin Engineering University, Harbin, China.
| | - Yang Li
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China
| | - Gongning Luo
- Perceptual Computing Research Center, Harbin Institute of Technology, Harbin, China
| | - Kuanquan Wang
- Perceptual Computing Research Center, Harbin Institute of Technology, Harbin, China
| | - Maozu Guo
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China
| |
Collapse
|
16
|
Wang Z, Gao F, Yu L, Tian S. UACENet: Uncertain area attention and cross‐image context extraction network for polyp segmentation. INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY 2023; 33:1973-1987. [DOI: 10.1002/ima.22906] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 04/23/2023] [Indexed: 12/09/2024]
Abstract
AbstractAccurately segmenting polyp from colonoscopy images is essential for early screening and diagnosis of colorectal cancer. In recent years, with the proposed encoder‐decoder architecture, many advanced methods have been applied to this task and have achieved significant improvements. However, accurate segmentation of polyps has always been a challenging task due to the irregular shape and size of polyps, the low contrast between the polyp and the background in some images, and the influence of the environment such as illumination and mucus. In order to tackle these challenges, we propose a novel uncertain area attention and cross‐image context extraction network for accurate polyp segmentation, which consists of the uncertain area attention module (UAAM), the cross‐image context extraction module (CCEM), and the adaptive fusion module (AFM). UAAM is guided by the output prediction of the adjacent decoding layer, and focuses on the difficult region of the boundary without neglecting the attention to the background and foreground so that more edge details and uncertain information can be captured. CCEM innovatively captures multi‐scale global context within an image and implicit contextual information between multiple images, fusing them to enhance the extraction of global location information. AFM fuses the local detail information extracted by UAAM and the global location information extracted by CCEM with the decoding layer feature for multiple fusion and adaptive attention to enhance feature representation. Our method is extensively experimented on four public datasets and generally achieves state‐of‐the‐art performance compared to other advanced methods.
Collapse
Affiliation(s)
- Zhi Wang
- College of Software Engineering Xinjiang University Urumqi China
| | - Feng Gao
- Department of Gastroenterology People's Hospital of Xinjiang Uygur Autonomous Region Urumqi China
| | - Long Yu
- College of Information Science and Engineering Xinjiang University Urumqi China
| | - Shengwei Tian
- College of Software Engineering Xinjiang University Urumqi China
| |
Collapse
|
17
|
Ghimire R, Lee SW. MMNet: A Mixing Module Network for Polyp Segmentation. SENSORS (BASEL, SWITZERLAND) 2023; 23:7258. [PMID: 37631792 PMCID: PMC10458640 DOI: 10.3390/s23167258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 08/03/2023] [Accepted: 08/16/2023] [Indexed: 08/27/2023]
Abstract
Traditional encoder-decoder networks like U-Net have been extensively used for polyp segmentation. However, such networks have demonstrated limitations in explicitly modeling long-range dependencies. In such networks, local patterns are emphasized over the global context, as each convolutional kernel focuses on only a local subset of pixels in the entire image. Several recent transformer-based networks have been shown to overcome such limitations. Such networks encode long-range dependencies using self-attention methods and thus learn highly expressive representations. However, due to the computational complexity of modeling the whole image, self-attention is expensive to compute, as there is a quadratic increment in cost with the increase in pixels in the image. Thus, patch embedding has been utilized, which groups small regions of the image into single input features. Nevertheless, these transformers still lack inductive bias, even with the image as a 1D sequence of visual tokens. This results in the inability to generalize to local contexts due to limited low-level features. We introduce a hybrid transformer combined with a convolutional mixing network to overcome computational and long-range dependency issues. A pretrained transformer network is introduced as a feature-extracting encoder, and a mixing module network (MMNet) is introduced to capture the long-range dependencies with a reduced computational cost. Precisely, in the mixing module network, we use depth-wise and 1 × 1 convolution to model long-range dependencies to establish spatial and cross-channel correlation, respectively. The proposed approach is evaluated qualitatively and quantitatively on five challenging polyp datasets across six metrics. Our MMNet outperforms the previous best polyp segmentation methods.
Collapse
Affiliation(s)
- Raman Ghimire
- Pattern Recognition and Machine Learning Lab, Department of IT Convergence Engineering, Gachon University, Seongnam 13557, Republic of Korea;
| | - Sang-Woong Lee
- Pattern Recognition and Machine Learning Lab, Department of AI Software, Gachon University, Seongnam 13557, Republic of Korea
| |
Collapse
|
18
|
Yue G, Zhuo G, Li S, Zhou T, Du J, Yan W, Hou J, Liu W, Wang T. Benchmarking Polyp Segmentation Methods in Narrow-Band Imaging Colonoscopy Images. IEEE J Biomed Health Inform 2023; 27:3360-3371. [PMID: 37099473 DOI: 10.1109/jbhi.2023.3270724] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/27/2023]
Abstract
In recent years, there has been significant progress in polyp segmentation in white-light imaging (WLI) colonoscopy images, particularly with methods based on deep learning (DL). However, little attention has been paid to the reliability of these methods in narrow-band imaging (NBI) data. NBI improves visibility of blood vessels and helps physicians observe complex polyps more easily than WLI, but NBI images often include polyps with small/flat appearances, background interference, and camouflage properties, making polyp segmentation a challenging task. This paper proposes a new polyp segmentation dataset (PS-NBI2K) consisting of 2,000 NBI colonoscopy images with pixel-wise annotations, and presents benchmarking results and analyses for 24 recently reported DL-based polyp segmentation methods on PS-NBI2K. The results show that existing methods struggle to locate polyps with smaller sizes and stronger interference, and that extracting both local and global features improves performance. There is also a trade-off between effectiveness and efficiency, and most methods cannot achieve the best results in both areas simultaneously. This work highlights potential directions for designing DL-based polyp segmentation methods in NBI colonoscopy images, and the release of PS-NBI2K aims to drive further development in this field.
Collapse
|
19
|
Zhang T, Bur AM, Kraft S, Kavookjian H, Renslo B, Chen X, Luo B, Wang G. Gender, Smoking History, and Age Prediction from Laryngeal Images. J Imaging 2023; 9:109. [PMID: 37367457 DOI: 10.3390/jimaging9060109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 05/22/2023] [Accepted: 05/25/2023] [Indexed: 06/28/2023] Open
Abstract
Flexible laryngoscopy is commonly performed by otolaryngologists to detect laryngeal diseases and to recognize potentially malignant lesions. Recently, researchers have introduced machine learning techniques to facilitate automated diagnosis using laryngeal images and achieved promising results. The diagnostic performance can be improved when patients' demographic information is incorporated into models. However, the manual entry of patient data is time-consuming for clinicians. In this study, we made the first endeavor to employ deep learning models to predict patient demographic information to improve the detector model's performance. The overall accuracy for gender, smoking history, and age was 85.5%, 65.2%, and 75.9%, respectively. We also created a new laryngoscopic image set for the machine learning study and benchmarked the performance of eight classical deep learning models based on CNNs and Transformers. The results can be integrated into current learning models to improve their performance by incorporating the patient's demographic information.
Collapse
Affiliation(s)
- Tianxiao Zhang
- Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS 66045, USA
| | - Andrés M Bur
- Department of Otolaryngology-Head and Neck Surgery, University of Kansas Medical Center, Kansas City, KS 66160, USA
| | - Shannon Kraft
- Department of Otolaryngology-Head and Neck Surgery, University of Kansas Medical Center, Kansas City, KS 66160, USA
| | - Hannah Kavookjian
- Department of Otolaryngology-Head and Neck Surgery, University of Kansas Medical Center, Kansas City, KS 66160, USA
| | - Bryan Renslo
- Department of Otolaryngology-Head and Neck Surgery, University of Kansas Medical Center, Kansas City, KS 66160, USA
| | - Xiangyu Chen
- Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS 66045, USA
| | - Bo Luo
- Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS 66045, USA
| | - Guanghui Wang
- Department of Computer Science, Toronto Metropolitan University, Toronto, ON M5B 2K3, Canada
| |
Collapse
|
20
|
Zhang H, Zhong X, Li G, Liu W, Liu J, Ji D, Li X, Wu J. BCU-Net: Bridging ConvNeXt and U-Net for medical image segmentation. Comput Biol Med 2023; 159:106960. [PMID: 37099973 DOI: 10.1016/j.compbiomed.2023.106960] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 04/12/2023] [Accepted: 04/17/2023] [Indexed: 04/28/2023]
Abstract
Medical image segmentation enables doctors to observe lesion regions better and make accurate diagnostic decisions. Single-branch models such as U-Net have achieved great progress in this field. However, the complementary local and global pathological semantics of heterogeneous neural networks have not yet been fully explored. The class-imbalance problem remains a serious issue. To alleviate these two problems, we propose a novel model called BCU-Net, which leverages the advantages of ConvNeXt in global interaction and U-Net in local processing. We propose a new multilabel recall loss (MRL) module to relieve the class imbalance problem and facilitate deep-level fusion of local and global pathological semantics between the two heterogeneous branches. Extensive experiments were conducted on six medical image datasets including retinal vessel and polyp images. The qualitative and quantitative results demonstrate the superiority and generalizability of BCU-Net. In particular, BCU-Net can handle diverse medical images with diverse resolutions. It has a flexible structure owing to its plug-and-play characteristics, which promotes its practicality.
Collapse
Affiliation(s)
- Hongbin Zhang
- School of Software, East China Jiaotong University, China.
| | - Xiang Zhong
- School of Software, East China Jiaotong University, China.
| | - Guangli Li
- School of Information Engineering, East China Jiaotong University, China.
| | - Wei Liu
- School of Software, East China Jiaotong University, China.
| | - Jiawei Liu
- School of Software, East China Jiaotong University, China.
| | - Donghong Ji
- School of Cyber Science and Engineering, Wuhan University, China.
| | - Xiong Li
- School of Software, East China Jiaotong University, China.
| | - Jianguo Wu
- The Second Affiliated Hospital of Nanchang University, China.
| |
Collapse
|
21
|
Chang Y, Zheng Z, Sun Y, Zhao M, Lu Y, Zhang Y. DPAFNet: A Residual Dual-Path Attention-Fusion Convolutional Neural Network for Multimodal Brain Tumor Segmentation. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
|
22
|
Jiang X, Cai W, Zhang Z, Jiang B, Yang Z, Wang X. MAGNet: A Camouflaged Object Detection Network Simulating the Observation Effect of a Magnifier. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1804. [PMID: 36554209 PMCID: PMC9778132 DOI: 10.3390/e24121804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 12/01/2022] [Accepted: 12/06/2022] [Indexed: 06/17/2023]
Abstract
In recent years, protecting important objects by simulating animal camouflage has been widely employed in many fields. Therefore, camouflaged object detection (COD) technology has emerged. COD is more difficult to achieve than traditional object detection techniques due to the high degree of fusion of objects camouflaged with the background. In this paper, we strive to more accurately and efficiently identify camouflaged objects. Inspired by the use of magnifiers to search for hidden objects in pictures, we propose a COD network that simulates the observation effect of a magnifier called the MAGnifier Network (MAGNet). Specifically, our MAGNet contains two parallel modules: the ergodic magnification module (EMM) and the attention focus module (AFM). The EMM is designed to mimic the process of a magnifier enlarging an image, and AFM is used to simulate the observation process in which human attention is highly focused on a particular region. The two sets of output camouflaged object maps were merged to simulate the observation of an object by a magnifier. In addition, a weighted key point area perception loss function, which is more applicable to COD, was designed based on two modules to give greater attention to the camouflaged object. Extensive experiments demonstrate that compared with 19 cutting-edge detection models, MAGNet can achieve the best comprehensive effect on eight evaluation metrics in the public COD dataset. Additionally, compared to other COD methods, MAGNet has lower computational complexity and faster segmentation. We also validated the model's generalization ability on a military camouflaged object dataset constructed in-house. Finally, we experimentally explored some extended applications of COD.
Collapse
Affiliation(s)
| | - Wei Cai
- Xi’an Research Institute of High Technology, Xi’an 710064, China
| | | | | | | | | |
Collapse
|
23
|
UPolySeg: A U-Net-Based Polyp Segmentation Network Using Colonoscopy Images. GASTROENTEROLOGY INSIGHTS 2022. [DOI: 10.3390/gastroent13030027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Colonoscopy is a gold standard procedure for tracking the lower gastrointestinal region. A colorectal polyp is one such condition that is detected through colonoscopy. Even though technical advancements have improved the early detection of colorectal polyps, there is still a high percentage of misses due to various factors. Polyp segmentation can play a significant role in the detection of polyps at the early stage and can thus help reduce the severity of the disease. In this work, the authors implemented several image pre-processing techniques such as coherence transport and contrast limited adaptive histogram equalization (CLAHE) to handle different challenges in colonoscopy images. The processed image was then segmented into a polyp and normal pixel using a U-Net-based deep learning segmentation model named UPolySeg. The main framework of UPolySeg has an encoder–decoder section with feature concatenation in the same layer as the encoder–decoder along with the use of dilated convolution. The model was experimentally verified using the publicly available Kvasir-SEG dataset, which gives a global accuracy of 96.77%, a dice coefficient of 96.86%, an IoU of 87.91%, a recall of 95.57%, and a precision of 92.29%. The new framework for the polyp segmentation implementing UPolySeg improved the performance by 1.93% compared with prior work.
Collapse
|
24
|
Yang Y, Zhang T, Li G, Kim T, Wang G. An unsupervised domain adaptation model based on dual-module adversarial training. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.12.060] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
25
|
Patel K, Wang G. A Discriminative Channel Diversification Network for Image Classification. Pattern Recognit Lett 2022; 153:176-182. [PMID: 35938044 PMCID: PMC9348547 DOI: 10.1016/j.patrec.2021.12.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Channel attention mechanisms in convolutional neural networks have been proven to be effective in various computer vision tasks. However, the performance improvement comes with additional model complexity and computation cost. In this paper, we propose a light-weight and effective attention module, called channel diversification block, to enhance the global context by establishing the channel relationship at the global level. Unlike other channel attention mechanisms, the proposed module focuses on the most discriminative features by giving more attention to the spatially distinguishable channels while taking account of the channel activation. Different from other attention models that plugin the module in between several intermediate layers, the proposed module is embedded at the end of the backbone networks, making it easy to implement. Extensive experiments on CIFAR-10, SVHN, and Tiny-ImageNet datasets demonstrate that the proposed module improves the performance of the baseline networks by a margin of 3% on average.
Collapse
Affiliation(s)
- Krushi Patel
- Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS, USA, 66045
| | - Guanghui Wang
- Department of Computer Science, Ryerson University, Toronto, ON, Canada M5B 2K3,Corresponding author: (Guanghui Wang)
| |
Collapse
|
26
|
Li K, Fathan MI, Patel K, Zhang T, Zhong C, Bansal A, Rastogi A, Wang JS, Wang G. Colonoscopy polyp detection and classification: Dataset creation and comparative evaluations. PLoS One 2021; 16:e0255809. [PMID: 34403452 PMCID: PMC8370621 DOI: 10.1371/journal.pone.0255809] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 07/25/2021] [Indexed: 12/12/2022] Open
Abstract
Colorectal cancer (CRC) is one of the most common types of cancer with a high mortality rate. Colonoscopy is the preferred procedure for CRC screening and has proven to be effective in reducing CRC mortality. Thus, a reliable computer-aided polyp detection and classification system can significantly increase the effectiveness of colonoscopy. In this paper, we create an endoscopic dataset collected from various sources and annotate the ground truth of polyp location and classification results with the help of experienced gastroenterologists. The dataset can serve as a benchmark platform to train and evaluate the machine learning models for polyp classification. We have also compared the performance of eight state-of-the-art deep learning-based object detection models. The results demonstrate that deep CNN models are promising in CRC screening. This work can serve as a baseline for future research in polyp detection and classification.
Collapse
Affiliation(s)
- Kaidong Li
- Department of Electrical Engineering and Computer Science, The University of Kansas, Lawrence, KS, United States of America
| | - Mohammad I. Fathan
- Department of Electrical Engineering and Computer Science, The University of Kansas, Lawrence, KS, United States of America
| | - Krushi Patel
- Department of Electrical Engineering and Computer Science, The University of Kansas, Lawrence, KS, United States of America
| | - Tianxiao Zhang
- Department of Electrical Engineering and Computer Science, The University of Kansas, Lawrence, KS, United States of America
| | - Cuncong Zhong
- Department of Electrical Engineering and Computer Science, The University of Kansas, Lawrence, KS, United States of America
| | - Ajay Bansal
- Gastroenterology, Hepatology and Motility, The University of Kansas Medical Center, Kansas City, KS, United States of America
| | - Amit Rastogi
- Gastroenterology, Hepatology and Motility, The University of Kansas Medical Center, Kansas City, KS, United States of America
| | - Jean S. Wang
- Department of Medicine, Washington University School of Medicine, Saint Louis, MO, United States of America
| | - Guanghui Wang
- Department of Computer Science, Ryerson University, Toronto, ON, Canada
| |
Collapse
|