51
|
Yin X, Zeng J, Hou T, Tang C, Gan C, Jain DK, García S. RSAFormer: A method of polyp segmentation with region self-attention transformer. Comput Biol Med 2024; 172:108268. [PMID: 38493598 DOI: 10.1016/j.compbiomed.2024.108268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 03/07/2024] [Accepted: 03/07/2024] [Indexed: 03/19/2024]
Abstract
Colonoscopy has attached great importance to early screening and clinical diagnosis of colon cancer. It remains a challenging task to achieve fine segmentation of polyps. However, existing State-of-the-art models still have limited segmentation ability due to the lack of clear and highly similar boundaries between normal tissue and polyps. To deal with this problem, we propose a region self-attention enhancement network (RSAFormer) with a transformer encoder to capture more robust features. Different from other excellent methods, RSAFormer uniquely employs a dual decoder structure to generate various feature maps. Contrasting with traditional methods that typically employ a single decoder, it offers more flexibility and detail in feature extraction. RSAFormer also introduces a region self-attention enhancement module (RSA) to acquire more accurate feature information and foster a stronger interplay between low-level and high-level features. This module enhances uncertain areas to extract more precise boundary information, these areas being signified by regional context. Extensive experiments were conducted on five prevalent polyp datasets to demonstrate RSAFormer's proficiency. It achieves 92.2% and 83.5% mean Dice on Kvasir and ETIS, respectively, which outperformed most of the state-of-the-art models.
Collapse
Affiliation(s)
- Xuehui Yin
- School of Software Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China.
| | - Jun Zeng
- School of Software Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China.
| | - Tianxiao Hou
- School of Software Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China.
| | - Chao Tang
- School of Software Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China.
| | - Chenquan Gan
- School of Cyber Security and Information Law, Chongqing University of Posts and Telecommunications, Chongqing 400065, China.
| | - Deepak Kumar Jain
- Key Laboratory of Intelligent Control and Optimization for Industrial Equipment of Ministry of Education, Dalian University of Technology, Dalian 116024, China; Symbiosis Institute of Technology, Symbiosis International University, Pune 412115, India.
| | - Salvador García
- Department of Computer Science and Artificial Intelligence, Andalusian Research Institute in Data Science and Computational Intelligence, University of Granada, Granada 18071, Spain.
| |
Collapse
|
52
|
Yang C, Zhang Z. PFD-Net: Pyramid Fourier Deformable Network for medical image segmentation. Comput Biol Med 2024; 172:108302. [PMID: 38503092 DOI: 10.1016/j.compbiomed.2024.108302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 02/26/2024] [Accepted: 03/12/2024] [Indexed: 03/21/2024]
Abstract
Medical image segmentation is crucial for accurately locating lesion regions and assisting doctors in diagnosis. However, most existing methods fail to effectively utilize both local details and global semantic information in medical image segmentation, resulting in the inability to effectively capture fine-grained content such as small targets and irregular boundaries. To address this issue, we propose a novel Pyramid Fourier Deformable Network (PFD-Net) for medical image segmentation, which leverages the strengths of CNN and Transformer. The PFD-Net first utilizes PVTv2-based Transformer as the primary encoder to capture global information and further enhances both local and global feature representations with the Fast Fourier Convolution Residual (FFCR) module. Moreover, PFD-Net further proposes the Dilated Deformable Refinement (DDR) module to enhance the model's capacity to comprehend global semantic structures of shape-diverse targets and their irregular boundaries. Lastly, Cross-Level Fusion Block with deformable convolution (CLFB) is proposed to combine the decoded feature maps from the final Residual Decoder Block (DDR) with local features from the CNN auxiliary encoder branch, improving the network's ability to perceive targets resembling the surrounding structures. Extensive experiments were conducted on nine publicly medical image datasets for five types of segmentation tasks including polyp, abdominal, cardiac, gland cells and nuclei. The qualitative and quantitative results demonstrate that PFD-Net outperforms existing state-of-the-art methods in various evaluation metrics, and achieves the highest performance of mDice with the value of 0.826 on the most challenging dataset (ETIS), which is 1.8% improvement compared to the previous best-performing HSNet and 3.6% improvement compared to the next-best PVT-CASCADE. Codes are available at https://github.com/ChaorongYang/PFD-Net.
Collapse
Affiliation(s)
- Chaorong Yang
- College of Computer and Cyber Security, Hebei Normal University, Shijiazhuang 050024, China; Hebei Provincial Engineering Research Center for Supply Chain Big Data Analytics & Data Security, Shijiazhuang 050024, China; Hebei Provincial Key Laboratory of Network & Information Security, Hebei Normal University, Shijiazhuang 050024, China.
| | - Zhaohui Zhang
- College of Computer and Cyber Security, Hebei Normal University, Shijiazhuang 050024, China; Hebei Provincial Engineering Research Center for Supply Chain Big Data Analytics & Data Security, Shijiazhuang 050024, China; Hebei Provincial Key Laboratory of Network & Information Security, Hebei Normal University, Shijiazhuang 050024, China.
| |
Collapse
|
53
|
Zhang Y, Yang G, Gong C, Zhang J, Wang S, Wang Y. Polyp segmentation with interference filtering and dynamic uncertainty mining. Phys Med Biol 2024; 69:075016. [PMID: 38382099 DOI: 10.1088/1361-6560/ad2b94] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 02/21/2024] [Indexed: 02/23/2024]
Abstract
Objective.Accurate polyp segmentation from colo-noscopy images plays a crucial role in the early diagnosis and treatment of colorectal cancer. However, existing polyp segmentation methods are inevitably affected by various image noises, such as reflections, motion blur, and feces, which significantly affect the performance and generalization of the model. In addition, coupled with ambiguous boundaries between polyps and surrounding tissue, i.e. small inter-class differences, accurate polyp segmentation remains a challenging problem.Approach.To address these issues, we propose a novel two-stage polyp segmentation method that leverages a preprocessing sub-network (Pre-Net) and a dynamic uncertainty mining network (DUMNet) to improve the accuracy of polyp segmentation. Pre-Net identifies and filters out interference regions before feeding the colonoscopy images to the polyp segmentation network DUMNet. Considering the confusing polyp boundaries, DUMNet employs the uncertainty mining module (UMM) to dynamically focus on foreground, background, and uncertain regions based on different pixel confidences. UMM helps to mine and enhance more detailed context, leading to coarse-to-fine polyp segmentation and precise localization of polyp regions.Main results.We conduct experiments on five popular polyp segmentation benchmarks: ETIS, CVC-ClinicDB, CVC-ColonDB, EndoScene, and Kvasir. Our method achieves state-of-the-art performance. Furthermore, the proposed Pre-Net has strong portability and can improve the accuracy of existing polyp segmentation models.Significance.The proposed method improves polyp segmentation performance by eliminating interference and mining uncertain regions. This aids doctors in making precise and reduces the risk of colorectal cancer. Our code will be released athttps://github.com/zyh5119232/DUMNet.
Collapse
Affiliation(s)
- Yunhua Zhang
- Northeastern University, Shenyang 110819, People's Republic of China
- DUT Artificial Intelligence Institute, Dalian 116024, People's Republic of China
| | - Gang Yang
- Northeastern University, Shenyang 110819, People's Republic of China
| | - Congjin Gong
- Northeastern University, Shenyang 110819, People's Republic of China
| | - Jianhao Zhang
- Northeastern University, Shenyang 110819, People's Republic of China
| | - Shuo Wang
- Northeastern University, Shenyang 110819, People's Republic of China
| | - Yutao Wang
- Northeastern University, Shenyang 110819, People's Republic of China
| |
Collapse
|
54
|
Xu C, Fan K, Mo W, Cao X, Jiao K. Dual ensemble system for polyp segmentation with submodels adaptive selection ensemble. Sci Rep 2024; 14:6152. [PMID: 38485963 PMCID: PMC10940608 DOI: 10.1038/s41598-024-56264-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 03/04/2024] [Indexed: 03/18/2024] Open
Abstract
Colonoscopy is one of the main methods to detect colon polyps, and its detection is widely used to prevent and diagnose colon cancer. With the rapid development of computer vision, deep learning-based semantic segmentation methods for colon polyps have been widely researched. However, the accuracy and stability of some methods in colon polyp segmentation tasks show potential for further improvement. In addition, the issue of selecting appropriate sub-models in ensemble learning for the colon polyp segmentation task still needs to be explored. In order to solve the above problems, we first implement the utilization of multi-complementary high-level semantic features through the Multi-Head Control Ensemble. Then, to solve the sub-model selection problem in training, we propose SDBH-PSO Ensemble for sub-model selection and optimization of ensemble weights for different datasets. The experiments were conducted on the public datasets CVC-ClinicDB, Kvasir, CVC-ColonDB, ETIS-LaribPolypDB and PolypGen. The results show that the DET-Former, constructed based on the Multi-Head Control Ensemble and the SDBH-PSO Ensemble, consistently provides improved accuracy across different datasets. Among them, the Multi-Head Control Ensemble demonstrated superior feature fusion capability in the experiments, and the SDBH-PSO Ensemble demonstrated excellent sub-model selection capability. The sub-model selection capabilities of the SDBH-PSO Ensemble will continue to have significant reference value and practical utility as deep learning networks evolve.
Collapse
Affiliation(s)
- Cun Xu
- Guilin University of Electronic Technology, Guilin, 541000, China
| | - Kefeng Fan
- China Electronics Standardization Institute, Beijing, 100007, China.
| | - Wei Mo
- Guilin University of Electronic Technology, Guilin, 541000, China
| | - Xuguang Cao
- Guilin University of Electronic Technology, Guilin, 541000, China
| | - Kaijie Jiao
- Guilin University of Electronic Technology, Guilin, 541000, China
| |
Collapse
|
55
|
Zhang Y, Shen Z, Jiao R. Segment anything model for medical image segmentation: Current applications and future directions. Comput Biol Med 2024; 171:108238. [PMID: 38422961 DOI: 10.1016/j.compbiomed.2024.108238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 02/06/2024] [Accepted: 02/25/2024] [Indexed: 03/02/2024]
Abstract
Due to the inherent flexibility of prompting, foundation models have emerged as the predominant force in the fields of natural language processing and computer vision. The recent introduction of the Segment Anything Model (SAM) signifies a noteworthy expansion of the prompt-driven paradigm into the domain of image segmentation, thereby introducing a plethora of previously unexplored capabilities. However, the viability of its application to medical image segmentation remains uncertain, given the substantial distinctions between natural and medical images. In this work, we provide a comprehensive overview of recent endeavors aimed at extending the efficacy of SAM to medical image segmentation tasks, encompassing both empirical benchmarking and methodological adaptations. Additionally, we explore potential avenues for future research directions in SAM's role within medical image segmentation. While direct application of SAM to medical image segmentation does not yield satisfactory performance on multi-modal and multi-target medical datasets so far, numerous insights gleaned from these efforts serve as valuable guidance for shaping the trajectory of foundational models in the realm of medical image analysis. To support ongoing research endeavors, we maintain an active repository that contains an up-to-date paper list and a succinct summary of open-source projects at https://github.com/YichiZhang98/SAM4MIS.
Collapse
Affiliation(s)
- Yichi Zhang
- School of Data Science, Fudan University, Shanghai, China.
| | - Zhenrong Shen
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Rushi Jiao
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
56
|
Wang H, Hu T, Zhang Y, Zhang H, Qi Y, Wang L, Ma J, Du M. Unveiling camouflaged and partially occluded colorectal polyps: Introducing CPSNet for accurate colon polyp segmentation. Comput Biol Med 2024; 171:108186. [PMID: 38394804 DOI: 10.1016/j.compbiomed.2024.108186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 02/02/2024] [Accepted: 02/18/2024] [Indexed: 02/25/2024]
Abstract
BACKGROUND Segmenting colorectal polyps presents a significant challenge due to the diverse variations in their size, shape, texture, and intricate backgrounds. Particularly demanding are the so-called "camouflaged" polyps, which are partially concealed by surrounding tissues or fluids, adding complexity to their detection. METHODS We present CPSNet, an innovative model designed for camouflaged polyp segmentation. CPSNet incorporates three key modules: the Deep Multi-Scale-Feature Fusion Module, the Camouflaged Object Detection Module, and the Multi-Scale Feature Enhancement Module. These modules work collaboratively to improve the segmentation process, enhancing both robustness and accuracy. RESULTS Our experiments confirm the effectiveness of CPSNet. When compared to state-of-the-art methods in colon polyp segmentation, CPSNet consistently outperforms the competition. Particularly noteworthy is its performance on the ETIS-LaribPolypDB dataset, where CPSNet achieved a remarkable 2.3% increase in the Dice coefficient compared to the Polyp-PVT model. CONCLUSION In summary, CPSNet marks a significant advancement in the field of colorectal polyp segmentation. Its innovative approach, encompassing multi-scale feature fusion, camouflaged object detection, and feature enhancement, holds considerable promise for clinical applications.
Collapse
Affiliation(s)
- Huafeng Wang
- School of Information Technology, North China University of Technology, Beijing 100041, China.
| | - Tianyu Hu
- School of Information Technology, North China University of Technology, Beijing 100041, China.
| | - Yanan Zhang
- School of Information Technology, North China University of Technology, Beijing 100041, China.
| | - Haodu Zhang
- School of Intelligent Systems Engineering, Sun Yat-sen University, Guangzhou 510335, China.
| | - Yong Qi
- School of Information Technology, North China University of Technology, Beijing 100041, China.
| | - Longzhen Wang
- Department of Gastroenterology, Second People's Hospital, Changzhi, Shanxi 046000, China.
| | - Jianhua Ma
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510335, China.
| | - Minghua Du
- Department of Emergency, PLA General Hospital, Beijing 100853, China.
| |
Collapse
|
57
|
Shao D, Yang H, Liu C, Ma L. AFANet: Adaptive feature aggregation for polyp segmentation. Med Eng Phys 2024; 125:104118. [PMID: 38508807 DOI: 10.1016/j.medengphy.2024.104118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Revised: 01/15/2024] [Accepted: 02/13/2024] [Indexed: 03/22/2024]
Abstract
In terms of speed and accuracy, the deep learning-based polyp segmentation method is superior. It is essential for the early detection and treatment of colorectal cancer and has the potential to greatly reduce the disease's overall prevalence. Due to the various forms and sizes of polyps, as well as the blurring of the boundaries between the polyp region and the surrounding mucus, most existing algorithms are unable to provide highly accurate colorectal polyp segmentation. Therefore, to overcome these obstacles, we propose an adaptive feature aggregation network (AFANet). It contains two main modules: the Multi-modal Balancing Attention Module (MMBA) and the Global Context Module (GCM). The MMBA extracts improved local characteristics for inference by integrating local contextual information while paying attention to them in three regions: foreground, background, and border. The GCM takes global information from the top of the encoder and sends it to the decoder layer in order to further investigate global contextual feature information in the pathologic picture. Dice of 92.11 % and 94.76 % and MIoU of 91.07 % and 94.54 %, respectively, are achieved by comprehensive experimental validation of our proposed technique on two benchmark datasets, Kvasir-SEG and CVCClinicDB. The experimental results demonstrate that the strategy outperforms other cutting-edge approaches.
Collapse
Affiliation(s)
- Dangguo Shao
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China
| | - Haiqiong Yang
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China
| | - Cuiyin Liu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China.
| | - Lei Ma
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China
| |
Collapse
|
58
|
Mozaffari J, Amirkhani A, Shokouhi SB. ColonGen: an efficient polyp segmentation system for generalization improvement using a new comprehensive dataset. Phys Eng Sci Med 2024; 47:309-325. [PMID: 38224384 DOI: 10.1007/s13246-023-01368-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 12/06/2023] [Indexed: 01/16/2024]
Abstract
Colorectal cancer (CRC) is one of the most common causes of cancer-related deaths. While polyp detection is important for diagnosing CRC, high miss rates for polyps have been reported during colonoscopy. Most deep learning methods extract features from images using convolutional neural networks (CNNs). In recent years, vision transformer (ViT) models have been employed for image processing and have been successful in image segmentation. It is possible to improve image processing by using transformer models that can extract spatial location information, and CNNs that are capable of aggregating local information. Despite this, recent research shows limited effectiveness in increasing data diversity and generalization accuracy. This paper investigates the generalization proficiency of polyp image segmentation based on transformer architecture and proposes a novel approach using two different ViT architectures. This allows the model to learn representations from different perspectives, which can then be combined to create a richer feature representation. Additionally, a more universal and comprehensive dataset has been derived from the datasets presented in the related research, which can be used for improving generalizations. We first evaluated the generalization of our proposed model using three distinct training-testing scenarios. Our experimental results demonstrate that our ColonGen-V1 outperforms other state-of-the-art methods in all scenarios. As a next step, we used the comprehensive dataset for improving the performance of the model against in- and out-of-domain data. The results show that our ColonGen-V2 outperforms state-of-the-art studies by 5.1%, 1.3%, and 1.1% in ETIS-Larib, Kvasir-Seg, and CVC-ColonDB datasets, respectively. The inclusive dataset and the model introduced in this paper are available to the public through this link: https://github.com/javadmozaffari/Polyp_segmentation .
Collapse
Affiliation(s)
- Javad Mozaffari
- School of Electrical Engineering, Iran University of Science and Technology, Tehran, 16846-13114, Iran
| | - Abdollah Amirkhani
- School of Automotive Engineering, Iran University of Science and Technology, Tehran, 16846-13114, Iran.
| | - Shahriar B Shokouhi
- School of Electrical Engineering, Iran University of Science and Technology, Tehran, 16846-13114, Iran
| |
Collapse
|
59
|
He Y, Yi Y, Zheng C, Kong J. BGF-Net: Boundary guided filter network for medical image segmentation. Comput Biol Med 2024; 171:108184. [PMID: 38417386 DOI: 10.1016/j.compbiomed.2024.108184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 02/12/2024] [Accepted: 02/18/2024] [Indexed: 03/01/2024]
Abstract
How to fuse low-level and high-level features effectively is crucial to improving the accuracy of medical image segmentation. Most CNN-based segmentation models on this topic usually adopt attention mechanisms to achieve the fusion of different level features, but they have not effectively utilized the guided information of high-level features, which is often highly beneficial to improve the performance of the segmentation model, to guide the extraction of low-level features. To address this problem, we design multiple guided modules and develop a boundary-guided filter network (BGF-Net) to obtain more accurate medical image segmentation. To the best of our knowledge, this is the first time that boundary guided information is introduced into the medical image segmentation task. Specifically, we first propose a simple yet effective channel boundary guided module to make the segmentation model pay more attention to the relevant channel weights. We further design a novel spatial boundary guided module to complement the channel boundary guided module and aware of the most important spatial positions. Finally, we propose a boundary guided filter to preserve the structural information from the previous feature map and guide the model to learn more important feature information. Moreover, we conduct extensive experiments on skin lesion, polyp, and gland segmentation datasets including ISIC 2016, CVC-EndoSceneStil and GlaS to test the proposed BGF-Net. The experimental results demonstrate that BGF-Net performs better than other state-of-the-art methods.
Collapse
Affiliation(s)
- Yanlin He
- College of Information Sciences and Technology, Northeast Normal University, Changchun, 130117, China
| | - Yugen Yi
- School of Software, Jiangxi Normal University, Nanchang, 330022, China
| | - Caixia Zheng
- College of Information Sciences and Technology, Northeast Normal University, Changchun, 130117, China; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China.
| | - Jun Kong
- Institute for Intelligent Elderly Care, Changchun Humanities and Sciences College, Changchun, 130117, China.
| |
Collapse
|
60
|
Soo JMP, Koh FHX. Detection of sessile serrated adenoma using artificial intelligence-enhanced endoscopy: an Asian perspective. ANZ J Surg 2024; 94:362-365. [PMID: 38149749 DOI: 10.1111/ans.18785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 11/04/2023] [Indexed: 12/28/2023]
Abstract
BACKGROUND As the serrated pathway has gained prominence as an alternative colorectal carcinogenesis pathway, sessile serrated adenomas or polyps (SSA/P) have been highlighted as lesions to rule out during colonoscopy. These lesions are however morphologically difficult to detect on endoscopy and can be mistaken for hyperplastic polyps due to similar endoscopic features. With the underlying nature of rapid progression and malignant transformation, interval cancer is a likely consequence of undetected or overlooked SSA/P. Real-time artificial intelligence (AI)-assisted colonoscopy via the computer-assisted detection system (CADe) is an increasingly useful tool in improving adenoma detection rate by providing a second eye during the procedure. In this article, we describe a guide through a video to illustrate the detection of SSA/P during AI-assisted colonoscopy. METHODS Consultant-grade endoscopists utilized real-time AI-assisted colonoscopy device, as part of a larger prospective study, to detect suspicious lesions which were later histopathologically confirmed to be SSA/P. RESULTS All lesions were picked up by the CADe where a real-time green box highlighted suspicious polyps to the clinician. Three SSA/P of varying morphology are described with reference to classical SSA/P features and with comparison to the features of the hyperplastic polyp found in our study. All three SSA/P observed are in keeping with the JNET Classification (Type 1). CONCLUSION In conclusion, CADe is a most useful aid to clinicians during endoscopy in the detection of SSA/P but must be complemented with factors such as good endoscopy skill and bowel prep for effective detection, and biopsy coupled with subsequent accurate histological diagnosis.
Collapse
Affiliation(s)
- Joycelyn Mun-Peng Soo
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Frederick Hong-Xiang Koh
- Colorectal Service, Department of General Surgery, Sengkang General Hospital, SingHealth Services, Singapore, Singapore
| |
Collapse
|
61
|
Wang M, An X, Pei Z, Li N, Zhang L, Liu G, Ming D. An Efficient Multi-Task Synergetic Network for Polyp Segmentation and Classification. IEEE J Biomed Health Inform 2024; 28:1228-1239. [PMID: 37155397 DOI: 10.1109/jbhi.2023.3273728] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Colonoscopy is considered the best diagnostic tool for early detection and resection of polyps, which can effectively prevent consequential colorectal cancer. In clinical practice, segmenting and classifying polyps from colonoscopic images have a great significance since they provide precious information for diagnosis and treatment. In this study, we propose an efficient multi-task synergetic network (EMTS-Net) for concurrent polyp segmentation and classification, and we introduce a polyp classification benchmark for exploring the potential correlations of the above-mentioned two tasks. This framework is composed of an enhanced multi-scale network (EMS-Net) for coarse-grained polyp segmentation, an EMTS-Net (Class) for accurate polyp classification, and an EMTS-Net (Seg) for fine-grained polyp segmentation. Specifically, we first obtain coarse segmentation masks by using EMS-Net. Then, we concatenate these rough masks with colonoscopic images to assist EMTS-Net (Class) in locating and classifying polyps precisely. To further enhance the segmentation performance of polyps, we propose a random multi-scale (RMS) training strategy to eliminate the interference caused by redundant information. In addition, we design an offline dynamic class activation mapping (OFLD CAM) generated by the combined effect of EMTS-Net (Class) and RMS strategy, which optimizes bottlenecks between multi-task networks efficiently and elegantly and helps EMTS-Net (Seg) to perform more accurate polyp segmentation. We evaluate the proposed EMTS-Net on the polyp segmentation and classification benchmarks, and it achieves an average mDice of 0.864 in polyp segmentation and an average AUC of 0.913 with an average accuracy of 0.924 in polyp classification. Quantitative and qualitative evaluations on the polyp segmentation and classification benchmarks demonstrate that our EMTS-Net achieves the best performance and outperforms previous state-of-the-art methods in terms of both efficiency and generalization.
Collapse
|
62
|
Jia X, Shen Y, Yang J, Song R, Zhang W, Meng MQH, Liao JC, Xing L. PolypMixNet: Enhancing semi-supervised polyp segmentation with polyp-aware augmentation. Comput Biol Med 2024; 170:108006. [PMID: 38325216 DOI: 10.1016/j.compbiomed.2024.108006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 12/29/2023] [Accepted: 01/13/2024] [Indexed: 02/09/2024]
Abstract
BACKGROUND AI-assisted polyp segmentation in colonoscopy plays a crucial role in enabling prompt diagnosis and treatment of colorectal cancer. However, the lack of sufficient annotated data poses a significant challenge for supervised learning approaches. Existing semi-supervised learning methods also suffer from performance degradation, mainly due to task-specific characteristics, such as class imbalance in polyp segmentation. PURPOSE The purpose of this work is to develop an effective semi-supervised learning framework for accurate polyp segmentation in colonoscopy, addressing limited annotated data and class imbalance challenges. METHODS We proposed PolypMixNet, a semi-supervised framework, for colorectal polyp segmentation, utilizing novel augmentation techniques and a Mean Teacher architecture to improve model performance. PolypMixNet introduces the polyp-aware mixup (PolypMix) algorithm and incorporates dual-level consistency regularization. PolypMix addresses the class imbalance in colonoscopy datasets and enhances the diversity of training data. By performing a polyp-aware mixup on unlabeled samples, it generates mixed images with polyp context along with their artificial labels. A polyp-directed soft pseudo-labeling (PDSPL) mechanism was proposed to generate high-quality pseudo labels and eliminate the dilution of lesion features caused by mixup operations. To ensure consistency in the training phase, we introduce the PolypMix prediction consistency (PMPC) loss and PolypMix attention consistency (PMAC) loss, enforcing consistency at both image and feature levels. Code is available at https://github.com/YChienHung/PolypMix. RESULTS PolypMixNet was evaluated on four public colonoscopy datasets, achieving 88.97% Dice and 88.85% mIoU on the benchmark dataset of Kvasir-SEG. In scenarios where the labeled training data is limited to 15%, PolypMixNet outperforms the state-of-the-art semi-supervised approaches with a 2.88-point improvement in Dice. It also shows the ability to reach performance comparable to the fully supervised counterpart. Additionally, we conducted extensive ablation studies to validate the effectiveness of each module and highlight the superiority of our proposed approach. CONCLUSION PolypMixNet effectively addresses the challenges posed by limited annotated data and unbalanced class distributions in polyp segmentation. By leveraging unlabeled data and incorporating novel augmentation and consistency regularization techniques, our method achieves state-of-the-art performance. We believe that the insights and contributions presented in this work will pave the way for further advancements in semi-supervised polyp segmentation and inspire future research in the medical imaging domain.
Collapse
Affiliation(s)
- Xiao Jia
- School of Control Science and Engineering, Shandong University, Jinan, China.
| | - Yutian Shen
- Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, China.
| | - Jianhong Yang
- School of Control Science and Engineering, Shandong University, Jinan, China.
| | - Ran Song
- School of Control Science and Engineering, Shandong University, Jinan, China.
| | - Wei Zhang
- School of Control Science and Engineering, Shandong University, Jinan, China.
| | - Max Q-H Meng
- Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, China.
| | - Joseph C Liao
- Department of Urology, Stanford University, Stanford, 94305, CA, USA; VA Palo Alto Health Care System, Palo Alto, 94304, CA, USA.
| | - Lei Xing
- Department of Radiation Oncology, Stanford University, Stanford, 94305, CA, USA.
| |
Collapse
|
63
|
Wang Z, Yu L, Tian S, Huo X. CRMEFNet: A coupled refinement, multiscale exploration and fusion network for medical image segmentation. Comput Biol Med 2024; 171:108202. [PMID: 38402839 DOI: 10.1016/j.compbiomed.2024.108202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2023] [Revised: 12/22/2023] [Accepted: 02/18/2024] [Indexed: 02/27/2024]
Abstract
Accurate segmentation of target areas in medical images, such as lesions, is essential for disease diagnosis and clinical analysis. In recent years, deep learning methods have been intensively researched and have generated significant progress in medical image segmentation tasks. However, most of the existing methods have limitations in modeling multilevel feature representations and identification of complex textured pixels at contrasting boundaries. This paper proposes a novel coupled refinement and multiscale exploration and fusion network (CRMEFNet) for medical image segmentation, which explores in the optimization and fusion of multiscale features to address the abovementioned limitations. The CRMEFNet consists of three main innovations: a coupled refinement module (CRM), a multiscale exploration and fusion module (MEFM), and a cascaded progressive decoder (CPD). The CRM decouples features into low-frequency body features and high-frequency edge features, and performs targeted optimization of both to enhance intraclass uniformity and interclass differentiation of features. The MEFM performs a two-stage exploration and fusion of multiscale features using our proposed multiscale aggregation attention mechanism, which explores the differentiated information within the cross-level features, and enhances the contextual connections between the features, to achieves adaptive feature fusion. Compared to existing complex decoders, the CPD decoder (consisting of the CRM and MEFM) can perform fine-grained pixel recognition while retaining complete semantic location information. It also has a simple design and excellent performance. The experimental results from five medical image segmentation tasks, ten datasets and twelve comparison models demonstrate the state-of-the-art performance, interpretability, flexibility and versatility of our CRMEFNet.
Collapse
Affiliation(s)
- Zhi Wang
- College of Software, Xinjiang University, Urumqi, 830000, China; Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830000, China
| | - Long Yu
- College of Network Center, Xinjiang University, Urumqi, 830000, China; Signal and Signal Processing Laboratory, College of Information Science and Engineering, Xinjiang University, Urumqi, 830000, China.
| | - Shengwei Tian
- College of Software, Xinjiang University, Urumqi, 830000, China; Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830000, China
| | - Xiangzuo Huo
- Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830000, China; Signal and Signal Processing Laboratory, College of Information Science and Engineering, Xinjiang University, Urumqi, 830000, China
| |
Collapse
|
64
|
Zhu PC, Wan JJ, Shao W, Meng XC, Chen BL. Colorectal image analysis for polyp diagnosis. Front Comput Neurosci 2024; 18:1356447. [PMID: 38404511 PMCID: PMC10884282 DOI: 10.3389/fncom.2024.1356447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Accepted: 01/05/2024] [Indexed: 02/27/2024] Open
Abstract
Colorectal polyp is an important early manifestation of colorectal cancer, which is significant for the prevention of colorectal cancer. Despite timely detection and manual intervention of colorectal polyps can reduce their chances of becoming cancerous, most existing methods ignore the uncertainties and location problems of polyps, causing a degradation in detection performance. To address these problems, in this paper, we propose a novel colorectal image analysis method for polyp diagnosis via PAM-Net. Specifically, a parallel attention module is designed to enhance the analysis of colorectal polyp images for improving the certainties of polyps. In addition, our method introduces the GWD loss to enhance the accuracy of polyp diagnosis from the perspective of polyp location. Extensive experimental results demonstrate the effectiveness of the proposed method compared with the SOTA baselines. This study enhances the performance of polyp detection accuracy and contributes to polyp detection in clinical medicine.
Collapse
Affiliation(s)
- Peng-Cheng Zhu
- Faculty of Computer and Software Engineering, Huaiyin Institute of Technology, Huaian, China
| | - Jing-Jing Wan
- Department of Gastroenterology, The Second People's Hospital of Huai'an, The Affiliated Huai'an Hospital of Xuzhou Medical University, Huaian, Jiangsu, China
| | - Wei Shao
- Nanjing University of Aeronautics and Astronautics Shenzhen Research Institute, Shenzhen, China
| | - Xian-Chun Meng
- Faculty of Computer and Software Engineering, Huaiyin Institute of Technology, Huaian, China
| | - Bo-Lun Chen
- Faculty of Computer and Software Engineering, Huaiyin Institute of Technology, Huaian, China
- Department of Physics, University of Fribourg, Fribourg, Switzerland
| |
Collapse
|
65
|
Yue G, Zhuo G, Yan W, Zhou T, Tang C, Yang P, Wang T. Boundary uncertainty aware network for automated polyp segmentation. Neural Netw 2024; 170:390-404. [PMID: 38029720 DOI: 10.1016/j.neunet.2023.11.050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 07/15/2023] [Accepted: 11/22/2023] [Indexed: 12/01/2023]
Abstract
Recently, leveraging deep neural networks for automated colorectal polyp segmentation has emerged as a hot topic due to the favored advantages in evading the limitations of visual inspection, e.g., overwork and subjectivity. However, most existing methods do not pay enough attention to the uncertain areas of colonoscopy images and often provide unsatisfactory segmentation performance. In this paper, we propose a novel boundary uncertainty aware network (BUNet) for precise and robust colorectal polyp segmentation. Specifically, considering that polyps vary greatly in size and shape, we first adopt a pyramid vision transformer encoder to learn multi-scale feature representations. Then, a simple yet effective boundary exploration module (BEM) is proposed to explore boundary cues from the low-level features. To make the network focus on the ambiguous area where the prediction score is biased to neither the foreground nor the background, we further introduce a boundary uncertainty aware module (BUM) that explores error-prone regions from the high-level features with the assistance of boundary cues provided by the BEM. Through the top-down hybrid deep supervision, our BUNet implements coarse-to-fine polyp segmentation and finally localizes polyp regions precisely. Extensive experiments on five public datasets show that BUNet is superior to thirteen competing methods in terms of both effectiveness and generalization ability.
Collapse
Affiliation(s)
- Guanghui Yue
- National-Reginoal Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory of Biomedical Measurements and Ultrasound Imaging, Marshall Laboratory of Biomedical Engineering, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen 518060, China
| | - Guibin Zhuo
- National-Reginoal Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory of Biomedical Measurements and Ultrasound Imaging, Marshall Laboratory of Biomedical Engineering, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen 518060, China
| | - Weiqing Yan
- School of Computer and Control Engineering, Yantai University, Yantai 264005, China
| | - Tianwei Zhou
- College of Management, Shenzhen University, Shenzhen 518060, China.
| | - Chang Tang
- School of Computer Science, China University of Geosciences, Wuhan 430074, China
| | - Peng Yang
- National-Reginoal Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory of Biomedical Measurements and Ultrasound Imaging, Marshall Laboratory of Biomedical Engineering, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen 518060, China
| | - Tianfu Wang
- National-Reginoal Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory of Biomedical Measurements and Ultrasound Imaging, Marshall Laboratory of Biomedical Engineering, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen 518060, China
| |
Collapse
|
66
|
Yin Y, Luo S, Zhou J, Kang L, Chen CYC. LDCNet: Lightweight dynamic convolution network for laparoscopic procedures image segmentation. Neural Netw 2024; 170:441-452. [PMID: 38039682 DOI: 10.1016/j.neunet.2023.11.055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 11/07/2023] [Accepted: 11/24/2023] [Indexed: 12/03/2023]
Abstract
Medical image segmentation is fundamental for modern healthcare systems, especially for reducing the risk of surgery and treatment planning. Transanal total mesorectal excision (TaTME) has emerged as a recent focal point in laparoscopic research, representing a pivotal modality in the therapeutic arsenal for the treatment of colon & rectum cancers. Real-time instance segmentation of surgical imagery during TaTME procedures can serve as an invaluable tool in assisting surgeons, ultimately reducing surgical risks. The dynamic variations in size and shape of anatomical structures within intraoperative images pose a formidable challenge, rendering the precise instance segmentation of TaTME images a task of considerable complexity. Deep learning has exhibited its efficacy in Medical image segmentation. However, existing models have encountered challenges in concurrently achieving a satisfactory level of accuracy while maintaining manageable computational complexity in the context of TaTME data. To address this conundrum, we propose a lightweight dynamic convolution Network (LDCNet) that has the same superior segmentation performance as the state-of-the-art (SOTA) medical image segmentation network while running at the speed of the lightweight convolutional neural network. Experimental results demonstrate the promising performance of LDCNet, which consistently exceeds previous SOTA approaches. Codes are available at github.com/yinyiyang416/LDCNet.
Collapse
Affiliation(s)
- Yiyang Yin
- Artificial Intelligence Medical Research Center, School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, Guangdong, China
| | - Shuangling Luo
- Department of General Surgery (Colorectal Surgery), The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510655, Guangdong, China; Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, Department of Colorectal Surgery, Guangzhou, 510655, Guangdong, China; The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510655, Guangdong, China
| | - Jun Zhou
- Artificial Intelligence Medical Research Center, School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, Guangdong, China
| | - Liang Kang
- Department of General Surgery (Colorectal Surgery), The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510655, Guangdong, China; Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, Department of Colorectal Surgery, Guangzhou, 510655, Guangdong, China; The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510655, Guangdong, China.
| | - Calvin Yu-Chian Chen
- Artificial Intelligence Medical Research Center, School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, Guangdong, China; AI for Science (AI4S) - Preferred Program, Peking University Shenzhen Graduate School, Shenzhen, 518055, Guangdong, China; School of Electronic and Computer Engineering, Peking University Shenzhen Graduate School, Shenzhen, 518055, Guangdong, China; Department of Medical Research, China Medical University Hospital, Taichung, 40447, Guangdong, Taiwan; Department of Bioinformatics and Medical Engineering, Asia University, Taichung, 41354, Taiwan.
| |
Collapse
|
67
|
Zhang Y, Zhou T, Tao Y, Wang S, Wu Y, Liu B, Gu P, Chen Q, Chen DZ. TestFit: A plug-and-play one-pass test time method for medical image segmentation. Med Image Anal 2024; 92:103069. [PMID: 38154382 DOI: 10.1016/j.media.2023.103069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Revised: 10/16/2023] [Accepted: 12/19/2023] [Indexed: 12/30/2023]
Abstract
Deep learning (DL) based methods have been extensively studied for medical image segmentation, mostly emphasizing the design and training of DL networks. Only few attempts were made on developing methods for applying DL models in test time. In this paper, we study whether a given off-the-shelf segmentation network can be stably improved on-the-fly during test time in an online processing-and-learning fashion. We propose a new online test-time method, called TestFit, to improve results of a given off-the-shelf DL segmentation model in test time by actively fitting the test data distribution. TestFit first creates a supplementary network (SuppNet) from the given trained off-the-shelf segmentation network (this original network is referred to as OGNet) and applies SuppNet together with OGNet for test time inference. OGNet keeps its hypothesis derived from the original training set to prevent the model from collapsing, while SuppNet seeks to fit the test data distribution. Segmentation results and supervision signals (for updating SuppNet) are generated by combining the outputs of OGNet and SuppNet on the fly. TestFit needs only one pass on each test sample - the same as the traditional test model pipeline - and requires no training time preparation. Since it is challenging to look at only one test sample and no manual annotation for model update each time, we develop a series of technical treatments for improving the stability and effectiveness of our proposed online test-time training method. TestFit works in a plug-and-play fashion, requires minimal hyper-parameter tuning, and is easy to use in practice. Experiments on a large collection of 2D and 3D datasets demonstrate the capability of our TestFit method.
Collapse
Affiliation(s)
- Yizhe Zhang
- Nanjing University of Science and Technology, Jiangsu 210094, China.
| | - Tao Zhou
- Nanjing University of Science and Technology, Jiangsu 210094, China
| | - Yuhui Tao
- Nanjing University of Science and Technology, Jiangsu 210094, China
| | - Shuo Wang
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
| | - Ye Wu
- Nanjing University of Science and Technology, Jiangsu 210094, China
| | - Benyuan Liu
- University of Massachusetts Lowell, MA 01854, USA
| | | | - Qiang Chen
- Nanjing University of Science and Technology, Jiangsu 210094, China
| | | |
Collapse
|
68
|
Li W, Huang Z, Li F, Zhao Y, Zhang H. CIFG-Net: Cross-level information fusion and guidance network for Polyp Segmentation. Comput Biol Med 2024; 169:107931. [PMID: 38181608 DOI: 10.1016/j.compbiomed.2024.107931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Revised: 12/03/2023] [Accepted: 01/01/2024] [Indexed: 01/07/2024]
Abstract
Colorectal cancer is a common malignant tumor of the digestive tract. Most colorectal cancer is caused by colorectal polyp lesions. Timely detection and removal of colorectal polyps can substantially reduce the incidence of colorectal cancer. Accurate polyp segmentation can provide important polyp information that can aid in the early diagnosis and treatment of colorectal cancer. However, polyps of the same type can vary in texture, color, and even size. Furthermore, some polyps are similar in colour to the surrounding healthy tissue, which makes the boundary between the polyp and the surrounding area unclear. In order to overcome the issues of inaccurate polyp localization and unclear boundary segmentation, we propose a polyp segmentation network based on cross-level information fusion and guidance. We use a Transformer encoder to extract a more robust feature representation. In addition, to refine the processing of feature information from encoders, we propose the edge feature processing module (EFPM) and the cross-level information processing module (CIPM). EFPM is used to focus on the boundary information in polyp features. After processing each feature, EFPM can obtain clear and accurate polyp boundary features, which can mitigate unclear boundary segmentation. CIPM is used to aggregate and process multi-scale features transmitted by various encoder layers and to solve the problem of inaccurate polyp location by using multi-level features to obtain the location information of polyps. In order to better use the processed features to optimise our segmentation effect, we also propose an information guidance module (IGM) to integrate the processed features of EFPM and CIPM to obtain accurate positioning and segmentation of polyps. Through experiments on five public polyp datasets using six metrics, it was demonstrated that the proposed network has better robustness and more accurate segmentation effect. Compared with other advanced algorithms, CIFG-Net has superior performance. Code available at: https://github.com/zspnb/CIFG-Net.
Collapse
Affiliation(s)
- Weisheng Li
- Chongqing Key Laboratory of Image Cognition, Chongqing University of Posts and Telecommunications, Chongqing, China.
| | - Zhaopeng Huang
- Chongqing Key Laboratory of Image Cognition, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Feiyan Li
- Chongqing Key Laboratory of Image Cognition, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Yinghui Zhao
- Chongqing Key Laboratory of Image Cognition, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Hongchuan Zhang
- Chongqing Key Laboratory of Image Cognition, Chongqing University of Posts and Telecommunications, Chongqing, China
| |
Collapse
|
69
|
Tian C, Zhang Z, Gao X, Zhou H, Ran R, Jiao Z. An Implicit-Explicit Prototypical Alignment Framework for Semi-Supervised Medical Image Segmentation. IEEE J Biomed Health Inform 2024; 28:929-940. [PMID: 37930923 DOI: 10.1109/jbhi.2023.3330667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2023]
Abstract
Semi-supervised learning methods have been explored to mitigate the scarcity of pixel-level annotation in medical image segmentation tasks. Consistency learning, serving as a mainstream method in semi-supervised training, suffers from low efficiency and poor stability due to inaccurate supervision and insufficient feature representation. Prototypical learning is one potential and plausible way to handle this problem due to the nature of feature aggregation in prototype calculation. However, the previous works have not fully studied how to enhance the supervision quality and feature representation using prototypical learning under the semi-supervised condition. To address this issue, we propose an implicit-explicit alignment (IEPAlign) framework to foster semi-supervised consistency training. In specific, we develop an implicit prototype alignment method based on dynamic multiple prototypes on-the-fly. And then, we design a multiple prediction voting strategy for reliable unlabeled mask generation and prototype calculation to improve the supervision quality. Afterward, to boost the intra-class consistency and inter-class separability of pixel-wise features in semi-supervised segmentation, we construct a region-aware hierarchical prototype alignment, which transmits information from labeled to unlabeled and from certain regions to uncertain regions. We evaluate IEPAlign on three medical image segmentation tasks. The extensive experimental results demonstrate that the proposed method outperforms other popular semi-supervised segmentation methods and achieves comparable performance with fully-supervised training methods.
Collapse
|
70
|
Zhang N, Yu L, Zhang D, Wu W, Tian S, Kang X, Li M. CT-Net: Asymmetric compound branch Transformer for medical image segmentation. Neural Netw 2024; 170:298-311. [PMID: 38006733 DOI: 10.1016/j.neunet.2023.11.034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 09/08/2023] [Accepted: 11/13/2023] [Indexed: 11/27/2023]
Abstract
The Transformer architecture has been widely applied in the field of image segmentation due to its powerful ability to capture long-range dependencies. However, its ability to capture local features is relatively weak and it requires a large amount of data for training. Medical image segmentation tasks, on the other hand, demand high requirements for local features and are often applied to small datasets. Therefore, existing Transformer networks show a significant decrease in performance when applied directly to this task. To address these issues, we have designed a new medical image segmentation architecture called CT-Net. It effectively extracts local and global representations using an asymmetric asynchronous branch parallel structure, while reducing unnecessary computational costs. In addition, we propose a high-density information fusion strategy that efficiently fuses the features of two branches using a fusion module of only 0.05M. This strategy ensures high portability and provides conditions for directly applying transfer learning to solve dataset dependency issues. Finally, we have designed a parameter-adjustable multi-perceptive loss function for this architecture to optimize the training process from both pixel-level and global perspectives. We have tested this network on 5 different tasks with 9 datasets, and compared to SwinUNet, CT-Net improves the IoU by 7.3% and 1.8% on Glas and MoNuSeg datasets respectively. Moreover, compared to SwinUNet, the average DSC on the Synapse dataset is improved by 3.5%.
Collapse
Affiliation(s)
- Ning Zhang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Long Yu
- College of Information Science and Engineering, Xinjiang University, Urumqi 830000, China; College of Network Center, Xinjiang University, Urumqi 830000, China.
| | - Dezhi Zhang
- People's Hospital of Xinjiang Uygur Autonomous Region, Xinjiang University, China
| | - Weidong Wu
- People's Hospital of Xinjiang Uygur Autonomous Region, Xinjiang University, China
| | - Shengwei Tian
- College of Software, Xinjiang University, Urumqi 830000, China
| | - Xiaojing Kang
- People's Hospital of Xinjiang Uygur Autonomous Region, Xinjiang University, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
71
|
Fan K, Xu C, Cao X, Jiao K, Mo W. Tri-branch feature pyramid network based on federated particle swarm optimization for polyp segmentation. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:1610-1624. [PMID: 38303480 DOI: 10.3934/mbe.2024070] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]
Abstract
Deep learning technology has shown considerable potential in various domains. However, due to privacy issues associated with medical data, legal and ethical constraints often result in smaller datasets. The limitations of smaller datasets hinder the applicability of deep learning technology in the field of medical image processing. To address this challenge, we proposed the Federated Particle Swarm Optimization algorithm, which is designed to increase the efficiency of decentralized data utilization in federated learning and to protect privacy in model training. To stabilize the federated learning process, we introduced Tri-branch feature pyramid network (TFPNet), a multi-branch structure model. TFPNet mitigates instability during the aggregation model deployment and ensures fast convergence through its multi-branch structure. We conducted experiments on four different public datasets:CVC-ClinicDB, Kvasir, CVC-ColonDB and ETIS-LaribPolypDB. The experimental results show that the Federated Particle Swarm Optimization algorithm outperforms single dataset training and the Federated Averaging algorithm when using independent scattered data, and TFPNet converges faster and achieves superior segmentation accuracy compared to other models.
Collapse
Affiliation(s)
- Kefeng Fan
- China Electronics Standardization Institute, Beijing 100007, China
| | - Cun Xu
- School of Electronic and Automation, Guilin University of Electronic Technology, Guilin 541004, China
| | - Xuguang Cao
- School of Electronic and Automation, Guilin University of Electronic Technology, Guilin 541004, China
| | - Kaijie Jiao
- School of Electronic and Automation, Guilin University of Electronic Technology, Guilin 541004, China
| | - Wei Mo
- School of Electronic and Automation, Guilin University of Electronic Technology, Guilin 541004, China
| |
Collapse
|
72
|
Sharma P, Nayak DR, Balabantaray BK, Tanveer M, Nayak R. A survey on cancer detection via convolutional neural networks: Current challenges and future directions. Neural Netw 2024; 169:637-659. [PMID: 37972509 DOI: 10.1016/j.neunet.2023.11.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 10/21/2023] [Accepted: 11/04/2023] [Indexed: 11/19/2023]
Abstract
Cancer is a condition in which abnormal cells uncontrollably split and damage the body tissues. Hence, detecting cancer at an early stage is highly essential. Currently, medical images play an indispensable role in detecting various cancers; however, manual interpretation of these images by radiologists is observer-dependent, time-consuming, and tedious. An automatic decision-making process is thus an essential need for cancer detection and diagnosis. This paper presents a comprehensive survey on automated cancer detection in various human body organs, namely, the breast, lung, liver, prostate, brain, skin, and colon, using convolutional neural networks (CNN) and medical imaging techniques. It also includes a brief discussion about deep learning based on state-of-the-art cancer detection methods, their outcomes, and the possible medical imaging data used. Eventually, the description of the dataset used for cancer detection, the limitations of the existing solutions, future trends, and challenges in this domain are discussed. The utmost goal of this paper is to provide a piece of comprehensive and insightful information to researchers who have a keen interest in developing CNN-based models for cancer detection.
Collapse
Affiliation(s)
- Pallabi Sharma
- School of Computer Science, UPES, Dehradun, 248007, Uttarakhand, India.
| | - Deepak Ranjan Nayak
- Department of Computer Science and Engineering, Malaviya National Institute of Technology, Jaipur, 302017, Rajasthan, India.
| | - Bunil Kumar Balabantaray
- Computer Science and Engineering, National Institute of Technology Meghalaya, Shillong, 793003, Meghalaya, India.
| | - M Tanveer
- Department of Mathematics, Indian Institute of Technology Indore, Simrol, 453552, Indore, India.
| | - Rajashree Nayak
- School of Applied Sciences, Birla Global University, Bhubaneswar, 751029, Odisha, India.
| |
Collapse
|
73
|
Zhang W, Lu F, Su H, Hu Y. Dual-branch multi-information aggregation network with transformer and convolution for polyp segmentation. Comput Biol Med 2024; 168:107760. [PMID: 38064849 DOI: 10.1016/j.compbiomed.2023.107760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 10/21/2023] [Accepted: 11/21/2023] [Indexed: 01/10/2024]
Abstract
Computer-Aided Diagnosis (CAD) for polyp detection offers one of the most notable showcases. By using deep learning technologies, the accuracy of polyp segmentation is surpassing human experts. In such CAD process, a critical step is concerned with segmenting colorectal polyps from colonoscopy images. Despite remarkable successes attained by recent deep learning related works, much improvement is still anticipated to tackle challenging cases. For instance, the effects of motion blur and light reflection can introduce significant noise into the image. The same type of polyps has a diversity of size, color and texture. To address such challenges, this paper proposes a novel dual-branch multi-information aggregation network (DBMIA-Net) for polyp segmentation, which is able to accurately and reliably segment a variety of colorectal polyps with efficiency. Specifically, a dual-branch encoder with transformer and convolutional neural networks (CNN) is employed to extract polyp features, and two multi-information aggregation modules are applied in the decoder to fuse multi-scale features adaptively. Two multi-information aggregation modules include global information aggregation (GIA) module and edge information aggregation (EIA) module. In addition, to enhance the representation learning capability of the potential channel feature association, this paper also proposes a novel adaptive channel graph convolution (ACGC). To validate the effectiveness and advantages of the proposed network, we compare it with several state-of-the-art (SOTA) methods on five public datasets. Experimental results consistently demonstrate that the proposed DBMIA-Net obtains significantly superior segmentation performance across six popularly used evaluation matrices. Especially, we achieve 94.12% mean Dice on CVC-ClinicDB dataset which is 4.22% improvement compared to the previous state-of-the-art method PraNet. Compared with SOTA algorithms, DBMIA-Net has a better fitting ability and stronger generalization ability.
Collapse
Affiliation(s)
- Wenyu Zhang
- School of Information Science and Engineering, Lanzhou University, China
| | - Fuxiang Lu
- School of Information Science and Engineering, Lanzhou University, China.
| | - Hongjing Su
- School of Information Science and Engineering, Lanzhou University, China
| | - Yawen Hu
- School of Information Science and Engineering, Lanzhou University, China
| |
Collapse
|
74
|
Wu L, Gao X, Hu Z, Zhang S. Pattern-Aware Transformer: Hierarchical Pattern Propagation in Sequential Medical Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:405-415. [PMID: 37594875 DOI: 10.1109/tmi.2023.3306468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/20/2023]
Abstract
This paper investigates how to effectively mine contextual information among sequential images and jointly model them in medical imaging tasks. Different from state-of-the-art methods that model sequential correlations via point-wise token encoding, this paper develops a novel hierarchical pattern-aware tokenization strategy. It handles distinct visual patterns independently and hierarchically, which not only ensures the full flexibility of attention aggregation under different pattern representations but also preserves both local and global information simultaneously. Based on this strategy, we propose a Pattern-Aware Transformer (PATrans) featuring a global-local dual-path pattern-aware cross-attention mechanism to achieve hierarchical pattern matching and propagation among sequential images. Furthermore, PATrans is plug-and-play and can be seamlessly integrated into various backbone networks for diverse downstream sequence modeling tasks. We demonstrate its general application paradigm across four domains and five benchmarks in video object detection and 3D volumetric semantic segmentation tasks, respectively. Impressively, PATrans sets new state-of-the-art across all these benchmarks, i.e., CVC-Video (92.3% detection F1), ASU-Mayo (99.1% localization F1), Lung Tumor (78.59% DSC), Nasopharynx Tumor (75.50% DSC), and Kidney Tumor (87.53% DSC). Codes and models are available at https://github.com/GGaoxiang/PATrans.
Collapse
|
75
|
Dao HV, Nguyen BP, Nguyen TT, Lam HN, Nguyen TTH, Dang TT, Hoang LB, Le HQ, Dao LV. Application of artificial intelligence in gastrointestinal endoscopy in Vietnam: a narrative review. Ther Adv Gastrointest Endosc 2024; 17:26317745241306562. [PMID: 39734422 PMCID: PMC11672465 DOI: 10.1177/26317745241306562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Accepted: 11/25/2024] [Indexed: 12/31/2024] Open
Abstract
The utilization of artificial intelligence (AI) in gastrointestinal (GI) endoscopy has witnessed significant progress and promising results in recent years worldwide. From 2019 to 2023, the European Society of Gastrointestinal Endoscopy has released multiple guidelines/consensus with recommendations on integrating AI for detecting and classifying lesions in practical endoscopy. In Vietnam, since 2019, several preliminary studies have been conducted to develop AI algorithms for GI endoscopy, focusing on lesion detection. These studies have yielded high accuracy results ranging from 86% to 92%. For upper GI endoscopy, ongoing research directions comprise image quality assessment, detection of anatomical landmarks, simulating image-enhanced endoscopy, and semi-automated tools supporting the delineation of GI lesions on endoscopic images. For lower GI endoscopy, most studies focus on developing AI algorithms for colorectal polyps' detection and classification based on the risk of malignancy. In conclusion, the application of AI in this field represents a promising research direction, presenting challenges and opportunities for real-world implementation within the Vietnamese healthcare context.
Collapse
Affiliation(s)
- Hang Viet Dao
- Research and Education Department, Institute of Gastroenterology and Hepatology, 09 Dao Duy Anh Street, Dong Da District, Hanoi City, Vietnam
- Department of Internal Medicine, Hanoi Medical University, Hanoi, Vietnam
- Endoscopy Center, Hanoi Medical University Hospital, Hanoi, Vietnam
| | | | | | - Hoa Ngoc Lam
- Institute of Gastroenterology and Hepatology, Hanoi, Vietnam
| | | | - Thao Thi Dang
- Institute of Gastroenterology and Hepatology, Hanoi, Vietnam
| | - Long Bao Hoang
- Institute of Gastroenterology and Hepatology, Hanoi, Vietnam
| | - Hung Quang Le
- Endoscopy Center, Hanoi Medical University Hospital, Hanoi, Vietnam
| | - Long Van Dao
- Department of Internal Medicine, Hanoi Medical University, Hanoi, Vietnam
- Endoscopy Center, Hanoi Medical University Hospital, Hanoi, Vietnam
- Institute of Gastroenterology and Hepatology, Hanoi, Vietnam
| |
Collapse
|
76
|
Jain S, Atale R, Gupta A, Mishra U, Seal A, Ojha A, Jaworek-Korjakowska J, Krejcar O. CoInNet: A Convolution-Involution Network With a Novel Statistical Attention for Automatic Polyp Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3987-4000. [PMID: 37768798 DOI: 10.1109/tmi.2023.3320151] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/30/2023]
Abstract
Polyps are very common abnormalities in human gastrointestinal regions. Their early diagnosis may help in reducing the risk of colorectal cancer. Vision-based computer-aided diagnostic systems automatically identify polyp regions to assist surgeons in their removal. Due to their varying shape, color, size, texture, and unclear boundaries, polyp segmentation in images is a challenging problem. Existing deep learning segmentation models mostly rely on convolutional neural networks that have certain limitations in learning the diversity in visual patterns at different spatial locations. Further, they fail to capture inter-feature dependencies. Vision transformer models have also been deployed for polyp segmentation due to their powerful global feature extraction capabilities. But they too are supplemented by convolution layers for learning contextual local information. In the present paper, a polyp segmentation model CoInNet is proposed with a novel feature extraction mechanism that leverages the strengths of convolution and involution operations and learns to highlight polyp regions in images by considering the relationship between different feature maps through a statistical feature attention unit. To further aid the network in learning polyp boundaries, an anomaly boundary approximation module is introduced that uses recursively fed feature fusion to refine segmentation results. It is indeed remarkable that even tiny-sized polyps with only 0.01% of an image area can be precisely segmented by CoInNet. It is crucial for clinical applications, as small polyps can be easily overlooked even in the manual examination due to the voluminous size of wireless capsule endoscopy videos. CoInNet outperforms thirteen state-of-the-art methods on five benchmark polyp segmentation datasets.
Collapse
|
77
|
Xu S, Duan L, Zhang Y, Zhang Z, Sun T, Tian L. Graph- and transformer-guided boundary aware network for medical image segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 242:107849. [PMID: 37837887 DOI: 10.1016/j.cmpb.2023.107849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 09/29/2023] [Accepted: 10/06/2023] [Indexed: 10/16/2023]
Abstract
BACKGROUND AND OBJECTIVE Despite the considerable progress achieved by U-Net-based models, medical image segmentation remains a challenging task due to complex backgrounds, irrelevant noises, and ambiguous boundaries. In this study, we present a novel approach called U-shaped Graph- and Transformer-guided Boundary Aware Network (GTBA-Net) to tackle these challenges. METHODS GTBA-Net uses the pre-trained ResNet34 as its basic structure, and involves Global Feature Aggregation (GFA) modules for target localization, Graph-based Dynamic Feature Fusion (GDFF) modules for effective noise suppression, and Uncertainty-based Boundary Refinement (UBR) modules for accurate delineation of ambiguous boundaries. The GFA modules employ an efficient self-attention mechanism to facilitate coarse target localization amidst complex backgrounds, without introducing additional computational complexity. The GDFF modules leverage graph attention mechanism to aggregate information hidden among high- and low-level features, effectively suppressing target-irrelevant noises while preserving valuable spatial details. The UBR modules introduce an uncertainty quantification strategy and auxiliary loss to guide the model's focus towards target regions and uncertain "ridges", gradually mitigating boundary uncertainty and ultimately achieving accurate boundary delineation. RESULTS Comparative experiments on five datasets encompassing diverse modalities (including X-ray, CT, endoscopic procedures, and ultrasound) demonstrate that the proposed GTBA-Net outperforms existing methods in various challenging scenarios. Subsequent ablation studies further demonstrate the efficacy of the GFA, GDFF, and UBR modules in target localization, noise suppression, and ambiguous boundary delineation, respectively. CONCLUSIONS GTBA-Net exhibits substantial potential for extensive application in the field of medical image segmentation, particularly in scenarios involving complex backgrounds, target-irrelevant noises, or ambiguous boundaries.
Collapse
Affiliation(s)
- Shanshan Xu
- School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China; Beijing Key Laboratory of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing 100044, China
| | - Lianhong Duan
- The Second School of Clinical Medicine, Southern Medical University, Guangzhou, China; Senior Department of Orthopedics, The Fourth Medical Center of PLA General Hospital, Beijing, China
| | - Yang Zhang
- Senior Department of Orthopedics, The Fourth Medical Center of PLA General Hospital, Beijing, China
| | - Zhicheng Zhang
- Senior Department of Orthopedics, The Fourth Medical Center of PLA General Hospital, Beijing, China
| | - Tiansheng Sun
- The Second School of Clinical Medicine, Southern Medical University, Guangzhou, China; Senior Department of Orthopedics, The Fourth Medical Center of PLA General Hospital, Beijing, China.
| | - Lixia Tian
- School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China.
| |
Collapse
|
78
|
Samarasena J, Yang D, Berzin TM. AGA Clinical Practice Update on the Role of Artificial Intelligence in Colon Polyp Diagnosis and Management: Commentary. Gastroenterology 2023; 165:1568-1573. [PMID: 37855759 DOI: 10.1053/j.gastro.2023.07.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 06/06/2023] [Accepted: 07/17/2023] [Indexed: 10/20/2023]
Abstract
DESCRIPTION The purpose of this American Gastroenterological Association (AGA) Institute Clinical Practice Update (CPU) is to review the available evidence and provide expert commentary on the current landscape of artificial intelligence in the evaluation and management of colorectal polyps. METHODS This CPU was commissioned and approved by the AGA Institute Clinical Practice Updates Committee (CPUC) and the AGA Governing Board to provide timely guidance on a topic of high clinical importance to the AGA membership and underwent internal peer review by the CPUC and external peer review through standard procedures of Gastroenterology. This Expert Commentary incorporates important as well as recently published studies in this field, and it reflects the experiences of the authors who are experienced endoscopists with expertise in the field of artificial intelligence and colorectal polyps.
Collapse
Affiliation(s)
- Jason Samarasena
- Division of Gastroenterology, University of California Irvine, Orange, California
| | - Dennis Yang
- Center for Interventional Endoscopy, AdventHealth, Orlando, Florida.
| | - Tyler M Berzin
- Center for Advanced Endoscopy, Beth Israel Deaconess Medical Center and Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
79
|
Chen W, Zhang R, Zhang Y, Bao F, Lv H, Li L, Zhang C. Pact-Net: Parallel CNNs and Transformers for medical image segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 242:107782. [PMID: 37690317 DOI: 10.1016/j.cmpb.2023.107782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2023] [Revised: 07/20/2023] [Accepted: 08/28/2023] [Indexed: 09/12/2023]
Abstract
BACKGROUND AND OBJECTIVE The image segmentation of diseases can help clinical diagnosis and treatment in medical image analysis. Because medical images usually have low contrast and large changes in the size and shape of some structures, this will lead to over-segmentation and under-segmentation. These problems are particularly evident in the segmentation of skin damage. The blurring of the boundary in skin images and the specificity of patients will further increase the difficulty of skin lesion segmentation. Currently, most researchers use deep learning networks to solve these skin segmentation problems. However, traditional convolution methods often fail to obtain satisfactory segmentation performance due to their shortcomings in obtaining global features. Recently, Transformers with good global information extraction ability has achieved satisfactory results in computer vision, which brings new solutions to optimize the model of medical image segmentation further. METHODS To extract more features related to medical image segmentation and effectively use features to further optimize the skin image segmentation model, we designed a network that combines CNNs and Transformers to improve local and global features, called Parallel CNNs and Transformers for Medical Image Segmentation (Pact-Net). Specifically, due to the advantages of Transformers in extracting global information, we create a novel fusion module CSMF, which uses channel and spatial attention mechanism and multi-scale mechanism to effectively fuse the global information extracted by Transformers into the local features extracted by CNNs. Therefore, our Pact-Net dual-branch runs in parallel to effectively capture global and local information. RESULTS Our Pact-Net exceeds the models submitted on the three datasets ISIC 2016, ISIC 2017 and ISIC 2018, and the indicators required for the datasets reach 86.95%, 79.31% and 84.14%, respectively. We also conducted medical image segmentation experiments on cell and polyp datasets to evaluate the robustness, learning and generalization ability of the network. The ablation study of each part of Pact-Net proves the validity of each component, and the comparison with state-of-the-art methods on different indicators proves the predominance of the network. CONCLUSIONS This paper uses the advantages of CNNs and Transformers in extracting local and global features, and further integrates features for skin lesion segmentation. Compared with the state-of-the-art methods, Pact-Net can achieve the most advanced segmentation ability on the skin lesion segmentation dataset, which can help doctors diagnose and treat diseases.
Collapse
Affiliation(s)
- Weilin Chen
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, Shandong, 250014, China
| | - Rui Zhang
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, Shandong, 250014, China
| | - Yunfeng Zhang
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, Shandong, 250014, China.
| | - Fangxun Bao
- School of Mathematics, Shandong University, Jinan, Shandong, 250100, China
| | - Haixia Lv
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, Shandong, 250014, China
| | - Longhao Li
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, Shandong, 250014, China
| | - Caiming Zhang
- School of Software, Shandong University, Jinan, Shandong, 250101, China; Shandong Co-Innovation Center of Future Intelligent Computing, Yantai, Shandong, 264025, China
| |
Collapse
|
80
|
Zhu S, Gao J, Liu L, Yin M, Lin J, Xu C, Xu C, Zhu J. Public Imaging Datasets of Gastrointestinal Endoscopy for Artificial Intelligence: a Review. J Digit Imaging 2023; 36:2578-2601. [PMID: 37735308 PMCID: PMC10584770 DOI: 10.1007/s10278-023-00844-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 05/03/2023] [Accepted: 05/03/2023] [Indexed: 09/23/2023] Open
Abstract
With the advances in endoscopic technologies and artificial intelligence, a large number of endoscopic imaging datasets have been made public to researchers around the world. This study aims to review and introduce these datasets. An extensive literature search was conducted to identify appropriate datasets in PubMed, and other targeted searches were conducted in GitHub, Kaggle, and Simula to identify datasets directly. We provided a brief introduction to each dataset and evaluated the characteristics of the datasets included. Moreover, two national datasets in progress were discussed. A total of 40 datasets of endoscopic images were included, of which 34 were accessible for use. Basic and detailed information on each dataset was reported. Of all the datasets, 16 focus on polyps, and 6 focus on small bowel lesions. Most datasets (n = 16) were constructed by colonoscopy only, followed by normal gastrointestinal endoscopy and capsule endoscopy (n = 9). This review may facilitate the usage of public dataset resources in endoscopic research.
Collapse
Affiliation(s)
- Shiqi Zhu
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou , Jiangsu, 215000, China
- Suzhou Clinical Center of Digestive Diseases, Suzhou, 215000, China
| | - Jingwen Gao
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou , Jiangsu, 215000, China
- Suzhou Clinical Center of Digestive Diseases, Suzhou, 215000, China
| | - Lu Liu
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou , Jiangsu, 215000, China
- Suzhou Clinical Center of Digestive Diseases, Suzhou, 215000, China
| | - Minyue Yin
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou , Jiangsu, 215000, China
- Suzhou Clinical Center of Digestive Diseases, Suzhou, 215000, China
| | - Jiaxi Lin
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou , Jiangsu, 215000, China
- Suzhou Clinical Center of Digestive Diseases, Suzhou, 215000, China
| | - Chang Xu
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou , Jiangsu, 215000, China
- Suzhou Clinical Center of Digestive Diseases, Suzhou, 215000, China
| | - Chunfang Xu
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou , Jiangsu, 215000, China.
- Suzhou Clinical Center of Digestive Diseases, Suzhou, 215000, China.
| | - Jinzhou Zhu
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou , Jiangsu, 215000, China.
- Suzhou Clinical Center of Digestive Diseases, Suzhou, 215000, China.
| |
Collapse
|
81
|
Mu N, Guo J, Wang R. Automated polyp segmentation based on a multi-distance feature dissimilarity-guided fully convolutional network. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:20116-20134. [PMID: 38052639 DOI: 10.3934/mbe.2023891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Colorectal malignancies often arise from adenomatous polyps, which typically begin as solitary, asymptomatic growths before progressing to malignancy. Colonoscopy is widely recognized as a highly efficacious clinical polyp detection method, offering valuable visual data that facilitates precise identification and subsequent removal of these tumors. Nevertheless, accurately segmenting individual polyps poses a considerable difficulty because polyps exhibit intricate and changeable characteristics, including shape, size, color, quantity and growth context during different stages. The presence of similar contextual structures around polyps significantly hampers the performance of commonly used convolutional neural network (CNN)-based automatic detection models to accurately capture valid polyp features, and these large receptive field CNN models often overlook the details of small polyps, which leads to the occurrence of false detections and missed detections. To tackle these challenges, we introduce a novel approach for automatic polyp segmentation, known as the multi-distance feature dissimilarity-guided fully convolutional network. This approach comprises three essential components, i.e., an encoder-decoder, a multi-distance difference (MDD) module and a hybrid loss (HL) module. Specifically, the MDD module primarily employs a multi-layer feature subtraction (MLFS) strategy to propagate features from the encoder to the decoder, which focuses on extracting information differences between neighboring layers' features at short distances, and both short and long-distance feature differences across layers. Drawing inspiration from pyramids, the MDD module effectively acquires discriminative features from neighboring layers or across layers in a continuous manner, which helps to strengthen feature complementary across different layers. The HL module is responsible for supervising the feature maps extracted at each layer of the network to improve prediction accuracy. Our experimental results on four challenge datasets demonstrate that the proposed approach exhibits superior automatic polyp performance in terms of the six evaluation criteria compared to five current state-of-the-art approaches.
Collapse
Affiliation(s)
- Nan Mu
- College of Computer Science, Sichuan Normal University, Chengdu 610101, China
- Visual Computing and Virtual Reality Key Laboratory of Sichuan, Sichuan Normal University, Chengdu 610068, China
- Education Big Data Collaborative Innovation Center of Sichuan 2011, Chengdu 610101, China
| | - Jinjia Guo
- Chongqing University-University of Cincinnati Joint Co-op Institution, Chongqing University, Chongqing 400044, China
| | - Rong Wang
- College of Computer Science, Sichuan Normal University, Chengdu 610101, China
- Visual Computing and Virtual Reality Key Laboratory of Sichuan, Sichuan Normal University, Chengdu 610068, China
- Education Big Data Collaborative Innovation Center of Sichuan 2011, Chengdu 610101, China
| |
Collapse
|
82
|
Chen F, Ma H, Zhang W. SegT: Separated edge-guidance transformer network for polyp segmentation. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:17803-17821. [PMID: 38052537 DOI: 10.3934/mbe.2023791] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Accurate segmentation of colonoscopic polyps is considered a fundamental step in medical image analysis and surgical interventions. Many recent studies have made improvements based on the encoder-decoder framework, which can effectively segment diverse polyps. Such improvements mainly aim to enhance local features by using global features and applying attention methods. However, relying only on the global information of the final encoder block can result in losing local regional features in the intermediate layer. In addition, determining the edges between benign regions and polyps could be a challenging task. To address the aforementioned issues, we propose a novel separated edge-guidance transformer (SegT) network that aims to build an effective polyp segmentation model. A transformer encoder that learns a more robust representation than existing convolutional neural network-based approaches was specifically applied. To determine the precise segmentation of polyps, we utilize a separated edge-guidance module consisting of separator and edge-guidance blocks. The separator block is a two-stream operator to highlight edges between the background and foreground, whereas the edge-guidance block lies behind both streams to strengthen the understanding of the edge. Lastly, an innovative cascade fusion module was used and fused the refined multi-level features. To evaluate the effectiveness of SegT, we conducted experiments with five challenging public datasets, and the proposed model achieved state-of-the-art performance.
Collapse
Affiliation(s)
- Feiyu Chen
- Department of Mathematics, Physics and Information Sciences, Shaoxing University, Shaoxing, China
| | - Haiping Ma
- Department of Mathematics, Physics and Information Sciences, Shaoxing University, Shaoxing, China
| | - Weijia Zhang
- Department of Mathematics, Physics and Information Sciences, Shaoxing University, Shaoxing, China
- Department of AOP Physics, Visiting Scholar, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
83
|
Lee GE, Cho J, Choi SI. Shallow and reverse attention network for colon polyp segmentation. Sci Rep 2023; 13:15243. [PMID: 37709828 PMCID: PMC10502036 DOI: 10.1038/s41598-023-42436-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Accepted: 09/10/2023] [Indexed: 09/16/2023] Open
Abstract
Polyp segmentation is challenging because the boundary between polyps and mucosa is ambiguous. Several models have considered the use of attention mechanisms to solve this problem. However, these models use only finite information obtained from a single type of attention. We propose a new dual-attention network based on shallow and reverse attention modules for colon polyps segmentation called SRaNet. The shallow attention mechanism removes background noise while emphasizing the locality by focusing on the foreground. In contrast, reverse attention helps distinguish the boundary between polyps and mucous membranes more clearly by focusing on the background. The two attention mechanisms are adaptively fused using a "Softmax Gate". Combining the two types of attention enables the model to capture complementary foreground and boundary features. Therefore, the proposed model predicts the boundaries of polyps more accurately than other models. We present the results of extensive experiments on polyp benchmarks to show that the proposed method outperforms existing models on both seen and unseen data. Furthermore, the results show that the proposed dual attention module increases the explainability of the model.
Collapse
Affiliation(s)
- Go-Eun Lee
- Department of Computer Science and Engineering, Dankook University, Yongin, 16890, South Korea
| | - Jungchan Cho
- School of Computing, Gachon University, Seongnam, 13120, South Korea.
| | - Sang-Ii Choi
- Department of Computer Science and Engineering, Dankook University, Yongin, 16890, South Korea.
| |
Collapse
|
84
|
Shukla S, Birla L, Gupta AK, Gupta P. Trustworthy Medical Image Segmentation with improved performance for in-distribution samples. Neural Netw 2023; 166:127-136. [PMID: 37487410 DOI: 10.1016/j.neunet.2023.06.047] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 06/13/2023] [Accepted: 06/30/2023] [Indexed: 07/26/2023]
Abstract
Despite the enormous achievements of Deep Learning (DL) based models, their non-transparent nature led to restricted applicability and distrusted predictions. Such predictions emerge from erroneous In-Distribution (ID) and Out-Of-Distribution (OOD) samples, which results in disastrous effects in the medical domain, specifically in Medical Image Segmentation (MIS). To mitigate such effects, several existing works accomplish OOD sample detection; however, the trustworthiness issues from ID samples still require thorough investigation. To this end, a novel method TrustMIS (Trustworthy Medical Image Segmentation) is proposed in this paper, which provides the trustworthiness and improved performance of ID samples for DL-based MIS models. TrustMIS works in three folds: IT (Investigating Trustworthiness), INT (Improving Non-Trustworthy prediction) and CSO (Classifier Switching Operation). Initially, the IT method investigates the trustworthiness of MIS by leveraging similar characteristics and consistency analysis of input and its variants. Subsequently, the INT method employs the IT method to improve the performance of the MIS model. It leverages the observation that an input providing erroneous segmentation can provide correct segmentation with rotated input. Eventually, the CSO method employs the INT method to scrutinise several MIS models and selects the model that delivers the most trustworthy prediction. The experiments conducted on publicly available datasets using well-known MIS models reveal that TrustMIS has successfully provided a trustworthiness measure, outperformed the existing methods, and improved the performance of state-of-the-art MIS models. Our implementation is available at https://github.com/SnehaShukla937/TrustMIS.
Collapse
Affiliation(s)
- Sneha Shukla
- Indian Institute of Technology Indore, Indore, India.
| | | | | | - Puneet Gupta
- Indian Institute of Technology Indore, Indore, India.
| |
Collapse
|
85
|
Yang L, Zhai C, Liu Y, Yu H. CFHA-Net: A polyp segmentation method with cross-scale fusion strategy and hybrid attention. Comput Biol Med 2023; 164:107301. [PMID: 37573723 DOI: 10.1016/j.compbiomed.2023.107301] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 07/10/2023] [Accepted: 07/28/2023] [Indexed: 08/15/2023]
Abstract
Colorectal cancer is a prevalent disease in modern times, with most cases being caused by polyps. Therefore, the segmentation of polyps has garnered significant attention in the field of medical image segmentation. In recent years, the variant network derived from the U-Net network has demonstrated a good segmentation effect on polyp segmentation challenges. In this paper, a polyp segmentation model, called CFHA-Net, is proposed, that combines a cross-scale feature fusion strategy and a hybrid attention mechanism. Inspired by feature learning, the encoder unit incorporates a cross-scale context fusion (CCF) module that performs cross-layer feature fusion and enhances the feature information of different scales. The skip connection is optimized by proposed triple hybrid attention (THA) module that aggregates spatial and channel attention features from three directions to improve the long-range dependence between features and help identify subsequent polyp lesion boundaries. Additionally, a dense-receptive feature fusion (DFF) module, which combines dense connections and multi-receptive field fusion modules, is added at the bottleneck layer to capture more comprehensive context information. Furthermore, a hybrid pooling (HP) module and a hybrid upsampling (HU) module are proposed to help the segmentation network acquire more contextual features. A series of experiments have been conducted on three typical datasets for polyp segmentation (CVC-ClinicDB, Kvasir-SEG, EndoTect) to evaluate the effectiveness and generalization of the proposed CFHA-Net. The experimental results demonstrate the validity and generalization of the proposed method, with many performance metrics surpassing those of related advanced segmentation networks. Therefore, proposed CFHA-Net could present a promising solution to the challenges of polyp segmentation in medical image analysis. The source code of proposed CFHA-Net is available at https://github.com/CXzhai/CFHA-Net.git.
Collapse
Affiliation(s)
- Lei Yang
- School of Electrical and Information Engineering, Zhengzhou University, Henan Province, 450001, China; Robot Perception and Control Engineering Laboratory of Henan Province, 450001, China
| | - Chenxu Zhai
- School of Electrical and Information Engineering, Zhengzhou University, Henan Province, 450001, China; Robot Perception and Control Engineering Laboratory of Henan Province, 450001, China
| | - Yanhong Liu
- School of Electrical and Information Engineering, Zhengzhou University, Henan Province, 450001, China; Robot Perception and Control Engineering Laboratory of Henan Province, 450001, China.
| | - Hongnian Yu
- School of Electrical and Information Engineering, Zhengzhou University, Henan Province, 450001, China; Built Environment, Edinburgh Napier University, Edinburgh EH10 5DT, UK
| |
Collapse
|
86
|
Liu Z, Lv Q, Yang Z, Li Y, Lee CH, Shen L. Recent progress in transformer-based medical image analysis. Comput Biol Med 2023; 164:107268. [PMID: 37494821 DOI: 10.1016/j.compbiomed.2023.107268] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 05/30/2023] [Accepted: 07/16/2023] [Indexed: 07/28/2023]
Abstract
The transformer is primarily used in the field of natural language processing. Recently, it has been adopted and shows promise in the computer vision (CV) field. Medical image analysis (MIA), as a critical branch of CV, also greatly benefits from this state-of-the-art technique. In this review, we first recap the core component of the transformer, the attention mechanism, and the detailed structures of the transformer. After that, we depict the recent progress of the transformer in the field of MIA. We organize the applications in a sequence of different tasks, including classification, segmentation, captioning, registration, detection, enhancement, localization, and synthesis. The mainstream classification and segmentation tasks are further divided into eleven medical image modalities. A large number of experiments studied in this review illustrate that the transformer-based method outperforms existing methods through comparisons with multiple evaluation metrics. Finally, we discuss the open challenges and future opportunities in this field. This task-modality review with the latest contents, detailed information, and comprehensive comparison may greatly benefit the broad MIA community.
Collapse
Affiliation(s)
- Zhaoshan Liu
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore.
| | - Qiujie Lv
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore; School of Intelligent Systems Engineering, Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, 518107, China.
| | - Ziduo Yang
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore; School of Intelligent Systems Engineering, Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, 518107, China.
| | - Yifan Li
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore.
| | - Chau Hung Lee
- Department of Radiology, Tan Tock Seng Hospital, 11 Jalan Tan Tock Seng, Singapore, 308433, Singapore.
| | - Lei Shen
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore.
| |
Collapse
|
87
|
Ghimire R, Lee SW. MMNet: A Mixing Module Network for Polyp Segmentation. SENSORS (BASEL, SWITZERLAND) 2023; 23:7258. [PMID: 37631792 PMCID: PMC10458640 DOI: 10.3390/s23167258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 08/03/2023] [Accepted: 08/16/2023] [Indexed: 08/27/2023]
Abstract
Traditional encoder-decoder networks like U-Net have been extensively used for polyp segmentation. However, such networks have demonstrated limitations in explicitly modeling long-range dependencies. In such networks, local patterns are emphasized over the global context, as each convolutional kernel focuses on only a local subset of pixels in the entire image. Several recent transformer-based networks have been shown to overcome such limitations. Such networks encode long-range dependencies using self-attention methods and thus learn highly expressive representations. However, due to the computational complexity of modeling the whole image, self-attention is expensive to compute, as there is a quadratic increment in cost with the increase in pixels in the image. Thus, patch embedding has been utilized, which groups small regions of the image into single input features. Nevertheless, these transformers still lack inductive bias, even with the image as a 1D sequence of visual tokens. This results in the inability to generalize to local contexts due to limited low-level features. We introduce a hybrid transformer combined with a convolutional mixing network to overcome computational and long-range dependency issues. A pretrained transformer network is introduced as a feature-extracting encoder, and a mixing module network (MMNet) is introduced to capture the long-range dependencies with a reduced computational cost. Precisely, in the mixing module network, we use depth-wise and 1 × 1 convolution to model long-range dependencies to establish spatial and cross-channel correlation, respectively. The proposed approach is evaluated qualitatively and quantitatively on five challenging polyp datasets across six metrics. Our MMNet outperforms the previous best polyp segmentation methods.
Collapse
Affiliation(s)
- Raman Ghimire
- Pattern Recognition and Machine Learning Lab, Department of IT Convergence Engineering, Gachon University, Seongnam 13557, Republic of Korea;
| | - Sang-Woong Lee
- Pattern Recognition and Machine Learning Lab, Department of AI Software, Gachon University, Seongnam 13557, Republic of Korea
| |
Collapse
|
88
|
Shamshad F, Khan S, Zamir SW, Khan MH, Hayat M, Khan FS, Fu H. Transformers in medical imaging: A survey. Med Image Anal 2023; 88:102802. [PMID: 37315483 DOI: 10.1016/j.media.2023.102802] [Citation(s) in RCA: 186] [Impact Index Per Article: 93.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Revised: 03/11/2023] [Accepted: 03/23/2023] [Indexed: 06/16/2023]
Abstract
Following unprecedented success on the natural language tasks, Transformers have been successfully applied to several computer vision problems, achieving state-of-the-art results and prompting researchers to reconsider the supremacy of convolutional neural networks (CNNs) as de facto operators. Capitalizing on these advances in computer vision, the medical imaging field has also witnessed growing interest for Transformers that can capture global context compared to CNNs with local receptive fields. Inspired from this transition, in this survey, we attempt to provide a comprehensive review of the applications of Transformers in medical imaging covering various aspects, ranging from recently proposed architectural designs to unsolved issues. Specifically, we survey the use of Transformers in medical image segmentation, detection, classification, restoration, synthesis, registration, clinical report generation, and other tasks. In particular, for each of these applications, we develop taxonomy, identify application-specific challenges as well as provide insights to solve them, and highlight recent trends. Further, we provide a critical discussion of the field's current state as a whole, including the identification of key challenges, open problems, and outlining promising future directions. We hope this survey will ignite further interest in the community and provide researchers with an up-to-date reference regarding applications of Transformer models in medical imaging. Finally, to cope with the rapid development in this field, we intend to regularly update the relevant latest papers and their open-source implementations at https://github.com/fahadshamshad/awesome-transformers-in-medical-imaging.
Collapse
Affiliation(s)
- Fahad Shamshad
- MBZ University of Artificial Intelligence, Abu Dhabi, United Arab Emirates.
| | - Salman Khan
- MBZ University of Artificial Intelligence, Abu Dhabi, United Arab Emirates; CECS, Australian National University, Canberra ACT 0200, Australia
| | - Syed Waqas Zamir
- Inception Institute of Artificial Intelligence, Abu Dhabi, United Arab Emirates
| | | | - Munawar Hayat
- Faculty of IT, Monash University, Clayton VIC 3800, Australia
| | - Fahad Shahbaz Khan
- MBZ University of Artificial Intelligence, Abu Dhabi, United Arab Emirates; Computer Vision Laboratory, Linköping University, Sweden
| | - Huazhu Fu
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore
| |
Collapse
|
89
|
Xie Y, Yu Y, Liao M, Sun C. Gastric polyp detection module based on improved attentional feature fusion. Biomed Eng Online 2023; 22:72. [PMID: 37468936 DOI: 10.1186/s12938-023-01130-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Accepted: 06/26/2023] [Indexed: 07/21/2023] Open
Abstract
Gastric cancer is a deadly disease and gastric polyps are at high risk of becoming cancerous. Therefore, the timely detection of gastric polyp is of great importance which can reduce the incidence of gastric cancer effectively. At present, the object detection method based on deep learning is widely used in medical images. However, as the contrast between the background and the polyps is not strong in gastroscopic image, it is difficult to distinguish various sizes of polyps from the background. In this paper, to improve the detection performance metrics of endoscopic gastric polyps, we propose an improved attentional feature fusion module. First, in order to enhance the contrast between the background and the polyps, we propose an attention module that enables the network to make full use of the target location information, it can suppress the interference of the background information and highlight the effective features. Therefore, on the basis of accurate positioning, it can focus on detecting whether the current location is the gastric polyp or background. Then, it is combined with our feature fusion module to form a new attentional feature fusion model that can mitigate the effects caused by semantic differences in the processing of feature fusion, using multi-scale fusion information to obtain more accurate attention weights and improve the detection performance of polyps of different sizes. In this work, we conduct experiments on our own dataset of gastric polyps. Experimental results show that the proposed attentional feature fusion module is better than the common feature fusion module and can improve the situation where polyps are missed or misdetected.
Collapse
Affiliation(s)
- Yun Xie
- School of Intelligence Science and technology, University of Science and Technology Beijing, Beijing, China
| | - Yao Yu
- School of Intelligence Science and technology, University of Science and Technology Beijing, Beijing, China.
| | - Mingchao Liao
- School of Intelligence Science and technology, University of Science and Technology Beijing, Beijing, China
| | - Changyin Sun
- School of Artificial Intelligence, Anhui University, Anhui, China
| |
Collapse
|
90
|
Bian H, Jiang M, Qian J. The investigation of constraints in implementing robust AI colorectal polyp detection for sustainable healthcare system. PLoS One 2023; 18:e0288376. [PMID: 37437026 DOI: 10.1371/journal.pone.0288376] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 06/24/2023] [Indexed: 07/14/2023] Open
Abstract
Colorectal cancer (CRC) is one of the significant threats to public health and the sustainable healthcare system during urbanization. As the primary method of screening, colonoscopy can effectively detect polyps before they evolve into cancerous growths. However, the current visual inspection by endoscopists is insufficient in providing consistently reliable polyp detection for colonoscopy videos and images in CRC screening. Artificial Intelligent (AI) based object detection is considered as a potent solution to overcome visual inspection limitations and mitigate human errors in colonoscopy. This study implemented a YOLOv5 object detection model to investigate the performance of mainstream one-stage approaches in colorectal polyp detection. Meanwhile, a variety of training datasets and model structure configurations are employed to identify the determinative factors in practical applications. The designed experiments show that the model yields acceptable results assisted by transfer learning, and highlight that the primary constraint in implementing deep learning polyp detection comes from the scarcity of training data. The model performance was improved by 15.6% in terms of average precision (AP) when the original training dataset was expanded. Furthermore, the experimental results were analysed from a clinical perspective to identify potential causes of false positives. Besides, the quality management framework is proposed for future dataset preparation and model development in AI-driven polyp detection tasks for smart healthcare solutions.
Collapse
Affiliation(s)
- Haitao Bian
- College of Safety Science and Engineering, Nanjing Tech University, Nanjing, Jiangsu, China
| | - Min Jiang
- KLA Corporation, Milpitas, California, United States of America
| | - Jingjing Qian
- Department of Gastroenterology, The Second Hospital of Nanjing, Nanjing University of Chinese Medicine, Nanjing, Jiangsu, China
| |
Collapse
|
91
|
Jin Q, Hou H, Zhang G, Li Z. FEGNet: A Feedback Enhancement Gate Network for Automatic Polyp Segmentation. IEEE J Biomed Health Inform 2023; 27:3420-3430. [PMID: 37126617 DOI: 10.1109/jbhi.2023.3272168] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Regular colonoscopy is an effective way to prevent colorectal cancer by detecting colorectal polyps. Automatic polyp segmentation significantly aids clinicians in precisely locating polyp areas for further diagnosis. However, polyp segmentation is a challenge problem, since polyps appear in a variety of shapes, sizes and textures, and they tend to have ambiguous boundaries. In this paper, we propose a U-shaped model named Feedback Enhancement Gate Network (FEGNet) for accurate polyp segmentation to overcome these difficulties. Specifically, for the high-level features, we design a novel Recurrent Gate Module (RGM) based on the feedback mechanism, which can refine attention maps without any additional parameters. RGM consists of Feature Aggregation Attention Gate (FAAG) and Multi-Scale Module (MSM). FAAG can aggregate context and feedback information, and MSM is applied for capturing multi-scale information, which is critical for the segmentation task. In addition, we propose a straightforward but effective edge extraction module to detect boundaries of polyps for low-level features, which is used to guide the training of early features. In our experiments, quantitative and qualitative evaluations show that the proposed FEGNet has achieved the best results in polyp segmentation compared to other state-of-the-art models on five colonoscopy datasets.
Collapse
|
92
|
Yue G, Zhuo G, Li S, Zhou T, Du J, Yan W, Hou J, Liu W, Wang T. Benchmarking Polyp Segmentation Methods in Narrow-Band Imaging Colonoscopy Images. IEEE J Biomed Health Inform 2023; 27:3360-3371. [PMID: 37099473 DOI: 10.1109/jbhi.2023.3270724] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/27/2023]
Abstract
In recent years, there has been significant progress in polyp segmentation in white-light imaging (WLI) colonoscopy images, particularly with methods based on deep learning (DL). However, little attention has been paid to the reliability of these methods in narrow-band imaging (NBI) data. NBI improves visibility of blood vessels and helps physicians observe complex polyps more easily than WLI, but NBI images often include polyps with small/flat appearances, background interference, and camouflage properties, making polyp segmentation a challenging task. This paper proposes a new polyp segmentation dataset (PS-NBI2K) consisting of 2,000 NBI colonoscopy images with pixel-wise annotations, and presents benchmarking results and analyses for 24 recently reported DL-based polyp segmentation methods on PS-NBI2K. The results show that existing methods struggle to locate polyps with smaller sizes and stronger interference, and that extracting both local and global features improves performance. There is also a trade-off between effectiveness and efficiency, and most methods cannot achieve the best results in both areas simultaneously. This work highlights potential directions for designing DL-based polyp segmentation methods in NBI colonoscopy images, and the release of PS-NBI2K aims to drive further development in this field.
Collapse
|
93
|
Wang J, Tian S, Yu L, Zhou Z, Wang F, Wang Y. HIGF-Net: Hierarchical information-guided fusion network for polyp segmentation based on transformer and convolution feature learning. Comput Biol Med 2023; 161:107038. [PMID: 37230017 DOI: 10.1016/j.compbiomed.2023.107038] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 01/22/2023] [Accepted: 05/11/2023] [Indexed: 05/27/2023]
Abstract
Polyp segmentation plays a role in image analysis during colonoscopy screening, thus improving the diagnostic efficiency of early colorectal cancer. However, due to the variable shape and size characteristics of polyps, small difference between lesion area and background, and interference of image acquisition conditions, existing segmentation methods have the phenomenon of missing polyp and rough boundary division. To overcome the above challenges, we propose a multi-level fusion network called HIGF-Net, which uses hierarchical guidance strategy to aggregate rich information to produce reliable segmentation results. Specifically, our HIGF-Net excavates deep global semantic information and shallow local spatial features of images together with Transformer encoder and CNN encoder. Then, Double-stream structure is used to transmit polyp shape properties between feature layers at different depths. The module calibrates the position and shape of polyps in different sizes to improve the model's efficient use of the rich polyp features. In addition, Separate Refinement module refines the polyp profile in the uncertain region to highlight the difference between the polyp and the background. Finally, in order to adapt to diverse collection environments, Hierarchical Pyramid Fusion module merges the features of multiple layers with different representational capabilities. We evaluate the learning and generalization abilities of HIGF-Net on five datasets using six evaluation metrics, including Kvasir-SEG, CVC-ClinicDB, ETIS, CVC-300, and CVC-ColonDB. Experimental results show that the proposed model is effective in polyp feature mining and lesion identification, and its segmentation performance is better than ten excellent models.
Collapse
Affiliation(s)
- Junwen Wang
- College of Software, Xinjiang University, Urumqi, 830000, China; Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830000, China
| | - Shengwei Tian
- College of Software, Xinjiang University, Urumqi, 830000, China; Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830000, China.
| | - Long Yu
- College of Network Center, Xinjiang University, Urumqi, 830000, China; Signal and Signal Processing Laboratory, College of Information Science and Engineering, Xinjiang University, Urumqi, 830000, China
| | - Zhicheng Zhou
- College of Software, Xinjiang University, Urumqi, 830000, China; Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830000, China
| | - Fan Wang
- College of Software, Xinjiang University, Urumqi, 830000, China; Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830000, China
| | - Yongtao Wang
- College of Software, Xinjiang University, Urumqi, 830000, China; Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830000, China
| |
Collapse
|
94
|
Nanni L, Fantozzi C, Loreggia A, Lumini A. Ensembles of Convolutional Neural Networks and Transformers for Polyp Segmentation. SENSORS (BASEL, SWITZERLAND) 2023; 23:4688. [PMID: 37430601 DOI: 10.3390/s23104688] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 04/29/2023] [Accepted: 05/09/2023] [Indexed: 07/12/2023]
Abstract
In the realm of computer vision, semantic segmentation is the task of recognizing objects in images at the pixel level. This is done by performing a classification of each pixel. The task is complex and requires sophisticated skills and knowledge about the context to identify objects' boundaries. The importance of semantic segmentation in many domains is undisputed. In medical diagnostics, it simplifies the early detection of pathologies, thus mitigating the possible consequences. In this work, we provide a review of the literature on deep ensemble learning models for polyp segmentation and develop new ensembles based on convolutional neural networks and transformers. The development of an effective ensemble entails ensuring diversity between its components. To this end, we combined different models (HarDNet-MSEG, Polyp-PVT, and HSNet) trained with different data augmentation techniques, optimization methods, and learning rates, which we experimentally demonstrate to be useful to form a better ensemble. Most importantly, we introduce a new method to obtain the segmentation mask by averaging intermediate masks after the sigmoid layer. In our extensive experimental evaluation, the average performance of the proposed ensembles over five prominent datasets beat any other solution that we know of. Furthermore, the ensembles also performed better than the state-of-the-art on two of the five datasets, when individually considered, without having been specifically trained for them.
Collapse
Affiliation(s)
- Loris Nanni
- Department of Information Engineering, University of Padova, 35122 Padova, Italy
| | - Carlo Fantozzi
- Department of Information Engineering, University of Padova, 35122 Padova, Italy
| | - Andrea Loreggia
- Department of Information Engineering, University of Brescia, 25121 Brescia, Italy
| | - Alessandra Lumini
- Department of Computer Science and Engineering, University of Bologna, 40126 Bologna, Italy
| |
Collapse
|
95
|
Hu K, Chen W, Sun Y, Hu X, Zhou Q, Zheng Z. PPNet: Pyramid pooling based network for polyp segmentation. Comput Biol Med 2023; 160:107028. [PMID: 37201273 DOI: 10.1016/j.compbiomed.2023.107028] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 04/24/2023] [Accepted: 05/09/2023] [Indexed: 05/20/2023]
Abstract
Colonoscopy is the gold standard method for investigating the gastrointestinal tract. Localizing the polyps in colonoscopy images plays a vital role when doing a colonoscopy screening, and it is also quite important for the following treatment, e.g., polyp resection. Many deep learning-based methods have been applied for solving the polyp segmentation issue. However, precisely polyp segmentation is still an open issue. Considering the effectiveness of the Pyramid Pooling Transformer (P2T) in modeling long-range dependencies and capturing robust contextual features, as well as the power of pyramid pooling in extracting features, we propose a pyramid pooling based network for polyp segmentation, namely PPNet. We first adopt the P2T as the encoder for extracting more powerful features. Next, a pyramid feature fusion module (PFFM) combining the channel attention scheme is utilized for learning a global contextual feature, in order to guide the information transition in the decoder branch. Aiming to enhance the effectiveness of PPNet on feature extraction during the decoder stage layer by layer, we introduce the memory-keeping pyramid pooling module (MPPM) into each side branch of the encoder, and transmit the corresponding feature to each lower-level side branch. Experimental results conducted on five public colorectal polyp segmentation datasets are given and discussed. Our method performs better compared with several state-of-the-art polyp extraction networks, which demonstrate the effectiveness of the mechanism of pyramid pooling for colorectal polyp segmentation.
Collapse
Affiliation(s)
- Keli Hu
- Department of Computer Science and Engineering, Shaoxing University, Shaoxing, 312000, PR China; Cancer Center, Department of Gastroenterology, Zhejiang Provincial People's Hospital (Affiliated People's Hospital, Hangzhou Medical College), Hangzhou, 310014, PR China; Information Technology R&D Innovation Center of Peking University, Shaoxing, 312000, PR China
| | - Wenping Chen
- Department of Computer Science and Engineering, Shaoxing University, Shaoxing, 312000, PR China.
| | - YuanZe Sun
- Department of Computer Science and Engineering, Shaoxing University, Shaoxing, 312000, PR China
| | - Xiaozhao Hu
- Shaoxing People's Hospital, Shaoxing, 312000, PR China
| | - Qianwei Zhou
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, PR China
| | - Zirui Zheng
- Department of Computer Science and Engineering, Shaoxing University, Shaoxing, 312000, PR China
| |
Collapse
|
96
|
Wang KN, Zhuang S, Ran QY, Zhou P, Hua J, Zhou GQ, He X. DLGNet: A dual-branch lesion-aware network with the supervised Gaussian Mixture model for colon lesions classification in colonoscopy images. Med Image Anal 2023; 87:102832. [PMID: 37148864 DOI: 10.1016/j.media.2023.102832] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Revised: 01/20/2023] [Accepted: 04/20/2023] [Indexed: 05/08/2023]
Abstract
Colorectal cancer is one of the malignant tumors with the highest mortality due to the lack of obvious early symptoms. It is usually in the advanced stage when it is discovered. Thus the automatic and accurate classification of early colon lesions is of great significance for clinically estimating the status of colon lesions and formulating appropriate diagnostic programs. However, it is challenging to classify full-stage colon lesions due to the large inter-class similarities and intra-class differences of the images. In this work, we propose a novel dual-branch lesion-aware neural network (DLGNet) to classify intestinal lesions by exploring the intrinsic relationship between diseases, composed of four modules: lesion location module, dual-branch classification module, attention guidance module, and inter-class Gaussian loss function. Specifically, the elaborate dual-branch module integrates the original image and the lesion patch obtained by the lesion localization module to explore and interact with lesion-specific features from a global and local perspective. Also, the feature-guided module guides the model to pay attention to the disease-specific features by learning remote dependencies through spatial and channel attention after network feature learning. Finally, the inter-class Gaussian loss function is proposed, which assumes that each feature extracted by the network is an independent Gaussian distribution, and the inter-class clustering is more compact, thereby improving the discriminative ability of the network. The extensive experiments on the collected 2568 colonoscopy images have an average accuracy of 91.50%, and the proposed method surpasses the state-of-the-art methods. This study is the first time that colon lesions are classified at each stage and achieves promising colon disease classification performance. To motivate the community, we have made our code publicly available via https://github.com/soleilssss/DLGNet.
Collapse
Affiliation(s)
- Kai-Ni Wang
- School of Biological Science and Medical Engineering, Southeast University, Nanjing, China; State Key Laboratory of Digital Medical Engineering, Southeast University, Nanjing, China; Jiangsu Key Laboratory of Biomaterials and Devices, Southeast University, Nanjing, China
| | - Shuaishuai Zhuang
- The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Qi-Yong Ran
- School of Biological Science and Medical Engineering, Southeast University, Nanjing, China; State Key Laboratory of Digital Medical Engineering, Southeast University, Nanjing, China; Jiangsu Key Laboratory of Biomaterials and Devices, Southeast University, Nanjing, China
| | - Ping Zhou
- School of Biological Science and Medical Engineering, Southeast University, Nanjing, China; State Key Laboratory of Digital Medical Engineering, Southeast University, Nanjing, China; Jiangsu Key Laboratory of Biomaterials and Devices, Southeast University, Nanjing, China
| | - Jie Hua
- The First Affiliated Hospital of Nanjing Medical University, Nanjing, China; Liyang People's Hospital, Liyang Branch Hospital of Jiangsu Province Hospital, Liyang, China
| | - Guang-Quan Zhou
- School of Biological Science and Medical Engineering, Southeast University, Nanjing, China; State Key Laboratory of Digital Medical Engineering, Southeast University, Nanjing, China; Jiangsu Key Laboratory of Biomaterials and Devices, Southeast University, Nanjing, China.
| | - Xiaopu He
- The First Affiliated Hospital of Nanjing Medical University, Nanjing, China.
| |
Collapse
|
97
|
Su Y, Cheng J, Zhong C, Zhang Y, Ye J, He J, Liu J. FeDNet: Feature Decoupled Network for polyp segmentation from endoscopy images. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/19/2023]
|
98
|
Wei X, Ye F, Wan H, Xu J, Min W. TANet: Triple Attention Network for medical image segmentation. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
|
99
|
Yue G, Li S, Zhou T, Wang M, Du J, Jiang Q, Gao W, Wang T, Lv J. Adaptive Context Exploration Network for Polyp Segmentation in Colonoscopy Images. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE 2023; 7:487-499. [DOI: 10.1109/tetci.2022.3193677] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2025]
Affiliation(s)
- Guanghui Yue
- School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China
| | - Siying Li
- School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China
| | - Tianwei Zhou
- College of Management, Shenzhen University, Shenzhen, China
| | - Miaohui Wang
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University, Shenzhen, China
| | - Jingfeng Du
- Department of Gastroenterology and Hepatology, Shenzhen University General Hospital, Shenzhen, China
| | - Qiuping Jiang
- School of Information Science and Engineering, Ningbo University, Ningbo, China
| | - Wei Gao
- School of Electronic and Computer Engineering, Shenzhen Graduate School, Peking University, Shenzhen, China
| | - Tianfu Wang
- School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China
| | - Jun Lv
- School of Computer and Control Engineering, Yantai University, Yantai, China
| |
Collapse
|
100
|
Dhaliwal J, Walsh CM. Artificial Intelligence in Pediatric Endoscopy: Current Status and Future Applications. Gastrointest Endosc Clin N Am 2023; 33:291-308. [PMID: 36948747 DOI: 10.1016/j.giec.2022.12.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/24/2023]
Abstract
The application of artificial intelligence (AI) has great promise for improving pediatric endoscopy. The majority of preclinical studies have been undertaken in adults, with the greatest progress being made in the context of colorectal cancer screening and surveillance. This development has only been possible with advances in deep learning, like the convolutional neural network model, which has enabled real-time detection of pathology. Comparatively, the majority of deep learning systems developed in inflammatory bowel disease have focused on predicting disease severity and were developed using still images rather than videos. The application of AI to pediatric endoscopy is in its infancy, thus providing an opportunity to develop clinically meaningful and fair systems that do not perpetuate societal biases. In this review, we provide an overview of AI, summarize the advances of AI in endoscopy, and describe its potential application to pediatric endoscopic practice and education.
Collapse
Affiliation(s)
- Jasbir Dhaliwal
- Division of Pediatric Gastroenterology, Hepatology and Nutrition, Cincinnati Children's Hospital Medictal Center, University of Cincinnati, OH, USA.
| | - Catharine M Walsh
- Division of Gastroenterology, Hepatology, and Nutrition, and the SickKids Research and Learning Institutes, The Hospital for Sick Children, Toronto, ON, Canada; Department of Paediatrics and The Wilson Centre, University of Toronto, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| |
Collapse
|