1
|
Huang S, Ge Y, Liu D, Hong M, Zhao J, Loui AC. Rethinking Copy-Paste for Consistency Learning in Medical Image Segmentation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; 34:1060-1074. [PMID: 40031728 DOI: 10.1109/tip.2025.3536208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Semi-supervised learning based on consistency learning offers significant promise for enhancing medical image segmentation. Current approaches use copy-paste as an effective data perturbation technique to facilitate weak-to-strong consistency learning. However, these techniques often lead to a decrease in the accuracy of synthetic labels corresponding to the synthetic data and introduce excessive perturbations to the distribution of the training data. Such over-perturbation causes the data distribution to stray from its true distribution, thereby impairing the model's generalization capabilities as it learns the decision boundaries. We propose a weak-to-strong consistency learning framework that integrally addresses these issues with two primary designs: 1) it emphasizes the use of highly reliable data to enhance the quality of labels in synthetic datasets through cross-copy-pasting between labeled and unlabeled datasets; 2) it employs uncertainty estimation and foreground region constraints to meticulously filter the regions for copy-pasting, thus the copy-paste technique implemented introduces a beneficial perturbation to the training data distribution. Our framework expands the copy-paste method by addressing its inherent limitations, and amplifying the potential of data perturbations for consistency learning. We extensively validated our model using six publicly available medical image segmentation datasets across different diagnostic tasks, including the segmentation of cardiac structures, prostate structures, brain structures, skin lesions, and gastrointestinal polyps. The results demonstrate that our method significantly outperforms state-of-the-art models. For instance, on the PROMISE12 dataset for the prostate structure segmentation task, using only 10% labeled data, our method achieves a 15.31% higher Dice score compared to the baseline models. Our experimental code will be made publicly available at https://github.com/slhuang24/RCP4CL.
Collapse
|
2
|
Chen J, Huang W, Zhang J, Debattista K, Han J. Addressing inconsistent labeling with cross image matching for scribble-based medical image segmentation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; PP:842-853. [PMID: 40031274 DOI: 10.1109/tip.2025.3530787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
In recent years, there has been a notable surge in the adoption of weakly-supervised learning for medical image segmentation, utilizing scribble annotation as a means to potentially reduce annotation costs. However, the inherent characteristics of scribble labeling, marked by incompleteness, subjectivity, and a lack of standardization, introduce inconsistencies into the annotations. These inconsistencies become significant challenges for the network's learning process, ultimately affecting the performance of segmentation. To address this challenge, we propose creating a reference set to guide pixel-level feature matching, constructed from class-specific tokens and pixel-level features extracted from variously images. Serving as a repository showcasing diverse pixel styles and classes, the reference set becomes the cornerstone for a pixel-level feature matching strategy. This strategy enables the effective comparison of unlabeled pixels, offering guidance, particularly in learning scenarios characterized by inconsistent and incomplete scribbles. The proposed strategy incorporates smoothing and regression techniques to align pixel-level features across different images. By leveraging the diversity of pixel sources, our matching approach enhances the network's ability to learn consistent patterns from the reference set. This, in turn, mitigates the impact of inconsistent and incomplete labeling, resulting in improved segmentation outcomes. Extensive experiments conducted on three publicly available datasets demonstrate the superiority of our approach over state-of-the-art methods in terms of segmentation accuracy and stability. The code will be made publicly available at https://github.com/jingkunchen/scribble-medical-segmentation.
Collapse
|
3
|
Zeng Q, Xie Y, Lu Z, Lu M, Zhang J, Xia Y. Consistency-Guided Differential Decoding for Enhancing Semi-Supervised Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:44-56. [PMID: 39088492 DOI: 10.1109/tmi.2024.3429340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/03/2024]
Abstract
Semi-supervised learning (SSL) has been proven beneficial for mitigating the issue of limited labeled data, especially on volumetric medical image segmentation. Unlike previous SSL methods which focus on exploring highly confident pseudo-labels or developing consistency regularization schemes, our empirical findings suggest that differential decoder features emerge naturally when two decoders strive to generate consistent predictions. Based on the observation, we first analyze the treasure of discrepancy in learning towards consistency, under both pseudo-labeling and consistency regularization settings, and subsequently propose a novel SSL method called LeFeD, which learns the feature-level discrepancies obtained from two decoders, by feeding such information as feedback signals to the encoder. The core design of LeFeD is to enlarge the discrepancies by training differential decoders, and then learn from the differential features iteratively. We evaluate LeFeD against eight state-of-the-art (SOTA) methods on three public datasets. Experiments show LeFeD surpasses competitors without any bells and whistles, such as uncertainty estimation and strong constraints, as well as setting a new state of the art for semi-supervised medical image segmentation. Code has been released at https://github.com/maxwell0027/LeFeD.
Collapse
|
4
|
Wang Y, Xiao B, Bi X, Li W, Gao X. Boundary-Aware Prototype in Semi-Supervised Medical Image Segmentation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:5456-5467. [PMID: 39316477 DOI: 10.1109/tip.2024.3463412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2024]
Abstract
The true label plays an important role in semi-supervised medical image segmentation (SSMIS) because it can provide the most accurate supervision information when the label is limited. The popular SSMIS method trains labeled and unlabeled data separately, and the unlabeled data cannot be directly supervised by the true label. This limits the contribution of labels to model training. Is there an interactive mechanism that can break the separation between two types of data training to maximize the utilization of true labels? Inspired by this, we propose a novel consistency learning framework based on the non-parametric distance metric of boundary-aware prototypes to alleviate this problem. This method combines CNN-based linear classification and nearest neighbor-based non-parametric classification into one framework, encouraging the two segmentation paradigms to have similar predictions for the same input. More importantly, the prototype can be clustered from both labeled and unlabeled data features so that it can be seen as a bridge for interactive training between labeled and unlabeled data. When the prototype-based prediction is supervised by the true label, the supervisory signal can simultaneously affect the feature extraction process of both data. In addition, boundary-aware prototypes can explicitly model the differences in boundaries and centers of adjacent categories, so pixel-prototype contrastive learning is introduced to further improve the discriminability of features and make them more suitable for non-parametric distance measurement. Experiments show that although our method uses a modified lightweight UNet as the backbone, it outperforms the comparison method using a 3D VNet with more parameters.
Collapse
|
5
|
Shi J, Li C, Gong T, Fu H. E 2-MIL: An explainable and evidential multiple instance learning framework for whole slide image classification. Med Image Anal 2024; 97:103294. [PMID: 39128377 DOI: 10.1016/j.media.2024.103294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 07/25/2024] [Accepted: 07/30/2024] [Indexed: 08/13/2024]
Abstract
Multiple instance learning (MIL)-based methods have been widely adopted to process the whole slide image (WSI) in the field of computational pathology. Due to the sparse slide-level supervision, these methods usually lack good localization on the tumor regions, leading to poor interpretability. Moreover, they lack robust uncertainty estimation of prediction results, leading to poor reliability. To solve the above two limitations, we propose an explainable and evidential multiple instance learning (E2-MIL) framework for whole slide image classification. E2-MIL is mainly composed of three modules: a detail-aware attention distillation module (DAM), a structure-aware attention refined module (SRM), and an uncertainty-aware instance classifier (UIC). Specifically, DAM helps the global network locate more detail-aware positive instances by utilizing the complementary sub-bags to learn detailed attention knowledge from the local network. In addition, a masked self-guidance loss is also introduced to help bridge the gap between the slide-level labels and instance-level classification tasks. SRM generates a structure-aware attention map that locates the entire tumor region structure by effectively modeling the spatial relations between clustering instances. Moreover, UIC provides accurate instance-level classification results and robust predictive uncertainty estimation to improve the model reliability based on subjective logic theory. Extensive experiments on three large multi-center subtyping datasets demonstrate both slide-level and instance-level performance superiority of E2-MIL.
Collapse
Affiliation(s)
- Jiangbo Shi
- School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi, 710049, China
| | - Chen Li
- School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi, 710049, China.
| | - Tieliang Gong
- School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi, 710049, China
| | - Huazhu Fu
- Institute of High Performance Computing (IHPC), Agency for Science, Technology and Research (A*STAR), 138632, Singapore.
| |
Collapse
|
6
|
Zhang S, Yuan Z, Zhou X, Wang H, Chen B, Wang Y. VENet: Variational energy network for gland segmentation of pathological images and early gastric cancer diagnosis of whole slide images. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 250:108178. [PMID: 38652995 DOI: 10.1016/j.cmpb.2024.108178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 04/08/2024] [Accepted: 04/13/2024] [Indexed: 04/25/2024]
Abstract
BACKGROUND AND OBJECTIVE Gland segmentation of pathological images is an essential but challenging step for adenocarcinoma diagnosis. Although deep learning methods have recently made tremendous progress in gland segmentation, they have not given satisfactory boundary and region segmentation results of adjacent glands. These glands usually have a large difference in glandular appearance, and the statistical distribution between the training and test sets in deep learning is inconsistent. These problems make networks not generalize well in the test dataset, bringing difficulties to gland segmentation and early cancer diagnosis. METHODS To address these problems, we propose a Variational Energy Network named VENet with a traditional variational energy Lv loss for gland segmentation of pathological images and early gastric cancer detection in whole slide images (WSIs). It effectively integrates the variational mathematical model and the data-adaptability of deep learning methods to balance boundary and region segmentation. Furthermore, it can effectively segment and classify glands in large-size WSIs with reliable nucleus width and nucleus-to-cytoplasm ratio features. RESULTS The VENet was evaluated on the 2015 MICCAI Gland Segmentation challenge (GlaS) dataset, the Colorectal Adenocarcinoma Glands (CRAG) dataset, and the self-collected Nanfang Hospital dataset. Compared with state-of-the-art methods, our method achieved excellent performance for GlaS Test A (object dice 0.9562, object F1 0.9271, object Hausdorff distance 73.13), GlaS Test B (object dice 94.95, object F1 95.60, object Hausdorff distance 59.63), and CRAG (object dice 95.08, object F1 92.94, object Hausdorff distance 28.01). For the Nanfang Hospital dataset, our method achieved a kappa of 0.78, an accuracy of 0.9, a sensitivity of 0.98, and a specificity of 0.80 on the classification task of test 69 WSIs. CONCLUSIONS The experimental results show that the proposed model accurately predicts boundaries and outperforms state-of-the-art methods. It can be applied to the early diagnosis of gastric cancer by detecting regions of high-grade gastric intraepithelial neoplasia in WSI, which can assist pathologists in analyzing large WSI and making accurate diagnostic decisions.
Collapse
Affiliation(s)
- Shuchang Zhang
- Department of Mathematics, National University of Defense Technology, Changsha, China.
| | - Ziyang Yuan
- Academy of Military Sciences of the People's Liberation Army, Beijing, China.
| | - Xianchen Zhou
- Department of Mathematics, National University of Defense Technology, Changsha, China
| | - Hongxia Wang
- Department of Mathematics, National University of Defense Technology, Changsha, China.
| | - Bo Chen
- Suzhou Research Center, Institute of Automation, Chinese Academy of Sciences, Suzhou, China
| | - Yadong Wang
- Department of Laboratory Pathology, Baiyun Branch, Nanfang Hospital, Southern Medical University, Guangzhou, China
| |
Collapse
|
7
|
Gai D, Huang Z, Min W, Geng Y, Wu H, Zhu M, Wang Q. SDMI-Net: Spatially Dependent Mutual Information Network for semi-supervised medical image segmentation. Comput Biol Med 2024; 174:108374. [PMID: 38582003 DOI: 10.1016/j.compbiomed.2024.108374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 02/21/2024] [Accepted: 03/24/2024] [Indexed: 04/08/2024]
Abstract
Semi-supervised medical image segmentation strives to polish deep models with a small amount of labeled data and a large amount of unlabeled data. The efficiency of most semi-supervised medical image segmentation methods based on voxel-level consistency learning is affected by low-confidence voxels. In addition, voxel-level consistency learning fails to consider the spatial correlation between neighboring voxels. To encourage reliable voxel-level consistent learning, we propose a dual-teacher affine consistent uncertainty estimation method to filter out some voxels with high uncertainty. Moreover, we design the spatially dependent mutual information module, which enhances the spatial dependence between neighboring voxels by maximizing the mutual information between the local voxel blocks predicted from the dual-teacher models and the student model, enabling consistent learning at the block level. On two benchmark medical image segmentation datasets, including the Left Atrial Segmentation Challenge dataset and the BraTS-2019 dataset, our method achieves state-of-the-art performance in both quantitative and qualitative aspects.
Collapse
Affiliation(s)
- Di Gai
- School of Mathematics and Computer Science, Nanchang University, Nanchang, 330031, China; Institute of Metaverse, Nanchang University, Nanchang, 330031, China.
| | - Zheng Huang
- School of Mathematics and Computer Science, Nanchang University, Nanchang, 330031, China.
| | - Weidong Min
- School of Mathematics and Computer Science, Nanchang University, Nanchang, 330031, China; Institute of Metaverse, Nanchang University, Nanchang, 330031, China.
| | - Yuhan Geng
- School of Public Health, University of Michigan, Ann Arbor, MI, 48109, USA.
| | - Haifan Wu
- School of Mathematics and Computer Science, Nanchang University, Nanchang, 330031, China.
| | - Meng Zhu
- School of Mathematics and Computer Science, Nanchang University, Nanchang, 330031, China.
| | - Qi Wang
- School of Mathematics and Computer Science, Nanchang University, Nanchang, 330031, China; Institute of Metaverse, Nanchang University, Nanchang, 330031, China.
| |
Collapse
|
8
|
Zhou C, Ye L, Peng H, Liu Z, Wang J, Ramírez-De-Arellano A. A Parallel Convolutional Network Based on Spiking Neural Systems. Int J Neural Syst 2024; 34:2450022. [PMID: 38487872 DOI: 10.1142/s0129065724500229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2024]
Abstract
Deep convolutional neural networks have shown advanced performance in accurately segmenting images. In this paper, an SNP-like convolutional neuron structure is introduced, abstracted from the nonlinear mechanism in nonlinear spiking neural P (NSNP) systems. Then, a U-shaped convolutional neural network named SNP-like parallel-convolutional network, or SPC-Net, is constructed for segmentation tasks. The dual-convolution concatenate (DCC) and dual-convolution addition (DCA) network blocks are designed, respectively, in the encoder and decoder stages. The two blocks employ parallel convolution with different kernel sizes to improve feature representation ability and make full use of spatial detail information. Meanwhile, different feature fusion strategies are used to fuse their features to achieve feature complementarity and augmentation. Furthermore, a dual-scale pooling (DSP) module in the bottleneck is designed to improve the feature extraction capability, which can extract multi-scale contextual information and reduce information loss while extracting salient features. The SPC-Net is applied in medical image segmentation tasks and is compared with several recent segmentation methods on the GlaS and CRAG datasets. The proposed SPC-Net achieves 90.77% DICE coefficient, 83.76% IoU score and 83.93% F1 score, 86.33% ObjDice coefficient, 135.60 Obj-Hausdorff distance, respectively. The experimental results show that the proposed model can achieve good segmentation performance.
Collapse
Affiliation(s)
- Chi Zhou
- School of Computer and Software Engineering, Xihua University, Chengdu 610039, P. R. China
| | - Lulin Ye
- School of Computer and Software Engineering, Xihua University, Chengdu 610039, P. R. China
| | - Hong Peng
- School of Computer and Software Engineering, Xihua University, Chengdu 610039, P. R. China
| | - Zhicai Liu
- School of Computer and Software Engineering, Xihua University, Chengdu 610039, P. R. China
| | - Jun Wang
- School of Electrical Engineering and Electronic Information, Xihua University, Chengdu 610039, P. R. China
| | - Antonio Ramírez-De-Arellano
- Research Group of Natural Computing, Department of Computer Science and Artificial Intelligence, University of Seville, Sevilla 41012, Spain
| |
Collapse
|
9
|
Chen C, Chen Y, Li X, Ning H, Xiao R. Linear semantic transformation for semi-supervised medical image segmentation. Comput Biol Med 2024; 173:108331. [PMID: 38522252 DOI: 10.1016/j.compbiomed.2024.108331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 02/29/2024] [Accepted: 03/17/2024] [Indexed: 03/26/2024]
Abstract
Medical image segmentation is a focus research and foundation in developing intelligent medical systems. Recently, deep learning for medical image segmentation has become a standard process and succeeded significantly, promoting the development of reconstruction, and surgical planning of disease diagnosis. However, semantic learning is often inefficient owing to the lack of supervision of feature maps, resulting in that high-quality segmentation models always rely on numerous and accurate data annotations. Learning robust semantic representation in latent spaces remains a challenge. In this paper, we propose a novel semi-supervised learning framework to learn vital attributes in medical images, which constructs generalized representation from diverse semantics to realize medical image segmentation. We first build a self-supervised learning part that achieves context recovery by reconstructing space and intensity of medical images, which conduct semantic representation for feature maps. Subsequently, we combine semantic-rich feature maps and utilize simple linear semantic transformation to convert them into image segmentation. The proposed framework was tested using five medical segmentation datasets. Quantitative assessments indicate the highest scores of our method on IXI (73.78%), ScaF (47.50%), COVID-19-Seg (50.72%), PC-Seg (65.06%), and Brain-MR (72.63%) datasets. Finally, we compared our method with the latest semi-supervised learning methods and obtained 77.15% and 75.22% DSC values, respectively, ranking first on two representative datasets. The experimental results not only proved that the proposed linear semantic transformation was effectively applied to medical image segmentation, but also presented its simplicity and ease-of-use to pursue robust segmentation in semi-supervised learning. Our code is now open at: https://github.com/QingYunA/Linear-Semantic-Transformation-for-Semi-Supervised-Medical-Image-Segmentation.
Collapse
Affiliation(s)
- Cheng Chen
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China
| | - Yunqing Chen
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China
| | - Xiaoheng Li
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China
| | - Huansheng Ning
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China
| | - Ruoxiu Xiao
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China; Shunde Innovation School, University of Science and Technology Beijing, Foshan, 100024, China.
| |
Collapse
|
10
|
Huang S, Luo J, Ou Y, Shen W, Pang Y, Nie X, Zhang G. Sd-net: a semi-supervised double-cooperative network for liver segmentation from computed tomography (CT) images. J Cancer Res Clin Oncol 2024; 150:79. [PMID: 38316678 PMCID: PMC10844439 DOI: 10.1007/s00432-023-05564-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 12/08/2023] [Indexed: 02/07/2024]
Abstract
INTRODUCTION The automatic segmentation of the liver is a crucial step in obtaining quantitative biomarkers for accurate clinical diagnosis and computer-aided decision support systems. This task is challenging due to the frequent presence of noise and sampling artifacts in computerized tomography (CT) images, as well as the complex background, variable shapes, and blurry boundaries of the liver. Standard segmentation of medical images based on full-supervised convolutional networks demands accurate dense annotations. Such a learning framework is built on laborious manual annotation with strict requirements for expertise, leading to insufficient high-quality labels. METHODS To overcome such limitation and exploit massive weakly labeled data, we relaxed the rigid labeling requirement and developed a semi-supervised double-cooperative network (SD- Net). SD-Net is trained to segment the complete liver volume from preoperative abdominal CT images by using limited labeled datasets and large-scale unlabeled datasets. Specifically, to enrich the diversity of unsupervised information, we construct SD-Net consisting of two collaborative network models. Within the supervised training module, we introduce an adaptive mask refinement approach. First, each of the two network models predicts the labeled dataset, after which adaptive mask refinement of the difference predictions is implemented to obtain more accurate liver segmentation results. In the unsupervised training module, a dynamic pseudo-label generation strategy is proposed. First each of the two models predicts unlabeled data and the better prediction is considered as pseudo-labeling before training. RESULTS AND DISCUSSION Based on the experimental findings, the proposed method achieves a dice score exceeding 94%, indicating its high level of accuracy and its suitability for everyday clinical use.
Collapse
Affiliation(s)
- Shixin Huang
- School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
- Department of Scientific Research, The People's Hospital of Yubei District of Chongqing city, Chongqing, 401120, China
| | - Jiawei Luo
- West China Biomedical Big Data Center, West China Hospital, Chengdu, 610044, China
| | - Yangning Ou
- School of Electrical and Electronic Engineering, Shanghai Institute of Technology, Shanghai, 201418, China
| | - Wangjun Shen
- Chongqing Human Resources Development Service Center, Chongqing, 400065, China
| | - Yu Pang
- School of Optoelectronic Engineering, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Xixi Nie
- College of Computer Science and Technology, The Chongqing Key Laboratory of Image Cognition, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| | - Guo Zhang
- School of Medical Information and Engineering, Southwest Medical University, Luzhou, 646000, China.
| |
Collapse
|
11
|
Jiao R, Zhang Y, Ding L, Xue B, Zhang J, Cai R, Jin C. Learning with limited annotations: A survey on deep semi-supervised learning for medical image segmentation. Comput Biol Med 2024; 169:107840. [PMID: 38157773 DOI: 10.1016/j.compbiomed.2023.107840] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 10/30/2023] [Accepted: 12/07/2023] [Indexed: 01/03/2024]
Abstract
Medical image segmentation is a fundamental and critical step in many image-guided clinical approaches. Recent success of deep learning-based segmentation methods usually relies on a large amount of labeled data, which is particularly difficult and costly to obtain, especially in the medical imaging domain where only experts can provide reliable and accurate annotations. Semi-supervised learning has emerged as an appealing strategy and been widely applied to medical image segmentation tasks to train deep models with limited annotations. In this paper, we present a comprehensive review of recently proposed semi-supervised learning methods for medical image segmentation and summarize both the technical novelties and empirical results. Furthermore, we analyze and discuss the limitations and several unsolved problems of existing approaches. We hope this review can inspire the research community to explore solutions to this challenge and further advance the field of medical image segmentation.
Collapse
Affiliation(s)
- Rushi Jiao
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China; School of Engineering Medicine, Beihang University, Beijing, 100191, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China.
| | - Yichi Zhang
- School of Data Science, Fudan University, Shanghai, 200433, China; Artificial Intelligence Innovation and Incubation Institute, Fudan University, Shanghai, 200433, China.
| | - Le Ding
- School of Biological Science and Medical Engineering, Beihang University, Beijing, 100191, China.
| | - Bingsen Xue
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China.
| | - Jicong Zhang
- School of Biological Science and Medical Engineering, Beihang University, Beijing, 100191, China; Hefei Innovation Research Institute, Beihang University, Hefei, 230012, China.
| | - Rong Cai
- School of Engineering Medicine, Beihang University, Beijing, 100191, China; Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beihang University, Beijing, 100191, China.
| | - Cheng Jin
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China; Beijing Anding Hospital, Capital Medical University, Beijing, 100088, China.
| |
Collapse
|
12
|
Qiu Z, Gan W, Yang Z, Zhou R, Gan H. Dual uncertainty-guided multi-model pseudo-label learning for semi-supervised medical image segmentation. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:2212-2232. [PMID: 38454680 DOI: 10.3934/mbe.2024097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/09/2024]
Abstract
Semi-supervised medical image segmentation is currently a highly researched area. Pseudo-label learning is a traditional semi-supervised learning method aimed at acquiring additional knowledge by generating pseudo-labels for unlabeled data. However, this method relies on the quality of pseudo-labels and can lead to an unstable training process due to differences between samples. Additionally, directly generating pseudo-labels from the model itself accelerates noise accumulation, resulting in low-confidence pseudo-labels. To address these issues, we proposed a dual uncertainty-guided multi-model pseudo-label learning framework (DUMM) for semi-supervised medical image segmentation. The framework consisted of two main parts: The first part is a sample selection module based on sample-level uncertainty (SUS), intended to achieve a more stable and smooth training process. The second part is a multi-model pseudo-label generation module based on pixel-level uncertainty (PUM), intended to obtain high-quality pseudo-labels. We conducted a series of experiments on two public medical datasets, ACDC2017 and ISIC2018. Compared to the baseline, we improved the Dice scores by 6.5% and 4.0% over the two datasets, respectively. Furthermore, our results showed a clear advantage over the comparative methods. This validates the feasibility and applicability of our approach.
Collapse
Affiliation(s)
- Zhanhong Qiu
- School of Computer Science, Hubei University of Technology, Wuhan 430068, China
| | - Weiyan Gan
- School of Computer Science, Hubei University of Technology, Wuhan 430068, China
| | - Zhi Yang
- School of Computer Science, Hubei University of Technology, Wuhan 430068, China
| | - Ran Zhou
- School of Computer Science, Hubei University of Technology, Wuhan 430068, China
| | - Haitao Gan
- School of Computer Science, Hubei University of Technology, Wuhan 430068, China
| |
Collapse
|
13
|
Chen C, Qi S, Zhou K, Lu T, Ning H, Xiao R. Pairwise attention-enhanced adversarial model for automatic bone segmentation in CT images. Phys Med Biol 2023; 68. [PMID: 36634367 DOI: 10.1088/1361-6560/acb2ab] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Accepted: 01/12/2023] [Indexed: 01/14/2023]
Abstract
Objective. Bone segmentation is a critical step in screw placement navigation. Although the deep learning methods have promoted the rapid development for bone segmentation, the local bone separation is still challenging due to irregular shapes and similar representational features.Approach. In this paper, we proposed the pairwise attention-enhanced adversarial model (Pair-SegAM) for automatic bone segmentation in computed tomography images, which includes the two parts of the segmentation model and discriminator. Considering that the distributions of the predictions from the segmentation model contains complicated semantics, we improve the discriminator to strengthen the awareness ability of the target region, improving the parsing of semantic information features. The Pair-SegAM has a pairwise structure, which uses two calculation mechanics to set up pairwise attention maps, then we utilize the semantic fusion to filter unstable regions. Therefore, the improved discriminator provides more refinement information to capture the bone outline, thus effectively enhancing the segmentation models for bone segmentation.Main results. To test the Pair-SegAM, we selected the two bone datasets for assessment. We evaluated our method against several bone segmentation models and latest adversarial models on the both datasets. The experimental results prove that our method not only exhibits superior bone segmentation performance, but also states effective generalization.Significance. Our method provides a more efficient segmentation of specific bones and has the potential to be extended to other semantic segmentation domains.
Collapse
Affiliation(s)
- Cheng Chen
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, People's Republic of China
| | - Siyu Qi
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, People's Republic of China
| | - Kangneng Zhou
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, People's Republic of China
| | - Tong Lu
- Visual 3D Medical Science and Technology Development Co. Ltd, Beijing 100082, People's Republic of China
| | - Huansheng Ning
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, People's Republic of China
| | - Ruoxiu Xiao
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, People's Republic of China.,Shunde Innovation School, University of Science and Technology Beijing, Foshan 100024, People's Republic of China
| |
Collapse
|
14
|
Mansour AE, Mohammed A, Elsayed HAEA, Elramly S. Spatial-Net for Human-Object Interaction Detection. IEEE ACCESS 2022; 10:88920-88931. [DOI: 10.1109/access.2022.3199380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Affiliation(s)
- Ahmed E. Mansour
- Electronics and Electrical Communication Department, Faculty of Engineering, Ain Shams University, Cairo, Egypt
| | - Ammar Mohammed
- Computer Science Department, Faculty of Graduate Studies of Statistical Researches, Cairo University, Giza, Egypt
| | - Hussein Abd El Atty Elsayed
- Electronics and Electrical Communication Department, Faculty of Engineering, Ain Shams University, Cairo, Egypt
| | - Salwa Elramly
- Electronics and Electrical Communication Department, Faculty of Engineering, Ain Shams University, Cairo, Egypt
| |
Collapse
|