1
|
Guo S, Liu Z, Yang Z, Lee CH, Lv Q, Shen L. Multi-scale multi-object semi-supervised consistency learning for ultrasound image segmentation. Neural Netw 2025; 184:107095. [PMID: 39754842 DOI: 10.1016/j.neunet.2024.107095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Revised: 10/18/2024] [Accepted: 12/23/2024] [Indexed: 01/06/2025]
Abstract
Manual annotation of ultrasound images relies on expert knowledge and requires significant time and financial resources. Semi-supervised learning (SSL) exploits large amounts of unlabeled data to improve model performance under limited labeled data. However, it faces two challenges: fusion of contextual information at multiple scales and bias of spatial information between multiple objects. We propose a consistency learning-based multi-scale multi-object (MSMO) semi-supervised framework for ultrasound image segmentation. MSMO addresses these challenges by employing a contextual-aware encoder coupled with a multi-object semantic calibration and fusion decoder. First, the encoder extracts multi-scale multi-objects context-aware features, and introduces attention module to refine the feature map and enhance channel information interaction. Then, the decoder uses HConvLSTM to calibrate the output features of the current object by using the hidden state of the previous object, and recursively fuses multi-object semantics at different scales. Finally, MSMO further reduces variations among multiple decoders in different perturbations through consistency constraints, thereby producing consistent predictions for highly uncertain areas. Extensive experiments show that proposed MSMO outperforms the SSL baseline on four benchmark datasets, whether for single-object or multi-object ultrasound image segmentation. MSMO significantly reduces the burden of manual analysis of ultrasound images and holds great potential as a clinical tool. The source code is accessible to the public at: https://github.com/lol88/MSMO.
Collapse
Affiliation(s)
- Saidi Guo
- School of Cyber Science and Engineering, Zhengzhou University, Zhengzhou 450002, China; School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, China
| | - Zhaoshan Liu
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore 117575, Singapore
| | - Ziduo Yang
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore 117575, Singapore; School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, Guangdong 518107, China
| | - Chau Hung Lee
- Department of Radiology, Tan Tock Seng Hospital, 11 Jalan Tan Tock Seng, Singapore 308433, Singapore
| | - Qiujie Lv
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, China.
| | - Lei Shen
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore 117575, Singapore.
| |
Collapse
|
2
|
Li N, Pan Y, Qiu W, Xiong L, Wang Y, Zhang Y. Constantly optimized mean teacher for semi-supervised 3D MRI image segmentation. Med Biol Eng Comput 2024; 62:2231-2245. [PMID: 38514501 DOI: 10.1007/s11517-024-03061-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2023] [Accepted: 02/23/2024] [Indexed: 03/23/2024]
Abstract
The mean teacher model and its variants, as important methods in semi-supervised learning, have demonstrated promising performance in magnetic resonance imaging (MRI) data segmentation. However, the superior performance of teacher model through exponential moving average (EMA) is limited by the unreliability of unlabeled image, resulting in potentially unreliable predictions. In this paper, we propose a framework to optimized the teacher model with reliable expert-annotated data while preserving the advantages of EMA. To avoid the tight coupling that results from EMA, we leverage data augmentations to provide two distinct perspectives for the teacher and student models. The teacher model adopts weak data augmentation to provide supervision for the student model and optimizes itself with real annotations, while the student uses strong data augmentation to avoid overfitting on noise information. In addition, double softmax helps the model resist noise and continue learning meaningful information from the images, which is a key component in the proposed model. Extensive experiments show that the proposed method exhibits competitive performance on the Left Atrium segmentation MRI dataset (LA) and the Brain Tumor Segmentation MRI dataset (BraTS2019). For the LA dataset, we achieved a dice of 91.02% using only 20% labeled data, which is close to the dice of 91.14% obtained by the supervised approach using 100% labeled data. For the BraTs2019 dataset, the proposed method achieved 1.02% and 1.92% improvement on 5% and 10% labeled data, respectively, compared to the best baseline method on this dataset. This study demonstrates that the proposed model can be a potential candidate for medical image segmentation in semi-supervised learning scenario.
Collapse
Affiliation(s)
- Ning Li
- School of Computer Science and Technology, Laboratory for Brain Science and Medical Artificial Intelligence, Southwest University of Science and Technology, Mianyang, 621010, People's Republic of China
| | - Yudong Pan
- School of Computer Science and Technology, Laboratory for Brain Science and Medical Artificial Intelligence, Southwest University of Science and Technology, Mianyang, 621010, People's Republic of China
| | - Wei Qiu
- School of Computer Science and Technology, Laboratory for Brain Science and Medical Artificial Intelligence, Southwest University of Science and Technology, Mianyang, 621010, People's Republic of China
| | - Lianjin Xiong
- School of Computer Science and Technology, Laboratory for Brain Science and Medical Artificial Intelligence, Southwest University of Science and Technology, Mianyang, 621010, People's Republic of China
| | - Yaobin Wang
- School of Computer Science and Technology, Laboratory for Brain Science and Medical Artificial Intelligence, Southwest University of Science and Technology, Mianyang, 621010, People's Republic of China
| | - Yangsong Zhang
- School of Computer Science and Technology, Laboratory for Brain Science and Medical Artificial Intelligence, Southwest University of Science and Technology, Mianyang, 621010, People's Republic of China.
- NHC Key Laboratory of Nuclear Technology Medical Transformation (Mianyang Central Hospital), Mianyang, 621000, People's Republic of China.
- Key Laboratory of Testing Technology for Manufacturing Process, Ministry of Education, Southwest University of Science and Technology, Mianyang, 621010, People's Republic of China.
| |
Collapse
|
3
|
Zheng Z, Hayashi Y, Oda M, Kitasaka T, Mori K. Revisiting instrument segmentation: Learning from decentralized surgical sequences with various imperfect annotations. Healthc Technol Lett 2024; 11:146-156. [PMID: 38638500 PMCID: PMC11022234 DOI: 10.1049/htl2.12068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Accepted: 12/07/2023] [Indexed: 04/20/2024] Open
Abstract
This paper focuses on a new and challenging problem related to instrument segmentation. This paper aims to learn a generalizable model from distributed datasets with various imperfect annotations. Collecting a large-scale dataset for centralized learning is usually impeded due to data silos and privacy issues. Besides, local clients, such as hospitals or medical institutes, may hold datasets with diverse and imperfect annotations. These datasets can include scarce annotations (many samples are unlabelled), noisy labels prone to errors, and scribble annotations with less precision. Federated learning (FL) has emerged as an attractive paradigm for developing global models with these locally distributed datasets. However, its potential in instrument segmentation has yet to be fully investigated. Moreover, the problem of learning from various imperfect annotations in an FL setup is rarely studied, even though it presents a more practical and beneficial scenario. This work rethinks instrument segmentation in such a setting and propose a practical FL framework for this issue. Notably, this approach surpassed centralized learning under various imperfect annotation settings. This method established a foundational benchmark, and future work can build upon it by considering each client owning various annotations and aligning closer with real-world complexities.
Collapse
Affiliation(s)
- Zhou Zheng
- Graduate School of InformaticsNagoya UniversityChikusa‐ku, NagoyaAichiJapan
| | - Yuichiro Hayashi
- Graduate School of InformaticsNagoya UniversityChikusa‐ku, NagoyaAichiJapan
| | - Masahiro Oda
- Graduate School of InformaticsNagoya UniversityChikusa‐ku, NagoyaAichiJapan
- Information Strategy Office, Information and CommunicationsNagoya UniversityChikusa‐ku, NagoyaAichiJapan
| | - Takayuki Kitasaka
- School of Information ScienceAichi Institute of TechnologyYagusa‐cho, ToyotaAichiJapan
| | - Kensaku Mori
- Graduate School of InformaticsNagoya UniversityChikusa‐ku, NagoyaAichiJapan
- Information Strategy Office, Information and CommunicationsNagoya UniversityChikusa‐ku, NagoyaAichiJapan
- Research Center for Medical BigdataNational Institute of InformaticsChiyoda‐ku, TokyoJapan
| |
Collapse
|
4
|
Jiao R, Zhang Y, Ding L, Xue B, Zhang J, Cai R, Jin C. Learning with limited annotations: A survey on deep semi-supervised learning for medical image segmentation. Comput Biol Med 2024; 169:107840. [PMID: 38157773 DOI: 10.1016/j.compbiomed.2023.107840] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 10/30/2023] [Accepted: 12/07/2023] [Indexed: 01/03/2024]
Abstract
Medical image segmentation is a fundamental and critical step in many image-guided clinical approaches. Recent success of deep learning-based segmentation methods usually relies on a large amount of labeled data, which is particularly difficult and costly to obtain, especially in the medical imaging domain where only experts can provide reliable and accurate annotations. Semi-supervised learning has emerged as an appealing strategy and been widely applied to medical image segmentation tasks to train deep models with limited annotations. In this paper, we present a comprehensive review of recently proposed semi-supervised learning methods for medical image segmentation and summarize both the technical novelties and empirical results. Furthermore, we analyze and discuss the limitations and several unsolved problems of existing approaches. We hope this review can inspire the research community to explore solutions to this challenge and further advance the field of medical image segmentation.
Collapse
Affiliation(s)
- Rushi Jiao
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China; School of Engineering Medicine, Beihang University, Beijing, 100191, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China.
| | - Yichi Zhang
- School of Data Science, Fudan University, Shanghai, 200433, China; Artificial Intelligence Innovation and Incubation Institute, Fudan University, Shanghai, 200433, China.
| | - Le Ding
- School of Biological Science and Medical Engineering, Beihang University, Beijing, 100191, China.
| | - Bingsen Xue
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China.
| | - Jicong Zhang
- School of Biological Science and Medical Engineering, Beihang University, Beijing, 100191, China; Hefei Innovation Research Institute, Beihang University, Hefei, 230012, China.
| | - Rong Cai
- School of Engineering Medicine, Beihang University, Beijing, 100191, China; Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beihang University, Beijing, 100191, China.
| | - Cheng Jin
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China; Beijing Anding Hospital, Capital Medical University, Beijing, 100088, China.
| |
Collapse
|
5
|
Yang H, Tan T, Tegzes P, Dong X, Tamada R, Ferenczi L, Avinash G. Light mixed-supervised segmentation for 3D medical image data. Med Phys 2024; 51:167-178. [PMID: 37909833 DOI: 10.1002/mp.16816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 10/03/2023] [Accepted: 10/16/2023] [Indexed: 11/03/2023] Open
Abstract
BACKGROUND Accurate 3D semantic segmentation models are essential for many clinical applications. To train a model for 3D segmentation, voxel-level annotation is necessary, which is expensive to obtain due to laborious work and privacy protection. To accurately annotate 3D medical data, such as MRI, a common practice is to annotate the volumetric data in a slice-by-slice contouring way along principal axes. PURPOSE In order to reduce the annotation effort in slices, weakly supervised learning with a bounding box (Bbox) was proposed to leverage the discriminating information via a tightness prior assumption. Nevertheless, this method requests accurate and tight Bboxes, which will significantly drop the performance when tightness is not held, that is when a relaxed Bbox is applied. Therefore, there is a need to train a stable model based on relaxed Bbox annotation. METHODS This paper presents a mixed-supervised training strategy to reduce the annotation effort for 3D segmentation tasks. In the proposed approach, a fully annotated contour is only required for a single slice of the volume. In contrast, the rest of the slices with targets are annotated with relaxed Bboxes. This mixed-supervised method adopts fully supervised learning, relaxed Bbox prior, and contrastive learning during the training, which ensures the network exploits the discriminative information of the training volumes properly. The proposed method was evaluated on two public 3D medical imaging datasets (MRI prostate dataset and Vestibular Schwannoma [VS] dataset). RESULTS The proposed method obtained a high segmentation Dice score of 85.3% on an MRI prostate dataset and 83.3% on a VS dataset with relaxed Bbox annotation, which are close to a fully supervised model. Moreover, with the same relaxed Bbox annotations, the proposed method outperforms the state-of-the-art methods. More importantly, the model performance is stable when the accuracy of Bbox annotation varies. CONCLUSIONS The presented study proposes a method based on a mixed-supervised learning method in 3D medical imaging. The benefit will be stable segmentation of the target in 3D images with low accurate annotation requirement, which leads to easier model training on large-scale datasets.
Collapse
Affiliation(s)
| | - Tao Tan
- GE Healthcare, Eindhoven, The Netherlands
| | | | | | | | | | | |
Collapse
|
6
|
Masoumi N, Rivaz H, Hacihaliloglu I, Ahmad MO, Reinertsen I, Xiao Y. The Big Bang of Deep Learning in Ultrasound-Guided Surgery: A Review. IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL 2023; 70:909-919. [PMID: 37028313 DOI: 10.1109/tuffc.2023.3255843] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Ultrasound (US) imaging is a paramount modality in many image-guided surgeries and percutaneous interventions, thanks to its high portability, temporal resolution, and cost-efficiency. However, due to its imaging principles, the US is often noisy and difficult to interpret. Appropriate image processing can greatly enhance the applicability of the imaging modality in clinical practice. Compared with the classic iterative optimization and machine learning (ML) approach, deep learning (DL) algorithms have shown great performance in terms of accuracy and efficiency for US processing. In this work, we conduct a comprehensive review on deep-learning algorithms in the applications of US-guided interventions, summarize the current trends, and suggest future directions on the topic.
Collapse
|
7
|
Zou W, Qi X, Zhou W, Sun M, Sun Z, Shan C. Graph Flow: Cross-Layer Graph Flow Distillation for Dual Efficient Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:1159-1171. [PMID: 36423314 DOI: 10.1109/tmi.2022.3224459] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
With the development of deep convolutional neural networks, medical image segmentation has achieved a series of breakthroughs in recent years. However, high-performance convolutional neural networks always mean numerous parameters and high computation costs, which will hinder the applications in resource-limited medical scenarios. Meanwhile, the scarceness of large-scale annotated medical image datasets further impedes the application of high-performance networks. To tackle these problems, we propose Graph Flow, a comprehensive knowledge distillation framework, for both network-efficiency and annotation-efficiency medical image segmentation. Specifically, the Graph Flow Distillation transfers the essence of cross-layer variations from a well-trained cumbersome teacher network to a non-trained compact student network. In addition, an unsupervised Paraphraser Module is integrated to purify the knowledge of the teacher, which is also beneficial for the training stabilization. Furthermore, we build a unified distillation framework by integrating the adversarial distillation and the vanilla logits distillation, which can further refine the final predictions of the compact network. With different teacher networks (traditional convolutional architecture or prevalent transformer architecture) and student networks, we conduct extensive experiments on four medical image datasets with different modalities (Gastric Cancer, Synapse, BUSI, and CVC-ClinicDB). We demonstrate the prominent ability of our method on these datasets, which achieves competitive performances. Moreover, we demonstrate the effectiveness of our Graph Flow through a novel semi-supervised paradigm for dual efficient medical image segmentation. Our code will be available at Graph Flow.
Collapse
|
8
|
Yang H, Shan C, Kolen AF, de With PHN. Medical instrument detection in ultrasound: a review. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10287-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
AbstractMedical instrument detection is essential for computer-assisted interventions, since it facilitates clinicians to find instruments efficiently with a better interpretation, thereby improving clinical outcomes. This article reviews image-based medical instrument detection methods for ultrasound-guided (US-guided) operations. Literature is selected based on an exhaustive search in different sources, including Google Scholar, PubMed, and Scopus. We first discuss the key clinical applications of medical instrument detection in the US, including delivering regional anesthesia, biopsy taking, prostate brachytherapy, and catheterization. Then, we present a comprehensive review of instrument detection methodologies, including non-machine-learning and machine-learning methods. The conventional non-machine-learning methods were extensively studied before the era of machine learning methods. The principal issues and potential research directions for future studies are summarized for the computer-assisted intervention community. In conclusion, although promising results have been obtained by the current (non-) machine learning methods for different clinical applications, thorough clinical validations are still required.
Collapse
|
9
|
Han K, Liu L, Song Y, Liu Y, Qiu C, Tang Y, Teng Q, Liu Z. An Effective Semi-supervised Approach for Liver CT Image Segmentation. IEEE J Biomed Health Inform 2022; 26:3999-4007. [PMID: 35420991 DOI: 10.1109/jbhi.2022.3167384] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Despite the substantial progress made by deep networks in the field of medical image segmentation, they generally require sufficient pixel-level annotated data for training. The scale of training data remains to be the main bottleneck to obtain a better deep segmentation model. Semi-supervised learning is an effective approach that alleviates the dependence on labeled data. However, most existing semi-supervised image segmentation methods usually do not generate high-quality pseudo labels to expand training dataset. In this paper, we propose a deep semi-supervised approach for liver CT image segmentation by expanding pseudo-labeling algorithm under the very low annotated-data paradigm. Specifically, the output features of labeled images from the pretrained network combine with corresponding pixel-level annotations to produce class representations according to the mean operation. Then pseudo labels of unlabeled images are generated by calculating the distances between unlabeled feature vectors and each class representation. To further improve the quality of pseudo labels, we adopt a series of operations to optimize pseudo labels. A more accurate segmentation network is obtained by expanding the training dataset and adjusting the contributions between supervised and unsupervised loss. Besides, the novel random patch based on prior locations is introduced for unlabeled images in the training procedure. Extensive experiments show our method has achieved more competitive results compared with other semi-supervised methods when fewer labeled slices of LiTS dataset are available.
Collapse
|