1
|
Triple-task mutual consistency for semi-supervised 3D medical image segmentation. Comput Biol Med 2024; 175:108506. [PMID: 38688127 DOI: 10.1016/j.compbiomed.2024.108506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 03/18/2024] [Accepted: 04/21/2024] [Indexed: 05/02/2024]
Abstract
Semi-supervised deep learning algorithm is an effective means of medical image segmentation. Among these methods, multi-task learning with consistency regularization has achieved outstanding results. However, most of the existing methods usually simply embed the Signed Distance Map (SDM) task into the network, which underestimates the potential ability of SDM in edge awareness and leads to excessive dependence between tasks. In this work, we propose a novel triple-task mutual consistency (TTMC) framework to enhance shape and edge awareness capabilities, and overcome the task dependence problem underestimated in previous work. Specifically, we innovatively construct the Signed Attention Map (SAM), a novel fusion image with attention mechanism, and use it as an auxiliary task for segmentation to enhance the edge awareness ability. Then we implement a triple-task deep network, which jointly predicts the voxel-wise classification map, the Signed Distance Map and the Signed Attention Map. In our proposed framework, an optimized differentiable transformation layer associates SDM with voxel-wise classification map and SAM prediction, while task-level consistency regularization utilizes unlabeled data in an unsupervised manner. Evaluated on the public Left Atrium dataset and NIH Pancreas dataset, our proposed framework achieves significant performance gains by effectively utilizing unlabeled data, outperforming recent state-of-the-art semi-supervised segmentation methods. Code is available at https://github.com/Saocent/TTMC.
Collapse
|
2
|
Expectation maximisation pseudo labels. Med Image Anal 2024; 94:103125. [PMID: 38428272 DOI: 10.1016/j.media.2024.103125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2023] [Revised: 02/18/2024] [Accepted: 02/26/2024] [Indexed: 03/03/2024]
Abstract
In this paper, we study pseudo-labelling. Pseudo-labelling employs raw inferences on unlabelled data as pseudo-labels for self-training. We elucidate the empirical successes of pseudo-labelling by establishing a link between this technique and the Expectation Maximisation algorithm. Through this, we realise that the original pseudo-labelling serves as an empirical estimation of its more comprehensive underlying formulation. Following this insight, we present a full generalisation of pseudo-labels under Bayes' theorem, termed Bayesian Pseudo Labels. Subsequently, we introduce a variational approach to generate these Bayesian Pseudo Labels, involving the learning of a threshold to automatically select high-quality pseudo labels. In the remainder of the paper, we showcase the applications of pseudo-labelling and its generalised form, Bayesian Pseudo-Labelling, in the semi-supervised segmentation of medical images. Specifically, we focus on: (1) 3D binary segmentation of lung vessels from CT volumes; (2) 2D multi-class segmentation of brain tumours from MRI volumes; (3) 3D binary segmentation of whole brain tumours from MRI volumes; and (4) 3D binary segmentation of prostate from MRI volumes. We further demonstrate that pseudo-labels can enhance the robustness of the learned representations. The code is released in the following GitHub repository: https://github.com/moucheng2017/EMSSL.
Collapse
|
3
|
The student-teacher framework guided by self-training and consistency regularization for semi-supervised medical image segmentation. PLoS One 2024; 19:e0300039. [PMID: 38648206 PMCID: PMC11034649 DOI: 10.1371/journal.pone.0300039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 02/20/2024] [Indexed: 04/25/2024] Open
Abstract
Due to the high suitability of semi-supervised learning for medical image segmentation, a plethora of valuable research has been conducted and has achieved noteworthy success in this field. However, many approaches tend to confine their focus to a singular semi-supervised framework, thereby overlooking the potential enhancements in segmentation performance offered by integrating several frameworks. In this paper, we propose a novel semi-supervised framework named Pesudo-Label Mean Teacher (PLMT), which synergizes the self-training pipeline with pseudo-labeling and consistency regularization techniques. In particular, we integrate the student-teacher structure with consistency loss into the self-training pipeline to facilitate a mutually beneficial enhancement between the two methods. This structure not only generates remarkably accurate pseudo-labels for the self-training pipeline but also furnishes additional pseudo-label supervision for the student-teacher framework. Moreover, to explore the impact of different semi-supervised losses on the segmentation performance of the PLMT framework, we introduce adaptive loss weights. The PLMT could dynamically adjust the weights of different semi-supervised losses during the training process. Extension experiments on three public datasets demonstrate that our framework achieves the best performance and outperforms the other five semi-supervised methods. The PLMT is an initial exploration of the framework that melds the self-training pipeline with consistency regularization and offers a comparatively innovative perspective in semi-supervised image segmentation.
Collapse
|
4
|
Multi-granularity learning of explicit geometric constraint and contrast for label-efficient medical image segmentation and differentiable clinical function assessment. Med Image Anal 2024; 95:103183. [PMID: 38692098 DOI: 10.1016/j.media.2024.103183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 01/26/2024] [Accepted: 04/18/2024] [Indexed: 05/03/2024]
Abstract
Automated segmentation is a challenging task in medical image analysis that usually requires a large amount of manually labeled data. However, most current supervised learning based algorithms suffer from insufficient manual annotations, posing a significant difficulty for accurate and robust segmentation. In addition, most current semi-supervised methods lack explicit representations of geometric structure and semantic information, restricting segmentation accuracy. In this work, we propose a hybrid framework to learn polygon vertices, region masks, and their boundaries in a weakly/semi-supervised manner that significantly advances geometric and semantic representations. Firstly, we propose multi-granularity learning of explicit geometric structure constraints via polygon vertices (PolyV) and pixel-wise region (PixelR) segmentation masks in a semi-supervised manner. Secondly, we propose eliminating boundary ambiguity by using an explicit contrastive objective to learn a discriminative feature space of boundary contours at the pixel level with limited annotations. Thirdly, we exploit the task-specific clinical domain knowledge to differentiate the clinical function assessment end-to-end. The ground truth of clinical function assessment, on the other hand, can serve as auxiliary weak supervision for PolyV and PixelR learning. We evaluate the proposed framework on two tasks, including optic disc (OD) and cup (OC) segmentation along with vertical cup-to-disc ratio (vCDR) estimation in fundus images; left ventricle (LV) segmentation at end-diastolic and end-systolic frames along with ejection fraction (LVEF) estimation in two-dimensional echocardiography images. Experiments on nine large-scale datasets of the two tasks under different label settings demonstrate our model's superior performance on segmentation and clinical function assessment.
Collapse
|
5
|
Automated Diagnosis of Major Depressive Disorder With Multi-Modal MRIs Based on Contrastive Learning: A Few-Shot Study. IEEE Trans Neural Syst Rehabil Eng 2024; 32:1566-1576. [PMID: 38512734 DOI: 10.1109/tnsre.2024.3380357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/23/2024]
Abstract
Depression ranks among the most prevalent mood-related psychiatric disorders. Existing clinical diagnostic approaches relying on scale interviews are susceptible to individual and environmental variations. In contrast, the integration of neuroimaging techniques and computer science has provided compelling evidence for the quantitative assessment of major depressive disorder (MDD). However, one of the major challenges in computer-aided diagnosis of MDD is to automatically and effectively mine the complementary cross-modal information from limited datasets. In this study, we proposed a few-shot learning framework that integrates multi-modal MRI data based on contrastive learning. In the upstream task, it is designed to extract knowledge from heterogeneous data. Subsequently, the downstream task is dedicated to transferring the acquired knowledge to the target dataset, where a hierarchical fusion paradigm is also designed to integrate features across inter- and intra-modalities. Lastly, the proposed model was evaluated on a set of multi-modal clinical data, achieving average scores of 73.52% and 73.09% for accuracy and AUC, respectively. Our findings also reveal that the brain regions within the default mode network and cerebellum play a crucial role in the diagnosis, which provides further direction in exploring reproducible biomarkers for MDD diagnosis.
Collapse
|
6
|
US2Mask: Image-to-mask generation learning via a conditional GAN for cardiac ultrasound image segmentation. Comput Biol Med 2024; 172:108282. [PMID: 38503085 DOI: 10.1016/j.compbiomed.2024.108282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 02/29/2024] [Accepted: 03/12/2024] [Indexed: 03/21/2024]
Abstract
Cardiac ultrasound (US) image segmentation is vital for evaluating clinical indices, but it often demands a large dataset and expert annotations, resulting in high costs for deep learning algorithms. To address this, our study presents a framework utilizing artificial intelligence generation technology to produce multi-class RGB masks for cardiac US image segmentation. The proposed approach directly performs semantic segmentation of the heart's main structures in US images from various scanning modes. Additionally, we introduce a novel learning approach based on conditional generative adversarial networks (CGAN) for cardiac US image segmentation, incorporating a conditional input and paired RGB masks. Experimental results from three cardiac US image datasets with diverse scan modes demonstrate that our approach outperforms several state-of-the-art models, showcasing improvements in five commonly used segmentation metrics, with lower noise sensitivity. Source code is available at https://github.com/energy588/US2mask.
Collapse
|
7
|
SC-SSL: Self-Correcting Collaborative and Contrastive Co-Training Model for Semi-Supervised Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1347-1364. [PMID: 37995173 DOI: 10.1109/tmi.2023.3336534] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2023]
Abstract
Image segmentation achieves significant improvements with deep neural networks at the premise of a large scale of labeled training data, which is laborious to assure in medical image tasks. Recently, semi-supervised learning (SSL) has shown great potential in medical image segmentation. However, the influence of the learning target quality for unlabeled data is usually neglected in these SSL methods. Therefore, this study proposes a novel self-correcting co-training scheme to learn a better target that is more similar to ground-truth labels from collaborative network outputs. Our work has three-fold highlights. First, we advance the learning target generation as a learning task, improving the learning confidence for unannotated data with a self-correcting module. Second, we impose a structure constraint to encourage the shape similarity further between the improved learning target and the collaborative network outputs. Finally, we propose an innovative pixel-wise contrastive learning loss to boost the representation capacity under the guidance of an improved learning target, thus exploring unlabeled data more efficiently with the awareness of semantic context. We have extensively evaluated our method with the state-of-the-art semi-supervised approaches on four public-available datasets, including the ACDC dataset, M&Ms dataset, Pancreas-CT dataset, and Task_07 CT dataset. The experimental results with different labeled-data ratios show our proposed method's superiority over other existing methods, demonstrating its effectiveness in semi-supervised medical image segmentation.
Collapse
|
8
|
Semi-Supervised Medical Image Segmentation Using Cross-Style Consistency With Shape-Aware and Local Context Constraints. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1449-1461. [PMID: 38032771 DOI: 10.1109/tmi.2023.3338269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2023]
Abstract
Despite the remarkable progress in semi-supervised medical image segmentation methods based on deep learning, their application to real-life clinical scenarios still faces considerable challenges. For example, insufficient labeled data often makes it difficult for networks to capture the complexity and variability of the anatomical regions to be segmented. To address these problems, we design a new semi-supervised segmentation framework that aspires to produce anatomically plausible predictions. Our framework comprises two parallel networks: shape-agnostic and shape-aware networks. These networks learn from each other, enabling effective utilization of unlabeled data. Our shape-aware network implicitly introduces shape guidance to capture shape fine-grained information. Meanwhile, shape-agnostic networks employ uncertainty estimation to further obtain reliable pseudo-labels for the counterpart. We also employ a cross-style consistency strategy to enhance the network's utilization of unlabeled data. It enriches the dataset to prevent overfitting and further eases the coupling of the two networks that learn from each other. Our proposed architecture also incorporates a novel loss term that facilitates the learning of the local context of segmentation by the network, thereby enhancing the overall accuracy of prediction. Experiments on three different datasets of medical images show that our method outperforms many excellent semi-supervised segmentation methods and outperforms them in perceiving shape. The code can be seen at https://github.com/igip-liu/SLC-Net.
Collapse
|
9
|
Knowledge distillation on individual vertebrae segmentation exploiting 3D U-Net. Comput Med Imaging Graph 2024; 113:102350. [PMID: 38340574 DOI: 10.1016/j.compmedimag.2024.102350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 02/01/2024] [Accepted: 02/01/2024] [Indexed: 02/12/2024]
Abstract
Recent advances in medical imaging have highlighted the critical development of algorithms for individual vertebral segmentation on computed tomography (CT) scans. Essential for diagnostic accuracy and treatment planning in orthopaedics, neurosurgery and oncology, these algorithms face challenges in clinical implementation, including integration into healthcare systems. Consequently, our focus lies in exploring the application of knowledge distillation (KD) methods to train shallower networks capable of efficiently segmenting vertebrae in CT scans. This approach aims to reduce segmentation time, enhance suitability for emergency cases, and optimize computational and memory resource efficiency. Building upon prior research in the field, a two-step segmentation approach was employed. Firstly, the spine's location was determined by predicting a heatmap, indicating the probability of each voxel belonging to the spine. Subsequently, an iterative segmentation of vertebrae was performed from the top to the bottom of the CT volume over the located spine, using a memory instance to record the already segmented vertebrae. KD methods were implemented by training a teacher network with performance similar to that found in the literature, and this knowledge was distilled to a shallower network (student). Two KD methods were applied: (1) using the soft outputs of both networks and (2) matching logits. Two publicly available datasets, comprising 319 CT scans from 300 patients and a total of 611 cervical, 2387 thoracic, and 1507 lumbar vertebrae, were used. To ensure dataset balance and robustness, effective data augmentation methods were applied, including cleaning the memory instance to replicate the first vertebra segmentation. The teacher network achieved an average Dice similarity coefficient (DSC) of 88.22% and a Hausdorff distance (HD) of 7.71 mm, showcasing performance similar to other approaches in the literature. Through knowledge distillation from the teacher network, the student network's performance improved, with an average DSC increasing from 75.78% to 84.70% and an HD decreasing from 15.17 mm to 8.08 mm. Compared to other methods, our teacher network exhibited up to 99.09% fewer parameters, 90.02% faster inference time, 88.46% shorter total segmentation time, and 89.36% less associated carbon (CO2) emission rate. Regarding our student network, it featured 75.00% fewer parameters than our teacher, resulting in a 36.15% reduction in inference time, a 33.33% decrease in total segmentation time, and a 42.96% reduction in CO2 emissions. This study marks the first exploration of applying KD to the problem of individual vertebrae segmentation in CT, demonstrating the feasibility of achieving comparable performance to existing methods using smaller neural networks.
Collapse
|
10
|
Constantly optimized mean teacher for semi-supervised 3D MRI image segmentation. Med Biol Eng Comput 2024:10.1007/s11517-024-03061-8. [PMID: 38514501 DOI: 10.1007/s11517-024-03061-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2023] [Accepted: 02/23/2024] [Indexed: 03/23/2024]
Abstract
The mean teacher model and its variants, as important methods in semi-supervised learning, have demonstrated promising performance in magnetic resonance imaging (MRI) data segmentation. However, the superior performance of teacher model through exponential moving average (EMA) is limited by the unreliability of unlabeled image, resulting in potentially unreliable predictions. In this paper, we propose a framework to optimized the teacher model with reliable expert-annotated data while preserving the advantages of EMA. To avoid the tight coupling that results from EMA, we leverage data augmentations to provide two distinct perspectives for the teacher and student models. The teacher model adopts weak data augmentation to provide supervision for the student model and optimizes itself with real annotations, while the student uses strong data augmentation to avoid overfitting on noise information. In addition, double softmax helps the model resist noise and continue learning meaningful information from the images, which is a key component in the proposed model. Extensive experiments show that the proposed method exhibits competitive performance on the Left Atrium segmentation MRI dataset (LA) and the Brain Tumor Segmentation MRI dataset (BraTS2019). For the LA dataset, we achieved a dice of 91.02% using only 20% labeled data, which is close to the dice of 91.14% obtained by the supervised approach using 100% labeled data. For the BraTs2019 dataset, the proposed method achieved 1.02% and 1.92% improvement on 5% and 10% labeled data, respectively, compared to the best baseline method on this dataset. This study demonstrates that the proposed model can be a potential candidate for medical image segmentation in semi-supervised learning scenario.
Collapse
|
11
|
MDT: semi-supervised medical image segmentation with mixup-decoupling training. Phys Med Biol 2024; 69:065012. [PMID: 38324897 DOI: 10.1088/1361-6560/ad2715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 02/07/2024] [Indexed: 02/09/2024]
Abstract
Objective. In the field of medicine, semi-supervised segmentation algorithms hold crucial research significance while also facing substantial challenges, primarily due to the extreme scarcity of expert-level annotated medical image data. However, many existing semi-supervised methods still process labeled and unlabeled data in inconsistent ways, which can lead to knowledge learned from labeled data being discarded to some extent. This not only lacks a variety of perturbations to explore potential robust information in unlabeled data but also ignores the confirmation bias and class imbalance issues in pseudo-labeling methods.Approach. To solve these problems, this paper proposes a semi-supervised medical image segmentation method 'mixup-decoupling training (MDT)' that combines the idea of consistency and pseudo-labeling. Firstly, MDT introduces a new perturbation strategy 'mixup-decoupling' to fully regularize training data. It not only mixes labeled and unlabeled data at the data level but also performs decoupling operations between the output predictions of mixed target data and labeled data at the feature level to obtain strong version predictions of unlabeled data. Then it establishes a dual learning paradigm based on consistency and pseudo-labeling. Secondly, MDT employs a novel categorical entropy filtering approach to pick high-confidence pseudo-labels for unlabeled data, facilitating more refined supervision.Main results. This paper compares MDT with other advanced semi-supervised methods on 2D and 3D datasets separately. A large number of experimental results show that MDT achieves competitive segmentation performance and outperforms other state-of-the-art semi-supervised segmentation methods.Significance. This paper proposes a semi-supervised medical image segmentation method MDT, which greatly reduces the demand for manually labeled data and eases the difficulty of data annotation to a great extent. In addition, MDT not only outperforms many advanced semi-supervised image segmentation methods in quantitative and qualitative experimental results, but also provides a new and developable idea for semi-supervised learning and computer-aided diagnosis technology research.
Collapse
|
12
|
Semi-supervised segmentation of orbit in CT images with paired copy-paste strategy. Comput Biol Med 2024; 171:108176. [PMID: 38401453 DOI: 10.1016/j.compbiomed.2024.108176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 02/06/2024] [Accepted: 02/18/2024] [Indexed: 02/26/2024]
Abstract
The segmentation of the orbit in computed tomography (CT) images plays a crucial role in facilitating the quantitative analysis of orbital decompression surgery for patients with Thyroid-associated Ophthalmopathy (TAO). However, the task of orbit segmentation, particularly in postoperative images, remains challenging due to the significant shape variation and limited amount of labeled data. In this paper, we present a two-stage semi-supervised framework for the automatic segmentation of the orbit in both preoperative and postoperative images, which consists of a pseudo-label generation stage and a semi-supervised segmentation stage. A Paired Copy-Paste strategy is concurrently introduced to proficiently amalgamate features extracted from both preoperative and postoperative images, thereby augmenting the network discriminative capability in discerning changes within orbital boundaries. More specifically, we employ a random cropping technique to transfer regions from labeled preoperative images (foreground) onto unlabeled postoperative images (background), as well as unlabeled preoperative images (foreground) onto labeled postoperative images (background). It is imperative to acknowledge that every set of preoperative and postoperative images belongs to the identical patient. The semi-supervised segmentation network (stage 2) utilizes a combination of mixed supervisory signals from pseudo labels (stage 1) and ground truth to process the two mixed images. The training and testing of the proposed method have been conducted on the CT dataset obtained from the Eye Hospital of Wenzhou Medical University. The experimental results demonstrate that the proposed method achieves a mean Dice similarity coefficient (DSC) of 91.92% with only 5% labeled data, surpassing the performance of the current state-of-the-art method by 2.4%.
Collapse
|
13
|
Semi-supervised image segmentation using a residual-driven mean teacher and an exponential Dice loss. Artif Intell Med 2024; 148:102757. [PMID: 38325920 DOI: 10.1016/j.artmed.2023.102757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 11/13/2023] [Accepted: 12/29/2023] [Indexed: 02/09/2024]
Abstract
Semi-supervised segmentation plays an important role in computer vision and medical image analysis and can alleviate the burden of acquiring abundant expert-annotated images. In this paper, we developed a residual-driven semi-supervised segmentation method (termed RDMT) based on the classical mean teacher (MT) framework by introducing a novel model-level residual perturbation and an exponential Dice (eDice) loss. The introduced perturbation was integrated into the exponential moving average (EMA) scheme to enhance the performance of the MT, while the eDice loss was used to improve the detection sensitivity of a given network to object boundaries. We validated the developed method by applying it to segment 3D Left Atrium (LA) and 2D optic cup (OC) from the public LASC and REFUGE datasets based on the V-Net and U-Net, respectively. Extensive experiments demonstrated that the developed method achieved the average Dice score of 0.8776 and 0.7751, when trained on 10% and 20% labeled images, respectively for the LA and OC regions depicted on the LASC and REFUGE datasets. It significantly outperformed the MT and can compete with several existing semi-supervised segmentation methods (i.e., HCMT, UAMT, DTC and SASS).
Collapse
|
14
|
Fully immersive virtual reality for skull-base surgery: surgical training and beyond. Int J Comput Assist Radiol Surg 2024; 19:51-59. [PMID: 37347346 DOI: 10.1007/s11548-023-02956-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Accepted: 05/08/2023] [Indexed: 06/23/2023]
Abstract
PURPOSE A virtual reality (VR) system, where surgeons can practice procedures on virtual anatomies, is a scalable and cost-effective alternative to cadaveric training. The fully digitized virtual surgeries can also be used to assess the surgeon's skills using measurements that are otherwise hard to collect in reality. Thus, we present the Fully Immersive Virtual Reality System (FIVRS) for skull-base surgery, which combines surgical simulation software with a high-fidelity hardware setup. METHODS FIVRS allows surgeons to follow normal clinical workflows inside the VR environment. FIVRS uses advanced rendering designs and drilling algorithms for realistic bone ablation. A head-mounted display with ergonomics similar to that of surgical microscopes is used to improve immersiveness. Extensive multi-modal data are recorded for post-analysis, including eye gaze, motion, force, and video of the surgery. A user-friendly interface is also designed to ease the learning curve of using FIVRS. RESULTS We present results from a user study involving surgeons with various levels of expertise. The preliminary data recorded by FIVRS differentiate between participants with different levels of expertise, promising future research on automatic skill assessment. Furthermore, informal feedback from the study participants about the system's intuitiveness and immersiveness was positive. CONCLUSION We present FIVRS, a fully immersive VR system for skull-base surgery. FIVRS features a realistic software simulation coupled with modern hardware for improved realism. The system is completely open source and provides feature-rich data in an industry-standard format.
Collapse
|
15
|
Sparse annotation learning for dense volumetric MR image segmentation with uncertainty estimation. Phys Med Biol 2023; 69:015009. [PMID: 38035374 DOI: 10.1088/1361-6560/ad111b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 11/30/2023] [Indexed: 12/02/2023]
Abstract
Objective.Training neural networks for pixel-wise or voxel-wise image segmentation is a challenging task that requires a considerable amount of training samples with highly accurate and densely delineated ground truth maps. This challenge becomes especially prominent in the medical imaging domain, where obtaining reliable annotations for training samples is a difficult, time-consuming, and expert-dependent process. Therefore, developing models that can perform well under the conditions of limited annotated training data is desirable.Approach.In this study, we propose an innovative framework called the extremely sparse annotation neural network (ESA-Net) that learns with only the single central slice label for 3D volumetric segmentation which explores both intra-slice pixel dependencies and inter-slice image correlations with uncertainty estimation. Specifically, ESA-Net consists of four specially designed distinct components: (1) an intra-slice pixel dependency-guided pseudo-label generation module that exploits uncertainty in network predictions while generating pseudo-labels for unlabeled slices with temporal ensembling; (2) an inter-slice image correlation-constrained pseudo-label propagation module which propagates labels from the labeled central slice to unlabeled slices by self-supervised registration with rotation ensembling; (3) a pseudo-label fusion module that fuses the two sets of generated pseudo-labels with voxel-wise uncertainty guidance; and (4) a final segmentation network optimization module to make final predictions with scoring-based label quantification.Main results.Extensive experimental validations have been performed on two popular yet challenging magnetic resonance image segmentation tasks and compared to five state-of-the-art methods.Significance.Results demonstrate that our proposed ESA-Net can consistently achieve better segmentation performances even under the extremely sparse annotation setting, highlighting its effectiveness in exploiting information from unlabeled data.
Collapse
|
16
|
A deep weakly semi-supervised framework for endoscopic lesion segmentation. Med Image Anal 2023; 90:102973. [PMID: 37757643 DOI: 10.1016/j.media.2023.102973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 07/19/2023] [Accepted: 09/14/2023] [Indexed: 09/29/2023]
Abstract
In the field of medical image analysis, accurate lesion segmentation is beneficial for the subsequent clinical diagnosis and treatment planning. Currently, various deep learning-based methods have been proposed to deal with the segmentation task. Albeit achieving some promising performances, the fully-supervised learning approaches require pixel-level annotations for model training, which is tedious and time-consuming for experienced radiologists to collect. In this paper, we propose a weakly semi-supervised segmentation framework, called Point Segmentation Transformer (Point SEGTR). Particularly, the framework utilizes a small amount of fully-supervised data with pixel-level segmentation masks and a large amount of weakly-supervised data with point-level annotations (i.e., annotating a point inside each object) for network training, which largely reduces the demand of pixel-level annotations significantly. To fully exploit the pixel-level and point-level annotations, we propose two regularization terms, i.e., multi-point consistency and symmetric consistency, to boost the quality of pseudo labels, which are then adopted to train a student model for inference. Extensive experiments are conducted on three endoscopy datasets with different lesion structures and several body sites (e.g., colorectal and nasopharynx). Comprehensive experimental results finely substantiate the effectiveness and the generality of our proposed method, as well as its potential to loosen the requirements of pixel-level annotations, which is valuable for clinical applications.
Collapse
|
17
|
Continual Nuclei Segmentation via Prototype-Wise Relation Distillation and Contrastive Learning. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3794-3804. [PMID: 37610902 DOI: 10.1109/tmi.2023.3307892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/25/2023]
Abstract
Deep learning models have achieved remarkable success in multi-type nuclei segmentation. These models are mostly trained at once with the full annotation of all types of nuclei available, while lack the ability of continually learning new classes due to the problem of catastrophic forgetting. In this paper, we study the practical and important class-incremental continual learning problem, where the model is incrementally updated to new classes without accessing to previous data. We propose a novel continual nuclei segmentation method, to avoid forgetting knowledge of old classes and facilitate the learning of new classes, by achieving feature-level knowledge distillation with prototype-wise relation distillation and contrastive learning. Concretely, prototype-wise relation distillation imposes constraints on the inter-class relation similarity, encouraging the encoder to extract similar class distribution for old classes in the feature space. Prototype-wise contrastive learning with a hard sampling strategy enhances the intra-class compactness and inter-class separability of features, improving the performance on both old and new classes. Experiments on two multi-type nuclei segmentation benchmarks, i.e., MoNuSAC and CoNSeP, demonstrate the effectiveness of our method with superior performance over many competitive methods. Codes are available at https://github.com/zzw-szu/CoNuSeg.
Collapse
|
18
|
MARGANVAC: metal artifact reduction method based on generative adversarial network with variable constraints. Phys Med Biol 2023; 68:205005. [PMID: 37696272 DOI: 10.1088/1361-6560/acf8ac] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 09/11/2023] [Indexed: 09/13/2023]
Abstract
Objective.Metal artifact reduction (MAR) has been a key issue in CT imaging. Recently, MAR methods based on deep learning have achieved promising results. However, when deploying deep learning-based MAR in real-world clinical scenarios, two prominent challenges arise. One limitation is the lack of paired training data in real applications, which limits the practicality of supervised methods. Another limitation is that image-domain methods suitable for more application scenarios are inadequate in performance while end-to-end approaches with better performance are only applicable to fan-beam CT due to large memory consumption.Approach.We propose a novel image-domain MAR method based on the generative adversarial network with variable constraints (MARGANVAC) to improve MAR performance. The proposed variable constraint is a kind of time-varying cost function that can relax the fidelity constraint at the beginning and gradually strengthen the fidelity constraint as the training progresses. To better deploy our image-domain supervised method into practical scenarios, we develop a transfer method to mimic the real metal artifacts by first extracting the real metal traces and then adding them to artifact-free images to generate paired training data.Main results.The effectiveness of the proposed method is validated in simulated fan-beam experiments and real cone-beam experiments. All quantitative and qualitative results demonstrate that the proposed method achieves superior performance compared with the competing methods.Significance.The MARGANVAC model proposed in this paper is an image-domain model that can be conveniently applied to various scenarios such as fan beam and cone beam CT. At the same time, its performance is on par with the cutting-edge dual-domain MAR approaches. In addition, the metal artifact transfer method proposed in this paper can easily generate paired data with real artifact features, which can be better used for model training in real scenarios.
Collapse
|
19
|
Min-Max Similarity: A Contrastive Semi-Supervised Deep Learning Network for Surgical Tools Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:2832-2841. [PMID: 37037256 PMCID: PMC10597739 DOI: 10.1109/tmi.2023.3266137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
A common problem with segmentation of medical images using neural networks is the difficulty to obtain a significant number of pixel-level annotated data for training. To address this issue, we proposed a semi-supervised segmentation network based on contrastive learning. In contrast to the previous state-of-the-art, we introduce Min-Max Similarity (MMS), a contrastive learning form of dual-view training by employing classifiers and projectors to build all-negative, and positive and negative feature pairs, respectively, to formulate the learning as solving a MMS problem. The all-negative pairs are used to supervise the networks learning from different views and to capture general features, and the consistency of unlabeled predictions is measured by pixel-wise contrastive loss between positive and negative pairs. To quantitatively and qualitatively evaluate our proposed method, we test it on four public endoscopy surgical tool segmentation datasets and one cochlear implant surgery dataset, which we manually annotated. Results indicate that our proposed method consistently outperforms state-of-the-art semi-supervised and fully supervised segmentation algorithms. And our semi-supervised segmentation algorithm can successfully recognize unknown surgical tools and provide good predictions. Also, our MMS approach could achieve inference speeds of about 40 frames per second (fps) and is suitable to deal with the real-time video segmentation.
Collapse
|
20
|
Self-supervised-RCNN for medical image segmentation with limited data annotation. Comput Med Imaging Graph 2023; 109:102297. [PMID: 37729826 DOI: 10.1016/j.compmedimag.2023.102297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 09/01/2023] [Accepted: 09/02/2023] [Indexed: 09/22/2023]
Abstract
Many successful methods developed for medical image analysis based on machine learning use supervised learning approaches, which often require large datasets annotated by experts to achieve high accuracy. However, medical data annotation is time-consuming and expensive, especially for segmentation tasks. To overcome the problem of learning with limited labeled medical image data, an alternative deep learning training strategy based on self-supervised pretraining on unlabeled imaging data is proposed in this work. For the pretraining, different distortions are arbitrarily applied to random areas of unlabeled images. Next, a Mask-RCNN architecture is trained to localize the distortion location and recover the original image pixels. This pretrained model is assumed to gain knowledge of the relevant texture in the images from the self-supervised pretraining on unlabeled imaging data. This provides a good basis for fine-tuning the model to segment the structure of interest using a limited amount of labeled training data. The effectiveness of the proposed method in different pretraining and fine-tuning scenarios was evaluated based on the Osteoarthritis Initiative dataset with the aim of segmenting effusions in MRI datasets of the knee. Applying the proposed self-supervised pretraining method improved the Dice score by up to 18% compared to training the models using only the limited annotated data. The proposed self-supervised learning approach can be applied to many other medical image analysis tasks including anomaly detection, segmentation, and classification.
Collapse
|
21
|
Ipsilateral Lesion Detection Refinement for Tomosynthesis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3080-3090. [PMID: 37227903 PMCID: PMC11033619 DOI: 10.1109/tmi.2023.3280135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Computer-aided detection (CAD) frameworks for breast cancer screening have been researched for several decades. Early adoption of deep-learning models in CAD frameworks has shown greatly improved detection performance compared to traditional CAD on single-view images. Recently, studies have improved performance by merging information from multiple views within each screening exam. Clinically, the integration of lesion correspondence during screening is a complicated decision process that depends on the correct execution of several referencing steps. However, most multi-view CAD frameworks are deep-learning-based black-box techniques. Fully end-to-end designs make it very difficult to analyze model behaviors and fine-tune performance. More importantly, the black-box nature of the techniques discourages clinical adoption due to the lack of explicit reasoning for each multi-view referencing step. Therefore, there is a need for a multi-view detection framework that can not only detect cancers accurately but also provide step-by-step, multi-view reasoning. In this work, we present Ipsilateral-Matching-Refinement Networks (IMR-Net) for digital breast tomosynthesis (DBT) lesion detection across multiple views. Our proposed framework adaptively refines the single-view detection scores based on explicit ipsilateral lesion matching. IMR-Net is built on a robust, single-view detection CAD pipeline with a commercial development DBT dataset of 24675 DBT volumetric views from 8034 exams. Performance is measured using location-based, case-level receiver operating characteristic (ROC) and case-level free-response ROC (FROC) analysis.
Collapse
|
22
|
Self-supervised Semantic Segmentation: Consistency over Transformation. ... IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS. IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION 2023; 2023:2646-2655. [PMID: 38298808 PMCID: PMC10829429 DOI: 10.1109/iccvw60793.2023.00280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/02/2024]
Abstract
Accurate medical image segmentation is of utmost importance for enabling automated clinical decision procedures. However, prevailing supervised deep learning approaches for medical image segmentation encounter significant challenges due to their heavy dependence on extensive labeled training data. To tackle this issue, we propose a novel self-supervised algorithm, S 3 - Net , which integrates a robust framework based on the proposed Inception Large Kernel Attention (I-LKA) modules. This architectural enhancement makes it possible to comprehensively capture contextual information while preserving local intricacies, thereby enabling precise semantic segmentation. Furthermore, considering that lesions in medical images often exhibit deformations, we leverage deformable convolution as an integral component to effectively capture and delineate lesion deformations for superior object boundary definition. Additionally, our self-supervised strategy emphasizes the acquisition of invariance to affine transformations, which is commonly encountered in medical scenarios. This emphasis on robustness with respect to geometric distortions significantly enhances the model's ability to accurately model and handle such distortions. To enforce spatial consistency and promote the grouping of spatially connected image pixels with similar feature representations, we introduce a spatial consistency loss term. This aids the network in effectively capturing the relationships among neighboring pixels and enhancing the overall segmentation quality. The S 3 - N e t approach iteratively learns pixel-level feature representations for image content clustering in an end-to-end manner. Our experimental results on skin lesion and lung organ segmentation tasks show the superior performance of our method compared to the SOTA approaches.
Collapse
|
23
|
Leveraging global binary masks for structure segmentation in medical images. Phys Med Biol 2023; 68:10.1088/1361-6560/acf2e2. [PMID: 37607564 PMCID: PMC10511220 DOI: 10.1088/1361-6560/acf2e2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 08/22/2023] [Indexed: 08/24/2023]
Abstract
Deep learning (DL) models for medical image segmentation are highly influenced by intensity variations of input images and lack generalization due to primarily utilizing pixels' intensity information for inference. Acquiring sufficient training data is another challenge limiting models' applications. Here, we proposed to leverage the consistency of organs' anatomical position and shape information in medical images. We introduced a framework leveraging recurring anatomical patterns through global binary masks for organ segmentation. Two scenarios were studied: (1) global binary masks were the only input for the U-Net based model, forcing exclusively encoding organs' position and shape information for rough segmentation or localization. (2) Global binary masks were incorporated as an additional channel providing position/shape clues to mitigate training data scarcity. Two datasets of the brain and heart computed tomography (CT) images with their ground-truth were split into (26:10:10) and (12:3:5) for training, validation, and test respectively. The two scenarios were evaluated using full training split as well as reduced subsets of training data. In scenario (1), training exclusively on global binary masks led to Dice scores of 0.77 ± 0.06 and 0.85 ± 0.04 for the brain and heart structures respectively. Average Euclidian distance of 3.12 ± 1.43 mm and 2.5 ± 0.93 mm were obtained relative to the center of mass of the ground truth for the brain and heart structures respectively. The outcomes indicated encoding a surprising degree of position and shape information through global binary masks. In scenario (2), incorporating global binary masks led to significantly higher accuracy relative to the model trained on only CT images in small subsets of training data; the performance improved by 4.3%-125.3% and 1.3%-48.1% for 1-8 training cases of the brain and heart datasets respectively. The findings imply the advantages of utilizing global binary masks for building models that are robust to image intensity variations as well as an effective approach to boost performance when access to labeled training data is highly limited.
Collapse
|
24
|
Efficient Multi-Organ Segmentation From 3D Abdominal CT Images With Lightweight Network and Knowledge Distillation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:2513-2523. [PMID: 37030798 DOI: 10.1109/tmi.2023.3262680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Accurate segmentation of multiple abdominal organs from Computed Tomography (CT) images plays an important role in computer-aided diagnosis, treatment planning and follow-up. Currently, 3D Convolution Neural Networks (CNN) have achieved promising performance for automatic medical image segmentation tasks. However, most existing 3D CNNs have a large set of parameters and huge floating point operations (FLOPs), and 3D CT volumes have a large size, leading to high computational cost, which limits their clinical application. To tackle this issue, we propose a novel framework based on lightweight network and Knowledge Distillation (KD) for delineating multiple organs from 3D CT volumes. We first propose a novel lightweight medical image segmentation network named LCOV-Net for reducing the model size and then introduce two knowledge distillation modules (i.e., Class-Affinity KD and Multi-Scale KD) to effectively distill the knowledge from a heavy-weight teacher model to improve LCOV-Net's segmentation accuracy. Experiments on two public abdominal CT datasets for multiple organ segmentation showed that: 1) Our LCOV-Net outperformed existing lightweight 3D segmentation models in both computational cost and accuracy; 2) The proposed KD strategy effectively improved the performance of the lightweight network, and it outperformed existing KD methods; 3) Combining the proposed LCOV-Net and KD strategy, our framework achieved better performance than the state-of-the-art 3D nnU-Net with only one-fifth parameters. The code is available at https://github.com/HiLab-git/LCOVNet-and-KD.
Collapse
|
25
|
Ambiguity-selective consistency regularization for mean-teacher semi-supervised medical image segmentation. Med Image Anal 2023; 88:102880. [PMID: 37413792 DOI: 10.1016/j.media.2023.102880] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 05/17/2023] [Accepted: 06/22/2023] [Indexed: 07/08/2023]
Abstract
Semi-supervised learning has greatly advanced medical image segmentation since it effectively alleviates the need of acquiring abundant annotations from experts, wherein the mean-teacher model, known as a milestone of perturbed consistency learning, commonly serves as a standard and simple baseline. Inherently, learning from consistency can be regarded as learning from stability under perturbations. Recent improvement leans toward more complex consistency learning frameworks, yet, little attention is paid to the consistency target selection. Considering that the ambiguous regions from unlabeled data contain more informative complementary clues, in this paper, we improve the mean-teacher model to a novel ambiguity-consensus mean-teacher (AC-MT) model. Particularly, we comprehensively introduce and benchmark a family of plug-and-play strategies for ambiguous target selection from the perspectives of entropy, model uncertainty and label noise self-identification, respectively. Then, the estimated ambiguity map is incorporated into the consistency loss to encourage consensus between the two models' predictions in these informative regions. In essence, our AC-MT aims to find out the most worthwhile voxel-wise targets from the unlabeled data, and the model especially learns from the perturbed stability of these informative regions. The proposed methods are extensively evaluated on left atrium segmentation and brain tumor segmentation. Encouragingly, our strategies bring substantial improvement over recent state-of-the-art methods. The ablation study further demonstrates our hypothesis and shows impressive results under various extreme annotation conditions.
Collapse
|
26
|
Semi-Supervised Medical Image Segmentation With Voxel Stability and Reliability Constraints. IEEE J Biomed Health Inform 2023; 27:3912-3923. [PMID: 37155391 DOI: 10.1109/jbhi.2023.3273609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Semi-supervised learning is becoming an effective solution in medical image segmentation because annotations are costly and tedious to acquire. Methods based on the teacher-student model use consistency regularization and uncertainty estimation and have shown good potential in dealing with limited annotated data. Nevertheless, the existing teacher-student model is seriously limited by the exponential moving average algorithm, which leads to the optimization trap. Moreover, the classic uncertainty estimation method calculates the global uncertainty for images but does not consider local region-level uncertainty, which is unsuitable for medical images with blurry regions. In this article, the Voxel Stability and Reliability Constraint (VSRC) model is proposed to address these issues. Specifically, the Voxel Stability Constraint (VSC) strategy is introduced to optimize parameters and exchange effective knowledge between two independent initialized models, which can break through the performance bottleneck and avoid model collapse. Moreover, a new uncertainty estimation strategy, the Voxel Reliability Constraint (VRC), is proposed for use in our semi-supervised model to consider the uncertainty at the local region level. We further extend our model to auxiliary tasks and propose a task-level consistency regularization with uncertainty estimation. Extensive experiments on two 3D medical image datasets demonstrate that our method outperforms other state-of-the-art semi-supervised medical image segmentation methods under limited supervision.
Collapse
|
27
|
Anatomy-aware self-supervised learning for anomaly detection in chest radiographs. iScience 2023; 26:107086. [PMID: 37434699 PMCID: PMC10331430 DOI: 10.1016/j.isci.2023.107086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Revised: 04/17/2023] [Accepted: 06/06/2023] [Indexed: 07/13/2023] Open
Abstract
In this study, we present a self-supervised learning (SSL)-based model that enables anatomical structure-based unsupervised anomaly detection (UAD). The model employs an anatomy-aware pasting (AnatPaste) augmentation tool that uses a threshold-based lung segmentation pretext task to create anomalies in normal chest radiographs used for model pretraining. These anomalies are similar to real anomalies and help the model recognize them. We evaluate our model using three open-source chest radiograph datasets. Our model exhibits area under curves of 92.1%, 78.7%, and 81.9%, which are the highest among those of existing UAD models. To the best of our knowledge, this is the first SSL model to employ anatomical information from segmentation as a pretext task. The performance of AnatPaste shows that incorporating anatomical information into SSL can effectively improve accuracy.
Collapse
|
28
|
Improve the performance of CT-based pneumonia classification via source data reweighting. Sci Rep 2023; 13:9401. [PMID: 37296239 PMCID: PMC10251339 DOI: 10.1038/s41598-023-35938-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Accepted: 05/26/2023] [Indexed: 06/12/2023] Open
Abstract
Pneumonia is a life-threatening disease. Computer tomography (CT) imaging is broadly used for diagnosing pneumonia. To assist radiologists in accurately and efficiently detecting pneumonia from CT scans, many deep learning methods have been developed. These methods require large amounts of annotated CT scans, which are difficult to obtain due to privacy concerns and high annotation costs. To address this problem, we develop a three-level optimization based method which leverages CT data from a source domain to mitigate the lack of labeled CT scans in a target domain. Our method automatically identifies and downweights low-quality source CT data examples which are noisy or have large domain discrepancy with target data, by minimizing the validation loss of a target model trained on reweighted source data. On a target dataset with 2218 CT scans and a source dataset with 349 CT images, our method achieves an F1 score of 91.8% in detecting pneumonia and an F1 score of 92.4% in detecting other types of pneumonia, which are significantly better than those achieved by state-of-the-art baseline methods.
Collapse
|
29
|
Bootstrapping Semi-supervised Medical Image Segmentation with Anatomical-Aware Contrastive Distillation. INFORMATION PROCESSING IN MEDICAL IMAGING : PROCEEDINGS OF THE ... CONFERENCE 2023; 13939:641-653. [PMID: 37409056 PMCID: PMC10322187 DOI: 10.1007/978-3-031-34048-2_49] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/07/2023]
Abstract
Contrastive learning has shown great promise over annotation scarcity problems in the context of medical image segmentation. Existing approaches typically assume a balanced class distribution for both labeled and unlabeled medical images. However, medical image data in reality is commonly imbalanced (i.e., multi-class label imbalance), which naturally yields blurry contours and usually incorrectly labels rare objects. Moreover, it remains unclear whether all negative samples are equally negative. In this work, we present ACTION, an Anatomical-aware ConTrastive dIstillatiON framework, for semi-supervised medical image segmentation. Specifically, we first develop an iterative contrastive distillation algorithm by softly labeling the negatives rather than binary supervision between positive and negative pairs. We also capture more semantically similar features from the randomly chosen negative set compared to the positives to enforce the diversity of the sampled data. Second, we raise a more important question: Can we really handle imbalanced samples to yield better performance? Hence, the key innovation in ACTION is to learn global semantic relationship across the entire dataset and local anatomical features among the neighbouring pixels with minimal additional memory footprint. During the training, we introduce anatomical contrast by actively sampling a sparse set of hard negative pixels, which can generate smoother segmentation boundaries and more accurate predictions. Extensive experiments across two benchmark datasets and different unlabeled settings show that ACTION significantly outperforms the current state-of-the-art semi-supervised methods.
Collapse
|
30
|
MFA-Net: Multiple Feature Association Network for medical image segmentation. Comput Biol Med 2023; 158:106834. [PMID: 37003067 DOI: 10.1016/j.compbiomed.2023.106834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 03/01/2023] [Accepted: 03/26/2023] [Indexed: 03/30/2023]
Abstract
Medical image segmentation plays a crucial role in computer-aided diagnosis. However, due to the large variability of medical images, accurate segmentation is a highly challenging task. In this paper, we present a novel medical image segmentation network named the Multiple Feature Association Network (MFA-Net), which is based on deep learning techniques. The MFA-Net utilizes an encoder-decoder architecture with skip connections as its backbone network, and a parallelly dilated convolutions arrangement (PDCA) module is integrated between the encoder and the decoder to capture more representative deep features. Furthermore, a multi-scale feature restructuring module (MFRM) is introduced to restructure and fuse the deep features of the encoder. To enhance global attention perception, the proposed global attention stacking (GAS) modules are cascaded on the decoder. The proposed MFA-Net leverages novel global attention mechanisms to improve the segmentation performance at different feature scales. We evaluated our MFA-Net on four segmentation tasks, including lesions in intestinal polyp, liver tumor, prostate cancer, and skin lesion. Our experimental results and ablation study demonstrate that the proposed MFA-Net outperforms state-of-the-art methods in terms of global positioning and local edge recognition.
Collapse
|
31
|
Semi-Supervised Medical Image Segmentation Using Adversarial Consistency Learning and Dynamic Convolution Network. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:1265-1277. [PMID: 36449588 DOI: 10.1109/tmi.2022.3225687] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Popular semi-supervised medical image segmentation networks often suffer from error supervision from unlabeled data since they usually use consistency learning under different data perturbations to regularize model training. These networks ignore the relationship between labeled and unlabeled data, and only compute single pixel-level consistency leading to uncertain prediction results. Besides, these networks often require a large number of parameters since their backbone networks are designed depending on supervised image segmentation tasks. Moreover, these networks often face a high over-fitting risk since a small number of training samples are popular for semi-supervised image segmentation. To address the above problems, in this paper, we propose a novel adversarial self-ensembling network using dynamic convolution (ASE-Net) for semi-supervised medical image segmentation. First, we use an adversarial consistency training strategy (ACTS) that employs two discriminators based on consistency learning to obtain prior relationships between labeled and unlabeled data. The ACTS can simultaneously compute pixel-level and image-level consistency of unlabeled data under different data perturbations to improve the prediction quality of labels. Second, we design a dynamic convolution-based bidirectional attention component (DyBAC) that can be embedded in any segmentation network, aiming at adaptively adjusting the weights of ASE-Net based on the structural information of input samples. This component effectively improves the feature representation ability of ASE-Net and reduces the overfitting risk of the network. The proposed ASE-Net has been extensively tested on three publicly available datasets, and experiments indicate that ASE-Net is superior to state-of-the-art networks, and reduces computational costs and memory overhead. The code is available at: https://github.com/SUST-reynole/ASE-Nethttps://github.com/SUST-reynole/ASE-Net.
Collapse
|
32
|
Occlusion-robust scene flow-based tissue deformation recovery incorporating a mesh optimization model. Int J Comput Assist Radiol Surg 2023:10.1007/s11548-023-02889-z. [PMID: 37067752 DOI: 10.1007/s11548-023-02889-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 03/27/2023] [Indexed: 04/18/2023]
Abstract
PURPOSE Tissue deformation recovery is to reconstruct the change in shape and surface strain caused by tool-tissue interaction or respiration, which is essential for providing motion and shape information that benefits the improvement of the safety of minimally invasive surgery. The binocular vision-based approach is a practical candidate for deformation recovery as no extra devices are required. However, previous methods suffer from limitations such as the reliance on biomechanical priors and the vulnerability to the occlusion caused by surgical instruments. To address the issues, we propose a deformation recovery method incorporating mesh structures and scene flow. METHODS The method can be divided into three modules. The first one is the implementation of the two-step scene flow generation module to extract the 3D motion from the binocular sequence. Second, we propose a strain-based filtering method to denoise the original scene flow. Third, a mesh optimization model is proposed that strengthens the robustness to occlusion by employing contextual connectivity. RESULTS In a phantom and an in vivo experiment, the feasibility of the method in recovering surface deformation in the presence of tool-induced occlusion was demonstrated. Surface reconstruction accuracy was quantitatively evaluated by comparing the recovered mesh surface with the 3D scanned model in the phantom experiment. Results show that the overall error is 0.70 ± 0.55 mm. CONCLUSION The method has been demonstrated to be capable of continuously recovering surface deformation using mesh representation with robustness to the occlusion caused by surgical forceps and promises to be suitable for the application in actual surgery.
Collapse
|
33
|
Individualized Statistical Modeling of Lesions in Fundus Images for Anomaly Detection. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:1185-1196. [PMID: 36446017 DOI: 10.1109/tmi.2022.3225422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Anomaly detection in fundus images remains challenging due to the fact that fundus images often contain diverse types of lesions with various properties in locations, sizes, shapes, and colors. Current methods achieve anomaly detection mainly through reconstructing or separating the fundus image background from a fundus image under the guidance of a set of normal fundus images. The reconstruction methods, however, ignore the constraint from lesions. The separation methods primarily model the diverse lesions with pixel-based independent and identical distributed (i.i.d.) properties, neglecting the individualized variations of different types of lesions and their structural properties. And hence, these methods may have difficulty to well distinguish lesions from fundus image backgrounds especially with the normal personalized variations (NPV). To address these challenges, we propose a patch-based non-i.i.d. mixture of Gaussian (MoG) to model diverse lesions for adapting to their statistical distribution variations in different fundus images and their patch-like structural properties. Further, we particularly introduce the weighted Schatten p-norm as the metric of low-rank decomposition for enhancing the accuracy of the learned fundus image backgrounds and reducing false-positives caused by NPV. With the individualized modeling of the diverse lesions and the background learning, fundus image backgrounds and NPV are finely learned and subsequently distinguished from diverse lesions, to ultimately improve the anomaly detection. The proposed method is evaluated on two real-world databases and one artificial database, outperforming the state-of-the-art methods.
Collapse
|
34
|
Transfer learning framework for low-dose CT reconstruction based on marginal distribution adaptation in multiscale. Med Phys 2023; 50:1450-1465. [PMID: 36321246 DOI: 10.1002/mp.16027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 09/05/2022] [Accepted: 09/09/2022] [Indexed: 11/05/2022] Open
Abstract
BACKGROUND With the increasing use of computed tomography (CT) in clinical practice, limiting CT radiation exposure to reduce potential cancer risks has become one of the important directions of medical imaging research. As the dose decreases, the reconstructed CT image will be severely degraded by projection noise. PURPOSE As an important method of image processing, supervised deep learning has been widely used in the restoration of low-dose CT (LDCT) in recent years. However, the normal-dose CT (NDCT) corresponding to a specific LDCT (it is regarded as the label of the LDCT, which is necessary for supervised learning) is very difficult to obtain so that the application of supervised learning methods in LDCT reconstruction is limited. It is necessary to construct a unsupervised deep learning framework for LDCT reconstruction that does not depend on paired LDCT-NDCT datasets. METHODS We presented an unsupervised learning framework for the transferring from the identity mapping to the low-dose reconstruction task, called marginal distribution adaptation in multiscale (MDAM). For NDCTs as source domain data, MDAM is an identity map with two parts: firstly, it establishes a dimensionality reduction mapping, which can obtain the same feature distribution from NDCTs and LDCTs; and then NDCTs is retrieved by reconstructing the image overview and details from the low-dimensional features. For the purpose of the feature transfer between source domain and target domain (LDCTs), we introduce the multiscale feature extraction in the MDAM, and then eliminate differences in probability distributions of these multiscale features between NDCTs and LDCTs through wavelet decomposition and domain adaptation learning. RESULTS Image quality evaluation metrics and subjective quality scores show that, as an unsupervised method, the performance of the MDAM approaches or even surpasses some state-of-the-art supervised methods. Especially, MDAM has been favorably evaluated in terms of noise suppression, structural preservation, and lesion detection. CONCLUSIONS We demonstrated that, the MDAM framework can reconstruct corresponding NDCTs from LDCTs with high accuracy, and without relying on any labeles. Moreover, it is more suitable for clinical application compared with supervised learning methods.
Collapse
|
35
|
SynthSeg: Segmentation of brain MRI scans of any contrast and resolution without retraining. Med Image Anal 2023; 86:102789. [PMID: 36857946 PMCID: PMC10154424 DOI: 10.1016/j.media.2023.102789] [Citation(s) in RCA: 31] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2022] [Revised: 01/20/2023] [Accepted: 02/22/2023] [Indexed: 03/03/2023]
Abstract
Despite advances in data augmentation and transfer learning, convolutional neural networks (CNNs) difficultly generalise to unseen domains. When segmenting brain scans, CNNs are highly sensitive to changes in resolution and contrast: even within the same MRI modality, performance can decrease across datasets. Here we introduce SynthSeg, the first segmentation CNN robust against changes in contrast and resolution. SynthSeg is trained with synthetic data sampled from a generative model conditioned on segmentations. Crucially, we adopt a domain randomisation strategy where we fully randomise the contrast and resolution of the synthetic training data. Consequently, SynthSeg can segment real scans from a wide range of target domains without retraining or fine-tuning, which enables straightforward analysis of huge amounts of heterogeneous clinical data. Because SynthSeg only requires segmentations to be trained (no images), it can learn from labels obtained by automated methods on diverse populations (e.g., ageing and diseased), thus achieving robustness to a wide range of morphological variability. We demonstrate SynthSeg on 5,000 scans of six modalities (including CT) and ten resolutions, where it exhibits unparallelled generalisation compared with supervised CNNs, state-of-the-art domain adaptation, and Bayesian segmentation. Finally, we demonstrate the generalisability of SynthSeg by applying it to cardiac MRI and CT scans.
Collapse
|
36
|
Triplet attention and dual-pool contrastive learning for clinic-driven multi-label medical image classification. Med Image Anal 2023; 86:102772. [PMID: 36822050 DOI: 10.1016/j.media.2023.102772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 11/21/2022] [Accepted: 02/10/2023] [Indexed: 02/18/2023]
Abstract
Multi-label classification (MLC) can attach multiple labels on single image, and has achieved promising results on medical images. But existing MLC methods still face challenging clinical realities in practical use, such as: (1) medical risks arising from misclassification, (2) sample imbalance problem among different diseases, (3) inability to classify the diseases that are not pre-defined (unseen diseases). Here, we design a hybrid label to improve the flexibility of MLC methods and alleviate the sample imbalance problem. Specifically, in the labeled training set, we remain independent labels for high-frequency diseases with enough samples and use a hybrid label to merge low-frequency diseases with fewer samples. The hybrid label can also be used to put unseen diseases in practical use. In this paper, we propose Triplet Attention and Dual-pool Contrastive Learning (TA-DCL) for multi-label medical image classification based on the aforementioned label representation. TA-DCL architecture is a triplet attention network (TAN), which combines category-attention, self-attention and cross-attention together to learn high-quality label embeddings for all disease labels by mining effective information from medical images. DCL includes dual-pool contrastive training (DCT) and dual-pool contrastive inference (DCI). DCT optimizes the clustering centers of label embeddings belonging to different disease labels to improve the discrimination of label embeddings. DCI relieves the error classification of sick cases for reducing the clinical risk and improving the ability to detect unseen diseases by contrast of differences. TA-DCL is validated on two public medical image datasets, ODIR and NIH-ChestXray14, showing superior performance than other state-of-the-art MLC methods. Code is available at https://github.com/ZhangYH0502/TA-DCL.
Collapse
|
37
|
A Novel Contrastive Self-Supervised Learning Framework for Solving Data Imbalance in Solder Joint Defect Detection. ENTROPY (BASEL, SWITZERLAND) 2023; 25:268. [PMID: 36832635 PMCID: PMC9954869 DOI: 10.3390/e25020268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 01/16/2023] [Accepted: 01/23/2023] [Indexed: 06/18/2023]
Abstract
Poor chip solder joints can severely affect the quality of the finished printed circuit boards (PCBs). Due to the diversity of solder joint defects and the scarcity of anomaly data, it is a challenging task to automatically and accurately detect all types of solder joint defects in the production process in real time. To address this issue, we propose a flexible framework based on contrastive self-supervised learning (CSSL). In this framework, we first design several special data augmentation approaches to generate abundant synthetic, not good (sNG) data from the normal solder joint data. Then, we develop a data filter network to distill the highest quality data from sNG data. Based on the proposed CSSL framework, a high-accuracy classifier can be obtained even when the available training data are very limited. Ablation experiments verify that the proposed method can effectively improve the ability of the classifier to learn normal solder joint (OK) features. Through comparative experiments, the classifier trained with the help of the proposed method can achieve an accuracy of 99.14% on the test set, which is better than other competitive methods. In addition, its reasoning time is less than 6 ms per chip image, which is in favor of the real-time defect detection of chip solder joints.
Collapse
|
38
|
BIoMT-ISeg: Blockchain internet of medical things for intelligent segmentation. Front Physiol 2023; 13:1097204. [PMID: 36714314 PMCID: PMC9879662 DOI: 10.3389/fphys.2022.1097204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2022] [Accepted: 12/20/2022] [Indexed: 01/13/2023] Open
Abstract
In the quest of training complicated medical data for Internet of Medical Things (IoMT) scenarios, this study develops an end-to-end intelligent framework that incorporates ensemble learning, genetic algorithms, blockchain technology, and various U-Net based architectures. Genetic algorithms are used to optimize the hyper-parameters of the used architectures. The training process was also protected with the help of blockchain technology. Finally, an ensemble learning system based on voting mechanism was developed to combine local outputs of various segmentation models into a global output. Our method shows that strong performance in a condensed number of epochs may be achieved with a high learning rate and a small batch size. As a result, we are able to perform better than standard solutions for well-known medical databases. In fact, the proposed solution reaches 95% of intersection over the union, compared to the baseline solutions where they are below 80%. Moreover, with the proposed blockchain strategy, the detected attacks reached 76%.
Collapse
|
39
|
Semi-supervised medical image classification with adaptive threshold pseudo-labeling and unreliable sample contrastive loss. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
40
|
WORD: A large scale dataset, benchmark and clinical applicable study for abdominal organ segmentation from CT image. Med Image Anal 2022; 82:102642. [DOI: 10.1016/j.media.2022.102642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Revised: 08/18/2022] [Accepted: 09/20/2022] [Indexed: 11/22/2022]
|
41
|
Momentum Contrastive Voxel-wise Representation Learning for Semi-supervised Volumetric Medical Image Segmentation. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION : MICCAI ... INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION 2022; 13434:639-652. [PMID: 37465615 PMCID: PMC10352821 DOI: 10.1007/978-3-031-16440-8_61] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/20/2023]
Abstract
Contrastive learning (CL) aims to learn useful representation without relying on expert annotations in the context of medical image segmentation. Existing approaches mainly contrast a single positive vector (i.e., an augmentation of the same image) against a set of negatives within the entire remainder of the batch by simply mapping all input features into the same constant vector. Despite the impressive empirical performance, those methods have the following shortcomings: (1) it remains a formidable challenge to prevent the collapsing problems to trivial solutions; and (2) we argue that not all voxels within the same image are equally positive since there exist the dissimilar anatomical structures with the same image. In this work, we present a novel Contrastive Voxel-wise Representation Learning (CVRL) method to effectively learn low-level and high-level features by capturing 3D spatial context and rich anatomical information along both the feature and the batch dimensions. Specifically, we first introduce a novel CL strategy to ensure feature diversity promotion among the 3D representation dimensions. We train the framework through bi-level contrastive optimization (i.e., low-level and high-level) on 3D images. Experiments on two benchmark datasets and different labeled settings demonstrate the superiority of our proposed framework. More importantly, we also prove that our method inherits the benefit of hardness-aware property from the standard CL approaches. Codes will be available soon.
Collapse
|
42
|
Real-time medical image denoising and information hiding model based on deep wavelet multiscale autonomous unmanned analysis. Soft comput 2022. [DOI: 10.1007/s00500-022-07322-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
|
43
|
Mutual consistency learning for semi-supervised medical image segmentation. Med Image Anal 2022; 81:102530. [PMID: 35839737 DOI: 10.1016/j.media.2022.102530] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 06/16/2022] [Accepted: 07/01/2022] [Indexed: 11/20/2022]
Abstract
In this paper, we propose a novel mutual consistency network (MC-Net+) to effectively exploit the unlabeled data for semi-supervised medical image segmentation. The MC-Net+ model is motivated by the observation that deep models trained with limited annotations are prone to output highly uncertain and easily mis-classified predictions in the ambiguous regions (e.g., adhesive edges or thin branches) for medical image segmentation. Leveraging these challenging samples can make the semi-supervised segmentation model training more effective. Therefore, our proposed MC-Net+ model consists of two new designs. First, the model contains one shared encoder and multiple slightly different decoders (i.e., using different up-sampling strategies). The statistical discrepancy of multiple decoders' outputs is computed to denote the model's uncertainty, which indicates the unlabeled hard regions. Second, we apply a novel mutual consistency constraint between one decoder's probability output and other decoders' soft pseudo labels. In this way, we minimize the discrepancy of multiple outputs (i.e., the model uncertainty) during training and force the model to generate invariant results in such challenging regions, aiming at regularizing the model training. We compared the segmentation results of our MC-Net+ model with five state-of-the-art semi-supervised approaches on three public medical datasets. Extension experiments with two standard semi-supervised settings demonstrate the superior performance of our model over other methods, which sets a new state of the art for semi-supervised medical image segmentation. Our code is released publicly at https://github.com/ycwu1997/MC-Net.
Collapse
|
44
|
Nuclear‐medicine probes: where we are and where we are going. Med Phys 2022; 49:4372-4390. [PMID: 35526220 PMCID: PMC9545507 DOI: 10.1002/mp.15690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2022] [Revised: 04/08/2022] [Accepted: 04/26/2022] [Indexed: 11/10/2022] Open
Abstract
Nuclear medicine probes turned into the key for the identification and precise location of sentinel lymph nodes and other occult lesions (i.e., tumors) by using the systemic administration of radiotracers. Intraoperative nuclear probes are key in the surgical management of some malignancies as well as in the determination of positive surgical margins, thus reducing the extent and potential surgery morbidity. Depending on their application, nuclear probes are classified into two main categories, namely, counting and imaging. Although counting probes present a simple design, are handheld (to be moved rapidly), and provide only acoustic signals when detecting radiation, imaging probes, also known as cameras, are more hardware‐complex and also able to provide images but at the cost of an increased intervention time as displacing the camera has to be done slowly. This review article begins with an introductory section to highlight the relevance of nuclear‐based probes and their components as well as the main differences between ionization‐ (semiconductor) and scintillation‐based probes. Then, the most significant performance parameters of the probe are reviewed (i.e., sensitivity, contrast, count rate capabilities, shielding, energy, and spatial resolution), as well as the different types of probes based on the target radiation nature, namely: gamma (γ), beta (β) (positron and electron), and Cherenkov. Various available intraoperative nuclear probes are finally compared in terms of performance to discuss the state‐of‐the‐art of nuclear medicine probes. The manuscript concludes by discussing the ideal probe design and the aspects to be considered when selecting nuclear‐medicine probes.
Collapse
|
45
|
Abstract
Despite the substantial progress made by deep networks in the field of medical image segmentation, they generally require sufficient pixel-level annotated data for training. The scale of training data remains to be the main bottleneck to obtain a better deep segmentation model. Semi-supervised learning is an effective approach that alleviates the dependence on labeled data. However, most existing semi-supervised image segmentation methods usually do not generate high-quality pseudo labels to expand training dataset. In this paper, we propose a deep semi-supervised approach for liver CT image segmentation by expanding pseudo-labeling algorithm under the very low annotated-data paradigm. Specifically, the output features of labeled images from the pretrained network combine with corresponding pixel-level annotations to produce class representations according to the mean operation. Then pseudo labels of unlabeled images are generated by calculating the distances between unlabeled feature vectors and each class representation. To further improve the quality of pseudo labels, we adopt a series of operations to optimize pseudo labels. A more accurate segmentation network is obtained by expanding the training dataset and adjusting the contributions between supervised and unsupervised loss. Besides, the novel random patch based on prior locations is introduced for unlabeled images in the training procedure. Extensive experiments show our method has achieved more competitive results compared with other semi-supervised methods when fewer labeled slices of LiTS dataset are available.
Collapse
|
46
|
All-Around Real Label Supervision: Cyclic Prototype Consistency Learning for Semi-supervised Medical Image Segmentation. IEEE J Biomed Health Inform 2022; 26:3174-3184. [PMID: 35324450 DOI: 10.1109/jbhi.2022.3162043] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Semi-supervised learning has substantially advanced medical image segmentation since it alleviates the heavy burden of acquiring the costly expert-examined annotations. Especially, the consistency-based approaches have attracted more attention for their superior performance, wherein the real labels are only utilized to supervise their paired images via supervised loss while the unlabeled images are exploited by enforcing the perturbation-based \textit{``unsupervised"} consistency without explicit guidance from those real labels. However, intuitively, the expert-examined real labels contain more reliable supervision signals. Observing this, we ask an unexplored but interesting question: can we exploit the unlabeled data via explicit real label supervision for semi-supervised training To this end, we discard the previous perturbation-based consistency but absorb the essence of non-parametric prototype learning. Based on the prototypical networks, we then propose a novel cyclic prototype consistency learning (CPCL) framework, which is constructed by a labeled-to-unlabeled (L2U) prototypical forward process and an unlabeled-to-labeled (U2L) backward process. Such two processes synergistically enhance the segmentation network by encouraging more discriminative and compact features. In this way, our framework turns previous \textit{``unsupervised"} consistency into new \textit{``supervised"} consistency, obtaining the \textit{``all-around real label supervision"} property of our method. Extensive experiments on brain tumor segmentation from MRI and kidney segmentation from CT images show that our CPCL can effectively exploit the unlabeled data and outperform other state-of-the-art semi-supervised medical image segmentation methods.
Collapse
|
47
|
Structure attention co-training neural network for neovascularization segmentation in intravascular optical coherence tomography. Med Phys 2022; 49:1723-1738. [PMID: 35061247 DOI: 10.1002/mp.15477] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2021] [Revised: 01/09/2022] [Accepted: 01/10/2022] [Indexed: 11/11/2022] Open
Abstract
PURPOSE To development and validate a Neovascularization (NV) segmentation model in intravascular optical coherence tomography (IVOCT) through deep learning methods. METHODS AND MATERIALS A total of 1950 2D slices of 70 IVOCT pullbacks were used in our study. We randomly selected 1273 2D slices from 44 patients as the training set, 379 2D slices from 11 patients as the validation set, and 298 2D slices from the last 15 patients as the testing set. Automatic NV segmentation is quite challenging, as it must address issues of speckle noise, shadow artifacts, high distribution variation, etc. To meet these challenges, a new deep learning-based segmentation method is developed based on a co-training architecture with an integrated structural attention mechanism. Co-training is developed to exploit the features of three consecutive slices. The structural attention mechanism comprises spatial and channel attention modules and is integrated into the co-training architecture at each up-sampling step. A cascaded fixed network is further incorporated to achieve segmentation at the image level in a coarse-to-fine manner. RESULTS Extensive experiments were performed involving a comparison with several state-of-the-art deep learning-based segmentation methods. Moreover, the consistency of the results with those of manual segmentation was also investigated. Our proposed NV automatic segmentation method achieved the highest correlation with the manual delineation by interventional cardiologists (the Pearson correlation coefficient is 0.825). CONCLUSION In this work, we proposed a co-training architecture with an integrated structural attention mechanism to segment NV in IVOCT images. The good agreement between our segmentation results and manual segmentation indicates that the proposed method has great potential for application in the clinical investigation of NV-related plaque diagnosis and treatment. This article is protected by copyright. All rights reserved.
Collapse
|
48
|
Variance-aware attention U-Net for multi-organ segmentation. Med Phys 2021; 48:7864-7876. [PMID: 34716711 DOI: 10.1002/mp.15322] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 10/06/2021] [Accepted: 10/23/2021] [Indexed: 01/20/2023] Open
Abstract
PURPOSE With the continuous development of deep learning based medical image segmentation technology, it is expected to attain more robust and accurate performance for more challenging tasks, such as multi-organs, small/irregular areas, and ambiguous boundary issues. METHODS We propose a variance-aware attention U-Net to solve the problem of multi-organ segmentation. Specifically, a simple yet effective variance-based uncertainty mechanism is devised to evaluate the discrimination of each voxel via its prediction probability. The proposed variance uncertainty is further embedded into an attention architecture, which not only aggregates multi-level deep features in a global-level but also enforces the network to pay extra attention to voxels with uncertain predictions during training. RESULTS Extensive experiments on challenging abdominal multi-organ CT dataset show that our proposed method consistently outperforms cutting-edge attention networks with respect to the evaluation metrics of Dice index (DSC), 95% Hausdorff distance (95HD) and average symmetric surface distance (ASSD). CONCLUSIONS The proposed network provides an accurate and robust solution for multi-organ segmentation and has the potential to be used for improving other segmentation applications.
Collapse
|
49
|
Comparison of convolutional neural networks for detecting large vessel occlusion on computed tomography angiography. Med Phys 2021; 48:6060-6068. [PMID: 34287944 PMCID: PMC8568625 DOI: 10.1002/mp.15122] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Revised: 07/04/2021] [Accepted: 07/05/2021] [Indexed: 12/19/2022] Open
Abstract
PURPOSE Artificial intelligence diagnosis and triage of large vessel occlusion may quicken clinical response for a subset of time-sensitive acute ischemic stroke patients, improving outcomes. Differences in architectural elements within data-driven convolutional neural network (CNN) models impact performance. Foreknowledge of effective model architectural elements for domain-specific problems can narrow the search for candidate models and inform strategic model design and adaptation to optimize performance on available data. Here, we study CNN architectures with a range of learnable parameters and which span the inclusion of architectural elements, such as parallel processing branches and residual connections with varying methods of recombining residual information. METHODS We compare five CNNs: ResNet-50, DenseNet-121, EfficientNet-B0, PhiNet, and an Inception module-based network, on a computed tomography angiography large vessel occlusion detection task. The models were trained and preliminarily evaluated with 10-fold cross-validation on preprocessed scans (n = 240). An ablation study was performed on PhiNet due to superior cross-validated test performance across accuracy, precision, recall, specificity, and F1 score. The final evaluation of all models was performed on a withheld external validation set (n = 60) and these predictions were subsequently calibrated with sigmoid curves. RESULTS Uncalibrated results on the withheld external validation set show that DenseNet-121 had the best average performance on accuracy, precision, recall, specificity, and F1 score. After calibration DenseNet-121 maintained superior performance on all metrics except recall. CONCLUSIONS The number of learnable parameters in our five models and best-ablated PhiNet directly related to cross-validated test performance-the smaller the model the better. However, this pattern did not hold when looking at generalization on the withheld external validation set. DenseNet-121 generalized the best; we posit this was due to its heavy use of residual connections utilizing concatenation, which causes feature maps from earlier layers to be used deeper in the network, while aiding in gradient flow and regularization.
Collapse
|