1
|
Liu X, Zhang J, Zhang Y, Chen L, Luo L, Tang J. Weakly supervised segmentation of retinal layers on OCT images with AMD using uncertainty prototype and boundary regression. Med Image Anal 2025; 102:103572. [PMID: 40179629 DOI: 10.1016/j.media.2025.103572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2024] [Revised: 02/16/2025] [Accepted: 03/27/2025] [Indexed: 04/05/2025]
Abstract
Retinal layer segmentation for optical coherence tomography (OCT) images of eyes is a critical step in the diagnosis and treatment of age-related macular degeneration (AMD) eye disease. In recent years, dense annotation supervised OCT layer segmentation methods have made significant progress. However, obtaining pixel-by-pixel labeled masks from OCT retinal images is time-consuming and labor-intensive. To reduce dependence on dense annotations, this paper proposes a novel weakly supervised layer segmentation method with Uncertainty Prototype module and Boundary Regression loss (W-UPBR), which only requires scribble annotations. Specifically, we first propose a feature enhancement U-Net (FEU-Net) to alleviate the severe layer distortion problem in OCT images with AMD. And this model serves as the backbone of a dual-branch network framework to enhance features. Within FEU-Net, in addition to the basic U-Net, two modules have been proposed: the global-local context-aware (GLCA) module, which captures both global and local contextual information, and the multi-scale fusion (MSF) module, designed for fusing multi-scale features. Secondly, we propose an uncertainty prototype module that combines the uncertainty-guided prototype and distance optimization loss. This module aims to exploit the similarities and dissimilarities between OCT images, thereby reducing mis-segmentation in layers caused by interference factors. Furthermore, a mixed pseudo-label strategy is incorporated to mix different predictions to alleviate the limitations posed by insufficient supervision and further promote network training. Finally, we design a boundary regression loss that constrains the boundaries in both 1D and 2D dimensions to enhance boundary under the supervision of generated mixed pseudo-labels, thereby reducing topological errors. The proposed method was evaluated on three datasets, and the results show that the proposed method outperformed other state-of-the-art weakly supervised methods and could achieve comparable performance to fully supervised methods.
Collapse
Affiliation(s)
- Xiaoming Liu
- School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan 430065, China; Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System, Wuhan 430065, China.
| | - Jia Zhang
- School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan 430065, China; Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System, Wuhan 430065, China
| | - Ying Zhang
- Aier Eye Hospital of Wuhan University, Wuhan 430014, China
| | - Li Chen
- School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan 430065, China; Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System, Wuhan 430065, China
| | - Liangfu Luo
- School of Computer Science Institute, Wuhan Qingchuan University, Wuhan 430205, China
| | - Jinshan Tang
- Department of Health Administration and Policy, George Mason University, Fairfax, VA 22030, USA
| |
Collapse
|
2
|
Liu Y, Lin L, Wong KKY, Tang X. ProCNS: Progressive Prototype Calibration and Noise Suppression for Weakly-Supervised Medical Image Segmentation. IEEE J Biomed Health Inform 2025; 29:2845-2858. [PMID: 40030827 DOI: 10.1109/jbhi.2024.3522958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Weakly-supervised segmentation (WSS) has emerged as a solution to mitigate the conflict between annotation cost and model performance by adopting sparse annotation formats (e.g., point, scribble, block, etc.). Typical approaches attempt to exploit anatomy and topology priors to directly expand sparse annotations into pseudo-labels. However, due to lack of attention to the ambiguous boundaries in medical images and insufficient exploration of sparse supervision, existing approaches tend to generate erroneous and overconfident pseudo proposals in noisy regions, leading to cumulative model error and performance degradation. In this work, we propose a novel WSS approach, named ProCNS, encompassing two synergistic modules devised with the principles of progressive prototype calibration and noise suppression. Specifically, we design a Prototype-based Regional Spatial Affinity (PRSA) loss to maximize the pair-wise affinities between spatial and semantic elements, providing our model of interest with more reliable guidance. The affinities are derived from the input images and the prototype-refined predictions. Meanwhile, we propose an Adaptive Noise Perception and Masking (ANPM) module to obtain more enriched and representative prototype representations, which adaptively identifies and masks noisy regions within the pseudo proposals, reducing potential erroneous interference during prototype computation. Furthermore, we generate specialized soft pseudo-labels for the noisy regions identified by ANPM, providing supplementary supervision. Extensive experiments on six medical image segmentation tasks involving different modalities demonstrate that the proposed framework significantly outperforms representative state-of-the-art methods.
Collapse
|
3
|
Szűcs ÁI, Kári B, Pártos O. Myocardial perfusion imaging SPECT left ventricle segmentation with graphs. EJNMMI Phys 2025; 12:21. [PMID: 40063231 PMCID: PMC11893936 DOI: 10.1186/s40658-025-00728-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2024] [Accepted: 02/17/2025] [Indexed: 03/14/2025] Open
Abstract
PURPOSE Various specialized and general collimators are used for myocardial perfusion imaging (MPI) with single-photon emission computed tomography (SPECT) to assess different types of coronary artery disease (CAD). Alongside the wide variability in imaging characteristics, the apriori "learnt" information of left ventricular (LV) shape can affect the final diagnosis of the imaging protocol. This study evaluates the effect of prior information incorporation into the segmentation process, compared to deep learning (DL) approaches, as well as the differences of 4 collimation techniques on 5 different datasets. METHODS This study was implemented on 80 patients database. 40 patients were coming from mixed black-box collimators, 10 each, from multi-pinhole (MPH), low energy high resolution (LEHR), CardioC and CardioD collimators. The testing was evaluated on a new continuous graph-based approach, which automatically segments the left ventricular volume with prior information on the cardiac geometry. The technique is based on the continuous max-flow (CMF) min-cut algorithm, which performance was evaluated in precision, recall, IoU and Dice score metrics. RESULTS In the testing it was shown that, the developed method showed a good improvement over deep learning reaching higher scores in most of the evaluation metrics. Further investigating the different collimation techniques, the evaluation of receiver operating characterstic (ROC) curves showed different stabilities on the various collimators. Running Wilcoxon signed-rank test on the outlines of the LVs showed differentiability between the collimation procedures. To further investigate these phenomena the model parameters of the LVs were reconstructed and evaluated by the uniform manifold approximation and projection (UMAP) method, which further proved that collimators can be differentiated based on the projected LV shapes alone. CONCLUSIONS The results show that prior information incorporation can enhance the performance of segmentation methods and collimation strategies have a high effect on the projected cardiac geometry.
Collapse
Affiliation(s)
- Ádám István Szűcs
- Computer Algebra, Eötvös Loránd University, Pázmány Péter blvd. 1/c, Budapest, Pest, 1117, Hungary.
| | - Béla Kári
- Nuclear Medicine, Semmelweis University, Üllői street 78b, Budapest, Pest, 1083, Hungary
| | - Oszkár Pártos
- Nuclear Medicine, Semmelweis University, Üllői street 78b, Budapest, Pest, 1083, Hungary
| |
Collapse
|
4
|
Aly M. Weakly-supervised thyroid ultrasound segmentation: Leveraging multi-scale consistency, contextual features, and bounding box supervision for accurate target delineation. Comput Biol Med 2025; 186:109669. [PMID: 39809086 DOI: 10.1016/j.compbiomed.2025.109669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2024] [Revised: 12/19/2024] [Accepted: 01/08/2025] [Indexed: 01/16/2025]
Abstract
Weakly-supervised learning (WSL) methods have gained significant attention in medical image segmentation, but they often face challenges in accurately delineating boundaries due to overfitting to weak annotations such as bounding boxes. This issue is particularly pronounced in thyroid ultrasound images, where low contrast and noisy backgrounds hinder precise segmentation. In this paper, we propose a novel weakly-supervised segmentation framework that addresses these challenges. Our framework integrates several key components: the Spatial Arrangement Consistency (SAC) branch, the Hierarchical Prediction Consistency (HPC) branch, the Contextual Feature Integration (CFI) branch, and the Multi-scale Prototype Refinement (MPR) module. These elements work together to enhance segmentation performance and mitigate overfitting to bounding box annotations. Specifically, the SAC branch ensures spatial alignment of the predicted segmentation with the target by evaluating maximum activations along both the horizontal and vertical dimensions of the bounding box. The HPC branch refines prototypes for target and background regions from semantic feature maps, comparing secondary predictions with the initial ones to improve segmentation accuracy. The CFI branch enhances feature representation by incorporating contextual information from neighboring regions, while the MPR module further refines segmentation accuracy by balancing global context and local details through multi-scale feature refinement. We evaluate the performance of our method on two thyroid ultrasound datasets: TG3K and TN3K, using comprehensive metrics including mIOU, DSC, HD95, DI, ACC, PR, and SE. On the TG3K dataset, the Proposed Method achieved mIOU of 71.85 %, DSC of 85.92 %, HD95 of 13.09 mm, and ACC of 0.93, significantly outperforming existing weakly-supervised methods. On the TN3K dataset, our model demonstrated mIOU of 70.45 %, DSC of 84.81 %, HD95 of 14.16 mm, and ACC of 0.91, further validating the robustness of the proposed method across datasets. In terms of Precision (PR) and Sensitivity (SE), the Proposed Method achieved PR = 0.91 and SE = 0.86 on the TG3K dataset, and PR = 0.89 and SE = 0.86 on the TN3K dataset. These results show that our model not only improves segmentation accuracy and boundary delineation (HD95) but also significantly reduces the dependency on pixel-level annotations, providing an effective solution for weakly-supervised thyroid ultrasound segmentation. Our method demonstrates competitive performance with fully-supervised approaches, with reduced annotation time, thereby improving the practicality of deep learning-based segmentation in clinical settings.
Collapse
Affiliation(s)
- Mohammed Aly
- Department of Artificial Intelligence, Faculty of Artificial Intelligence, Egyptian Russian University, 11829, Badr City, Egypt.
| |
Collapse
|
5
|
Nguyen-Tat TB, Vo HA, Dang PS. QMaxViT-Unet+: A query-based MaxViT-Unet with edge enhancement for scribble-supervised segmentation of medical images. Comput Biol Med 2025; 187:109762. [PMID: 39919665 DOI: 10.1016/j.compbiomed.2025.109762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2024] [Revised: 01/17/2025] [Accepted: 01/27/2025] [Indexed: 02/09/2025]
Abstract
The deployment of advanced deep learning models for medical image segmentation is often constrained by the requirement for extensively annotated datasets. Weakly-supervised learning, which allows less precise labels, has become a promising solution to this challenge. Building on this approach, we propose QMaxViT-Unet+, a novel framework for scribble-supervised medical image segmentation. This framework is built on the U-Net architecture, with the encoder and decoder replaced by Multi-Axis Vision Transformer (MaxViT) blocks. These blocks enhance the model's ability to learn local and global features efficiently. Additionally, our approach integrates a query-based Transformer decoder to refine features and an edge enhancement module to compensate for the limited boundary information in the scribble label. We evaluate the proposed QMaxViT-Unet+ on four public datasets focused on cardiac structures, colorectal polyps, and breast cancer: ACDC, MS-CMRSeg, SUN-SEG, and BUSI. Evaluation metrics include the Dice similarity coefficient (DSC) and the 95th percentile of Hausdorff distance (HD95). Experimental results show that QMaxViT-Unet+ achieves 89.1% DSC and 1.316 mm HD95 on ACDC, 88.4% DSC and 2.226 mm HD95 on MS-CMRSeg, 71.4% DSC and 4.996 mm HD95 on SUN-SEG, and 69.4% DSC and 50.122 mm HD95 on BUSI. These results demonstrate that our method outperforms existing approaches in terms of accuracy, robustness, and efficiency while remaining competitive with fully-supervised learning approaches. This makes it ideal for medical image analysis, where high-quality annotations are often scarce and require significant effort and expense. The code is available at https://github.com/anpc849/QMaxViT-Unet.
Collapse
Affiliation(s)
- Thien B Nguyen-Tat
- University of Information Technology, Ho Chi Minh City, Vietnam; Vietnam National University, Ho Chi Minh City, Vietnam.
| | - Hoang-An Vo
- University of Information Technology, Ho Chi Minh City, Vietnam; Vietnam National University, Ho Chi Minh City, Vietnam
| | - Phuoc-Sang Dang
- University of Information Technology, Ho Chi Minh City, Vietnam; Vietnam National University, Ho Chi Minh City, Vietnam
| |
Collapse
|
6
|
Lin L, Liu Y, Wu J, Cheng P, Cai Z, Wong KKY, Tang X. FedLPPA: Learning Personalized Prompt and Aggregation for Federated Weakly-Supervised Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:1127-1139. [PMID: 39423080 DOI: 10.1109/tmi.2024.3483221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/21/2024]
Abstract
Federated learning (FL) effectively mitigates the data silo challenge brought about by policies and privacy concerns, implicitly harnessing more data for deep model training. However, traditional centralized FL models grapple with diverse multi-center data, especially in the face of significant data heterogeneity, notably in medical contexts. In the realm of medical image segmentation, the growing imperative to curtail annotation costs has amplified the importance of weakly-supervised techniques which utilize sparse annotations such as points, scribbles, etc. A pragmatic FL paradigm shall accommodate diverse annotation formats across different sites, which research topic remains under-investigated. In such context, we propose a novel personalized FL framework with learnable prompt and aggregation (FedLPPA) to uniformly leverage heterogeneous weak supervision for medical image segmentation. In FedLPPA, a learnable universal knowledge prompt is maintained, complemented by multiple learnable personalized data distribution prompts and prompts representing the supervision sparsity. Integrated with sample features through a dual-attention mechanism, those prompts empower each local task decoder to adeptly adjust to both the local distribution and the supervision form. Concurrently, a dual-decoder strategy, predicated on prompt similarity, is introduced for enhancing the generation of pseudo-labels in weakly-supervised learning, alleviating overfitting and noise accumulation inherent to local data, while an adaptable aggregation method is employed to customize the task decoder on a parameter-wise basis. Extensive experiments on four distinct medical image segmentation tasks involving different modalities underscore the superiority of FedLPPA, with its efficacy closely parallels that of fully supervised centralized training. Our code and data will be available at https://github.com/llmir/FedLPPA.
Collapse
|
7
|
Chen J, Huang W, Zhang J, Debattista K, Han J. Addressing inconsistent labeling with cross image matching for scribble-based medical image segmentation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; PP:842-853. [PMID: 40031274 DOI: 10.1109/tip.2025.3530787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
In recent years, there has been a notable surge in the adoption of weakly-supervised learning for medical image segmentation, utilizing scribble annotation as a means to potentially reduce annotation costs. However, the inherent characteristics of scribble labeling, marked by incompleteness, subjectivity, and a lack of standardization, introduce inconsistencies into the annotations. These inconsistencies become significant challenges for the network's learning process, ultimately affecting the performance of segmentation. To address this challenge, we propose creating a reference set to guide pixel-level feature matching, constructed from class-specific tokens and pixel-level features extracted from variously images. Serving as a repository showcasing diverse pixel styles and classes, the reference set becomes the cornerstone for a pixel-level feature matching strategy. This strategy enables the effective comparison of unlabeled pixels, offering guidance, particularly in learning scenarios characterized by inconsistent and incomplete scribbles. The proposed strategy incorporates smoothing and regression techniques to align pixel-level features across different images. By leveraging the diversity of pixel sources, our matching approach enhances the network's ability to learn consistent patterns from the reference set. This, in turn, mitigates the impact of inconsistent and incomplete labeling, resulting in improved segmentation outcomes. Extensive experiments conducted on three publicly available datasets demonstrate the superiority of our approach over state-of-the-art methods in terms of segmentation accuracy and stability. The code will be made publicly available at https://github.com/jingkunchen/scribble-medical-segmentation.
Collapse
|
8
|
Zhang Y, Zhao S, Gu H, Mazurowski MA. How to Efficiently Annotate Images for Best-Performing Deep Learning-Based Segmentation Models: An Empirical Study with Weak and Noisy Annotations and Segment Anything Model. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025:10.1007/s10278-025-01408-7. [PMID: 39843720 DOI: 10.1007/s10278-025-01408-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Revised: 12/26/2024] [Accepted: 01/07/2025] [Indexed: 01/24/2025]
Abstract
Deep neural networks (DNNs) have demonstrated exceptional performance across various image segmentation tasks. However, the process of preparing datasets for training segmentation DNNs is both labor-intensive and costly, as it typically requires pixel-level annotations for each object of interest. To mitigate this challenge, alternative approaches such as using weak labels (e.g., bounding boxes or scribbles) or less precise (noisy) annotations can be employed. Noisy and weak labels are significantly quicker to generate, allowing for more annotated images within the same time frame. However, the potential decrease in annotation quality may adversely impact the segmentation performance of the resulting model. In this study, we conducted a comprehensive cost-effectiveness evaluation on six variants of annotation strategies (9 ∼ 10 sub-variants in total) across 4 datasets and conclude that the common practice of precisely outlining objects of interest is virtually never the optimal approach when annotation budget is limited. Both noisy and weak annotations showed usage cases that yield similar performance to the perfectly annotated counterpart, yet had significantly better cost-effectiveness. We hope our findings will help researchers be aware of the different available options and use their annotation budgets more efficiently, especially in cases where accurately acquiring labels for target objects is particularly costly. Our code will be made available on https://github.com/yzluka/AnnotationEfficiency2D .
Collapse
Affiliation(s)
- Yixin Zhang
- Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA.
| | - Shen Zhao
- Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA
| | - Hanxue Gu
- Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA
| | - Maciej A Mazurowski
- Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA
- Department of Radiology, Duke University, Durham, NC, USA
- Department of Biostatistics & Bioinformatics, Duke University, Durham, NC, USA
- Department of Computer Science, Duke University, Durham, NC, USA
| |
Collapse
|
9
|
Zhang L, Li W, Bi K, Li P, Zhang L, Liu H. FDDSeg: Unleashing the Power of Scribble Annotation for Cardiac MRI Images Through Feature Decomposition Distillation. IEEE J Biomed Health Inform 2025; 29:285-296. [PMID: 38787661 DOI: 10.1109/jbhi.2024.3404884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2024]
Abstract
Cardiovascular diseases can be diagnosed with computer assistance when using the magnetic resonance imaging (MRI) image that is produced by the MRI sensor. Deep learning-based scribbling MRI image segmentation has demonstrated impressive results recently. However, the majority of current approaches possess an excessive number of model parameters and do not completely utilize scribbling annotations. To develop a feature decomposition distillation deep learning method, named FDDSeg, for scribble-supervised cardiac MRI image segmentation. Public ACDC and MSCMR cardiac MRI datasets were used to evaluate the segmentation performance of FDDSeg. FDDSeg adopts a scribble annotation reuse policy to help provide accurate boundaries, and the intermediate features are split class region and class-free region by using the pseudo labels to further improve feature learning. Effective distillation knowledge is then captured by feature decomposition. FDDSeg was compared with 7 state-of-the-art methods, MAAG, ShapePU, CycleMix, Dual-Branch, ZscribbleSeg, Perturbation Dual-Branch as well as ScribbleVC on both ACDC and MSCMR datasets. FDDSeg is shown to perform the best in DSC(89.05% and 88.75%), JC(80.30% and 79.78%) as well as HD95(5.76% and 4.44%) metrics with only 2.01 M of parameters. FDDSeg methods can segment cardiac MRI images more precise with only scribble annotations at lower computation cost, which may help increase the efficiency of quantitative analysis of cardiac.
Collapse
|
10
|
Huang Z, Wang Z, Zhao T, Ding X, Yang X. Toward high-quality pseudo masks from noisy or weak annotations for robust medical image segmentation. Neural Netw 2025; 181:106850. [PMID: 39520897 DOI: 10.1016/j.neunet.2024.106850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Revised: 09/01/2024] [Accepted: 10/23/2024] [Indexed: 11/16/2024]
Abstract
Deep learning networks excel in image segmentation with abundant accurately annotated training samples. However, in medical applications, acquiring large quantities of high-quality labeled images is prohibitively expensive. Thus, learning from imperfect annotations (e.g. noisy or weak annotations) has emerged as a prominent research area in medical image segmentation. This work aims to extract high-quality pseudo masks from imperfect annotations with the assistance of a small number of clean labels. Our core motivation is based on the understanding that different types of flawed imperfect annotations inherently exhibit unique noise patterns. Comparing clean annotations with corresponding imperfectly annotated labels can effectively identify potential noise patterns at minimal additional cost. To this end, we propose a two-phase framework including a noise identification network and a noise-robust segmentation network. The former network implicitly learns noise patterns and revises labels accordingly. It includes a three-branch network to identify different types of noises. The latter one further mitigates the negative influence of residual annotation noises based on parallel segmentation networks with different initializations and a label softening strategy. Extensive experimental results on two public datasets demonstrate that our method can effectively refine annotation flaws and achieve superior segmentation performance to the state-of-the-art methods.
Collapse
Affiliation(s)
- Zihang Huang
- School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, China
| | - Zhiwei Wang
- Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China
| | - Tianyu Zhao
- School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, China
| | - Xiaohuan Ding
- School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, China
| | - Xin Yang
- School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, China.
| |
Collapse
|
11
|
Li C, Zheng Z, Wu D. Shape-Aware Adversarial Learning for Scribble-Supervised Medical Image Segmentation with a MaskMix Siamese Network: A Case Study of Cardiac MRI Segmentation. Bioengineering (Basel) 2024; 11:1146. [PMID: 39593806 PMCID: PMC11592347 DOI: 10.3390/bioengineering11111146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2024] [Revised: 11/12/2024] [Accepted: 11/12/2024] [Indexed: 11/28/2024] Open
Abstract
The transition in medical image segmentation from fine-grained to coarse-grained annotation methods, notably scribble annotation, offers a practical and efficient preparation for deep learning applications. However, these methods often compromise segmentation precision and result in irregular contours. This study targets the enhancement of scribble-supervised segmentation to match the accuracy of fine-grained annotation. Capitalizing on the consistency of target shapes across unpaired datasets, this study introduces a shape-aware scribble-supervised learning framework (MaskMixAdv) addressing two critical tasks: (1) Pseudo label generation, where a mixup-based masking strategy enables image-level and feature-level data augmentation to enrich coarse-grained scribbles annotations. A dual-branch siamese network is proposed to generate fine-grained pseudo labels. (2) Pseudo label optimization, where a CNN-based discriminator is proposed to refine pseudo label contours by distinguishing them from external unpaired masks during model fine-tuning. MaskMixAdv works under constrained annotation conditions as a label-efficient learning approach for medical image segmentation. A case study on public cardiac MRI datasets demonstrated that the proposed MaskMixAdv outperformed the state-of-the-art methods and narrowed the performance gap between scribble-supervised and mask-supervised segmentation. This innovation cuts annotation time by at least 95%, with only a minor impact on Dice performance, specifically a 2.6% reduction. The experimental outcomes indicate that employing efficient and cost-effective scribble annotation can achieve high segmentation accuracy, significantly reducing the typical requirement for fine-grained annotations.
Collapse
Affiliation(s)
| | - Zhong Zheng
- College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, China; (C.L.); (D.W.)
| | | |
Collapse
|
12
|
Han M, Luo X, Xie X, Liao W, Zhang S, Song T, Wang G, Zhang S. DMSPS: Dynamically mixed soft pseudo-label supervision for scribble-supervised medical image segmentation. Med Image Anal 2024; 97:103274. [PMID: 39043109 DOI: 10.1016/j.media.2024.103274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 05/11/2024] [Accepted: 07/09/2024] [Indexed: 07/25/2024]
Abstract
High performance of deep learning on medical image segmentation rely on large-scale pixel-level dense annotations, which poses a substantial burden on medical experts due to the laborious and time-consuming annotation process, particularly for 3D images. To reduce the labeling cost as well as maintain relatively satisfactory segmentation performance, weakly-supervised learning with sparse labels has attained increasing attentions. In this work, we present a scribble-based framework for medical image segmentation, called Dynamically Mixed Soft Pseudo-label Supervision (DMSPS). Concretely, we extend a backbone with an auxiliary decoder to form a dual-branch network to enhance the feature capture capability of the shared encoder. Considering that most pixels do not have labels and hard pseudo-labels tend to be over-confident to result in poor segmentation, we propose to use soft pseudo-labels generated by dynamically mixing the decoders' predictions as auxiliary supervision. To further enhance the model's performance, we adopt a two-stage approach where the sparse scribbles are expanded based on predictions with low uncertainties from the first-stage model, leading to more annotated pixels to train the second-stage model. Experiments on ACDC dataset for cardiac structure segmentation, WORD dataset for 3D abdominal organ segmentation and BraTS2020 dataset for 3D brain tumor segmentation showed that: (1) compared with the baseline, our method improved the average DSC from 50.46% to 89.51%, from 75.46% to 87.56% and from 52.61% to 76.53% on the three datasets, respectively; (2) DMSPS achieved better performance than five state-of-the-art scribble-supervised segmentation methods, and is generalizable to different segmentation backbones. The code is available online at: https://github.com/HiLab-git/DMSPS.
Collapse
Affiliation(s)
- Meng Han
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Xiangde Luo
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; Shanghai Artificial Intelligence Laboratory, Shanghai, China
| | - Xiangjiang Xie
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Wenjun Liao
- Department of Radiation Oncology, Sichuan Cancer Hospital and Institute, Sichuan Cancer Center, Chengdu, China; School of Medicine, University of Electronic Science and Technology of China, Chengdu, China
| | - Shichuan Zhang
- Department of Radiation Oncology, Sichuan Cancer Hospital and Institute, Sichuan Cancer Center, Chengdu, China
| | - Tao Song
- SenseTime Research, Shanghai, China
| | - Guotai Wang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; Shanghai Artificial Intelligence Laboratory, Shanghai, China.
| | - Shaoting Zhang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; Shanghai Artificial Intelligence Laboratory, Shanghai, China.
| |
Collapse
|
13
|
Yan P, Li M, Zhang J, Li G, Jiang Y, Luo H. Cold SegDiffusion: A novel diffusion model for medical image segmentation. Knowl Based Syst 2024; 301:112350. [DOI: 10.1016/j.knosys.2024.112350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/03/2024]
|
14
|
You C, Dai W, Liu F, Min Y, Dvornek NC, Li X, Clifton DA, Staib L, Duncan JS. Mine Your Own Anatomy: Revisiting Medical Image Segmentation With Extremely Limited Labels. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; PP:11136-11151. [PMID: 39269798 PMCID: PMC11903367 DOI: 10.1109/tpami.2024.3461321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/15/2024]
Abstract
Recent studies on contrastive learning have achieved remarkable performance solely by leveraging few labels in the context of medical image segmentation. Existing methods mainly focus on instance discrimination and invariant mapping (i.e., pulling positive samples closer and negative samples apart in the feature space). However, they face three common pitfalls: (1) tailness: medical image data usually follows an implicit long-tail class distribution. Blindly leveraging all pixels in training hence can lead to the data imbalance issues, and cause deteriorated performance; (2) consistency: it remains unclear whether a segmentation model has learned meaningful and yet consistent anatomical features due to the intra-class variations between different anatomical features; and (3) diversity: the intra-slice correlations within the entire dataset have received significantly less attention. This motivates us to seek a principled approach for strategically making use of the dataset itself to discover similar yet distinct samples from different anatomical views. In this paper, we introduce a novel semi-supervised 2D medical image segmentation framework termed Mine yOur owNAnatomy (MONA), and make three contributions. First, prior work argues that every pixel equally matters to the model training; we observe empirically that this alone is unlikely to define meaningful anatomical features, mainly due to lacking the supervision signal. We show two simple solutions towards learning invariances-through the use of stronger data augmentations and nearest neighbors. Second, we construct a set of objectives that encourage the model to be capable of decomposing medical images into a collection of anatomical features in an unsupervised manner. Lastly, we both empirically and theoretically, demonstrate the efficacy of our MONA on three benchmark datasets with different labeled settings, achieving new state-of-the-art under different labeled semi-supervised settings. MONA makes minimal assumptions on domain expertise, and hence constitutes a practical and versatile solution in medical image analysis. We provide the PyTorch-like pseudo-code in supplementary.
Collapse
|
15
|
Qu Y, Lu T, Zhang S, Wang G. ScribSD+: Scribble-supervised medical image segmentation based on simultaneous multi-scale knowledge distillation and class-wise contrastive regularization. Comput Med Imaging Graph 2024; 116:102416. [PMID: 39018640 DOI: 10.1016/j.compmedimag.2024.102416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 06/16/2024] [Accepted: 07/04/2024] [Indexed: 07/19/2024]
Abstract
Despite that deep learning has achieved state-of-the-art performance for automatic medical image segmentation, it often requires a large amount of pixel-level manual annotations for training. Obtaining these high-quality annotations is time-consuming and requires specialized knowledge, which hinders the widespread application that relies on such annotations to train a model with good segmentation performance. Using scribble annotations can substantially reduce the annotation cost, but often leads to poor segmentation performance due to insufficient supervision. In this work, we propose a novel framework named as ScribSD+ that is based on multi-scale knowledge distillation and class-wise contrastive regularization for learning from scribble annotations. For a student network supervised by scribbles and the teacher based on Exponential Moving Average (EMA), we first introduce multi-scale prediction-level Knowledge Distillation (KD) that leverages soft predictions of the teacher network to supervise the student at multiple scales, and then propose class-wise contrastive regularization which encourages feature similarity within the same class and dissimilarity across different classes, thereby effectively improving the segmentation performance of the student network. Experimental results on the ACDC dataset for heart structure segmentation and a fetal MRI dataset for placenta and fetal brain segmentation demonstrate that our method significantly improves the student's performance and outperforms five state-of-the-art scribble-supervised learning methods. Consequently, the method has a potential for reducing the annotation cost in developing deep learning models for clinical diagnosis.
Collapse
Affiliation(s)
- Yijie Qu
- University of Electronic Science and Technology of China, Chengdu, China
| | - Tao Lu
- Sichuan Provincial People's Hospital, Chengdu, China
| | - Shaoting Zhang
- University of Electronic Science and Technology of China, Chengdu, China; Shanghai AI lab, Shanghai, China
| | - Guotai Wang
- University of Electronic Science and Technology of China, Chengdu, China; Shanghai AI lab, Shanghai, China.
| |
Collapse
|
16
|
Zhang H, Cai Z. ConvNextUNet: A small-region attentioned model for cardiac MRI segmentation. Comput Biol Med 2024; 177:108592. [PMID: 38781642 DOI: 10.1016/j.compbiomed.2024.108592] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 04/08/2024] [Accepted: 05/09/2024] [Indexed: 05/25/2024]
Abstract
Cardiac MRI segmentation is a significant research area in medical image processing, holding immense clinical and scientific importance in assisting the diagnosis and treatment of heart diseases. Currently, existing cardiac MRI segmentation algorithms are often constrained by specific datasets and conditions, leading to a notable decrease in segmentation performance when applied to diverse datasets. These limitations affect the algorithm's overall performance and generalization capabilities. Inspired by ConvNext, we introduce a two-dimensional cardiac MRI segmentation U-shaped network called ConvNextUNet. It is the first application of a combination of ConvNext and the U-shaped architecture in the field of cardiac MRI segmentation. Firstly, we incorporate up-sampling modules into the original ConvNext architecture and combine it with the U-shaped framework to achieve accurate reconstruction. Secondly, we integrate Input Stem into ConvNext, and introduce attention mechanisms along the bridging path. By merging features extracted from both the encoder and decoder, a probability distribution is obtained through linear and nonlinear transformations, serving as attention weights, thereby enhancing the signal of the same region of interest. The resulting attention weights are applied to the decoder features, highlighting the region of interest. This allows the model to simultaneously consider local context and global details during the learning phase, fully leveraging the advantages of both global and local perception for a more comprehensive understanding of cardiac anatomical structures. Consequently, the model demonstrates a clear advantage and robust generalization capability, especially in small-region segmentation. Experimental results on the ACDC, LVQuan19, and RVSC datasets confirm that the ConvNextUNet model outperforms the current state-of-the-art models, particularly in small-region segmentation tasks. Furthermore, we conducted cross-dataset training and testing experiments, which revealed that the pre-trained model can accurately segment diverse cardiac datasets, showcasing its powerful generalization capabilities. The source code of this project is available at https://github.com/Zemin-Cai/ConvNextUNet.
Collapse
Affiliation(s)
- Huiyi Zhang
- The Department of Electronic Engineering, Shantou University, Shantou, Guangdong 515063, PR China; Key Laboratory of Digital Signal and Image Processing of Guangdong Province, Shantou, Guangdong 515063, PR China
| | - Zemin Cai
- The Department of Electronic Engineering, Shantou University, Shantou, Guangdong 515063, PR China; Key Laboratory of Digital Signal and Image Processing of Guangdong Province, Shantou, Guangdong 515063, PR China.
| |
Collapse
|
17
|
Jalata I, Nakarmi U. Cut-Puzzle mix: Scribble Guided Medical Image Segmentation without Segmentation Masks. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2024; 2024:1-5. [PMID: 40039028 DOI: 10.1109/embc53108.2024.10782628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
Majority of contemporary fully supervised segmentation algorithms excel in quantifying human anatomy, significantly advancing the field. However, the reliance of numerous deep neural networks on extensive datasets with full pixel-wise annotations poses challenges. The creation of such annotated training data is both laborious and financially demanding. In response to these challenges, there has been a notable shift in focus towards leveraging limited data, specifically in the form of scribble annotations. This study explores training strategies aimed at learning the parameters of a pixel-wise segmentation network solely from scribble annotations, employing cut-mix and puzzle mix strategies. To further enhance supervision regularization, consistency losses are incorporated with cross entropy, penalizing inconsistent segmentation and leading to a noteworthy improvement in segmentation performance. Evaluation of these techniques is conducted on publicly available cardiac (ACDC) and MSCMR segmentation datasets. Our proposed method demonstrates impressive performance, surpassing the results achieved by state-of-the-art methods.
Collapse
|
18
|
Li Z, Zheng Y, Shan D, Yang S, Li Q, Wang B, Zhang Y, Hong Q, Shen D. ScribFormer: Transformer Makes CNN Work Better for Scribble-Based Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2254-2265. [PMID: 38324425 DOI: 10.1109/tmi.2024.3363190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Most recent scribble-supervised segmentation methods commonly adopt a CNN framework with an encoder-decoder architecture. Despite its multiple benefits, this framework generally can only capture small-range feature dependency for the convolutional layer with the local receptive field, which makes it difficult to learn global shape information from the limited information provided by scribble annotations. To address this issue, this paper proposes a new CNN-Transformer hybrid solution for scribble-supervised medical image segmentation called ScribFormer. The proposed ScribFormer model has a triple-branch structure, i.e., the hybrid of a CNN branch, a Transformer branch, and an attention-guided class activation map (ACAM) branch. Specifically, the CNN branch collaborates with the Transformer branch to fuse the local features learned from CNN with the global representations obtained from Transformer, which can effectively overcome limitations of existing scribble-supervised segmentation methods. Furthermore, the ACAM branch assists in unifying the shallow convolution features and the deep convolution features to improve model's performance further. Extensive experiments on two public datasets and one private dataset show that our ScribFormer has superior performance over the state-of-the-art scribble-supervised segmentation methods, and achieves even better results than the fully-supervised segmentation methods. The code is released at https://github.com/HUANGLIZI/ScribFormer.
Collapse
|
19
|
Li Y, Wang L, Huang X, Wang Y, Dong L, Ge R, Zhou H, Ye J, Zhang Q. Sketch-Supervised Histopathology Tumour Segmentation: Dual CNN-Transformer With Global Normalised CAM. IEEE J Biomed Health Inform 2024; 28:66-77. [PMID: 37368799 DOI: 10.1109/jbhi.2023.3289984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2023]
Abstract
Deep learning methods are frequently used in segmenting histopathology images with high-quality annotations nowadays. Compared with well-annotated data, coarse, scribbling-like labelling is more cost-effective and easier to obtain in clinical practice. The coarse annotations provide limited supervision, so employing them directly for segmentation network training remains challenging. We present a sketch-supervised method, called DCTGN-CAM, based on a dual CNN-Transformer network and a modified global normalised class activation map. By modelling global and local tumour features simultaneously, the dual CNN-Transformer network produces accurate patch-based tumour classification probabilities by training only on lightly annotated data. With the global normalised class activation map, more descriptive gradient-based representations of the histopathology images can be obtained, and inference of tumour segmentation can be performed with high accuracy. Additionally, we collect a private skin cancer dataset named BSS, which contains fine and coarse annotations for three types of cancer. To facilitate reproducible performance comparison, experts are also invited to label coarse annotations on the public liver cancer dataset PAIP2019. On the BSS dataset, our DCTGN-CAM segmentation outperforms the state-of-the-art methods and achieves 76.68 % IOU and 86.69 % Dice scores on the sketch-based tumour segmentation task. On the PAIP2019 dataset, our method achieves a Dice gain of 8.37 % compared with U-Net as the baseline network.
Collapse
|
20
|
Ying J, Huang W, Fu L, Yang H, Cheng J. Weakly supervised segmentation of uterus by scribble labeling on endometrial cancer MR images. Comput Biol Med 2023; 167:107582. [PMID: 37922606 DOI: 10.1016/j.compbiomed.2023.107582] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Revised: 09/28/2023] [Accepted: 10/15/2023] [Indexed: 11/07/2023]
Abstract
Uterine segmentation of endometrial cancer MR images can be a valuable diagnostic tool for gynecologists. However, uterine segmentation based on deep learning relies on artificial pixel-level annotation, which is time-consuming, laborious and subjective. To reduce the dependence on pixel-level annotation, a method of weakly supervised uterine segmentation on endometrial cancer MRI slices is proposed, which only requires scribble label and is enhanced by pseudo-label technology, exponential geodesic distance loss and input disturbance strategy. Specifically, the limitations caused by the shortage of supervision are addressed by dynamically mixing the two outputs of the dual branch network to generate pseudo-labels, expanding supervision information and promoting mutual supervision training. On the other hand, considering the large difference of grayscale intensity between the uterus and surrounding tissues, the exponential geodesic distance loss is introduced to enhance the ability of the network to capture the edge of the uterus. Input disturbance strategies are incorporated to adapt to the flexible and variable characteristics of the uterus and further improve the segmentation performance of the network. The proposed method is evaluated on MRI images from 135 cases of endometrial cancer. Compared with other four weakly supervised segmentation methods, the performance of the proposed method is the best, whose mean DI, HD95, Recall, Precision, ADP are 92.8%, 11.632, 92.7%, 93.6%, 6.5% and increasing by 2.1%, 9.144, 0.6%, 2.4%, 2.9% respectively. The experimental results demonstrate that the proposed method is more effective than other weakly supervised methods and achieves similar performance as those fully supervised.
Collapse
Affiliation(s)
- Jie Ying
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai, China.
| | - Wei Huang
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai, China
| | - Le Fu
- Department of Radiology, Shanghai First Maternity and Infant Hospital, Tongji University School of Medicine, Shanghai, China.
| | - Haima Yang
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai, China
| | - Jiangzihao Cheng
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai, China
| |
Collapse
|
21
|
Lin L, Peng L, He H, Cheng P, Wu J, Wong KKY, Tang X. YoloCurvSeg: You only label one noisy skeleton for vessel-style curvilinear structure segmentation. Med Image Anal 2023; 90:102937. [PMID: 37672901 DOI: 10.1016/j.media.2023.102937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2023] [Revised: 06/30/2023] [Accepted: 08/16/2023] [Indexed: 09/08/2023]
Abstract
Weakly-supervised learning (WSL) has been proposed to alleviate the conflict between data annotation cost and model performance through employing sparsely-grained (i.e., point-, box-, scribble-wise) supervision and has shown promising performance, particularly in the image segmentation field. However, it is still a very challenging task due to the limited supervision, especially when only a small number of labeled samples are available. Additionally, almost all existing WSL segmentation methods are designed for star-convex structures which are very different from curvilinear structures such as vessels and nerves. In this paper, we propose a novel sparsely annotated segmentation framework for curvilinear structures, named YoloCurvSeg. A very essential component of YoloCurvSeg is image synthesis. Specifically, a background generator delivers image backgrounds that closely match the real distributions through inpainting dilated skeletons. The extracted backgrounds are then combined with randomly emulated curves generated by a Space Colonization Algorithm-based foreground generator and through a multilayer patch-wise contrastive learning synthesizer. In this way, a synthetic dataset with both images and curve segmentation labels is obtained, at the cost of only one or a few noisy skeleton annotations. Finally, a segmenter is trained with the generated dataset and possibly an unlabeled dataset. The proposed YoloCurvSeg is evaluated on four publicly available datasets (OCTA500, CORN, DRIVE and CHASEDB1) and the results show that YoloCurvSeg outperforms state-of-the-art WSL segmentation methods by large margins. With only one noisy skeleton annotation (respectively 0.14%, 0.03%, 1.40%, and 0.65% of the full annotation), YoloCurvSeg achieves more than 97% of the fully-supervised performance on each dataset. Code and datasets will be released at https://github.com/llmir/YoloCurvSeg.
Collapse
Affiliation(s)
- Li Lin
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China; Department of Electrical and Electronic Engineering, the University of Hong Kong, Hong Kong, China; Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing, China
| | - Linkai Peng
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China
| | - Huaqing He
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China; Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing, China
| | - Pujin Cheng
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China; Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing, China
| | - Jiewei Wu
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China
| | - Kenneth K Y Wong
- Department of Electrical and Electronic Engineering, the University of Hong Kong, Hong Kong, China
| | - Xiaoying Tang
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China; Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing, China.
| |
Collapse
|
22
|
Peng K, Peng Y, Liao H, Yang Z, Feng W. Automated bone marrow cell classification through dual attention gates dense neural networks. J Cancer Res Clin Oncol 2023; 149:16971-16981. [PMID: 37740765 DOI: 10.1007/s00432-023-05384-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 08/31/2023] [Indexed: 09/25/2023]
Abstract
PURPOSE The morphology of bone marrow cells is essential in identifying malignant hematological disorders. The automatic classification model of bone marrow cell morphology based on convolutional neural networks shows considerable promise in terms of diagnostic efficiency and accuracy. However, due to the lack of acceptable accuracy in bone marrow cell classification algorithms, automatic classification of bone marrow cells is now infrequently used in clinical facilities. To address the issue of precision, in this paper, we propose a Dual Attention Gates DenseNet (DAGDNet) to construct a novel efficient, and high-precision bone marrow cell classification model for enhancing the classification model's performance even further. METHODS DAGDNet is constructed by embedding a novel dual attention gates (DAGs) mechanism in the architecture of DenseNet. DAGs are used to filter and highlight the position-related features in DenseNet to improve the precision and recall of neural network-based cell classifiers. We have constructed a dataset of bone marrow cell morphology from the First Affiliated Hospital of Chongqing Medical University, which mainly consists of leukemia samples, to train and test our proposed DAGDNet together with the bone marrow cell classification dataset. RESULTS When evaluated on a multi-center dataset, experimental results show that our proposed DAGDNet outperforms image classification models such as DenseNet and ResNeXt in bone marrow cell classification performance. The mean precision of DAGDNet on the Munich Leukemia Laboratory dataset is 88.1%, achieving state-of-the-art performance while still maintaining high efficiency. CONCLUSION Our data demonstrate that the DAGDNet can improve the efficacy of automatic bone marrow cell classification and can be exploited as an assisting diagnosis tool in clinical applications. Moreover, the DAGDNet is also an efficient model that can swiftly inspect a large number of bone marrow cells and offers the benefit of reducing the probability of an incorrect diagnosis.
Collapse
Affiliation(s)
- Kaiyi Peng
- Department of Clinical Hematology, Key Laboratory of Laboratory Medical Diagnostics Designated by the Ministry of Education, School of Laboratory Medicine, Chongqing Medical University, No. 1, Yixueyuan Road, Chongqing, 400016, China
| | - Yuhang Peng
- Department of Clinical Hematology, Key Laboratory of Laboratory Medical Diagnostics Designated by the Ministry of Education, School of Laboratory Medicine, Chongqing Medical University, No. 1, Yixueyuan Road, Chongqing, 400016, China
| | - Hedong Liao
- Department of Hematology, The First Affiliated Hospital of Chongqing Medical University, No. 1, Youyi Road, Chongqing, 400016, China
| | - Zesong Yang
- Department of Hematology, The First Affiliated Hospital of Chongqing Medical University, No. 1, Youyi Road, Chongqing, 400016, China.
| | - Wenli Feng
- Department of Clinical Hematology, Key Laboratory of Laboratory Medical Diagnostics Designated by the Ministry of Education, School of Laboratory Medicine, Chongqing Medical University, No. 1, Yixueyuan Road, Chongqing, 400016, China.
| |
Collapse
|
23
|
Wang L, Zhou H, Xu N, Liu Y, Jiang X, Li S, Feng C, Xu H, Deng K, Song J. A general approach for automatic segmentation of pneumonia, pulmonary nodule, and tuberculosis in CT images. iScience 2023; 26:107005. [PMID: 37534183 PMCID: PMC10391673 DOI: 10.1016/j.isci.2023.107005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Revised: 04/27/2023] [Accepted: 05/26/2023] [Indexed: 08/04/2023] Open
Abstract
Proposing a general segmentation approach for lung lesions, including pulmonary nodules, pneumonia, and tuberculosis, in CT images will improve efficiency in radiology. However, the performance of generative adversarial networks is hampered by the limited availability of annotated samples and the catastrophic forgetting of the discriminator, whereas the universality of traditional morphology-based methods is insufficient for segmenting diverse lung lesions. A cascaded dual-attention network with a context-aware pyramid feature extraction module was designed to address these challenges. A self-supervised rotation loss was designed to mitigate discriminator forgetting. The proposed model achieved Dice coefficients of 70.92, 73.55, and 68.52% on multi-center pneumonia, lung nodule, and tuberculosis test datasets, respectively. No significant decrease in accuracy was observed (p > 0.10) when a small training sample size was used. The cyclic training of the discriminator was reduced with self-supervised rotation loss (p < 0.01). The proposed approach is promising for segmenting multiple lung lesion types in CT images.
Collapse
Affiliation(s)
- Lu Wang
- Department of Library, Shengjing Hospital of China Medical University, Shenyang, Liaoning 110004, China
- School of Health Management, China Medical University, Shenyang, Liaoning 110122, China
| | - He Zhou
- School of Health Management, China Medical University, Shenyang, Liaoning 110122, China
| | - Nan Xu
- School of Health Management, China Medical University, Shenyang, Liaoning 110122, China
| | - Yuchan Liu
- Department of Radiology, The First Affiliated Hospital of University of Science and Technology of China (USTC), Division of Life Sciences and Medicine, USTC Hefei, Anhui 230036, China
| | - Xiran Jiang
- School of Intelligent Medicine, China Medical University, Shenyang, Liaoning 110122, China
| | - Shu Li
- School of Health Management, China Medical University, Shenyang, Liaoning 110122, China
| | - Chaolu Feng
- Key Laboratory of Intelligent Computing in Medical Image (MIIC), Ministry of Education, Shenyang, Liaoning 110169, China
| | - Hainan Xu
- Department of Obstetrics and Gynecology, Pelvic Floor Disease Diagnosis and Treatment Center, Shengjing Hospital of China Medical University, Shenyang, Liaoning 110004, China
| | - Kexue Deng
- Department of Radiology, The First Affiliated Hospital of University of Science and Technology of China (USTC), Division of Life Sciences and Medicine, USTC Hefei, Anhui 230036, China
| | - Jiangdian Song
- School of Health Management, China Medical University, Shenyang, Liaoning 110122, China
| |
Collapse
|
24
|
Cai W, Xie L, Yang W, Li Y, Gao Y, Wang T. DFTNet: Dual-Path Feature Transfer Network for Weakly Supervised Medical Image Segmentation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2530-2540. [PMID: 35951571 DOI: 10.1109/tcbb.2022.3198284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Medical image segmentation has long suffered from the problem of expensive labels. Acquiring pixel-level annotations is time-consuming, labor-intensive, and relies on extensive expert knowledge. Bounding box annotations, in contrast, are relatively easy to acquire. Thus, in this paper, we explore to segment images through a novel Dual-path Feature Transfer design with only bounding box annotations. Specifically, a Target-aware Reconstructor is proposed to extract target-related features by reconstructing the pixels within the bounding box through the channel and spatial attention module. Then, a sliding Feature Fusion and Transfer Module (FFTM) fuses the extracted features from Reconstructor and transfers them to guide the Segmentor for segmentation. Finally, we present the Confidence Ranking Loss (CRLoss) which dynamically assigns weights to the loss of each pixel based on the network's confidence. CRLoss mitigates the impact of inaccurate pseudo-labels on performance. Extensive experiments demonstrate that our proposed model achieves state-of-the-art performance on the Medical Segmentation Decathlon (MSD) Brain Tumour and PROMISE12 datasets.
Collapse
|
25
|
Wang G, Luo X, Gu R, Yang S, Qu Y, Zhai S, Zhao Q, Li K, Zhang S. PyMIC: A deep learning toolkit for annotation-efficient medical image segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 231:107398. [PMID: 36773591 DOI: 10.1016/j.cmpb.2023.107398] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 11/29/2022] [Accepted: 02/01/2023] [Indexed: 06/18/2023]
Abstract
BACKGROUND AND OBJECTIVE Open-source deep learning toolkits are one of the driving forces for developing medical image segmentation models that are essential for computer-assisted diagnosis and treatment procedures. Existing toolkits mainly focus on fully supervised segmentation that assumes full and accurate pixel-level annotations are available. Such annotations are time-consuming and difficult to acquire for segmentation tasks, which makes learning from imperfect labels highly desired for reducing the annotation cost. We aim to develop a new deep learning toolkit to support annotation-efficient learning for medical image segmentation, which can accelerate and simplify the development of deep learning models with limited annotation budget, e.g., learning from partial, sparse or noisy annotations. METHODS Our proposed toolkit named PyMIC is a modular deep learning library for medical image segmentation tasks. In addition to basic components that support development of high-performance models for fully supervised segmentation, it contains several advanced components that are tailored for learning from imperfect annotations, such as loading annotated and unannounced images, loss functions for unannotated, partially or inaccurately annotated images, and training procedures for co-learning between multiple networks, etc. PyMIC is built on the PyTorch framework and supports development of semi-supervised, weakly supervised and noise-robust learning methods for medical image segmentation. RESULTS We present several illustrative medical image segmentation tasks based on PyMIC: (1) Achieving competitive performance on fully supervised learning; (2) Semi-supervised cardiac structure segmentation with only 10% training images annotated; (3) Weakly supervised segmentation using scribble annotations; and (4) Learning from noisy labels for chest radiograph segmentation. CONCLUSIONS The PyMIC toolkit is easy to use and facilitates efficient development of medical image segmentation models with imperfect annotations. It is modular and flexible, which enables researchers to develop high-performance models with low annotation cost. The source code is available at:https://github.com/HiLab-git/PyMIC.
Collapse
Affiliation(s)
- Guotai Wang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; Shanghai Artificial Intelligence Laboratory, Shanghai, China.
| | - Xiangde Luo
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; Shanghai Artificial Intelligence Laboratory, Shanghai, China
| | - Ran Gu
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Shuojue Yang
- Laboratory for Computational Sensing and Robotics, Johns Hopkins University, Baltimore, USA
| | - Yijie Qu
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Shuwei Zhai
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Qianfei Zhao
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Kang Li
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China
| | - Shaoting Zhang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; Shanghai Artificial Intelligence Laboratory, Shanghai, China
| |
Collapse
|
26
|
Zou W, Qi X, Zhou W, Sun M, Sun Z, Shan C. Graph Flow: Cross-Layer Graph Flow Distillation for Dual Efficient Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:1159-1171. [PMID: 36423314 DOI: 10.1109/tmi.2022.3224459] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
With the development of deep convolutional neural networks, medical image segmentation has achieved a series of breakthroughs in recent years. However, high-performance convolutional neural networks always mean numerous parameters and high computation costs, which will hinder the applications in resource-limited medical scenarios. Meanwhile, the scarceness of large-scale annotated medical image datasets further impedes the application of high-performance networks. To tackle these problems, we propose Graph Flow, a comprehensive knowledge distillation framework, for both network-efficiency and annotation-efficiency medical image segmentation. Specifically, the Graph Flow Distillation transfers the essence of cross-layer variations from a well-trained cumbersome teacher network to a non-trained compact student network. In addition, an unsupervised Paraphraser Module is integrated to purify the knowledge of the teacher, which is also beneficial for the training stabilization. Furthermore, we build a unified distillation framework by integrating the adversarial distillation and the vanilla logits distillation, which can further refine the final predictions of the compact network. With different teacher networks (traditional convolutional architecture or prevalent transformer architecture) and student networks, we conduct extensive experiments on four medical image datasets with different modalities (Gastric Cancer, Synapse, BUSI, and CVC-ClinicDB). We demonstrate the prominent ability of our method on these datasets, which achieves competitive performances. Moreover, we demonstrate the effectiveness of our Graph Flow through a novel semi-supervised paradigm for dual efficient medical image segmentation. Our code will be available at Graph Flow.
Collapse
|
27
|
Chaitanya K, Erdil E, Karani N, Konukoglu E. Local contrastive loss with pseudo-label based self-training for semi-supervised medical image segmentation. Med Image Anal 2023; 87:102792. [PMID: 37054649 DOI: 10.1016/j.media.2023.102792] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 11/25/2022] [Accepted: 03/02/2023] [Indexed: 03/13/2023]
Abstract
Supervised deep learning-based methods yield accurate results for medical image segmentation. However, they require large labeled datasets for this, and obtaining them is a laborious task that requires clinical expertise. Semi/self-supervised learning-based approaches address this limitation by exploiting unlabeled data along with limited annotated data. Recent self-supervised learning methods use contrastive loss to learn good global level representations from unlabeled images and achieve high performance in classification tasks on popular natural image datasets like ImageNet. In pixel-level prediction tasks such as segmentation, it is crucial to also learn good local level representations along with global representations to achieve better accuracy. However, the impact of the existing local contrastive loss-based methods remains limited for learning good local representations because similar and dissimilar local regions are defined based on random augmentations and spatial proximity; not based on the semantic label of local regions due to lack of large-scale expert annotations in the semi/self-supervised setting. In this paper, we propose a local contrastive loss to learn good pixel level features useful for segmentation by exploiting semantic label information obtained from pseudo-labels of unlabeled images alongside limited annotated images with ground truth (GT) labels. In particular, we define the proposed contrastive loss to encourage similar representations for the pixels that have the same pseudo-label/GT label while being dissimilar to the representation of pixels with different pseudo-label/GT label in the dataset. We perform pseudo-label based self-training and train the network by jointly optimizing the proposed contrastive loss on both labeled and unlabeled sets and segmentation loss on only the limited labeled set. We evaluated the proposed approach on three public medical datasets of cardiac and prostate anatomies, and obtain high segmentation performance with a limited labeled set of one or two 3D volumes. Extensive comparisons with the state-of-the-art semi-supervised and data augmentation methods and concurrent contrastive learning methods demonstrate the substantial improvement achieved by the proposed method. The code is made publicly available at https://github.com/krishnabits001/pseudo_label_contrastive_training.
Collapse
Affiliation(s)
- Krishna Chaitanya
- Computer Vision Laboratory, ETH Zurich, Sternwartstrasse 7, Zurich 8092, Switzerland.
| | - Ertunc Erdil
- Computer Vision Laboratory, ETH Zurich, Sternwartstrasse 7, Zurich 8092, Switzerland
| | - Neerav Karani
- Computer Vision Laboratory, ETH Zurich, Sternwartstrasse 7, Zurich 8092, Switzerland
| | - Ender Konukoglu
- Computer Vision Laboratory, ETH Zurich, Sternwartstrasse 7, Zurich 8092, Switzerland
| |
Collapse
|
28
|
Bond-Taylor S, Leach A, Long Y, Willcocks CG. Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:7327-7347. [PMID: 34591756 DOI: 10.1109/tpami.2021.3116668] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Deep generative models are a class of techniques that train deep neural networks to model the distribution of training samples. Research has fragmented into various interconnected approaches, each of which make trade-offs including run-time, diversity, and architectural restrictions. In particular, this compendium covers energy-based models, variational autoencoders, generative adversarial networks, autoregressive models, normalizing flows, in addition to numerous hybrid approaches. These techniques are compared and contrasted, explaining the premises behind each and how they are interrelated, while reviewing current state-of-the-art advances and implementations.
Collapse
|
29
|
WORD: A large scale dataset, benchmark and clinical applicable study for abdominal organ segmentation from CT image. Med Image Anal 2022; 82:102642. [DOI: 10.1016/j.media.2022.102642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Revised: 08/18/2022] [Accepted: 09/20/2022] [Indexed: 11/22/2022]
|
30
|
Kuang X, Cheung JPY, Wong KYK, Lam WY, Lam CH, Choy RW, Cheng CP, Wu H, Yang C, Wang K, Li Y, Zhang T. Spine-GFlow: A hybrid learning framework for robust multi-tissue segmentation in lumbar MRI without manual annotation. Comput Med Imaging Graph 2022; 99:102091. [PMID: 35803034 DOI: 10.1016/j.compmedimag.2022.102091] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 05/30/2022] [Accepted: 06/13/2022] [Indexed: 10/18/2022]
Abstract
Most learning-based magnetic resonance image (MRI) segmentation methods rely on the manual annotation to provide supervision, which is extremely tedious, especially when multiple anatomical structures are required. In this work, we aim to develop a hybrid framework named Spine-GFlow that combines the image features learned by a CNN model and anatomical priors for multi-tissue segmentation in a sagittal lumbar MRI. Our framework does not require any manual annotation and is robust against image feature variation caused by different image settings and/or underlying pathology. Our contributions include: 1) a rule-based method that automatically generates the weak annotation (initial seed area), 2) a novel proposal generation method that integrates the multi-scale image features and anatomical prior, 3) a comprehensive loss for CNN training that optimizes the pixel classification and feature distribution simultaneously. Our Spine-GFlow has been validated on 2 independent datasets: HKDDC (containing images obtained from 3 different machines) and IVDM3Seg. The segmentation results of vertebral bodies (VB), intervertebral discs (IVD), and spinal canal (SC) are evaluated quantitatively using intersection over union (IoU) and the Dice coefficient. Results show that our method, without requiring manual annotation, has achieved a segmentation performance comparable to a model trained with full supervision (mean Dice 0.914 vs 0.916).
Collapse
Affiliation(s)
- Xihe Kuang
- Department of Orthopaedics and Traumatology, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China
| | - Jason Pui Yin Cheung
- Department of Orthopaedics and Traumatology, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China
| | - Kwan-Yee K Wong
- Department of Computer Science, Faculty of Engineering, University of Hong Kong, Hong Kong, China
| | - Wai Yi Lam
- Department of Orthopaedics and Traumatology, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China
| | - Chak Hei Lam
- Department of Orthopaedics and Traumatology, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China
| | - Richard W Choy
- Department of Orthopaedics and Traumatology, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China
| | | | - Honghan Wu
- Department of Orthopaedics and Traumatology, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China
| | - Cao Yang
- Department of Orthopedics, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430022, China
| | - Kun Wang
- Department of Orthopedics, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430022, China
| | - Yang Li
- School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China.
| | - Teng Zhang
- Department of Orthopaedics and Traumatology, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China.
| |
Collapse
|
31
|
Luo X, Wang G, Liao W, Chen J, Song T, Chen Y, Zhang S, Metaxas DN, Zhang S. Semi-supervised medical image segmentation via uncertainty rectified pyramid consistency. Med Image Anal 2022; 80:102517. [PMID: 35732106 DOI: 10.1016/j.media.2022.102517] [Citation(s) in RCA: 67] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 05/26/2022] [Accepted: 06/10/2022] [Indexed: 02/05/2023]
Abstract
Despite that Convolutional Neural Networks (CNNs) have achieved promising performance in many medical image segmentation tasks, they rely on a large set of labeled images for training, which is expensive and time-consuming to acquire. Semi-supervised learning has shown the potential to alleviate this challenge by learning from a large set of unlabeled images and limited labeled samples. In this work, we present a simple yet efficient consistency regularization approach for semi-supervised medical image segmentation, called Uncertainty Rectified Pyramid Consistency (URPC). Inspired by the pyramid feature network, we chose a pyramid-prediction network that obtains a set of segmentation predictions at different scales. For semi-supervised learning, URPC learns from unlabeled data by minimizing the discrepancy between each of the pyramid predictions and their average. We further present multi-scale uncertainty rectification to boost the pyramid consistency regularization, where the rectification seeks to temper the consistency loss at outlier pixels that may have substantially different predictions than the average, potentially due to upsampling errors or lack of enough labeled data. Experiments on two public datasets and an in-house clinical dataset showed that: 1) URPC can achieve large performance improvement by utilizing unlabeled data and 2) Compared with five existing semi-supervised methods, URPC achieved better or comparable results with a simpler pipeline. Furthermore, we build a semi-supervised medical image segmentation codebase to boost research on this topic: https://github.com/HiLab-git/SSL4MIS.
Collapse
Affiliation(s)
- Xiangde Luo
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; Shanghai AI Lab, Shanghai, China
| | - Guotai Wang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; Shanghai AI Lab, Shanghai, China.
| | - Wenjun Liao
- Department of Radiation Oncology, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Jieneng Chen
- College of Electronics and Information Engineering, Tongji University, Shanghai, China
| | - Tao Song
- SenseTime Research, Shanghai, China
| | - Yinan Chen
- SenseTime Research, Shanghai, China; West China Biomedical Big Data Center, Sichuan University West China Hospital, Chengdu, China
| | - Shichuan Zhang
- Department of Radiation Oncology, Sichuan Cancer Hospital and Institute, University of Electronic Science and Technology of China, Chengdu, China
| | - Dimitris N Metaxas
- Department of Computer Science, Rutgers University, Piscataway NJ 08854, USA
| | - Shaoting Zhang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; Shanghai AI Lab, Shanghai, China.
| |
Collapse
|