1
|
Shen Y, Dreizin D, Inigo B, Unberath M. ProtoSAM-3D: Interactive semantic segmentation in volumetric medical imaging via a Segment Anything Model and mask-level prototypes. Comput Med Imaging Graph 2025; 121:102501. [PMID: 39919534 PMCID: PMC11875884 DOI: 10.1016/j.compmedimag.2025.102501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2024] [Revised: 01/01/2025] [Accepted: 01/24/2025] [Indexed: 02/09/2025]
Abstract
Semantic segmentation of volumetric medical images is essential for accurate delineation of anatomic structures and pathology, enabling quantitative analysis in precision medicine applications. While volumetric segmentation has been extensively studied, most existing methods require full supervision and struggle to generalize to new classes at inference time, particularly for irregular, ill-defined targets such as tumors, where fine-grained, high-salience segmentation is required. Consequently, conventional semantic segmentation methods cannot easily offer zero/few-shot generalization to segment objects of interest beyond their closed training set. Foundation models, such as the Segment Anything Model (SAM), have demonstrated promising zero-shot generalization for interactive instance segmentation based on user prompts. However, these models sacrifice semantic knowledge for generalization capabilities that largely rely on collaborative user prompting to inject semantics. For volumetric medical image analysis, a unified approach that combines the semantic understanding of conventional segmentation methods with the flexible, prompt-driven capabilities of SAM is essential for comprehensive anatomical delineation On the one hand, it is natural to exploit anatomic knowledge to enable semantic segmentation without any user interaction. On the other hand, SAM-like approaches to segment unknown classes via prompting provide the needed flexibility to segment structures beyond the closed training set, enabling quantitative analysis. To address these needs in a unified framework, we introduce ProtoSAM-3D, which extends SAMs to semantic segmentation of volumetric data via a novel mask-level prototype prediction approach while retaining the flexibility of SAM Our model utilizes an innovative spatially-aware Transformer to fuse instance-specific intermediate representations from the SAM encoder and decoder, obtaining a comprehensive feature embedding for each mask. These embeddings are then classified by computing similarity with learned prototypes. By predicting prototypes instead of classes directly, ProtoSAM-3D gains the flexibility to rapidly adapt to new classes with minimal retraining. Furthermore, we introduce an auto-prompting method to enable semantic segmentation of known classes without user interaction. We demonstrate state-of-the-art zero/few-shot performance on multi-organ segmentation in CT and MRI. Experimentally, ProtoSAM-3D achieves competitive performance compared to fully supervised methods. Our work represents a step towards interactive semantic segmentation models with SAM for volumetric medical image processing.
Collapse
Affiliation(s)
| | - David Dreizin
- University of Maryland School of Medicine, Baltimore, USA
| | | | | |
Collapse
|
2
|
Li L, Liu T, Wang P, Su L, Wang L, Wang X, Chen C. Multiple perception contrastive learning for automated ovarian tumor classification in CT images. Abdom Radiol (NY) 2025:10.1007/s00261-025-04879-y. [PMID: 40074925 DOI: 10.1007/s00261-025-04879-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2024] [Revised: 03/02/2025] [Accepted: 03/03/2025] [Indexed: 03/14/2025]
Abstract
Ovarian cancer is among the most common malignant tumours in women worldwide, and early identification is essential for enhancing patient survival chances. The development of automated and trustworthy diagnostic techniques is necessary because traditional CT picture processing mostly depends on the subjective assessment of radiologists, which can result in variability. Deep learning approaches in medical image analysis have advanced significantly, particularly showing considerable promise in the automatic categorisation of ovarian tumours. This research presents an automated diagnostic approach for ovarian tumour CT images utilising supervised contrastive learning and a Multiple Perception Encoder (MP Encoder). The approach incorporates T-Pro technology to augment data diversity and simulates semantic perturbations to increase the model's generalisation capability. The incorporation of Multi-Scale Perception Module (MSP Module) and Multi-Attention Module (MA Module) enhances the model's sensitivity to the intricate morphology and subtle characteristics of ovarian tumours, resulting in improved classification accuracy and robustness, ultimately achieving an average classification accuracy of 98.43%. Experimental results indicate the method's exceptional efficacy in ovarian tumour classification, particularly in cases involving tumours with intricate morphology or worse picture quality, thereby markedly enhancing classification accuracy. This advanced deep learning framework proficiently tackles the complexities of ovarian tumour CT image interpretation, offering clinicians enhanced diagnostic support and aiding in the optimisation of early detection and treatment strategies for ovarian cancer.
Collapse
Affiliation(s)
- Lingwei Li
- School of Medical Technology and Engineering, Henan School of Science and Technology, Luoyang, 471032, China
- School of Medical Imaging, Qilu Medical University, Zibo, 255300, China
| | | | - Peng Wang
- School of Medical Imaging, Qilu Medical University, Zibo, 255300, China
| | - Lianzheng Su
- School of Medical Imaging, Qilu Medical University, Zibo, 255300, China
| | - Lei Wang
- School of Medical Imaging, Qilu Medical University, Zibo, 255300, China
| | - Xinmiao Wang
- School of Medical Imaging, Qilu Medical University, Zibo, 255300, China
| | - Chidao Chen
- School of Medical Imaging, Qilu Medical University, Zibo, 255300, China
| |
Collapse
|
3
|
Huang Z, Wang Z, Zhao T, Ding X, Yang X. Toward high-quality pseudo masks from noisy or weak annotations for robust medical image segmentation. Neural Netw 2025; 181:106850. [PMID: 39520897 DOI: 10.1016/j.neunet.2024.106850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Revised: 09/01/2024] [Accepted: 10/23/2024] [Indexed: 11/16/2024]
Abstract
Deep learning networks excel in image segmentation with abundant accurately annotated training samples. However, in medical applications, acquiring large quantities of high-quality labeled images is prohibitively expensive. Thus, learning from imperfect annotations (e.g. noisy or weak annotations) has emerged as a prominent research area in medical image segmentation. This work aims to extract high-quality pseudo masks from imperfect annotations with the assistance of a small number of clean labels. Our core motivation is based on the understanding that different types of flawed imperfect annotations inherently exhibit unique noise patterns. Comparing clean annotations with corresponding imperfectly annotated labels can effectively identify potential noise patterns at minimal additional cost. To this end, we propose a two-phase framework including a noise identification network and a noise-robust segmentation network. The former network implicitly learns noise patterns and revises labels accordingly. It includes a three-branch network to identify different types of noises. The latter one further mitigates the negative influence of residual annotation noises based on parallel segmentation networks with different initializations and a label softening strategy. Extensive experimental results on two public datasets demonstrate that our method can effectively refine annotation flaws and achieve superior segmentation performance to the state-of-the-art methods.
Collapse
Affiliation(s)
- Zihang Huang
- School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, China
| | - Zhiwei Wang
- Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China
| | - Tianyu Zhao
- School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, China
| | - Xiaohuan Ding
- School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, China
| | - Xin Yang
- School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, China.
| |
Collapse
|
4
|
Li Y, Li C, Yang T, Chen L, Huang M, Yang L, Zhou S, Liu H, Xia J, Wang S. Multiview deep learning networks based on automated breast volume scanner images for identifying breast cancer in BI-RADS 4. Front Oncol 2024; 14:1399296. [PMID: 39309734 PMCID: PMC11412795 DOI: 10.3389/fonc.2024.1399296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 08/19/2024] [Indexed: 09/25/2024] Open
Abstract
Objectives To develop and validate a deep learning (DL) based automatic segmentation and classification system to classify benign and malignant BI-RADS 4 lesions imaged with ABVS. Methods From May to December 2020, patients with BI-RADS 4 lesions from Centre 1 and Centre 2 were retrospectively enrolled and divided into a training set (Centre 1) and an independent test set (Centre 2). All included patients underwent an ABVS examination within one week before the biopsy. A two-stage DL framework consisting of an automatic segmentation module and an automatic classification module was developed. The preprocessed ABVS images were input into the segmentation module for BI-RADS 4 lesion segmentation. The classification model was constructed to extract features and output the probability of malignancy. The diagnostic performances among different ABVS views (axial, sagittal, coronal, and multi-view) and DL architectures (Inception-v3, ResNet 50, and MobileNet) were compared. Results A total of 251 BI-RADS 4 lesions from 216 patients were included (178 in the training set and 73 in the independent test set). The average Dice coefficient, precision, and recall of the segmentation module in the test set were 0.817 ± 0.142, 0.903 ± 0.183, and 0.886 ± 0.187, respectively. The DL model based on multiview ABVS images and Inception-v3 achieved the best performance, with an AUC, sensitivity, specificity, PPV, and NPV of 0.949 (95% CI: 0.945-0.953), 82.14%, 95.56%, 92.00%, and 89.58%, respectively, in the test set. Conclusions The developed multiview DL model enables automatic segmentation and classification of BI-RADS 4 lesions in ABVS images.
Collapse
Affiliation(s)
- Yini Li
- Department of Ultrasound, The Affiliated Hospital of Southwest Medical University, Sichuan, China
| | - Cao Li
- Department of Radiology, The Affiliated Hospital of Southwest Medical University, Sichuan, China
| | - Tao Yang
- Department of Ultrasound, The Affiliated Hospital of Southwest Medical University, Sichuan, China
| | - Lingzhi Chen
- Department of Ultrasound, The Affiliated Hospital of Southwest Medical University, Sichuan, China
| | - Mingquan Huang
- Department of Breast Surgery, The Affiliated Hospital of Southwest Medical University, Sichuan, China
| | - Lu Yang
- Department of Radiology, The Affiliated Hospital of Southwest Medical University, Sichuan, China
| | - Shuxian Zhou
- Artificial Intelligence Innovation Center, Research Institute of Tsinghua, Guangdong, China
| | - Huaqing Liu
- Artificial Intelligence Innovation Center, Research Institute of Tsinghua, Guangdong, China
| | - Jizhu Xia
- Department of Ultrasound, The Affiliated Hospital of Southwest Medical University, Sichuan, China
| | - Shijie Wang
- Department of Ultrasound, The Affiliated Hospital of Southwest Medical University, Sichuan, China
| |
Collapse
|
5
|
R P A, Zacharias J. Semantic segmentation in skin surface microscopic images with artifacts removal. Comput Biol Med 2024; 180:108975. [PMID: 39153395 DOI: 10.1016/j.compbiomed.2024.108975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 07/29/2024] [Accepted: 07/30/2024] [Indexed: 08/19/2024]
Abstract
Skin surface imaging has been used to examine skin lesions with a microscope for over a century and is commonly known as epiluminescence microscopy, dermatoscopy, or dermoscopy. Skin surface microscopy has been recommended to reduce the necessity of biopsy. This imaging technique could improve the clinical diagnostic performance of pigmented skin lesions. Different imaging techniques are employed in dermatology to find diseases. Segmentation and classification are the two main steps in the examination. The classification performance is influenced by the algorithm employed in the segmentation procedure. The most difficult aspect of segmentation is getting rid of the unwanted artifacts. Many deep-learning models are being created to segment skin lesions. In this paper, an analysis of common artifacts is proposed to investigate the segmentation performance of deep learning models with skin surface microscopic images. The most prevalent artifacts in skin images are hair and dark corners. These artifacts can be observed in the majority of dermoscopy images captured through various imaging techniques. While hair detection and removal methods are common, the introduction of dark corner detection and removal represents a novel approach to skin lesion segmentation. A comprehensive analysis of this segmentation performance is assessed using the surface density of artifacts. Assessment of the PH2, ISIC 2017, and ISIC 2018 datasets demonstrates significant enhancements, as reflected by Dice coefficients rising to 93.49 (86.81), 85.86 (79.91), and 75.38 (51.28) respectively, upon artifact removal. These results underscore the pivotal significance of artifact removal techniques in amplifying the efficacy of deep-learning models for skin lesion segmentation.
Collapse
Affiliation(s)
- Aneesh R P
- College of Engineering Trivandrum, APJ Abdul Kalam Technological University Kerala, Thiruvananthapuram, Kerala, India.
| | - Joseph Zacharias
- College of Engineering Trivandrum, APJ Abdul Kalam Technological University Kerala, Thiruvananthapuram, Kerala, India
| |
Collapse
|
6
|
Zhan F, Wang W, Chen Q, Guo Y, He L, Wang L. Three-Direction Fusion for Accurate Volumetric Liver and Tumor Segmentation. IEEE J Biomed Health Inform 2024; 28:2175-2186. [PMID: 38109246 DOI: 10.1109/jbhi.2023.3344392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]
Abstract
Biomedical image segmentation of organs, tissues and lesions has gained increasing attention in clinical treatment planning and navigation, which involves the exploration of two-dimensional (2D) and three-dimensional (3D) contexts in the biomedical image. Compared to 2D methods, 3D methods pay more attention to inter-slice correlations, which offer additional spatial information for image segmentation. An organ or tumor has a 3D structure that can be observed from three directions. Previous studies focus only on the vertical axis, limiting the understanding of the relationship between a tumor and its surrounding tissues. Important information can also be obtained from sagittal and coronal axes. Therefore, spatial information of organs and tumors can be obtained from three directions, i.e. the sagittal, coronal and vertical axes, to understand better the invasion depth of tumor and its relationship with the surrounding tissues. Moreover, the edges of organs and tumors in biomedical image may be blurred. To address these problems, we propose a three-direction fusion volumetric segmentation (TFVS) model for segmenting 3D biomedical images from three perspectives in sagittal, coronal and transverse planes, respectively. We use the dataset of the liver task provided by the Medical Segmentation Decathlon challenge to train our model. The TFVS method demonstrates a competitive performance on the 3D-IRCADB dataset. In addition, the t-test and Wilcoxon signed-rank test are also performed to show the statistical significance of the improvement by the proposed method as compared with the baseline methods. The proposed method is expected to be beneficial in guiding and facilitating clinical diagnosis and treatment.
Collapse
|
7
|
Wang Y, Yang Z, Liu X, Li Z, Wu C, Wang Y, Jin K, Chen D, Jia G, Chen X, Ye J, Huang X. PGKD-Net: Prior-guided and Knowledge Diffusive Network for Choroid Segmentation. Artif Intell Med 2024; 150:102837. [PMID: 38553151 DOI: 10.1016/j.artmed.2024.102837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 03/01/2024] [Accepted: 03/03/2024] [Indexed: 04/02/2024]
Abstract
The thickness of the choroid is considered to be an important indicator of clinical diagnosis. Therefore, accurate choroid segmentation in retinal OCT images is crucial for monitoring various ophthalmic diseases. However, this is still challenging due to the blurry boundaries and interference from other lesions. To address these issues, we propose a novel prior-guided and knowledge diffusive network (PGKD-Net) to fully utilize retinal structural information to highlight choroidal region features and boost segmentation performance. Specifically, it is composed of two parts: a Prior-mask Guided Network (PG-Net) for coarse segmentation and a Knowledge Diffusive Network (KD-Net) for fine segmentation. In addition, we design two novel feature enhancement modules, Multi-Scale Context Aggregation (MSCA) and Multi-Level Feature Fusion (MLFF). The MSCA module captures the long-distance dependencies between features from different receptive fields and improves the model's ability to learn global context. The MLFF module integrates the cascaded context knowledge learned from PG-Net to benefit fine-level segmentation. Comprehensive experiments are conducted to evaluate the performance of the proposed PGKD-Net. Experimental results show that our proposed method achieves superior segmentation accuracy over other state-of-the-art methods. Our code is made up publicly available at: https://github.com/yzh-hdu/choroid-segmentation.
Collapse
Affiliation(s)
- Yaqi Wang
- College of Media Engineering, Communication University of Zhejiang, Hangzhou, China.
| | - Zehua Yang
- Hangzhou Dianzi University, Hangzhou, China.
| | - Xindi Liu
- Department of Ophthalmology, School of Medicine, The Second Affiliated Hospital of Zhejiang University, Hangzhou, China.
| | - Zhi Li
- Hangzhou Dianzi University, Hangzhou, China.
| | - Chengyu Wu
- Department of Mechanical, Electrical and Information Engineering, Shandong University, Weihai, China.
| | - Yizhen Wang
- Hangzhou Dianzi University, Hangzhou, China.
| | - Kai Jin
- Department of Ophthalmology, School of Medicine, The Second Affiliated Hospital of Zhejiang University, Hangzhou, China.
| | - Dechao Chen
- Hangzhou Dianzi University, Hangzhou, China.
| | | | | | - Juan Ye
- Department of Ophthalmology, School of Medicine, The Second Affiliated Hospital of Zhejiang University, Hangzhou, China.
| | | |
Collapse
|
8
|
Li W, Huang W, Zheng Y. CorrDiff: Corrective Diffusion Model for Accurate MRI Brain Tumor Segmentation. IEEE J Biomed Health Inform 2024; 28:1587-1598. [PMID: 38215328 DOI: 10.1109/jbhi.2024.3353272] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2024]
Abstract
Accurate segmentation of brain tumors in MRI images is imperative for precise clinical diagnosis and treatment. However, existing medical image segmentation methods exhibit errors, which can be categorized into two types: random errors and systematic errors. Random errors, arising from various unpredictable effects, pose challenges in terms of detection and correction. Conversely, systematic errors, attributable to systematic effects, can be effectively addressed through machine learning techniques. In this paper, we propose a corrective diffusion model for accurate MRI brain tumor segmentation by correcting systematic errors. This marks the first application of the diffusion model for correcting systematic segmentation errors. Additionally, we introduce the Vector Quantized Variational Autoencoder (VQ-VAE) to compress the original data into a discrete coding codebook. This not only reduces the dimensionality of the training data but also enhances the stability of the correction diffusion model. Furthermore, we propose the Multi-Fusion Attention Mechanism, which can effectively enhances the segmentation performance of brain tumor images, and enhance the flexibility and reliability of the corrective diffusion model. Our model is evaluated on the BRATS2019, BRATS2020, and Jun Cheng datasets. Experimental results demonstrate the effectiveness of our model over state-of-the-art methods in brain tumor segmentation.
Collapse
|
9
|
Zhang B, Li J, Bai Y, Jiang Q, Yan B, Wang Z. An Improved Microaneurysm Detection Model Based on SwinIR and YOLOv8. Bioengineering (Basel) 2023; 10:1405. [PMID: 38135996 PMCID: PMC10740408 DOI: 10.3390/bioengineering10121405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 11/23/2023] [Accepted: 11/26/2023] [Indexed: 12/24/2023] Open
Abstract
Diabetic retinopathy (DR) is a microvascular complication of diabetes. Microaneurysms (MAs) are often observed in the retinal vessels of diabetic patients and represent one of the earliest signs of DR. Accurate and efficient detection of MAs is crucial for the diagnosis of DR. In this study, an automatic model (MA-YOLO) is proposed for MA detection in fluorescein angiography (FFA) images. To obtain detailed features and improve the discriminability of MAs in FFA images, SwinIR was utilized to reconstruct super-resolution images. To solve the problems of missed detection of small features and feature information loss, an MA detection layer was added between the neck and the head sections of YOLOv8. To enhance the generalization ability of the MA-YOLO model, transfer learning was conducted between high-resolution images and low-resolution images. To avoid excessive penalization due to geometric factors and address sample distribution imbalance, the loss function was optimized by taking the Wise-IoU loss as a bounding box regression loss. The performance of the MA-YOLO model in MA detection was compared with that of other state-of-the-art models, including SSD, RetinaNet, YOLOv5, YOLOX, and YOLOv7. The results showed that the MA-YOLO model had the best performance in MA detection, as shown by its optimal metrics, including recall, precision, F1 score, and AP, which were 88.23%, 97.98%, 92.85%, and 94.62%, respectively. Collectively, the proposed MA-YOLO model is suitable for the automatic detection of MAs in FFA images, which can assist ophthalmologists in the diagnosis of the progression of DR.
Collapse
Affiliation(s)
- Bowei Zhang
- College of Information Science, Shanghai Ocean University, Shanghai 201306, China; (B.Z.); (Y.B.)
| | - Jing Li
- Department of Ophthalmology, Eye Institute, Eye and ENT Hospital, Fudan University, Shanghai 201114, China (B.Y.)
| | - Yun Bai
- College of Information Science, Shanghai Ocean University, Shanghai 201306, China; (B.Z.); (Y.B.)
| | - Qing Jiang
- The Affiliated Eye Hospital, Nanjing Medical University, Nanjing 211166, China
| | - Biao Yan
- Department of Ophthalmology, Eye Institute, Eye and ENT Hospital, Fudan University, Shanghai 201114, China (B.Y.)
| | - Zhenhua Wang
- College of Information Science, Shanghai Ocean University, Shanghai 201306, China; (B.Z.); (Y.B.)
| |
Collapse
|
10
|
Wang H, Cao P, Yang J, Zaiane O. MCA-UNet: multi-scale cross co-attentional U-Net for automatic medical image segmentation. Health Inf Sci Syst 2023; 11:10. [PMID: 36721640 PMCID: PMC9884736 DOI: 10.1007/s13755-022-00209-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2022] [Accepted: 10/01/2022] [Indexed: 01/31/2023] Open
Abstract
Medical image segmentation is a challenging task due to the high variation in shape, size and position of infections or lesions in medical images. It is necessary to construct multi-scale representations to capture image contents from different scales. However, it is still challenging for U-Net with a simple skip connection to model the global multi-scale context. To overcome it, we proposed a dense skip-connection with cross co-attention in U-Net to solve the semantic gaps for an accurate automatic medical image segmentation. We name our method MCA-UNet, which enjoys two benefits: (1) it has a strong ability to model the multi-scale features, and (2) it jointly explores the spatial and channel attentions. The experimental results on the COVID-19 and IDRiD datasets suggest that our MCA-UNet produces more precise segmentation performance for the consolidation, ground-glass opacity (GGO), microaneurysms (MA) and hard exudates (EX). The source code of this work will be released via https://github.com/McGregorWwww/MCA-UNet/.
Collapse
Affiliation(s)
- Haonan Wang
- Computer Science and Engineering, Northeastern University, Shenyang, China
- Key Laboratory of Intelligent Computing in Medical Image of Ministry of Education, Northeastern University, Shenyang, China
| | - Peng Cao
- Computer Science and Engineering, Northeastern University, Shenyang, China
- Key Laboratory of Intelligent Computing in Medical Image of Ministry of Education, Northeastern University, Shenyang, China
| | - Jinzhu Yang
- Computer Science and Engineering, Northeastern University, Shenyang, China
- Key Laboratory of Intelligent Computing in Medical Image of Ministry of Education, Northeastern University, Shenyang, China
| | - Osmar Zaiane
- Amii, University of Alberta, Edmonton, AB Canada
| |
Collapse
|
11
|
Chen Y, Yu L, Wang JY, Panjwani N, Obeid JP, Liu W, Liu L, Kovalchuk N, Gensheimer MF, Vitzthum LK, Beadle BM, Chang DT, Le QT, Han B, Xing L. Adaptive Region-Specific Loss for Improved Medical Image Segmentation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:13408-13421. [PMID: 37363838 PMCID: PMC11346301 DOI: 10.1109/tpami.2023.3289667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/28/2023]
Abstract
Defining the loss function is an important part of neural network design and critically determines the success of deep learning modeling. A significant shortcoming of the conventional loss functions is that they weight all regions in the input image volume equally, despite the fact that the system is known to be heterogeneous (i.e., some regions can achieve high prediction performance more easily than others). Here, we introduce a region-specific loss to lift the implicit assumption of homogeneous weighting for better learning. We divide the entire volume into multiple sub-regions, each with an individualized loss constructed for optimal local performance. Effectively, this scheme imposes higher weightings on the sub-regions that are more difficult to segment, and vice versa. Furthermore, the regional false positive and false negative errors are computed for each input image during a training step and the regional penalty is adjusted accordingly to enhance the overall accuracy of the prediction. Using different public and in-house medical image datasets, we demonstrate that the proposed regionally adaptive loss paradigm outperforms conventional methods in the multi-organ segmentations, without any modification to the neural network architecture or additional data preparation.
Collapse
|
12
|
Yin W, Zhou D, Nie R. DI-UNet: dual-branch interactive U-Net for skin cancer image segmentation. J Cancer Res Clin Oncol 2023; 149:15511-15524. [PMID: 37646827 DOI: 10.1007/s00432-023-05319-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Accepted: 08/18/2023] [Indexed: 09/01/2023]
Abstract
PURPOSE Skin disease is a prevalent type of physical ailment that can manifest in multitude of forms. Many internal diseases can be directly reflected on the skin, and if left unattended, skin diseases can potentially develop into skin cancer. Accurate and effective segmentation of skin lesions, especially melanoma, is critical for early detection and diagnosis of skin cancer. However, the complex color variations, boundary ambiguity, and scale variations in skin lesion regions present significant challenges for precise segmentation. METHODS We propose a novel approach for melanoma segmentation using a dual-branch interactive U-Net architecture. Two distinct sampling strategies are simultaneously integrated into the network, creating a vertical dual-branch structure. Meanwhile, we introduce a novel dual-channel symmetrical convolution block (DCS-Conv), which employs a symmetrical design, enabling the network to exhibit a horizontal dual-branch structure. The combination of the vertical and horizontal distribution of the dual-branch structure enhances both the depth and width of the network, providing greater diversity and rich multiscale cascade features. Additionally, this paper introduces a novel module called the residual fuse-and-select module (RFS module), which leverages self-attention mechanisms to focus on the specific skin cancer features and reduce irrelevant artifacts, further improving the segmentation accuracy. RESULTS We evaluated our approach on two publicly skin cancer datasets, ISIC2016 and PH2, and achieved state-of-the-art results, surpassing previous outcomes in terms of segmentation accuracy and overall performance. CONCLUSION Our proposed approach holds tremendous potential to aid dermatologists in clinical decision-making.
Collapse
Affiliation(s)
- Wen Yin
- School of Information Science and Engineering, Yunnan University, Kunming, 650504, China
| | - Dongming Zhou
- School of Information Science and Engineering, Yunnan University, Kunming, 650504, China.
| | - Rencan Nie
- School of Information Science and Engineering, Yunnan University, Kunming, 650504, China
| |
Collapse
|
13
|
Karimi D, Gholipour A. Improving Calibration and Out-of-Distribution Detection in Deep Models for Medical Image Segmentation. IEEE TRANSACTIONS ON ARTIFICIAL INTELLIGENCE 2023; 4:383-397. [PMID: 37868336 PMCID: PMC10586223 DOI: 10.1109/tai.2022.3159510] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/24/2023]
Abstract
Convolutional Neural Networks (CNNs) have proved to be powerful medical image segmentation models. In this study, we address some of the main unresolved issues regarding these models. Specifically, training of these models on small medical image datasets is still challenging, with many studies promoting techniques such as transfer learning. Moreover, these models are infamous for producing over-confident predictions and for failing silently when presented with out-of-distribution (OOD) test data. In this paper, for improving prediction calibration we advocate for multi-task learning, i.e., training a single model on several different datasets, spanning different organs of interest and different imaging modalities. We show that multi-task learning can significantly improve model confidence calibration. For OOD detection, we propose a novel method based on spectral analysis of CNN feature maps. We show that different datasets, representing different imaging modalities and/or different organs of interest, have distinct spectral signatures, which can be used to identify whether or not a test image is similar to the images used for training. We show that our proposed method is more accurate than several competing methods, including methods based on prediction uncertainty and image classification.
Collapse
Affiliation(s)
- Davood Karimi
- Department of Radiology, Boston Children's Hospital, and Harvard Medical School, Boston, Massachusetts, USA
| | - Ali Gholipour
- Department of Radiology, Boston Children's Hospital, and Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
14
|
Wang Y, Su J, Xu Q, Zhong Y. A Collaborative Learning Model for Skin Lesion Segmentation and Classification. Diagnostics (Basel) 2023; 13:diagnostics13050912. [PMID: 36900056 PMCID: PMC10001355 DOI: 10.3390/diagnostics13050912] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 02/19/2023] [Accepted: 02/24/2023] [Indexed: 03/06/2023] Open
Abstract
The automatic segmentation and classification of skin lesions are two essential tasks in computer-aided skin cancer diagnosis. Segmentation aims to detect the location and boundary of the skin lesion area, while classification is used to evaluate the type of skin lesion. The location and contour information of lesions provided by segmentation is essential for the classification of skin lesions, while the skin disease classification helps generate target localization maps to assist the segmentation task. Although the segmentation and classification are studied independently in most cases, we find meaningful information can be explored using the correlation of dermatological segmentation and classification tasks, especially when the sample data are insufficient. In this paper, we propose a collaborative learning deep convolutional neural networks (CL-DCNN) model based on the teacher-student learning method for dermatological segmentation and classification. To generate high-quality pseudo-labels, we provide a self-training method. The segmentation network is selectively retrained through classification network screening pseudo-labels. Specially, we obtain high-quality pseudo-labels for the segmentation network by providing a reliability measure method. We also employ class activation maps to improve the location ability of the segmentation network. Furthermore, we provide the lesion contour information by using the lesion segmentation masks to improve the recognition ability of the classification network. Experiments are carried on the ISIC 2017 and ISIC Archive datasets. The CL-DCNN model achieved a Jaccard of 79.1% on the skin lesion segmentation task and an average AUC of 93.7% on the skin disease classification task, which is superior to the advanced skin lesion segmentation methods and classification methods.
Collapse
Affiliation(s)
- Ying Wang
- School of Information Science and Engineering, University of Jinan, Jinan 250022, China
- Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan 250022, China
| | - Jie Su
- School of Information Science and Engineering, University of Jinan, Jinan 250022, China
- Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan 250022, China
- Correspondence: ; Tel.: +86-15054125550
| | - Qiuyu Xu
- School of Information Science and Engineering, University of Jinan, Jinan 250022, China
- Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan 250022, China
| | - Yixin Zhong
- School of Information Science and Engineering, University of Jinan, Jinan 250022, China
- Artificial Intelligence Research Institute, University of Jinan, Jinan 250022, China
| |
Collapse
|
15
|
Liu Z, Zhao C, Lu Y, Jiang Y, Yan J. Multi-scale graph learning for ovarian tumor segmentation from CT images. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.09.093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
16
|
Yang H, Chen Q, Fu K, Zhu L, Jin L, Qiu B, Ren Q, Du H, Lu Y. Boosting medical image segmentation via conditional-synergistic convolution and lesion decoupling. Comput Med Imaging Graph 2022; 101:102110. [PMID: 36057184 DOI: 10.1016/j.compmedimag.2022.102110] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 06/09/2022] [Accepted: 07/28/2022] [Indexed: 01/27/2023]
Abstract
Medical image segmentation is a critical step in pathology assessment and monitoring. Extensive methods tend to utilize a deep convolutional neural network for various medical segmentation tasks, such as polyp segmentation, skin lesion segmentation, etc. However, due to the inherent difficulty of medical images and tremendous data variations, they usually perform poorly in some intractable cases. In this paper, we propose an input-specific network called conditional-synergistic convolution and lesion decoupling network (CCLDNet) to solve these issues. First, in contrast to existing CNN-based methods with stationary convolutions, we propose the conditional synergistic convolution (CSConv) that aims to generate a specialist convolution kernel for each lesion. CSConv has the ability of dynamic modeling and could be leveraged as a basic block to construct other networks in a broad range of vision tasks. Second, we devise a lesion decoupling strategy (LDS) to decouple the original lesion segmentation map into two soft labels, i.e., lesion center label and lesion boundary label, for reducing the segmentation difficulty. Besides, we use a transformer network as the backbone, further erasing the fixed structure of the standard CNN and empowering dynamic modeling capability of the whole framework. Our CCLDNet outperforms state-of-the-art approaches by a large margin on a variety of benchmarks, including polyp segmentation (89.22% dice score on EndoScene) and skin lesion segmentation (91.15% dice score on ISIC2018). Our code is available at https://github.com/QianChen98/CCLD-Net.
Collapse
Affiliation(s)
- Huakun Yang
- College of Information Science and Technology, University of Science and Technology of China, Hefei 230041, China
| | - Qian Chen
- Institute of Medical Technology, Peking University Health Science Center, Peking University, Beijing 100191, China; Department of Biomedical Engineering, College of Future Technology, Peking University, Beijing 100871, China
| | - Keren Fu
- College of Computer Science, National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University, Chengdu 610065, China
| | - Lei Zhu
- Institute of Medical Technology, Peking University Health Science Center, Peking University, Beijing 100191, China; Department of Biomedical Engineering, College of Future Technology, Peking University, Beijing 100871, China
| | - Lujia Jin
- Institute of Medical Technology, Peking University Health Science Center, Peking University, Beijing 100191, China; Department of Biomedical Engineering, College of Future Technology, Peking University, Beijing 100871, China
| | - Bensheng Qiu
- College of Information Science and Technology, University of Science and Technology of China, Hefei 230041, China
| | - Qiushi Ren
- Institute of Medical Technology, Peking University Health Science Center, Peking University, Beijing 100191, China; Department of Biomedical Engineering, College of Future Technology, Peking University, Beijing 100871, China
| | - Hongwei Du
- College of Information Science and Technology, University of Science and Technology of China, Hefei 230041, China.
| | - Yanye Lu
- Institute of Medical Technology, Peking University Health Science Center, Peking University, Beijing 100191, China.
| |
Collapse
|
17
|
Shen X, Wu X, Liu R, Li H, Yin J, Wang L, Ma H. Accurate segmentation of breast tumor in ultrasound images through joint training and refined segmentation. Phys Med Biol 2022; 67. [DOI: 10.1088/1361-6560/ac8964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2021] [Accepted: 08/12/2022] [Indexed: 11/11/2022]
Abstract
Abstract
Objective. This paper proposes an automatic breast tumor segmentation method for two-dimensional (2D) ultrasound images, which is significantly more accurate, robust, and adaptable than common deep learning models on small datasets. Approach. A generalized joint training and refined segmentation framework (JR) was established, involving a joint training module (J
module
) and a refined segmentation module (R
module
). In J
module
, two segmentation networks are trained simultaneously, under the guidance of the proposed Jocor for Segmentation (JFS) algorithm. In R
module
, the output of J
module
is refined by the proposed area first (AF) algorithm, and marked watershed (MW) algorithm. The AF mainly reduces false positives, which arise easily from the inherent features of breast ultrasound images, in the light of the area, distance, average radical derivative (ARD) and radical gradient index (RGI) of candidate contours. Meanwhile, the MW avoids over-segmentation, and refines segmentation results. To verify its performance, the JR framework was evaluated on three breast ultrasound image datasets. Image dataset A contains 1036 images from local hospitals. Image datasets B and C are two public datasets, containing 562 images and 163 images, respectively. The evaluation was followed by related ablation experiments. Main results. The JR outperformed the other state-of-the-art (SOTA) methods on the three image datasets, especially on image dataset B. Compared with the SOTA methods, the JR improved true positive ratio (TPR) and Jaccard index (JI) by 1.5% and 3.2%, respectively, and reduces (false positive ratio) FPR by 3.7% on image dataset B. The results of the ablation experiments show that each component of the JR matters, and contributes to the segmentation accuracy, particularly in the reduction of false positives. Significance. This study successfully combines traditional segmentation methods with deep learning models. The proposed method can segment small-scale breast ultrasound image datasets efficiently and effectively, with excellent generalization performance.
Collapse
|
18
|
DGCU–Net: A new dual gradient-color deep convolutional neural network for efficient skin lesion segmentation. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103829] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
|
19
|
You Z, Jiang M, Shi Z, Zhao M, Shi C, Du S, Hérard AS, Souedet N, Delzescaux T. Multiscale segmentation- and error-guided iterative convolutional neural network for cerebral neuron segmentation in microscopic images. Microsc Res Tech 2022; 85:3541-3552. [PMID: 35855638 DOI: 10.1002/jemt.24206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Accepted: 07/07/2022] [Indexed: 11/10/2022]
Abstract
This article uses microscopy images obtained from diverse anatomical regions of macaque brain for neuron semantic segmentation. The complex structure of brain, the large intra-class staining intensity difference within neuron class, the small inter-class staining intensity difference between neuron and tissue class, and the unbalanced dataset increase the difficulty of neuron semantic segmentation. To address this problem, we propose a multiscale segmentation- and error-guided iterative convolutional neural network (MSEG-iCNN) to improve the semantic segmentation performance in major anatomical regions of the macaque brain. After evaluating microscopic images from 17 anatomical regions, the semantic segmentation performance of neurons is improved by 10.6%, 4.0%, 1.5%, and 1.2% compared with Random Forest, FCN-8s, U-Net, and UNet++, respectively. Especially for neurons with brighter staining intensity in the anatomical regions such as lateral geniculate, globus pallidus and hypothalamus, the performance is improved by 66.1%, 23.9%, 11.2%, and 6.7%, respectively. Experiments show that our proposed method can efficiently segment neurons with a wide range of staining intensities. The semantic segmentation results are of great significance and can be further used for neuron instance segmentation, morphological analysis and disease diagnosis. Cell segmentation plays a critical role in extracting cerebral information, such as cell counting, cell morphometry and distribution analysis. Accurate automated neuron segmentation is challenging due to the complex structure of brain, the large intra-class staining intensity difference within neuron class, the small inter-class staining intensity difference between neuron and tissue class, and the unbalanced dataset. The proposed multiscale segmentation- and error-guided iterative convolutional neural network (MSEG-iCNN) improve the segmentation performance in 17 major anatomical regions of the macaque brain.
Collapse
Affiliation(s)
- Zhenzhen You
- Shaanxi Key Laboratory for Network Computing and Security Technology, School of Computer Science and Engineering, Xi'an University of Technology, Xi'an, China.,CEA-CNRS-UMR 9199, Laboratoire des Maladies Neurodégénératives, MIRCen, Fontenay-aux-Roses, Université Paris-Saclay, Paris, France
| | - Ming Jiang
- National Laboratory of Radar Signal Processing, Xidian University, Xi'an, China
| | - Zhenghao Shi
- Shaanxi Key Laboratory for Network Computing and Security Technology, School of Computer Science and Engineering, Xi'an University of Technology, Xi'an, China
| | - Minghua Zhao
- Shaanxi Key Laboratory for Network Computing and Security Technology, School of Computer Science and Engineering, Xi'an University of Technology, Xi'an, China
| | - Cheng Shi
- Shaanxi Key Laboratory for Network Computing and Security Technology, School of Computer Science and Engineering, Xi'an University of Technology, Xi'an, China
| | - Shuangli Du
- Shaanxi Key Laboratory for Network Computing and Security Technology, School of Computer Science and Engineering, Xi'an University of Technology, Xi'an, China
| | - Anne-Sophie Hérard
- CEA-CNRS-UMR 9199, Laboratoire des Maladies Neurodégénératives, MIRCen, Fontenay-aux-Roses, Université Paris-Saclay, Paris, France
| | - Nicolas Souedet
- CEA-CNRS-UMR 9199, Laboratoire des Maladies Neurodégénératives, MIRCen, Fontenay-aux-Roses, Université Paris-Saclay, Paris, France
| | - Thierry Delzescaux
- CEA-CNRS-UMR 9199, Laboratoire des Maladies Neurodégénératives, MIRCen, Fontenay-aux-Roses, Université Paris-Saclay, Paris, France
| |
Collapse
|
20
|
Xia H, Rao Z, Zhou Z. A multi-scale gated network for retinal hemorrhage detection. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03476-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
21
|
Mishra S, Zhang Y, Chen DZ, Hu XS. Data-Driven Deep Supervision for Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:1560-1574. [PMID: 35030076 DOI: 10.1109/tmi.2022.3143371] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Medical image segmentation plays a vital role in disease diagnosis and analysis. However, data-dependent difficulties such as low image contrast, noisy background, and complicated objects of interest render the segmentation problem challenging. These difficulties diminish dense prediction and make it tough for known approaches to explore data-specific attributes for robust feature extraction. In this paper, we study medical image segmentation by focusing on robust data-specific feature extraction to achieve improved dense prediction. We propose a new deep convolutional neural network (CNN), which exploits specific attributes of input datasets to utilize deep supervision for enhanced feature extraction. In particular, we strategically locate and deploy auxiliary supervision, by matching the object perceptive field (OPF) (which we define and compute) with the layer-wise effective receptive fields (LERF) of the network. This helps the model pay close attention to some distinct input data dependent features, which the network might otherwise 'ignore' during training. Further, to achieve better target localization and refined dense prediction, we propose the densely decoded networks (DDN), by selectively introducing additional network connections (the 'crutch' connections). Using five public datasets (two retinal vessel, melanoma, optic disc/cup, and spleen segmentation) and two in-house datasets (lymph node and fungus segmentation), we verify the effectiveness of our proposed approach in 2D and 3D segmentation.
Collapse
|
22
|
Huang S, Li J, Xiao Y, Shen N, Xu T. RTNet: Relation Transformer Network for Diabetic Retinopathy Multi-Lesion Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:1596-1607. [PMID: 35041595 DOI: 10.1109/tmi.2022.3143833] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Automatic diabetic retinopathy (DR) lesions segmentation makes great sense of assisting ophthalmologists in diagnosis. Although many researches have been conducted on this task, most prior works paid too much attention to the designs of networks instead of considering the pathological association for lesions. Through investigating the pathogenic causes of DR lesions in advance, we found that certain lesions are closed to specific vessels and present relative patterns to each other. Motivated by the observation, we propose a relation transformer block (RTB) to incorporate attention mechanisms at two main levels: a self-attention transformer exploits global dependencies among lesion features, while a cross-attention transformer allows interactions between lesion and vessel features by integrating valuable vascular information to alleviate ambiguity in lesion detection caused by complex fundus structures. In addition, to capture the small lesion patterns first, we propose a global transformer block (GTB) which preserves detailed information in deep network. By integrating the above blocks of dual-branches, our network segments the four kinds of lesions simultaneously. Comprehensive experiments on IDRiD and DDR datasets well demonstrate the superiority of our approach, which achieves competitive performance compared to state-of-the-arts.
Collapse
|
23
|
Zhang Z, Tian C, Bai HX, Jiao Z, Tian X. Discriminative Error Prediction Network for Semi-supervised Colon Gland Segmentation. Med Image Anal 2022; 79:102458. [DOI: 10.1016/j.media.2022.102458] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 04/10/2022] [Accepted: 04/11/2022] [Indexed: 10/18/2022]
|
24
|
Liu P, Yue N, Chen J. A Machine-Learning-Based Medical Imaging Fast Recognition of Injury Mechanism for Athletes of Winter Sports. Front Public Health 2022; 10:842452. [PMID: 35372194 PMCID: PMC8968734 DOI: 10.3389/fpubh.2022.842452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Accepted: 01/06/2022] [Indexed: 11/13/2022] Open
Abstract
The Beijing 2022 Winter Olympics will begin soon, which is mainly focused on winter sports. Athletes from different countries will arrive in Beijing one after another for training and competition. The health protection of athletes of winter sports is very important in training and competition. The occurrence of sports injury is characterized by multiple factors, uncertainty, and accidents. This paper mainly pays attention to the head injury with the highest severity. Athletes' high safety awareness is a part of reducing injury, but safety awareness cannot effectively reduce the occurrence of injury in competition, and timely treatment of injured athletes is particularly important. After athletes are injured, a telemedicine image acquisition system can be built, so that medical experts can identify athletes' injuries in time and provide the basis for further diagnosis and treatment. In order to improve the accuracy of medical image processing, a C-support vector machine (SVM) medical image segmentation method combining the Chan-Vese (CV) model and SVM is proposed in this paper. After segmentation, the edge and detail features of the image are more prominent, which meet the requirements of high precision for medical image segmentation. Meanwhile, a high-precision registration algorithm of brain functional time-series images based on machine learning (ML) is proposed, and the automatic optimization of high-precision registration of brain function time-series images is performed by ML algorithm. The experimental results show that the proposed algorithm has higher segmentation accuracy above 80% and less registration time below 40 ms, which can provide a reference for doctors to quickly identify the injury and shorten the time.
Collapse
|
25
|
Wang R, Chen S, Ji C, Fan J, Li Y. Boundary-Aware Context Neural Network for Medical Image Segmentation. Med Image Anal 2022; 78:102395. [DOI: 10.1016/j.media.2022.102395] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2020] [Revised: 02/07/2022] [Accepted: 02/12/2022] [Indexed: 12/13/2022]
|
26
|
Xie Y, Zhang J, Liao Z, Verjans J, Shen C, Xia Y. Intra- and Inter-Pair Consistency for Semi-Supervised Gland Segmentation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:894-905. [PMID: 34951847 DOI: 10.1109/tip.2021.3136716] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Accurate gland segmentation in histology tissue images is a critical but challenging task. Although deep models have demonstrated superior performance in medical image segmentation, they commonly require a large amount of annotated data, which are hard to obtain due to the extensive labor costs and expertise required. In this paper, we propose an intra- and inter-pair consistency-based semi-supervised (I2CS) model that can be trained on both labeled and unlabeled histology images for gland segmentation. Considering that each image contains glands and hence different images could potentially share consistent semantics in the feature space, we introduce a novel intra- and inter-pair consistency module to explore such consistency for learning with unlabeled data. It first characterizes the pixel-level relation between a pair of images in the feature space to create an attention map that highlights the regions with the same semantics but on different images. Then, it imposes a consistency constraint on the attention maps obtained from multiple image pairs, and thus filters low-confidence attention regions to generate refined attention maps that are then merged with original features to improve their representation ability. In addition, we also design an object-level loss to address the issues caused by touching glands. We evaluated our model against several recent gland segmentation methods and three typical semi-supervised methods on the GlaS and CRAG datasets. Our results not only demonstrate the effectiveness of the proposed due consistency module and Obj-Dice loss, but also indicate that the proposed I2CS model achieves state-of-the-art gland segmentation performance on both benchmarks.
Collapse
|
27
|
He X, Tan EL, Bi H, Zhang X, Zhao S, Lei B. Fully Transformer Network for Skin Lesion Analysis. Med Image Anal 2022; 77:102357. [DOI: 10.1016/j.media.2022.102357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Revised: 12/26/2021] [Accepted: 01/06/2022] [Indexed: 10/19/2022]
|
28
|
Wu H, Chen S, Chen G, Wang W, Lei B, Wen Z. FAT-Net: Feature adaptive transformers for automated skin lesion segmentation. Med Image Anal 2021; 76:102327. [PMID: 34923250 DOI: 10.1016/j.media.2021.102327] [Citation(s) in RCA: 125] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 11/17/2021] [Accepted: 12/01/2021] [Indexed: 12/23/2022]
Abstract
Skin lesion segmentation from dermoscopic image is essential for improving the quantitative analysis of melanoma. However, it is still a challenging task due to the large scale variations and irregular shapes of the skin lesions. In addition, the blurred lesion boundaries between the skin lesions and the surrounding tissues may also increase the probability of incorrect segmentation. Due to the inherent limitations of traditional convolutional neural networks (CNNs) in capturing global context information, traditional CNN-based methods usually cannot achieve a satisfactory segmentation performance. In this paper, we propose a novel feature adaptive transformer network based on the classical encoder-decoder architecture, named FAT-Net, which integrates an extra transformer branch to effectively capture long-range dependencies and global context information. Furthermore, we also employ a memory-efficient decoder and a feature adaptation module to enhance the feature fusion between the adjacent-level features by activating the effective channels and restraining the irrelevant background noise. We have performed extensive experiments to verify the effectiveness of our proposed method on four public skin lesion segmentation datasets, including the ISIC 2016, ISIC 2017, ISIC 2018, and PH2 datasets. Ablation studies demonstrate the effectiveness of our feature adaptive transformers and memory-efficient strategies. Comparisons with state-of-the-art methods also verify the superiority of our proposed FAT-Net in terms of both accuracy and inference speed. The code is available at https://github.com/SZUcsh/FAT-Net.
Collapse
Affiliation(s)
- Huisi Wu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China
| | - Shihuai Chen
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China
| | - Guilian Chen
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China
| | - Wei Wang
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China
| | - Baiying Lei
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, 518060, ChinaChina.
| | - Zhenkun Wen
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China
| |
Collapse
|
29
|
Analysis of the ISIC image datasets: Usage, benchmarks and recommendations. Med Image Anal 2021; 75:102305. [PMID: 34852988 DOI: 10.1016/j.media.2021.102305] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Revised: 08/09/2021] [Accepted: 11/08/2021] [Indexed: 11/20/2022]
Abstract
The International Skin Imaging Collaboration (ISIC) datasets have become a leading repository for researchers in machine learning for medical image analysis, especially in the field of skin cancer detection and malignancy assessment. They contain tens of thousands of dermoscopic photographs together with gold-standard lesion diagnosis metadata. The associated yearly challenges have resulted in major contributions to the field, with papers reporting measures well in excess of human experts. Skin cancers can be divided into two major groups - melanoma and non-melanoma. Although less prevalent, melanoma is considered to be more serious as it can quickly spread to other organs if not treated at an early stage. In this paper, we summarise the usage of the ISIC dataset images and present an analysis of yearly releases over a period of 2016 - 2020. Our analysis found a significant number of duplicate images, both within and between the datasets. Additionally, we also noted duplicates spread across testing and training sets. Due to these irregularities, we propose a duplicate removal strategy and recommend a curated dataset for researchers to use when working on ISIC datasets. Given that ISIC 2020 focused on melanoma classification, we conduct experiments to provide benchmark results on the ISIC 2020 test set, with additional analysis on the smaller ISIC 2017 test set. Testing was completed following the application of our duplicate removal strategy and an additional data balancing step. As a result of removing 14,310 duplicate images from the training set, our benchmark results show good levels of melanoma prediction with an AUC of 0.80 for the best performing model. As our aim was not to maximise network performance, we did not include additional steps in our experiments. Finally, we provide recommendations for future research by highlighting irregularities that may present research challenges. A list of image files with reference to the original ISIC dataset sources for the recommended curated training set will be shared on our GitHub repository (available at www.github.com/mmu-dermatology-research/isic_duplicate_removal_strategy).
Collapse
|
30
|
Guo W, Zhang G, Gong Z, Li Q. Effective integration of object boundaries and regions for improving the performance of medical image segmentation by using two cascaded networks. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 212:106423. [PMID: 34673377 DOI: 10.1016/j.cmpb.2021.106423] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 09/13/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND AND OBJECTIVES The existing CNN-based methods for object segmentation use the regions of objects alone as the labels for training networks, and the potentially useful boundaries annotated by radiologists are not used directly during the training. Thus, we proposed a framework of two cascaded networks to integrate both the region and boundary information for improving the accuracy of object segmentation. METHODS The first network was used to extract the boundary from original images. The predicted dilated boundary from the first network and the corresponding original image were employed to train the second network for final segmentation. Compared with the object regions, the boundaries may provide additional useful local information for improved object segmentation. The two cascaded networks were evaluated on three datasets, including 40 CT scans for segmenting the esophagus, heart, trachea, and aorta, 247 chest radiographs for segmenting the lung, heart, and clavicle, and 101 retinal images for segmenting the optical disk and cup. The mean values of Dices, 90% Hausdorff distance, and Euclidean distance were employed to quantitatively evaluate the segmentation results. RESULTS Compared with the baseline method of the conventional U-Net, the two cascaded networks consistently improved the mean Dices and reduced the mean 90% Hausdorff distances and Euclidean distances for all objects, and the reduction rate of the 90% Hausdorff distance was as high as ten times for certain objects. CONCLUSIONS The boundary is very useful information for object segmentation, and the integration of object boundary and region would improve the segmentation results compared with the use of object region alone.
Collapse
Affiliation(s)
- Wei Guo
- Huazhong University of Science and Technology, China; Shenyang Aerospace University, China
| | | | | | - Qiang Li
- Huazhong University of Science and Technology, China.
| |
Collapse
|
31
|
Guo S. Fundus image segmentation via hierarchical feature learning. Comput Biol Med 2021; 138:104928. [PMID: 34662814 DOI: 10.1016/j.compbiomed.2021.104928] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2021] [Revised: 10/06/2021] [Accepted: 10/06/2021] [Indexed: 01/28/2023]
Abstract
Fundus Image Segmentation (FIS) is an essential procedure for the automated diagnosis of ophthalmic diseases. Recently, deep fully convolutional networks have been widely used for FIS with state-of-the-art performance. The representative deep model is the U-Net, which follows an encoder-decoder architecture. I believe it is suboptimal for FIS because consecutive pooling operations in the encoder lead to low-resolution representation and loss of detailed spatial information, which is particularly important for the segmentation of tiny vessels and lesions. Motivated by this, a high-resolution hierarchical network (HHNet) is proposed to learn semantic-rich high-resolution representations and preserve spatial details simultaneously. Specifically, a High-resolution Feature Learning (HFL) module with increasing dilation rates was first designed to learn the high-level high-resolution representations. Then, the HHNet was constructed by incorporating three HFL modules and two feature aggregation modules. The HHNet runs in a coarse-to-fine manner, and fine segmentation maps are output at the last level. Extensive experiments were conducted on fundus lesion segmentation, vessel segmentation, and optic cup segmentation. The experimental results reveal that the proposed method shows highly competitive or even superior performance in terms of segmentation performance and computation cost, indicating its potential advantages in clinical application.
Collapse
Affiliation(s)
- Song Guo
- School of Information and Control Engineering, Xi'an University of Architecture and Technology, Xi'an, 710055, China.
| |
Collapse
|
32
|
Xia H, Lan Y, Song S, Li H. A multi-scale segmentation-to-classification network for tiny microaneurysm detection in fundus images. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.107140] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
33
|
Zhang J, Xie Y, Pang G, Liao Z, Verjans J, Li W, Sun Z, He J, Li Y, Shen C, Xia Y. Viral Pneumonia Screening on Chest X-Rays Using Confidence-Aware Anomaly Detection. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:879-890. [PMID: 33245693 PMCID: PMC8544953 DOI: 10.1109/tmi.2020.3040950] [Citation(s) in RCA: 113] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Revised: 11/10/2020] [Accepted: 11/22/2020] [Indexed: 05/24/2023]
Abstract
Clusters of viral pneumonia occurrences over a short period may be a harbinger of an outbreak or pandemic. Rapid and accurate detection of viral pneumonia using chest X-rays can be of significant value for large-scale screening and epidemic prevention, particularly when other more sophisticated imaging modalities are not readily accessible. However, the emergence of novel mutated viruses causes a substantial dataset shift, which can greatly limit the performance of classification-based approaches. In this paper, we formulate the task of differentiating viral pneumonia from non-viral pneumonia and healthy controls into a one-class classification-based anomaly detection problem. We therefore propose the confidence-aware anomaly detection (CAAD) model, which consists of a shared feature extractor, an anomaly detection module, and a confidence prediction module. If the anomaly score produced by the anomaly detection module is large enough, or the confidence score estimated by the confidence prediction module is small enough, the input will be accepted as an anomaly case (i.e., viral pneumonia). The major advantage of our approach over binary classification is that we avoid modeling individual viral pneumonia classes explicitly and treat all known viral pneumonia cases as anomalies to improve the one-class model. The proposed model outperforms binary classification models on the clinical X-VIRAL dataset that contains 5,977 viral pneumonia (no COVID-19) cases, 37,393 non-viral pneumonia or healthy cases. Moreover, when directly testing on the X-COVID dataset that contains 106 COVID-19 cases and 107 normal controls without any fine-tuning, our model achieves an AUC of 83.61% and sensitivity of 71.70%, which is comparable to the performance of radiologists reported in the literature.
Collapse
Affiliation(s)
- Jianpeng Zhang
- National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science and EngineeringNorthwestern Polytechnical UniversityXi’an710072China
| | - Yutong Xie
- National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science and EngineeringNorthwestern Polytechnical UniversityXi’an710072China
| | - Guansong Pang
- School of Computer ScienceThe University of AdelaideSA5005Australia
| | - Zhibin Liao
- School of Computer ScienceThe University of AdelaideSA5005Australia
| | - Johan Verjans
- School of Computer ScienceThe University of AdelaideSA5005Australia
| | | | - Zongji Sun
- National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science and EngineeringNorthwestern Polytechnical UniversityXi’an710072China
| | - Jian He
- Department of RadiologyNanjing Drum Tower Hospital-Affiliated Hospital, Medical SchoolNanjing UniversityNanjing210029China
| | - Yi Li
- GreyBird Ventures, LLCConcordMA01742USA
| | - Chunhua Shen
- School of Computer ScienceThe University of AdelaideSA5005Australia
| | - Yong Xia
- National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science and EngineeringNorthwestern Polytechnical UniversityXi’an710072China
- Research and Development Institute, Northwestern Polytechnical University in ShenzhenShenzhen518057China
| |
Collapse
|
34
|
Zhang J, Xie Y, Wang Y, Xia Y. Inter-Slice Context Residual Learning for 3D Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:661-672. [PMID: 33125324 DOI: 10.1109/tmi.2020.3034995] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Automated and accurate 3D medical image segmentation plays an essential role in assisting medical professionals to evaluate disease progresses and make fast therapeutic schedules. Although deep convolutional neural networks (DCNNs) have widely applied to this task, the accuracy of these models still need to be further improved mainly due to their limited ability to 3D context perception. In this paper, we propose the 3D context residual network (ConResNet) for the accurate segmentation of 3D medical images. This model consists of an encoder, a segmentation decoder, and a context residual decoder. We design the context residual module and use it to bridge both decoders at each scale. Each context residual module contains both context residual mapping and context attention mapping, the formal aims to explicitly learn the inter-slice context information and the latter uses such context as a kind of attention to boost the segmentation accuracy. We evaluated this model on the MICCAI 2018 Brain Tumor Segmentation (BraTS) dataset and NIH Pancreas Segmentation (Pancreas-CT) dataset. Our results not only demonstrate the effectiveness of the proposed 3D context residual learning scheme but also indicate that the proposed ConResNet is more accurate than six top-ranking methods in brain tumor segmentation and seven top-ranking methods in pancreas segmentation.
Collapse
|