1
|
Ji Z, Ye Y, Ma X. BDFormer: Boundary-aware dual-decoder transformer for skin lesion segmentation. Artif Intell Med 2025; 162:103079. [PMID: 39983372 DOI: 10.1016/j.artmed.2025.103079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Revised: 01/17/2025] [Accepted: 02/05/2025] [Indexed: 02/23/2025]
Abstract
Segmenting skin lesions from dermatoscopic images is crucial for improving the quantitative analysis of skin cancer. However, automatic segmentation of skin lesions remains a challenging task due to the presence of unclear boundaries, artifacts, and obstacles such as hair and veins, all of which complicate the segmentation process. Transformers have demonstrated superior capabilities in capturing long-range dependencies through self-attention mechanisms and are gradually replacing CNNs in this domain. However, one of their primary limitations is the inability to effectively capture local details, which is crucial for handling unclear boundaries and significantly affects segmentation accuracy. To address this issue, we propose a novel boundary-aware dual-decoder transformer that employs a single encoder and dual-decoder framework for both skin lesion segmentation and dilated boundary segmentation. Within this model, we introduce a shifted window cross-attention block to build the dual-decoder structure and apply multi-task distillation to enable efficient interaction of inter-task information. Additionally, we propose a multi-scale aggregation strategy to refine the extracted features, ensuring optimal predictions. To further enhance boundary details, we incorporate a dilated boundary loss function, which expands the single-pixel boundary mask into planar information. We also introduce a task-wise consistency loss to promote consistency across tasks. Our method is evaluated on three datasets: ISIC2018, ISIC2017, and PH2, yielding promising results with excellent performance compared to state-of-the-art models. The code is available at https://github.com/Yuxuan-Ye/BDFormer.
Collapse
Affiliation(s)
- Zexuan Ji
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
| | - Yuxuan Ye
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China.
| | - Xiao Ma
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
| |
Collapse
|
2
|
Fan JC, Luan H, Qiao Y, Li Y, Ren Y. Detection and segmentation of pulmonary embolism in 3D CT pulmonary angiography using a threshold adjustment segmentation network. Sci Rep 2025; 15:7263. [PMID: 40025153 PMCID: PMC11873198 DOI: 10.1038/s41598-025-91807-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2024] [Accepted: 02/24/2025] [Indexed: 03/04/2025] Open
Abstract
Pulmonary embolism is a life-threatening condition where early diagnosis and precise localization are crucial for improving patient outcomes. While CT pulmonary angiography (CTPA) is the primary method for detecting pulmonary embolism, existing segmentation algorithms struggle to effectively distinguish thrombi from vascular structures in complex 3D CTPA images, often leading to both false positives and false negatives. To address these challenges, the Threshold Adjustment Segmentation Network (TSNet) is proposed to enhance segmentation performance in 3D CTPA images. TSNet incorporates two core modules: the Threshold Adjustment Module (TAD) and the Geometric-Topological Axial Feature Module (GT-AFM). TAD utilizes logarithmic scaling, adaptive adjustments, and nonlinear transformations to optimize the probability distributions of thrombi and vessels, reducing false positives while improving the sensitivity of thrombus detection. GT-AFM integrates geometric features and topological information to enhance the recognition of complex vascular and thrombotic structures, improving spatial feature processing. Experimental results show that TSNet achieves a sensitivity of 0.761 and a false positives per scan of 1.273 at ε = 0 mm. With an increased tolerance of ε = 5 mm, sensitivity improves to 0.878 and false positives per scan decreases to 0.515, significantly reducing false positives. These results indicate that TSNet demonstrates superior segmentation performance under various tolerance levels, showing robustness and a well-balanced trade-off between sensitivity and false positives, making it highly promising for clinical applications.
Collapse
Affiliation(s)
- Jian-Cong Fan
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, Shandong Province, PR China
- Provincial Key Laboratory for Information Technology of Wisdom Mining of Shandong Province, Shandong University of Science and Technology, Qingdao, China
| | - Haoyang Luan
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, Shandong Province, PR China
- Provincial Key Laboratory for Information Technology of Wisdom Mining of Shandong Province, Shandong University of Science and Technology, Qingdao, China
| | - Yaqian Qiao
- Department of Radiology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Yang Li
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, Shandong Province, PR China.
- Provincial Key Laboratory for Information Technology of Wisdom Mining of Shandong Province, Shandong University of Science and Technology, Qingdao, China.
| | - Yande Ren
- Department of Radiology, The Affiliated Hospital of Qingdao University, Qingdao, China.
| |
Collapse
|
3
|
Zhou J, Wang S, Wang H, Li Y, Li X. Multi-Modality Fusion and Tumor Sub-Component Relationship Ensemble Network for Brain Tumor Segmentation. Bioengineering (Basel) 2025; 12:159. [PMID: 40001679 PMCID: PMC11851405 DOI: 10.3390/bioengineering12020159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2024] [Revised: 01/21/2025] [Accepted: 01/27/2025] [Indexed: 02/27/2025] Open
Abstract
Deep learning technology has been widely used in brain tumor segmentation with multi-modality magnetic resonance imaging, helping doctors achieve faster and more accurate diagnoses. Previous studies have demonstrated that the weighted fusion segmentation method effectively extracts modality importance, laying a solid foundation for multi-modality magnetic resonance imaging segmentation. However, the challenge of fusing multi-modality features with single-modality features remains unresolved, which motivated us to explore an effective fusion solution. We propose a multi-modality and single-modality feature recalibration network for magnetic resonance imaging brain tumor segmentation. Specifically, we designed a dual recalibration module that achieves accurate feature calibration by integrating the complementary features of multi-modality with the specific features of a single modality. Experimental results on the BraTS 2018 dataset showed that the proposed method outperformed existing multi-modal network methods across multiple evaluation metrics, with spatial recalibration significantly improving the results, including Dice score increases of 1.7%, 0.5%, and 1.6% for the enhanced tumor core, whole tumor, and tumor core regions, respectively.
Collapse
Affiliation(s)
- Jinyan Zhou
- Basic Medical College, Heilongjiang University of Chinese Medicine, Harbin 150040, China; (J.Z.); (S.W.)
| | - Shuwen Wang
- Basic Medical College, Heilongjiang University of Chinese Medicine, Harbin 150040, China; (J.Z.); (S.W.)
| | - Hao Wang
- Department of Control Science and Engineering, Harbin Institute of Technology, Harbin 150001, China;
| | - Yaxue Li
- Basic Medical College, Heilongjiang University of Chinese Medicine, Harbin 150040, China; (J.Z.); (S.W.)
| | - Xiang Li
- Department of Control Science and Engineering, Harbin Institute of Technology, Harbin 150001, China;
- College of Information Science and Engineering, Northeastern University, Shenyang 110819, China
- Hebei Key Laboratory of Micro-Nano Precision Optical Sensing and Measurement Technology, Qinhuangdao 066004, China
| |
Collapse
|
4
|
Zhou Y, Li L, Wang C, Song L, Yang G. GobletNet: Wavelet-Based High-Frequency Fusion Network for Semantic Segmentation of Electron Microscopy Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:1058-1069. [PMID: 39365717 DOI: 10.1109/tmi.2024.3474028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/06/2024]
Abstract
Semantic segmentation of electron microscopy (EM) images is crucial for nanoscale analysis. With the development of deep neural networks (DNNs), semantic segmentation of EM images has achieved remarkable success. However, current EM image segmentation models are usually extensions or adaptations of natural or biomedical models. They lack the full exploration and utilization of the intrinsic characteristics of EM images. Furthermore, they are often designed only for several specific segmentation objects and lack versatility. In this study, we quantitatively analyze the characteristics of EM images compared with those of natural and other biomedical images via the wavelet transform. To better utilize these characteristics, we design a high-frequency (HF) fusion network, GobletNet, which outperforms state-of-the-art models by a large margin in the semantic segmentation of EM images. We use the wavelet transform to generate HF images as extra inputs and use an extra encoding branch to extract HF information. Furthermore, we introduce a fusion-attention module (FAM) into GobletNet to facilitate better absorption and fusion of information from raw images and HF images. Extensive benchmarking on seven public EM datasets (EPFL, CREMI, SNEMI3D, UroCell, MitoEM, Nanowire and BetaSeg) demonstrates the effectiveness of our model. The code is available at https://github.com/Yanfeng-Zhou/GobletNet.
Collapse
|
5
|
Zhang S, Shen X, Chen X, Yu Z, Ren B, Yang H, Zhang XY, Zhou Y. CQformer: Learning Dynamics Across Slices in Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:1043-1057. [PMID: 39388328 DOI: 10.1109/tmi.2024.3477555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/12/2024]
Abstract
Prevalent studies on deep learning-based 3D medical image segmentation capture the continuous variation across 2D slices mainly via convolution, Transformer, inter-slice interaction, and time series models. In this work, via modeling this variation by an ordinary differential equation (ODE), we propose a cross instance query-guided Transformer architecture (CQformer) that leverages features from preceding slices to improve the segmentation performance of subsequent slices. Its key components include a cross-attention mechanism in an ODE formulation, which bridges the features of contiguous 2D slices of the 3D volumetric data. In addition, a regression head is employed to shorten the gap between the bottleneck and the prediction layer. Extensive experiments on 7 datasets with various modalities (CT, MRI) and tasks (organ, tissue, and lesion) demonstrate that CQformer outperforms previous state-of-the-art segmentation algorithms on 6 datasets by 0.44%-2.45%, and achieves the second highest performance of 88.30% on the BTCV dataset. The code is available at https://github.com/qbmizsj/CQformer.
Collapse
|
6
|
Xin J, Yu Y, Shen Q, Zhang S, Su N, Wang Z. BCT-Net: semantic-guided breast cancer segmentation on BUS. Med Biol Eng Comput 2025:10.1007/s11517-025-03304-2. [PMID: 39883373 DOI: 10.1007/s11517-025-03304-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Accepted: 01/17/2025] [Indexed: 01/31/2025]
Abstract
Accurately and swiftly segmenting breast tumors is significant for cancer diagnosis and treatment. Ultrasound imaging stands as one of the widely employed methods in clinical practice. However, due to challenges such as low contrast, blurred boundaries, and prevalent shadows in ultrasound images, tumor segmentation remains a daunting task. In this study, we propose BCT-Net, a network amalgamating CNN and transformer components for breast tumor segmentation. BCT-Net integrates a dual-level attention mechanism to capture more features and redefines the skip connection module. We introduce the utilization of a classification task as an auxiliary task to impart additional semantic information to the segmentation network, employing supervised contrastive learning. A hybrid objective loss function is proposed, which combines pixel-wise cross-entropy, binary cross-entropy, and supervised contrastive learning loss. Experimental results demonstrate that BCT-Net achieves high precision, with Pre and DSC indices of 86.12% and 88.70%, respectively. Experiments conducted on the BUSI dataset of breast ultrasound images manifest that this approach exhibits high accuracy in breast tumor segmentation.
Collapse
Affiliation(s)
- Junchang Xin
- School of Computer Science and Engineering, Northeastern University, Shenyang, 110169, China
| | - Yaqi Yu
- School of Computer Science and Engineering, Northeastern University, Shenyang, 110169, China
| | - Qi Shen
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, 110169, China
| | - Shudi Zhang
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, 110169, China
| | - Na Su
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, 110169, China
| | - Zhiqiong Wang
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, 110169, China.
| |
Collapse
|
7
|
Lopez-Ramirez F, Soleimani S, Azadi JR, Sheth S, Kawamoto S, Javed AA, Tixier F, Hruban RH, Fishman EK, Chu LC. Radiomics machine learning algorithm facilitates detection of small pancreatic neuroendocrine tumors on CT. Diagn Interv Imaging 2025; 106:28-40. [PMID: 39278763 DOI: 10.1016/j.diii.2024.08.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2024] [Revised: 08/15/2024] [Accepted: 08/22/2024] [Indexed: 09/18/2024]
Abstract
PURPOSE The purpose of this study was to develop a radiomics-based algorithm to identify small pancreatic neuroendocrine tumors (PanNETs) on CT and evaluate its robustness across manual and automated segmentations, exploring the feasibility of automated screening. MATERIALS AND METHODS Patients with pathologically confirmed T1 stage PanNETs and healthy controls undergoing dual-phase CT imaging were retrospectively identified. Manual segmentation of pancreas and tumors was performed, then automated pancreatic segmentations were generated using a pretrained neural network. A total of 1223 radiomics features were independently extracted from both segmentation volumes, in the arterial and venous phases separately. Ten final features were selected to train classifiers to identify PanNETs and controls. The cohort was divided into training and testing sets, and performance of classifiers was assessed using area under the receiver operator characteristic curve (AUC), specificity and sensitivity, and compared against two radiologists blinded to the diagnoses. RESULTS A total of 135 patients with 142 PanNETs, and 135 healthy controls were included. There were 168 women and 102 men, with a mean age of 55.4 ± 11.6 (standard deviation) years (range: 20-85 years). Median PanNET size was 1.3 cm (Q1, 1.0; Q3, 1.5; range: 0.5-1.9). The arterial phase LightGBM model achieved the best performance in the test set, with 90 % sensitivity (95 % confidence interval [CI]: 80-98), 76 % specificity (95 % CI: 62-88) and an AUC of 0.87 (95 % CI: 0.79-0.94). Using features from the automated segmentations, this model achieved an AUC of 0.86 (95 % CI: 0.79-0.93). In comparison, the two radiologists achieved a mean 50 % sensitivity and 100 % specificity using arterial phase CT images. CONCLUSION Radiomics features identify small PanNETs, with stable performance when extracted using automated segmentations. These models demonstrate high sensitivity, complementing the high specificity of radiologists, and could serve as opportunistic screeners.
Collapse
Affiliation(s)
- Felipe Lopez-Ramirez
- Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Sahar Soleimani
- Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Javad R Azadi
- Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Sheila Sheth
- Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA; Department of Radiology, New York University Grossman School of Medicine, New York, NY 10016, USA
| | - Satomi Kawamoto
- Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Ammar A Javed
- Department of Surgery, New York University Grossman School of Medicine, New York, NY, 10016, USA
| | - Florent Tixier
- Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Ralph H Hruban
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA; Sol Goldman Pancreatic Cancer Research Center, Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Elliot K Fishman
- Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Linda C Chu
- Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA.
| |
Collapse
|
8
|
Zhuang Y, Liu H, Fang W, Ma G, Sun S, Zhu Y, Zhang X, Ge C, Chen W, Long J, Song E. A 3D hierarchical cross-modality interaction network using transformers and convolutions for brain glioma segmentation in MR images. Med Phys 2024; 51:8371-8389. [PMID: 39137295 DOI: 10.1002/mp.17354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 06/20/2024] [Accepted: 08/02/2024] [Indexed: 08/15/2024] Open
Abstract
BACKGROUND Precise glioma segmentation from multi-parametric magnetic resonance (MR) images is essential for brain glioma diagnosis. However, due to the indistinct boundaries between tumor sub-regions and the heterogeneous appearances of gliomas in volumetric MR scans, designing a reliable and automated glioma segmentation method is still challenging. Although existing 3D Transformer-based or convolution-based segmentation networks have obtained promising results via multi-modal feature fusion strategies or contextual learning methods, they widely lack the capability of hierarchical interactions between different modalities and cannot effectively learn comprehensive feature representations related to all glioma sub-regions. PURPOSE To overcome these problems, in this paper, we propose a 3D hierarchical cross-modality interaction network (HCMINet) using Transformers and convolutions for accurate multi-modal glioma segmentation, which leverages an effective hierarchical cross-modality interaction strategy to sufficiently learn modality-specific and modality-shared knowledge correlated to glioma sub-region segmentation from multi-parametric MR images. METHODS In the HCMINet, we first design a hierarchical cross-modality interaction Transformer (HCMITrans) encoder to hierarchically encode and fuse heterogeneous multi-modal features by Transformer-based intra-modal embeddings and inter-modal interactions in multiple encoding stages, which effectively captures complex cross-modality correlations while modeling global contexts. Then, we collaborate an HCMITrans encoder with a modality-shared convolutional encoder to construct the dual-encoder architecture in the encoding stage, which can learn the abundant contextual information from global and local perspectives. Finally, in the decoding stage, we present a progressive hybrid context fusion (PHCF) decoder to progressively fuse local and global features extracted by the dual-encoder architecture, which utilizes the local-global context fusion (LGCF) module to efficiently alleviate the contextual discrepancy among the decoding features. RESULTS Extensive experiments are conducted on two public and competitive glioma benchmark datasets, including the BraTS2020 dataset with 494 patients and the BraTS2021 dataset with 1251 patients. Results show that our proposed method outperforms existing Transformer-based and CNN-based methods using other multi-modal fusion strategies in our experiments. Specifically, the proposed HCMINet achieves state-of-the-art mean DSC values of 85.33% and 91.09% on the BraTS2020 online validation dataset and the BraTS2021 local testing dataset, respectively. CONCLUSIONS Our proposed method can accurately and automatically segment glioma regions from multi-parametric MR images, which is beneficial for the quantitative analysis of brain gliomas and helpful for reducing the annotation burden of neuroradiologists.
Collapse
Affiliation(s)
- Yuzhou Zhuang
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Hong Liu
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Wei Fang
- Wuhan Zhongke Industrial Research Institute of Medical Science Co., Ltd, Wuhan, China
| | - Guangzhi Ma
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Sisi Sun
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Yunfeng Zhu
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Xu Zhang
- Wuhan United Imaging Healthcare Surgical Technology Co., Ltd, Wuhan, China
| | - Chuanbin Ge
- Wuhan United Imaging Healthcare Surgical Technology Co., Ltd, Wuhan, China
| | - Wenyang Chen
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Jiaosong Long
- School of Art and Design, Hubei University of Technology, Wuhan, China
| | - Enmin Song
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| |
Collapse
|
9
|
Li H, Hussin N, He D, Geng Z, Li S. Design of image segmentation model based on residual connection and feature fusion. PLoS One 2024; 19:e0309434. [PMID: 39361568 PMCID: PMC11449362 DOI: 10.1371/journal.pone.0309434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 08/12/2024] [Indexed: 10/05/2024] Open
Abstract
With the development of deep learning technology, convolutional neural networks have made great progress in the field of image segmentation. However, for complex scenes and multi-scale target images, the existing technologies are still unable to achieve effective image segmentation. In view of this, an image segmentation model based on residual connection and feature fusion is proposed. The model makes comprehensive use of the deep feature extraction ability of residual connections and the multi-scale feature integration ability of feature fusion. In order to solve the problem of background complexity and information loss in traditional image segmentation, experiments were carried out on two publicly available data sets. The results showed that in the ISPRS Vaihingen dataset and the Caltech UCSD Birds200 dataset, when the model completed the 56th and 84th iterations, respectively, the average accuracy of FRes-MFDNN was the highest, which was 97.89% and 98.24%, respectively. In the ISPRS Vaihingen dataset and the Caltech UCSD Birds200 dataset, when the system model ran to 0.20s and 0.26s, the F1 value of the FRes-MFDNN method was the largest, and the F1 value approached 100% infinitely. The FRes-MFDNN segmented four images in the ISPRS Vaihingen dataset, and the segmentation accuracy of images 1, 2, 3 and 4 were 91.44%, 92.12%, 94.02% and 91.41%, respectively. In practical applications, the MSRF-Net method, LBN-AA-SPN method, ARG-Otsu method, and FRes-MFDNN were used to segment unlabeled bird images. The results showed that the FRes-MFDNN was more complete in details, and the overall effect was significantly better than the other three models. Meanwhile, in ordinary scene images, although there was a certain degree of noise and occlusion, the model still accurately recognized and segmented the main bird images. The results show that compared with the traditional model, after FRes-MFDNN segmentation, the completeness, detail, and spatial continuity of pixels have been significantly improved, making it more suitable for complex scenes.
Collapse
Affiliation(s)
- Hong Li
- School of Information Engineering, Pingdingshan University, Pingdingshan, China
- Faculty of Engineering, Built Environment and Information Technology, SEGi University, Kota Damansara, Malaysia
| | - Norriza Hussin
- Faculty of Engineering, Built Environment and Information Technology, SEGi University, Kota Damansara, Malaysia
| | - Dandan He
- School of Information Engineering, Pingdingshan University, Pingdingshan, China
- Faculty of Engineering, Built Environment and Information Technology, SEGi University, Kota Damansara, Malaysia
| | - Zexun Geng
- School of Information Engineering, Pingdingshan University, Pingdingshan, China
| | - Shengpu Li
- School of Information Engineering, Pingdingshan University, Pingdingshan, China
| |
Collapse
|
10
|
Song H, Mao X, Yu J, Li Q, Wang Y. I³Net: Inter-Intra-Slice Interpolation Network for Medical Slice Synthesis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:3306-3318. [PMID: 38669167 DOI: 10.1109/tmi.2024.3394033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/28/2024]
Abstract
Medical imaging is limited by acquisition time and scanning equipment. CT and MR volumes, reconstructed with thicker slices, are anisotropic with high in-plane resolution and low through-plane resolution. We reveal an intriguing phenomenon that due to the mentioned nature of data, performing slice-wise interpolation from the axial view can yield greater benefits than performing super-resolution from other views. Based on this observation, we propose an Inter-Intra-slice Interpolation Network ( [Formula: see text]Net), which fully explores information from high in-plane resolution and compensates for low through-plane resolution. The through-plane branch supplements the limited information contained in low through-plane resolution from high in-plane resolution and enables continual and diverse feature learning. In-plane branch transforms features to the frequency domain and enforces an equal learning opportunity for all frequency bands in a global context learning paradigm. We further propose a cross-view block to take advantage of the information from all three views online. Extensive experiments on two public datasets demonstrate the effectiveness of [Formula: see text]Net, and noticeably outperforms state-of-the-art super-resolution, video frame interpolation and slice interpolation methods by a large margin. We achieve 43.90dB in PSNR, with at least 1.14dB improvement under the upscale factor of ×2 on MSD dataset with faster inference. Code is available at https://github.com/DeepMed-Lab-ECNU/Medical-Image-Reconstruction.
Collapse
|
11
|
Huang X, Huang J, Zhao K, Zhang T, Li Z, Yue C, Chen W, Wang R, Chen X, Zhang Q, Fu Y, Wang Y, Guo Y. SASAN: Spectrum-Axial Spatial Approach Networks for Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:3044-3056. [PMID: 38557622 DOI: 10.1109/tmi.2024.3383466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Ophthalmic diseases such as central serous chorioretinopathy (CSC) significantly impair the vision of millions of people globally. Precise segmentation of choroid and macular edema is critical for diagnosing and treating these conditions. However, existing 3D medical image segmentation methods often fall short due to the heterogeneous nature and blurry features of these conditions, compounded by medical image clarity issues and noise interference arising from equipment and environmental limitations. To address these challenges, we propose the Spectrum Analysis Synergy Axial-Spatial Network (SASAN), an approach that innovatively integrates spectrum features using the Fast Fourier Transform (FFT). SASAN incorporates two key modules: the Frequency Integrated Neural Enhancer (FINE), which mitigates noise interference, and the Axial-Spatial Elementum Multiplier (ASEM), which enhances feature extraction. Additionally, we introduce the Self-Adaptive Multi-Aspect Loss ( LSM ), which balances image regions, distribution, and boundaries, adaptively updating weights during training. We compiled and meticulously annotated the Choroid and Macular Edema OCT Mega Dataset (CMED-18k), currently the world's largest dataset of its kind. Comparative analysis against 13 baselines shows our method surpasses these benchmarks, achieving the highest Dice scores and lowest HD95 in the CMED and OIMHS datasets. Our code is publicly available at https://github.com/IMOP-lab/SASAN-Pytorch.
Collapse
|
12
|
Xiao Z, Zhang Y, Deng Z, Liu F. Light3DHS: A lightweight 3D hippocampus segmentation method using multiscale convolution attention and vision transformer. Neuroimage 2024; 292:120608. [PMID: 38626817 DOI: 10.1016/j.neuroimage.2024.120608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 04/10/2024] [Accepted: 04/11/2024] [Indexed: 04/22/2024] Open
Abstract
The morphological analysis and volume measurement of the hippocampus are crucial to the study of many brain diseases. Therefore, an accurate hippocampal segmentation method is beneficial for the development of clinical research in brain diseases. U-Net and its variants have become prevalent in hippocampus segmentation of Magnetic Resonance Imaging (MRI) due to their effectiveness, and the architecture based on Transformer has also received some attention. However, some existing methods focus too much on the shape and volume of the hippocampus rather than its spatial information, and the extracted information is independent of each other, ignoring the correlation between local and global features. In addition, many methods cannot be effectively applied to practical medical image segmentation due to many parameters and high computational complexity. To this end, we combined the advantages of CNNs and ViTs (Vision Transformer) and proposed a simple and lightweight model: Light3DHS for the segmentation of the 3D hippocampus. In order to obtain richer local contextual features, the encoder first utilizes a multi-scale convolutional attention module (MCA) to learn the spatial information of the hippocampus. Considering the importance of local features and global semantics for 3D segmentation, we used a lightweight ViT to learn high-level features of scale invariance and further fuse local-to-global representation. To evaluate the effectiveness of encoder feature representation, we designed three decoders of different complexity to generate segmentation maps. Experiments on three common hippocampal datasets demonstrate that the network achieves more accurate hippocampus segmentation with fewer parameters. Light3DHS performs better than other state-of-the-art algorithms.
Collapse
Affiliation(s)
- Zhiyong Xiao
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214122, China; Institut Fresnel, Centre National de la Recherche Scientifique, Marseille, 13397, France
| | - Yuhong Zhang
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214122, China
| | - Zhaohong Deng
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214122, China
| | - Fei Liu
- Wuxi Hospital of Traditional Chinese Medicine, Wuxi, 214071, China.
| |
Collapse
|
13
|
Zaman A, Hassan H, Zeng X, Khan R, Lu J, Yang H, Miao X, Cao A, Yang Y, Huang B, Guo Y, Kang Y. Adaptive Feature Medical Segmentation Network: an adaptable deep learning paradigm for high-performance 3D brain lesion segmentation in medical imaging. Front Neurosci 2024; 18:1363930. [PMID: 38680446 PMCID: PMC11047127 DOI: 10.3389/fnins.2024.1363930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Accepted: 03/04/2024] [Indexed: 05/01/2024] Open
Abstract
Introduction In neurological diagnostics, accurate detection and segmentation of brain lesions is crucial. Identifying these lesions is challenging due to its complex morphology, especially when using traditional methods. Conventional methods are either computationally demanding with a marginal impact/enhancement or sacrifice fine details for computational efficiency. Therefore, balancing performance and precision in compute-intensive medical imaging remains a hot research topic. Methods We introduce a novel encoder-decoder network architecture named the Adaptive Feature Medical Segmentation Network (AFMS-Net) with two encoder variants: the Single Adaptive Encoder Block (SAEB) and the Dual Adaptive Encoder Block (DAEB). A squeeze-and-excite mechanism is employed in SAEB to identify significant data while disregarding peripheral details. This approach is best suited for scenarios requiring quick and efficient segmentation, with an emphasis on identifying key lesion areas. In contrast, the DAEB utilizes an advanced channel spatial attention strategy for fine-grained delineation and multiple-class classifications. Additionally, both architectures incorporate a Segmentation Path (SegPath) module between the encoder and decoder, refining segmentation, enhancing feature extraction, and improving model performance and stability. Results AFMS-Net demonstrates exceptional performance across several notable datasets, including BRATs 2021, ATLAS 2021, and ISLES 2022. Its design aims to construct a lightweight architecture capable of handling complex segmentation challenges with high precision. Discussion The proposed AFMS-Net addresses the critical balance issue between performance and computational efficiency in the segmentation of brain lesions. By introducing two tailored encoder variants, the network adapts to varying requirements of speed and feature. This approach not only advances the state-of-the-art in lesion segmentation but also provides a scalable framework for future research in medical image processing.
Collapse
Affiliation(s)
- Asim Zaman
- School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China
- College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen, China
- School of Applied Technology, Shenzhen University, Shenzhen, China
- Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Medical School, Shenzhen University, Shenzhen, China
| | - Haseeb Hassan
- College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen, China
| | - Xueqiang Zeng
- College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen, China
- School of Applied Technology, Shenzhen University, Shenzhen, China
| | - Rashid Khan
- School of Applied Technology, Shenzhen University, Shenzhen, China
- Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Medical School, Shenzhen University, Shenzhen, China
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, China
| | - Jiaxi Lu
- College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen, China
- School of Applied Technology, Shenzhen University, Shenzhen, China
| | - Huihui Yang
- College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen, China
- School of Applied Technology, Shenzhen University, Shenzhen, China
| | - Xiaoqiang Miao
- College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen, China
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Anbo Cao
- College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen, China
- School of Applied Technology, Shenzhen University, Shenzhen, China
| | - Yingjian Yang
- Shenzhen Lanmage Medical Technology Co., Ltd, Shenzhen, China
| | - Bingding Huang
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, China
| | - Yingwei Guo
- College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen, China
- School of Electrical and Information Engineering, Northeast Petroleum University, Daqing, China
| | - Yan Kang
- School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China
- College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen, China
- School of Applied Technology, Shenzhen University, Shenzhen, China
- Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Medical School, Shenzhen University, Shenzhen, China
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| |
Collapse
|
14
|
Yang C, Zhang H, Chi D, Li Y, Xiao Q, Bai Y, Li Z, Li H, Li H. Contour attention network for cerebrovascular segmentation from TOF-MRA volumetric images. Med Phys 2024; 51:2020-2031. [PMID: 37672343 DOI: 10.1002/mp.16720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 06/25/2023] [Accepted: 07/20/2023] [Indexed: 09/08/2023] Open
Abstract
BACKGROUND Cerebrovascular segmentation is a crucial step in the computer-assisted diagnosis of cerebrovascular pathologies. However, accurate extraction of cerebral vessels from time-of-flight magnetic resonance angiography (TOF-MRA) data is still challenging due to the complex topology and slender shape. PURPOSE The existing deep learning-based approaches pay more attention to the skeleton and ignore the contour, which limits the segmentation performance of the cerebrovascular structure. We aim to weight the contour of brain vessels in shallow features when concatenating with deep features. It helps to obtain more accurate cerebrovascular details and narrows the semantic gap between multilevel features. METHODS This work proposes a novel framework for priority extraction of contours in cerebrovascular structures. We first design a neighborhood-based algorithm to generate the ground truth of the cerebrovascular contour from original annotations, which can introduce useful shape information for the segmentation network. Moreover, We propose an encoder-dual decoder-based contour attention network (CA-Net), which consists of the dilated asymmetry convolution block (DACB) and the Contour Attention Module (CAM). The ancillary decoder uses the DACB to obtain cerebrovascular contour features under the supervision of contour annotations. The CAM transforms these features into a spatial attention map to increase the weight of the contour voxels in main decoder to better restored the vessel contour details. RESULTS The CA-Net is thoroughly validated using two publicly available datasets, and the experimental results demonstrate that our network outperforms the competitors for cerebrovascular segmentation. We achieved the average dice similarity coefficient (D S C $DSC$ ) of 68.15 and 99.92% in natural and synthetic datasets. Our method segments cerebrovascular structures with better completeness. CONCLUSIONS We propose a new framework containing contour annotation generation and cerebrovascular segmentation network that better captures the tiny vessels and improve vessel connectivity.
Collapse
Affiliation(s)
- Chaozhi Yang
- College of Computer Science and Technology, China University of Petroleum (EastChina), Qingdao, China
| | | | - Dianwei Chi
- School of Artificial Intelligence, Yantai Institute of Technology, Yantai, China
| | - Yachuan Li
- College of Computer Science and Technology, China University of Petroleum (EastChina), Qingdao, China
| | - Qian Xiao
- College of Computer Science and Technology, China University of Petroleum (EastChina), Qingdao, China
| | - Yun Bai
- College of Computer Science and Technology, China University of Petroleum (EastChina), Qingdao, China
| | - Zongmin Li
- College of Computer Science and Technology, China University of Petroleum (EastChina), Qingdao, China
- Shengli College of China University of Petroleum, Dongying, China
| | - Hongyi Li
- Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Science, Beijing, China
| | - Hua Li
- Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
15
|
Li P, Li Z, Wang Z, Li C, Wang M. mResU-Net: multi-scale residual U-Net-based brain tumor segmentation from multimodal MRI. Med Biol Eng Comput 2024; 62:641-651. [PMID: 37981627 DOI: 10.1007/s11517-023-02965-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2023] [Accepted: 11/01/2023] [Indexed: 11/21/2023]
Abstract
Brain tumor segmentation is an important direction in medical image processing, and its main goal is to accurately mark the tumor part in brain MRI. This study proposes a brand new end-to-end model for brain tumor segmentation, which is a multi-scale deep residual convolutional neural network called mResU-Net. The semantic gap between the encoder and decoder is bridged by using skip connections in the U-Net structure. The residual structure is used to alleviate the vanishing gradient problem during training and ensure sufficient information in deep networks. On this basis, multi-scale convolution kernels are used to improve the segmentation accuracy of targets of different sizes. At the same time, we also integrate channel attention modules into the network to improve its accuracy. The proposed model has an average dice score of 0.9289, 0.9277, and 0.8965 for tumor core (TC), whole tumor (WT), and enhanced tumor (ET) on the BraTS 2021 dataset, respectively. Comparing the segmentation results of this method with existing techniques shows that mResU-Net can significantly improve the segmentation performance of brain tumor subregions.
Collapse
Affiliation(s)
- Pengcheng Li
- School of Mechanical and Power Engineering, Harbin University of Science and Technology, Harbin, Heilongjiang, 150000, China.
| | - Zhihao Li
- School of Mechanical and Power Engineering, Harbin University of Science and Technology, Harbin, Heilongjiang, 150000, China
| | - Zijian Wang
- School of Mechanical and Power Engineering, Harbin University of Science and Technology, Harbin, Heilongjiang, 150000, China
| | - Chaoxiang Li
- School of Mechanical and Power Engineering, Harbin University of Science and Technology, Harbin, Heilongjiang, 150000, China
| | - Monan Wang
- School of Mechanical and Power Engineering, Harbin University of Science and Technology, Harbin, Heilongjiang, 150000, China
| |
Collapse
|
16
|
Xu W, Yang H, Shi Y, Tan T, Liu W, Pan X, Deng Y, Gao F, Su R. ERNet: Edge Regularization Network for Cerebral Vessel Segmentation in Digital Subtraction Angiography Images. IEEE J Biomed Health Inform 2024; 28:1472-1483. [PMID: 38090824 DOI: 10.1109/jbhi.2023.3342195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2024]
Abstract
Stroke is a leading cause of disability and fatality in the world, with ischemic stroke being the most common type. Digital Subtraction Angiography images, the gold standard in the operation process, can accurately show the contours and blood flow of cerebral vessels. The segmentation of cerebral vessels in DSA images can effectively help physicians assess the lesions. However, due to the disturbances in imaging parameters and changes in imaging scale, accurate cerebral vessel segmentation in DSA images is still a challenging task. In this paper, we propose a novel Edge Regularization Network (ERNet) to segment cerebral vessels in DSA images. Specifically, ERNet employs the erosion and dilation processes on the original binary vessel annotation to generate pseudo-ground truths of False Negative and False Positive, which serve as constraints to refine the coarse predictions based on their mapping relationship with the original vessels. In addition, we exploit a Hybrid Fusion Module based on convolution and transformers to extract local features and build long-range dependencies. Moreover, to support and advance the open research in the field of ischemic stroke, we introduce FPDSA, the first pixel-level semantic segmentation dataset for cerebral vessels. Extensive experiments on FPDSA illustrate the leading performance of our ERNet.
Collapse
|
17
|
Xie Y, Zhang J, Xia Y, Shen C. Learning From Partially Labeled Data for Multi-Organ and Tumor Segmentation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:14905-14919. [PMID: 37672381 DOI: 10.1109/tpami.2023.3312587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
Medical image benchmarks for the segmentation of organs and tumors suffer from the partially labeling issue due to its intensive cost of labor and expertise. Current mainstream approaches follow the practice of one network solving one task. With this pipeline, not only the performance is limited by the typically small dataset of a single task, but also the computation cost linearly increases with the number of tasks. To address this, we propose a Transformer based dynamic on-demand network (TransDoDNet) that learns to segment organs and tumors on multiple partially labeled datasets. Specifically, TransDoDNet has a hybrid backbone that is composed of the convolutional neural network and Transformer. A dynamic head enables the network to accomplish multiple segmentation tasks flexibly. Unlike existing approaches that fix kernels after training, the kernels in the dynamic head are generated adaptively by the Transformer, which employs the self-attention mechanism to model long-range organ-wise dependencies and decodes the organ embedding that can represent each organ. We create a large-scale partially labeled Multi-Organ and Tumor Segmentation benchmark, termed MOTS, and demonstrate the superior performance of our TransDoDNet over other competitors on seven organ and tumor segmentation tasks. This study also provides a general 3D medical image segmentation model, which has been pre-trained on the large-scale MOTS benchmark and has demonstrated advanced performance over current predominant self-supervised learning methods.
Collapse
|
18
|
Chen Z, Zhuo W, Wang T, Cheng J, Xue W, Ni D. Semi-Supervised Representation Learning for Segmentation on Medical Volumes and Sequences. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3972-3986. [PMID: 37756175 DOI: 10.1109/tmi.2023.3319973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/29/2023]
Abstract
Benefiting from the massive labeled samples, deep learning-based segmentation methods have achieved great success for two dimensional natural images. However, it is still a challenging task to segment high dimensional medical volumes and sequences, due to the considerable efforts for clinical expertise to make large scale annotations. Self/semi-supervised learning methods have been shown to improve the performance by exploiting unlabeled data. However, they are still lack of mining local semantic discrimination and exploitation of volume/sequence structures. In this work, we propose a semi-supervised representation learning method with two novel modules to enhance the features in the encoder and decoder, respectively. For the encoder, based on the continuity between slices/frames and the common spatial layout of organs across subjects, we propose an asymmetric network with an attention-guided predictor to enable prediction between feature maps of different slices of unlabeled data. For the decoder, based on the semantic consistency between labeled data and unlabeled data, we introduce a novel semantic contrastive learning to regularize the feature maps in the decoder. The two parts are trained jointly with both labeled and unlabeled volumes/sequences in a semi-supervised manner. When evaluated on three benchmark datasets of medical volumes and sequences, our model outperforms existing methods with a large margin of 7.3% DSC on ACDC, 6.5% on Prostate, and 3.2% on CAMUS when only a few labeled data is available. Further, results on the M&M dataset show that the proposed method yields improvement without using any domain adaption techniques for data from unknown domain. Intensive evaluations reveal the effectiveness of representation mining, and superiority on performance of our method. The code is available at https://github.com/CcchenzJ/BootstrapRepresentation.
Collapse
|
19
|
Yang J, Jiao L, Shang R, Liu X, Li R, Xu L. EPT-Net: Edge Perception Transformer for 3D Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3229-3243. [PMID: 37216246 DOI: 10.1109/tmi.2023.3278461] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
The convolutional neural network has achieved remarkable results in most medical image seg- mentation applications. However, the intrinsic locality of convolution operation has limitations in modeling the long-range dependency. Although the Transformer designed for sequence-to-sequence global prediction was born to solve this problem, it may lead to limited positioning capability due to insufficient low-level detail features. Moreover, low-level features have rich fine-grained information, which greatly impacts edge segmentation decisions of different organs. However, a simple CNN module is difficult to capture the edge information in fine-grained features, and the computational power and memory consumed in processing high-resolution 3D features are costly. This paper proposes an encoder-decoder network that effectively combines edge perception and Transformer structure to segment medical images accurately, called EPT-Net. Under this framework, this paper proposes a Dual Position Transformer to enhance the 3D spatial positioning ability effectively. In addition, as low-level features contain detailed information, we conduct an Edge Weight Guidance module to extract edge information by minimizing the edge information function without adding network parameters. Furthermore, we verified the effectiveness of the proposed method on three datasets, including SegTHOR 2019, Multi-Atlas Labeling Beyond the Cranial Vault and the re-labeled KiTS19 dataset called KiTS19-M by us. The experimental results show that EPT-Net has significantly improved compared with the state-of-the-art medical image segmentation method.
Collapse
|
20
|
Zhu W, Fang L, Ye X, Medani M, Escorcia-Gutierrez J. IDRM: Brain tumor image segmentation with boosted RIME optimization. Comput Biol Med 2023; 166:107551. [PMID: 37832284 DOI: 10.1016/j.compbiomed.2023.107551] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 09/13/2023] [Accepted: 09/28/2023] [Indexed: 10/15/2023]
Abstract
Timely diagnosis of medical conditions can significantly mitigate the risks they pose to human life. Consequently, there is an urgent demand for an effective auxiliary model that assists physicians in accurately diagnosing medical conditions based on imaging data. While multi-threshold image segmentation models have garnered considerable attention due to their simplicity and ease of implementation, the selection of threshold combinations greatly influences the segmentation performance. Traditional optimization algorithms often require substantial time to address multi-threshold image segmentation problems, and their segmentation accuracy is frequently unsatisfactory. As a result, metaheuristic algorithms have been employed in this domain. However, several algorithms suffer from drawbacks such as premature convergence and inadequate exploration of the solution space when it comes to threshold selection. For instance, the recently proposed optimization algorithm RIME, inspired by the physical phenomenon of rime-ice, falls short in terms of avoiding local optima and fully exploring the solution space. Therefore, this study introduces an enhanced version of RIME, called IDRM, which incorporates an interactive mechanism and Gaussian diffusion strategy. The interactive mechanism facilitates information exchange among agents, enabling them to evolve towards more promising directions and increasing the likelihood of discovering the optimal solution. Additionally, the Gaussian diffusion strategy enhances the agents' local exploration capabilities and expands their search within the solution space, effectively preventing them from becoming trapped in local optima. Experimental results on 30 benchmark test functions demonstrate that IDRM exhibits favorable optimization performance across various optimization functions, showcasing its robustness and convergence properties. Furthermore, the algorithm is applied to select threshold combinations for brain tumor image segmentation, and the results are evaluated using metrics such as Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM). The overall findings consistently highlight the exceptional performance of this approach, further validating the effectiveness of IDRM in addressing image segmentation problems.
Collapse
Affiliation(s)
- Wei Zhu
- School of Resources and Safety Engineering, Central South University, Changsha, 410083, China.
| | - Liming Fang
- School of Humanities and Communication, Zhejiang Gongshang University, Hangzhou, 310000, China.
| | - Xia Ye
- School of the 1st Clinical Medical Sciences(School of Information and Engineering), Wenzhou Medical University, Wenzhou, 325000, China.
| | - Mohamed Medani
- Department of Computer Science, College of Science and Art at Mahayil, King Khalid University, Muhayil Aseer, 62529, Saudi Arabia.
| | - José Escorcia-Gutierrez
- Department of Computational Science and Electronics, Universidad de la Costa, CUC, Barranquilla, 080002, Colombia.
| |
Collapse
|
21
|
Ma M, Zhang X, Li Y, Wang X, Zhang R, Wang Y, Sun P, Wang X, Sun X. ConvLSTM coordinated longitudinal transformer under spatio-temporal features for tumor growth prediction. Comput Biol Med 2023; 164:107313. [PMID: 37562325 DOI: 10.1016/j.compbiomed.2023.107313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 07/17/2023] [Accepted: 08/07/2023] [Indexed: 08/12/2023]
Abstract
Accurate quantification of tumor growth patterns can indicate the development process of the disease. According to the important features of tumor growth rate and expansion, physicians can intervene and diagnose patients more efficiently to improve the cure rate. However, the existing longitudinal growth model can not well analyze the dependence between tumor growth pixels in the long space-time, and fail to effectively fit the nonlinear growth law of tumors. So, we propose the ConvLSTM coordinated longitudinal Transformer (LCTformer) under spatiotemporal features for tumor growth prediction. We design the Adaptive Edge Enhancement Module (AEEM) to learn static spatial features of different size tumors under time series and make the depth model more focused on tumor edge regions. In addition, we propose the Growth Prediction Module (GPM) to characterize the future growth trend of tumors. It consists of a Longitudinal Transformer and ConvLSTM. Based on the adaptive abstract features of current tumors, Longitudinal Transformer explores the dynamic growth patterns between spatiotemporal CT sequences and learns the future morphological features of tumors under the dual views of residual information and sequence motion relationship in parallel. ConvLSTM can better learn the location information of target tumors, and it complements Longitudinal Transformer to jointly predict future imaging of tumors to reduce the loss of growth information. Finally, Channel Enhancement Fusion Module (CEFM) performs the dense fusion of the generated tumor feature images in the channel and spatial dimensions and realizes accurate quantification of the whole tumor growth process. Our model has been strictly trained and tested on the NLST dataset. The average prediction accuracy can reach 88.52% (Dice score), 89.64% (Recall), and 11.06 (RMSE), which can improve the work efficiency of doctors.
Collapse
Affiliation(s)
- Manfu Ma
- College of Computer Science & Engineering, Northwest Normal University, Lanzhou, 730070, China
| | - Xiaoming Zhang
- College of Computer Science & Engineering, Northwest Normal University, Lanzhou, 730070, China
| | - Yong Li
- College of Computer Science & Engineering, Northwest Normal University, Lanzhou, 730070, China.
| | - Xia Wang
- Department of Pharmacy, The People's Hospital of Gansu Province, Lanzhou, 730000, China
| | - Ruigen Zhang
- College of Computer Science & Engineering, Northwest Normal University, Lanzhou, 730070, China
| | - Yang Wang
- College of Computer Science & Engineering, Northwest Normal University, Lanzhou, 730070, China
| | - Penghui Sun
- College of Computer Science & Engineering, Northwest Normal University, Lanzhou, 730070, China
| | - Xuegang Wang
- College of Computer Science & Engineering, Northwest Normal University, Lanzhou, 730070, China
| | - Xuan Sun
- College of Computer Science & Engineering, Northwest Normal University, Lanzhou, 730070, China
| |
Collapse
|
22
|
Liu L, Chang J, Liu Z, Zhang P, Xu X, Shang H. Hybrid Contextual Semantic Network for Accurate Segmentation and Detection of Small-Size Stroke Lesions From MRI. IEEE J Biomed Health Inform 2023; 27:4062-4073. [PMID: 37155390 DOI: 10.1109/jbhi.2023.3273771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Stroke is a cerebrovascular disease with high mortality and disability rates. The occurrence of the stroke typically produces lesions of different sizes, with the accurate segmentation and detection of small-size stroke lesions being closely related to the prognosis of patients. However, the large lesions are usually correctly identified, the small-size lesions are usually ignored. This article provides a hybrid contextual semantic network (HCSNet) that can accurately and simultaneously segment and detect small-size stroke lesions from magnetic resonance images. HCSNet inherits the advantages of the encoder-decoder architecture and applies a novel hybrid contextual semantic module that generates high-quality contextual semantic features from the spatial and channel contextual semantic features through the skip connection layer. Moreover, a mixing-loss function is proposed to optimize HCSNet for unbalanced small-size lesions. HCSNet is trained and evaluated on 2D magnetic resonance images produced from the Anatomical Tracings of Lesions After Stroke challenge (ATLAS R2.0). Extensive experiments demonstrate that HCSNet outperforms several other state-of-the-art methods in its ability to segment and detect small-size stroke lesions. Visualization and ablation experiments reveal that the hybrid semantic module improves the segmentation and detection performance of HCSNet.
Collapse
|
23
|
Xu B, Zhang X, Tian C, Yan W, Wang Y, Zhang D, Liao X, Cai X. Automatic segmentation of white matter hyperintensities and correlation analysis for cerebral small vessel disease. Front Neurol 2023; 14:1242685. [PMID: 37576013 PMCID: PMC10413581 DOI: 10.3389/fneur.2023.1242685] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 07/06/2023] [Indexed: 08/15/2023] Open
Abstract
Objective Cerebral white matter hyperintensity can lead to cerebral small vessel disease, MRI images in the brain are used to assess the degree of pathological changes in white matter regions. In this paper, we propose a framework for automatic 3D segmentation of brain white matter hyperintensity based on MRI images to address the problems of low accuracy and segmentation inhomogeneity in 3D segmentation. We explored correlation analyses of cognitive assessment parameters and multiple comparison analyses to investigate differences in brain white matter hyperintensity volume among three cognitive states, Dementia, MCI and NCI. The study explored the correlation between cognitive assessment coefficients and brain white matter hyperintensity volume. Methods This paper proposes an automatic 3D segmentation framework for white matter hyperintensity using a deep multi-mapping encoder-decoder structure. The method introduces a 3D residual mapping structure for the encoder and decoder. Multi-layer Cross-connected Residual Mapping Module (MCRCM) is proposed in the encoding stage to enhance the expressiveness of model and perception of detailed features. Spatial Attention Weighted Enhanced Supervision Module (SAWESM) is proposed in the decoding stage to adjust the supervision strategy through a spatial attention weighting mechanism. This helps guide the decoder to perform feature reconstruction and detail recovery more effectively. Result Experimental data was obtained from a privately owned independent brain white matter dataset. The results of the automatic 3D segmentation framework showed a higher segmentation accuracy compared to nnunet and nnunet-resnet, with a p-value of <0.001 for the two cognitive assessment parameters MMSE and MoCA. This indicates that larger brain white matter are associated with lower scores of MMSE and MoCA, which in turn indicates poorer cognitive function. The order of volume size of white matter hyperintensity in the three groups of cognitive states is dementia, MCI and NCI, respectively. Conclusion The paper proposes an automatic 3D segmentation framework for brain white matter that achieves high-precision segmentation. The experimental results show that larger volumes of segmented regions have a negative correlation with lower scoring coefficients of MMSE and MoCA. This correlation analysis provides promising treatment prospects for the treatment of cerebral small vessel diseases in the brain through 3D segmentation analysis of brain white matter. The differences in the volume of white matter hyperintensity regions in subjects with three different cognitive states can help to better understand the mechanism of cognitive decline in clinical research.
Collapse
Affiliation(s)
- Bin Xu
- Department of Neurosurgery, Shenzhen Second People's Hospital, The First Affiliated Hospital of Shenzhen University, Shenzhen, Guangdong, China
- Shenzhen University School of Medicine, Shenzhen, Guangdong, China
| | - Xiaofeng Zhang
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Congyu Tian
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Wei Yan
- Brain Cognition and Brain Disease Institute, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Yuanqing Wang
- Department of Neurosurgery, Shenzhen Second People's Hospital, The First Affiliated Hospital of Shenzhen University, Shenzhen, Guangdong, China
| | - Doudou Zhang
- Department of Neurosurgery, Shenzhen Second People's Hospital, The First Affiliated Hospital of Shenzhen University, Shenzhen, Guangdong, China
- Shenzhen University School of Medicine, Shenzhen, Guangdong, China
| | - Xiangyun Liao
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Xiaodong Cai
- Department of Neurosurgery, Shenzhen Second People's Hospital, The First Affiliated Hospital of Shenzhen University, Shenzhen, Guangdong, China
- Shenzhen University School of Medicine, Shenzhen, Guangdong, China
| |
Collapse
|
24
|
Shen L, Zhang Y, Wang Q, Qin F, Sun D, Min H, Meng Q, Xu C, Zhao W, Song X. Feature interaction network based on hierarchical decoupled convolution for 3D medical image segmentation. PLoS One 2023; 18:e0288658. [PMID: 37440581 DOI: 10.1371/journal.pone.0288658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Accepted: 06/30/2023] [Indexed: 07/15/2023] Open
Abstract
Manual image segmentation consumes time. An automatic and accurate method to segment multimodal brain tumors using context information rich three-dimensional medical images that can be used for clinical treatment decisions and surgical planning is required. However, it is a challenge to use deep learning to achieve accurate segmentation of medical images due to the diversity of tumors and the complex boundary interactions between sub-regions while limited computing resources hinder the construction of efficient neural networks. We propose a feature fusion module based on a hierarchical decoupling convolution network and an attention mechanism to improve the performance of network segmentation. We replaced the skip connections of U-shaped networks with a feature fusion module to solve the category imbalance problem, thus contributing to the segmentation of more complicated medical images. We introduced a global attention mechanism to further integrate the features learned by the encoder and explore the context information. The proposed method was evaluated for enhance tumor, whole tumor, and tumor core, achieving Dice similarity coefficient metrics of 0.775, 0.900, and 0.827, respectively, on the BraTS 2019 dataset and 0.800, 0.902, and 0.841, respectively on the BraTS 2018 dataset. The results show that our proposed method is inherently general and is a powerful tool for brain tumor image studies. Our code is available at: https://github.com/WSake/Feature-interaction-network-based-on-Hierarchical-Decoupled-Convolution.
Collapse
Affiliation(s)
- Longfeng Shen
- Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), College of Computer Science and Technology, Huaibei Normal University, Huaibei, Anhui, China
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
- Anhui Big-Data Research Center on University Management, Huaibei, Anhui, China
| | - Yingjie Zhang
- Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), College of Computer Science and Technology, Huaibei Normal University, Huaibei, Anhui, China
| | - Qiong Wang
- Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), College of Computer Science and Technology, Huaibei Normal University, Huaibei, Anhui, China
| | - Fenglan Qin
- Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), College of Computer Science and Technology, Huaibei Normal University, Huaibei, Anhui, China
| | - Dengdi Sun
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
- Anhui Provincial Key Laboratory of Multimodal Cognitive Computing, School of Artificial Intelligence, Anhui University, Hefei, China
| | - Hai Min
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, Anhui, China
| | - Qianqian Meng
- Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), College of Computer Science and Technology, Huaibei Normal University, Huaibei, Anhui, China
| | - Chengzhen Xu
- Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), College of Computer Science and Technology, Huaibei Normal University, Huaibei, Anhui, China
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
| | - Wei Zhao
- Huaibei People's Hospital, Huaibei, Anhui, China
| | - Xin Song
- Huaibei People's Hospital, Huaibei, Anhui, China
| |
Collapse
|
25
|
Cui H, Wang Y, Li Y, Xu D, Jiang L, Xia Y, Zhang Y. An Improved Combination of Faster R-CNN and U-Net Network for Accurate Multi-Modality Whole Heart Segmentation. IEEE J Biomed Health Inform 2023; 27:3408-3419. [PMID: 37040240 DOI: 10.1109/jbhi.2023.3266228] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2023]
Abstract
Detailed information of substructures of the whole heart is usually vital in the diagnosis of cardiovascular diseases and in 3D modeling of the heart. Deep convolutional neural networks have been demonstrated to achieve state-of-the-art performance in 3D cardiac structures segmentation. However, when dealing with high-resolution 3D data, current methods employing tiling strategies usually degrade segmentation performances due to GPU memory constraints. This work develops a two-stage multi-modality whole heart segmentation strategy, which adopts an improved Combination of Faster R-CNN and 3D U-Net (CFUN+). More specifically, the bounding box of the heart is first detected by Faster R-CNN, and then the original Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) images of the heart aligned with the bounding box are input into 3D U-Net for segmentation. The proposed CFUN+ method redefines the bounding box loss function by replacing the previous Intersection over Union (IoU) loss with Complete Intersection over Union (CIoU) loss. Meanwhile, the integration of the edge loss makes the segmentation results more accurate, and also improves the convergence speed. The proposed method achieves an average Dice score of 91.1% on the Multi-Modality Whole Heart Segmentation (MM-WHS) 2017 challenge CT dataset, which is 5.2% higher than the baseline CFUN model, and achieves state-of-the-art segmentation results. In addition, the segmentation speed of a single heart has been dramatically improved from a few minutes to less than 6 seconds.
Collapse
|
26
|
Qureshi A, Lim S, Suh SY, Mutawak B, Chitnis PV, Demer JL, Wei Q. Deep-Learning-Based Segmentation of Extraocular Muscles from Magnetic Resonance Images. Bioengineering (Basel) 2023; 10:699. [PMID: 37370630 PMCID: PMC10295225 DOI: 10.3390/bioengineering10060699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 05/31/2023] [Accepted: 06/02/2023] [Indexed: 06/29/2023] Open
Abstract
In this study, we investigated the performance of four deep learning frameworks of U-Net, U-NeXt, DeepLabV3+, and ConResNet in multi-class pixel-based segmentation of the extraocular muscles (EOMs) from coronal MRI. Performances of the four models were evaluated and compared with the standard F-measure-based metrics of intersection over union (IoU) and Dice, where the U-Net achieved the highest overall IoU and Dice scores of 0.77 and 0.85, respectively. Centroid distance offset between identified and ground truth EOM centroids was measured where U-Net and DeepLabV3+ achieved low offsets (p > 0.05) of 0.33 mm and 0.35 mm, respectively. Our results also demonstrated that segmentation accuracy varies in spatially different image planes. This study systematically compared factors that impact the variability of segmentation and morphometric accuracy of the deep learning models when applied to segmenting EOMs from MRI.
Collapse
Affiliation(s)
- Amad Qureshi
- Department of Bioengineering, George Mason University, Fairfax, VA 22030, USA; (A.Q.)
| | - Seongjin Lim
- Department of Ophthalmology, Neurology and Bioengineering, Jules Stein Eye Institute, University of California, Los Angeles, CA 90095, USA; (S.L.)
| | - Soh Youn Suh
- Department of Ophthalmology, Neurology and Bioengineering, Jules Stein Eye Institute, University of California, Los Angeles, CA 90095, USA; (S.L.)
| | - Bassam Mutawak
- Department of Bioengineering, George Mason University, Fairfax, VA 22030, USA; (A.Q.)
| | - Parag V. Chitnis
- Department of Bioengineering, George Mason University, Fairfax, VA 22030, USA; (A.Q.)
| | - Joseph L. Demer
- Department of Ophthalmology, Neurology and Bioengineering, Jules Stein Eye Institute, University of California, Los Angeles, CA 90095, USA; (S.L.)
| | - Qi Wei
- Department of Bioengineering, George Mason University, Fairfax, VA 22030, USA; (A.Q.)
| |
Collapse
|
27
|
Zhao Y, Wang S, Zhang Y, Qiao S, Zhang M. WRANet: wavelet integrated residual attention U-Net network for medical image segmentation. COMPLEX INTELL SYST 2023:1-13. [PMID: 37361970 PMCID: PMC10248349 DOI: 10.1007/s40747-023-01119-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 05/16/2023] [Indexed: 06/28/2023]
Abstract
Medical image segmentation is crucial for the diagnosis and analysis of disease. Deep convolutional neural network methods have achieved great success in medical image segmentation. However, they are highly susceptible to noise interference during the propagation of the network, where weak noise can dramatically alter the network output. As the network deepens, it can face problems such as gradient explosion and vanishing. To improve the robustness and segmentation performance of the network, we propose a wavelet residual attention network (WRANet) for medical image segmentation. We replace the standard downsampling modules (e.g., maximum pooling and average pooling) in CNNs with discrete wavelet transform, decompose the features into low- and high-frequency components, and remove the high-frequency components to eliminate noise. At the same time, the problem of feature loss can be effectively addressed by introducing an attention mechanism. The combined experimental results show that our method can effectively perform aneurysm segmentation, achieving a Dice score of 78.99%, an IoU score of 68.96%, a precision of 85.21%, and a sensitivity score of 80.98%. In polyp segmentation, a Dice score of 88.89%, an IoU score of 81.74%, a precision rate of 91.32%, and a sensitivity score of 91.07% were achieved. Furthermore, our comparison with state-of-the-art techniques demonstrates the competitiveness of the WRANet network.
Collapse
Affiliation(s)
- Yawu Zhao
- School of Computer Science and Technology, China University of Petroleum, Qingdao, Shandong China
| | - Shudong Wang
- School of Computer Science and Technology, China University of Petroleum, Qingdao, Shandong China
| | - Yulin Zhang
- College of Mathematics and System Science, Shandong University of Science and Technology, Qingdao, Shandong China
| | - Sibo Qiao
- School of Computer Science and Technology, China University of Petroleum, Qingdao, Shandong China
| | - Mufei Zhang
- Inspur Cloud Information Technology Co, Inspur, Jinan, Shandong China
| |
Collapse
|
28
|
Li X, Jiang Y, Li M, Zhang J, Yin S, Luo H. MSFR-Net: Multi-modality and single-modality feature recalibration network for brain tumor segmentation. Med Phys 2023; 50:2249-2262. [PMID: 35962724 DOI: 10.1002/mp.15933] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Revised: 05/16/2022] [Accepted: 06/14/2022] [Indexed: 11/11/2022] Open
Abstract
BACKGROUND Accurate and automated brain tumor segmentation from multi-modality MR images plays a significant role in tumor treatment. However, the existing approaches mainly focus on the fusion of multi-modality while ignoring the correlation between single-modality and tumor subcomponents. For example, T2-weighted images show good visualization of edema, and T1-contrast images have a good contrast between enhancing tumor core and necrosis. In the actual clinical process, professional physicians also label tumors according to these characteristics. We design a method for brain tumors segmentation that utilizes both multi-modality fusion and single-modality characteristics. METHODS A multi-modality and single-modality feature recalibration network (MSFR-Net) is proposed for brain tumor segmentation from MR images. Specifically, multi-modality information and single-modality information are assigned to independent pathways. Multi-modality network explicitly learns the relationship between all modalities and all tumor sub-components. Single-modality network learns the relationship between single-modality and its highly correlated tumor subcomponents. Then, a dual recalibration module (DRM) is designed to connect the parallel single-modality network and multi-modality network at multiple stages. The function of the DRM is to unify the two types of features into the same feature space. RESULTS Experiments on BraTS 2015 dataset and BraTS 2018 dataset show that the proposed method is competitive and superior to other state-of-the-art methods. The proposed method achieved the segmentation results with Dice coefficients of 0.86 and Hausdorff distance of 4.82 on BraTS 2018 dataset, with dice coefficients of 0.80, positive predictive value of 0.76, and sensitivity of 0.78 on BraTS 2015 dataset. CONCLUSIONS This work combines the manual labeling process of doctors and introduces the correlation between single-modality and the tumor subcomponents into the segmentation network. The method improves the segmentation performance of brain tumors and can be applied in the clinical practice. The code of the proposed method is available at: https://github.com/xiangQAQ/MSFR-Net.
Collapse
Affiliation(s)
- Xiang Li
- Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, China
| | - Yuchen Jiang
- Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, China
| | - Minglei Li
- Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, China
| | - Jiusi Zhang
- Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, China
| | - Shen Yin
- Department of Mechanical and Industrial Engineering, Faculty of Engineering, Norwegian University of Science and Technology, Trondheim, Norway
| | - Hao Luo
- Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, China
| |
Collapse
|
29
|
Lu X, Xu Y, Yuan W. PDRF-Net: a progressive dense residual fusion network for COVID-19 lung CT image segmentation. EVOLVING SYSTEMS 2023; 15:1-17. [PMID: 38625320 PMCID: PMC9936947 DOI: 10.1007/s12530-023-09489-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Accepted: 02/02/2023] [Indexed: 02/19/2023]
Abstract
The lungs of patients with COVID-19 exhibit distinctive lesion features in chest CT images. Fast and accurate segmentation of lesion sites from CT images of patients' lungs is significant for the diagnosis and monitoring of COVID-19 patients. To this end, we propose a progressive dense residual fusion network named PDRF-Net for COVID-19 lung CT segmentation. Dense skip connections are introduced to capture multi-level contextual information and compensate for the feature loss problem in network delivery. The efficient aggregated residual module is designed for the encoding-decoding structure, which combines a visual transformer and the residual block to enable the network to extract richer and minute-detail features from CT images. Furthermore, we introduce a bilateral channel pixel weighted module to progressively fuse the feature maps obtained from multiple branches. The proposed PDRF-Net obtains good segmentation results on two COVID-19 datasets. Its segmentation performance is superior to baseline by 11.6% and 11.1%, and outperforming other comparative mainstream methods. Thus, PDRF-Net serves as an easy-to-train, high-performance deep learning model that can realize effective segmentation of the COVID-19 lung CT images.
Collapse
Affiliation(s)
- Xiaoyan Lu
- College of Big Data and Information Engineering, Guizhou University, Guiyang, Guizhou People’s Republic of China
| | - Yang Xu
- College of Big Data and Information Engineering, Guizhou University, Guiyang, Guizhou People’s Republic of China
- Guiyang Aluminum Magnesium Design and Research Institute Co., Ltd, Guiyang, Guizhou People’s Republic of China
| | - Wenhao Yuan
- College of Big Data and Information Engineering, Guizhou University, Guiyang, Guizhou People’s Republic of China
| |
Collapse
|
30
|
Zhang B, Wang Y, Ding C, Deng Z, Li L, Qin Z, Ding Z, Bian L, Yang C. Multi-scale feature pyramid fusion network for medical image segmentation. Int J Comput Assist Radiol Surg 2023; 18:353-365. [PMID: 36042149 DOI: 10.1007/s11548-022-02738-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 08/11/2022] [Indexed: 02/03/2023]
Abstract
PURPOSE Medical image segmentation is the most widely used technique in diagnostic and clinical research. However, accurate segmentation of target organs from blurred border regions and low-contrast adjacent organs in Computed tomography (CT) imaging is crucial for clinical diagnosis and treatment. METHODS In this article, we propose a Multi-Scale Feature Pyramid Fusion Network (MS-Net) based on the codec structure formed by the combination of Multi-Scale Attention Module (MSAM) and Stacked Feature Pyramid Module (SFPM). Among them, MSAM is used to skip connections, which aims to extract different levels of context details by dynamically adjusting the receptive fields under different network depths; the SFPM including multi-scale strategies and multi-layer Feature Perception Module (FPM) is nested in the network at the deepest point, which aims to better focus the network's attention on the target organ by adaptively increasing the weight of the features of interest. RESULTS Experiments demonstrate that the proposed MS-Net significantly improved the Dice score from 91.74% to 94.54% on CHAOS, from 97.59% to 98.59% on Lung, and from 82.55% to 86.06% on ISIC 2018, compared with U-Net. Additionally, comparisons with other six state-of-the-art codec structures also show the presented network has great advantages on evaluation indicators such as Miou, Dice, ACC and AUC. CONCLUSION The experimental results show that both the MSAM and SFPM techniques proposed in this paper can assist the network to improve the segmentation effect, so that the proposed MS-Net method achieves better results in the CHAOS, Lung and ISIC 2018 segmentation tasks.
Collapse
Affiliation(s)
- Bing Zhang
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang, 550025, China
| | - Yang Wang
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang, 550025, China
| | - Caifu Ding
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang, 550025, China
| | - Ziqing Deng
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang, 550025, China
| | - Linwei Li
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang, 550025, China
| | - Zesheng Qin
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang, 550025, China
| | - Zhao Ding
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang, 550025, China
| | - Lifeng Bian
- Frontier Institute of Chip and System, Fudan University, Shanghai, 200433, China.
| | - Chen Yang
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang, 550025, China.
| |
Collapse
|
31
|
Tong G, Jiang H, Yao YD. SDA-UNet: a hepatic vein segmentation network based on the spatial distribution and density awareness of blood vessels. Phys Med Biol 2023; 68. [PMID: 36623320 DOI: 10.1088/1361-6560/acb199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 01/09/2023] [Indexed: 01/11/2023]
Abstract
Objective.Hepatic vein segmentation is a fundamental task for liver diagnosis and surgical navigation planning. Unlike other organs, the liver is the only organ with two sets of venous systems. Meanwhile, the segmentation target distribution in the hepatic vein scene is extremely unbalanced. The hepatic veins occupy a small area in abdominal CT slices. The morphology of each person's hepatic vein is different, which also makes segmentation difficult. The purpose of this study is to develop an automated hepatic vein segmentation model that guides clinical diagnosis.Approach.We introduce the 3D spatial distribution and density awareness (SDA) of hepatic veins and propose an automatic segmentation network based on 3D U-Net which includes a multi-axial squeeze and excitation module (MASE) and a distribution correction module (DCM). The MASE restrict the activation area to the area with hepatic veins. The DCM improves the awareness of the sparse spatial distribution of the hepatic veins. To obtain global axial information and spatial information at the same time, we study the effect of different training strategies on hepatic vein segmentation. Our method was evaluated by a public dataset and a private dataset. The Dice coefficient achieves 71.37% and 69.58%, improving 3.60% and 3.30% compared to the other SOTA models, respectively. Furthermore, metrics based on distance and volume also show the superiority of our method.Significance.The proposed method greatly reduced false positive areas and improved the segmentation performance of the hepatic vein in CT images. It will assist doctors in making accurate diagnoses and surgical navigation planning.
Collapse
Affiliation(s)
- Guoyu Tong
- Software College, Northeastern University, Shenyang 110819, People's Republic of China
| | - Huiyan Jiang
- Software College, Northeastern University, Shenyang 110819, People's Republic of China.,Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang 110819, People's Republic of China
| | - Yu-Dong Yao
- Department of Electrical and Computer Engineering, Stevens Institute of Technology, Hoboken, NJ 07030, United States of America
| |
Collapse
|
32
|
Wang H, Xiao N, Luo S, Li R, Zhao J, Ma Y, Zhao J, Qiang Y, Wang L, Lian J. Multi-scale dense selective network based on border modeling for lung nodule segmentation. Int J Comput Assist Radiol Surg 2023; 18:845-853. [PMID: 36637749 DOI: 10.1007/s11548-022-02817-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Accepted: 12/20/2022] [Indexed: 01/14/2023]
Abstract
PURPOSE Accurate quantification of pulmonary nodules helps physicians to accurately diagnose and treat lung cancer. We try to improve the segmentation efficiency of irregular nodules while maintaining the segmentation accuracy of simple types of nodules. METHODS In this paper, we obtain the unique edge part of pulmonary nodules and process it as a single branch stream, i.e., border stream, to explicitly model the nodule edge information. We propose a multi-scale dense selective network based on border modeling (BorDenNet). Its overall framework consists of a dual-branch encoder-decoder, which achieves parallel processing of classical image stream and border stream. We design a dense attention module to facilitate a strongly coupled status of feature images to focus on key regions of pulmonary nodules. Then, during the process of model decoding, the multi-scale selective attention module is proposed to establish long-range correlation relationships between different scale features, which further achieves finer feature discrimination and spatial recovery. We introduce border context enhancement module to mutually fuse and enhance the edge-related voxel features contained in the image stream and border stream and finally achieve the accurate segmentation of pulmonary nodules. RESULTS We evaluate the BorDenNet rigorously on the lung public dataset LIDC-IDRI. For the segmentation of the target nodules, the average Dice score is 92.78[Formula: see text], the average sensitivity is 91.37[Formula: see text], and the average Hausdorff distance is 3.06 mm. We further test on a private dataset from Shanxi Provincial People's Hospital, which verifies the excellent generalization of BorDenNet. Our BorDenNet relatively improves the segmentation efficiency for multi-type nodules such as adherent pulmonary nodules and ground-glass pulmonary nodules. CONCLUSION Accurate segmentation of irregular pulmonary nodules can obtain important clinical parameters, which can be used as a guide for clinicians and improve clinical efficiency.
Collapse
Affiliation(s)
- Hexi Wang
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, 030000, Shanxi, China
| | - Ning Xiao
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, 030000, Shanxi, China
| | - Shichao Luo
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, 030000, Shanxi, China
| | - Runrui Li
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, 030000, Shanxi, China
| | - Jun Zhao
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, 030000, Shanxi, China
| | - Yulan Ma
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, 030000, Shanxi, China
| | - Juanjuan Zhao
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, 030000, Shanxi, China.
- College of Information, Jinzhong College of Information, Jinzhong, 030600, Shanxi, China.
| | - Yan Qiang
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, 030000, Shanxi, China
| | - Long Wang
- College of Information, Jinzhong College of Information, Jinzhong, 030600, Shanxi, China
| | - Jianhong Lian
- Cancer Hospital, Shanxi Cancer Hospital, Taiyuan, 030000, Shanxi, China
| |
Collapse
|
33
|
Extension-contraction transformation network for pancreas segmentation in abdominal CT scans. Comput Biol Med 2023; 152:106410. [PMID: 36516578 DOI: 10.1016/j.compbiomed.2022.106410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 11/08/2022] [Accepted: 12/03/2022] [Indexed: 12/12/2022]
Abstract
Accurate and automatic pancreas segmentation from abdominal computed tomography (CT) scans is crucial for the diagnosis and prognosis of pancreatic diseases. However, the pancreas accounts for a relatively small portion of the scan and presents high anatomical variability and low contrast, making traditional automated segmentation methods fail to generate satisfactory results. In this paper, we propose an extension-contraction transformation network (ECTN) and deploy it into a cascaded two-stage segmentation framework for accurate pancreas segmenting. This model can enhance the perception of 3D context by distinguishing and exploiting the extension and contraction transformation of the pancreas between slices. It consists of an encoder, a segmentation decoder, and an extension-contraction (EC) decoder. The EC decoder is responsible for predicting the inter-slice extension and contraction transformation of the pancreas by feeding the extension and contraction information generated by the segmentation decoder; meanwhile, its output is combined with the output of the segmentation decoder to reconstruct and refine the segmentation results. Quantitative evaluation is performed on NIH Pancreas Segmentation (Pancreas-CT) dataset using 4-fold cross-validation. We obtained average Precision of 86.59±6.14% , Recall of 85.11±5.96%, Dice similarity coefficient (DSC) of 85.58±3.98%. and Jaccard Index (JI) of 74.99±5.86%. The performance of our method outperforms several baseline and state-of-the-art methods.
Collapse
|
34
|
Chang Y, Zheng Z, Sun Y, Zhao M, Lu Y, Zhang Y. DPAFNet: A Residual Dual-Path Attention-Fusion Convolutional Neural Network for Multimodal Brain Tumor Segmentation. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
|
35
|
Zhang L, Xu F, Li Y, Zhang H, Xi Z, Xiang J, Wang B. A lightweight convolutional neural network model with receptive field block for C-shaped root canal detection in mandibular second molars. Sci Rep 2022; 12:17373. [PMID: 36253430 PMCID: PMC9576767 DOI: 10.1038/s41598-022-20411-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 09/13/2022] [Indexed: 01/10/2023] Open
Abstract
Rapid and accurate detection of a C-shaped root canal on mandibular second molars can assist dentists in diagnosis and treatment. Oral panoramic radiography is one of the most effective methods of determining the root canal of teeth. There are already some traditional methods based on deep learning to learn the characteristics of C-shaped root canal tooth images. However, previous studies have shown that the accuracy of detecting the C-shaped root canal still needs to be improved. And it is not suitable for implementing these network structures with limited hardware resources. In this paper, a new lightweight convolutional neural network is designed, which combined with receptive field block (RFB) for optimizing feature extraction. In order to optimize the hardware resource requirements of the model, a lightweight, multi-branch, convolutional neural network model was developed in this study. To improve the feature extraction ability of the model for C-shaped root canal tooth images, RFB has been merged with this model. RFB has achieved excellent results in target detection and classification. In the multiscale receptive field block, some small convolution kernels are used to replace the large convolution kernels, which allows the model to extract detailed features and reduce the computational complexity. Finally, the accuracy and area under receiver operating characteristics curve (AUC) values of C-shaped root canals on the image data of our mandibular second molars were 0.9838 and 0.996, respectively. The results show that the deep learning model proposed in this paper is more accurate and has lower computational complexity than many other similar studies. In addition, score-weighted class activation maps (Score-CAM) were generated to localize the internal structure that contributed to the predictions.
Collapse
Affiliation(s)
- Lijuan Zhang
- grid.464423.3Department of Oral Medicine, Shanxi Provincial People’s Hospital, Taiyuan, China
| | - Feng Xu
- grid.440656.50000 0000 9491 9632College of Information and Computer, Taiyuan University of Technology, No. 79, Yingze West Street, Taiyuan, 030024 Shanxi China
| | - Ying Li
- grid.440656.50000 0000 9491 9632College of Information and Computer, Taiyuan University of Technology, No. 79, Yingze West Street, Taiyuan, 030024 Shanxi China
| | - Huimin Zhang
- grid.440656.50000 0000 9491 9632College of Information and Computer, Taiyuan University of Technology, No. 79, Yingze West Street, Taiyuan, 030024 Shanxi China
| | - Ziyi Xi
- grid.440656.50000 0000 9491 9632College of Information and Computer, Taiyuan University of Technology, No. 79, Yingze West Street, Taiyuan, 030024 Shanxi China
| | - Jie Xiang
- grid.440656.50000 0000 9491 9632College of Information and Computer, Taiyuan University of Technology, No. 79, Yingze West Street, Taiyuan, 030024 Shanxi China
| | - Bin Wang
- grid.440656.50000 0000 9491 9632College of Information and Computer, Taiyuan University of Technology, No. 79, Yingze West Street, Taiyuan, 030024 Shanxi China
| |
Collapse
|
36
|
Yang Q, Guo X, Chen Z, Woo PYM, Yuan Y. D 2-Net: Dual Disentanglement Network for Brain Tumor Segmentation With Missing Modalities. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:2953-2964. [PMID: 35576425 DOI: 10.1109/tmi.2022.3175478] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Multi-modal Magnetic Resonance Imaging (MRI) can provide complementary information for automatic brain tumor segmentation, which is crucial for diagnosis and prognosis. While missing modality data is common in clinical practice and it can result in the collapse of most previous methods relying on complete modality data. Current state-of-the-art approaches cope with the situations of missing modalities by fusing multi-modal images and features to learn shared representations of tumor regions, which often ignore explicitly capturing the correlations among modalities and tumor regions. Inspired by the fact that modality information plays distinct roles to segment different tumor regions, we aim to explicitly exploit the correlations among various modality-specific information and tumor-specific knowledge for segmentation. To this end, we propose a Dual Disentanglement Network (D2-Net) for brain tumor segmentation with missing modalities, which consists of a modality disentanglement stage (MD-Stage) and a tumor-region disentanglement stage (TD-Stage). In the MD-Stage, a spatial-frequency joint modality contrastive learning scheme is designed to directly decouple the modality-specific information from MRI data. To decompose tumor-specific representations and extract discriminative holistic features, we propose an affinity-guided dense tumor-region knowledge distillation mechanism in the TD-Stage through aligning the features of a disentangled binary teacher network with a holistic student network. By explicitly discovering relations among modalities and tumor regions, our model can learn sufficient information for segmentation even if some modalities are missing. Extensive experiments on the public BraTS-2018 database demonstrate the superiority of our framework over state-of-the-art methods in missing modalities situations. Codes are available at https://github.com/CityU-AIM-Group/D2Net.
Collapse
|
37
|
Song M, Song W, Yang G, Chen C. Improving RGB-D Salient Object Detection via Modality-Aware Decoder. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:6124-6138. [PMID: 36112559 DOI: 10.1109/tip.2022.3205747] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Most existing RGB-D salient object detection (SOD) methods are primarily focusing on cross-modal and cross-level saliency fusion, which has been proved to be efficient and effective. However, these methods still have a critical limitation, i.e., their fusion patterns - typically the combination of selective characteristics and its variations, are too highly dependent on the network's non-linear adaptability. In such methods, the balances between RGB and D (Depth) are formulated individually considering the intermediate feature slices, but the relation at the modality level may not be learned properly. The optimal RGB-D combinations differ depending on the RGB-D scenarios, and the exact complementary status is frequently determined by multiple modality-level factors, such as D quality, the complexity of the RGB scene, and degree of harmony between them. Therefore, given the existing approaches, it may be difficult for them to achieve further performance breakthroughs, as their methodologies belong to some methods that are somewhat less modality sensitive. To conquer this problem, this paper presents the Modality-aware Decoder (MaD). The critical technical innovations include a series of feature embedding, modality reasoning, and feature back-projecting and collecting strategies, all of which upgrade the widely-used multi-scale and multi-level decoding process to be modality-aware. Our MaD achieves competitive performance over other state-of-the-art (SOTA) models without using any fancy tricks in the decoder's design. Codes and results will be publicly available at https://github.com/MengkeSong/MaD.
Collapse
|
38
|
Zhu Y, Hu P, Li X, Tian Y, Bai X, Liang T, Li J. Multiscale unsupervised domain adaptation for automatic pancreas segmentation in CT volumes using adversarial learning. Med Phys 2022; 49:5799-5818. [PMID: 35833617 DOI: 10.1002/mp.15827] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 04/28/2022] [Accepted: 05/27/2022] [Indexed: 11/09/2022] Open
Abstract
PURPOSE Computer-aided automatic pancreas segmentation is essential for early diagnosis and treatment of pancreatic diseases. However, the annotation of pancreas images requires professional doctors and considerable expenditure. Due to imaging differences among various institution population, scanning devices and imaging protocols etc., significant degradation in the performance of model inference results is prone to occur when models trained with domain-specific (usually institution-specific) datasets are directly applied to new (other centers/institutions) domain data. In this paper, we propose a novel unsupervised domain adaptation method based on adversarial learning to address pancreas segmentation challenges with the lack of annotations and domain shift interference. METHODS A 3D semantic segmentation model with attention module and residual module is designed as the backbone pancreas segmentation model. In both segmentation model and domain adaptation discriminator network, a multiscale progressively weighted structure is introduced to acquire different field of views. Features of labeled data and unlabeled data are fed in pairs into the proposed multiscale discriminator to learn domain-specific characteristics. Then the unlabeled data features with pseudo-domain label are fed to the discriminator to acquire domain-ambiguous information. With this adversarial learning strategy, the performance of the segmentation network is enhanced to segment unseen unlabeled data. RESULTS Experiments were conducted on two public annotated datasets as source datasets respectively and one private dataset as target dataset, where annotations were not used for the training process but only for evaluation. The 3D segmentation model achieves comparative performance with state-of-the-art pancreas segmentation methods on source domain. After implementing our domain adaptation architecture, the average Dice Similarity Coefficient(DSC) of the segmentation model trained on the NIH-TCIA source dataset increases from 58.79% to 72.73% on the local hospital dataset, while the performance of the target domain segmentation model transferred from the MSD source dataset rises from 62.34% to 71.17%. CONCLUSIONS Correlation of features across data domains are utilized to train the pancreas segmentation model on unlabeled data domain, improving the generalization of the model. Our results demonstrate that the proposed method enables the segmentation model to make meaningful segmentation for unseen data of the training set. In the future, the proposed method has the potential to apply segmentation model trained on public dataset to clinical unannotated CT images from local hospital, effectively assisting radiologists in clinical practice. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Yan Zhu
- Engineering Research Center of EMR and Intelligent Expert System, Ministry of Education, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, 310027, China
| | - Peijun Hu
- Engineering Research Center of EMR and Intelligent Expert System, Ministry of Education, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, 310027, China.,Research Center for Healthcare Data Science, Zhejiang Lab, Hangzhou, 311100, China
| | - Xiang Li
- Department of Hepatobiliary and Pancreatic Surgery, the First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, 310006, China.,Zhejiang Provincial Key Laboratory of Pancreatic Disease, Hangzhou, 310006, China
| | - Yu Tian
- Engineering Research Center of EMR and Intelligent Expert System, Ministry of Education, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, 310027, China
| | - Xueli Bai
- Department of Hepatobiliary and Pancreatic Surgery, the First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, 310006, China.,Zhejiang Provincial Key Laboratory of Pancreatic Disease, Hangzhou, 310006, China
| | - Tingbo Liang
- Department of Hepatobiliary and Pancreatic Surgery, the First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, 310006, China.,Zhejiang Provincial Key Laboratory of Pancreatic Disease, Hangzhou, 310006, China
| | - Jingsong Li
- Engineering Research Center of EMR and Intelligent Expert System, Ministry of Education, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, 310027, China.,Research Center for Healthcare Data Science, Zhejiang Lab, Hangzhou, 311100, China
| |
Collapse
|
39
|
Wang Y, Yang Q, Tian L, Zhou X, Rekik I, Huang H. HFCF-Net: A hybrid-feature cross fusion network for COVID-19 lesion segmentation from CT volumetric images. Med Phys 2022; 49:3797-3815. [PMID: 35301729 PMCID: PMC9088496 DOI: 10.1002/mp.15600] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 02/16/2022] [Accepted: 02/21/2022] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND The coronavirus disease 2019 (COVID-19) spreads rapidly across the globe, seriously threatening the health of people all over the world. To reduce the diagnostic pressure of front-line doctors, an accurate and automatic lesion segmentation method is highly desirable in clinic practice. PURPOSE Many proposed two-dimensional (2D) methods for sliced-based lesion segmentation cannot take full advantage of spatial information in the three-dimensional (3D) volume data, resulting in limited segmentation performance. Three-dimensional methods can utilize the spatial information but suffer from long training time and slow convergence speed. To solve these problems, we propose an end-to-end hybrid-feature cross fusion network (HFCF-Net) to fuse the 2D and 3D features at three scales for the accurate segmentation of COVID-19 lesions. METHODS The proposed HFCF-Net incorporates 2D and 3D subnets to extract features within and between slices effectively. Then the cross fusion module is designed to bridge 2D and 3D decoders at the same scale to fuse both types of features. The module consists of three cross fusion blocks, each of which contains a prior fusion path and a context fusion path to jointly learn better lesion representations. The former aims to explicitly provide the 3D subnet with lesion-related prior knowledge, and the latter utilizes the 3D context information as the attention guidance of the 2D subnet, which promotes the precise segmentation of the lesion regions. Furthermore, we explore an imbalance-robust adaptive learning loss function that includes image-level loss and pixel-level loss to tackle the problems caused by the apparent imbalance between the proportions of the lesion and non-lesion voxels, providing a learning strategy to dynamically adjust the learning focus between 2D and 3D branches during the training process for effective supervision. RESULT Extensive experiments conducted on a publicly available dataset demonstrate that the proposed segmentation network significantly outperforms some state-of-the-art methods for the COVID-19 lesion segmentation, yielding a Dice similarity coefficient of 74.85%. The visual comparison of segmentation performance also proves the superiority of the proposed network in segmenting different-sized lesions. CONCLUSIONS In this paper, we propose a novel HFCF-Net for rapid and accurate COVID-19 lesion segmentation from chest computed tomography volume data. It innovatively fuses hybrid features in a cross manner for lesion segmentation, aiming to utilize the advantages of 2D and 3D subnets to complement each other for enhancing the segmentation performance. Benefitting from the cross fusion mechanism, the proposed HFCF-Net can segment the lesions more accurately with the knowledge acquired from both subnets.
Collapse
Affiliation(s)
- Yanting Wang
- School of Computer and Information TechnologyBeijing Jiaotong UniversityBeijingChina
| | - Qingyu Yang
- School of Computer and Information TechnologyBeijing Jiaotong UniversityBeijingChina
| | - Lixia Tian
- School of Computer and Information TechnologyBeijing Jiaotong UniversityBeijingChina
| | - Xuezhong Zhou
- School of Computer and Information TechnologyBeijing Jiaotong UniversityBeijingChina
| | - Islem Rekik
- BASIRA LaboratoryFaculty of Computer and InformaticsIstanbul Technical UniversityIstanbulTurkey
- School of Science and EngineeringComputingUniversity of DundeeDundeeUK
| | - Huifang Huang
- School of Computer and Information TechnologyBeijing Jiaotong UniversityBeijingChina
| |
Collapse
|
40
|
Wang X, Li Z, Huang Y, Jiao Y. Multimodal medical image segmentation using multi-scale context-aware network. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.11.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
41
|
Shi C, Zhang J, Zhang X, Shen M, Chen H, Wang L. A recurrent skip deep learning network for accurate image segmentation. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103533] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
42
|
Guo X, Liu J, Yuan Y. Semantic-Oriented Labeled-to-Unlabeled Distribution Translation for Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:434-445. [PMID: 34543194 DOI: 10.1109/tmi.2021.3114329] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Automatic medical image segmentation plays a crucial role in many medical applications, such as disease diagnosis and treatment planning. Existing deep learning based models usually regarded the segmentation task as pixel-wise classification and neglected the semantic correlations of pixels across different images, leading to vague feature distribution. Moreover, pixel-wise annotated data is rare in medical domain, and the scarce annotated data usually exhibits the biased distribution against the desired one, hindering the performance improvement under the supervised learning setting. In this paper, we propose a novel Labeled-to-unlabeled Distribution Translation (L2uDT) framework with Semantic-oriented Contrastive Learning (SoCL), mainly for addressing the aforementioned issues in medical image segmentation. In SoCL, a semantic grouping module is designed to cluster pixels into a set of semantically coherent groups, and a semantic-oriented contrastive loss is advanced to constrain group-wise prototypes, so as to explicitly learn a feature space with intra-class compactness and inter-class separability. We then establish a L2uDT strategy to approximate the desired data distribution for unbiased optimization, where we translate the labeled data distribution with the guidance of extensive unlabeled data. In particular, a bias estimator is devised to measure the distribution bias, then a gradual-paced shift is derived to progressively translate the labeled data distribution to unlabeled one. Both labeled and translated data are leveraged to optimize the segmentation model simultaneously. We illustrate the effectiveness of the proposed method on two benchmark datasets, EndoScene and PROSTATEx, and our method achieves state-of-the-art performance, which clearly demonstrates its effectiveness for medical image segmentation. The source code is available at https://github.com/CityU-AIM-Group/L2uDT.
Collapse
|
43
|
HT-Net: hierarchical context-attention transformer network for medical ct image segmentation. APPL INTELL 2022. [DOI: 10.1007/s10489-021-03010-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
44
|
Xie Y, Zhang J, Liao Z, Verjans J, Shen C, Xia Y. Intra- and Inter-Pair Consistency for Semi-Supervised Gland Segmentation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:894-905. [PMID: 34951847 DOI: 10.1109/tip.2021.3136716] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Accurate gland segmentation in histology tissue images is a critical but challenging task. Although deep models have demonstrated superior performance in medical image segmentation, they commonly require a large amount of annotated data, which are hard to obtain due to the extensive labor costs and expertise required. In this paper, we propose an intra- and inter-pair consistency-based semi-supervised (I2CS) model that can be trained on both labeled and unlabeled histology images for gland segmentation. Considering that each image contains glands and hence different images could potentially share consistent semantics in the feature space, we introduce a novel intra- and inter-pair consistency module to explore such consistency for learning with unlabeled data. It first characterizes the pixel-level relation between a pair of images in the feature space to create an attention map that highlights the regions with the same semantics but on different images. Then, it imposes a consistency constraint on the attention maps obtained from multiple image pairs, and thus filters low-confidence attention regions to generate refined attention maps that are then merged with original features to improve their representation ability. In addition, we also design an object-level loss to address the issues caused by touching glands. We evaluated our model against several recent gland segmentation methods and three typical semi-supervised methods on the GlaS and CRAG datasets. Our results not only demonstrate the effectiveness of the proposed due consistency module and Obj-Dice loss, but also indicate that the proposed I2CS model achieves state-of-the-art gland segmentation performance on both benchmarks.
Collapse
|
45
|
Classification of Giemsa staining chromosome using input-aware deep convolutional neural network with integrated uncertainty estimates. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2021.103120] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
46
|
|
47
|
X-CTRSNet: 3D cervical vertebra CT reconstruction and segmentation directly from 2D X-ray images. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107680] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|