1
|
Wang Z, Gu J, Zhou W, He Q, Zhao T, Guo J, Lu L, He T, Bu J. Neural Memory State Space Models for Medical Image Segmentation. Int J Neural Syst 2025; 35:2450068. [PMID: 39343431 DOI: 10.1142/s0129065724500680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/01/2024]
Abstract
With the rapid advancement of deep learning, computer-aided diagnosis and treatment have become crucial in medicine. UNet is a widely used architecture for medical image segmentation, and various methods for improving UNet have been extensively explored. One popular approach is incorporating transformers, though their quadratic computational complexity poses challenges. Recently, State-Space Models (SSMs), exemplified by Mamba, have gained significant attention as a promising alternative due to their linear computational complexity. Another approach, neural memory Ordinary Differential Equations (nmODEs), exhibits similar principles and achieves good results. In this paper, we explore the respective strengths and weaknesses of nmODEs and SSMs and propose a novel architecture, the nmSSM decoder, which combines the advantages of both approaches. This architecture possesses powerful nonlinear representation capabilities while retaining the ability to preserve input and process global information. We construct nmSSM-UNet using the nmSSM decoder and conduct comprehensive experiments on the PH2, ISIC2018, and BU-COCO datasets to validate its effectiveness in medical image segmentation. The results demonstrate the promising application value of nmSSM-UNet. Additionally, we conducted ablation experiments to verify the effectiveness of our proposed improvements on SSMs and nmODEs.
Collapse
Affiliation(s)
- Zhihua Wang
- College of Computer Science, Zhejiang University, Hangzhou, P. R. China
- Zhejiang Provincial Key Laboratory of Service Robot, Hangzhou, Zhejiang Province, P. R. China
| | - Jingjun Gu
- College of Computer Science, Zhejiang University, Hangzhou, P. R. China
- Zhejiang Provincial Key Laboratory of Service Robot, Hangzhou, Zhejiang Province, P. R. China
| | - Wang Zhou
- Department of Ultrasound, The First Affiliated Hospital of Anhui Medical University, Hefei, P. R. China
| | - Quansong He
- College of Computer Science, Sichuan University, Chengdu, P. R. China
| | - Tianli Zhao
- Department of Cardiovascular Surgery, The Second Xiangya Hospital, Central South University, Changsha, P. R. China
| | - Jialong Guo
- College of Computer Science, Zhejiang University, Hangzhou, P. R. China
- Zhejiang Provincial Key Laboratory of Service Robot, Hangzhou, Zhejiang Province, P. R. China
| | - Li Lu
- Department of Ophthalmology, Eye Center, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, P. R. China
| | - Tao He
- College of Computer Science, Sichuan University, Chengdu, P. R. China
| | - Jiajun Bu
- College of Computer Science, Zhejiang University, Hangzhou, P. R. China
- Zhejiang Provincial Key Laboratory of Service Robot, Hangzhou, Zhejiang Province, P. R. China
| |
Collapse
|
2
|
Zheng X, Yang Y, Li D, Deng Y, Xie Y, Yi Z, Ma L, Xu L. Precise Localization for Anatomo-Physiological Hallmarks of the Cervical Spine by Using Neural Memory Ordinary Differential Equation. Int J Neural Syst 2024; 34:2450056. [PMID: 39049777 DOI: 10.1142/s0129065724500564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/27/2024]
Abstract
In the evaluation of cervical spine disorders, precise positioning of anatomo-physiological hallmarks is fundamental for calculating diverse measurement metrics. Despite the fact that deep learning has achieved impressive results in the field of keypoint localization, there are still many limitations when facing medical image. First, these methods often encounter limitations when faced with the inherent variability in cervical spine datasets, arising from imaging factors. Second, predicting keypoints for only 4% of the entire X-ray image surface area poses a significant challenge. To tackle these issues, we propose a deep neural network architecture, NF-DEKR, specifically tailored for predicting keypoints in cervical spine physiological anatomy. Leveraging neural memory ordinary differential equation with its distinctive memory learning separation and convergence to a singular global attractor characteristic, our design effectively mitigates inherent data variability. Simultaneously, we introduce a Multi-Resolution Focus module to preprocess feature maps before entering the disentangled regression branch and the heatmap branch. Employing a differentiated strategy for feature maps of varying scales, this approach yields more accurate predictions of densely localized keypoints. We construct a medical dataset, SCUSpineXray, comprising X-ray images annotated by orthopedic specialists and conduct similar experiments on the publicly available UWSpineCT dataset. Experimental results demonstrate that compared to the baseline DEKR network, our proposed method enhances average precision by 2% to 3%, accompanied by a marginal increase in model parameters and the floating-point operations (FLOPs). The code (https://github.com/Zhxyi/NF-DEKR) is available.
Collapse
Affiliation(s)
- Xi Zheng
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, No. 24 South Section 1, Yihuan Road, Chengdu 610065, P. R. China
| | - Yi Yang
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Road, Chengdu 610041, P. R. China
| | - Dehan Li
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, No. 24 South Section 1, Yihuan Road, Chengdu 610065, P. R. China
| | - Yi Deng
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Road, Chengdu 610041, P. R. China
| | - Yuexiong Xie
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, No. 24 South Section 1, Yihuan Road, Chengdu 610065, P. R. China
| | - Zhang Yi
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, No. 24 South Section 1, Yihuan Road, Chengdu 610065, P. R. China
| | - Litai Ma
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Road, Chengdu 610041, P. R. China
| | - Lei Xu
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, No. 24 South Section 1, Yihuan Road, Chengdu 610065, P. R. China
| |
Collapse
|
3
|
Rafiei MH, Gauthier LV, Adeli H, Takabi D. Self-Supervised Learning for Near-Wild Cognitive Workload Estimation. J Med Syst 2024; 48:107. [PMID: 39576291 DOI: 10.1007/s10916-024-02122-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Accepted: 11/08/2024] [Indexed: 11/24/2024]
Abstract
Feedback on cognitive workload may reduce decision-making mistakes. Machine learning-based models can produce feedback from physiological data such as electroencephalography (EEG) and electrocardiography (ECG). Supervised machine learning requires large training data sets that are (1) relevant and decontaminated and (2) carefully labeled for accurate approximation, a costly and tedious procedure. Commercial over-the-counter devices are low-cost resolutions for the real-time collection of physiological modalities. However, they produce significant artifacts when employed outside of laboratory settings, compromising machine learning accuracies. Additionally, the physiological modalities that most successfully machine-approximate cognitive workload in everyday settings are unknown. To address these challenges, a first-ever hybrid implementation of feature selection and self-supervised machine learning techniques is introduced. This model is employed on data collected outside controlled laboratory settings to (1) identify relevant physiological modalities to machine approximate six levels of cognitive-physical workloads from a seven-modality repository and (2) postulate limited labeling experiments and machine approximate mental-physical workloads using self-supervised learning techniques.
Collapse
Affiliation(s)
- Mohammad H Rafiei
- Whiting School of Engineering, Johns Hopkins University, 21218, Baltimore, MD, USA
| | - Lynne V Gauthier
- Department of Physical Therapy and Kinesiology, University of Massachusetts Lowell, 01854, Lowell, MA, USA
| | - Hojjat Adeli
- Departments of Biomedical Informatics and Neuroscience, The Ohio State University, 43210, Columbus, OH, USA.
| | - Daniel Takabi
- School of Cybersecurity, Old Dominion University, 23529, Norfolk, VA, USA
| |
Collapse
|
4
|
Díaz-Francés JÁ, Fernández-Rodríguez JD, Thurnhofer-Hemsi K, López-Rubio E. Semi-Supervised Semantic Image Segmentation by Deep Diffusion Models and Generative Adversarial Networks. Int J Neural Syst 2024; 34:2450057. [PMID: 39155691 DOI: 10.1142/s0129065724500576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/20/2024]
Abstract
Typically, deep learning models for image segmentation tasks are trained using large datasets of images annotated at the pixel level, which can be expensive and highly time-consuming. A way to reduce the amount of annotated images required for training is to adopt a semi-supervised approach. In this regard, generative deep learning models, concretely Generative Adversarial Networks (GANs), have been adapted to semi-supervised training of segmentation tasks. This work proposes MaskGDM, a deep learning architecture combining some ideas from EditGAN, a GAN that jointly models images and their segmentations, together with a generative diffusion model. With careful integration, we find that using a generative diffusion model can improve EditGAN performance results in multiple segmentation datasets, both multi-class and with binary labels. According to the quantitative results obtained, the proposed model improves multi-class image segmentation when compared to the EditGAN and DatasetGAN models, respectively, by [Formula: see text] and [Formula: see text]. Moreover, using the ISIC dataset, our proposal improves the results from other models by up to [Formula: see text] for the binary image segmentation approach.
Collapse
Affiliation(s)
- José Ángel Díaz-Francés
- ITIS Software, University of Málaga, Calle Arquitecto Francisco Peñalosa 18, Málaga 29010, Spain
| | | | - Karl Thurnhofer-Hemsi
- ITIS Software, University of Málaga, Calle Arquitecto Francisco Peñalosa 18, Málaga 29010, Spain
| | - Ezequiel López-Rubio
- ITIS Software, University of Málaga, Calle Arquitecto Francisco Peñalosa 18, Málaga 29010, Spain
| |
Collapse
|
5
|
Zhou C, Ye L, Peng H, Liu Z, Wang J, Ramírez-De-Arellano A. A Parallel Convolutional Network Based on Spiking Neural Systems. Int J Neural Syst 2024; 34:2450022. [PMID: 38487872 DOI: 10.1142/s0129065724500229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2024]
Abstract
Deep convolutional neural networks have shown advanced performance in accurately segmenting images. In this paper, an SNP-like convolutional neuron structure is introduced, abstracted from the nonlinear mechanism in nonlinear spiking neural P (NSNP) systems. Then, a U-shaped convolutional neural network named SNP-like parallel-convolutional network, or SPC-Net, is constructed for segmentation tasks. The dual-convolution concatenate (DCC) and dual-convolution addition (DCA) network blocks are designed, respectively, in the encoder and decoder stages. The two blocks employ parallel convolution with different kernel sizes to improve feature representation ability and make full use of spatial detail information. Meanwhile, different feature fusion strategies are used to fuse their features to achieve feature complementarity and augmentation. Furthermore, a dual-scale pooling (DSP) module in the bottleneck is designed to improve the feature extraction capability, which can extract multi-scale contextual information and reduce information loss while extracting salient features. The SPC-Net is applied in medical image segmentation tasks and is compared with several recent segmentation methods on the GlaS and CRAG datasets. The proposed SPC-Net achieves 90.77% DICE coefficient, 83.76% IoU score and 83.93% F1 score, 86.33% ObjDice coefficient, 135.60 Obj-Hausdorff distance, respectively. The experimental results show that the proposed model can achieve good segmentation performance.
Collapse
Affiliation(s)
- Chi Zhou
- School of Computer and Software Engineering, Xihua University, Chengdu 610039, P. R. China
| | - Lulin Ye
- School of Computer and Software Engineering, Xihua University, Chengdu 610039, P. R. China
| | - Hong Peng
- School of Computer and Software Engineering, Xihua University, Chengdu 610039, P. R. China
| | - Zhicai Liu
- School of Computer and Software Engineering, Xihua University, Chengdu 610039, P. R. China
| | - Jun Wang
- School of Electrical Engineering and Electronic Information, Xihua University, Chengdu 610039, P. R. China
| | - Antonio Ramírez-De-Arellano
- Research Group of Natural Computing, Department of Computer Science and Artificial Intelligence, University of Seville, Sevilla 41012, Spain
| |
Collapse
|
6
|
Li F, Jiang A, Li M, Xiao C, Ji W. HPFG: semi-supervised medical image segmentation framework based on hybrid pseudo-label and feature-guiding. Med Biol Eng Comput 2024; 62:405-421. [PMID: 37875739 DOI: 10.1007/s11517-023-02946-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 10/07/2023] [Indexed: 10/26/2023]
Abstract
Semi-supervised learning methods have been attracting much attention in medical image segmentation due to the lack of high-quality annotation. To cope with the noise problem of pseudo-label in semi-supervised medical image segmentation and the limitations of contrastive learning applications, we propose a semi-supervised medical image segmentation framework, HPFG, based on hybrid pseudo-label and feature-guiding, which consists of a hybrid pseudo-label strategy and two different feature-guiding modules. The hybrid pseudo-label strategy uses the CutMix operation and an auxiliary network to enable the labeled images to guide the unlabeled images to generate high-quality pseudo-label and reduce the impact of pseudo-label noise. In addition, a feature-guiding encoder module based on feature-level contrastive learning is designed to guide the encoder to mine useful local and global image features, thus effectively enhancing the feature extraction capability of the model. At the same time, a feature-guiding decoder module based on adaptive class-level contrastive learning is designed to guide the decoder in better extracting class information, achieving intra-class affinity and inter-class separation, and effectively alleviating the class imbalance problem in medical datasets. Extensive experimental results show that the segmentation performance of the HPFG framework proposed in this paper outperforms existing semi-supervised medical image segmentation methods on three public datasets: ACDC, LIDC, and ISIC. Code is available at https://github.com/fakerlove1/HPFG .
Collapse
Affiliation(s)
- Feixiang Li
- College of Computer Science and Technology, Taiyuan University of Technology, Jinzhong, 030600, China
| | - Ailian Jiang
- College of Computer Science and Technology, Taiyuan University of Technology, Jinzhong, 030600, China.
| | - Mengyang Li
- College of Computer Science and Technology, Taiyuan University of Technology, Jinzhong, 030600, China
| | - Cimei Xiao
- College of Computer Science and Technology, Taiyuan University of Technology, Jinzhong, 030600, China
| | - Wei Ji
- College of Computer Science and Technology, Taiyuan University of Technology, Jinzhong, 030600, China
| |
Collapse
|
7
|
Hu J, Yu C, Yi Z, Zhang H. Enhancing Robustness of Medical Image Segmentation Model with Neural Memory Ordinary Differential Equation. Int J Neural Syst 2023; 33:2350060. [PMID: 37743765 DOI: 10.1142/s0129065723500600] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
Deep neural networks (DNNs) have emerged as a prominent model in medical image segmentation, achieving remarkable advancements in clinical practice. Despite the promising results reported in the literature, the effectiveness of DNNs necessitates substantial quantities of high-quality annotated training data. During experiments, we observe a significant decline in the performance of DNNs on the test set when there exists disruption in the labels of the training dataset, revealing inherent limitations in the robustness of DNNs. In this paper, we find that the neural memory ordinary differential equation (nmODE), a recently proposed model based on ordinary differential equations (ODEs), not only addresses the robustness limitation but also enhances performance when trained by the clean training dataset. However, it is acknowledged that the ODE-based model tends to be less computationally efficient compared to the conventional discrete models due to the multiple function evaluations required by the ODE solver. Recognizing the efficiency limitation of the ODE-based model, we propose a novel approach called the nmODE-based knowledge distillation (nmODE-KD). The proposed method aims to transfer knowledge from the continuous nmODE to a discrete layer, simultaneously enhancing the model's robustness and efficiency. The core concept of nmODE-KD revolves around enforcing the discrete layer to mimic the continuous nmODE by minimizing the KL divergence between them. Experimental results on 18 organs-at-risk segmentation tasks demonstrate that nmODE-KD exhibits improved robustness compared to ODE-based models while also mitigating the efficiency limitation.
Collapse
Affiliation(s)
- Junjie Hu
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu 610065, P. R. China
| | - Chengrong Yu
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu 610065, P. R. China
| | - Zhang Yi
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu 610065, P. R. China
| | - Haixian Zhang
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu 610065, P. R. China
| |
Collapse
|
8
|
Cui J, Xiao J, Hou Y, Wu X, Zhou J, Peng X, Wang Y. Unsupervised Domain Adaptive Dose Prediction via Cross-Attention Transformer and Target-Specific Knowledge Preservation. Int J Neural Syst 2023; 33:2350057. [PMID: 37771298 DOI: 10.1142/s0129065723500570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/30/2023]
Abstract
Radiotherapy is one of the leading treatments for cancer. To accelerate the implementation of radiotherapy in clinic, various deep learning-based methods have been developed for automatic dose prediction. However, the effectiveness of these methods heavily relies on the availability of a substantial amount of data with labels, i.e. the dose distribution maps, which cost dosimetrists considerable time and effort to acquire. For cancers of low-incidence, such as cervical cancer, it is often a luxury to collect an adequate amount of labeled data to train a well-performing deep learning (DL) model. To mitigate this problem, in this paper, we resort to the unsupervised domain adaptation (UDA) strategy to achieve accurate dose prediction for cervical cancer (target domain) by leveraging the well-labeled high-incidence rectal cancer (source domain). Specifically, we introduce the cross-attention mechanism to learn the domain-invariant features and develop a cross-attention transformer-based encoder to align the two different cancer domains. Meanwhile, to preserve the target-specific knowledge, we employ multiple domain classifiers to enforce the network to extract more discriminative target features. In addition, we employ two independent convolutional neural network (CNN) decoders to compensate for the lack of spatial inductive bias in the pure transformer and generate accurate dose maps for both domains. Furthermore, to enhance the performance, two additional losses, i.e. a knowledge distillation loss (KDL) and a domain classification loss (DCL), are incorporated to transfer the domain-invariant features while preserving domain-specific information. Experimental results on a rectal cancer dataset and a cervical cancer dataset have demonstrated that our method achieves the best quantitative results with [Formula: see text], [Formula: see text], and HI of 1.446, 1.231, and 0.082, respectively, and outperforms other methods in terms of qualitative assessment.
Collapse
Affiliation(s)
- Jiaqi Cui
- School of Computer Science, Sichuan University, Chengdu, P. R. China
| | - Jianghong Xiao
- Department of Radiation Oncology, Cancer Center, West China Hospital, Sichuan University, Chengdu, P. R. China
| | - Yun Hou
- Agile and Intelligent Computing Key Laboratory, Southwest China Institute of Electronic Technology, Chengdu, P. R. China
| | - Xi Wu
- School of Computer Science, Chengdu University of Information Technology, P. R. China
| | - Jiliu Zhou
- School of Computer Science, Sichuan University, Chengdu, P. R. China
| | - Xingchen Peng
- Department of Biotherapy, Cancer Center, West China Hospital, Sichuan University, Chengdu, P. R. China
| | - Yan Wang
- School of Computer Science, Sichuan University, Chengdu, P. R. China
| |
Collapse
|
9
|
Hao D, Li H, Zhang Y, Zhang Q. MUE-CoT: multi-scale uncertainty entropy-aware co-training framework for left atrial segmentation. Phys Med Biol 2023; 68:215008. [PMID: 37567214 DOI: 10.1088/1361-6560/acef8e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 08/11/2023] [Indexed: 08/13/2023]
Abstract
Objective.Accurate left atrial segmentation is the basis of the recognition and clinical analysis of atrial fibrillation. Supervised learning has achieved some competitive segmentation results, but the high annotation cost often limits its performance. Semi-supervised learning is implemented from limited labeled data and a large amount of unlabeled data and shows good potential in solving practical medical problems.Approach. In this study, we proposed a collaborative training framework for multi-scale uncertain entropy perception (MUE-CoT) and achieved efficient left atrial segmentation from a small amount of labeled data. Based on the pyramid feature network, learning is implemented from unlabeled data by minimizing the pyramid prediction difference. In addition, novel loss constraints are proposed for co-training in the study. The diversity loss is defined as a soft constraint so as to accelerate the convergence and a novel multi-scale uncertainty entropy calculation method and a consistency regularization term are proposed to measure the consistency between prediction results. The quality of pseudo-labels cannot be guaranteed in the pre-training period, so a confidence-dependent empirical Gaussian function is proposed to weight the pseudo-supervised loss.Main results.The experimental results of a publicly available dataset and an in-house clinical dataset proved that our method outperformed existing semi-supervised methods. For the two datasets with a labeled ratio of 5%, the Dice similarity coefficient scores were 84.94% ± 4.31 and 81.24% ± 2.4, the HD95values were 4.63 mm ± 2.13 and 3.94 mm ± 2.72, and the Jaccard similarity coefficient scores were 74.00% ± 6.20 and 68.49% ± 3.39, respectively.Significance.The proposed model effectively addresses the challenges of limited data samples and high costs associated with manual annotation in the medical field, leading to enhanced segmentation accuracy.
Collapse
Affiliation(s)
- Dechen Hao
- School of Software, North University of China, Taiyuan Shanxi, People's Republic of China
| | - Hualing Li
- School of Software, North University of China, Taiyuan Shanxi, People's Republic of China
| | - Yonglai Zhang
- School of Software, North University of China, Taiyuan Shanxi, People's Republic of China
| | - Qi Zhang
- Department of Cardiology, The Second Hospital of Shanxi Medical University, Taiyuan Shanxi, People's Republic of China
| |
Collapse
|
10
|
Zhu H, Wang J, Wang SH, Raman R, Górriz JM, Zhang YD. An Evolutionary Attention-Based Network for Medical Image Classification. Int J Neural Syst 2023; 33:2350010. [PMID: 36655400 DOI: 10.1142/s0129065723500107] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Deep learning has become a primary choice in medical image analysis due to its powerful representation capability. However, most existing deep learning models designed for medical image classification can only perform well on a specific disease. The performance drops dramatically when it comes to other diseases. Generalizability remains a challenging problem. In this paper, we propose an evolutionary attention-based network (EDCA-Net), which is an effective and robust network for medical image classification tasks. To extract task-related features from a given medical dataset, we first propose the densely connected attentional network (DCA-Net) where feature maps are automatically channel-wise weighted, and the dense connectivity pattern is introduced to improve the efficiency of information flow. To improve the model capability and generalizability, we introduce two types of evolution: intra- and inter-evolution. The intra-evolution optimizes the weights of DCA-Net, while the inter-evolution allows two instances of DCA-Net to exchange training experience during training. The evolutionary DCA-Net is referred to as EDCA-Net. The EDCA-Net is evaluated on four publicly accessible medical datasets of different diseases. Experiments showed that the EDCA-Net outperforms the state-of-the-art methods on three datasets and achieves comparable performance on the last dataset, demonstrating good generalizability for medical image classification.
Collapse
Affiliation(s)
- Hengde Zhu
- School of Computing and Mathematical Sciences, University of Leicester, Leicester LE1 7RH, UK
| | - Jian Wang
- School of Computing and Mathematical Sciences, University of Leicester, Leicester LE1 7RH, UK
| | - Shui-Hua Wang
- School of Computing and Mathematical Sciences, University of Leicester, Leicester LE1 7RH, UK
| | - Rajeev Raman
- School of Computing and Mathematical Sciences, University of Leicester, Leicester LE1 7RH, UK
| | - Juan M Górriz
- Department of Signal Theory, Networking and Communications, University of Granada, Granada 52005, Spain
| | - Yu-Dong Zhang
- School of Computing and Mathematical Sciences, University of Leicester, Leicester LE1 7RH, UK
| |
Collapse
|
11
|
Abstract
Three-dimensional (3D) medical image segmentation plays a crucial role in medical care applications. Although various two-dimensional (2D) and 3D neural network models have been applied to 3D medical image segmentation and achieved impressive results, a trade-off remains between efficiency and accuracy. To address this issue, a novel mixture convolutional network (MixConvNet) is proposed, in which traditional 2D/3D convolutional blocks are replaced with novel MixConv blocks. In the MixConv block, 3D convolution is decomposed into a mixture of 2D convolutions from different views. Therefore, the MixConv block fully utilizes the advantages of 2D convolution and maintains the learning ability of 3D convolution. It acts as 3D convolutions and thus can process volumetric input directly and learn intra-slice features, which are absent in the traditional 2D convolutional block. By contrast, the proposed MixConv block only contains 2D convolutions; hence, it has significantly fewer trainable parameters and less computation budget than a block containing 3D convolutions. Furthermore, the proposed MixConvNet is pre-trained with small input patches and fine-tuned with large input patches to improve segmentation performance further. In experiments on the Decathlon Heart dataset and Sliver07 dataset, the proposed MixConvNet outperformed the state-of-the-art methods such as UNet3D, VNet, and nnUnet.
Collapse
Affiliation(s)
- Jianyong Wang
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, Sichuan, P. R. China
| | - Lei Zhang
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, Sichuan, P. R. China
| | - Yi Zhang
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, Sichuan, P. R. China
| |
Collapse
|
12
|
Semi-supervised structure attentive temporal mixup coherence for medical image segmentation. Biocybern Biomed Eng 2022. [DOI: 10.1016/j.bbe.2022.09.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
13
|
Wang K, Wang Y, Zhan B, Yang Y, Zu C, Wu X, Zhou J, Nie D, Zhou L. An Efficient Semi-Supervised Framework with Multi-Task and Curriculum Learning for Medical Image Segmentation. Int J Neural Syst 2022; 32:2250043. [DOI: 10.1142/s0129065722500435] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|