1
|
Qiu H, Zhong C, Gao C, Huang C. Boundary-enhanced local-global collaborative network for medical image segmentation. Sci Rep 2025; 15:9081. [PMID: 40097512 PMCID: PMC11914279 DOI: 10.1038/s41598-025-93875-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2024] [Accepted: 03/10/2025] [Indexed: 03/19/2025] Open
Abstract
Medical imaging plays a vital role as an auxiliary tool in clinical diagnosis and treatment, with segmentation serving as a crucial foundational process in medical image analysis. Nonetheless, challenges such as class imbalance and indistinct boundaries of regions of interest (ROIs) often complicate medical image segmentation. Constructing a network capable of precisely locating small ROIs and achieving precise segmentation is a significant task. In this paper, we propose a boundary information-enhanced local-global collaborative network. This network leverages the local feature extraction capabilities of CNNs, the global feature recognition prowess of state space models exemplified by Mamba, and boundary feature enhancement to learn a more comprehensive representation. Specifically, we propose a local-global collaborative encoder via attention fusion. This encoder adeptly integrates local and global features through a deep attention fusion module to address the challenge of segmenting small ROIs in class-imbalanced scenarios. Subsequently, we develop a boundary information-enhanced decoder. Through the incremental implementation of boundary attention modules, this decoder emphasizes boundary features during image restoration, steering the network to achieve more complete segmentation. Extensive experiments on various public class-imbalanced medical image segmentation datasets demonstrate that the proposed BELGNet outperforms state-of-the-art methods.
Collapse
Affiliation(s)
- Haiyan Qiu
- The Central Hospital of Yongzhou, Yongzhou, 425000, China
| | - Chi Zhong
- The Central Hospital of Yongzhou, Yongzhou, 425000, China
| | - Chengling Gao
- Zhejiang Key Laboratory of Intelligent Education Technology and Application, Zhejiang Normal University, Jinhua, 321004, China.
| | - Changqin Huang
- Zhejiang Key Laboratory of Intelligent Education Technology and Application, Zhejiang Normal University, Jinhua, 321004, China.
| |
Collapse
|
2
|
Liu Y, Yuan D, Xu Z, Zhan Y, Zhang H, Lu J, Lukasiewicz T. Pixel level deep reinforcement learning for accurate and robust medical image segmentation. Sci Rep 2025; 15:8213. [PMID: 40064951 PMCID: PMC11894052 DOI: 10.1038/s41598-025-92117-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2025] [Accepted: 02/25/2025] [Indexed: 03/14/2025] Open
Abstract
Existing deep learning methods have achieved significant success in medical image segmentation. However, this success largely relies on stacking advanced modules and architectures, which has created a path dependency. This path dependency is unsustainable, as it leads to increasingly larger model parameters and higher deployment costs. To break this path dependency, we introduce deep reinforcement learning to enhance segmentation performance. However, current deep reinforcement learning methods face challenges such as high training cost, independent iterative processes, and high uncertainty of segmentation masks. Consequently, we propose a Pixel-level Deep Reinforcement Learning model with pixel-by-pixel Mask Generation (PixelDRL-MG) for more accurate and robust medical image segmentation. PixelDRL-MG adopts a dynamic iterative update policy, directly segmenting the regions of interest without requiring user interaction or coarse segmentation masks. We propose a Pixel-level Asynchronous Advantage Actor-Critic (PA3C) strategy to treat each pixel as an agent whose state (foreground or background) is iteratively updated through direct actions. Our experiments on two commonly used medical image segmentation datasets demonstrate that PixelDRL-MG achieves more superior segmentation performances than the state-of-the-art segmentation baselines (especially in boundaries) using significantly fewer model parameters. We also conducted detailed ablation studies to enhance understanding and facilitate practical application. Additionally, PixelDRL-MG performs well in low-resource settings (i.e., 50-shot or 100-shot), making it an ideal choice for real-world scenarios.
Collapse
Affiliation(s)
- Yunxin Liu
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China
- Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, Hebei University of Technology, Tianjin, China
| | - Di Yuan
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China
- Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, Hebei University of Technology, Tianjin, China
| | - Zhenghua Xu
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China.
- Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, Hebei University of Technology, Tianjin, China.
| | - Yuefu Zhan
- The Third People's Hospital of Longgang District Shenzhen, Shenzhen, China.
- The Seventh People's Hospital of Chongqing, No. 1, Village 1, Lijiatuo Labor Union, Banan District, Chongqing, China.
- Longgang Institute of Medical Imaging, Shantou University Medical College, Shenzhen, China.
- Hainan Women and Children's Medical Center, Hainan, China.
| | - Hongwei Zhang
- BigBear (Tianjin) Medical Technology Co., Ltd, Tianjin, China
| | - Jun Lu
- BigBear (Tianjin) Medical Technology Co., Ltd, Tianjin, China
| | - Thomas Lukasiewicz
- Institute of Logic and Computation, Vienna University of Technology, Vienna, Austria
- Department of Computer Science, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
3
|
Zhang C, Mei M, Mei Z, Wu B, Chen S, Lu M, Lu C. On efficient expanding training datasets of breast tumor ultrasound segmentation model. Comput Biol Med 2024; 183:109274. [PMID: 39471661 DOI: 10.1016/j.compbiomed.2024.109274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 09/25/2024] [Accepted: 10/10/2024] [Indexed: 11/01/2024]
Abstract
Automatic segmentation of breast tumor ultrasound images can provide doctors with objective and efficient references for lesions and regions of interest. Both dataset optimization and model structure optimization are crucial for achieving optimal image segmentation performance, and it can be challenging to satisfy the clinical needs solely through model structure enhancements in the context of insufficient breast tumor ultrasound datasets for model training. While significant research has focused on enhancing the architecture of deep learning models to improve tumor segmentation performance, there is a relative paucity of work dedicated to dataset augmentation. Current data augmentation techniques, such as rotation and transformation, often yield insufficient improvements in model accuracy. The deep learning methods used for generating synthetic images, such as GANs is primarily applied to produce visually natural-looking images. Nevertheless, the accuracy of the labels for these generated images still requires manual verification, and the images exhibit a lack of diversity. Therefore, they are not suitable for the training datasets augmentation of image segmentation models. This study introduces a novel dataset augmentation approach that generates synthetic images by embedding tumor regions into normal images. We explore two synthetic methods: one using identical backgrounds and another with varying backgrounds. Through experimental validation, we demonstrate the efficiency of the synthetic datasets in enhancing the performance of image segmentation models. Notably, the synthetic method utilizing different backgrounds exhibits superior improvement compared to the identical background approach. Our findings contribute to medical image analysis, particularly in tumor segmentation, by providing a practical and effective dataset augmentation strategy that can significantly improve the accuracy and reliability of segmentation models.
Collapse
Affiliation(s)
- Caicai Zhang
- School of Modern Information Technology, Zhejiang Polytechnic University of Mechanical and Electrical Engineering, 528 Binwen Road, Binjiang District, Hangzhou, 310053, Zhejiang, China.
| | - Mei Mei
- Department of Ultrasound, The Second Affiliated Hospital, Zhejiang University School of Medicine, 88 Jiefang Road, Shangcheng District, Hangzhou 310009, Zhejiang, China.
| | - Zhuolin Mei
- School of Computer and Big Data Science, Jiujiang University, 551 Qianjin East Road, Jiujiang 332005, Jiangxi, China.
| | - Bin Wu
- School of Computer and Big Data Science, Jiujiang University, 551 Qianjin East Road, Jiujiang 332005, Jiangxi, China.
| | - Shasha Chen
- School of Modern Information Technology, Zhejiang Polytechnic University of Mechanical and Electrical Engineering, 528 Binwen Road, Binjiang District, Hangzhou, 310053, Zhejiang, China.
| | - Minfeng Lu
- School of Modern Information Technology, Zhejiang Polytechnic University of Mechanical and Electrical Engineering, 528 Binwen Road, Binjiang District, Hangzhou, 310053, Zhejiang, China.
| | - Chenglang Lu
- School of Modern Information Technology, Zhejiang Polytechnic University of Mechanical and Electrical Engineering, 528 Binwen Road, Binjiang District, Hangzhou, 310053, Zhejiang, China.
| |
Collapse
|
4
|
Xu Z, Wang S, Xu G, Liu Y, Yu M, Zhang H, Lukasiewicz T, Gu J. Automatic data augmentation for medical image segmentation using Adaptive Sequence-length based Deep Reinforcement Learning. Comput Biol Med 2024; 169:107877. [PMID: 38157774 DOI: 10.1016/j.compbiomed.2023.107877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 12/03/2023] [Accepted: 12/18/2023] [Indexed: 01/03/2024]
Abstract
Although existing deep reinforcement learning-based approaches have achieved some success in image augmentation tasks, their effectiveness and adequacy for data augmentation in intelligent medical image analysis are still unsatisfactory. Therefore, we propose a novel Adaptive Sequence-length based Deep Reinforcement Learning (ASDRL) model for Automatic Data Augmentation (AutoAug) in intelligent medical image analysis. The improvements of ASDRL-AutoAug are two-fold: (i) To remedy the problem of some augmented images being invalid, we construct a more accurate reward function based on different variations of the augmentation trajectories. This reward function assesses the validity of each augmentation transformation more accurately by introducing different information about the validity of the augmented images. (ii) Then, to alleviate the problem of insufficient augmentation, we further propose a more intelligent automatic stopping mechanism (ASM). ASM feeds a stop signal to the agent automatically by judging the adequacy of image augmentation. This ensures that each transformation before stopping the augmentation can smoothly improve the model performance. Extensive experimental results on three medical image segmentation datasets show that (i) ASDRL-AutoAug greatly outperforms the state-of-the-art data augmentation methods in medical image segmentation tasks, (ii) the proposed improvements are both effective and essential for ASDRL-AutoAug to achieve superior performance, and the new reward evaluates the transformations more accurately than existing reward functions, and (iii) we also demonstrate that ASDRL-AutoAug is adaptive for different images in terms of sequence length, as well as generalizable across different segmentation models.
Collapse
Affiliation(s)
- Zhenghua Xu
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China
| | - Shengxin Wang
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China
| | - Gang Xu
- School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
| | - Yunxin Liu
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China
| | - Miao Yu
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China.
| | - Hongwei Zhang
- School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, China
| | - Thomas Lukasiewicz
- Institute of Logic and Computation, Vienna University of Technology, Vienna, Austria; Department of Computer Science, University of Oxford, Oxford, United Kingdom
| | - Junhua Gu
- School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
| |
Collapse
|
5
|
Zhang J, Zhang S, Shen X, Lukasiewicz T, Xu Z. Multi-ConDoS: Multimodal Contrastive Domain Sharing Generative Adversarial Networks for Self-Supervised Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:76-95. [PMID: 37379176 DOI: 10.1109/tmi.2023.3290356] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/30/2023]
Abstract
Existing self-supervised medical image segmentation usually encounters the domain shift problem (i.e., the input distribution of pre-training is different from that of fine-tuning) and/or the multimodality problem (i.e., it is based on single-modal data only and cannot utilize the fruitful multimodal information of medical images). To solve these problems, in this work, we propose multimodal contrastive domain sharing (Multi-ConDoS) generative adversarial networks to achieve effective multimodal contrastive self-supervised medical image segmentation. Compared to the existing self-supervised approaches, Multi-ConDoS has the following three advantages: (i) it utilizes multimodal medical images to learn more comprehensive object features via multimodal contrastive learning; (ii) domain translation is achieved by integrating the cyclic learning strategy of CycleGAN and the cross-domain translation loss of Pix2Pix; (iii) novel domain sharing layers are introduced to learn not only domain-specific but also domain-sharing information from the multimodal medical images. Extensive experiments on two publicly multimodal medical image segmentation datasets show that, with only 5% (resp., 10%) of labeled data, Multi-ConDoS not only greatly outperforms the state-of-the-art self-supervised and semi-supervised medical image segmentation baselines with the same ratio of labeled data, but also achieves similar (sometimes even better) performances as fully supervised segmentation methods with 50% (resp., 100%) of labeled data, which thus proves that our work can achieve superior segmentation performances with very low labeling workload. Furthermore, ablation studies prove that the above three improvements are all effective and essential for Multi-ConDoS to achieve this very superior performance.
Collapse
|
6
|
Xu Z, Tang J, Qi C, Yao D, Liu C, Zhan Y, Lukasiewicz T. Cross-domain attention-guided generative data augmentation for medical image analysis with limited data. Comput Biol Med 2024; 168:107744. [PMID: 38006826 DOI: 10.1016/j.compbiomed.2023.107744] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 11/12/2023] [Accepted: 11/20/2023] [Indexed: 11/27/2023]
Abstract
Data augmentation is widely applied to medical image analysis tasks in limited datasets with imbalanced classes and insufficient annotations. However, traditional augmentation techniques cannot supply extra information, making the performance of diagnosis unsatisfactory. GAN-based generative methods have thus been proposed to obtain additional useful information to realize more effective data augmentation; but existing generative data augmentation techniques mainly encounter two problems: (i) Current generative data augmentation lacks of the capability in using cross-domain differential information to extend limited datasets. (ii) The existing generative methods cannot provide effective supervised information in medical image segmentation tasks. To solve these problems, we propose an attention-guided cross-domain tumor image generation model (CDA-GAN) with an information enhancement strategy. The CDA-GAN can generate diverse samples to expand the scale of datasets, improving the performance of medical image diagnosis and treatment tasks. In particular, we incorporate channel attention into a CycleGAN-based cross-domain generation network that captures inter-domain information and generates positive or negative samples of brain tumors. In addition, we propose a semi-supervised spatial attention strategy to guide spatial information of features at the pixel level in tumor generation. Furthermore, we add spectral normalization to prevent the discriminator from mode collapse and stabilize the training procedure. Finally, to resolve an inapplicability problem in the segmentation task, we further propose an application strategy of using this data augmentation model to achieve more accurate medical image segmentation with limited data. Experimental studies on two public brain tumor datasets (BraTS and TCIA) show that the proposed CDA-GAN model greatly outperforms the state-of-the-art generative data augmentation in both practical medical image classification tasks and segmentation tasks; e.g. CDA-GAN is 0.50%, 1.72%, 2.05%, and 0.21% better than the best SOTA baseline in terms of ACC, AUC, Recall, and F1, respectively, in the classification task of BraTS, while its improvements w.r.t. the best SOTA baseline in terms of Dice, Sens, HD95, and mIOU, in the segmentation task of TCIA are 2.50%, 0.90%, 14.96%, and 4.18%, respectively.
Collapse
Affiliation(s)
- Zhenghua Xu
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China
| | - Jiaqi Tang
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China
| | - Chang Qi
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China; Institute of Logic and Computation, Vienna University of Technology, Vienna, Austria.
| | - Dan Yao
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China
| | - Caihua Liu
- College of Computer Science and Technology, Civil Aviation University of China, Tianjin, China
| | - Yuefu Zhan
- Department of Radiology, Hainan Women and Children's Medical Center, Haikou, China
| | - Thomas Lukasiewicz
- Institute of Logic and Computation, Vienna University of Technology, Vienna, Austria; Department of Computer Science, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
7
|
Guo H, Liu H, Zhu H, Li M, Yu H, Zhu Y, Chen X, Xu Y, Gao L, Zhang Q, Shentu Y. Exploring a novel HE image segmentation technique for glioblastoma: A hybrid slime mould and differential evolution approach. Comput Biol Med 2024; 168:107653. [PMID: 37984200 DOI: 10.1016/j.compbiomed.2023.107653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2023] [Revised: 10/12/2023] [Accepted: 10/31/2023] [Indexed: 11/22/2023]
Abstract
Glioblastoma is a primary brain tumor with high incidence and mortality rates, posing a significant threat to human health. It is crucial to provide necessary diagnostic assistance for its management. Among them, Multi-threshold Image Segmentation (MIS) is considered the most efficient and intuitive method in image processing. In recent years, many scholars have combined different metaheuristic algorithms with MIS to improve the quality of Image Segmentation (IS). Slime Mould Algorithm (SMA) is a metaheuristic approach inspired by the foraging behavior of slime mould populations in nature. In this investigation, we introduce a hybridized variant named BDSMA, aimed at overcoming the inherent limitations of the original algorithm. These limitations encompass inadequate exploitation capacity and a tendency to converge prematurely towards local optima when dealing with complex multidimensional problems. To bolster the algorithm's optimization prowess, we integrate the original algorithm with a robust exploitative operator called Differential Evolution (DE). Additionally, we introduce a strategy for handling solutions that surpass boundaries. The incorporation of an advanced cooperative mixing model accelerates the convergence of BDSMA, refining its precision and preventing it from becoming trapped in local optima. To substantiate the effectiveness of our proposed approach, we conduct a comprehensive series of comparative experiments involving 30 benchmark functions. The results of these experiments demonstrate the superiority of our method in terms of both convergence speed and precision. Moreover, within this study, we propose a MIS technique. This technique is subsequently employed to conduct experiments on IS at both low and high threshold levels. The effectiveness of the BDSMA-based MIS technique is further showcased through its successful application to the medical image of brain glioblastoma. The evaluation of these experimental outcomes, utilizing image quality metrics, conclusively underscores the exceptional efficacy of the algorithm we have put forth.
Collapse
Affiliation(s)
- Hongliang Guo
- College of Information Technology, Jilin Agricultural University, Changchun 130118, China.
| | - Hanbo Liu
- College of Information Technology, Jilin Agricultural University, Changchun 130118, China.
| | - Hong Zhu
- Department of Pathology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.
| | - Mingyang Li
- College of Information Technology, Jilin Agricultural University, Changchun 130118, China.
| | - Helong Yu
- College of Information Technology, Jilin Agricultural University, Changchun 130118, China.
| | - Yun Zhu
- Department of Pathology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.
| | - Xiaoxiao Chen
- Department of Pathology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.
| | - Yujia Xu
- Department of Pathology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.
| | - Lianxing Gao
- College of Engineering and Technology, Jilin Agricultural University, Changchun 130118, China.
| | - Qiongying Zhang
- Department of Pathology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.
| | - Yangping Shentu
- Department of Pathology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.
| |
Collapse
|
8
|
Deng Z, Huang G, Yuan X, Zhong G, Lin T, Pun CM, Huang Z, Liang Z. QMLS: quaternion mutual learning strategy for multi-modal brain tumor segmentation. Phys Med Biol 2023; 69:015014. [PMID: 38061066 DOI: 10.1088/1361-6560/ad135e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 12/07/2023] [Indexed: 12/27/2023]
Abstract
Objective.Due to non-invasive imaging and the multimodality of magnetic resonance imaging (MRI) images, MRI-based multi-modal brain tumor segmentation (MBTS) studies have attracted more and more attention in recent years. With the great success of convolutional neural networks in various computer vision tasks, lots of MBTS models have been proposed to address the technical challenges of MBTS. However, the problem of limited data collection usually exists in MBTS tasks, making existing studies typically have difficulty in fully exploring the multi-modal MRI images to mine complementary information among different modalities.Approach.We propose a novel quaternion mutual learning strategy (QMLS), which consists of a voxel-wise lesion knowledge mutual learning mechanism (VLKML mechanism) and a quaternion multi-modal feature learning module (QMFL module). Specifically, the VLKML mechanism allows the networks to converge to a robust minimum so that aggressive data augmentation techniques can be applied to expand the limited data fully. In particular, the quaternion-valued QMFL module treats different modalities as components of quaternions to sufficiently learn complementary information among different modalities on the hypercomplex domain while significantly reducing the number of parameters by about 75%.Main results.Extensive experiments on the dataset BraTS 2020 and BraTS 2019 indicate that QMLS achieves superior results to current popular methods with less computational cost.Significance.We propose a novel algorithm for brain tumor segmentation task that achieves better performance with fewer parameters, which helps the clinical application of automatic brain tumor segmentation.
Collapse
Affiliation(s)
- Zhengnan Deng
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, 510006, People's Republic of China
| | - Guoheng Huang
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, 510006, People's Republic of China
| | - Xiaochen Yuan
- Faculty of Applied Sciences, Macao Polytechnic University, Macao, People's Republic of China
| | - Guo Zhong
- School of Information Science and Technology, Guangdong University of Foreign Studies, Guangzhou, 510006, People's Republic of China
| | - Tongxu Lin
- School of Automation, Guangdong University of Technology, Guangzhou, 510006, People's Republic of China
| | - Chi-Man Pun
- Department of Computer and Information Science, University of Macau, Macao, People's Republic of China
| | - Zhixin Huang
- Department of Neurology, Guangdong Second Provincial General Hospital, Guangzhou, 510317, People's Republic of China
| | - Zhixin Liang
- Department of Nuclear Medicine, Jinshazhou Hospital, Guangzhou University of Chinese Medicine, Guangzhou, 510168, People's Republic of China
| |
Collapse
|
9
|
Yu M, Guo M, Zhang S, Zhan Y, Zhao M, Lukasiewicz T, Xu Z. RIRGAN: An end-to-end lightweight multi-task learning method for brain MRI super-resolution and denoising. Comput Biol Med 2023; 167:107632. [PMID: 39491379 DOI: 10.1016/j.compbiomed.2023.107632] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 10/05/2023] [Accepted: 10/23/2023] [Indexed: 11/05/2024]
Abstract
A common problem in the field of deep-learning-based low-level vision medical images is that most of the research is based on single task learning (STL), which is dedicated to solving one of the situations of low resolution or high noise. Our motivation is to design a model that can perform both SR and DN tasks simultaneously, in order to cope with the actual situation of low resolution and high noise in low-level vision medical images. By improving the existing single image super-resolution (SISR) network and introducing the idea of multi-task learning (MTL), we propose an end-to-end lightweight MTL generative adversarial network (GAN) based network using residual-in-residual-blocks (RIR-Blocks) for feature extraction, RIRGAN, which can concurrently accomplish super-resolution (SR) and denoising (DN) tasks. The generator in RIRGAN is composed of several residual groups with a long skip connection (LSC), which can help form a very deep network and enable the network to focus on learning high-frequency (HF) information. The introduction of a discriminator based on relativistic average discriminator (RaD) greatly improves the discriminator's ability and makes the generated image have more realistic details. Meanwhile, the use of hybrid loss function not only ensures that RIRGAN has the ability of MTL, but also enables RIRGAN to give a more balanced attention between quantitative evaluation of metrics and qualitative evaluation of human vision. The experimental results show that the quality of the restoration image of RIRGAN is superior to the SR and DN methods based on STL in both subjective perception and objective evaluation metrics when processing medical images with low-level vision. Our RIRGAN is more in line with the practical requirements of medical practice.
Collapse
Affiliation(s)
- Miao Yu
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China
| | - Miaomiao Guo
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China
| | - Shuai Zhang
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China.
| | - Yuefu Zhan
- Department of Radiology, Hainan Women and Children's Medical Center, Haikou, China
| | - Mingkang Zhao
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China
| | - Thomas Lukasiewicz
- Institute of Logic and Computation, Vienna University of Technology, Vienna, Austria; Department of Computer Science, University of Oxford, Oxford, United Kingdom
| | - Zhenghua Xu
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China.
| |
Collapse
|
10
|
Brémond Martin C, Simon Chane C, Clouchoux C, Histace A. Mu-Net a Light Architecture for Small Dataset Segmentation of Brain Organoid Bright-Field Images. Biomedicines 2023; 11:2687. [PMID: 37893062 PMCID: PMC10603975 DOI: 10.3390/biomedicines11102687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 07/30/2023] [Accepted: 08/04/2023] [Indexed: 10/29/2023] Open
Abstract
To characterize the growth of brain organoids (BOs), cultures that replicate some early physiological or pathological developments of the human brain are usually manually extracted. Due to their novelty, only small datasets of these images are available, but segmenting the organoid shape automatically with deep learning (DL) tools requires a larger number of images. Light U-Net segmentation architectures, which reduce the training time while increasing the sensitivity under small input datasets, have recently emerged. We further reduce the U-Net architecture and compare the proposed architecture (MU-Net) with U-Net and UNet-Mini on bright-field images of BOs using several data augmentation strategies. In each case, we perform leave-one-out cross-validation on 40 original and 40 synthesized images with an optimized adversarial autoencoder (AAE) or on 40 transformed images. The best results are achieved with U-Net segmentation trained on optimized augmentation. However, our novel method, MU-Net, is more robust: it achieves nearly as accurate segmentation results regardless of the dataset used for training (various AAEs or a transformation augmentation). In this study, we confirm that small datasets of BOs can be segmented with a light U-Net method almost as accurately as with the original method.
Collapse
Affiliation(s)
- Clara Brémond Martin
- ETIS Laboratory UMR 8051 (CY Cergy Paris Université, ENSEA, CNRS), 6 Avenue du Ponceau, 95000 Cergy, France
- Witsee, 33 Ave. des Champs-Élysées, 75008 Paris, France
| | - Camille Simon Chane
- ETIS Laboratory UMR 8051 (CY Cergy Paris Université, ENSEA, CNRS), 6 Avenue du Ponceau, 95000 Cergy, France
| | | | - Aymeric Histace
- ETIS Laboratory UMR 8051 (CY Cergy Paris Université, ENSEA, CNRS), 6 Avenue du Ponceau, 95000 Cergy, France
| |
Collapse
|
11
|
Xu Z, Zhang X, Zhang H, Liu Y, Zhan Y, Lukasiewicz T. EFPN: Effective medical image detection using feature pyramid fusion enhancement. Comput Biol Med 2023; 163:107149. [PMID: 37348265 DOI: 10.1016/j.compbiomed.2023.107149] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 05/15/2023] [Accepted: 06/07/2023] [Indexed: 06/24/2023]
Abstract
Feature pyramid networks (FPNs) are widely used in the existing deep detection models to help them utilize multi-scale features. However, there exist two multi-scale feature fusion problems for the FPN-based deep detection models in medical image detection tasks: insufficient multi-scale feature fusion and the same importance for multi-scale features. Therefore, in this work, we propose a new enhanced backbone model, EFPNs, to overcome these problems and help the existing FPN-based detection models to achieve much better medical image detection performances. We first introduce an additional top-down pyramid to help the detection networks fuse deeper multi-scale information; then, a scale enhancement module is developed to use different sizes of kernels to generate more diverse multi-scale features. Finally, we propose a feature fusion attention module to estimate and assign different importance weights to features with different depths and scales. Extensive experiments are conducted on two public lesion detection datasets for different medical image modalities (X-ray and MRI). On the mAP and mR evaluation metrics, EFPN-based Faster R-CNNs improved 1.55% and 4.3% on the PenD (X-ray) dataset, and 2.74% and 3.1% on the BraTs (MRI) dataset, respectively. EFPN-based Faster R-CNNs achieve much better performances than the state-of-the-art baselines in medical image detection tasks. The proposed three improvements are all essential and effective for EFPNs to achieve superior performances; and besides Faster R-CNNs, EFPNs can be easily applied to other deep models to significantly enhance their performances in medical image detection tasks.
Collapse
Affiliation(s)
- Zhenghua Xu
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, Hebei University of Technology, Tianjin, China.
| | - Xudong Zhang
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, Hebei University of Technology, Tianjin, China
| | - Hexiang Zhang
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, Hebei University of Technology, Tianjin, China.
| | - Yunxin Liu
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, Hebei University of Technology, Tianjin, China
| | - Yuefu Zhan
- Department of Radiology, Hainan Women and Children's Medical Center, Haikou, China.
| | - Thomas Lukasiewicz
- Institute of Logic and Computation, TU Wien, Vienna, Austria; Department of Computer Science, University of Oxford, Oxford, United Kingdom
| |
Collapse
|