1
|
Ince S, Kunduracioglu I, Algarni A, Bayram B, Pacal I. Deep learning for cerebral vascular occlusion segmentation: A novel ConvNeXtV2 and GRN-integrated U-Net framework for diffusion-weighted imaging. Neuroscience 2025; 574:42-53. [PMID: 40204150 DOI: 10.1016/j.neuroscience.2025.04.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2025] [Revised: 03/26/2025] [Accepted: 04/05/2025] [Indexed: 04/11/2025]
Abstract
Cerebral vascular occlusion is a serious condition that can lead to stroke and permanent neurological damage due to insufficient oxygen and nutrients reaching brain tissue. Early diagnosis and accurate segmentation are critical for effective treatment planning. Due to its high soft tissue contrast, Magnetic Resonance Imaging (MRI) is commonly used for detecting these occlusions such as ischemic stroke. However, challenges such as low contrast, noise, and heterogeneous lesion structures in MRI images complicate manual segmentation and often lead to misinterpretations. As a result, deep learning-based Computer-Aided Diagnosis (CAD) systems are essential for faster and more accurate diagnosis and treatment methods, although they can sometimes face challenges such as high computational costs and difficulties in segmenting small or irregular lesions. This study proposes a novel U-Net architecture enhanced with ConvNeXtV2 blocks and GRN-based Multi-Layer Perceptrons (MLP) to address these challenges in cerebral vascular occlusion segmentation. This is the first application of ConvNeXtV2 in this domain. The proposed model significantly improves segmentation accuracy, even in low-contrast regions, while maintaining high computational efficiency, which is crucial for real-world clinical applications. To reduce false positives and improve overall accuracy, small lesions (≤5 pixels) were removed in the preprocessing step with the support of expert clinicians. Experimental results on the ISLES 2022 dataset showed superior performance with an Intersection over Union (IoU) of 0.8015 and a Dice coefficient of 0.8894. Comparative analyses indicate that the proposed model achieves higher segmentation accuracy than existing U-Net variants and other methods, offering a promising solution for clinical use.
Collapse
Affiliation(s)
- Suat Ince
- Department of Radiology, University of Health Sciences, Van Education and Research Hospital, 65000 Van, Turkey.
| | - Ismail Kunduracioglu
- Department of Computer Engineering, Faculty of Engineering, Igdir University, 76000 Igdir, Turkey.
| | - Ali Algarni
- Department of Informatics and Computer Systems, College of Computer Science, King Khalid University, Abha 61421, Saudi Arabia.
| | - Bilal Bayram
- Department of Neurology, University of Health Sciences, Van Education and Research Hospital, 65000 Van, Turkey.
| | - Ishak Pacal
- Department of Computer Engineering, Faculty of Engineering, Igdir University, 76000 Igdir, Turkey; Department of Electronics and Information Technologies, Faculty of Architecture and Engineering, Nakhchivan State University, AZ 7012 Nakhchivan, Azerbaijan.
| |
Collapse
|
2
|
Kuang H, Wang Y, Tan X, Yang J, Sun J, Liu J, Qiu W, Zhang J, Zhang J, Yang C, Wang J, Chen Y. LW-CTrans: A lightweight hybrid network of CNN and Transformer for 3D medical image segmentation. Med Image Anal 2025; 102:103545. [PMID: 40107117 DOI: 10.1016/j.media.2025.103545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2024] [Revised: 02/22/2025] [Accepted: 03/07/2025] [Indexed: 03/22/2025]
Abstract
Recent models based on convolutional neural network (CNN) and Transformer have achieved the promising performance for 3D medical image segmentation. However, these methods cannot segment small targets well even when equipping large parameters. Therefore, We design a novel lightweight hybrid network that combines the strengths of CNN and Transformers (LW-CTrans) and can boost the global and local representation capability at different stages. Specifically, we first design a dynamic stem that can accommodate images of various resolutions. In the first stage of the hybrid encoder, to capture local features with fewer parameters, we propose a multi-path convolution (MPConv) block. In the middle stages of the hybrid encoder, to learn global and local features meantime, we propose a multi-view pooling based Transformer (MVPFormer) which projects the 3D feature map onto three 2D subspaces to deal with small objects, and use the MPConv block for enhancing local representation learning. In the final stage, to mostly capture global features, only the proposed MVPFormer is used. Finally, to reduce the parameters of the decoder, we propose a multi-stage feature fusion module. Extensive experiments on 3 public datasets for three tasks: stroke lesion segmentation, pancreas cancer segmentation and brain tumor segmentation, show that the proposed LW-CTrans achieves Dices of 62.35±19.51%, 64.69±20.58% and 83.75±15.77% on the 3 datasets, respectively, outperforming 16 state-of-the-art methods, and the numbers of parameters (2.08M, 2.14M and 2.21M on 3 datasets, respectively) are smaller than the non-lightweight 3D methods and close to the lightweight methods. Besides, LW-CTrans also achieves the best performance for small lesion segmentation.
Collapse
Affiliation(s)
- Hulin Kuang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410000, China
| | - Yahui Wang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410000, China
| | - Xianzhen Tan
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410000, China
| | - Jialin Yang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410000, China
| | - Jiarui Sun
- School of Computer Science and Engineering, Southeast University, Nanjing 210096, China; Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, Nanjing 210096, China
| | - Jin Liu
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410000, China
| | - Wu Qiu
- School of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430000, China
| | - Jingyang Zhang
- School of Computer Science and Engineering, Southeast University, Nanjing 210096, China; Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, Nanjing 210096, China
| | - Jiulou Zhang
- Department of Radiology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, 210096, China; Lab for Artificial Intelligence in Medical Imaging (LAIMI), School of Medical Imaging, Nanjing Medical University, Nanjing, 210096, China
| | - Chunfeng Yang
- School of Computer Science and Engineering, Southeast University, Nanjing 210096, China; Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, Nanjing 210096, China.
| | - Jianxin Wang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410000, China; Xinjiang Engineering Research Center of Big Data and Intelligent Software, School of Software, Xinjiang University, Urumqi, 830091, Xinjiang, China.
| | - Yang Chen
- School of Computer Science and Engineering, Southeast University, Nanjing 210096, China; Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, Nanjing 210096, China
| |
Collapse
|
3
|
Bao R, Weiss RJ, Bates SV, Song Y, He S, Li J, Bjornerud A, Hirschtick RL, Grant PE, Ou Y. PARADISE: Personalized and regional adaptation for HIE disease identification and segmentation. Med Image Anal 2025; 102:103419. [PMID: 40147073 DOI: 10.1016/j.media.2024.103419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 09/16/2024] [Accepted: 11/28/2024] [Indexed: 03/29/2025]
Abstract
Hypoxic ischemic encephalopathy (HIE) is a brain dysfunction occurring in approximately 1-5/1000 term-born neonates. Accurate segmentation of HIE lesions in brain MRI is crucial for prognosis and diagnosis but presents a unique challenge due to the diffuse and small nature of these abnormalities, which resulted in a substantial gap between the performance of machine learning-based segmentation methods and clinical expert annotations for HIE. To address this challenge, we introduce ParadiseNet, an algorithm specifically designed for HIE lesion segmentation. ParadiseNet incorporates global-local learning, progressive uncertainty learning, and self-evolution learning modules, all inspired by clinical interpretation of neonatal brain MRIs. These modules target issues such as unbalanced data distribution, boundary uncertainty, and imprecise lesion detection, respectively. Extensive experiments demonstrate that ParadiseNet significantly enhances small lesion detection (<1%) accuracy in HIE, achieving an over 4% improvement in Dice, 6% improvement in NSD compared to U-Net and other general medical image segmentation algorithms.
Collapse
Affiliation(s)
- Rina Bao
- Boston Children's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA.
| | | | | | | | - Sheng He
- Boston Children's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA
| | - Jingpeng Li
- Boston Children's Hospital, Boston, MA, USA; Oslo University Hospital; University of Oslo, Norway
| | | | - Randy L Hirschtick
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, USA; Department of Psychiatry, Massachusetts General Hospital, Boston, MA, USA
| | - P Ellen Grant
- Boston Children's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA
| | - Yangming Ou
- Boston Children's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
4
|
Kabir MM, Rahman A, Hasan MN, Mridha MF. Computer vision algorithms in healthcare: Recent advancements and future challenges. Comput Biol Med 2025; 185:109531. [PMID: 39675214 DOI: 10.1016/j.compbiomed.2024.109531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 10/05/2024] [Accepted: 12/03/2024] [Indexed: 12/17/2024]
Abstract
Computer vision has emerged as a promising technology with numerous applications in healthcare. This systematic review provides an overview of advancements and challenges associated with computer vision in healthcare. The review highlights the application areas where computer vision has made significant strides, including medical imaging, surgical assistance, remote patient monitoring, and telehealth. Additionally, it addresses the challenges related to data quality, privacy, model interpretability, and integration with existing healthcare systems. Ethical and legal considerations, such as patient consent and algorithmic bias, are also discussed. The review concludes by identifying future directions and opportunities for research, emphasizing the potential impact of computer vision on healthcare delivery and outcomes. Overall, this systematic review underscores the importance of understanding both the advancements and challenges in computer vision to facilitate its responsible implementation in healthcare.
Collapse
Affiliation(s)
- Md Mohsin Kabir
- School of Innovation, Design and Engineering, Mälardalens University, Västerås, 722 20, Sweden.
| | - Ashifur Rahman
- Department of Computer Science and Engineering, Bangladesh University of Business and Technology, Mirpur-2, Dhaka, 1216, Bangladesh.
| | - Md Nahid Hasan
- Department of Computer Science, University of Wisconsin-Milwaukee, Milwaukee, WI 53211, United States.
| | - M F Mridha
- Department of Computer Science, American International University-Bangladesh, Dhaka, 1229, Dhaka, Bangladesh.
| |
Collapse
|
5
|
Zhong S, Wang W, Feng Q, Zhang Y, Ning Z. Cross-view discrepancy-dependency network for volumetric medical image segmentation. Med Image Anal 2025; 99:103329. [PMID: 39236632 DOI: 10.1016/j.media.2024.103329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Revised: 07/28/2024] [Accepted: 08/26/2024] [Indexed: 09/07/2024]
Abstract
The limited data poses a crucial challenge for deep learning-based volumetric medical image segmentation, and many methods have tried to represent the volume by its subvolumes (i.e., multi-view slices) for alleviating this issue. However, such methods generally sacrifice inter-slice spatial continuity. Currently, a promising avenue involves incorporating multi-view information into the network to enhance volume representation learning, but most existing studies tend to overlook the discrepancy and dependency across different views, ultimately limiting the potential of multi-view representations. To this end, we propose a cross-view discrepancy-dependency network (CvDd-Net) to task with volumetric medical image segmentation, which exploits multi-view slice prior to assist volume representation learning and explore view discrepancy and view dependency for performance improvement. Specifically, we develop a discrepancy-aware morphology reinforcement (DaMR) module to effectively learn view-specific representation by mining morphological information (i.e., boundary and position of object). Besides, we design a dependency-aware information aggregation (DaIA) module to adequately harness the multi-view slice prior, enhancing individual view representations of the volume and integrating them based on cross-view dependency. Extensive experiments on four medical image datasets (i.e., Thyroid, Cervix, Pancreas, and Glioma) demonstrate the efficacy of the proposed method on both fully-supervised and semi-supervised tasks.
Collapse
Affiliation(s)
- Shengzhou Zhong
- School of Biomedical Engineering, Southern Medical University, Guangzhou, Guangdong, 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Guangzhou, Guangdong, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou 510515, China
| | - Wenxu Wang
- School of Biomedical Engineering, Southern Medical University, Guangzhou, Guangdong, 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Guangzhou, Guangdong, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou 510515, China
| | - Qianjin Feng
- School of Biomedical Engineering, Southern Medical University, Guangzhou, Guangdong, 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Guangzhou, Guangdong, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou 510515, China
| | - Yu Zhang
- School of Biomedical Engineering, Southern Medical University, Guangzhou, Guangdong, 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Guangzhou, Guangdong, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou 510515, China.
| | - Zhenyuan Ning
- School of Biomedical Engineering, Southern Medical University, Guangzhou, Guangdong, 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Guangzhou, Guangdong, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou 510515, China.
| |
Collapse
|
6
|
Osman YBM, Li C, Huang W, Wang S. Collaborative Learning for Annotation-Efficient Volumetric MR Image Segmentation. J Magn Reson Imaging 2024; 60:1604-1614. [PMID: 38156427 DOI: 10.1002/jmri.29194] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 12/05/2023] [Accepted: 12/05/2023] [Indexed: 12/30/2023] Open
Abstract
BACKGROUND Deep learning has presented great potential in accurate MR image segmentation when enough labeled data are provided for network optimization. However, manually annotating three-dimensional (3D) MR images is tedious and time-consuming, requiring experts with rich domain knowledge and experience. PURPOSE To build a deep learning method exploring sparse annotations, namely only a single two-dimensional slice label for each 3D training MR image. STUDY TYPE Retrospective. POPULATION Three-dimensional MR images of 150 subjects from two publicly available datasets were included. Among them, 50 (1377 image slices) are for prostate segmentation. The other 100 (8800 image slices) are for left atrium segmentation. Five-fold cross-validation experiments were carried out utilizing the first dataset. For the second dataset, 80 subjects were used for training and 20 were used for testing. FIELD STRENGTH/SEQUENCE 1.5 T and 3.0 T; axial T2-weighted and late gadolinium-enhanced, 3D respiratory navigated, inversion recovery prepared gradient echo pulse sequence. ASSESSMENT A collaborative learning method by integrating the strengths of semi-supervised and self-supervised learning schemes was developed. The method was trained using labeled central slices and unlabeled noncentral slices. Segmentation performance on testing set was reported quantitatively and qualitatively. STATISTICAL TESTS Quantitative evaluation metrics including boundary intersection-over-union (B-IoU), Dice similarity coefficient, average symmetric surface distance, and relative absolute volume difference were calculated. Paired t test was performed, and P < 0.05 was considered statistically significant. RESULTS Compared to fully supervised training with only the labeled central slice, mean teacher, uncertainty-aware mean teacher, deep co-training, interpolation consistency training (ICT), and ambiguity-consensus mean teacher, the proposed method achieved a substantial improvement in segmentation accuracy, increasing the mean B-IoU significantly by more than 10.0% for prostate segmentation (proposed method B-IoU: 70.3% ± 7.6% vs. ICT B-IoU: 60.3% ± 11.2%) and by more than 6.0% for left atrium segmentation (proposed method B-IoU: 66.1% ± 6.8% vs. ICT B-IoU: 60.1% ± 7.1%). DATA CONCLUSIONS A collaborative learning method trained using sparse annotations can segment prostate and left atrium with high accuracy. LEVEL OF EVIDENCE 0 TECHNICAL EFFICACY: Stage 1.
Collapse
Affiliation(s)
- Yousuf Babiker M Osman
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Cheng Li
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Weijian Huang
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
- Peng Cheng Laboratory, Shenzhen, China
| | - Shanshan Wang
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- Peng Cheng Laboratory, Shenzhen, China
| |
Collapse
|
7
|
Huang W, Li C, Yang H, Liu J, Liang Y, Zheng H, Wang S. Enhancing the vision-language foundation model with key semantic knowledge-emphasized report refinement. Med Image Anal 2024; 97:103299. [PMID: 39146702 DOI: 10.1016/j.media.2024.103299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 07/05/2024] [Accepted: 08/06/2024] [Indexed: 08/17/2024]
Abstract
Recently, vision-language representation learning has made remarkable advancements in building up medical foundation models, holding immense potential for transforming the landscape of clinical research and medical care. The underlying hypothesis is that the rich knowledge embedded in radiology reports can effectively assist and guide the learning process, reducing the need for additional labels. However, these reports tend to be complex and sometimes even consist of redundant descriptions that make the representation learning too challenging to capture the key semantic information. This paper develops a novel iterative vision-language representation learning framework by proposing a key semantic knowledge-emphasized report refinement method. Particularly, raw radiology reports are refined to highlight the key information according to a constructed clinical dictionary and two model-optimized knowledge-enhancement metrics. The iterative framework is designed to progressively learn, starting from gaining a general understanding of the patient's condition based on raw reports and gradually refines and extracts critical information essential to the fine-grained analysis tasks. The effectiveness of the proposed framework is validated on various downstream medical image analysis tasks, including disease classification, region-of-interest segmentation, and phrase grounding. Our framework surpasses seven state-of-the-art methods in both fine-tuning and zero-shot settings, demonstrating its encouraging potential for different clinical applications.
Collapse
Affiliation(s)
- Weijian Huang
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Peng Cheng Laboratory, Shenzhen 518066, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Cheng Li
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Hao Yang
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Peng Cheng Laboratory, Shenzhen 518066, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jiarun Liu
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Peng Cheng Laboratory, Shenzhen 518066, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yong Liang
- Peng Cheng Laboratory, Shenzhen 518066, China
| | - Hairong Zheng
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Shanshan Wang
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China.
| |
Collapse
|
8
|
Zhang K, Zhu Y, Li H, Zeng Z, Liu Y, Zhang Y. MDANet: Multimodal difference aware network for brain stroke segmentation. Biomed Signal Process Control 2024; 95:106383. [DOI: 10.1016/j.bspc.2024.106383] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
9
|
Zhu J, Bolsterlee B, Chow BVY, Song Y, Meijering E. Hybrid dual mean-teacher network with double-uncertainty guidance for semi-supervised segmentation of magnetic resonance images. Comput Med Imaging Graph 2024; 115:102383. [PMID: 38643551 DOI: 10.1016/j.compmedimag.2024.102383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 03/26/2024] [Accepted: 04/14/2024] [Indexed: 04/23/2024]
Abstract
Semi-supervised learning has made significant progress in medical image segmentation. However, existing methods primarily utilize information from a single dimensionality, resulting in sub-optimal performance on challenging magnetic resonance imaging (MRI) data with multiple segmentation objects and anisotropic resolution. To address this issue, we present a Hybrid Dual Mean-Teacher (HD-Teacher) model with hybrid, semi-supervised, and multi-task learning to achieve effective semi-supervised segmentation. HD-Teacher employs a 2D and a 3D mean-teacher network to produce segmentation labels and signed distance fields from the hybrid information captured in both dimensionalities. This hybrid mechanism allows HD-Teacher to utilize features from 2D, 3D, or both dimensions as needed. Outputs from 2D and 3D teacher models are dynamically combined based on confidence scores, forming a single hybrid prediction with estimated uncertainty. We propose a hybrid regularization module to encourage both student models to produce results close to the uncertainty-weighted hybrid prediction to further improve their feature extraction capability. Extensive experiments of binary and multi-class segmentation conducted on three MRI datasets demonstrated that the proposed framework could (1) significantly outperform state-of-the-art semi-supervised methods (2) surpass a fully-supervised VNet trained on substantially more annotated data, and (3) perform on par with human raters on muscle and bone segmentation task. Code will be available at https://github.com/ThisGame42/Hybrid-Teacher.
Collapse
Affiliation(s)
- Jiayi Zhu
- School of Computer Science and Engineering, University of New South Wales, Sydney, NSW 2052, Australia; Neuroscience Research Australia (NeuRA), Randwick, NSW 2031, Australia.
| | - Bart Bolsterlee
- Neuroscience Research Australia (NeuRA), Randwick, NSW 2031, Australia; Graduate School of Biomedical Engineering, University of New South Wales, Sydney, NSW 2052, Australia
| | - Brian V Y Chow
- Neuroscience Research Australia (NeuRA), Randwick, NSW 2031, Australia; School of Biomedical Sciences, University of New South Wales, Sydney, NSW 2052, Australia
| | - Yang Song
- School of Computer Science and Engineering, University of New South Wales, Sydney, NSW 2052, Australia
| | - Erik Meijering
- School of Computer Science and Engineering, University of New South Wales, Sydney, NSW 2052, Australia
| |
Collapse
|
10
|
Luo J, Dai P, He Z, Huang Z, Liao S, Liu K. Deep learning models for ischemic stroke lesion segmentation in medical images: A survey. Comput Biol Med 2024; 175:108509. [PMID: 38677171 DOI: 10.1016/j.compbiomed.2024.108509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 02/09/2024] [Accepted: 04/21/2024] [Indexed: 04/29/2024]
Abstract
This paper provides a comprehensive review of deep learning models for ischemic stroke lesion segmentation in medical images. Ischemic stroke is a severe neurological disease and a leading cause of death and disability worldwide. Accurate segmentation of stroke lesions in medical images such as MRI and CT scans is crucial for diagnosis, treatment planning and prognosis. This paper first introduces common imaging modalities used for stroke diagnosis, discussing their capabilities in imaging lesions at different disease stages from the acute to chronic stage. It then reviews three major public benchmark datasets for evaluating stroke segmentation algorithms: ATLAS, ISLES and AISD, highlighting their key characteristics. The paper proceeds to provide an overview of foundational deep learning architectures for medical image segmentation, including CNN-based and transformer-based models. It summarizes recent innovations in adapting these architectures to the task of stroke lesion segmentation across the three datasets, analyzing their motivations, modifications and results. A survey of loss functions and data augmentations employed for this task is also included. The paper discusses various aspects related to stroke segmentation tasks, including prior knowledge, small lesions, and multimodal fusion, and then concludes by outlining promising future research directions. Overall, this comprehensive review covers critical technical developments in the field to support continued progress in automated stroke lesion segmentation.
Collapse
Affiliation(s)
- Jialin Luo
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Peishan Dai
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China.
| | - Zhuang He
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Zhongchao Huang
- Department of Biomedical Engineering, School of Basic Medical Science, Central South University, Changsha, Hunan, China
| | - Shenghui Liao
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Kun Liu
- Brain Hospital of Hunan Province (The Second People's Hospital of Hunan Province), Changsha, Hunan, China
| |
Collapse
|
11
|
Zeng Q, Yu J, Hu Q, Yin K, Li Q, Huang J, Xie L, Wang J, Zhang C, Wang J, Zhang J, Feng Y. Investigation into white matter microstructure differences in visual training by using an automated fiber tract subclassification segmentation quantification method. Neurosci Lett 2024; 821:137574. [PMID: 38036084 DOI: 10.1016/j.neulet.2023.137574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 11/25/2023] [Accepted: 11/27/2023] [Indexed: 12/02/2023]
Abstract
Visual training has emerged as a useful framework for investigating training-related brain plasticity, a highly complex task involving the interaction of visual orientation, attention, reasoning, and cognitive functions. However, the effects of long-term visual training on microstructural changes within white matter (WM) is poorly understood. Therefore, a set of visual training programs was designed, and automated fiber tract subclassification segmentation quantification based on diffusion magnetic resonance imaging was performed to obtain the anatomical changes in the brains of visual trainees. First, 40 healthy matched participants were randomly assigned to the training group or the control group. The training group underwent 10 consecutive weeks of visual training. Then, the fiber tracts of the subjects were automatically identified and further classified into fiber clusters to determine the differences between the two groups on a detailed scale. Next, each fiber cluster was divided into segments that can analyze specific areas of a fiber cluster. Lastly, the diffusion metrics of the two groups were comparatively analyzed to delineate the effects of visual training on WM microstructure. Our results showed that there were significant differences in the fiber clusters of the cingulate bundle, thalamus frontal, uncinate fasciculus, and corpus callosum between the training group compared and the control group. In addition, the training group exhibited lower mean fractional anisotropy, higher mean diffusivity and radial diffusivity than the control group. Therefore, the long-term cognitive activities, such as visual training, may systematically influence the WM properties of cognition, attention, memory, and processing speed.
Collapse
Affiliation(s)
- Qingrun Zeng
- Institute of Information Processing and Automation, College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China; Zhejiang Provincial United Key Laboratory of Embedded Systems, Hangzhou 310023, China
| | - Jiangli Yu
- Institute of Information Processing and Automation, College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China; Zhejiang Provincial United Key Laboratory of Embedded Systems, Hangzhou 310023, China
| | - Qiming Hu
- Institute of Information Processing and Automation, College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China; Zhejiang Provincial United Key Laboratory of Embedded Systems, Hangzhou 310023, China
| | - Kuiying Yin
- Nanjing Research Institute of Electronic Technology, Nanjing 210012, China
| | - Qixue Li
- Nanjing Research Institute of Electronic Technology, Nanjing 210012, China
| | - Jiahao Huang
- Institute of Information Processing and Automation, College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China; Zhejiang Provincial United Key Laboratory of Embedded Systems, Hangzhou 310023, China
| | - Lei Xie
- Institute of Information Processing and Automation, College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China; Zhejiang Provincial United Key Laboratory of Embedded Systems, Hangzhou 310023, China
| | - Jingqiang Wang
- Institute of Information Processing and Automation, College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China; Zhejiang Provincial United Key Laboratory of Embedded Systems, Hangzhou 310023, China
| | - Chengzhe Zhang
- Institute of Information Processing and Automation, College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China; Zhejiang Provincial United Key Laboratory of Embedded Systems, Hangzhou 310023, China
| | - Jiafeng Wang
- Institute of Information Processing and Automation, College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China; Zhejiang Provincial United Key Laboratory of Embedded Systems, Hangzhou 310023, China
| | - Jiawei Zhang
- Institute of Information Processing and Automation, College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China; Zhejiang Provincial United Key Laboratory of Embedded Systems, Hangzhou 310023, China
| | - Yuanjing Feng
- Institute of Information Processing and Automation, College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China; Zhejiang Provincial United Key Laboratory of Embedded Systems, Hangzhou 310023, China.
| |
Collapse
|
12
|
Malik M, Chong B, Fernandez J, Shim V, Kasabov NK, Wang A. Stroke Lesion Segmentation and Deep Learning: A Comprehensive Review. Bioengineering (Basel) 2024; 11:86. [PMID: 38247963 PMCID: PMC10813717 DOI: 10.3390/bioengineering11010086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 01/05/2024] [Accepted: 01/15/2024] [Indexed: 01/23/2024] Open
Abstract
Stroke is a medical condition that affects around 15 million people annually. Patients and their families can face severe financial and emotional challenges as it can cause motor, speech, cognitive, and emotional impairments. Stroke lesion segmentation identifies the stroke lesion visually while providing useful anatomical information. Though different computer-aided software are available for manual segmentation, state-of-the-art deep learning makes the job much easier. This review paper explores the different deep-learning-based lesion segmentation models and the impact of different pre-processing techniques on their performance. It aims to provide a comprehensive overview of the state-of-the-art models and aims to guide future research and contribute to the development of more robust and effective stroke lesion segmentation models.
Collapse
Affiliation(s)
- Mishaim Malik
- Auckland Bioengineering Institute, The University of Auckland, Auckland 1010, New Zealand; (M.M.); (B.C.); (N.K.K.)
| | - Benjamin Chong
- Auckland Bioengineering Institute, The University of Auckland, Auckland 1010, New Zealand; (M.M.); (B.C.); (N.K.K.)
- Faculty of Medical and Health Sciences, The University of Auckland, Auckland 1010, New Zealand
- Centre for Brain Research, The University of Auckland, Auckland 1010, New Zealand
| | - Justin Fernandez
- Auckland Bioengineering Institute, The University of Auckland, Auckland 1010, New Zealand; (M.M.); (B.C.); (N.K.K.)
- Centre for Brain Research, The University of Auckland, Auckland 1010, New Zealand
- Mātai Medical Research Institute, Gisborne 4010, New Zealand
| | - Vickie Shim
- Auckland Bioengineering Institute, The University of Auckland, Auckland 1010, New Zealand; (M.M.); (B.C.); (N.K.K.)
- Mātai Medical Research Institute, Gisborne 4010, New Zealand
| | - Nikola Kirilov Kasabov
- Auckland Bioengineering Institute, The University of Auckland, Auckland 1010, New Zealand; (M.M.); (B.C.); (N.K.K.)
- Knowledge Engineering and Discovery Research Innovation, School of Engineering, Computer and Mathematical Sciences, Auckland University of Technology, Auckland 1010, New Zealand
- Institute for Information and Communication Technologies, Bulgarian Academy of Sciences, 1113 Sofia, Bulgaria
- Knowledge Engineering Consulting Ltd., Auckland 1071, New Zealand
| | - Alan Wang
- Auckland Bioengineering Institute, The University of Auckland, Auckland 1010, New Zealand; (M.M.); (B.C.); (N.K.K.)
- Faculty of Medical and Health Sciences, The University of Auckland, Auckland 1010, New Zealand
- Centre for Brain Research, The University of Auckland, Auckland 1010, New Zealand
- Mātai Medical Research Institute, Gisborne 4010, New Zealand
- Medical Imaging Research Centre, The University of Auckland, Auckland 1010, New Zealand
- Centre for Co-Created Ageing Research, The University of Auckland, Auckland 1010, New Zealand
| |
Collapse
|
13
|
Jiang Z, Wu Y, Huang L, Gu M. FDB-Net: Fusion double branch network combining CNN and transformer for medical image segmentation. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2024; 32:931-951. [PMID: 38848160 DOI: 10.3233/xst-230413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2024]
Abstract
BACKGROUND The rapid development of deep learning techniques has greatly improved the performance of medical image segmentation, and medical image segmentation networks based on convolutional neural networks and Transformer have been widely used in this field. However, due to the limitation of the restricted receptive field of convolutional operation and the lack of local fine information extraction ability of the self-attention mechanism in Transformer, the current neural networks with pure convolutional or Transformer structure as the backbone still perform poorly in medical image segmentation. METHODS In this paper, we propose FDB-Net (Fusion Double Branch Network, FDB-Net), a double branch medical image segmentation network combining CNN and Transformer, by using a CNN containing gnConv blocks and a Transformer containing Varied-Size Window Attention (VWA) blocks as the feature extraction backbone network, the dual-path encoder ensures that the network has a global receptive field as well as access to the target local detail features. We also propose a new feature fusion module (Deep Feature Fusion, DFF), which helps the image to simultaneously fuse features from two different structural encoders during the encoding process, ensuring the effective fusion of global and local information of the image. CONCLUSION Our model achieves advanced results in all three typical tasks of medical image segmentation, which fully validates the effectiveness of FDB-Net.
Collapse
Affiliation(s)
- Zhongchuan Jiang
- State Key Laboratory of Public Big Data, Guiyang, China
- College of Computer Science and Technology, Guizhou University, Guiyang, China
| | - Yun Wu
- State Key Laboratory of Public Big Data, Guiyang, China
- College of Computer Science and Technology, Guizhou University, Guiyang, China
| | - Lei Huang
- State Key Laboratory of Public Big Data, Guiyang, China
- College of Computer Science and Technology, Guizhou University, Guiyang, China
| | - Maohua Gu
- State Key Laboratory of Public Big Data, Guiyang, China
- College of Computer Science and Technology, Guizhou University, Guiyang, China
| |
Collapse
|
14
|
Osman YBM, Li C, Huang W, Wang S. Sparse annotation learning for dense volumetric MR image segmentation with uncertainty estimation. Phys Med Biol 2023; 69:015009. [PMID: 38035374 DOI: 10.1088/1361-6560/ad111b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 11/30/2023] [Indexed: 12/02/2023]
Abstract
Objective.Training neural networks for pixel-wise or voxel-wise image segmentation is a challenging task that requires a considerable amount of training samples with highly accurate and densely delineated ground truth maps. This challenge becomes especially prominent in the medical imaging domain, where obtaining reliable annotations for training samples is a difficult, time-consuming, and expert-dependent process. Therefore, developing models that can perform well under the conditions of limited annotated training data is desirable.Approach.In this study, we propose an innovative framework called the extremely sparse annotation neural network (ESA-Net) that learns with only the single central slice label for 3D volumetric segmentation which explores both intra-slice pixel dependencies and inter-slice image correlations with uncertainty estimation. Specifically, ESA-Net consists of four specially designed distinct components: (1) an intra-slice pixel dependency-guided pseudo-label generation module that exploits uncertainty in network predictions while generating pseudo-labels for unlabeled slices with temporal ensembling; (2) an inter-slice image correlation-constrained pseudo-label propagation module which propagates labels from the labeled central slice to unlabeled slices by self-supervised registration with rotation ensembling; (3) a pseudo-label fusion module that fuses the two sets of generated pseudo-labels with voxel-wise uncertainty guidance; and (4) a final segmentation network optimization module to make final predictions with scoring-based label quantification.Main results.Extensive experimental validations have been performed on two popular yet challenging magnetic resonance image segmentation tasks and compared to five state-of-the-art methods.Significance.Results demonstrate that our proposed ESA-Net can consistently achieve better segmentation performances even under the extremely sparse annotation setting, highlighting its effectiveness in exploiting information from unlabeled data.
Collapse
Affiliation(s)
- Yousuf Babiker M Osman
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, People's Republic of China
- University of Chinese Academy of Sciences, Beijing 100049, People's Republic of China
| | - Cheng Li
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, People's Republic of China
- Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangzhou 510080, People's Republic of China
| | - Weijian Huang
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, People's Republic of China
- University of Chinese Academy of Sciences, Beijing 100049, People's Republic of China
- Peng Cheng Laboratory, Shenzhen 518066, People's Republic of China
| | - Shanshan Wang
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, People's Republic of China
- Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangzhou 510080, People's Republic of China
- Peng Cheng Laboratory, Shenzhen 518066, People's Republic of China
| |
Collapse
|
15
|
Xu H, Xie H, Tan Q, Zhang Y. Meta semi-supervised medical image segmentation with label hierarchy. Health Inf Sci Syst 2023; 11:26. [PMID: 37325196 PMCID: PMC10267083 DOI: 10.1007/s13755-023-00222-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 04/05/2023] [Indexed: 06/17/2023] Open
Abstract
Semi-supervised learning (SSL) has attracted increasing attention in medical image segmentation, where the mainstream usually explores perturbation-based consistency as a regularization to leverage unlabelled data. However, unlike directly optimizing segmentation task objectives, consistency regularization is a compromise by incorporating invariance towards perturbations, and inevitably suffers from noise in self-predicted targets. The above issues result in a knowledge gap between supervised guidance and unsupervised regularization. To bridge the knowledge gap, this work proposes a meta-based semi-supervised segmentation framework with the exploitation of label hierarchy. Two main prominent components named Divide and Generalize, and Label Hierarchy, are built in this work. Concretely, rather than merging all knowledge indiscriminately, we dynamically divide consistency regularization from supervised guidance as different domains. Then, a domain generalization technique is introduced with a meta-based optimization objective which ensures the update on supervised guidance should generalize to the consistency regularization, thereby bridging the knowledge gap. Furthermore, to alleviate the negative impact of noise in self-predicted targets, we propose to distill the noisy pixel-level consistency by exploiting label hierarchy and extracting hierarchical consistencies. Comprehensive experiments on two public medical segmentation benchmarks demonstrate the superiority of our framework to other semi-supervised segmentation methods, with new state-of-the-art results.
Collapse
Affiliation(s)
- Hai Xu
- School of Information Science and Technology, University of Science and Technology of China, Hefei, 230026 Anhui China
| | - Hongtao Xie
- School of Information Science and Technology, University of Science and Technology of China, Hefei, 230026 Anhui China
| | - Qingfeng Tan
- Cyberspace Institution of Advanced Technology, Guangzhou University, Guangzhou, 511442 Guangdong China
| | - Yongdong Zhang
- School of Information Science and Technology, University of Science and Technology of China, Hefei, 230026 Anhui China
| |
Collapse
|
16
|
Li Y, Yang J, Yu T, Chi J, Liu F. Global attention-enabled texture enhancement network for MR image reconstruction. Magn Reson Med 2023; 90:1919-1931. [PMID: 37382206 DOI: 10.1002/mrm.29785] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Revised: 05/23/2023] [Accepted: 06/14/2023] [Indexed: 06/30/2023]
Abstract
PURPOSE Although recent convolutional neural network (CNN) methodologies have shown promising results in fast MR imaging, there is still a desire to explore how they can be used to learn the frequency characteristics of multicontrast images and reconstruct texture details. METHODS A global attention-enabled texture enhancement network (GATE-Net) with a frequency-dependent feature extraction module (FDFEM) and convolution-based global attention module (GAM) is proposed to address the highly under-sampling MR image reconstruction problem. First, FDFEM enables GATE-Net to effectively extract high-frequency features from shareable information of multicontrast images to improve the texture details of reconstructed images. Second, GAM with less computation complexity has the receptive field of the entire image, which can fully explore useful shareable information of multi-contrast images and suppress less beneficial shareable information. RESULTS The ablation studies are conducted to evaluate the effectiveness of the proposed FDFEM and GAM. Experimental results under various acceleration rates and datasets consistently demonstrate the superiority of GATE-Net, in terms of peak signal-to-noise ratio, structural similarity and normalized mean square error. CONCLUSION A global attention-enabled texture enhancement network is proposed. it can be applied to multicontrast MR image reconstruction tasks with different acceleration rates and datasets and achieves superior performance in comparison with state-of-the-art methods.
Collapse
Affiliation(s)
- Yingnan Li
- College of Electronics and Information, Qingdao University, Qingdao, Shandong, China
| | - Jie Yang
- College of Mechanical and Electrical Engineering, Qingdao University, Qingdao, Shandong, China
| | - Teng Yu
- College of Electronics and Information, Qingdao University, Qingdao, Shandong, China
| | - Jieru Chi
- College of Electronics and Information, Qingdao University, Qingdao, Shandong, China
| | - Feng Liu
- School of Information Technology and Electrical Engineering, University of Queensland, Brisbane, Queensland, Australia
| |
Collapse
|
17
|
Wu Z, Zhang X, Li F, Wang S, Li J. TransRender: a transformer-based boundary rendering segmentation network for stroke lesions. Front Neurosci 2023; 17:1259677. [PMID: 37901438 PMCID: PMC10601640 DOI: 10.3389/fnins.2023.1259677] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 09/26/2023] [Indexed: 10/31/2023] Open
Abstract
Vision transformer architectures attract widespread interest due to their robust representation capabilities of global features. Transformer-based methods as the encoder achieve superior performance compared to convolutional neural networks and other popular networks in many segmentation tasks for medical images. Due to the complex structure of the brain and the approximate grayscale of healthy tissue and lesions, lesion segmentation suffers from over-smooth boundaries or inaccurate segmentation. Existing methods, including the transformer, utilize stacked convolutional layers as the decoder to uniformly treat each pixel as a grid, which is convenient for feature computation. However, they often neglect the high-frequency features of the boundary and focus excessively on the region features. We propose an effective method for lesion boundary rendering called TransRender, which adaptively selects a series of important points to compute the boundary features in a point-based rendering way. The transformer-based method is selected to capture global information during the encoding stage. Several renders efficiently map the encoded features of different levels to the original spatial resolution by combining global and local features. Furthermore, the point-based function is employed to supervise the render module generating points, so that TransRender can continuously refine the uncertainty region. We conducted substantial experiments on different stroke lesion segmentation datasets to prove the efficiency of TransRender. Several evaluation metrics illustrate that our method can automatically segment the stroke lesion with relatively high accuracy and low calculation complexity.
Collapse
Affiliation(s)
- Zelin Wu
- College of Electronic Information and Optical Engineering, Taiyuan University of Technology, Taiyuan, China
| | - Xueying Zhang
- College of Electronic Information and Optical Engineering, Taiyuan University of Technology, Taiyuan, China
| | - Fenglian Li
- College of Electronic Information and Optical Engineering, Taiyuan University of Technology, Taiyuan, China
| | - Suzhe Wang
- College of Electronic Information and Optical Engineering, Taiyuan University of Technology, Taiyuan, China
| | - Jiaying Li
- The First Clinical Medical College, Shanxi Medical University, Taiyuan, China
| |
Collapse
|
18
|
Ahmed R, Al Shehhi A, Hassan B, Werghi N, Seghier ML. An appraisal of the performance of AI tools for chronic stroke lesion segmentation. Comput Biol Med 2023; 164:107302. [PMID: 37572443 DOI: 10.1016/j.compbiomed.2023.107302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2023] [Revised: 07/18/2023] [Accepted: 07/28/2023] [Indexed: 08/14/2023]
Abstract
Automated demarcation of stoke lesions from monospectral magnetic resonance imaging scans is extremely useful for diverse research and clinical applications, including lesion-symptom mapping to explain deficits and predict recovery. There is a significant surge of interest in the development of supervised artificial intelligence (AI) methods for that purpose, including deep learning, with a performance comparable to trained experts. Such AI-based methods, however, require copious amounts of data. Thanks to the availability of large datasets, the development of AI-based methods for lesion segmentation has immensely accelerated in the last decade. One of these datasets is the Anatomical Tracings of Lesions After Stroke (ATLAS) dataset which includes T1-weighted images from hundreds of chronic stroke survivors with their manually traced lesions. This systematic review offers an appraisal of the impact of the ATLAS dataset in promoting the development of AI-based segmentation of stroke lesions. An examination of all published studies, that used the ATLAS dataset to both train and test their methods, highlighted an overall moderate performance (median Dice index = 59.40%) and a huge variability across studies in terms of data preprocessing, data augmentation, AI architecture, and the mode of operation (two-dimensional versus three-dimensional methods). Perhaps most importantly, almost all AI tools were borrowed from existing AI architectures in computer vision, as 90% of all selected studies relied on conventional convolutional neural network-based architectures. Overall, current research has not led to the development of robust AI architectures than can handle spatially heterogenous lesion patterns. This review also highlights the difficulty of gauging the performance of AI tools in the presence of uncertainties in the definition of the ground truth.
Collapse
Affiliation(s)
- Ramsha Ahmed
- Department of Biomedical Engineering, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| | - Aamna Al Shehhi
- Department of Biomedical Engineering, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates; Healthcare Engineering Innovation Center (HEIC), Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| | - Bilal Hassan
- Department of Electrical Engineering and Computer Science, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| | - Naoufel Werghi
- Healthcare Engineering Innovation Center (HEIC), Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates; Department of Electrical Engineering and Computer Science, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| | - Mohamed L Seghier
- Department of Biomedical Engineering, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates; Healthcare Engineering Innovation Center (HEIC), Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates.
| |
Collapse
|
19
|
Zhao J, Xing Z, Chen Z, Wan L, Han T, Fu H, Zhu L. Uncertainty-Aware Multi-Dimensional Mutual Learning for Brain and Brain Tumor Segmentation. IEEE J Biomed Health Inform 2023; 27:4362-4372. [PMID: 37155398 DOI: 10.1109/jbhi.2023.3274255] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Existing segmentation methods for brain MRI data usually leverage 3D CNNs on 3D volumes or employ 2D CNNs on 2D image slices. We discovered that while volume-based approaches well respect spatial relationships across slices, slice-based methods typically excel at capturing fine local features. Furthermore, there is a wealth of complementary information between their segmentation predictions. Inspired by this observation, we develop an Uncertainty-aware Multi-dimensional Mutual learning framework to learn different dimensional networks simultaneously, each of which provides useful soft labels as supervision to the others, thus effectively improving the generalization ability. Specifically, our framework builds upon a 2D-CNN, a 2.5D-CNN, and a 3D-CNN, while an uncertainty gating mechanism is leveraged to facilitate the selection of qualified soft labels, so as to ensure the reliability of shared information. The proposed method is a general framework and can be applied to varying backbones. The experimental results on three datasets demonstrate that our method can significantly enhance the performance of the backbone network by notable margins, achieving a Dice metric improvement of 2.8% on MeniSeg, 1.4% on IBSR, and 1.3% on BraTS2020.
Collapse
|
20
|
Huang SJ, Chen CC, Kao Y, Lu HHS. Feature-aware unsupervised lesion segmentation for brain tumor images using fast data density functional transform. Sci Rep 2023; 13:13582. [PMID: 37604860 PMCID: PMC10442428 DOI: 10.1038/s41598-023-40848-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2023] [Accepted: 08/17/2023] [Indexed: 08/23/2023] Open
Abstract
We demonstrate that isomorphically mapping gray-level medical image matrices onto energy spaces underlying the framework of fast data density functional transform (fDDFT) can achieve the unsupervised recognition of lesion morphology. By introducing the architecture of geometric deep learning and metrics of graph neural networks, gridized density functionals of the fDDFT establish an unsupervised feature-aware mechanism with global convolutional kernels to extract the most likely lesion boundaries and produce lesion segmentation. An AutoEncoder-assisted module reduces the computational complexity from [Formula: see text] to [Formula: see text], thus efficiently speeding up global convolutional operations. We validate their performance utilizing various open-access datasets and discuss limitations. The inference time of each object in large three-dimensional datasets is 1.76 s on average. The proposed gridized density functionals have activation capability synergized with gradient ascent operations, hence can be modularized and embedded in pipelines of modern deep neural networks. Algorithms of geometric stability and similarity convergence also raise the accuracy of unsupervised recognition and segmentation of lesion images. Their performance achieves the standard requirement for conventional deep neural networks; the median dice score is higher than 0.75. The experiment shows that the synergy of fDDFT and a naïve neural network improves the training and inference time by 58% and 51%, respectively, and the dice score raises to 0.9415. This advantage facilitates fast computational modeling in interdisciplinary applications and clinical investigation.
Collapse
Affiliation(s)
- Shin-Jhe Huang
- Geometric Data Vision Laboratory, Department of Biomedical Sciences and Engineering, National Central University, Taoyuan City, 32001, Taiwan
| | - Chien-Chang Chen
- Geometric Data Vision Laboratory, Department of Biomedical Sciences and Engineering, National Central University, Taoyuan City, 32001, Taiwan
| | - Yamin Kao
- Geometric Data Vision Laboratory, Department of Biomedical Sciences and Engineering, National Central University, Taoyuan City, 32001, Taiwan
| | - Henry Horng-Shing Lu
- Institute of Statistics, National Yang Ming Chiao Tung University, Hsinchu, 30010, Taiwan.
| |
Collapse
|
21
|
Liu L, Chang J, Liu Z, Zhang P, Xu X, Shang H. Hybrid Contextual Semantic Network for Accurate Segmentation and Detection of Small-Size Stroke Lesions From MRI. IEEE J Biomed Health Inform 2023; 27:4062-4073. [PMID: 37155390 DOI: 10.1109/jbhi.2023.3273771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Stroke is a cerebrovascular disease with high mortality and disability rates. The occurrence of the stroke typically produces lesions of different sizes, with the accurate segmentation and detection of small-size stroke lesions being closely related to the prognosis of patients. However, the large lesions are usually correctly identified, the small-size lesions are usually ignored. This article provides a hybrid contextual semantic network (HCSNet) that can accurately and simultaneously segment and detect small-size stroke lesions from magnetic resonance images. HCSNet inherits the advantages of the encoder-decoder architecture and applies a novel hybrid contextual semantic module that generates high-quality contextual semantic features from the spatial and channel contextual semantic features through the skip connection layer. Moreover, a mixing-loss function is proposed to optimize HCSNet for unbalanced small-size lesions. HCSNet is trained and evaluated on 2D magnetic resonance images produced from the Anatomical Tracings of Lesions After Stroke challenge (ATLAS R2.0). Extensive experiments demonstrate that HCSNet outperforms several other state-of-the-art methods in its ability to segment and detect small-size stroke lesions. Visualization and ablation experiments reveal that the hybrid semantic module improves the segmentation and detection performance of HCSNet.
Collapse
|
22
|
Wang Y, Li L, Li C, Xi Y, Lin Y, Wang S. Expert knowledge guided manifold representation learning for magnetic resonance imaging-based glioma grading. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104876] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/09/2023]
|
23
|
Shi X, Li Y, Cheng J, Bai J, Zhao G, Chen YW. Multi-task Model for Glioma Segmentation and Isocitrate Dehydrogenase Status Prediction Using Global and Local Features. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-5. [PMID: 38083206 DOI: 10.1109/embc40787.2023.10340355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
According to the 2021 World Health Organization IDH status prediction scheme for gliomas, isocitrate dehydrogenase (IDH) is a particularly important basis for glioma diagnosis. In general, 3D multimodal brain MRI is an effective diagnostic tool. However, only using brain MRI data is difficult for experienced doctors to predict the IDH status. Surgery is necessary to be performed for confirming the IDH. Previous studies have shown that brain MRI images of glioma areas contain a lot of useful information for diagnosis. These studies usually need to mark the glioma area in advance to complete the prediction of IDH status, which takes a long time and has high computational cost. The tumor segmentation task model can automatically segment and locate the tumor region, which is exactly the information needed for the IDH prediction task. In this study, we proposed a multi-task deep learning model using 3D multimodal brain MRI images to achieve glioma segmentation and IDH status prediction simultaneously, which improved the accuracy of both tasks effectively. Firstly, we used a segmentation model to segment the tumor region. Also, the whole MRI image and the segmented glioma region features as the global and local features were used to predict IDH status. The effectiveness of the proposed method was validated via a public glioma dataset from the BraTS2020. Our experimental results show that our proposed method outperformed state-of-the-art methods with a prediction accuracy of 88.5% and average dice of 79.8%. The improvements in prediction and segmentation are 3% and 1% compared with the state-of-the-art method, respectively.
Collapse
|
24
|
Liu L, Chang J, Liang G, Xiong S. Simulated Quantum Mechanics-Based Joint Learning Network for Stroke Lesion Segmentation and TICI Grading. IEEE J Biomed Health Inform 2023; 27:3372-3383. [PMID: 37104101 DOI: 10.1109/jbhi.2023.3270861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/28/2023]
Abstract
Segmenting stroke lesions and assessing the thrombolysis in cerebral infarction (TICI) grade are two important but challenging prerequisites for an auxiliary diagnosis of the stroke. However, most previous studies have focused only on a single one of two tasks, without considering the relation between them. In our study, we propose a simulated quantum mechanics-based joint learning network (SQMLP-net) that simultaneously segments a stroke lesion and assesses the TICI grade. The correlation and heterogeneity between the two tasks are tackled with a single-input double-output hybrid network. SQMLP-net has a segmentation branch and a classification branch. These two branches share an encoder, which extracts and shares the spatial and global semantic information for the segmentation and classification tasks. Both tasks are optimized by a novel joint loss function that learns the intra- and inter-task weights between these two tasks. Finally, we evaluate SQMLP-net with a public stroke dataset (ATLAS R2.0). SQMLP-net obtains state-of-the-art metrics (Dice:70.98% and accuracy:86.78%) and outperforms single-task and existing advanced methods. An analysis found a negative correlation between the severity of TICI grading and the accuracy of stroke lesion segmentation.
Collapse
|
25
|
Murmu A, Kumar P. A novel Gateaux derivatives with efficient DCNN-Resunet method for segmenting multi-class brain tumor. Med Biol Eng Comput 2023:10.1007/s11517-023-02824-z. [PMID: 37338739 DOI: 10.1007/s11517-023-02824-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 03/14/2023] [Indexed: 06/21/2023]
Abstract
In hospitals and pathology, observing the features and locations of brain tumors in Magnetic Resonance Images (MRI) is a crucial task for assisting medical professionals in both treatment and diagnosis. The multi-class information about the brain tumor is often obtained from the patient's MRI dataset. However, this information may vary in different shapes and sizes for various brain tumors, making it difficult to detect their locations in the brain. To resolve these issues, a novel customized Deep Convolution Neural Network (DCNN) based Residual-Unet (ResUnet) model with Transfer Learning (TL) is proposed for predicting the locations of the brain tumor in an MRI dataset. The DCNN model has been used to extract the features from input images and select the Region Of Interest (ROI) by using the TL technique for training it faster. Furthermore, the min-max normalizing approach is used to enhance the color intensity value for particular ROI boundary edges in the brain tumor images. Specifically, the boundary edges of the brain tumors have been detected by utilizing Gateaux Derivatives (GD) method to identify the multi-class brain tumors precisely. The proposed scheme has been validated on two datasets namely the brain tumor, and Figshare MRI datasets for detecting multi-class Brain Tumor Segmentation (BTS).The experimental results have been analyzed by evaluation metrics namely, accuracy (99.78, and 99.03), Jaccard Coefficient (93.04, and 94.95), Dice Factor Coefficient (DFC) (92.37, and 91.94), Mean Absolute Error (MAE) (0.0019, and 0.0013), and Mean Squared Error (MSE) (0.0085, and 0.0012) for proper validation. The proposed system outperforms the state-of-the-art segmentation models on the MRI brain tumor dataset.
Collapse
Affiliation(s)
- Anita Murmu
- Computer Science and Engineering, National Institute of Technology Patna, Ashok Rajpath, Patna, 800005, Bihar, India.
| | - Piyush Kumar
- Computer Science and Engineering, National Institute of Technology Patna, Ashok Rajpath, Patna, 800005, Bihar, India
| |
Collapse
|
26
|
Zou J, Wang Z, Du X. A Double-Teacher Model Capable of Exploiting Isomorphic and Heterogeneous Discrepancy Information for Medical Image Segmentation. Diagnostics (Basel) 2023; 13:diagnostics13111971. [PMID: 37296823 DOI: 10.3390/diagnostics13111971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 05/20/2023] [Accepted: 05/31/2023] [Indexed: 06/12/2023] Open
Abstract
Deep learning, with continuous development, has achieved relatively good results in the field of left atrial segmentation, and numerous semi-supervised methods in this field have been implemented based on consistency regularization to obtain high-performance 3D models by training. However, most semi-supervised methods focus on inter-model consistency and ignore inter-model discrepancy. Therefore, we designed an improved double-teacher framework with discrepancy information. Herein, one teacher learns 2D information, another learns both 2D and 3D information, and the two models jointly guide the student model for learning. Simultaneously, we extract the isomorphic/heterogeneous discrepancy information between the predictions of the student and teacher model to optimize the whole framework. Unlike other semi-supervised methods based on 3D models, ours only uses 3D information to assist 2D models, and does not have a fully 3D model, thus addressing the large memory consumption and limited training data of 3D models to some extent. Our approach shows excellent performance on the left atrium (LA) dataset, similar to that of the best performing 3D semi-supervised methods available, compared to existing techniques.
Collapse
Affiliation(s)
- Junguo Zou
- School of Information Engineering, Chuzhou Polytechnic, Chuzhou 239000, China
| | - Zhaohe Wang
- School of Computer Science and Technology, Anhui University, Hefei 230601, China
| | - Xiuquan Du
- School of Computer Science and Technology, Anhui University, Hefei 230601, China
| |
Collapse
|
27
|
Song G, Zhou J, Wang K, Yao D, Chen S, Shi Y. Segmentation of multi-regional skeletal muscle in abdominal CT image for cirrhotic sarcopenia diagnosis. Front Neurosci 2023; 17:1203823. [PMID: 37360174 PMCID: PMC10289291 DOI: 10.3389/fnins.2023.1203823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 05/12/2023] [Indexed: 06/28/2023] Open
Abstract
Background Sarcopenia is generally diagnosed by the total area of skeletal muscle in the CT axial slice located in the third lumbar (L3) vertebra. However, patients with severe liver cirrhosis cannot accurately obtain the corresponding total skeletal muscle because their abdominal muscles are squeezed, which affects the diagnosis of sarcopenia. Purpose This study proposes a novel lumbar skeletal muscle network to automatically segment multi-regional skeletal muscle from CT images, and explores the relationship between cirrhotic sarcopenia and each skeletal muscle region. Methods This study utilizes the skeletal muscle characteristics of different spatial regions to improve the 2.5D U-Net enhanced by residual structure. Specifically, a 3D texture attention enhancement block is proposed to tackle the issue of blurred edges with similar intensities and poor segmentation between different skeletal muscle regions, which contains skeletal muscle shape and muscle fibre texture to spatially constrain the integrity of skeletal muscle region and alleviate the difficulty of identifying muscle boundaries in axial slices. Subsequentially, a 3D encoding branch is constructed in conjunction with a 2.5D U-Net, which segments the lumbar skeletal muscle in multiple L3-related axial CT slices into four regions. Furthermore, the diagnostic cut-off values of the L3 skeletal muscle index (L3SMI) are investigated for identifying cirrhotic sarcopenia in four muscle regions segmented from CT images of 98 patients with liver cirrhosis. Results Our method is evaluated on 317 CT images using the five-fold cross-validation method. For the four skeletal muscle regions segmented in the images from the independent test set, the avg. DSC is 0.937 and the avg. surface distance is 0.558 mm. For sarcopenia diagnosis in 98 patients with liver cirrhosis, the cut-off values of Rectus Abdominis, Right Psoas, Left Psoas, and Paravertebral are 16.67, 4.14, 3.76, and 13.20 cm2/m2 in females, and 22.51, 5.84, 6.10, and 17.28 cm2/m2 in males, respectively. Conclusion The proposed method can segment four skeletal muscle regions related to the L3 vertebra with high accuracy. Furthermore, the analysis shows that the Rectus Abdominis region can be used to assist in the diagnosis of sarcopenia when the total muscle is not available.
Collapse
Affiliation(s)
- Genshen Song
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai, China
- Shanghai Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention, Shanghai, China
| | - Ji Zhou
- Department of Gastroenterology and Hepatology, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Kang Wang
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai, China
- Shanghai Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention, Shanghai, China
| | - Demin Yao
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai, China
- Shanghai Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention, Shanghai, China
| | - Shiyao Chen
- Department of Gastroenterology and Hepatology, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Yonghong Shi
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai, China
- Shanghai Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention, Shanghai, China
- Academy for Engineering & Technology, Fudan University, Shanghai, China
| |
Collapse
|
28
|
Wang J, Li S, Yu L, Qu A, Wang Q, Liu J, Wu Q. SDPN: A Slight Dual-Path Network With Local-Global Attention Guided for Medical Image Segmentation. IEEE J Biomed Health Inform 2023; 27:2956-2967. [PMID: 37030687 DOI: 10.1109/jbhi.2023.3260026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/10/2023]
Abstract
Accurate identification of lesions is a key step in surgical planning. However, this task mainly exists two challenges: 1) Due to the complex anatomical shapes of different lesions, most segmentation methods only achieve outstanding performance for a specific structure, rather than other lesions with location differences. 2) The huge number of parameters limits existing transformer-based segmentation models. To overcome these problems, we propose a novel slight dual-path network (SDPN) to segment variable location lesions or organs with significant differences accurately. First, we design a dual-path module to integrate local with global features without obvious memory consumption. Second, a novel Multi-spectrum attention module is proposed to pay further attention to detailed information, which can automatically adapt to the variable segmentation target. Then, the compression module based on tensor ring decomposition is designed to compress convolutional and transformer structures. In the experiment, four datasets, including three benchmark datasets and a clinical dataset, are used to evaluate SDPN. Results of the experiments show that SDPN performs better than other start-of-the-art methods for brain tumor, liver tumor, endometrial tumor and cardiac segmentation. To ensure the generalizability, we train the network on Kvasir-SEG and test on CVC-ClinicDB which collected from a different institution. The quantitative analysis shows that the clinical evaluation results are consistent with the experts. Therefore, this model may be a potential candidate for the segmentation of lesions and organs segmentation with variable locations in clinical applications.
Collapse
|
29
|
Ou Y, Huang SX, Wong KK, Cummock J, Volpi J, Wang JZ, Wong STC. BBox-Guided Segmentor: Leveraging expert knowledge for accurate stroke lesion segmentation using weakly supervised bounding box prior. Comput Med Imaging Graph 2023; 107:102236. [PMID: 37146318 DOI: 10.1016/j.compmedimag.2023.102236] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 02/17/2023] [Accepted: 04/06/2023] [Indexed: 05/07/2023]
Abstract
Stroke is one of the leading causes of death and disability in the world. Despite intensive research on automatic stroke lesion segmentation from non-invasive imaging modalities including diffusion-weighted imaging (DWI), challenges remain such as a lack of sufficient labeled data for training deep learning models and failure in detecting small lesions. In this paper, we propose BBox-Guided Segmentor, a method that significantly improves the accuracy of stroke lesion segmentation by leveraging expert knowledge. Specifically, our model uses a very coarse bounding box label provided by the expert and then performs accurate segmentation automatically. The small overhead of having the expert provide a rough bounding box leads to large performance improvement in segmentation, which is paramount to accurate stroke diagnosis. To train our model, we employ a weakly-supervised approach that uses a large number of weakly-labeled images with only bounding boxes and a small number of fully labeled images. The scarce fully labeled images are used to train a generator segmentation network, while adversarial training is used to leverage the large number of weakly-labeled images to provide additional learning signals. We evaluate our method extensively using a unique clinical dataset of 99 fully labeled cases (i.e., with full segmentation map labels) and 831 weakly labeled cases (i.e., with only bounding box labels), and the results demonstrate the superior performance of our approach over state-of-the-art stroke lesion segmentation models. We also achieve competitive performance as a SOTA fully supervised method using less than one-tenth of the complete labels. Our proposed approach has the potential to improve stroke diagnosis and treatment planning, which may lead to better patient outcomes.
Collapse
Affiliation(s)
- Yanglan Ou
- Data Science and Artificial Intelligence Area, College of Information Sciences and Technology, The Pennsylvania State University, University Park, PA 16802, USA.
| | - Sharon X Huang
- Data Science and Artificial Intelligence Area, College of Information Sciences and Technology, The Pennsylvania State University, University Park, PA 16802, USA.
| | - Kelvin K Wong
- T.T. and W.F. Chao Center for BRAIN & Houston Methodist Cancer Center, Houston Methodist Hospital, Houston, TX 77030, USA.
| | - Jonathon Cummock
- T.T. and W.F. Chao Center for BRAIN & Houston Methodist Cancer Center, Houston Methodist Hospital, Houston, TX 77030, USA
| | - John Volpi
- Eddy Scurlock Comprehensive Stroke Center, Department of Neurology, Houston Methodist Hospital, Houston, TX 77030, USA
| | - James Z Wang
- Data Science and Artificial Intelligence Area, College of Information Sciences and Technology, The Pennsylvania State University, University Park, PA 16802, USA
| | - Stephen T C Wong
- T.T. and W.F. Chao Center for BRAIN & Houston Methodist Cancer Center, Houston Methodist Hospital, Houston, TX 77030, USA
| |
Collapse
|
30
|
Yu W, Huang Z, Zhang J, Shan H. SAN-Net: Learning generalization to unseen sites for stroke lesion segmentation with self-adaptive normalization. Comput Biol Med 2023; 156:106717. [PMID: 36878125 DOI: 10.1016/j.compbiomed.2023.106717] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 01/31/2023] [Accepted: 02/26/2023] [Indexed: 03/06/2023]
Abstract
There are considerable interests in automatic stroke lesion segmentation on magnetic resonance (MR) images in the medical imaging field, as stroke is an important cerebrovascular disease. Although deep learning-based models have been proposed for this task, generalizing these models to unseen sites is difficult due to not only the large inter-site discrepancy among different scanners, imaging protocols, and populations, but also the variations in stroke lesion shape, size, and location. To tackle this issue, we introduce a self-adaptive normalization network, termed SAN-Net, to achieve adaptive generalization on unseen sites for stroke lesion segmentation. Motivated by traditional z-score normalization and dynamic network, we devise a masked adaptive instance normalization (MAIN) to minimize inter-site discrepancies, which standardizes input MR images from different sites into a site-unrelated style by dynamically learning affine parameters from the input; i.e., MAIN can affinely transform the intensity values. Then, we leverage a gradient reversal layer to force the U-net encoder to learn site-invariant representation with a site classifier, which further improves the model generalization in conjunction with MAIN. Finally, inspired by the "pseudosymmetry" of the human brain, we introduce a simple yet effective data augmentation technique, termed symmetry-inspired data augmentation (SIDA), that can be embedded within SAN-Net to double the sample size while halving memory consumption. Experimental results on the benchmark Anatomical Tracings of Lesions After Stroke (ATLAS) v1.2 dataset, which includes MR images from 9 different sites, demonstrate that under the "leave-one-site-out" setting, the proposed SAN-Net outperforms recently published methods in terms of quantitative metrics and qualitative comparisons.
Collapse
Affiliation(s)
- Weiyi Yu
- Institute of Science and Technology for Brain-inspired Intelligence and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, 200433, China
| | - Zhizhong Huang
- Shanghai Key Lab of Intelligent Information Processing and the School of Computer Science, Fudan University, Shanghai 200433, China
| | - Junping Zhang
- Shanghai Key Lab of Intelligent Information Processing and the School of Computer Science, Fudan University, Shanghai 200433, China
| | - Hongming Shan
- Institute of Science and Technology for Brain-inspired Intelligence and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, 200433, China; Shanghai Center for Brain Science and Brain-inspired Technology, Shanghai 201210, China.
| |
Collapse
|
31
|
Xiang D, Yan S, Guan Y, Cai M, Li Z, Liu H, Chen X, Tian B. Semi-Supervised Dual Stream Segmentation Network for Fundus Lesion Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:713-725. [PMID: 36260572 DOI: 10.1109/tmi.2022.3215580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Accurate segmentation of retinal images can assist ophthalmologists to determine the degree of retinopathy and diagnose other systemic diseases. However, the structure of the retina is complex, and different anatomical structures often affect the segmentation of fundus lesions. In this paper, a new segmentation strategy called a dual stream segmentation network embedded into a conditional generative adversarial network is proposed to improve the accuracy of retinal lesion segmentation. First, a dual stream encoder is proposed to utilize the capabilities of two different networks and extract more feature information. Second, a multiple level fuse block is proposed to decode the richer and more effective features from the two different parallel encoders. Third, the proposed network is further trained in a semi-supervised adversarial manner to leverage from labeled images and unlabeled images with high confident pseudo labels, which are selected by the dual stream Bayesian segmentation network. An annotation discriminator is further proposed to reduce the negativity that prediction tends to become increasingly similar to the inaccurate predictions of unlabeled images. The proposed method is cross-validated in 384 clinical fundus fluorescein angiography images and 1040 optical coherence tomography images. Compared to state-of-the-art methods, the proposed method can achieve better segmentation of retinal capillary non-perfusion region and choroidal neovascularization.
Collapse
|
32
|
Li L, Ma K, Song Y, Du X. TSRL-Net: Target-aware supervision residual learning for stroke segmentation. Comput Biol Med 2023; 159:106840. [PMID: 37116236 DOI: 10.1016/j.compbiomed.2023.106840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 03/21/2023] [Accepted: 03/26/2023] [Indexed: 03/30/2023]
Abstract
Accurate stroke segmentation is a crucial task in establishing a computer-aided diagnostic system for brain diseases. However, reducing false negatives and accurately segmenting strokes in MRI images is often challenging because of the class imbalance and intraclass ambiguities problems. To address these issues, we propose a novel target-aware supervision residual learning framework for stroke segmentation. Considering the problem of imbalance of positive and negative samples, a creatively target-aware loss function is designed to dilate strong attention regions, pay high attention to the positive sample losses, and compensate for the loss of negative samples around the target. Then, a coarse-grained residual learning module is developed to gradually fix the lost residual features during the decoding phase to alleviate the problem of high number of false negatives caused by intraclass ambiguities. Here, our reverse/positive attention unit suppresses redundant target/background noise and allows relatively more focused highlighting of important features in the target residual region. Extensive experiments were performed on the Anatomical Tracings of Lesions After Stroke and Ischemic Stroke Lesion Segmentation public datasets, with results suggesting the effectiveness of our proposed method compared to several state-of-the-art methods.
Collapse
Affiliation(s)
- Lei Li
- Department of Neurology, Shuyang Hospital Affiliated to Yangzhou University School of Medicine (Shuyang Hospital of Traditional Chinese Medicine, Suqian, Jiangsu, China
| | - Kunpeng Ma
- School of Computer Science and Technology, Anhui University, Hefei, Anhui, China
| | - Yuhui Song
- School of Computer Science and Technology, Anhui University, Hefei, Anhui, China
| | - Xiuquan Du
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei, Anhui, China; School of Computer Science and Technology, Anhui University, Hefei, Anhui, China.
| |
Collapse
|
33
|
Yin HC, Lien JJJ. Cascaded Segmentation U-Net for Quality Evaluation of Scraping Workpiece. SENSORS (BASEL, SWITZERLAND) 2023; 23:998. [PMID: 36679795 PMCID: PMC9866543 DOI: 10.3390/s23020998] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 01/03/2023] [Accepted: 01/11/2023] [Indexed: 06/17/2023]
Abstract
In the terms of industry, the hand-scraping method is a key technology for achieving high precision in machine tools, and the quality of scraping workpieces directly affects the accuracy and service life of the machine tool. However, most of the quality evaluation of the scraping workpieces is carried out by the scraping worker's subjective judgment, which results in differences in the quality of the scraping workpieces and is time-consuming. Hence, in this research, an edge-cloud computing system was developed to obtain the relevant parameters, which are the percentage of point (POP) and the peak point per square inch (PPI), for evaluating the quality of scraping workpieces. On the cloud computing server-side, a novel network called cascaded segmentation U-Net is proposed to high-quality segment the height of points (HOP) (around 40 μm height) in favor of small datasets training and then carries out a post-processing algorithm that automatically calculates POP and PPI. This research emphasizes the architecture of the network itself instead. The design of the components of our network is based on the basic idea of identity function, which not only solves the problem of the misjudgment of the oil ditch and the residual pigment but also allows the network to be end-to-end trained effectively. At the head of the network, a cascaded multi-stage pixel-wise classification is designed for obtaining more accurate HOP borders. Furthermore, the "Cross-dimension Compression" stage is used to fuse high-dimensional semantic feature maps across the depth of the feature maps into low-dimensional feature maps, producing decipherable content for final pixel-wise classification. Our system can achieve an error rate of 3.7% and 0.9 points for POP and PPI. The novel network achieves an Intersection over Union (IoU) of 90.2%.
Collapse
|
34
|
Novel artificial intelligent transformer U-NET for better identification and management of prostate cancer. Mol Cell Biochem 2022; 478:1439-1445. [DOI: 10.1007/s11010-022-04600-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Accepted: 10/24/2022] [Indexed: 11/10/2022]
|
35
|
Kalsotra R, Arora S. Performance analysis of U-Net with hybrid loss for foreground detection. MULTIMEDIA SYSTEMS 2022; 29:771-786. [PMID: 36406901 PMCID: PMC9641683 DOI: 10.1007/s00530-022-01014-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Accepted: 10/10/2022] [Indexed: 06/16/2023]
Abstract
With the latest developments in deep neural networks, the convolutional neural network (CNN) has made considerable progress in the area of foreground detection. However, the top-rank background subtraction algorithms for foreground detection still have many shortcomings. It is challenging to extract the true foreground against complex background. To tackle the bottleneck, we propose a hybrid loss-assisted U-Net framework for foreground detection. A proposed deep learning model integrates transfer learning and hybrid loss for better feature representation and faster model convergence. The core idea is to incorporate reference background image and change detection mask in the learning network. Furthermore, we empirically investigate the potential of hybrid loss over single loss function. The advantages of two significant loss functions are combined to tackle the class imbalance problem in foreground detection. The proposed technique demonstrates its effectiveness on standard datasets and performs better than the top-rank methods in challenging environment. Moreover, experiments on unseen videos also confirm the efficacy of proposed method.
Collapse
Affiliation(s)
- Rudrika Kalsotra
- Department of Computer Science and Engineering, Shri Mata Vaishno Devi University, Katra, 182320 India
| | - Sakshi Arora
- Department of Computer Science and Engineering, Shri Mata Vaishno Devi University, Katra, 182320 India
| |
Collapse
|
36
|
Moon HS, Heffron L, Mahzarnia A, Obeng-Gyasi B, Holbrook M, Badea CT, Feng W, Badea A. Automated multimodal segmentation of acute ischemic stroke lesions on clinical MR images. Magn Reson Imaging 2022; 92:45-57. [PMID: 35688400 PMCID: PMC9949513 DOI: 10.1016/j.mri.2022.06.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Revised: 06/01/2022] [Accepted: 06/02/2022] [Indexed: 02/09/2023]
Abstract
Magnetic resonance (MR) imaging (MRI) is commonly used to diagnose, assess and monitor stroke. Accurate and timely segmentation of stroke lesions provides the anatomico-structural information that can aid physicians in predicting prognosis, as well as in decision making and triaging for various rehabilitation strategies. To segment stroke lesions, MR protocols, including diffusion-weighted imaging (DWI) and T2-weighted fluid attenuated inversion recovery (FLAIR) are often utilized. These imaging sequences are usually acquired with different spatial resolutions due to time constraints. Within the same image, voxels may be anisotropic, with reduced resolution along slice direction for diffusion scans in particular. In this study, we evaluate the ability of 2D and 3D U-Net Convolutional Neural Network (CNN) architectures to segment ischemic stroke lesions using single contrast (DWI) and dual contrast images (T2w FLAIR and DWI). The predicted segmentations correlate with post-stroke motor outcome measured by the National Institutes of Health Stroke Scale (NIHSS) and Fugl-Meyer Upper Extremity (FM-UE) index based on the lesion loads overlapping the corticospinal tracts (CST), which is a neural substrate for motor movement and function. Although the four methods performed similarly, the 2D multimodal U-Net achieved the best results with a mean Dice of 0.737 (95% CI: 0.705, 0.769) and a relatively high correlation between the weighted lesion load and the NIHSS scores (both at baseline and at 90 days). A monotonically constrained quintic polynomial regression yielded R2 = 0.784 and 0.875 for weighted lesion load versus baseline and 90-Days NIHSS respectively, and better corrected Akaike information criterion (AICc) scores than those of the linear regression. In addition, using the quintic polynomial regression model to regress the weighted lesion load to the 90-Days FM-UE score results in an R2 of 0.570 with a better AICc score than that of the linear regression. Our results suggest that the multi-contrast information enhanced the accuracy of the segmentation and the prediction accuracy for upper extremity motor outcomes. Expanding the training dataset to include different types of stroke lesions and more data points will help add a temporal longitudinal aspect and increase the accuracy. Furthermore, adding patient-specific data may improve the inference about the relationship between imaging metrics and functional outcomes.
Collapse
Affiliation(s)
- Hae Sol Moon
- Department of Biomedical Engineering, Duke University, Durham, NC, United States
| | - Lindsay Heffron
- Department of Orthopaedic Surgery, Duke University School of Medicine, Durham, NC, United States
| | - Ali Mahzarnia
- Department of Radiology, Duke University School of Medicine, Durham, NC, United States
| | - Barnabas Obeng-Gyasi
- Department of Radiology, Duke University School of Medicine, Durham, NC, United States
| | - Matthew Holbrook
- Department of Radiology, Duke University School of Medicine, Durham, NC, United States
| | - Cristian T Badea
- Department of Biomedical Engineering, Duke University, Durham, NC, United States; Department of Radiology, Duke University School of Medicine, Durham, NC, United States
| | - Wuwei Feng
- Department of Neurology, Duke University School of Medicine, Durham, NC, United States
| | - Alexandra Badea
- Department of Biomedical Engineering, Duke University, Durham, NC, United States; Department of Radiology, Duke University School of Medicine, Durham, NC, United States; Department of Neurology, Duke University School of Medicine, Durham, NC, United States; Brain Imaging and Analysis Center, Duke University School of Medicine, NC, United States.
| |
Collapse
|
37
|
Yalçın S, Vural H. Brain stroke classification and segmentation using encoder-decoder based deep convolutional neural networks. Comput Biol Med 2022; 149:105941. [DOI: 10.1016/j.compbiomed.2022.105941] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2022] [Revised: 07/15/2022] [Accepted: 08/06/2022] [Indexed: 11/16/2022]
|
38
|
Du X, Ma K, Song Y. AGMR-Net: Attention-guided multiscale recovery framework for stroke segmentation. Comput Med Imaging Graph 2022; 101:102120. [PMID: 36179432 DOI: 10.1016/j.compmedimag.2022.102120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Revised: 07/24/2022] [Accepted: 08/30/2022] [Indexed: 01/27/2023]
Abstract
Automatic and accurate lesion segmentation is critical to the clinical estimation of the lesion status of stroke diseases and appropriate diagnostic systems. Although existing methods have achieved remarkable results, their further adoption is hindered by: (1) intraclass inconsistency, i.e., large variability between different areas of the lesion; and (2) interclass indistinction, in which normal brain tissue resembles the lesion in appearance. To meet these challenges in stroke segmentation, we propose a novel method, namely attention-guided multiscale recovery framework (AGMR-Net) in this paper. Firstly, a coarse-grained patch attention (CPA) module in the encoding is adopted to obtain a patch-based coarse-grained attention map in a multistage, explicitly supervised way, enabling target spatial context saliency representation with a patch-based weighting technique that eliminates the effect of intraclass inconsistency. Secondly, to obtain more detailed boundary partitioning to meet the challenge of interclass indistinction, a newly designed cross-dimensional feature fusion (CFF) module is used to capture global contextual information to further guide the selective aggregation of 2D and 3D features, which can compensate for the lack of boundary learning capability of 2D convolution. Lastly, in the decoding stage, an innovative designed multiscale deconvolution upsampling (MDU) is used for enhanced recovery of target spatial and boundary information. AGMR-Net is evaluated on the open-source dataset Anatomical Tracings of Lesions After Stroke, achieving the highest Dice similarity coefficient of 0.594, Hausdorff distance of 27.005 mm, and average symmetry surface distance of 7.137 mm, which demonstrates that our proposed method outperforms state-of-the-art methods and has great potential for stroke diagnosis.
Collapse
Affiliation(s)
- Xiuquan Du
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, China.
| | - Kunpeng Ma
- School of Computer Science and Technology, Anhui University, China
| | - Yuhui Song
- School of Computer Science and Technology, Anhui University, China
| |
Collapse
|
39
|
Khezrpour S, Seyedarabi H, Razavi SN, Farhoudi M. Automatic segmentation of the brain stroke lesions from MR flair scans using improved U-net framework. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103978] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
40
|
Automatic Segmentation and Quantitative Assessment of Stroke Lesions on MR Images. Diagnostics (Basel) 2022; 12:diagnostics12092055. [PMID: 36140457 PMCID: PMC9497525 DOI: 10.3390/diagnostics12092055] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 08/12/2022] [Accepted: 08/22/2022] [Indexed: 12/20/2022] Open
Abstract
Lesion studies are crucial in establishing brain-behavior relationships, and accurately segmenting the lesion represents the first step in achieving this. Manual lesion segmentation is the gold standard for chronic strokes. However, it is labor-intensive, subject to bias, and limits sample size. Therefore, our objective is to develop an automatic segmentation algorithm for chronic stroke lesions on T1-weighted MR images. Methods: To train our model, we utilized an open-source dataset: ATLAS v2.0 (Anatomical Tracings of Lesions After Stroke). We partitioned the dataset of 655 T1 images with manual segmentation labels into five subsets and performed a 5-fold cross-validation to avoid overfitting of the model. We used a deep neural network (DNN) architecture for model training. Results: To evaluate the model performance, we used three metrics that pertain to diverse aspects of volumetric segmentation, including shape, location, and size. The Dice similarity coefficient (DSC) compares the spatial overlap between manual and machine segmentation. The average DSC was 0.65 (0.61−0.67; 95% bootstrapped CI). Average symmetric surface distance (ASSD) measures contour distances between the two segmentations. ASSD between manual and automatic segmentation was 12 mm. Finally, we compared the total lesion volumes and the Pearson correlation coefficient (ρ) between the manual and automatically segmented lesion volumes, which was 0.97 (p-value < 0.001). Conclusions: We present the first automated segmentation model trained on a large multicentric dataset. This model will enable automated on-demand processing of MRI scans and quantitative chronic stroke lesion assessment.
Collapse
|
41
|
Li C, Li W, Liu C, Zheng H, Cai J, Wang S. Artificial intelligence in multi-parametric magnetic resonance imaging: A review. Med Phys 2022; 49:e1024-e1054. [PMID: 35980348 DOI: 10.1002/mp.15936] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Revised: 08/01/2022] [Accepted: 08/04/2022] [Indexed: 11/06/2022] Open
Abstract
Multi-parametric magnetic resonance imaging (mpMRI) is an indispensable tool in the clinical workflow for the diagnosis and treatment planning of various diseases. Machine learning-based artificial intelligence (AI) methods, especially those adopting the deep learning technique, have been extensively employed to perform mpMRI image classification, segmentation, registration, detection, reconstruction, and super-resolution. The current availability of increasing computational power and fast-improving AI algorithms have empowered numerous computer-based systems for applying mpMRI to disease diagnosis, imaging-guided radiotherapy, patient risk and overall survival time prediction, and the development of advanced quantitative imaging technology for magnetic resonance fingerprinting. However, the wide application of these developed systems in the clinic is still limited by a number of factors, including robustness, reliability, and interpretability. This survey aims to provide an overview for new researchers in the field as well as radiologists with the hope that they can understand the general concepts, main application scenarios, and remaining challenges of AI in mpMRI. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Cheng Li
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Wen Li
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Chenyang Liu
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Hairong Zheng
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Jing Cai
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Shanshan Wang
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.,Peng Cheng Laboratory, Shenzhen, 518066, China.,Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangzhou, 510080, China
| |
Collapse
|
42
|
Cheng J, Liu J, Kuang H, Wang J. A Fully Automated Multimodal MRI-Based Multi-Task Learning for Glioma Segmentation and IDH Genotyping. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:1520-1532. [PMID: 35020590 DOI: 10.1109/tmi.2022.3142321] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The accurate prediction of isocitrate dehydrogenase (IDH) mutation and glioma segmentation are important tasks for computer-aided diagnosis using preoperative multimodal magnetic resonance imaging (MRI). The two tasks are ongoing challenges due to the significant inter-tumor and intra-tumor heterogeneity. The existing methods to address them are mostly based on single-task approaches without considering the correlation between the two tasks. In addition, the acquisition of IDH genetic labels is expensive and costly, resulting in a limited number of IDH mutation data for modeling. To comprehensively address these problems, we propose a fully automated multimodal MRI-based multi-task learning framework for simultaneous glioma segmentation and IDH genotyping. Specifically, the task correlation and heterogeneity are tackled with a hybrid CNN-Transformer encoder that consists of a convolutional neural network and a transformer to extract the shared spatial and global information learned from a decoder for glioma segmentation and a multi-scale classifier for IDH genotyping. Then, a multi-task learning loss is designed to balance the two tasks by combining the segmentation and classification loss functions with uncertain weights. Finally, an uncertainty-aware pseudo-label selection is proposed to generate IDH pseudo-labels from larger unlabeled data for improving the accuracy of IDH genotyping by using semi-supervised learning. We evaluate our method on a multi-institutional public dataset. Experimental results show that our proposed multi-task network achieves promising performance and outperforms the single-task learning counterparts and other existing state-of-the-art methods. With the introduction of unlabeled data, the semi-supervised multi-task learning framework further improves the performance of glioma segmentation and IDH genotyping. The source codes of our framework are publicly available at https://github.com/miacsu/MTTU-Net.git.
Collapse
|
43
|
Sheng M, Xu W, Yang J, Chen Z. Cross-Attention and Deep Supervision UNet for Lesion Segmentation of Chronic Stroke. Front Neurosci 2022; 16:836412. [PMID: 35392415 PMCID: PMC8980944 DOI: 10.3389/fnins.2022.836412] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 01/26/2022] [Indexed: 12/04/2022] Open
Abstract
Stroke is an acute cerebrovascular disease with high incidence, high mortality, and high disability rate. Determining the location and volume of the disease in MR images promotes accurate stroke diagnosis and surgical planning. Therefore, the automatic recognition and segmentation of stroke lesions has important clinical significance for large-scale stroke imaging analysis. There are some problems in the segmentation of stroke lesions, such as imbalance of the front and back scenes, uncertainty of position, and unclear boundary. To meet this challenge, this paper proposes a cross-attention and deep supervision UNet (CADS-UNet) to segment chronic stroke lesions from T1-weighted MR images. Specifically, we propose a cross-spatial attention module, which is different from the usual self-attention module. The location information interactively selects encode features and decode features to enrich the lost spatial focus. At the same time, the channel attention mechanism is used to screen the channel characteristics. Finally, combined with deep supervision and mixed loss, the model is supervised more accurately. We compared and verified the model on the authoritative open dataset "Anatomical Tracings of Lesions After Stroke" (Atlas), which fully proved the effectiveness of our model.
Collapse
Affiliation(s)
- Manjin Sheng
- School of Informatics, Xiamen University, Xiamen, China
| | - Wenjie Xu
- School of Informatics, Xiamen University, Xiamen, China
| | - Jane Yang
- Department of Cognitive Science, University of California, San Diego, San Diego, CA, United States
| | - Zhongjie Chen
- Department of Neurology, Zhongshan Hospital, Xiamen University, Xiamen, China
| |
Collapse
|
44
|
Bai N, Lin R, Wang Z, Cai S, Huang J, Su Z, Yao Y, Wen F, Li H, Huang Y, Zhao Y, Xia T, Lei M, Yang W, Qiu Z. Exploring New Characteristics: Using Deep Learning and 3D Reconstruction to Compare the Original COVID-19 and Its Delta Variant Based on Chest CT. Front Mol Biosci 2022; 9:836862. [PMID: 35359591 PMCID: PMC8961806 DOI: 10.3389/fmolb.2022.836862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Accepted: 01/17/2022] [Indexed: 11/15/2022] Open
Abstract
Purpose: Computer-aided diagnostic methods were used to compare the characteristics of the Original COVID-19 and its Delta Variant. Methods: This was a retrospective study. A deep learning segmentation model was applied to segment lungs and infections in CT. Three-dimensional (3D) reconstruction was used to create 3D models of the patient’s lungs and infections. A stereoscopic segmentation method was proposed, which can subdivide the 3D lung into five lobes and 18 segments. An expert-based CT scoring system was improved and artificial intelligence was used to automatically score instead of visual score. Non-linear regression and quantitative analysis were used to analyze the dynamic changes in the percentages of infection (POI). Results: The POI in the five lung lobes of all patients were calculated and converted into CT scores. The CT scores of Original COVID-19 patients and Delta Variant patients since the onset of initial symptoms were fitted over time, respectively. The peak was found to occur on day 11 in Original COVID-19 patients and on day 15 in Delta Variant patients. The time course of lung changes in CT of Delta Variant patients was redetermined as early stage (0–3 days), progressive and peak stage (4–16 days), and absorption stage (17–42 days). The first RT-PCR negative time in Original COVID-19 patients appeared earlier than in Delta Variant patients (22 [17–30] vs. 39 [31–44], p < 0.001). Delta Variant patients had more re-detectable positive RT-PCR test results than Original COVID-19 patients after the first negative RT-PCR time (30.5% vs. 17.1%). In the early stage, CT scores in the right lower lobe were significantly different (Delta Variant vs. Original COVID-19, 0.8 ± 0.6 vs. 1.3 ± 0.6, p = 0.039). In the absorption stage, CT scores of the right middle lobes were significantly different (Delta Variant vs. Original COVID-19, 0.6 ± 0.7 vs. 0.3 ± 0.4, p = 0.012). The left and the right lower lobes contributed most to lung involvement at any given time. Conclusion: Compared with the Original COVID-19, the Delta Variant has a longer lung change duration, more re-detectable positive RT-PCR test results, different locations of pneumonia, and more lesions in the early stage, and the peak of infection occurred later.
Collapse
Affiliation(s)
- Na Bai
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Ruikai Lin
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Zhiwei Wang
- China United Network Communications Corporation Heilongjiang Branch, Harbin, China
| | - Shengyan Cai
- Hongqi Hospital Affiliated to Mudanjiang Medical University, Mudanjiang, China
| | - Jianliang Huang
- Zhangjiajie Hospital Affiliated to Hunan Normal University, Zhangjiajie, China
| | - Zhongrui Su
- Zhangjiajie Hospital Affiliated to Hunan Normal University, Zhangjiajie, China
| | - Yuanzhen Yao
- Zhangjiajie Hospital Affiliated to Hunan Normal University, Zhangjiajie, China
| | - Fang Wen
- Medical College of Jishou University, Jishou, China
| | - Han Li
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Yuxin Huang
- Heilongjiang Tuomeng Technology Co. Ltd., Harbin, China
| | - Yi Zhao
- Heilongjiang Tuomeng Technology Co. Ltd., Harbin, China
| | - Tao Xia
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Mingsheng Lei
- Zhangjiajie Hospital Affiliated to Hunan Normal University, Zhangjiajie, China
- *Correspondence: Mingsheng Lei, ; Weizhen Yang, ; Zhaowen Qiu,
| | - Weizhen Yang
- Hongqi Hospital Affiliated to Mudanjiang Medical University, Mudanjiang, China
- *Correspondence: Mingsheng Lei, ; Weizhen Yang, ; Zhaowen Qiu,
| | - Zhaowen Qiu
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
- *Correspondence: Mingsheng Lei, ; Weizhen Yang, ; Zhaowen Qiu,
| |
Collapse
|
45
|
Abstract
In magnetic resonance imaging (MRI) segmentation, conventional approaches utilize U-Net models with encoder–decoder structures, segmentation models using vision transformers, or models that combine a vision transformer with an encoder–decoder model structure. However, conventional models have large sizes and slow computation speed and, in vision transformer models, the computation amount sharply increases with the image size. To overcome these problems, this paper proposes a model that combines Swin transformer blocks and a lightweight U-Net type model that has an HarDNet blocks-based encoder–decoder structure. To maintain the features of the hierarchical transformer and shifted-windows approach of the Swin transformer model, the Swin transformer is used in the first skip connection layer of the encoder instead of in the encoder–decoder bottleneck. The proposed model, called STHarDNet, was evaluated by separating the anatomical tracings of lesions after stroke (ATLAS) dataset, which comprises 229 T1-weighted MRI images, into training and validation datasets. It achieved Dice, IoU, precision, and recall values of 0.5547, 0.4185, 0.6764, and 0.5286, respectively, which are better than those of the state-of-the-art models U-Net, SegNet, PSPNet, FCHarDNet, TransHarDNet, Swin Transformer, Swin UNet, X-Net, and D-UNet. Thus, STHarDNet improves the accuracy and speed of MRI image-based stroke diagnosis.
Collapse
|
46
|
Song H, Huang Z, Feng L, Zhong Y, Wen C, Guo J. RAFF-Net: An improved tongue segmentation algorithm based on residual attention network and multiscale feature fusion. Digit Health 2022; 8:20552076221136362. [PMID: 36339902 PMCID: PMC9634193 DOI: 10.1177/20552076221136362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 10/15/2022] [Indexed: 11/06/2022] Open
Abstract
Objective Due to the complexity of face images, tongue segmentation is susceptible to interference from uneven tongue texture, lips and face, resulting in traditional methods failing to segment the tongue accurately. To address this problem, RAFF-Net, an automatic tongue region segmentation network based on residual attention network and multiscale feature fusion, was proposed. It aims to improve tongue segmentation accuracy and achieve end-to-end automated segmentation. Methods Based on the UNet backbone network, different numbers of ResBlocks combined with the Squeeze-and-Excitation (SE) block was used as an encoder to extract image layered features. The decoder structure of UNet was simplified and the number of parameters of the network model was reduced. Meanwhile, the multiscale feature fusion module was designed to optimize the network parameters by combining a custom loss function instead of the common cross-entropy loss function to further improve the detection accuracy. Results The RAFF-Net network structure achieved Mean Intersection over Union (MIoU) and F1-score of 97.85% and 97.73%, respectively, which improved 0.56% and 0.46%, respectively, compared with the original UNet; ablation experiments demonstrated that the improved algorithm could contribute to the enhancement of tongue segmentation effect. Conclusion This study combined the residual attention network with multiscale feature fusion to effectively improve the segmentation accuracy of the tongue region, and optimized the input and output of the UNet network using different numbers of ResBlocks, SE block, multiscale feature fusion and weighted loss function, increased the stability of the network and improved the overall effect of the network.
Collapse
Affiliation(s)
- Haibei Song
- School of Intelligent Medicine, Chengdu University of Traditional Chinese
Medicine, Chengdu, China
| | - Zonghai Huang
- School of Intelligent Medicine, Chengdu University of Traditional Chinese
Medicine, Chengdu, China
| | - Li Feng
- School of Intelligent Medicine, Chengdu University of Traditional Chinese
Medicine, Chengdu, China
| | - Yanmei Zhong
- School of Intelligent Medicine, Chengdu University of Traditional Chinese
Medicine, Chengdu, China
| | - Chuanbiao Wen
- School of Intelligent Medicine, Chengdu University of Traditional Chinese
Medicine, Chengdu, China
| | - Jinhong Guo
- School of Information and Communication Engineering, University of
Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
47
|
Huang B, Wei Z, Tang X, Fujita H, Cai Q, Gao Y, Wu T, Zhou L. Deep learning network for medical volume data segmentation based on multi axial plane fusion. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 212:106480. [PMID: 34736168 DOI: 10.1016/j.cmpb.2021.106480] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/26/2021] [Accepted: 10/13/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND AND OBJECTIVE High-dimensional data generally contains more accurate information for medical image, e.g., computerized tomography (CT) data can depict the three dimensional structure of organs more precisely. However, the data in high-dimension often needs enormous computation and has high memory requirements in the deep learning convolution networks, while dimensional reduction usually leads to performance degradation. METHODS In this paper, a two-dimensional deep learning segmentation network was proposed for medical volume data based on multi-pinacoidal plane fusion to cover more information under the control of computation.This approach has conducive compatibility while using the model proposed to extract the global information between different inputs layers. RESULTS Our approach has worked in different backbone network. Using the approach, DeepUnet's Dice coefficient (Dice) and Positive Predictive Value (PPV) are 0.883 and 0.982 showing the satisfied progress. Various backbones can enjoy the profit of the method. CONCLUSIONS Through the comparison of different backbones, it can be found that the proposed network with multi-pinacoidal plane fusion can achieve better results both quantitively and qualitatively.
Collapse
Affiliation(s)
- Bo Huang
- Shanghai University of Engineering Science, 333 Longteng Road, Songjiang District, Shanghai, Shanghai, 201620, China.
| | - Ziran Wei
- Shanghai Changzheng Hospital, 415 Fengyang Road, Huangpu District, Shanghai, Shanghai, 200003, China
| | - Xianhua Tang
- Changzhou United Imaging Healthcare Surgical Technology Co.,Ltd, No.5 Longfan Road, Wujin High-Tech Industrial Development Zone, Changzhou, China
| | - Hamido Fujita
- Faculty of Information Technology, Ho Chi Minh City University of Technology(HUTECH), Ho Chi Minh City, Vietnam; i-SOMET.org Incorporated Association, Iwate 020-0104, Japan; Andalusian Research Institute in Data Science and Computational Intelligence(DaSCI), University of Granada, Granada, Spain; College of Mathematical Sciences, Harbin Engineering University, Harbin 150001, China.
| | - Qingping Cai
- Shanghai Changzheng Hospital, 415 Fengyang Road, Huangpu District, Shanghai, Shanghai, 200003, China
| | - Yongbin Gao
- Shanghai University of Engineering Science, 333 Longteng Road, Songjiang District, Shanghai, Shanghai, 201620, China
| | - Tao Wu
- Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Liang Zhou
- Shanghai University of Medicine & Health Sciences, Shanghai, China
| |
Collapse
|
48
|
Wang S, Li C, Wang R, Liu Z, Wang M, Tan H, Wu Y, Liu X, Sun H, Yang R, Liu X, Chen J, Zhou H, Ben Ayed I, Zheng H. Annotation-efficient deep learning for automatic medical image segmentation. Nat Commun 2021; 12:5915. [PMID: 34625565 PMCID: PMC8501087 DOI: 10.1038/s41467-021-26216-9] [Citation(s) in RCA: 63] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Accepted: 09/22/2021] [Indexed: 01/17/2023] Open
Abstract
Automatic medical image segmentation plays a critical role in scientific research and medical care. Existing high-performance deep learning methods typically rely on large training datasets with high-quality manual annotations, which are difficult to obtain in many clinical applications. Here, we introduce Annotation-effIcient Deep lEarning (AIDE), an open-source framework to handle imperfect training datasets. Methodological analyses and empirical evaluations are conducted, and we demonstrate that AIDE surpasses conventional fully-supervised models by presenting better performance on open datasets possessing scarce or noisy annotations. We further test AIDE in a real-life case study for breast tumor segmentation. Three datasets containing 11,852 breast images from three medical centers are employed, and AIDE, utilizing 10% training annotations, consistently produces segmentation maps comparable to those generated by fully-supervised counterparts or provided by independent radiologists. The 10-fold enhanced efficiency in utilizing expert labels has the potential to promote a wide range of biomedical applications.
Collapse
Affiliation(s)
- Shanshan Wang
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China.
- Peng Cheng Laboratory, Shenzhen, Guangdong, China.
- Pazhou Laboratory, Guangzhou, Guangdong, China.
| | - Cheng Li
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China.
| | - Rongpin Wang
- Department of Medical Imaging, Guizhou Provincial People's Hospital, Guiyang, Guizhou, China
| | - Zaiyi Liu
- Department of Medical Imaging, Guangdong General Hospital, Guangdong Academy of Medical Sciences, Guangzhou, Guangdong, China
| | - Meiyun Wang
- Department of Medical Imaging, Henan Provincial People's Hospital & the People's Hospital of Zhengzhou University, Zhengzhou, Henan, China
| | - Hongna Tan
- Department of Medical Imaging, Henan Provincial People's Hospital & the People's Hospital of Zhengzhou University, Zhengzhou, Henan, China
| | - Yaping Wu
- Department of Medical Imaging, Henan Provincial People's Hospital & the People's Hospital of Zhengzhou University, Zhengzhou, Henan, China
| | - Xinfeng Liu
- Department of Medical Imaging, Guizhou Provincial People's Hospital, Guiyang, Guizhou, China
| | - Hui Sun
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| | - Rui Yang
- Department of Urology, Renmin Hospital of Wuhan University, Wuhan, Hubei, China
| | - Xin Liu
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| | - Jie Chen
- Peng Cheng Laboratory, Shenzhen, Guangdong, China
- School of Electronic and Computer Engineering, Shenzhen Graduate School, Peking University, Shenzhen, Guangdong, China
| | - Huihui Zhou
- Brain Cognition and Brain Disease Institute, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| | | | - Hairong Zheng
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China.
| |
Collapse
|
49
|
Bao Q, Mi S, Gang B, Yang W, Chen J, Liao Q. MDAN: Mirror Difference Aware Network for Brain Stroke Lesion Segmentation. IEEE J Biomed Health Inform 2021; 26:1628-1639. [PMID: 34543208 DOI: 10.1109/jbhi.2021.3113460] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Brain stroke lesion segmentation is of great importance for stroke rehabilitation neuroimaging analysis. Due to the large variance of stroke lesion shapes and similarities of tissue intensity distribution, it remains a challenging task. To help detect abnormalities, the anatomical symmetries of brain magnetic resonance (MR) images have been widely used as visual cues for clinical practices. However, most methods do not fully utilize structural symmetry information in brain images segmentation. This paper presents a novel mirror difference aware network (MDAN) for stroke lesion segmentation in an encoder-decoder architecture, aiming at holistically exploiting the symmetries of image features. Specifically, a differential feature augmentation (DFA) module is developed in the encoding path to highlight the semantically pathological asymmetries of the features in abnormalities. In the DFA module, a Siamese contrastive supervised loss is designed to enhance discriminative features, and a mirror position-based difference augmentation (MDA) module is used to further magnify the discrepancy information. Moreover, mirror feature fusion (MFF) modules are applied to fuse and transfer the information both of the original input and the horizontally flipped features to the decoding path. Extensive experiments on the Anatomical Tracings of Lesions After Stroke (ATLAS) dataset show the proposed MDAN outperforms the state-of-the-art methods.
Collapse
|
50
|
Dual-Path Attention Compensation U-Net for Stroke Lesion Segmentation. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021; 2021:7552185. [PMID: 34504522 PMCID: PMC8423551 DOI: 10.1155/2021/7552185] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Accepted: 08/19/2021] [Indexed: 11/17/2022]
Abstract
For the segmentation task of stroke lesions, using the attention U-Net model based on the self-attention mechanism can suppress irrelevant regions in an input image while highlighting salient features useful for specific tasks. However, when the lesion is small and the lesion contour is blurred, attention U-Net may generate wrong attention coefficient maps, leading to incorrect segmentation results. To cope with this issue, we propose a dual-path attention compensation U-Net (DPAC-UNet) network, which consists of a primary network and auxiliary path network. Both networks are attention U-Net models and identical in structure. The primary path network is the core network that performs accurate lesion segmentation and outputting of the final segmentation result. The auxiliary path network generates auxiliary attention compensation coefficients and sends them to the primary path network to compensate for and correct possible attention coefficient errors. To realize the compensation mechanism of DPAC-UNet, we propose a weighted binary cross-entropy Tversky (WBCE-Tversky) loss to train the primary path network to achieve accurate segmentation and propose another compound loss function called tolerance loss to train the auxiliary path network to generate auxiliary compensation attention coefficient maps with expanded coverage area to perform compensate operations. We conducted segmentation experiments using the 239 MRI scans of the anatomical tracings of lesions after stroke (ATLAS) dataset to evaluate the performance and effectiveness of our method. The experimental results show that the DSC score of the proposed DPAC-UNet network is 6% higher than the single-path attention U-Net. It is also higher than the existing segmentation methods of the related literature. Therefore, our method demonstrates powerful abilities in the application of stroke lesion segmentation.
Collapse
|