51
|
Li L, Hu Z, Huang Y, Zhu W, Zhao C, Wang Y, Chen M, Yu J. BP-Net: Boundary and perfusion feature guided dual-modality ultrasound video analysis network for fibrous cap integrity assessment. Comput Med Imaging Graph 2023; 107:102246. [PMID: 37210966 DOI: 10.1016/j.compmedimag.2023.102246] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Revised: 05/09/2023] [Accepted: 05/10/2023] [Indexed: 05/23/2023]
Abstract
Ultrasonography is one of the main imaging methods for monitoring and diagnosing atherosclerosis due to its non-invasiveness and low-cost. Automatic differentiation of carotid plaque fibrous cap integrity by using multi-modal ultrasound videos has significant diagnostic and prognostic value for cardiovascular and cerebrovascular disease patients. However, the task faces several challenges, including high variation in plaque location and shape, the absence of analysis mechanism focusing on fibrous cap, the lack of effective mechanism to capture the relevance among multi-modal data for feature fusion and selection, etc. To overcome these challenges, we propose a new target boundary and perfusion feature guided video analysis network (BP-Net) based on conventional B-mode ultrasound and contrast-enhanced ultrasound videos for assessing the integrity of fibrous cap. Based on our previously proposed plaque auto-tracking network, in our BP-Net, we further introduce the plaque edge attention module and reverse mechanism to focus the dual video analysis on the fiber cap of plaques. Moreover, to fully explore the rich information on the fibrous cap and inside/outside of the plaque, we propose a feature fusion module for B-mode and contrast video to filter out the most valuable features for fibrous cap integrity assessment. Finally, multi-head convolution attention is proposed and embedded into transformer-based network, which captures semantic features and global context information to obtain accurate evaluation of fibrous caps integrity. The experimental results demonstrate that the proposed method has high accuracy and generalizability with an accuracy of 92.35% and an AUC of 0.935, which outperforms than the state-of-the-art deep learning based methods. A series of comprehensive ablation studies suggest the effectiveness of each proposed component and show great potential in clinical application.
Collapse
Affiliation(s)
- Leyin Li
- School of Information Science and Technology, Fudan University, Shanghai, China
| | - Zhaoyu Hu
- School of Information Science and Technology, Fudan University, Shanghai, China
| | - Yunqian Huang
- Department of Ultrasound, Tongren Hospital, Shanghai Jiao Tong University, Shanghai, China
| | - Wenqian Zhu
- Department of Ultrasound, Tongren Hospital, Shanghai Jiao Tong University, Shanghai, China
| | - Chengqian Zhao
- School of Information Science and Technology, Fudan University, Shanghai, China
| | - Yuanyuan Wang
- School of Information Science and Technology, Fudan University, Shanghai, China
| | - Man Chen
- Department of Ultrasound, Tongren Hospital, Shanghai Jiao Tong University, Shanghai, China.
| | - Jinhua Yu
- School of Information Science and Technology, Fudan University, Shanghai, China.
| |
Collapse
|
52
|
Černý M, Kybic J, Májovský M, Sedlák V, Pirgl K, Misiorzová E, Lipina R, Netuka D. Fully automated imaging protocol independent system for pituitary adenoma segmentation: a convolutional neural network-based model on sparsely annotated MRI. Neurosurg Rev 2023; 46:116. [PMID: 37162632 DOI: 10.1007/s10143-023-02014-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 03/08/2023] [Accepted: 04/28/2023] [Indexed: 05/11/2023]
Abstract
This study aims to develop a fully automated imaging protocol independent system for pituitary adenoma segmentation from magnetic resonance imaging (MRI) scans that can work without user interaction and evaluate its accuracy and utility for clinical applications. We trained two independent artificial neural networks on MRI scans of 394 patients. The scans were acquired according to various imaging protocols over the course of 11 years on 1.5T and 3T MRI systems. The segmentation model assigned a class label to each input pixel (pituitary adenoma, internal carotid artery, normal pituitary gland, background). The slice segmentation model classified slices as clinically relevant (structures of interest in slice) or irrelevant (anterior or posterior to sella turcica). We used MRI data of another 99 patients to evaluate the performance of the model during training. We validated the model on a prospective cohort of 28 patients, Dice coefficients of 0.910, 0.719, and 0.240 for tumour, internal carotid artery, and normal gland labels, respectively, were achieved. The slice selection model achieved 82.5% accuracy, 88.7% sensitivity, 76.7% specificity, and an AUC of 0.904. A human expert rated 71.4% of the segmentation results as accurate, 21.4% as slightly inaccurate, and 7.1% as coarsely inaccurate. Our model achieved good results comparable with recent works of other authors on the largest dataset to date and generalized well for various imaging protocols. We discussed future clinical applications, and their considerations. Models and frameworks for clinical use have yet to be developed and evaluated.
Collapse
Affiliation(s)
- Martin Černý
- Department of Neurosurgery and Neurooncology, 1st Faculty of Medicine, Charles University, Central Military Hospital Prague, U Vojenské nemocnice 1200, 169 02, Praha 6, Czech Republic.
- 1st Faculty of Medicine, Charles University Prague, Kateřinská 1660/32, 121 08, Praha 2, Czech Republic.
| | - Jan Kybic
- Department of Cybernetics, Faculty of Electrical Engineering, Czech Technical University in Prague, Technická 2, 166 27, Praha 6, Czech Republic
| | - Martin Májovský
- Department of Neurosurgery and Neurooncology, 1st Faculty of Medicine, Charles University, Central Military Hospital Prague, U Vojenské nemocnice 1200, 169 02, Praha 6, Czech Republic
| | - Vojtěch Sedlák
- Department of Radiodiagnostics, Central Military Hospital Prague, U Vojenské nemocnice 1200, 169 02, Praha 6, Czech Republic
| | - Karin Pirgl
- Department of Neurosurgery and Neurooncology, 1st Faculty of Medicine, Charles University, Central Military Hospital Prague, U Vojenské nemocnice 1200, 169 02, Praha 6, Czech Republic
- 3rd Faculty of Medicine, Charles University Prague, Ruská 87, 100 00, Praha 10, Czech Republic
| | - Eva Misiorzová
- Department of Neurosurgery, Faculty of Medicine, University of Ostrava, University Hospital Ostrava, 17. listopadu 1790/5, 708 52, Ostrava-Poruba, Czech Republic
| | - Radim Lipina
- Department of Neurosurgery, Faculty of Medicine, University of Ostrava, University Hospital Ostrava, 17. listopadu 1790/5, 708 52, Ostrava-Poruba, Czech Republic
| | - David Netuka
- Department of Neurosurgery and Neurooncology, 1st Faculty of Medicine, Charles University, Central Military Hospital Prague, U Vojenské nemocnice 1200, 169 02, Praha 6, Czech Republic
| |
Collapse
|
53
|
Wang KN, Zhuang S, Ran QY, Zhou P, Hua J, Zhou GQ, He X. DLGNet: A dual-branch lesion-aware network with the supervised Gaussian Mixture model for colon lesions classification in colonoscopy images. Med Image Anal 2023; 87:102832. [PMID: 37148864 DOI: 10.1016/j.media.2023.102832] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Revised: 01/20/2023] [Accepted: 04/20/2023] [Indexed: 05/08/2023]
Abstract
Colorectal cancer is one of the malignant tumors with the highest mortality due to the lack of obvious early symptoms. It is usually in the advanced stage when it is discovered. Thus the automatic and accurate classification of early colon lesions is of great significance for clinically estimating the status of colon lesions and formulating appropriate diagnostic programs. However, it is challenging to classify full-stage colon lesions due to the large inter-class similarities and intra-class differences of the images. In this work, we propose a novel dual-branch lesion-aware neural network (DLGNet) to classify intestinal lesions by exploring the intrinsic relationship between diseases, composed of four modules: lesion location module, dual-branch classification module, attention guidance module, and inter-class Gaussian loss function. Specifically, the elaborate dual-branch module integrates the original image and the lesion patch obtained by the lesion localization module to explore and interact with lesion-specific features from a global and local perspective. Also, the feature-guided module guides the model to pay attention to the disease-specific features by learning remote dependencies through spatial and channel attention after network feature learning. Finally, the inter-class Gaussian loss function is proposed, which assumes that each feature extracted by the network is an independent Gaussian distribution, and the inter-class clustering is more compact, thereby improving the discriminative ability of the network. The extensive experiments on the collected 2568 colonoscopy images have an average accuracy of 91.50%, and the proposed method surpasses the state-of-the-art methods. This study is the first time that colon lesions are classified at each stage and achieves promising colon disease classification performance. To motivate the community, we have made our code publicly available via https://github.com/soleilssss/DLGNet.
Collapse
Affiliation(s)
- Kai-Ni Wang
- School of Biological Science and Medical Engineering, Southeast University, Nanjing, China; State Key Laboratory of Digital Medical Engineering, Southeast University, Nanjing, China; Jiangsu Key Laboratory of Biomaterials and Devices, Southeast University, Nanjing, China
| | - Shuaishuai Zhuang
- The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Qi-Yong Ran
- School of Biological Science and Medical Engineering, Southeast University, Nanjing, China; State Key Laboratory of Digital Medical Engineering, Southeast University, Nanjing, China; Jiangsu Key Laboratory of Biomaterials and Devices, Southeast University, Nanjing, China
| | - Ping Zhou
- School of Biological Science and Medical Engineering, Southeast University, Nanjing, China; State Key Laboratory of Digital Medical Engineering, Southeast University, Nanjing, China; Jiangsu Key Laboratory of Biomaterials and Devices, Southeast University, Nanjing, China
| | - Jie Hua
- The First Affiliated Hospital of Nanjing Medical University, Nanjing, China; Liyang People's Hospital, Liyang Branch Hospital of Jiangsu Province Hospital, Liyang, China
| | - Guang-Quan Zhou
- School of Biological Science and Medical Engineering, Southeast University, Nanjing, China; State Key Laboratory of Digital Medical Engineering, Southeast University, Nanjing, China; Jiangsu Key Laboratory of Biomaterials and Devices, Southeast University, Nanjing, China.
| | - Xiaopu He
- The First Affiliated Hospital of Nanjing Medical University, Nanjing, China.
| |
Collapse
|
54
|
Pang J, Jiang C, Chen Y, Chang J, Feng M, Wang R, Yao J. 3D Shuffle-Mixer: An Efficient Context-Aware Vision Learner of Transformer-MLP Paradigm for Dense Prediction in Medical Volume. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:1241-1253. [PMID: 35849668 DOI: 10.1109/tmi.2022.3191974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Dense prediction in medical volume provides enriched guidance for clinical analysis. CNN backbones have met bottleneck due to lack of long-range dependencies and global context modeling power. Recent works proposed to combine vision transformer with CNN, due to its strong global capture ability and learning capability. However, most works are limited to simply applying pure transformer with several fatal flaws (i.e., lack of inductive bias, heavy computation and little consideration for 3D data). Therefore, designing an elegant and efficient vision transformer learner for dense prediction in medical volume is promising and challenging. In this paper, we propose a novel 3D Shuffle-Mixer network of a new Local Vision Transformer-MLP paradigm for medical dense prediction. In our network, a local vision transformer block is utilized to shuffle and learn spatial context from full-view slices of rearranged volume, a residual axial-MLP is designed to mix and capture remaining volume context in a slice-aware manner, and a MLP view aggregator is employed to project the learned full-view rich context to the volume feature in a view-aware manner. Moreover, an Adaptive Scaled Enhanced Shortcut is proposed for local vision transformer to enhance feature along spatial and channel dimensions adaptively, and a CrossMerge is proposed to skip-connect the multi-scale feature appropriately in the pyramid architecture. Extensive experiments demonstrate the proposed model outperforms other state-of-the-art medical dense prediction methods.
Collapse
|
55
|
Jia Z, Zhu H, Zhu J, Ma P. Two-Branch network for brain tumor segmentation using attention mechanism and super-resolution reconstruction. Comput Biol Med 2023; 157:106751. [PMID: 36934534 DOI: 10.1016/j.compbiomed.2023.106751] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2022] [Revised: 02/12/2023] [Accepted: 03/06/2023] [Indexed: 03/17/2023]
Abstract
Accurate segmentation of brain tumor plays an important role in MRI diagnosis and treatment monitoring of brain tumor. However, the degree of lesions in each patient's brain tumor region is usually inconsistent, with large structural differences, and brain tumor MR images are characterized by low contrast and blur, current deep learning algorithms often cannot achieve accurate segmentation. To address this problem, we propose a novel end-to-end brain tumor segmentation algorithm by integrating the improved 3D U-Net network and super-resolution image reconstruction into one framework. In addition, the coordinate attention module is embedded before the upsampling operation of the backbone network, which enhances the capture ability of local texture feature information and global location feature information. To demonstrate the segmentation results of the proposed algorithm in different brain tumor MR images, we have trained and evaluated the proposed algorithm on BraTS datasets, and compared with other deep learning algorithms by dice similarity scores. On the BraTS2021 dataset, the proposed algorithm achieves the dice similarity score of 89.61%, 88.30%, 91.05%, and the Hausdorff distance (95%) of 1.414 mm, 7.810 mm, 4.583 mm for the enhancing tumors, tumor cores and whole tumors, respectively. The experimental results illuminate that our method outperforms the baseline 3D U-Net method and yields good performance on different datasets. It indicated that it is robust to segmentation of brain tumor MR images with structures vary considerably.
Collapse
Affiliation(s)
- Zhaohong Jia
- School of Internet, Anhui University, Hefei 230039, China
| | - Hongxin Zhu
- School of Internet, Anhui University, Hefei 230039, China
| | - Junan Zhu
- School of Internet, Anhui University, Hefei 230039, China.
| | - Ping Ma
- School of Internet, Anhui University, Hefei 230039, China
| |
Collapse
|
56
|
Al Khalil Y, Amirrajab S, Lorenz C, Weese J, Pluim J, Breeuwer M. Reducing segmentation failures in cardiac MRI via late feature fusion and GAN-based augmentation. Comput Biol Med 2023; 161:106973. [PMID: 37209615 DOI: 10.1016/j.compbiomed.2023.106973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 04/05/2023] [Accepted: 04/22/2023] [Indexed: 05/22/2023]
Abstract
Cardiac magnetic resonance (CMR) image segmentation is an integral step in the analysis of cardiac function and diagnosis of heart related diseases. While recent deep learning-based approaches in automatic segmentation have shown great promise to alleviate the need for manual segmentation, most of these are not applicable to realistic clinical scenarios. This is largely due to training on mainly homogeneous datasets, without variation in acquisition, which typically occurs in multi-vendor and multi-site settings, as well as pathological data. Such approaches frequently exhibit a degradation in prediction performance, particularly on outlier cases commonly associated with difficult pathologies, artifacts and extensive changes in tissue shape and appearance. In this work, we present a model aimed at segmenting all three cardiac structures in a multi-center, multi-disease and multi-view scenario. We propose a pipeline, addressing different challenges with segmentation of such heterogeneous data, consisting of heart region detection, augmentation through image synthesis and a late-fusion segmentation approach. Extensive experiments and analysis demonstrate the ability of the proposed approach to tackle the presence of outlier cases during both training and testing, allowing for better adaptation to unseen and difficult examples. Overall, we show that the effective reduction of segmentation failures on outlier cases has a positive impact on not only the average segmentation performance, but also on the estimation of clinical parameters, leading to a better consistency in derived metrics.
Collapse
Affiliation(s)
- Yasmina Al Khalil
- Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands.
| | - Sina Amirrajab
- Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | | | | | - Josien Pluim
- Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Marcel Breeuwer
- Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands; Philips Healthcare, MR R&D - Clinical Science, Best, The Netherlands
| |
Collapse
|
57
|
Liu Z, Wei J, Li R, Zhou J. Learning multi-modal brain tumor segmentation from privileged semi-paired MRI images with curriculum disentanglement learning. Comput Biol Med 2023; 159:106927. [PMID: 37105113 DOI: 10.1016/j.compbiomed.2023.106927] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2022] [Revised: 04/02/2023] [Accepted: 04/13/2023] [Indexed: 04/29/2023]
Abstract
Since the brain is the human body's primary command and control center, brain cancer is one of the most dangerous cancers. Automatic segmentation of brain tumors from multi-modal images is important in diagnosis and treatment. Due to the difficulties in obtaining multi-modal paired images in clinical practice, recent studies segment brain tumors solely relying on unpaired images and discarding the available paired images. Although these models solve the dependence on paired images, they cannot fully exploit the complementary information from different modalities, resulting in low unimodal segmentation accuracy. Hence, this work studies the unimodal segmentation with privileged semi-paired images, i.e., limited paired images are introduced to the training phase. Specifically, we present a novel two-step (intra-modality and inter-modality) curriculum disentanglement learning framework. The modality-specific style codes describe the attenuation of tissue features and image contrast, and modality-invariant content codes contain anatomical and functional information extracted from the input images. Besides, we address the problem of unthorough decoupling by introducing constraints on the style and content spaces. Experiments on the BraTS2020 dataset highlight that our model outperforms the competing models on unimodal segmentation, achieving average dice scores of 82.91%, 72.62%, and 54.80% for WT (the whole tumor), TC (the tumor core), and ET (the enhancing tumor), respectively. Finally, we further evaluate our model's variable multi-modal brain tumor segmentation performance by introducing a fusion block (TFusion). The experimental results reveal that our model achieves the best WT segmentation performance for all 15 possible modality combinations with 87.31% average accuracy. In summary, we propose a curriculum disentanglement learning framework for unimodal segmentation with privileged semi-paired images. Moreover, the benefits of the improved unimodal segmentation extend to variable multi-modal segmentation, demonstrating that improving the unimodal segmentation performance is significant for brain tumor segmentation with missing modalities. Our code is available at https://github.com/scut-cszcl/SpBTS.
Collapse
Affiliation(s)
- Zecheng Liu
- School of Computer Science and Engineering, South China University of Technology, Guangzhou, China.
| | - Jia Wei
- School of Computer Science and Engineering, South China University of Technology, Guangzhou, China.
| | - Rui Li
- Golisano College of Computing and Information Sciences, Rochester Institute of Technology, Rochester, NY, USA.
| | - Jianlong Zhou
- Data Science Institute, University of Technology Sydney, Ultimo, NSW 2007, Australia.
| |
Collapse
|
58
|
Li X, Jiang Y, Li M, Zhang J, Yin S, Luo H. MSFR-Net: Multi-modality and single-modality feature recalibration network for brain tumor segmentation. Med Phys 2023; 50:2249-2262. [PMID: 35962724 DOI: 10.1002/mp.15933] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Revised: 05/16/2022] [Accepted: 06/14/2022] [Indexed: 11/11/2022] Open
Abstract
BACKGROUND Accurate and automated brain tumor segmentation from multi-modality MR images plays a significant role in tumor treatment. However, the existing approaches mainly focus on the fusion of multi-modality while ignoring the correlation between single-modality and tumor subcomponents. For example, T2-weighted images show good visualization of edema, and T1-contrast images have a good contrast between enhancing tumor core and necrosis. In the actual clinical process, professional physicians also label tumors according to these characteristics. We design a method for brain tumors segmentation that utilizes both multi-modality fusion and single-modality characteristics. METHODS A multi-modality and single-modality feature recalibration network (MSFR-Net) is proposed for brain tumor segmentation from MR images. Specifically, multi-modality information and single-modality information are assigned to independent pathways. Multi-modality network explicitly learns the relationship between all modalities and all tumor sub-components. Single-modality network learns the relationship between single-modality and its highly correlated tumor subcomponents. Then, a dual recalibration module (DRM) is designed to connect the parallel single-modality network and multi-modality network at multiple stages. The function of the DRM is to unify the two types of features into the same feature space. RESULTS Experiments on BraTS 2015 dataset and BraTS 2018 dataset show that the proposed method is competitive and superior to other state-of-the-art methods. The proposed method achieved the segmentation results with Dice coefficients of 0.86 and Hausdorff distance of 4.82 on BraTS 2018 dataset, with dice coefficients of 0.80, positive predictive value of 0.76, and sensitivity of 0.78 on BraTS 2015 dataset. CONCLUSIONS This work combines the manual labeling process of doctors and introduces the correlation between single-modality and the tumor subcomponents into the segmentation network. The method improves the segmentation performance of brain tumors and can be applied in the clinical practice. The code of the proposed method is available at: https://github.com/xiangQAQ/MSFR-Net.
Collapse
Affiliation(s)
- Xiang Li
- Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, China
| | - Yuchen Jiang
- Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, China
| | - Minglei Li
- Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, China
| | - Jiusi Zhang
- Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, China
| | - Shen Yin
- Department of Mechanical and Industrial Engineering, Faculty of Engineering, Norwegian University of Science and Technology, Trondheim, Norway
| | - Hao Luo
- Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, China
| |
Collapse
|
59
|
Feature generation and multi-sequence fusion based deep convolutional network for breast tumor diagnosis with missing MR sequences. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
60
|
He Q, Dong M, Summerfield N, Glide-Hurst C. MAGNET: A MODALITY-AGNOSTIC NETWORK FOR 3D MEDICAL IMAGE SEGMENTATION. PROCEEDINGS. IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING 2023; 2023:10.1109/isbi53787.2023.10230587. [PMID: 38169907 PMCID: PMC10760993 DOI: 10.1109/isbi53787.2023.10230587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
In this paper, we proposed MAGNET, a novel modality-agnostic network for 3D medical image segmentation. Different from existing learning methods, MAGNET is specifically designed to handle real medical situations where multiple modalities/sequences are available during model training, but fewer ones are available or used at time of clinical practice. Our results on multiple datasets show that MAGNET trained on multi-modality data has the unique ability to perform predictions using any subset of training imaging modalities. It outperforms individually trained uni-modality models while can further boost performance when more modalities are available at testing.
Collapse
Affiliation(s)
- Qisheng He
- Wayne State University Department of Computer Science 5057 Woodward Ave, Detroit, MI 48202
| | - Ming Dong
- Wayne State University Department of Computer Science 5057 Woodward Ave, Detroit, MI 48202
| | - Nicholas Summerfield
- University of Wisconsin-Madison Department of Human Oncology Department of Medical Physics 600 Highland Ave, Madison, WI 53792
| | - Carri Glide-Hurst
- University of Wisconsin-Madison Department of Human Oncology Department of Medical Physics 600 Highland Ave, Madison, WI 53792
| |
Collapse
|
61
|
Li J, Chen J, Tang Y, Wang C, Landman BA, Zhou SK. Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives. Med Image Anal 2023; 85:102762. [PMID: 36738650 PMCID: PMC10010286 DOI: 10.1016/j.media.2023.102762] [Citation(s) in RCA: 54] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 01/18/2023] [Accepted: 01/27/2023] [Indexed: 02/01/2023]
Abstract
Transformer, one of the latest technological advances of deep learning, has gained prevalence in natural language processing or computer vision. Since medical imaging bear some resemblance to computer vision, it is natural to inquire about the status quo of Transformers in medical imaging and ask the question: can the Transformer models transform medical imaging? In this paper, we attempt to make a response to the inquiry. After a brief introduction of the fundamentals of Transformers, especially in comparison with convolutional neural networks (CNNs), and highlighting key defining properties that characterize the Transformers, we offer a comprehensive review of the state-of-the-art Transformer-based approaches for medical imaging and exhibit current research progresses made in the areas of medical image segmentation, recognition, detection, registration, reconstruction, enhancement, etc. In particular, what distinguishes our review lies in its organization based on the Transformer's key defining properties, which are mostly derived from comparing the Transformer and CNN, and its type of architecture, which specifies the manner in which the Transformer and CNN are combined, all helping the readers to best understand the rationale behind the reviewed approaches. We conclude with discussions of future perspectives.
Collapse
Affiliation(s)
- Jun Li
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Junyu Chen
- Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins Medical Institutes, Baltimore, MD, USA
| | - Yucheng Tang
- Department of Electrical and Computer Engineering, Vanderbilt University, Nashville, TN, USA
| | - Ce Wang
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Bennett A Landman
- Department of Electrical and Computer Engineering, Vanderbilt University, Nashville, TN, USA
| | - S Kevin Zhou
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China; School of Biomedical Engineering & Suzhou Institute for Advanced Research, Center for Medical Imaging, Robotics, and Analytic Computing & Learning (MIRACLE), University of Science and Technology of China, Suzhou 215123, China.
| |
Collapse
|
62
|
Liang B, Tang C, Zhang W, Xu M, Wu T. N-Net: an UNet architecture with dual encoder for medical image segmentation. SIGNAL, IMAGE AND VIDEO PROCESSING 2023; 17:1-9. [PMID: 37362231 PMCID: PMC10031177 DOI: 10.1007/s11760-023-02528-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 07/08/2022] [Accepted: 02/07/2023] [Indexed: 06/28/2023]
Abstract
In order to assist physicians in diagnosis and treatment planning, accurate and automatic methods of organ segmentation are needed in clinical practice. UNet and its improved models, such as UNet + + and UNt3 + , have been powerful tools for medical image segmentation. In this paper, we focus on helping the encoder extract richer features and propose a N-Net for medical image segmentation. On the basis of UNet, we propose a dual encoder model to deepen the network depth and enhance the ability of feature extraction. In our implementation, the Squeeze-and-Excitation (SE) module is added to the dual encoder model to obtain channel-level global features. In addition, the introduction of full-scale skip connections promotes the integration of low-level details and high-level semantic information. The performance of our model is tested on the lung and liver datasets, and compared with UNet, UNet + + and UNet3 + in terms of quantitative evaluation with the Dice, Recall, Precision and F1 score and qualitative evaluation. Our experiments demonstrate that N-Net outperforms the work of UNet, UNet + + and UNet3 + in these three datasets. By visual comparison of the segmentation results, N-Net produces more coherent organ boundaries and finer details.
Collapse
Affiliation(s)
- Bingtao Liang
- School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072 China
| | - Chen Tang
- School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072 China
| | - Wei Zhang
- Tianjin Key Laboratory of Ophthalmology and Visual Science, Tianjin Eye Institute, Clinical College of Ophthalmology of Tianjin Medical University, Tianjin Eye Hospital, Tianjin, 300020 China
| | - Min Xu
- School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072 China
| | - Tianbo Wu
- School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072 China
| |
Collapse
|
63
|
Wang F, Cheng C, Cao W, Wu Z, Wang H, Wei W, Yan Z, Liu Z. MFCNet: A multi-modal fusion and calibration networks for 3D pancreas tumor segmentation on PET-CT images. Comput Biol Med 2023; 155:106657. [PMID: 36791551 DOI: 10.1016/j.compbiomed.2023.106657] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 01/29/2023] [Accepted: 02/09/2023] [Indexed: 02/12/2023]
Abstract
In clinical diagnosis, positron emission tomography and computed tomography (PET-CT) images containing complementary information are fused. Tumor segmentation based on multi-modal PET-CT images is an important part of clinical diagnosis and treatment. However, the existing current PET-CT tumor segmentation methods mainly focus on positron emission tomography (PET) and computed tomography (CT) feature fusion, which weakens the specificity of the modality. In addition, the information interaction between different modal images is usually completed by simple addition or concatenation operations, but this has the disadvantage of introducing irrelevant information during the multi-modal semantic feature fusion, so effective features cannot be highlighted. To overcome this problem, this paper propose a novel Multi-modal Fusion and Calibration Networks (MFCNet) for tumor segmentation based on three-dimensional PET-CT images. First, a Multi-modal Fusion Down-sampling Block (MFDB) with a residual structure is developed. The proposed MFDB can fuse complementary features of multi-modal images while retaining the unique features of different modal images. Second, a Multi-modal Mutual Calibration Block (MMCB) based on the inception structure is designed. The MMCB can guide the network to focus on a tumor region by combining different branch decoding features using the attention mechanism and extracting multi-scale pathological features using a convolution kernel of different sizes. The proposed MFCNet is verified on both the public dataset (Head and Neck cancer) and the in-house dataset (pancreas cancer). The experimental results indicate that on the public and in-house datasets, the average Dice values of the proposed multi-modal segmentation network are 74.14% and 76.20%, while the average Hausdorff distances are 6.41 and 6.84, respectively. In addition, the experimental results show that the proposed MFCNet outperforms the state-of-the-art methods on the two datasets.
Collapse
Affiliation(s)
- Fei Wang
- Institute of Biomedical Engineering, School of Communication and Information Engineering, Shanghai University, Shanghai, 200444, China; Department of Medical Imaging, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Chao Cheng
- Department of Nuclear Medicine, The First Affiliated Hospital of Naval Medical University(Changhai Hospital), Shanghai, 200433, China
| | - Weiwei Cao
- Department of Medical Imaging, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Zhongyi Wu
- Department of Medical Imaging, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Heng Wang
- School of Electronic and Information Engineering, Changchun University of Science and Technology, Changchun, 130022, China
| | - Wenting Wei
- School of Electronic and Information Engineering, Changchun University of Science and Technology, Changchun, 130022, China
| | - Zhuangzhi Yan
- Institute of Biomedical Engineering, School of Communication and Information Engineering, Shanghai University, Shanghai, 200444, China.
| | - Zhaobang Liu
- Department of Medical Imaging, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China.
| |
Collapse
|
64
|
Zhang S, Miao Y, Chen J, Zhang X, Han L, Ran D, Huang Z, Pei N, Liu H, An C. Twist-Net: A multi-modality transfer learning network with the hybrid bilateral encoder for hypopharyngeal cancer segmentation. Comput Biol Med 2023; 154:106555. [PMID: 36701967 DOI: 10.1016/j.compbiomed.2023.106555] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 12/31/2022] [Accepted: 01/11/2023] [Indexed: 01/15/2023]
Abstract
Hypopharyngeal cancer (HPC) is a rare disease. Therefore, it is a challenge to automatically segment HPC tumors and metastatic lymph nodes (HPC risk areas) from medical images with the small-scale dataset. Combining low-level details and high-level semantics from feature maps in different scales can improve the accuracy of segmentation. Herein, we propose a Multi-Modality Transfer Learning Network with Hybrid Bilateral Encoder (Twist-Net) for Hypopharyngeal Cancer Segmentation. Specifically, we propose a Bilateral Transition (BT) block and a Bilateral Gather (BG) block to twist (fuse) high-level semantic feature maps and low-level detailed feature maps. We design a block with multi-receptive field extraction capabilities, M Block, to capture multi-scale information. To avoid overfitting caused by the small scale of the dataset, we propose a transfer learning method that can transfer priors experience from large computer vision datasets to multi-modality medical imaging datasets. Compared with other methods, our method outperforms other methods on HPC dataset, achieving the highest Dice of 82.98%. Our method is also superior to other methods on two public medical segmentation datasets, i.e., the CHASE_DB1 dataset and BraTS2018 dataset. On these two datasets, the Dice of our method is 79.83% and 84.87%, respectively. The code is available at: https://github.com/zhongqiu1245/TwistNet.
Collapse
Affiliation(s)
- Shuo Zhang
- Beijing University of Technology, Beijing, China
| | - Yang Miao
- Beijing University of Technology, Beijing, China; Beijing Key Laboratory of Advanced Manufacturing Technology, Beijing, China
| | - Jun Chen
- Beijing Engineering Research Center of Pediatric Surgery, Engineering and Transformation Center, Beijing Children's Hospital, Capital Medical University, National Center for Children's Health, Beijing, China
| | - Xiwei Zhang
- Department of Head and Neck Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Lei Han
- Beijing University of Posts and Telecommunications, Beijing, China.
| | - Dongsheng Ran
- Beijing University of Posts and Telecommunications, Beijing, China
| | - Zehao Huang
- Department of Head and Neck Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Ning Pei
- Beijing Institute of Technology, Beijing, China
| | - Haibin Liu
- Beijing University of Technology, Beijing, China.
| | - Changming An
- Department of Head and Neck Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.
| |
Collapse
|
65
|
Zeng Z, Zhao T, Sun L, Zhang Y, Xia M, Liao X, Zhang J, Shen D, Wang L, He Y. 3D-MASNet: 3D mixed-scale asymmetric convolutional segmentation network for 6-month-old infant brain MR images. Hum Brain Mapp 2023; 44:1779-1792. [PMID: 36515219 PMCID: PMC9921327 DOI: 10.1002/hbm.26174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Revised: 11/04/2022] [Accepted: 11/25/2022] [Indexed: 12/15/2022] Open
Abstract
Precise segmentation of infant brain magnetic resonance (MR) images into gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF) are essential for studying neuroanatomical hallmarks of early brain development. However, for 6-month-old infants, the extremely low-intensity contrast caused by inherent myelination hinders accurate tissue segmentation. Existing convolutional neural networks (CNNs) based segmentation models for this task generally employ single-scale symmetric convolutions, which are inefficient for encoding the isointense tissue boundaries in baby brain images. Here, we propose a 3D mixed-scale asymmetric convolutional segmentation network (3D-MASNet) framework for brain MR images of 6-month-old infants. We replaced the traditional convolutional layer of an existing to-be-trained network with a 3D mixed-scale convolution block consisting of asymmetric kernels (MixACB) during the training phase and then equivalently converted it into the original network. Five canonical CNN segmentation models were evaluated using both T1- and T2-weighted images of 23 6-month-old infants from iSeg-2019 datasets, which contained manual labels as ground truth. MixACB significantly enhanced the average accuracy of all five models and obtained the most considerable improvement in the fully convolutional network model (CC-3D-FCN) and the highest performance in the Dense U-Net model. This approach further obtained Dice coefficient accuracies of 0.931, 0.912, and 0.961 in GM, WM, and CSF, respectively, ranking first among 30 teams on the validation dataset of the iSeg-2019 Grand Challenge. Thus, the proposed 3D-MASNet can improve the accuracy of existing CNNs-based segmentation models as a plug-and-play solution that offers a promising technique for future infant brain MRI studies.
Collapse
Affiliation(s)
- Zilong Zeng
- State Key Laboratory of Cognitive Neuroscience and LearningBeijing Normal UniversityBeijingChina
- Beijing Key Laboratory of Brain Imaging and ConnectomicsBeijing Normal UniversityBeijingChina
- IDG/McGovern Institute for Brain Research, Beijing Normal UniversityBeijingChina
| | - Tengda Zhao
- State Key Laboratory of Cognitive Neuroscience and LearningBeijing Normal UniversityBeijingChina
- Beijing Key Laboratory of Brain Imaging and ConnectomicsBeijing Normal UniversityBeijingChina
- IDG/McGovern Institute for Brain Research, Beijing Normal UniversityBeijingChina
| | - Lianglong Sun
- State Key Laboratory of Cognitive Neuroscience and LearningBeijing Normal UniversityBeijingChina
- Beijing Key Laboratory of Brain Imaging and ConnectomicsBeijing Normal UniversityBeijingChina
- IDG/McGovern Institute for Brain Research, Beijing Normal UniversityBeijingChina
| | - Yihe Zhang
- State Key Laboratory of Cognitive Neuroscience and LearningBeijing Normal UniversityBeijingChina
- Beijing Key Laboratory of Brain Imaging and ConnectomicsBeijing Normal UniversityBeijingChina
- IDG/McGovern Institute for Brain Research, Beijing Normal UniversityBeijingChina
| | - Mingrui Xia
- State Key Laboratory of Cognitive Neuroscience and LearningBeijing Normal UniversityBeijingChina
- Beijing Key Laboratory of Brain Imaging and ConnectomicsBeijing Normal UniversityBeijingChina
- IDG/McGovern Institute for Brain Research, Beijing Normal UniversityBeijingChina
| | - Xuhong Liao
- School of Systems ScienceBeijing Normal UniversityBeijingChina
| | - Jiaying Zhang
- State Key Laboratory of Cognitive Neuroscience and LearningBeijing Normal UniversityBeijingChina
- Beijing Key Laboratory of Brain Imaging and ConnectomicsBeijing Normal UniversityBeijingChina
- IDG/McGovern Institute for Brain Research, Beijing Normal UniversityBeijingChina
| | - Dinggang Shen
- School of Biomedical EngineeringShanghaiTech UniversityShanghaiChina
- Shanghai Clinical Research and Trial CenterShanghaiChina
- Department of Research and DevelopmentShanghai United Imaging Intelligence Co., Ltd.ShanghaiChina
| | - Li Wang
- Department of Radiology and BRICUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
| | - Yong He
- State Key Laboratory of Cognitive Neuroscience and LearningBeijing Normal UniversityBeijingChina
- Beijing Key Laboratory of Brain Imaging and ConnectomicsBeijing Normal UniversityBeijingChina
- IDG/McGovern Institute for Brain Research, Beijing Normal UniversityBeijingChina
- Chinese Institute for Brain ResearchBeijingChina
| |
Collapse
|
66
|
Baloi A, Costea C, Gutt R, Balacescu O, Turcu F, Belean B. Hexagonal-Grid-Layout Image Segmentation Using Shock Filters: Computational Complexity Case Study for Microarray Image Analysis Related to Machine Learning Approaches. SENSORS (BASEL, SWITZERLAND) 2023; 23:2582. [PMID: 36904788 PMCID: PMC10007319 DOI: 10.3390/s23052582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 02/17/2023] [Accepted: 02/21/2023] [Indexed: 06/18/2023]
Abstract
Hexagonal grid layouts are advantageous in microarray technology; however, hexagonal grids appear in many fields, especially given the rise of new nanostructures and metamaterials, leading to the need for image analysis on such structures. This work proposes a shock-filter-based approach driven by mathematical morphology for the segmentation of image objects disposed in a hexagonal grid. The original image is decomposed into a pair of rectangular grids, such that their superposition generates the initial image. Within each rectangular grid, the shock-filters are once again used to confine the foreground information for each image object into an area of interest. The proposed methodology was successfully applied for microarray spot segmentation, whereas its character of generality is underlined by the segmentation results obtained for two other types of hexagonal grid layouts. Considering the segmentation accuracy through specific quality measures for microarray images, such as the mean absolute error and the coefficient of variation, high correlations of our computed spot intensity features with the annotated reference values were found, indicating the reliability of the proposed approach. Moreover, taking into account that the shock-filter PDE formalism is targeting the one-dimensional luminance profile function, the computational complexity to determine the grid is minimized. The order of growth for the computational complexity of our approach is at least one order of magnitude lower when compared with state-of-the-art microarray segmentation approaches, ranging from classical to machine learning ones.
Collapse
Affiliation(s)
- Aurel Baloi
- Research Center for Integrated Analysis and Territorial Management, University of Bucharest, 4-12 Regina Elisabeta, 030018 Bucharest, Romania
- Faculty of Administration and Business, University of Bucharest, 030018 Bucharest, Romania
| | - Carmen Costea
- Department of Mathematics, Faculty of Automation and Computer Science, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, Romania
| | - Robert Gutt
- Center of Advanced Research and Technologies for Alternative Energies, National Institute for Research and Development of Isotopic and Molecular Technologies, 400293 Cluj-Napoca, Romania
| | - Ovidiu Balacescu
- Department of Genetics, Genomics and Experimental Pathology, The Oncology Institute, Prof. Dr. Ion Chiricuta, 400015 Cluj-Napoca, Romania
| | - Flaviu Turcu
- Center of Advanced Research and Technologies for Alternative Energies, National Institute for Research and Development of Isotopic and Molecular Technologies, 400293 Cluj-Napoca, Romania
- Faculty of Physics, Babes-Bolyai University, 400084 Cluj-Napoca, Romania
| | - Bogdan Belean
- Center of Advanced Research and Technologies for Alternative Energies, National Institute for Research and Development of Isotopic and Molecular Technologies, 400293 Cluj-Napoca, Romania
| |
Collapse
|
67
|
Wen M, Zhou Q, Tao B, Shcherbakov P, Xu Y, Zhang X. Short‐term and long‐term memory self‐attention network for segmentation of tumours in 3D medical images. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY 2023. [DOI: 10.1049/cit2.12179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/19/2023] Open
Affiliation(s)
- Mingwei Wen
- Department of Biomedical Engineering College of Life Science and Technology Huazhong University of Science and Technology Wuhan China
| | - Quan Zhou
- Department of Biomedical Engineering College of Life Science and Technology Huazhong University of Science and Technology Wuhan China
| | - Bo Tao
- State Key Laboratory of Digital Manufacturing Equipment and Technology Huazhong University of Science and Technology Wuhan China
| | - Pavel Shcherbakov
- Institute for Control Science Russian Academy of Sciences Moscow Russia
| | - Yang Xu
- Hubei Medical Devices Quality Supervision and Test Institute Wuhan China
| | - Xuming Zhang
- Department of Biomedical Engineering College of Life Science and Technology Huazhong University of Science and Technology Wuhan China
| |
Collapse
|
68
|
Qiu J, Li L, Wang S, Zhang K, Chen Y, Yang S, Zhuang X. MyoPS-Net: Myocardial pathology segmentation with flexible combination of multi-sequence CMR images. Med Image Anal 2023; 84:102694. [PMID: 36495601 DOI: 10.1016/j.media.2022.102694] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Revised: 10/05/2022] [Accepted: 11/16/2022] [Indexed: 11/29/2022]
Abstract
Myocardial pathology segmentation (MyoPS) can be a prerequisite for the accurate diagnosis and treatment planning of myocardial infarction. However, achieving this segmentation is challenging, mainly due to the inadequate and indistinct information from an image. In this work, we develop an end-to-end deep neural network, referred to as MyoPS-Net, to flexibly combine five-sequence cardiac magnetic resonance (CMR) images for MyoPS. To extract precise and adequate information, we design an effective yet flexible architecture to extract and fuse cross-modal features. This architecture can tackle different numbers of CMR images and complex combinations of modalities, with output branches targeting specific pathologies. To impose anatomical knowledge on the segmentation results, we first propose a module to regularize myocardium consistency and localize the pathologies, and then introduce an inclusiveness loss to utilize relations between myocardial scars and edema. We evaluated the proposed MyoPS-Net on two datasets, i.e., a private one consisting of 50 paired multi-sequence CMR images and a public one from MICCAI2020 MyoPS Challenge. Experimental results showed that MyoPS-Net could achieve state-of-the-art performance in various scenarios. Note that in practical clinics, the subjects may not have full sequences, such as missing LGE CMR or mapping CMR scans. We therefore conducted extensive experiments to investigate the performance of the proposed method in dealing with such complex combinations of different CMR sequences. Results proved the superiority and generalizability of MyoPS-Net, and more importantly, indicated a practical clinical application. The code has been released via https://github.com/QJYBall/MyoPS-Net.
Collapse
Affiliation(s)
- Junyi Qiu
- School of Data Science, Fudan University, Shanghai, China
| | - Lei Li
- Institute of Biomedical Engineering, University of Oxford, Oxford, UK
| | - Sihan Wang
- School of Data Science, Fudan University, Shanghai, China
| | - Ke Zhang
- School of Data Science, Fudan University, Shanghai, China
| | - Yinyin Chen
- Department of Radiology, Zhongshan Hospital, Fudan University, Shanghai, China; Department of Medical Imaging, Shanghai Medical School, Fudan University and Shanghai Institute of Medical Imaging, Shanghai, China
| | - Shan Yang
- Department of Radiology, Zhongshan Hospital, Fudan University, Shanghai, China; Department of Medical Imaging, Shanghai Medical School, Fudan University and Shanghai Institute of Medical Imaging, Shanghai, China
| | - Xiahai Zhuang
- School of Data Science, Fudan University, Shanghai, China.
| |
Collapse
|
69
|
Wang M, Jiang H. Memory-Net: Coupling feature maps extraction and hierarchical feature maps reuse for efficient and effective PET/CT multi-modality image-based tumor segmentation. Knowl Based Syst 2023. [DOI: 10.1016/j.knosys.2023.110399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023]
|
70
|
Wang M, Jiang H, Shi T, Yao YD. SCL-Net: Structured Collaborative Learning for PET/CT Based Tumor Segmentation. IEEE J Biomed Health Inform 2023; 27:1048-1059. [PMID: 37015562 DOI: 10.1109/jbhi.2022.3226475] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Collaborative learning methods for medical image segmentation are often variants of UNet, where the constructions of classifiers depend on each other and their outputs are supervised independently. However, they cannot explicitly ensure that optimizing auxiliary classifier heads leads to improved segmentation of target classifier. To resolve this problem, we propose a structured collaborative learning (SCL) method, which consists of a context-aware structured classifier population generation (CA-SCPG) module, where the feature propagation of the target classifier path is directly enhanced by the outputs of auxiliary classifiers via a light-weighted high-level context-aware dense connection (HLCA-DC) mechanism, and a knowledge-aware structured classifier population supervision (KA-SCPS) module, where the auxiliary classifiers are properly supervised under the guidance of target classifier's segmentations. Specifically, SCL is proposed based on a recurrent-dense-siamese decoder (RDS-Decoder), which consists of multiple siamese-decoder paths. CA-SCPG enhances the feature propagation of the decoder paths by HLCA-DC, which densely reuses previous decoder paths' output predictions to belong to the target classes as inputs to the latter decoder paths. KA-SCPS supervises the classifier heads simultaneously with KA-SCPS loss, which consists of a generalized weighted cross-entropy loss for deep class-imbalanced learning and a novel knowledge-aware Dice loss (KA-DL). KA-DL is a weighted Dice loss broadcasting knowledges learnt by the target classifier to other classifier heads, harmonizing the learning process of the classifier population. Experiments are performed based on PET/CT volumes with malignant melanoma, lymphoma, or lung cancer. Experimental results demonstrate the superiority of our SCL, when compared to the state-of-the-art methods and baselines.
Collapse
|
71
|
Yang Y, Xie F, Zhang H, Wang J, Liu J, Zhang Y, Ding H. Skin lesion classification based on two-modal images using a multi-scale fully-shared fusion network. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 229:107315. [PMID: 36586177 DOI: 10.1016/j.cmpb.2022.107315] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 12/11/2022] [Accepted: 12/15/2022] [Indexed: 06/17/2023]
Abstract
BACKGROUND AND OBJECTIVE Due to the complexity of skin lesion features, computer-aided diagnosis of skin diseases based on multi-modal images is considered a challenging task. Dermoscopic images and clinical images are commonly used to diagnose skin diseases in clinical scenarios, and the complementarity of their features promotes the research of multi-modality classification in the computer-aided diagnosis field. Most current methods focus on the fusion between modalities and ignore the complementary information within each of them, which leads to the loss of the intra-modality relation. Multi-modality models for integrating features both within single modalities and across multiple modalities are limited in the literature. Therefore, a multi-modality model based on dermoscopic and clinical images is proposed to address this issue. METHODS We propose a Multi-scale Fully-shared Fusion Network (MFF-Net) that gathers features of dermoscopic images and clinical images for skin lesion classification. In MFF-Net, the multi-scale fusion structure combines deep and shallow features within individual modalities to reduce the loss of spatial information in high-level feature maps. Then Dermo-Clinical Block (DCB) integrates the feature maps from dermoscopic images and clinical images through channel-wise concatenation and using a fully-shared fusion strategy that explores complementary information at different stages. RESULTS We validated our model on a four-class two-modal skin diseases dataset, and proved that the proposed multi-scale structure, the fusion module DCBs, and the fully-shared fusion strategy improve the performance of MFF-Net independently. Our method achieved the highest average accuracy of 72.9% on the 7-point checklist dataset, outperforming the state-of-the-art single-modality and multi-modality methods with an accuracy boost of 7.1% and 3.4%, respectively. CONCLUSIONS The multi-scale fusion structure demonstrates the significance of intra-modality relations between clinical images and dermoscopic images. The proposed network combined with the multi-scale structure, DCBs, and the fully-shared fusion strategy, can effectively integrate the features of the skin lesions across the two modalities and achieved a promising accuracy among different skin diseases.
Collapse
Affiliation(s)
- Yiguang Yang
- Image Processing Center, School of Astronautics, Beihang University, Beijing 100191, China
| | - Fengying Xie
- Image Processing Center, School of Astronautics, Beihang University, Beijing 100191, China.
| | - Haopeng Zhang
- Image Processing Center, School of Astronautics, Beihang University, Beijing 100191, China
| | - Juncheng Wang
- Department of Dermatology, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, National Clinical Research Center for Dermatologic and Immunologic Diseases, Chinese Academy of Medical Science and Peking Union Medical College, Beijing 100730, China
| | - Jie Liu
- Department of Dermatology, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, National Clinical Research Center for Dermatologic and Immunologic Diseases, Chinese Academy of Medical Science and Peking Union Medical College, Beijing 100730, China
| | - Yilan Zhang
- Image Processing Center, School of Astronautics, Beihang University, Beijing 100191, China
| | - Haidong Ding
- Image Processing Center, School of Astronautics, Beihang University, Beijing 100191, China
| |
Collapse
|
72
|
Zheng L, Zhao M, Zhu J, Huang L, Zhao J, Liang D, Zhang D. Fusion of hyperspectral imaging (HSI) and RGB for identification of soybean kernel damages using ShuffleNet with convolutional optimization and cross stage partial architecture. FRONTIERS IN PLANT SCIENCE 2023; 13:1098864. [PMID: 36743540 PMCID: PMC9889993 DOI: 10.3389/fpls.2022.1098864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 12/19/2022] [Indexed: 06/18/2023]
Abstract
Identification of soybean kernel damages is significant to prevent further disoperation. Hyperspectral imaging (HSI) has shown great potential in cereal kernel identification, but its low spatial resolution leads to external feature infidelity and limits the analysis accuracy. In this study, the fusion of HSI and RGB images and improved ShuffleNet were combined to develop an identification method for soybean kernel damages. First, the HSI-RGB fusion network (HRFN) was designed based on super-resolution and spectral modification modules to process the registered HSI and RGB image pairs and generate super-resolution HSI (SR-HSI) images. ShuffleNet improved with convolution optimization and cross-stage partial architecture (ShuffleNet_COCSP) was used to build classification models with the optimal image set of effective wavelengths (OISEW) of SR-HSI images obtained by support vector machine and ShuffleNet. High-quality fusion of HSI and RGB with the obvious spatial promotion and satisfactory spectral conservation was gained by HRFN. ShuffleNet_COCSP and OISEW obtained the optimal recognition performance of ACCp=98.36%, Params=0.805 M, and FLOPs=0.097 G, outperforming other classification methods and other types of images. Overall, the proposed method provides an accurate and reliable identification of soybean kernel damages and would be extended to analysis of other quality indicators of various crop kernels.
Collapse
|
73
|
Zhang S, Zhang J, Tian B, Lukasiewicz T, Xu Z. Multi-modal contrastive mutual learning and pseudo-label re-learning for semi-supervised medical image segmentation. Med Image Anal 2023; 83:102656. [PMID: 36327656 DOI: 10.1016/j.media.2022.102656] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 10/04/2022] [Accepted: 10/12/2022] [Indexed: 12/12/2022]
Abstract
Semi-supervised learning has a great potential in medical image segmentation tasks with a few labeled data, but most of them only consider single-modal data. The excellent characteristics of multi-modal data can improve the performance of semi-supervised segmentation for each image modality. However, a shortcoming for most existing multi-modal solutions is that as the corresponding processing models of the multi-modal data are highly coupled, multi-modal data are required not only in the training but also in the inference stages, which thus limits its usage in clinical practice. Consequently, we propose a semi-supervised contrastive mutual learning (Semi-CML) segmentation framework, where a novel area-similarity contrastive (ASC) loss leverages the cross-modal information and prediction consistency between different modalities to conduct contrastive mutual learning. Although Semi-CML can improve the segmentation performance of both modalities simultaneously, there is a performance gap between two modalities, i.e., there exists a modality whose segmentation performance is usually better than that of the other. Therefore, we further develop a soft pseudo-label re-learning (PReL) scheme to remedy this gap. We conducted experiments on two public multi-modal datasets. The results show that Semi-CML with PReL greatly outperforms the state-of-the-art semi-supervised segmentation methods and achieves a similar (and sometimes even better) performance as fully supervised segmentation methods with 100% labeled data, while reducing the cost of data annotation by 90%. We also conducted ablation studies to evaluate the effectiveness of the ASC loss and the PReL module.
Collapse
Affiliation(s)
- Shuo Zhang
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China; Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China
| | - Jiaojiao Zhang
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China; Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China
| | - Biao Tian
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China; Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China
| | | | - Zhenghua Xu
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China; Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China.
| |
Collapse
|
74
|
Abozeid A, I. Taloba A, M. Abd El-Aziz R, Faiz Alwaghid A, Salem M, Elhadad A. An Efficient Indoor Localization Based on Deep Attention Learning Model. COMPUTER SYSTEMS SCIENCE AND ENGINEERING 2023; 46:2637-2650. [DOI: 10.32604/csse.2023.037761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Accepted: 01/06/2023] [Indexed: 09/02/2023]
|
75
|
Chen X, Peng Y, Guo Y, Sun J, Li D, Cui J. MLRD-Net: 3D multiscale local cross-channel residual denoising network for MRI-based brain tumor segmentation. Med Biol Eng Comput 2022; 60:3377-3395. [DOI: 10.1007/s11517-022-02673-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2021] [Accepted: 09/17/2022] [Indexed: 11/11/2022]
|
76
|
Wang H, Chen X, Yu R, Wei Z, Yao T, Gao C, Li Y, Wang Z, Yi D, Wu Y. E-DU: Deep neural network for multimodal medical image segmentation based on semantic gap compensation. Comput Biol Med 2022; 151:106206. [PMID: 36395592 DOI: 10.1016/j.compbiomed.2022.106206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Revised: 09/08/2022] [Accepted: 10/09/2022] [Indexed: 12/27/2022]
Abstract
BACKGROUND U-Net includes encoder, decoder and skip connection structures. It has become the benchmark network in medical image segmentation. However, the direct fusion of low-level and high-level convolution features with semantic gaps by traditional skip connections may lead to problems such as fuzzy generated feature maps and target region segmentation errors. OBJECTIVE We use spatial enhancement filtering technology to compensate for the semantic gap and propose an enhanced dense U-Net (E-DU), aiming to apply it to multimodal medical image segmentation to improve the segmentation performance and efficiency. METHODS Before combining encoder and decoder features, we replace the traditional skip connection with a multiscale denoise enhancement (MDE) module. The encoder features need to be deeply convolved by the spatial enhancement filter and then combined with the decoder features. We propose a simple and efficient deep full convolution network structure E-DU, which can not only fuse semantically various features but also denoise and enhance the feature map. RESULTS We performed experiments on medical image segmentation datasets with seven image modalities and combined MDE with various baseline networks to perform ablation studies. E-DU achieved the best segmentation results on evaluation indicators such as DSC on the U-Net family, with DSC values of 97.78, 97.64, 95.31, 94.42, 94.93, 98.85, and 98.38 (%), respectively. The addition of the MDE module to the attention mechanism network improves segmentation performance and efficiency, reflecting its generalization performance. In comparison to advanced methods, our method is also competitive. CONCLUSION Our proposed MDE module has a good segmentation effect and operating efficiency, and it can be easily extended to multiple modal medical segmentation datasets. Our idea and method can achieve clinical multimodal medical image segmentation and make full use of image information to provide clinical decision support. It has great application value and promotion prospects.
Collapse
Affiliation(s)
- Haojia Wang
- Department of Health Statistics, College of Preventive Medicine, Army Medical University, NO.30 Gaotanyan Street, Shapingba District, Chongqing, 400038, China
| | - Xicheng Chen
- Department of Health Statistics, College of Preventive Medicine, Army Medical University, NO.30 Gaotanyan Street, Shapingba District, Chongqing, 400038, China
| | - Rui Yu
- Tactical Health Service Department, NCO School of Army Medical University, Zhongshanxi Road 450, Qiaoxi District, Shijiazhuang, 050081, China
| | - Zeliang Wei
- Department of Health Statistics, College of Preventive Medicine, Army Medical University, NO.30 Gaotanyan Street, Shapingba District, Chongqing, 400038, China
| | - Tianhua Yao
- Department of Health Statistics, College of Preventive Medicine, Army Medical University, NO.30 Gaotanyan Street, Shapingba District, Chongqing, 400038, China
| | - Chengcheng Gao
- Department of Health Statistics, College of Preventive Medicine, Army Medical University, NO.30 Gaotanyan Street, Shapingba District, Chongqing, 400038, China
| | - Yang Li
- Department of Health Statistics, College of Preventive Medicine, Army Medical University, NO.30 Gaotanyan Street, Shapingba District, Chongqing, 400038, China
| | - Zhenyan Wang
- Department of Health Statistics, College of Preventive Medicine, Army Medical University, NO.30 Gaotanyan Street, Shapingba District, Chongqing, 400038, China
| | - Dong Yi
- Department of Health Statistics, College of Preventive Medicine, Army Medical University, NO.30 Gaotanyan Street, Shapingba District, Chongqing, 400038, China.
| | - Yazhou Wu
- Department of Health Statistics, College of Preventive Medicine, Army Medical University, NO.30 Gaotanyan Street, Shapingba District, Chongqing, 400038, China.
| |
Collapse
|
77
|
Chen Y, Jin D, Guo B, Bai X. Attention-Assisted Adversarial Model for Cerebrovascular Segmentation in 3D TOF-MRA Volumes. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:3520-3532. [PMID: 35759584 DOI: 10.1109/tmi.2022.3186731] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Cerebrovascular segmentation in time-of-flight magnetic resonance angiography (TOF-MRA) volumes is essential for a variety of diagnostic and analytical applications. However, accurate cerebrovascular segmentation in 3D TOF-MRA is faced with multiple issues, including vast variations in cerebrovascular morphology and intensity, noisy background, and severe class imbalance between foreground cerebral vessels and background. In this work, a 3D adversarial network model called A-SegAN is proposed to segment cerebral vessels in TOF-MRA volumes. The proposed model is composed of a segmentation network A-SegS to predict segmentation maps, and a critic network A-SegC to discriminate predictions from ground truth. Based on this model, the aforementioned issues are addressed by the prevailing visual attention mechanism. First, A-SegS is incorporated with feature-attention blocks to filter out discriminative feature maps, though the cerebrovascular has varied appearances. Second, a hard-example-attention loss is exploited to boost the training of A-SegS on hard samples. Further, A-SegC is combined with an input-attention layer to attach importance to foreground cerebrovascular class. The proposed methods were evaluated on a self-constructed voxel-wise annotated cerebrovascular TOF-MRA segmentation dataset, and experimental results indicate that A-SegAN achieves competitive or better cerebrovascular segmentation results compared to other deep learning methods, effectively alleviating the above issues.
Collapse
|
78
|
Wang M, Jiang H, Shi T, Wang Z, Guo J, Lu G, Wang Y, Yao YD. PSR-Nets: Deep neural networks with prior shift regularization for PET/CT based automatic, accurate, and calibrated whole-body lymphoma segmentation. Comput Biol Med 2022; 151:106215. [PMID: 36306584 DOI: 10.1016/j.compbiomed.2022.106215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 10/04/2022] [Accepted: 10/15/2022] [Indexed: 12/27/2022]
Abstract
Lymphoma is a type of lymphatic tissue originated cancer. Automatic and accurate lymphoma segmentation is critical for its diagnosis and prognosis yet challenging due to the severely class-imbalanced problem. Generally, deep neural networks trained with class-observation-frequency based re-weighting loss functions are used to address this problem. However, the majority class can be under-weighted by them, due to the existence of data overlap. Besides, they are more mis-calibrated. To resolve these, we propose a neural network with prior-shift regularization (PSR-Net), which comprises a UNet-like backbone with re-weighting loss functions, and a prior-shift regularization (PSR) module including a prior-shift layer (PSL), a regularizer generation layer (RGL), and an expected prediction confidence updating layer (EPCUL). We first propose a trainable expected prediction confidence (EPC) for each class. Periodically, PSL shifts a prior training dataset to a more informative dataset based on EPCs; RGL presents a generalized informative-voxel-aware (GIVA) loss with EPCs and calculates it on the informative dataset for model finetuning in back-propagation; and EPCUL updates EPCs to refresh PSL and RRL in next forward-propagation. PSR-Net is trained in a two- stage manner. The backbone is first trained with re-weighting loss functions, then we reload the best saved model for the backbone and continue to train it with the weighted sum of the re-weighting loss functions, the GIVA regularizer and the L2 loss function of EPCs for regularization fine-tuning. Extensive experiments are performed based on PET/CT volumes with advanced stage lymphomas. Our PSR-Net achieves 95.12% sensitivity and 87.18% Dice coefficient, demonstrating the effectiveness of PSR-Net, when compared to the baselines and the state-of-the-arts.
Collapse
Affiliation(s)
- Meng Wang
- Department of Software College, Northeastern University, Shenyang 110819, China
| | - Huiyan Jiang
- Department of Software College, Northeastern University, Shenyang 110819, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang 110819, China.
| | - Tianyu Shi
- Department of Software College, Northeastern University, Shenyang 110819, China
| | - Zhiguo Wang
- Department of Nuclear Medicine, General Hospital of Northern Military Area, Shenyang 110016, China
| | - Jia Guo
- Department of Nuclear Medicine, General Hospital of Northern Military Area, Shenyang 110016, China
| | - Guoxiu Lu
- Department of Nuclear Medicine, General Hospital of Northern Military Area, Shenyang 110016, China
| | - Youchao Wang
- Department of Nuclear Medicine, General Hospital of Northern Military Area, Shenyang 110016, China
| | - Yu-Dong Yao
- Department of Electrical and Computer Engineering, Stevens Institute of Technology, Hoboken, NJ 07030, USA
| |
Collapse
|
79
|
Xia X, Wang J, Liang S, Ye F, Tian MM, Hu W, Xu L. An attention base U-net for parotid tumor autosegmentation. Front Oncol 2022; 12:1028382. [PMID: 36505865 PMCID: PMC9730401 DOI: 10.3389/fonc.2022.1028382] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 10/26/2022] [Indexed: 11/25/2022] Open
Abstract
A parotid neoplasm is an uncommon condition that only accounts for less than 3% of all head and neck cancers, and they make up less than 0.3% of all new cancers diagnosed annually. Due to their nonspecific imaging features and heterogeneous nature, accurate preoperative diagnosis remains a challenge. Automatic parotid tumor segmentation may help physicians evaluate these tumors. Two hundred eighty-five patients diagnosed with benign or malignant parotid tumors were enrolled in this study. Parotid and tumor tissues were segmented by 3 radiologists on T1-weighted (T1w), T2-weighted (T2w) and T1-weighted contrast-enhanced (T1wC) MR images. These images were randomly divided into two datasets, including a training dataset (90%) and an validation dataset (10%). A 10-fold cross-validation was performed to assess the performance. An attention base U-net for parotid tumor autosegmentation was created on the MRI T1w, T2 and T1wC images. The results were evaluated in a separate dataset, and the mean Dice similarity coefficient (DICE) for both parotids was 0.88. The mean DICE for left and right tumors was 0.85 and 0.86, respectively. These results indicate that the performance of this model corresponds with the radiologist's manual segmentation. In conclusion, an attention base U-net for parotid tumor autosegmentation may assist physicians to evaluate parotid gland tumors.
Collapse
Affiliation(s)
- Xianwu Xia
- The Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, Zhejiang, China
- Department of Oncology Intervention, The Affiliated Municipal Hospital of Taizhou University, Taizhou, China
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Jiazhou Wang
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Sheng Liang
- Department of Oncology Intervention, The Affiliated Municipal Hospital of Taizhou University, Taizhou, China
| | - Fangfang Ye
- Department of Oncology Intervention, The Affiliated Municipal Hospital of Taizhou University, Taizhou, China
| | - Min-Ming Tian
- Department of Oncology Intervention, Jiangxi University of Traditional Chinese Medicine, Nanchang, Jiangxi, China
| | - Weigang Hu
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Leiming Xu
- The Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, Zhejiang, China
| |
Collapse
|
80
|
Li Y, Xu C, Han J, An Z, Wang D, Ma H, Liu C. MHAU-Net: Skin Lesion Segmentation Based on Multi-Scale Hybrid Residual Attention Network. SENSORS (BASEL, SWITZERLAND) 2022; 22:8701. [PMID: 36433298 PMCID: PMC9695536 DOI: 10.3390/s22228701] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Revised: 11/03/2022] [Accepted: 11/07/2022] [Indexed: 06/16/2023]
Abstract
Melanoma is a main factor that leads to skin cancer, and early diagnosis and treatment can significantly reduce the mortality of patients. Skin lesion boundary segmentation is a key to accurately localizing a lesion in dermoscopic images. However, the irregular shape and size of the lesions and the blurred boundary of the lesions pose significant challenges for researchers. In recent years, pixel-level semantic segmentation strategies based on convolutional neural networks have been widely used, but many methods still suffer from the inaccurate segmentation of fuzzy boundaries. In this paper, we proposed a multi-scale hybrid attentional convolutional neural network (MHAU-Net) for the precise localization and segmentation of skin lesions. MHAU-Net has four main components: multi-scale resolution input, hybrid residual attention (HRA), dilated convolution, and atrous spatial pyramid pooling. Multi-scale resolution inputs provide richer visual information, and HRA solves the problem of blurred boundaries and enhances the segmentation results. The Dice, mIoU, average specificity, and sensitivity on the ISIC2018 task 1 validation set were 93.69%, 90.02%, 92.7% and 93.9%, respectively. The segmentation metrics are significantly better than the latest DCSAU-Net, UNeXt, and U-Net, and excellent segmentation results are achieved on different datasets. We performed model robustness validations on the Kvasir-SEG dataset with an overall sensitivity and average specificity of 95.91% and 96.28%, respectively.
Collapse
Affiliation(s)
- Yingjie Li
- School of Integrated Circuits, Anhui University, Hefei 230601, China
- Anhui Engineering Laboratory of Agro-Ecological Big Data, Hefei 230601, China
| | - Chao Xu
- School of Integrated Circuits, Anhui University, Hefei 230601, China
- Anhui Engineering Laboratory of Agro-Ecological Big Data, Hefei 230601, China
| | - Jubao Han
- School of Integrated Circuits, Anhui University, Hefei 230601, China
- Anhui Engineering Laboratory of Agro-Ecological Big Data, Hefei 230601, China
| | - Ziheng An
- School of Integrated Circuits, Anhui University, Hefei 230601, China
- Anhui Engineering Laboratory of Agro-Ecological Big Data, Hefei 230601, China
| | - Deyu Wang
- School of Integrated Circuits, Anhui University, Hefei 230601, China
- Anhui Engineering Laboratory of Agro-Ecological Big Data, Hefei 230601, China
| | - Haichao Ma
- School of Integrated Circuits, Anhui University, Hefei 230601, China
- Anhui Engineering Laboratory of Agro-Ecological Big Data, Hefei 230601, China
| | - Chuanxu Liu
- School of Integrated Circuits, Anhui University, Hefei 230601, China
- Anhui Engineering Laboratory of Agro-Ecological Big Data, Hefei 230601, China
| |
Collapse
|
81
|
Liu H, Jiao ML, Xing XY, Ou-Yang HQ, Yuan Y, Liu JF, Li Y, Wang CJ, Lang N, Qian YL, Jiang L, Yuan HS, Wang XD. BgNet: Classification of benign and malignant tumors with MRI multi-plane attention learning. Front Oncol 2022; 12:971871. [DOI: 10.3389/fonc.2022.971871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Accepted: 10/05/2022] [Indexed: 11/13/2022] Open
Abstract
ObjectivesTo propose a deep learning-based classification framework, which can carry out patient-level benign and malignant tumors classification according to the patient’s multi-plane images and clinical information.MethodsA total of 430 cases of spinal tumor, including axial and sagittal plane images by MRI, of which 297 cases for training (14072 images), and 133 cases for testing (6161 images) were included. Based on the bipartite graph and attention learning, this study proposed a multi-plane attention learning framework, BgNet, for benign and malignant tumor diagnosis. In a bipartite graph structure, the tumor area in each plane is used as the vertex of the graph, and the matching between different planes is used as the edge of the graph. The tumor areas from different plane images are spliced at the input layer. And based on the convolutional neural network ResNet and visual attention learning model Swin-Transformer, this study proposed a feature fusion model named ResNetST for combining both global and local information to extract the correlation features of multiple planes. The proposed BgNet consists of five modules including a multi-plane fusion module based on the bipartite graph, input layer fusion module, feature layer fusion module, decision layer fusion module, and output module. These modules are respectively used for multi-level fusion of patient multi-plane image data to realize the comprehensive diagnosis of benign and malignant tumors at the patient level.ResultsThe accuracy (ACC: 79.7%) of the proposed BgNet with multi-plane was higher than that with a single plane, and higher than or equal to the four doctors’ ACC (D1: 70.7%, p=0.219; D2: 54.1%, p<0.005; D3: 79.7%, p=0.006; D4: 72.9%, p=0.178). Moreover, the diagnostic accuracy and speed of doctors can be further improved with the aid of BgNet, the ACC of D1, D2, D3, and D4 improved by 4.5%, 21.8%, 0.8%, and 3.8%, respectively.ConclusionsThe proposed deep learning framework BgNet can classify benign and malignant tumors effectively, and can help doctors improve their diagnostic efficiency and accuracy. The code is available at https://github.com/research-med/BgNet.
Collapse
|
82
|
Li M, Jiang Z, Shen W, Liu H. Deep learning in bladder cancer imaging: A review. Front Oncol 2022; 12:930917. [PMID: 36338676 PMCID: PMC9631317 DOI: 10.3389/fonc.2022.930917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 09/30/2022] [Indexed: 11/13/2022] Open
Abstract
Deep learning (DL) is a rapidly developing field in machine learning (ML). The concept of deep learning originates from research on artificial neural networks and is an upgrade of traditional neural networks. It has achieved great success in various domains and has shown potential in solving medical problems, particularly when using medical images. Bladder cancer (BCa) is the tenth most common cancer in the world. Imaging, as a safe, noninvasive, and relatively inexpensive technique, is a powerful tool to aid in the diagnosis and treatment of bladder cancer. In this review, we provide an overview of the latest progress in the application of deep learning to the imaging assessment of bladder cancer. First, we review the current deep learning approaches used for bladder segmentation. We then provide examples of how deep learning helps in the diagnosis, staging, and treatment management of bladder cancer using medical images. Finally, we summarize the current limitations of deep learning and provide suggestions for future improvements.
Collapse
Affiliation(s)
- Mingyang Li
- Department of Urology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Zekun Jiang
- Ministry of Education (MoE) Key Lab of Artificial Intelligence, Artificial Intelligence (AI) Institute, Shanghai Jiao Tong University, Shanghai, China
| | - Wei Shen
- Ministry of Education (MoE) Key Lab of Artificial Intelligence, Artificial Intelligence (AI) Institute, Shanghai Jiao Tong University, Shanghai, China
- *Correspondence: Haitao Liu, ; Wei Shen,
| | - Haitao Liu
- Department of Urology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
- *Correspondence: Haitao Liu, ; Wei Shen,
| |
Collapse
|
83
|
Moon HS, Heffron L, Mahzarnia A, Obeng-Gyasi B, Holbrook M, Badea CT, Feng W, Badea A. Automated multimodal segmentation of acute ischemic stroke lesions on clinical MR images. Magn Reson Imaging 2022; 92:45-57. [PMID: 35688400 PMCID: PMC9949513 DOI: 10.1016/j.mri.2022.06.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Revised: 06/01/2022] [Accepted: 06/02/2022] [Indexed: 02/09/2023]
Abstract
Magnetic resonance (MR) imaging (MRI) is commonly used to diagnose, assess and monitor stroke. Accurate and timely segmentation of stroke lesions provides the anatomico-structural information that can aid physicians in predicting prognosis, as well as in decision making and triaging for various rehabilitation strategies. To segment stroke lesions, MR protocols, including diffusion-weighted imaging (DWI) and T2-weighted fluid attenuated inversion recovery (FLAIR) are often utilized. These imaging sequences are usually acquired with different spatial resolutions due to time constraints. Within the same image, voxels may be anisotropic, with reduced resolution along slice direction for diffusion scans in particular. In this study, we evaluate the ability of 2D and 3D U-Net Convolutional Neural Network (CNN) architectures to segment ischemic stroke lesions using single contrast (DWI) and dual contrast images (T2w FLAIR and DWI). The predicted segmentations correlate with post-stroke motor outcome measured by the National Institutes of Health Stroke Scale (NIHSS) and Fugl-Meyer Upper Extremity (FM-UE) index based on the lesion loads overlapping the corticospinal tracts (CST), which is a neural substrate for motor movement and function. Although the four methods performed similarly, the 2D multimodal U-Net achieved the best results with a mean Dice of 0.737 (95% CI: 0.705, 0.769) and a relatively high correlation between the weighted lesion load and the NIHSS scores (both at baseline and at 90 days). A monotonically constrained quintic polynomial regression yielded R2 = 0.784 and 0.875 for weighted lesion load versus baseline and 90-Days NIHSS respectively, and better corrected Akaike information criterion (AICc) scores than those of the linear regression. In addition, using the quintic polynomial regression model to regress the weighted lesion load to the 90-Days FM-UE score results in an R2 of 0.570 with a better AICc score than that of the linear regression. Our results suggest that the multi-contrast information enhanced the accuracy of the segmentation and the prediction accuracy for upper extremity motor outcomes. Expanding the training dataset to include different types of stroke lesions and more data points will help add a temporal longitudinal aspect and increase the accuracy. Furthermore, adding patient-specific data may improve the inference about the relationship between imaging metrics and functional outcomes.
Collapse
Affiliation(s)
- Hae Sol Moon
- Department of Biomedical Engineering, Duke University, Durham, NC, United States
| | - Lindsay Heffron
- Department of Orthopaedic Surgery, Duke University School of Medicine, Durham, NC, United States
| | - Ali Mahzarnia
- Department of Radiology, Duke University School of Medicine, Durham, NC, United States
| | - Barnabas Obeng-Gyasi
- Department of Radiology, Duke University School of Medicine, Durham, NC, United States
| | - Matthew Holbrook
- Department of Radiology, Duke University School of Medicine, Durham, NC, United States
| | - Cristian T Badea
- Department of Biomedical Engineering, Duke University, Durham, NC, United States; Department of Radiology, Duke University School of Medicine, Durham, NC, United States
| | - Wuwei Feng
- Department of Neurology, Duke University School of Medicine, Durham, NC, United States
| | - Alexandra Badea
- Department of Biomedical Engineering, Duke University, Durham, NC, United States; Department of Radiology, Duke University School of Medicine, Durham, NC, United States; Department of Neurology, Duke University School of Medicine, Durham, NC, United States; Brain Imaging and Analysis Center, Duke University School of Medicine, NC, United States.
| |
Collapse
|
84
|
|
85
|
Khaled A, Han JJ, Ghaleb TA. Learning to detect boundary information for brain image segmentation. BMC Bioinformatics 2022; 23:332. [PMID: 35953776 PMCID: PMC9367147 DOI: 10.1186/s12859-022-04882-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 07/30/2022] [Indexed: 11/14/2022] Open
Abstract
MRI brain images are always of low contrast, which makes it difficult to identify to which area the information at the boundary of brain images belongs. This can make the extraction of features at the boundary more challenging, since those features can be misleading as they might mix properties of different brain regions. Hence, to alleviate such a problem, image boundary detection plays a vital role in medical image segmentation, and brain segmentation in particular, as unclear boundaries can worsen brain segmentation results. Yet, given the low quality of brain images, boundary detection in the context of brain image segmentation remains challenging. Despite the research invested to improve boundary detection and brain segmentation, these two problems were addressed independently, i.e., little attention was paid to applying boundary detection to brain segmentation tasks. Therefore, in this paper, we propose a boundary detection-based model for brain image segmentation. To this end, we first design a boundary segmentation network for detecting and segmenting images brain tissues. Then, we design a boundary information module (BIM) to distinguish boundaries from the three different brain tissues. After that, we add a boundary attention gate (BAG) to the encoder output layers of our transformer to capture more informative local details. We evaluate our proposed model on two datasets of brain tissue images, including infant and adult brains. The extensive evaluation experiments of our model show better performance (a Dice Coefficient (DC) accuracy of up to \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$5.3\%$$\end{document}5.3% compared to the state-of-the-art models) in detecting and segmenting brain tissue images.
Collapse
Affiliation(s)
- Afifa Khaled
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China.
| | - Jian-Jun Han
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Taher A Ghaleb
- School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, Canada
| |
Collapse
|
86
|
Cao X, Chen H, Li Y, Peng Y, Zhou Y, Cheng L, Liu T, Shen D. Auto-DenseUNet: Searchable neural network architecture for mass segmentation in 3D automated breast ultrasound. Med Image Anal 2022; 82:102589. [DOI: 10.1016/j.media.2022.102589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Revised: 07/18/2022] [Accepted: 08/17/2022] [Indexed: 11/15/2022]
|
87
|
Zhou Q, Wang R, Zeng G, Fan H, Zheng G. Towards bridging the distribution gap: Instance to Prototype Earth Mover’s Distance for distribution alignment. Med Image Anal 2022; 82:102607. [DOI: 10.1016/j.media.2022.102607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 06/28/2022] [Accepted: 08/25/2022] [Indexed: 11/16/2022]
|
88
|
Li J, Chen H, Li Y, Peng Y, Sun J, Pan P. Cross-modality synthesis aiding lung tumor segmentation on multi-modal MRI images. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103655] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
89
|
Yang Y, Yan T, Jiang X, Xie R, Li C, Zhou T. MH-Net: Model-data-driven hybrid-fusion network for medical image segmentation. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
90
|
SVF-Net: spatial and visual feature enhancement network for brain structure segmentation. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03706-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
91
|
Cao J, Lai H, Zhang J, Zhang J, Xie T, Wang H, Bu J, Feng Q, Huang M. 2D-3D cascade network for glioma segmentation in multisequence MRI images using multiscale information. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 221:106894. [PMID: 35613498 DOI: 10.1016/j.cmpb.2022.106894] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 04/21/2022] [Accepted: 05/14/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND AND OBJECTIVE Glioma segmentation is an important procedure for the treatment plan and follow-up evaluation of patients with glioma. UNet-based networks are widely used in medical image segmentation tasks and have achieved state-of-the-art performance. However, context information along the third dimension is ignored in 2D convolutions, whereas difference between z-axis and in-plane resolutions is large in 3D convolutions. Moreover, an original UNet structure cannot capture fine details because of the reduced resolution of feature maps near bottleneck layers. METHODS To address these issues, a novel 2D-3D cascade network with multiscale information module is proposed for the multiclass segmentation of gliomas in multisequence MRI images. First, a 2D network is applied to fully exploit potential intra-slice features. A variational autoencoder module is incorporated into 2D DenseUNet to regularize a shared encoder, extract useful information, and represent glioma heterogeneity. Second, we integrated 3D DenseUNet with the 2D network in cascade mode to extract useful inter-slice features and alleviate the influence of large difference between z-axis and in-plane resolutions. Moreover, a multiscale information module is used in the 2D and 3D networks to further capture the fine details of gliomas. Finally, the whole 2D-3D cascade network is trained in an end-to-end manner, where the intra-slice and inter-slice features are fused and optimized jointly to take full advantage of 3D image information. RESULTS Our method is evaluated on publicly available and clinical datasets and achieves competitive performance in these two datasets. CONCLUSIONS These results indicate that the proposed method may be a useful tool for glioma segmentation.
Collapse
Affiliation(s)
- Jianyun Cao
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China; Zhujiang Hospital, Southern Medical University, Guangzhou 510282, China
| | - Haoran Lai
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China
| | - Jiawei Zhang
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China
| | - Junde Zhang
- Zhujiang Hospital, Southern Medical University, Guangzhou 510282, China
| | - Tao Xie
- Zhujiang Hospital, Southern Medical University, Guangzhou 510282, China
| | - Heqing Wang
- Zhujiang Hospital, Southern Medical University, Guangzhou 510282, China
| | - Junguo Bu
- Zhujiang Hospital, Southern Medical University, Guangzhou 510282, China
| | - Qianjin Feng
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou 510515, China.
| | - Meiyan Huang
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou 510515, China.
| |
Collapse
|
92
|
Unpaired multi-modal tumor segmentation with structure adaptation. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03610-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
93
|
Automatic Segmentation of Magnetic Resonance Images of Severe Patients with Advanced Liver Cancer and the Molecular Mechanism of Emodin-Induced Apoptosis of HepG2 Cells under the Deep Learning. JOURNAL OF HEALTHCARE ENGINEERING 2022; 2022:3951112. [PMID: 35295165 PMCID: PMC8920667 DOI: 10.1155/2022/3951112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Accepted: 01/26/2022] [Indexed: 11/17/2022]
Abstract
To improve the accuracy of clinical diagnosis of severe patients with advanced liver cancer and enhance the effect of chemotherapy treatment, the U-Net model was optimized by introducing the batch normalization (BN) layer and the dropout layer, and the segmentation training and verification of the optimized model were realized by the magnetic resonance (MR) image data. Subsequently, HepG2 cells were taken as the research objects and treated with 0, 10, 20, 40, 60, 80, and 100 μmol/L emodin (EMO), respectively. The methyl thiazolyl tetrazolium (MTT) method was used to explore the changes in cell viability, the acridine orange (AO)/ethidium bromide (EB) and 4',6-diamidino-2-phenylindole (DAPI) were used for staining, the Annexin V fluorescein isothiocyanate (FITC)/propidium iodide (PI) (Annexin V-FITC/PI) was adopted to detect the apoptosis after EMO treatment, and the Western blot (WB) method was used with the purpose of exploring the changes in protein expression levels of PARP, Bcl-2, and p53 in the cells after treatment. It was found that compared with the original U-Net model, the introduction of the BN layer and the dropout layer can improve the robustness of the U-Net model, and the optimized U-Net model had the highest dice similarity coefficient (DSC) (98.45%) and mean average precision (MAP) (0.88) for the liver tumor segmentation.
Collapse
|
94
|
Jiang J, Rimner A, Deasy JO, Veeraraghavan H. Unpaired Cross-Modality Educed Distillation (CMEDL) for Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:1057-1068. [PMID: 34855590 PMCID: PMC9128665 DOI: 10.1109/tmi.2021.3132291] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Accurate and robust segmentation of lung cancers from CT, even those located close to mediastinum, is needed to more accurately plan and deliver radiotherapy and to measure treatment response. Therefore, we developed a new cross-modality educed distillation (CMEDL) approach, using unpaired CT and MRI scans, whereby an informative teacher MRI network guides a student CT network to extract features that signal the difference between foreground and background. Our contribution eliminates two requirements of distillation methods: (i) paired image sets by using an image to image (I2I) translation and (ii) pre-training of the teacher network with a large training set by using concurrent training of all networks. Our framework uses an end-to-end trained unpaired I2I translation, teacher, and student segmentation networks. Architectural flexibility of our framework is demonstrated using 3 segmentation and 2 I2I networks. Networks were trained with 377 CT and 82 T2w MRI from different sets of patients, with independent validation (N = 209 tumors) and testing (N = 609 tumors) datasets. Network design, methods to combine MRI with CT information, distillation learning under informative (MRI to CT), weak (CT to MRI) and equal teacher (MRI to MRI), and ablation tests were performed. Accuracy was measured using Dice similarity (DSC), surface Dice (sDSC), and Hausdorff distance at the 95th percentile (HD95). The CMEDL approach was significantly (p < 0.001) more accurate (DSC of 0.77 vs. 0.73) than non-CMEDL methods with an informative teacher for CT lung tumor, with a weak teacher (DSC of 0.84 vs. 0.81) for MRI lung tumor, and with equal teacher (DSC of 0.90 vs. 0.88) for MRI multi-organ segmentation. CMEDL also reduced inter-rater lung tumor segmentation variabilities.
Collapse
|
95
|
Wu L, Hu S, Liu C. MR brain segmentation based on DE-ResUnet combining texture features and background knowledge. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103541] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
96
|
Zhou J, Zhang X, Zhu Z, Lan X, Fu L, Wang H, Wen H. Cohesive Multi-Modality Feature Learning and Fusion for COVID-19 Patient Severity Prediction. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY : A PUBLICATION OF THE CIRCUITS AND SYSTEMS SOCIETY 2022; 32:2535-2549. [PMID: 35937181 PMCID: PMC9280852 DOI: 10.1109/tcsvt.2021.3063952] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 02/14/2021] [Accepted: 02/25/2021] [Indexed: 06/15/2023]
Abstract
The outbreak of coronavirus disease (COVID-19) has been a nightmare to citizens, hospitals, healthcare practitioners, and the economy in 2020. The overwhelming number of confirmed cases and suspected cases put forward an unprecedented challenge to the hospital's capacity of management and medical resource distribution. To reduce the possibility of cross-infection and attend a patient according to his severity level, expertly diagnosis and sophisticated medical examinations are often required but hard to fulfil during a pandemic. To facilitate the assessment of a patient's severity, this paper proposes a multi-modality feature learning and fusion model for end-to-end covid patient severity prediction using the blood test supported electronic medical record (EMR) and chest computerized tomography (CT) scan images. To evaluate a patient's severity by the co-occurrence of salient clinical features, the High-order Factorization Network (HoFN) is proposed to learn the impact of a set of clinical features without tedious feature engineering. On the other hand, an attention-based deep convolutional neural network (CNN) using pre-trained parameters are used to process the lung CT images. Finally, to achieve cohesion of cross-modality representation, we design a loss function to shift deep features of both-modality into the same feature space which improves the model's performance and robustness when one modality is absent. Experimental results demonstrate that the proposed multi-modality feature learning and fusion model achieves high performance in an authentic scenario.
Collapse
Affiliation(s)
- Jinzhao Zhou
- School of Computer Science and EngineeringSouth China University of TechnologyGuangzhou510641China
| | - Xingming Zhang
- School of Computer Science and EngineeringSouth China University of TechnologyGuangzhou510641China
| | - Ziwei Zhu
- School of Computer Science and EngineeringSouth China University of TechnologyGuangzhou510641China
| | - Xiangyuan Lan
- Department of Computer ScienceHong Kong Baptist UniversityHong Kong
| | - Lunkai Fu
- School of Computer Science and EngineeringSouth China University of TechnologyGuangzhou510641China
| | - Haoxiang Wang
- School of Computer Science and EngineeringSouth China University of TechnologyGuangzhou510641China
| | - Hanchun Wen
- Department of Critical Care MedicineThe First Affiliated Hospital of Guangxi Medical UniversityNanning530021China
| |
Collapse
|
97
|
Yue H, Liu J, Li J, Kuang H, Lang J, Cheng J, Peng L, Han Y, Bai H, Wang Y, Wang Q, Wang J. MLDRL: Multi-loss disentangled representation learning for predicting esophageal cancer response to neoadjuvant chemoradiotherapy using longitudinal CT images. Med Image Anal 2022; 79:102423. [DOI: 10.1016/j.media.2022.102423] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 03/08/2022] [Accepted: 03/12/2022] [Indexed: 12/24/2022]
|
98
|
Wang J, Yu Z, Luan Z, Ren J, Zhao Y, Yu G. RDAU-Net: Based on a Residual Convolutional Neural Network With DFP and CBAM for Brain Tumor Segmentation. Front Oncol 2022; 12:805263. [PMID: 35311076 PMCID: PMC8924611 DOI: 10.3389/fonc.2022.805263] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Accepted: 01/14/2022] [Indexed: 12/20/2022] Open
Abstract
Due to the high heterogeneity of brain tumors, automatic segmentation of brain tumors remains a challenging task. In this paper, we propose RDAU-Net by adding dilated feature pyramid blocks with 3D CBAM blocks and inserting 3D CBAM blocks after skip-connection layers. Moreover, a CBAM with channel attention and spatial attention facilitates the combination of more expressive feature information, thereby leading to more efficient extraction of contextual information from images of various scales. The performance was evaluated on the Multimodal Brain Tumor Segmentation (BraTS) challenge data. Experimental results show that RDAU-Net achieves state-of-the-art performance. The Dice coefficient for WT on the BraTS 2019 dataset exceeded the baseline value by 9.2%.
Collapse
Affiliation(s)
- Jingjing Wang
- College of Physics and Electronics Science, Shandong Normal University, Jinan, China
| | - Zishu Yu
- College of Physics and Electronics Science, Shandong Normal University, Jinan, China
| | - Zhenye Luan
- College of Physics and Electronics Science, Shandong Normal University, Jinan, China
| | - Jinwen Ren
- College of Physics and Electronics Science, Shandong Normal University, Jinan, China
| | - Yanhua Zhao
- Obstetrics and Gynecology, Tengzhou Xigang Central Health Center, Tengzhou, China
| | - Gang Yu
- College of Physics and Electronics Science, Shandong Normal University, Jinan, China
| |
Collapse
|
99
|
Shi W, Xu T, Yang H, Xi Y, Du Y, Li J, Li J. Attention Gate based dual-pathway Network for Vertebra Segmentation of X-ray Spine images. IEEE J Biomed Health Inform 2022; 26:3976-3987. [PMID: 35290194 DOI: 10.1109/jbhi.2022.3158968] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Automatic spine and vertebra segmentation from X-ray spine images is a critical and challenging problem in many computer-aid spinal image analysis and disease diagnosis applications. In this paper, a two-stage automatic segmentation framework for spine X-ray images is proposed, which can firstly locate the spine regions (including backbone, sacrum and illum) in the coarse stage and then identify eighteen vertebrae (i.e., cervical vertebra 1, thoracic vertebra 1-12 and lumbar vertebra 1-5) with isolate and clear boundary in the fine stage. A novel Attention Gate based dual-pathway Network (AGNet) composed of context and edge pathways is designed to extract semantic and boundary information for segmentation of both spine and vertebra regions. Multi-scale supervision mechanism is applied to explore comprehensive features and an Edge aware Fusion Mechanism (EFM) is proposed to fuse features extracted from the two pathways. Some other image processing skills, such as centralized backbone clipping, patch cropping and convex hull detection are introduced to further refine the vertebra segmentation results. Experimental validations on spine X-ray images dataset and vertebrae dataset suggest that the proposed AGNet achieves superior performance compared with state-of-the-art segmentation methods, and the coarse-to-fine framework can be implemented in real spinal diagnosis systems.
Collapse
|
100
|
Chen C, Dou Q, Jin Y, Liu Q, Heng PA. Learning With Privileged Multimodal Knowledge for Unimodal Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:621-632. [PMID: 34633927 DOI: 10.1109/tmi.2021.3119385] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Multimodal learning usually requires a complete set of modalities during inference to maintain performance. Although training data can be well-prepared with high-quality multiple modalities, in many cases of clinical practice, only one modality can be acquired and important clinical evaluations have to be made based on the limited single modality information. In this work, we propose a privileged knowledge learning framework with the 'Teacher-Student' architecture, in which the complete multimodal knowledge that is only available in the training data (called privileged information) is transferred from a multimodal teacher network to a unimodal student network, via both a pixel-level and an image-level distillation scheme. Specifically, for the pixel-level distillation, we introduce a regularized knowledge distillation loss which encourages the student to mimic the teacher's softened outputs in a pixel-wise manner and incorporates a regularization factor to reduce the effect of incorrect predictions from the teacher. For the image-level distillation, we propose a contrastive knowledge distillation loss which encodes image-level structured information to enrich the knowledge encoding in combination with the pixel-level distillation. We extensively evaluate our method on two different multi-class segmentation tasks, i.e., cardiac substructure segmentation and brain tumor segmentation. Experimental results on both tasks demonstrate that our privileged knowledge learning is effective in improving unimodal segmentation and outperforms previous methods.
Collapse
|