1
|
Wu C, Guo M, Ma M, Wang K. TLTNet: A novel transscale cascade layered transformer network for enhanced retinal blood vessel segmentation. Comput Biol Med 2024; 178:108773. [PMID: 38925090 DOI: 10.1016/j.compbiomed.2024.108773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Revised: 05/10/2024] [Accepted: 06/15/2024] [Indexed: 06/28/2024]
Abstract
Extracting global and local feature information is still challenging due to the problems of retinal blood vessel medical images like fuzzy edge features, noise, difficulty in distinguishing between lesion regions and background information, and loss of low-level feature information, which leads to insufficient extraction of feature information. To better solve these problems and fully extract the global and local feature information of the image, we propose a novel transscale cascade layered transformer network for enhanced retinal blood vessel segmentation, which consists of an encoder and a decoder and is connected between the encoder and decoder by a transscale transformer cascade module. Among them, the encoder consists of a local-global transscale transformer module, a multi-head layered transscale adaptive embedding module, and a local context(LCNet) module. The transscale transformer cascade module learns local and global feature information from the first three layers of the encoder, and multi-scale dependent features, fuses the hierarchical feature information from the skip connection block and the channel-token interaction fusion block, respectively, and inputs it to the decoder. The decoder includes a decoding module for the local context network and a transscale position transformer module to input the local and global feature information extracted from the encoder with retained key position information into the decoding module and the position embedding transformer module for recovery and output of the prediction results that are consistent with the input feature information. In addition, we propose an improved cross-entropy loss function based on the difference between the deterministic observation samples and the prediction results with the deviation distance, which is validated on the DRIVE and STARE datasets combined with the proposed network model based on the dual transformer structure in this paper, and the segmentation accuracies are 97.26% and 97.87%, respectively. Compared with other state-of-the-art networks, the results show that the proposed network model has a significant competitive advantage in improving the segmentation performance of retinal blood vessel images.
Collapse
Affiliation(s)
- Chengwei Wu
- Key Laboratory of Modern Teaching Technology, Ministry of Education, School of Computer Science, Shaanxi Normal University, Xi'an, 710119, China.
| | - Min Guo
- Key Laboratory of Modern Teaching Technology, Ministry of Education, School of Computer Science, Shaanxi Normal University, Xi'an, 710119, China.
| | - Miao Ma
- Key Laboratory of Modern Teaching Technology, Ministry of Education, School of Computer Science, Shaanxi Normal University, Xi'an, 710119, China.
| | - Kaiguang Wang
- Key Laboratory of Modern Teaching Technology, Ministry of Education, School of Computer Science, Shaanxi Normal University, Xi'an, 710119, China.
| |
Collapse
|
2
|
Song J, Lu X, Gu Y. GMAlignNet: multi-scale lightweight brain tumor image segmentation with enhanced semantic information consistency. Phys Med Biol 2024; 69:115033. [PMID: 38657628 DOI: 10.1088/1361-6560/ad4301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 04/24/2024] [Indexed: 04/26/2024]
Abstract
Although the U-shaped architecture, represented by UNet, has become a major network model for brain tumor segmentation, the repeated convolution and sampling operations can easily lead to the loss of crucial information. Additionally, directly fusing features from different levels without distinction can easily result in feature misalignment, affecting segmentation accuracy. On the other hand, traditional convolutional blocks used for feature extraction cannot capture the abundant multi-scale information present in brain tumor images. This paper proposes a multi-scale feature-aligned segmentation model called GMAlignNet that fully utilizes Ghost convolution to solve these problems. Ghost hierarchical decoupled fusion unit and Ghost hierarchical decoupled unit are used instead of standard convolutions in the encoding and decoding paths. This transformation replaces the holistic learning of volume structures by traditional convolutional blocks with multi-level learning on a specific view, facilitating the acquisition of abundant multi-scale contextual information through low-cost operations. Furthermore, a feature alignment unit is proposed that can utilize semantic information flow to guide the recovery of upsampled features. It performs pixel-level semantic information correction on misaligned features due to feature fusion. The proposed method is also employed to optimize three classic networks, namely DMFNet, HDCNet, and 3D UNet, demonstrating its effectiveness in automatic brain tumor segmentation. The proposed network model was applied to the BraTS 2018 dataset, and the results indicate that the proposed GMAlignNet achieved Dice coefficients of 81.65%, 90.07%, and 85.16% for enhancing tumor, whole tumor, and tumor core segmentation, respectively. Moreover, with only 0.29 M parameters and 26.88G FLOPs, it demonstrates better potential in terms of computational efficiency and possesses the advantages of lightweight. Extensive experiments on the BraTS 2018, BraTS 2019, and BraTS 2020 datasets suggest that the proposed model exhibits better potential in handling edge details and contour recognition.
Collapse
Affiliation(s)
- Jianli Song
- Inner Mongolia Key Laboratory of Pattern Recognition and Intelligent Image Processing, School of Digital and Intelligent Industry, Inner Mongolia University of Science and Technology, Baotou 014010, People's Republic of China
| | - Xiaoqi Lu
- Inner Mongolia Key Laboratory of Pattern Recognition and Intelligent Image Processing, School of Digital and Intelligent Industry, Inner Mongolia University of Science and Technology, Baotou 014010, People's Republic of China
- School of Information Engineering, Inner Mongolia University of Technology, Hohhot 010051, People's Republic of China
| | - Yu Gu
- Inner Mongolia Key Laboratory of Pattern Recognition and Intelligent Image Processing, School of Digital and Intelligent Industry, Inner Mongolia University of Science and Technology, Baotou 014010, People's Republic of China
| |
Collapse
|
3
|
Choi Y, Al-Masni MA, Jung KJ, Yoo RE, Lee SY, Kim DH. A single stage knowledge distillation network for brain tumor segmentation on limited MR image modalities. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 240:107644. [PMID: 37307766 DOI: 10.1016/j.cmpb.2023.107644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 05/14/2023] [Accepted: 06/03/2023] [Indexed: 06/14/2023]
Abstract
BACKGROUND AND OBJECTIVE Precisely segmenting brain tumors using multimodal Magnetic Resonance Imaging (MRI) is an essential task for early diagnosis, disease monitoring, and surgical planning. Unfortunately, the complete four image modalities utilized in the well-known BraTS benchmark dataset: T1, T2, Fluid-Attenuated Inversion Recovery (FLAIR), and T1 Contrast-Enhanced (T1CE) are not regularly acquired in clinical practice due to the high cost and long acquisition time. Rather, it is common to utilize limited image modalities for brain tumor segmentation. METHODS In this paper, we propose a single stage learning of knowledge distillation algorithm that derives information from the missing modalities for better segmentation of brain tumors. Unlike the previous works that adopted a two-stage framework to distill the knowledge from a pre-trained network into a student network, where the latter network is trained on limited image modality, we train both models simultaneously using a single-stage knowledge distillation algorithm. We transfer the information by reducing the redundancy from a teacher network trained on full image modalities to the student network using Barlow Twins loss on a latent-space level. To distill the knowledge on the pixel level, we further employ a deep supervision idea that trains the backbone networks of both teacher and student paths using Cross-Entropy loss. RESULTS We demonstrate that the proposed single-stage knowledge distillation approach enables improving the performance of the student network in each tumor category with overall dice scores of 91.11% for Tumor Core, 89.70% for Enhancing Tumor, and 92.20% for Whole Tumor in the case of only using the FLAIR and T1CE images, outperforming the state-of-the-art segmentation methods. CONCLUSIONS The outcomes of this work prove the feasibility of exploiting the knowledge distillation in segmenting brain tumors using limited image modalities and hence make it closer to clinical practices.
Collapse
Affiliation(s)
- Yoonseok Choi
- Department of Electrical and Electronic Engineering, College of Engineering, Yonsei University, Seoul 03722, Republic of Korea
| | - Mohammed A Al-Masni
- Department of Artificial Intelligence, College of Software & Convergence Technology, Daeyang AI Center, Sejong University, Seoul 05006, Republic of Korea
| | - Kyu-Jin Jung
- Department of Electrical and Electronic Engineering, College of Engineering, Yonsei University, Seoul 03722, Republic of Korea
| | - Roh-Eul Yoo
- Department of Radiology, Seoul National University Hospital, 101 Daehak-ro Jongno-gu, Seoul 03080, Republic of Korea; Department of Radiology, Seoul National University College of Medicine, 103 Daehak-ro Jongno-gu, Seoul 03080, Republic of Korea
| | - Seong-Yeong Lee
- Department of Radiology, Seoul National University Hospital, 101 Daehak-ro Jongno-gu, Seoul 03080, Republic of Korea
| | - Dong-Hyun Kim
- Department of Electrical and Electronic Engineering, College of Engineering, Yonsei University, Seoul 03722, Republic of Korea.
| |
Collapse
|
4
|
You X, Gu Y, Liu Y, Lu S, Tang X, Yang J. VerteFormer: A single-staged Transformer network for vertebrae segmentation from CT images with arbitrary field of views. Med Phys 2023; 50:6296-6318. [PMID: 37211910 DOI: 10.1002/mp.16467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Revised: 04/09/2023] [Accepted: 04/27/2023] [Indexed: 05/23/2023] Open
Abstract
BACKGROUND Spinal diseases are burdening an increasing number of patients. And fully automatic vertebrae segmentation for CT images with arbitrary field of views (FOVs), has been a fundamental research for computer-assisted spinal disease diagnosis and surgical intervention. Therefore, researchers aim to solve this challenging task in the past years. PURPOSE This task suffers from challenges including the intra-vertebrae inconsistency of segmentation and the poor identification of biterminal vertebrae in CT scans. And there are some limitations in existing models, which might be difficult to be applied to spinal cases with arbitrary FOVs or employ multi-stage networks with too much computational cost. In this paper, we propose a single-staged model called VerteFormer which can effectively deal with the challenges and limitations mentioned above. METHODS The proposed VerteFormer utilizes the advantage of Vision Transformer (ViT), which does well in mining global relations for input data. The Transformer and UNet-based structure effectively fuse global and local features of vertebrae. Beisdes, we propose the Edge Detection (ED) block based on convolution and self-attention to divide neighboring vertebrae with clear boundary lines. And it simultaneously promotes the network to achieve more consistent segmentation masks of vertebrae. To better identify the labels of vertebrae in the spine, particularly biterminal vertebrae, we further introduce global information generated from the Global Information Extraction (GIE) block. RESULTS We evaluate the proposed model on two public datasets: MICCAI Challenge VerSe 2019 and 2020. And VerteFormer achieve 86.39% and 86.54% of dice scores on the public and hidden test datasets of VerSe 2019, 84.53% and 86.86% of dice scores on VerSe 2020, which outperforms other Transformer-based models and single-staged methods specifically designed for the VerSe Challenge. Additional ablation experiments validate the effectiveness of ViT block, ED block and GIE block. CONCLUSIONS We propose a single-staged Transformer-based model for the task of fully automatic vertebrae segmentation from CT images with arbitrary FOVs. ViT demonstrates its effectiveness in modeling long-term relations. The ED block and GIE block has shown their improvements to the segmentation performance of vertebrae. The proposed model can assist physicians for spinal diseases' diagnosis and surgical intervention, and is also promising to be generalized and transferred to other applications of medical imaging.
Collapse
Affiliation(s)
- Xin You
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, China
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, China
| | - Yun Gu
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, China
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, China
| | - Yingying Liu
- Research, Technology and Clinical, Medtronic Technology Center, Shanghai, China
| | - Steve Lu
- Visualization and Robotics, Medtronic Technology Center, Shanghai, China
| | - Xin Tang
- Research, Technology and Clinical, Medtronic Technology Center, Shanghai, China
| | - Jie Yang
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, China
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
5
|
Shen L, Zhang Y, Wang Q, Qin F, Sun D, Min H, Meng Q, Xu C, Zhao W, Song X. Feature interaction network based on hierarchical decoupled convolution for 3D medical image segmentation. PLoS One 2023; 18:e0288658. [PMID: 37440581 DOI: 10.1371/journal.pone.0288658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Accepted: 06/30/2023] [Indexed: 07/15/2023] Open
Abstract
Manual image segmentation consumes time. An automatic and accurate method to segment multimodal brain tumors using context information rich three-dimensional medical images that can be used for clinical treatment decisions and surgical planning is required. However, it is a challenge to use deep learning to achieve accurate segmentation of medical images due to the diversity of tumors and the complex boundary interactions between sub-regions while limited computing resources hinder the construction of efficient neural networks. We propose a feature fusion module based on a hierarchical decoupling convolution network and an attention mechanism to improve the performance of network segmentation. We replaced the skip connections of U-shaped networks with a feature fusion module to solve the category imbalance problem, thus contributing to the segmentation of more complicated medical images. We introduced a global attention mechanism to further integrate the features learned by the encoder and explore the context information. The proposed method was evaluated for enhance tumor, whole tumor, and tumor core, achieving Dice similarity coefficient metrics of 0.775, 0.900, and 0.827, respectively, on the BraTS 2019 dataset and 0.800, 0.902, and 0.841, respectively on the BraTS 2018 dataset. The results show that our proposed method is inherently general and is a powerful tool for brain tumor image studies. Our code is available at: https://github.com/WSake/Feature-interaction-network-based-on-Hierarchical-Decoupled-Convolution.
Collapse
Affiliation(s)
- Longfeng Shen
- Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), College of Computer Science and Technology, Huaibei Normal University, Huaibei, Anhui, China
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
- Anhui Big-Data Research Center on University Management, Huaibei, Anhui, China
| | - Yingjie Zhang
- Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), College of Computer Science and Technology, Huaibei Normal University, Huaibei, Anhui, China
| | - Qiong Wang
- Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), College of Computer Science and Technology, Huaibei Normal University, Huaibei, Anhui, China
| | - Fenglan Qin
- Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), College of Computer Science and Technology, Huaibei Normal University, Huaibei, Anhui, China
| | - Dengdi Sun
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
- Anhui Provincial Key Laboratory of Multimodal Cognitive Computing, School of Artificial Intelligence, Anhui University, Hefei, China
| | - Hai Min
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, Anhui, China
| | - Qianqian Meng
- Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), College of Computer Science and Technology, Huaibei Normal University, Huaibei, Anhui, China
| | - Chengzhen Xu
- Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), College of Computer Science and Technology, Huaibei Normal University, Huaibei, Anhui, China
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
| | - Wei Zhao
- Huaibei People's Hospital, Huaibei, Anhui, China
| | - Xin Song
- Huaibei People's Hospital, Huaibei, Anhui, China
| |
Collapse
|
6
|
An Adapted Deep Convolutional Neural Network for Automatic Measurement of Pancreatic Fat and Pancreatic Volume in Clinical Multi-Protocol Magnetic Resonance Images: A Retrospective Study with Multi-Ethnic External Validation. Biomedicines 2022; 10:biomedicines10112991. [PMID: 36428558 PMCID: PMC9687882 DOI: 10.3390/biomedicines10112991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2022] [Revised: 11/08/2022] [Accepted: 11/17/2022] [Indexed: 11/23/2022] Open
Abstract
Pancreatic volume and fat fraction are critical prognoses for metabolic diseases like type 2 diabetes (T2D). Magnetic Resonance Imaging (MRI) is a required non-invasive quantification method for the pancreatic fat fraction. The dramatic development of deep learning has enabled the automatic measurement of MR images. Therefore, based on MRI, we intend to develop a deep convolutional neural network (DCNN) that can accurately segment and measure pancreatic volume and fat fraction. This retrospective study involved abdominal MR images from 148 diabetic patients and 246 healthy normoglycemic participants. We randomly separated them into training and testing sets according to the proportion of 80:20. There were 2364 recognizable pancreas images labeled and pre-treated by an upgraded superpixel algorithm for a discernible pancreatic boundary. We then applied them to the novel DCNN model, mimicking the most accurate and latest manual pancreatic segmentation process. Fat phantom and erosion algorithms were employed to increase the accuracy. The results were evaluated by dice similarity coefficient (DSC). External validation datasets included 240 MR images from 10 additional patients. We assessed the pancreas and pancreatic fat volume using the DCNN and compared them with those of specialists. This DCNN employed the cutting-edge idea of manual pancreas segmentation and achieved the highest DSC (91.2%) compared with any reported models. It is the first framework to measure intra-pancreatic fat volume and fat deposition. Performance validation reflected by regression R2 value between manual operation and trained DCNN segmentation on the pancreas and pancreatic fat volume were 0.9764 and 0.9675, respectively. The performance of the novel DCNN enables accurate pancreas segmentation, pancreatic fat volume, fraction measurement, and calculation. It achieves the same segmentation level of experts. With further training, it may well surpass any expert and provide accurate measurements, which may have significant clinical relevance.
Collapse
|
7
|
Zeng G, Degonda C, Boschung A, Schmaranzer F, Gerber N, Siebenrock KA, Steppacher SD, Tannast M, Lerch TD. Three-Dimensional Magnetic Resonance Imaging Bone Models of the Hip Joint Using Deep Learning: Dynamic Simulation of Hip Impingement for Diagnosis of Intra- and Extra-articular Hip Impingement. Orthop J Sports Med 2021; 9:23259671211046916. [PMID: 34938819 PMCID: PMC8685729 DOI: 10.1177/23259671211046916] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Accepted: 06/23/2021] [Indexed: 11/26/2022] Open
Abstract
Background: Dynamic 3-dimensional (3D) simulation of hip impingement enables better
understanding of complex hip deformities in young adult patients with
femoroacetabular impingement (FAI). Deep learning algorithms may improve
magnetic resonance imaging (MRI) segmentation. Purpose: (1) To evaluate the accuracy of 3D models created using convolutional neural
networks (CNNs) for fully automatic MRI bone segmentation of the hip joint,
(2) to correlate hip range of motion (ROM) between manual and automatic
segmentation, and (3) to compare location of hip impingement in 3D models
created using automatic bone segmentation in patients with FAI. Study Design: Cohort study (diagnosis); Level of evidence, 3. Methods: The authors retrospectively reviewed 31 hip MRI scans from 26 symptomatic
patients (mean age, 27 years) with hip pain due to FAI. All patients had
matched computed tomography (CT) and MRI scans of the pelvis and the knee.
CT- and MRI-based osseous 3D models of the hip joint of the same patients
were compared (MRI: T1 volumetric interpolated breath-hold examination
high-resolution sequence; 0.8 mm3 isovoxel). CNNs were used to
develop fully automatic bone segmentation of the hip joint, and the 3D
models created using this method were compared with manual segmentation of
CT- and MRI-based 3D models. Impingement-free ROM and location of hip
impingement were calculated using previously validated collision detection
software. Results: The difference between the CT- and MRI-based 3D models was <1 mm, and the
difference between fully automatic and manual segmentation of MRI-based 3D
models was <1 mm. The correlation of automatic and manual MRI-based 3D
models was excellent and significant for impingement-free ROM
(r = 0.995; P < .001), flexion
(r = 0.953; P < .001), and internal
rotation at 90° of flexion (r = 0.982; P
< .001). The correlation for impingement-free flexion between automatic
MRI-based 3D models and CT-based 3D models was 0.953 (P
< .001). The location of impingement was not significantly different
between manual and automatic segmentation of MRI-based 3D models, and the
location of extra-articular hip impingement was not different between CT-
and MRI-based 3D models. Conclusion: CNN can potentially be used in clinical practice to provide rapid and
accurate 3D MRI hip joint models for young patients. The created models can
be used for simulation of impingement during diagnosis of intra- and
extra-articular hip impingement to enable radiation-free and
patient-specific surgical planning for hip arthroscopy and open hip
preservation surgery.
Collapse
Affiliation(s)
- Guodong Zeng
- Sitem Center for Translational Medicine and Biomedical Entrepreneurship, University of Bern, Switzerland
| | - Celia Degonda
- Department of Orthopedic Surgery, Inselspital, University of Bern, Bern, Switzerland
| | - Adam Boschung
- Department of Orthopedic Surgery, Inselspital, University of Bern, Bern, Switzerland.,Department of Diagnostic, Interventional and Paediatric Radiology, University of Bern, Inselspital, Bern, Switzerland
| | - Florian Schmaranzer
- Department of Orthopedic Surgery, Inselspital, University of Bern, Bern, Switzerland.,Department of Diagnostic, Interventional and Paediatric Radiology, University of Bern, Inselspital, Bern, Switzerland
| | - Nicolas Gerber
- Sitem Center for Translational Medicine and Biomedical Entrepreneurship, University of Bern, Switzerland
| | - Klaus A Siebenrock
- Department of Orthopedic Surgery, Inselspital, University of Bern, Bern, Switzerland
| | - Simon D Steppacher
- Department of Orthopedic Surgery, Inselspital, University of Bern, Bern, Switzerland
| | - Moritz Tannast
- Department of Orthopedic Surgery, Inselspital, University of Bern, Bern, Switzerland.,Department of Orthopaedic Surgery and Traumatology, Cantonal Hospital, University of Fribourg, Fribourg, Switzerland
| | - Till D Lerch
- Department of Orthopedic Surgery, Inselspital, University of Bern, Bern, Switzerland.,Department of Diagnostic, Interventional and Paediatric Radiology, University of Bern, Inselspital, Bern, Switzerland
| |
Collapse
|
8
|
Tao R, Liu W, Zheng G. Spine-transformers: Vertebra labeling and segmentation in arbitrary field-of-view spine CTs via 3D transformers. Med Image Anal 2021; 75:102258. [PMID: 34670147 DOI: 10.1016/j.media.2021.102258] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 08/10/2021] [Accepted: 09/28/2021] [Indexed: 11/26/2022]
Abstract
In this paper, we address the problem of fully automatic labeling and segmentation of 3D vertebrae in arbitrary Field-Of-View (FOV) CT images. We propose a deep learning-based two-stage solution to tackle these two problems. More specifically, in the first stage, the challenging vertebra labeling problem is solved via a novel transformers-based 3D object detector that views automatic detection of vertebrae in arbitrary FOV CT scans as a one-to-one set prediction problem. The main components of the new method, called Spine-Transformers, are a one-to-one set based global loss that forces unique predictions and a light-weighted 3D transformer architecture equipped with a skip connection and learnable positional embeddings for encoder and decoder, respectively. We additionally propose an inscribed sphere-based object detector to replace the regular box-based object detector for a better handling of volume orientation variations. Our method reasons about the relationships of different levels of vertebrae and the global volume context to directly infer all vertebrae in parallel. In the second stage, the segmentation of the identified vertebrae and the refinement of the detected centers are then done by training one single multi-task encoder-decoder network for all vertebrae as the network does not need to identify which vertebra it is working on. The two tasks share a common encoder path but with different decoder paths. Comprehensive experiments are conducted on two public datasets and one in-house dataset. The experimental results demonstrate the efficacy of the present approach.
Collapse
Affiliation(s)
- Rong Tao
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, No.800 Dongchuan Road, Shanghai 200240, China
| | - Wenyong Liu
- Key Laboratory of Biomechanics and Mechanobiology (Beihang University) of Ministry of Education, Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China.
| | - Guoyan Zheng
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, No.800 Dongchuan Road, Shanghai 200240, China.
| |
Collapse
|
9
|
Xiao Z, He K, Liu J, Zhang W. Multi-view hierarchical split network for brain tumor segmentation. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2021.102897] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
|
10
|
Chlap P, Min H, Vandenberg N, Dowling J, Holloway L, Haworth A. A review of medical image data augmentation techniques for deep learning applications. J Med Imaging Radiat Oncol 2021; 65:545-563. [PMID: 34145766 DOI: 10.1111/1754-9485.13261] [Citation(s) in RCA: 221] [Impact Index Per Article: 55.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Accepted: 05/23/2021] [Indexed: 12/21/2022]
Abstract
Research in artificial intelligence for radiology and radiotherapy has recently become increasingly reliant on the use of deep learning-based algorithms. While the performance of the models which these algorithms produce can significantly outperform more traditional machine learning methods, they do rely on larger datasets being available for training. To address this issue, data augmentation has become a popular method for increasing the size of a training dataset, particularly in fields where large datasets aren't typically available, which is often the case when working with medical images. Data augmentation aims to generate additional data which is used to train the model and has been shown to improve performance when validated on a separate unseen dataset. This approach has become commonplace so to help understand the types of data augmentation techniques used in state-of-the-art deep learning models, we conducted a systematic review of the literature where data augmentation was utilised on medical images (limited to CT and MRI) to train a deep learning model. Articles were categorised into basic, deformable, deep learning or other data augmentation techniques. As artificial intelligence models trained using augmented data make their way into the clinic, this review aims to give an insight to these techniques and confidence in the validity of the models produced.
Collapse
Affiliation(s)
- Phillip Chlap
- South Western Sydney Clinical School, University of New South Wales, Sydney, New South Wales, Australia.,Ingham Institute for Applied Medical Research, Sydney, New South Wales, Australia.,Liverpool and Macarthur Cancer Therapy Centre, Liverpool Hospital, Sydney, New South Wales, Australia
| | - Hang Min
- South Western Sydney Clinical School, University of New South Wales, Sydney, New South Wales, Australia.,Ingham Institute for Applied Medical Research, Sydney, New South Wales, Australia.,The Australian e-Health and Research Centre, CSIRO Health and Biosecurity, Brisbane, Queensland, Australia
| | - Nym Vandenberg
- Institute of Medical Physics, University of Sydney, Sydney, New South Wales, Australia
| | - Jason Dowling
- South Western Sydney Clinical School, University of New South Wales, Sydney, New South Wales, Australia.,The Australian e-Health and Research Centre, CSIRO Health and Biosecurity, Brisbane, Queensland, Australia
| | - Lois Holloway
- South Western Sydney Clinical School, University of New South Wales, Sydney, New South Wales, Australia.,Ingham Institute for Applied Medical Research, Sydney, New South Wales, Australia.,Liverpool and Macarthur Cancer Therapy Centre, Liverpool Hospital, Sydney, New South Wales, Australia.,Institute of Medical Physics, University of Sydney, Sydney, New South Wales, Australia.,Centre for Medical Radiation Physics, University of Wollongong, Wollongong, New South Wales, Australia
| | - Annette Haworth
- Institute of Medical Physics, University of Sydney, Sydney, New South Wales, Australia
| |
Collapse
|
11
|
Luo Z, Jia Z, Yuan Z, Peng J. HDC-Net: Hierarchical Decoupled Convolution Network for Brain Tumor Segmentation. IEEE J Biomed Health Inform 2021; 25:737-745. [PMID: 32750914 DOI: 10.1109/jbhi.2020.2998146] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Accurate segmentation of brain tumor from magnetic resonance images (MRIs) is crucial for clinical treatment decision and surgical planning. Due to the large diversity of the tumors and complex boundary interactions between sub-regions, it is of a great challenge. Besides accuracy, resource constraint is another important consideration. Recently, impressive improvement has been achieved for this task by using deep convolutional networks. However, most of state-of-the-art models rely on expensive 3D convolutions as well as model cascade/ensemble strategies, which result in high computational overheads and undesired system complexity. For clinical usage, the challenge is how to pursue the best accuracy within very limited computational budgets. In this study, we segment 3D volumetric image in one-pass with a hierarchical decoupled convolution network (HDC-Net), which is a light-weight but efficient pseudo-3D model. Specifically, we replace 3D convolutions with a novel hierarchical decoupled convolution (HDC) module, which can explore multi-scale multi-view spatial contexts with high efficiency. Extensive experiments on the BraTS 2018 and 2017 challenge datasets show that our method performs favorably against state of the art in accuracy yet with greatly reduced computational complexity.
Collapse
|
12
|
Zeng G, Schmaranzer F, Degonda C, Gerber N, Gerber K, Tannast M, Burger J, Siebenrock KA, Zheng G, Lerch TD. MRI-based 3D models of the hip joint enables radiation-free computer-assisted planning of periacetabular osteotomy for treatment of hip dysplasia using deep learning for automatic segmentation. Eur J Radiol Open 2020; 8:100303. [PMID: 33364259 PMCID: PMC7753932 DOI: 10.1016/j.ejro.2020.100303] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 11/27/2020] [Accepted: 11/28/2020] [Indexed: 11/02/2022] Open
Abstract
Introduction Both Hip Dysplasia(DDH) and Femoro-acetabular-Impingement(FAI) are complex three-dimensional hip pathologies causing hip pain and osteoarthritis in young patients. 3D-MRI-based models were used for radiation-free computer-assisted surgical planning. Automatic segmentation of MRI-based 3D-models are preferred because manual segmentation is time-consuming.To investigate(1) the difference and(2) the correlation for femoral head coverage(FHC) between automatic MR-based and manual CT-based 3D-models and (3) feasibility of preoperative planning in symptomatic patients with hip diseases. Methods We performed an IRB-approved comparative, retrospective study of 31 hips(26 symptomatic patients with hip dysplasia or FAI). 3D MRI sequences and CT scans of the hip were acquired. Preoperative MRI included axial-oblique T1 VIBE sequence(0.8 mm3 isovoxel) of the hip joint. Manual segmentation of MRI and CT scans were performed. Automatic segmentation of MRI-based 3D-models was performed using deep learning. Results (1)The difference between automatic and manual segmentation of MRI-based 3D hip joint models was below 1 mm(proximal femur 0.2 ± 0.1 mm and acetabulum 0.3 ± 0.5 mm). Dice coefficients of the proximal femur and the acetabulum were 98 % and 97 %, respectively. (2)The correlation for total FHC was excellent and significant(r = 0.975, p < 0.001) between automatic MRI-based and manual CT-based 3D-models. Correlation for total FHC (r = 0.979, p < 0.001) between automatic and manual MR-based 3D models was excellent.(3)Preoperative planning and simulation of periacetabular osteotomy was feasible in all patients(100 %) with hip dysplasia or acetabular retroversion. Conclusions Automatic segmentation of MRI-based 3D-models using deep learning is as accurate as CT-based 3D-models for patients with hip diseases of childbearing age. This allows radiation-free and patient-specific preoperative simulation and surgical planning of periacetabular osteotomy for patients with DDH.
Collapse
Affiliation(s)
- Guodong Zeng
- Sitem Center for Translational Medicine and Biomedical Entrepreneurship, University of Bern, Switzerland
| | - Florian Schmaranzer
- Department of Orthopedic Surgery, Inselspital, University of Bern, Bern, Switzerland.,Department of Diagnostic, Interventional and Paediatric Radiology, University of Bern, Inselspital, Bern, Switzerland
| | - Celia Degonda
- Department of Orthopedic Surgery, Inselspital, University of Bern, Bern, Switzerland
| | - Nicolas Gerber
- Sitem Center for Translational Medicine and Biomedical Entrepreneurship, University of Bern, Switzerland
| | - Kate Gerber
- Sitem Center for Translational Medicine and Biomedical Entrepreneurship, University of Bern, Switzerland
| | - Moritz Tannast
- Department of Orthopedic Surgery, Inselspital, University of Bern, Bern, Switzerland.,Department of Orthopaedic Surgery and Traumatology, Cantonal Hospital, University of Fribourg, Switzerland
| | - Jürgen Burger
- Sitem Center for Translational Medicine and Biomedical Entrepreneurship, University of Bern, Switzerland
| | - Klaus A Siebenrock
- Department of Orthopedic Surgery, Inselspital, University of Bern, Bern, Switzerland
| | - Guoyan Zheng
- Institute for Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, China
| | - Till D Lerch
- Department of Orthopedic Surgery, Inselspital, University of Bern, Bern, Switzerland.,Department of Diagnostic, Interventional and Paediatric Radiology, University of Bern, Inselspital, Bern, Switzerland
| |
Collapse
|
13
|
Dong X, Xu S, Liu Y, Wang A, Saripan MI, Li L, Zhang X, Lu L. Multi-view secondary input collaborative deep learning for lung nodule 3D segmentation. Cancer Imaging 2020; 20:53. [PMID: 32738913 PMCID: PMC7395980 DOI: 10.1186/s40644-020-00331-0] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2019] [Accepted: 07/19/2020] [Indexed: 01/17/2023] Open
Abstract
BACKGROUND Convolutional neural networks (CNNs) have been extensively applied to two-dimensional (2D) medical image segmentation, yielding excellent performance. However, their application to three-dimensional (3D) nodule segmentation remains a challenge. METHODS In this study, we propose a multi-view secondary input residual (MV-SIR) convolutional neural network model for 3D lung nodule segmentation using the Lung Image Database Consortium and Image Database Resource Initiative (LIDC-IDRI) dataset of chest computed tomography (CT) images. Lung nodule cubes are prepared from the sample CT images. Further, from the axial, coronal, and sagittal perspectives, multi-view patches are generated with randomly selected voxels in the lung nodule cubes as centers. Our model consists of six submodels, which enable learning of 3D lung nodules sliced into three views of features; each submodel extracts voxel heterogeneity and shape heterogeneity features. We convert the segmentation of 3D lung nodules into voxel classification by inputting the multi-view patches into the model and determine whether the voxel points belong to the nodule. The structure of the secondary input residual submodel comprises a residual block followed by a secondary input module. We integrate the six submodels to classify whether voxel points belong to nodules, and then reconstruct the segmentation image. RESULTS The results of tests conducted using our model and comparison with other existing CNN models indicate that the MV-SIR model achieves excellent results in the 3D segmentation of pulmonary nodules, with a Dice coefficient of 0.926 and an average surface distance of 0.072. CONCLUSION our MV-SIR model can accurately perform 3D segmentation of lung nodules with the same segmentation accuracy as the U-net model.
Collapse
Affiliation(s)
- Xianling Dong
- Present Address: Department of Biomedical Engineering, Chengde Medical University, Chengde City, Hebei Province, China
| | - Shiqi Xu
- Present Address: Department of Biomedical Engineering, Chengde Medical University, Chengde City, Hebei Province, China
| | - Yanli Liu
- Present Address: Department of Biomedical Engineering, Chengde Medical University, Chengde City, Hebei Province, China
| | - Aihui Wang
- Department of Nuclear Medicine, Affiliated Hospital, Chengde Medical University, Chengde City, China
| | - M Iqbal Saripan
- Faculty of Engineering, Universiti Putra Malaysia, Serdang, Malaysia
| | - Li Li
- Present Address: Department of Biomedical Engineering, Chengde Medical University, Chengde City, Hebei Province, China
| | - Xiaolei Zhang
- Present Address: Department of Biomedical Engineering, Chengde Medical University, Chengde City, Hebei Province, China.
| | - Lijun Lu
- School of Biomedical Engineering and Guangdong Provincal Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, China.
| |
Collapse
|
14
|
Zheng H, Qian L, Qin Y, Gu Y, Yang J. Improving the slice interaction of 2.5D CNN for automatic pancreas segmentation. Med Phys 2020; 47:5543-5554. [PMID: 32502278 DOI: 10.1002/mp.14303] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2020] [Revised: 04/20/2020] [Accepted: 05/14/2020] [Indexed: 11/07/2022] Open
Abstract
PURPOSE Volumetric pancreas segmentation can be used in the diagnosis of pancreatic diseases, the research about diabetes and surgical planning. Since manual delineation is time-consuming and laborious, we develop a deep learning-based framework for automatic pancreas segmentation in three dimensional (3D) medical images. METHODS A two-stage framework is designed for automatic pancreas delineation. In the localization stage, a Square Root Dice loss is developed to handle the trade-off between sensitivity and specificity. In refinement stage, a novel 2.5D slice interaction network with slice correlation module is proposed to capture the non-local cross-slice information at multiple feature levels. Also a self-supervised learning-based pre-training method, slice shuffle, is designed to encourage the inter-slice communication. To further improve the accuracy and robustness, ensemble learning and a recurrent refinement process are adopted in the segmentation flow. RESULTS The segmentation technique is validated in a public dataset (NIH Pancreas-CT) with 82 abdominal contrast-enhanced 3D CT scans. Fourfold cross-validation is performed to assess the capability and robustness of our method. The dice similarity coefficient, sensitivity, and specificity of our results are 86.21 ± 4.37%, 87.49 ± 6.38% and 85.11 ± 6.49% respectively, which is the state-of-the-art performance in this dataset. CONCLUSIONS We proposed an automatic pancreas segmentation framework and validate in an open dataset. It is found that 2.5D network benefits from multi-level slice interaction and suitable self-supervised learning method for pre-training can boost the performance of neural network. This technique could provide new image findings for the routine diagnosis of pancreatic disease.
Collapse
Affiliation(s)
- Hao Zheng
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, 800 Dongchuan RD, Minhang District, Shanghai, 200240, China
- School of Biomedical Engineering, Shanghai Jiao Tong University, 800 Dongchuan RD, Minhang District, Shanghai, 200240, China
- Institute of Medical Robotics, Shanghai Jiao Tong University, 800 Dongchuan RD, Minhang District, Shanghai, 200240, China
| | - Lijun Qian
- Department of Radiology, Renji Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, 200240, China
| | - Yulei Qin
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, 800 Dongchuan RD, Minhang District, Shanghai, 200240, China
- Institute of Medical Robotics, Shanghai Jiao Tong University, 800 Dongchuan RD, Minhang District, Shanghai, 200240, China
| | - Yun Gu
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, 800 Dongchuan RD, Minhang District, Shanghai, 200240, China
- Institute of Medical Robotics, Shanghai Jiao Tong University, 800 Dongchuan RD, Minhang District, Shanghai, 200240, China
| | - Jie Yang
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, 800 Dongchuan RD, Minhang District, Shanghai, 200240, China
- Institute of Medical Robotics, Shanghai Jiao Tong University, 800 Dongchuan RD, Minhang District, Shanghai, 200240, China
| |
Collapse
|
15
|
Semantic segmentation of the multiform proximal femur and femoral head bones with the deep convolutional neural networks in low quality MRI sections acquired in different MRI protocols. Comput Med Imaging Graph 2020; 81:101715. [PMID: 32240933 DOI: 10.1016/j.compmedimag.2020.101715] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Revised: 03/01/2020] [Accepted: 03/03/2020] [Indexed: 01/22/2023]
Abstract
Medical image segmentation is one of the most crucial issues in medical image processing and analysis. In general, segmentation of the various structures in medical images is performed for the further image analyzes such as quantification, assessment, diagnosis, prognosis and classification. In this paper, a research study for the 2D semantic segmentation of the multiform, both spheric and aspheric, femoral head and proximal femur bones in magnetic resonance imaging (MRI) sections of the patients with Legg-Calve-Perthes disease (LCPD) with the deep convolutional neural networks (CNNs) is presented. In the scope of the proposed study, bilateral hip MRI sections acquired in coronal plane were used. The main characteristic of the MRI sections that were used is to be low quality images which were obtained in different MRI protocols by using 3 different MRI scanners with 1.5 T imaging capability. In performance evaluations, promising segmentation results were achieved with deep CNNs in low quality MRI sections acquired in different MRI protocols. A success rate about 90% was observed in semantic segmentation of the multiform femoral head and proximal femur bones in a total of 194 MRI sections obtained from 33 MRI sequences of 13 patients with deep CNNs.
Collapse
|