201
|
Zhang Y, Liu M, Yu F, Zeng T, Wang Y. An O-shape Neural Network With Attention Modules to Detect Junctions in Biomedical Images Without Segmentation. IEEE J Biomed Health Inform 2021; 26:774-785. [PMID: 34197332 DOI: 10.1109/jbhi.2021.3094187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Junction plays an important role in biomedical research such as retinal biometric identification, retinal image registration, eye-related disease diagnosis and neuron reconstruction. However, junction detection in original biomedical images is extremely challenging. For example, retinal images contain many tiny blood vessels with complicated structures and low contrast, which makes it challenging to detect junctions. In this paper, we propose an O-shape Network architecture with Attention modules (Attention O-Net), which includes Junction Detection Branch (JDB) and Local Enhancement Branch (LEB) to detect junctions in biomedical images without segmentation. In JDB, the heatmap indicating the probabilities of junctions is estimated and followed by choosing the positions with the local highest value as the junctions, whereas it is challenging to detect junctions when the images contain weak filament signals. Therefore, LEB is constructed to enhance the thin branch foreground and make the network pay more attention to the regions with low contrast, which is helpful to alleviate the imbalance of the foreground between thin and thick branches and to detect the junctions of the thin branch. Furthermore, attention modules are utilized to introduce the feature maps from LEB to JDB, which can establish a complementary relationship and further integrate local features and contextual information between the two branches. The proposed method achieves the highest average F1-scores of 0.82, 0.73 and 0.94 in two retinal datasets and one neuron dataset, respectively. The experimental results confirm that Attention O-Net outperforms other state-of-the-art detection methods, and is helpful for retinal biometric identification.
Collapse
|
202
|
Liu Y, Yip LWL, Zheng Y, Wang L. Glaucoma screening using an attention-guided stereo ensemble network. Methods 2021; 202:14-21. [PMID: 34153436 DOI: 10.1016/j.ymeth.2021.06.010] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2021] [Revised: 06/09/2021] [Accepted: 06/16/2021] [Indexed: 11/19/2022] Open
Abstract
Glaucoma is a chronic eye disease, which causes gradual vision loss and eventually blindness. Accurate glaucoma screening at early stage is critical to mitigate its aggravation. Extracting high-quality features are critical in training of classification models. In this paper, we propose a deep ensemble network with attention mechanism that detects glaucoma using optic nerve head stereo images. The network consists of two main sub-components, a deep Convolutional Neural Network that obtains global information and an Attention-Guided Network that localizes optic disc while maintaining beneficial information from other image regions. Both images in a stereo pair are fed into these sub-components, the outputs are fused together to generate the final prediction result. Abundant image features from different views and regions are being extracted, providing compensation when one of the stereo images is of poor quality. The attention-based localization method is trained in a weakly-supervised manner and only image-level annotation is required, which avoids expensive segmentation labelling. Results from real patient images show that our approach increases recall (sensitivity) from the state-of-the-art 88.89% to 95.48%, while maintaining precision and performance stability. The marked reduction in false-negative rate can significantly enhance the chance of successful early diagnosis of glaucoma.
Collapse
Affiliation(s)
- Yuan Liu
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore.
| | | | - Yuanjin Zheng
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore.
| | - Lipo Wang
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore.
| |
Collapse
|
203
|
Chen S, Zou Y, Liu PX. IBA-U-Net: Attentive BConvLSTM U-Net with Redesigned Inception for medical image segmentation. Comput Biol Med 2021; 135:104551. [PMID: 34157471 DOI: 10.1016/j.compbiomed.2021.104551] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2021] [Revised: 05/16/2021] [Accepted: 06/02/2021] [Indexed: 10/21/2022]
Abstract
Accurate segmentation of medical images plays an essential role in their analysis and has a wide range of research and application values in fields of practice such as medical research, disease diagnosis, disease analysis, and auxiliary surgery. In recent years, deep convolutional neural networks have been developed that show strong performance in medical image segmentation. However, because of the inherent challenges of medical images, such as irregularities of the dataset and the existence of outliers, segmentation approaches have not demonstrated sufficiently accurate and reliable results for clinical employment. Our method is based on three key ideas: (1) integrating the BConvLSTM block and the Attention block to reduce the semantic gap between the encoder and decoder feature maps to make the two feature maps more homogeneous, (2) factorizing convolutions with a large filter size by Redesigned Inception, which uses a multiscale feature fusion method to significantly increase the effective receptive field, and (3) devising a deep convolutional neural network with multiscale feature fusion and a Attentive BConvLSTM mechanism, which integrates the Attentive BConvLSTM block and the Redesigned Inception block into an encoder-decoder model called Attentive BConvLSTM U-Net with Redesigned Inception (IBA-U-Net). Our proposed architecture, IBA-U-Net, has been compared with the U-Net and state-of-the-art segmentation methods on three publicly available datasets, the lung image segmentation dataset, skin lesion image dataset, and retinal blood vessel image segmentation dataset, each with their unique challenges, and it has improved the prediction performance even with slightly less calculation expense and fewer network parameters. By devising a deep convolutional neural network with a multiscale feature fusion and Attentive BConvLSTM mechanism, medical image segmentation of different tasks can be completed effectively and accurately with only 45% of U-Net parameters.
Collapse
Affiliation(s)
- Siyuan Chen
- The School of Information Engineering, Nanchang University, Jiangxi, Nanchang, 330031, China
| | - Yanni Zou
- The School of Information Engineering, Nanchang University, Jiangxi, Nanchang, 330031, China.
| | - Peter X Liu
- Department of Systems and Computer Engineering, Carleton University, Ottawa, ON, KIS 5B6, Canada
| |
Collapse
|
204
|
Lei H, Liu W, Xie H, Zhao B, Yue G, Lei B. Unsupervised Domain Adaptation Based Image Synthesis and Feature Alignment for Joint Optic Disc and Cup Segmentation. IEEE J Biomed Health Inform 2021; 26:90-102. [PMID: 34061755 DOI: 10.1109/jbhi.2021.3085770] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Due to the discrepancy of different devices for fundus image collection, a well-trained neural network is usually unsuitable for another new dataset. To solve this problem, the unsupervised domain adaptation strategy attracts a lot of attentions. In this paper, we propose an unsupervised domain adaptation method based image synthesis and feature alignment (ISFA) method to segment optic disc and cup on the fundus image. The GAN-based image synthesis (IS) mechanism along with the boundary information of optic disc and cup is utilized to generate target-like query images, which serves as the intermediate latent space between source domain and target domain images to alleviate the domain shift problem. Specifically, we use content and style feature alignment (CSFA) to ensure the feature consistency among source domain images, target-like query images and target domain images. The adversarial learning is used to extract domain invariant features for output-level feature alignment (OLFA). To enhance the representation ability of domain-invariant boundary structure information, we introduce the edge attention module (EAM) for low-level feature maps. Eventually, we train our proposed method on the training set of the REFUGE challenge dataset and test it on Drishti-GS and RIM-ONE_r3 datasets. On the Drishti-GS dataset, our method achieves about 3% improvement of Dice on optic cup segmentation over the next best method. We comprehensively discuss the robustness of our method for small dataset domain adaptation. The experimental results also demonstrate the effectiveness of our method. Our code is available at https://github.com/thinkobj/ISFA.
Collapse
|
205
|
Krishna Adithya V, Williams BM, Czanner S, Kavitha S, Friedman DS, Willoughby CE, Venkatesh R, Czanner G. EffUnet-SpaGen: An Efficient and Spatial Generative Approach to Glaucoma Detection. J Imaging 2021; 7:92. [PMID: 39080880 PMCID: PMC8321378 DOI: 10.3390/jimaging7060092] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Revised: 05/21/2021] [Accepted: 05/27/2021] [Indexed: 12/11/2022] Open
Abstract
Current research in automated disease detection focuses on making algorithms "slimmer" reducing the need for large training datasets and accelerating recalibration for new data while achieving high accuracy. The development of slimmer models has become a hot research topic in medical imaging. In this work, we develop a two-phase model for glaucoma detection, identifying and exploiting a redundancy in fundus image data relating particularly to the geometry. We propose a novel algorithm for the cup and disc segmentation "EffUnet" with an efficient convolution block and combine this with an extended spatial generative approach for geometry modelling and classification, termed "SpaGen" We demonstrate the high accuracy achievable by EffUnet in detecting the optic disc and cup boundaries and show how our algorithm can be quickly trained with new data by recalibrating the EffUnet layer only. Our resulting glaucoma detection algorithm, "EffUnet-SpaGen", is optimized to significantly reduce the computational burden while at the same time surpassing the current state-of-art in glaucoma detection algorithms with AUROC 0.997 and 0.969 in the benchmark online datasets ORIGA and DRISHTI, respectively. Our algorithm also allows deformed areas of the optic rim to be displayed and investigated, providing explainability, which is crucial to successful adoption and implementation in clinical settings.
Collapse
Affiliation(s)
- Venkatesh Krishna Adithya
- Department of Glaucoma, Aravind Eye Care System, Thavalakuppam, Pondicherry 605007, India; (V.K.A.); (S.K.); (R.V.)
| | - Bryan M. Williams
- School of Computing and Communications, Lancaster University, Bailrigg, Lancaster LA1 4WA, UK;
| | - Silvester Czanner
- School of Computer Science and Mathematics, Liverpool John Moores University, Liverpool L3 3AF, UK;
| | - Srinivasan Kavitha
- Department of Glaucoma, Aravind Eye Care System, Thavalakuppam, Pondicherry 605007, India; (V.K.A.); (S.K.); (R.V.)
| | - David S. Friedman
- Glaucoma Center of Excellence, Harvard Medical School, Boston, MA 02114, USA;
| | - Colin E. Willoughby
- Biomedical Research Institute, Ulster University, Coleraine, Co. Londonderry BT52 1SA, UK;
| | - Rengaraj Venkatesh
- Department of Glaucoma, Aravind Eye Care System, Thavalakuppam, Pondicherry 605007, India; (V.K.A.); (S.K.); (R.V.)
| | - Gabriela Czanner
- School of Computer Science and Mathematics, Liverpool John Moores University, Liverpool L3 3AF, UK;
| |
Collapse
|
206
|
Ge R, Cai H, Yuan X, Qin F, Huang Y, Wang P, Lyu L. MD-UNET: Multi-input dilated U-shape neural network for segmentation of bladder cancer. Comput Biol Chem 2021; 93:107510. [PMID: 34044203 DOI: 10.1016/j.compbiolchem.2021.107510] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 05/12/2021] [Indexed: 10/21/2022]
Abstract
Accurate segmentation of the tumour area is crucial for the treatment and prognosis of patients with bladder cancer. However, the complex information from the MRI image poses an important challenge for us to accurately segment the lesion, for example, the high distinction among people, size of bladder variation and noise interference. Based on the above issues, we propose an MD-Unet network structure, which uses multi-scale images as the input of the network, and combines max-pooling with dilated convolution to increase the receptive field of the convolutional network. The results show that the proposed network can obtain higher precision than the existing models for the bladder cancer dataset. The MD-Unet can achieve state-of-art performance compared with other methods.
Collapse
Affiliation(s)
- Ruiquan Ge
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, 310018, China
| | - Huihuang Cai
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, 310018, China
| | - Xin Yuan
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, 310018, China
| | - Feiwei Qin
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, 310018, China
| | - Yan Huang
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, 310018, China
| | - Pu Wang
- Computer School, Hubei University of Arts and Science, Xiangyang, 441053, China.
| | - Lei Lyu
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250014, China.
| |
Collapse
|
207
|
Tao S, Jiang Y, Cao S, Wu C, Ma Z. Attention-Guided Network with Densely Connected Convolution for Skin Lesion Segmentation. SENSORS (BASEL, SWITZERLAND) 2021; 21:3462. [PMID: 34065771 PMCID: PMC8156456 DOI: 10.3390/s21103462] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Revised: 05/09/2021] [Accepted: 05/11/2021] [Indexed: 12/03/2022]
Abstract
The automatic segmentation of skin lesions is considered to be a key step in the diagnosis and treatment of skin lesions, which is essential to improve the survival rate of patients. However, due to the low contrast, the texture and boundary are difficult to distinguish, which makes the accurate segmentation of skin lesions challenging. To cope with these challenges, this paper proposes an attention-guided network with densely connected convolution for skin lesion segmentation, called CSAG and DCCNet. In the last step of the encoding path, the model uses densely connected convolution to replace the ordinary convolutional layer. A novel attention-oriented filter module called Channel Spatial Fast Attention-guided Filter (CSFAG for short) was designed and embedded in the skip connection of the CSAG and DCCNet. On the ISIC-2017 data set, a large number of ablation experiments have verified the superiority and robustness of the CSFAG module and Densely Connected Convolution. The segmentation performance of CSAG and DCCNet is compared with other latest algorithms, and very competitive results have been achieved in all indicators. The robustness and cross-data set performance of our method was tested on another publicly available data set PH2, further verifying the effectiveness of the model.
Collapse
|
208
|
Skin Lesion Segmentation by U-Net with Adaptive Skip Connection and Structural Awareness. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11104528] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Skin lesion segmentation is one of the pivotal stages in the diagnosis of melanoma. Many methods have been proposed but, to date, this is still a challenging task. Variations in size and color, the fuzzy boundary and the low contrast between lesion and normal skin are the adverse factors for deficient or excessive delineation of lesions, or even inaccurate lesion location detection. In this paper, to counter these problems, we introduce a deep learning method based on U-Net architecture, which performs three tasks, namely lesion segmentation, boundary distance map regression and contour detection. The two auxiliary tasks provide an awareness of boundary and shape to the main encoder, which improves the object localization and pixel-wise classification in the transition region from lesion tissues to healthy tissues. Moreover, concerning the large variation in size, the Selective Kernel modules, which are placed in the skip connections, transfer the multi-receptive field features from the encoder to the decoder. Our methods are evaluated on three publicly available datasets: ISBI2016, ISBI 2017 and PH2. The extensive experimental results show the effectiveness of the proposed method in the task of skin lesion segmentation.
Collapse
|
209
|
Joint optic disc and optic cup segmentation based on boundary prior and adversarial learning. Int J Comput Assist Radiol Surg 2021; 16:905-914. [PMID: 33963969 DOI: 10.1007/s11548-021-02373-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 04/08/2021] [Indexed: 10/21/2022]
Abstract
PURPOSE The most direct means of glaucoma screening is to use cup-to-disc ratio via colour fundus photography, the first step of which is the precise segmentation of the optic cup (OC) and optic disc (OD). In recent years, convolution neural networks (CNN) have shown outstanding performance in medical segmentation tasks. However, most CNN-based methods ignore the effect of boundary ambiguity on performance, which leads to low generalization. This paper is dedicated to solving this issue. METHODS In this paper, we propose a novel segmentation architecture, called BGA-Net, which introduces an auxiliary boundary branch and adversarial learning to jointly segment OD and OC in a multi-label manner. To generate more accurate results, the generative adversarial network is exploited to encourage boundary and mask predictions to be similar to the ground truth ones. RESULTS Experimental results show that our BGA-Net system achieves state-of-the-art OC and OD segmentation performance on three publicly available datasets, i.e., the Dice scores for the optic disc/cup on the Drishti-GS, RIM-ONE-r3 and REFUGE datasets are 0.975/0.898, 0.967/0.872 and 0.951/0.866, respectively. CONCLUSION In this work, we not only achieve superior OD and OC segmentation results, but also confirm that the values calculated through the geometric relationship between the former two are highly related to glaucoma.
Collapse
|
210
|
Pathological Myopia Image Recognition Strategy Based on Data Augmentation and Model Fusion. JOURNAL OF HEALTHCARE ENGINEERING 2021; 2021:5549779. [PMID: 34035883 PMCID: PMC8118733 DOI: 10.1155/2021/5549779] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Revised: 04/02/2021] [Accepted: 04/27/2021] [Indexed: 11/17/2022]
Abstract
The automatic diagnosis of various retinal diseases based on fundus images is important in supporting clinical decision-making. Convolutional neural networks (CNNs) have achieved remarkable results in such tasks. However, their high expression ability possibly leads to overfitting. Therefore, data augmentation (DA) techniques have been proposed to prevent overfitting while enriching datasets. Recent CNN architectures with more parameters render traditional DA techniques insufficient. In this study, we proposed a new DA strategy based on multimodal fusion (DAMF) which could integrate the standard DA method, data disrupting method, data mixing method, and autoadjustment method to enhance the image data in the training dataset to create new training images. In addition, we fused the results of the classifier by voting on the basis of DAMF, which further improved the generalization ability of the model. The experimental results showed that the optimal DA mode could be matched to the image dataset through our DA strategy. We evaluated DAMF on the iChallenge-PM dataset. At last, we compared training results between 12 DAMF processed datasets and the original training dataset. Compared with the original dataset, the optimal DAMF achieved an accuracy increase of 2.85% on iChallenge-PM.
Collapse
|
211
|
Shu Y, Zhang J, Xiao B, Li W. Medical image segmentation based on active fusion-transduction of multi-stream features. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.106950] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
212
|
Yuan F, Zhang L, Xia X, Huang Q, Li X. A Gated Recurrent Network With Dual Classification Assistance for Smoke Semantic Segmentation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:4409-4422. [PMID: 33798085 DOI: 10.1109/tip.2021.3069318] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Smoke has semi-transparency property leading to highly complicated mixture of background and smoke. Sparse or small smoke is visually inconspicuous, and its boundary is often ambiguous. These reasons result in a very challenging task of separating smoke from a single image. To solve these problems, we propose a Classification-assisted Gated Recurrent Network (CGRNet) for smoke semantic segmentation. To discriminate smoke and smoke-like objects, we present a smoke segmentation strategy with dual classification assistance. Our classification module outputs two prediction probabilities for smoke. The first assistance is to use one probability to explicitly regulate the segmentation module for accuracy improvement by supervising a cross-entropy classification loss. The second one is to multiply the segmentation result by another probability for further refinement. This dual classification assistance greatly improves performance at image level. In the segmentation module, we design an Attention Convolutional GRU module (Att-ConvGRU) to learn the long-range context dependence of features. To perceive small or inconspicuous smoke, we design a Multi-scale Context Contrasted Local Feature structure (MCCL) and a Dense Pyramid Pooling Module (DPPM) for improving the representation ability of our network. Extensive experiments validate that our method significantly outperforms existing state-of-art algorithms on smoke datasets, and also obtain satisfactory results on challenging images with inconspicuous smoke and smoke-like objects.
Collapse
|
213
|
Tian F, Gao Y, Fang Z, Gu J. Automatic coronary artery segmentation algorithm based on deep learning and digital image processing. APPL INTELL 2021. [DOI: 10.1007/s10489-021-02197-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
|
214
|
Hu J, Wang H, Wang J, Wang Y, He F, Zhang J. SA-Net: A scale-attention network for medical image segmentation. PLoS One 2021; 16:e0247388. [PMID: 33852577 PMCID: PMC8046243 DOI: 10.1371/journal.pone.0247388] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Accepted: 02/06/2021] [Indexed: 11/24/2022] Open
Abstract
Semantic segmentation of medical images provides an important cornerstone for subsequent tasks of image analysis and understanding. With rapid advancements in deep learning methods, conventional U-Net segmentation networks have been applied in many fields. Based on exploratory experiments, features at multiple scales have been found to be of great importance for the segmentation of medical images. In this paper, we propose a scale-attention deep learning network (SA-Net), which extracts features of different scales in a residual module and uses an attention module to enforce the scale-attention capability. SA-Net can better learn the multi-scale features and achieve more accurate segmentation for different medical image. In addition, this work validates the proposed method across multiple datasets. The experiment results show SA-Net achieves excellent performances in the applications of vessel detection in retinal images, lung segmentation, artery/vein(A/V) classification in retinal images and blastocyst segmentation. To facilitate SA-Net utilization by the scientific community, the code implementation will be made publicly available.
Collapse
Affiliation(s)
- Jingfei Hu
- School of Biological Science and Medical Engineering, Beihang University, Beijing, China
- Hefei Innovation Research Institute, Beihang University, Hefei, China
- Beijing Advanced Innovation Centre for Biomedical Engineering, Beihang University, Beijing, China
- School of Biomedical Engineering, Anhui Medical University, Hefei, China
| | - Hua Wang
- School of Biological Science and Medical Engineering, Beihang University, Beijing, China
- Hefei Innovation Research Institute, Beihang University, Hefei, China
- Beijing Advanced Innovation Centre for Biomedical Engineering, Beihang University, Beijing, China
- School of Biomedical Engineering, Anhui Medical University, Hefei, China
| | - Jie Wang
- School of Computer Science and Engineering, Beihang University, Beijing, China
| | - Yunqi Wang
- School of Biological Science and Medical Engineering, Beihang University, Beijing, China
- Hefei Innovation Research Institute, Beihang University, Hefei, China
| | - Fang He
- Hefei Innovation Research Institute, Beihang University, Hefei, China
| | - Jicong Zhang
- School of Biological Science and Medical Engineering, Beihang University, Beijing, China
- Hefei Innovation Research Institute, Beihang University, Hefei, China
- Beijing Advanced Innovation Centre for Biomedical Engineering, Beihang University, Beijing, China
- School of Biomedical Engineering, Anhui Medical University, Hefei, China
- Beijing Advanced Innovation Centre for Big Data-Based Precision Medicine, Beihang University, Beijing, China
| |
Collapse
|
215
|
Gridach M. PyDiNet: Pyramid Dilated Network for medical image segmentation. Neural Netw 2021; 140:274-281. [PMID: 33839599 DOI: 10.1016/j.neunet.2021.03.023] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Revised: 03/08/2021] [Accepted: 03/12/2021] [Indexed: 12/18/2022]
Abstract
Medical image segmentation is an important step in many generic applications such as population analysis and, more accessible, can be made into a crucial tool in diagnosis and treatment planning. Previous approaches are based on two main architectures: fully convolutional networks and U-Net-based architecture. These methods rely on multiple pooling and striding layers leading to the loss of important spatial information and fail to capture details in medical images. In this paper, we propose a novel neural network called PyDiNet (Pyramid Dilated Network) to capture small and complex variations in medical images while preserving spatial information. To achieve this goal, PyDiNet uses a newly proposed pyramid dilated module (PDM), which consists of multiple dilated convolutions stacked in parallel. We combine several PDM modules to form the final PyDiNet architecture. We applied the proposed PyDiNet to different medical image segmentation tasks. Experimental results show that the proposed model achieves new state-of-the-art performance on three medical image segmentation benchmarks. Furthermore, PyDiNet was very competitive on the 2020 Endoscopic Artifact Detection challenge.
Collapse
Affiliation(s)
- Mourad Gridach
- Department of Computer Science, High Institute of Technology, Agadir, Morocco.
| |
Collapse
|
216
|
Wang L, Gu J, Chen Y, Liang Y, Zhang W, Pu J, Chen H. Automated segmentation of the optic disc from fundus images using an asymmetric deep learning network. PATTERN RECOGNITION 2021; 112:107810. [PMID: 34354302 PMCID: PMC8336919 DOI: 10.1016/j.patcog.2020.107810] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Accurate segmentation of the optic disc (OD) regions from color fundus images is a critical procedure for computer-aided diagnosis of glaucoma. We present a novel deep learning network to automatically identify the OD regions. On the basis of the classical U-Net framework, we define a unique sub-network and a decoding convolutional block. The sub-network is used to preserve important textures and facilitate their detections, while the decoding block is used to improve the contrast of the regions-of-interest with their background. We integrate these two components into the classical U-Net framework to improve the accuracy and reliability of segmenting the OD regions depicted on color fundus images. We train and evaluate the developed network using three publicly available datasets (i.e., MESSIDOR, ORIGA, and REFUGE). The results on an independent testing set (n=1,970 images) show a segmentation performance with an average Dice similarity coefficient (DSC), intersection over union (IOU), and Matthew's correlation coefficient (MCC) of 0.9377, 0.8854, and 0.9383 when trained on the global field-of-view images, respectively, and 0.9735, 0.9494, and 0.9594 when trained on the local disc region images. When compared with the other three classical networks (i.e., the U-Net, M-Net, and Deeplabv3) on the same testing datasets, the developed network demonstrates a relatively higher performance.
Collapse
Affiliation(s)
- Lei Wang
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, 325027, China
- Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing, 211189, China
| | - Juan Gu
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, 325027, China
| | - Yize Chen
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, 325027, China
| | - Yuanbo Liang
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, 325027, China
| | - Weijie Zhang
- Departments of Radiology and Bioengineering, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Jiantao Pu
- Departments of Radiology and Bioengineering, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Hao Chen
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, 325027, China
| |
Collapse
|
217
|
Automated segmentation of optic disc and optic cup for glaucoma assessment using improved UNET++ architecture. Biocybern Biomed Eng 2021. [DOI: 10.1016/j.bbe.2021.05.011] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
|
218
|
Xu Y, Hu M, Liu H, Yang H, Wang H, Lu S, Liang T, Li X, Xu M, Li L, Li H, Ji X, Wang Z, Li L, Weinreb RN, Wang N. A hierarchical deep learning approach with transparency and interpretability based on small samples for glaucoma diagnosis. NPJ Digit Med 2021; 4:48. [PMID: 33707616 PMCID: PMC7952384 DOI: 10.1038/s41746-021-00417-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 02/08/2021] [Indexed: 12/11/2022] Open
Abstract
The application of deep learning algorithms for medical diagnosis in the real world faces challenges with transparency and interpretability. The labeling of large-scale samples leads to costly investment in developing deep learning algorithms. The application of human prior knowledge is an effective way to solve these problems. Previously, we developed a deep learning system for glaucoma diagnosis based on a large number of samples that had high sensitivity and specificity. However, it is a black box and the specific analytic methods cannot be elucidated. Here, we establish a hierarchical deep learning system based on a small number of samples that comprehensively simulates the diagnostic thinking of human experts. This system can extract the anatomical characteristics of the fundus images, including the optic disc, optic cup, and appearance of the retinal nerve fiber layer to realize automatic diagnosis of glaucoma. In addition, this system is transparent and interpretable, and the intermediate process of prediction can be visualized. Applying this system to three validation datasets of fundus images, we demonstrate performance comparable to that of human experts in diagnosing glaucoma. Moreover, it markedly improves the diagnostic accuracy of ophthalmologists. This system may expedite the screening and diagnosis of glaucoma, resulting in improved clinical outcomes.
Collapse
Affiliation(s)
- Yongli Xu
- Department of Mathematics, Beijing University of Chemical Technology, Beijing, China
| | - Man Hu
- National Key Discipline of Pediatrics, Ministry of Education, Department of Ophthalmology, Beijing Children's Hospital, Capital Medical University, Beijing, China
| | - Hanruo Liu
- Beijing Institute of Ophthalmology, Beijing Tongren Hospital, Capital Medical University, Beijing Ophthalmology & Visual Science Key Lab, Beijing, China.,School of Information and Electronics, Beijing Institute of Technology, Beijing, China
| | - Hao Yang
- Department of Mathematics, Beijing University of Chemical Technology, Beijing, China
| | - Huaizhou Wang
- Beijing Institute of Ophthalmology, Beijing Tongren Hospital, Capital Medical University, Beijing Ophthalmology & Visual Science Key Lab, Beijing, China
| | - Shuai Lu
- Department of Mathematics, Beijing University of Chemical Technology, Beijing, China.,School of Information and Electronics, Beijing Institute of Technology, Beijing, China
| | - Tianwei Liang
- National Key Discipline of Pediatrics, Ministry of Education, Department of Ophthalmology, Beijing Children's Hospital, Capital Medical University, Beijing, China
| | - Xiaoxing Li
- Department of Mathematics, Beijing University of Chemical Technology, Beijing, China
| | - Mai Xu
- School of Electronic and Information Engineering, Beihang University, Beijing, China
| | - Liu Li
- School of Electronic and Information Engineering, Beihang University, Beijing, China
| | - Huiqi Li
- School of Information and Electronics, Beijing Institute of Technology, Beijing, China
| | - Xin Ji
- Beijing Shanggong Medical Technology co., Ltd, Beijing, China
| | - Zhijun Wang
- Beijing Shanggong Medical Technology co., Ltd, Beijing, China
| | - Li Li
- National Key Discipline of Pediatrics, Ministry of Education, Department of Ophthalmology, Beijing Children's Hospital, Capital Medical University, Beijing, China.
| | - Robert N Weinreb
- Shiley Eye Institute, University of California San Diego, La Jolla, CA, USA
| | - Ningli Wang
- Beijing Institute of Ophthalmology, Beijing Tongren Hospital, Capital Medical University, Beijing Ophthalmology & Visual Science Key Lab, Beijing, China. .,Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, Beihang University & Capital Medical University, Beijing Tongren Hospital, Beijing, China.
| |
Collapse
|
219
|
Xie H, Zeng X, Lei H, Du J, Wang J, Zhang G, Cao J, Wang T, Lei B. Cross-attention multi-branch network for fundus diseases classification using SLO images. Med Image Anal 2021; 71:102031. [PMID: 33798993 DOI: 10.1016/j.media.2021.102031] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 01/24/2021] [Accepted: 03/03/2021] [Indexed: 12/23/2022]
Abstract
Fundus diseases classification is vital for the health of human beings. However, most of existing methods detect diseases by means of single angle fundus images, which lead to the lack of pathological information. To address this limitation, this paper proposes a novel deep learning method to complete different fundus diseases classification tasks using ultra-wide field scanning laser ophthalmoscopy (SLO) images, which have an ultra-wide field view of 180-200˚. The proposed deep model consists of multi-branch network, atrous spatial pyramid pooling module (ASPP), cross-attention and depth-wise attention module. Specifically, the multi-branch network employs the ResNet-34 model as the backbone to extract feature information, where the ResNet-34 model with two-branch is followed by the ASPP module to extract multi-scale spatial contextual features by setting different dilated rates. The depth-wise attention module can provide the global attention map from the multi-branch network, which enables the network to focus on the salient targets of interest. The cross-attention module adopts the cross-fusion mode to fuse the channel and spatial attention maps from the ResNet-34 model with two-branch, which can enhance the representation ability of the disease-specific features. The extensive experiments on our collected SLO images and two publicly available datasets demonstrate that the proposed method can outperform the state-of-the-art methods and achieve quite promising classification performance of the fundus diseases.
Collapse
Affiliation(s)
- Hai Xie
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China
| | - Xianlu Zeng
- Shenzhen Eye Hospital, Shenzhen Key Ophthalmic Laboratory, Health Science Center, Shenzhen University, The Second Affiliated Hospital of Jinan University, Shenzhen, China
| | - Haijun Lei
- Guangdong Province Key Laboratory of Popular High-performance Computers, School of Computer and Software Engineering, Shenzhen University, Shenzhen, China
| | - Jie Du
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China
| | - Jiantao Wang
- Shenzhen Eye Hospital, Shenzhen Key Ophthalmic Laboratory, Health Science Center, Shenzhen University, The Second Affiliated Hospital of Jinan University, Shenzhen, China
| | - Guoming Zhang
- Shenzhen Eye Hospital, Shenzhen Key Ophthalmic Laboratory, Health Science Center, Shenzhen University, The Second Affiliated Hospital of Jinan University, Shenzhen, China.
| | - Jiuwen Cao
- Key Lab for IOT and Information Fusion Technology of Zhejiang, Artificial Intelligence Institute, Hangzhou Dianzi University, Hangzhou, China
| | - Tianfu Wang
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China
| | - Baiying Lei
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China.
| |
Collapse
|
220
|
Wu H, Wang W, Zhong J, Lei B, Wen Z, Qin J. SCS-Net: A Scale and Context Sensitive Network for Retinal Vessel Segmentation. Med Image Anal 2021; 70:102025. [PMID: 33721692 DOI: 10.1016/j.media.2021.102025] [Citation(s) in RCA: 77] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Revised: 02/24/2021] [Accepted: 02/25/2021] [Indexed: 02/01/2023]
Abstract
Accurately segmenting retinal vessel from retinal images is essential for the detection and diagnosis of many eye diseases. However, it remains a challenging task due to (1) the large variations of scale in the retinal vessels and (2) the complicated anatomical context of retinal vessels, including complex vasculature and morphology, the low contrast between some vessels and the background, and the existence of exudates and hemorrhage. It is difficult for a model to capture representative and distinguishing features for retinal vessels under such large scale and semantics variations. Limited training data also make this task even harder. In order to comprehensively tackle these challenges, we propose a novel scale and context sensitive network (a.k.a., SCS-Net) for retinal vessel segmentation. We first propose a scale-aware feature aggregation (SFA) module, aiming at dynamically adjusting the receptive fields to effectively extract multi-scale features. Then, an adaptive feature fusion (AFF) module is designed to guide efficient fusion between adjacent hierarchical features to capture more semantic information. Finally, a multi-level semantic supervision (MSS) module is employed to learn more distinctive semantic representation for refining the vessel maps. We conduct extensive experiments on the six mainstream retinal image databases (DRIVE, CHASEDB1, STARE, IOSTAR, HRF, and LES-AV). The experimental results demonstrate the effectiveness of the proposed SCS-Net, which is capable of achieving better segmentation performance than other state-of-the-art approaches, especially for the challenging cases with large scale variations and complex context environments.
Collapse
Affiliation(s)
- Huisi Wu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China, 518060
| | - Wei Wang
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China, 518060
| | - Jiafu Zhong
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China, 518060
| | - Baiying Lei
- School of Biomedical Engineering, Health Science Centers, Shenzhen University, National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, Marshall Laboratory of Biomedical Engineering, AI Research Center for Medical Image Analysis and Diagnosis, Shenzhen, China, 518060.
| | - Zhenkun Wen
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China, 518060
| | - Jing Qin
- Center for Smart Health, School of Nursing, The Hong Kong Polytechnic University, Hong Kong
| |
Collapse
|
221
|
Shabbir A, Rasheed A, Shehraz H, Saleem A, Zafar B, Sajid M, Ali N, Dar SH, Shehryar T. Detection of glaucoma using retinal fundus images: A comprehensive review. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2021; 18:2033-2076. [PMID: 33892536 DOI: 10.3934/mbe.2021106] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Content-based image analysis and computer vision techniques are used in various health-care systems to detect the diseases. The abnormalities in a human eye are detected through fundus images captured through a fundus camera. Among eye diseases, glaucoma is considered as the second leading case that can result in neurodegeneration illness. The inappropriate intraocular pressure within the human eye is reported as the main cause of this disease. There are no symptoms of glaucoma at earlier stages and if the disease remains unrectified then it can lead to complete blindness. The early diagnosis of glaucoma can prevent permanent loss of vision. Manual examination of human eye is a possible solution however it is dependant on human efforts. The automatic detection of glaucoma by using a combination of image processing, artificial intelligence and computer vision can help to prevent and detect this disease. In this review article, we aim to present a comprehensive review about the various types of glaucoma, causes of glaucoma, the details about the possible treatment, details about the publicly available image benchmarks, performance metrics, and various approaches based on digital image processing, computer vision, and deep learning. The review article presents a detailed study of various published research models that aim to detect glaucoma from low-level feature extraction to recent trends based on deep learning. The pros and cons of each approach are discussed in detail and tabular representations are used to summarize the results of each category. We report our findings and provide possible future research directions to detect glaucoma in conclusion.
Collapse
Affiliation(s)
- Amsa Shabbir
- Department of Software Engineering, Mirpur University of Science and Technology (MUST), Mirpur- AJK 10250, Pakistan
| | - Aqsa Rasheed
- Department of Software Engineering, Mirpur University of Science and Technology (MUST), Mirpur- AJK 10250, Pakistan
| | - Huma Shehraz
- Department of Software Engineering, Mirpur University of Science and Technology (MUST), Mirpur- AJK 10250, Pakistan
| | - Aliya Saleem
- Department of Software Engineering, Mirpur University of Science and Technology (MUST), Mirpur- AJK 10250, Pakistan
| | - Bushra Zafar
- Department of Computer Science, Government College University, Faisalabad 38000, Pakistan
| | - Muhammad Sajid
- Department of Electrical Engineering, Mirpur University of Science and Technology (MUST), Mirpur- AJK 10250, Pakistan
| | - Nouman Ali
- Department of Software Engineering, Mirpur University of Science and Technology (MUST), Mirpur- AJK 10250, Pakistan
| | - Saadat Hanif Dar
- Department of Software Engineering, Mirpur University of Science and Technology (MUST), Mirpur- AJK 10250, Pakistan
| | - Tehmina Shehryar
- Department of Software Engineering, Mirpur University of Science and Technology (MUST), Mirpur- AJK 10250, Pakistan
| |
Collapse
|
222
|
Kim KC, Cho HC, Jang TJ, Choi JM, Seo JK. Automatic detection and segmentation of lumbar vertebrae from X-ray images for compression fracture evaluation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 200:105833. [PMID: 33250283 DOI: 10.1016/j.cmpb.2020.105833] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Accepted: 11/04/2020] [Indexed: 06/12/2023]
Abstract
For compression fracture detection and evaluation, an automatic X-ray image segmentation technique that combines deep-learning and level-set methods is proposed. Automatic segmentation is much more difficult for X-ray images than for CT or MRI images because they contain overlapping shadows of thoracoabdominal structures including lungs, bowel gases, and other bony structures such as ribs. Additional difficulties include unclear object boundaries, the complex shape of the vertebra, inter-patient variability, and variations in image contrast. Accordingly, a structured hierarchical segmentation method is presented that combines the advantages of two deep-learning methods. Pose-driven learning is used to selectively identify the five lumbar vertebrae in an accurate and robust manner. With knowledge of the vertebral positions, M-net is employed to segment the individual vertebra. Finally, fine-tuning segmentation is applied by combining the level-set method with the previously obtained segmentation results. The performance of the proposed method was validated by 160 lumbar X-ray images, resulting in a mean Dice similarity metric of 91.60±2.22%. The results show that the proposed method achieves accurate and robust identification of each lumbar vertebra and fine segmentation of individual vertebra.
Collapse
Affiliation(s)
- Kang Cheol Kim
- School of Mathematics and Computing (Computational Science and Engineering), Yonsei University, Seoul 03722, South Korea
| | - Hyun Cheol Cho
- School of Mathematics and Computing (Computational Science and Engineering), Yonsei University, Seoul 03722, South Korea
| | - Tae Jun Jang
- School of Mathematics and Computing (Computational Science and Engineering), Yonsei University, Seoul 03722, South Korea
| | | | - Jin Keun Seo
- School of Mathematics and Computing (Computational Science and Engineering), Yonsei University, Seoul 03722, South Korea
| |
Collapse
|
223
|
Shen Z, Fu H, Shen J, Shao L. Modeling and Enhancing Low-Quality Retinal Fundus Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:996-1006. [PMID: 33296301 DOI: 10.1109/tmi.2020.3043495] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Retinal fundus images are widely used for the clinical screening and diagnosis of eye diseases. However, fundus images captured by operators with various levels of experience have a large variation in quality. Low-quality fundus images increase uncertainty in clinical observation and lead to the risk of misdiagnosis. However, due to the special optical beam of fundus imaging and structure of the retina, natural image enhancement methods cannot be utilized directly to address this. In this article, we first analyze the ophthalmoscope imaging system and simulate a reliable degradation of major inferior-quality factors, including uneven illumination, image blurring, and artifacts. Then, based on the degradation model, a clinically oriented fundus enhancement network (cofe-Net) is proposed to suppress global degradation factors, while simultaneously preserving anatomical retinal structures and pathological characteristics for clinical observation and analysis. Experiments on both synthetic and real images demonstrate that our algorithm effectively corrects low-quality fundus images without losing retinal details. Moreover, we also show that the fundus correction method can benefit medical image analysis applications, e.g., retinal vessel segmentation and optic disc/cup detection.
Collapse
|
224
|
Multi-Scale and Multi-Branch Convolutional Neural Network for Retinal Image Segmentation. Symmetry (Basel) 2021. [DOI: 10.3390/sym13030365] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
The accurate segmentation of retinal images is a basic step in screening for retinopathy and glaucoma. Most existing retinal image segmentation methods have insufficient feature information extraction. They are susceptible to the impact of the lesion area and poor image quality, resulting in the poor recovery of contextual information. This also causes the segmentation results of the model to be noisy and low in accuracy. Therefore, this paper proposes a multi-scale and multi-branch convolutional neural network model (multi-scale and multi-branch network (MSMB-Net)) for retinal image segmentation. The model uses atrous convolution with different expansion rates and skip connections to reduce the loss of feature information. Receiving domains of different sizes captures global context information. The model fully integrates shallow and deep semantic information and retains rich spatial information. The network embeds an improved attention mechanism to obtain more detailed information, which can improve the accuracy of segmentation. Finally, the method of this paper was validated on the fundus vascular datasets, DRIVE, STARE and CHASE datasets, with accuracies/F1 of 0.9708/0.8320, 0.9753/0.8469 and 0.9767/0.8190, respectively. The effectiveness of the method in this paper was further validated on the optic disc visual cup DRISHTI-GS1 dataset with an accuracy/F1 of 0.9985/0.9770. Experimental results show that, compared with existing retinal image segmentation methods, our proposed method has good segmentation performance in all four benchmark tests.
Collapse
|
225
|
Yuan X, Zhou L, Yu S, Li M, Wang X, Zheng X. A multi-scale convolutional neural network with context for joint segmentation of optic disc and cup. Artif Intell Med 2021; 113:102035. [PMID: 33685591 DOI: 10.1016/j.artmed.2021.102035] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2020] [Revised: 01/12/2021] [Accepted: 02/09/2021] [Indexed: 01/27/2023]
Abstract
Glaucoma is the leading cause of irreversible blindness. For glaucoma screening, the cup to disc ratio (CDR) is a significant indicator, whose calculation relies on the segmentation of optic disc(OD) and optic cup(OC) in color fundus images. This study proposes a residual multi-scale convolutional neural network with a context semantic extraction module to jointly segment the OD and OC. The proposed method uses a W-shaped backbone network, including image pyramid multi-scale input with the side output layer as an early classifier to generate local prediction output. The proposed method includes a context extraction module that extracts contextual semantic information from multiple level receptive field sizes and adaptively recalibrates channel-wise feature responses. It can effectively extract global information and reduce the semantic gaps in the fusion of deep and shallow semantic information. We validated the proposed method on four datasets, including DRISHTI-GS1, REFUGE, RIM-ONE r3, and a private dataset. The overlap errors are 0.0540, 0.0684, 0.0492, 0.0511 in OC segmentation and 0.2332, 0.1777, 0.2372, 0.2547 in OD segmentation, respectively. Experimental results indicate that the proposed method can estimate the CDR for a large-scale glaucoma screening.
Collapse
Affiliation(s)
- Xin Yuan
- College of Electrical Engineering, Sichuan University, Chengdu, Sichuan, China
| | - Lingxiao Zhou
- Department of Ophthalmology, First Affiliated Hospital of Xi'an Medical University, Xi'an, Shaanxi, China
| | - Shuyang Yu
- College of Electrical Engineering, Sichuan University, Chengdu, Sichuan, China
| | - Miao Li
- College of Electrical Engineering, Sichuan University, Chengdu, Sichuan, China
| | - Xiang Wang
- Department of Ophthalmology, Shandong First Medical University & Shandong Academy of Medical Sciences, Tai'an, Shandong, China
| | - Xiujuan Zheng
- College of Electrical Engineering, Sichuan University, Chengdu, Sichuan, China.
| |
Collapse
|
226
|
Zhang J, Xie Y, Wang Y, Xia Y. Inter-Slice Context Residual Learning for 3D Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:661-672. [PMID: 33125324 DOI: 10.1109/tmi.2020.3034995] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Automated and accurate 3D medical image segmentation plays an essential role in assisting medical professionals to evaluate disease progresses and make fast therapeutic schedules. Although deep convolutional neural networks (DCNNs) have widely applied to this task, the accuracy of these models still need to be further improved mainly due to their limited ability to 3D context perception. In this paper, we propose the 3D context residual network (ConResNet) for the accurate segmentation of 3D medical images. This model consists of an encoder, a segmentation decoder, and a context residual decoder. We design the context residual module and use it to bridge both decoders at each scale. Each context residual module contains both context residual mapping and context attention mapping, the formal aims to explicitly learn the inter-slice context information and the latter uses such context as a kind of attention to boost the segmentation accuracy. We evaluated this model on the MICCAI 2018 Brain Tumor Segmentation (BraTS) dataset and NIH Pancreas Segmentation (Pancreas-CT) dataset. Our results not only demonstrate the effectiveness of the proposed 3D context residual learning scheme but also indicate that the proposed ConResNet is more accurate than six top-ranking methods in brain tumor segmentation and seven top-ranking methods in pancreas segmentation.
Collapse
|
227
|
Hemelings R, Elen B, Blaschko MB, Jacob J, Stalmans I, De Boever P. Pathological myopia classification with simultaneous lesion segmentation using deep learning. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 199:105920. [PMID: 33412285 DOI: 10.1016/j.cmpb.2020.105920] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Accepted: 12/21/2020] [Indexed: 06/12/2023]
Abstract
BACKGROUND AND OBJECTIVES Pathological myopia (PM) is the seventh leading cause of blindness, with a reported global prevalence up to 3%. Early and automated PM detection from fundus images could aid to prevent blindness in a world population that is characterized by a rising myopia prevalence. We aim to assess the use of convolutional neural networks (CNNs) for the detection of PM and semantic segmentation of myopia-induced lesions from fundus images on a recently introduced reference data set. METHODS This investigation reports on the results of CNNs developed for the recently introduced Pathological Myopia (PALM) dataset, which consists of 1200 images. Our CNN bundles lesion segmentation and PM classification, as the two tasks are heavily intertwined. Domain knowledge is also inserted through the introduction of a new Optic Nerve Head (ONH)-based prediction enhancement for the segmentation of atrophy and fovea localization. Finally, we are the first to approach fovea localization using segmentation instead of detection or regression models. Evaluation metrics include area under the receiver operating characteristic curve (AUC) for PM detection, Euclidean distance for fovea localization, and Dice and F1 metrics for the semantic segmentation tasks (optic disc, retinal atrophy and retinal detachment). RESULTS Models trained with 400 available training images achieved an AUC of 0.9867 for PM detection, and a Euclidean distance of 58.27 pixels on the fovea localization task, evaluated on a test set of 400 images. Dice and F1 metrics for semantic segmentation of lesions scored 0.9303 and 0.9869 on optic disc, 0.8001 and 0.9135 on retinal atrophy, and 0.8073 and 0.7059 on retinal detachment, respectively. CONCLUSIONS We report a successful approach for a simultaneous classification of pathological myopia and segmentation of associated lesions. Our work was acknowledged with an award in the context of the "Pathological Myopia detection from retinal images" challenge held during the IEEE International Symposium on Biomedical Imaging (April 2019). Considering that (pathological) myopia cases are often identified as false positives and negatives in glaucoma deep learning models, we envisage that the current work could aid in future research to discriminate between glaucomatous and highly-myopic eyes, complemented by the localization and segmentation of landmarks such as fovea, optic disc and atrophy.
Collapse
Affiliation(s)
- Ruben Hemelings
- Research Group Ophthalmology, KU Leuven, Herestraat 49, 3000 Leuven, Belgium; VITO NV, Boeretang 200, 2400 Mol, Belgium.
| | - Bart Elen
- VITO NV, Boeretang 200, 2400 Mol, Belgium
| | | | - Julie Jacob
- Ophthalmology Department, UZ Leuven, Herestraat 49, 3000 Leuven, Belgium
| | - Ingeborg Stalmans
- Research Group Ophthalmology, KU Leuven, Herestraat 49, 3000 Leuven, Belgium; Ophthalmology Department, UZ Leuven, Herestraat 49, 3000 Leuven, Belgium
| | - Patrick De Boever
- Hasselt University, Agoralaan building D, 3590 Diepenbeek, Belgium; VITO NV, Boeretang 200, 2400 Mol, Belgium
| |
Collapse
|
228
|
Liu B, Pan D, Song H. Joint optic disc and cup segmentation based on densely connected depthwise separable convolution deep network. BMC Med Imaging 2021; 21:14. [PMID: 33509106 PMCID: PMC7842021 DOI: 10.1186/s12880-020-00528-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2020] [Accepted: 11/19/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Glaucoma is an eye disease that causes vision loss and even blindness. The cup to disc ratio (CDR) is an important indicator for glaucoma screening and diagnosis. Accurate segmentation for the optic disc and cup helps obtain CDR. Although many deep learning-based methods have been proposed to segment the disc and cup for fundus image, achieving highly accurate segmentation performance is still a great challenge due to the heavy overlap between the optic disc and cup. METHODS In this paper, we propose a two-stage method where the optic disc is firstly located and then the optic disc and cup are segmented jointly according to the interesting areas. Also, we consider the joint optic disc and cup segmentation task as a multi-category semantic segmentation task for which a deep learning-based model named DDSC-Net (densely connected depthwise separable convolution network) is proposed. Specifically, we employ depthwise separable convolutional layer and image pyramid input to form a deeper and wider network to improve segmentation performance. Finally, we evaluate our method on two publicly available datasets, Drishti-GS and REFUGE dataset. RESULTS The experiment results show that the proposed method outperforms state-of-the-art methods, such as pOSAL, GL-Net, M-Net and Stack-U-Net in terms of disc coefficients, with the scores of 0.9780 (optic disc) and 0.9123 (optic cup) on the DRISHTI-GS dataset, and the scores of 0.9601 (optic disc) and 0.8903 (optic cup) on the REFUGE dataset. Particularly, in the more challenging optic cup segmentation task, our method outperforms GL-Net by 0.7[Formula: see text] in terms of disc coefficients on the Drishti-GS dataset and outperforms pOSAL by 0.79[Formula: see text] on the REFUGE dataset, respectively. CONCLUSIONS The promising segmentation performances reveal that our method has the potential in assisting the screening and diagnosis of glaucoma.
Collapse
Affiliation(s)
- Bingyan Liu
- South China Normal University, Guangzhou, 510006, China
| | - Daru Pan
- South China Normal University, Guangzhou, 510006, China.
| | - Hui Song
- South China Normal University, Guangzhou, 510006, China
| |
Collapse
|
229
|
Li T, Bo W, Hu C, Kang H, Liu H, Wang K, Fu H. Applications of deep learning in fundus images: A review. Med Image Anal 2021; 69:101971. [PMID: 33524824 DOI: 10.1016/j.media.2021.101971] [Citation(s) in RCA: 99] [Impact Index Per Article: 24.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Accepted: 01/12/2021] [Indexed: 02/06/2023]
Abstract
The use of fundus images for the early screening of eye diseases is of great clinical importance. Due to its powerful performance, deep learning is becoming more and more popular in related applications, such as lesion segmentation, biomarkers segmentation, disease diagnosis and image synthesis. Therefore, it is very necessary to summarize the recent developments in deep learning for fundus images with a review paper. In this review, we introduce 143 application papers with a carefully designed hierarchy. Moreover, 33 publicly available datasets are presented. Summaries and analyses are provided for each task. Finally, limitations common to all tasks are revealed and possible solutions are given. We will also release and regularly update the state-of-the-art results and newly-released datasets at https://github.com/nkicsl/Fundus_Review to adapt to the rapid development of this field.
Collapse
Affiliation(s)
- Tao Li
- College of Computer Science, Nankai University, Tianjin 300350, China
| | - Wang Bo
- College of Computer Science, Nankai University, Tianjin 300350, China
| | - Chunyu Hu
- College of Computer Science, Nankai University, Tianjin 300350, China
| | - Hong Kang
- College of Computer Science, Nankai University, Tianjin 300350, China
| | - Hanruo Liu
- Beijing Tongren Hospital, Capital Medical University, Address, Beijing 100730 China
| | - Kai Wang
- College of Computer Science, Nankai University, Tianjin 300350, China.
| | - Huazhu Fu
- Inception Institute of Artificial Intelligence (IIAI), Abu Dhabi, UAE
| |
Collapse
|
230
|
Deep learning-based solvability of underdetermined inverse problems in medical imaging. Med Image Anal 2021; 69:101967. [PMID: 33517242 DOI: 10.1016/j.media.2021.101967] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Revised: 12/28/2020] [Accepted: 01/06/2021] [Indexed: 11/23/2022]
Abstract
Recently, with the significant developments in deep learning techniques, solving underdetermined inverse problems has become one of the major concerns in the medical imaging domain, where underdetermined problems are motivated by the willingness to provide high resolution medical images with as little data as possible, by optimizing data collection in terms of minimal acquisition time, cost-effectiveness, and low invasiveness. Typical examples include undersampled magnetic resonance imaging(MRI), interior tomography, and sparse-view computed tomography(CT), where deep learning techniques have achieved excellent performances. However, there is a lack of mathematical analysis of why the deep learning method is performing well. This study aims to explain about learning the causal relationship regarding the structure of the training data suitable for deep learning, to solve highly underdetermined problems. We present a particular low-dimensional solution model to highlight the advantage of deep learning methods over conventional methods, where two approaches use the prior information of the solution in a completely different way. We also analyze whether deep learning methods can learn the desired reconstruction map from training data in the three models (undersampled MRI, sparse-view CT, interior tomography). This paper also discusses the nonlinearity structure of underdetermined linear systems and conditions of learning (called M-RIP condition).
Collapse
|
231
|
Dai P, Zhang H, Cao X. SLOAN: Scale-Adaptive Orientation Attention Network for Scene Text Recognition. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:1687-1701. [PMID: 33360990 DOI: 10.1109/tip.2020.3045602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Scene text recognition, the final step of the scene text reading system, has made impressive progress based on deep neural networks. However, existing recognition methods devote to dealing with the geometrically regular or irregular scene text. They are limited to the semantically arbitrary-orientation scene text. Meanwhile, previous scene text recognizers usually learn the single-scale feature representations for various-scale characters, which cannot model effective contexts for different characters. In this paper, we propose a novel scale-adaptive orientation attention network for arbitrary-orientation scene text recognition, which consists of a dynamic log-polar transformer and a sequence recognition network. Specifically, the dynamic log-polar transformer learns the log-polar origin to adaptively convert the arbitrary rotations and scales of scene texts into the shifts in the log-polar space, which is helpful to generate the rotation-aware and scale-aware visual representation. Next, the sequence recognition network is an encoder-decoder model, which incorporates a novel character-level receptive field attention module to encode more valid contexts for various-scale characters. The whole architecture can be trained in an end-to-end manner, only requiring the word image and its corresponding ground-truth text. Extensive experiments on several public datasets have demonstrated the effectiveness and superiority of our proposed method.
Collapse
|
232
|
|
233
|
FFU-Net: Feature Fusion U-Net for Lesion Segmentation of Diabetic Retinopathy. BIOMED RESEARCH INTERNATIONAL 2021; 2021:6644071. [PMID: 33490274 PMCID: PMC7801055 DOI: 10.1155/2021/6644071] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/03/2020] [Revised: 11/25/2020] [Accepted: 12/21/2020] [Indexed: 11/18/2022]
Abstract
Diabetic retinopathy is one of the main causes of blindness in human eyes, and lesion segmentation is an important basic work for the diagnosis of diabetic retinopathy. Due to the small lesion areas scattered in fundus images, it is laborious to segment the lesion of diabetic retinopathy effectively with the existing U-Net model. In this paper, we proposed a new lesion segmentation model named FFU-Net (Feature Fusion U-Net) that enhances U-Net from the following points. Firstly, the pooling layer in the network is replaced with a convolutional layer to reduce spatial loss of the fundus image. Then, we integrate multiscale feature fusion (MSFF) block into the encoders which helps the network to learn multiscale features efficiently and enrich the information carried with skip connection and lower-resolution decoder by fusing contextual channel attention (CCA) models. Finally, in order to solve the problems of data imbalance and misclassification, we present a Balanced Focal Loss function. In the experiments on benchmark dataset IDRID, we make an ablation study to verify the effectiveness of each component and compare FFU-Net against several state-of-the-art models. In comparison with baseline U-Net, FFU-Net improves the segmentation performance by 11.97%, 10.68%, and 5.79% on metrics SEN, IOU, and DICE, respectively. The quantitative and qualitative results demonstrate the superiority of our FFU-Net in the task of lesion segmentation of diabetic retinopathy.
Collapse
|
234
|
He A, Li T, Li N, Wang K, Fu H. CABNet: Category Attention Block for Imbalanced Diabetic Retinopathy Grading. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:143-153. [PMID: 32915731 DOI: 10.1109/tmi.2020.3023463] [Citation(s) in RCA: 79] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Diabetic Retinopathy (DR) grading is challenging due to the presence of intra-class variations, small lesions and imbalanced data distributions. The key for solving fine-grained DR grading is to find more discriminative features corresponding to subtle visual differences, such as microaneurysms, hemorrhages and soft exudates. However, small lesions are quite difficult to identify using traditional convolutional neural networks (CNNs), and an imbalanced DR data distribution will cause the model to pay too much attention to DR grades with more samples, greatly affecting the final grading performance. In this article, we focus on developing an attention module to address these issues. Specifically, for imbalanced DR data distributions, we propose a novel Category Attention Block (CAB), which explores more discriminative region-wise features for each DR grade and treats each category equally. In order to capture more detailed small lesion information, we also propose the Global Attention Block (GAB), which can exploit detailed and class-agnostic global attention feature maps for fundus images. By aggregating the attention blocks with a backbone network, the CABNet is constructed for DR grading. The attention blocks can be applied to a wide range of backbone networks and trained efficiently in an end-to-end manner. Comprehensive experiments are conducted on three publicly available datasets, showing that CABNet produces significant performance improvements for existing state-of-the-art deep architectures with few additional parameters and achieves the state-of-the-art results for DR grading. Code and models will be available at https://github.com/he2016012996/CABnet.
Collapse
|
235
|
Zhang S, Wang H, Tian S, Zhang X, Li J, Lei R, Gao M, Liu C, Yang L, Bi X, Zhu L, Zhu S, Xu T, Yang R. A slice classification model-facilitated 3D encoder-decoder network for segmenting organs at risk in head and neck cancer. JOURNAL OF RADIATION RESEARCH 2021; 62:94-103. [PMID: 33029634 PMCID: PMC7779351 DOI: 10.1093/jrr/rraa094] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Revised: 05/30/2020] [Indexed: 06/06/2023]
Abstract
For deep learning networks used to segment organs at risk (OARs) in head and neck (H&N) cancers, the class-imbalance problem between small volume OARs and whole computed tomography (CT) images results in delineation with serious false-positives on irrelevant slices and unnecessary time-consuming calculations. To alleviate this problem, a slice classification model-facilitated 3D encoder-decoder network was developed and validated. In the developed two-step segmentation model, a slice classification model was firstly utilized to classify CT slices into six categories in the craniocaudal direction. Then the target categories for different OARs were pushed to the different 3D encoder-decoder segmentation networks, respectively. All the patients were divided into training (n = 120), validation (n = 30) and testing (n = 20) datasets. The average accuracy of the slice classification model was 95.99%. The Dice similarity coefficient and 95% Hausdorff distance, respectively, for each OAR were as follows: right eye (0.88 ± 0.03 and 1.57 ± 0.92 mm), left eye (0.89 ± 0.03 and 1.35 ± 0.43 mm), right optic nerve (0.72 ± 0.09 and 1.79 ± 1.01 mm), left optic nerve (0.73 ± 0.09 and 1.60 ± 0.71 mm), brainstem (0.87 ± 0.04 and 2.28 ± 0.99 mm), right temporal lobe (0.81 ± 0.12 and 3.28 ± 2.27 mm), left temporal lobe (0.82 ± 0.09 and 3.73 ± 2.08 mm), right temporomandibular joint (0.70 ± 0.13 and 1.79 ± 0.79 mm), left temporomandibular joint (0.70 ± 0.16 and 1.98 ± 1.48 mm), mandible (0.89 ± 0.02 and 1.66 ± 0.51 mm), right parotid (0.77 ± 0.07 and 7.30 ± 4.19 mm) and left parotid (0.71 ± 0.12 and 8.41 ± 4.84 mm). The total segmentation time was 40.13 s. The 3D encoder-decoder network facilitated by the slice classification model demonstrated superior performance in accuracy and efficiency in segmenting OARs in H&N CT images. This may significantly reduce the workload for radiation oncologists.
Collapse
Affiliation(s)
- Shuming Zhang
- Department of Radiation Oncology, Peking University Third Hospital, Beijing, China
| | - Hao Wang
- Department of Radiation Oncology, Peking University Third Hospital, Beijing, China
| | - Suqing Tian
- Department of Radiation Oncology, Peking University Third Hospital, Beijing, China
| | - Xuyang Zhang
- Department of Radiation Oncology, Peking University Third Hospital, Beijing, China
- Cancer Center, Beijing Luhe Hospital, Capital Medical University, Beijing, China
| | - Jiaqi Li
- Department of Radiation Oncology, Peking University Third Hospital, Beijing, China
- Department of Emergency, Beijing Children’s Hospital, Capital Medical University, Beijing, China
| | - Runhong Lei
- Department of Radiation Oncology, Peking University Third Hospital, Beijing, China
| | - Mingze Gao
- Beijing Linking Medical Technology Co., Ltd, Beijing, China
| | - Chunlei Liu
- Beijing Linking Medical Technology Co., Ltd, Beijing, China
| | - Li Yang
- Beijing Linking Medical Technology Co., Ltd, Beijing, China
| | - Xinfang Bi
- Beijing Linking Medical Technology Co., Ltd, Beijing, China
| | - Linlin Zhu
- Beijing Linking Medical Technology Co., Ltd, Beijing, China
| | - Senhua Zhu
- Beijing Linking Medical Technology Co., Ltd, Beijing, China
| | - Ting Xu
- Institute of Science and Technology Development, Beijing University of Posts and Telecommunications, Beijing, China
| | - Ruijie Yang
- Department of Radiation Oncology, Peking University Third Hospital, Beijing, China
| |
Collapse
|
236
|
Kose K, Bozkurt A, Alessi-Fox C, Gill M, Longo C, Pellacani G, Dy JG, Brooks DH, Rajadhyaksha M. Segmentation of cellular patterns in confocal images of melanocytic lesions in vivo via a multiscale encoder-decoder network (MED-Net). Med Image Anal 2021; 67:101841. [PMID: 33142135 PMCID: PMC7885250 DOI: 10.1016/j.media.2020.101841] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Revised: 09/17/2020] [Accepted: 09/18/2020] [Indexed: 12/11/2022]
Abstract
In-vivo optical microscopy is advancing into routine clinical practice for non-invasively guiding diagnosis and treatment of cancer and other diseases, and thus beginning to reduce the need for traditional biopsy. However, reading and analysis of the optical microscopic images are generally still qualitative, relying mainly on visual examination. Here we present an automated semantic segmentation method called "Multiscale Encoder-Decoder Network (MED-Net)" that provides pixel-wise labeling into classes of patterns in a quantitative manner. The novelty in our approach is the modeling of textural patterns at multiple scales (magnifications, resolutions). This mimics the traditional procedure for examining pathology images, which routinely starts with low magnification (low resolution, large field of view) followed by closer inspection of suspicious areas with higher magnification (higher resolution, smaller fields of view). We trained and tested our model on non-overlapping partitions of 117 reflectance confocal microscopy (RCM) mosaics of melanocytic lesions, an extensive dataset for this application, collected at four clinics in the US, and two in Italy. With patient-wise cross-validation, we achieved pixel-wise mean sensitivity and specificity of 74% and 92%, respectively, with 0.74 Dice coefficient over six classes. In the scenario, we partitioned the data clinic-wise and tested the generalizability of the model over multiple clinics. In this setting, we achieved pixel-wise mean sensitivity and specificity of 77% and 94%, respectively, with 0.77 Dice coefficient. We compared MED-Net against the state-of-the-art semantic segmentation models and achieved better quantitative segmentation performance. Our results also suggest that, due to its nested multiscale architecture, the MED-Net model annotated RCM mosaics more coherently, avoiding unrealistic-fragmented annotations.
Collapse
Affiliation(s)
- Kivanc Kose
- Dermatology Service, Memorial Sloan Kettering Cancer Center, New York, 11377,NY, USA.
| | - Alican Bozkurt
- Electrical and Computer Engineering Department, Northeastern University, Boston, 02115, MA, USA.
| | | | - Melissa Gill
- Department of Pathology at SUNY Downstate Medical Center, New York, 11203, NY, USA; SkinMedical Research Diagnostics, P.L.L.C., Dobbs Ferry, 10522, NY, USA; Faculty of Medicine and Health Sciences, University of Alcala de Henares, Madrid, Spain.
| | - Caterina Longo
- University of Modena and Reggio Emilia, Reggio Emilia, Italy; Azienda Unità Sanitaria Locale - IRCCS di Reggio Emilia, Centro Oncologico ad Alta Tecnologia Diagnostica-Dermatologia, Reggio Emilia, Italy.
| | | | - Jennifer G Dy
- Electrical and Computer Engineering Department, Northeastern University, Boston, 02115, MA, USA.
| | - Dana H Brooks
- Electrical and Computer Engineering Department, Northeastern University, Boston, 02115, MA, USA.
| | - Milind Rajadhyaksha
- Dermatology Service, Memorial Sloan Kettering Cancer Center, New York, 11377,NY, USA.
| |
Collapse
|
237
|
Loo J, Kriegel MF, Tuohy MM, Kim KH, Prajna V, Woodward MA, Farsiu S. Open-Source Automatic Segmentation of Ocular Structures and Biomarkers of Microbial Keratitis on Slit-Lamp Photography Images Using Deep Learning. IEEE J Biomed Health Inform 2021; 25:88-99. [PMID: 32248131 PMCID: PMC7781042 DOI: 10.1109/jbhi.2020.2983549] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
We propose a fully-automatic deep learning-based algorithm for segmentation of ocular structures and microbial keratitis (MK) biomarkers on slit-lamp photography (SLP) images. The dataset consisted of SLP images from 133 eyes with manual annotations by a physician, P1. A modified region-based convolutional neural network, SLIT-Net, was developed and trained using P1's annotations to identify and segment four pathological regions of interest (ROIs) on diffuse white light images (stromal infiltrate (SI), hypopyon, white blood cell (WBC) border, corneal edema border), one pathological ROI on diffuse blue light images (epithelial defect (ED)), and two non-pathological ROIs on all images (corneal limbus, light reflexes). To assess inter-reader variability, 75 eyes were manually annotated for pathological ROIs by a second physician, P2. Performance was evaluated using the Dice similarity coefficient (DSC) and Hausdorff distance (HD). Using seven-fold cross-validation, the DSC of the algorithm (as compared to P1) for all ROIs was good (range: 0.62-0.95) on all 133 eyes. For the subset of 75 eyes with manual annotations by P2, the DSC for pathological ROIs ranged from 0.69-0.85 (SLIT-Net) vs. 0.37-0.92 (P2). DSCs for SLIT-Net were not significantly different than P2 for segmenting hypopyons (p > 0.05) and higher than P2 for WBCs (p < 0.001) and edema (p < 0.001). DSCs were higher for P2 for segmenting SIs (p < 0.001) and EDs (p < 0.001). HDs were lower for P2 for segmenting SIs (p = 0.005) and EDs (p < 0.001) and not significantly different for hypopyons (p > 0.05), WBCs (p > 0.05), and edema (p > 0.05). This prototype fully-automatic algorithm to segment MK biomarkers on SLP images performed to expectations on an exploratory dataset and holds promise for quantification of corneal physiology and pathology.
Collapse
|
238
|
Hasan MK, Alam MA, Elahi MTE, Roy S, Martí R. DRNet: Segmentation and localization of optic disc and Fovea from diabetic retinopathy image. Artif Intell Med 2020; 111:102001. [PMID: 33461693 DOI: 10.1016/j.artmed.2020.102001] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2020] [Revised: 11/22/2020] [Accepted: 12/06/2020] [Indexed: 12/18/2022]
Abstract
BACKGROUND AND OBJECTIVE In modern ophthalmology, automated Computer-aided Screening Tools (CSTs) are crucial non-intrusive diagnosis methods, where an accurate segmentation of Optic Disc (OD) and localization of OD and Fovea centers are substantial integral parts. However, designing such an automated tool remains challenging due to small dataset sizes, inconsistency in spatial, texture, and shape information of the OD and Fovea, and the presence of different artifacts. METHODS This article proposes an end-to-end encoder-decoder network, named DRNet, for the segmentation and localization of OD and Fovea centers. In our DRNet, we propose a skip connection, named residual skip connection, for compensating the spatial information lost due to pooling in the encoder. Unlike the earlier skip connection in the UNet, the proposed skip connection does not directly concatenate low-level feature maps from the encoder's beginning layers with the corresponding same scale decoder. We validate DRNet using different publicly available datasets, such as IDRiD, RIMONE, DRISHTI-GS, and DRIVE for OD segmentation; IDRiD and HRF for OD center localization; and IDRiD for Fovea center localization. RESULTS The proposed DRNet, for OD segmentation, achieves mean Intersection over Union (mIoU) of 0.845, 0.901, 0.933, and 0.920 for IDRiD, RIMONE, DRISHTI-GS, and DRIVE, respectively. Our OD segmentation result, in terms of mIoU, outperforms the state-of-the-art results for IDRiD and DRIVE datasets, whereas it outperforms state-of-the-art results concerning mean sensitivity for RIMONE and DRISHTI-GS datasets. The DRNet localizes the OD center with mean Euclidean Distance (mED) of 20.23 and 13.34 pixels, respectively, for IDRiD and HRF datasets; it outperforms the state-of-the-art by 4.62 pixels for IDRiD dataset. The DRNet also successfully localizes the Fovea center with mED of 41.87 pixels for the IDRiD dataset, outperforming the state-of-the-art by 1.59 pixels for the same dataset. CONCLUSION As the proposed DRNet exhibits excellent performance even with limited training data and without intermediate intervention, it can be employed to design a better-CST system to screen retinal images. Our source codes, trained models, and ground-truth heatmaps for OD and Fovea center localization will be made publicly available upon publication at GitHub.1.
Collapse
Affiliation(s)
- Md Kamrul Hasan
- Department of Electrical and Electronic Engineering, Khulna University of Engineering & Technology, Khulna 9203, Bangladesh.
| | - Md Ashraful Alam
- Department of Electrical and Electronic Engineering, Khulna University of Engineering & Technology, Khulna 9203, Bangladesh.
| | - Md Toufick E Elahi
- Department of Electrical and Electronic Engineering, Khulna University of Engineering & Technology, Khulna 9203, Bangladesh.
| | - Shidhartho Roy
- Department of Electrical and Electronic Engineering, Khulna University of Engineering & Technology, Khulna 9203, Bangladesh.
| | - Robert Martí
- Computer Vision and Robotics Institute, University of Girona, Spain.
| |
Collapse
|
239
|
Mirzania D, Thompson AC, Muir KW. Applications of deep learning in detection of glaucoma: A systematic review. Eur J Ophthalmol 2020; 31:1618-1642. [PMID: 33274641 DOI: 10.1177/1120672120977346] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Glaucoma is the leading cause of irreversible blindness and disability worldwide. Nevertheless, the majority of patients do not know they have the disease and detection of glaucoma progression using standard technology remains a challenge in clinical practice. Artificial intelligence (AI) is an expanding field that offers the potential to improve diagnosis and screening for glaucoma with minimal reliance on human input. Deep learning (DL) algorithms have risen to the forefront of AI by providing nearly human-level performance, at times exceeding the performance of humans for detection of glaucoma on structural and functional tests. A succinct summary of present studies and challenges to be addressed in this field is needed. Following PRISMA guidelines, we conducted a systematic review of studies that applied DL methods for detection of glaucoma using color fundus photographs, optical coherence tomography (OCT), or standard automated perimetry (SAP). In this review article we describe recent advances in DL as applied to the diagnosis of glaucoma and glaucoma progression for application in screening and clinical settings, as well as the challenges that remain when applying this novel technique in glaucoma.
Collapse
Affiliation(s)
| | - Atalie C Thompson
- Duke University School of Medicine, Durham, NC, USA.,Durham VA Medical Center, Durham, NC, USA
| | - Kelly W Muir
- Duke University School of Medicine, Durham, NC, USA.,Durham VA Medical Center, Durham, NC, USA
| |
Collapse
|
240
|
Wang S, Yu L, Li K, Yang X, Fu CW, Heng PA. DoFE: Domain-Oriented Feature Embedding for Generalizable Fundus Image Segmentation on Unseen Datasets. IEEE TRANSACTIONS ON MEDICAL IMAGING 2020; 39:4237-4248. [PMID: 32776876 DOI: 10.1109/tmi.2020.3015224] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Deep convolutional neural networks have significantly boosted the performance of fundus image segmentation when test datasets have the same distribution as the training datasets. However, in clinical practice, medical images often exhibit variations in appearance for various reasons, e.g., different scanner vendors and image quality. These distribution discrepancies could lead the deep networks to over-fit on the training datasets and lack generalization ability on the unseen test datasets. To alleviate this issue, we present a novel Domain-oriented Feature Embedding (DoFE) framework to improve the generalization ability of CNNs on unseen target domains by exploring the knowledge from multiple source domains. Our DoFE framework dynamically enriches the image features with additional domain prior knowledge learned from multi-source domains to make the semantic features more discriminative. Specifically, we introduce a Domain Knowledge Pool to learn and memorize the prior information extracted from multi-source domains. Then the original image features are augmented with domain-oriented aggregated features, which are induced from the knowledge pool based on the similarity between the input image and multi-source domain images. We further design a novel domain code prediction branch to infer this similarity and employ an attention-guided mechanism to dynamically combine the aggregated features with the semantic features. We comprehensively evaluate our DoFE framework on two fundus image segmentation tasks, including the optic cup and disc segmentation and vessel segmentation. Our DoFE framework generates satisfying segmentation results on unseen datasets and surpasses other domain generalization and network regularization methods.
Collapse
|
241
|
Shi T, Jiang H, Zheng B. A Stacked Generalization U-shape network based on zoom strategy and its application in biomedical image segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2020; 197:105678. [PMID: 32791449 DOI: 10.1016/j.cmpb.2020.105678] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Accepted: 07/23/2020] [Indexed: 06/11/2023]
Abstract
BACKGROUND AND OBJECTIVE The deep neural network model can learn complex non-linear relationships in the data and has superior flexibility and adaptability. A downside of this flexibility is that they are sensitive to initial conditions, both in terms of the initial random weights and in terms of the statistical noise in the training dataset. And the disadvantage caused by adaptability is that deep convolutional networks usually have poor robustness or generalization when the models are trained using the extremely limited amount of labeled data, especially in the biomedical imaging informatics field. METHODS In this paper, we propose to develop and test a stacked generalization U-shape network (SG-UNet) based on the zoom strategy applying to biomedical image segmentation. SG-UNet is essentially a stacked generalization architecture consisting of multiple sub-modules, which takes multi-resolution images as input and uses hybrid features to segment regions of interest and detect diseases under the multi-supervision. The proposed new SG-UNet applies the zoom of multi-supervision to do optimization search in global feature space without pre-training. Besides, the zoom loss function can gradually enhance the focus training on a sparse set of hard samples. RESULTS We evaluated the proposed algorithm in comparison with several popular U-shape ensemble network architectures across multi-modal biomedical image segmentation tasks to segment malignant rectal cancers, polyps and glands from the three imaging modalities of computed tomography (CT), digital colonoscopy and histopathology images. Applying the proposed algorithm improves 3.116%, 2.676%, 2.356% on Dice coefficients, and 3.044%, 2.420%, 1.928% on F2-score for the three imaging modality datasets, respectively. The comparison results using different amounts of rectal cancer CT data show that the proposed algorithm has a slower tendency of diminishing marginal efficiency. And glands segmentation study results also support the feasibility of yielding comparable performance with other state-of-the-art methods. CONCLUSIONS The proposed algorithm can be trained more efficiently by using the small image datasets without using additional techniques such as fine-tuning, and achieves higher accuracy with less computational complexity than other stacked ensemble networks for biomedical image segmentation.
Collapse
Affiliation(s)
- Tianyu Shi
- Software College, Northeastern University, Shenyang 110819, China
| | - Huiyan Jiang
- Software College, Northeastern University, Shenyang 110819, China; Key Laboratory of Intelligent Computing in Biomedical Image, Ministry of Education, Northeastern University, Shenyang 110819, China.
| | - Bin Zheng
- School of Electrical and Computer Engineering, The University of Oklahoma, Norman, OK 73019, USA.
| |
Collapse
|
242
|
Bian X, Luo X, Wang C, Liu W, Lin X. Optic disc and optic cup segmentation based on anatomy guided cascade network. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2020; 197:105717. [PMID: 32957060 DOI: 10.1016/j.cmpb.2020.105717] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/01/2020] [Accepted: 08/15/2020] [Indexed: 05/26/2023]
Abstract
BACKGROUND AND OBJECTIVE Glaucoma, a worldwide eye disease, may cause irreversible vision damage. If not treated properly at an early stage, glaucoma eventually deteriorates into blindness. Various glaucoma screening methods, e.g. Ultrasound Biomicroscopy (UBM), Optical Coherence Tomography (OCT), and Heidelberg Retinal Scanner (HRT), are available. However, retinal fundus image photography examination, because of its low cost, is one of the most common solutions used to diagnose glaucoma. Clinically, the cup-to-disk ratio is an important indicator in glaucoma diagnosis. Therefore, precise fundus image segmentation to calculate the cup-to-disk ratio is the basis for screening glaucoma. METHODS In this paper, we propose a deep neural network that uses anatomical knowledge to guide the segmentation of fundus images, which accurately segments the optic cup and the optic disc in a fundus image to accurately calculate the cup-to-disk ratio. Optic disc and optic cup segmentation are typical small target segmentation problems in biomedical images. We propose to use an attention-based cascade network to effectively accelerate the convergence of small target segmentation during training and accurately reserve detailed contours of small targets. RESULTS Our method, which was validated in the MICCAI REFUGE fundus image segmentation competition, achieves 93.31% dice score in optic disc segmentation and 88.04% dice score in optic cup segmentation. Moreover, we win a high CDR evaluation score, which is useful for glaucoma screening. CONCLUSIONS The proposed method successfully introduce anatomical knowledge into segmentation task, and achieve state-of-the-art performance in fundus image segmentation. It also can be used for both automatic segmentation and semiautomatic segmentation with human interaction.
Collapse
Affiliation(s)
- Xuesheng Bian
- Fujian Key Laboratory of Sensing and Computing for Smart Cities, Department of Computer Science, School of Informatics, Xiamen University, Xiamen 361005, China
| | - Xiongbiao Luo
- Fujian Key Laboratory of Sensing and Computing for Smart Cities, Department of Computer Science, School of Informatics, Xiamen University, Xiamen 361005, China
| | - Cheng Wang
- Fujian Key Laboratory of Sensing and Computing for Smart Cities, Department of Computer Science, School of Informatics, Xiamen University, Xiamen 361005, China.
| | - Weiquan Liu
- Fujian Key Laboratory of Sensing and Computing for Smart Cities, Department of Computer Science, School of Informatics, Xiamen University, Xiamen 361005, China
| | - Xiuhong Lin
- Fujian Key Laboratory of Sensing and Computing for Smart Cities, Department of Computer Science, School of Informatics, Xiamen University, Xiamen 361005, China
| |
Collapse
|
243
|
Fu H, Li F, Sun X, Cao X, Liao J, Orlando JI, Tao X, Li Y, Zhang S, Tan M, Yuan C, Bian C, Xie R, Li J, Li X, Wang J, Geng L, Li P, Hao H, Liu J, Kong Y, Ren Y, Bogunović H, Zhang X, Xu Y. AGE challenge: Angle Closure Glaucoma Evaluation in Anterior Segment Optical Coherence Tomography. Med Image Anal 2020; 66:101798. [DOI: 10.1016/j.media.2020.101798] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Revised: 07/30/2020] [Accepted: 07/31/2020] [Indexed: 12/13/2022]
|
244
|
IOSUDA: an unsupervised domain adaptation with input and output space alignment for joint optic disc and cup segmentation. APPL INTELL 2020. [DOI: 10.1007/s10489-020-01956-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
245
|
Fang X, Yan P. Multi-Organ Segmentation Over Partially Labeled Datasets With Multi-Scale Feature Abstraction. IEEE TRANSACTIONS ON MEDICAL IMAGING 2020; 39:3619-3629. [PMID: 32746108 PMCID: PMC7665851 DOI: 10.1109/tmi.2020.3001036] [Citation(s) in RCA: 66] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Shortage of fully annotated datasets has been a limiting factor in developing deep learning based image segmentation algorithms and the problem becomes more pronounced in multi-organ segmentation. In this paper, we propose a unified training strategy that enables a novel multi-scale deep neural network to be trained on multiple partially labeled datasets for multi-organ segmentation. In addition, a new network architecture for multi-scale feature abstraction is proposed to integrate pyramid input and feature analysis into a U-shape pyramid structure. To bridge the semantic gap caused by directly merging features from different scales, an equal convolutional depth mechanism is introduced. Furthermore, we employ a deep supervision mechanism to refine the outputs in different scales. To fully leverage the segmentation features from all the scales, we design an adaptive weighting layer to fuse the outputs in an automatic fashion. All these mechanisms together are integrated into a Pyramid Input Pyramid Output Feature Abstraction Network (PIPO-FAN). Our proposed method was evaluated on four publicly available datasets, including BTCV, LiTS, KiTS and Spleen, where very promising performance has been achieved. The source code of this work is publicly shared at https://github.com/DIAL-RPI/PIPO-FAN to facilitate others to reproduce the work and build their own models using the introduced mechanisms.
Collapse
|
246
|
Liu L, Wu FX, Wang YP, Wang J. Multi-Receptive-Field CNN for Semantic Segmentation of Medical Images. IEEE J Biomed Health Inform 2020; 24:3215-3225. [DOI: 10.1109/jbhi.2020.3016306] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
247
|
Cao G, Zhao W, Higashita R, Liu J, Chen W, Yuan J, Zhang Y, Yang M. An Efficient Lens Structures Segmentation Method on AS-OCT Images. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020; 2020:1646-1649. [PMID: 33018311 DOI: 10.1109/embc44109.2020.9175944] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Lens structures segmentation on anterior segment optical coherence tomography (AS-OCT) images is a fundamental task for cataract grading analysis. In this paper, in order to reduce the computational cost while keeping the segmentation accuracy, we propose an efficient segmentation method for lens structures segmentation. At first, we adopt an efficient semantic segmentation network in the work, and used it to extract the lens area image instead of the conventional object detection method, and then used it once again to segment the lens structures. Finally, we introduce the curve fitting processing (CFP) on the segmentation results. Experiment results show that our method has good performance on accuracy and processing speed, and could be applied to CASIA II device for practical applications.
Collapse
|
248
|
Zhang L, Zhang J, Li Z, Song Y. A multiple-channel and atrous convolution network for ultrasound image segmentation. Med Phys 2020; 47:6270-6285. [PMID: 33007105 DOI: 10.1002/mp.14512] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Revised: 09/12/2020] [Accepted: 09/22/2020] [Indexed: 11/08/2022] Open
Abstract
PURPOSE Ultrasound image segmentation is a challenging task due to a low signal-to-noise ratio and poor image quality. Although several approaches based on the convolutional neural network (CNN) have been applied to ultrasound image segmentation, they have weak generalization ability. We propose an end-to-end, multiple-channel and atrous CNN designed to extract a greater amount of semantic information for segmentation of ultrasound images. METHOD A multiple-channel and atrous convolution network is developed, referred to as MA-Net. Similar to U-Net, MA-Net is based on an encoder-decoder architecture and includes five modules: the encoder, atrous convolution, pyramid pooling, decoder, and residual skip pathway modules. In the encoder module, we aim to capture more information with multiple-channel convolution and use large kernel convolution instead of small filters in each convolution operation. In the last layer, atrous convolution and pyramid pooling are used to extract multi-scale features. The architecture of the decoder is similar to that of the encoder module, except that up-sampling is used instead of down-sampling. Furthermore, the residual skip pathway module connects the subnetworks of the encoder and decoder to optimize learning from the deeper layer and improve the accuracy of segmentation. During the learning process, we adopt multi-task learning to enhance segmentation performance. Five types of datasets are used in our experiments. Because the original training data are limited, we apply data augmentation (e.g., horizontal and vertical flipping, random rotations, and random scaling) to our training data. We use the Dice score, precision, recall, Hausdorff distance (HD), average symmetric surface distance (ASD), and root mean square symmetric surface distance (RMSD) as the metrics for segmentation evaluation. Meanwhile, Friedman test was performed as the nonparametric statistical analysis to evaluate the algorithms. RESULTS For the datasets of brachia plexus (BP), fetal head, and lymph node segmentations, MA-Net achieved average Dice scores of 0.776, 0.973, and 0.858, respectively; with average precisions of 0.787, 0.968, and 0.854, respectively; average recalls of 0.788, 0.978, and 0.885, respectively; average HDs (mm) of 13.591, 10.924, and 19.245, respectively; average ASDs (mm) of 4.822, 4.152, and 4.312, respectively; and average RMSDs (mm) of 4.979, 4.161, and 4.930, respectively. Compared with U-Net, U-Net++, M-Net, and Dilated U-Net, the average performance of the MA-Net increased by approximately 5.68%, 2.85%, 6.59%, 36.03%, 23.64%, and 31.71% for Dice, precision, recall, HD, ASD, and RMSD, respectively. Moreover, we verified the generalization of MA-Net segmentation to lower grade brain glioma MRI and lung CT images. In addition, the MA-Net achieved the highest mean rank in the Friedman test. CONCLUSION The proposed MA-Net accurately segments ultrasound images with high generalization, and therefore, it offers a useful tool for diagnostic application in ultrasound images.
Collapse
Affiliation(s)
- Lun Zhang
- School of Information Science and Engineering, Yunnan University, Kunming, Yunnan, 650091, China.,Yunnan Vocational Institute of Energy Technology, Qujing, Yunnan, 655001, China
| | - Junhua Zhang
- School of Information Science and Engineering, Yunnan University, Kunming, Yunnan, 650091, China
| | - Zonggui Li
- School of Information Science and Engineering, Yunnan University, Kunming, Yunnan, 650091, China
| | - Yingchao Song
- School of Information Science and Engineering, Yunnan University, Kunming, Yunnan, 650091, China
| |
Collapse
|
249
|
Abstract
Kidney tumors represent a type of cancer that people of advanced age are more likely to develop. For this reason, it is important to exercise caution and provide diagnostic tests in the later stages of life. Medical imaging and deep learning methods are becoming increasingly attractive in this sense. Developing deep learning models to help physicians identify tumors with successful segmentation is of great importance. However, not many successful systems exist for soft tissue organs, such as the kidneys and the prostate, of which segmentation is relatively difficult. In such cases where segmentation is difficult, V-Net-based models are mostly used. This paper proposes a new hybrid model using the superior features of existing V-Net models. The model represents a more successful system with improvements in the encoder and decoder phases not previously applied. We believe that this new hybrid V-Net model could help the majority of physicians, particularly those focused on kidney and kidney tumor segmentation. The proposed model showed better performance in segmentation than existing imaging models and can be easily integrated into all systems due to its flexible structure and applicability. The hybrid V-Net model exhibited average Dice coefficients of 97.7% and 86.5% for kidney and tumor segmentation, respectively, and, therefore, could be used as a reliable method for soft tissue organ segmentation.
Collapse
|
250
|
Cai Z, Xin J, Wu J, Liu S, Zuo W, Zheng N. Triple Multi-scale Adversarial Learning with Self-attention and Quality Loss for Unpaired Fundus Fluorescein Angiography Synthesis. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020; 2020:1592-1595. [PMID: 33018298 DOI: 10.1109/embc44109.2020.9176302] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Clinically, the Fundus Fluorescein Angiography (FA) is a more common mean for Diabetic Retinopathy (DR) detection since the DR appears in FA much more contrasty than in Color Fundus Image (CF). However, acquiring FA has a risk of death due to the fluorescent allergy. Thus, in this paper, we explore a novel unpaired CycleGAN-based model for the FA synthesis from CF, where some strict structure similarity constraints are employed to guarantee the perfectly mapping from one domain to another one. First, a triple multi-scale network architecture with multi-scale inputs, multi-scale discriminators and multi-scale cycle consistency losses is proposed to enhance the similarity between two retinal modalities from different scales. Second, the self-attention mechanism is introduced to improve the adaptive domain mapping ability of the model. Third, to further improve strict constraints in the feather level, quality loss is employed between each process of generation and reconstruction. Qualitative examples, as well as quantitative evaluation, are provided to support the robustness and the accuracy of our proposed method.
Collapse
|