1
|
Hosseinimanesh G, Alsheghri A, Keren J, Cheriet F, Guibault F. Personalized dental crown design: A point-to-mesh completion network. Med Image Anal 2025; 101:103439. [PMID: 39705822 DOI: 10.1016/j.media.2024.103439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2024] [Revised: 11/24/2024] [Accepted: 12/10/2024] [Indexed: 12/23/2024]
Abstract
Designing dental crowns with computer-aided design software in dental laboratories is complex and time-consuming. Using real clinical datasets, we developed an end-to-end deep learning model that automatically generates personalized dental crown meshes. The input context includes the prepared tooth, its adjacent teeth, and the two closest teeth in the opposing jaw. The training set contains this context, the ground truth crown, and the extracted margin line. Our model consists of two components: First, a feature extractor converts the input point cloud into a set of local feature vectors, which are then fed into a transformer-based model to predict the geometric features of the crown. Second, a point-to-mesh module generates a dense array of points with normal vectors, and a differentiable Poisson surface reconstruction method produces an accurate crown mesh. Training is conducted with three losses: (1) a customized margin line loss; (2) a contrastive-based Chamfer distance loss; and (3) a mean square error (MSE) loss to control mesh quality. We compare our method with our previously published method, Dental Mesh Completion (DMC). Extensive testing confirms our method's superiority, achieving a 12.32% reduction in Chamfer distance and a 46.43% reduction in MSE compared to DMC. Margin line loss improves Chamfer distance by 5.59%.
Collapse
Affiliation(s)
| | - Ammar Alsheghri
- Mechanical Engineering Department, King Fahd University of Petroleum and Minerals (KFUPM), Dhahran, 31261, Kingdom of Saudi Arabia; Interdisciplinary research center for Biosystems and Machines, King Fahd University of Petroleum and Minerals (KFUPM), Dhahran, Kingdom of Saudi Arabia
| | | | | | | |
Collapse
|
2
|
Rekik A, Ben-Hamadou A, Smaoui O, Bouzguenda F, Pujades S, Boyer E. TSegLab: Multi-stage 3D dental scan segmentation and labeling. Comput Biol Med 2025; 185:109535. [PMID: 39708498 DOI: 10.1016/j.compbiomed.2024.109535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 08/17/2024] [Accepted: 12/03/2024] [Indexed: 12/23/2024]
Abstract
This study introduces a novel deep learning approach for 3D teeth scan segmentation and labeling, designed to enhance accuracy in computer-aided design (CAD) systems. Our method is organized into three key stages: coarse localization, fine teeth segmentation, and labeling. In the teeth localization stage, we employ a Mask-RCNN model to detect teeth in a rendered three-channel 2D representation of the input scan. For fine teeth segmentation, each detected tooth mesh is isomorphically mapped to a 2D harmonic parameter space and segmented with a Mask-RCNN model for precise crown delineation. Finally, for labeling, we propose a graph neural network that captures both the 3D shape and spatial distribution of the teeth, along with a new data augmentation technique to simulate missing teeth and teeth position variation during training. The method is evaluated using three key metrics: Teeth Localization Accuracy (TLA), Teeth Segmentation Accuracy (TSA), and Teeth Identification Rate (TIR). We tested our approach on the Teeth3DS dataset, consisting of 1800 intraoral 3D scans, and achieved a TLA of 98.45%, TSA of 98.17%, and TIR of 97.61%, outperforming existing state-of-the-art techniques. These results suggest that our approach significantly enhances the precision and reliability of automatic teeth segmentation and labeling in dental CAD applications. Link to the project page: https://crns-smartvision.github.io/tseglab.
Collapse
Affiliation(s)
- Ahmed Rekik
- Digital Research Center of Sfax, Technopark of Sfax, Sakiet Ezzit, 3021 Sfax, Tunisia; ISSAT, Gafsa university, Sidi Ahmed Zarrouk University Campus, 2112 Gafsa, Tunisia; Laboratory of Signals, systeMs, aRtificial Intelligence and neTworkS, Technopark of Sfax, Sakiet Ezzit, 3021 Sfax, Tunisia
| | - Achraf Ben-Hamadou
- Digital Research Center of Sfax, Technopark of Sfax, Sakiet Ezzit, 3021 Sfax, Tunisia; Laboratory of Signals, systeMs, aRtificial Intelligence and neTworkS, Technopark of Sfax, Sakiet Ezzit, 3021 Sfax, Tunisia.
| | - Oussama Smaoui
- Udini, 37 BD Aristide Briand, 13100 Aix-En-Provence, France
| | | | - Sergi Pujades
- Inria, Univ. Grenoble Alpes, CNRS, Grenoble INP, LJK, France
| | - Edmond Boyer
- Inria, Univ. Grenoble Alpes, CNRS, Grenoble INP, LJK, France
| |
Collapse
|
3
|
Xu J, Guan B, Zhao J, Yi B, Li J. LungDepth: Self-Supervised Multi-Frame Monocular Depth Estimation for Bronchoscopy. Int J Med Robot 2025; 21:e70050. [PMID: 39893656 DOI: 10.1002/rcs.70050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Revised: 01/19/2025] [Accepted: 01/26/2025] [Indexed: 02/04/2025]
Abstract
BACKGROUND Bronchoscopy is an essential measure for conducting lung biopsies in clinical practice. It is crucial for advancing the intelligence of bronchoscopy to acquire depth information from bronchoscopic image sequences. METHODS A self-supervised multi-frame monocular depth estimation approach for bronchoscopy is constructed. Networks are trained by minimising the photometric reprojection error between the target frame and the reconstructed target frame. The adaptive dual attention module and the details emphasis module are introduced to better capture the edge contour and internal details. In addition, the approach is evaluated on a self-made dataset and compared against other established methods. RESULTS Experimental results demonstrate that the proposed method outperforms other self-supervised monocular depth estimation approaches in both quantitative measurement and qualitative analysis. CONCLUSION Our monocular depth estimation approach for bronchoscopy achieves superior performance in terms of error and accuracy, and passes physical model validations, which can facilitate further research into intelligent bronchoscopic procedures.
Collapse
Affiliation(s)
- Jingsheng Xu
- Key Laboratory of Mechanism Theory and Equipment Design of Ministry of Education, Tianjin University, Tianjin, China
| | - Bo Guan
- Key Laboratory of Mechanism Theory and Equipment Design of Ministry of Education, Tianjin University, Tianjin, China
| | - Jianchang Zhao
- School of Aerospace Engineering, National Engineering Research Center of Neuromodulation, Tsinghua University, Beijing, China
| | - Bo Yi
- Department of General Surgery, Third Xiangya Hospital, Central South University, Changsha, China
| | - Jianmin Li
- Key Laboratory of Mechanism Theory and Equipment Design of Ministry of Education, Tianjin University, Tianjin, China
| |
Collapse
|
4
|
Li S, Ma F, Yan F, Dong X, Guo Y, Meng J, Liu H. SFNet: Spatial and Frequency Domain Networks for Wide-Field OCT Angiography Retinal Vessel Segmentation. JOURNAL OF BIOPHOTONICS 2025; 18:e202400420. [PMID: 39523861 DOI: 10.1002/jbio.202400420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Revised: 10/20/2024] [Accepted: 10/30/2024] [Indexed: 11/16/2024]
Abstract
Automatic segmentation of blood vessels in fundus images is important to assist ophthalmologists in diagnosis. However, automatic segmentation for Optical Coherence Tomography Angiography (OCTA) blood vessels has not been fully investigated due to various difficulties, such as vessel complexity. In addition, there are only a few publicly available OCTA image data sets for training and validating segmentation algorithms. To address these issues, we constructed a wild-field retinal OCTA segmentation data set, the Retinal Vessels Images in OCTA (REVIO) dataset. Second, we propose a new retinal vessel segmentation network based on spatial and frequency domain networks (SFNet). The proposed model are tested on three benchmark data sets including REVIO, ROSE and OCTA-500. The experimental results show superior performance on segmentation tasks compared to the representative methods.
Collapse
Affiliation(s)
- Sien Li
- School of Computer Science, Qufu Normal University, Rizhao, Shandong, China
| | - Fei Ma
- School of Computer Science, Qufu Normal University, Rizhao, Shandong, China
| | - Fen Yan
- Ultrasound Medicine Department Qufu People's Hospital, Qufu, Shandong, China
| | - Xiwei Dong
- School of Computer and Big Data Science, Jiujiang University, Jiujiang, Jiangxi, China
| | - Yanfei Guo
- School of Computer Science, Qufu Normal University, Rizhao, Shandong, China
| | - Jing Meng
- School of Computer Science, Qufu Normal University, Rizhao, Shandong, China
| | - Hongjuan Liu
- School of Computer Science, Qufu Normal University, Rizhao, Shandong, China
| |
Collapse
|
5
|
Roh J, Kim J, Lee J. Two-stage deep learning framework for occlusal crown depth image generation. Comput Biol Med 2024; 183:109220. [PMID: 39366141 DOI: 10.1016/j.compbiomed.2024.109220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2024] [Revised: 09/05/2024] [Accepted: 09/26/2024] [Indexed: 10/06/2024]
Abstract
The generation of depth images of occlusal dental crowns is complicated by the need for customization in each case. To decrease the workload of skilled dental technicians, various computer vision models have been used to generate realistic occlusal crown depth images with definite crown surface structures that can ultimately be reconstructed to three-dimensional crowns and directly used in patient treatment. However, it has remained difficult to generate images of the structure of dental crowns in a fluid position using computer vision models. In this paper, we propose a two-stage model for generating depth images of occlusal crowns in diverse positions. The model is divided into two parts: segmentation and inpainting to obtain both shape and surface structure accuracy. The segmentation network focuses on the position and size of the crowns, which allows the model to adapt to diverse targets. The inpainting network based on a GAN generates curved structures of the crown surfaces based on the target jaw image and a binary mask made by the segmentation network. The performance of the model is evaluated via quantitative metrics for the area detection and pixel-value metrics. Compared to the baseline model, the proposed method reduced the MSE score from 0.007001 to 0.002618 and increased DICE score from 0.9333 to 0.9648. It indicates that the model showed better performance in terms of the binary mask from the addition of the segmentation network and the internal structure through the use of inpainting networks. Also, the results demonstrated an improved ability of the proposed model to restore realistic details compared to other models.
Collapse
Affiliation(s)
- Junghyun Roh
- Graduate School of Artificial Intelligence, Ulsan National Institute of Science and Technology, 50, UNIST-gil, Ulsan, 44919, Republic of Korea
| | - Junhwi Kim
- Steinfeld Co., 75 Clarendon Ave, San Francisco, 94114, CA, USA
| | - Jimin Lee
- Graduate School of Artificial Intelligence, Ulsan National Institute of Science and Technology, 50, UNIST-gil, Ulsan, 44919, Republic of Korea; Department of Nuclear Engineering, Ulsan National Institute of Science and Technology, 50, UNIST-gil, Ulsan, 44919, Republic of Korea.
| |
Collapse
|
6
|
Al-masni MA, Al-Shamiri AK, Hussain D, Gu YH. A Unified Multi-Task Learning Model with Joint Reverse Optimization for Simultaneous Skin Lesion Segmentation and Diagnosis. Bioengineering (Basel) 2024; 11:1173. [PMID: 39593832 PMCID: PMC11592164 DOI: 10.3390/bioengineering11111173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2024] [Revised: 11/17/2024] [Accepted: 11/19/2024] [Indexed: 11/28/2024] Open
Abstract
Classifying and segmenting skin cancer represent pivotal objectives for automated diagnostic systems that utilize dermoscopy images. However, these tasks present significant challenges due to the diverse shape variations of skin lesions and the inherently fuzzy nature of dermoscopy images, including low contrast and the presence of artifacts. Given the robust correlation between the classification of skin lesions and their segmentation, we propose that employing a combined learning method holds the promise of considerably enhancing the performance of both tasks. In this paper, we present a unified multi-task learning strategy that concurrently classifies abnormalities of skin lesions and allows for the joint segmentation of lesion boundaries. This approach integrates an optimization technique known as joint reverse learning, which fosters mutual enhancement through extracting shared features and limiting task dominance across the two tasks. The effectiveness of the proposed method was assessed using two publicly available datasets, ISIC 2016 and PH2, which included melanoma and benign skin cancers. In contrast to the single-task learning strategy, which solely focuses on either classification or segmentation, the experimental findings demonstrated that the proposed network improves the diagnostic capability of skin tumor screening and analysis. The proposed method achieves a significant segmentation performance on skin lesion boundaries, with Dice Similarity Coefficients (DSC) of 89.48% and 88.81% on the ISIC 2016 and PH2 datasets, respectively. Additionally, our multi-task learning approach enhances classification, increasing the F1 score from 78.26% (baseline ResNet50) to 82.07% on ISIC 2016 and from 82.38% to 85.50% on PH2. This work showcases its potential applicability across varied clinical scenarios.
Collapse
Affiliation(s)
- Mohammed A. Al-masni
- Department of Artificial Intelligence and Data Science, College of AI Convergence, Sejong University, Seoul 05006, Republic of Korea; (M.A.A.-m.); (D.H.)
| | - Abobakr Khalil Al-Shamiri
- School of Computer Science, University of Southampton Malaysia, Iskandar Puteri 79100, Johor, Malaysia
| | - Dildar Hussain
- Department of Artificial Intelligence and Data Science, College of AI Convergence, Sejong University, Seoul 05006, Republic of Korea; (M.A.A.-m.); (D.H.)
| | - Yeong Hyeon Gu
- Department of Artificial Intelligence and Data Science, College of AI Convergence, Sejong University, Seoul 05006, Republic of Korea; (M.A.A.-m.); (D.H.)
| |
Collapse
|
7
|
Garbaz A, Oukdach Y, Charfi S, El Ansari M, Koutti L, Salihoun M. MLFA-UNet: A multi-level feature assembly UNet for medical image segmentation. Methods 2024; 232:52-64. [PMID: 39481818 DOI: 10.1016/j.ymeth.2024.10.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Revised: 10/03/2024] [Accepted: 10/22/2024] [Indexed: 11/03/2024] Open
Abstract
Medical image segmentation is crucial for accurate diagnosis and treatment in medical image analysis. Among the various methods employed, fully convolutional networks (FCNs) have emerged as a prominent approach for segmenting medical images. Notably, the U-Net architecture and its variants have gained widespread adoption in this domain. This paper introduces MLFA-UNet, an innovative architectural framework aimed at advancing medical image segmentation. MLFA-UNet adopts a U-shaped architecture and integrates two pivotal modules: multi-level feature assembly (MLFA) and multi-scale information attention (MSIA), complemented by a pixel-vanishing (PV) attention mechanism. These modules synergistically contribute to the segmentation process enhancement, fostering both robustness and segmentation precision. MLFA operates within both the network encoder and decoder, facilitating the extraction of local information crucial for accurately segmenting lesions. Furthermore, the bottleneck MSIA module serves to replace stacking modules, thereby expanding the receptive field and augmenting feature diversity, fortified by the PV attention mechanism. These integrated mechanisms work together to boost segmentation performance by effectively capturing both detailed local features and a broader range of contextual information, enhancing both accuracy and resilience in identifying lesions. To assess the versatility of the network, we conducted evaluations of MFLA-UNet across a range of medical image segmentation datasets, encompassing diverse imaging modalities such as wireless capsule endoscopy (WCE), colonoscopy, and dermoscopic images. Our results consistently demonstrate that MFLA-UNet outperforms state-of-the-art algorithms, achieving dice coefficients of 91.42%, 82.43%, 90.8%, and 88.68% for the MICCAI 2017 (Red Lesion), ISIC 2017, PH2, and CVC-ClinicalDB datasets, respectively.
Collapse
Affiliation(s)
- Anass Garbaz
- Laboratory of Computer Systems and Vision, Faculty of Science, Ibn Zohr University, Agadir, 80000, Morocco.
| | - Yassine Oukdach
- Laboratory of Computer Systems and Vision, Faculty of Science, Ibn Zohr University, Agadir, 80000, Morocco
| | - Said Charfi
- Laboratory of Computer Systems and Vision, Faculty of Science, Ibn Zohr University, Agadir, 80000, Morocco
| | - Mohamed El Ansari
- Informatics and Applications Laboratory, Department of Computer Science Faculty of sciences, Moulay Ismail University, Meknes, 50000, Morocco
| | - Lahcen Koutti
- Laboratory of Computer Systems and Vision, Faculty of Science, Ibn Zohr University, Agadir, 80000, Morocco
| | - Mouna Salihoun
- Faculty of Medicine and Pharmacy, Mohammed V University, Rabat, 10100, Morocco
| |
Collapse
|
8
|
Li B, Wang J, Wang B, Shao Z, Li W, Huang J, Li P. BMCS-Net: A Bi-directional multi-scale cascaded segmentation network based on transformer-guided feature Aggregation for medical images. Comput Biol Med 2024; 180:108939. [PMID: 39079413 DOI: 10.1016/j.compbiomed.2024.108939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 07/21/2024] [Accepted: 07/23/2024] [Indexed: 08/29/2024]
Abstract
convolutional neural networks (CNNs) show great potential in medical image segmentation tasks, and can provide reliable basis for disease diagnosis and clinical research. However, CNNs exhibit general limitations on modeling explicit long-range relation, and existing cures, resorting to building deep encoders along with aggressive downsampling operations, leads to loss of localized details. Transformer has naturally excellent ability to model the global features and long-range correlations of the input information, which is strongly complementary to the inductive bias of CNNs. In this paper, a novel Bi-directional Multi-scale Cascaded Segmentation Network, BMCS-Net, is proposed to improve the performance of medical segmentation tasks by aggregating these features obtained from Transformers and CNNs branches. Specifically, a novel feature integration technique, termed as Two-stream Cascaded Feature Aggregation (TCFA) module, is designed to fuse features in two-stream branches, and solve the problem of gradual dilution of global information in the network. Besides, a Multi-Scale Expansion-Aware (MSEA) module based on the convolution of feature perception and expansion is introduced to capture context information, and further compensate for the loss of details. Extensive experiments demonstrated that BMCS-Net has an excellent performance on both skin and Polyp segmentation datasets.
Collapse
Affiliation(s)
- Bicao Li
- School of Electronic and Information Engineering, Zhongyuan University of Technology, Zhengzhou, 450007, China.
| | - Jing Wang
- School of Electronic and Information Engineering, Zhongyuan University of Technology, Zhengzhou, 450007, China
| | - Bei Wang
- University Infirmary, Zhongyuan University of Technology, Zhengzhou, 450007, China
| | - Zhuhong Shao
- College of Information Engineering, Capital Normal University, Beijing, 100048, China
| | - Wei Li
- School of Electronic and Information Engineering, Zhongyuan University of Technology, Zhengzhou, 450007, China
| | - Jie Huang
- School of Electronic and Information Engineering, Zhongyuan University of Technology, Zhengzhou, 450007, China
| | - Panpan Li
- School of Electronic and Information Engineering, Zhongyuan University of Technology, Zhengzhou, 450007, China
| |
Collapse
|
9
|
Cho JH, Çakmak G, Choi J, Lee D, Yoon HI, Yilmaz B, Schimmel M. Deep learning-designed implant-supported posterior crowns: Assessing time efficiency, tooth morphology, emergence profile, occlusion, and proximal contacts. J Dent 2024; 147:105142. [PMID: 38906454 DOI: 10.1016/j.jdent.2024.105142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 06/10/2024] [Accepted: 06/18/2024] [Indexed: 06/23/2024] Open
Abstract
OBJECTIVES To compare implant supported crowns (ISCs) designed using deep learning (DL) software with those designed by a technician using conventional computer-aided design software. METHODS Twenty resin-based partially edentulous casts (maxillary and mandibular) used for fabricating ISCs were evaluated retrospectively. ISCs were designed using a DL-based method with no modification of the as-generated outcome (DB), a DL-based method with further optimization by a dental technician (DM), and a conventional computer-aided design method by a technician (NC). Time efficiency, crown contour, occlusal table area, cusp angle, cusp height, emergence profile angle, occlusal contacts, and proximal contacts were compared among groups. Depending on the distribution of measured data, various statistical methods were used for comparative analyses with a significance level of 0.05. RESULTS ISCs in the DB group showed a significantly higher efficiency than those in the DM and NC groups (P ≤ 0.001). ISCs in the DM group exhibited significantly smaller volume deviations than those in the DB group when superimposed on ISCs in the NC group (DB-NC vs. DM-NC pairs, P ≤ 0.008). Except for the number and intensity of occlusal contacts (P ≤ 0.004), ISCs in the DB and DM groups had occlusal table areas, cusp angles, cusp heights, proximal contact intensities, and emergence profile angles similar to those in the NC group (P ≥ 0.157). CONCLUSIONS A DL-based method can be beneficial for designing posterior ISCs in terms of time efficiency, occlusal table area, cusp angle, cusp height, proximal contact, and emergence profile, similar to the conventional human-based method. CLINICAL SIGNIFICANCE A deep learning-based design method can achieve clinically acceptable functional properties of posterior ISCs. However, further optimization by a technician could improve specific outcomes, such as the crown contour or emergence profile angle.
Collapse
Affiliation(s)
- Jun-Ho Cho
- Department of Prosthodontics, Seoul National University Dental Hospital, Seoul, Republic of Korea
| | - Gülce Çakmak
- Department of Reconstructive Dentistry and Gerodontology, School of Dental Medicine, University of Bern, Bern, Switzerland
| | - Jinhyeok Choi
- Department of Biomedical Sciences, Seoul National University, Seoul, Republic of Korea
| | - Dongwook Lee
- School of Electrical Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea
| | - Hyung-In Yoon
- Department of Reconstructive Dentistry and Gerodontology, School of Dental Medicine, University of Bern, Bern, Switzerland; Department of Prosthodontics, School of Dentistry and Dental Research Institute, Seoul National University, Seoul, Republic of Korea.
| | - Burak Yilmaz
- Department of Reconstructive Dentistry and Gerodontology, School of Dental Medicine, University of Bern, Bern, Switzerland; Department of Restorative, Preventive and Pediatric Dentistry, School of Dental Medicine, University of Bern, Bern, Switzerland; Division of Restorative and Prosthetic Dentistry, The Ohio State University, Columbus, OH, United States
| | - Martin Schimmel
- Department of Reconstructive Dentistry and Gerodontology, School of Dental Medicine, University of Bern, Bern, Switzerland; Division of Gerodontology and Removable Prosthodontics, University Clinics of Dental Medicine, University of Geneva, Geneva, Switzerland
| |
Collapse
|
10
|
Nie T, Zhao Y, Yao S. ELA-Net: An Efficient Lightweight Attention Network for Skin Lesion Segmentation. SENSORS (BASEL, SWITZERLAND) 2024; 24:4302. [PMID: 39001081 PMCID: PMC11243870 DOI: 10.3390/s24134302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Revised: 06/25/2024] [Accepted: 06/27/2024] [Indexed: 07/16/2024]
Abstract
In clinical conditions limited by equipment, attaining lightweight skin lesion segmentation is pivotal as it facilitates the integration of the model into diverse medical devices, thereby enhancing operational efficiency. However, the lightweight design of the model may face accuracy degradation, especially when dealing with complex images such as skin lesion images with irregular regions, blurred boundaries, and oversized boundaries. To address these challenges, we propose an efficient lightweight attention network (ELANet) for the skin lesion segmentation task. In ELANet, two different attention mechanisms of the bilateral residual module (BRM) can achieve complementary information, which enhances the sensitivity to features in spatial and channel dimensions, respectively, and then multiple BRMs are stacked for efficient feature extraction of the input information. In addition, the network acquires global information and improves segmentation accuracy by putting feature maps of different scales through multi-scale attention fusion (MAF) operations. Finally, we evaluate the performance of ELANet on three publicly available datasets, ISIC2016, ISIC2017, and ISIC2018, and the experimental results show that our algorithm can achieve 89.87%, 81.85%, and 82.87% of the mIoU on the three datasets with a parametric of 0.459 M, which is an excellent balance between accuracy and lightness and is superior to many existing segmentation methods.
Collapse
Affiliation(s)
- Tianyu Nie
- School of Geography and Information Engineering, China University of Geosciences, Wuhan 430074, China
| | - Yishi Zhao
- School of Computer Science, China University of Geosciences, Wuhan 430074, China
- Engineering Research Center of Natural Resource Information Management and Digital Twin Engineering Software, Ministry of Education, Wuhan 430074, China
| | - Shihong Yao
- School of Geography and Information Engineering, China University of Geosciences, Wuhan 430074, China
| |
Collapse
|
11
|
Pundhir A, Sagar S, Singh P, Raman B. Echoes of images: multi-loss network for image retrieval in vision transformers. Med Biol Eng Comput 2024; 62:2037-2058. [PMID: 38436836 DOI: 10.1007/s11517-024-03055-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 02/16/2024] [Indexed: 03/05/2024]
Abstract
This paper introduces a novel approach to enhance content-based image retrieval, validated on two benchmark datasets: ISIC-2017 and ISIC-2018. These datasets comprise skin lesion images that are crucial for innovations in skin cancer diagnosis and treatment. We advocate the use of pre-trained Vision Transformer (ViT), a relatively uncharted concept in the realm of image retrieval, particularly in medical scenarios. In contrast to the traditionally employed Convolutional Neural Networks (CNNs), our findings suggest that ViT offers a more comprehensive understanding of the image context, essential in medical imaging. We further incorporate a weighted multi-loss function, delving into various losses such as triplet loss, distillation loss, contrastive loss, and cross-entropy loss. Our exploration investigates the most resilient combination of these losses to create a robust multi-loss function, thus enhancing the robustness of the learned feature space and ameliorating the precision and recall in the retrieval process. Instead of using all the loss functions, the proposed multi-loss function utilizes the combination of only cross-entropy loss, triplet loss, and distillation loss and gains improvement of 6.52% and 3.45% for mean average precision over ISIC-2017 and ISIC-2018. Another innovation in our methodology is a two-branch network strategy, which concurrently boosts image retrieval and classification. Through our experiments, we underscore the effectiveness and the pitfalls of diverse loss configurations in image retrieval. Furthermore, our approach underlines the advantages of retrieval-based classification through majority voting rather than relying solely on the classification head, leading to enhanced prediction for melanoma - the most lethal type of skin cancer. Our results surpass existing state-of-the-art techniques on the ISIC-2017 and ISIC-2018 datasets by improving mean average precision by 1.01% and 4.36% respectively, emphasizing the efficacy and promise of Vision Transformers paired with our tailor-made weighted loss function, especially in medical contexts. The proposed approach's effectiveness is substantiated through thorough ablation studies and an array of quantitative and qualitative outcomes. To promote reproducibility and support forthcoming research, our source code will be accessible on GitHub.
Collapse
Affiliation(s)
- Anshul Pundhir
- Department of Computer Science and Engineering, Indian Institute of Technology, Roorkee, 247667, Uttarakhand, India.
| | - Shivam Sagar
- Department of Electrical Engineering, Indian Institute of Technology, Roorkee, 247667, Uttarakhand, India
| | - Pradeep Singh
- Department of Computer Science and Engineering, Indian Institute of Technology, Roorkee, 247667, Uttarakhand, India
| | - Balasubramanian Raman
- Department of Computer Science and Engineering, Indian Institute of Technology, Roorkee, 247667, Uttarakhand, India
| |
Collapse
|
12
|
Wang J, Tang Y, Xiao Y, Zhou JT, Fang Z, Yang F. GREnet: Gradually REcurrent Network With Curriculum Learning for 2-D Medical Image Segmentation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10018-10032. [PMID: 37022080 DOI: 10.1109/tnnls.2023.3238381] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Medical image segmentation is a vital stage in medical image analysis. Numerous deep-learning methods are booming to improve the performance of 2-D medical image segmentation, owing to the fast growth of the convolutional neural network. Generally, the manually defined ground truth is utilized directly to supervise models in the training phase. However, direct supervision of the ground truth often results in ambiguity and distractors as complex challenges appear simultaneously. To alleviate this issue, we propose a gradually recurrent network with curriculum learning, which is supervised by gradual information of the ground truth. The whole model is composed of two independent networks. One is the segmentation network denoted as GREnet, which formulates 2-D medical image segmentation as a temporal task supervised by pixel-level gradual curricula in the training phase. The other is a curriculum-mining network. To a certain degree, the curriculum-mining network provides curricula with an increasing difficulty in the ground truth of the training set by progressively uncovering hard-to-segmentation pixels via a data-driven manner. Given that segmentation is a pixel-level dense-prediction challenge, to the best of our knowledge, this is the first work to function 2-D medical image segmentation as a temporal task with pixel-level curriculum learning. In GREnet, the naive UNet is adopted as the backbone, while ConvLSTM is used to establish the temporal link between gradual curricula. In the curriculum-mining network, UNet++ supplemented by transformer is designed to deliver curricula through the outputs of the modified UNet++ at different layers. Experimental results have demonstrated the effectiveness of GREnet on seven datasets, i.e., three lesion segmentation datasets in dermoscopic images, an optic disc and cup segmentation dataset and a blood vessel segmentation dataset in retinal images, a breast lesion segmentation dataset in ultrasound images, and a lung segmentation dataset in computed tomography (CT).
Collapse
|
13
|
Zhou Z, Chen Y, He A, Que X, Wang K, Yao R, Li T. NKUT: Dataset and Benchmark for Pediatric Mandibular Wisdom Teeth Segmentation. IEEE J Biomed Health Inform 2024; 28:3523-3533. [PMID: 38557613 DOI: 10.1109/jbhi.2024.3383222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Germectomy is a common surgery in pediatric dentistry to prevent the potential dangers caused by impacted mandibular wisdom teeth. Segmentation of mandibular wisdom teeth is a crucial step in surgery planning. However, manually segmenting teeth and bones from 3D volumes is time-consuming and may cause delays in treatment. Deep learning based medical image segmentation methods have demonstrated the potential to reduce the burden of manual annotations, but they still require a lot of well-annotated data for training. In this paper, we initially curated a Cone Beam Computed Tomography (CBCT) dataset, NKUT, for the segmentation of pediatric mandibular wisdom teeth. This marks the first publicly available dataset in this domain. Second, we propose a semantic separation scale-specific feature fusion network named WTNet, which introduces two branches to address the teeth and bones segmentation tasks. In WTNet, We design a Input Enhancement (IE) block and a Teeth-Bones Feature Separation (TBFS) block to solve the feature confusions and semantic-blur problems in our task. Experimental results suggest that WTNet performs better on NKUT compared to previous state-of-the-art segmentation methods (such as TransUnet), with a maximum DSC lead of nearly 16%.
Collapse
|
14
|
Su D, Luo J, Fei C. An Efficient and Rapid Medical Image Segmentation Network. IEEE J Biomed Health Inform 2024; 28:2979-2990. [PMID: 38457317 DOI: 10.1109/jbhi.2024.3374780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/10/2024]
Abstract
Accurate medical image segmentation is an essential part of the medical image analysis process that provides detailed quantitative metrics. In recent years, extensions of classical networks such as UNet have achieved state-of-the-art performance on medical image segmentation tasks. However, the high model complexity of these networks limits their applicability to devices with constrained computational resources. To alleviate this problem, we propose a shallow hierarchical Transformer for medical image segmentation, called SHFormer. By decreasing the number of transformer blocks utilized, the model complexity of SHFormer can be reduced to an acceptable level. To improve the learned attention while keeping the structure lightweight, we propose a spatial-channel connection module. This module separately learns attention in the spatial and channel dimensions of the feature while interconnecting them to produce more focused attention. To keep the decoder lightweight, the MLP-D module is proposed to progressively fuse multi-scale features in which channels are aligned using Multi-Layer Perceptron (MLP) and spatial information is fused by convolutional blocks. We first validated the performance of SHFormer on the ISIC-2018 dataset. Compared to the latest network, SHFormer exhibits comparable performance with 15 times fewer parameters, 30 times lower computational complexity and 5 times higher inference efficiency. To test the generalizability of SHFormer, we introduced the polyp dataset for additional testing. SHFormer achieves comparable segmentation accuracy to the latest network while having lower computational overhead.
Collapse
|
15
|
Hao S, Wang H, Chen R, Liao Q, Ji Z, Lyu T, Zhao L. DTONet a Lightweight Model for Melanoma Segmentation. Bioengineering (Basel) 2024; 11:390. [PMID: 38671811 PMCID: PMC11048536 DOI: 10.3390/bioengineering11040390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Revised: 04/15/2024] [Accepted: 04/16/2024] [Indexed: 04/28/2024] Open
Abstract
With the further development of neural networks, automatic segmentation techniques for melanoma are becoming increasingly mature, especially under the conditions of abundant hardware resources. This allows for the accuracy of segmentation to be improved by increasing the complexity and computational capacity of the model. However, a new problem arises when it comes to actual applications, as there may not be the high-end hardware available, especially in hospitals and among the general public, who may have limited computing resources. In response to this situation, this paper proposes a lightweight deep learning network that can achieve high segmentation accuracy with minimal resource consumption. We introduce a network called DTONet (double-tailed octave network), which was specifically designed for this purpose. Its computational parameter count is only 30,859, which is 1/256th of the mainstream UNet model. Despite its reduced complexity, DTONet demonstrates superior performance in terms of accuracy, with an IOU improvement over other similar models. To validate the generalization capability of this model, we conducted tests on the PH2 dataset, and the results still outperformed existing models. Therefore, the proposed DTONet network exhibits excellent generalization ability and is sufficiently outstanding.
Collapse
Affiliation(s)
- Shengnan Hao
- Hebei Key Laboratory of Industrial Intelligent Perception, North China University of Science and Technology, Tangshan 063210, China; (S.H.); (H.W.); (Z.J.)
| | - Hongzan Wang
- Hebei Key Laboratory of Industrial Intelligent Perception, North China University of Science and Technology, Tangshan 063210, China; (S.H.); (H.W.); (Z.J.)
| | - Rui Chen
- Changgeng Hospital, Institute for Precision Medicine, Tsinghua University, Beijing 100084, China; (R.C.); (Q.L.)
| | - Qinping Liao
- Changgeng Hospital, Institute for Precision Medicine, Tsinghua University, Beijing 100084, China; (R.C.); (Q.L.)
| | - Zhanlin Ji
- Hebei Key Laboratory of Industrial Intelligent Perception, North China University of Science and Technology, Tangshan 063210, China; (S.H.); (H.W.); (Z.J.)
- College of Mathematics and Computer Science, Zhejiang A&F University, Hangzhou 311300, China
| | - Tao Lyu
- Changgeng Hospital, Institute for Precision Medicine, Tsinghua University, Beijing 100084, China; (R.C.); (Q.L.)
| | - Li Zhao
- Beijing National Research Center for Information Science and Technology, Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| |
Collapse
|
16
|
Du H, Wang J, Liu M, Wang Y, Meijering E. SwinPA-Net: Swin Transformer-Based Multiscale Feature Pyramid Aggregation Network for Medical Image Segmentation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:5355-5366. [PMID: 36121961 DOI: 10.1109/tnnls.2022.3204090] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The precise segmentation of medical images is one of the key challenges in pathology research and clinical practice. However, many medical image segmentation tasks have problems such as large differences between different types of lesions and similar shapes as well as colors between lesions and surrounding tissues, which seriously affects the improvement of segmentation accuracy. In this article, a novel method called Swin Pyramid Aggregation network (SwinPA-Net) is proposed by combining two designed modules with Swin Transformer to learn more powerful and robust features. The two modules, named dense multiplicative connection (DMC) module and local pyramid attention (LPA) module, are proposed to aggregate the multiscale context information of medical images. The DMC module cascades the multiscale semantic feature information through dense multiplicative feature fusion, which minimizes the interference of shallow background noise to improve the feature expression and solves the problem of excessive variation in lesion size and type. Moreover, the LPA module guides the network to focus on the region of interest by merging the global attention and the local attention, which helps to solve similar problems. The proposed network is evaluated on two public benchmark datasets for polyp segmentation task and skin lesion segmentation task as well as a clinical private dataset for laparoscopic image segmentation task. Compared with existing state-of-the-art (SOTA) methods, the SwinPA-Net achieves the most advanced performance and can outperform the second-best method on the mean Dice score by 1.68%, 0.8%, and 1.2% on the three tasks, respectively.
Collapse
|
17
|
Luo X, Zhang H, Huang X, Gong H, Zhang J. DBNet-SI: Dual branch network of shift window attention and inception structure for skin lesion segmentation. Comput Biol Med 2024; 170:108090. [PMID: 38320341 DOI: 10.1016/j.compbiomed.2024.108090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 12/27/2023] [Accepted: 01/27/2024] [Indexed: 02/08/2024]
Abstract
The U-shaped convolutional neural network (CNN) has attained remarkable achievements in the segmentation of skin lesion. However, given the inherent locality of convolution, this architecture cannot capture long-range pixel dependencies and multiscale global contextual information effectively. Moreover, repeated convolutions and downsampling operations can readily result in the omission of intricate local fine-grained details. In this paper, we proposed a U-shaped network (DBNet-SI) equipped with a dual-branch module that combines shift window attention and inception structures. First, we proposed a dual-branch module that combines shift window attention and inception structures (MSI) to better capture multiscale global contextual information and long-range pixel dependencies. Specifically, we have devised a cross-branch bidirectional interaction module within the MSI module to enable information complementarity between the two branches in the channel and spatial dimensions. Therefore, MSI is capable of extracting distinguishing and comprehensive features to accurately identify the skin lesion boundaries. Second, we have devised a progressive feature enhancement and information compensation module (PFEIC), which progressively compensates for fine-grained features through reconstructed skip connections and integrated global context attention modules. The results of the experiment show the superior segmentation performance of DBNet-SI compared with other deep learning models for skin lesion segmentation in the ISIC2017 and ISIC2018 datasets. Ablation studies demonstrate that our model can effectively extract rich multiscale global contextual information and compensate for the loss of local details.
Collapse
Affiliation(s)
- Xuqiong Luo
- School of Mathematics and Statistics, Changsha University of Science and Technology, ChangSha 410114, China
| | - Hao Zhang
- School of Mathematics and Statistics, Changsha University of Science and Technology, ChangSha 410114, China
| | - Xiaofei Huang
- School of Mathematics and Statistics, Changsha University of Science and Technology, ChangSha 410114, China
| | - Hongfang Gong
- School of Mathematics and Statistics, Changsha University of Science and Technology, ChangSha 410114, China.
| | - Jin Zhang
- School of Computer and Communication Engineering, Changsha University of Science and Technology, ChangSha 410114, China
| |
Collapse
|
18
|
Cho JH, Çakmak G, Yi Y, Yoon HI, Yilmaz B, Schimmel M. Tooth morphology, internal fit, occlusion and proximal contacts of dental crowns designed by deep learning-based dental software: A comparative study. J Dent 2024; 141:104830. [PMID: 38163455 DOI: 10.1016/j.jdent.2023.104830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 12/13/2023] [Accepted: 12/29/2023] [Indexed: 01/03/2024] Open
Abstract
OBJECTIVES This study compared the tooth morphology, internal fit, occlusion, and proximal contacts of dental crowns automatically generated via two deep learning (DL)-based dental software systems with those manually designed by an experienced dental technician using conventional software. METHODS Thirty partial arch scans of prepared posterior teeth were used. The crowns were designed using two DL-based methods (AA and AD) and a technician-based method (NC). The crown design outcomes were three-dimensionally compared, focusing on tooth morphology, internal fit, occlusion, and proximal contacts, by calculating the geometric relationship. Statistical analysis utilized the independent t-test, Mann-Whitney test, one-way ANOVA, and Kruskal-Wallis test with post hoc pairwise comparisons (α = 0.05). RESULTS The AA and AD groups, with the NC group as a reference, exhibited no significant tooth morphology discrepancies across entire external or occlusal surfaces. The AD group exhibited higher root mean square and positive average values on the axial surface (P < .05). The AD and NC groups exhibited a better internal fit than the AA group (P < .001). The cusp angles were similar across all groups (P = .065). The NC group yielded more occlusal contact points than the AD group (P = .006). Occlusal and proximal contact intensities varied among the groups (both P < .001). CONCLUSIONS Crowns designed by using both DL-based software programs exhibited similar morphologies on the occlusal and axial surfaces; however, they differed in internal fit, occlusion, and proximal contacts. Their overall performance was clinically comparable to that of the technician-based method in terms of the internal fit and number of occlusal contact points. CLINICAL SIGNIFICANCE DL-based dental software for crown design can streamline the digital workflow in restorative dentistry, ensuring clinically-acceptable outcomes on tooth morphology, internal fit, occlusion, and proximal contacts. It can minimize the necessity of additional design optimization by dental technician.
Collapse
Affiliation(s)
- Jun-Ho Cho
- Department of Prosthodontics, Seoul National University Dental Hospital, Seoul, Republic of Korea
| | - Gülce Çakmak
- Department of Reconstructive Dentistry and Gerodontology, School of Dental Medicine, University of Bern, Bern, Switzerland
| | - Yuseung Yi
- Department of Prosthodontics, Seoul National University Dental Hospital, Seoul, Republic of Korea
| | - Hyung-In Yoon
- Department of Reconstructive Dentistry and Gerodontology, School of Dental Medicine, University of Bern, Bern, Switzerland; Department of Prosthodontics, School of Dentistry and Dental Research Institute, Seoul National University, Seoul, Republic of Korea.
| | - Burak Yilmaz
- Department of Reconstructive Dentistry and Gerodontology, School of Dental Medicine, University of Bern, Bern, Switzerland; Department of Restorative, Preventive and Pediatric Dentistry, School of Dental Medicine, University of Bern, Bern, Switzerland; Division of Restorative and Prosthetic Dentistry, The Ohio State University, Columbus, OH, USA
| | - Martin Schimmel
- Department of Reconstructive Dentistry and Gerodontology, School of Dental Medicine, University of Bern, Bern, Switzerland
| |
Collapse
|
19
|
Zuo Q, Shen Y, Zhong N, Chen CLP, Lei B, Wang S. Alzheimer's Disease Prediction via Brain Structural-Functional Deep Fusing Network. IEEE Trans Neural Syst Rehabil Eng 2023; 31:4601-4612. [PMID: 37971911 DOI: 10.1109/tnsre.2023.3333952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2023]
Abstract
Fusing structural-functional images of the brain has shown great potential to analyze the deterioration of Alzheimer's disease (AD). However, it is a big challenge to effectively fuse the correlated and complementary information from multimodal neuroimages. In this work, a novel model termed cross-modal transformer generative adversarial network (CT-GAN) is proposed to effectively fuse the functional and structural information contained in functional magnetic resonance imaging (fMRI) and diffusion tensor imaging (DTI). The CT-GAN can learn topological features and generate multimodal connectivity from multimodal imaging data in an efficient end-to-end manner. Moreover, the swapping bi-attention mechanism is designed to gradually align common features and effectively enhance the complementary features between modalities. By analyzing the generated connectivity features, the proposed model can identify AD-related brain connections. Evaluations on the public ADNI dataset show that the proposed CT-GAN can dramatically improve prediction performance and detect AD-related brain regions effectively. The proposed model also provides new insights into detecting AD-related abnormal neural circuits.
Collapse
|
20
|
Cho JH, Yi Y, Choi J, Ahn J, Yoon HI, Yilmaz B. Time efficiency, occlusal morphology, and internal fit of anatomic contour crowns designed by dental software powered by generative adversarial network: A comparative study. J Dent 2023; 138:104739. [PMID: 37804938 DOI: 10.1016/j.jdent.2023.104739] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 09/26/2023] [Accepted: 10/05/2023] [Indexed: 10/09/2023] Open
Abstract
OBJECTIVES To evaluate the time efficiency, occlusal morphology, and internal fit of dental crowns designed using generative adversarial network (GAN)-based dental software compared to conventional dental software. METHODS Thirty datasets of partial arch scans for prepared posterior teeth were analyzed. Each crown was designed on each abutment using GAN-based software (AI) and conventional dental software (non-AI). The AI and non-AI groups were compared in terms of time efficiency by measuring the elapsed work time. The difference in the occlusal morphology of the crowns before and after design optimization and the internal fit of the crown to the prepared abutment were also evaluated by superimposition for each software. Data were analyzed using independent t tests or Mann-Whitney test with statistical significance (α=.05). RESULTS The working time was significantly less for the AI group than the non-AI group at T1, T5, and T6 (P≤.043). The working time with AI was significantly shorter at T1, T3, T5, and T6 for the intraoral scan (P≤.036). Only at T2 (P≤.001) did the cast scan show a significant difference between the two groups. The crowns in the AI group showed less deviation in occlusal morphology and significantly better internal fit to the abutment than those in the non-AI group (both P<.001). CONCLUSIONS Crowns designed by AI software showed improved outcomes than that designed by non-AI software, in terms of time efficiency, difference in occlusal morphology, and internal fit. CLINICAL SIGNIFICANCE The GAN-based software showed better time efficiency and less deviation in occlusal morphology during the design process than the conventional software, suggesting a higher probability of optimized outcomes of crown design.
Collapse
Affiliation(s)
- Jun-Ho Cho
- Department of Prosthodontics, Seoul National University Dental Hospital, Seoul, Republic of Korea
| | - Yuseung Yi
- Department of Prosthodontics, Seoul National University Dental Hospital, Seoul, Republic of Korea
| | - Jinhyeok Choi
- Department of Biomedical Sciences, Seoul National University, Seoul, Republic of Korea
| | - Junseong Ahn
- Department of Computer Science, Korea University, Seoul, Republic of Korea
| | - Hyung-In Yoon
- Department of Prosthodontics, School of Dentistry and Dental Research Institute, Seoul National University, Seoul, Republic of Korea; Department of Reconstructive Dentistry and Gerodontology, School of Dental Medicine, University of Bern, Bern, Switzerland.
| | - Burak Yilmaz
- Department of Reconstructive Dentistry and Gerodontology, School of Dental Medicine, University of Bern, Bern, Switzerland; Department of Restorative, Preventive and Pediatric Dentistry, School of Dental Medicine, University of Bern, Bern, Switzerland; Division of Restorative and Prosthetic Dentistry, The Ohio State University, Columbus, Ohio, United States
| |
Collapse
|
21
|
Wang K, Li Z, Wang H, Liu S, Pan M, Wang M, Wang S, Song Z. Improving brain tumor segmentation with anatomical prior-informed pre-training. Front Med (Lausanne) 2023; 10:1211800. [PMID: 37771979 PMCID: PMC10525322 DOI: 10.3389/fmed.2023.1211800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 08/21/2023] [Indexed: 09/30/2023] Open
Abstract
Introduction Precise delineation of glioblastoma in multi-parameter magnetic resonance images is pivotal for neurosurgery and subsequent treatment monitoring. Transformer models have shown promise in brain tumor segmentation, but their efficacy heavily depends on a substantial amount of annotated data. To address the scarcity of annotated data and improve model robustness, self-supervised learning methods using masked autoencoders have been devised. Nevertheless, these methods have not incorporated the anatomical priors of brain structures. Methods This study proposed an anatomical prior-informed masking strategy to enhance the pre-training of masked autoencoders, which combines data-driven reconstruction with anatomical knowledge. We investigate the likelihood of tumor presence in various brain structures, and this information is then utilized to guide the masking procedure. Results Compared with random masking, our method enables the pre-training to concentrate on regions that are more pertinent to downstream segmentation. Experiments conducted on the BraTS21 dataset demonstrate that our proposed method surpasses the performance of state-of-the-art self-supervised learning techniques. It enhances brain tumor segmentation in terms of both accuracy and data efficiency. Discussion Tailored mechanisms designed to extract valuable information from extensive data could enhance computational efficiency and performance, resulting in increased precision. It's still promising to integrate anatomical priors and vision approaches.
Collapse
Affiliation(s)
- Kang Wang
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai, China
- Shanghai Key Lab of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai, China
| | - Zeyang Li
- Department of Neurosurgery, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Haoran Wang
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai, China
- Shanghai Key Lab of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai, China
| | - Siyu Liu
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai, China
- Shanghai Key Lab of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai, China
| | - Mingyuan Pan
- Radiation Oncology Center, Huashan Hospital, Fudan University, Shanghai, China
| | - Manning Wang
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai, China
- Shanghai Key Lab of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai, China
| | - Shuo Wang
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai, China
- Shanghai Key Lab of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai, China
| | - Zhijian Song
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai, China
- Shanghai Key Lab of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai, China
| |
Collapse
|
22
|
Arshad S, Amjad T, Hussain A, Qureshi I, Abbas Q. Dermo-Seg: ResNet-UNet Architecture and Hybrid Loss Function for Detection of Differential Patterns to Diagnose Pigmented Skin Lesions. Diagnostics (Basel) 2023; 13:2924. [PMID: 37761291 PMCID: PMC10527859 DOI: 10.3390/diagnostics13182924] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2023] [Revised: 08/29/2023] [Accepted: 09/11/2023] [Indexed: 09/29/2023] Open
Abstract
Convolutional neural network (CNN) models have been extensively applied to skin lesions segmentation due to their information discrimination capabilities. However, CNNs' struggle to capture the connection between long-range contexts when extracting deep semantic features from lesion images, resulting in a semantic gap that causes segmentation distortion in skin lesions. Therefore, detecting the presence of differential structures such as pigment networks, globules, streaks, negative networks, and milia-like cysts becomes difficult. To resolve these issues, we have proposed an approach based on semantic-based segmentation (Dermo-Seg) to detect differential structures of lesions using a UNet model with a transfer-learning-based ResNet-50 architecture and a hybrid loss function. The Dermo-Seg model uses ResNet-50 backbone architecture as an encoder in the UNet model. We have applied a combination of focal Tversky loss and IOU loss functions to handle the dataset's highly imbalanced class ratio. The obtained results prove that the intended model performs well compared to the existing models. The dataset was acquired from various sources, such as ISIC18, ISBI17, and HAM10000, to evaluate the Dermo-Seg model. We have dealt with the data imbalance present within each class at the pixel level using our hybrid loss function. The proposed model achieves a mean IOU score of 0.53 for streaks, 0.67 for pigment networks, 0.66 for globules, 0.58 for negative networks, and 0.53 for milia-like-cysts. Overall, the Dermo-Seg model is efficient in detecting different skin lesion structures and achieved 96.4% on the IOU index. Our Dermo-Seg system improves the IOU index compared to the most recent network.
Collapse
Affiliation(s)
- Sannia Arshad
- Department of Computer Science, Faculty of Basic and Applied Science, International Islamic University, Islamabad 44000, Pakistan; (S.A.); (T.A.)
| | - Tehmina Amjad
- Department of Computer Science, Faculty of Basic and Applied Science, International Islamic University, Islamabad 44000, Pakistan; (S.A.); (T.A.)
| | - Ayyaz Hussain
- Department of Computer Science, Quaid e Azam University, Islamabad 44000, Pakistan;
| | - Imran Qureshi
- College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia;
| | - Qaisar Abbas
- College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia;
| |
Collapse
|
23
|
Zhou L, Liang L, Sheng X. GA-Net: Ghost convolution adaptive fusion skin lesion segmentation network. Comput Biol Med 2023; 164:107273. [PMID: 37562327 DOI: 10.1016/j.compbiomed.2023.107273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 06/30/2023] [Accepted: 07/16/2023] [Indexed: 08/12/2023]
Abstract
Automatic segmentation of skin lesions is a pivotal task in computer-aided diagnosis, playing a crucial role in the early detection and treatment of skin cancer. Despite the existence of numerous deep learning-based segmentation methods, the extraction of lesion features remains inadequate as a result of the segmentation process. Consequently, skin lesion image segmentation continues to face challenges regarding missing detailed information and inaccurate segmentation of the lesion region. In this paper, we propose a ghost convolution adaptive fusion network for skin lesion segmentation. First, the neural network incorporates a ghost module instead of the ordinary convolution layer, generating a rich skin lesion feature map for comprehensive target feature extraction. Subsequently, the network employs an adaptive fusion module and bilateral attention module to connect the encoding and decoding layers, facilitating the integration of shallow and deep network information. Moreover, multi-level output patterns are used for pixel prediction. Layer feature fusion effectively combines output features of different scales, thus improving image segmentation accuracy. The proposed network was extensively evaluated on three publicly available datasets: ISIC2016, ISIC2017, and ISIC2018. The experimental results demonstrated accuracies of 96.42%, 94.07%, and 95.03%, and kappa coefficients of 90.41%, 81.08%, and 86.96%, respectively. The overall performance of our network surpassed that of existing networks. Simulation experiments further revealed that the ghost convolution adaptive fusion network exhibited superior segmentation results for skin lesion images, offering new possibilities for the diagnosis of skin diseases.
Collapse
Affiliation(s)
- Longsong Zhou
- School of Electrical Engineering and Automation, Jiangxi University of Science and Technology, Ganzhou, Jiangxi, 341000, China; Jinguan Copper Branch of Tongling Nonferrous Metals Group Co, Ltd, Tongling, Anhui, 244100, China
| | - Liming Liang
- School of Electrical Engineering and Automation, Jiangxi University of Science and Technology, Ganzhou, Jiangxi, 341000, China
| | - Xiaoqi Sheng
- School of Computer Science and Engineering, South China University of Technology, Guangzhou, Guangdong, 510006, China.
| |
Collapse
|
24
|
Chen Y, Wang T, Tang H, Zhao L, Zhang X, Tan T, Gao Q, Du M, Tong T. CoTrFuse: a novel framework by fusing CNN and transformer for medical image segmentation. Phys Med Biol 2023; 68:175027. [PMID: 37605997 DOI: 10.1088/1361-6560/acede8] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2023] [Accepted: 08/07/2023] [Indexed: 08/23/2023]
Abstract
Medical image segmentation is a crucial and intricate process in medical image processing and analysis. With the advancements in artificial intelligence, deep learning techniques have been widely used in recent years for medical image segmentation. One such technique is the U-Net framework based on the U-shaped convolutional neural networks (CNN) and its variants. However, these methods have limitations in simultaneously capturing both the global and the remote semantic information due to the restricted receptive domain caused by the convolution operation's intrinsic features. Transformers are attention-based models with excellent global modeling capabilities, but their ability to acquire local information is limited. To address this, we propose a network that combines the strengths of both CNN and Transformer, called CoTrFuse. The proposed CoTrFuse network uses EfficientNet and Swin Transformer as dual encoders. The Swin Transformer and CNN Fusion module are combined to fuse the features of both branches before the skip connection structure. We evaluated the proposed network on two datasets: the ISIC-2017 challenge dataset and the COVID-QU-Ex dataset. Our experimental results demonstrate that the proposed CoTrFuse outperforms several state-of-the-art segmentation methods, indicating its superiority in medical image segmentation. The codes are available athttps://github.com/BinYCn/CoTrFuse.
Collapse
Affiliation(s)
- Yuanbin Chen
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, People's Republic of China
- Fujian Key Lab of Medical Instrumentation & Pharmaceutical Technology, Fuzhou University, Fuzhou 350116, People's Republic of China
| | - Tao Wang
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, People's Republic of China
- Fujian Key Lab of Medical Instrumentation & Pharmaceutical Technology, Fuzhou University, Fuzhou 350116, People's Republic of China
| | - Hui Tang
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, People's Republic of China
- Fujian Key Lab of Medical Instrumentation & Pharmaceutical Technology, Fuzhou University, Fuzhou 350116, People's Republic of China
| | - Longxuan Zhao
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, People's Republic of China
- Fujian Key Lab of Medical Instrumentation & Pharmaceutical Technology, Fuzhou University, Fuzhou 350116, People's Republic of China
| | - Xinlin Zhang
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, People's Republic of China
- Fujian Key Lab of Medical Instrumentation & Pharmaceutical Technology, Fuzhou University, Fuzhou 350116, People's Republic of China
| | - Tao Tan
- Faculty of Applied Science, Macao Polytechnic University, Macao 999078, People's Republic of China
| | - Qinquan Gao
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, People's Republic of China
- Fujian Key Lab of Medical Instrumentation & Pharmaceutical Technology, Fuzhou University, Fuzhou 350116, People's Republic of China
| | - Min Du
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, People's Republic of China
- Fujian Key Lab of Medical Instrumentation & Pharmaceutical Technology, Fuzhou University, Fuzhou 350116, People's Republic of China
| | - Tong Tong
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, People's Republic of China
- Fujian Key Lab of Medical Instrumentation & Pharmaceutical Technology, Fuzhou University, Fuzhou 350116, People's Republic of China
| |
Collapse
|
25
|
Chen J, Liu Q, Wei Z, Luo X, Lai M, Chen H, Liu J, Xu Y, Li J. ETU-Net: efficient Transformer and convolutional U-style connected attention segmentation network applied to endoscopic image of epistaxis. Front Med (Lausanne) 2023; 10:1198054. [PMID: 37636575 PMCID: PMC10450218 DOI: 10.3389/fmed.2023.1198054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 07/19/2023] [Indexed: 08/29/2023] Open
Abstract
Epistaxis is a typical presentation in the otolaryngology and emergency department. When compressive therapy fails, directive nasal cautery is necessary, which strongly recommended operating under the nasal endoscope if it is possible. Limited by the operator's clinical experience, complications such as recurrence, nasal ulcer, and septum perforation may occur due to insufficient or excessive cautery. At present, deep learning technology is widely used in the medical field because of its accurate and efficient recognition ability, but it is still blank in the research of epistaxis. In this work, we first gathered and retrieved the Nasal Bleeding dataset, which was annotated and confirmed by many clinical specialists, filling a void in this sector. Second, we created ETU-Net, a deep learning model that smartly integrated the excellent performance of attention convolution with Transformer, overcoming the traditional model's difficulties in capturing contextual feature information and insufficient sequence modeling skills in picture segmentation. On the Nasal Bleeding dataset, our proposed model outperforms all others models that we tested. The segmentation recognition index, Intersection over Union, and F1-Score were 94.57 and 97.15%. Ultimately, we summarized effective ways of combining artificial intelligence with medical treatment and tested it on multiple general datasets to prove its feasibility. The results show that our method has good domain adaptability and has a cutting-edge reference for future medical technology development.
Collapse
Affiliation(s)
- Junyang Chen
- College of Information Engineering, Sichuan Agricultural University, Ya'an, China
| | - Qiurui Liu
- Department of Otorhinolaryngology Head and Neck Surgery, Ya'an People's Hospital, Ya'an, China
| | - Zedong Wei
- College of Information Engineering, Sichuan Agricultural University, Ya'an, China
| | - Xi Luo
- Department of Otorhinolaryngology Head and Neck Surgery, Ya'an People's Hospital, Ya'an, China
| | - Mengzhen Lai
- College of Information Engineering, Sichuan Agricultural University, Ya'an, China
| | - Hongkun Chen
- College of Information Engineering, Sichuan Agricultural University, Ya'an, China
| | - Junlin Liu
- College of Information Engineering, Sichuan Agricultural University, Ya'an, China
| | - Yanhong Xu
- Department of Otorhinolaryngology Head and Neck Surgery, Ya'an People's Hospital, Ya'an, China
| | - Jun Li
- Sichuan Key Laboratory of Agricultural Information Engineering, Ya'an, China
| |
Collapse
|
26
|
Mu J, Lin Y, Meng X, Fan J, Ai D, Chen D, Qiu H, Yang J, Gu Y. M-CSAFN: Multi-Color Space Adaptive Fusion Network for Automated Port-Wine Stains Segmentation. IEEE J Biomed Health Inform 2023; 27:3924-3935. [PMID: 37027679 DOI: 10.1109/jbhi.2023.3247479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/24/2023]
Abstract
Automatic segmentation of port-wine stains (PWS) from clinical images is critical for accurate diagnosis and objective assessment of PWS. However, this is a challenging task due to the color heterogeneity, low contrast, and indistinguishable appearance of PWS lesions. To address such challenges, we propose a novel multi-color space adaptive fusion network (M-CSAFN) for PWS segmentation. First, a multi-branch detection model is constructed based on six typical color spaces, which utilizes rich color texture information to highlight the difference between lesions and surrounding tissues. Second, an adaptive fusion strategy is used to fuse complementary predictions, which address the significant differences within the lesions caused by color heterogeneity. Third, a structural similarity loss with color information is proposed to measure the detail error between predicted lesions and truth lesions. Additionally, a PWS clinical dataset consisting of 1413 image pairs was established for the development and evaluation of PWS segmentation algorithms. To verify the effectiveness and superiority of the proposed method, we compared it with other state-of-the-art methods on our collected dataset and four publicly available skin lesion datasets (ISIC 2016, ISIC 2017, ISIC 2018, and PH2). The experimental results show that our method achieves remarkable performance in comparison with other state-of-the-art methods on our collected dataset, achieving 92.29% and 86.14% on Dice and Jaccard metrics, respectively. Comparative experiments on other datasets also confirmed the reliability and potential capability of M-CSAFN in skin lesion segmentation.
Collapse
|
27
|
Shamshad F, Khan S, Zamir SW, Khan MH, Hayat M, Khan FS, Fu H. Transformers in medical imaging: A survey. Med Image Anal 2023; 88:102802. [PMID: 37315483 DOI: 10.1016/j.media.2023.102802] [Citation(s) in RCA: 186] [Impact Index Per Article: 93.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Revised: 03/11/2023] [Accepted: 03/23/2023] [Indexed: 06/16/2023]
Abstract
Following unprecedented success on the natural language tasks, Transformers have been successfully applied to several computer vision problems, achieving state-of-the-art results and prompting researchers to reconsider the supremacy of convolutional neural networks (CNNs) as de facto operators. Capitalizing on these advances in computer vision, the medical imaging field has also witnessed growing interest for Transformers that can capture global context compared to CNNs with local receptive fields. Inspired from this transition, in this survey, we attempt to provide a comprehensive review of the applications of Transformers in medical imaging covering various aspects, ranging from recently proposed architectural designs to unsolved issues. Specifically, we survey the use of Transformers in medical image segmentation, detection, classification, restoration, synthesis, registration, clinical report generation, and other tasks. In particular, for each of these applications, we develop taxonomy, identify application-specific challenges as well as provide insights to solve them, and highlight recent trends. Further, we provide a critical discussion of the field's current state as a whole, including the identification of key challenges, open problems, and outlining promising future directions. We hope this survey will ignite further interest in the community and provide researchers with an up-to-date reference regarding applications of Transformer models in medical imaging. Finally, to cope with the rapid development in this field, we intend to regularly update the relevant latest papers and their open-source implementations at https://github.com/fahadshamshad/awesome-transformers-in-medical-imaging.
Collapse
Affiliation(s)
- Fahad Shamshad
- MBZ University of Artificial Intelligence, Abu Dhabi, United Arab Emirates.
| | - Salman Khan
- MBZ University of Artificial Intelligence, Abu Dhabi, United Arab Emirates; CECS, Australian National University, Canberra ACT 0200, Australia
| | - Syed Waqas Zamir
- Inception Institute of Artificial Intelligence, Abu Dhabi, United Arab Emirates
| | | | - Munawar Hayat
- Faculty of IT, Monash University, Clayton VIC 3800, Australia
| | - Fahad Shahbaz Khan
- MBZ University of Artificial Intelligence, Abu Dhabi, United Arab Emirates; Computer Vision Laboratory, Linköping University, Sweden
| | - Huazhu Fu
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore
| |
Collapse
|
28
|
Mirikharaji Z, Abhishek K, Bissoto A, Barata C, Avila S, Valle E, Celebi ME, Hamarneh G. A survey on deep learning for skin lesion segmentation. Med Image Anal 2023; 88:102863. [PMID: 37343323 DOI: 10.1016/j.media.2023.102863] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 02/01/2023] [Accepted: 05/31/2023] [Indexed: 06/23/2023]
Abstract
Skin cancer is a major public health problem that could benefit from computer-aided diagnosis to reduce the burden of this common disease. Skin lesion segmentation from images is an important step toward achieving this goal. However, the presence of natural and artificial artifacts (e.g., hair and air bubbles), intrinsic factors (e.g., lesion shape and contrast), and variations in image acquisition conditions make skin lesion segmentation a challenging task. Recently, various researchers have explored the applicability of deep learning models to skin lesion segmentation. In this survey, we cross-examine 177 research papers that deal with deep learning-based segmentation of skin lesions. We analyze these works along several dimensions, including input data (datasets, preprocessing, and synthetic data generation), model design (architecture, modules, and losses), and evaluation aspects (data annotation requirements and segmentation performance). We discuss these dimensions both from the viewpoint of select seminal works, and from a systematic viewpoint, examining how those choices have influenced current trends, and how their limitations should be addressed. To facilitate comparisons, we summarize all examined works in a comprehensive table as well as an interactive table available online3.
Collapse
Affiliation(s)
- Zahra Mirikharaji
- Medical Image Analysis Lab, School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada
| | - Kumar Abhishek
- Medical Image Analysis Lab, School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada
| | - Alceu Bissoto
- RECOD.ai Lab, Institute of Computing, University of Campinas, Av. Albert Einstein 1251, Campinas 13083-852, Brazil
| | - Catarina Barata
- Institute for Systems and Robotics, Instituto Superior Técnico, Avenida Rovisco Pais, Lisbon 1049-001, Portugal
| | - Sandra Avila
- RECOD.ai Lab, Institute of Computing, University of Campinas, Av. Albert Einstein 1251, Campinas 13083-852, Brazil
| | - Eduardo Valle
- RECOD.ai Lab, School of Electrical and Computing Engineering, University of Campinas, Av. Albert Einstein 400, Campinas 13083-952, Brazil
| | - M Emre Celebi
- Department of Computer Science and Engineering, University of Central Arkansas, 201 Donaghey Ave., Conway, AR 72035, USA.
| | - Ghassan Hamarneh
- Medical Image Analysis Lab, School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada.
| |
Collapse
|
29
|
Wu H, Huang X, Guo X, Wen Z, Qin J. Cross-Image Dependency Modeling for Breast Ultrasound Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:1619-1631. [PMID: 37018315 DOI: 10.1109/tmi.2022.3233648] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
We present a novel deep network (namely BUSSeg) equipped with both within- and cross-image long-range dependency modeling for automated lesions segmentation from breast ultrasound images, which is a quite daunting task due to (1) the large variation of breast lesions, (2) the ambiguous lesion boundaries, and (3) the existence of speckle noise and artifacts in ultrasound images. Our work is motivated by the fact that most existing methods only focus on modeling the within-image dependencies while neglecting the cross-image dependencies, which are essential for this task under limited training data and noise. We first propose a novel cross-image dependency module (CDM) with a cross-image contextual modeling scheme and a cross-image dependency loss (CDL) to capture more consistent feature expression and alleviate noise interference. Compared with existing cross-image methods, the proposed CDM has two merits. First, we utilize more complete spatial features instead of commonly used discrete pixel vectors to capture the semantic dependencies between images, mitigating the negative effects of speckle noise and making the acquired features more representative. Second, the proposed CDM includes both intra- and inter-class contextual modeling rather than just extracting homogeneous contextual dependencies. Furthermore, we develop a parallel bi-encoder architecture (PBA) to tame a Transformer and a convolutional neural network to enhance BUSSeg's capability in capturing within-image long-range dependencies and hence offer richer features for CDM. We conducted extensive experiments on two representative public breast ultrasound datasets, and the results demonstrate that the proposed BUSSeg consistently outperforms state-of-the-art approaches in most metrics.
Collapse
|
30
|
Qin C, Zheng B, Zeng J, Chen Z, Zhai Y, Genovese A, Piuri V, Scotti F. Dynamically aggregating MLPs and CNNs for skin lesion segmentation with geometry regularization. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 238:107601. [PMID: 37210926 DOI: 10.1016/j.cmpb.2023.107601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Revised: 04/24/2023] [Accepted: 05/13/2023] [Indexed: 05/23/2023]
Abstract
BACKGROUND AND OBJECTIVE Melanoma is a highly malignant skin tumor. Accurate segmentation of skin lesions from dermoscopy images is pivotal for computer-aided diagnosis of melanoma. However, blurred lesion boundaries, variable lesion shapes, and other interference factors pose a challenge in this regard. METHODS This work proposes a novel framework called CFF-Net (Cross Feature Fusion Network) for supervised skin lesion segmentation. The encoder of the network includes dual branches, where the CNNs branch aims to extract rich local features while MLPs branch is used to establish both the global-spatial-dependencies and global-channel-dependencies for precise delineation of skin lesions. Besides, a feature-interaction module between two branches is designed for strengthening the feature representation by allowing dynamic exchange of spatial and channel information, so as to retain more spatial details and inhibit irrelevant noise. Moreover, an auxiliary prediction task is introduced to learn the global geometric information, highlighting the boundary of the skin lesion. RESULTS Comprehensive experiments using four publicly available skin lesion datasets (i.e., ISIC 2018, ISIC 2017, ISIC 2016, and PH2) indicated that CFF-Net outperformed the state-of-the-art models. In particular, CFF-Net greatly increased the average Jaccard Index score from 79.71% to 81.86% in ISIC 2018, from 78.03% to 80.21% in ISIC 2017, from 82.58% to 85.38% in ISIC 2016, and from 84.18% to 89.71% in PH2 compared with U-Net. Ablation studies demonstrated the effectiveness of each proposed component. Cross-validation experiments in ISIC 2018 and PH2 datasets verified the generalizability of CFF-Net under different skin lesion data distributions. Finally, comparison experiments using three public datasets demonstrated the superior performance of our model. CONCLUSION The proposed CFF-Net performed well in four public skin lesion datasets, especially for challenging cases with blurred edges of skin lesions and low contrast between skin lesions and background. CFF-Net can be employed for other segmentation tasks with better prediction and more accurate delineation of boundaries.
Collapse
Affiliation(s)
- Chuanbo Qin
- Faculty of Intelligent Manufacturing, Wuyi University, Jiangmen 529020, China
| | - Bin Zheng
- Faculty of Intelligent Manufacturing, Wuyi University, Jiangmen 529020, China
| | - Junying Zeng
- Faculty of Intelligent Manufacturing, Wuyi University, Jiangmen 529020, China.
| | - Zhuyuan Chen
- Faculty of Intelligent Manufacturing, Wuyi University, Jiangmen 529020, China
| | - Yikui Zhai
- Faculty of Intelligent Manufacturing, Wuyi University, Jiangmen 529020, China
| | - Angelo Genovese
- Departimento di Information, Università degli Studi di Milano, 20133 Milano, Italy
| | - Vincenzo Piuri
- Departimento di Information, Università degli Studi di Milano, 20133 Milano, Italy
| | - Fabio Scotti
- Departimento di Information, Università degli Studi di Milano, 20133 Milano, Italy
| |
Collapse
|
31
|
Triplet attention fusion module: A concise and efficient channel attention module for medical image segmentation. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
32
|
Karri M, Annavarapu CSR, Acharya UR. Skin lesion segmentation using two-phase cross-domain transfer learning framework. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 231:107408. [PMID: 36805279 DOI: 10.1016/j.cmpb.2023.107408] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/25/2022] [Revised: 01/31/2023] [Accepted: 02/04/2023] [Indexed: 06/18/2023]
Abstract
BACKGROUND AND OBJECTIVE Deep learning (DL) models have been used for medical imaging for a long time but they did not achieve their full potential in the past because of insufficient computing power and scarcity of training data. In recent years, we have seen substantial growth in DL networks because of improved technology and an abundance of data. However, previous studies indicate that even a well-trained DL algorithm may struggle to generalize data from multiple sources because of domain shifts. Additionally, ineffectiveness of basic data fusion methods, complexity of segmentation target and low interpretability of current DL models limit their use in clinical decisions. To meet these challenges, we present a new two-phase cross-domain transfer learning system for effective skin lesion segmentation from dermoscopic images. METHODS Our system is based on two significant technical inventions. We examine a two- phase cross-domain transfer learning approach, including model-level and data-level transfer learning, by fine-tuning the system on two datasets, MoleMap and ImageNet. We then present nSknRSUNet, a high-performing DL network, for skin lesion segmentation using broad receptive fields and spatial edge attention feature fusion. We examine the trained model's generalization capabilities on skin lesion segmentation to quantify these two inventions. We cross-examine the model using two skin lesion image datasets, MoleMap and HAM10000, obtained from varied clinical contexts. RESULTS At data-level transfer learning for the HAM10000 dataset, the proposed model obtained 94.63% of DSC and 99.12% accuracy. In cross-examination at data-level transfer learning for the Molemap dataset, the proposed model obtained 93.63% of DSC and 97.01% of accuracy. CONCLUSION Numerous experiments reveal that our system produces excellent performance and improves upon state-of-the-art methods on both qualitative and quantitative measures.
Collapse
Affiliation(s)
- Meghana Karri
- Department of Computer Science and Engineering, Indian Institute of Technology (Indian School of Mines), Dhanbad, 826004, Jharkhand, India.
| | - Chandra Sekhara Rao Annavarapu
- Department of Computer Science and Engineering, Indian Institute of Technology (Indian School of Mines), Dhanbad, 826004, Jharkhand, India.
| | - U Rajendra Acharya
- Ngee Ann Polytechnic, Department of Electronics and Computer Engineering, 599489, Singapore; Department of Biomedical Engineering, School of science and Technology, SUSS university, Singapore; Department of Biomedical Informatics and Medical Engineering, Asia university, Taichung, Taiwan.
| |
Collapse
|
33
|
Du W, Rao N, Yong J, Adjei PE, Hu X, Wang X, Gan T, Zhu L, Zeng B, Liu M, Xu Y. Early gastric cancer segmentation in gastroscopic images using a co-spatial attention and channel attention based triple-branch ResUnet. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 231:107397. [PMID: 36753915 DOI: 10.1016/j.cmpb.2023.107397] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2022] [Revised: 12/15/2022] [Accepted: 02/01/2023] [Indexed: 06/18/2023]
Abstract
BACKGROUND AND OBJECTIVE The artificial segmentation of early gastric cancer (EGC) lesions in gastroscopic images remains a challenging task due to reasons including the diversity of mucosal features, irregular edges of EGC lesions and nuances between EGC lesions and healthy background mucosa. Hence, this study proposed an automatic segmentation framework: co-spatial attention and channel attention based triple-branch ResUnet (CSA-CA-TB-ResUnet) to achieve accurate segmentation of EGC lesions for aiding clinical diagnosis and treatment. METHODS The input gastroscopic image sequences of the triple-branch segmentation network CSA-CA-TB-ResUnet is firstly generated by the designed multi-branch input preprocessing (MBIP) module in order to fully utilize massive correlation information among multiple gastroscopic images of the same a lesion. Then, the proposed CSA-CA-TB-ResUnet performs the segmentation of EGC lesion, in which the co-spatial attention (CSA) mechanism is designed to activate the spatial location of EGC lesions by leveraging on the correlations among multiple gastroscopic images of the same EGC lesion, and the channel attention (CA) mechanism is introduced to extract subtle discriminative features of EGC lesions by capturing the interdependencies between channel features. Finally, two gastroscopic images datasets from different digestive endoscopic centers in the southwest and northeast regions of China respectively were collected to validate the performances of proposed segmentation method. RESULTS The correlation information among gastroscopic images was confirmed to be able to improve the accuracy of EGC segmentation. On another unseen dataset, our EGC segmentation method achieves Jaccard similarity index (JSI) of 84.54% (95% confidence interval (CI), 83.49%-85.56%), threshold Jaccard index (TJI) of 81.73% (95% CI, 79.70%-83.61%), Dice similarity coefficient (DSC) of 91.08% (95% CI, 90.40%-91.76%) and pixel-wise accuracy (PA) of 91.18% (95% CI, 90.43%-91.87%), which is superior to other state-of-the-art methods. Even on the challenging small lesions, the segmentation results of our CSA-CA-TB-ResUnet-based method are consistently and significantly better than other state-of-the-art methods. We also compared the segmentation result of our model with the diagnostic accuracy with junior/senior expert. The comparison results indicated that our model performed better than the junior expert. CONCLUSIONS This study proposed a novel CSA-CA-TB-ResUnet-based EGC segmentation method and it has a potential for real-time application in improving EGC clinical diagnosis and minimally invasive surgery.
Collapse
Affiliation(s)
- Wenju Du
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China; School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Nini Rao
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China; School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Jiahao Yong
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China; School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Prince Ebenezer Adjei
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China; School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Xiaoming Hu
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China; School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Xiaotong Wang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China; School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Tao Gan
- Digestive Endoscopic Center of West China Hospital, Sichuan University, Chengdu 610017, China.
| | - Linlin Zhu
- Digestive Endoscopic Center of West China Hospital, Sichuan University, Chengdu 610017, China
| | - Bing Zeng
- School of Information and Communication Engineering, University Electronic Science and Technology of China, Chengdu 610054, China
| | - Mengyuan Liu
- The First Hospital of China Medical University, China Medical University, Shenyang 110001, China
| | - Yongxue Xu
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China; School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| |
Collapse
|
34
|
Hasan MK, Ahamad MA, Yap CH, Yang G. A survey, review, and future trends of skin lesion segmentation and classification. Comput Biol Med 2023; 155:106624. [PMID: 36774890 DOI: 10.1016/j.compbiomed.2023.106624] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2022] [Revised: 01/04/2023] [Accepted: 01/28/2023] [Indexed: 02/03/2023]
Abstract
The Computer-aided Diagnosis or Detection (CAD) approach for skin lesion analysis is an emerging field of research that has the potential to alleviate the burden and cost of skin cancer screening. Researchers have recently indicated increasing interest in developing such CAD systems, with the intention of providing a user-friendly tool to dermatologists to reduce the challenges encountered or associated with manual inspection. This article aims to provide a comprehensive literature survey and review of a total of 594 publications (356 for skin lesion segmentation and 238 for skin lesion classification) published between 2011 and 2022. These articles are analyzed and summarized in a number of different ways to contribute vital information regarding the methods for the development of CAD systems. These ways include: relevant and essential definitions and theories, input data (dataset utilization, preprocessing, augmentations, and fixing imbalance problems), method configuration (techniques, architectures, module frameworks, and losses), training tactics (hyperparameter settings), and evaluation criteria. We intend to investigate a variety of performance-enhancing approaches, including ensemble and post-processing. We also discuss these dimensions to reveal their current trends based on utilization frequencies. In addition, we highlight the primary difficulties associated with evaluating skin lesion segmentation and classification systems using minimal datasets, as well as the potential solutions to these difficulties. Findings, recommendations, and trends are disclosed to inform future research on developing an automated and robust CAD system for skin lesion analysis.
Collapse
Affiliation(s)
- Md Kamrul Hasan
- Department of Bioengineering, Imperial College London, UK; Department of Electrical and Electronic Engineering (EEE), Khulna University of Engineering & Technology (KUET), Khulna 9203, Bangladesh.
| | - Md Asif Ahamad
- Department of Electrical and Electronic Engineering (EEE), Khulna University of Engineering & Technology (KUET), Khulna 9203, Bangladesh.
| | - Choon Hwai Yap
- Department of Bioengineering, Imperial College London, UK.
| | - Guang Yang
- National Heart and Lung Institute, Imperial College London, UK; Cardiovascular Research Centre, Royal Brompton Hospital, UK.
| |
Collapse
|
35
|
Li J, Sun W, von Deneen KM, Fan X, An G, Cui G, Zhang Y. MG-Net: Multi-level global-aware network for thymoma segmentation. Comput Biol Med 2023; 155:106635. [PMID: 36791547 DOI: 10.1016/j.compbiomed.2023.106635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 01/26/2023] [Accepted: 02/04/2023] [Indexed: 02/10/2023]
Abstract
BACKGROUND AND OBJECTIVE Automatic thymoma segmentation in preoperative contrast-enhanced computed tomography (CECT) images makes great sense for diagnosis. Although convolutional neural networks (CNNs) are distinguished in medical image segmentation, they are challenged by thymomas with various shapes, scales and textures, owing to the intrinsic locality of convolution operations. In order to overcome this deficit, we built a deep learning network with enhanced global-awareness for thymoma segmentation. METHODS We propose a multi-level global-aware network (MG-Net) for thymoma segmentation, in which the multi-level feature interaction and integration are jointly designed to enhance the global-awareness of CNNs. Particularly, we design the cross-attention block (CAB) to calculate pixel-wise interactions of multi-level features, resulting in the Global Enhanced Convolution Block, which can enable the network to handle various thymomas by strengthening the global-awareness of the encoder. We further devise the Global Spatial Attention Module to integrate coarse- and fine-grain information for enhancing the semantic consistency between the encoder and decoder with CABs. We also develop an Adaptive Attention Fusion Module to adaptively aggregate different semantic-scale features in the decoder to preserve comprehensive details. RESULTS The MG-Net has been evaluated against several state-of-the-art models on the self-collected CECT dataset and NIH Pancreas-CT dataset. Results suggest that all designed components are effective, and MG-Net has superior segmentation performance and generalization ability over existing models. CONCLUSION Both the qualitative and quantitative experimental results indicate that our MG-Net with global-aware ability can achieve accurate thymoma segmentation and has generalization ability in different tasks. The code is available at: https://github.com/Leejyuan/MGNet.
Collapse
Affiliation(s)
- Jingyuan Li
- Center for Brain Imaging, School of Life Science and Technology, Xidian University & Engineering Research Center of Molecular and Neuro Imaging, Ministry of Education, Xi'an, Shaanxi, 710126, China; International Joint Research Center for Advanced Medical Imaging and Intelligent Diagnosis and Treatment & Xi'an Key Laboratory of Intelligent Sensing and Regulation of Trans-Scale Life Information, School of Life Science and Technology, Xidian University, Xi'an, Shaanxi, 710126, China
| | - Wenfang Sun
- International Joint Research Center for Advanced Medical Imaging and Intelligent Diagnosis and Treatment & Xi'an Key Laboratory of Intelligent Sensing and Regulation of Trans-Scale Life Information, School of Life Science and Technology, Xidian University, Xi'an, Shaanxi, 710126, China; School of Aerospace Science and Technology, Xidian University, Xi'an, Shaanxi, 710126, China.
| | - Karen M von Deneen
- Center for Brain Imaging, School of Life Science and Technology, Xidian University & Engineering Research Center of Molecular and Neuro Imaging, Ministry of Education, Xi'an, Shaanxi, 710126, China; International Joint Research Center for Advanced Medical Imaging and Intelligent Diagnosis and Treatment & Xi'an Key Laboratory of Intelligent Sensing and Regulation of Trans-Scale Life Information, School of Life Science and Technology, Xidian University, Xi'an, Shaanxi, 710126, China
| | - Xiao Fan
- Center for Brain Imaging, School of Life Science and Technology, Xidian University & Engineering Research Center of Molecular and Neuro Imaging, Ministry of Education, Xi'an, Shaanxi, 710126, China; International Joint Research Center for Advanced Medical Imaging and Intelligent Diagnosis and Treatment & Xi'an Key Laboratory of Intelligent Sensing and Regulation of Trans-Scale Life Information, School of Life Science and Technology, Xidian University, Xi'an, Shaanxi, 710126, China
| | - Gang An
- Center for Brain Imaging, School of Life Science and Technology, Xidian University & Engineering Research Center of Molecular and Neuro Imaging, Ministry of Education, Xi'an, Shaanxi, 710126, China; International Joint Research Center for Advanced Medical Imaging and Intelligent Diagnosis and Treatment & Xi'an Key Laboratory of Intelligent Sensing and Regulation of Trans-Scale Life Information, School of Life Science and Technology, Xidian University, Xi'an, Shaanxi, 710126, China
| | - Guangbin Cui
- Department of Radiology, Tangdu Hospital, Fourth Military Medical University, Xi'an, Shaanxi, 710038, China.
| | - Yi Zhang
- Center for Brain Imaging, School of Life Science and Technology, Xidian University & Engineering Research Center of Molecular and Neuro Imaging, Ministry of Education, Xi'an, Shaanxi, 710126, China; International Joint Research Center for Advanced Medical Imaging and Intelligent Diagnosis and Treatment & Xi'an Key Laboratory of Intelligent Sensing and Regulation of Trans-Scale Life Information, School of Life Science and Technology, Xidian University, Xi'an, Shaanxi, 710126, China.
| |
Collapse
|
36
|
Qu T, Li X, Wang X, Deng W, Mao L, He M, Li X, Wang Y, Liu Z, Zhang L, Jin Z, Xue H, Yu Y. Transformer guided progressive fusion network for 3D pancreas and pancreatic mass segmentation. Med Image Anal 2023; 86:102801. [PMID: 37028237 DOI: 10.1016/j.media.2023.102801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 12/21/2022] [Accepted: 03/22/2023] [Indexed: 04/03/2023]
Abstract
Pancreatic masses are diverse in type, often making their clinical management challenging. This study aims to address the task of various types of pancreatic mass segmentation and detection while accurately segmenting the pancreas. Although convolution operation performs well at extracting local details, it experiences difficulty capturing global representations. To alleviate this limitation, we propose a transformer guided progressive fusion network (TGPFN) that utilizes the global representation captured by the transformer to supplement long-range dependencies lost by convolution operations at different resolutions. TGPFN is built on a branch-integrated network structure, where the convolutional neural network and transformer branches first perform separate feature extraction in the encoder, and then the local and global features are progressively fused in the decoder. To effectively integrate the information of the two branches, we design a transformer guidance flow to ensure feature consistency, and present a cross-network attention module to capture the channel dependencies. Extensive experiments with nnUNet (3D) show that TGPFN improves the mass segmentation (Dice: 73.93% vs. 69.40%) and detection accuracy (detection rate: 91.71% vs. 84.97%) on 416 private CTs, and also obtains performance improvements of mass segmentation (Dice: 43.86% vs. 42.07%) and detection (detection rate: 83.33% vs. 71.74%) on 419 public CTs.
Collapse
|
37
|
Zhang W, Lu F, Zhao W, Hu Y, Su H, Yuan M. ACCPG-Net: A skin lesion segmentation network with Adaptive Channel-Context-Aware Pyramid Attention and Global Feature Fusion. Comput Biol Med 2023; 154:106580. [PMID: 36716686 DOI: 10.1016/j.compbiomed.2023.106580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Revised: 01/09/2023] [Accepted: 01/22/2023] [Indexed: 01/26/2023]
Abstract
The computer-aided diagnosis system based on dermoscopic images has played an important role in the clinical treatment of skin lesion. An accurate, efficient, and automatic skin lesion segmentation method is an important auxiliary tool for clinical diagnosis. At present, skin lesion segmentation still suffers from great challenges. Existing deep-learning-based automatic segmentation methods frequently use convolutional neural networks (CNN). However, the globally-sharing feature re-weighting vector may not be optimal for the prediction of lesion areas in dermoscopic images. The presence of hairs and spots in some samples aggravates the interference of similar categories, and reduces the segmentation accuracy. To solve this problem, this paper proposes a new deep network for precise skin lesion segmentation based on a U-shape structure. To be specific, two lightweight attention modules: adaptive channel-context-aware pyramid attention (ACCAPA) module and global feature fusion (GFF) module, are embedded in the network. The ACCAPA module can model the characteristics of the lesion areas by dynamically learning the channel information, contextual information and global structure information. GFF is used for different levels of semantic information interaction between encoder and decoder layers. To validate the effectiveness of the proposed method, we test the performance of ACCPG-Net on several public skin lesion datasets. The results show that our method achieves better segmentation performance compared to other state-of-the-art methods.
Collapse
Affiliation(s)
- Wenyu Zhang
- School of Information Science and Engineering, Lanzhou University, China
| | - Fuxiang Lu
- School of Information Science and Engineering, Lanzhou University, China.
| | - Wei Zhao
- School of Information Science and Engineering, Lanzhou University, China
| | - Yawen Hu
- School of Information Science and Engineering, Lanzhou University, China
| | - Hongjing Su
- School of Information Science and Engineering, Lanzhou University, China
| | - Min Yuan
- School of Information Science and Engineering, Lanzhou University, China
| |
Collapse
|
38
|
Baloi A, Costea C, Gutt R, Balacescu O, Turcu F, Belean B. Hexagonal-Grid-Layout Image Segmentation Using Shock Filters: Computational Complexity Case Study for Microarray Image Analysis Related to Machine Learning Approaches. SENSORS (BASEL, SWITZERLAND) 2023; 23:2582. [PMID: 36904788 PMCID: PMC10007319 DOI: 10.3390/s23052582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 02/17/2023] [Accepted: 02/21/2023] [Indexed: 06/18/2023]
Abstract
Hexagonal grid layouts are advantageous in microarray technology; however, hexagonal grids appear in many fields, especially given the rise of new nanostructures and metamaterials, leading to the need for image analysis on such structures. This work proposes a shock-filter-based approach driven by mathematical morphology for the segmentation of image objects disposed in a hexagonal grid. The original image is decomposed into a pair of rectangular grids, such that their superposition generates the initial image. Within each rectangular grid, the shock-filters are once again used to confine the foreground information for each image object into an area of interest. The proposed methodology was successfully applied for microarray spot segmentation, whereas its character of generality is underlined by the segmentation results obtained for two other types of hexagonal grid layouts. Considering the segmentation accuracy through specific quality measures for microarray images, such as the mean absolute error and the coefficient of variation, high correlations of our computed spot intensity features with the annotated reference values were found, indicating the reliability of the proposed approach. Moreover, taking into account that the shock-filter PDE formalism is targeting the one-dimensional luminance profile function, the computational complexity to determine the grid is minimized. The order of growth for the computational complexity of our approach is at least one order of magnitude lower when compared with state-of-the-art microarray segmentation approaches, ranging from classical to machine learning ones.
Collapse
Affiliation(s)
- Aurel Baloi
- Research Center for Integrated Analysis and Territorial Management, University of Bucharest, 4-12 Regina Elisabeta, 030018 Bucharest, Romania
- Faculty of Administration and Business, University of Bucharest, 030018 Bucharest, Romania
| | - Carmen Costea
- Department of Mathematics, Faculty of Automation and Computer Science, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, Romania
| | - Robert Gutt
- Center of Advanced Research and Technologies for Alternative Energies, National Institute for Research and Development of Isotopic and Molecular Technologies, 400293 Cluj-Napoca, Romania
| | - Ovidiu Balacescu
- Department of Genetics, Genomics and Experimental Pathology, The Oncology Institute, Prof. Dr. Ion Chiricuta, 400015 Cluj-Napoca, Romania
| | - Flaviu Turcu
- Center of Advanced Research and Technologies for Alternative Energies, National Institute for Research and Development of Isotopic and Molecular Technologies, 400293 Cluj-Napoca, Romania
- Faculty of Physics, Babes-Bolyai University, 400084 Cluj-Napoca, Romania
| | - Bogdan Belean
- Center of Advanced Research and Technologies for Alternative Energies, National Institute for Research and Development of Isotopic and Molecular Technologies, 400293 Cluj-Napoca, Romania
| |
Collapse
|
39
|
Pei Y, Mu L, Xu C, Li Q, Sen G, Sun B, Li X, Li X. Learning-based landmark detection in pelvis x-rays with attention mechanism: data from the osteoarthritis initiative. Biomed Phys Eng Express 2023; 9. [PMID: 36070671 DOI: 10.1088/2057-1976/ac8ffa] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Accepted: 09/07/2022] [Indexed: 01/07/2023]
Abstract
Patients with developmental dysplasia of the hip can have this problem throughout their lifetime. The problem is difficult to detect by radiologists throughout x-ray because of an abrasion of anatomical structures. Thus, the landmarks should be automatically and precisely located. In this paper, we propose an attention mechanism of combining multi-dimension information on the basis of separating spatial dimension. The proposed attention mechanism decouples spatial dimension and forms width-channel dimension and height-channel dimension by 1D pooling operations in the height and width of spatial dimension. Then non-local means operations are performed to capture the correlation between long-range pixels in width-channel dimension, as well as that in height-channel dimension at different resolutions. The proposed attention mechanism modules are inserted into the skipped connections of U-Net to form a novel landmark detection structure. This landmark detection method was trained and evaluated through five-fold cross-validation on an open-source dataset, including 524 pelvis x-ray, each containing eight landmarks in pelvis, and achieved excellent performance compared to other landmark detection models. The average point-to-point errors of U-Net, HR-Net, CE-Net, and the proposed network were 3.5651 mm, 3.6118 mm, 3.3914 mm and 3.1350 mm, respectively. The results indicate that the proposed method has the highest detection accuracy. Furthermore, an open-source pelvis dataset is annotated and released for open research.
Collapse
Affiliation(s)
- Yun Pei
- State Key Laboratory of Integrated Optoelectronics, College of Electronic Science and Engineering, Jilin University, Changchun, 130012, People's Republic of China
| | - Lin Mu
- Department of Radiology, The First Hospital of Jilin University, Changchun, 130021, People's Republic of China
| | - Chuanxin Xu
- School of Electrical Engineering and Computer, Jilin Jianzhu University, Changchun, 130118, People's Republic of China
| | - Qiang Li
- Department of Orthopedics, the Second Hospital of Jilin University, Changchun, 130041, People's Republic of China
| | - Gan Sen
- Department of Medical Engineering and Technology, Xinjiang Medical University, Urumqi, 830011, People's Republic of China
| | - Bin Sun
- Department of Medical Engineering and Technology, Xinjiang Medical University, Urumqi, 830011, People's Republic of China
| | - Xiuying Li
- State Key Laboratory of Integrated Optoelectronics, College of Electronic Science and Engineering, Jilin University, Changchun, 130012, People's Republic of China
| | - Xueyan Li
- State Key Laboratory of Integrated Optoelectronics, College of Electronic Science and Engineering, Jilin University, Changchun, 130012, People's Republic of China.,Peng Cheng Laboratory, Shenzhen, 518000, People's Republic of China
| |
Collapse
|
40
|
Zafar M, Sharif MI, Sharif MI, Kadry S, Bukhari SAC, Rauf HT. Skin Lesion Analysis and Cancer Detection Based on Machine/Deep Learning Techniques: A Comprehensive Survey. Life (Basel) 2023; 13:146. [PMID: 36676093 PMCID: PMC9864434 DOI: 10.3390/life13010146] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 12/25/2022] [Accepted: 12/28/2022] [Indexed: 01/06/2023] Open
Abstract
The skin is the human body's largest organ and its cancer is considered among the most dangerous kinds of cancer. Various pathological variations in the human body can cause abnormal cell growth due to genetic disorders. These changes in human skin cells are very dangerous. Skin cancer slowly develops over further parts of the body and because of the high mortality rate of skin cancer, early diagnosis is essential. The visual checkup and the manual examination of the skin lesions are very tricky for the determination of skin cancer. Considering these concerns, numerous early recognition approaches have been proposed for skin cancer. With the fast progression in computer-aided diagnosis systems, a variety of deep learning, machine learning, and computer vision approaches were merged for the determination of medical samples and uncommon skin lesion samples. This research provides an extensive literature review of the methodologies, techniques, and approaches applied for the examination of skin lesions to date. This survey includes preprocessing, segmentation, feature extraction, selection, and classification approaches for skin cancer recognition. The results of these approaches are very impressive but still, some challenges occur in the analysis of skin lesions because of complex and rare features. Hence, the main objective is to examine the existing techniques utilized in the discovery of skin cancer by finding the obstacle that helps researchers contribute to future research.
Collapse
Affiliation(s)
- Mehwish Zafar
- Department of Computer Science, COMSATS University Islamabad, Wah Campus, Wah Cantt 47040, Pakistan
| | - Muhammad Imran Sharif
- Department of Computer Science, COMSATS University Islamabad, Wah Campus, Wah Cantt 47040, Pakistan
| | - Muhammad Irfan Sharif
- Department of Computer Science, University of Education, Jauharabad Campus, Khushāb 41200, Pakistan
| | - Seifedine Kadry
- Department of Applied Data Science, Noroff University College, 4612 Kristiansand, Norway
- Artificial Intelligence Research Center (AIRC), Ajman University, Ajman P.O. Box 346, United Arab Emirates
- Department of Electrical and Computer Engineering, Lebanese American University, Byblos P.O. Box 13-5053, Lebanon
| | - Syed Ahmad Chan Bukhari
- Division of Computer Science, Mathematics and Science, Collins College of Professional Studies, St. John’s University, Queens, NY 11439, USA
| | - Hafiz Tayyab Rauf
- Centre for Smart Systems, AI and Cybersecurity, Staffordshire University, Stoke-on-Trent ST4 2DE, UK
| |
Collapse
|
41
|
Huang P, He P, Tian S, Ma M, Feng P, Xiao H, Mercaldo F, Santone A, Qin J. A ViT-AMC Network With Adaptive Model Fusion and Multiobjective Optimization for Interpretable Laryngeal Tumor Grading From Histopathological Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:15-28. [PMID: 36018875 DOI: 10.1109/tmi.2022.3202248] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
The tumor grading of laryngeal cancer pathological images needs to be accurate and interpretable. The deep learning model based on the attention mechanism-integrated convolution (AMC) block has good inductive bias capability but poor interpretability, whereas the deep learning model based on the vision transformer (ViT) block has good interpretability but weak inductive bias ability. Therefore, we propose an end-to-end ViT-AMC network (ViT-AMCNet) with adaptive model fusion and multiobjective optimization that integrates and fuses the ViT and AMC blocks. However, existing model fusion methods often have negative fusion: 1). There is no guarantee that the ViT and AMC blocks will simultaneously have good feature representation capability. 2). The difference in feature representations learning between the ViT and AMC blocks is not obvious, so there is much redundant information in the two feature representations. Accordingly, we first prove the feasibility of fusing the ViT and AMC blocks based on Hoeffding's inequality. Then, we propose a multiobjective optimization method to solve the problem that ViT and AMC blocks cannot simultaneously have good feature representation. Finally, an adaptive model fusion method integrating the metrics block and the fusion block is proposed to increase the differences between feature representations and improve the deredundancy capability. Our methods improve the fusion ability of ViT-AMCNet, and experimental results demonstrate that ViT-AMCNet significantly outperforms state-of-the-art methods. Importantly, the visualized interpretive maps are closer to the region of interest of concern by pathologists, and the generalization ability is also excellent. Our code is publicly available at https://github.com/Baron-Huang/ViT-AMCNet.
Collapse
|
42
|
Wang J, Luo Y, Wang Z, Hounye AH, Cao C, Hou M, Zhang J. A cell phone app for facial acne severity assessment. APPL INTELL 2023; 53:7614-7633. [PMID: 35919632 PMCID: PMC9336136 DOI: 10.1007/s10489-022-03774-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/15/2022] [Indexed: 11/28/2022]
Abstract
Acne vulgaris, the most common skin disease, can cause substantial economic and psychological impacts to the people it affects, and its accurate grading plays a crucial role in the treatment of patients. In this paper, we firstly proposed an acne grading criterion that considers lesion classifications and a metric for producing accurate severity ratings. Due to similar appearance of acne lesions with comparable severities and difficult-to-count lesions, severity assessment is a challenging task. We cropped facial skin images of several lesion patches and then addressed the acne lesion with a lightweight acne regular network (Acne-RegNet). Acne-RegNet was built by using a median filter and histogram equalization to improve image quality, a channel attention mechanism to boost the representational power of network, a region-based focal loss to handle classification imbalances and a model pruning and feature-based knowledge distillation to reduce model size. After the application of Acne-RegNet, the severity score is calculated, and the acne grading is further optimized by the metadata of the patients. The entire acne assessment procedure was deployed to a mobile device, and a phone app was designed. Compared with state-of-the-art lightweight models, the proposed Acne-RegNet significantly improves the accuracy of lesion classifications. The acne app demonstrated promising results in severity assessments (accuracy: 94.56%) and showed a dermatologist-level diagnosis on the internal clinical dataset.The proposed acne app could be a useful adjunct to assess acne severity in clinical practice and it enables anyone with a smartphone to immediately assess acne, anywhere and anytime.
Collapse
Affiliation(s)
- Jiaoju Wang
- School of Mathematics and Statistics, Central South University, Changsha, 410083 Hunan China
| | - Yan Luo
- Department of dermatology of Xiangya hospital, Central South University, Changsha, 410083 Hunan China
| | - Zheng Wang
- School of Mathematics and Statistics, Central South University, Changsha, 410083 Hunan China.,Science and Engineering School, Hunan First Normal University, Changsha, 410083 Hunan China
| | - Alphonse Houssou Hounye
- School of Mathematics and Statistics, Central South University, Changsha, 410083 Hunan China
| | - Cong Cao
- School of Mathematics and Statistics, Central South University, Changsha, 410083 Hunan China
| | - Muzhou Hou
- School of Mathematics and Statistics, Central South University, Changsha, 410083 Hunan China
| | - Jianglin Zhang
- Department of Dermatology of Shenzhen People's Hospital The Second Clinical Medical College of Jinan Uninversity, The First Affiliated Hospital of Southern University of Science and Technology, Shenzhen, 518020 Guangdong China.,Candidate Branch of National Clinical Research Center for Skin Diseases, Shenzhen, 518020 Guangdong China
| |
Collapse
|
43
|
Basu S, Gupta M, Rana P, Gupta P, Arora C. RadFormer: Transformers with global-local attention for interpretable and accurate Gallbladder Cancer detection. Med Image Anal 2023; 83:102676. [PMID: 36455424 DOI: 10.1016/j.media.2022.102676] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 09/17/2022] [Accepted: 10/27/2022] [Indexed: 11/21/2022]
Abstract
We propose a novel deep neural network architecture to learn interpretable representation for medical image analysis. Our architecture generates a global attention for region of interest, and then learns bag of words style deep feature embeddings with local attention. The global, and local feature maps are combined using a contemporary transformer architecture for highly accurate Gallbladder Cancer (GBC) detection from Ultrasound (USG) images. Our experiments indicate that the detection accuracy of our model beats even human radiologists, and advocates its use as the second reader for GBC diagnosis. Bag of words embeddings allow our model to be probed for generating interpretable explanations for GBC detection consistent with the ones reported in medical literature. We show that the proposed model not only helps understand decisions of neural network models but also aids in discovery of new visual features relevant to the diagnosis of GBC. Source-code is available at https://github.com/sbasu276/RadFormer.
Collapse
Affiliation(s)
- Soumen Basu
- Department of Computer Science, Indian Institute of Technology Delhi, New Delhi, India.
| | - Mayank Gupta
- Department of Computer Science, Indian Institute of Technology Delhi, New Delhi, India
| | - Pratyaksha Rana
- Department of Radiodiagnosis and Imaging, Postgraduate Institute of Medical Education & Research, Chandigarh, India
| | - Pankaj Gupta
- Department of Radiodiagnosis and Imaging, Postgraduate Institute of Medical Education & Research, Chandigarh, India
| | - Chetan Arora
- Department of Computer Science, Indian Institute of Technology Delhi, New Delhi, India
| |
Collapse
|
44
|
Yue G, Wei P, Zhou T, Jiang Q, Yan W, Wang T. Toward Multicenter Skin Lesion Classification Using Deep Neural Network With Adaptively Weighted Balance Loss. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:119-131. [PMID: 36063522 DOI: 10.1109/tmi.2022.3204646] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Recently, deep neural network-based methods have shown promising advantages in accurately recognizing skin lesions from dermoscopic images. However, most existing works focus more on improving the network framework for better feature representation but ignore the data imbalance issue, limiting their flexibility and accuracy across multiple scenarios in multi-center clinics. Generally, different clinical centers have different data distributions, which presents challenging requirements for the network's flexibility and accuracy. In this paper, we divert the attention from framework improvement to the data imbalance issue and propose a new solution for multi-center skin lesion classification by introducing a novel adaptively weighted balance (AWB) loss to the conventional classification network. Benefiting from AWB, the proposed solution has the following advantages: 1) it is easy to satisfy different practical requirements by only changing the backbone; 2) it is user-friendly with no tuning on hyperparameters; and 3) it adaptively enables small intraclass compactness and pays more attention to the minority class. Extensive experiments demonstrate that, compared with solutions equipped with state-of-the-art loss functions, the proposed solution is more flexible and more competent for tackling the multi-center imbalanced skin lesion classification task with considerable performance on two benchmark datasets. In addition, the proposed solution is proved to be effective in handling the imbalanced gastrointestinal disease classification task and the imbalanced DR grading task. Code is available at https://github.com/Weipeishan2021.
Collapse
|
45
|
Dong Y, Wang L, Li Y. TC-Net: Dual coding network of Transformer and CNN for skin lesion segmentation. PLoS One 2022; 17:e0277578. [PMID: 36409714 PMCID: PMC9678318 DOI: 10.1371/journal.pone.0277578] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 10/29/2022] [Indexed: 11/22/2022] Open
Abstract
Skin lesion segmentation has become an essential recent direction in machine learning for medical applications. In a deep learning segmentation network, the convolutional neural network (CNN) uses convolution to capture local information for modeling. However, it ignores the relationship between pixels and still can not meet the precise segmentation requirements of some complex low contrast datasets. Transformer performs well in modeling global feature information, but their ability to extract fine-grained local feature patterns is weak. In this work, The dual coding fusion network architecture Transformer and CNN (TC-Net), as an architecture that can more accurately combine local feature information and global feature information, can improve the segmentation performance of skin images. The results of this work demonstrate that the combination of CNN and Transformer brings very significant improvement in global segmentation performance and allows outperformance as compared to the pure single network model. The experimental results and visual analysis of these three datasets quantitatively and qualitatively illustrate the robustness of TC-Net. Compared with Swin UNet, on the ISIC2018 dataset, it has increased by 2.46% in the dice index and about 4% in the JA index. On the ISBI2017 dataset, the dice and JA indices rose by about 4%.
Collapse
Affiliation(s)
- Yuying Dong
- College of Information Science and Engineering, Xinjiang University, Urumqi, China
| | - Liejun Wang
- College of Information Science and Engineering, Xinjiang University, Urumqi, China
- * E-mail:
| | - Yongming Li
- College of Information Science and Engineering, Xinjiang University, Urumqi, China
| |
Collapse
|
46
|
Liu L, Liu Y, Zhou J, Guo C, Duan H. A novel MCF-Net: Multi-level context fusion network for 2D medical image segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 226:107160. [PMID: 36191351 DOI: 10.1016/j.cmpb.2022.107160] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 08/14/2022] [Accepted: 09/25/2022] [Indexed: 06/16/2023]
Abstract
Medical image segmentation is a crucial step in the clinical applications for diagnosis and analysis of some diseases. U-Net-based convolution neural networks have achieved impressive performance in medical image segmentation tasks. However, the multi-level contextual information integration capability and the feature extraction ability are often insufficient. In this paper, we present a novel multi-level context fusion network (MCF-Net) to improve the performance of U-Net on various segmentation tasks by designing three modules, hybrid attention-based residual atrous convolution (HARA) module, multi-scale feature memory (MSFM) module, and multi-receptive field fusion (MRFF) module, to fuse multi-scale contextual information. HARA module was proposed to effectively extract multi-receptive field features by combing atrous spatial pyramid pooling and attention mechanism. We further design the MSFM and MRFF modules to fuse features of different levels and effectively extract contextual information. The proposed MCF-Net was evaluated on the ISIC 2018, DRIVE, BUSI, and Kvasir-SEG datasets, which have challenging images of many sizes and widely varying anatomy. The experimental results show that MCF-Net is very competitive with other U-Net models, and it offers tremendous potential as a general-purpose deep learning model for 2D medical image segmentation.
Collapse
Affiliation(s)
- Lizhu Liu
- Engineering Research Center of Automotive Electrics and Control Technology, College of Mechanical and Vehicle Engineering, Hunan University, Changsha 410082, China; National Engineering Laboratory of Robot Visual Perception and Control Technology, School of Robotics, Hunan University, Changsha 410082, China.
| | - Yexin Liu
- Engineering Research Center of Automotive Electrics and Control Technology, College of Mechanical and Vehicle Engineering, Hunan University, Changsha 410082, China.
| | - Jian Zhou
- Engineering Research Center of Automotive Electrics and Control Technology, College of Mechanical and Vehicle Engineering, Hunan University, Changsha 410082, China.
| | - Cheng Guo
- Engineering Research Center of Automotive Electrics and Control Technology, College of Mechanical and Vehicle Engineering, Hunan University, Changsha 410082, China.
| | - Huigao Duan
- Engineering Research Center of Automotive Electrics and Control Technology, College of Mechanical and Vehicle Engineering, Hunan University, Changsha 410082, China.
| |
Collapse
|
47
|
Feng R, Zhuo L, Li X, Yin H, Wang Z. BLA-Net:Boundary learning assisted network for skin lesion segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 226:107190. [PMID: 36288686 DOI: 10.1016/j.cmpb.2022.107190] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 10/14/2022] [Accepted: 10/17/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND AND OBJECTIVE Automatic skin lesion segmentation plays an important role in computer-aided diagnosis of skin diseases. However, current segmentation networks cannot accurately detect the boundaries of the skin lesion areas. METHODS In this paper, a boundary learning assisted network for skin lesion segmentation is proposed, namely BLA-Net, which adopts ResNet34 as backbone network under an encoder-decoder framework. The overall architecture is divided into two key components: Primary Segmentation Network (PSNet) and Auxiliary Boundary Learning Network (ABLNet). PSNet is to locate the skin lesion areas. Dynamic Deformable Convolution is introduced into the lower layer of the encoder, so that the network can effectively deal with complex skin lesion objects. And a Global Context Information Extraction Module is proposed and embedded into the high layer of the encoder to capture multi-receptive field and multi-scale global context features. ABLNet is to finely detect the boundaries of skin lesion area based on the low-level features of the encoder, in which an object regional attention mechanism is proposed to enhance the features of lesion object area and suppress those of irrelevant regions. ABLNet can assist the PSNet to realize accurate skin lesion segmentation. RESULTS We verified the segmentation performance of the proposed method on the two public dermoscopy datasets, namely ISBI 2016 and ISIC 2018. The experimental results show that our proposed method can achieve the Jaccard Index of 86.6%, 84.8% and the Dice Coefficient of 92.4%, 91.2% on ISBI 2016 and ISIC 2018 datasets, respectively. CONCLUSIONS Compared with existing methods, the proposed method can achieve the state-of-the-arts segmentation accuracy with less model parameters, which can assist dermatologists in clinical diagnosis and treatment.
Collapse
Affiliation(s)
- Ruiqi Feng
- Faculty of Information Technology, Beijing University of Technology, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, China
| | - Li Zhuo
- Faculty of Information Technology, Beijing University of Technology, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, China.
| | - Xiaoguang Li
- Faculty of Information Technology, Beijing University of Technology, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, China
| | - Hongxia Yin
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, China
| | - Zhenchang Wang
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, China.
| |
Collapse
|
48
|
Dynamic prototypical feature representation learning framework for semi-supervised skin lesion segmentation. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.08.039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
49
|
Jiang Y, Cheng T, Dong J, Liang J, Zhang Y, Lin X, Yao H. Dermoscopic image segmentation based on Pyramid Residual Attention Module. PLoS One 2022; 17:e0267380. [PMID: 36112649 PMCID: PMC9481037 DOI: 10.1371/journal.pone.0267380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Accepted: 04/08/2022] [Indexed: 11/18/2022] Open
Abstract
We propose a stacked convolutional neural network incorporating a novel and efficient pyramid residual attention (PRA) module for the task of automatic segmentation of dermoscopic images. Precise segmentation is a significant and challenging step for computer-aided diagnosis technology in skin lesion diagnosis and treatment. The proposed PRA has the following characteristics: First, we concentrate on three widely used modules in the PRA. The purpose of the pyramid structure is to extract the feature information of the lesion area at different scales, the residual means is aimed to ensure the efficiency of model training, and the attention mechanism is used to screen effective features maps. Thanks to the PRA, our network can still obtain precise boundary information that distinguishes healthy skin from diseased areas for the blurred lesion areas. Secondly, the proposed PRA can increase the segmentation ability of a single module for lesion regions through efficient stacking. The third, we incorporate the idea of encoder-decoder into the architecture of the overall network. Compared with the traditional networks, we divide the segmentation procedure into three levels and construct the pyramid residual attention network (PRAN). The shallow layer mainly processes spatial information, the middle layer refines both spatial and semantic information, and the deep layer intensively learns semantic information. The basic module of PRAN is PRA, which is enough to ensure the efficiency of the three-layer architecture network. We extensively evaluate our method on ISIC2017 and ISIC2018 datasets. The experimental results demonstrate that PRAN can obtain better segmentation performance comparable to state-of-the-art deep learning models under the same experiment environment conditions.
Collapse
Affiliation(s)
- Yun Jiang
- College of Computer Science and Engineering, Lanzhou, Gansu, China
| | - Tongtong Cheng
- College of Computer Science and Engineering, Lanzhou, Gansu, China
| | - Jinkun Dong
- College of Computer Science and Engineering, Lanzhou, Gansu, China
| | - Jing Liang
- College of Computer Science and Engineering, Lanzhou, Gansu, China
| | - Yuan Zhang
- College of Computer Science and Engineering, Lanzhou, Gansu, China
| | - Xin Lin
- College of Computer Science and Engineering, Lanzhou, Gansu, China
| | - Huixia Yao
- College of Computer Science and Engineering, Lanzhou, Gansu, China
| |
Collapse
|
50
|
Zuo B, Lee F, Chen Q. An efficient U-shaped network combined with edge attention module and context pyramid fusion for skin lesion segmentation. Med Biol Eng Comput 2022; 60:1987-2000. [PMID: 35538200 DOI: 10.1007/s11517-022-02581-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2021] [Accepted: 04/22/2022] [Indexed: 12/17/2022]
Abstract
Skin lesion segmentation is an important process in skin diagnosis, but still a challenging problem due to the variety of shapes, colours, and boundaries of melanoma. In this paper, we propose a novel and efficient U-shaped network named EAM-CPFNet, which combines with edge attention module (EAM) and context pyramid fusion (CPF) to improve the performance of the skin lesion segmentation. First, we design a plug-and-play module named edge attention module (EAM), which is used to highlight the edge information learned in the encoder. Secondly, we integrate two pyramid modules collectively named context pyramid fusion (CPF) for context information fusion. One is multiple global pyramid guidance (GPG) modules, which replace the skip connections between the encoder and the decoder to capture global context information, and the other is scale-aware pyramid fusion (SAPF) module, which is designed to dynamically fuse multi-scale context information in high-level features by utilizing spatial and channel attention mechanisms. Furthermore, we introduce full-scale skip connections to enhance different levels of global context information. We evaluate the proposed method on the publicly available ISIC2018 dataset, and the experimental results demonstrate that our proposed method is very competitive compared with other state-of-the-art methods for the skin lesion segmentation.
Collapse
Affiliation(s)
- Bin Zuo
- Shanghai Engineering Research Center of Assistive Devices, School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, China
- Rehabilitation Engineering and Technology Institute, University of Shanghai for Science and Technology, Shanghai, 200093, China
| | - Feifei Lee
- Shanghai Engineering Research Center of Assistive Devices, School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, China.
- Rehabilitation Engineering and Technology Institute, University of Shanghai for Science and Technology, Shanghai, 200093, China.
| | - Qiu Chen
- Major of Electrical Engineering and Electronics, Graduate School of Engineering, Kogakuin University, Tokyo, 163-8677, Japan.
| |
Collapse
|