1
|
Zhi Y, Bie H, Wang J, Ren L. Masked autoencoders with generalizable self-distillation for skin lesion segmentation. Med Biol Eng Comput 2024:10.1007/s11517-024-03086-z. [PMID: 38653880 DOI: 10.1007/s11517-024-03086-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 03/29/2024] [Indexed: 04/25/2024]
Abstract
In the field of skin lesion image segmentation, accurate identification and partitioning of diseased regions is of vital importance for in-depth analysis of skin cancer. Self-supervised learning, i.e., MAE, has emerged as a potent force in the medical imaging domain, which autonomously learns and extracts latent features from unlabeled data, thereby yielding pre-trained models that greatly assist downstream tasks. To encourage pre-trained models to more comprehensively learn the global structural and local detail information inherent in dermoscopy images, we introduce a Teacher-Student architecture, named TEDMAE, by incorporating a self-distillation mechanism, it learns holistic image feature information to improve the generalizable global knowledge learning of the student MAE model. To make the image features learned by the model suitable for unknown test images, two optimization strategies are, Exterior Conversion Augmentation (EC) utilizes random convolutional kernels and linear interpolation to effectively transform the input image into one with the same shape but altered intensities and textures, while Dynamic Feature Generation (DF) employs a nonlinear attention mechanism for feature merging, enhancing the expressive power of the features, are proposed to enhance the generalizability of global features learned by the teacher model, thereby improving the overall generalization capability of the pre-trained models. Experimental results from the three public skin disease datasets, ISIC2019, ISIC2017, and PH2 indicate that our proposed TEDMAE method outperforms several similar approaches. Specifically, TEDMAE demonstrated optimal segmentation and generalization performance on the ISIC2017 and PH2 datasets, with Dice scores reaching 82.1% and 91.2%, respectively. The best Jaccard values were 72.6% and 84.5%, while the optimal HD95% values were 13.0% and 8.9%, respectively.
Collapse
Affiliation(s)
- Yichen Zhi
- Department of Intelligent Media Computing Center, School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, 100876, People's Republic of China
| | - Hongxia Bie
- Department of Intelligent Media Computing Center, School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, 100876, People's Republic of China.
| | - Jiali Wang
- Department of Intelligent Media Computing Center, School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, 100876, People's Republic of China
| | - Lihan Ren
- Department of Intelligent Media Computing Center, School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, 100876, People's Republic of China
| |
Collapse
|
2
|
Xin C, Liu Z, Ma Y, Wang D, Zhang J, Li L, Zhou Q, Xu S, Zhang Y. Transformer guided self-adaptive network for multi-scale skin lesion image segmentation. Comput Biol Med 2024; 169:107846. [PMID: 38184865 DOI: 10.1016/j.compbiomed.2023.107846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 12/03/2023] [Accepted: 12/11/2023] [Indexed: 01/09/2024]
Abstract
BACKGROUND In recent years, skin lesion has become a major public health concern, and the diagnosis and management of skin lesions depend heavily on the correct segmentation of the lesions. Traditional convolutional neural networks (CNNs) have demonstrated promising results in skin lesion segmentation, but they are limited in their ability to capture distant connections and intricate features. In addition, current medical image segmentation algorithms rarely consider the distribution of different categories in different regions of the image and do not consider the spatial relationship between pixels. OBJECTIVES This study proposes a self-adaptive position-aware skin lesion segmentation model SapFormer to capture global context and fine-grained detail, better capture spatial relationships, and adapt to different positional characteristics. The SapFormer is a multi-scale dynamic position-aware structure designed to provide a more flexible representation of the relationships between skin lesion characteristics and lesion distribution. Additionally, it increases skin lesion segmentation accuracy and decreases incorrect segmentation of non-lesion areas. INNOVATIONS SapFormer designs multiple hybrid transformers for multi-scale feature encoding of skin images and multi-scale positional feature sensing of the encoded features using a transformer decoder to obtain fine-grained features of the lesion area and optimize the regional feature distribution. The self-adaptive feature framework, built upon the transformer decoder module, dynamically and automatically generates parameterizations with learnable properties at different positions. These parameterizations are derived from the multi-scale encoding characteristics of the input image. Simultaneously, this paper utilizes the cross-attention network to optimize the features of the current region according to the features of other regions, aiming to increase skin lesion segmentation accuracy. MAIN RESULTS The ISIC-2016, ISIC-2017, and ISIC-2018 datasets for skin lesions are used as the basis for the experiment. On these datasets, the proposed model has accuracy values of 97.9 %, 94.3 %, and 95.7 %, respectively. The proposed model's IOU values are, in order, 93.2 %, 86.4 %, and 89.4 %. The proposed model's DSC values are 96.4 %, 92.6 %, and 94.3 %, respectively. All three metrics surpass the performance of the majority of state-of-the-art (SOTA) models. SapFormer's metrics on these datasets demonstrate that it can precisely segment skin lesions. Notably, our approach exhibits remarkable noise resistance in non-lesion areas, while simultaneously conducting finer-grained regional feature extraction on the skin lesion image. CONCLUSIONS In conclusion, the integration of a transformer-guided position-aware network into semantic skin lesion segmentation results in a notable performance boost. The ability of our proposed network to capture spatial relationships and fine-grained details proves beneficial for effective skin lesion segmentation. By enhancing lesion localization, feature extraction, quantitative analysis, and classification accuracy, the proposed segmentation model improves the diagnostic efficiency of skin lesion analysis on dermoscopic images. It assists dermatologists in making more accurate and efficient diagnoses, ultimately leading to better patient care and outcomes. This research paves the way for advances in diagnosing and treating skin lesions, promoting better understanding and decision-making in the clinical setting.
Collapse
Affiliation(s)
- Chao Xin
- The First Affiliated Hospital of Ningbo University, Ningbo, 315211, China.
| | - Zhifang Liu
- The First Affiliated Hospital of Ningbo University, Ningbo, 315211, China.
| | - Yizhao Ma
- The First Affiliated Hospital of Ningbo University, Ningbo, 315211, China.
| | - Dianchen Wang
- The First Affiliated Hospital of Ningbo University, Ningbo, 315211, China.
| | - Jing Zhang
- The First Affiliated Hospital of Ningbo University, Ningbo, 315211, China.
| | - Lingzhi Li
- The First Affiliated Hospital of Ningbo University, Ningbo, 315211, China.
| | - Qiongyan Zhou
- The First Affiliated Hospital of Ningbo University, Ningbo, 315211, China.
| | - Suling Xu
- The First Affiliated Hospital of Ningbo University, Ningbo, 315211, China.
| | | |
Collapse
|
3
|
Innani S, Dutande P, Baid U, Pokuri V, Bakas S, Talbar S, Baheti B, Guntuku SC. Generative adversarial networks based skin lesion segmentation. Sci Rep 2023; 13:13467. [PMID: 37596306 PMCID: PMC10439152 DOI: 10.1038/s41598-023-39648-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2023] [Accepted: 07/28/2023] [Indexed: 08/20/2023] Open
Abstract
Skin cancer is a serious condition that requires accurate diagnosis and treatment. One way to assist clinicians in this task is using computer-aided diagnosis tools that automatically segment skin lesions from dermoscopic images. We propose a novel adversarial learning-based framework called Efficient-GAN (EGAN) that uses an unsupervised generative network to generate accurate lesion masks. It consists of a generator module with a top-down squeeze excitation-based compound scaled path, an asymmetric lateral connection-based bottom-up path, and a discriminator module that distinguishes between original and synthetic masks. A morphology-based smoothing loss is also implemented to encourage the network to create smooth semantic boundaries of lesions. The framework is evaluated on the International Skin Imaging Collaboration Lesion Dataset. It outperforms the current state-of-the-art skin lesion segmentation approaches with a Dice coefficient, Jaccard similarity, and accuracy of 90.1%, 83.6%, and 94.5%, respectively. We also design a lightweight segmentation framework called Mobile-GAN (MGAN) that achieves comparable performance as EGAN but with an order of magnitude lower number of training parameters, thus resulting in faster inference times for low compute resource settings.
Collapse
Affiliation(s)
- Shubham Innani
- Center of Excellence in Signal and Image Processing, Shri Guru Gobind Singhji Institute of Engineering and Technology, Nanded, Maharashtra, India.
| | - Prasad Dutande
- Center of Excellence in Signal and Image Processing, Shri Guru Gobind Singhji Institute of Engineering and Technology, Nanded, Maharashtra, India
| | - Ujjwal Baid
- Center of Excellence in Signal and Image Processing, Shri Guru Gobind Singhji Institute of Engineering and Technology, Nanded, Maharashtra, India
- Center for Biomedical Image Computing and Analytics, University of Pennsylvania, Philadelphia, USA
| | - Venu Pokuri
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, USA
| | - Spyridon Bakas
- Center for Biomedical Image Computing and Analytics, University of Pennsylvania, Philadelphia, USA
| | - Sanjay Talbar
- Center of Excellence in Signal and Image Processing, Shri Guru Gobind Singhji Institute of Engineering and Technology, Nanded, Maharashtra, India
| | - Bhakti Baheti
- Center of Excellence in Signal and Image Processing, Shri Guru Gobind Singhji Institute of Engineering and Technology, Nanded, Maharashtra, India
- Center for Biomedical Image Computing and Analytics, University of Pennsylvania, Philadelphia, USA
| | - Sharath Chandra Guntuku
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, USA
| |
Collapse
|
4
|
Liu Y, Wang J, Wu C, Liu L, Zhang Z, Yu H. Fovea-UNet: detection and segmentation of lymph node metastases in colorectal cancer with deep learning. Biomed Eng Online 2023; 22:74. [PMID: 37479991 PMCID: PMC10362618 DOI: 10.1186/s12938-023-01137-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 07/11/2023] [Indexed: 07/23/2023] Open
Abstract
BACKGROUND Colorectal cancer is one of the most serious malignant tumors, and lymph node metastasis (LNM) from colorectal cancer is a major factor for patient management and prognosis. Accurate image detection of LNM is an important task to help clinicians diagnose cancer. Recently, the U-Net architecture based on convolutional neural networks (CNNs) has been widely used to segment image to accomplish more precise cancer diagnosis. However, the accurate segmentation of important regions with high diagnostic value is still a great challenge due to the insufficient capability of CNN and codec structure in aggregating the detailed and non-local contextual information. In this work, we propose a high performance and low computation solution. METHODS Inspired by the working principle of Fovea in visual neuroscience, a novel network framework based on U-Net for cancer segmentation named Fovea-UNet is proposed to adaptively adjust the resolution according to the importance-aware of information and selectively focuses on the region most relevant to colorectal LNM. Specifically, we design an effective adaptively optimized pooling operation called Fovea Pooling (FP), which dynamically aggregate the detailed and non-local contextual information according to the pixel-level feature importance. In addition, the improved lightweight backbone network based on GhostNet is adopted to reduce the computational cost caused by FP. RESULTS Experimental results show that our proposed framework can achieve higher performance than other state-of-the-art segmentation networks with 79.38% IoU, 88.51% DSC, 92.82% sensitivity and 84.57% precision on the LNM dataset, and the parameter amount is reduced to 23.23 MB. CONCLUSIONS The proposed framework can provide a valid tool for cancer diagnosis, especially for LNM of colorectal cancer.
Collapse
Affiliation(s)
- Yajiao Liu
- School of Electrical and Information Engineering, Tianjin University, Tianjin, China
| | - Jiang Wang
- School of Electrical and Information Engineering, Tianjin University, Tianjin, China
| | - Chenpeng Wu
- Department of Pathology, Tangshan Gongren Hospital, Tangshan, China
| | - Liyun Liu
- Department of Pathology, Tangshan Gongren Hospital, Tangshan, China
| | - Zhiyong Zhang
- Department of Pathology, Tangshan Gongren Hospital, Tangshan, China
| | - Haitao Yu
- School of Electrical and Information Engineering, Tianjin University, Tianjin, China.
| |
Collapse
|
5
|
Khan S, Ali H, Shah Z. Identifying the role of vision transformer for skin cancer-A scoping review. Front Artif Intell 2023; 6:1202990. [PMID: 37529760 PMCID: PMC10388102 DOI: 10.3389/frai.2023.1202990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Accepted: 07/03/2023] [Indexed: 08/03/2023] Open
Abstract
Introduction Detecting and accurately diagnosing early melanocytic lesions is challenging due to extensive intra- and inter-observer variabilities. Dermoscopy images are widely used to identify and study skin cancer, but the blurred boundaries between lesions and besieging tissues can lead to incorrect identification. Artificial Intelligence (AI) models, including vision transformers, have been proposed as a solution, but variations in symptoms and underlying effects hinder their performance. Objective This scoping review synthesizes and analyzes the literature that uses vision transformers for skin lesion detection. Methods The review follows the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Revise) guidelines. The review searched online repositories such as IEEE Xplore, Scopus, Google Scholar, and PubMed to retrieve relevant articles. After screening and pre-processing, 28 studies that fulfilled the inclusion criteria were included. Results and discussions The review found that the use of vision transformers for skin cancer detection has rapidly increased from 2020 to 2022 and has shown outstanding performance for skin cancer detection using dermoscopy images. Along with highlighting intrinsic visual ambiguities, irregular skin lesion shapes, and many other unwanted challenges, the review also discusses the key problems that obfuscate the trustworthiness of vision transformers in skin cancer diagnosis. This review provides new insights for practitioners and researchers to understand the current state of knowledge in this specialized research domain and outlines the best segmentation techniques to identify accurate lesion boundaries and perform melanoma diagnosis. These findings will ultimately assist practitioners and researchers in making more authentic decisions promptly.
Collapse
|
6
|
Tang S, Yu X, Cheang CF, Liang Y, Zhao P, Yu HH, Choi IC. Transformer-based multi-task learning for classification and segmentation of gastrointestinal tract endoscopic images. Comput Biol Med 2023; 157:106723. [PMID: 36907035 DOI: 10.1016/j.compbiomed.2023.106723] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 02/04/2023] [Accepted: 02/26/2023] [Indexed: 03/07/2023]
Abstract
Despite being widely utilized to help endoscopists identify gastrointestinal (GI) tract diseases using classification and segmentation, models based on convolutional neural network (CNN) have difficulties in distinguishing the similarities among some ambiguous types of lesions presented in endoscopic images, and in the training when lacking labeled datasets. Those will prevent CNN from further improving the accuracy of diagnosis. To address these challenges, we first proposed a Multi-task Network (TransMT-Net) capable of simultaneously learning two tasks (classification and segmentation), which has the transformer designed to learn global features and can combine the advantages of CNN in learning local features so that to achieve a more accurate prediction in identifying the lesion types and regions in GI tract endoscopic images. We further adopted the active learning in TransMT-Net to tackle the labeled image-hungry problem. A dataset was created from the CVC-ClinicDB dataset, Macau Kiang Wu Hospital, and Zhongshan Hospital to evaluate the model performance. Then, the experimental results show that our model not only achieved 96.94% accuracy in the classification task and 77.76% Dice Similarity Coefficient in the segmentation task but also outperformed those of other models on our test set. Meanwhile, active learning also produced positive results for the performance of our model with a small-scale initial training set, and even its performance with 30% of the initial training set was comparable to that of most comparable models with the full training set. Consequently, the proposed TransMT-Net has demonstrated its potential performance in GI tract endoscopic images and it through active learning can alleviate the shortage of labeled images.
Collapse
Affiliation(s)
- Suigu Tang
- Faculty of Innovation Engineering-School of Computer Science and Engineering, Macau University of Science and Technology, Macao Special Administrative Region of China
| | - Xiaoyuan Yu
- Faculty of Innovation Engineering-School of Computer Science and Engineering, Macau University of Science and Technology, Macao Special Administrative Region of China
| | - Chak Fong Cheang
- Faculty of Innovation Engineering-School of Computer Science and Engineering, Macau University of Science and Technology, Macao Special Administrative Region of China.
| | - Yanyan Liang
- Faculty of Innovation Engineering-School of Computer Science and Engineering, Macau University of Science and Technology, Macao Special Administrative Region of China
| | - Penghui Zhao
- Faculty of Innovation Engineering-School of Computer Science and Engineering, Macau University of Science and Technology, Macao Special Administrative Region of China
| | - Hon Ho Yu
- Kiang Wu Hospital, Macao Special Administrative Region of China
| | - I Cheong Choi
- Kiang Wu Hospital, Macao Special Administrative Region of China
| |
Collapse
|
7
|
Hasan MK, Ahamad MA, Yap CH, Yang G. A survey, review, and future trends of skin lesion segmentation and classification. Comput Biol Med 2023; 155:106624. [PMID: 36774890 DOI: 10.1016/j.compbiomed.2023.106624] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2022] [Revised: 01/04/2023] [Accepted: 01/28/2023] [Indexed: 02/03/2023]
Abstract
The Computer-aided Diagnosis or Detection (CAD) approach for skin lesion analysis is an emerging field of research that has the potential to alleviate the burden and cost of skin cancer screening. Researchers have recently indicated increasing interest in developing such CAD systems, with the intention of providing a user-friendly tool to dermatologists to reduce the challenges encountered or associated with manual inspection. This article aims to provide a comprehensive literature survey and review of a total of 594 publications (356 for skin lesion segmentation and 238 for skin lesion classification) published between 2011 and 2022. These articles are analyzed and summarized in a number of different ways to contribute vital information regarding the methods for the development of CAD systems. These ways include: relevant and essential definitions and theories, input data (dataset utilization, preprocessing, augmentations, and fixing imbalance problems), method configuration (techniques, architectures, module frameworks, and losses), training tactics (hyperparameter settings), and evaluation criteria. We intend to investigate a variety of performance-enhancing approaches, including ensemble and post-processing. We also discuss these dimensions to reveal their current trends based on utilization frequencies. In addition, we highlight the primary difficulties associated with evaluating skin lesion segmentation and classification systems using minimal datasets, as well as the potential solutions to these difficulties. Findings, recommendations, and trends are disclosed to inform future research on developing an automated and robust CAD system for skin lesion analysis.
Collapse
Affiliation(s)
- Md Kamrul Hasan
- Department of Bioengineering, Imperial College London, UK; Department of Electrical and Electronic Engineering (EEE), Khulna University of Engineering & Technology (KUET), Khulna 9203, Bangladesh.
| | - Md Asif Ahamad
- Department of Electrical and Electronic Engineering (EEE), Khulna University of Engineering & Technology (KUET), Khulna 9203, Bangladesh.
| | - Choon Hwai Yap
- Department of Bioengineering, Imperial College London, UK.
| | - Guang Yang
- National Heart and Lung Institute, Imperial College London, UK; Cardiovascular Research Centre, Royal Brompton Hospital, UK.
| |
Collapse
|