1
|
Wang H, Ahn E, Bi L, Kim J. Self-supervised multi-modality learning for multi-label skin lesion classification. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 265:108729. [PMID: 40184849 DOI: 10.1016/j.cmpb.2025.108729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/12/2024] [Revised: 03/10/2025] [Accepted: 03/16/2025] [Indexed: 04/07/2025]
Abstract
BACKGROUND The clinical diagnosis of skin lesions involves the analysis of dermoscopic and clinical modalities. Dermoscopic images provide detailed views of surface structures, while clinical images offer complementary macroscopic information. Clinicians frequently use the seven-point checklist as an auxiliary tool for melanoma diagnosis and identifying lesion attributes. Supervised deep learning approaches, such as convolutional neural networks, have performed well using dermoscopic and clinical modalities (multi-modality) and further enhanced classification by predicting seven skin lesion attributes (multi-label). However, the performance of these approaches is reliant on the availability of large-scale labeled data, which are costly and time-consuming to obtain, more so with annotating multi-attributes METHODS:: To reduce the dependency on large labeled datasets, we propose a self-supervised learning (SSL) algorithm for multi-modality multi-label skin lesion classification. Compared with single-modality SSL, our algorithm enables multi-modality SSL by maximizing the similarities between paired dermoscopic and clinical images from different views. We introduce a novel multi-modal and multi-label SSL strategy that generates surrogate pseudo-multi-labels for seven skin lesion attributes through clustering analysis. A label-relation-aware module is proposed to refine each pseudo-label embedding, capturing the interrelationships between pseudo-multi-labels. We further illustrate the interrelationships of skin lesion attributes and their relationships with clinical diagnoses using an attention visualization technique. RESULTS The proposed algorithm was validated using the well-benchmarked seven-point skin lesion dataset. Our results demonstrate that our method outperforms the state-of-the-art SSL counterparts. Improvements in the area under receiver operating characteristic curve, precision, sensitivity, and specificity were observed across various lesion attributes and melanoma diagnoses. CONCLUSIONS Our self-supervised learning algorithm offers a robust and efficient solution for multi-modality multi-label skin lesion classification, reducing the reliance on large-scale labeled data. By effectively capturing and leveraging the complementary information between the dermoscopic and clinical images and interrelationships between lesion attributes, our approach holds the potential for improving clinical diagnosis accuracy in dermatology.
Collapse
Affiliation(s)
- Hao Wang
- School of Computer Science, Faculty of Engineering, The University of Sydney, Sydney, NSW 2006, Australia; Institute of Translational Medicine, National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai, China.
| | - Euijoon Ahn
- College of Science and Engineering, James Cook University, Cairns, QLD 4870, Australia.
| | - Lei Bi
- Institute of Translational Medicine, National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai, China.
| | - Jinman Kim
- School of Computer Science, Faculty of Engineering, The University of Sydney, Sydney, NSW 2006, Australia.
| |
Collapse
|
2
|
Farndale L, Insall R, Yuan K. TriDeNT : Triple deep network training for privileged knowledge distillation in histopathology. Med Image Anal 2025; 102:103479. [PMID: 40174325 DOI: 10.1016/j.media.2025.103479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 01/13/2025] [Accepted: 01/21/2025] [Indexed: 04/04/2025]
Abstract
Computational pathology models rarely utilise data that will not be available for inference. This means most models cannot learn from highly informative data such as additional immunohistochemical (IHC) stains and spatial transcriptomics. We present TriDeNT , a novel self-supervised method for utilising privileged data that is not available during inference to improve performance. We demonstrate the efficacy of this method for a range of different paired data including immunohistochemistry, spatial transcriptomics and expert nuclei annotations. In all settings, TriDeNT outperforms other state-of-the-art methods in downstream tasks, with observed improvements of up to 101%. Furthermore, we provide qualitative and quantitative measurements of the features learned by these models and how they differ from baselines. TriDeNT offers a novel method to distil knowledge from scarce or costly data during training, to create significantly better models for routine inputs.
Collapse
Affiliation(s)
- Lucas Farndale
- School of Cancer Sciences, University of Glasgow, Scotland, UK; Cancer Research UK Scotland Institute, Scotland, UK; School of Computing Science, University of Glasgow, Scotland, UK; School of Mathematics and Statistics, University of Glasgow, Scotland, UK.
| | - Robert Insall
- School of Cancer Sciences, University of Glasgow, Scotland, UK; Cancer Research UK Scotland Institute, Scotland, UK; Division of Biosciences, University College London, England, UK
| | - Ke Yuan
- School of Cancer Sciences, University of Glasgow, Scotland, UK; Cancer Research UK Scotland Institute, Scotland, UK; School of Computing Science, University of Glasgow, Scotland, UK.
| |
Collapse
|
3
|
Noman A, Beiji Z, Zhu C, Alhabib M, Al-Sabri R. FEGGNN: Feature-Enhanced Gated Graph Neural Network for robust few-shot skin disease classification. Comput Biol Med 2025; 189:109902. [PMID: 40056840 DOI: 10.1016/j.compbiomed.2025.109902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2024] [Revised: 01/25/2025] [Accepted: 02/19/2025] [Indexed: 03/10/2025]
Abstract
Accurate and timely classification of skin diseases is essential for effective dermatological diagnosis. However, the limited availability of annotated images, particularly for rare or novel conditions, poses a significant challenge. Although few-shot learning (FSL) methods in computer-aided diagnosis (CAD) can decrease the dependence on extensive labeled data, their efficacy is often diminished by these challenges, particularly the catastrophic forgetting defect during the sequence of few-shot tasks. To address these challenges, we propose a Feature Enhanced Gated Graph Neural Network (FEGGNN) framework to improve the few-shot classification of skin diseases. The FEGGNN leverages an efficient Asymmetric Convolutional Network (ACNet) to extract high-quality feature maps from skin lesion images, which are subsequently used to construct a graph where nodes represent feature vectors and edges indicate similarities between samples. The core of FEGGNN consists of multiple aggregation blocks within the Graph Neural Network (GNN) framework, which iteratively refine node and edge features. Each block updates node features by aggregating information from neighboring nodes, weighted by edge features, to capture contextual relationships. Simultaneously, Gated Recurrent Units (GRUs) model long-term dependencies across tasks, enabling effective knowledge transfer and mitigating catastrophic forgetting. The Efficient Channel Attention (ECA) mechanism further enhances edge feature updates by focusing on the most relevant feature channels, optimizing edge weight computation. This iterative refinement process enables FEGGNN to progressively enhance feature representations, ensuring robust performance in diverse few-shot classification tasks. FEGGNN's superior ability to generalize to unseen classes is demonstrated by its state-of-the-art performance, achieving 84.90% accuracy on Derm7pt and 95.19% on SD-198 in 2-way 5-shot settings.
Collapse
Affiliation(s)
- Abdulrahman Noman
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Zou Beiji
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Chengzhang Zhu
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China.
| | - Mohammed Alhabib
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Raeed Al-Sabri
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| |
Collapse
|
4
|
Woerner S, Jaques A, Baumgartner CF. A comprehensive and easy-to-use multi-domain multi-task medical imaging meta-dataset. Sci Data 2025; 12:666. [PMID: 40253434 PMCID: PMC12009356 DOI: 10.1038/s41597-025-04866-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Accepted: 03/20/2025] [Indexed: 04/21/2025] Open
Abstract
While the field of medical image analysis has undergone a transformative shift with the integration of machine learning techniques, the main challenge of these techniques is often the scarcity of large, diverse, and well-annotated datasets. Medical images vary in format, size, and other parameters and therefore require extensive preprocessing and standardization, for usage in machine learning. Addressing these challenges, we introduce the Medical Imaging Meta-Dataset (MedIMeta), a novel multi-domain, multi-task meta-dataset. MedIMeta contains 19 medical imaging datasets spanning 10 different domains and encompassing 54 distinct medical tasks, all of which are standardized to the same format and readily usable in PyTorch or other ML frameworks. We perform a technical validation of MedIMeta, demonstrating its utility through fully supervised and cross-domain few-shot learning baselines.
Collapse
Affiliation(s)
- Stefano Woerner
- Cluster of Excellence "Machine Learning: New Perspectives for Science", University of Tübingen, Tübingen, Germany.
| | - Arthur Jaques
- Cluster of Excellence "Machine Learning: New Perspectives for Science", University of Tübingen, Tübingen, Germany
| | - Christian F Baumgartner
- Cluster of Excellence "Machine Learning: New Perspectives for Science", University of Tübingen, Tübingen, Germany
- Faculty of Health Sciences and Medicine, University of Lucerne, Lucerne, Switzerland
| |
Collapse
|
5
|
Khan A, Sajid MZ, Khan NA, Youssef A, Abbas Q. CAD-Skin: A Hybrid Convolutional Neural Network-Autoencoder Framework for Precise Detection and Classification of Skin Lesions and Cancer. Bioengineering (Basel) 2025; 12:326. [PMID: 40281686 PMCID: PMC12025204 DOI: 10.3390/bioengineering12040326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2025] [Revised: 03/06/2025] [Accepted: 03/06/2025] [Indexed: 04/29/2025] Open
Abstract
Skin cancer is a class of disorder defined by the growth of abnormal cells on the body. Accurately identifying and diagnosing skin lesions is quite difficult because skin malignancies share many common characteristics and a wide range of morphologies. To face this challenge, deep learning algorithms have been proposed. Deep learning algorithms have shown diagnostic efficacy comparable to dermatologists in the discipline of images-based skin lesion diagnosis in recent research articles. This work proposes a novel deep learning algorithm to detect skin cancer. The proposed CAD-Skin system detects and classifies skin lesions using deep convolutional neural networks and autoencoders to improve the classification efficiency of skin cancer. The CAD-Skin system was designed and developed by the use of the modern preprocessing approach, which is a combination of multi-scale retinex, gamma correction, unsharp masking, and contrast-limited adaptive histogram equalization. In this work, we have implemented a data augmentation strategy to deal with unbalanced datasets. This step improves the model's resilience to different pigmented skin conditions and avoids overfitting. Additionally, a Quantum Support Vector Machine (QSVM) algorithm is integrated for final-stage classification. Our proposed CAD-Skin enhances category recognition for different skin disease severities, including actinic keratosis, malignant melanoma, and other skin cancers. The proposed system was tested using the PAD-UFES-20-Modified, ISIC-2018, and ISIC-2019 datasets. The system reached accuracy rates of 98%, 99%, and 99%, consecutively, which is higher than state-of-the-art work in the literature. The minimum accuracy achieved for certain skin disorder diseases reached 97.43%. Our research study demonstrates that the proposed CAD-Skin provides precise diagnosis and timely detection of skin abnormalities, diversifying options for doctors and enhancing patient satisfaction during medical practice.
Collapse
Affiliation(s)
- Abdullah Khan
- Department of Computer Software Engineering, Military College of Signals, National University of Science and Technology, Islamabad 44000, Pakistan; (A.K.); (M.Z.S.); (N.A.K.)
| | - Muhammad Zaheer Sajid
- Department of Computer Software Engineering, Military College of Signals, National University of Science and Technology, Islamabad 44000, Pakistan; (A.K.); (M.Z.S.); (N.A.K.)
| | - Nauman Ali Khan
- Department of Computer Software Engineering, Military College of Signals, National University of Science and Technology, Islamabad 44000, Pakistan; (A.K.); (M.Z.S.); (N.A.K.)
| | - Ayman Youssef
- Department of Computers and Systems, Electronics Research Institute, Cairo 12622, Egypt
| | - Qaisar Abbas
- College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia;
| |
Collapse
|
6
|
Patrício C, Teixeira LF, Neves JC. A two-step concept-based approach for enhanced interpretability and trust in skin lesion diagnosis. Comput Struct Biotechnol J 2025; 28:71-79. [PMID: 40093651 PMCID: PMC11907460 DOI: 10.1016/j.csbj.2025.02.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2024] [Revised: 02/17/2025] [Accepted: 02/17/2025] [Indexed: 03/19/2025] Open
Abstract
The main challenges hindering the adoption of deep learning-based systems in clinical settings are the scarcity of annotated data and the lack of interpretability and trust in these systems. Concept Bottleneck Models (CBMs) offer inherent interpretability by constraining the final disease prediction on a set of human-understandable concepts. However, this inherent interpretability comes at the cost of greater annotation burden. Additionally, adding new concepts requires retraining the entire system. In this work, we introduce a novel two-step methodology that addresses both of these challenges. By simulating the two stages of a CBM, we utilize a pretrained Vision Language Model (VLM) to automatically predict clinical concepts, and an off-the-shelf Large Language Model (LLM) to generate disease diagnoses grounded on the predicted concepts. Furthermore, our approach supports test-time human intervention, enabling corrections to predicted concepts, which improves final diagnoses and enhances transparency in decision-making. We validate our approach on three skin lesion datasets, demonstrating that it outperforms traditional CBMs and state-of-the-art explainable methods, all without requiring any training and utilizing only a few annotated examples. The code is available at https://github.com/CristianoPatricio/2-step-concept-based-skin-diagnosis.
Collapse
Affiliation(s)
- Cristiano Patrício
- Universidade da Beira Interior and NOVA LINCS, Portugal
- INESC TEC, Portugal
| | - Luís F. Teixeira
- Faculdade de Engenharia da Universidade do Porto, Portugal
- INESC TEC, Portugal
| | - João C. Neves
- Universidade da Beira Interior and NOVA LINCS, Portugal
| |
Collapse
|
7
|
Zuo L, Wang Z, Wang Y. A multi-stage multi-modal learning algorithm with adaptive multimodal fusion for improving multi-label skin lesion classification. Artif Intell Med 2025; 162:103091. [PMID: 40015211 DOI: 10.1016/j.artmed.2025.103091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 09/10/2024] [Accepted: 02/14/2025] [Indexed: 03/01/2025]
Abstract
Skin cancer is frequently occurring and has become a major contributor to both cancer incidence and mortality. Accurate and timely diagnosis of skin cancer holds the potential to save lives. Deep learning-based methods have demonstrated significant advancements in the screening of skin cancers. However, most current approaches rely on a single modality input for diagnosis, thereby missing out on valuable complementary information that could enhance accuracy. Although some multimodal-based methods exist, they often lack adaptability and fail to fully leverage multimodal information. In this paper, we introduce a novel uncertainty-based hybrid fusion strategy for a multi-modal learning algorithm aimed at skin cancer diagnosis. Our approach specifically combines three different modalities: clinical images, dermoscopy images, and metadata, to make the final classification. For the fusion of two image modalities, we employ an intermediate fusion strategy that considers the similarity between clinical and dermoscopy images to extract features containing both complementary and correlated information. To capture the correlated information, we utilize cosine similarity, and we employ concatenation as the means for integrating complementary information. In the fusion of image and metadata modalities, we leverage uncertainty to obtain confident late fusion results, allowing our method to adaptively combine the information from different modalities. We conducted comprehensive experiments using a popular publicly available skin disease diagnosis dataset, and the results of these experiments demonstrate the effectiveness of our proposed method. Our proposed fusion algorithm could enhance the clinical applicability of automated skin lesion classification, offering a more robust and adaptive way to make automatic diagnoses with the help of uncertainty mechanism. Code is available at https://github.com/Zuo-Lihan/CosCatNet-Adaptive_Fusion_Algorithm.
Collapse
Affiliation(s)
- Lihan Zuo
- School of Computer and Artificial Intelligence, Southwest Jiaotong University, Chengdu 610000, PR China
| | - Zizhou Wang
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Yan Wang
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore.
| |
Collapse
|
8
|
Shakya M, Patel R, Joshi S. A comprehensive analysis of deep learning and transfer learning techniques for skin cancer classification. Sci Rep 2025; 15:4633. [PMID: 39920179 PMCID: PMC11805976 DOI: 10.1038/s41598-024-82241-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2024] [Accepted: 12/03/2024] [Indexed: 02/09/2025] Open
Abstract
Accurately and early diagnosis of melanoma is one of the challenging tasks due to its unique characteristics and different shapes of skin lesions. So, in order to solve this issue, the current study examines various deep learning-based approaches and provide an effective approach for classifying dermoscopic images into two categories of skin lesions. This research focus on skin cancer images and provides solution using deep learning approaches. This research investigates three approaches for classifying skin cancer images. (1) Utilizing three fine-tuned pre-trained networks (VGG19, ResNet18, and MobileNet_V2) as classifiers. (2) Employing three pre-trained networks (ResNet-18, VGG19, and MobileNet v2) as feature extractors in conjunction with four machine learning classifiers (SVM, DT, Naïve Bayes, and KNN). (3) Utilizing a combination of the aforementioned pre-trained networks as feature extractors in conjunction with same machine learning classifiers. All these algorithms are trained using segmented images which are achieved by using the active contour approach. Prior to segmentation, preprocessing step is performed which involves scaling, denoising, and enhancing the image. Experimental performance is measured on the ISIC 2018 dataset which contains 3300 images of skin disease including benign and malignant type cancer images. 80% of the images from the ISIC 2018 dataset are allocated for training, while the remaining 20% are designated for testing. All approaches are trained using different parameters like epoch, batch size, and learning rate. The results indicate that combining ResNet-18 and MobileNet pre-trained networks using concatenation with an SVM classifier achieved the maximum accuracy of 92.87%.
Collapse
Affiliation(s)
- Manishi Shakya
- Department of Computer Application, UIT RGPV, Bhopal, MP, India.
| | - Ravindra Patel
- Department of Computer Application, UIT RGPV, Bhopal, MP, India
| | - Sunil Joshi
- Department of Computer Science and Engineering, SATI, Vidisha, MP, India
| |
Collapse
|
9
|
Zhang R, Du X, Yan J, Zhang S. The Decoupling Concept Bottleneck Model. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2025; 47:1250-1265. [PMID: 39485693 DOI: 10.1109/tpami.2024.3489597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/03/2024]
Abstract
The Concept Bottleneck Model (CBM) is an interpretable neural network that leverages high-level concepts to explain model decisions and conduct human-machine interaction. However, in real-world scenarios, the deficiency of informative concepts can impede the model's interpretability and subsequent interventions. This paper proves that insufficient concept information can lead to an inherent dilemma of concept and label distortions in CBM. To address this challenge, we propose the Decoupling Concept Bottleneck Model (DCBM), which comprises two phases: 1) DCBM for prediction and interpretation, which decouples heterogeneous information into explicit and implicit concepts while maintaining high label and concept accuracy, and 2) DCBM for human-machine interaction, which automatically corrects labels and traces wrong concepts via mutual information estimation. The construction of the interaction system can be formulated as a light min-max optimization problem. Extensive experiments expose the success of alleviating concept/label distortions, especially when concepts are insufficient. In particular, we propose the Concept Contribution Score (CCS) to quantify the interpretability of DCBM. Numerical results demonstrate that CCS can be guaranteed by the Jensen-Shannon divergence constraint in DCBM. Moreover, DCBM expresses two effective human-machine interactions, including forward intervention and backward rectification, to further promote concept/label accuracy via interaction with human experts.
Collapse
|
10
|
Lin M, Wang S, Ding Y, Zhao L, Wang F, Peng Y. An empirical study of using radiology reports and images to improve intensive care unit mortality prediction. JAMIA Open 2025; 8:ooae137. [PMID: 39980476 PMCID: PMC11841685 DOI: 10.1093/jamiaopen/ooae137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 10/05/2024] [Indexed: 02/22/2025] Open
Abstract
Objectives The predictive intensive care unit (ICU) scoring system is crucial for predicting patient outcomes, particularly mortality. Traditional scoring systems rely mainly on structured clinical data from electronic health records, which can overlook important clinical information in narratives and images. Materials and Methods In this work, we build a deep learning-based survival prediction model that utilizes multimodality data for ICU mortality prediction. Four sets of features are investigated: (1) physiological measurements of Simplified Acute Physiology Score (SAPS) II, (2) common thorax diseases predefined by radiologists, (3) bidirectional encoder representations from transformers-based text representations, and (4) chest X-ray image features. The model was evaluated using the Medical Information Mart for Intensive Care IV dataset. Results Our model achieves an average C-index of 0.7829 (95% CI, 0.7620-0.8038), surpassing the baseline using only SAPS-II features, which had a C-index of 0.7470 (95% CI: 0.7263-0.7676). Ablation studies further demonstrate the contributions of incorporating predefined labels (2.00% improvement), text features (2.44% improvement), and image features (2.82% improvement). Discussion and Conclusion The deep learning model demonstrated superior performance to traditional machine learning methods under the same feature fusion setting for ICU mortality prediction. This study highlights the potential of integrating multimodal data into deep learning models to enhance the accuracy of ICU mortality prediction.
Collapse
Affiliation(s)
- Mingquan Lin
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10022, United States
- Department of Surgery, University of Minnesota, Minneapolis, MN 55455, United States
| | - Song Wang
- Cockrell School of Engineering, The University of Texas at Austin, Austin, TX 78712, United States
| | - Ying Ding
- School of Information, The University of Texas at Austin, Austin, TX 78712, United States
| | - Lihui Zhao
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, United States
| | - Fei Wang
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10022, United States
| | - Yifan Peng
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10022, United States
| |
Collapse
|
11
|
Abhishek K, Jain A, Hamarneh G. Investigating the Quality of DermaMNIST and Fitzpatrick17k Dermatological Image Datasets. Sci Data 2025; 12:196. [PMID: 39893183 PMCID: PMC11787307 DOI: 10.1038/s41597-025-04382-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Accepted: 01/02/2025] [Indexed: 02/04/2025] Open
Abstract
The remarkable progress of deep learning in dermatological tasks has brought us closer to achieving diagnostic accuracies comparable to those of human experts. However, while large datasets play a crucial role in the development of reliable deep neural network models, the quality of data therein and their correct usage are of paramount importance. Several factors can impact data quality, such as the presence of duplicates, data leakage across train-test partitions, mislabeled images, and the absence of a well-defined test partition. In this paper, we conduct meticulous analyses of three popular dermatological image datasets: DermaMNIST, its source HAM10000, and Fitzpatrick17k, uncovering these data quality issues, measure the effects of these problems on the benchmark results, and propose corrections to the datasets. Besides ensuring the reproducibility of our analysis, by making our analysis pipeline and the accompanying code publicly available, we aim to encourage similar explorations and to facilitate the identification and addressing of potential data quality issues in other large datasets.
Collapse
Affiliation(s)
- Kumar Abhishek
- School of Computing Science, Simon Fraser University, Burnaby, V5A 1S6, Canada.
| | - Aditi Jain
- Department of Mathematics, Indian Institute of Technology Delhi, New Delhi, 110016, India
| | - Ghassan Hamarneh
- School of Computing Science, Simon Fraser University, Burnaby, V5A 1S6, Canada
| |
Collapse
|
12
|
Xiao C, Zhu A, Xia C, Qiu Z, Liu Y, Zhao C, Ren W, Wang L, Dong L, Wang T, Guo L, Lei B. Attention-Guided Learning With Feature Reconstruction for Skin Lesion Diagnosis Using Clinical and Ultrasound Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:543-555. [PMID: 39208042 DOI: 10.1109/tmi.2024.3450682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
Skin lesion is one of the most common diseases, and most categories are highly similar in morphology and appearance. Deep learning models effectively reduce the variability between classes and within classes, and improve diagnostic accuracy. However, the existing multi-modal methods are only limited to the surface information of lesions in skin clinical and dermatoscopic modalities, which hinders the further improvement of skin lesion diagnostic accuracy. This requires us to further study the depth information of lesions in skin ultrasound. In this paper, we propose a novel skin lesion diagnosis network, which combines clinical and ultrasound modalities to fuse the surface and depth information of the lesion to improve diagnostic accuracy. Specifically, we propose an attention-guided learning (AL) module that fuses clinical and ultrasound modalities from both local and global perspectives to enhance feature representation. The AL module consists of two parts, attention-guided local learning (ALL) computes the intra-modality and inter-modality correlations to fuse multi-scale information, which makes the network focus on the local information of each modality, and attention-guided global learning (AGL) fuses global information to further enhance the feature representation. In addition, we propose a feature reconstruction learning (FRL) strategy which encourages the network to extract more discriminative features and corrects the focus of the network to enhance the model's robustness and certainty. We conduct extensive experiments and the results confirm the superiority of our proposed method. Our code is available at: https://github.com/XCL-hub/AGFnet.
Collapse
|
13
|
Li S, Li X, Xu X, Cheng KT. Dynamic Subcluster-Aware Network for Few-Shot Skin Disease Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:1872-1883. [PMID: 38090872 DOI: 10.1109/tnnls.2023.3336765] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
This article addresses the problem of few-shot skin disease classification by introducing a novel approach called the subcluster-aware network (SCAN) that enhances accuracy in diagnosing rare skin diseases. The key insight motivating the design of SCAN is the observation that skin disease images within a class often exhibit multiple subclusters, characterized by distinct variations in appearance. To improve the performance of few-shot learning (FSL), we focus on learning a high-quality feature encoder that captures the unique subclustered representations within each disease class, enabling better characterization of feature distributions. Specifically, SCAN follows a dual-branch framework, where the first branch learns classwise features to distinguish different skin diseases, and the second branch aims to learn features, which can effectively partition each class into several groups so as to preserve the subclustered structure within each class. To achieve the objective of the second branch, we present a cluster loss to learn image similarities via unsupervised clustering. To ensure that the samples in each subcluster are from the same class, we further design a purity loss to refine the unsupervised clustering results. We evaluate the proposed approach on two public datasets for few-shot skin disease classification. The experimental results validate that our framework outperforms the state-of-the-art methods by around 2%-5% in terms of sensitivity, specificity, accuracy, and F1-score on the SD-198 and Derm7pt datasets.
Collapse
|
14
|
Xu J, Huang K, Zhong L, Gao Y, Sun K, Liu W, Zhou Y, Guo W, Guo Y, Zou Y, Duan Y, Lu L, Wang Y, Chen X, Zhao S. RemixFormer++: A Multi-Modal Transformer Model for Precision Skin Tumor Differential Diagnosis With Memory-Efficient Attention. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:320-337. [PMID: 39120989 DOI: 10.1109/tmi.2024.3441012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/11/2024]
Abstract
Diagnosing malignant skin tumors accurately at an early stage can be challenging due to ambiguous and even confusing visual characteristics displayed by various categories of skin tumors. To improve diagnosis precision, all available clinical data from multiple sources, particularly clinical images, dermoscopy images, and medical history, could be considered. Aligning with clinical practice, we propose a novel Transformer model, named RemixFormer++ that consists of a clinical image branch, a dermoscopy image branch, and a metadata branch. Given the unique characteristics inherent in clinical and dermoscopy images, specialized attention strategies are adopted for each type. Clinical images are processed through a top-down architecture, capturing both localized lesion details and global contextual information. Conversely, dermoscopy images undergo a bottom-up processing with two-level hierarchical encoders, designed to pinpoint fine-grained structural and textural features. A dedicated metadata branch seamlessly integrates non-visual information by encoding relevant patient data. Fusing features from three branches substantially boosts disease classification accuracy. RemixFormer++ demonstrates exceptional performance on four single-modality datasets (PAD-UFES-20, ISIC 2017/2018/2019). Compared with the previous best method using a public multi-modal Derm7pt dataset, we achieved an absolute 5.3% increase in averaged F1 and 1.2% in accuracy for the classification of five skin tumors. Furthermore, using a large-scale in-house dataset of 10,351 patients with the twelve most common skin tumors, our method obtained an overall classification accuracy of 92.6%. These promising results, on par or better with the performance of 191 dermatologists through a comprehensive reader study, evidently imply the potential clinical usability of our method.
Collapse
|
15
|
Yan S, Yu Z, Liu C, Ju L, Mahapatra D, Betz-Stablein B, Mar V, Janda M, Soyer P, Ge Z. Prompt-Driven Latent Domain Generalization for Medical Image Classification. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:348-360. [PMID: 39137089 DOI: 10.1109/tmi.2024.3443119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2024]
Abstract
Deep learning models for medical image analysis easily suffer from distribution shifts caused by dataset artifact bias, camera variations, differences in the imaging station, etc., leading to unreliable diagnoses in real-world clinical settings. Domain generalization (DG) methods, which aim to train models on multiple domains to perform well on unseen domains, offer a promising direction to solve the problem. However, existing DG methods assume domain labels of each image are available and accurate, which is typically feasible for only a limited number of medical datasets. To address these challenges, we propose a unified DG framework for medical image classification without relying on domain labels, called Prompt-driven Latent Domain Generalization (PLDG). PLDG consists of unsupervised domain discovery and prompt learning. This framework first discovers pseudo domain labels by clustering the bias-associated style features, then leverages collaborative domain prompts to guide a Vision Transformer to learn knowledge from discovered diverse domains. To facilitate cross-domain knowledge learning between different prompts, we introduce a domain prompt generator that enables knowledge sharing between domain prompts and a shared prompt. A domain mixup strategy is additionally employed for more flexible decision margins and mitigates the risk of incorrect domain assignments. Extensive experiments on three medical image classification tasks and one debiasing task demonstrate that our method can achieve comparable or even superior performance than conventional DG algorithms without relying on domain labels. Our code is publicly available at https://github.com/SiyuanYan1/PLDG/tree/main.
Collapse
|
16
|
Dominguez-Morales JP, Hernandez-Rodriguez JC, Duran-Lopez L, Conejo-Mir J, Pereyra-Rodriguez JJ. Melanoma Breslow Thickness Classification Using Ensemble-Based Knowledge Distillation With Semi-Supervised Convolutional Neural Networks. IEEE J Biomed Health Inform 2025; 29:443-455. [PMID: 39302772 DOI: 10.1109/jbhi.2024.3465929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/22/2024]
Abstract
Melanoma is considered a global public health challenge and is responsible for more than 90% deaths related to skin cancer. Although the diagnosis of early melanoma is the main goal of dermoscopy, the discrimination between dermoscopic images of in situ and invasive melanomas can be a difficult task even for experienced dermatologists. Recent advances in artificial intelligence in the field of medical image analysis show that its application to dermoscopy with the aim of supporting and providing a second opinion to the medical expert could be of great interest. In this work, four datasets from different sources were used to train and evaluate deep learning models on in situ versus invasive melanoma classification and on Breslow thickness prediction. Supervised learning and semi-supervised learning using a multi-teacher ensemble knowledge distillation approach were considered and evaluated using a stratified 5-fold cross-validation scheme. The best models achieved AUCs of 0.80850.0242 and of 0.82320.0666 on the former and latter classification tasks, respectively. The best results were obtained using semi-supervised learning, with the best model achieving 0.8547 and 0.8768 AUC, respectively. An external test set was also evaluated, where semi-supervision achieved higher performance in all the classification tasks. The results obtained show that semi-supervised learning could improve the performance of trained models in different melanoma classification tasks compared to supervised learning. Automatic deep learning-based diagnosis systems could support medical professionals in their decision, serving as a second opinion or as a triage tool for medical centers.
Collapse
|
17
|
Matta S, Lamard M, Zhang P, Le Guilcher A, Borderie L, Cochener B, Quellec G. A systematic review of generalization research in medical image classification. Comput Biol Med 2024; 183:109256. [PMID: 39427426 DOI: 10.1016/j.compbiomed.2024.109256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 09/17/2024] [Accepted: 10/06/2024] [Indexed: 10/22/2024]
Abstract
Numerous Deep Learning (DL) classification models have been developed for a large spectrum of medical image analysis applications, which promises to reshape various facets of medical practice. Despite early advances in DL model validation and implementation, which encourage healthcare institutions to adopt them, a fundamental questions remain: how can these models effectively handle domain shift? This question is crucial to limit DL models performance degradation. Medical data are dynamic and prone to domain shift, due to multiple factors. Two main shift types can occur over time: (1) covariate shift mainly arising due to updates to medical equipment and (2) concept shift caused by inter-grader variability. To mitigate the problem of domain shift, existing surveys mainly focus on domain adaptation techniques, with an emphasis on covariate shift. More generally, no work has reviewed the state-of-the-art solutions while focusing on the shift types. This paper aims to explore existing domain generalization methods for DL-based classification models through a systematic review of literature. It proposes a taxonomy based on the shift type they aim to solve. Papers were searched and gathered on Scopus till 10 April 2023, and after the eligibility screening and quality evaluation, 77 articles were identified. Exclusion criteria included: lack of methodological novelty (e.g., reviews, benchmarks), experiments conducted on a single mono-center dataset, or articles not written in English. The results of this paper show that learning based methods are emerging, for both shift types. Finally, we discuss future challenges, including the need for improved evaluation protocols and benchmarks, and envisioned future developments to achieve robust, generalized models for medical image classification.
Collapse
Affiliation(s)
- Sarah Matta
- Université de Bretagne Occidentale, Brest, Bretagne, 29200, France; Inserm, UMR 1101, Brest, F-29200, France.
| | - Mathieu Lamard
- Université de Bretagne Occidentale, Brest, Bretagne, 29200, France; Inserm, UMR 1101, Brest, F-29200, France
| | - Philippe Zhang
- Université de Bretagne Occidentale, Brest, Bretagne, 29200, France; Inserm, UMR 1101, Brest, F-29200, France; Evolucare Technologies, Villers-Bretonneux, F-80800, France
| | | | | | - Béatrice Cochener
- Université de Bretagne Occidentale, Brest, Bretagne, 29200, France; Inserm, UMR 1101, Brest, F-29200, France; Service d'Ophtalmologie, CHRU Brest, Brest, F-29200, France
| | | |
Collapse
|
18
|
Sabir R, Mehmood T. Classification of melanoma skin Cancer based on Image Data Set using different neural networks. Sci Rep 2024; 14:29704. [PMID: 39613788 DOI: 10.1038/s41598-024-75143-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 10/03/2024] [Indexed: 12/01/2024] Open
Abstract
This paper aims to address the pressing issue of melanoma classification by leveraging advanced neural network models, specifically basic Convolutional Neural Networks (CNN), ResNet-18, and EfficientNet-B0. Our objectives encompass presenting and evaluating these models based on established practices in medical image diagnosis. Additionally, we aim to demonstrate their effectiveness in contributing to the critical task of saving lives through early and accurate melanoma diagnosis.Our methodology involves a multi-stage process, which includes image normalization and augmentation, followed by segmentation, feature extraction, and classification. Notably, the neural network models underwent rigorous evaluation, with EfficientNet-B0 exhibiting exceptional performance as the winning model. EfficientNet-B0 achieved a remarkable accuracy of 97%, surpassing ResNet-18 (87%) and basic CNN (80%) in classifying malignant and benign cases. In addition to accuracy, a comprehensive set of evaluation metrics was employed for EfficientNet-B0: sensitivity of 99%, specificity of 93%, F1-score of 97%, precision of 95%, and an error rate of 3%. It also demonstrated a Mathew's correlation coefficient of 94% and a geometric mean of 1.01. Across these metrics, EfficientNet-B0 consistently outperformed ResNet-18 and basic CNN. The findings from this research suggest that neural network models, particularly EfficientNet-B0, hold significant promise for precise and efficient melanoma skin cancer detection.
Collapse
Affiliation(s)
- Rukhsar Sabir
- School of Natural Sciences, National University of Science and Technology, Islamabad, 46000, Pakistan.
| | - Tahir Mehmood
- School of Natural Sciences, National University of Science and Technology, Islamabad, 46000, Pakistan
| |
Collapse
|
19
|
Yilmaz A, Yasar SP, Gencoglan G, Temelkuran B. DERM12345: A Large, Multisource Dermatoscopic Skin Lesion Dataset with 40 Subclasses. Sci Data 2024; 11:1302. [PMID: 39609462 PMCID: PMC11604664 DOI: 10.1038/s41597-024-04104-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Accepted: 11/08/2024] [Indexed: 11/30/2024] Open
Abstract
Skin lesion datasets provide essential information for understanding various skin conditions and developing effective diagnostic tools. They aid the artificial intelligence-based early detection of skin cancer, facilitate treatment planning, and contribute to medical education and research. Published large datasets have partially coverage the subclassifications of the skin lesions. This limitation highlights the need for more expansive and varied datasets to reduce false predictions and help improve the failure analysis for skin lesions. This study presents a diverse dataset comprising 12,345 dermatoscopic images with 40 subclasses of skin lesions, collected in Turkiye, which comprises different skin types in the transition zone between Europe and Asia. Each subgroup contains high-resolution images and expert annotations, providing a strong and reliable basis for future research. The detailed analysis of each subgroup provided in this study facilitates targeted research endeavors and enhances the depth of understanding regarding the skin lesions. This dataset distinguishes itself through a diverse structure with its 5 super classes, 15 main classes, 40 subclasses and 12,345 high-resolution dermatoscopic images.
Collapse
Affiliation(s)
- Abdurrahim Yilmaz
- Imperial College London, Division of Systems Medicine, Department of Metabolism, Digestion, and Reproduction, London, SW7 2AZ, United Kingdom.
| | - Sirin Pekcan Yasar
- The University of Health Sciences, Haydarpasa Numune Research and Training Hospital, Department of Dermatology and Venereology, Istanbul, 34668, Turkey
| | - Gulsum Gencoglan
- Istinye University, Liv Hospital Vadistanbul, Department of Dermatology and Venereology, Istanbul, 34010, Turkey.
| | - Burak Temelkuran
- Imperial College London, Division of Systems Medicine, Department of Metabolism, Digestion, and Reproduction, London, SW7 2AZ, United Kingdom.
| |
Collapse
|
20
|
Bayasi N, Hamarneh G, Garbi R. GC 2: Generalizable Continual Classification of Medical Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:3767-3779. [PMID: 38717881 DOI: 10.1109/tmi.2024.3398533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/05/2024]
Abstract
Deep learning models have achieved remarkable success in medical image classification. These models are typically trained once on the available annotated images and thus lack the ability of continually learning new tasks (i.e., new classes or data distributions) due to the problem of catastrophic forgetting. Recently, there has been more interest in designing continual learning methods to learn different tasks presented sequentially over time while preserving previously acquired knowledge. However, these methods focus mainly on preventing catastrophic forgetting and are tested under a closed-world assumption; i.e., assuming the test data is drawn from the same distribution as the training data. In this work, we advance the state-of-the-art in continual learning by proposing GC2 for medical image classification, which learns a sequence of tasks while simultaneously enhancing its out-of-distribution robustness. To alleviate forgetting, GC2 employs a gradual culpability-based network pruning to identify an optimal subnetwork for each task. To improve generalization, GC2 incorporates adversarial image augmentation and knowledge distillation approaches for learning generalized and robust representations for each subnetwork. Our extensive experiments on multiple benchmarks in a task-agnostic inference demonstrate that GC2 significantly outperforms baselines and other continual learning methods in reducing forgetting and enhancing generalization. Our code is publicly available at the following link: https://github.com/nourhanb/TMI2024-GC2.
Collapse
|
21
|
Huang Q, Li G. Knowledge graph based reasoning in medical image analysis: A scoping review. Comput Biol Med 2024; 182:109100. [PMID: 39244959 DOI: 10.1016/j.compbiomed.2024.109100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Revised: 08/04/2024] [Accepted: 08/31/2024] [Indexed: 09/10/2024]
Abstract
Automated computer-aided diagnosis (CAD) is becoming more significant in the field of medicine due to advancements in computer hardware performance and the progress of artificial intelligence. The knowledge graph is a structure for visually representing knowledge facts. In the last decade, a large body of work based on knowledge graphs has effectively improved the organization and interpretability of large-scale complex knowledge. Introducing knowledge graph inference into CAD is a research direction with significant potential. In this review, we briefly review the basic principles and application methods of knowledge graphs firstly. Then, we systematically organize and analyze the research and application of knowledge graphs in medical imaging-assisted diagnosis. We also summarize the shortcomings of the current research, such as medical data barriers and deficiencies, low utilization of multimodal information, and weak interpretability. Finally, we propose future research directions with possibilities and potentials to address the shortcomings of current approaches.
Collapse
Affiliation(s)
- Qinghua Huang
- School of Artificial Intelligence, OPtics and ElectroNics (iOPEN), Northwestern Polytechnical University, 127 West Youyi Road, Beilin District, Xi'an, 710072, Shaanxi, China.
| | - Guanghui Li
- School of Artificial Intelligence, OPtics and ElectroNics (iOPEN), Northwestern Polytechnical University, 127 West Youyi Road, Beilin District, Xi'an, 710072, Shaanxi, China; School of Computer Science, Northwestern Polytechnical University, 1 Dongxiang Road, Chang'an District, Xi'an, 710129, Shaanxi, China.
| |
Collapse
|
22
|
Gómez-Martínez V, Chushig-Muzo D, Veierød MB, Granja C, Soguero-Ruiz C. Ensemble feature selection and tabular data augmentation with generative adversarial networks to enhance cutaneous melanoma identification and interpretability. BioData Min 2024; 17:46. [PMID: 39478549 PMCID: PMC11526724 DOI: 10.1186/s13040-024-00397-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Accepted: 10/09/2024] [Indexed: 11/02/2024] Open
Abstract
BACKGROUND Cutaneous melanoma is the most aggressive form of skin cancer, responsible for most skin cancer-related deaths. Recent advances in artificial intelligence, jointly with the availability of public dermoscopy image datasets, have allowed to assist dermatologists in melanoma identification. While image feature extraction holds potential for melanoma detection, it often leads to high-dimensional data. Furthermore, most image datasets present the class imbalance problem, where a few classes have numerous samples, whereas others are under-represented. METHODS In this paper, we propose to combine ensemble feature selection (FS) methods and data augmentation with the conditional tabular generative adversarial networks (CTGAN) to enhance melanoma identification in imbalanced datasets. We employed dermoscopy images from two public datasets, PH2 and Derm7pt, which contain melanoma and not-melanoma lesions. To capture intrinsic information from skin lesions, we conduct two feature extraction (FE) approaches, including handcrafted and embedding features. For the former, color, geometric and first-, second-, and higher-order texture features were extracted, whereas for the latter, embeddings were obtained using ResNet-based models. To alleviate the high-dimensionality in the FE, ensemble FS with filter methods were used and evaluated. For data augmentation, we conducted a progressive analysis of the imbalance ratio (IR), related to the amount of synthetic samples created, and evaluated the impact on the predictive results. To gain interpretability on predictive models, we used SHAP, bootstrap resampling statistical tests and UMAP visualizations. RESULTS The combination of ensemble FS, CTGAN, and linear models achieved the best predictive results, achieving AUCROC values of 87% (with support vector machine and IR=0.9) and 76% (with LASSO and IR=1.0) for the PH2 and Derm7pt, respectively. We also identified that melanoma lesions were mainly characterized by features related to color, while not-melanoma lesions were characterized by texture features. CONCLUSIONS Our results demonstrate the effectiveness of ensemble FS and synthetic data in the development of models that accurately identify melanoma. This research advances skin lesion analysis, contributing to both melanoma detection and the interpretation of main features for its identification.
Collapse
Affiliation(s)
- Vanesa Gómez-Martínez
- Department of Signal Theory and Communications, Telematics and Computing Systems, Rey Juan Carlos University, Madrid, 28943, Spain.
| | - David Chushig-Muzo
- Department of Signal Theory and Communications, Telematics and Computing Systems, Rey Juan Carlos University, Madrid, 28943, Spain
| | - Marit B Veierød
- Oslo Centre for Biostatistics and Epidemiology, Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, Oslo, Norway
| | - Conceição Granja
- Norwegian Centre for E-health Research, University Hospital of North Norway, Tromsø, 9019, Norway
| | - Cristina Soguero-Ruiz
- Department of Signal Theory and Communications, Telematics and Computing Systems, Rey Juan Carlos University, Madrid, 28943, Spain
| |
Collapse
|
23
|
Ullah MA, Zia T, Kim J, Kadry S. An inherently interpretable deep learning model for local explanations using visual concepts. PLoS One 2024; 19:e0311879. [PMID: 39466770 PMCID: PMC11516011 DOI: 10.1371/journal.pone.0311879] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Accepted: 09/25/2024] [Indexed: 10/30/2024] Open
Abstract
Over the past decade, deep learning has become the leading approach for various computer vision tasks and decision support systems. However, the opaque nature of deep learning models raises significant concerns about their fairness, reliability, and the underlying inferences they make. Many existing methods attempt to approximate the relationship between low-level input features and outcomes. However, humans tend to understand and reason based on high-level concepts rather than low-level input features. To bridge this gap, several concept-based interpretable methods have been developed. Most of these methods compute the importance of each discovered concept for a specific class. However, they often fail to provide local explanations. Additionally, these approaches typically rely on labeled concepts or learn directly from datasets, leading to the extraction of irrelevant concepts. They also tend to overlook the potential of these concepts to interpret model predictions effectively. This research proposes a two-stream model called the Cross-Attentional Fast/Slow Thinking Network (CA-SoftNet) to address these issues. The model is inspired by dual-process theory and integrates two key components: a shallow convolutional neural network (sCNN) as System-I for rapid, implicit pattern recognition and a cross-attentional concept memory network as System-II for transparent, controllable, and logical reasoning. Our evaluation across diverse datasets demonstrates the model's competitive accuracy, achieving 85.6%, 83.7%, 93.6%, and 90.3% on CUB 200-2011, Stanford Cars, ISIC 2016, and ISIC 2017, respectively. This performance outperforms existing interpretable models and is comparable to non-interpretable counterparts. Furthermore, our novel concept extraction method facilitates identifying and selecting salient concepts. These concepts are then used to generate concept-based local explanations that align with human thinking. Additionally, the model's ability to share similar concepts across distinct classes, such as in fine-grained classification, enhances its scalability for large datasets. This feature also induces human-like cognition and reasoning within the proposed framework.
Collapse
Affiliation(s)
- Mirza Ahsan Ullah
- Department of Computer Science, COMSATS University Islamabad, Islamabad, Pakistan
- Department of Software Engineering, University of Gujrat, Gujrat, Pakistan
| | - Tehseen Zia
- Department of Computer Science, COMSATS University Islamabad, Islamabad, Pakistan
| | - Jungeun Kim
- Department of Computer Engineering, Inha University, Incheon, Republic of Korea
| | - Seifedine Kadry
- Department of Computer Science and Mathematics, Lebanese American University, Beirut, Lebanon
- Department of Applied Data Science, Noroff University College, Kristiansand, Norway
| |
Collapse
|
24
|
Nguyen TNQ, García-Rudolph A, Saurí J, Kelleher JD. Multi-task learning for predicting quality-of-life and independence in activities of daily living after stroke: a proof-of-concept study. Front Neurol 2024; 15:1449234. [PMID: 39399874 PMCID: PMC11469734 DOI: 10.3389/fneur.2024.1449234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2024] [Accepted: 08/28/2024] [Indexed: 10/15/2024] Open
Abstract
A health-related (HR) profile is a set of multiple health-related items recording the status of the patient at different follow-up times post-stroke. In order to support clinicians in designing rehabilitation treatment programs, we propose a novel multi-task learning (MTL) strategy for predicting post-stroke patient HR profiles. The HR profile in this study is measured by the Barthel index (BI) assessment or by the EQ-5D-3L questionnaire. Three datasets are used in this work and for each dataset six neural network architectures are developed and tested. Results indicate that an MTL architecture combining a pre-trained network for all tasks with a concatenation strategy conditioned by a task grouping method is a promising approach for predicting the HR profile of a patient with stroke at different phases of the patient journey. These models obtained a mean F1-score of 0.434 (standard deviation 0.022, confidence interval at 95% [0.428, 0.44]) calculated across all the items when predicting BI at 3 months after stroke (MaS), 0.388 (standard deviation 0.029, confidence interval at 95% [0.38, 0.397]) when predicting EQ-5D-3L at 6MaS, and 0.462 (standard deviation 0.029, confidence interval at 95% [0.454, 0.47]) when predicting the EQ-5D-3L at 18MaS. Furthermore, our MTL architecture outperforms the reference single-task learning models and the classic MTL of all tasks in 8 out of 10 tasks when predicting BI at 3MaS and has better prediction performance than the reference models on all tasks when predicting EQ-5D-3L at 6 and 18MaS. The models we present in this paper are the first models to predict the components of the BI or the EQ-5D-3L, and our results demonstrate the potential benefits of using MTL in a health context to predict patient profiles.
Collapse
Affiliation(s)
- Thi Nguyet Que Nguyen
- Research Hub 4 - Digital Futures Research Hub, Technological University Dublin, Dublin, Ireland
- Artificial Intelligence in Digital Health and Medicine (AIDHM), Technological University Dublin, Dublin, Ireland
| | - Alejandro García-Rudolph
- Department of Research and Innovation, Institut Guttmann, Institut Universitari de Neurorehabilitació adscrit ala UAB, Barcelona, Spain
- Departament de Medicina, Universitat Autonoma De Barcelona, Bellaterra, Spain
- Fundació Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol, Barcelona, Spain
| | - Joan Saurí
- Department of Research and Innovation, Institut Guttmann, Institut Universitari de Neurorehabilitació adscrit ala UAB, Barcelona, Spain
- Departament de Medicina, Universitat Autonoma De Barcelona, Bellaterra, Spain
- Fundació Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol, Barcelona, Spain
| | - John D. Kelleher
- Artificial Intelligence in Digital Health and Medicine (AIDHM), Technological University Dublin, Dublin, Ireland
- School of Computer Science and Statistics, Trinity College Dublin, ADAPT Research Centre, Dublin, Ireland
| |
Collapse
|
25
|
Alipour N, Burke T, Courtney J. Skin Type Diversity in Skin Lesion Datasets: A Review. CURRENT DERMATOLOGY REPORTS 2024; 13:198-210. [PMID: 39184010 PMCID: PMC11343783 DOI: 10.1007/s13671-024-00440-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/22/2024] [Indexed: 08/27/2024]
Abstract
Purpose of review Skin type diversity in image datasets refers to the representation of various skin types. This diversity allows for the verification of comparable performance of a trained model across different skin types. A widespread problem in datasets involving human skin is the lack of verifiable diversity in skin types, making it difficult to evaluate whether the performance of the trained models generalizes across different skin types. For example, the diversity issues in skin lesion datasets, which are used to train deep learning-based models, often result in lower accuracy for darker skin types that are typically under-represented in these datasets. Under-representation in datasets results in lower performance in deep learning models for under-represented skin types. Recent findings This issue has been discussed in previous works; however, the reporting of skin types, and inherent diversity, have not been fully assessed. Some works report skin types but do not attempt to assess the representation of each skin type in datasets. Others, focusing on skin lesions, identify the issue but do not measure skin type diversity in the datasets examined. Summary Effort is needed to address these shortcomings and move towards facilitating verifiable diversity. Building on previous works in skin lesion datasets, this review explores the general issue of skin type diversity by investigating and evaluating skin lesion datasets specifically. The main contributions of this work are an evaluation of publicly available skin lesion datasets and their metadata to assess the frequency and completeness of reporting of skin type and an investigation into the diversity and representation of each skin type within these datasets. Supplementary Information The online version contains material available at 10.1007/s13671-024-00440-0.
Collapse
Affiliation(s)
- Neda Alipour
- School of Electrical and Electronic Engineering Technological, TU Dublin, City Campus, Dublin, Ireland
| | - Ted Burke
- School of Electrical and Electronic Engineering Technological, TU Dublin, City Campus, Dublin, Ireland
| | - Jane Courtney
- School of Electrical and Electronic Engineering Technological, TU Dublin, City Campus, Dublin, Ireland
| |
Collapse
|
26
|
Lyakhova UA, Lyakhov PA. Systematic review of approaches to detection and classification of skin cancer using artificial intelligence: Development and prospects. Comput Biol Med 2024; 178:108742. [PMID: 38875908 DOI: 10.1016/j.compbiomed.2024.108742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 06/03/2024] [Accepted: 06/08/2024] [Indexed: 06/16/2024]
Abstract
In recent years, there has been a significant improvement in the accuracy of the classification of pigmented skin lesions using artificial intelligence algorithms. Intelligent analysis and classification systems are significantly superior to visual diagnostic methods used by dermatologists and oncologists. However, the application of such systems in clinical practice is severely limited due to a lack of generalizability and risks of potential misclassification. Successful implementation of artificial intelligence-based tools into clinicopathological practice requires a comprehensive study of the effectiveness and performance of existing models, as well as further promising areas for potential research development. The purpose of this systematic review is to investigate and evaluate the accuracy of artificial intelligence technologies for detecting malignant forms of pigmented skin lesions. For the study, 10,589 scientific research and review articles were selected from electronic scientific publishers, of which 171 articles were included in the presented systematic review. All selected scientific articles are distributed according to the proposed neural network algorithms from machine learning to multimodal intelligent architectures and are described in the corresponding sections of the manuscript. This research aims to explore automated skin cancer recognition systems, from simple machine learning algorithms to multimodal ensemble systems based on advanced encoder-decoder models, visual transformers (ViT), and generative and spiking neural networks. In addition, as a result of the analysis, future directions of research, prospects, and potential for further development of automated neural network systems for classifying pigmented skin lesions are discussed.
Collapse
Affiliation(s)
- U A Lyakhova
- Department of Mathematical Modeling, North-Caucasus Federal University, 355017, Stavropol, Russia.
| | - P A Lyakhov
- Department of Mathematical Modeling, North-Caucasus Federal University, 355017, Stavropol, Russia; North-Caucasus Center for Mathematical Research, North-Caucasus Federal University, 355017, Stavropol, Russia.
| |
Collapse
|
27
|
Rasel MA, Abdul Kareem S, Kwan Z, Yong SS, Obaidellah U. Bluish veil detection and lesion classification using custom deep learnable layers with explainable artificial intelligence (XAI). Comput Biol Med 2024; 178:108758. [PMID: 38905895 DOI: 10.1016/j.compbiomed.2024.108758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 06/11/2024] [Accepted: 06/12/2024] [Indexed: 06/23/2024]
Abstract
Melanoma, one of the deadliest types of skin cancer, accounts for thousands of fatalities globally. The bluish, blue-whitish, or blue-white veil (BWV) is a critical feature for diagnosing melanoma, yet research into detecting BWV in dermatological images is limited. This study utilizes a non-annotated skin lesion dataset, which is converted into an annotated dataset using a proposed imaging algorithm (color threshold techniques) on lesion patches based on color palettes. A Deep Convolutional Neural Network (DCNN) is designed and trained separately on three individual and combined dermoscopic datasets, using custom layers instead of standard activation function layers. The model is developed to categorize skin lesions based on the presence of BWV. The proposed DCNN demonstrates superior performance compared to the conventional BWV detection models across different datasets. The model achieves a testing accuracy of 85.71 % on the augmented PH2 dataset, 95.00 % on the augmented ISIC archive dataset, 95.05 % on the combined augmented (PH2+ISIC archive) dataset, and 90.00 % on the Derm7pt dataset. An explainable artificial intelligence (XAI) algorithm is subsequently applied to interpret the DCNN's decision-making process about the BWV detection. The proposed approach, coupled with XAI, significantly improves the detection of BWV in skin lesions, outperforming existing models and providing a robust tool for early melanoma diagnosis.
Collapse
Affiliation(s)
- M A Rasel
- Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, Universiti Malaya, Kuala Lumpur, 50603, Malaysia
| | - Sameem Abdul Kareem
- Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, Universiti Malaya, Kuala Lumpur, 50603, Malaysia
| | - Zhenli Kwan
- Division of Dermatology, Department of Medicine, Faculty of Medicine, Universiti Malaya, Kuala Lumpur, 50603, Malaysia
| | - Shin Shen Yong
- Division of Dermatology, Department of Medicine, Faculty of Medicine, Universiti Malaya, Kuala Lumpur, 50603, Malaysia
| | - Unaizah Obaidellah
- Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, Universiti Malaya, Kuala Lumpur, 50603, Malaysia.
| |
Collapse
|
28
|
Saifullah S, Mercier D, Lucieri A, Dengel A, Ahmed S. The privacy-explainability trade-off: unraveling the impacts of differential privacy and federated learning on attribution methods. Front Artif Intell 2024; 7:1236947. [PMID: 39021435 PMCID: PMC11253022 DOI: 10.3389/frai.2024.1236947] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 06/17/2024] [Indexed: 07/20/2024] Open
Abstract
Since the advent of deep learning (DL), the field has witnessed a continuous stream of innovations. However, the translation of these advancements into practical applications has not kept pace, particularly in safety-critical domains where artificial intelligence (AI) must meet stringent regulatory and ethical standards. This is underscored by the ongoing research in eXplainable AI (XAI) and privacy-preserving machine learning (PPML), which seek to address some limitations associated with these opaque and data-intensive models. Despite brisk research activity in both fields, little attention has been paid to their interaction. This work is the first to thoroughly investigate the effects of privacy-preserving techniques on explanations generated by common XAI methods for DL models. A detailed experimental analysis is conducted to quantify the impact of private training on the explanations provided by DL models, applied to six image datasets and five time series datasets across various domains. The analysis comprises three privacy techniques, nine XAI methods, and seven model architectures. The findings suggest non-negligible changes in explanations through the implementation of privacy measures. Apart from reporting individual effects of PPML on XAI, the paper gives clear recommendations for the choice of techniques in real applications. By unveiling the interdependencies of these pivotal technologies, this research marks an initial step toward resolving the challenges that hinder the deployment of AI in safety-critical settings.
Collapse
Affiliation(s)
- Saifullah Saifullah
- Department of Computer Science, RPTU Kaiserslautern-Landau, Kaiserslautern, Rhineland-Palatinate, Germany
- Smart Data and Knowledge Services (SDS), DFKI GmbH, Kaiserslautern, Rhineland-Palatinate, Germany
| | - Dominique Mercier
- Department of Computer Science, RPTU Kaiserslautern-Landau, Kaiserslautern, Rhineland-Palatinate, Germany
- Smart Data and Knowledge Services (SDS), DFKI GmbH, Kaiserslautern, Rhineland-Palatinate, Germany
| | - Adriano Lucieri
- Department of Computer Science, RPTU Kaiserslautern-Landau, Kaiserslautern, Rhineland-Palatinate, Germany
- Smart Data and Knowledge Services (SDS), DFKI GmbH, Kaiserslautern, Rhineland-Palatinate, Germany
| | - Andreas Dengel
- Department of Computer Science, RPTU Kaiserslautern-Landau, Kaiserslautern, Rhineland-Palatinate, Germany
- Smart Data and Knowledge Services (SDS), DFKI GmbH, Kaiserslautern, Rhineland-Palatinate, Germany
| | - Sheraz Ahmed
- Smart Data and Knowledge Services (SDS), DFKI GmbH, Kaiserslautern, Rhineland-Palatinate, Germany
| |
Collapse
|
29
|
Sinha A, Kawahara J, Pakzad A, Abhishek K, Ruthven M, Ghorbel E, Kacem A, Aouada D, Hamarneh G. DermSynth3D: Synthesis of in-the-wild annotated dermatology images. Med Image Anal 2024; 95:103145. [PMID: 38615432 DOI: 10.1016/j.media.2024.103145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 02/11/2024] [Accepted: 03/18/2024] [Indexed: 04/16/2024]
Abstract
In recent years, deep learning (DL) has shown great potential in the field of dermatological image analysis. However, existing datasets in this domain have significant limitations, including a small number of image samples, limited disease conditions, insufficient annotations, and non-standardized image acquisitions. To address these shortcomings, we propose a novel framework called DermSynth3D. DermSynth3D blends skin disease patterns onto 3D textured meshes of human subjects using a differentiable renderer and generates 2D images from various camera viewpoints under chosen lighting conditions in diverse background scenes. Our method adheres to top-down rules that constrain the blending and rendering process to create 2D images with skin conditions that mimic in-the-wild acquisitions, ensuring more meaningful results. The framework generates photo-realistic 2D dermatological images and the corresponding dense annotations for semantic segmentation of the skin, skin conditions, body parts, bounding boxes around lesions, depth maps, and other 3D scene parameters, such as camera position and lighting conditions. DermSynth3D allows for the creation of custom datasets for various dermatology tasks. We demonstrate the effectiveness of data generated using DermSynth3D by training DL models on synthetic data and evaluating them on various dermatology tasks using real 2D dermatological images. We make our code publicly available at https://github.com/sfu-mial/DermSynth3D.
Collapse
Affiliation(s)
- Ashish Sinha
- Medical Image Analysis Lab, School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada
| | - Jeremy Kawahara
- Medical Image Analysis Lab, School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada
| | - Arezou Pakzad
- Medical Image Analysis Lab, School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada
| | - Kumar Abhishek
- Medical Image Analysis Lab, School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada
| | - Matthieu Ruthven
- Computer Vision, Imaging & Machine Intelligence Research Group, Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, L-1855, Luxembourg
| | - Enjie Ghorbel
- Computer Vision, Imaging & Machine Intelligence Research Group, Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, L-1855, Luxembourg; Cristal Laboratory, National School of Computer Sciences, University of Manouba, 2010, Tunisia
| | - Anis Kacem
- Computer Vision, Imaging & Machine Intelligence Research Group, Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, L-1855, Luxembourg
| | - Djamila Aouada
- Computer Vision, Imaging & Machine Intelligence Research Group, Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, L-1855, Luxembourg
| | - Ghassan Hamarneh
- Medical Image Analysis Lab, School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada.
| |
Collapse
|
30
|
Li Y, El Habib Daho M, Conze PH, Zeghlache R, Le Boité H, Tadayoni R, Cochener B, Lamard M, Quellec G. A review of deep learning-based information fusion techniques for multimodal medical image classification. Comput Biol Med 2024; 177:108635. [PMID: 38796881 DOI: 10.1016/j.compbiomed.2024.108635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 03/18/2024] [Accepted: 05/18/2024] [Indexed: 05/29/2024]
Abstract
Multimodal medical imaging plays a pivotal role in clinical diagnosis and research, as it combines information from various imaging modalities to provide a more comprehensive understanding of the underlying pathology. Recently, deep learning-based multimodal fusion techniques have emerged as powerful tools for improving medical image classification. This review offers a thorough analysis of the developments in deep learning-based multimodal fusion for medical classification tasks. We explore the complementary relationships among prevalent clinical modalities and outline three main fusion schemes for multimodal classification networks: input fusion, intermediate fusion (encompassing single-level fusion, hierarchical fusion, and attention-based fusion), and output fusion. By evaluating the performance of these fusion techniques, we provide insight into the suitability of different network architectures for various multimodal fusion scenarios and application domains. Furthermore, we delve into challenges related to network architecture selection, handling incomplete multimodal data management, and the potential limitations of multimodal fusion. Finally, we spotlight the promising future of Transformer-based multimodal fusion techniques and give recommendations for future research in this rapidly evolving field.
Collapse
Affiliation(s)
- Yihao Li
- LaTIM UMR 1101, Inserm, Brest, France; University of Western Brittany, Brest, France
| | - Mostafa El Habib Daho
- LaTIM UMR 1101, Inserm, Brest, France; University of Western Brittany, Brest, France.
| | | | - Rachid Zeghlache
- LaTIM UMR 1101, Inserm, Brest, France; University of Western Brittany, Brest, France
| | - Hugo Le Boité
- Sorbonne University, Paris, France; Ophthalmology Department, Lariboisière Hospital, AP-HP, Paris, France
| | - Ramin Tadayoni
- Ophthalmology Department, Lariboisière Hospital, AP-HP, Paris, France; Paris Cité University, Paris, France
| | - Béatrice Cochener
- LaTIM UMR 1101, Inserm, Brest, France; University of Western Brittany, Brest, France; Ophthalmology Department, CHRU Brest, Brest, France
| | - Mathieu Lamard
- LaTIM UMR 1101, Inserm, Brest, France; University of Western Brittany, Brest, France
| | | |
Collapse
|
31
|
Fu W, Chen J, Zhou L. Boosting few-shot rare skin disease classification via self-supervision and distribution calibration. Biomed Eng Lett 2024; 14:877-889. [PMID: 38946819 PMCID: PMC11208389 DOI: 10.1007/s13534-024-00383-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 04/22/2024] [Accepted: 04/25/2024] [Indexed: 07/02/2024] Open
Abstract
Due to the difficulty in obtaining clinical samples and the high cost of labeling, rare skin diseases are characterized by data scarcity, making training deep neural networks for classification challenging. In recent years, few-shot learning has emerged as a promising solution, enabling models to recognize unseen disease classes by limited labeled samples. However, most existing methods ignored the fine-grained nature of rare skin diseases, resulting in poor performance when generalizing to highly similar classes. Moreover, the distributions learned from limited labeled data are biased, severely impairing the model's generalizability. This paper proposes a self-supervision distribution calibration network (SS-DCN) to address the above issues. Specifically, SS-DCN adopts a multi-task learning framework during pre-training. By introducing self-supervised tasks to aid in supervised learning, the model can learn more discriminative and transferable visual representations. Furthermore, SS-DCN applied an enhanced distribution calibration (EDC) strategy, which utilizes the statistics of base classes with sufficient samples to calibrate the bias distribution of novel classes with few-shot samples. By generating more samples from the calibrated distribution, EDC can provide sufficient supervision for subsequent classifier training. The proposed method is evaluated on three public skin disease datasets(i.e., ISIC2018, Derm7pt, and SD198), achieving significant performance improvements over state-of-the-art methods.
Collapse
Affiliation(s)
- Wen Fu
- Institute of Microelectronics of Chinese Academy of Sciences, Beijing, 100029 China
- University of Chinese Academy of Sciences, Beijing, 100049 China
| | - Jie Chen
- Institute of Microelectronics of Chinese Academy of Sciences, Beijing, 100029 China
| | - Li Zhou
- Institute of Microelectronics of Chinese Academy of Sciences, Beijing, 100029 China
| |
Collapse
|
32
|
Guan H, Yap PT, Bozoki A, Liu M. Federated learning for medical image analysis: A survey. PATTERN RECOGNITION 2024; 151:110424. [PMID: 38559674 PMCID: PMC10976951 DOI: 10.1016/j.patcog.2024.110424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Machine learning in medical imaging often faces a fundamental dilemma, namely, the small sample size problem. Many recent studies suggest using multi-domain data pooled from different acquisition sites/centers to improve statistical power. However, medical images from different sites cannot be easily shared to build large datasets for model training due to privacy protection reasons. As a promising solution, federated learning, which enables collaborative training of machine learning models based on data from different sites without cross-site data sharing, has attracted considerable attention recently. In this paper, we conduct a comprehensive survey of the recent development of federated learning methods in medical image analysis. We have systematically gathered research papers on federated learning and its applications in medical image analysis published between 2017 and 2023. Our search and compilation were conducted using databases from IEEE Xplore, ACM Digital Library, Science Direct, Springer Link, Web of Science, Google Scholar, and PubMed. In this survey, we first introduce the background of federated learning for dealing with privacy protection and collaborative learning issues. We then present a comprehensive review of recent advances in federated learning methods for medical image analysis. Specifically, existing methods are categorized based on three critical aspects of a federated learning system, including client end, server end, and communication techniques. In each category, we summarize the existing federated learning methods according to specific research problems in medical image analysis and also provide insights into the motivations of different approaches. In addition, we provide a review of existing benchmark medical imaging datasets and software platforms for current federated learning research. We also conduct an experimental study to empirically evaluate typical federated learning methods for medical image analysis. This survey can help to better understand the current research status, challenges, and potential research opportunities in this promising research field.
Collapse
Affiliation(s)
- Hao Guan
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Pew-Thian Yap
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Andrea Bozoki
- Department of Neurology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Mingxia Liu
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
33
|
Rios-Duarte JA, Diaz-Valencia AC, Combariza G, Feles M, Peña-Silva RA. Comprehensive analysis of clinical images contributions for melanoma classification using convolutional neural networks. Skin Res Technol 2024; 30:e13607. [PMID: 38742379 DOI: 10.1111/srt.13607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 01/19/2024] [Indexed: 05/16/2024]
Abstract
BACKGROUND Timely diagnosis plays a critical role in determining melanoma prognosis, prompting the development of deep learning models to aid clinicians. Questions persist regarding the efficacy of clinical images alone or in conjunction with dermoscopy images for model training. This study aims to compare the classification performance for melanoma of three types of CNN models: those trained on clinical images, dermoscopy images, and a combination of paired clinical and dermoscopy images from the same lesion. MATERIALS AND METHODS We divided 914 image pairs into training, validation, and test sets. Models were built using pre-trained Inception-ResNetV2 convolutional layers for feature extraction, followed by binary classification. Training comprised 20 models per CNN type using sets of random hyperparameters. Best models were chosen based on validation AUC-ROC. RESULTS Significant AUC-ROC differences were found between clinical versus dermoscopy models (0.661 vs. 0.869, p < 0.001) and clinical versus clinical + dermoscopy models (0.661 vs. 0.822, p = 0.001). Significant sensitivity differences were found between clinical and dermoscopy models (0.513 vs. 0.799, p = 0.01), dermoscopy versus clinical + dermoscopy models (0.799 vs. 1.000, p = 0.02), and clinical versus clinical + dermoscopy models (0.513 vs. 1.000, p < 0.001). Significant specificity differences were found between dermoscopy versus clinical + dermoscopy models (0.800 vs. 0.288, p < 0.001) and clinical versus clinical + dermoscopy models (0.650 vs. 0.288, p < 0.001). CONCLUSION CNN models trained on dermoscopy images outperformed those relying solely on clinical images under our study conditions. The potential advantages of incorporating paired clinical and dermoscopy images for CNN-based melanoma classification appear less clear based on our findings.
Collapse
Affiliation(s)
| | | | - Germán Combariza
- Department of Mathematics, Universidad Externado de Colombia, Bogotá, Colombia
| | - Miguel Feles
- Department of Mathematics, Universidad Externado de Colombia, Bogotá, Colombia
| | - Ricardo A Peña-Silva
- School of Medicine, Universidad de los Andes, Bogotá, Colombia
- Lown Scholars Program, T.H. Chan School of Public Health, Harvard University, Boston, Massachusetts, USA
| |
Collapse
|
34
|
Kim C, Gadgil SU, DeGrave AJ, Omiye JA, Cai ZR, Daneshjou R, Lee SI. Transparent medical image AI via an image-text foundation model grounded in medical literature. Nat Med 2024; 30:1154-1165. [PMID: 38627560 DOI: 10.1038/s41591-024-02887-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 02/27/2024] [Indexed: 04/21/2024]
Abstract
Building trustworthy and transparent image-based medical artificial intelligence (AI) systems requires the ability to interrogate data and models at all stages of the development pipeline, from training models to post-deployment monitoring. Ideally, the data and associated AI systems could be described using terms already familiar to physicians, but this requires medical datasets densely annotated with semantically meaningful concepts. In the present study, we present a foundation model approach, named MONET (medical concept retriever), which learns how to connect medical images with text and densely scores images on concept presence to enable important tasks in medical AI development and deployment such as data auditing, model auditing and model interpretation. Dermatology provides a demanding use case for the versatility of MONET, due to the heterogeneity in diseases, skin tones and imaging modalities. We trained MONET based on 105,550 dermatological images paired with natural language descriptions from a large collection of medical literature. MONET can accurately annotate concepts across dermatology images as verified by board-certified dermatologists, competitively with supervised models built on previously concept-annotated dermatology datasets of clinical images. We demonstrate how MONET enables AI transparency across the entire AI system development pipeline, from building inherently interpretable models to dataset and model auditing, including a case study dissecting the results of an AI clinical trial.
Collapse
Affiliation(s)
- Chanwoo Kim
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, USA
| | - Soham U Gadgil
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, USA
| | - Alex J DeGrave
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, USA
- Medical Scientist Training Program, University of Washington, Seattle, WA, USA
| | - Jesutofunmi A Omiye
- Department of Dermatology, Stanford School of Medicine, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford School of Medicine, Stanford, CA, USA
| | - Zhuo Ran Cai
- Program for Clinical Research and Technology, Stanford University, Stanford, CA, USA
| | - Roxana Daneshjou
- Department of Dermatology, Stanford School of Medicine, Stanford, CA, USA.
- Department of Biomedical Data Science, Stanford School of Medicine, Stanford, CA, USA.
| | - Su-In Lee
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, USA.
| |
Collapse
|
35
|
Malik FS, Yousaf MH, Sial HA, Viriri S. Exploring dermoscopic structures for melanoma lesions' classification. Front Big Data 2024; 7:1366312. [PMID: 38590699 PMCID: PMC10999676 DOI: 10.3389/fdata.2024.1366312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2024] [Accepted: 02/26/2024] [Indexed: 04/10/2024] Open
Abstract
Background Melanoma is one of the deadliest skin cancers that originate from melanocytes due to sun exposure, causing mutations. Early detection boosts the cure rate to 90%, but misclassification drops survival to 15-20%. Clinical variations challenge dermatologists in distinguishing benign nevi and melanomas. Current diagnostic methods, including visual analysis and dermoscopy, have limitations, emphasizing the need for Artificial Intelligence understanding in dermatology. Objectives In this paper, we aim to explore dermoscopic structures for the classification of melanoma lesions. The training of AI models faces a challenge known as brittleness, where small changes in input images impact the classification. A study explored AI vulnerability in discerning melanoma from benign lesions using features of size, color, and shape. Tests with artificial and natural variations revealed a notable decline in accuracy, emphasizing the necessity for additional information, such as dermoscopic structures. Methodology The study utilizes datasets with clinically marked dermoscopic images examined by expert clinicians. Transformers and CNN-based models are employed to classify these images based on dermoscopic structures. Classification results are validated using feature visualization. To assess model susceptibility to image variations, classifiers are evaluated on test sets with original, duplicated, and digitally modified images. Additionally, testing is done on ISIC 2016 images. The study focuses on three dermoscopic structures crucial for melanoma detection: Blue-white veil, dots/globules, and streaks. Results In evaluating model performance, adding convolutions to Vision Transformers proves highly effective for achieving up to 98% accuracy. CNN architectures like VGG-16 and DenseNet-121 reach 50-60% accuracy, performing best with features other than dermoscopic structures. Vision Transformers without convolutions exhibit reduced accuracy on diverse test sets, revealing their brittleness. OpenAI Clip, a pre-trained model, consistently performs well across various test sets. To address brittleness, a mitigation method involving extensive data augmentation during training and 23 transformed duplicates during test time, sustains accuracy. Conclusions This paper proposes a melanoma classification scheme utilizing three dermoscopic structures across Ph2 and Derm7pt datasets. The study addresses AI susceptibility to image variations. Despite a small dataset, future work suggests collecting more annotated datasets and automatic computation of dermoscopic structural features.
Collapse
Affiliation(s)
- Fiza Saeed Malik
- Department of Computer Engineering, University of Engineering and Technology, Taxila, Pakistan
| | - Muhammad Haroon Yousaf
- Department of Computer Engineering, University of Engineering and Technology, Taxila, Pakistan
- School of Computing, College of Science, Engineering and Technology, University of South Africa (UNISA), Pretoria, South Africa
| | | | - Serestina Viriri
- School of Computing, College of Science, Engineering and Technology, University of South Africa (UNISA), Pretoria, South Africa
- School of Mathematics, Statistics and Computer Science, University of KwaZulu-Natal, Durban, South Africa
| |
Collapse
|
36
|
Abhishek K, Brown CJ, Hamarneh G. Multi-sample ζ-mixup: richer, more realistic synthetic samples from a p-series interpolant. JOURNAL OF BIG DATA 2024; 11:43. [PMID: 38528850 PMCID: PMC10960781 DOI: 10.1186/s40537-024-00898-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 02/28/2024] [Indexed: 03/27/2024]
Abstract
Modern deep learning training procedures rely on model regularization techniques such as data augmentation methods, which generate training samples that increase the diversity of data and richness of label information. A popular recent method, mixup, uses convex combinations of pairs of original samples to generate new samples. However, as we show in our experiments, mixup can produce undesirable synthetic samples, where the data is sampled off the manifold and can contain incorrect labels. We propose ζ -mixup, a generalization of mixup with provably and demonstrably desirable properties that allows convex combinations of T ≥ 2 samples, leading to more realistic and diverse outputs that incorporate information from T original samples by using a p-series interpolant. We show that, compared to mixup, ζ -mixup better preserves the intrinsic dimensionality of the original datasets, which is a desirable property for training generalizable models. Furthermore, we show that our implementation of ζ -mixup is faster than mixup, and extensive evaluation on controlled synthetic and 26 diverse real-world natural and medical image classification datasets shows that ζ -mixup outperforms mixup, CutMix, and traditional data augmentation techniques. The code will be released at https://github.com/kakumarabhishek/zeta-mixup.
Collapse
Affiliation(s)
- Kumar Abhishek
- School of Computing Science, Simon Fraser University, 8888 University Drive, Burnaby, V5A 1S6 Canada
| | - Colin J Brown
- Engineering, Hinge Health, 455 Market Street, Suite 700, San Francisco, 94105 USA
| | - Ghassan Hamarneh
- School of Computing Science, Simon Fraser University, 8888 University Drive, Burnaby, V5A 1S6 Canada
| |
Collapse
|
37
|
Tao T, Chen Y, Shang Y, He J, Hao J. SMMF: a self-attention-based multi-parametric MRI feature fusion framework for the diagnosis of bladder cancer grading. Front Oncol 2024; 14:1337186. [PMID: 38515574 PMCID: PMC10955083 DOI: 10.3389/fonc.2024.1337186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Accepted: 02/21/2024] [Indexed: 03/23/2024] Open
Abstract
Background Multi-parametric magnetic resonance imaging (MP-MRI) may provide comprehensive information for graded diagnosis of bladder cancer (BCa). Nevertheless, existing methods ignore the complex correlation between these MRI sequences, failing to provide adequate information. Therefore, the main objective of this study is to enhance feature fusion and extract comprehensive features from MP-MRI using deep learning methods to achieve an accurate diagnosis of BCa grading. Methods In this study, a self-attention-based MP-MRI feature fusion framework (SMMF) is proposed to enhance the performance of the model by extracting and fusing features of both T2-weighted imaging (T2WI) and dynamic contrast-enhanced imaging (DCE) sequences. A new multiscale attention (MA) model is designed to embed into the neural network (CNN) end to further extract rich features from T2WI and DCE. Finally, a self-attention feature fusion strategy (SAFF) was used to effectively capture and fuse the common and complementary features of patients' MP-MRIs. Results In a clinically collected sample of 138 BCa patients, the SMMF network demonstrated superior performance compared to the existing deep learning-based bladder cancer grading model, with accuracy, F1 value, and AUC values of 0.9488, 0.9426, and 0.9459, respectively. Conclusion Our proposed SMMF framework combined with MP-MRI information can accurately predict the pathological grading of BCa and can better assist physicians in diagnosing BCa.
Collapse
Affiliation(s)
- Tingting Tao
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China
| | - Ying Chen
- Department of Radiology, Second Affiliated Hospital of Kunming Medical University, Kunming, China
| | - Yunyun Shang
- Department of Radiology, Second Affiliated Hospital of Kunming Medical University, Kunming, China
| | - Jianfeng He
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China
- School of Physics and Electronic Engineering, Yuxi Normal University, Yuxi, China
| | - Jingang Hao
- Department of Radiology, Second Affiliated Hospital of Kunming Medical University, Kunming, China
| |
Collapse
|
38
|
Chanda T, Hauser K, Hobelsberger S, Bucher TC, Garcia CN, Wies C, Kittler H, Tschandl P, Navarrete-Dechent C, Podlipnik S, Chousakos E, Crnaric I, Majstorovic J, Alhajwan L, Foreman T, Peternel S, Sarap S, Özdemir İ, Barnhill RL, Llamas-Velasco M, Poch G, Korsing S, Sondermann W, Gellrich FF, Heppt MV, Erdmann M, Haferkamp S, Drexler K, Goebeler M, Schilling B, Utikal JS, Ghoreschi K, Fröhling S, Krieghoff-Henning E, Brinker TJ. Dermatologist-like explainable AI enhances trust and confidence in diagnosing melanoma. Nat Commun 2024; 15:524. [PMID: 38225244 PMCID: PMC10789736 DOI: 10.1038/s41467-023-43095-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 10/31/2023] [Indexed: 01/17/2024] Open
Abstract
Artificial intelligence (AI) systems have been shown to help dermatologists diagnose melanoma more accurately, however they lack transparency, hindering user acceptance. Explainable AI (XAI) methods can help to increase transparency, yet often lack precise, domain-specific explanations. Moreover, the impact of XAI methods on dermatologists' decisions has not yet been evaluated. Building upon previous research, we introduce an XAI system that provides precise and domain-specific explanations alongside its differential diagnoses of melanomas and nevi. Through a three-phase study, we assess its impact on dermatologists' diagnostic accuracy, diagnostic confidence, and trust in the XAI-support. Our results show strong alignment between XAI and dermatologist explanations. We also show that dermatologists' confidence in their diagnoses, and their trust in the support system significantly increase with XAI compared to conventional AI. This study highlights dermatologists' willingness to adopt such XAI systems, promoting future use in the clinic.
Collapse
Affiliation(s)
- Tirtha Chanda
- Digital Biomarkers for Oncology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Katja Hauser
- Digital Biomarkers for Oncology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Sarah Hobelsberger
- Department of Dermatology, University Hospital, Technical University Dresden, Dresden, Germany
| | - Tabea-Clara Bucher
- Digital Biomarkers for Oncology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Carina Nogueira Garcia
- Digital Biomarkers for Oncology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Christoph Wies
- Digital Biomarkers for Oncology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Medical Faculty of University Heidelberg, Heidelberg, Germany
| | - Harald Kittler
- Department of Dermatology, Medical University of Vienna, Vienna, Austria
| | - Philipp Tschandl
- Department of Dermatology, Medical University of Vienna, Vienna, Austria
| | - Cristian Navarrete-Dechent
- Department of Dermatology, Escuela de Medicina, Pontificia Universidad Católica de Chile, Santiago, Chile
| | - Sebastian Podlipnik
- Dermatology Department, Hospital Clínic of Barcelona, University of Barcelona, IDIBAPS, Barcelona, Spain
| | - Emmanouil Chousakos
- 1st Department of Pathology, Medical School, National & Kapodistrian University of Athens, Athens, Greece
| | - Iva Crnaric
- Department of Dermatovenereology, Sestre milosrdnice University Hospital Center, Zagreb, Croatia
| | | | - Linda Alhajwan
- Department of Dermatology, Dubai London Clinic, Dubai, United Arab Emirates
| | | | - Sandra Peternel
- Department of Dermatovenereology, Clinical Hospital Center Rijeka, Faculty of Medicine, University of Rijeka, Rijeka, Croatia
| | | | - İrem Özdemir
- Department of Dermatology, Faculty of Medicine, Gazi University, Ankara, Turkey
| | - Raymond L Barnhill
- Department of Translational Research, Institut Curie, Unit of Formation and Research of Medicine University of Paris, Paris, France
| | | | - Gabriela Poch
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Department of Dermatology, Venereology and Allergology, Berlin, Germany
| | - Sören Korsing
- Department of Dermatology, University Hospital Essen, University Duisburg-Essen, Essen, Germany
| | - Wiebke Sondermann
- Department of Dermatology, Uniklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | | | - Markus V Heppt
- Department of Dermatology, Uniklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Michael Erdmann
- Department of Dermatology, Uniklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Sebastian Haferkamp
- Department of Dermatology, University Hospital Regensburg, Regensburg, Germany
| | - Konstantin Drexler
- Department of Dermatology, University Hospital Regensburg, Regensburg, Germany
| | - Matthias Goebeler
- Department of Dermatology, Venereology and Allergology, University Hospital Würzburg, Würzburg, Germany
| | - Bastian Schilling
- Department of Dermatology, Venereology and Allergology, University Hospital Würzburg, Würzburg, Germany
| | - Jochen S Utikal
- Department of Dermatology, Venereology and Allergology, University Medical Center Mannheim, Ruprecht-Karl University of Heidelberg, Mannheim, Germany
| | - Kamran Ghoreschi
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Department of Dermatology, Venereology and Allergology, Berlin, Germany
| | - Stefan Fröhling
- Division of Translational Medical Oncology, National Center for Tumor Diseases (NCT) Heidelberg and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Eva Krieghoff-Henning
- Digital Biomarkers for Oncology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Titus J Brinker
- Digital Biomarkers for Oncology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany.
| |
Collapse
|
39
|
Zhang D, Li A, Wu W, Yu L, Kang X, Huo X. CR-Conformer: a fusion network for clinical skin lesion classification. Med Biol Eng Comput 2024; 62:85-94. [PMID: 37653185 DOI: 10.1007/s11517-023-02904-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 08/03/2023] [Indexed: 09/02/2023]
Abstract
Deep convolutional neural network (DCNN) models have been widely used to diagnose skin lesions, and some of them have achieved diagnostic results comparable to or even better than dermatologists. Most publicly available skin lesion datasets used to train DCNN were dermoscopic images. Expensive dermoscopic equipment is rarely available in rural clinics or small hospitals in remote areas. Therefore, it is of great significance to rely on clinical images for computer-aided diagnosis of skin lesions. This paper proposes an improved dual-branch fusion network called CR-Conformer. It integrates a DCNN branch that can effectively extract local features and a Transformer branch that can extract global features to capture more valuable features in clinical skin lesion images. In addition, we improved the DCNN branch to extract enhanced features in four directions through the convolutional rotation operation, further improving the classification performance of clinical skin lesion images. To verify the effectiveness of our proposed method, we conducted comprehensive tests on a private dataset named XJUSL, which contains ten types of clinical skin lesions. The test results indicate that our proposed method reduced the number of parameters by 11.17 M and improved the accuracy of clinical skin lesion image classification by 1.08%. It has the potential to realize automatic diagnosis of skin lesions in mobile devices.
Collapse
Affiliation(s)
- Dezhi Zhang
- Department of Dermatology and Venereology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, 830000, China
- Xinjiang Clinical Research Center for Dermatologic Diseases, Urumqi, China
- Xinjiang Key Laboratory of Dermatology Research (XJYS1707), Urumqi, China
| | - Aolun Li
- School of Information Science and Engineering, Xinjiang University, Urumqi, China
| | - Weidong Wu
- Department of Dermatology and Venereology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, 830000, China.
- Xinjiang Clinical Research Center for Dermatologic Diseases, Urumqi, China.
- Xinjiang Key Laboratory of Dermatology Research (XJYS1707), Urumqi, China.
| | - Long Yu
- School of Information Science and Engineering, Xinjiang University, Urumqi, China
| | - Xiaojing Kang
- Department of Dermatology and Venereology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, 830000, China
- Xinjiang Clinical Research Center for Dermatologic Diseases, Urumqi, China
- Xinjiang Key Laboratory of Dermatology Research (XJYS1707), Urumqi, China
| | - Xiangzuo Huo
- School of Information Science and Engineering, Xinjiang University, Urumqi, China
| |
Collapse
|
40
|
Ouyang Y, Wu Y, Wang H, Zhang C, Cheng F, Jiang C, Jin L, Cao Y, Li Q. Leveraging Historical Medical Records as a Proxy via Multimodal Modeling and Visualization to Enrich Medical Diagnostic Learning. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:1238-1248. [PMID: 37874707 DOI: 10.1109/tvcg.2023.3326929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/26/2023]
Abstract
Simulation-based Medical Education (SBME) has been developed as a cost-effective means of enhancing the diagnostic skills of novice physicians and interns, thereby mitigating the need for resource-intensive mentor-apprentice training. However, feedback provided in most SBME is often directed towards improving the operational proficiency of learners, rather than providing summative medical diagnoses that result from experience and time. Additionally, the multimodal nature of medical data during diagnosis poses significant challenges for interns and novice physicians, including the tendency to overlook or over-rely on data from certain modalities, and difficulties in comprehending potential associations between modalities. To address these challenges, we present DiagnosisAssistant, a visual analytics system that leverages historical medical records as a proxy for multimodal modeling and visualization to enhance the learning experience of interns and novice physicians. The system employs elaborately designed visualizations to explore different modality data, offer diagnostic interpretive hints based on the constructed model, and enable comparative analyses of specific patients. Our approach is validated through two case studies and expert interviews, demonstrating its effectiveness in enhancing medical training.
Collapse
|
41
|
Akram T, Khan MA, Sharif M, Yasmin M. Skin lesion segmentation and recognition using multichannel saliency estimation and M-SVM on selected serially fused features. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING 2024; 15:1083-1102. [DOI: 10.1007/s12652-018-1051-5] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2018] [Accepted: 09/15/2018] [Indexed: 08/25/2024]
|
42
|
Lang Zhongliang 郎, Zhang Fan 张, Wu Bingxuan 吴, Shao Pengfei 邵, Shen Shuwei 申, Yao Peng 姚, Liu Peng 刘, Xu Xiaorong 徐. 皮肤肿瘤智能远程会诊系统研究. CHINESE JOURNAL OF LASERS 2024; 51:0907021. [DOI: 10.3788/cjl231326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
43
|
Ali MU, Khalid M, Alshanbari H, Zafar A, Lee SW. Enhancing Skin Lesion Detection: A Multistage Multiclass Convolutional Neural Network-Based Framework. Bioengineering (Basel) 2023; 10:1430. [PMID: 38136020 PMCID: PMC10741172 DOI: 10.3390/bioengineering10121430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 12/07/2023] [Accepted: 12/14/2023] [Indexed: 12/24/2023] Open
Abstract
The early identification and treatment of various dermatological conditions depend on the detection of skin lesions. Due to advancements in computer-aided diagnosis and machine learning approaches, learning-based skin lesion analysis methods have attracted much interest recently. Employing the concept of transfer learning, this research proposes a deep convolutional neural network (CNN)-based multistage and multiclass framework to categorize seven types of skin lesions. In the first stage, a CNN model was developed to classify skin lesion images into two classes, namely benign and malignant. In the second stage, the model was then used with the transfer learning concept to further categorize benign lesions into five subcategories (melanocytic nevus, actinic keratosis, benign keratosis, dermatofibroma, and vascular) and malignant lesions into two subcategories (melanoma and basal cell carcinoma). The frozen weights of the CNN developed-trained with correlated images benefited the transfer learning using the same type of images for the subclassification of benign and malignant classes. The proposed multistage and multiclass technique improved the classification accuracy of the online ISIC2018 skin lesion dataset by up to 93.4% for benign and malignant class identification. Furthermore, a high accuracy of 96.2% was achieved for subclassification of both classes. Sensitivity, specificity, precision, and F1-score metrics further validated the effectiveness of the proposed multistage and multiclass framework. Compared to existing CNN models described in the literature, the proposed approach took less time to train and had a higher classification rate.
Collapse
Affiliation(s)
- Muhammad Umair Ali
- Department of Intelligent Mechatronics Engineering, Sejong University, Seoul 05006, Republic of Korea;
| | - Majdi Khalid
- Department of Computer Science and Artificial Intelligence, College of Computers, Umm Al-Qura University, Makkah 21955, Saudi Arabia; (M.K.); (H.A.)
| | - Hanan Alshanbari
- Department of Computer Science and Artificial Intelligence, College of Computers, Umm Al-Qura University, Makkah 21955, Saudi Arabia; (M.K.); (H.A.)
| | - Amad Zafar
- Department of Intelligent Mechatronics Engineering, Sejong University, Seoul 05006, Republic of Korea;
| | - Seung Won Lee
- Department of Precision Medicine, Sungkyunkwan University School of Medicine, Suwon 16419, Republic of Korea
| |
Collapse
|
44
|
Wang Z, Zhang L, Shu X, Wang Y, Feng Y. Consistent representation via contrastive learning for skin lesion diagnosis. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 242:107826. [PMID: 37837885 DOI: 10.1016/j.cmpb.2023.107826] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 09/19/2023] [Accepted: 09/21/2023] [Indexed: 10/16/2023]
Abstract
BACKGROUND Skin lesions are a prevalent ailment, with melanoma emerging as a particularly perilous variant. Encouragingly, artificial intelligence displays promising potential in early detection, yet its integration within clinical contexts, particularly involving multi-modal data, presents challenges. While multi-modal approaches enhance diagnostic efficacy, the influence of modal bias is often disregarded. METHODS In this investigation, a multi-modal feature learning technique termed "Contrast-based Consistent Representation Disentanglement" for dermatological diagnosis is introduced. This approach employs adversarial domain adaptation to disentangle features from distinct modalities, fostering a shared representation. Furthermore, a contrastive learning strategy is devised to incentivize the model to preserve uniformity in common lesion attributes across modalities. Emphasizing the learning of a uniform representation among models, this approach circumvents reliance on supplementary data. RESULTS Assessment of the proposed technique on a 7-point criteria evaluation dataset yields an average accuracy of 76.1% for multi-classification tasks, surpassing researched state-of-the-art methods. The approach tackles modal bias, enabling the acquisition of a consistent representation of common lesion appearances across diverse modalities, which transcends modality boundaries. This study underscores the latent potential of multi-modal feature learning in dermatological diagnosis. CONCLUSION In summation, a multi-modal feature learning strategy is posited for dermatological diagnosis. This approach outperforms other state-of-the-art methods, underscoring its capacity to enhance diagnostic precision for skin lesions.
Collapse
Affiliation(s)
- Zizhou Wang
- College of Computer Science, Sichuan University, Chengdu 610065, China; Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore.
| | - Lei Zhang
- College of Computer Science, Sichuan University, Chengdu 610065, China.
| | - Xin Shu
- College of Computer Science, Sichuan University, Chengdu 610065, China.
| | - Yan Wang
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore.
| | - Yangqin Feng
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore.
| |
Collapse
|
45
|
Brutti F, La Rosa F, Lazzeri L, Benvenuti C, Bagnoni G, Massi D, Laurino M. Artificial Intelligence Algorithms for Benign vs. Malignant Dermoscopic Skin Lesion Image Classification. Bioengineering (Basel) 2023; 10:1322. [PMID: 38002446 PMCID: PMC10669580 DOI: 10.3390/bioengineering10111322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 11/13/2023] [Accepted: 11/14/2023] [Indexed: 11/26/2023] Open
Abstract
In recent decades, the incidence of melanoma has grown rapidly. Hence, early diagnosis is crucial to improving clinical outcomes. Here, we propose and compare a classical image analysis-based machine learning method with a deep learning one to automatically classify benign vs. malignant dermoscopic skin lesion images. The same dataset of 25,122 publicly available dermoscopic images was used to train both models, while a disjointed test set of 200 images was used for the evaluation phase. The training dataset was randomly divided into 10 datasets of 19,932 images to obtain an equal distribution between the two classes. By testing both models on the disjoint set, the deep learning-based method returned accuracy of 85.4 ± 3.2% and specificity of 75.5 ± 7.6%, while the machine learning one showed accuracy and specificity of 73.8 ± 1.1% and 44.5 ± 4.7%, respectively. Although both approaches performed well in the validation phase, the convolutional neural network outperformed the ensemble boosted tree classifier on the disjoint test set, showing better generalization ability. The integration of new melanoma detection algorithms with digital dermoscopic devices could enable a faster screening of the population, improve patient management, and achieve better survival rates.
Collapse
Affiliation(s)
- Francesca Brutti
- Institute of Clinical Physiology, National Research Council, 56124 Pisa, Italy; (F.B.); (F.L.R.); (C.B.)
| | - Federica La Rosa
- Institute of Clinical Physiology, National Research Council, 56124 Pisa, Italy; (F.B.); (F.L.R.); (C.B.)
| | - Linda Lazzeri
- Uniti of Dermatologia, Specialist Surgery Area, Department of General Surgery, Livorno Hospital, Azienda Usl Toscana Nord Ovest, 57124 Livorno, Italy; (L.L.); (G.B.)
| | - Chiara Benvenuti
- Institute of Clinical Physiology, National Research Council, 56124 Pisa, Italy; (F.B.); (F.L.R.); (C.B.)
| | - Giovanni Bagnoni
- Uniti of Dermatologia, Specialist Surgery Area, Department of General Surgery, Livorno Hospital, Azienda Usl Toscana Nord Ovest, 57124 Livorno, Italy; (L.L.); (G.B.)
| | - Daniela Massi
- Department of Health Sciences, Section of Pathological Anatomy, University of Florence, 50139 Florence, Italy;
| | - Marco Laurino
- Institute of Clinical Physiology, National Research Council, 56124 Pisa, Italy; (F.B.); (F.L.R.); (C.B.)
| |
Collapse
|
46
|
Branciforti F, Meiburger KM, Zavattaro E, Veronese F, Tarantino V, Mazzoletti V, Cristo ND, Savoia P, Salvi M. Impact of artificial intelligence-based color constancy on dermoscopical assessment of skin lesions: A comparative study. Skin Res Technol 2023; 29:e13508. [PMID: 38009044 PMCID: PMC10603308 DOI: 10.1111/srt.13508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 10/12/2023] [Indexed: 11/28/2023]
Abstract
BACKGROUND The quality of dermoscopic images is affected by lighting conditions, operator experience, and device calibration. Color constancy algorithms reduce this variability by making images appear as if they were acquired under the same conditions, allowing artificial intelligence (AI)-based methods to achieve better results. The impact of color constancy algorithms has not yet been evaluated from a clinical dermatologist's workflow point of view. Here we propose an in-depth investigation of the impact of an AI-based color constancy algorithm, called DermoCC-GAN, on the skin lesion diagnostic routine. METHODS Three dermatologists, with different experience levels, carried out two assignments. The clinical experts evaluated key parameters such as perceived image quality, lesion diagnosis, and diagnosis confidence. RESULTS When the DermoCC-GAN color constancy algorithm was applied, the dermoscopic images were perceived to be of better quality overall. An increase in classification performance was observed, reaching a maximum accuracy of 74.67% for a six-class classification task. Finally, the use of normalized images results in an increase in the level of self-confidence in the qualitative diagnostic routine. CONCLUSIONS From the conducted analysis, it is evident that the impact of AI-based color constancy algorithms, such as DermoCC-GAN, is positive and brings qualitative benefits to the clinical practitioner.
Collapse
Affiliation(s)
- Francesco Branciforti
- Biolab, PolitoBIOMed Lab, Department of Electronics and TelecommunicationsPolitecnico di TorinoTurinItaly
| | - Kristen M. Meiburger
- Biolab, PolitoBIOMed Lab, Department of Electronics and TelecommunicationsPolitecnico di TorinoTurinItaly
| | - Elisa Zavattaro
- Department of Health ScienceUniversity of Eastern PiedmontNovaraItaly
| | | | | | | | - Nunzia Di Cristo
- Department of Health ScienceUniversity of Eastern PiedmontNovaraItaly
| | - Paola Savoia
- Department of Health ScienceUniversity of Eastern PiedmontNovaraItaly
| | - Massimo Salvi
- Biolab, PolitoBIOMed Lab, Department of Electronics and TelecommunicationsPolitecnico di TorinoTurinItaly
| |
Collapse
|
47
|
Luo N, Zhong X, Su L, Cheng Z, Ma W, Hao P. Artificial intelligence-assisted dermatology diagnosis: From unimodal to multimodal. Comput Biol Med 2023; 165:107413. [PMID: 37703714 DOI: 10.1016/j.compbiomed.2023.107413] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 08/02/2023] [Accepted: 08/28/2023] [Indexed: 09/15/2023]
Abstract
Artificial Intelligence (AI) is progressively permeating medicine, notably in the realm of assisted diagnosis. However, the traditional unimodal AI models, reliant on large volumes of accurately labeled data and single data type usage, prove insufficient to assist dermatological diagnosis. Augmenting these models with text data from patient narratives, laboratory reports, and image data from skin lesions, dermoscopy, and pathologies could significantly enhance their diagnostic capacity. Large-scale pre-training multimodal models offer a promising solution, exploiting the burgeoning reservoir of clinical data and amalgamating various data types. This paper delves into unimodal models' methodologies, applications, and shortcomings while exploring how multimodal models can enhance accuracy and reliability. Furthermore, integrating cutting-edge technologies like federated learning and multi-party privacy computing with AI can substantially mitigate patient privacy concerns in dermatological datasets and further fosters a move towards high-precision self-diagnosis. Diagnostic systems underpinned by large-scale pre-training multimodal models can facilitate dermatology physicians in formulating effective diagnostic and treatment strategies and herald a transformative era in healthcare.
Collapse
Affiliation(s)
- Nan Luo
- Hospital of Chengdu University of Traditional Chinese Medicine, No. 39 Shi-er-qiao Road, Chengdu, 610075, Sichuan, China.
| | - Xiaojing Zhong
- Hospital of Chengdu University of Traditional Chinese Medicine, No. 39 Shi-er-qiao Road, Chengdu, 610075, Sichuan, China.
| | - Luxin Su
- Hospital of Chengdu University of Traditional Chinese Medicine, No. 39 Shi-er-qiao Road, Chengdu, 610075, Sichuan, China.
| | - Zilin Cheng
- Hospital of Chengdu University of Traditional Chinese Medicine, No. 39 Shi-er-qiao Road, Chengdu, 610075, Sichuan, China.
| | - Wenyi Ma
- Hospital of Chengdu University of Traditional Chinese Medicine, No. 39 Shi-er-qiao Road, Chengdu, 610075, Sichuan, China.
| | - Pingsheng Hao
- Hospital of Chengdu University of Traditional Chinese Medicine, No. 39 Shi-er-qiao Road, Chengdu, 610075, Sichuan, China.
| |
Collapse
|
48
|
Bibi S, Khan MA, Shah JH, Damaševičius R, Alasiry A, Marzougui M, Alhaisoni M, Masood A. MSRNet: Multiclass Skin Lesion Recognition Using Additional Residual Block Based Fine-Tuned Deep Models Information Fusion and Best Feature Selection. Diagnostics (Basel) 2023; 13:3063. [PMID: 37835807 PMCID: PMC10572512 DOI: 10.3390/diagnostics13193063] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 09/19/2023] [Accepted: 09/24/2023] [Indexed: 10/15/2023] Open
Abstract
Cancer is one of the leading significant causes of illness and chronic disease worldwide. Skin cancer, particularly melanoma, is becoming a severe health problem due to its rising prevalence. The considerable death rate linked with melanoma requires early detection to receive immediate and successful treatment. Lesion detection and classification are more challenging due to many forms of artifacts such as hairs, noise, and irregularity of lesion shape, color, irrelevant features, and textures. In this work, we proposed a deep-learning architecture for classifying multiclass skin cancer and melanoma detection. The proposed architecture consists of four core steps: image preprocessing, feature extraction and fusion, feature selection, and classification. A novel contrast enhancement technique is proposed based on the image luminance information. After that, two pre-trained deep models, DarkNet-53 and DensNet-201, are modified in terms of a residual block at the end and trained through transfer learning. In the learning process, the Genetic algorithm is applied to select hyperparameters. The resultant features are fused using a two-step approach named serial-harmonic mean. This step increases the accuracy of the correct classification, but some irrelevant information is also observed. Therefore, an algorithm is developed to select the best features called marine predator optimization (MPA) controlled Reyni Entropy. The selected features are finally classified using machine learning classifiers for the final classification. Two datasets, ISIC2018 and ISIC2019, have been selected for the experimental process. On these datasets, the obtained maximum accuracy of 85.4% and 98.80%, respectively. To prove the effectiveness of the proposed methods, a detailed comparison is conducted with several recent techniques and shows the proposed framework outperforms.
Collapse
Affiliation(s)
- Sobia Bibi
- Department of CS, COMSATS University Islamabad, Wah Campus, Islamabad 45550, Pakistan; (S.B.); (J.H.S.)
| | - Muhammad Attique Khan
- Department of Computer Science and Mathematics, Lebanese American University, Beirut 1102-2801, Lebanon;
- Department of CS, HITEC University, Taxila 47080, Pakistan
| | - Jamal Hussain Shah
- Department of CS, COMSATS University Islamabad, Wah Campus, Islamabad 45550, Pakistan; (S.B.); (J.H.S.)
| | - Robertas Damaševičius
- Center of Excellence Forest 4.0, Faculty of Informatics, Kaunas University of Technology, 51368 Kaunas, Lithuania;
| | - Areej Alasiry
- College of Computer Science, King Khalid University, Abha 61413, Saudi Arabia; (A.A.); (M.M.)
| | - Mehrez Marzougui
- College of Computer Science, King Khalid University, Abha 61413, Saudi Arabia; (A.A.); (M.M.)
| | - Majed Alhaisoni
- Computer Sciences Department, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, Riyadh 11564, Saudi Arabia;
| | - Anum Masood
- Department of Circulation and Medical Imaging, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology (NTNU), 7034 Trondheim, Norway
| |
Collapse
|
49
|
Wang J, Horlacher M, Cheng L, Winther O. RNA trafficking and subcellular localization-a review of mechanisms, experimental and predictive methodologies. Brief Bioinform 2023; 24:bbad249. [PMID: 37466130 PMCID: PMC10516376 DOI: 10.1093/bib/bbad249] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 05/30/2023] [Accepted: 06/16/2023] [Indexed: 07/20/2023] Open
Abstract
RNA localization is essential for regulating spatial translation, where RNAs are trafficked to their target locations via various biological mechanisms. In this review, we discuss RNA localization in the context of molecular mechanisms, experimental techniques and machine learning-based prediction tools. Three main types of molecular mechanisms that control the localization of RNA to distinct cellular compartments are reviewed, including directed transport, protection from mRNA degradation, as well as diffusion and local entrapment. Advances in experimental methods, both image and sequence based, provide substantial data resources, which allow for the design of powerful machine learning models to predict RNA localizations. We review the publicly available predictive tools to serve as a guide for users and inspire developers to build more effective prediction models. Finally, we provide an overview of multimodal learning, which may provide a new avenue for the prediction of RNA localization.
Collapse
Affiliation(s)
- Jun Wang
- Bioinformatics Centre, Department of Biology, University of Copenhagen, København Ø 2100, Denmark
| | - Marc Horlacher
- Computational Health Center, Helmholtz Center, Munich, Germany
| | - Lixin Cheng
- Shenzhen People’s Hospital, First Affiliated Hospital of Southern University of Science and Technology, Second Clinical Medicine College of Jinan University, Shenzhen 518020, China
| | - Ole Winther
- Bioinformatics Centre, Department of Biology, University of Copenhagen, København Ø 2100, Denmark
- Center for Genomic Medicine, Rigshospitalet (Copenhagen University Hospital), Copenhagen 2100, Denmark
- Section for Cognitive Systems, Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| |
Collapse
|
50
|
Giavina-Bianchi M, Vitor WG, Fornasiero de Paiva V, Okita AL, Sousa RM, Machado B. Explainability agreement between dermatologists and five visual explanations techniques in deep neural networks for melanoma AI classification. Front Med (Lausanne) 2023; 10:1241484. [PMID: 37746081 PMCID: PMC10513767 DOI: 10.3389/fmed.2023.1241484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 08/14/2023] [Indexed: 09/26/2023] Open
Abstract
Introduction The use of deep convolutional neural networks for analyzing skin lesion images has shown promising results. The identification of skin cancer by faster and less expensive means can lead to an early diagnosis, saving lives and avoiding treatment costs. However, to implement this technology in a clinical context, it is important for specialists to understand why a certain model makes a prediction; it must be explainable. Explainability techniques can be used to highlight the patterns of interest for a prediction. Methods Our goal was to test five different techniques: Grad-CAM, Grad-CAM++, Score-CAM, Eigen-CAM, and LIME, to analyze the agreement rate between features highlighted by the visual explanation maps to 3 important clinical criteria for melanoma classification: asymmetry, border irregularity, and color heterogeneity (ABC rule) in 100 melanoma images. Two dermatologists scored the visual maps and the clinical images using a semi-quantitative scale, and the results were compared. They also ranked their preferable techniques. Results We found that the techniques had different agreement rates and acceptance. In the overall analysis, Grad-CAM showed the best total+partial agreement rate (93.6%), followed by LIME (89.8%), Grad-CAM++ (88.0%), Eigen-CAM (86.4%), and Score-CAM (84.6%). Dermatologists ranked their favorite options: Grad-CAM and Grad-CAM++, followed by Score-CAM, LIME, and Eigen-CAM. Discussion Saliency maps are one of the few methods that can be used for visual explanations. The evaluation of explainability with humans is ideal to assess the understanding and applicability of these methods. Our results demonstrated that there is a significant agreement between clinical features used by dermatologists to diagnose melanomas and visual explanation techniques, especially Grad-Cam.
Collapse
|