1
|
Goessinger EV, Cerminara SE, Mueller AM, Gottfrois P, Huber S, Amaral M, Wenz F, Kostner L, Weiss L, Kunz M, Maul JT, Wespi S, Broman E, Kaufmann S, Patpanathapillai V, Treyer I, Navarini AA, Maul LV. Consistency of convolutional neural networks in dermoscopic melanoma recognition: A prospective real-world study about the pitfalls of augmented intelligence. J Eur Acad Dermatol Venereol 2024; 38:945-953. [PMID: 38158385 DOI: 10.1111/jdv.19777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Accepted: 10/23/2023] [Indexed: 01/03/2024]
Abstract
BACKGROUND Deep-learning convolutional neural networks (CNNs) have outperformed even experienced dermatologists in dermoscopic melanoma detection under controlled conditions. It remains unexplored how real-world dermoscopic image transformations affect CNN robustness. OBJECTIVES To investigate the consistency of melanoma risk assessment by two commercially available CNNs to help formulate recommendations for current clinical use. METHODS A comparative cohort study was conducted from January to July 2022 at the Department of Dermatology, University Hospital Basel. Five dermoscopic images of 116 different lesions on the torso of 66 patients were captured consecutively by the same operator without deliberate rotation. Classification was performed by two CNNs (CNN-1/CNN-2). Lesions were divided into four subgroups based on their initial risk scoring and clinical dignity assessment. Reliability was assessed by variation and intraclass correlation coefficients. Excisions were performed for melanoma suspicion or two consecutively elevated CNN risk scores, and benign lesions were confirmed by expert consensus (n = 3). RESULTS 117 repeated image series of 116 melanocytic lesions (2 melanomas, 16 dysplastic naevi, 29 naevi, 1 solar lentigo, 1 suspicious and 67 benign) were classified. CNN-1 demonstrated superior measurement repeatability for clinically benign lesions with an initial malignant risk score (mean variation coefficient (mvc): CNN-1: 49.5(±34.3)%; CNN-2: 71.4(±22.5)%; p = 0.03), while CNN-2 outperformed for clinically benign lesions with benign scoring (mvc: CNN-1: 49.7(±22.7)%; CNN-2: 23.8(±29.3)%; p = 0.002). Both systems exhibited lowest score consistency for lesions with an initial malignant risk score and benign assessment. In this context, averaging three initial risk scores achieved highest sensitivity of dignity assessment (CNN-1: 94%; CNN-2: 89%). Intraclass correlation coefficients indicated 'moderate'-to-'good' reliability for both systems (CNN-1: 0.80, 95% CI:0.71-0.87, p < 0.001; CNN-2: 0.67, 95% CI:0.55-0.77, p < 0.001). CONCLUSIONS Potential user-induced image changes can significantly influence CNN classification. For clinical application, we recommend using the average of three initial risk scores. Furthermore, we advocate for CNN robustness optimization by cross-validation with repeated image sets. TRIAL REGISTRATION ClinicalTrials.gov (NCT04605822).
Collapse
Affiliation(s)
- E V Goessinger
- Department of Dermatology, University Hospital Basel, Basel, Switzerland
| | - S E Cerminara
- Department of Dermatology, University Hospital Basel, Basel, Switzerland
| | - A M Mueller
- Department of Dermatology, University Hospital Basel, Basel, Switzerland
| | - P Gottfrois
- Department of Dermatology, University Hospital Basel, Basel, Switzerland
| | - S Huber
- Department of Dermatology, University Hospital Basel, Basel, Switzerland
| | - M Amaral
- Department of Dermatology, University Hospital Basel, Basel, Switzerland
| | - F Wenz
- Department of Dermatology, University Hospital Basel, Basel, Switzerland
| | - L Kostner
- Department of Dermatology, University Hospital Basel, Basel, Switzerland
| | - L Weiss
- Department of Dermatology, University Hospital Basel, Basel, Switzerland
| | - M Kunz
- Department of Dermatology, University Hospital Basel, Basel, Switzerland
| | - J-T Maul
- Department of Dermatology, University Hospital Zurich, Zurich, Switzerland
| | - S Wespi
- Department of Dermatology, University Hospital Basel, Basel, Switzerland
| | - E Broman
- Department of Dermatology, University Hospital Basel, Basel, Switzerland
| | - S Kaufmann
- Department of Dermatology, University Hospital Basel, Basel, Switzerland
| | - V Patpanathapillai
- Department of Dermatology, University Hospital Basel, Basel, Switzerland
| | - I Treyer
- Department of Dermatology, University Hospital Basel, Basel, Switzerland
| | - A A Navarini
- Department of Dermatology, University Hospital Basel, Basel, Switzerland
| | - L V Maul
- Department of Dermatology, University Hospital Basel, Basel, Switzerland
- Department of Dermatology, University Hospital Zurich, Zurich, Switzerland
| |
Collapse
|
2
|
Miller I, Rosic N, Stapelberg M, Hudson J, Coxon P, Furness J, Walsh J, Climstein M. Performance of Commercial Dermatoscopic Systems That Incorporate Artificial Intelligence for the Identification of Melanoma in General Practice: A Systematic Review. Cancers (Basel) 2024; 16:1443. [PMID: 38611119 PMCID: PMC11011068 DOI: 10.3390/cancers16071443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 04/03/2024] [Accepted: 04/04/2024] [Indexed: 04/14/2024] Open
Abstract
BACKGROUND Cutaneous melanoma remains an increasing global public health burden, particularly in fair-skinned populations. Advancing technologies, particularly artificial intelligence (AI), may provide an additional tool for clinicians to help detect malignancies with a more accurate success rate. This systematic review aimed to report the performance metrics of commercially available convolutional neural networks (CNNs) tasked with detecting MM. METHODS A systematic literature search was performed using CINAHL, Medline, Scopus, ScienceDirect and Web of Science databases. RESULTS A total of 16 articles reporting MM were included in this review. The combined number of melanomas detected was 1160, and non-melanoma lesions were 33,010. The performance of market-approved technology and clinician performance for classifying melanoma was highly heterogeneous, with sensitivity ranging from 16.4 to 100.0%, specificity between 40.0 and 98.3% and accuracy between 44.0 and 92.0%. Less heterogeneity was observed when clinicians worked in unison with AI, with sensitivity ranging between 83.3 and 100.0%, specificity between 83.7 and 87.3%, and accuracy between 86.4 and 86.9%. CONCLUSION Instead of focusing on the performance of AI versus clinicians for classifying melanoma, more consistent performance has been obtained when clinicians' work is supported by AI, facilitating management decisions and improving health outcomes.
Collapse
Affiliation(s)
- Ian Miller
- Aquatic Based Research, Southern Cross University, Bilinga, QLD 4225, Australia; (I.M.); (N.R.)
- Faculty of Health, Southern Cross University, Bilinga, QLD 4225, Australia (P.C.)
- Specialist Suite, John Flynn Hospital, Tugun, QLD 4224, Australia
| | - Nedeljka Rosic
- Aquatic Based Research, Southern Cross University, Bilinga, QLD 4225, Australia; (I.M.); (N.R.)
- Faculty of Health, Southern Cross University, Bilinga, QLD 4225, Australia (P.C.)
| | - Michael Stapelberg
- Aquatic Based Research, Southern Cross University, Bilinga, QLD 4225, Australia; (I.M.); (N.R.)
- Faculty of Health, Southern Cross University, Bilinga, QLD 4225, Australia (P.C.)
- Specialist Suite, John Flynn Hospital, Tugun, QLD 4224, Australia
| | - Jeremy Hudson
- Faculty of Health, Southern Cross University, Bilinga, QLD 4225, Australia (P.C.)
- North Queensland Skin Centre, Townsville, QLD 4810, Australia
| | - Paul Coxon
- Faculty of Health, Southern Cross University, Bilinga, QLD 4225, Australia (P.C.)
- North Queensland Skin Centre, Townsville, QLD 4810, Australia
| | - James Furness
- Water Based Research Unit, Bond University, Robina, QLD 4226, Australia;
| | - Joe Walsh
- Sport Science Institute, Sydney, NSW 2000, Australia;
- AI Consulting Group, Sydney, NSW 2000, Australia
| | - Mike Climstein
- Aquatic Based Research, Southern Cross University, Bilinga, QLD 4225, Australia; (I.M.); (N.R.)
- Faculty of Health, Southern Cross University, Bilinga, QLD 4225, Australia (P.C.)
- Physical Activity, Lifestyle, Ageing and Wellbeing Faculty Research Group, University of Sydney, Sydney, NSW 2050, Australia
| |
Collapse
|
3
|
Ji H, Li J, Zhu X, Fan L, Jiang W, Chen Y. Enhancing assisted diagnostic accuracy in scalp psoriasis: A Multi-Network Fusion Object Detection Framework for dermoscopic pattern diagnosis. Skin Res Technol 2024; 30:e13698. [PMID: 38634154 PMCID: PMC11024501 DOI: 10.1111/srt.13698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Accepted: 04/02/2024] [Indexed: 04/19/2024]
Abstract
BACKGROUND Dermoscopy is a common method of scalp psoriasis diagnosis, and several artificial intelligence techniques have been used to assist dermoscopy in the diagnosis of nail fungus disease, the most commonly used being the convolutional neural network algorithm; however, convolutional neural networks are only the most basic algorithm, and the use of object detection algorithms to assist dermoscopy in the diagnosis of scalp psoriasis has not been reported. OBJECTIVES Establishment of a dermoscopic modality diagnostic framework for scalp psoriasis based on object detection technology and image enhancement to improve diagnostic efficiency and accuracy. METHODS We analyzed the dermoscopic patterns of scalp psoriasis diagnosed at 72nd Group army hospital of PLA from January 1, 2020 to December 31, 2021, and selected scalp seborrheic dermatitis as a control group. Based on dermoscopic images and major dermoscopic patterns of scalp psoriasis and scalp seborrheic dermatitis, we investigated a multi-network fusion object detection framework based on the object detection technique Faster R-CNN and the image enhancement technique contrast limited adaptive histogram equalization (CLAHE), for assisting in the diagnosis of scalp psoriasis and scalp seborrheic dermatitis, as well as to differentiate the major dermoscopic patterns of the two diseases. The diagnostic performance of the multi-network fusion object detection framework was compared with that between dermatologists. RESULTS A total of 1876 dermoscopic images were collected, including 1218 for scalp psoriasis versus 658 for scalp seborrheic dermatitis. Based on these images, training and testing are performed using a multi-network fusion object detection framework. The results showed that the test accuracy, specificity, sensitivity, and Youden index for the diagnosis of scalp psoriasis was: 91.0%, 89.5%, 91.0%, and 0.805, and for the main dermoscopic patterns of scalp psoriasis and scalp seborrheic dermatitis, the diagnostic results were: 89.9%, 97.7%, 89.9%, and 0.876. Comparing the diagnostic results with those of five dermatologists, the fusion framework performs better than the dermatologists' diagnoses. CONCLUSIONS Studies have shown some differences in dermoscopic patterns between scalp psoriasis and scalp seborrheic dermatitis. The proposed multi-network fusion object detection framework has higher diagnostic performance for scalp psoriasis than for dermatologists.
Collapse
Affiliation(s)
- Honghai Ji
- School of Electronics & Control EngineeringNorth China University of TechnologyBeijingChina
| | - Jiaqi Li
- School of Electronics & Control EngineeringNorth China University of TechnologyBeijingChina
| | - Xiaoyang Zhu
- Department of Dermatology72nd Group army hospital of PLAHuzhouChina
| | - Lingling Fan
- School of AutomationBeijing Information Science and Technology UniversityBeijingChina
| | - Weiwei Jiang
- Department of Dermatology72nd Group army hospital of PLAHuzhouChina
- Department of DermatologyShanghai Key Laboratory of Medical MycologyChangzheng HospitalNaval Medical UniversityShanghaiChina
| | - Yang Chen
- Department of Dermatology72nd Group army hospital of PLAHuzhouChina
| |
Collapse
|
4
|
Yee J, Rosendahl C, Aoude LG. The role of artificial intelligence and convolutional neural networks in the management of melanoma: a clinical, pathological, and radiological perspective. Melanoma Res 2024; 34:96-104. [PMID: 38141179 PMCID: PMC10906187 DOI: 10.1097/cmr.0000000000000951] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 11/29/2023] [Indexed: 12/25/2023]
Abstract
Clinical dermatoscopy and pathological slide assessment are essential in the diagnosis and management of patients with cutaneous melanoma. For those presenting with stage IIC disease and beyond, radiological investigations are often considered. The dermatoscopic, whole slide and radiological images used during clinical care are often stored digitally, enabling artificial intelligence (AI) and convolutional neural networks (CNN) to learn, analyse and contribute to the clinical decision-making. A keyword search of the Medline database was performed to assess the progression, capabilities and limitations of AI and CNN and its use in diagnosis and management of cutaneous melanoma. Full-text articles were reviewed if they related to dermatoscopy, pathological slide assessment or radiology. Through analysis of 95 studies, we demonstrate that diagnostic accuracy of AI/CNN can be superior (or at least equal) to clinicians. However, variability in image acquisition, pre-processing, segmentation, and feature extraction remains challenging. With current technological abilities, AI/CNN and clinicians synergistically working together are better than one another in all subspecialty domains relating to cutaneous melanoma. AI has the potential to enhance the diagnostic capabilities of junior dermatology trainees, primary care skin cancer clinicians and general practitioners. For experienced clinicians, AI provides a cost-efficient second opinion. From a pathological and radiological perspective, CNN has the potential to improve workflow efficiency, allowing clinicians to achieve more in a finite amount of time. Until the challenges of AI/CNN are reliably met, however, they can only remain an adjunct to clinical decision-making.
Collapse
Affiliation(s)
- Joshua Yee
- Faculty of Medicine, University of Queensland, St Lucia
| | - Cliff Rosendahl
- Primary Care Clinical Unit, Medical School, The University of Queensland, Herston
| | - Lauren G. Aoude
- Frazer Institute, The University of Queensland, Woolloongabba, QLD, Australia
| |
Collapse
|
5
|
Kolasa K, Admassu B, Hołownia-Voloskova M, Kędzior KJ, Poirrier JE, Perni S. Systematic reviews of machine learning in healthcare: a literature review. Expert Rev Pharmacoecon Outcomes Res 2024; 24:63-115. [PMID: 37955147 DOI: 10.1080/14737167.2023.2279107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 10/31/2023] [Indexed: 11/14/2023]
Abstract
INTRODUCTION The increasing availability of data and computing power has made machine learning (ML) a viable approach to faster, more efficient healthcare delivery. METHODS A systematic literature review (SLR) of published SLRs evaluating ML applications in healthcare settings published between1 January 2010 and 27 March 2023 was conducted. RESULTS In total 220 SLRs covering 10,462 ML algorithms were reviewed. The main application of AI in medicine related to the clinical prediction and disease prognosis in oncology and neurology with the use of imaging data. Accuracy, specificity, and sensitivity were provided in 56%, 28%, and 25% SLRs respectively. Internal and external validation was reported in 53% and less than 1% of the cases respectively. The most common modeling approach was neural networks (2,454 ML algorithms), followed by support vector machine and random forest/decision trees (1,578 and 1,522 ML algorithms, respectively). EXPERT OPINION The review indicated considerable reporting gaps in terms of the ML's performance, both internal and external validation. Greater accessibility to healthcare data for developers can ensure the faster adoption of ML algorithms into clinical practice.
Collapse
Affiliation(s)
- Katarzyna Kolasa
- Division of Health Economics and Healthcare Management, Kozminski University, Warsaw, Poland
| | - Bisrat Admassu
- Division of Health Economics and Healthcare Management, Kozminski University, Warsaw, Poland
| | | | | | | | | |
Collapse
|
6
|
Liutkus J, Kriukas A, Stragyte D, Mazeika E, Raudonis V, Galetzka W, Stang A, Valiukeviciene S. Accuracy of a Smartphone-Based Artificial Intelligence Application for Classification of Melanomas, Melanocytic Nevi, and Seborrheic Keratoses. Diagnostics (Basel) 2023; 13:2139. [PMID: 37443533 DOI: 10.3390/diagnostics13132139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 06/16/2023] [Accepted: 06/20/2023] [Indexed: 07/15/2023] Open
Abstract
Current artificial intelligence algorithms can classify melanomas at a level equivalent to that of experienced dermatologists. The objective of this study was to assess the accuracy of a smartphone-based "You Only Look Once" neural network model for the classification of melanomas, melanocytic nevi, and seborrheic keratoses. The algorithm was trained using 59,090 dermatoscopic images. Testing was performed on histologically confirmed lesions: 32 melanomas, 35 melanocytic nevi, and 33 seborrheic keratoses. The results of the algorithm's decisions were compared with those of two skilled dermatologists and five beginners in dermatoscopy. The algorithm's sensitivity and specificity for melanomas were 0.88 (0.71-0.96) and 0.87 (0.76-0.94), respectively. The algorithm surpassed the beginner dermatologists, who achieved a sensitivity of 0.83 (0.77-0.87). For melanocytic nevi, the algorithm outclassed each group of dermatologists, attaining a sensitivity of 0.77 (0.60-0.90). The algorithm's sensitivity for seborrheic keratoses was 0.52 (0.34-0.69). The smartphone-based "You Only Look Once" neural network model achieved a high sensitivity and specificity in the classification of melanomas and melanocytic nevi with an accuracy similar to that of skilled dermatologists. However, a bigger dataset is required in order to increase the algorithm's sensitivity for seborrheic keratoses.
Collapse
Affiliation(s)
- Jokubas Liutkus
- Department of Skin and Venereal Diseases, Lithuanian University of Health Sciences, 44307 Kaunas, Lithuania
- Department of Skin and Venereal Diseases, Hospital of Lithuanian University of Health Sciences Kauno Klinikos, 50161 Kaunas, Lithuania
| | - Arturas Kriukas
- Department of Skin and Venereal Diseases, Lithuanian University of Health Sciences, 44307 Kaunas, Lithuania
- Department of Skin and Venereal Diseases, Hospital of Lithuanian University of Health Sciences Kauno Klinikos, 50161 Kaunas, Lithuania
| | - Dominyka Stragyte
- Department of Skin and Venereal Diseases, Lithuanian University of Health Sciences, 44307 Kaunas, Lithuania
- Department of Skin and Venereal Diseases, Hospital of Lithuanian University of Health Sciences Kauno Klinikos, 50161 Kaunas, Lithuania
| | - Erikas Mazeika
- Department of Skin and Venereal Diseases, Lithuanian University of Health Sciences, 44307 Kaunas, Lithuania
- Department of Skin and Venereal Diseases, Hospital of Lithuanian University of Health Sciences Kauno Klinikos, 50161 Kaunas, Lithuania
| | - Vidas Raudonis
- Artificial Intelligence Center, Kaunas University of Technology, 51423 Kaunas, Lithuania
| | - Wolfgang Galetzka
- Institute of Medical Informatics, Biometrics and Epidemiology, University Hospital Essen, 45130 Essen, Germany
| | - Andreas Stang
- Institute of Medical Informatics, Biometrics and Epidemiology, University Hospital Essen, 45130 Essen, Germany
| | - Skaidra Valiukeviciene
- Department of Skin and Venereal Diseases, Lithuanian University of Health Sciences, 44307 Kaunas, Lithuania
- Department of Skin and Venereal Diseases, Hospital of Lithuanian University of Health Sciences Kauno Klinikos, 50161 Kaunas, Lithuania
| |
Collapse
|
7
|
Grossarth S, Mosley D, Madden C, Ike J, Smith I, Huo Y, Wheless L. Recent Advances in Melanoma Diagnosis and Prognosis Using Machine Learning Methods. Curr Oncol Rep 2023; 25:635-645. [PMID: 37000340 PMCID: PMC10339689 DOI: 10.1007/s11912-023-01407-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/13/2023] [Indexed: 04/01/2023]
Abstract
PURPOSE OF REVIEW The purpose was to summarize the current role and state of artificial intelligence and machine learning in the diagnosis and management of melanoma. RECENT FINDINGS Deep learning algorithms can identify melanoma from clinical, dermoscopic, and whole slide pathology images with increasing accuracy. Efforts to provide more granular annotation to datasets and to identify new predictors are ongoing. There have been many incremental advances in both melanoma diagnostics and prognostic tools using artificial intelligence and machine learning. Higher quality input data will further improve these models' capabilities.
Collapse
Affiliation(s)
- Sarah Grossarth
- Quillen College of Medicine, East Tennessee State University, Johnson City, TN, USA
| | | | - Christopher Madden
- Department of Dermatology, Vanderbilt University Medicine Center, Nashville, TN, USA
- State University of New York Downstate College of Medicine, Brooklyn, NY, USA
| | - Jacqueline Ike
- Department of Dermatology, Vanderbilt University Medicine Center, Nashville, TN, USA
- Meharry Medical College, Nashville, TN, USA
| | - Isabelle Smith
- Department of Dermatology, Vanderbilt University Medicine Center, Nashville, TN, USA
- Vanderbilt University, Nashville, TN, USA
| | - Yuankai Huo
- Department of Computer Science and Electrical Engineering, Vanderbilt University, Nashville, TN, 37235, USA
| | - Lee Wheless
- Department of Dermatology, Vanderbilt University Medicine Center, Nashville, TN, USA.
- Department of Medicine, Division of Epidemiology, Vanderbilt University Medical Center, Nashville, TN, USA.
- Tennessee Valley Healthcare System VA Medical Center, Nashville, TN, USA.
| |
Collapse
|
8
|
Fogelberg K, Chamarthi S, Maron RC, Niebling J, Brinker TJ. Domain shifts in dermoscopic skin cancer datasets: Evaluation of essential limitations for clinical translation. N Biotechnol 2023:S1871-6784(23)00021-3. [PMID: 37146681 DOI: 10.1016/j.nbt.2023.04.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 04/12/2023] [Accepted: 04/26/2023] [Indexed: 05/07/2023]
Abstract
The limited ability of Convolutional Neural Networks to generalize to images from previously unseen domains is a major limitation, in particular, for safety-critical clinical tasks such as dermoscopic skin cancer classification. In order to translate CNN-based applications into the clinic, it is essential that they are able to adapt to domain shifts. Such new conditions can arise through the use of different image acquisition systems or varying lighting conditions. In dermoscopy, shifts can also occur as a change in patient age or occurence of rare lesion localizations (e.g. palms). These are not prominently represented in most training datasets and can therefore lead to a decrease in performance. In order to verify the generalizability of classification models in real world clinical settings it is crucial to have access to data which mimics such domain shifts. To our knowledge no dermoscopic image dataset exists where such domain shifts are properly described and quantified. We therefore grouped publicly available images from ISIC archive based on their metadata (e.g. acquisition location, lesion localization, patient age) to generate meaningful domains. To verify that these domains are in fact distinct, we used multiple quantification measures to estimate the presence and intensity of domain shifts. Additionally, we analyzed the performance on these domains with and without an unsupervised domain adaptation technique. We observed that in most of our grouped domains, domain shifts in fact exist. Based on our results, we believe these datasets to be helpful for testing the generalization capabilities of dermoscopic skin cancer classifiers.
Collapse
Affiliation(s)
- Katharina Fogelberg
- Digital Biomarkers for Oncology, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Sireesha Chamarthi
- Data Analysis and Intelligence, German Aerospace Center (DLR - Institute of Data science), Jena, Germany
| | - Roman C Maron
- Digital Biomarkers for Oncology, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Julia Niebling
- Data Analysis and Intelligence, German Aerospace Center (DLR - Institute of Data science), Jena, Germany
| | - Titus J Brinker
- Digital Biomarkers for Oncology, German Cancer Research Center (DKFZ), Heidelberg, Germany.
| |
Collapse
|
9
|
Kommoss KS, Winkler JK, Mueller-Christmann C, Bardehle F, Toberer F, Stolz W, Kraenke T, Hofmann-Wellenhof R, Blum A, Enk A, Rosenberger A, Haenssle HA. Observational study investigating the level of support from a convolutional neural network in face and scalp lesions deemed diagnostically 'unclear' by dermatologists. Eur J Cancer 2023; 185:53-60. [PMID: 36963352 DOI: 10.1016/j.ejca.2023.02.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 02/24/2023] [Accepted: 02/26/2023] [Indexed: 03/07/2023]
Abstract
BACKGROUND The clinical diagnosis of face and scalp lesions (FSL) is challenging due to overlapping features. Dermatologists encountering diagnostically 'unclear' lesions may benefit from artificial intelligence support via convolutional neural networks (CNN). METHODS In a web-based classification task, dermatologists (n = 64) diagnosed a convenience sample of 100 FSL as 'benign', 'malignant', or 'unclear' and indicated their management decisions ('no action', 'follow-up', 'treatment/excision'). A market-approved CNN (Moleanalyzer-Pro®, FotoFinder Systems, Germany) was applied for binary classifications (benign/malignant) of dermoscopic images. RESULTS After reviewing one dermoscopic image per case, dermatologists labelled 562 of 6400 diagnoses (8.8%) as 'unclear' and mostly managed these by follow-up examinations (57.3%, n = 322) or excisions (42.5%, n = 239). Management was incorrect in 58.8% of 291 truly malignant cases (171 'follow-up' or 'no action') and 43.9% of 271 truly benign cases (119 'excision'). Accepting CNN classifications in unclear cases would have reduced false management decisions to 4.1% in truly malignant and 31.7% in truly benign lesions (both p < 0.01). After receiving full case information 239 diagnoses (3.7%) remained 'unclear' to dermatologists, now triggering more excisions (72.0%) than follow-up examinations (28.0%). These management decisions were incorrect in 32.8% of 116 truly malignant cases and 76.4% of 123 truly benign cases. Accepting CNN classifications would have reduced false management decisions to 6.9% in truly malignant lesions and to 38.2% in truly benign cases (both p < 0.01). CONCLUSIONS Dermatologists mostly managed diagnostically 'unclear' FSL by treatment/excision or follow-up examination. Following CNN classifications as guidance in unclear cases seems suitable to significantly reduce incorrect decisions.
Collapse
Affiliation(s)
| | - Julia K Winkler
- Department of Dermatology, University of Heidelberg, Heidelberg, Germany
| | | | - Felicitas Bardehle
- Department of Dermatology, University of Heidelberg, Heidelberg, Germany
| | - Ferdinand Toberer
- Department of Dermatology, University of Heidelberg, Heidelberg, Germany
| | - Wilhelm Stolz
- Department of Dermatology, Allergology and Environmental Medicine II, Hospital Thalkirchner Street, Munich, Germany
| | - Teresa Kraenke
- Department of Dermatology and Venerology, Medical University of Graz, Graz, Austria
| | | | - Andreas Blum
- Public, Private and Teaching Practice of Dermatology, Konstanz, Germany
| | - Alexander Enk
- Department of Dermatology, University of Heidelberg, Heidelberg, Germany
| | - Albert Rosenberger
- Department of Genetic Epidemiology, University of Goettingen, Goettingen, Germany
| | - Holger A Haenssle
- Department of Dermatology, University of Heidelberg, Heidelberg, Germany.
| |
Collapse
|
10
|
Differentiating malignant and benign eyelid lesions using deep learning. Sci Rep 2023; 13:4103. [PMID: 36914694 PMCID: PMC10011394 DOI: 10.1038/s41598-023-30699-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Accepted: 02/28/2023] [Indexed: 03/16/2023] Open
Abstract
Artificial intelligence as a screening tool for eyelid lesions will be helpful for early diagnosis of eyelid malignancies and proper decision-making. This study aimed to evaluate the performance of a deep learning model in differentiating eyelid lesions using clinical eyelid photographs in comparison with human ophthalmologists. We included 4954 photographs from 928 patients in this retrospective cross-sectional study. Images were classified into three categories: malignant lesion, benign lesion, and no lesion. Two pre-trained convolutional neural network (CNN) models, DenseNet-161 and EfficientNetV2-M architectures, were fine-tuned to classify images into three or two (malignant versus benign) categories. For a ternary classification, the mean diagnostic accuracies of the CNNs were 82.1% and 83.0% using DenseNet-161 and EfficientNetV2-M, respectively, which were inferior to those of the nine clinicians (87.0-89.5%). For the binary classification, the mean accuracies were 87.5% and 92.5% using DenseNet-161 and EfficientNetV2-M models, which was similar to that of the clinicians (85.8-90.0%). The mean AUC of the two CNN models was 0.908 and 0.950, respectively. Gradient-weighted class activation map successfully highlighted the eyelid tumors on clinical photographs. Deep learning models showed a promising performance in discriminating malignant versus benign eyelid lesions on clinical photographs, reaching the level of human observers.
Collapse
|
11
|
Liopyris K, Gregoriou S, Dias J, Stratigos AJ. Artificial Intelligence in Dermatology: Challenges and Perspectives. Dermatol Ther (Heidelb) 2022; 12:2637-2651. [PMID: 36306100 PMCID: PMC9674813 DOI: 10.1007/s13555-022-00833-8] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 10/07/2022] [Indexed: 01/07/2023] Open
Abstract
Artificial intelligence (AI) based on machine learning and convolutional neuron networks (CNN) is rapidly becoming a realistic prospect in dermatology. Non-melanoma skin cancer is the most common cancer worldwide and melanoma is one of the deadliest forms of cancer. Dermoscopy has improved physicians' diagnostic accuracy for skin cancer recognition but unfortunately it remains comparatively low. AI could provide invaluable aid in the early evaluation and diagnosis of skin cancer. In the last decade, there has been a breakthrough in new research and publications in the field of AI. Studies have shown that CNN algorithms can classify skin lesions from dermoscopic images with superior or at least equivalent performance compared to clinicians. Even though AI algorithms have shown very promising results for the diagnosis of skin cancer in reader studies, their generalizability and applicability in everyday clinical practice remain elusive. Herein we attempted to summarize the potential pitfalls and challenges of AI that were underlined in reader studies and pinpoint strategies to overcome limitations in future studies. Finally, we tried to analyze the advantages and opportunities that lay ahead for a better future for dermatology and patients, with the potential use of AI in our practices.
Collapse
Affiliation(s)
- Konstantinos Liopyris
- 1st Department of Dermatology-Venereology, Andreas Sygros Hospital, National and Kapodistrian University of Athens, 5 Ionos Dragoumi Str, 16121, Athens, Greece
- Dermatology Department, Memorial Sloan Kettering Cancer Center, New York, NY, 10021, USA
| | - Stamatios Gregoriou
- 1st Department of Dermatology-Venereology, Andreas Sygros Hospital, National and Kapodistrian University of Athens, 5 Ionos Dragoumi Str, 16121, Athens, Greece.
| | - Julia Dias
- 1st Department of Dermatology-Venereology, Andreas Sygros Hospital, National and Kapodistrian University of Athens, 5 Ionos Dragoumi Str, 16121, Athens, Greece
| | - Alexandros J Stratigos
- 1st Department of Dermatology-Venereology, Andreas Sygros Hospital, National and Kapodistrian University of Athens, 5 Ionos Dragoumi Str, 16121, Athens, Greece
| |
Collapse
|
12
|
Winkler JK, Haenssle HA. [Artificial intelligence-based classification for the diagnostics of skin cancer]. DERMATOLOGIE (HEIDELBERG, GERMANY) 2022; 73:838-844. [PMID: 36094608 DOI: 10.1007/s00105-022-05058-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 08/22/2022] [Indexed: 06/15/2023]
Abstract
Convolutional neural networks (CNN) achieve a level of performance comparable or even superior to dermatologists in the assessment of pigmented and nonpigmented skin lesions. In the analysis of images by artificial neural networks, images on a pixel level pass through various layers of the network with different graphic filters. Based on excellent study results, a first deep learning network (Moleanalyzer pro, Fotofinder Systems GmBH, Bad Birnbach, Germany) received market approval in Europe. However, such neural networks also reveal relevant limitations, whereby rare entities with insufficient training images are classified less adequately and image artifacts can lead to false diagnoses. Best results can ultimately be achieved in a cooperation of "man with machine". For future skin cancer screening, automated total body mapping is evaluated, which combines total body photography, automated data extraction and assessment of all relevant skin lesions.
Collapse
Affiliation(s)
- Julia K Winkler
- Universitätshautklinik Heidelberg, Im Neuenheimer Feld 440, 69120, Heidelberg, Deutschland.
| | - Holger A Haenssle
- Universitätshautklinik Heidelberg, Im Neuenheimer Feld 440, 69120, Heidelberg, Deutschland
| |
Collapse
|
13
|
Nanni L, Brahnam S, Paci M, Ghidoni S. Comparison of Different Convolutional Neural Network Activation Functions and Methods for Building Ensembles for Small to Midsize Medical Data Sets. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22166129. [PMID: 36015898 PMCID: PMC9415767 DOI: 10.3390/s22166129] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 08/09/2022] [Accepted: 08/12/2022] [Indexed: 05/08/2023]
Abstract
CNNs and other deep learners are now state-of-the-art in medical imaging research. However, the small sample size of many medical data sets dampens performance and results in overfitting. In some medical areas, it is simply too labor-intensive and expensive to amass images numbering in the hundreds of thousands. Building Deep CNN ensembles of pre-trained CNNs is one powerful method for overcoming this problem. Ensembles combine the outputs of multiple classifiers to improve performance. This method relies on the introduction of diversity, which can be introduced on many levels in the classification workflow. A recent ensembling method that has shown promise is to vary the activation functions in a set of CNNs or within different layers of a single CNN. This study aims to examine the performance of both methods using a large set of twenty activations functions, six of which are presented here for the first time: 2D Mexican ReLU, TanELU, MeLU + GaLU, Symmetric MeLU, Symmetric GaLU, and Flexible MeLU. The proposed method was tested on fifteen medical data sets representing various classification tasks. The best performing ensemble combined two well-known CNNs (VGG16 and ResNet50) whose standard ReLU activation layers were randomly replaced with another. Results demonstrate the superiority in performance of this approach.
Collapse
Affiliation(s)
- Loris Nanni
- Department of Information Engineering, University of Padua, Via Gradenigo 6, 35131 Padova, Italy
| | - Sheryl Brahnam
- Department of Information Technology and Cybersecurity, Missouri State University, 901 S. National Street, Springfield, MO 65804, USA
- Correspondence:
| | - Michelangelo Paci
- BioMediTech, Faculty of Medicine and Health Technology, Tampere University, Arvo Ylpön katu 34, D 219, FI-33520 Tampere, Finland
| | - Stefano Ghidoni
- Department of Information Engineering, University of Padua, Via Gradenigo 6, 35131 Padova, Italy
| |
Collapse
|
14
|
The Detection of Thread Roll's Margin Based on Computer Vision. SENSORS 2021; 21:s21196331. [PMID: 34640651 PMCID: PMC8512785 DOI: 10.3390/s21196331] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Revised: 09/11/2021] [Accepted: 09/17/2021] [Indexed: 11/16/2022]
Abstract
The automatic detection of the thread roll's margin is one of the kernel problems in the textile field. As the traditional detection method based on the thread's tension has the disadvantages of high cost and low reliability, this paper proposes a technology that installs a camera on a mobile robot and uses computer vision to detect the thread roll's margin. Before starting, we define a thread roll's margin as follows: The difference between the thread roll's radius and the bobbin's radius. Firstly, we capture images of the thread roll's end surface. Secondly, we obtain the bobbin's image coordinates by calculating the image's convolutions with a Circle Gradient Operator. Thirdly, we fit the thread roll and bobbin's contours into ellipses, and then delete false detections according to the bobbin's image coordinates. Finally, we restore every sub-image of the thread roll by a perspective transformation method, and establish the conversion relationship between the actual size and pixel size. The difference value of the two concentric circles' radii is the thread roll's margin. However, there are false detections and these errors may be more than 19.4 mm when the margin is small. In order to improve the precision and delete false detections, we use deep learning to detect thread roll and bobbin's radii and then can calculate the thread roll's margin. After that, we fuse the two results. However, the deep learning method also has some false detections. As such, in order to eliminate the false detections completely, we estimate the thread roll's margin according to thread consumption speed. Lastly, we use a Kalman Filter to fuse the measured value and estimated value; the average error is less than 5.7 mm.
Collapse
|
15
|
Skin cancer classification via convolutional neural networks: systematic review of studies involving human experts. Eur J Cancer 2021; 156:202-216. [PMID: 34509059 DOI: 10.1016/j.ejca.2021.06.049] [Citation(s) in RCA: 60] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2021] [Revised: 06/18/2021] [Accepted: 06/28/2021] [Indexed: 12/23/2022]
Abstract
BACKGROUND Multiple studies have compared the performance of artificial intelligence (AI)-based models for automated skin cancer classification to human experts, thus setting the cornerstone for a successful translation of AI-based tools into clinicopathological practice. OBJECTIVE The objective of the study was to systematically analyse the current state of research on reader studies involving melanoma and to assess their potential clinical relevance by evaluating three main aspects: test set characteristics (holdout/out-of-distribution data set, composition), test setting (experimental/clinical, inclusion of metadata) and representativeness of participating clinicians. METHODS PubMed, Medline and ScienceDirect were screened for peer-reviewed studies published between 2017 and 2021 and dealing with AI-based skin cancer classification involving melanoma. The search terms skin cancer classification, deep learning, convolutional neural network (CNN), melanoma (detection), digital biomarkers, histopathology and whole slide imaging were combined. Based on the search results, only studies that considered direct comparison of AI results with clinicians and had a diagnostic classification as their main objective were included. RESULTS A total of 19 reader studies fulfilled the inclusion criteria. Of these, 11 CNN-based approaches addressed the classification of dermoscopic images; 6 concentrated on the classification of clinical images, whereas 2 dermatopathological studies utilised digitised histopathological whole slide images. CONCLUSIONS All 19 included studies demonstrated superior or at least equivalent performance of CNN-based classifiers compared with clinicians. However, almost all studies were conducted in highly artificial settings based exclusively on single images of the suspicious lesions. Moreover, test sets mainly consisted of holdout images and did not represent the full range of patient populations and melanoma subtypes encountered in clinical practice.
Collapse
|
16
|
Kittler H. Evolution of the Clinical, Dermoscopic and Pathologic Diagnosis of Melanoma. Dermatol Pract Concept 2021; 11:e2021163S. [PMID: 34447612 DOI: 10.5826/dpc.11s1a163s] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/26/2021] [Indexed: 10/31/2022] Open
Abstract
The conventional narrative states that the steadily rising incidence of melanoma among fair-skinned Caucasian populations during the last decades is caused by excessive UV-exposure. There is, however, no doubt that other factors had a significant impact on the rising incidence of melanoma. Pre-1980s the clinical diagnosis of melanoma was based on gross criteria such as ulceration or bleeding. Melanomas were often diagnosed in advanced stages when the prognosis was grim. In the mid-1980s education campaigns such as the propagation of the ABCD criteria, which addressed health care professionals and the public alike, shifted the focus towards early recognition. Dermatoscopy, which became increasingly popular in the mid-1990s, improved the accuracy for the diagnosis of melanoma in comparison to inspection with the unaided eye, especially for flat and small lesions lacking ABCD criteria. At the same time, pathologists began to lower their thresholds, particularly for the diagnosis of melanoma in situ. The melanoma epidemic that followed was mainly driven by an increase in the number of in situ or microinvasive melanomas. In a few decades, the landscape shifted from an undercalling to an overcalling of melanomas, a development that is now met with increased criticism. The gold standard of melanoma diagnosis is still conventional pathology, which is faced with low to moderate interobserver agreement. New insights in the molecular landscape of melanoma did not translate into techniques for the reliable diagnosis of gray zone lesions including small lesions. The aim of this review is to put our current view of melanoma diagnosis in historical context and to provide a narrative synthesis of its evolution. Based on this narrative I will provide suggestions on how to rebuild the trust in melanoma diagnosis accuracy and in the benefit of early recognition.
Collapse
Affiliation(s)
- Harald Kittler
- Department of Dermatology, Medical University of Vienna, Vienna, Austria
| |
Collapse
|
17
|
Females and Males Show Differences in Early-Stage Transcriptomic Biomarkers of Lung Adenocarcinoma and Lung Squamous Cell Carcinoma. Diagnostics (Basel) 2021; 11:diagnostics11020347. [PMID: 33669819 PMCID: PMC7922551 DOI: 10.3390/diagnostics11020347] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 02/15/2021] [Accepted: 02/17/2021] [Indexed: 12/25/2022] Open
Abstract
The incidence and mortality rates of lung cancers are different between females and males. Therefore, sex information should be an important part of how to train and optimize a diagnostic model. However, most of the existing studies do not fully utilize this information. This study carried out a comparative investigation between sex-specific models and sex-independent models. Three feature selection algorithms and five classifiers were utilized to evaluate the contribution of the sex information to the detection of early-stage lung cancers. Both lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) showed that the sex-specific models outperformed the sex-independent detection of early-stage lung cancers. The Venn plots suggested that females and males shared only a few transcriptomic biomarkers of early-stage lung cancers. Our experimental data suggested that sex information should be included in optimizing disease diagnosis models.
Collapse
|