1
|
Sherbino J, Sibbald M, Norman G, LoGiudice A, Keuhl A, Lee M, Monteiro S. Crowdsourcing a diagnosis? Exploring the accuracy of the size and type of group diagnosis: an experimental study. BMJ Qual Saf 2024:bmjqs-2023-016695. [PMID: 38503488 DOI: 10.1136/bmjqs-2023-016695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 02/26/2024] [Indexed: 03/21/2024]
Abstract
BACKGROUND The consultation process, where a clinician seeks an opinion from another clinician, is foundational in medicine. However, the effectiveness of group diagnosis has not been studied. OBJECTIVE To compare individual diagnosis to group diagnosis on two dimensions: group size (n=3 or 6) and group process (interactive or artificial groups). METHODOLOGY Thirty-six internal or emergency medicine residents participated in the study. Initially, each resident worked through four written cases on their own, providing a primary diagnosis and a differential diagnosis. Next, participants formed into groups of three. Using a videoconferencing platform, they worked through four additional cases, collectively providing a single primary diagnosis and differential diagnosis. The process was repeated using a group of six with four new cases. Cases were all counterbalanced. Retrospectively, nominal (ie, artificial) groups were formed by aggregating individual participant data into subgroups of three and six and analytically computing scores. Presence of the correct diagnosis as primary diagnosis or included in the differential diagnosis, as well as the number of diagnoses mentioned, was calculated for all conditions. Means were compared using analysis of variance. RESULTS For both authentic and nominal groups, the diagnostic accuracy of group diagnosis was superior to individual for both the primary diagnosis and differential diagnosis. However, there was no improvement in diagnostic accuracy when comparing a group of three to a group of six. Interactive and nominal groups were equivalent; however, this may be an artefact of the method used to combine data. CONCLUSIONS Group diagnosis improves diagnostic accuracy. However, a larger group is not necessarily superior to a smaller group. In this study, interactive group discussion does not result in improved diagnostic accuracy.
Collapse
Affiliation(s)
- Jonathan Sherbino
- Department of Medicine, McMaster University Faculty of Health Sciences, Hamilton, Ontario, Canada
| | - Matt Sibbald
- Department of Medicine, McMaster University Faculty of Health Sciences, Hamilton, Ontario, Canada
| | - Geoffrey Norman
- Department of Clinical Epidemiology and Biostatistics, McMaster University Faculty of Health Sciences, Hamilton, Ontario, Canada
| | - Andrew LoGiudice
- Education Services, McMaster University Faculty of Health Sciences, Hamilton, Ontario, Canada
| | - Amy Keuhl
- Education Services, McMaster University Faculty of Health Sciences, Hamilton, Ontario, Canada
| | - Mark Lee
- Education Services, McMaster University Faculty of Health Sciences, Hamilton, Ontario, Canada
| | - Sandra Monteiro
- Department of Medicine, McMaster University Faculty of Health Sciences, Hamilton, Ontario, Canada
| |
Collapse
|
2
|
Donoso F, Peirano D, Longo C, Apalla Z, Lallas A, Jaimes N, Navarrete-Dechent C. Gamified learning in dermatology and dermoscopy education: a paradigm shift. Clin Exp Dermatol 2023; 48:962-967. [PMID: 37155594 DOI: 10.1093/ced/llad177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 04/21/2023] [Accepted: 04/24/2023] [Indexed: 05/10/2023]
Abstract
Teaching methods in medical education have been changing. More recent teaching modalities have gone beyond the traditional delivery of knowledge, promoting learning motivation, and improving teaching and learning outcomes. 'Gamification' and 'serious games' are methodologies that use the principles of games to facilitate learning processes and the acquisition of skills and knowledge, thereby improving attitudes towards learning when compared with traditional teaching methods. As dermatology is a visual field, images are a key component of different teaching strategies. Likewise, dermoscopy, a noninvasive diagnostic technique that allows the visualization of structures within the epidermis and upper dermis, also uses images and pattern recognition strategies. A series of Apps using game-based strategy have been created to teach and facilitate dermoscopy learning; however, studies are required to demonstrate their effectiveness. This review summarizes the current evidence of game-based learning strategies in medical education, including dermatology and dermoscopy.
Collapse
Affiliation(s)
| | | | - Caterina Longo
- Department of Dermatology, University of Modena and Reggio Emilia, Modena, Italy
- Centro Oncologico ad Alta Tecnologia Diagnostica, Azienda Unità Sanitaria Locale-IRCCS di Reggio Emilia, Reggio Emilia, Italy
| | | | - Aimilios Lallas
- First Dermatology Department, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Natalia Jaimes
- Dr Phillip Frost Department of Dermatology and Cutaneous Surgery
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Cristian Navarrete-Dechent
- Department of Dermatology
- Melanoma and Skin Cancer Unit, Escuela de Medicina, Pontificia Universidad Católica de Chile, Santiago, Chile
| |
Collapse
|
3
|
Barata C, Rotemberg V, Codella NCF, Tschandl P, Rinner C, Akay BN, Apalla Z, Argenziano G, Halpern A, Lallas A, Longo C, Malvehy J, Puig S, Rosendahl C, Soyer HP, Zalaudek I, Kittler H. A reinforcement learning model for AI-based decision support in skin cancer. Nat Med 2023; 29:1941-1946. [PMID: 37501017 PMCID: PMC10427421 DOI: 10.1038/s41591-023-02475-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 06/28/2023] [Indexed: 07/29/2023]
Abstract
We investigated whether human preferences hold the potential to improve diagnostic artificial intelligence (AI)-based decision support using skin cancer diagnosis as a use case. We utilized nonuniform rewards and penalties based on expert-generated tables, balancing the benefits and harms of various diagnostic errors, which were applied using reinforcement learning. Compared with supervised learning, the reinforcement learning model improved the sensitivity for melanoma from 61.4% to 79.5% (95% confidence interval (CI): 73.5-85.6%) and for basal cell carcinoma from 79.4% to 87.1% (95% CI: 80.3-93.9%). AI overconfidence was also reduced while simultaneously maintaining accuracy. Reinforcement learning increased the rate of correct diagnoses made by dermatologists by 12.0% (95% CI: 8.8-15.1%) and improved the rate of optimal management decisions from 57.4% to 65.3% (95% CI: 61.7-68.9%). We further demonstrated that the reward-adjusted reinforcement learning model and a threshold-based model outperformed naïve supervised learning in various clinical scenarios. Our findings suggest the potential for incorporating human preferences into image-based diagnostic algorithms.
Collapse
Affiliation(s)
- Catarina Barata
- Institute for Systems and Robotics, LARSyS, Instituto Superior Técnico, Lisbon, Portugal
| | - Veronica Rotemberg
- Dermatology Service, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | | | - Philipp Tschandl
- Department of Dermatology, Medical University of Vienna, Vienna, Austria
| | - Christoph Rinner
- Center for Medical Statistics, Informatics and Intelligent Systems (CeMSIIS), Medical University of Vienna, Vienna, Austria
| | - Bengu Nisa Akay
- Ankara University School of Medicine, Department of Dermatology, Ankara, Turkey
| | - Zoe Apalla
- Second Department of Dermatology, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | | | - Allan Halpern
- Dermatology Service, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Aimilios Lallas
- Second Department of Dermatology, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Caterina Longo
- Dermatology Unit, University of Modena and Reggio Emilia, Modena, Italy
- Azienda Unità Sanitaria Locale - IRCCS di Reggio Emilia, Centro Oncologico ad Alta Tecnologia Diagnostica-Dermatologia, Reggio Emilia, Italy
| | - Josep Malvehy
- Melanoma Unit, Dermatology Department, Hospital Clínic Barcelona, Universitat de Barcelona, IDIBAPS, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBER ER), Instituto de Salud Carlos III, Barcelona, Spain
| | - Susana Puig
- Melanoma Unit, Dermatology Department, Hospital Clínic Barcelona, Universitat de Barcelona, IDIBAPS, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBER ER), Instituto de Salud Carlos III, Barcelona, Spain
| | - Cliff Rosendahl
- General Practice Clinical Unit, Medical School, The University of Queensland, Brisbane, Queensland, Australia
| | - H Peter Soyer
- Frazer Institute, The University of Queensland, Dermatology Research Centre, Brisbane, Queensland, Australia
| | - Iris Zalaudek
- Department of Dermatology, Medical University of Trieste, Trieste, Italy
| | - Harald Kittler
- Department of Dermatology, Medical University of Vienna, Vienna, Austria.
| |
Collapse
|
4
|
Winkler JK, Haenssle HA. [Artificial intelligence-based classification for the diagnostics of skin cancer]. DERMATOLOGIE (HEIDELBERG, GERMANY) 2022; 73:838-844. [PMID: 36094608 DOI: 10.1007/s00105-022-05058-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 08/22/2022] [Indexed: 06/15/2023]
Abstract
Convolutional neural networks (CNN) achieve a level of performance comparable or even superior to dermatologists in the assessment of pigmented and nonpigmented skin lesions. In the analysis of images by artificial neural networks, images on a pixel level pass through various layers of the network with different graphic filters. Based on excellent study results, a first deep learning network (Moleanalyzer pro, Fotofinder Systems GmBH, Bad Birnbach, Germany) received market approval in Europe. However, such neural networks also reveal relevant limitations, whereby rare entities with insufficient training images are classified less adequately and image artifacts can lead to false diagnoses. Best results can ultimately be achieved in a cooperation of "man with machine". For future skin cancer screening, automated total body mapping is evaluated, which combines total body photography, automated data extraction and assessment of all relevant skin lesions.
Collapse
Affiliation(s)
- Julia K Winkler
- Universitätshautklinik Heidelberg, Im Neuenheimer Feld 440, 69120, Heidelberg, Deutschland.
| | - Holger A Haenssle
- Universitätshautklinik Heidelberg, Im Neuenheimer Feld 440, 69120, Heidelberg, Deutschland
| |
Collapse
|
5
|
Polesie S, Gillstedt M, Kittler H, Rinner C, Tschandl P, Paoli J. Assessment of melanoma thickness based on dermoscopy images: an open, web-based, international, diagnostic study. J Eur Acad Dermatol Venereol 2022; 36:2002-2007. [PMID: 35841304 PMCID: PMC9796258 DOI: 10.1111/jdv.18436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Accepted: 06/14/2022] [Indexed: 01/01/2023]
Abstract
BACKGROUND Preoperative assessment of whether a melanoma is invasive or in situ (MIS) is a common task that might have important implications for triage, prognosis and the selection of surgical margins. Several dermoscopic features suggestive of melanoma have been described, but only a few of these are useful in differentiating MIS from invasive melanoma. OBJECTIVE The primary aim of this study was to evaluate how accurately a large number of international readers, individually as well as collectively, were able to discriminate between MIS and invasive melanomas as well as estimate the Breslow thickness of invasive melanomas based on dermoscopy images. The secondary aim was to compare the accuracy of two machine learning convolutional neural networks (CNNs) and the collective reader response. METHODS We conducted an open, web-based, international, diagnostic reader study using an online platform. The online challenge opened on 10 May 2021 and closed on 19 July 2021 (71 days) and was advertised through several social media channels. The investigation included, 1456 dermoscopy images of melanomas (788 MIS; 474 melanomas ≤1.0 mm and 194 >1.0 mm). A test set comprising 277 MIS and 246 invasive melanomas was used to compare readers and CNNs. RESULTS We analysed 22 314 readings by 438 international readers. The overall accuracy (95% confidence interval) for melanoma thickness was 56.4% (55.7%-57.0%), 63.4% (62.5%-64.2%) for MIS and 71.0% (70.3%-72.1%) for invasive melanoma. Readers accurately predicted the thickness in 85.9% (85.4%-86.4%) of melanomas ≤1.0 mm (including MIS) and in 70.8% (69.2%-72.5%) of melanomas >1.0 mm. The reader collective outperformed a de novo CNN but not a pretrained CNN in differentiating MIS from invasive melanoma. CONCLUSIONS Using dermoscopy images, readers and CNNs predict melanoma thickness with fair to moderate accuracy. Readers most accurately discriminated between thin (≤1.0 mm including MIS) and thick melanomas (>1.0 mm).
Collapse
Affiliation(s)
- S. Polesie
- Department of Dermatology and Venereology, Institute of Clinical Sciences, Sahlgrenska AcademyUniversity of GothenburgGothenburgSweden
- Department of Dermatology and Venereology, Region Västra GötalandSahlgrenska University HospitalGothenburgSweden
| | - M. Gillstedt
- Department of Dermatology and Venereology, Institute of Clinical Sciences, Sahlgrenska AcademyUniversity of GothenburgGothenburgSweden
- Department of Dermatology and Venereology, Region Västra GötalandSahlgrenska University HospitalGothenburgSweden
| | - H. Kittler
- Department of DermatologyMedical University of ViennaViennaAustria
| | - C. Rinner
- Center of Medical Statistics, Informatics and Intelligent Systems (CeMSIIS)Medical University of ViennaViennaAustria
| | - P. Tschandl
- Department of DermatologyMedical University of ViennaViennaAustria
| | - J. Paoli
- Department of Dermatology and Venereology, Institute of Clinical Sciences, Sahlgrenska AcademyUniversity of GothenburgGothenburgSweden
- Department of Dermatology and Venereology, Region Västra GötalandSahlgrenska University HospitalGothenburgSweden
| |
Collapse
|
6
|
Barrutia L, Vega-Gutiérrez J, Santamarina-Albertos A. Benefits, drawbacks, and challenges of social media use in dermatology: A systematic review. J DERMATOL TREAT 2022; 33:2738-2757. [PMID: 35506617 DOI: 10.1080/09546634.2022.2069661] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
The presence of dermatological information on social media has grown exponentially over the last two decades. Consequently, the recent literature on this topic is abundant. Many authors have highlighted that social media constitutes a unique opportunity for patient education. Additionally, numerous other benefits of these platforms have been reported. However, other authors have focused on the potential risks that these networks involve. The main concerns are patient confidentiality, legal considerations and ethical issues. Therefore, we stand at a crossroads where the many advantages of social media use in dermatology seem to be underestimated due to the presence of potential drawbacks. At this point, we propose that a systematic review of the positive and negative aspects of using social media in dermatology is necessary. We carried out a comprehensive systematic review dating from inception to July 2021. Finally, 161 articles were included. Fifteen benefits, 11 drawbacks and 10 challenges of social media use in dermatology were identified and discussed. Suggested strategies to address the identified drawbacks were provided. Overall, while there are risks to using social media, they are outnumbered by their benefits. Therefore, dermatologists should embrace this opportunity to educate patients and aim to create rigorous and engaging content.
Collapse
Affiliation(s)
- Leire Barrutia
- Dermatology, Medicine and Toxicology Department, University of Valladolid, Valladolid, Spain.,Dermatology Department, Clinical University Hospital of Valladolid, Valladolid, Spain
| | - Jesús Vega-Gutiérrez
- Dermatology, Medicine and Toxicology Department, University of Valladolid, Valladolid, Spain.,Dermatology Department, Río Hortega University Hospital, Valladolid, Spain
| | - Alba Santamarina-Albertos
- Dermatology, Medicine and Toxicology Department, University of Valladolid, Valladolid, Spain.,Dermatology Department, Clinical University Hospital of Valladolid, Valladolid, Spain
| |
Collapse
|
7
|
Combalia M, Codella N, Rotemberg V, Carrera C, Dusza S, Gutman D, Helba B, Kittler H, Kurtansky NR, Liopyris K, Marchetti MA, Podlipnik S, Puig S, Rinner C, Tschandl P, Weber J, Halpern A, Malvehy J. Validation of artificial intelligence prediction models for skin cancer diagnosis using dermoscopy images: the 2019 International Skin Imaging Collaboration Grand Challenge. Lancet Digit Health 2022; 4:e330-e339. [PMID: 35461690 PMCID: PMC9295694 DOI: 10.1016/s2589-7500(22)00021-8] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 12/23/2021] [Accepted: 01/26/2022] [Indexed: 01/08/2023]
|
8
|
Shi YC, Li J, Li SJ, Li ZP, Zhang HJ, Wu ZY, Wu ZY. Flap failure prediction in microvascular tissue reconstruction using machine learning algorithms. World J Clin Cases 2022; 10:3729-3738. [PMID: 35647170 PMCID: PMC9100718 DOI: 10.12998/wjcc.v10.i12.3729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 02/11/2022] [Accepted: 03/06/2022] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Microvascular tissue reconstruction is a well-established, commonly used technique for a wide variety of the tissue defects. However, flap failure is associated with an additional hospital stay, medical cost burden, and mental stress. Therefore, understanding of the risk factors associated with this event is of utmost importance.
AIM To develop machine learning-based predictive models for flap failure to identify the potential factors and screen out high-risk patients.
METHODS Using the data set of 946 consecutive patients, who underwent microvascular tissue reconstruction of free flap reconstruction for head and neck, breast, back, and extremity, we established three machine learning models including random forest classifier, support vector machine, and gradient boosting. Model performances were evaluated by the indicators such as area under the curve of receiver operating characteristic curve, accuracy, precision, recall, and F1 score. A multivariable regression analysis was performed for the most critical variables in the random forest model.
RESULTS Post-surgery, the flap failure event occurred in 34 patients (3.6%). The machine learning models based on various preoperative and intraoperative variables were successfully developed. Among them, the random forest classifier reached the best performance in receiver operating characteristic curve, with an area under the curve score of 0.770 in the test set. The top 10 variables in the random forest were age, body mass index, ischemia time, smoking, diabetes, experience, prior chemotherapy, hypertension, insulin, and obesity. Interestingly, only age, body mass index, and ischemic time were statistically associated with the outcomes.
CONCLUSION Machine learning-based algorithms, especially the random forest classifier, were very important in categorizing patients at high risk of flap failure. The occurrence of flap failure was a multifactor-driven event and was identified with numerous factors that warrant further investigation. Importantly, the successful application of machine learning models may help the clinician in decision-making, understanding the underlying pathologic mechanisms of the disease, and improving the long-term outcome of patients.
Collapse
Affiliation(s)
- Yu-Cang Shi
- Department of Plastic Surgery, Affiliated Hospital of Guangdong Medical University, Zhanjiang 524001, Guangdong Province, China
| | - Jie Li
- Department of Plastic Surgery, Affiliated Hospital of Guangdong Medical University, Zhanjiang 524001, Guangdong Province, China
| | - Shao-Jie Li
- Department of Plastic Surgery, Affiliated Hospital of Guangdong Medical University, Zhanjiang 524001, Guangdong Province, China
| | - Zhan-Peng Li
- Department of Plastic Surgery, Affiliated Hospital of Guangdong Medical University, Zhanjiang 524001, Guangdong Province, China
| | - Hui-Jun Zhang
- Department of Plastic Surgery, Affiliated Hospital of Guangdong Medical University, Zhanjiang 524001, Guangdong Province, China
| | - Ze-Yong Wu
- Department of Plastic Surgery, Affiliated Hospital of Guangdong Medical University, Zhanjiang 524001, Guangdong Province, China
| | - Zhi-Yuan Wu
- Department of Plastic Surgery, Affiliated Hospital of Guangdong Medical University, Zhanjiang 524001, Guangdong Province, China
| |
Collapse
|
9
|
Winkler JK, Sies K, Fink C, Toberer F, Enk A, Abassi MS, Fuchs T, Blum A, Stolz W, Coras-Stepanek B, Cipic R, Guther S, Haenssle HA. Kollektive menschliche Intelligenz übertrifft künstliche Intelligenz in einem Quiz zur Klassifizierung von Hautläsionen. J Dtsch Dermatol Ges 2021; 19:1178-1185. [PMID: 34390156 DOI: 10.1111/ddg.14510_g] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 03/08/2021] [Indexed: 12/12/2022]
Affiliation(s)
| | | | | | | | | | | | - Tobias Fuchs
- Forschungs- und Entwicklungsabteilung, FotoFinder Systems GmbH, Bad Birnbach
| | | | - Wilhelm Stolz
- Klinik für Dermatologie, Allergologgie und Umweltmedizin II, Krankenhaus Thalkirchner Straße, München
| | - Brigitte Coras-Stepanek
- Klinik für Dermatologie, Allergologgie und Umweltmedizin II, Krankenhaus Thalkirchner Straße, München
| | - Robert Cipic
- Klinik für Dermatologie, Allergologgie und Umweltmedizin II, Krankenhaus Thalkirchner Straße, München
| | - Stefanie Guther
- Klinik für Dermatologie, Allergologgie und Umweltmedizin II, Krankenhaus Thalkirchner Straße, München
| | | |
Collapse
|
10
|
Winkler JK, Sies K, Fink C, Toberer F, Enk A, Abassi MS, Fuchs T, Blum A, Stolz W, Coras-Stepanek B, Cipic R, Guther S, Haenssle HA. Collective human intelligence outperforms artificial intelligence in a skin lesion classification task. J Dtsch Dermatol Ges 2021; 19:1178-1184. [PMID: 34096688 DOI: 10.1111/ddg.14510] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 03/08/2021] [Indexed: 11/28/2022]
Abstract
BACKGROUND AND OBJECTIVES Convolutional neural networks (CNN) enable accurate diagnosis of medical images and perform on or above the level of individual physicians. Recently, collective human intelligence (CoHI) was shown to exceed the diagnostic accuracy of individuals. Thus, diagnostic performance of CoHI (120 dermatologists) versus individual dermatologists versus two state-of-the-art CNN was investigated. PATIENTS AND METHODS Cross-sectional reader study with presentation of 30 clinical cases to 120 dermatologists. Six diagnoses were offered and votes collected via remote voting devices (quizzbox®, Quizzbox Solutions GmbH, Stuttgart, Germany). Dermatoscopic images were classified by a binary and multiclass CNN (FotoFinder Systems GmbH, Bad Birnbach, Germany). Three sets of diagnostic classifications were scored against ground truth: (1) CoHI, (2) individual dermatologists, and (3) CNN. RESULTS CoHI attained a significantly higher accuracy [95 % confidence interval] (80.0 % [62.7 %-90.5 %]) than individual dermatologists (75.7 % [73.8 %-77.5 %]) and CNN (70.0 % [52.1 %-83.3 %]; all P < 0.001) in binary classifications. Moreover, CoHI achieved a higher sensitivity (82.4 % [59.0 %-93.8 %]) and specificity (76.9 % [49.7 %-91.8 %]) than individual dermatologists (sensitivity 77.8 % [75.3 %-80.2 %], specificity 73.0 % [70.6 %-75.4 %]) and CNN (sensitivity 70.6 % [46.9 %-86.7 %], specificity 69.2 % [42.4 %-87.3 %]). The diagnostic accuracy of CoHI was superior to that of individual dermatologists (P < 0.001) in multiclass evaluation, with the accuracy of the latter comparable to multiclass CNN. CONCLUSIONS Our analysis revealed that the majority vote of an interconnected group of dermatologists (CoHI) outperformed individuals and CNN in a demanding skin lesion classification task.
Collapse
Affiliation(s)
- Julia K Winkler
- Department of Dermatology, University of Heidelberg, Heidelberg, Germany
| | - Katharina Sies
- Department of Dermatology, University of Heidelberg, Heidelberg, Germany
| | - Christine Fink
- Department of Dermatology, University of Heidelberg, Heidelberg, Germany
| | - Ferdinand Toberer
- Department of Dermatology, University of Heidelberg, Heidelberg, Germany
| | - Alexander Enk
- Department of Dermatology, University of Heidelberg, Heidelberg, Germany
| | | | - Tobias Fuchs
- Department of Research and Development, FotoFinder Systems GmbH, Bad Birnbach, Germany
| | - Andreas Blum
- Public, Private and Teaching Practice, Konstanz, Germany
| | - Wilhelm Stolz
- Department of Dermatology, Allergology and Environmental Medicine II, Hospital Thalkirchner Street, Munich, Germany
| | - Brigitte Coras-Stepanek
- Department of Dermatology, Allergology and Environmental Medicine II, Hospital Thalkirchner Street, Munich, Germany
| | - Robert Cipic
- Department of Dermatology, Allergology and Environmental Medicine II, Hospital Thalkirchner Street, Munich, Germany
| | - Stefanie Guther
- Department of Dermatology, Allergology and Environmental Medicine II, Hospital Thalkirchner Street, Munich, Germany
| | - Holger A Haenssle
- Department of Dermatology, University of Heidelberg, Heidelberg, Germany
| |
Collapse
|
11
|
Ronzio L, Campagner A, Cabitza F, Gensini GF. Unity Is Intelligence: A Collective Intelligence Experiment on ECG Reading to Improve Diagnostic Performance in Cardiology. J Intell 2021; 9:jintelligence9020017. [PMID: 33915991 PMCID: PMC8167709 DOI: 10.3390/jintelligence9020017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 02/21/2021] [Accepted: 03/09/2021] [Indexed: 12/03/2022] Open
Abstract
Medical errors have a huge impact on clinical practice in terms of economic and human costs. As a result, technology-based solutions, such as those grounded in artificial intelligence (AI) or collective intelligence (CI), have attracted increasing interest as a means of reducing error rates and their impacts. Previous studies have shown that a combination of individual opinions based on rules, weighting mechanisms, or other CI solutions could improve diagnostic accuracy with respect to individual doctors. We conducted a study to investigate the potential of this approach in cardiology and, more precisely, in electrocardiogram (ECG) reading. To achieve this aim, we designed and conducted an experiment involving medical students, recent graduates, and residents, who were asked to annotate a collection of 10 ECGs of various complexity and difficulty. For each ECG, we considered groups of increasing size (from three to 30 members) and applied three different CI protocols. In all cases, the results showed a statistically significant improvement (ranging from 9% to 88%) in terms of diagnostic accuracy when compared to the performance of individual readers; this difference held for not only large groups, but also smaller ones. In light of these results, we conclude that CI approaches can support the tasks mentioned above, and possibly other similar ones as well. We discuss the implications of applying CI solutions to clinical settings, such as cases of augmented ‘second opinions’ and decision-making.
Collapse
Affiliation(s)
- Luca Ronzio
- Dipartimento di Informatica, Sistemistica e Comunicazione, University of Milano-Bicocca, Viale Sarca 336, 20126 Milan, Italy; (L.R.); (A.C.)
| | - Andrea Campagner
- Dipartimento di Informatica, Sistemistica e Comunicazione, University of Milano-Bicocca, Viale Sarca 336, 20126 Milan, Italy; (L.R.); (A.C.)
| | - Federico Cabitza
- Dipartimento di Informatica, Sistemistica e Comunicazione, University of Milano-Bicocca, Viale Sarca 336, 20126 Milan, Italy; (L.R.); (A.C.)
- Correspondence:
| | | |
Collapse
|
12
|
Tschandl P, Rinner C, Apalla Z, Argenziano G, Codella N, Halpern A, Janda M, Lallas A, Longo C, Malvehy J, Paoli J, Puig S, Rosendahl C, Soyer HP, Zalaudek I, Kittler H. Human-computer collaboration for skin cancer recognition. Nat Med 2020; 26:1229-1234. [PMID: 32572267 DOI: 10.1038/s41591-020-0942-0] [Citation(s) in RCA: 265] [Impact Index Per Article: 66.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2019] [Accepted: 05/15/2020] [Indexed: 01/13/2023]
Abstract
The rapid increase in telemedicine coupled with recent advances in diagnostic artificial intelligence (AI) create the imperative to consider the opportunities and risks of inserting AI-based support into new paradigms of care. Here we build on recent achievements in the accuracy of image-based AI for skin cancer diagnosis to address the effects of varied representations of AI-based support across different levels of clinical expertise and multiple clinical workflows. We find that good quality AI-based support of clinical decision-making improves diagnostic accuracy over that of either AI or physicians alone, and that the least experienced clinicians gain the most from AI-based support. We further find that AI-based multiclass probabilities outperformed content-based image retrieval (CBIR) representations of AI in the mobile technology environment, and AI-based support had utility in simulations of second opinions and of telemedicine triage. In addition to demonstrating the potential benefits associated with good quality AI in the hands of non-expert clinicians, we find that faulty AI can mislead the entire spectrum of clinicians, including experts. Lastly, we show that insights derived from AI class-activation maps can inform improvements in human diagnosis. Together, our approach and findings offer a framework for future studies across the spectrum of image-based diagnostics to improve human-computer collaboration in clinical practice.
Collapse
Affiliation(s)
- Philipp Tschandl
- ViDIR Group, Department of Dermatology, Medical University of Vienna, Vienna, Austria
| | - Christoph Rinner
- Center for Medical Statistics, Informatics and Intelligent Systems (CeMSIIS), Medical University of Vienna, Vienna, Austria
| | - Zoe Apalla
- Department of Dermatology, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | | | - Noel Codella
- IBM T. J. Watson Research Center, New York, NY, USA
| | - Allan Halpern
- Dermatology Service, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Monika Janda
- Centre for Health Services Research, Faculty of Medicine, The University of Queensland, Brisbane, Queensland, Australia
| | - Aimilios Lallas
- Department of Dermatology, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Caterina Longo
- Dermatology Unit, University of Modena and Reggio Emilia, Modena, Italy.,Centro Oncologico ad Alta Tecnologia Diagnostica-Dermatologia, Azienda Unità Sanitaria Locale-IRCCS di Reggio Emilia, Reggio Emilia, Italy
| | - Josep Malvehy
- Dermatology Department, Melanoma Unit, Hospital Clínic de Barcelona, IDIBAPS, Universitat de Barcelona, Barcelona, Spain.,Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBER ER), Instituto de Salud Carlos III, Barcelona, Spain
| | - John Paoli
- Department of Dermatology and Venereology, Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden.,Department of Dermatology and Venereology, Region Västra Götaland, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Susana Puig
- Dermatology Department, Melanoma Unit, Hospital Clínic de Barcelona, IDIBAPS, Universitat de Barcelona, Barcelona, Spain.,Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBER ER), Instituto de Salud Carlos III, Barcelona, Spain
| | - Cliff Rosendahl
- Faculty of Medicine, The University of Queensland, Brisbane, Queensland, Australia
| | - H Peter Soyer
- Dermatology Research Centre, The University of Queensland Diamantina Institute, The University of Queensland, Brisbane, Queensland, Australia
| | - Iris Zalaudek
- Department of Dermatology, Medical University of Trieste, Trieste, Italy
| | - Harald Kittler
- ViDIR Group, Department of Dermatology, Medical University of Vienna, Vienna, Austria.
| |
Collapse
|