1
|
Cabitza F, Natali C, Famiglini L, Campagner A, Caccavella V, Gallazzi E. Never tell me the odds: Investigating pro-hoc explanations in medical decision making. Artif Intell Med 2024; 150:102819. [PMID: 38553159 DOI: 10.1016/j.artmed.2024.102819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 01/28/2024] [Accepted: 02/21/2024] [Indexed: 04/02/2024]
Abstract
This paper examines a kind of explainable AI, centered around what we term pro-hoc explanations, that is a form of support that consists of offering alternative explanations (one for each possible outcome) instead of a specific post-hoc explanation following specific advice. Specifically, our support mechanism utilizes explanations by examples, featuring analogous cases for each category in a binary setting. Pro-hoc explanations are an instance of what we called frictional AI, a general class of decision support aimed at achieving a useful compromise between the increase of decision effectiveness and the mitigation of cognitive risks, such as over-reliance, automation bias and deskilling. To illustrate an instance of frictional AI, we conducted an empirical user study to investigate its impact on the task of radiological detection of vertebral fractures in x-rays. Our study engaged 16 orthopedists in a 'human-first, second-opinion' interaction protocol. In this protocol, clinicians first made initial assessments of the x-rays without AI assistance and then provided their final diagnosis after considering the pro-hoc explanations. Our findings indicate that physicians, particularly those with less experience, perceived pro-hoc XAI support as significantly beneficial, even though it did not notably enhance their diagnostic accuracy. However, their increased confidence in final diagnoses suggests a positive overall impact. Given the promisingly high effect size observed, our results advocate for further research into pro-hoc explanations specifically, and into the broader concept of frictional AI.
Collapse
Affiliation(s)
- Federico Cabitza
- Università degli Studi di Milano-Bicocca, Milan, Italy; IRCCS Istituto Ortopedico Galeazzi, Milan, Italy.
| | - Chiara Natali
- Università degli Studi di Milano-Bicocca, Milan, Italy
| | | | | | | | - Enrico Gallazzi
- Istituto Ortopedico Gaetano Pini - ASST Pini-CTO, Milan, Italy
| |
Collapse
|
2
|
Pischedda G, Marinò L, Corsi K. Defensive medicine through the lens of the managerial perspective: a literature review. BMC Health Serv Res 2023; 23:1104. [PMID: 37848915 PMCID: PMC10580549 DOI: 10.1186/s12913-023-10089-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 09/29/2023] [Indexed: 10/19/2023] Open
Abstract
PURPOSE Several studies have been carried out on defensive medicine, but research from the managerial viewpoint is still scarce. Therefore, the aim of the present study is to conduct a literature review to better understand defensive medicine from a managerial perspective. DESIGN/METHODOLOGY/APPROACH A literature review was conducted of studies focusing on the organisational (meso) level of healthcare providers and managerial practices. A final sample of 28 studies was processed. FINDINGS Defensive medicine has mainly been studied in the USA, and scholars have principally used quantitative surveys. High-risk specialities have been a critical field of investigation, and a large portion of the papers are published in journals that cover medicine, health policy, education and law fields. The analysis showed that operations and the organisation of staffing were the most discussed managerial practices. No study considered planning and budgeting aspects. ORIGINALITY/VALUE The review confirmed that the managerial aspect of defensive medicine has not been fully addressed. Stimulated by this gap, this study analyses the managerial background of the defensive medicine phenomenon and shows which managerial practices have been most analysed. This paper also contributes to developing the literature on defensive medicine from the managerial side. Areas for future research include qualitative studies to investigate the behaviour of managers of healthcare companies to give a different perspective on defensive medicine and organisations' decision-making. RESEARCH LIMITATIONS/IMPLICATIONS Some important publications might have been missed in this work because of the choice of only two databases. A further limit could be imposed by the use of the English language as an inclusion criterion.
Collapse
Affiliation(s)
- Gianfranco Pischedda
- Department of Economics and Business, University of Sassari, Via Muroni, 25, 07100, Sassari, Italy.
| | - Ludovico Marinò
- Department of Economics and Business, University of Sassari, Via Muroni, 25, 07100, Sassari, Italy
| | - Katia Corsi
- Department of Economics and Business, University of Sassari, Via Muroni, 25, 07100, Sassari, Italy
| |
Collapse
|
3
|
Brereton TA, Malik MM, Lifson M, Greenwood JD, Peterson KJ, Overgaard SM. The Role of Artificial Intelligence Model Documentation in Translational Science: Scoping Review. Interact J Med Res 2023; 12:e45903. [PMID: 37450330 PMCID: PMC10382950 DOI: 10.2196/45903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 05/10/2023] [Accepted: 05/11/2023] [Indexed: 07/18/2023] Open
Abstract
BACKGROUND Despite the touted potential of artificial intelligence (AI) and machine learning (ML) to revolutionize health care, clinical decision support tools, herein referred to as medical modeling software (MMS), have yet to realize the anticipated benefits. One proposed obstacle is the acknowledged gaps in AI translation. These gaps stem partly from the fragmentation of processes and resources to support MMS transparent documentation. Consequently, the absence of transparent reporting hinders the provision of evidence to support the implementation of MMS in clinical practice, thereby serving as a substantial barrier to the successful translation of software from research settings to clinical practice. OBJECTIVE This study aimed to scope the current landscape of AI- and ML-based MMS documentation practices and elucidate the function of documentation in facilitating the translation of ethical and explainable MMS into clinical workflows. METHODS A scoping review was conducted in accordance with PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines. PubMed was searched using Medical Subject Headings key concepts of AI, ML, ethical considerations, and explainability to identify publications detailing AI- and ML-based MMS documentation, in addition to snowball sampling of selected reference lists. To include the possibility of implicit documentation practices not explicitly labeled as such, we did not use documentation as a key concept but as an inclusion criterion. A 2-stage screening process (title and abstract screening and full-text review) was conducted by 1 author. A data extraction template was used to record publication-related information; barriers to developing ethical and explainable MMS; available standards, regulations, frameworks, or governance strategies related to documentation; and recommendations for documentation for papers that met the inclusion criteria. RESULTS Of the 115 papers retrieved, 21 (18.3%) papers met the requirements for inclusion. Ethics and explainability were investigated in the context of AI- and ML-based MMS documentation and translation. Data detailing the current state and challenges and recommendations for future studies were synthesized. Notable themes defining the current state and challenges that required thorough review included bias, accountability, governance, and explainability. Recommendations identified in the literature to address present barriers call for a proactive evaluation of MMS, multidisciplinary collaboration, adherence to investigation and validation protocols, transparency and traceability requirements, and guiding standards and frameworks that enhance documentation efforts and support the translation of AI- and ML-based MMS. CONCLUSIONS Resolving barriers to translation is critical for MMS to deliver on expectations, including those barriers identified in this scoping review related to bias, accountability, governance, and explainability. Our findings suggest that transparent strategic documentation, aligning translational science and regulatory science, will support the translation of MMS by coordinating communication and reporting and reducing translational barriers, thereby furthering the adoption of MMS.
Collapse
Affiliation(s)
- Tracey A Brereton
- Center for Digital Health, Mayo Clinic, Rochester, MN, United States
| | - Momin M Malik
- Center for Digital Health, Mayo Clinic, Rochester, MN, United States
| | - Mark Lifson
- Center for Digital Health, Mayo Clinic, Rochester, MN, United States
| | - Jason D Greenwood
- Department of Family Medicine, Mayo Clinic, Rochester, MN, United States
| | - Kevin J Peterson
- Center for Digital Health, Mayo Clinic, Rochester, MN, United States
| | | |
Collapse
|
4
|
Benzinger L, Ursin F, Balke WT, Kacprowski T, Salloch S. Should Artificial Intelligence be used to support clinical ethical decision-making? A systematic review of reasons. BMC Med Ethics 2023; 24:48. [PMID: 37415172 PMCID: PMC10327319 DOI: 10.1186/s12910-023-00929-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Accepted: 06/28/2023] [Indexed: 07/08/2023] Open
Abstract
BACKGROUND Healthcare providers have to make ethically complex clinical decisions which may be a source of stress. Researchers have recently introduced Artificial Intelligence (AI)-based applications to assist in clinical ethical decision-making. However, the use of such tools is controversial. This review aims to provide a comprehensive overview of the reasons given in the academic literature for and against their use. METHODS PubMed, Web of Science, Philpapers.org and Google Scholar were searched for all relevant publications. The resulting set of publications was title and abstract screened according to defined inclusion and exclusion criteria, resulting in 44 papers whose full texts were analysed using the Kuckartz method of qualitative text analysis. RESULTS Artificial Intelligence might increase patient autonomy by improving the accuracy of predictions and allowing patients to receive their preferred treatment. It is thought to increase beneficence by providing reliable information, thereby, supporting surrogate decision-making. Some authors fear that reducing ethical decision-making to statistical correlations may limit autonomy. Others argue that AI may not be able to replicate the process of ethical deliberation because it lacks human characteristics. Concerns have been raised about issues of justice, as AI may replicate existing biases in the decision-making process. CONCLUSIONS The prospective benefits of using AI in clinical ethical decision-making are manifold, but its development and use should be undertaken carefully to avoid ethical pitfalls. Several issues that are central to the discussion of Clinical Decision Support Systems, such as justice, explicability or human-machine interaction, have been neglected in the debate on AI for clinical ethics so far. TRIAL REGISTRATION This review is registered at Open Science Framework ( https://osf.io/wvcs9 ).
Collapse
Affiliation(s)
- Lasse Benzinger
- Institute for Ethics, History and Philosophy of Medicine, Hannover Medical School (MHH), Carl-Neuberg-Str. 1, 30625, Hannover, Germany.
| | - Frank Ursin
- Institute for Ethics, History and Philosophy of Medicine, Hannover Medical School (MHH), Carl-Neuberg-Str. 1, 30625, Hannover, Germany
| | - Wolf-Tilo Balke
- Institute for Information Systems, TU Braunschweig, Braunschweig, Germany
| | - Tim Kacprowski
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of Technische Universität Braunschweig and Hannover Medical School, Braunschweig, Germany
- Braunschweig Integrated Centre for Systems Biology (BRICS), TU Braunschweig, Braunschweig, Germany
| | - Sabine Salloch
- Institute for Ethics, History and Philosophy of Medicine, Hannover Medical School (MHH), Carl-Neuberg-Str. 1, 30625, Hannover, Germany
| |
Collapse
|
5
|
Xue P, Si M, Qin D, Wei B, Seery S, Ye Z, Chen M, Wang S, Song C, Zhang B, Ding M, Zhang W, Bai A, Yan H, Dang L, Zhao Y, Rezhake R, Zhang S, Qiao Y, Qu Y, Jiang Y. Unassisted Clinicians Versus Deep Learning-Assisted Clinicians in Image-Based Cancer Diagnostics: Systematic Review With Meta-analysis. J Med Internet Res 2023; 25:e43832. [PMID: 36862499 PMCID: PMC10020907 DOI: 10.2196/43832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 01/19/2023] [Accepted: 02/13/2023] [Indexed: 02/16/2023] Open
Abstract
BACKGROUND A number of publications have demonstrated that deep learning (DL) algorithms matched or outperformed clinicians in image-based cancer diagnostics, but these algorithms are frequently considered as opponents rather than partners. Despite the clinicians-in-the-loop DL approach having great potential, no study has systematically quantified the diagnostic accuracy of clinicians with and without the assistance of DL in image-based cancer identification. OBJECTIVE We systematically quantified the diagnostic accuracy of clinicians with and without the assistance of DL in image-based cancer identification. METHODS PubMed, Embase, IEEEXplore, and the Cochrane Library were searched for studies published between January 1, 2012, and December 7, 2021. Any type of study design was permitted that focused on comparing unassisted clinicians and DL-assisted clinicians in cancer identification using medical imaging. Studies using medical waveform-data graphics material and those investigating image segmentation rather than classification were excluded. Studies providing binary diagnostic accuracy data and contingency tables were included for further meta-analysis. Two subgroups were defined and analyzed, including cancer type and imaging modality. RESULTS In total, 9796 studies were identified, of which 48 were deemed eligible for systematic review. Twenty-five of these studies made comparisons between unassisted clinicians and DL-assisted clinicians and provided sufficient data for statistical synthesis. We found a pooled sensitivity of 83% (95% CI 80%-86%) for unassisted clinicians and 88% (95% CI 86%-90%) for DL-assisted clinicians. Pooled specificity was 86% (95% CI 83%-88%) for unassisted clinicians and 88% (95% CI 85%-90%) for DL-assisted clinicians. The pooled sensitivity and specificity values for DL-assisted clinicians were higher than for unassisted clinicians, at ratios of 1.07 (95% CI 1.05-1.09) and 1.03 (95% CI 1.02-1.05), respectively. Similar diagnostic performance by DL-assisted clinicians was also observed across the predefined subgroups. CONCLUSIONS The diagnostic performance of DL-assisted clinicians appears better than unassisted clinicians in image-based cancer identification. However, caution should be exercised, because the evidence provided in the reviewed studies does not cover all the minutiae involved in real-world clinical practice. Combining qualitative insights from clinical practice with data-science approaches may improve DL-assisted practice, although further research is required. TRIAL REGISTRATION PROSPERO CRD42021281372; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=281372.
Collapse
Affiliation(s)
- Peng Xue
- Department of Epidemiology and Biostatistics, School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Mingyu Si
- Department of Epidemiology and Biostatistics, School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Dongxu Qin
- Department of Epidemiology and Biostatistics, School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Bingrui Wei
- Department of Epidemiology and Biostatistics, School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Samuel Seery
- Faculty of Health and Medicine, Division of Health Research, Lancaster University, Lancaster, United Kingdom
| | - Zichen Ye
- Department of Epidemiology and Biostatistics, School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Mingyang Chen
- Department of Epidemiology and Biostatistics, School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Sumeng Wang
- Department of Cancer Epidemiology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Cheng Song
- Department of Epidemiology and Biostatistics, School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Bo Zhang
- Department of Epidemiology and Biostatistics, School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Ming Ding
- Department of Epidemiology and Biostatistics, School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Wenling Zhang
- Department of Epidemiology and Biostatistics, School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Anying Bai
- Department of Epidemiology and Biostatistics, School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Huijiao Yan
- Department of Epidemiology and Biostatistics, School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Le Dang
- Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Yuqian Zhao
- Sichuan Cancer Hospital & Institute, Sichuan Cancer Center, School of Medicine, University of Electronic Science & Technology of China, Sichuan, China
| | - Remila Rezhake
- Affiliated Cancer Hospital, The 3rd Affiliated Teaching Hospital of Xinjiang Medical University, Xinjiang, China
| | - Shaokai Zhang
- Henan Cancer Hospital, Affiliated Cancer Hospital of Zhengzhou University, Henan, China
| | - Youlin Qiao
- Center for Global Health, School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Yimin Qu
- Department of Epidemiology and Biostatistics, School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Yu Jiang
- Department of Epidemiology and Biostatistics, School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| |
Collapse
|
6
|
Abstract
Artificial intelligence (AI) systems have demonstrated impressive performance across a variety of clinical tasks. However, notoriously, sometimes these systems are "black boxes." The initial response in the literature was a demand for "explainable AI." However, recently, several authors have suggested that making AI more explainable or "interpretable" is likely to be at the cost of the accuracy of these systems and that prioritizing interpretability in medical AI may constitute a "lethal prejudice." In this paper, we defend the value of interpretability in the context of the use of AI in medicine. Clinicians may prefer interpretable systems over more accurate black boxes, which in turn is sufficient to give designers of AI reason to prefer more interpretable systems in order to ensure that AI is adopted and its benefits realized. Moreover, clinicians may be justified in this preference. Achieving the downstream benefits from AI is critically dependent on how the outputs of these systems are interpreted by physicians and patients. A preference for the use of highly accurate black box AI systems, over less accurate but more interpretable systems, may itself constitute a form of lethal prejudice that may diminish the benefits of AI to-and perhaps even harm-patients.
Collapse
Affiliation(s)
- Joshua Hatherley
- School of Philosophical, Historical, and International Studies, Monash University, Clayton, Victoria, Australia
| | - Robert Sparrow
- School of Philosophical, Historical, and International Studies, Monash University, Clayton, Victoria, Australia
| | - Mark Howard
- School of Philosophical, Historical, and International Studies, Monash University, Clayton, Victoria, Australia
| |
Collapse
|
7
|
Hatherley J, Sparrow R, Howard M. The Virtues of Interpretable Medical Artificial Intelligence. Camb Q Healthc Ethics 2022:1-10. [PMID: 36524245 DOI: 10.1017/s0963180122000305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Artificial intelligence (AI) systems have demonstrated impressive performance across a variety of clinical tasks. However, notoriously, sometimes these systems are "black boxes." The initial response in the literature was a demand for "explainable AI." However, recently, several authors have suggested that making AI more explainable or "interpretable" is likely to be at the cost of the accuracy of these systems and that prioritizing interpretability in medical AI may constitute a "lethal prejudice." In this article, we defend the value of interpretability in the context of the use of AI in medicine. Clinicians may prefer interpretable systems over more accurate black boxes, which in turn is sufficient to give designers of AI reason to prefer more interpretable systems in order to ensure that AI is adopted and its benefits realized. Moreover, clinicians may be justified in this preference. Achieving the downstream benefits from AI is critically dependent on how the outputs of these systems are interpreted by physicians and patients. A preference for the use of highly accurate black box AI systems, over less accurate but more interpretable systems, may itself constitute a form of lethal prejudice that may diminish the benefits of AI to-and perhaps even harm-patients.
Collapse
Affiliation(s)
- Joshua Hatherley
- School of Philosophical, Historical, and International Studies, Monash University, Clayton, Victoria3168, Australia
| | - Robert Sparrow
- School of Philosophical, Historical, and International Studies, Monash University, Clayton, Victoria3168, Australia
| | - Mark Howard
- School of Philosophical, Historical, and International Studies, Monash University, Clayton, Victoria3168, Australia
| |
Collapse
|
8
|
Abstract
The use of machine learning systems for decision-support in healthcare may exacerbate health inequalities. However, recent work suggests that algorithms trained on sufficiently diverse datasets could in principle combat health inequalities. One concern about these algorithms is that their performance for patients in traditionally disadvantaged groups exceeds their performance for patients in traditionally advantaged groups. This renders the algorithmic decisions unfair relative to the standard fairness metrics in machine learning. In this paper, we defend the permissible use of affirmative algorithms; that is, algorithms trained on diverse datasets that perform better for traditionally disadvantaged groups. Whilst such algorithmic decisions may be unfair, the fairness of algorithmic decisions is not the appropriate locus of moral evaluation. What matters is the fairness of final decisions, such as diagnoses, resulting from collaboration between clinicians and algorithms. We argue that affirmative algorithms can permissibly be deployed provided the resultant final decisions are fair.
Collapse
Affiliation(s)
- Thomas Grote
- Ethics and Philosophy Lab; Cluster of Excellence: Machine Learning: New Perspectives for Science, University of Tübingen, Maria von Linden Str. 6, D-72076 Tübingen, Germany
| | - Geoff Keeling
- Institute for Human-Centered AI and McCoy Family Center for Ethics in Society, Stanford University, 450 Serra Mall, 94305 Stanford, CA USA
| |
Collapse
|