1
|
Kuersten A. Prudently Evaluating Medical Adaptive Machine Learning Systems. THE AMERICAN JOURNAL OF BIOETHICS : AJOB 2024; 24:76-79. [PMID: 39283387 DOI: 10.1080/15265161.2024.2388759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/20/2024]
Affiliation(s)
- Andreas Kuersten
- American Law Division, Congressional Research Service, Library of Congress
| |
Collapse
|
2
|
Chen X, Chen H, Wan J, Li J, Wei F. An enhanced AlexNet-Based model for femoral bone tumor classification and diagnosis using magnetic resonance imaging. J Bone Oncol 2024; 48:100626. [PMID: 39290649 PMCID: PMC11407034 DOI: 10.1016/j.jbo.2024.100626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Revised: 07/22/2024] [Accepted: 07/25/2024] [Indexed: 09/19/2024] Open
Abstract
Objective Bone tumors, known for their infrequent occurrence and diverse imaging characteristics, require precise differentiation into benign and malignant categories. Existing diagnostic approaches heavily depend on the laborious and variable manual delineation of tumor regions. Deep learning methods, particularly convolutional neural networks (CNNs), have emerged as a promising solution to tackle these issues. This paper introduces an enhanced deep-learning model based on AlexNet to classify femoral bone tumors accurately. Methods This study involved 500 femoral tumor patients from July 2020 to January 2023, with 500 imaging cases (335 benign and 165 malignant). A CNN was employed for automated classification. The model framework encompassed training and testing stages, with 8 layers (5 Conv and 3 FC) and ReLU activation. Essential architectural modifications included Batch Normalization (BN) after the first and second convolutional filters. Comparative experiments with various existing methods were conducted to assess algorithm performance in tumor staging. Evaluation metrics encompassed accuracy, precision, sensitivity, specificity, F-measure, ROC curves, and AUC values. Results The analysis of precision, sensitivity, specificity, and F1 score from the results demonstrates that the method introduced in this paper offers several advantages, including a low feature dimension and robust generalization (with an accuracy of 98.34 %, sensitivity of 97.26 %, specificity of 95.74 %, and an F1 score of 96.37). These findings underscore its exceptional overall detection capabilities. Notably, when comparing various algorithms, they generally exhibit similar classification performance. However, the algorithm presented in this paper stands out with a higher AUC value (AUC=0.848), signifying enhanced sensitivity and more robust specificity. Conclusion This study presents an optimized AlexNet model for classifying femoral bone tumor images based on convolutional neural networks. This algorithm demonstrates higher accuracy, precision, sensitivity, specificity, and F1-score than other methods. Furthermore, the AUC value further confirms the outstanding performance of this algorithm in terms of sensitivity and specificity. This research makes a significant contribution to the field of medical image classification, offering an efficient automated classification solution, and holds the potential to advance the application of artificial intelligence in bone tumor classification.
Collapse
Affiliation(s)
- Xu Chen
- Department of Orthopedic Surgery, The Seventh Affiliated Hospital, Sun Yat-sen University, Shenzhen, Guangdong 518107, PR China
| | - Hongkun Chen
- Department of Orthopedic Surgery, The Seventh Affiliated Hospital, Sun Yat-sen University, Shenzhen, Guangdong 518107, PR China
| | - Junming Wan
- Department of Orthopedic Surgery, The Seventh Affiliated Hospital, Sun Yat-sen University, Shenzhen, Guangdong 518107, PR China
| | - Jianjun Li
- Department of Orthopedic Surgery, Shengjing Hospital of China Medical University, Shenyang, Liaoning 110004, PR China
| | - Fuxin Wei
- Department of Orthopedic Surgery, The Seventh Affiliated Hospital, Sun Yat-sen University, Shenzhen, Guangdong 518107, PR China
| |
Collapse
|
3
|
Sparrow R, Hatherley J, Oakley J, Bain C. Should the Use of Adaptive Machine Learning Systems in Medicine be Classified as Research? THE AMERICAN JOURNAL OF BIOETHICS : AJOB 2024; 24:58-69. [PMID: 38662360 DOI: 10.1080/15265161.2024.2337429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
A novel advantage of the use of machine learning (ML) systems in medicine is their potential to continue learning from new data after implementation in clinical practice. To date, considerations of the ethical questions raised by the design and use of adaptive machine learning systems in medicine have, for the most part, been confined to discussion of the so-called "update problem," which concerns how regulators should approach systems whose performance and parameters continue to change even after they have received regulatory approval. In this paper, we draw attention to a prior ethical question: whether the continuous learning that will occur in such systems after their initial deployment should be classified, and regulated, as medical research? We argue that there is a strong prima facie case that the use of continuous learning in medical ML systems should be categorized, and regulated, as research and that individuals whose treatment involves such systems should be treated as research subjects.
Collapse
|
4
|
Binkley C, Meda R, de Lara J. Discerning the Nature of MAMLS: Research, Quality Improvement, or Both? THE AMERICAN JOURNAL OF BIOETHICS : AJOB 2024; 24:98-100. [PMID: 39283383 DOI: 10.1080/15265161.2024.2388772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/20/2024]
Affiliation(s)
- Charles Binkley
- Hackensack Meridian Health
- Hackensack Meridian School of Medicine
- Markkula Center for Applied Ethics at Santa Clara University
| | - Rohan Meda
- Georgetown University of School of Medicine
| | | |
Collapse
|
5
|
Ursin F, Müller R, Funer F, Liedtke W, Renz D, Wiertz S, Ranisch R. Non-empirical methods for ethics research on digital technologies in medicine, health care and public health: a systematic journal review. MEDICINE, HEALTH CARE, AND PHILOSOPHY 2024:10.1007/s11019-024-10222-x. [PMID: 39120780 DOI: 10.1007/s11019-024-10222-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 07/27/2024] [Indexed: 08/10/2024]
Abstract
Bioethics has developed approaches to address ethical issues in health care, similar to how technology ethics provides guidelines for ethical research on artificial intelligence, big data, and robotic applications. As these digital technologies are increasingly used in medicine, health care and public health, thus, it is plausible that the approaches of technology ethics have influenced bioethical research. Similar to the "empirical turn" in bioethics, which led to intense debates about appropriate moral theories, ethical frameworks and meta-ethics due to the increased use of empirical methodologies from social sciences, the proliferation of health-related subtypes of technology ethics might have a comparable impact on current bioethical research. This systematic journal review analyses the reporting of ethical frameworks and non-empirical methods in argument-based research articles on digital technologies in medicine, health care and public health that have been published in high-impact bioethics journals. We focus on articles reporting non-empirical research in original contributions. Our aim is to describe currently used methods for the ethical analysis of ethical issues regarding the application of digital technologies in medicine, health care and public health. We confine our analysis to non-empirical methods because empirical methods have been well-researched elsewhere. Finally, we discuss our findings against the background of established methods for health technology assessment, the lack of a typology for non-empirical methods as well as conceptual and methodical change in bioethics. Our descriptive results may serve as a starting point for reflecting on whether current ethical frameworks and non-empirical methods are appropriate to research ethical issues deriving from the application of digital technologies in medicine, health care and public health.
Collapse
Affiliation(s)
- Frank Ursin
- Institute for Ethics, History and Philosophy of Medicine, Hannover Medical School, Carl-Neuberg-Strasse 1, 30625, Hannover, Germany.
| | - Regina Müller
- Institute of Philosophy, University of Bremen, Enrique-Schmidt-Straße 7, 28359, Bremen, Germany
| | - Florian Funer
- Institute for Ethics and History of Medicine, Eberhard Karls University, Gartenstrasse 47, 72074, Tübingen, Tübingen, Germany
| | - Wenke Liedtke
- Faculty of Theology, University of Greifswald, Am Rubenowplatz 2-3, 17489, Greifswald, Germany
| | - David Renz
- Faculty of Protestant Theology, University of Bonn, Am Hofgarten 8, 53113, Bonn, Germany
| | - Svenja Wiertz
- Department of Medical Ethics and the History of Medicine, University of Freiburg, Stefan-Meier-Str. 26, 79104, Freiburg, Germany
| | - Robert Ranisch
- Junior Professorship for Medical Ethics with a Focus on Digitization, Faculty of Health Sciences Brandenburg, University of Potsdam, Am Mühlenberg 9, 14476, Potsdam, Golm, Germany
| |
Collapse
|
6
|
Hatherley J. Are clinicians ethically obligated to disclose their use of medical machine learning systems to patients? JOURNAL OF MEDICAL ETHICS 2024:jme-2024-109905. [PMID: 39117396 DOI: 10.1136/jme-2024-109905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Accepted: 07/26/2024] [Indexed: 08/10/2024]
Abstract
It is commonly accepted that clinicians are ethically obligated to disclose their use of medical machine learning systems to patients, and that failure to do so would amount to a moral fault for which clinicians ought to be held accountable. Call this 'the disclosure thesis.' Four main arguments have been, or could be, given to support the disclosure thesis in the ethics literature: the risk-based argument, the rights-based argument, the materiality argument and the autonomy argument. In this article, I argue that each of these four arguments are unconvincing, and therefore, that the disclosure thesis ought to be rejected. I suggest that mandating disclosure may also even risk harming patients by providing stakeholders with a way to avoid accountability for harm that results from improper applications or uses of these systems.
Collapse
Affiliation(s)
- Joshua Hatherley
- Department of Philosophy and History of Ideas, Aarhus University, Aarhus, Denmark
| |
Collapse
|
7
|
Tikhomirov L, Semmler C, McCradden M, Searston R, Ghassemi M, Oakden-Rayner L. Medical artificial intelligence for clinicians: the lost cognitive perspective. Lancet Digit Health 2024; 6:e589-e594. [PMID: 39059890 DOI: 10.1016/s2589-7500(24)00095-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 03/08/2024] [Accepted: 05/01/2024] [Indexed: 07/28/2024]
Abstract
The development and commercialisation of medical decision systems based on artificial intelligence (AI) far outpaces our understanding of their value for clinicians. Although applicable across many forms of medicine, we focus on characterising the diagnostic decisions of radiologists through the concept of ecologically bounded reasoning, review the differences between clinician decision making and medical AI model decision making, and reveal how these differences pose fundamental challenges for integrating AI into radiology. We argue that clinicians are contextually motivated, mentally resourceful decision makers, whereas AI models are contextually stripped, correlational decision makers, and discuss misconceptions about clinician-AI interaction stemming from this misalignment of capabilities. We outline how future research on clinician-AI interaction could better address the cognitive considerations of decision making and be used to enhance the safety and usability of AI models in high-risk medical decision-making contexts.
Collapse
Affiliation(s)
- Lana Tikhomirov
- Australian Institute for Machine Learning, University of Adelaide, Adelaide, SA, Australia.
| | - Carolyn Semmler
- School of Psychology, University of Adelaide, Adelaide, SA, Australia
| | - Melissa McCradden
- Australian Institute for Machine Learning, University of Adelaide, Adelaide, SA, Australia; School of Public Health, Hospital for Sick Children, University of Toronto, Toronto, ON, Canada
| | - Rachel Searston
- School of Psychology, University of Adelaide, Adelaide, SA, Australia
| | - Marzyeh Ghassemi
- Department of Electrical Engineering and Computer Science and Institute for Medical and Evaluative Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Lauren Oakden-Rayner
- Australian Institute for Machine Learning, University of Adelaide, Adelaide, SA, Australia
| |
Collapse
|
8
|
Rabindranath M, Naghibzadeh M, Zhao X, Holdsworth S, Brudno M, Sidhu A, Bhat M. Clinical Deployment of Machine Learning Tools in Transplant Medicine: What Does the Future Hold? Transplantation 2024; 108:1700-1708. [PMID: 39042768 DOI: 10.1097/tp.0000000000004876] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023]
Abstract
Medical applications of machine learning (ML) have shown promise in analyzing patient data to support clinical decision-making and provide patient-specific outcomes. In transplantation, several applications of ML exist which include pretransplant: patient prioritization, donor-recipient matching, organ allocation, and posttransplant outcomes. Numerous studies have shown the development and utility of ML models, which have the potential to augment transplant medicine. Despite increasing efforts to develop robust ML models for clinical use, very few of these tools are deployed in the healthcare setting. Here, we summarize the current applications of ML in transplant and discuss a potential clinical deployment framework using examples in organ transplantation. We identified that creating an interdisciplinary team, curating a reliable dataset, addressing the barriers to implementation, and understanding current clinical evaluation models could help in deploying ML models into the transplant clinic setting.
Collapse
Affiliation(s)
- Madhumitha Rabindranath
- Transplant AI Initiative, Ajmera Transplant Program, University Health Network, Toronto, ON, Canada
- Institute of Medical Science, University of Toronto, Toronto, ON, Canada
- Toronto General Hospital Research Institute, University Health Network, Toronto, ON, Canada
| | - Maryam Naghibzadeh
- Transplant AI Initiative, Ajmera Transplant Program, University Health Network, Toronto, ON, Canada
| | - Xun Zhao
- Transplant AI Initiative, Ajmera Transplant Program, University Health Network, Toronto, ON, Canada
| | - Sandra Holdsworth
- Transplant AI Initiative, Ajmera Transplant Program, University Health Network, Toronto, ON, Canada
| | - Michael Brudno
- Transplant AI Initiative, Ajmera Transplant Program, University Health Network, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
| | - Aman Sidhu
- Transplant AI Initiative, Ajmera Transplant Program, University Health Network, Toronto, ON, Canada
- Department of Medicine, University of Toronto, Toronto, ON, Canada
| | - Mamatha Bhat
- Transplant AI Initiative, Ajmera Transplant Program, University Health Network, Toronto, ON, Canada
- Institute of Medical Science, University of Toronto, Toronto, ON, Canada
- Toronto General Hospital Research Institute, University Health Network, Toronto, ON, Canada
- Department of Medicine, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
9
|
Vandemeulebroucke T. The ethics of artificial intelligence systems in healthcare and medicine: from a local to a global perspective, and back. Pflugers Arch 2024:10.1007/s00424-024-02984-3. [PMID: 38969841 DOI: 10.1007/s00424-024-02984-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 04/30/2024] [Accepted: 06/24/2024] [Indexed: 07/07/2024]
Abstract
Artificial intelligence systems (ai-systems) (e.g. machine learning, generative artificial intelligence), in healthcare and medicine, have been received with hopes of better care quality, more efficiency, lower care costs, etc. Simultaneously, these systems have been met with reservations regarding their impacts on stakeholders' privacy, on changing power dynamics, on systemic biases, etc. Fortunately, healthcare and medicine have been guided by a multitude of ethical principles, frameworks, or approaches, which also guide the use of ai-systems in healthcare and medicine, in one form or another. Nevertheless, in this article, I argue that most of these approaches are inspired by a local isolationist view on ai-systems, here exemplified by the principlist approach. Despite positive contributions to laying out the ethical landscape of ai-systems in healthcare and medicine, such ethics approaches are too focused on a specific local healthcare and medical setting, be it a particular care relationship, a particular care organisation, or a particular society or region. By doing so, they lose sight of the global impacts ai-systems have, especially environmental impacts and related social impacts, such as increased health risks. To meet this gap, this article presents a global approach to the ethics of ai-systems in healthcare and medicine which consists of five levels of ethical impacts and analysis: individual-relational, organisational, societal, global, and historical. As such, this global approach incorporates the local isolationist view by integrating it in a wider landscape of ethical consideration so to ensure ai-systems meet the needs of everyone everywhere.
Collapse
Affiliation(s)
- Tijs Vandemeulebroucke
- Bonn Sustainable AI Lab, Institut für Wissenschaft und Ethik, Universität Bonn-University of Bonn, Bonner Talweg 57, 53113, Bonn, Germany.
| |
Collapse
|
10
|
Bouhouita-Guermech S, Haidar H. Scoping Review Shows the Dynamics and Complexities Inherent to the Notion of "Responsibility" in Artificial Intelligence within the Healthcare Context. Asian Bioeth Rev 2024; 16:315-344. [PMID: 39022380 PMCID: PMC11250714 DOI: 10.1007/s41649-024-00292-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 03/06/2024] [Accepted: 03/07/2024] [Indexed: 07/20/2024] Open
Abstract
The increasing integration of artificial intelligence (AI) in healthcare presents a host of ethical, legal, social, and political challenges involving various stakeholders. These challenges prompt various studies proposing frameworks and guidelines to tackle these issues, emphasizing distinct phases of AI development, deployment, and oversight. As a result, the notion of responsible AI has become widespread, incorporating ethical principles such as transparency, fairness, responsibility, and privacy. This paper explores the existing literature on AI use in healthcare to examine how it addresses, defines, and discusses the concept of responsibility. We conducted a scoping review of literature related to AI responsibility in healthcare, searching databases and reference lists between January 2017 and January 2022 for terms related to "responsibility" and "AI in healthcare", and their derivatives. Following screening, 136 articles were included. Data were grouped into four thematic categories: (1) the variety of terminology used to describe and address responsibility; (2) principles and concepts associated with responsibility; (3) stakeholders' responsibilities in AI clinical development, use, and deployment; and (4) recommendations for addressing responsibility concerns. The results show the lack of a clear definition of AI responsibility in healthcare and highlight the importance of ensuring responsible development and implementation of AI in healthcare. Further research is necessary to clarify this notion to contribute to developing frameworks regarding the type of responsibility (ethical/moral/professional, legal, and causal) of various stakeholders involved in the AI lifecycle.
Collapse
Affiliation(s)
| | - Hazar Haidar
- Ethics Programs, Department of Letters and Humanities, University of Quebec at Rimouski, Rimouski, Québec Canada
| |
Collapse
|
11
|
Ho A, Bavli I, Mahal R, McKeown MJ. Multi-Level Ethical Considerations of Artificial Intelligence Health Monitoring for People Living with Parkinson's Disease. AJOB Empir Bioeth 2024; 15:178-191. [PMID: 37889210 DOI: 10.1080/23294515.2023.2274582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2023]
Abstract
Artificial intelligence (AI) has garnered tremendous attention in health care, and many hope that AI can enhance our health system's ability to care for people with chronic and degenerative conditions, including Parkinson's Disease (PD). This paper reports the themes and lessons derived from a qualitative study with people living with PD, family caregivers, and health care providers regarding the ethical dimensions of using AI to monitor, assess, and predict PD symptoms and progression. Thematic analysis identified ethical concerns at four intersecting levels: personal, interpersonal, professional/institutional, and societal levels. Reflecting on potential benefits of predictive algorithms that can continuously collect and process longitudinal data, participants expressed a desire for more timely, ongoing, and accurate information that could enhance management of day-to-day fluctuations and facilitate clinical and personal care as their disease progresses. Nonetheless, they voiced concerns about intersecting ethical questions around evolving illness identities, familial and professional care relationships, privacy, and data ownership/governance. The multi-layer analysis provides a helpful way to understand the ethics of using AI in monitoring and managing PD and other chronic/degenerative conditions.
Collapse
Affiliation(s)
- Anita Ho
- Centre for Applied Ethics, School of Population and Public Health, University of British Columbia, Vancouver, Canada
| | - Itai Bavli
- Centre for Applied Ethics, School of Population and Public Health, University of British Columbia, Vancouver, Canada
| | - Ravneet Mahal
- Pacific Parkinson's Research Centre, University of British Columbia, Vancouver, Canada
| | - Martin J McKeown
- Pacific Parkinson's Research Centre, University of British Columbia, Vancouver, Canada
| |
Collapse
|
12
|
Arawi T, El Bachour J, El Khansa T. The Fourth Industrial Revolution: Its Impact on Artificial Intelligence and Medicine in Developing Countries. Asian Bioeth Rev 2024; 16:513-526. [PMID: 39022373 PMCID: PMC11250712 DOI: 10.1007/s41649-024-00284-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 02/07/2024] [Accepted: 02/10/2024] [Indexed: 07/20/2024] Open
Abstract
Artificial intelligence (AI) is the ability of a digital computer or computer-controlled robot to perform tasks commonly associated with intelligent beings. Artificial intelligence can be both a blessing and a curse, and potentially a double-edged sword if not carefully wielded. While it holds massive potential benefits to humans-particularly in healthcare by assisting in treatment of diseases, surgeries, record keeping, and easing the lives of both patients and doctors, its misuse has potential for harm through impact of biases, unemployment, breaches of privacy, and lack of accountability to mention a few. In this article, we discuss the fourth industrial revolution, through a focus on the core of this phenomenon, artificial intelligence. We outline what the fourth industrial revolution is, its basis around AI, and how this infiltrates human lives and society, akin to a transcendence. We focus on the potential dangers of AI and the ethical concerns it brings about particularly in developing countries in general and conflict zones in particular, and we offer potential solutions to such dangers. While we acknowledge the importance and potential of AI, we also call for cautious reservations before plunging straight into the exciting world of the future, one which we long have heard of only in science fiction movies.
Collapse
Affiliation(s)
- Thalia Arawi
- Salim El Hoss Bioethics and Professionalism Program (SHBPP), Faculty of Medicine, American University of Beirut & Medical Center, Beirut, Lebanon
| | - Joseph El Bachour
- Salim El Hoss Bioethics and Professionalism Program (SHBPP), Faculty of Medicine, American University of Beirut & Medical Center, Beirut, Lebanon
| | - Tala El Khansa
- Salim El Hoss Bioethics and Professionalism Program (SHBPP), Faculty of Medicine, American University of Beirut & Medical Center, Beirut, Lebanon
| |
Collapse
|
13
|
Kasun M, Ryan K, Paik J, Lane-McKinley K, Dunn LB, Roberts LW, Kim JP. Academic machine learning researchers' ethical perspectives on algorithm development for health care: a qualitative study. J Am Med Inform Assoc 2024; 31:563-573. [PMID: 38069455 PMCID: PMC10873830 DOI: 10.1093/jamia/ocad238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 10/20/2023] [Accepted: 12/05/2023] [Indexed: 02/18/2024] Open
Abstract
OBJECTIVES We set out to describe academic machine learning (ML) researchers' ethical considerations regarding the development of ML tools intended for use in clinical care. MATERIALS AND METHODS We conducted in-depth, semistructured interviews with a sample of ML researchers in medicine (N = 10) as part of a larger study investigating stakeholders' ethical considerations in the translation of ML tools in medicine. We used a qualitative descriptive design, applying conventional qualitative content analysis in order to allow participant perspectives to emerge directly from the data. RESULTS Every participant viewed their algorithm development work as holding ethical significance. While participants shared positive attitudes toward continued ML innovation, they described concerns related to data sampling and labeling (eg, limitations to mitigating bias; ensuring the validity and integrity of data), and algorithm training and testing (eg, selecting quantitative targets; assessing reproducibility). Participants perceived a need to increase interdisciplinary training across stakeholders and to envision more coordinated and embedded approaches to addressing ethics issues. DISCUSSION AND CONCLUSION Participants described key areas where increased support for ethics may be needed; technical challenges affecting clinical acceptability; and standards related to scientific integrity, beneficence, and justice that may be higher in medicine compared to other industries engaged in ML innovation. Our results help shed light on the perspectives of ML researchers in medicine regarding the range of ethical issues they encounter or anticipate in their work, including areas where more attention may be needed to support the successful development and integration of medical ML tools.
Collapse
Affiliation(s)
- Max Kasun
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA 94305, United States
| | - Katie Ryan
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA 94305, United States
| | - Jodi Paik
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA 94305, United States
| | - Kyle Lane-McKinley
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA 94305, United States
| | - Laura Bodin Dunn
- Department of Psychiatry, University of Arkansas for Medical Sciences, Little Rock, AK 72205, United States
| | - Laura Weiss Roberts
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA 94305, United States
| | - Jane Paik Kim
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA 94305, United States
| |
Collapse
|
14
|
Maier-Hein L, Reinke A, Godau P, Tizabi MD, Buettner F, Christodoulou E, Glocker B, Isensee F, Kleesiek J, Kozubek M, Reyes M, Riegler MA, Wiesenfarth M, Kavur AE, Sudre CH, Baumgartner M, Eisenmann M, Heckmann-Nötzel D, Rädsch T, Acion L, Antonelli M, Arbel T, Bakas S, Benis A, Blaschko MB, Cardoso MJ, Cheplygina V, Cimini BA, Collins GS, Farahani K, Ferrer L, Galdran A, van Ginneken B, Haase R, Hashimoto DA, Hoffman MM, Huisman M, Jannin P, Kahn CE, Kainmueller D, Kainz B, Karargyris A, Karthikesalingam A, Kofler F, Kopp-Schneider A, Kreshuk A, Kurc T, Landman BA, Litjens G, Madani A, Maier-Hein K, Martel AL, Mattson P, Meijering E, Menze B, Moons KGM, Müller H, Nichyporuk B, Nickel F, Petersen J, Rajpoot N, Rieke N, Saez-Rodriguez J, Sánchez CI, Shetty S, van Smeden M, Summers RM, Taha AA, Tiulpin A, Tsaftaris SA, Van Calster B, Varoquaux G, Jäger PF. Metrics reloaded: recommendations for image analysis validation. Nat Methods 2024; 21:195-212. [PMID: 38347141 PMCID: PMC11182665 DOI: 10.1038/s41592-023-02151-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 12/12/2023] [Indexed: 02/15/2024]
Abstract
Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. In biomedical image analysis, chosen performance metrics often do not reflect the domain interest, and thus fail to adequately measure scientific progress and hinder translation of ML techniques into practice. To overcome this, we created Metrics Reloaded, a comprehensive framework guiding researchers in the problem-aware selection of metrics. Developed by a large international consortium in a multistage Delphi process, it is based on the novel concept of a problem fingerprint-a structured representation of the given problem that captures all aspects that are relevant for metric selection, from the domain interest to the properties of the target structure(s), dataset and algorithm output. On the basis of the problem fingerprint, users are guided through the process of choosing and applying appropriate validation metrics while being made aware of potential pitfalls. Metrics Reloaded targets image analysis problems that can be interpreted as classification tasks at image, object or pixel level, namely image-level classification, object detection, semantic segmentation and instance segmentation tasks. To improve the user experience, we implemented the framework in the Metrics Reloaded online tool. Following the convergence of ML methodology across application domains, Metrics Reloaded fosters the convergence of validation methodology. Its applicability is demonstrated for various biomedical use cases.
Collapse
Affiliation(s)
- Lena Maier-Hein
- German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany.
- German Cancer Research Center (DKFZ) Heidelberg, HI Helmholtz Imaging, Heidelberg, Germany.
- Faculty of Mathematics and Computer Science, Heidelberg University, Heidelberg, Germany.
- Medical Faculty, Heidelberg University, Heidelberg, Germany.
- National Center for Tumor Diseases (NCT), NCT Heidelberg, a partnership between DKFZ and University Medical Center Heidelberg, Heidelberg, Germany.
| | - Annika Reinke
- German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany.
- German Cancer Research Center (DKFZ) Heidelberg, HI Helmholtz Imaging, Heidelberg, Germany.
- Faculty of Mathematics and Computer Science, Heidelberg University, Heidelberg, Germany.
| | - Patrick Godau
- German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany
- Faculty of Mathematics and Computer Science, Heidelberg University, Heidelberg, Germany
- National Center for Tumor Diseases (NCT), NCT Heidelberg, a partnership between DKFZ and University Medical Center Heidelberg, Heidelberg, Germany
| | - Minu D Tizabi
- German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany
- National Center for Tumor Diseases (NCT), NCT Heidelberg, a partnership between DKFZ and University Medical Center Heidelberg, Heidelberg, Germany
| | - Florian Buettner
- German Cancer Consortium (DKTK), partner site Frankfurt/Mainz, a partnership between DKFZ and UCT Frankfurt-Marburg, Frankfurt am Main, Germany
- German Cancer Research Center (DKFZ) Heidelberg, Heidelberg, Germany
- Department of Medicine, Goethe University Frankfurt, Frankfurt am Main, Germany
- Department of Informatics, Goethe University Frankfurt, Frankfurt am Main, Germany
- Frankfurt Cancer Insititute, Frankfurt am Main, Germany
| | - Evangelia Christodoulou
- German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany
| | - Ben Glocker
- Department of Computing, Imperial College London, South Kensington Campus, London, UK
| | - Fabian Isensee
- German Cancer Research Center (DKFZ) Heidelberg, Division of Medical Image Computing, Heidelberg, Germany
- German Cancer Research Center (DKFZ) Heidelberg, HI Applied Computer Vision Lab, Heidelberg, Germany
| | - Jens Kleesiek
- Institute for AI in Medicine, University Medicine Essen, Essen, Germany
| | - Michal Kozubek
- Centre for Biomedical Image Analysis and Faculty of Informatics, Masaryk University, Brno, Czech Republic
| | - Mauricio Reyes
- ARTORG Center for Biomedical Engineering Research, University of Bern, Bern, Switzerland
- Department of Radiation Oncology, University Hospital Bern, University of Bern, Bern, Switzerland
| | - Michael A Riegler
- Simula Metropolitan Center for Digital Engineering, Oslo, Norway
- Department of Computer Science, UiT The Arctic University of Norway, Tromsø, Norway
| | - Manuel Wiesenfarth
- German Cancer Research Center (DKFZ) Heidelberg, Division of Biostatistics, Heidelberg, Germany
| | - A Emre Kavur
- German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany
- German Cancer Research Center (DKFZ) Heidelberg, Division of Medical Image Computing, Heidelberg, Germany
- German Cancer Research Center (DKFZ) Heidelberg, HI Applied Computer Vision Lab, Heidelberg, Germany
| | - Carole H Sudre
- MRC Unit for Lifelong Health and Ageing at UCL and Centre for Medical Image Computing, Department of Computer Science, University College London, London, UK
- School of Biomedical Engineering and Imaging Science, King's College London, London, UK
| | - Michael Baumgartner
- German Cancer Research Center (DKFZ) Heidelberg, Division of Medical Image Computing, Heidelberg, Germany
| | - Matthias Eisenmann
- German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany
| | - Doreen Heckmann-Nötzel
- German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany
- National Center for Tumor Diseases (NCT), NCT Heidelberg, a partnership between DKFZ and University Medical Center Heidelberg, Heidelberg, Germany
| | - Tim Rädsch
- German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany
- German Cancer Research Center (DKFZ) Heidelberg, HI Helmholtz Imaging, Heidelberg, Germany
| | - Laura Acion
- Instituto de Cálculo, CONICET - Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Michela Antonelli
- School of Biomedical Engineering and Imaging Science, King's College London, London, UK
- Centre for Medical Image Computing, University College London, London, UK
| | - Tal Arbel
- Centre for Intelligent Machines and MILA (Québec Artificial Intelligence Institute), McGill University, Montréal, Quebec, Canada
| | - Spyridon Bakas
- Division of Computational Pathology, Department of Pathology & Laboratory Medicine, Indiana University School of Medicine, IU Health Information and Translational Sciences Building, Indianapolis, IN, USA
- Center for Biomedical Image Computing and Analytics (CBICA), University of Pennsylvania, Philadelphia, PA, USA
| | - Arriel Benis
- Department of Digital Medical Technologies, Holon Institute of Technology, Holon, Israel
- European Federation for Medical Informatics, Le Mont-sur-Lausanne, Switzerland
| | - Matthew B Blaschko
- Center for Processing Speech and Images, Department of Electrical Engineering, KU Leuven, Leuven, Belgium
| | - M Jorge Cardoso
- School of Biomedical Engineering and Imaging Science, King's College London, London, UK
| | - Veronika Cheplygina
- Department of Computer Science, IT University of Copenhagen, Copenhagen, Denmark
| | - Beth A Cimini
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Gary S Collins
- Centre for Statistics in Medicine, University of Oxford, Nuffield Orthopaedic Centre, Oxford, UK
| | - Keyvan Farahani
- Center for Biomedical Informatics and Information Technology, National Cancer Institute, Bethesda, MD, USA
| | - Luciana Ferrer
- Instituto de Investigación en Ciencias de la Computación (ICC), CONICET-UBA, Ciudad Autónoma de Buenos Aires, Buenos Aires, Argentina
| | - Adrian Galdran
- BCN Medtech, Universitat Pompeu Fabra, Barcelona, Spain
- Australian Institute for Machine Learning AIML, University of Adelaide, Adelaide, South Australia, Australia
| | - Bram van Ginneken
- Fraunhofer MEVIS, Bremen, Germany
- Radboud Institute for Health Sciences, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Robert Haase
- Technische Universität (TU) Dresden, DFG Cluster of Excellence 'Physics of Life', Dresden, Germany
- Center for Systems Biology, Dresden, Germany
- Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI), Leipzig University, Leipzig, Germany
| | - Daniel A Hashimoto
- Department of Surgery, Perelman School of Medicine, Philadelphia, PA, USA
- General Robotics Automation Sensing and Perception Laboratory, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Michael M Hoffman
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| | - Merel Huisman
- Department of Radiology and Nuclear Medicine, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Pierre Jannin
- Laboratoire Traitement du Signal et de l'Image - UMR_S 1099, Université de Rennes 1, Rennes, France
- INSERM, Paris, France
| | - Charles E Kahn
- Department of Radiology and Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - Dagmar Kainmueller
- Max-Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Biomedical Image Analysis and HI Helmholtz Imaging, Berlin, Germany
- Digital Engineering Faculty, University of Potsdam, Potsdam, Germany
| | - Bernhard Kainz
- Department of Computing, Faculty of Engineering, Imperial College London, London, UK
- Department AIBE, Friedrich-Alexander-Universität (FAU), Erlangen-Nürnberg, Germany
| | | | | | | | - Annette Kopp-Schneider
- German Cancer Research Center (DKFZ) Heidelberg, Division of Biostatistics, Heidelberg, Germany
| | - Anna Kreshuk
- Cell Biology and Biophysics Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
| | - Tahsin Kurc
- Department of Biomedical Informatics, Stony Brook University, Health Science Center, Stony Brook, NY, USA
| | | | - Geert Litjens
- Department of Pathology, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Amin Madani
- Department of Surgery, University Health Network, Philadelphia, PA, USA
| | - Klaus Maier-Hein
- German Cancer Research Center (DKFZ) Heidelberg, Division of Medical Image Computing, Heidelberg, Germany
- Pattern Analysis and Learning Group, Department of Radiation Oncology, Heidelberg University Hospital, Heidelberg, Germany
| | - Anne L Martel
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Physical Sciences, Sunnybrook Research Institute, Toronto, Ontario, Canada
| | - Peter Mattson
- Google, 1600 Amphitheatre Pkwy, Mountain View, CA, USA
| | - Erik Meijering
- School of Computer Science and Engineering, University of New South Wales, UNSW Sydney, Kensington, New South Wales, Australia
| | - Bjoern Menze
- Department of Quantitative Biomedicine, University of Zurich, Zurich, Switzerland
| | - Karel G M Moons
- Julius Center for Health Sciences and Primary Care, UMC Utrecht, Utrecht University, Utrecht, the Netherlands
| | - Henning Müller
- Information Systems Institute, University of Applied Sciences Western Switzerland (HES-SO), Sierre, Switzerland
- Medical Faculty, University of Geneva, Geneva, Switzerland
| | - Brennan Nichyporuk
- MILA (Québec Artificial Intelligence Institute), Montréal, Quebec, Canada
| | - Felix Nickel
- Department of General, Visceral and Thoracic Surgery, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Jens Petersen
- German Cancer Research Center (DKFZ) Heidelberg, Division of Medical Image Computing, Heidelberg, Germany
| | - Nasir Rajpoot
- Tissue Image Analytics Laboratory, Department of Computer Science, University of Warwick, Coventry, UK
| | | | - Julio Saez-Rodriguez
- Institute for Computational Biomedicine, Heidelberg University, Heidelberg, Germany
- Faculty of Medicine, Heidelberg University Hospital, Heidelberg, Germany
| | - Clara I Sánchez
- Informatics Institute, Faculty of Science, University of Amsterdam, Amsterdam, the Netherlands
| | | | - Maarten van Smeden
- Julius Center for Health Sciences and Primary Care, UMC Utrecht, Utrecht University, Utrecht, the Netherlands
| | - Ronald M Summers
- National Institutes of Health Clinical Center, Bethesda, MD, USA
| | - Abdel A Taha
- Institute of Information Systems Engineering, TU Wien, Vienna, Austria
| | - Aleksei Tiulpin
- Research Unit of Health Sciences and Technology, Faculty of Medicine, University of Oulu, Oulu, Finland
- Neurocenter Oulu, Oulu University Hospital, Oulu, Finland
| | | | - Ben Van Calster
- Department of Development and Regeneration and EPI-centre, KU Leuven, Leuven, Belgium
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands
| | - Gaël Varoquaux
- Parietal project team, INRIA Saclay-Île de France, Palaiseau, France
| | - Paul F Jäger
- German Cancer Research Center (DKFZ) Heidelberg, HI Helmholtz Imaging, Heidelberg, Germany.
- German Cancer Research Center (DKFZ) Heidelberg, Interactive Machine Learning Group, Heidelberg, Germany.
| |
Collapse
|
15
|
Berners-Lee B. Reconciling healthism and techno-solutionism: An observational study of a digital mental health trial. SOCIOLOGY OF HEALTH & ILLNESS 2024; 46:39-58. [PMID: 37337395 DOI: 10.1111/1467-9566.13683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 05/12/2023] [Indexed: 06/21/2023]
Abstract
In a growing trend in digital psychiatry, algorithmic systems are used to determine correlations between data that is collected using wearable devices and self-reports of mood. They then offer recommendations for behaviour modification for improved mood. The present study consists of observations of the development of one of these systems. Descriptions of the trial emphasise the powerful role of the intrinsically motivated, responsible participant on one hand and the empowering machine learning (ML)-based technology on the other. This conceptualisation is shown to extend the neoliberal paradox of a freedom that, to be maintained, must be continually adjusted through discipline. Because of the paradoxical nature of this formulation, laboratory members disagree about the balance of agency between the objective machine learning system and the empowered participant. The guides who help participants interpret ML outputs and implement system recommendations are ascribed a replaceable role in formal accounts. Observations of this guidance practice make clear not only the important role played by guides but also how their work is relegated to the technological side of the broader formulation of the trial and further how this conceptualisation affects the way they conduct their work.
Collapse
Affiliation(s)
- Ben Berners-Lee
- Department of Communication, UC San Diego, La Jolla, California, USA
| |
Collapse
|
16
|
Herington J, McCradden MD, Creel K, Boellaard R, Jones EC, Jha AK, Rahmim A, Scott PJH, Sunderland JJ, Wahl RL, Zuehlsdorff S, Saboury B. Ethical Considerations for Artificial Intelligence in Medical Imaging: Data Collection, Development, and Evaluation. J Nucl Med 2023; 64:1848-1854. [PMID: 37827839 PMCID: PMC10690124 DOI: 10.2967/jnumed.123.266080] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 09/12/2023] [Indexed: 10/14/2023] Open
Abstract
The development of artificial intelligence (AI) within nuclear imaging involves several ethically fraught components at different stages of the machine learning pipeline, including during data collection, model training and validation, and clinical use. Drawing on the traditional principles of medical and research ethics, and highlighting the need to ensure health justice, the AI task force of the Society of Nuclear Medicine and Molecular Imaging has identified 4 major ethical risks: privacy of data subjects, data quality and model efficacy, fairness toward marginalized populations, and transparency of clinical performance. We provide preliminary recommendations to developers of AI-driven medical devices for mitigating the impact of these risks on patients and populations.
Collapse
Affiliation(s)
- Jonathan Herington
- Department of Health Humanities and Bioethics and Department of Philosophy, University of Rochester, Rochester, New York
| | - Melissa D McCradden
- Department of Bioethics, Hospital for Sick Children, Toronto and Dana Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
| | - Kathleen Creel
- Department of Philosophy and Religion and Khoury College of Computer Sciences, Northeastern University, Boston, Massachusetts
| | - Ronald Boellaard
- Department of Radiology and Nuclear Medicine, Cancer Centre Amsterdam, Amsterdam University Medical Centres, Amsterdam, The Netherlands
| | - Elizabeth C Jones
- Department of Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, Bethesda, Maryland
| | - Abhinav K Jha
- Department of Biomedical Engineering and Mallinckrodt Institute of Radiology, Washington University in St. Louis, St. Louis, Missouri
| | - Arman Rahmim
- Departments of Radiology and Physics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Peter J H Scott
- Department of Radiology, University of Michigan Medical School, Ann Arbor, Michigan
| | - John J Sunderland
- Departments of Radiology and Physics, University of Iowa, Iowa City, Iowa
| | - Richard L Wahl
- Mallinckrodt Institute of Radiology, Washington University in St. Louis, St. Louis, Missouri; and
| | | | - Babak Saboury
- Department of Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, Bethesda, Maryland;
| |
Collapse
|
17
|
Smith LA, Oakden-Rayner L, Bird A, Zeng M, To MS, Mukherjee S, Palmer LJ. Machine learning and deep learning predictive models for long-term prognosis in patients with chronic obstructive pulmonary disease: a systematic review and meta-analysis. Lancet Digit Health 2023; 5:e872-e881. [PMID: 38000872 DOI: 10.1016/s2589-7500(23)00177-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2023] [Revised: 06/26/2023] [Accepted: 08/29/2023] [Indexed: 11/26/2023]
Abstract
BACKGROUND Machine learning and deep learning models have been increasingly used to predict long-term disease progression in patients with chronic obstructive pulmonary disease (COPD). We aimed to summarise the performance of such prognostic models for COPD, compare their relative performances, and identify key research gaps. METHODS We conducted a systematic review and meta-analysis to compare the performance of machine learning and deep learning prognostic models and identify pathways for future research. We searched PubMed, Embase, the Cochrane Library, ProQuest, Scopus, and Web of Science from database inception to April 6, 2023, for studies in English using machine learning or deep learning to predict patient outcomes at least 6 months after initial clinical presentation in those with COPD. We included studies comprising human adults aged 18-90 years and allowed for any input modalities. We reported area under the receiver operator characteristic curve (AUC) with 95% CI for predictions of mortality, exacerbation, and decline in forced expiratory volume in 1 s (FEV1). We reported the degree of interstudy heterogeneity using Cochran's Q test (significant heterogeneity was defined as p≤0·10 or I2>50%). Reporting quality was assessed using the TRIPOD checklist and a risk-of-bias assessment was done using the PROBAST checklist. This study was registered with PROSPERO (CRD42022323052). FINDINGS We identified 3620 studies in the initial search. 18 studies were eligible, and, of these, 12 used conventional machine learning and six used deep learning models. Seven models analysed exacerbation risk, with only six reporting AUC and 95% CI on internal validation datasets (pooled AUC 0·77 [95% CI 0·69-0·85]) and there was significant heterogeneity (I2 97%, p<0·0001). 11 models analysed mortality risk, with only six reporting AUC and 95% CI on internal validation datasets (pooled AUC 0·77 [95% CI 0·74-0·80]) with significant degrees of heterogeneity (I2 60%, p=0·027). Two studies assessed decline in lung function and were unable to be pooled. Machine learning and deep learning models did not show significant improvement over pre-existing disease severity scores in predicting exacerbations (p=0·24). Three studies directly compared machine learning models against pre-existing severity scores for predicting mortality and pooled performance did not differ (p=0·57). Of the five studies that performed external validation, performance was worse than or equal to regression models. Incorrect handling of missing data, not reporting model uncertainty, and use of datasets that were too small relative to the number of predictive features included provided the largest risks of bias. INTERPRETATION There is limited evidence that conventional machine learning and deep learning prognostic models demonstrate superior performance to pre-existing disease severity scores. More rigorous adherence to reporting guidelines would reduce the risk of bias in future studies and aid study reproducibility. FUNDING None.
Collapse
Affiliation(s)
- Luke A Smith
- Australian Institute for Machine Learning, University of Adelaide, Adelaide, SA, Australia; School of Public Health, University of Adelaide, Adelaide, SA, Australia.
| | - Lauren Oakden-Rayner
- Australian Institute for Machine Learning, University of Adelaide, Adelaide, SA, Australia; School of Public Health, University of Adelaide, Adelaide, SA, Australia
| | - Alix Bird
- Australian Institute for Machine Learning, University of Adelaide, Adelaide, SA, Australia; School of Public Health, University of Adelaide, Adelaide, SA, Australia
| | - Minyan Zeng
- Australian Institute for Machine Learning, University of Adelaide, Adelaide, SA, Australia; School of Public Health, University of Adelaide, Adelaide, SA, Australia
| | - Minh-Son To
- Health Data and Clinical Trials, Flinders University, Bedford Park, SA, Australia; South Australia Medical Imaging, Flinders Medical Centre, Bedford Park, SA, Australia
| | - Sutapa Mukherjee
- Department of Respiratory and Sleep Medicine, Southern Adelaide Local Health Network (SALHN), Bedford Park, SA, Australia; Adelaide Institute for Sleep Health/Flinders Health and Medical Research Institute, College of Medicine and Public Health, Flinders University, Bedford Park, SA, Australia
| | - Lyle J Palmer
- Australian Institute for Machine Learning, University of Adelaide, Adelaide, SA, Australia; School of Public Health, University of Adelaide, Adelaide, SA, Australia
| |
Collapse
|
18
|
McCradden MD, Joshi S, Anderson JA, London AJ. A normative framework for artificial intelligence as a sociotechnical system in healthcare. PATTERNS (NEW YORK, N.Y.) 2023; 4:100864. [PMID: 38035190 PMCID: PMC10682751 DOI: 10.1016/j.patter.2023.100864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/02/2023]
Abstract
Artificial intelligence (AI) tools are of great interest to healthcare organizations for their potential to improve patient care, yet their translation into clinical settings remains inconsistent. One of the reasons for this gap is that good technical performance does not inevitably result in patient benefit. We advocate for a conceptual shift wherein AI tools are seen as components of an intervention ensemble. The intervention ensemble describes the constellation of practices that, together, bring about benefit to patients or health systems. Shifting from a narrow focus on the tool itself toward the intervention ensemble prioritizes a "sociotechnical" vision for translation of AI that values all components of use that support beneficial patient outcomes. The intervention ensemble approach can be used for regulation, institutional oversight, and for AI adopters to responsibly and ethically appraise, evaluate, and use AI tools.
Collapse
Affiliation(s)
- Melissa D. McCradden
- Department of Bioethics, The Hospital for Sick Children, Toronto, ON, Canada
- Genetics & Genome Biology Research Program, Peter Gilgan Center for Research & Learning, Toronto, ON, Canada
- Division of Clinical & Public Health, Dalla Lana School of Public Health, Toronto, ON, Canada
| | - Shalmali Joshi
- Department of Biomedical Informatics, Department of Computer Science (Affliate), Data Science Institute, Columbia University, New York, NY, USA
| | - James A. Anderson
- Department of Bioethics, The Hospital for Sick Children, Toronto, ON, Canada
- Institute for Health Policy, Management, and Evaluation, University of Toronto, Toronto, ON, Canada
| | - Alex John London
- Department of Philosophy and Center for Ethics and Policy, Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
19
|
Glocker B, Jones C, Roschewitz M, Winzeck S. Risk of Bias in Chest Radiography Deep Learning Foundation Models. Radiol Artif Intell 2023; 5:e230060. [PMID: 38074789 PMCID: PMC10698597 DOI: 10.1148/ryai.230060] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 08/07/2023] [Accepted: 08/24/2023] [Indexed: 03/15/2024]
Abstract
PURPOSE To analyze a recently published chest radiography foundation model for the presence of biases that could lead to subgroup performance disparities across biologic sex and race. MATERIALS AND METHODS This Health Insurance Portability and Accountability Act-compliant retrospective study used 127 118 chest radiographs from 42 884 patients (mean age, 63 years ± 17 [SD]; 23 623 male, 19 261 female) from the CheXpert dataset that were collected between October 2002 and July 2017. To determine the presence of bias in features generated by a chest radiography foundation model and baseline deep learning model, dimensionality reduction methods together with two-sample Kolmogorov-Smirnov tests were used to detect distribution shifts across sex and race. A comprehensive disease detection performance analysis was then performed to associate any biases in the features to specific disparities in classification performance across patient subgroups. RESULTS Ten of 12 pairwise comparisons across biologic sex and race showed statistically significant differences in the studied foundation model, compared with four significant tests in the baseline model. Significant differences were found between male and female (P < .001) and Asian and Black (P < .001) patients in the feature projections that primarily capture disease. Compared with average model performance across all subgroups, classification performance on the "no finding" label decreased between 6.8% and 7.8% for female patients, and performance in detecting "pleural effusion" decreased between 10.7% and 11.6% for Black patients. CONCLUSION The studied chest radiography foundation model demonstrated racial and sex-related bias, which led to disparate performance across patient subgroups; thus, this model may be unsafe for clinical applications.Keywords: Conventional Radiography, Computer Application-Detection/Diagnosis, Chest Radiography, Bias, Foundation Models Supplemental material is available for this article. Published under a CC BY 4.0 license.See also commentary by Czum and Parr in this issue.
Collapse
Affiliation(s)
- Ben Glocker
- From the Department of Computing, Imperial College London, South
Kensington Campus, London SW7 2AZ, United Kingdom
| | - Charles Jones
- From the Department of Computing, Imperial College London, South
Kensington Campus, London SW7 2AZ, United Kingdom
| | - Mélanie Roschewitz
- From the Department of Computing, Imperial College London, South
Kensington Campus, London SW7 2AZ, United Kingdom
| | - Stefan Winzeck
- From the Department of Computing, Imperial College London, South
Kensington Campus, London SW7 2AZ, United Kingdom
| |
Collapse
|
20
|
Kim JP, Ryan K, Kasun M, Hogg J, Dunn LB, Roberts LW. Physicians' and Machine Learning Researchers' Perspectives on Ethical Issues in the Early Development of Clinical Machine Learning Tools: Qualitative Interview Study. JMIR AI 2023; 2:e47449. [PMID: 38875536 PMCID: PMC11041441 DOI: 10.2196/47449] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 08/20/2023] [Accepted: 09/16/2023] [Indexed: 06/16/2024]
Abstract
BACKGROUND Innovative tools leveraging artificial intelligence (AI) and machine learning (ML) are rapidly being developed for medicine, with new applications emerging in prediction, diagnosis, and treatment across a range of illnesses, patient populations, and clinical procedures. One barrier for successful innovation is the scarcity of research in the current literature seeking and analyzing the views of AI or ML researchers and physicians to support ethical guidance. OBJECTIVE This study aims to describe, using a qualitative approach, the landscape of ethical issues that AI or ML researchers and physicians with professional exposure to AI or ML tools observe or anticipate in the development and use of AI and ML in medicine. METHODS Semistructured interviews were used to facilitate in-depth, open-ended discussion, and a purposeful sampling technique was used to identify and recruit participants. We conducted 21 semistructured interviews with a purposeful sample of AI and ML researchers (n=10) and physicians (n=11). We asked interviewees about their views regarding ethical considerations related to the adoption of AI and ML in medicine. Interviews were transcribed and deidentified by members of our research team. Data analysis was guided by the principles of qualitative content analysis. This approach, in which transcribed data is broken down into descriptive units that are named and sorted based on their content, allows for the inductive emergence of codes directly from the data set. RESULTS Notably, both researchers and physicians articulated concerns regarding how AI and ML innovations are shaped in their early development (ie, the problem formulation stage). Considerations encompassed the assessment of research priorities and motivations, clarity and centeredness of clinical needs, professional and demographic diversity of research teams, and interdisciplinary knowledge generation and collaboration. Phase-1 ethical issues identified by interviewees were notably interdisciplinary in nature and invited questions regarding how to align priorities and values across disciplines and ensure clinical value throughout the development and implementation of medical AI and ML. Relatedly, interviewees suggested interdisciplinary solutions to these issues, for example, more resources to support knowledge generation and collaboration between developers and physicians, engagement with a broader range of stakeholders, and efforts to increase diversity in research broadly and within individual teams. CONCLUSIONS These qualitative findings help elucidate several ethical challenges anticipated or encountered in AI and ML for health care. Our study is unique in that its use of open-ended questions allowed interviewees to explore their sentiments and perspectives without overreliance on implicit assumptions about what AI and ML currently are or are not. This analysis, however, does not include the perspectives of other relevant stakeholder groups, such as patients, ethicists, industry researchers or representatives, or other health care professionals beyond physicians. Additional qualitative and quantitative research is needed to reproduce and build on these findings.
Collapse
Affiliation(s)
- Jane Paik Kim
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Katie Ryan
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Max Kasun
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Justin Hogg
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Laura B Dunn
- Department of Psychiatry, University of Arkansas for Medical Sciences, Arkansas, CA, United States
| | - Laura Weiss Roberts
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Palo Alto, CA, United States
| |
Collapse
|
21
|
McCradden M, Hui K, Buchman DZ. Evidence, ethics and the promise of artificial intelligence in psychiatry. JOURNAL OF MEDICAL ETHICS 2023; 49:573-579. [PMID: 36581457 PMCID: PMC10423547 DOI: 10.1136/jme-2022-108447] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2022] [Accepted: 11/29/2022] [Indexed: 05/20/2023]
Abstract
Researchers are studying how artificial intelligence (AI) can be used to better detect, prognosticate and subgroup diseases. The idea that AI might advance medicine's understanding of biological categories of psychiatric disorders, as well as provide better treatments, is appealing given the historical challenges with prediction, diagnosis and treatment in psychiatry. Given the power of AI to analyse vast amounts of information, some clinicians may feel obligated to align their clinical judgements with the outputs of the AI system. However, a potential epistemic privileging of AI in clinical judgements may lead to unintended consequences that could negatively affect patient treatment, well-being and rights. The implications are also relevant to precision medicine, digital twin technologies and predictive analytics generally. We propose that a commitment to epistemic humility can help promote judicious clinical decision-making at the interface of big data and AI in psychiatry.
Collapse
Affiliation(s)
- Melissa McCradden
- Joint Centre for Bioethics, University of Toronto Dalla Lana School of Public Health, Toronto, Ontario, Canada
- Bioethics, The Hospital for Sick Children, Toronto, Ontario, Canada
- Genetics & Genome Biology, Peter Gilgan Centre for Research and Learning, Toronto, Ontario, Canada
| | - Katrina Hui
- Everyday Ethics Lab, Centre for Addiction and Mental Health, Toronto, Ontario, Canada
- Department of Psychiatry, University of Toronto, Toronto, Ontario, Canada
| | - Daniel Z Buchman
- Joint Centre for Bioethics, University of Toronto Dalla Lana School of Public Health, Toronto, Ontario, Canada
- Everyday Ethics Lab, Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| |
Collapse
|
22
|
Abstract
Translational bioethics expands the scope of research ethics to include multidisciplinary analyses of the societal implications of new translational science discoveries. Novel health privacy issues are raised by the collection, use, and disclosure of extensive and diverse big data for research on precision medicine. Similar privacy concerns surround the use of artificial intelligence to analyze vast troves of clinical records to improve patient outcomes. Embedding bioethics scholars with translational scientists can improve the technical analyses and timeliness of bioethical inquiries, but they complicate the task of producing independent and rigorous ethical assessments.
Collapse
Affiliation(s)
- Mark A Rothstein
- Herbert F. Boehl Chair of Law and Medicine Emeritus at the University of Louisville
| |
Collapse
|
23
|
Glocker B, Jones C, Bernhardt M, Winzeck S. Algorithmic encoding of protected characteristics in chest X-ray disease detection models. EBioMedicine 2023; 89:104467. [PMID: 36791660 PMCID: PMC10025760 DOI: 10.1016/j.ebiom.2023.104467] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 01/23/2023] [Accepted: 01/24/2023] [Indexed: 02/16/2023] Open
Abstract
BACKGROUND It has been rightfully emphasized that the use of AI for clinical decision making could amplify health disparities. An algorithm may encode protected characteristics, and then use this information for making predictions due to undesirable correlations in the (historical) training data. It remains unclear how we can establish whether such information is actually used. Besides the scarcity of data from underserved populations, very little is known about how dataset biases manifest in predictive models and how this may result in disparate performance. This article aims to shed some light on these issues by exploring methodology for subgroup analysis in image-based disease detection models. METHODS We utilize two publicly available chest X-ray datasets, CheXpert and MIMIC-CXR, to study performance disparities across race and biological sex in deep learning models. We explore test set resampling, transfer learning, multitask learning, and model inspection to assess the relationship between the encoding of protected characteristics and disease detection performance across subgroups. FINDINGS We confirm subgroup disparities in terms of shifted true and false positive rates which are partially removed after correcting for population and prevalence shifts in the test sets. We find that transfer learning alone is insufficient for establishing whether specific patient information is used for making predictions. The proposed combination of test-set resampling, multitask learning, and model inspection reveals valuable insights about the way protected characteristics are encoded in the feature representations of deep neural networks. INTERPRETATION Subgroup analysis is key for identifying performance disparities of AI models, but statistical differences across subgroups need to be taken into account when analyzing potential biases in disease detection. The proposed methodology provides a comprehensive framework for subgroup analysis enabling further research into the underlying causes of disparities. FUNDING European Research Council Horizon 2020, UK Research and Innovation.
Collapse
Affiliation(s)
- Ben Glocker
- Department of Computing, Imperial College London, London, SW7 2AZ, UK.
| | - Charles Jones
- Department of Computing, Imperial College London, London, SW7 2AZ, UK
| | - Mélanie Bernhardt
- Department of Computing, Imperial College London, London, SW7 2AZ, UK
| | - Stefan Winzeck
- Department of Computing, Imperial College London, London, SW7 2AZ, UK
| |
Collapse
|
24
|
Sendak M, Vidal D, Trujillo S, Singh K, Liu X, Balu S. Editorial: Surfacing best practices for AI software development and integration in healthcare. Front Digit Health 2023; 5:1150875. [PMID: 36895323 PMCID: PMC9989472 DOI: 10.3389/fdgth.2023.1150875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Accepted: 02/06/2023] [Indexed: 02/25/2023] Open
Affiliation(s)
- Mark Sendak
- Duke Institute for Health Innovation, Durham, NC, United States
| | | | | | - Karandeep Singh
- Division of Nephrology, Department of Internal Medicine, University of Michigan, Ann Arbor, MI, United States
| | - Xiaoxuan Liu
- Academic Unit of Ophthalmology, Institute of Inflammation and Ageing, University of Birmingham, Birmingham, United Kingdom
| | - Suresh Balu
- Duke Institute for Health Innovation, Durham, NC, United States
| |
Collapse
|
25
|
Ulloa M, Rothrock B, Ahmad FS, Jacobs M. Invisible clinical labor driving the successful integration of AI in healthcare. FRONTIERS IN COMPUTER SCIENCE 2022. [DOI: 10.3389/fcomp.2022.1045704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Artificial Intelligence and Machine Learning (AI/ML) tools are changing the landscape of healthcare decision-making. Vast amounts of data can lead to efficient triage and diagnosis of patients with the assistance of ML methodologies. However, more research has focused on the technological challenges of developing AI, rather than the system integration. As a result, clinical teams' role in developing and deploying these tools has been overlooked. We look to three case studies from our research to describe the often invisible work that clinical teams do in driving the successful integration of clinical AI tools. Namely, clinical teams support data labeling, identifying algorithmic errors and accounting for workflow exceptions, translating algorithmic output to clinical next steps in care, and developing team awareness of how the tool is used once deployed. We call for detailed and extensive documentation strategies (of clinical labor, workflows, and team structures) to ensure this labor is valued and to promote sharing of sociotechnical implementation strategies.
Collapse
|
26
|
Cox M, Panagides JC, Tabari A, Kalva S, Kalpathy-Cramer J, Daye D. Risk stratification with explainable machine learning for 30-day procedure-related mortality and 30-day unplanned readmission in patients with peripheral arterial disease. PLoS One 2022; 17:e0277507. [PMID: 36409699 PMCID: PMC9678279 DOI: 10.1371/journal.pone.0277507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2022] [Accepted: 10/28/2022] [Indexed: 11/22/2022] Open
Abstract
Predicting 30-day procedure-related mortality risk and 30-day unplanned readmission in patients undergoing lower extremity endovascular interventions for peripheral artery disease (PAD) may assist in improving patient outcomes. Risk prediction of 30-day mortality can help clinicians identify treatment plans to reduce the risk of death, and prediction of 30-day unplanned readmission may improve outcomes by identifying patients who may benefit from readmission prevention strategies. The goal of this study is to develop machine learning models to stratify risk of 30-day procedure-related mortality and 30-day unplanned readmission in patients undergoing lower extremity infra-inguinal endovascular interventions. We used a cohort of 14,444 cases from the American College of Surgeons National Surgical Quality Improvement Program database. For each outcome, we developed and evaluated multiple machine learning models, including Support Vector Machines, Multilayer Perceptrons, and Gradient Boosting Machines, and selected a random forest as the best-performing model for both outcomes. Our 30-day procedure-related mortality model achieved an AUC of 0.75 (95% CI: 0.71-0.79) and our 30-day unplanned readmission model achieved an AUC of 0.68 (95% CI: 0.67-0.71). Stratification of the test set by race (white and non-white), sex (male and female), and age (≥65 years and <65 years) and subsequent evaluation of demographic parity by AUC shows that both models perform equally well across race, sex, and age groups. We interpret the model globally and locally using Gini impurity and SHapley Additive exPlanations (SHAP). Using the top five predictors for death and mortality, we demonstrate differences in survival for subgroups stratified by these predictors, which underscores the utility of our model.
Collapse
Affiliation(s)
- Meredith Cox
- Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States of America
| | - J. C. Panagides
- Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States of America
| | - Azadeh Tabari
- Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States of America
| | - Sanjeeva Kalva
- Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States of America
| | - Jayashree Kalpathy-Cramer
- Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States of America
| | - Dania Daye
- Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States of America
| |
Collapse
|
27
|
Istasy P, Lee WS, Iansavichene A, Upshur R, Gyawali B, Burkell J, Sadikovic B, Lazo-Langner A, Chin-Yee B. The Impact of Artificial Intelligence on Health Equity in Oncology: Scoping Review. J Med Internet Res 2022; 24:e39748. [PMID: 36005841 PMCID: PMC9667381 DOI: 10.2196/39748] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2022] [Revised: 08/11/2022] [Accepted: 08/24/2022] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND The field of oncology is at the forefront of advances in artificial intelligence (AI) in health care, providing an opportunity to examine the early integration of these technologies in clinical research and patient care. Hope that AI will revolutionize health care delivery and improve clinical outcomes has been accompanied by concerns about the impact of these technologies on health equity. OBJECTIVE We aimed to conduct a scoping review of the literature to address the question, "What are the current and potential impacts of AI technologies on health equity in oncology?" METHODS Following PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines for scoping reviews, we systematically searched MEDLINE and Embase electronic databases from January 2000 to August 2021 for records engaging with key concepts of AI, health equity, and oncology. We included all English-language articles that engaged with the 3 key concepts. Articles were analyzed qualitatively for themes pertaining to the influence of AI on health equity in oncology. RESULTS Of the 14,011 records, 133 (0.95%) identified from our review were included. We identified 3 general themes in the literature: the use of AI to reduce health care disparities (58/133, 43.6%), concerns surrounding AI technologies and bias (16/133, 12.1%), and the use of AI to examine biological and social determinants of health (55/133, 41.4%). A total of 3% (4/133) of articles focused on many of these themes. CONCLUSIONS Our scoping review revealed 3 main themes on the impact of AI on health equity in oncology, which relate to AI's ability to help address health disparities, its potential to mitigate or exacerbate bias, and its capability to help elucidate determinants of health. Gaps in the literature included a lack of discussion of ethical challenges with the application of AI technologies in low- and middle-income countries, lack of discussion of problems of bias in AI algorithms, and a lack of justification for the use of AI technologies over traditional statistical methods to address specific research questions in oncology. Our review highlights a need to address these gaps to ensure a more equitable integration of AI in cancer research and clinical practice. The limitations of our study include its exploratory nature, its focus on oncology as opposed to all health care sectors, and its analysis of solely English-language articles.
Collapse
Affiliation(s)
- Paul Istasy
- Schulich School of Medicine and Dentistry, Western University, London, ON, Canada
- Rotman Institute of Philosophy, Western University, London, ON, Canada
| | - Wen Shen Lee
- Department of Pathology & Laboratory Medicine, Schulich School of Medicine, Western University, London, ON, Canada
| | | | - Ross Upshur
- Division of Clinical Public Health, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
- Bridgepoint Collaboratory for Research and Innovation, Lunenfeld Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
| | - Bishal Gyawali
- Division of Cancer Care and Epidemiology, Department of Oncology, Queen's University, Kingston, ON, Canada
- Division of Cancer Care and Epidemiology, Department of Public Health Sciences, Queen's University, Kingston, ON, Canada
| | - Jacquelyn Burkell
- Faculty of Information and Media Studies, Western University, London, ON, Canada
| | - Bekim Sadikovic
- Department of Pathology & Laboratory Medicine, Schulich School of Medicine, Western University, London, ON, Canada
| | - Alejandro Lazo-Langner
- Division of Hematology, Schulich School of Medicine and Dentistry, Western University, London, ON, Canada
| | - Benjamin Chin-Yee
- Rotman Institute of Philosophy, Western University, London, ON, Canada
- Division of Hematology, Schulich School of Medicine and Dentistry, Western University, London, ON, Canada
- Division of Hematology, Department of Medicine, London Health Sciences Centre, London, ON, Canada
| |
Collapse
|
28
|
You J, Zhang YR, Wang HF, Yang M, Feng JF, Yu JT, Cheng W. Development of a novel dementia risk prediction model in the general population: A large, longitudinal, population-based machine-learning study. EClinicalMedicine 2022; 53:101665. [PMID: 36187723 PMCID: PMC9519470 DOI: 10.1016/j.eclinm.2022.101665] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 08/29/2022] [Accepted: 08/31/2022] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND The existing dementia risk models are limited to known risk factors and traditional statistical methods. We aimed to employ machine learning (ML) to develop a novel dementia prediction model by leveraging a rich-phenotypic variable space of 366 features covering multiple domains of health-related data. METHODS In this longitudinal population-based cohort of the UK Biobank (UKB), 425,159 non-demented participants were enrolled from 22 recruitment centres across the UK between March 1, 2006 and October 31, 2010. We implemented a data-driven strategy to identify predictors from 366 candidate variables covering a comprehensive range of genetic and environmental factors and developed the ML model to predict incident dementia and Alzheimer's Disease (AD) within five, ten, and much longer years (median 11.9 [Interquartile range 11.2-12.5] years). FINDINGS During a follow-up of 5,023,337 person-years, 5287 and 2416 participants developed dementia and AD, respectively. A novel UKB dementia risk prediction (UKB-DRP) model comprising ten predictors including age, ApoE ε4, pairs matching time, leg fat percentage, number of medications taken, reaction time, peak expiratory flow, mother's age at death, long-standing illness, and mean corpuscular volume was established. Our prediction model was internally evaluated based on five-fold cross-validation on discrimination and calibration, and it was further compared with existing prediction scales. The UKB-DRP model can achieve high discriminative accuracy in dementia (AUC 0.848 ± 0.007) and even better in AD (AUC 0.862 ± 0.015). The model was well-calibrated (Hosmer-Lemeshow goodness-of-fit p-value = 0.92), and the predictive power was solid in different incidence time groups. More importantly, our model presented an apparent superiority over existing models like Cardiovascular Risk Factors, Aging, and Incidence of Dementia Risk Score (AUC 0.705 ± 0.008), the Dementia Risk Score (AUC 0.752 ± 0.007), and the Australian National University Alzheimer's Disease Risk Index (AUC 0.584 ± 0.017). The model was internally validated in the general population of European ancestry and White ethnicity; thus, further validation with independent datasets is necessary to confirm these findings. INTERPRETATION Our ML-based UKB-DRP model incorporated ten easily accessible predictors with solid predictive power for incident dementia and AD within five, ten, and much longer years, which can be used to identify individuals at high risk of dementia and AD in the general population. FUNDING This study was funded by grants from the Science and Technology Innovation 2030 Major Projects (2022ZD0211600), National Key R&D Program of China (2018YFC1312904, 2019YFA070950), National Natural Science Foundation of China (282071201, 81971032, 82071997), Shanghai Municipal Science and Technology Major Project (2018SHZDZX01), Research Start-up Fund of Huashan Hospital (2022QD002), Excellence 2025 Talent Cultivation Program at Fudan University (3030277001), Shanghai Rising-Star Program (21QA1408700), Medical Engineering Fund of Fudan University (yg2021-013), and the 111 Project (No. B18015).
Collapse
Affiliation(s)
- Jia You
- Department of Neurology, Huashan Hospital, Institute of Science and Technology for Brain-Inspired Intelligence, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
| | - Ya-Ru Zhang
- Department of Neurology, Huashan Hospital, Institute of Science and Technology for Brain-Inspired Intelligence, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
| | - Hui-Fu Wang
- Department of Neurology, Huashan Hospital, Institute of Science and Technology for Brain-Inspired Intelligence, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
| | - Ming Yang
- Department of Neurology, Huashan Hospital, Institute of Science and Technology for Brain-Inspired Intelligence, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
| | - Jian-Feng Feng
- Department of Neurology, Huashan Hospital, Institute of Science and Technology for Brain-Inspired Intelligence, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
- Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, China
- Zhangjiang Fudan International Innovation Center, Shanghai, China
- Fudan ISTBI—ZJNU Algorithm Centre for Brain-inspired Intelligence, Zhejiang Normal University, Zhejiang, China
- Corresponding authors at: Room 2316, Guanghua Building, East Main Wing, Fudan University, No. 220 Handan Road, Shanghai, 200433, China.
| | - Jin-Tai Yu
- Department of Neurology, Huashan Hospital, Institute of Science and Technology for Brain-Inspired Intelligence, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
- Corresponding author at: Huashan Hospital, No. 12 Wulumuqi Zhong Road, Shanghai, 200040, China.
| | - Wei Cheng
- Department of Neurology, Huashan Hospital, Institute of Science and Technology for Brain-Inspired Intelligence, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
- Fudan ISTBI—ZJNU Algorithm Centre for Brain-inspired Intelligence, Zhejiang Normal University, Zhejiang, China
- Corresponding authors at: Room 2316, Guanghua Building, East Main Wing, Fudan University, No. 220 Handan Road, Shanghai, 200433, China.
| |
Collapse
|
29
|
Data governance functions to support responsible data stewardship in pediatric radiology research studies using artificial intelligence. Pediatr Radiol 2022; 52:2111-2119. [PMID: 35790559 DOI: 10.1007/s00247-022-05427-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/19/2021] [Revised: 04/13/2022] [Accepted: 06/06/2022] [Indexed: 03/03/2023]
Abstract
The integration of human and machine intelligence promises to profoundly change the practice of medicine. The rapidly increasing adoption of artificial intelligence (AI) solutions highlights its potential to streamline physician work and optimize clinical decision-making, also in the field of pediatric radiology. Large imaging databases are necessary for training, validating and testing these algorithms. To better promote data accessibility in multi-institutional AI-enabled radiologic research, these databases centralize the large volumes of data required to effect accurate models and outcome predictions. However, such undertakings must consider the sensitivity of patient information and therefore utilize requisite data governance measures to safeguard data privacy and security, to recognize and mitigate the effects of bias and to promote ethical use. In this article we define data stewardship and data governance, review their key considerations and applicability to radiologic research in the pediatric context, and consider the associated best practices along with the ramifications of poorly executed data governance. We summarize several adaptable data governance frameworks and describe strategies for their implementation in the form of distributed and centralized approaches to data management.
Collapse
|
30
|
Anderson JA, McCradden MD, Stephenson EA. Response to Open Peer Commentaries: On Social Harms, Big Tech, and Institutional Accountability. THE AMERICAN JOURNAL OF BIOETHICS : AJOB 2022; 22:W6-W8. [PMID: 35593914 DOI: 10.1080/15265161.2022.2075977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Affiliation(s)
| | - Melissa D McCradden
- The Hospital for Sick Children
- Peter Gilgan Centre for Research and Learning
- Dalla Lana School of Public Health
| | | |
Collapse
|
31
|
Kwong JCC, Erdman L, Khondker A, Skreta M, Goldenberg A, McCradden MD, Lorenzo AJ, Rickard M. The silent trial - the bridge between bench-to-bedside clinical AI applications. Front Digit Health 2022; 4:929508. [PMID: 36052317 PMCID: PMC9424628 DOI: 10.3389/fdgth.2022.929508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 08/01/2022] [Indexed: 11/23/2022] Open
Abstract
As more artificial intelligence (AI) applications are integrated into healthcare, there is an urgent need for standardization and quality-control measures to ensure a safe and successful transition of these novel tools into clinical practice. We describe the role of the silent trial, which evaluates an AI model on prospective patients in real-time, while the end-users (i.e., clinicians) are blinded to predictions such that they do not influence clinical decision-making. We present our experience in evaluating a previously developed AI model to predict obstructive hydronephrosis in infants using the silent trial. Although the initial model performed poorly on the silent trial dataset (AUC 0.90 to 0.50), the model was refined by exploring issues related to dataset drift, bias, feasibility, and stakeholder attitudes. Specifically, we found a shift in distribution of age, laterality of obstructed kidneys, and change in imaging format. After correction of these issues, model performance improved and remained robust across two independent silent trial datasets (AUC 0.85–0.91). Furthermore, a gap in patient knowledge on how the AI model would be used to augment their care was identified. These concerns helped inform the patient-centered design for the user-interface of the final AI model. Overall, the silent trial serves as an essential bridge between initial model development and clinical trials assessment to evaluate the safety, reliability, and feasibility of the AI model in a minimal risk environment. Future clinical AI applications should make efforts to incorporate this important step prior to embarking on a full-scale clinical trial.
Collapse
Affiliation(s)
- Jethro C. C. Kwong
- Division of Urology, Department of Surgery, University of Toronto, Toronto, ON, Canada
- Temerty Centre for AI Research and Education in Medicine, University of Toronto, Toronto, ON, Canada
| | - Lauren Erdman
- Temerty Centre for AI Research and Education in Medicine, University of Toronto, Toronto, ON, Canada
- Centre for Computational Medicine, The Hospital for Sick Children, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| | - Adree Khondker
- Division of Urology, Department of Surgery, The Hospital for Sick Children, Toronto, ON, Canada
| | - Marta Skreta
- Centre for Computational Medicine, The Hospital for Sick Children, Toronto, ON, Canada
| | - Anna Goldenberg
- Temerty Centre for AI Research and Education in Medicine, University of Toronto, Toronto, ON, Canada
- Centre for Computational Medicine, The Hospital for Sick Children, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| | - Melissa D. McCradden
- Temerty Centre for AI Research and Education in Medicine, University of Toronto, Toronto, ON, Canada
- Department of Bioethics, The Hospital for Sick Children, Toronto, ON, Canada
- Division of Clinical and Public Health, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
- Genetics & Genome Biology, Peter Gilgan Centre for Research and Learning, Toronto, ON, Canada
| | - Armando J. Lorenzo
- Division of Urology, Department of Surgery, University of Toronto, Toronto, ON, Canada
- Division of Urology, Department of Surgery, The Hospital for Sick Children, Toronto, ON, Canada
| | - Mandy Rickard
- Division of Urology, Department of Surgery, The Hospital for Sick Children, Toronto, ON, Canada
- Correspondence: Mandy Rickard,
| |
Collapse
|
32
|
Vayena E, Blasimme A. A Systemic Approach to the Oversight of Machine Learning Clinical Translation. THE AMERICAN JOURNAL OF BIOETHICS : AJOB 2022; 22:23-25. [PMID: 35475963 DOI: 10.1080/15265161.2022.2055216] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
|
33
|
Ho CWL, Malpani R. Scaling up the Research Ethics Framework for Healthcare Machine Learning as Global Health Ethics and Governance. THE AMERICAN JOURNAL OF BIOETHICS : AJOB 2022; 22:36-38. [PMID: 35475959 DOI: 10.1080/15265161.2022.2055209] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
|
34
|
Vandemeulebroucke T, Denier Y, Gastmans C. The Need for a Global Approach to the Ethical Evaluation of Healthcare Machine Learning. THE AMERICAN JOURNAL OF BIOETHICS : AJOB 2022; 22:33-35. [PMID: 35475955 DOI: 10.1080/15265161.2022.2055207] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
|
35
|
Affiliation(s)
- Kadija Ferryman
- Johns Hopkins University Bloomberg School of Public Health and Johns Hopkins Berman Institute of Bioethics
| |
Collapse
|
36
|
Martinez-Martin N, Cho MK. Bridging the AI Chasm: Can EBM Address Representation and Fairness in Clinical Machine Learning? THE AMERICAN JOURNAL OF BIOETHICS : AJOB 2022; 22:30-32. [PMID: 35475967 PMCID: PMC9337773 DOI: 10.1080/15265161.2022.2055212] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Affiliation(s)
| | - Mildred K. Cho
- Stanford Center for Biomedical Ethics
- Stanford University
| |
Collapse
|
37
|
Blumenthal-Barby J, Lang B, Dorfman N, Kaplan H, Hooper WB, Kostick-Quenet K. Research on the Clinical Translation of Health Care Machine Learning: Ethicists Experiences on Lessons Learned. THE AMERICAN JOURNAL OF BIOETHICS : AJOB 2022; 22:1-3. [PMID: 35475968 DOI: 10.1080/15265161.2022.2059199] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
|
38
|
Levi M, Bernstein M, Waeiss C. Broadening the Ethical Scope. THE AMERICAN JOURNAL OF BIOETHICS : AJOB 2022; 22:26-28. [PMID: 35475958 DOI: 10.1080/15265161.2022.2055219] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
|
39
|
Spector-Bagdady K, Rahimzadeh V, Jaffe K, Moreno J. Promoting Ethical Deployment of Artificial Intelligence and Machine Learning in Healthcare. THE AMERICAN JOURNAL OF BIOETHICS : AJOB 2022; 22:4-7. [PMID: 35499568 PMCID: PMC9805364 DOI: 10.1080/15265161.2022.2059206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
|
40
|
Shaw J. Emerging Paradigms for Ethical Review of Research Using Artificial Intelligence. THE AMERICAN JOURNAL OF BIOETHICS : AJOB 2022; 22:42-44. [PMID: 35475953 DOI: 10.1080/15265161.2022.2055206] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
|