Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Chen IY, Pierson E, Rose S, Joshi S, Ferryman K, Ghassemi M. Ethical Machine Learning in Healthcare. Annu Rev Biomed Data Sci 2021;4:123-144. [PMID: 34396058 PMCID: PMC8362902 DOI: 10.1146/annurev-biodatasci-092820-114757] [Citation(s) in RCA: 94] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

For:	Chen IY, Pierson E, Rose S, Joshi S, Ferryman K, Ghassemi M. Ethical Machine Learning in Healthcare. Annu Rev Biomed Data Sci 2021;4:123-144. [PMID: 34396058 PMCID: PMC8362902 DOI: 10.1146/annurev-biodatasci-092820-114757] [Citation(s) in RCA: 94] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

Number

Cited by Other Article(s)

Wahid KA, Cardenas CE, Marquez B, Netherton TJ, Kann BH, Court LE, He R, Naser MA, Moreno AC, Fuller CD, Fuentes D. Evolving Horizons in Radiation Therapy Auto-Contouring: Distilling Insights, Embracing Data-Centric Frameworks, and Moving Beyond Geometric Quantification. Adv Radiat Oncol 2024;9:101521. [PMID: 38799110 PMCID: PMC11111585 DOI: 10.1016/j.adro.2024.101521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 02/26/2024] [Indexed: 05/29/2024] Open

Zhang L, Richter LR, Wang Y, Ostropolets A, Elhadad N, Blei DM, Hripcsak G. Causal fairness assessment of treatment allocation with electronic health records. J Biomed Inform 2024;155:104656. [PMID: 38782170 PMCID: PMC11180553 DOI: 10.1016/j.jbi.2024.104656] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 12/31/2023] [Accepted: 05/14/2024] [Indexed: 05/25/2024]

Abstract

OBJECTIVE

Healthcare continues to grapple with the persistent issue of treatment disparities, sparking concerns regarding the equitable allocation of treatments in clinical practice. While various fairness metrics have emerged to assess fairness in decision-making processes, a growing focus has been on causality-based fairness concepts due to their capacity to mitigate confounding effects and reason about bias. However, the application of causal fairness notions in evaluating the fairness of clinical decision-making with electronic health record (EHR) data remains an understudied domain. This study aims to address the methodological gap in assessing causal fairness of treatment allocation with electronic health records data. In addition, we investigate the impact of social determinants of health on the assessment of causal fairness of treatment allocation.

METHODS

We propose a causal fairness algorithm to assess fairness in clinical decision-making. Our algorithm accounts for the heterogeneity of patient populations and identifies potential unfairness in treatment allocation by conditioning on patients who have the same likelihood to benefit from the treatment. We apply this framework to a patient cohort with coronary artery disease derived from an EHR database to evaluate the fairness of treatment decisions.

RESULTS

Our analysis reveals notable disparities in coronary artery bypass grafting (CABG) allocation among different patient groups. Women were found to be 4.4%-7.7% less likely to receive CABG than men in two out of four treatment response strata. Similarly, Black or African American patients were 5.4%-8.7% less likely to receive CABG than others in three out of four response strata. These results were similar when social determinants of health (insurance and area deprivation index) were dropped from the algorithm. These findings highlight the presence of disparities in treatment allocation among similar patients, suggesting potential unfairness in the clinical decision-making process.

CONCLUSION

This study introduces a novel approach for assessing the fairness of treatment allocation in healthcare. By incorporating responses to treatment into fairness framework, our method explores the potential of quantifying fairness from a causal perspective using EHR data. Our research advances the methodological development of fairness assessment in healthcare and highlight the importance of causality in determining treatment fairness.

Collapse

Jain SS, Elias P, Poterucha T, Randazzo M, Lopez Jimenez F, Khera R, Perez M, Ouyang D, Pirruccello J, Salerno M, Einstein AJ, Avram R, Tison GH, Nadkarni G, Natarajan V, Pierson E, Beecy A, Kumaraiah D, Haggerty C, Avari Silva JN, Maddox TM. Artificial Intelligence in Cardiovascular Care-Part 2: Applications: JACC Review Topic of the Week. J Am Coll Cardiol 2024;83:2487-2496. [PMID: 38593945 DOI: 10.1016/j.jacc.2024.03.401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 03/14/2024] [Indexed: 04/11/2024]

Affiliation(s)

Sneha S Jain Division of Cardiology, Stanford University School of Medicine, Palo Alto, California, USA
Pierre Elias Seymour, Paul and Gloria Milstein Division of Cardiology, Columbia University Irving Medical Center, New York, New York, USA; Department of Biomedical Informatics Columbia University Irving Medical Center, New York, New York, USA
Timothy Poterucha Seymour, Paul and Gloria Milstein Division of Cardiology, Columbia University Irving Medical Center, New York, New York, USA
Michael Randazzo Division of Cardiology, University of Chicago Medical Center, Chicago, Illinois, USA
Francisco Lopez Jimenez Department of Cardiology, Mayo Clinic College of Medicine, Rochester, Minnesota, USA
Rohan Khera Division of Cardiology, Yale School of Medicine, New Haven, Connecticut, USA
Marco Perez Division of Cardiology, Stanford University School of Medicine, Palo Alto, California, USA
David Ouyang Division of Cardiology, Cedars-Sinai Medical Center, Los Angeles, California, USA
James Pirruccello Division of Cardiology, University of California San Francisco, San Francisco, California, USA
Michael Salerno Division of Cardiology, Stanford University School of Medicine, Palo Alto, California, USA
Andrew J Einstein Seymour, Paul and Gloria Milstein Division of Cardiology, Columbia University Irving Medical Center, New York, New York, USA
Robert Avram Division of Cardiology, Montreal Heart Institute, Montreal, Quebec, Canada
Geoffrey H Tison Division of Cardiology, University of California San Francisco, San Francisco, California, USA
Girish Nadkarni Icahn School of Medicine at Mount Sinai, New York, New York, USA
Vivek Natarajan Google Health, Mountain View, California, USA
Emma Pierson Department of Computer Science, Cornell Tech, New York, New York, USA
Ashley Beecy NewYork-Presbyterian Health System, New York, New York, USA; Division of Cardiology, Weill Cornell Medical College, New York, New York, USA
Deepa Kumaraiah Seymour, Paul and Gloria Milstein Division of Cardiology, Columbia University Irving Medical Center, New York, New York, USA; NewYork-Presbyterian Health System, New York, New York, USA
Chris Haggerty Department of Biomedical Informatics Columbia University Irving Medical Center, New York, New York, USA; NewYork-Presbyterian Health System, New York, New York, USA
Jennifer N Avari Silva Division of Cardiology, Washington University School of Medicine, St Louis, Missouri, USA
Thomas M Maddox Division of Cardiology, Washington University School of Medicine, St Louis, Missouri, USA.

Collapse

Yang H, Zhu D, He S, Xu Z, Liu Z, Zhang W, Cai J. Enhancing psychiatric rehabilitation outcomes through a multimodal multitask learning model based on BERT and TabNet: An approach for personalized treatment and improved decision-making. Psychiatry Res 2024;336:115896. [PMID: 38626625 DOI: 10.1016/j.psychres.2024.115896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 04/03/2024] [Accepted: 04/05/2024] [Indexed: 04/18/2024]

McMahon GT. The Risks and Challenges of Artificial Intelligence in Endocrinology. J Clin Endocrinol Metab 2024;109:e1468-e1471. [PMID: 38471009 DOI: 10.1210/clinem/dgae017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Indexed: 03/14/2024]

Teotia K, Jia Y, Link Woite N, Celi LA, Matos J, Struja T. Variation in monitoring: Glucose measurement in the ICU as a case study to preempt spurious correlations. J Biomed Inform 2024;153:104643. [PMID: 38621640 PMCID: PMC11103268 DOI: 10.1016/j.jbi.2024.104643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 03/29/2024] [Accepted: 04/12/2024] [Indexed: 04/17/2024]

Abstract

OBJECTIVE

Health inequities can be influenced by demographic factors such as race and ethnicity, proficiency in English, and biological sex. Disparities may manifest as differential likelihood of testing which correlates directly with the likelihood of an intervention to address an abnormal finding. Our retrospective observational study evaluated the presence of variation in glucose measurements in the Intensive Care Unit (ICU).

METHODS

Using the MIMIC-IV database (2008-2019), a single-center, academic referral hospital in Boston (USA), we identified adult patients meeting sepsis-3 criteria. Exclusion criteria were diabetic ketoacidosis, ICU length of stay under 1 day, and unknown race or ethnicity. We performed a logistic regression analysis to assess differential likelihoods of glucose measurements on day 1. A negative binomial regression was fitted to assess the frequency of subsequent glucose readings. Analyses were adjusted for relevant clinical confounders, and performed across three disparity proxy axes: race and ethnicity, sex, and English proficiency.

RESULTS

We studied 24,927 patients, of which 19.5% represented racial and ethnic minority groups, 42.4% were female, and 9.8% had limited English proficiency. No significant differences were found for glucose measurement on day 1 in the ICU. This pattern was consistent irrespective of the axis of analysis, i.e. race and ethnicity, sex, or English proficiency. Conversely, subsequent measurement frequency revealed potential disparities. Specifically, males (incidence rate ratio (IRR) 1.06, 95% confidence interval (CI) 1.01 - 1.21), patients who identify themselves as Hispanic (IRR 1.11, 95% CI 1.01 - 1.21), or Black (IRR 1.06, 95% CI 1.01 - 1.12), and patients being English proficient (IRR 1.08, 95% CI 1.01 - 1.15) had higher chances of subsequent glucose readings.

CONCLUSION

We found disparities in ICU glucose measurements among patients with sepsis, albeit the magnitude was small. Variation in disease monitoring is a source of data bias that may lead to spurious correlations when modeling health data.

Collapse

Fernández-Alvarez J, Molinari G, Kilcullen R, Delgadillo J, Drill R, Errázuriz P, Falkenstrom F, Firth N, O'Shea A, Paz C, Youn SJ, Castonguay LG. The Importance of Conducting Practice-oriented Research with Underserved Populations. ADMINISTRATION AND POLICY IN MENTAL HEALTH AND MENTAL HEALTH SERVICES RESEARCH 2024;51:358-375. [PMID: 38157130 DOI: 10.1007/s10488-023-01337-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/15/2023] [Indexed: 01/03/2024]

Barea Mendoza JA, Valiente Fernandez M, Pardo Fernandez A, Gómez Álvarez J. Current perspectives on the use of artificial intelligence in critical patient safety. Med Intensiva 2024:S2173-5727(24)00080-8. [PMID: 38677902 DOI: 10.1016/j.medine.2024.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 03/11/2024] [Indexed: 04/29/2024]

Wang HE, Weiner JP, Saria S, Kharrazi H. Evaluating Algorithmic Bias in 30-Day Hospital Readmission Models: Retrospective Analysis. J Med Internet Res 2024;26:e47125. [PMID: 38422347 PMCID: PMC11066744 DOI: 10.2196/47125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2023] [Revised: 12/28/2023] [Accepted: 02/27/2024] [Indexed: 03/02/2024] Open

Abstract

BACKGROUND

The adoption of predictive algorithms in health care comes with the potential for algorithmic bias, which could exacerbate existing disparities. Fairness metrics have been proposed to measure algorithmic bias, but their application to real-world tasks is limited.

OBJECTIVE

This study aims to evaluate the algorithmic bias associated with the application of common 30-day hospital readmission models and assess the usefulness and interpretability of selected fairness metrics.

METHODS

We used 10.6 million adult inpatient discharges from Maryland and Florida from 2016 to 2019 in this retrospective study. Models predicting 30-day hospital readmissions were evaluated: LACE Index, modified HOSPITAL score, and modified Centers for Medicare & Medicaid Services (CMS) readmission measure, which were applied as-is (using existing coefficients) and retrained (recalibrated with 50% of the data). Predictive performances and bias measures were evaluated for all, between Black and White populations, and between low- and other-income groups. Bias measures included the parity of false negative rate (FNR), false positive rate (FPR), 0-1 loss, and generalized entropy index. Racial bias represented by FNR and FPR differences was stratified to explore shifts in algorithmic bias in different populations.

RESULTS

The retrained CMS model demonstrated the best predictive performance (area under the curve: 0.74 in Maryland and 0.68-0.70 in Florida), and the modified HOSPITAL score demonstrated the best calibration (Brier score: 0.16-0.19 in Maryland and 0.19-0.21 in Florida). Calibration was better in White (compared to Black) populations and other-income (compared to low-income) groups, and the area under the curve was higher or similar in the Black (compared to White) populations. The retrained CMS and modified HOSPITAL score had the lowest racial and income bias in Maryland. In Florida, both of these models overall had the lowest income bias and the modified HOSPITAL score showed the lowest racial bias. In both states, the White and higher-income populations showed a higher FNR, while the Black and low-income populations resulted in a higher FPR and a higher 0-1 loss. When stratified by hospital and population composition, these models demonstrated heterogeneous algorithmic bias in different contexts and populations.

CONCLUSIONS

Caution must be taken when interpreting fairness measures' face value. A higher FNR or FPR could potentially reflect missed opportunities or wasted resources, but these measures could also reflect health care use patterns and gaps in care. Simply relying on the statistical notions of bias could obscure or underplay the causes of health disparity. The imperfect health data, analytic frameworks, and the underlying health systems must be carefully considered. Fairness measures can serve as a useful routine assessment to detect disparate model performances but are insufficient to inform mechanisms or policy changes. However, such an assessment is an important first step toward data-driven improvement to address existing health disparities.

Collapse

Collins GS, Moons KGM, Dhiman P, Riley RD, Beam AL, Van Calster B, Ghassemi M, Liu X, Reitsma JB, van Smeden M, Boulesteix AL, Camaradou JC, Celi LA, Denaxas S, Denniston AK, Glocker B, Golub RM, Harvey H, Heinze G, Hoffman MM, Kengne AP, Lam E, Lee N, Loder EW, Maier-Hein L, Mateen BA, McCradden MD, Oakden-Rayner L, Ordish J, Parnell R, Rose S, Singh K, Wynants L, Logullo P. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ 2024;385:e078378. [PMID: 38626948 PMCID: PMC11019967 DOI: 10.1136/bmj-2023-078378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/17/2024] [Indexed: 04/19/2024]

Affiliation(s)

Gary S Collins Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
Karel G M Moons Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
Paula Dhiman Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
Richard D Riley Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK National Institute for Health and Care Research (NIHR) Birmingham Biomedical Research Centre, Birmingham, UK
Andrew L Beam Department of Epidemiology, Harvard T H Chan School of Public Health, Boston, MA, USA
Ben Van Calster Department of Development and Regeneration, KU Leuven, Leuven, Belgium Department of Biomedical Data Science, Leiden University Medical Centre, Leiden, Netherlands
Marzyeh Ghassemi Department of Electrical Engineering and Computer Science, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
Xiaoxuan Liu Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
Johannes B Reitsma Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
Maarten van Smeden Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
Anne-Laure Boulesteix Institute for Medical Information Processing, Biometry and Epidemiology, Faculty of Medicine, Ludwig-Maximilians-University of Munich and Munich Centre of Machine Learning, Germany
Jennifer Catherine Camaradou Patient representative, Health Data Research UK patient and public involvement and engagement group Patient representative, University of East Anglia, Faculty of Health Sciences, Norwich Research Park, Norwich, UK
Leo Anthony Celi Beth Israel Deaconess Medical Center, Boston, MA, USA Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA Department of Biostatistics, Harvard T H Chan School of Public Health, Boston, MA, USA
Spiros Denaxas Institute of Health Informatics, University College London, London, UK British Heart Foundation Data Science Centre, London, UK
Alastair K Denniston National Institute for Health and Care Research (NIHR) Birmingham Biomedical Research Centre, Birmingham, UK Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
Ben Glocker Department of Computing, Imperial College London, London, UK
Robert M Golub Northwestern University Feinberg School of Medicine, Chicago, IL, USA
Hugh Harvey Hardian Health, Haywards Heath, UK
Georg Heinze Section for Clinical Biometrics, Centre for Medical Data Science, Medical University of Vienna, Vienna, Austria
Michael M Hoffman Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada Department of Computer Science, University of Toronto, Toronto, ON, Canada Vector Institute for Artificial Intelligence, Toronto, ON, Canada
André Pascal Kengne Department of Medicine, University of Cape Town, Cape Town, South Africa
Emily Lam Patient representative, Health Data Research UK patient and public involvement and engagement group
Naomi Lee National Institute for Health and Care Excellence, London, UK
Elizabeth W Loder The BMJ, London, UK Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
Lena Maier-Hein Department of Intelligent Medical Systems, German Cancer Research Centre, Heidelberg, Germany
Bilal A Mateen Institute of Health Informatics, University College London, London, UK Wellcome Trust, London, UK Alan Turing Institute, London, UK
Melissa D McCradden Department of Bioethics, Hospital for Sick Children Toronto, ON, Canada Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, Canada
Lauren Oakden-Rayner Australian Institute for Machine Learning, University of Adelaide, Adelaide, SA, Australia
Johan Ordish Medicines and Healthcare products Regulatory Agency, London, UK
Richard Parnell Patient representative, Health Data Research UK patient and public involvement and engagement group
Sherri Rose Department of Health Policy and Center for Health Policy, Stanford University, Stanford, CA, USA
Karandeep Singh Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, Netherlands
Laure Wynants Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, Netherlands
Patricia Logullo Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK

Collapse

Perets O, Stagno E, Yehuda EB, McNichol M, Anthony Celi L, Rappoport N, Dorotic M. Inherent Bias in Electronic Health Records: A Scoping Review of Sources of Bias. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.09.24305594. [PMID: 38680842 PMCID: PMC11046491 DOI: 10.1101/2024.04.09.24305594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/01/2024]

Abstract

Objectives

1.1Biases inherent in electronic health records (EHRs), and therefore in medical artificial intelligence (AI) models may significantly exacerbate health inequities and challenge the adoption of ethical and responsible AI in healthcare. Biases arise from multiple sources, some of which are not as documented in the literature. Biases are encoded in how the data has been collected and labeled, by implicit and unconscious biases of clinicians, or by the tools used for data processing. These biases and their encoding in healthcare records undermine the reliability of such data and bias clinical judgments and medical outcomes. Moreover, when healthcare records are used to build data-driven solutions, the biases are further exacerbated, resulting in systems that perpetuate biases and induce healthcare disparities. This literature scoping review aims to categorize the main sources of biases inherent in EHRs.

Methods

1.2We queried PubMed and Web of Science on January 19th, 2023, for peer-reviewed sources in English, published between 2016 and 2023, using the PRISMA approach to stepwise scoping of the literature. To select the papers that empirically analyze bias in EHR, from the initial yield of 430 papers, 27 duplicates were removed, and 403 studies were screened for eligibility. 196 articles were removed after the title and abstract screening, and 96 articles were excluded after the full-text review resulting in a final selection of 116 articles.

Results

1.3Systematic categorizations of diverse sources of bias are scarce in the literature, while the effects of separate studies are often convoluted and methodologically contestable. Our categorization of published empirical evidence identified the six main sources of bias: a) bias arising from past clinical trials; b) data-related biases arising from missing, incomplete information or poor labeling of data; human-related bias induced by c) implicit clinician bias, d) referral and admission bias; e) diagnosis or risk disparities bias and finally, (f) biases in machinery and algorithms.

Conclusions

1.4Machine learning and data-driven solutions can potentially transform healthcare delivery, but not without limitations. The core inputs in the systems (data and human factors) currently contain several sources of bias that are poorly documented and analyzed for remedies. The current evidence heavily focuses on data-related biases, while other sources are less often analyzed or anecdotal. However, these different sources of biases add to one another exponentially. Therefore, to understand the issues holistically we need to explore these diverse sources of bias. While racial biases in EHR have been often documented, other sources of biases have been less frequently investigated and documented (e.g. gender-related biases, sexual orientation discrimination, socially induced biases, and implicit, often unconscious, human-related cognitive biases). Moreover, some existing studies lack causal evidence, illustrating the different prevalences of disease across groups, which does not per se prove the causality. Our review shows that data-, human- and machine biases are prevalent in healthcare and they significantly impact healthcare outcomes and judgments and exacerbate disparities and differential treatment. Understanding how diverse biases affect AI systems and recommendations is critical. We suggest that researchers and medical personnel should develop safeguards and adopt data-driven solutions with a "bias-in-mind" approach. More empirical evidence is needed to tease out the effects of different sources of bias on health outcomes.

Collapse

Nichol AA, Halley M, Federico C, Cho MK, Sankar PL. Moral Engagement and Disengagement in Health Care AI Development. AJOB Empir Bioeth 2024:1-10. [PMID: 38588388 DOI: 10.1080/23294515.2024.2336906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/10/2024]

Abstract

BACKGROUND

Machine learning (ML) is utilized increasingly in health care, and can pose harms to patients, clinicians, health systems, and the public. In response, regulators have proposed an approach that would shift more responsibility to ML developers for mitigating potential harms. To be effective, this approach requires ML developers to recognize, accept, and act on responsibility for mitigating harms. However, little is known regarding the perspectives of developers themselves regarding their obligations to mitigate harms.

METHODS

We conducted 40 semi-structured interviews with developers of ML predictive analytics applications for health care in the United States.

RESULTS

Participants varied widely in their perspectives on personal responsibility and included examples of both moral engagement and disengagement, albeit in a variety of forms. While most (70%) of participants made a statement indicative of moral engagement, most of these statements reflected an awareness of moral issues, while only a subset of these included additional elements of engagement such as recognizing responsibility, alignment with personal values, addressing conflicts of interests, and opportunities for action. Further, we identified eight distinct categories of moral disengagement reflecting efforts to minimize potential harms or deflect personal responsibility for preventing or mitigating harms.

CONCLUSIONS

These findings suggest possible facilitators and barriers to the development of ethical ML that could act by encouraging moral engagement or discouraging moral disengagement. Regulatory approaches that depend on the ability of ML developers to recognize, accept, and act on responsibility for mitigating harms might have limited success without education and guidance for ML developers about the extent of their responsibilities and how to implement them.

Collapse

Didier AJ, Nigro A, Noori Z, Omballi MA, Pappada SM, Hamouda DM. Application of machine learning for lung cancer survival prognostication-A systematic review and meta-analysis. Front Artif Intell 2024;7:1365777. [PMID: 38646415 PMCID: PMC11026647 DOI: 10.3389/frai.2024.1365777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Accepted: 03/18/2024] [Indexed: 04/23/2024] Open

Abstract

Introduction

Machine learning (ML) techniques have gained increasing attention in the field of healthcare, including predicting outcomes in patients with lung cancer. ML has the potential to enhance prognostication in lung cancer patients and improve clinical decision-making. In this systematic review and meta-analysis, we aimed to evaluate the performance of ML models compared to logistic regression (LR) models in predicting overall survival in patients with lung cancer.

Methods

We followed the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) statement. A comprehensive search was conducted in Medline, Embase, and Cochrane databases using a predefined search query. Two independent reviewers screened abstracts and conflicts were resolved by a third reviewer. Inclusion and exclusion criteria were applied to select eligible studies. Risk of bias assessment was performed using predefined criteria. Data extraction was conducted using the Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modeling Studies (CHARMS) checklist. Meta-analytic analysis was performed to compare the discriminative ability of ML and LR models.

Results

The literature search resulted in 3,635 studies, and 12 studies with a total of 211,068 patients were included in the analysis. Six studies reported confidence intervals and were included in the meta-analysis. The performance of ML models varied across studies, with C-statistics ranging from 0.60 to 0.85. The pooled analysis showed that ML models had higher discriminative ability compared to LR models, with a weighted average C-statistic of 0.78 for ML models compared to 0.70 for LR models.

Conclusion

Machine learning models show promise in predicting overall survival in patients with lung cancer, with superior discriminative ability compared to logistic regression models. However, further validation and standardization of ML models are needed before their widespread implementation in clinical practice. Future research should focus on addressing the limitations of the current literature, such as potential bias and heterogeneity among studies, to improve the accuracy and generalizability of ML models for predicting outcomes in patients with lung cancer. Further research and development of ML models in this field may lead to improved patient outcomes and personalized treatment strategies.

Collapse

Mehandru N, Miao BY, Almaraz ER, Sushil M, Butte AJ, Alaa A. Evaluating large language models as agents in the clinic. NPJ Digit Med 2024;7:84. [PMID: 38570554 PMCID: PMC10991271 DOI: 10.1038/s41746-024-01083-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 03/22/2024] [Indexed: 04/05/2024] Open

Balagopalan A, Baldini I, Celi LA, Gichoya J, McCoy LG, Naumann T, Shalit U, van der Schaar M, Wagstaff KL. Machine learning for healthcare that matters: Reorienting from technical novelty to equitable impact. PLOS DIGITAL HEALTH 2024;3:e0000474. [PMID: 38620047 PMCID: PMC11018283 DOI: 10.1371/journal.pdig.0000474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 02/18/2024] [Indexed: 04/17/2024]

Wang R, Kuo PC, Chen LC, Seastedt KP, Gichoya JW, Celi LA. Drop the shortcuts: image augmentation improves fairness and decreases AI detection of race and other demographics from medical images. EBioMedicine 2024;102:105047. [PMID: 38471396 PMCID: PMC10945176 DOI: 10.1016/j.ebiom.2024.105047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 02/15/2024] [Accepted: 02/21/2024] [Indexed: 03/14/2024] Open

Abstract

BACKGROUND

It has been shown that AI models can learn race on medical images, leading to algorithmic bias. Our aim in this study was to enhance the fairness of medical image models by eliminating bias related to race, age, and sex. We hypothesise models may be learning demographics via shortcut learning and combat this using image augmentation.

METHODS

This study included 44,953 patients who identified as Asian, Black, or White (mean age, 60.68 years ±18.21; 23,499 women) for a total of 194,359 chest X-rays (CXRs) from MIMIC-CXR database. The included CheXpert images comprised 45,095 patients (mean age 63.10 years ±18.14; 20,437 women) for a total of 134,300 CXRs were used for external validation. We also collected 1195 3D brain magnetic resonance imaging (MRI) data from the ADNI database, which included 273 participants with an average age of 76.97 years ±14.22, and 142 females. DL models were trained on either non-augmented or augmented images and assessed using disparity metrics. The features learned by the models were analysed using task transfer experiments and model visualisation techniques.

FINDINGS

In the detection of radiological findings, training a model using augmented CXR images was shown to reduce disparities in error rate among racial groups (-5.45%), age groups (-13.94%), and sex (-22.22%). For AD detection, the model trained with augmented MRI images was shown 53.11% and 31.01% reduction of disparities in error rate among age and sex groups, respectively. Image augmentation led to a reduction in the model's ability to identify demographic attributes and resulted in the model trained for clinical purposes incorporating fewer demographic features.

INTERPRETATION

The model trained using the augmented images was less likely to be influenced by demographic information in detecting image labels. These results demonstrate that the proposed augmentation scheme could enhance the fairness of interpretations by DL models when dealing with data from patients with different demographic backgrounds.

FUNDING

National Science and Technology Council (Taiwan), National Institutes of Health.

Collapse

Zhan K, Buhler KA, Chen IY, Fritzler MJ, Choi MY. Systemic lupus in the era of machine learning medicine. Lupus Sci Med 2024;11:e001140. [PMID: 38443092 PMCID: PMC11146397 DOI: 10.1136/lupus-2023-001140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 01/26/2024] [Indexed: 03/07/2024]

Khan L, Shahreen M, Qazi A, Jamil Ahmed Shah S, Hussain S, Chang HT. Migraine headache (MH) classification using machine learning methods with data augmentation. Sci Rep 2024;14:5180. [PMID: 38431729 PMCID: PMC10908834 DOI: 10.1038/s41598-024-55874-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 02/28/2024] [Indexed: 03/05/2024] Open

Abstract

Migraine headache, a prevalent and intricate neurovascular disease, presents significant challenges in its clinical identification. Existing techniques that use subjective pain intensity measures are insufficiently accurate to make a reliable diagnosis. Even though headaches are a common condition with poor diagnostic specificity, they have a significant negative influence on the brain, body, and general human function. In this era of deeply intertwined health and technology, machine learning (ML) has emerged as a crucial force in transforming every aspect of healthcare, utilizing advanced facilities ML has shown groundbreaking achievements related to developing classification and automatic predictors. With this, deep learning models, in particular, have proven effective in solving complex problems spanning computer vision and data analytics. Consequently, the integration of ML in healthcare has become vital, especially in developing countries where limited medical resources and lack of awareness prevail, the urgent need to forecast and categorize migraines using artificial intelligence (AI) becomes even more crucial. By training these models on a publicly available dataset, with and without data augmentation. This study focuses on leveraging state-of-the-art ML algorithms, including support vector machine (SVM), K-nearest neighbors (KNN), random forest (RF), decision tree (DST), and deep neural networks (DNN), to predict and classify various types of migraines. The proposed models with data augmentations were trained to classify seven various types of migraine. The proposed models with data augmentations were trained to classify seven various types of migraine. The revealed results show that DNN, SVM, KNN, DST, and RF achieved an accuracy of 99.66%, 94.60%, 97.10%, 88.20%, and 98.50% respectively with data augmentation highlighting the transformative potential of AI in enhancing migraine diagnosis.

Collapse

Chan SCC, Neves AL, Majeed A, Faisal A. Bridging the equity gap towards inclusive artificial intelligence in healthcare diagnostics. BMJ 2024;384:q490. [PMID: 38423556 DOI: 10.1136/bmj.q490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/02/2024]

Lock C, Tan NSM, Long IJ, Keong NC. Neuroimaging data repositories and AI-driven healthcare-Global aspirations vs. ethical considerations in machine learning models of neurological disease. Front Artif Intell 2024;6:1286266. [PMID: 38440234 PMCID: PMC10910099 DOI: 10.3389/frai.2023.1286266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2023] [Accepted: 12/27/2023] [Indexed: 03/06/2024] Open

Abstract

Neuroimaging data repositories are data-rich resources comprising brain imaging with clinical and biomarker data. The potential for such repositories to transform healthcare is tremendous, especially in their capacity to support machine learning (ML) and artificial intelligence (AI) tools. Current discussions about the generalizability of such tools in healthcare provoke concerns of risk of bias-ML models underperform in women and ethnic and racial minorities. The use of ML may exacerbate existing healthcare disparities or cause post-deployment harms. Do neuroimaging data repositories and their capacity to support ML/AI-driven clinical discoveries, have both the potential to accelerate innovative medicine and harden the gaps of social inequities in neuroscience-related healthcare? In this paper, we examined the ethical concerns of ML-driven modeling of global community neuroscience needs arising from the use of data amassed within neuroimaging data repositories. We explored this in two parts; firstly, in a theoretical experiment, we argued for a South East Asian-based repository to redress global imbalances. Within this context, we then considered the ethical framework toward the inclusion vs. exclusion of the migrant worker population, a group subject to healthcare inequities. Secondly, we created a model simulating the impact of global variations in the presentation of anosmia risks in COVID-19 toward altering brain structural findings; we then performed a mini AI ethics experiment. In this experiment, we interrogated an actual pilot dataset (n = 17; 8 non-anosmic (47%) vs. 9 anosmic (53%) using an ML clustering model. To create the COVID-19 simulation model, we bootstrapped to resample and amplify the dataset. This resulted in three hypothetical datasets: (i) matched (n = 68; 47% anosmic), (ii) predominant non-anosmic (n = 66; 73% disproportionate), and (iii) predominant anosmic (n = 66; 76% disproportionate). We found that the differing proportions of the same cohorts represented in each hypothetical dataset altered not only the relative importance of key features distinguishing between them but even the presence or absence of such features. The main objective of our mini experiment was to understand if ML/AI methodologies could be utilized toward modelling disproportionate datasets, in a manner we term "AI ethics." Further work is required to expand the approach proposed here into a reproducible strategy.

Collapse

Li A, Mullin S, Elkin PL. Improving Prediction of Survival for Extremely Premature Infants Born at 23 to 29 Weeks Gestational Age in the Neonatal Intensive Care Unit: Development and Evaluation of Machine Learning Models. JMIR Med Inform 2024;12:e42271. [PMID: 38354033 PMCID: PMC10902770 DOI: 10.2196/42271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 02/02/2023] [Accepted: 12/28/2023] [Indexed: 03/02/2024] Open

Abstract

BACKGROUND

Infants born at extremely preterm gestational ages are typically admitted to the neonatal intensive care unit (NICU) after initial resuscitation. The subsequent hospital course can be highly variable, and despite counseling aided by available risk calculators, there are significant challenges with shared decision-making regarding life support and transition to end-of-life care. Improving predictive models can help providers and families navigate these unique challenges.

OBJECTIVE

Machine learning methods have previously demonstrated added predictive value for determining intensive care unit outcomes, and their use allows consideration of a greater number of factors that potentially influence newborn outcomes, such as maternal characteristics. Machine learning-based models were analyzed for their ability to predict the survival of extremely preterm neonates at initial admission.

METHODS

Maternal and newborn information was extracted from the health records of infants born between 23 and 29 weeks of gestation in the Medical Information Mart for Intensive Care III (MIMIC-III) critical care database. Applicable machine learning models predicting survival during the initial NICU admission were developed and compared. The same type of model was also examined using only features that would be available prepartum for the purpose of survival prediction prior to an anticipated preterm birth. Features most correlated with the predicted outcome were determined when possible for each model.

RESULTS

Of included patients, 37 of 459 (8.1%) expired. The resulting random forest model showed higher predictive performance than the frequently used Score for Neonatal Acute Physiology With Perinatal Extension II (SNAPPE-II) NICU model when considering extremely preterm infants of very low birth weight. Several other machine learning models were found to have good performance but did not show a statistically significant difference from previously available models in this study. Feature importance varied by model, and those of greater importance included gestational age; birth weight; initial oxygenation level; elements of the APGAR (appearance, pulse, grimace, activity, and respiration) score; and amount of blood pressure support. Important prepartum features also included maternal age, steroid administration, and the presence of pregnancy complications.

CONCLUSIONS

Machine learning methods have the potential to provide robust prediction of survival in the context of extremely preterm births and allow for consideration of additional factors such as maternal clinical and socioeconomic information. Evaluation of larger, more diverse data sets may provide additional clarity on comparative performance.

Collapse

Giddings R, Joseph A, Callender T, Janes SM, van der Schaar M, Sheringham J, Navani N. Factors influencing clinician and patient interaction with machine learning-based risk prediction models: a systematic review. Lancet Digit Health 2024;6:e131-e144. [PMID: 38278615 DOI: 10.1016/s2589-7500(23)00241-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 10/20/2023] [Accepted: 11/14/2023] [Indexed: 01/28/2024]

Berman D, Hunter C, Hossain A, Yao J, Workman E, Guan S, Strickhart L, Beanlands R, Slater D, deKemp RA. Machine and deep learning models for accurate detection of ischemia and scar with myocardial blood flow positron emission tomography imaging. J Nucl Cardiol 2024;32:101797. [PMID: 38185409 DOI: 10.1016/j.nuclcard.2024.101797] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]

K M, Syed K. Arrhythmia classification for non-experts using infinite impulse response (IIR)-filter-based machine learning and deep learning models of the electrocardiogram. PeerJ Comput Sci 2024;10:e1774. [PMID: 38435599 PMCID: PMC10909216 DOI: 10.7717/peerj-cs.1774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Accepted: 12/04/2023] [Indexed: 03/05/2024]

Pierson E. Accuracy and Equity in Clinical Risk Prediction. N Engl J Med 2024;390:100-102. [PMID: 38198167 DOI: 10.1056/nejmp2311050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/11/2024]

Rahman A, Debnath T, Kundu D, Khan MSI, Aishi AA, Sazzad S, Sayduzzaman M, Band SS. Machine learning and deep learning-based approach in smart healthcare: Recent advances, applications, challenges and opportunities. AIMS Public Health 2024;11:58-109. [PMID: 38617415 PMCID: PMC11007421 DOI: 10.3934/publichealth.2024004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Accepted: 12/18/2023] [Indexed: 04/16/2024] Open

Schulte PJ, Goldberg JD, Oster RA, Ambrosius WT, Bonner LB, Cabral H, Carter RE, Chen Y, Desai M, Li D, Lindsell CJ, Pomann GM, Slade E, Tosteson TD, Yu F, Spratt H. Peer review of clinical and translational research manuscripts: Perspectives from statistical collaborators. J Clin Transl Sci 2024;8:e20. [PMID: 38384899 PMCID: PMC10879991 DOI: 10.1017/cts.2023.707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 11/29/2023] [Accepted: 12/19/2023] [Indexed: 02/23/2024] Open

Affiliation(s)

Phillip J. Schulte Division of Clinical Trials and Biostatistics, Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA
Judith D. Goldberg Division of Biostatistics, Department of Population Health, New York University Grossman School of Medicine, New York, NY, USA
Robert A. Oster Department of Medicine, Division of Preventive Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
Walter T. Ambrosius Department of Biostatistics and Data Science, Wake Forest University School of Medicine, Winston-Salem, NC, USA
Lauren Balmert Bonner Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
Howard Cabral Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
Rickey E. Carter Department of Quantitative Health Sciences, Mayo Clinic, Jacksonville, FL, USA
Ye Chen Biostatistics, Epidemiology and Research Design (BERD), Tufts Clinical and Translational Science Institute (CTSI), Boston, MA, USA
Manisha Desai Quantitative Sciences Unit, Departments of Medicine, Biomedical Data Science, and Epidemiology and Population Health, Stanford University, Stanford, CA, USA
Dongmei Li Department of Clinical and Translational Research, Obstetrics and Gynecology and Public Health Sciences, University of Rochester Medical Center, Rochester, NY, USA
Christopher J. Lindsell Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
Gina-Maria Pomann Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
Emily Slade Department of Biostatistics, University of Kentucky, Lexington, KY, USA
Tor D. Tosteson Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Hanover, NH, USA
Fang Yu Department of Biostatistics, University of Nebraska Medical Center, Omaha, NE, USA
Heidi Spratt Department of Biostatistics and Data Science, School of Public and Population Health, University of Texas Medical Branch, Galveston, TX, USA

Collapse

Beam K, Sharma P, Levy P, Beam AL. Artificial intelligence in the neonatal intensive care unit: the time is now. J Perinatol 2024;44:131-135. [PMID: 37443271 DOI: 10.1038/s41372-023-01719-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 06/24/2023] [Accepted: 07/03/2023] [Indexed: 07/15/2023]

Nieser KJ, Cochran AL. Quantifying and reducing inequity in average treatment effect estimation. BMC Med Res Methodol 2023;23:297. [PMID: 38102563 PMCID: PMC10722685 DOI: 10.1186/s12874-023-02104-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 11/16/2023] [Indexed: 12/17/2023] Open

Herington J, McCradden MD, Creel K, Boellaard R, Jones EC, Jha AK, Rahmim A, Scott PJH, Sunderland JJ, Wahl RL, Zuehlsdorff S, Saboury B. Ethical Considerations for Artificial Intelligence in Medical Imaging: Data Collection, Development, and Evaluation. J Nucl Med 2023;64:1848-1854. [PMID: 37827839 PMCID: PMC10690124 DOI: 10.2967/jnumed.123.266080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 09/12/2023] [Indexed: 10/14/2023] Open

Allareddy V, Oubaidin M, Rampa S, Venugopalan SR, Elnagar MH, Yadav S, Lee MK. Call for algorithmic fairness to mitigate amplification of racial biases in artificial intelligence models used in orthodontics and craniofacial health. Orthod Craniofac Res 2023;26 Suppl 1:124-130. [PMID: 37846615 DOI: 10.1111/ocr.12721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/09/2023] [Indexed: 10/18/2023]

Mei Z, Zheng D, Ge M. Informative Artifacts in AI-Assisted Care. N Engl J Med 2023;389:10.1056/NEJMc2311525#sa2. [PMID: 38048205 DOI: 10.1056/nejmc2311525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/06/2023]

Trentham-Dietz A, Corley DA, Del Vecchio NJ, Greenlee RT, Haas JS, Hubbard RA, Hughes AE, Kim JJ, Kobrin S, Li CI, Meza R, Neslund-Dudas CM, Tiro JA. Data gaps and opportunities for modeling cancer health equity. J Natl Cancer Inst Monogr 2023;2023:246-254. [PMID: 37947335 PMCID: PMC11009506 DOI: 10.1093/jncimonographs/lgad025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 07/12/2023] [Accepted: 08/15/2023] [Indexed: 11/12/2023] Open

Arora A, Alderman JE, Palmer J, Ganapathi S, Laws E, McCradden MD, Oakden-Rayner L, Pfohl SR, Ghassemi M, McKay F, Treanor D, Rostamzadeh N, Mateen B, Gath J, Adebajo AO, Kuku S, Matin R, Heller K, Sapey E, Sebire NJ, Cole-Lewis H, Calvert M, Denniston A, Liu X. The value of standards for health datasets in artificial intelligence-based applications. Nat Med 2023;29:2929-2938. [PMID: 37884627 PMCID: PMC10667100 DOI: 10.1038/s41591-023-02608-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 09/22/2023] [Indexed: 10/28/2023]

Affiliation(s)

Anmol Arora School of Clinical Medicine, University of Cambridge, Cambridge, UK
Joseph E Alderman Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK
Joanne Palmer University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK
Shaswath Ganapathi Sandwell and West Birmingham Hospitals NHS Trust, Birmingham, UK
Elinor Laws Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK
Melissa D McCradden Department of Bioethics, The Hospital for Sick Children, Toronto, Ontario, Canada Genetics and Genome Biology, Peter Gilgan Centre for Research and Learning, Toronto, Ontario, Canada Dalla Lana School of Public Health, Toronto, Ontario, Canada
Lauren Oakden-Rayner The Australian Institute for Machine Learning, University of Adelaide, Adelaide, South Australia, Australia
Stephen R Pfohl Google Research, Mountain View, CA, USA
Marzyeh Ghassemi Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA Institute for Medical Engineering & Science, Massachusetts Institute of Technology, Cambridge, MA, USA Vector Institute, Toronto, Ontario, Canada
Francis McKay The Ethox Centre and the Wellcome Centre for Ethics and Humanities, Nuffield Department of Population Health, University of Oxford, Oxford, UK
Darren Treanor Leeds Teaching Hospitals NHS Trust, Leeds, UK University of Leeds, Leeds, UK Department of Clinical Pathology and Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden Center for Medical Image Science and Visualization, Linköping University, Linköping, Sweden
Negar Rostamzadeh Google Research, Montreal, Quebec, Canada
Bilal Mateen Institute for Health Informatics, University College London, London, UK Wellcome Trust, London, UK
Jacqui Gath Patient and Public Involvement and Engagement (PPIE) Group, STANDING Together, Birmingham, UK
Adewole O Adebajo Patient and Public Involvement and Engagement (PPIE) Group, STANDING Together, Birmingham, UK
Stephanie Kuku Institute of Women's Health, UCL, London, UK
Rubeta Matin Oxford University Hospitals NHS Foundation Trust, Oxford, UK
Katherine Heller Google Research, Mountain View, CA, USA
Elizabeth Sapey Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK PIONEER, HDR UK Hub in Acute Care, Institute of Inflammation and Ageing, University of Birmingham, Birmingham, UK
Neil J Sebire National Institute for Health and Care Research, Great Ormond Street Hospital Biomedical Research Centre, London, UK Great Ormond Street Institute of Child Health, University Hospital London, London, UK
Heather Cole-Lewis Google Research, Mountain View, CA, USA
Melanie Calvert National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK Birmingham Health Partners Centre for Regulatory Science and Innovation, University of Birmingham, Birmingham, UK Centre for Patient Reported Outcomes Research, Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK National Institute for Health and Care Research Applied Research Collaboration West Midlands, University of Birmingham, Birmingham, UK National Institute for Health and Care Research Birmingham-Oxford Blood and Transplant Research Unit in Precision Transplant and Cellular Therapeutics, University of Birmingham, Birmingham, UK DEMAND Hub, University of Birmingham, Birmingham, UK UK SPINE, University of Birmingham, Birmingham, UK
Alastair Denniston Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK Birmingham Health Partners Centre for Regulatory Science and Innovation, University of Birmingham, Birmingham, UK National Institute for Health and Care Research Biomedical Research Centre, Moorfields Eye Hospital/University College London, London, UK
Xiaoxuan Liu Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK. University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK. National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK.

Collapse

Ghassemi M. Presentation matters for AI-generated clinical advice. Nat Hum Behav 2023;7:1833-1835. [PMID: 37985904 DOI: 10.1038/s41562-023-01721-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]

Hubbard RA, Pujol TA, Alhajjar E, Edoh K, Martin ML. Sources of Disparities in Surveillance Mammography Performance and Risk-Guided Recommendations for Supplemental Breast Imaging: A Simulation Study. Cancer Epidemiol Biomarkers Prev 2023;32:1531-1541. [PMID: 37351916 PMCID: PMC10750297 DOI: 10.1158/1055-9965.epi-23-0330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Revised: 05/22/2023] [Accepted: 06/21/2023] [Indexed: 06/24/2023] Open

Abstract

BACKGROUND

Surveillance mammography is recommended for all women with a history of breast cancer. Risk-guided surveillance incorporating advanced imaging modalities based on individual risk of a second cancer could improve cancer detection. However, personalized surveillance may also amplify disparities.

METHODS

In simulated populations using inputs from the Breast Cancer Surveillance Consortium (BCSC), we investigated race- and ethnicity-based disparities. Disparities were decomposed into those due to primary breast cancer and treatment characteristics, social determinants of health (SDOH) and differential error in second cancer ascertainment by modeling populations with or without variation across race and ethnicity in the distribution of these characteristics. We estimated effects of disparities on mammography performance and supplemental imaging recommendations stratified by race and ethnicity.

RESULTS

In simulated cohorts based on 65,446 BCSC surveillance mammograms, when only cancer characteristics varied by race and ethnicity, mammograms for Black women had lower sensitivity compared with the overall population (64.1% vs. 71.1%). Differences between Black women and the overall population were larger when both cancer characteristics and SDOH varied by race and ethnicity (53.8% vs. 71.1%). Basing supplemental imaging recommendations on high predicted second cancer risk resulted in less frequent recommendations for Hispanic (6.7%) and Asian/Pacific Islander women (6.4%) compared with the overall population (10.0%).

CONCLUSIONS

Variation in cancer characteristics and SDOH led to disparities in surveillance mammography performance and recommendations for supplemental imaging.

IMPACT

Risk-guided surveillance imaging may exacerbate disparities. Decision-makers should consider implications for equity in cancer outcomes resulting from implementing risk-guided screening programs. See related In the Spotlight, p. 1479.

Collapse

Ricci Lara MA, Rodríguez Kowalczuk MV, Lisa Eliceche M, Ferraresso MG, Luna DR, Benitez SE, Mazzuoccolo LD. A dataset of skin lesion images collected in Argentina for the evaluation of AI tools in this population. Sci Data 2023;10:712. [PMID: 37853053 PMCID: PMC10584927 DOI: 10.1038/s41597-023-02630-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 10/11/2023] [Indexed: 10/20/2023] Open

Wahid KA, Cardenas CE, Marquez B, Netherton TJ, Kann BH, Court LE, He R, Naser MA, Moreno AC, Fuller CD, Fuentes D. Evolving Horizons in Radiotherapy Auto-Contouring: Distilling Insights, Embracing Data-Centric Frameworks, and Moving Beyond Geometric Quantification. ARXIV 2023:arXiv:2310.10867v1. [PMID: 37904737 PMCID: PMC10614971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Subscribe] [Scholar Register] [Indexed: 11/01/2023]

Teotia K, Jia Y, Woite NL, Celi LA, Matos J, Struja T. Variation in monitoring: Glucose measurement in the ICU as a case study to preempt spurious correlations. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.10.12.23296568. [PMID: 37873163 PMCID: PMC10593024 DOI: 10.1101/2023.10.12.23296568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]

Abstract

Objective

Methods

Results

Conclusion

Collapse

Charpignon ML, Byers J, Cabral S, Celi LA, Fernandes C, Gallifant J, Lough ME, Mlombwa D, Moukheiber L, Ong BA, Panitchote A, William W, Wong AKI, Nazer L. Critical Bias in Critical Care Devices. Crit Care Clin 2023;39:795-813. [PMID: 37704341 DOI: 10.1016/j.ccc.2023.02.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]

Affiliation(s)

Marie-Laure Charpignon Institute for Data, Systems, and Society (IDSS), E18-407A, 50 Ames Street, Cambridge, MA 02142, USA.
Joseph Byers Respiratory Therapy, Beth Israel Deaconess Medical Center, 330 Brookline Avenue, Boston, MA 02215, USA
Stephanie Cabral Department of Medicine, Beth Israel Deaconess Medical Center, 330 Brookline Avenue, Boston, MA 02215, USA
Leo Anthony Celi Laboratory for Computational Physiology, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA; Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
Chrystinne Fernandes Laboratory for Computational Physiology, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
Jack Gallifant Imperial College London NHS Trust, St Thomas' Hospital, Westminster Bridge Road, London SE1 7EH, UK
Mary E Lough Stanford Health Care, Stanford University, 300 Pasteur Drive, Stanford, CA 94305, USA
Donald Mlombwa Zomba Central Hospital, 8th Avenue, Zomba, Malawi; Kamuzu College of Health Sciences, Blantyre, Malawi; St. Luke's College of Health Sciences, Chilema-Zomba, Malawi
Lama Moukheiber Institute for Medical Engineering and Science, Massachusetts Institute of Technology, 77 Massachusetts Avenue, E25-330, Cambridge, MA 02139, USA
Bradley Ashley Ong College of Medicine, University of the Philippines Manila, Calderon hall, UP College of Medicine, 547 Pedro Gil Street, Ermita Manila, Philippines
Anupol Panitchote Faculty of Medicine, Khon Kaen University, 123 Mittraparp Highway, Muang District, Khon Kaen 40002, Thailand
Wasswa William Mbarara University of Science and Technology, P.O. Box 1410, Mbarara, Uganda
An-Kwok Ian Wong Duke University Medical Center, 2424 Erwin Road, Suite 1102, Hock Plaza Box 2721, Durham, NC 27710, USA
Lama Nazer King Hussein Cancer Center, Queen Rania Street 202, Amman, Jordan

Collapse

Langlotz CP. The Future of AI and Informatics in Radiology: 10 Predictions. Radiology 2023;309:e231114. [PMID: 37874234 PMCID: PMC10623186 DOI: 10.1148/radiol.231114] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 05/16/2023] [Accepted: 05/22/2023] [Indexed: 10/25/2023]

Herington J, McCradden MD, Creel K, Boellaard R, Jones EC, Jha AK, Rahmim A, Scott PJH, Sunderland JJ, Wahl RL, Zuehlsdorff S, Saboury B. Ethical Considerations for Artificial Intelligence in Medical Imaging: Deployment and Governance. J Nucl Med 2023;64:1509-1515. [PMID: 37620051 DOI: 10.2967/jnumed.123.266110] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 07/11/2023] [Indexed: 08/26/2023] Open

Singh N, Lawrence K, Richardson S, Mann DM. Centering health equity in large language model deployment. PLOS DIGITAL HEALTH 2023;2:e0000367. [PMID: 37874780 PMCID: PMC10597518 DOI: 10.1371/journal.pdig.0000367] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/26/2023]

Hunter DJ, Holmes C. Where Medical Statistics Meets Artificial Intelligence. N Engl J Med 2023;389:1211-1219. [PMID: 37754286 DOI: 10.1056/nejmra2212850] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 09/28/2023]

McElfresh DC, Chen L, Oliva E, Joyce V, Rose S, Tamang S. A call for better validation of opioid overdose risk algorithms. J Am Med Inform Assoc 2023;30:1741-1746. [PMID: 37428897 PMCID: PMC10531142 DOI: 10.1093/jamia/ocad110] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 05/11/2023] [Accepted: 07/01/2023] [Indexed: 07/12/2023] Open

Muralidharan V, Burgart A, Daneshjou R, Rose S. Recommendations for the use of pediatric data in artificial intelligence and machine learning ACCEPT-AI. NPJ Digit Med 2023;6:166. [PMID: 37673925 PMCID: PMC10482936 DOI: 10.1038/s41746-023-00898-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 08/03/2023] [Indexed: 09/08/2023] Open

Alvarez-Estevez D. Challenges of Applying Automated Polysomnography Scoring at Scale. Sleep Med Clin 2023;18:277-292. [PMID: 37532369 DOI: 10.1016/j.jsmc.2023.05.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/04/2023]

Gao Y, Sharma T, Cui Y. Addressing the Challenge of Biomedical Data Inequality: An Artificial Intelligence Perspective. Annu Rev Biomed Data Sci 2023;6:153-171. [PMID: 37104653 PMCID: PMC10529864 DOI: 10.1146/annurev-biodatasci-020722-020704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]

Singhal K, Azizi S, Tu T, Mahdavi SS, Wei J, Chung HW, Scales N, Tanwani A, Cole-Lewis H, Pfohl S, Payne P, Seneviratne M, Gamble P, Kelly C, Babiker A, Schärli N, Chowdhery A, Mansfield P, Demner-Fushman D, Agüera Y Arcas B, Webster D, Corrado GS, Matias Y, Chou K, Gottweis J, Tomasev N, Liu Y, Rajkomar A, Barral J, Semturs C, Karthikesalingam A, Natarajan V. Large language models encode clinical knowledge. Nature 2023;620:172-180. [PMID: 37438534 PMCID: PMC10396962 DOI: 10.1038/s41586-023-06291-2] [Citation(s) in RCA: 244] [Impact Index Per Article: 244.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Accepted: 06/05/2023] [Indexed: 07/14/2023]

Abstract

Large language models (LLMs) have demonstrated impressive capabilities, but the bar for clinical applications is high. Attempts to assess the clinical knowledge of models typically rely on automated evaluations based on limited benchmarks. Here, to address these limitations, we present MultiMedQA, a benchmark combining six existing medical question answering datasets spanning professional medicine, research and consumer queries and a new dataset of medical questions searched online, HealthSearchQA. We propose a human evaluation framework for model answers along multiple axes including factuality, comprehension, reasoning, possible harm and bias. In addition, we evaluate Pathways Language Model1 (PaLM, a 540-billion parameter LLM) and its instruction-tuned variant, Flan-PaLM2 on MultiMedQA. Using a combination of prompting strategies, Flan-PaLM achieves state-of-the-art accuracy on every MultiMedQA multiple-choice dataset (MedQA3, MedMCQA4, PubMedQA5 and Measuring Massive Multitask Language Understanding (MMLU) clinical topics6), including 67.6% accuracy on MedQA (US Medical Licensing Exam-style questions), surpassing the prior state of the art by more than 17%. However, human evaluation reveals key gaps. To resolve this, we introduce instruction prompt tuning, a parameter-efficient approach for aligning LLMs to new domains using a few exemplars. The resulting model, Med-PaLM, performs encouragingly, but remains inferior to clinicians. We show that comprehension, knowledge recall and reasoning improve with model scale and instruction prompt tuning, suggesting the potential utility of LLMs in medicine. Our human evaluations reveal limitations of today's models, reinforcing the importance of both evaluation frameworks and method development in creating safe, helpful LLMs for clinical applications.

Collapse

Polevikov S. Advancing AI in healthcare: A comprehensive review of best practices. Clin Chim Acta 2023;548:117519. [PMID: 37595864 DOI: 10.1016/j.cca.2023.117519] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 08/14/2023] [Accepted: 08/15/2023] [Indexed: 08/20/2023]