Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: McCradden MD, Joshi S, Anderson JA, Mazwi M, Goldenberg A, Zlotnik Shaul R. Patient safety and quality improvement: Ethical principles for a regulatory approach to bias in healthcare machine learning. J Am Med Inform Assoc 2020;27:2024-2027. [PMID: 32585698 PMCID: PMC7727331 DOI: 10.1093/jamia/ocaa085] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Accepted: 05/01/2020] [Indexed: 12/27/2022] Open

For:	McCradden MD, Joshi S, Anderson JA, Mazwi M, Goldenberg A, Zlotnik Shaul R. Patient safety and quality improvement: Ethical principles for a regulatory approach to bias in healthcare machine learning. J Am Med Inform Assoc 2020;27:2024-2027. [PMID: 32585698 PMCID: PMC7727331 DOI: 10.1093/jamia/ocaa085] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Accepted: 05/01/2020] [Indexed: 12/27/2022] Open

Number

Cited by Other Article(s)

Sun K, Lan T, Goh YM, Safiena S, Huang YH, Lytle B, He Y. An interpretable clustering approach to safety climate analysis: Examining driver group distinctions. ACCIDENT; ANALYSIS AND PREVENTION 2024;196:107420. [PMID: 38159513 DOI: 10.1016/j.aap.2023.107420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 11/23/2023] [Accepted: 12/01/2023] [Indexed: 01/03/2024]

Abstract

The transportation industry, particularly the trucking sector, is prone to workplace accidents and fatalities. Accidents involving large trucks accounted for a considerable percentage of overall traffic fatalities. Recognizing the crucial role of safety climate in accident prevention, researchers have sought to understand its factors and measure its impact within organizations. While existing data-driven safety climate studies have made remarkable progress, clustering employees based on their safety climate perception is innovative and has not been extensively utilized in research. Identifying clusters of drivers based on their safety climate perception allows the organization to profile its workforce and devise more impactful interventions. The lack of utilizing the clustering approach could be due to difficulties interpreting or explaining the factors influencing employees' cluster membership. Moreover, existing safety-related studies did not compare multiple clustering algorithms, resulting in potential bias. To address these problems, this study introduces an interpretable clustering approach for safety climate analysis. This study compares five algorithms for clustering truck drivers based on their safety climate perceptions. It also proposes a novel method for quantitatively evaluating partial dependence plots (QPDP). Then, to better interpret the clustering results, this study introduces different interpretable machine learning measures (Shapley additive explanations, permutation feature importance, and QPDP). The Python code used in this study is available at https://github.com/NUS-DBE/truck-driver-safety-climate. This study explains the clusters based on the importance of different safety climate factors. Drawing on data collected from more than 7,000 American truck drivers, this study significantly contributes to the scientific literature. It highlights the critical role of supervisory care promotion in distinguishing various driver groups. Moreover, it showcases the advantages of employing machine learning techniques, such as cluster analysis, to enrich the scientific knowledge in this field. Future studies could involve experimental methods to assess strategies for enhancing supervisory care promotion, as well as integrating deep learning clustering techniques with safety climate evaluation.

Collapse

Fehr J, Citro B, Malpani R, Lippert C, Madai VI. A trustworthy AI reality-check: the lack of transparency of artificial intelligence products in healthcare. Front Digit Health 2024;6:1267290. [PMID: 38455991 PMCID: PMC10919164 DOI: 10.3389/fdgth.2024.1267290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 02/05/2024] [Indexed: 03/09/2024] Open

Kasun M, Ryan K, Paik J, Lane-McKinley K, Dunn LB, Roberts LW, Kim JP. Academic machine learning researchers' ethical perspectives on algorithm development for health care: a qualitative study. J Am Med Inform Assoc 2024;31:563-573. [PMID: 38069455 PMCID: PMC10873830 DOI: 10.1093/jamia/ocad238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 10/20/2023] [Accepted: 12/05/2023] [Indexed: 02/18/2024] Open

Graham SS, Shifflet S, Amjad M, Claborn K. An interpretable machine learning framework for opioid overdose surveillance from emergency medical services records. PLoS One 2024;19:e0292170. [PMID: 38289927 PMCID: PMC10826931 DOI: 10.1371/journal.pone.0292170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Accepted: 09/14/2023] [Indexed: 02/01/2024] Open

Abstract

The goal of this study is to develop and validate a lightweight, interpretable machine learning (ML) classifier to identify opioid overdoses in emergency medical services (EMS) records. We conducted a comparative assessment of three feature engineering approaches designed for use with unstructured narrative data. Opioid overdose annotations were provided by two harm reduction paramedics and two supporting annotators trained to reliably match expert annotations. Candidate feature engineering techniques included term frequency-inverse document frequency (TF-IDF), a highly performant approach to concept vectorization, and a custom approach based on the count of empirically-identified keywords. Each feature set was trained using four model architectures: generalized linear model (GLM), Naïve Bayes, neural network, and Extreme Gradient Boost (XGBoost). Ensembles of trained models were also evaluated. The custom feature models were also assessed for variable importance to aid interpretation. Models trained using TF-IDF feature engineering ranged from AUROC = 0.59 (95% CI: 0.53-0.66) for the Naïve Bayes to AUROC = 0.76 (95% CI: 0.71-0.81) for the neural network. Models trained using concept vectorization features ranged from AUROC = 0.83 (95% 0.78-0.88)for the Naïve Bayes to AUROC = 0.89 (95% CI: 0.85-0.94) for the ensemble. Models trained using custom features were the most performant, with benchmarks ranging from AUROC = 0.92 (95% CI: 0.88-0.95) with the GLM to 0.93 (95% CI: 0.90-0.96) for the ensemble. The custom features model achieved positive predictive values (PPV) ranging for 80 to 100%, which represent substantial improvements over previously published EMS encounter opioid overdose classifiers. The application of this approach to county EMS data can productively inform local and targeted harm reduction initiatives.

Collapse

Green BL, Murphy A, Robinson E. Accelerating health disparities research with artificial intelligence. Front Digit Health 2024;6:1330160. [PMID: 38322109 PMCID: PMC10844447 DOI: 10.3389/fdgth.2024.1330160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Accepted: 01/10/2024] [Indexed: 02/08/2024] Open

Cen HS, Dandamudi S, Lei X, Weight C, Desai M, Gill I, Duddalwar V. Diversity in Renal Mass Data Cohorts: Implications for Urology AI Researchers. Oncology 2023:000535841. [PMID: 38104555 PMCID: PMC11178677 DOI: 10.1159/000535841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Accepted: 12/08/2023] [Indexed: 12/19/2023]

Abstract

Objective We examine the heterogeneity and distribution of the cohort populations in two publicly used radiological image cohorts, Cancer Genome Atlas Kidney Renal Clear Cell Carcinoma (TCIA TCGA KIRC) collection and 2019 MICCAI Kidney Tumor Segmentation Challenge (KiTS19), and deviations in real world population renal cancer data from National Cancer Database (NCDB) Participant User Data File (PUF) and tertiary center data. PUF data is used as an anchor for prevalence rate bias assessment. Specific gene expression and therefore biology of RCC differ by self-reported race especially between the African American and Caucasian population. AI algorithms learn from datasets, but if the dataset misrepresents the population, reinforcing bias may occur. Ignoring these demographic features may lead to inaccurate downstream effects, thereby limiting the translation of these analyses to clinical practice. Consciousness of model training biases is vital to patient care decisions when using models in clinical settings. Method Data evaluated included the gender, demographic and reported pathologic grading and cancer staging. American Urological Association risk levels were used. Poisson regression was used to estimate the population-based and sample specific estimation for prevalence rate and corresponding 95% confidence interval. SAS 9.4 was used for data analysis. Result Compared to PUF, KiTS19 and TCGA KIRC over sampled Caucasian by 9.5% (95% CI, -3.7% to 22.7%) and 15.1% (95% CI, 1.5% to 28.8%), under sampled African American by -6.7% (95% CI, -10% to -3.3%), -5.5% (95% CI, -9.3% to -1.8%). Tertiary also under sampled African American by -6.6% (95% CI, -8.7% to -4.6%). The tertiary cohort largely under sampled aggressive cancers by -14.7% (95% CI, -20.9% to -8.4%). No statistically significant difference was found among PUF, TCGA, and KiTS19 in aggressive rate, however heterogeneities in risk are notable. Conclusion Heterogeneities between cohorts need to be considered in future AI training and cross-validation for renal masses.

Collapse

Chin MH, Afsar-Manesh N, Bierman AS, Chang C, Colón-Rodríguez CJ, Dullabh P, Duran DG, Fair M, Hernandez-Boussard T, Hightower M, Jain A, Jordan WB, Konya S, Moore RH, Moore TT, Rodriguez R, Shaheen G, Snyder LP, Srinivasan M, Umscheid CA, Ohno-Machado L. Guiding Principles to Address the Impact of Algorithm Bias on Racial and Ethnic Disparities in Health and Health Care. JAMA Netw Open 2023;6:e2345050. [PMID: 38100101 PMCID: PMC11181958 DOI: 10.1001/jamanetworkopen.2023.45050] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/18/2023] Open

Abstract

Importance

Health care algorithms are used for diagnosis, treatment, prognosis, risk stratification, and allocation of resources. Bias in the development and use of algorithms can lead to worse outcomes for racial and ethnic minoritized groups and other historically marginalized populations such as individuals with lower income.

Objective

To provide a conceptual framework and guiding principles for mitigating and preventing bias in health care algorithms to promote health and health care equity.

Evidence Review

The Agency for Healthcare Research and Quality and the National Institute for Minority Health and Health Disparities convened a diverse panel of experts to review evidence, hear from stakeholders, and receive community feedback.

Findings

The panel developed a conceptual framework to apply guiding principles across an algorithm's life cycle, centering health and health care equity for patients and communities as the goal, within the wider context of structural racism and discrimination. Multiple stakeholders can mitigate and prevent bias at each phase of the algorithm life cycle, including problem formulation (phase 1); data selection, assessment, and management (phase 2); algorithm development, training, and validation (phase 3); deployment and integration of algorithms in intended settings (phase 4); and algorithm monitoring, maintenance, updating, or deimplementation (phase 5). Five principles should guide these efforts: (1) promote health and health care equity during all phases of the health care algorithm life cycle; (2) ensure health care algorithms and their use are transparent and explainable; (3) authentically engage patients and communities during all phases of the health care algorithm life cycle and earn trustworthiness; (4) explicitly identify health care algorithmic fairness issues and trade-offs; and (5) establish accountability for equity and fairness in outcomes from health care algorithms.

Conclusions and Relevance

Multiple stakeholders must partner to create systems, processes, regulations, incentives, standards, and policies to mitigate and prevent algorithmic bias. Reforms should implement guiding principles that support promotion of health and health care equity in all phases of the algorithm life cycle as well as transparency and explainability, authentic community engagement and ethical partnerships, explicit identification of fairness issues and trade-offs, and accountability for equity and fairness.

Collapse

McCradden MD, Joshi S, Anderson JA, London AJ. A normative framework for artificial intelligence as a sociotechnical system in healthcare. PATTERNS (NEW YORK, N.Y.) 2023;4:100864. [PMID: 38035190 PMCID: PMC10682751 DOI: 10.1016/j.patter.2023.100864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/02/2023]

Li LT, Haley LC, Boyd AK, Bernstam EV. Technical/Algorithm, Stakeholder, and Society (TASS) barriers to the application of artificial intelligence in medicine: A systematic review. J Biomed Inform 2023;147:104531. [PMID: 37884177 DOI: 10.1016/j.jbi.2023.104531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 09/14/2023] [Accepted: 10/22/2023] [Indexed: 10/28/2023]

Abstract

INTRODUCTION

The use of artificial intelligence (AI), particularly machine learning and predictive analytics, has shown great promise in health care. Despite its strong potential, there has been limited use in health care settings. In this systematic review, we aim to determine the main barriers to successful implementation of AI in healthcare and discuss potential ways to overcome these challenges.

METHODS

We conducted a literature search in PubMed (1/1/2001-1/1/2023). The search was restricted to publications in the English language, and human study subjects. We excluded articles that did not discuss AI, machine learning, predictive analytics, and barriers to the use of these techniques in health care. Using grounded theory methodology, we abstracted concepts to identify major barriers to AI use in medicine.

RESULTS

We identified a total of 2,382 articles. After reviewing the 306 included papers, we developed 19 major themes, which we categorized into three levels: the Technical/Algorithm, Stakeholder, and Social levels (TASS). These themes included: Lack of Explainability, Need for Validation Protocols, Need for Standards for Interoperability, Need for Reporting Guidelines, Need for Standardization of Performance Metrics, Lack of Plan for Updating Algorithm, Job Loss, Skills Loss, Workflow Challenges, Loss of Patient Autonomy and Consent, Disturbing the Patient-Clinician Relationship, Lack of Trust in AI, Logistical Challenges, Lack of strategic plan, Lack of Cost-effectiveness Analysis and Proof of Efficacy, Privacy, Liability, Bias and Social Justice, and Education.

CONCLUSION

We identified 19 major barriers to the use of AI in healthcare and categorized them into three levels: the Technical/Algorithm, Stakeholder, and Social levels (TASS). Future studies should expand on barriers in pediatric care and focus on developing clearly defined protocols to overcome these barriers.

Collapse

Wang Y, Song Y, Ma Z, Han X. Multidisciplinary considerations of fairness in medical AI: A scoping review. Int J Med Inform 2023;178:105175. [PMID: 37595374 DOI: 10.1016/j.ijmedinf.2023.105175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 08/02/2023] [Accepted: 08/04/2023] [Indexed: 08/20/2023]

Abstract

INTRODUCTION

Artificial Intelligence (AI) technology has been developed significantly in recent years. The fairness of medical AI is of great concern due to its direct relation to human life and health. This review aims to analyze the existing research literature on fairness in medical AI from the perspectives of computer science, medical science, and social science (including law and ethics). The objective of the review is to examine the similarities and differences in the understanding of fairness, explore influencing factors, and investigate potential measures to implement fairness in medical AI across English and Chinese literature.

METHODS

This study employed a scoping review methodology and selected the following databases: Web of Science, MEDLINE, Pubmed, OVID, CNKI, WANFANG Data, etc., for the fairness issues in medical AI through February 2023. The search was conducted using various keywords such as "artificial intelligence," "machine learning," "medical," "algorithm," "fairness," "decision-making," and "bias." The collected data were charted, synthesized, and subjected to descriptive and thematic analysis.

RESULTS

After reviewing 468 English papers and 356 Chinese papers, 53 and 42 were included in the final analysis. Our results show the three different disciplines all show significant differences in the research on the core issues. Data is the foundation that affects medical AI fairness in addition to algorithmic bias and human bias. Legal, ethical, and technological measures all promote the implementation of medical AI fairness.

CONCLUSIONS

Our review indicates a consensus regarding the importance of data fairness as the foundation for achieving fairness in medical AI across multidisciplinary perspectives. However, there are substantial discrepancies in core aspects such as the concept, influencing factors, and implementation measures of fairness in medical AI. Consequently, future research should facilitate interdisciplinary discussions to bridge the cognitive gaps between different fields and enhance the practical implementation of fairness in medical AI.

Collapse

Teeple S, Chivers C, Linn KA, Halpern SD, Eneanya N, Draugelis M, Courtright K. Evaluating equity in performance of an electronic health record-based 6-month mortality risk model to trigger palliative care consultation: a retrospective model validation analysis. BMJ Qual Saf 2023;32:503-516. [PMID: 37001995 PMCID: PMC10898860 DOI: 10.1136/bmjqs-2022-015173] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 03/08/2023] [Indexed: 04/03/2023]

Abstract

OBJECTIVE

Evaluate predictive performance of an electronic health record (EHR)-based, inpatient 6-month mortality risk model developed to trigger palliative care consultation among patient groups stratified by age, race, ethnicity, insurance and socioeconomic status (SES), which may vary due to social forces (eg, racism) that shape health, healthcare and health data.

DESIGN

Retrospective evaluation of prediction model.

SETTING

Three urban hospitals within a single health system.

PARTICIPANTS

All patients ≥18 years admitted between 1 January and 31 December 2017, excluding observation, obstetric, rehabilitation and hospice (n=58 464 encounters, 41 327 patients).

MAIN OUTCOME MEASURES

General performance metrics (c-statistic, integrated calibration index (ICI), Brier Score) and additional measures relevant to health equity (accuracy, false positive rate (FPR), false negative rate (FNR)).

RESULTS

For black versus non-Hispanic white patients, the model's accuracy was higher (0.051, 95% CI 0.044 to 0.059), FPR lower (-0.060, 95% CI -0.067 to -0.052) and FNR higher (0.049, 95% CI 0.023 to 0.078). A similar pattern was observed among patients who were Hispanic, younger, with Medicaid/missing insurance, or living in low SES zip codes. No consistent differences emerged in c-statistic, ICI or Brier Score. Younger age had the second-largest effect size in the mortality prediction model, and there were large standardised group differences in age (eg, 0.32 for non-Hispanic white versus black patients), suggesting age may contribute to systematic differences in the predicted probabilities between groups.

CONCLUSIONS

An EHR-based mortality risk model was less likely to identify some marginalised patients as potentially benefiting from palliative care, with younger age pinpointed as a possible mechanism. Evaluating predictive performance is a critical preliminary step in addressing algorithmic inequities in healthcare, which must also include evaluating clinical impact, and governance and regulatory structures for oversight, monitoring and accountability.

Collapse

Walsh G, Stogiannos N, van de Venter R, Rainey C, Tam W, McFadden S, McNulty JP, Mekis N, Lewis S, O'Regan T, Kumar A, Huisman M, Bisdas S, Kotter E, Pinto dos Santos D, Sá dos Reis C, van Ooijen P, Brady AP, Malamateniou C. Responsible AI practice and AI education are central to AI implementation: a rapid review for all medical imaging professionals in Europe. BJR Open 2023;5:20230033. [PMID: 37953871 PMCID: PMC10636340 DOI: 10.1259/bjro.20230033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 05/27/2023] [Accepted: 05/30/2023] [Indexed: 11/14/2023] Open

Vorisek CN, Stellmach C, Mayer PJ, Klopfenstein SAI, Bures DM, Diehl A, Henningsen M, Ritter K, Thun S. Artificial Intelligence Bias in Health Care: Web-Based Survey. J Med Internet Res 2023;25:e41089. [PMID: 37347528 PMCID: PMC10337406 DOI: 10.2196/41089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 11/11/2022] [Accepted: 04/20/2023] [Indexed: 06/23/2023] Open

Abstract

BACKGROUND

Resources are increasingly spent on artificial intelligence (AI) solutions for medical applications aiming to improve diagnosis, treatment, and prevention of diseases. While the need for transparency and reduction of bias in data and algorithm development has been addressed in past studies, little is known about the knowledge and perception of bias among AI developers.

OBJECTIVE

This study's objective was to survey AI specialists in health care to investigate developers' perceptions of bias in AI algorithms for health care applications and their awareness and use of preventative measures.

METHODS

A web-based survey was provided in both German and English language, comprising a maximum of 41 questions using branching logic within the REDCap web application. Only the results of participants with experience in the field of medical AI applications and complete questionnaires were included for analysis. Demographic data, technical expertise, and perceptions of fairness, as well as knowledge of biases in AI, were analyzed, and variations among gender, age, and work environment were assessed.

RESULTS

A total of 151 AI specialists completed the web-based survey. The median age was 30 (IQR 26-39) years, and 67% (101/151) of respondents were male. One-third rated their AI development projects as fair (47/151, 31%) or moderately fair (51/151, 34%), 12% (18/151) reported their AI to be barely fair, and 1% (2/151) not fair at all. One participant identifying as diverse rated AI developments as barely fair, and among the 2 undefined gender participants, AI developments were rated as barely fair or moderately fair, respectively. Reasons for biases selected by respondents were lack of fair data (90/132, 68%), guidelines or recommendations (65/132, 49%), or knowledge (60/132, 45%). Half of the respondents worked with image data (83/151, 55%) from 1 center only (76/151, 50%), and 35% (53/151) worked with national data exclusively.

CONCLUSIONS

This study shows that the perception of biases in AI overall is moderately fair. Gender minorities did not once rate their AI development as fair or very fair. Therefore, further studies need to focus on minorities and women and their perceptions of AI. The results highlight the need to strengthen knowledge about bias in AI and provide guidelines on preventing biases in AI health care applications.

Collapse

de Hond AAH, Kant IMJ, Fornasa M, Cinà G, Elbers PWG, Thoral PJ, Sesmu Arbous M, Steyerberg EW. Predicting Readmission or Death After Discharge From the ICU: External Validation and Retraining of a Machine Learning Model. Crit Care Med 2023;51:291-300. [PMID: 36524820 PMCID: PMC9848213 DOI: 10.1097/ccm.0000000000005758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Abstract

OBJECTIVES

Many machine learning (ML) models have been developed for application in the ICU, but few models have been subjected to external validation. The performance of these models in new settings therefore remains unknown. The objective of this study was to assess the performance of an existing decision support tool based on a ML model predicting readmission or death within 7 days after ICU discharge before, during, and after retraining and recalibration.

DESIGN

A gradient boosted ML model was developed and validated on electronic health record data from 2004 to 2021. We performed an independent validation of this model on electronic health record data from 2011 to 2019 from a different tertiary care center.

SETTING

Two ICUs in tertiary care centers in The Netherlands.

PATIENTS

Adult patients who were admitted to the ICU and stayed for longer than 12 hours.

INTERVENTIONS

None.

MEASUREMENTS AND MAIN RESULTS

We assessed discrimination by area under the receiver operating characteristic curve (AUC) and calibration (slope and intercept). We retrained and recalibrated the original model and assessed performance via a temporal validation design. The final retrained model was cross-validated on all data from the new site. Readmission or death within 7 days after ICU discharge occurred in 577 of 10,052 ICU admissions (5.7%) at the new site. External validation revealed moderate discrimination with an AUC of 0.72 (95% CI 0.67-0.76). Retrained models showed improved discrimination with AUC 0.79 (95% CI 0.75-0.82) for the final validation model. Calibration was poor initially and good after recalibration via isotonic regression.

CONCLUSIONS

In this era of expanding availability of ML models, external validation and retraining are key steps to consider before applying ML models to new settings. Clinicians and decision-makers should take this into account when considering applying new ML models to their local settings.

Collapse

Rubeis G, Fang ML, Sixsmith A. Equity in AgeTech for Ageing Well in Technology-Driven Places: The Role of Social Determinants in Designing AI-based Assistive Technologies. SCIENCE AND ENGINEERING ETHICS 2022;28:49. [PMID: 36301408 PMCID: PMC9613787 DOI: 10.1007/s11948-022-00397-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 08/01/2022] [Indexed: 06/16/2023]

Chavez-Yenter D, Goodman MS, Chen Y, Chu X, Bradshaw RL, Lorenz Chambers R, Chan PA, Daly BM, Flynn M, Gammon A, Hess R, Kessler C, Kohlmann WK, Mann DM, Monahan R, Peel S, Kawamoto K, Del Fiol G, Sigireddi M, Buys SS, Ginsburg O, Kaphingst KA. Association of Disparities in Family History and Family Cancer History in the Electronic Health Record With Sex, Race, Hispanic or Latino Ethnicity, and Language Preference in 2 Large US Health Care Systems. JAMA Netw Open 2022;5:e2234574. [PMID: 36194411 PMCID: PMC9533178 DOI: 10.1001/jamanetworkopen.2022.34574] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Abstract

IMPORTANCE

Clinical decision support (CDS) algorithms are increasingly being implemented in health care systems to identify patients for specialty care. However, systematic differences in missingness of electronic health record (EHR) data may lead to disparities in identification by CDS algorithms.

OBJECTIVE

To examine the availability and comprehensiveness of cancer family history information (FHI) in patients' EHRs by sex, race, Hispanic or Latino ethnicity, and language preference in 2 large health care systems in 2021.

DESIGN, SETTING, AND PARTICIPANTS

This retrospective EHR quality improvement study used EHR data from 2 health care systems: University of Utah Health (UHealth) and NYU Langone Health (NYULH). Participants included patients aged 25 to 60 years who had a primary care appointment in the previous 3 years. Data were collected or abstracted from the EHR from December 10, 2020, to October 31, 2021, and analyzed from June 15 to October 31, 2021.

EXPOSURES

Prior collection of cancer FHI in primary care settings.

MAIN OUTCOMES AND MEASURES

Availability was defined as having any FHI and any cancer FHI in the EHR and was examined at the patient level. Comprehensiveness was defined as whether a cancer family history observation in the EHR specified the type of cancer diagnosed in a family member, the relationship of the family member to the patient, and the age at onset for the family member and was examined at the observation level.

RESULTS

Among 144 484 patients in the UHealth system, 53.6% were women; 74.4% were non-Hispanic or non-Latino and 67.6% were White; and 83.0% had an English language preference. Among 377 621 patients in the NYULH system, 55.3% were women; 63.2% were non-Hispanic or non-Latino, and 55.3% were White; and 89.9% had an English language preference. Patients from historically medically undeserved groups-specifically, Black vs White patients (UHealth: 17.3% [95% CI, 16.1%-18.6%] vs 42.8% [95% CI, 42.5%-43.1%]; NYULH: 24.4% [95% CI, 24.0%-24.8%] vs 33.8% [95% CI, 33.6%-34.0%]), Hispanic or Latino vs non-Hispanic or non-Latino patients (UHealth: 27.2% [95% CI, 26.5%-27.8%] vs 40.2% [95% CI, 39.9%-40.5%]; NYULH: 24.4% [95% CI, 24.1%-24.7%] vs 31.6% [95% CI, 31.4%-31.8%]), Spanish-speaking vs English-speaking patients (UHealth: 18.4% [95% CI, 17.2%-19.1%] vs 40.0% [95% CI, 39.7%-40.3%]; NYULH: 15.1% [95% CI, 14.6%-15.6%] vs 31.1% [95% CI, 30.9%-31.2%), and men vs women (UHealth: 30.8% [95% CI, 30.4%-31.2%] vs 43.0% [95% CI, 42.6%-43.3%]; NYULH: 23.1% [95% CI, 22.9%-23.3%] vs 34.9% [95% CI, 34.7%-35.1%])-had significantly lower availability and comprehensiveness of cancer FHI (P < .001).

CONCLUSIONS AND RELEVANCE

These findings suggest that systematic differences in the availability and comprehensiveness of FHI in the EHR may introduce informative presence bias as inputs to CDS algorithms. The observed differences may also exacerbate disparities for medically underserved groups. System-, clinician-, and patient-level efforts are needed to improve the collection of FHI.

Collapse

Affiliation(s)

Daniel Chavez-Yenter Huntsman Cancer Institute, University of Utah, Salt Lake City Department of Communication, University of Utah, Salt Lake City
Melody S. Goodman School of Global Public Health, New York University, New York, New York
Yuyu Chen School of Global Public Health, New York University, New York, New York
Xiangying Chu School of Global Public Health, New York University, New York, New York
Richard L. Bradshaw Department of Biomedical Informatics, University of Utah, Salt Lake City School of Medicine, University of Utah Health, Salt Lake City, Utah
Rachelle Lorenz Chambers Perlmutter Cancer Center, NYU Langone Health, New York, New York
Priscilla A. Chan Perlmutter Cancer Center, NYU Langone Health, New York, New York
Brianne M. Daly Huntsman Cancer Institute, University of Utah, Salt Lake City
Michael Flynn School of Medicine, University of Utah Health, Salt Lake City, Utah
Amanda Gammon Huntsman Cancer Institute, University of Utah, Salt Lake City
Rachel Hess Department of Population Health Sciences, University of Utah, Salt Lake City Department of Internal Medicine, University of Utah, Salt Lake City
Cecelia Kessler Huntsman Cancer Institute, University of Utah, Salt Lake City
Wendy K. Kohlmann Huntsman Cancer Institute, University of Utah, Salt Lake City
Devin M. Mann Department of Population Health, New York University Grossman School of Medicine, New York University, New York, New York
Rachel Monahan Perlmutter Cancer Center, NYU Langone Health, New York, New York Department of Population Health, New York University Grossman School of Medicine, New York University, New York, New York
Sara Peel Huntsman Cancer Institute, University of Utah, Salt Lake City
Kensaku Kawamoto Department of Biomedical Informatics, University of Utah, Salt Lake City
Guilherme Del Fiol Department of Biomedical Informatics, University of Utah, Salt Lake City
Meenakshi Sigireddi Perlmutter Cancer Center, NYU Langone Health, New York, New York
Saundra S. Buys Huntsman Cancer Institute, University of Utah, Salt Lake City Department of Internal Medicine, University of Utah, Salt Lake City
Ophira Ginsburg Center for Global Health, National Cancer Institute, Rockville, Maryland
Kimberly A. Kaphingst Huntsman Cancer Institute, University of Utah, Salt Lake City Department of Communication, University of Utah, Salt Lake City

Collapse

Albert K, Delano M. Sex trouble: Sex/gender slippage, sex confusion, and sex obsession in machine learning using electronic health records. PATTERNS (NEW YORK, N.Y.) 2022;3:100534. [PMID: 36033589 PMCID: PMC9403398 DOI: 10.1016/j.patter.2022.100534] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Čartolovni A, Tomičić A, Lazić Mosler E. Ethical, legal, and social considerations of AI-based medical decision-support tools: A scoping review. Int J Med Inform 2022;161:104738. [PMID: 35299098 DOI: 10.1016/j.ijmedinf.2022.104738] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2021] [Revised: 02/11/2022] [Accepted: 03/10/2022] [Indexed: 10/18/2022]

Abstract

INTRODUCTION

Recent developments in the field of Artificial Intelligence (AI) applied to healthcare promise to solve many of the existing global issues in advancing human health and managing global health challenges. This comprehensive review aims not only to surface the underlying ethical and legal but also social implications (ELSI) that have been overlooked in recent reviews while deserving equal attention in the development stage, and certainly ahead of implementation in healthcare. It is intended to guide various stakeholders (eg. designers, engineers, clinicians) in addressing the ELSI of AI at the design stage using the Ethics by Design (EbD) approach.

METHODS

The authors followed a systematised scoping methodology and searched the following databases: Pubmed, Web of science, Ovid, Scopus, IEEE Xplore, EBSCO Search (Academic Search Premier, CINAHL, PSYCINFO, APA PsycArticles, ERIC) for the ELSI of AI in healthcare through January 2021. Data were charted and synthesised, and the authors conducted a descriptive and thematic analysis of the collected data.

RESULTS

After reviewing 1108 papers, 94 were included in the final analysis. Our results show a growing interest in the academic community for ELSI in the field of AI. The main issues of concern identified in our analysis fall into four main clusters of impact: AI algorithms, physicians, patients, and healthcare in general. The most prevalent issues are patient safety, algorithmic transparency, lack of proper regulation, liability & accountability, impact on patient-physician relationship and governance of AI empowered healthcare.

CONCLUSIONS

The results of our review confirm the potential of AI to significantly improve patient care, but the drawbacks to its implementation relate to complex ELSI that have yet to be addressed. Most ELSI refer to the impact on and extension of the reciprocal and fiduciary patient-physician relationship. With the integration of AIbased decision making tools, a bilateral patient-physician relationship may shift into a trilateral one.

Collapse

SHIFTing artificial intelligence to be responsible in healthcare: A systematic review. Soc Sci Med 2022;296:114782. [DOI: 10.1016/j.socscimed.2022.114782] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 02/02/2022] [Accepted: 02/03/2022] [Indexed: 12/12/2022]

Leo CG, Tumolo MR, Sabina S, Colella R, Recchia V, Ponzini G, Fotiadis DI, Bodini A, Mincarone P. Health Technology Assessment for In Silico Medicine: Social, Ethical and Legal Aspects. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022;19:ijerph19031510. [PMID: 35162529 PMCID: PMC8835251 DOI: 10.3390/ijerph19031510] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 01/25/2022] [Accepted: 01/26/2022] [Indexed: 12/28/2022]

de Hond AAH, Leeuwenberg AM, Hooft L, Kant IMJ, Nijman SWJ, van Os HJA, Aardoom JJ, Debray TPA, Schuit E, van Smeden M, Reitsma JB, Steyerberg EW, Chavannes NH, Moons KGM. Guidelines and quality criteria for artificial intelligence-based prediction models in healthcare: a scoping review. NPJ Digit Med 2022;5:2. [PMID: 35013569 PMCID: PMC8748878 DOI: 10.1038/s41746-021-00549-7] [Citation(s) in RCA: 105] [Impact Index Per Article: 52.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Accepted: 12/13/2021] [Indexed: 12/23/2022] Open

Affiliation(s)

Anne A H de Hond Department of Information Technology and Digital Innovation, Leiden University Medical Center, Leiden, The Netherlands. Clinical AI Implementation and Research Lab, Leiden University Medical Center, Leiden, The Netherlands. Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands.
Artuur M Leeuwenberg Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands.
Lotty Hooft Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
Ilse M J Kant Department of Information Technology and Digital Innovation, Leiden University Medical Center, Leiden, The Netherlands Clinical AI Implementation and Research Lab, Leiden University Medical Center, Leiden, The Netherlands Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
Steven W J Nijman Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
Hendrikus J A van Os Clinical AI Implementation and Research Lab, Leiden University Medical Center, Leiden, The Netherlands National eHealth Living Lab, Leiden, The Netherlands
Jiska J Aardoom National eHealth Living Lab, Leiden, The Netherlands Department of Public Health and Primary Care, Leiden University Medical Center, Leiden, The Netherlands
Thomas P A Debray Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
Ewoud Schuit Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
Maarten van Smeden Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
Johannes B Reitsma Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
Ewout W Steyerberg Clinical AI Implementation and Research Lab, Leiden University Medical Center, Leiden, The Netherlands Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
Niels H Chavannes National eHealth Living Lab, Leiden, The Netherlands Department of Public Health and Primary Care, Leiden University Medical Center, Leiden, The Netherlands
Karel G M Moons Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands

Collapse

Chauhan C, Gullapalli RR. Ethics of AI in Pathology: Current Paradigms and Emerging Issues. THE AMERICAN JOURNAL OF PATHOLOGY 2021;191:1673-1683. [PMID: 34252382 PMCID: PMC8485059 DOI: 10.1016/j.ajpath.2021.06.011] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 06/18/2021] [Accepted: 06/24/2021] [Indexed: 02/06/2023]

Goirand M, Austin E, Clay-Williams R. Implementing Ethics in Healthcare AI-Based Applications: A Scoping Review. SCIENCE AND ENGINEERING ETHICS 2021;27:61. [PMID: 34480239 DOI: 10.1007/s11948-021-00336-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Accepted: 08/04/2021] [Indexed: 06/13/2023]

Martinez-Martin N, Greely HT, Cho MK. Ethical Development of Digital Phenotyping Tools for Mental Health Applications: Delphi Study. JMIR Mhealth Uhealth 2021;9:e27343. [PMID: 34319252 PMCID: PMC8367187 DOI: 10.2196/27343] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 05/06/2021] [Accepted: 05/21/2021] [Indexed: 01/15/2023] Open

Abstract

BACKGROUND

Digital phenotyping (also known as personal sensing, intelligent sensing, or body computing) involves the collection of biometric and personal data in situ from digital devices, such as smartphones, wearables, or social media, to measure behavior or other health indicators. The collected data are analyzed to generate moment-by-moment quantification of a person's mental state and potentially predict future mental states. Digital phenotyping projects incorporate data from multiple sources, such as electronic health records, biometric scans, or genetic testing. As digital phenotyping tools can be used to study and predict behavior, they are of increasing interest for a range of consumer, government, and health care applications. In clinical care, digital phenotyping is expected to improve mental health diagnoses and treatment. At the same time, mental health applications of digital phenotyping present significant areas of ethical concern, particularly in terms of privacy and data protection, consent, bias, and accountability.

OBJECTIVE

This study aims to develop consensus statements regarding key areas of ethical guidance for mental health applications of digital phenotyping in the United States.

METHODS

We used a modified Delphi technique to identify the emerging ethical challenges posed by digital phenotyping for mental health applications and to formulate guidance for addressing these challenges. Experts in digital phenotyping, data science, mental health, law, and ethics participated as panelists in the study. The panel arrived at consensus recommendations through an iterative process involving interviews and surveys. The panelists focused primarily on clinical applications for digital phenotyping for mental health but also included recommendations regarding transparency and data protection to address potential areas of misuse of digital phenotyping data outside of the health care domain.

RESULTS

The findings of this study showed strong agreement related to these ethical issues in the development of mental health applications of digital phenotyping: privacy, transparency, consent, accountability, and fairness. Consensus regarding the recommendation statements was strongest when the guidance was stated broadly enough to accommodate a range of potential applications. The privacy and data protection issues that the Delphi participants found particularly critical to address related to the perceived inadequacies of current regulations and frameworks for protecting sensitive personal information and the potential for sale and analysis of personal data outside of health systems.

CONCLUSIONS

The Delphi study found agreement on a number of ethical issues to prioritize in the development of digital phenotyping for mental health applications. The Delphi consensus statements identified general recommendations and principles regarding the ethical application of digital phenotyping to mental health. As digital phenotyping for mental health is implemented in clinical care, there remains a need for empirical research and consultation with relevant stakeholders to further understand and address relevant ethical issues.

Collapse

Antes AL, Burrous S, Sisk BA, Schuelke MJ, Keune JD, DuBois JM. Exploring perceptions of healthcare technologies enabled by artificial intelligence: an online, scenario-based survey. BMC Med Inform Decis Mak 2021;21:221. [PMID: 34284756 PMCID: PMC8293482 DOI: 10.1186/s12911-021-01586-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 07/02/2021] [Indexed: 01/14/2023] Open

Abstract

Background

Healthcare is expected to increasingly integrate technologies enabled by artificial intelligence (AI) into patient care. Understanding perceptions of these tools is essential to successful development and adoption. This exploratory study gauged participants’ level of openness, concern, and perceived benefit associated with AI-driven healthcare technologies. We also explored socio-demographic, health-related, and psychosocial correlates of these perceptions.

Methods

We developed a measure depicting six AI-driven technologies that either diagnose, predict, or suggest treatment. We administered the measure via an online survey to adults (N = 936) in the United States using MTurk, a crowdsourcing platform. Participants indicated their level of openness to using the AI technology in the healthcare scenario. Items reflecting potential concerns and benefits associated with each technology accompanied the scenarios. Participants rated the extent that the statements of concerns and benefits influenced their perception of favorability toward the technology. Participants completed measures of socio-demographics, health variables, and psychosocial variables such as trust in the healthcare system and trust in technology. Exploratory and confirmatory factor analyses of the concern and benefit items identified two factors representing overall level of concern and perceived benefit. Descriptive analyses examined levels of openness, concern, and perceived benefit. Correlational analyses explored associations of socio-demographic, health, and psychosocial variables with openness, concern, and benefit scores while multivariable regression models examined these relationships concurrently.

Results

Participants were moderately open to AI-driven healthcare technologies (M = 3.1/5.0 ± 0.9), but there was variation depending on the type of application, and the statements of concerns and benefits swayed views. Trust in the healthcare system and trust in technology were the strongest, most consistent correlates of openness, concern, and perceived benefit. Most other socio-demographic, health-related, and psychosocial variables were less strongly, or not, associated, but multivariable models indicated some personality characteristics (e.g., conscientiousness and agreeableness) and socio-demographics (e.g., full-time employment, age, sex, and race) were modestly related to perceptions.

Conclusions

Participants’ openness appears tenuous, suggesting early promotion strategies and experiences with novel AI technologies may strongly influence views, especially if implementation of AI technologies increases or undermines trust. The exploratory nature of these findings warrants additional research.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12911-021-01586-8.

Collapse

Detection and Evaluation of Machine Learning Bias. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11146271] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Impact of Different Approaches to Preparing Notes for Analysis With Natural Language Processing on the Performance of Prediction Models in Intensive Care. Crit Care Explor 2021;3:e0450. [PMID: 34136824 PMCID: PMC8202578 DOI: 10.1097/cce.0000000000000450] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open

Abstract

Supplemental Digital Content is available in the text.

OBJECTIVES:

To evaluate whether different approaches in note text preparation (known as preprocessing) can impact machine learning model performance in the case of mortality prediction ICU.

DESIGN:

Clinical note text was used to build machine learning models for adults admitted to the ICU. Preprocessing strategies studied were none (raw text), cleaning text, stemming, term frequency-inverse document frequency vectorization, and creation of n-grams. Model performance was assessed by the area under the receiver operating characteristic curve. Models were trained and internally validated on University of California San Francisco data using 10-fold cross validation. These models were then externally validated on Beth Israel Deaconess Medical Center data.

SETTING:

ICUs at University of California San Francisco and Beth Israel Deaconess Medical Center.

SUBJECTS:

Ten thousand patients in the University of California San Francisco training and internal testing dataset and 27,058 patients in the external validation dataset, Beth Israel Deaconess Medical Center.

INTERVENTIONS:

None.

MEASUREMENTS AND MAIN RESULTS:

Mortality rate at Beth Israel Deaconess Medical Center and University of California San Francisco was 10.9% and 7.4%, respectively. Data are presented as area under the receiver operating characteristic curve (95% CI) for models validated at University of California San Francisco and area under the receiver operating characteristic curve for models validated at Beth Israel Deaconess Medical Center. Models built and trained on University of California San Francisco data for the prediction of inhospital mortality improved from the raw note text model (AUROC, 0.84; CI, 0.80–0.89) to the term frequency-inverse document frequency model (AUROC, 0.89; CI, 0.85–0.94). When applying the models developed at University of California San Francisco to Beth Israel Deaconess Medical Center data, there was a similar increase in model performance from raw note text (area under the receiver operating characteristic curve at Beth Israel Deaconess Medical Center: 0.72) to the term frequency-inverse document frequency model (area under the receiver operating characteristic curve at Beth Israel Deaconess Medical Center: 0.83).

CONCLUSIONS:

Differences in preprocessing strategies for note text impacted model discrimination. Completing a preprocessing pathway including cleaning, stemming, and term frequency-inverse document frequency vectorization resulted in the preprocessing strategy with the greatest improvement in model performance. Further study is needed, with particular emphasis on how to manage author implicit bias present in note text, before natural language processing algorithms are implemented in the clinical setting.

Collapse

Petersen C, Smith J, Freimuth RR, Goodman KW, Jackson GP, Kannry J, Liu H, Madhavan S, Sittig DF, Wright A. Recommendations for the safe, effective use of adaptive CDS in the US healthcare system: an AMIA position paper. J Am Med Inform Assoc 2021;28:677-684. [PMID: 33447854 DOI: 10.1093/jamia/ocaa319] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Accepted: 12/01/2020] [Indexed: 02/07/2023] Open

Jackson BR, Ye Y, Crawford JM, Becich MJ, Roy S, Botkin JR, de Baca ME, Pantanowitz L. The Ethics of Artificial Intelligence in Pathology and Laboratory Medicine: Principles and Practice. Acad Pathol 2021;8:2374289521990784. [PMID: 33644301 PMCID: PMC7894680 DOI: 10.1177/2374289521990784] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Revised: 11/24/2020] [Accepted: 12/28/2020] [Indexed: 12/24/2022] Open

Smith J. Setting the agenda: an informatics-led policy framework for adaptive CDS. J Am Med Inform Assoc 2020;27:1831-1833. [PMID: 33301025 PMCID: PMC7727380 DOI: 10.1093/jamia/ocaa239] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Indexed: 03/31/2024] Open

Pfohl SR, Foryciarz A, Shah NH. An empirical characterization of fair machine learning for clinical risk prediction. J Biomed Inform 2020;113:103621. [PMID: 33220494 DOI: 10.1016/j.jbi.2020.103621] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Revised: 10/06/2020] [Accepted: 11/05/2020] [Indexed: 11/19/2022]