Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Gaube S, Suresh H, Raue M, Merritt A, Berkowitz SJ, Lermer E, Coughlin JF, Guttag JV, Colak E, Ghassemi M. Do as AI say: susceptibility in deployment of clinical decision-aids. NPJ Digit Med 2021;4:31. [PMID: 33608629 PMCID: PMC7896064 DOI: 10.1038/s41746-021-00385-9] [Citation(s) in RCA: 112] [Impact Index Per Article: 37.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 01/07/2021] [Indexed: 02/07/2023] Open

For:	Gaube S, Suresh H, Raue M, Merritt A, Berkowitz SJ, Lermer E, Coughlin JF, Guttag JV, Colak E, Ghassemi M. Do as AI say: susceptibility in deployment of clinical decision-aids. NPJ Digit Med 2021;4:31. [PMID: 33608629 PMCID: PMC7896064 DOI: 10.1038/s41746-021-00385-9] [Citation(s) in RCA: 112] [Impact Index Per Article: 37.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 01/07/2021] [Indexed: 02/07/2023] Open

Number

Cited by Other Article(s)

Weber S, Wyszynski M, Godefroid M, Plattfaut R, Niehaves B. How do medical professionals make sense (or not) of AI? A social-media-based computational grounded theory study and an online survey. Comput Struct Biotechnol J 2024;24:146-159. [PMID: 38434249 PMCID: PMC10904922 DOI: 10.1016/j.csbj.2024.02.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 02/14/2024] [Accepted: 02/14/2024] [Indexed: 03/05/2024] Open

Chen H, Ma X, Rives H, Serpedin A, Yao P, Rameau A. Trust in Machine Learning Driven Clinical Decision Support Tools Among Otolaryngologists. Laryngoscope 2024;134:2799-2804. [PMID: 38230948 DOI: 10.1002/lary.31260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 11/29/2023] [Accepted: 12/20/2023] [Indexed: 01/18/2024]

Kotter E, Pinto Dos Santos D. [Ethics and artificial intelligence]. RADIOLOGIE (HEIDELBERG, GERMANY) 2024;64:498-502. [PMID: 38499692 DOI: 10.1007/s00117-024-01286-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 02/26/2024] [Indexed: 03/20/2024]

Hasani AM, Singh S, Zahergivar A, Ryan B, Nethala D, Bravomontenegro G, Mendhiratta N, Ball M, Farhadi F, Malayeri A. Evaluating the performance of Generative Pre-trained Transformer-4 (GPT-4) in standardizing radiology reports. Eur Radiol 2024;34:3566-3574. [PMID: 37938381 DOI: 10.1007/s00330-023-10384-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 09/01/2023] [Accepted: 09/08/2023] [Indexed: 11/09/2023]

Abstract

OBJECTIVE

Radiology reporting is an essential component of clinical diagnosis and decision-making. With the advent of advanced artificial intelligence (AI) models like GPT-4 (Generative Pre-trained Transformer 4), there is growing interest in evaluating their potential for optimizing or generating radiology reports. This study aimed to compare the quality and content of radiologist-generated and GPT-4 AI-generated radiology reports.

METHODS

A comparative study design was employed in the study, where a total of 100 anonymized radiology reports were randomly selected and analyzed. Each report was processed by GPT-4, resulting in the generation of a corresponding AI-generated report. Quantitative and qualitative analysis techniques were utilized to assess similarities and differences between the two sets of reports.

RESULTS

The AI-generated reports showed comparable quality to radiologist-generated reports in most categories. Significant differences were observed in clarity (p = 0.027), ease of understanding (p = 0.023), and structure (p = 0.050), favoring the AI-generated reports. AI-generated reports were more concise, with 34.53 fewer words and 174.22 fewer characters on average, but had greater variability in sentence length. Content similarity was high, with an average Cosine Similarity of 0.85, Sequence Matcher Similarity of 0.52, BLEU Score of 0.5008, and BERTScore F1 of 0.8775.

CONCLUSION

The results of this proof-of-concept study suggest that GPT-4 can be a reliable tool for generating standardized radiology reports, offering potential benefits such as improved efficiency, better communication, and simplified data extraction and analysis. However, limitations and ethical implications must be addressed to ensure the safe and effective implementation of this technology in clinical practice.

CLINICAL RELEVANCE STATEMENT

The findings of this study suggest that GPT-4 (Generative Pre-trained Transformer 4), an advanced AI model, has the potential to significantly contribute to the standardization and optimization of radiology reporting, offering improved efficiency and communication in clinical practice.

KEY POINTS

• Large language model-generated radiology reports exhibited high content similarity and moderate structural resemblance to radiologist-generated reports. • Performance metrics highlighted the strong matching of word selection and order, as well as high semantic similarity between AI and radiologist-generated reports. • Large language model demonstrated potential for generating standardized radiology reports, improving efficiency and communication in clinical settings.

Collapse

Yuan W, Du Z, Han S. Semi-supervised skin cancer diagnosis based on self-feedback threshold focal learning. Discov Oncol 2024;15:180. [PMID: 38776027 PMCID: PMC11111630 DOI: 10.1007/s12672-024-01043-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 05/17/2024] [Indexed: 05/25/2024] Open

Rosen S, Saban M. Evaluating the reliability of ChatGPT as a tool for imaging test referral: a comparative study with a clinical decision support system. Eur Radiol 2024;34:2826-2837. [PMID: 37828297 DOI: 10.1007/s00330-023-10230-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 07/28/2023] [Accepted: 08/01/2023] [Indexed: 10/14/2023]

Abstract

OBJECTIVES

As the technology continues to evolve and advance, we can expect to see artificial intelligence (AI) being used in increasingly sophisticated ways to make a diagnosis and decisions such as suggesting the most appropriate imaging referrals. We aim to explore whether Chat Generative Pretrained Transformer (ChatGPT) can provide accurate imaging referrals for clinical use that are at least as good as the ESR iGuide.

METHODS

A comparative study was conducted in a tertiary hospital. Data was collected from 97 consecutive cases that were admitted to the emergency department with abdominal complaints. We compared the imaging test referral recommendations suggested by the ESR iGuide and the ChatGPT and analyzed cases of disagreement. In addition, we selected cases where ChatGPT recommended a chest abdominal pelvis (CAP) CT (n = 66), and asked four specialists to grade the appropriateness of the referral.

RESULTS

ChatGPT recommendations were consistent with the recommendations provided by the ESR iGuide. No statistical differences were found between the appropriateness of referrals by age or gender. Using a sub-analysis of CAP cases, a high agreement between ChatGPT and the specialists was found. Cases of disagreement (12.4%) were further analyzed and presented themes of vague recommendations such as "it would be advisable" and "this would help to rule out."

CONCLUSIONS

ChatGPT's ability to guide the selection of appropriate tests may be comparable to some degree with the ESR iGuide. Features such as the clinical, ethical, and regulatory implications are still warranted and need to be addressed prior to clinical implementation. Further studies are needed to confirm these findings.

CLINICAL RELEVANCE STATEMENT

The article explores the potential of using advanced language models, such as ChatGPT, in healthcare as a CDS for selecting appropriate imaging tests. Using ChatGPT can improve the efficiency of the decision-making process KEY POINTS: • ChatGPT recommendations were highly consistent with the recommendations provided by the ESR iGuide. • ChatGPT's ability in guiding the selection of appropriate tests may be comparable to some degree with ESR iGuide's.

Collapse

Brady AP, Allen B, Chong J, Kotter E, Kottler N, Mongan J, Oakden-Rayner L, Dos Santos DP, Tang A, Wald C, Slavotinek J. Developing, Purchasing, Implementing and Monitoring AI Tools in Radiology: Practical Considerations. A Multi-Society Statement From the ACR, CAR, ESR, RANZCR & RSNA. Can Assoc Radiol J 2024;75:226-244. [PMID: 38251882 DOI: 10.1177/08465371231222229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2024] Open

Cecil J, Lermer E, Hudecek MFC, Sauer J, Gaube S. Explainability does not mitigate the negative impact of incorrect AI advice in a personnel selection task. Sci Rep 2024;14:9736. [PMID: 38679619 PMCID: PMC11056364 DOI: 10.1038/s41598-024-60220-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Accepted: 04/19/2024] [Indexed: 05/01/2024] Open

Jiang T, Chen C, Zhou Y, Cai S, Yan Y, Sui L, Lai M, Song M, Zhu X, Pan Q, Wang H, Chen X, Wang K, Xiong J, Chen L, Xu D. Deep learning-assisted diagnosis of benign and malignant parotid tumors based on ultrasound: a retrospective study. BMC Cancer 2024;24:510. [PMID: 38654281 PMCID: PMC11036551 DOI: 10.1186/s12885-024-12277-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 04/16/2024] [Indexed: 04/25/2024] Open

Abstract

BACKGROUND

To develop a deep learning(DL) model utilizing ultrasound images, and evaluate its efficacy in distinguishing between benign and malignant parotid tumors (PTs), as well as its practicality in assisting clinicians with accurate diagnosis.

METHODS

A total of 2211 ultrasound images of 980 pathologically confirmed PTs (Training set: n = 721; Validation set: n = 82; Internal-test set: n = 89; External-test set: n = 88) from 907 patients were retrospectively included in this study. The optimal model was selected and the diagnostic performance evaluation is conducted by utilizing the area under curve (AUC) of the receiver-operating characteristic(ROC) based on five different DL networks constructed at varying depths. Furthermore, a comparison of different seniority radiologists was made in the presence of the optimal auxiliary diagnosis model. Additionally, the diagnostic confusion matrix of the optimal model was calculated, and an analysis and summary of misjudged cases' characteristics were conducted.

RESULTS

The Resnet18 demonstrated superior diagnostic performance, with an AUC value of 0.947, accuracy of 88.5%, sensitivity of 78.2%, and specificity of 92.7% in internal-test set, and with an AUC value of 0.925, accuracy of 89.8%, sensitivity of 83.3%, and specificity of 90.6% in external-test set. The PTs were subjectively assessed twice by six radiologists, both with and without the assisted of the model. With the assisted of the model, both junior and senior radiologists demonstrated enhanced diagnostic performance. In the internal-test set, there was an increase in AUC values by 0.062 and 0.082 for junior radiologists respectively, while senior radiologists experienced an improvement of 0.066 and 0.106 in their respective AUC values.

CONCLUSIONS

The DL model based on ultrasound images demonstrates exceptional capability in distinguishing between benign and malignant PTs, thereby assisting radiologists of varying expertise levels to achieve heightened diagnostic performance, and serve as a noninvasive imaging adjunct diagnostic method for clinical purposes.

Collapse

Affiliation(s)

Tian Jiang Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, 310022, Hangzhou, Zhejiang, China Postgraduate training base Alliance of Wenzhou Medical University (Zhejiang Cancer Hospital), 310022, Hangzhou, Zhejiang, China Zhejiang Provincial Research Center for Cancer Intelligent Diagnosis and Molecular Technology, 310022, Hangzhou, Zhejiang, China
Chen Chen Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, 310022, Hangzhou, Zhejiang, China Wenling Big Data and Artificial Intelligence Institute in Medicine, 317502, TaiZhou, Zhejiang, China Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), 317502, Taizhou, Zhejiang, China
Yahan Zhou Wenling Big Data and Artificial Intelligence Institute in Medicine, 317502, TaiZhou, Zhejiang, China Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), 317502, Taizhou, Zhejiang, China
Shenzhou Cai Wenling Big Data and Artificial Intelligence Institute in Medicine, 317502, TaiZhou, Zhejiang, China Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), 317502, Taizhou, Zhejiang, China
Yuqi Yan Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, 310022, Hangzhou, Zhejiang, China Postgraduate training base Alliance of Wenzhou Medical University (Zhejiang Cancer Hospital), 310022, Hangzhou, Zhejiang, China Wenling Big Data and Artificial Intelligence Institute in Medicine, 317502, TaiZhou, Zhejiang, China Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), 317502, Taizhou, Zhejiang, China
Lin Sui Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, 310022, Hangzhou, Zhejiang, China Postgraduate training base Alliance of Wenzhou Medical University (Zhejiang Cancer Hospital), 310022, Hangzhou, Zhejiang, China Wenling Big Data and Artificial Intelligence Institute in Medicine, 317502, TaiZhou, Zhejiang, China Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), 317502, Taizhou, Zhejiang, China
Min Lai Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, 310022, Hangzhou, Zhejiang, China Zhejiang Provincial Research Center for Cancer Intelligent Diagnosis and Molecular Technology, 310022, Hangzhou, Zhejiang, China Second Clinical College, Zhejiang University of Traditional Chinese Medicine, 310022, Hangzhou, Zhejiang, China
Mei Song Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, 310022, Hangzhou, Zhejiang, China Zhejiang Provincial Research Center for Cancer Intelligent Diagnosis and Molecular Technology, 310022, Hangzhou, Zhejiang, China
Xi Zhu Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, 310022, Hangzhou, Zhejiang, China Wenling Big Data and Artificial Intelligence Institute in Medicine, 317502, TaiZhou, Zhejiang, China Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), 317502, Taizhou, Zhejiang, China
Qianmeng Pan Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), 317502, Taizhou, Zhejiang, China
Hui Wang Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), 317502, Taizhou, Zhejiang, China
Xiayi Chen Wenling Big Data and Artificial Intelligence Institute in Medicine, 317502, TaiZhou, Zhejiang, China Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), 317502, Taizhou, Zhejiang, China
Kai Wang Dongyang Hospital Affiliated to Wenzhou Medical University, 322100, Jinhua, Zhejiang, China
Jing Xiong Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, 518000, Shenzhen, Guangdong, China
Liyu Chen Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, 310022, Hangzhou, Zhejiang, China. Zhejiang Provincial Research Center for Cancer Intelligent Diagnosis and Molecular Technology, 310022, Hangzhou, Zhejiang, China.
Dong Xu Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, 310022, Hangzhou, Zhejiang, China. Postgraduate training base Alliance of Wenzhou Medical University (Zhejiang Cancer Hospital), 310022, Hangzhou, Zhejiang, China. Zhejiang Provincial Research Center for Cancer Intelligent Diagnosis and Molecular Technology, 310022, Hangzhou, Zhejiang, China. Wenling Big Data and Artificial Intelligence Institute in Medicine, 317502, TaiZhou, Zhejiang, China. Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), 317502, Taizhou, Zhejiang, China.

Collapse

Montomoli J, Bitondo MM, Cascella M, Rezoagli E, Romeo L, Bellini V, Semeraro F, Gamberini E, Frontoni E, Agnoletti V, Altini M, Benanti P, Bignami EG. Algor-ethics: charting the ethical path for AI in critical care. J Clin Monit Comput 2024:10.1007/s10877-024-01157-y. [PMID: 38573370 DOI: 10.1007/s10877-024-01157-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2023] [Accepted: 03/22/2024] [Indexed: 04/05/2024]

Affiliation(s)

Jonathan Montomoli Department of Anesthesia and Intensive Care, Infermi Hospital, Romagna Local Health Authority, Viale Settembrini 2, Rimini, 47923, Italy. Health Services Research, Evaluation and Policy Unit, Romagna Local Health Authority, Viale Settembrini 2, Rimini, 47923, Italy.
Maria Maddalena Bitondo Department of Anesthesia and Intensive Care, Infermi Hospital, Romagna Local Health Authority, Viale Settembrini 2, Rimini, 47923, Italy
Marco Cascella Unit of Anesthesia and Pain Medicine, Department of Medicine, Surgery and Dentistry "Scuola Medica Salernitana, " University of Salerno, Baronissi, Salerno, Italy
Emanuele Rezoagli School of Medicine and Surgery, University of Milano-Bicocca, Via Cadore, 48, Monza, 20900, Italy Dipartimento di Emergenza e Urgenza, Terapia intensiva e Semintensiva adulti e pediatrica, Fondazione IRCCS San Gerardo dei Tintori, Via Pergolesi, 33, Monza, 20900, Italy
Luca Romeo Department of Economics and Law, University of Macerata, Macerata, 62100, Italy
Valentina Bellini Anesthesiology, Critical Care and Pain Medicine Division, Department of Medicine and Surgery, University of Parma, Via Gramsci 14, Parma, 43125, Italy
Federico Semeraro Department of Anesthesia, Intensive Care and Prehospital Emergency, Ospedale Maggiore Carlo Alberto Pizzardi, Largo Bartolo Nigrisoli, 2, Bologna, 40133, Italy
Emiliano Gamberini Department of Anesthesia and Intensive Care, Infermi Hospital, Romagna Local Health Authority, Viale Settembrini 2, Rimini, 47923, Italy
Emanuele Frontoni Department of Political Sciences, Communication and International Relations, University of Macerata, Macerata, 62100, Italy
Vanni Agnoletti Department of Surgery and Trauma, Anesthesia and Intensive Care Unit, Maurizio Bufalini Hospital, Romagna Local Health Authority, Viale Giovanni Ghirotti, 286, Cesena, 47521, Italy
Mattia Altini Hospital Care Sector, Emilia-Romagna Region, Via Aldo Moro, 21, Bologna, 40127, Italy
Paolo Benanti Pontifical Gregorian University, Piazza della Pilotta 4, Roma, 00187, Italy
Elena Giovanna Bignami Anesthesiology, Critical Care and Pain Medicine Division, Department of Medicine and Surgery, University of Parma, Via Gramsci 14, Parma, 43125, Italy

Collapse

Vaidya A, Chen RJ, Williamson DFK, Song AH, Jaume G, Yang Y, Hartvigsen T, Dyer EC, Lu MY, Lipkova J, Shaban M, Chen TY, Mahmood F. Demographic bias in misdiagnosis by computational pathology models. Nat Med 2024;30:1174-1190. [PMID: 38641744 DOI: 10.1038/s41591-024-02885-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2023] [Accepted: 02/23/2024] [Indexed: 04/21/2024]

Affiliation(s)

Anurag Vaidya Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA Health Sciences and Technology, Harvard-MIT, Cambridge, MA, USA
Richard J Chen Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Drew F K Williamson Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA Department of Pathology and Laboratory Medicine, Emory University School of Medicine, Atlanta, GA, USA
Andrew H Song Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
Guillaume Jaume Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
Yuzhe Yang Electrical Engineering and Computer Science, MIT, Cambridge, MA, USA
Thomas Hartvigsen School of Data Science, University of Virginia, Charlottesville, VA, USA
Emma C Dyer T.H. Chan School of Public Health, Harvard University, Cambridge, MA, USA
Ming Y Lu Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA Electrical Engineering and Computer Science, MIT, Cambridge, MA, USA
Jana Lipkova Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
Muhammad Shaban Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
Tiffany Y Chen Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
Faisal Mahmood Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA. Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA. Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA. Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA. Harvard Data Science Initiative, Harvard University, Cambridge, MA, USA.

Collapse

Balagopalan A, Baldini I, Celi LA, Gichoya J, McCoy LG, Naumann T, Shalit U, van der Schaar M, Wagstaff KL. Machine learning for healthcare that matters: Reorienting from technical novelty to equitable impact. PLOS DIGITAL HEALTH 2024;3:e0000474. [PMID: 38620047 PMCID: PMC11018283 DOI: 10.1371/journal.pdig.0000474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 02/18/2024] [Indexed: 04/17/2024]

Simmons C, DeGrasse J, Polakovic S, Aibinder W, Throckmorton T, Noerdlinger M, Papandrea R, Trenhaile S, Schoch B, Gobbato B, Routman H, Parsons M, Roche CP. Initial clinical experience with a predictive clinical decision support tool for anatomic and reverse total shoulder arthroplasty. EUROPEAN JOURNAL OF ORTHOPAEDIC SURGERY & TRAUMATOLOGY : ORTHOPEDIE TRAUMATOLOGIE 2024;34:1307-1318. [PMID: 38095688 DOI: 10.1007/s00590-023-03796-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 11/19/2023] [Indexed: 04/02/2024]

Abstract

PURPOSE

Clinical decision support tools (CDSTs) are software that generate patient-specific assessments that can be used to better inform healthcare provider decision making. Machine learning (ML)-based CDSTs have recently been developed for anatomic (aTSA) and reverse (rTSA) total shoulder arthroplasty to facilitate more data-driven, evidence-based decision making. Using this shoulder CDST as an example, this external validation study provides an overview of how ML-based algorithms are developed and discusses the limitations of these tools.

METHODS

An external validation for a novel CDST was conducted on 243 patients (120F/123M) who received a personalized prediction prior to surgery and had short-term clinical follow-up from 3 months to 2 years after primary aTSA (n = 43) or rTSA (n = 200). The outcome score and active range of motion predictions were compared to each patient's actual result at each timepoint, with the accuracy quantified by the mean absolute error (MAE).

RESULTS

The results of this external validation demonstrate the CDST accuracy to be similar (within 10%) or better than the MAEs from the published internal validation. A few predictive models were observed to have substantially lower MAEs than the internal validation, specifically, Constant (31.6% better), active abduction (22.5% better), global shoulder function (20.0% better), active external rotation (19.0% better), and active forward elevation (16.2% better), which is encouraging; however, the sample size was small.

CONCLUSION

A greater understanding of the limitations of ML-based CDSTs will facilitate more responsible use and build trust and confidence, potentially leading to greater adoption. As CDSTs evolve, we anticipate greater shared decision making between the patient and surgeon with the aim of achieving even better outcomes and greater levels of patient satisfaction.

Collapse

Ciet P, Eade C, Ho ML, Laborie LB, Mahomed N, Naidoo J, Pace E, Segal B, Toso S, Tschauner S, Vamyanmane DK, Wagner MW, Shelmerdine SC. The unintended consequences of artificial intelligence in paediatric radiology. Pediatr Radiol 2024;54:585-593. [PMID: 37665368 DOI: 10.1007/s00247-023-05746-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 08/07/2023] [Accepted: 08/08/2023] [Indexed: 09/05/2023]

Affiliation(s)

Pierluigi Ciet Department of Radiology and Nuclear Medicine, Erasmus MC - Sophia's Children's Hospital, Rotterdam, The Netherlands Department of Medical Sciences, University of Cagliari, Cagliari, Italy
Christine Eade Royal Cornwall Hospitals Trust, Truro, Cornwall, UK
Mai-Lan Ho University of Missouri, Columbia, MO, USA
Lene Bjerke Laborie Department of Radiology, Section for Paediatrics, Haukeland University Hospital, Bergen, Norway Department of Clinical Medicine, University of Bergen, Bergen, Norway
Nasreen Mahomed Department of Radiology, University of Witwatersrand, Johannesburg, South Africa
Jaishree Naidoo Paediatric Diagnostic Imaging, Dr J Naidoo Inc., Johannesburg, South Africa Envisionit Deep AI Ltd, Coveham House, Downside Bridge Road, Cobham, UK
Erika Pace Department of Diagnostic Radiology, The Royal Marsden NHS Foundation Trust, London, UK
Bradley Segal Department of Radiology, University of Witwatersrand, Johannesburg, South Africa
Seema Toso Pediatric Radiology, Children's Hospital, University Hospitals of Geneva, Geneva, Switzerland
Sebastian Tschauner Division of Paediatric Radiology, Department of Radiology, Medical University of Graz, Graz, Austria
Dhananjaya K Vamyanmane Department of Pediatric Radiology, Indira Gandhi Institute of Child Health, Bangalore, India
Matthias W Wagner Department of Diagnostic Imaging, Division of Neuroradiology, The Hospital for Sick Children, Toronto, Canada Department of Medical Imaging, University of Toronto, Toronto, ON, Canada Department of Neuroradiology, University Hospital Augsburg, Augsburg, Germany
Susan C Shelmerdine Department of Clinical Radiology, Great Ormond Street Hospital for Children NHS Foundation Trust, Great Ormond Street, London, WC1H 3JH, UK. Great Ormond Street Hospital for Children, UCL Great Ormond Street Institute of Child Health, London, UK. NIHR Great Ormond Street Hospital Biomedical Research Centre, 30 Guilford Street, Bloomsbury, London, UK. Department of Clinical Radiology, St George's Hospital, London, UK.

Collapse

Anderson JW, Visweswaran S. Algorithmic Individual Fairness and Healthcare: A Scoping Review. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.03.25.24304853. [PMID: 38585746 PMCID: PMC10996729 DOI: 10.1101/2024.03.25.24304853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]

Abstract

Objective

Statistical and artificial intelligence algorithms are increasingly being developed for use in healthcare. These algorithms may reflect biases that magnify disparities in clinical care, and there is a growing need for understanding how algorithmic biases can be mitigated in pursuit of algorithmic fairness. Individual fairness in algorithms constrains algorithms to the notion that "similar individuals should be treated similarly." We conducted a scoping review on algorithmic individual fairness to understand the current state of research in the metrics and methods developed to achieve individual fairness and its applications in healthcare.

Methods

We searched three databases, PubMed, ACM Digital Library, and IEEE Xplore, for algorithmic individual fairness metrics, algorithmic bias mitigation, and healthcare applications. Our search was restricted to articles published between January 2013 and September 2023. We identified 1,886 articles through database searches and manually identified one article from which we included 30 articles in the review. Data from the selected articles were extracted, and the findings were synthesized.

Results

Based on the 30 articles in the review, we identified several themes, including philosophical underpinnings of fairness, individual fairness metrics, mitigation methods for achieving individual fairness, implications of achieving individual fairness on group fairness and vice versa, fairness metrics that combined individual fairness and group fairness, software for measuring and optimizing individual fairness, and applications of individual fairness in healthcare.

Conclusion

While there has been significant work on algorithmic individual fairness in recent years, the definition, use, and study of individual fairness remain in their infancy, especially in healthcare. Future research is needed to apply and evaluate individual fairness in healthcare comprehensively.

Collapse

Wei ML, Tada M, So A, Torres R. Artificial intelligence and skin cancer. Front Med (Lausanne) 2024;11:1331895. [PMID: 38566925 PMCID: PMC10985205 DOI: 10.3389/fmed.2024.1331895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 02/26/2024] [Indexed: 04/04/2024] Open

Campion JR, O'Connor DB, Lahiff C. Human-artificial intelligence interaction in gastrointestinal endoscopy. World J Gastrointest Endosc 2024;16:126-135. [PMID: 38577646 PMCID: PMC10989254 DOI: 10.4253/wjge.v16.i3.126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Revised: 01/18/2024] [Accepted: 02/23/2024] [Indexed: 03/14/2024] Open

Topff L, Steltenpool S, Ranschaert ER, Ramanauskas N, Menezes R, Visser JJ, Beets-Tan RGH, Hartkamp NS. Artificial intelligence-assisted double reading of chest radiographs to detect clinically relevant missed findings: a two-centre evaluation. Eur Radiol 2024:10.1007/s00330-024-10676-w. [PMID: 38466390 DOI: 10.1007/s00330-024-10676-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 01/21/2024] [Accepted: 02/01/2024] [Indexed: 03/13/2024]

Abstract

OBJECTIVES

To evaluate an artificial intelligence (AI)-assisted double reading system for detecting clinically relevant missed findings on routinely reported chest radiographs.

METHODS

A retrospective study was performed in two institutions, a secondary care hospital and tertiary referral oncology centre. Commercially available AI software performed a comparative analysis of chest radiographs and radiologists' authorised reports using a deep learning and natural language processing algorithm, respectively. The AI-detected discrepant findings between images and reports were assessed for clinical relevance by an external radiologist, as part of the commercial service provided by the AI vendor. The selected missed findings were subsequently returned to the institution's radiologist for final review.

RESULTS

In total, 25,104 chest radiographs of 21,039 patients (mean age 61.1 years ± 16.2 [SD]; 10,436 men) were included. The AI software detected discrepancies between imaging and reports in 21.1% (5289 of 25,104). After review by the external radiologist, 0.9% (47 of 5289) of cases were deemed to contain clinically relevant missed findings. The institution's radiologists confirmed 35 of 47 missed findings (74.5%) as clinically relevant (0.1% of all cases). Missed findings consisted of lung nodules (71.4%, 25 of 35), pneumothoraces (17.1%, 6 of 35) and consolidations (11.4%, 4 of 35).

CONCLUSION

The AI-assisted double reading system was able to identify missed findings on chest radiographs after report authorisation. The approach required an external radiologist to review the AI-detected discrepancies. The number of clinically relevant missed findings by radiologists was very low.

CLINICAL RELEVANCE STATEMENT

The AI-assisted double reader workflow was shown to detect diagnostic errors and could be applied as a quality assurance tool. Although clinically relevant missed findings were rare, there is potential impact given the common use of chest radiography.

KEY POINTS

• A commercially available double reading system supported by artificial intelligence was evaluated to detect reporting errors in chest radiographs (n=25,104) from two institutions. • Clinically relevant missed findings were found in 0.1% of chest radiographs and consisted of unreported lung nodules, pneumothoraces and consolidations. • Applying AI software as a secondary reader after report authorisation can assist in reducing diagnostic errors without interrupting the radiologist's reading workflow. However, the number of AI-detected discrepancies was considerable and required review by a radiologist to assess their relevance.

Collapse

Brady AP, Allen B, Chong J, Kotter E, Kottler N, Mongan J, Oakden-Rayner L, Pinto Dos Santos D, Tang A, Wald C, Slavotinek J. Developing, purchasing, implementing and monitoring AI tools in radiology: Practical considerations. A multi-society statement from the ACR, CAR, ESR, RANZCR & RSNA. J Med Imaging Radiat Oncol 2024;68:7-26. [PMID: 38259140 DOI: 10.1111/1754-9485.13612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 11/23/2023] [Indexed: 01/24/2024]

Groh M, Badri O, Daneshjou R, Koochek A, Harris C, Soenksen LR, Doraiswamy PM, Picard R. Deep learning-aided decision support for diagnosis of skin disease across skin tones. Nat Med 2024;30:573-583. [PMID: 38317019 PMCID: PMC10878981 DOI: 10.1038/s41591-023-02728-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Accepted: 11/16/2023] [Indexed: 02/07/2024]

Brady AP, Allen B, Chong J, Kotter E, Kottler N, Mongan J, Oakden-Rayner L, Pinto Dos Santos D, Tang A, Wald C, Slavotinek J. Developing, Purchasing, Implementing and Monitoring AI Tools in Radiology: Practical Considerations. A Multi-Society Statement From the ACR, CAR, ESR, RANZCR & RSNA. J Am Coll Radiol 2024:S1546-1440(23)01020-7. [PMID: 38276923 DOI: 10.1016/j.jacr.2023.12.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2024]

Brady AP, Allen B, Chong J, Kotter E, Kottler N, Mongan J, Oakden-Rayner L, Dos Santos DP, Tang A, Wald C, Slavotinek J. Developing, purchasing, implementing and monitoring AI tools in radiology: practical considerations. A multi-society statement from the ACR, CAR, ESR, RANZCR & RSNA. Insights Imaging 2024;15:16. [PMID: 38246898 PMCID: PMC10800328 DOI: 10.1186/s13244-023-01541-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2024] Open

Nguyen T. ChatGPT in Medical Education: A Precursor for Automation Bias? JMIR MEDICAL EDUCATION 2024;10:e50174. [PMID: 38231545 PMCID: PMC10831594 DOI: 10.2196/50174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 12/11/2023] [Indexed: 01/18/2024]

Day TG, Matthew J, Budd SF, Venturini L, Wright R, Farruggia A, Vigneswaran TV, Zidere V, Hajnal JV, Razavi R, Simpson JM, Kainz B. Interaction between clinicians and artificial intelligence to detect fetal atrioventricular septal defects on ultrasound: how can we optimize collaborative performance? ULTRASOUND IN OBSTETRICS & GYNECOLOGY : THE OFFICIAL JOURNAL OF THE INTERNATIONAL SOCIETY OF ULTRASOUND IN OBSTETRICS AND GYNECOLOGY 2024. [PMID: 38197584 DOI: 10.1002/uog.27577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 12/19/2023] [Accepted: 12/30/2023] [Indexed: 01/11/2024]

Abstract

OBJECTIVES

Artificial intelligence (AI) has shown promise in improving the performance of fetal ultrasound screening in detecting congenital heart disease (CHD). The effect of giving AI advice to human operators has not been studied in this context. Giving additional information about AI model workings, such as confidence scores for AI predictions, may be a way of further improving performance. Our aims were to investigate whether AI advice improved overall diagnostic accuracy (using a single CHD lesion as an exemplar), and to determine what, if any, additional information given to clinicians optimized the overall performance of the clinician-AI team.

METHODS

An AI model was trained to classify a single fetal CHD lesion (atrioventricular septal defect (AVSD)), using a retrospective cohort of 121 130 cardiac four-chamber images extracted from 173 ultrasound scan videos (98 with normal hearts, 75 with AVSD); a ResNet50 model architecture was used. Temperature scaling of model prediction probability was performed on a validation set, and gradient-weighted class activation maps (grad-CAMs) produced. Ten clinicians (two consultant fetal cardiologists, three trainees in pediatric cardiology and five fetal cardiac sonographers) were recruited from a center of fetal cardiology to participate. Each participant was shown 2000 fetal four-chamber images in a random order (1000 normal and 1000 AVSD). The dataset comprised 500 images, each shown in four conditions: (1) image alone without AI output; (2) image with binary AI classification; (3) image with AI model confidence; and (4) image with grad-CAM image overlays. The clinicians were asked to classify each image as normal or AVSD.

RESULTS

A total of 20 000 image classifications were recorded from 10 clinicians. The AI model alone achieved an accuracy of 0.798 (95% CI, 0.760-0.832), a sensitivity of 0.868 (95% CI, 0.834-0.902) and a specificity of 0.728 (95% CI, 0.702-0.754), and the clinicians without AI achieved an accuracy of 0.844 (95% CI, 0.834-0.854), a sensitivity of 0.827 (95% CI, 0.795-0.858) and a specificity of 0.861 (95% CI, 0.828-0.895). Showing a binary (normal or AVSD) AI model output resulted in significant improvement in accuracy to 0.865 (P < 0.001). This effect was seen in both experienced and less-experienced participants. Giving incorrect AI advice resulted in a significant deterioration in overall accuracy, from 0.761 to 0.693 (P < 0.001), which was driven by an increase in both Type-I and Type-II errors by the clinicians. This effect was worsened by showing model confidence (accuracy, 0.649; P < 0.001) or grad-CAM (accuracy, 0.644; P < 0.001).

CONCLUSIONS

AI has the potential to improve performance when used in collaboration with clinicians, even if the model performance does not reach expert level. Giving additional information about model workings such as model confidence and class activation map image overlays did not improve overall performance, and actually worsened performance for images for which the AI model was incorrect. © 2024 The Authors. Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd on behalf of International Society of Ultrasound in Obstetrics and Gynecology.

Collapse

Affiliation(s)

T G Day School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK Department of Congenital Heart Disease, Evelina London Children's Healthcare, Guy's and St Thomas' NHS Foundation Trust, London, UK
J Matthew School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK
S F Budd School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK
L Venturini School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK
R Wright School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK
A Farruggia School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK
T V Vigneswaran School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK Department of Congenital Heart Disease, Evelina London Children's Healthcare, Guy's and St Thomas' NHS Foundation Trust, London, UK
V Zidere School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK Department of Congenital Heart Disease, Evelina London Children's Healthcare, Guy's and St Thomas' NHS Foundation Trust, London, UK Harris Birthright Research Centre, King's College London NHS Foundation Trust, London, UK
J V Hajnal School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK
R Razavi School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK Department of Congenital Heart Disease, Evelina London Children's Healthcare, Guy's and St Thomas' NHS Foundation Trust, London, UK
J M Simpson School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK Department of Congenital Heart Disease, Evelina London Children's Healthcare, Guy's and St Thomas' NHS Foundation Trust, London, UK
B Kainz School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität, Erlangen-Nürnberg, Germany Department of Computing, Faculty of Engineering, Imperial College London, London, UK

Collapse

Dot G, Gajny L, Ducret M. [The challenges of artificial intelligence in odontology]. Med Sci (Paris) 2024;40:79-84. [PMID: 38299907 DOI: 10.1051/medsci/2023199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2024] Open

Brady AP, Allen B, Chong J, Kotter E, Kottler N, Mongan J, Oakden-Rayner L, dos Santos DP, Tang A, Wald C, Slavotinek J. Developing, Purchasing, Implementing and Monitoring AI Tools in Radiology: Practical Considerations. A Multi-Society Statement from the ACR, CAR, ESR, RANZCR and RSNA. Radiol Artif Intell 2024;6:e230513. [PMID: 38251899 PMCID: PMC10831521 DOI: 10.1148/ryai.230513] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2024]

Abstract

Artificial Intelligence (AI) carries the potential for unprecedented disruption in radiology, with possible positive and negative consequences. The integration of AI in radiology holds the potential to revolutionize healthcare practices by advancing diagnosis, quantification, and management of multiple medical conditions. Nevertheless, the ever-growing availability of AI tools in radiology highlights an increasing need to critically evaluate claims for its utility and to differentiate safe product offerings from potentially harmful, or fundamentally unhelpful ones. This multi-society paper, presenting the views of Radiology Societies in the USA, Canada, Europe, Australia, and New Zealand, defines the potential practical problems and ethical issues surrounding the incorporation of AI into radiological practice. In addition to delineating the main points of concern that developers, regulators, and purchasers of AI tools should consider prior to their introduction into clinical practice, this statement also suggests methods to monitor their stability and safety in clinical use, and their suitability for possible autonomous function. This statement is intended to serve as a useful summary of the practical issues which should be considered by all parties involved in the development of radiology AI resources, and their implementation as clinical tools. This article is simultaneously published in Insights into Imaging (DOI 10.1186/s13244-023-01541-3), Journal of Medical Imaging and Radiation Oncology (DOI 10.1111/1754-9485.13612), Canadian Association of Radiologists Journal (DOI 10.1177/08465371231222229), Journal of the American College of Radiology (DOI 10.1016/j.jacr.2023.12.005), and Radiology: Artificial Intelligence (DOI 10.1148/ryai.230513). Keywords: Artificial Intelligence, Radiology, Automation, Machine Learning Published under a CC BY 4.0 license. ©The Author(s) 2024. Editor's Note: The RSNA Board of Directors has endorsed this article. It has not undergone review or editing by this journal.

Collapse

Teneggi J, Yi PH, Sulam J. Examination-Level Supervision for Deep Learning-based Intracranial Hemorrhage Detection on Head CT Scans. Radiol Artif Intell 2024;6:e230159. [PMID: 38294324 PMCID: PMC10831525 DOI: 10.1148/ryai.230159] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 11/02/2023] [Accepted: 12/05/2023] [Indexed: 02/01/2024]

Abstract

Purpose To compare the effectiveness of weak supervision (ie, with examination-level labels only) and strong supervision (ie, with image-level labels) in training deep learning models for detection of intracranial hemorrhage (ICH) on head CT scans. Materials and Methods In this retrospective study, an attention-based convolutional neural network was trained with either local (ie, image level) or global (ie, examination level) binary labels on the Radiological Society of North America (RSNA) 2019 Brain CT Hemorrhage Challenge dataset of 21 736 examinations (8876 [40.8%] ICH) and 752 422 images (107 784 [14.3%] ICH). The CQ500 (436 examinations; 212 [48.6%] ICH) and CT-ICH (75 examinations; 36 [48.0%] ICH) datasets were employed for external testing. Performance in detecting ICH was compared between weak (examination-level labels) and strong (image-level labels) learners as a function of the number of labels available during training. Results On examination-level binary classification, strong and weak learners did not have different area under the receiver operating characteristic curve values on the internal validation split (0.96 vs 0.96; P = .64) and the CQ500 dataset (0.90 vs 0.92; P = .15). Weak learners outperformed strong ones on the CT-ICH dataset (0.95 vs 0.92; P = .03). Weak learners had better section-level ICH detection performance when more than 10 000 labels were available for training (average f1 = 0.73 vs 0.65; P < .001). Weakly supervised models trained on the entire RSNA dataset required 35 times fewer labels than equivalent strong learners. Conclusion Strongly supervised models did not achieve better performance than weakly supervised ones, which could reduce radiologist labor requirements for prospective dataset curation. Keywords: CT, Head/Neck, Brain/Brain Stem, Hemorrhage Supplemental material is available for this article. © RSNA, 2023 See also commentary by Wahid and Fuentes in this issue.

Collapse

Jabbour S, Fouhey D, Shepard S, Valley TS, Kazerooni EA, Banovic N, Wiens J, Sjoding MW. Measuring the Impact of AI in the Diagnosis of Hospitalized Patients: A Randomized Clinical Vignette Survey Study. JAMA 2023;330:2275-2284. [PMID: 38112814 PMCID: PMC10731487 DOI: 10.1001/jama.2023.22295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 10/11/2023] [Indexed: 12/21/2023]

Abstract

Importance

Artificial intelligence (AI) could support clinicians when diagnosing hospitalized patients; however, systematic bias in AI models could worsen clinician diagnostic accuracy. Recent regulatory guidance has called for AI models to include explanations to mitigate errors made by models, but the effectiveness of this strategy has not been established.

Objectives

To evaluate the impact of systematically biased AI on clinician diagnostic accuracy and to determine if image-based AI model explanations can mitigate model errors.

Design, Setting, and Participants

Randomized clinical vignette survey study administered between April 2022 and January 2023 across 13 US states involving hospitalist physicians, nurse practitioners, and physician assistants.

Interventions

Clinicians were shown 9 clinical vignettes of patients hospitalized with acute respiratory failure, including their presenting symptoms, physical examination, laboratory results, and chest radiographs. Clinicians were then asked to determine the likelihood of pneumonia, heart failure, or chronic obstructive pulmonary disease as the underlying cause(s) of each patient's acute respiratory failure. To establish baseline diagnostic accuracy, clinicians were shown 2 vignettes without AI model input. Clinicians were then randomized to see 6 vignettes with AI model input with or without AI model explanations. Among these 6 vignettes, 3 vignettes included standard-model predictions, and 3 vignettes included systematically biased model predictions.

Main Outcomes and Measures

Clinician diagnostic accuracy for pneumonia, heart failure, and chronic obstructive pulmonary disease.

Results

Median participant age was 34 years (IQR, 31-39) and 241 (57.7%) were female. Four hundred fifty-seven clinicians were randomized and completed at least 1 vignette, with 231 randomized to AI model predictions without explanations, and 226 randomized to AI model predictions with explanations. Clinicians' baseline diagnostic accuracy was 73.0% (95% CI, 68.3% to 77.8%) for the 3 diagnoses. When shown a standard AI model without explanations, clinician accuracy increased over baseline by 2.9 percentage points (95% CI, 0.5 to 5.2) and by 4.4 percentage points (95% CI, 2.0 to 6.9) when clinicians were also shown AI model explanations. Systematically biased AI model predictions decreased clinician accuracy by 11.3 percentage points (95% CI, 7.2 to 15.5) compared with baseline and providing biased AI model predictions with explanations decreased clinician accuracy by 9.1 percentage points (95% CI, 4.9 to 13.2) compared with baseline, representing a nonsignificant improvement of 2.3 percentage points (95% CI, -2.7 to 7.2) compared with the systematically biased AI model.

Conclusions and Relevance

Although standard AI models improve diagnostic accuracy, systematically biased AI models reduced diagnostic accuracy, and commonly used image-based AI model explanations did not mitigate this harmful effect.

Trial Registration

ClinicalTrials.gov Identifier: NCT06098950.

Collapse

Smith CM, Weathers AL, Lewis SL. An overview of clinical machine learning applications in neurology. J Neurol Sci 2023;455:122799. [PMID: 37979413 DOI: 10.1016/j.jns.2023.122799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 10/26/2023] [Accepted: 11/12/2023] [Indexed: 11/20/2023]

Funer F, Liedtke W, Tinnemeyer S, Klausen AD, Schneider D, Zacharias HU, Langanke M, Salloch S. Responsibility and decision-making authority in using clinical decision support systems: an empirical-ethical exploration of German prospective professionals' preferences and concerns. JOURNAL OF MEDICAL ETHICS 2023;50:6-11. [PMID: 37217277 PMCID: PMC10803986 DOI: 10.1136/jme-2022-108814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Accepted: 03/11/2023] [Indexed: 05/24/2023]

Banerji CRS, Chakraborti T, Harbron C, MacArthur BD. Clinical AI tools must convey predictive uncertainty for each individual patient. Nat Med 2023;29:2996-2998. [PMID: 37821686 DOI: 10.1038/s41591-023-02562-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]

Kostick-Quenet K, Lang BH, Smith J, Hurley M, Blumenthal-Barby J. Trust criteria for artificial intelligence in health: normative and epistemic considerations. JOURNAL OF MEDICAL ETHICS 2023:jme-2023-109338. [PMID: 37979976 PMCID: PMC11101592 DOI: 10.1136/jme-2023-109338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 11/02/2023] [Indexed: 11/20/2023]

Kenny R, Fischhoff B, Davis A, Canfield C. Improving Social Bot Detection Through Aid and Training. HUMAN FACTORS 2023:187208231210145. [PMID: 37963198 DOI: 10.1177/00187208231210145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2023]

Nagendran M, Festor P, Komorowski M, Gordon AC, Faisal AA. Quantifying the impact of AI recommendations with explanations on prescription decision making. NPJ Digit Med 2023;6:206. [PMID: 37935953 PMCID: PMC10630476 DOI: 10.1038/s41746-023-00955-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 10/27/2023] [Indexed: 11/09/2023] Open

Li MD, Little BP. Appropriate Reliance on Artificial Intelligence in Radiology Education. J Am Coll Radiol 2023;20:1126-1130. [PMID: 37392983 DOI: 10.1016/j.jacr.2023.04.019] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 03/20/2023] [Accepted: 04/06/2023] [Indexed: 07/03/2023]

Ghassemi M. Presentation matters for AI-generated clinical advice. Nat Hum Behav 2023;7:1833-1835. [PMID: 37985904 DOI: 10.1038/s41562-023-01721-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]

Schlicker N, Langer M, Hirsch MC. [How trustworthy is artificial intelligence? : A model for the conflict between objectivity and subjectivity]. INNERE MEDIZIN (HEIDELBERG, GERMANY) 2023;64:1051-1057. [PMID: 37737496 DOI: 10.1007/s00108-023-01602-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 09/18/2023] [Indexed: 09/23/2023]

Vijayakumar S, Lee VV, Leong QY, Hong SJ, Blasiak A, Ho D. Physicians' Perspectives on AI in Clinical Decision Support Systems: Interview Study of the CURATE.AI Personalized Dose Optimization Platform. JMIR Hum Factors 2023;10:e48476. [PMID: 37902825 PMCID: PMC10644191 DOI: 10.2196/48476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 08/24/2023] [Accepted: 09/10/2023] [Indexed: 10/31/2023] Open

Abstract

BACKGROUND

Physicians play a key role in integrating new clinical technology into care practices through user feedback and growth propositions to developers of the technology. As physicians are stakeholders involved through the technology iteration process, understanding their roles as users can provide nuanced insights into the workings of these technologies that are being explored. Therefore, understanding physicians' perceptions can be critical toward clinical validation, implementation, and downstream adoption. Given the increasing prevalence of clinical decision support systems (CDSSs), there remains a need to gain an in-depth understanding of physicians' perceptions and expectations toward their downstream implementation. This paper explores physicians' perceptions of integrating CURATE.AI, a novel artificial intelligence (AI)-based and clinical stage personalized dosing CDSSs, into clinical practice.

OBJECTIVE

This study aims to understand physicians' perspectives of integrating CURATE.AI for clinical work and to gather insights on considerations of the implementation of AI-based CDSS tools.

METHODS

A total of 12 participants completed semistructured interviews examining their knowledge, experience, attitudes, risks, and future course of the personalized combination therapy dosing platform, CURATE.AI. Interviews were audio recorded, transcribed verbatim, and coded manually. The data were thematically analyzed.

RESULTS

Overall, 3 broad themes and 9 subthemes were identified through thematic analysis. The themes covered considerations that physicians perceived as significant across various stages of new technology development, including trial, clinical implementation, and mass adoption.

CONCLUSIONS

The study laid out the various ways physicians interpreted an AI-based personalized dosing CDSS, CURATE.AI, for their clinical practice. The research pointed out that physicians' expectations during the different stages of technology exploration can be nuanced and layered with expectations of implementation that are relevant for technology developers and researchers.

Collapse

Joo H, Mathis MR, Tam M, James C, Han P, Mangrulkar RS, Friedman CP, Vydiswaran VGV. Applying AI and Guidelines to Assist Medical Students in Recognizing Patients With Heart Failure: Protocol for a Randomized Trial. JMIR Res Protoc 2023;12:e49842. [PMID: 37874618 PMCID: PMC10630872 DOI: 10.2196/49842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 09/16/2023] [Accepted: 09/20/2023] [Indexed: 10/25/2023] Open

Abstract

BACKGROUND

The integration of artificial intelligence (AI) into clinical practice is transforming both clinical practice and medical education. AI-based systems aim to improve the efficacy of clinical tasks, enhancing diagnostic accuracy and tailoring treatment delivery. As it becomes increasingly prevalent in health care for high-quality patient care, it is critical for health care providers to use the systems responsibly to mitigate bias, ensure effective outcomes, and provide safe clinical practices. In this study, the clinical task is the identification of heart failure (HF) prior to surgery with the intention of enhancing clinical decision-making skills. HF is a common and severe disease, but detection remains challenging due to its subtle manifestation, often concurrent with other medical conditions, and the absence of a simple and effective diagnostic test. While advanced HF algorithms have been developed, the use of these AI-based systems to enhance clinical decision-making in medical education remains understudied.

OBJECTIVE

This research protocol is to demonstrate our study design, systematic procedures for selecting surgical cases from electronic health records, and interventions. The primary objective of this study is to measure the effectiveness of interventions aimed at improving HF recognition before surgery, the second objective is to evaluate the impact of inaccurate AI recommendations, and the third objective is to explore the relationship between the inclination to accept AI recommendations and their accuracy.

METHODS

Our study used a 3 × 2 factorial design (intervention type × order of prepost sets) for this randomized trial with medical students. The student participants are asked to complete a 30-minute e-learning module that includes key information about the intervention and a 5-question quiz, and a 60-minute review of 20 surgical cases to determine the presence of HF. To mitigate selection bias in the pre- and posttests, we adopted a feature-based systematic sampling procedure. From a pool of 703 expert-reviewed surgical cases, 20 were selected based on features such as case complexity, model performance, and positive and negative labels. This study comprises three interventions: (1) a direct AI-based recommendation with a predicted HF score, (2) an indirect AI-based recommendation gauged through the area under the curve metric, and (3) an HF guideline-based intervention.

RESULTS

As of July 2023, 62 of the enrolled medical students have fulfilled this study's participation, including the completion of a short quiz and the review of 20 surgical cases. The subject enrollment commenced in August 2022 and will end in December 2023, with the goal of recruiting 75 medical students in years 3 and 4 with clinical experience.

CONCLUSIONS

We demonstrated a study protocol for the randomized trial, measuring the effectiveness of interventions using AI and HF guidelines among medical students to enhance HF recognition in preoperative care with electronic health record data.

INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID)

DERR1-10.2196/49842.

Collapse

Vicente L, Matute H. Humans inherit artificial intelligence biases. Sci Rep 2023;13:15737. [PMID: 37789032 PMCID: PMC10547752 DOI: 10.1038/s41598-023-42384-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 09/09/2023] [Indexed: 10/05/2023] Open

Carboni C, Wehrens R, van der Veen R, de Bont A. Eye for an AI: More-than-seeing, fauxtomation, and the enactment of uncertain data in digital pathology. SOCIAL STUDIES OF SCIENCE 2023;53:712-737. [PMID: 37154611 PMCID: PMC10543128 DOI: 10.1177/03063127231167589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]

McCradden M, Hui K, Buchman DZ. Evidence, ethics and the promise of artificial intelligence in psychiatry. JOURNAL OF MEDICAL ETHICS 2023;49:573-579. [PMID: 36581457 PMCID: PMC10423547 DOI: 10.1136/jme-2022-108447] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2022] [Accepted: 11/29/2022] [Indexed: 05/20/2023]

Day TG, Matthew J, Budd S, Hajnal JV, Simpson JM, Razavi R, Kainz B. Sonographer interaction with artificial intelligence: collaboration or conflict? ULTRASOUND IN OBSTETRICS & GYNECOLOGY : THE OFFICIAL JOURNAL OF THE INTERNATIONAL SOCIETY OF ULTRASOUND IN OBSTETRICS AND GYNECOLOGY 2023;62:167-174. [PMID: 37523514 DOI: 10.1002/uog.26238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 04/05/2023] [Accepted: 04/14/2023] [Indexed: 08/02/2023]

van Leeuwen K, Becks M, Grob D, de Lange F, Rutten J, Schalekamp S, Rutten M, van Ginneken B, de Rooij M, Meijer F. AI-support for the detection of intracranial large vessel occlusions: One-year prospective evaluation. Heliyon 2023;9:e19065. [PMID: 37636476 PMCID: PMC10458691 DOI: 10.1016/j.heliyon.2023.e19065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 08/07/2023] [Accepted: 08/09/2023] [Indexed: 08/29/2023] Open

Alarcón ÁS, Madrid NM, Seepold R, Ortega JA. Obstructive sleep apnea event detection using explainable deep learning models for a portable monitor. Front Neurosci 2023;17:1155900. [PMID: 37521695 PMCID: PMC10375719 DOI: 10.3389/fnins.2023.1155900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 06/16/2023] [Indexed: 08/01/2023] Open

Wehkamp K, Krawczak M, Schreiber S. The Quality and Utility of Artificial Intelligence in Patient Care. DEUTSCHES ARZTEBLATT INTERNATIONAL 2023;120:463-469. [PMID: 37218054 PMCID: PMC10487679 DOI: 10.3238/arztebl.m2023.0124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 11/30/2022] [Accepted: 05/08/2023] [Indexed: 05/24/2023]

Jiang LY, Liu XC, Nejatian NP, Nasir-Moin M, Wang D, Abidin A, Eaton K, Riina HA, Laufer I, Punjabi P, Miceli M, Kim NC, Orillac C, Schnurman Z, Livia C, Weiss H, Kurland D, Neifert S, Dastagirzada Y, Kondziolka D, Cheung ATM, Yang G, Cao M, Flores M, Costa AB, Aphinyanaphongs Y, Cho K, Oermann EK. Health system-scale language models are all-purpose prediction engines. Nature 2023;619:357-362. [PMID: 37286606 PMCID: PMC10338337 DOI: 10.1038/s41586-023-06160-y] [Citation(s) in RCA: 35] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 05/02/2023] [Indexed: 06/09/2023]

Affiliation(s)

Lavender Yao Jiang Department of Neurosurgery, NYU Langone Health, New York, NY, USA Center for Data Science, New York University, New York, NY, USA
Xujin Chris Liu Department of Neurosurgery, NYU Langone Health, New York, NY, USA Electrical and Computer Engineering, Tandon School of Engineering, New York, NY, USA
Nima Pour Nejatian NVIDIA, Santa Clara, CA, USA
Mustafa Nasir-Moin Department of Neurosurgery, NYU Langone Health, New York, NY, USA
Duo Wang Predictive Analytics Unit, NYU Langone Health, New York, NY, USA
Anas Abidin NVIDIA, Santa Clara, CA, USA
Kevin Eaton Department of Internal Medicine, NYU Langone Health, New York, NY, USA
Howard Antony Riina Department of Neurosurgery, NYU Langone Health, New York, NY, USA
Ilya Laufer Department of Neurosurgery, NYU Langone Health, New York, NY, USA
Paawan Punjabi Department of Internal Medicine, NYU Langone Health, New York, NY, USA
Madeline Miceli Department of Internal Medicine, NYU Langone Health, New York, NY, USA
Nora C Kim Department of Neurosurgery, NYU Langone Health, New York, NY, USA
Cordelia Orillac Department of Neurosurgery, NYU Langone Health, New York, NY, USA
Zane Schnurman Department of Neurosurgery, NYU Langone Health, New York, NY, USA
Christopher Livia Department of Neurosurgery, NYU Langone Health, New York, NY, USA
Hannah Weiss Department of Neurosurgery, NYU Langone Health, New York, NY, USA
David Kurland Department of Neurosurgery, NYU Langone Health, New York, NY, USA
Sean Neifert Department of Neurosurgery, NYU Langone Health, New York, NY, USA
Yosef Dastagirzada Department of Neurosurgery, NYU Langone Health, New York, NY, USA
Douglas Kondziolka Department of Neurosurgery, NYU Langone Health, New York, NY, USA
Alexander T M Cheung Department of Neurosurgery, NYU Langone Health, New York, NY, USA
Grace Yang Department of Neurosurgery, NYU Langone Health, New York, NY, USA Center for Data Science, New York University, New York, NY, USA
Ming Cao Department of Neurosurgery, NYU Langone Health, New York, NY, USA Center for Data Science, New York University, New York, NY, USA
Mona Flores NVIDIA, Santa Clara, CA, USA
Anthony B Costa NVIDIA, Santa Clara, CA, USA
Yindalon Aphinyanaphongs Predictive Analytics Unit, NYU Langone Health, New York, NY, USA Department of Population Health, NYU Langone Health, New York, NY, USA
Kyunghyun Cho Center for Data Science, New York University, New York, NY, USA Prescient Design, Genentech, New York, NY, USA Courant Institute of Mathematical Sciences, New York University, New York, NY, USA Canadian Institute for Advanced Research, Toronto, Ontario, Canada
Eric Karl Oermann Department of Neurosurgery, NYU Langone Health, New York, NY, USA. Center for Data Science, New York University, New York, NY, USA. Department of Radiology, NYU Langone Health, New York, NY, USA.

Collapse

Šuster S, Baldwin T, Verspoor K. Analysis of predictive performance and reliability of classifiers for quality assessment of medical evidence revealed important variation by medical area. J Clin Epidemiol 2023;159:58-69. [PMID: 37120028 DOI: 10.1016/j.jclinepi.2023.04.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Revised: 03/30/2023] [Accepted: 04/18/2023] [Indexed: 05/01/2023]

Dvijotham KD, Winkens J, Barsbey M, Ghaisas S, Stanforth R, Pawlowski N, Strachan P, Ahmed Z, Azizi S, Bachrach Y, Culp L, Daswani M, Freyberg J, Kelly C, Kiraly A, Kohlberger T, McKinney S, Mustafa B, Natarajan V, Geras K, Witowski J, Qin ZZ, Creswell J, Shetty S, Sieniek M, Spitz T, Corrado G, Kohli P, Cemgil T, Karthikesalingam A. Enhancing the reliability and accuracy of AI-enabled diagnosis via complementarity-driven deferral to clinicians. Nat Med 2023;29:1814-1820. [PMID: 37460754 DOI: 10.1038/s41591-023-02437-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 06/05/2023] [Indexed: 07/20/2023]

Gasmi I, Calinghen A, Parienti JJ, Belloy F, Fohlen A, Pelage JP. Comparison of diagnostic performance of a deep learning algorithm, emergency physicians, junior radiologists and senior radiologists in the detection of appendicular fractures in children. Pediatr Radiol 2023;53:1675-1684. [PMID: 36877239 DOI: 10.1007/s00247-023-05621-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 11/21/2022] [Accepted: 01/30/2023] [Indexed: 03/07/2023]

Abstract

BACKGROUND

Advances have been made in the use of artificial intelligence (AI) in the field of diagnostic imaging, particularly in the detection of fractures on conventional radiographs. Studies looking at the detection of fractures in the pediatric population are few. The anatomical variations and evolution according to the child's age require specific studies of this population. Failure to diagnose fractures early in children may lead to serious consequences for growth.

OBJECTIVE

To evaluate the performance of an AI algorithm based on deep neural networks toward detecting traumatic appendicular fractures in a pediatric population. To compare sensitivity, specificity, positive predictive value and negative predictive value of different readers and the AI algorithm.

MATERIALS AND METHODS

This retrospective study conducted on 878 patients younger than 18 years of age evaluated conventional radiographs obtained after recent non-life-threatening trauma. All radiographs of the shoulder, arm, elbow, forearm, wrist, hand, leg, knee, ankle and foot were evaluated. The diagnostic performance of a consensus of radiology experts in pediatric imaging (reference standard) was compared with those of pediatric radiologists, emergency physicians, senior residents and junior residents. The predictions made by the AI algorithm and the annotations made by the different physicians were compared.

RESULTS

The algorithm predicted 174 fractures out of 182, corresponding to a sensitivity of 95.6%, a specificity of 91.64% and a negative predictive value of 98.76%. The AI predictions were close to that of pediatric radiologists (sensitivity 98.35%) and that of senior residents (95.05%) and were above those of emergency physicians (81.87%) and junior residents (90.1%). The algorithm identified 3 (1.6%) fractures not initially seen by pediatric radiologists.

CONCLUSION

This study suggests that deep learning algorithms can be useful in improving the detection of fractures in children.

Collapse