1
|
Le KDR, Tay SBP, Choy KT, Verjans J, Sasanelli N, Kong JCH. Applications of natural language processing tools in the surgical journey. Front Surg 2024; 11:1403540. [PMID: 38826809 PMCID: PMC11140056 DOI: 10.3389/fsurg.2024.1403540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Accepted: 05/07/2024] [Indexed: 06/04/2024] Open
Abstract
Background Natural language processing tools are becoming increasingly adopted in multiple industries worldwide. They have shown promising results however their use in the field of surgery is under-recognised. Many trials have assessed these benefits in small settings with promising results before large scale adoption can be considered in surgery. This study aims to review the current research and insights into the potential for implementation of natural language processing tools into surgery. Methods A narrative review was conducted following a computer-assisted literature search on Medline, EMBASE and Google Scholar databases. Papers related to natural language processing tools and consideration into their use for surgery were considered. Results Current applications of natural language processing tools within surgery are limited. From the literature, there is evidence of potential improvement in surgical capability and service delivery, such as through the use of these technologies to streamline processes including surgical triaging, data collection and auditing, surgical communication and documentation. Additionally, there is potential to extend these capabilities to surgical academia to improve processes in surgical research and allow innovation in the development of educational resources. Despite these outcomes, the evidence to support these findings are challenged by small sample sizes with limited applicability to broader settings. Conclusion With the increasing adoption of natural language processing technology, such as in popular forms like ChatGPT, there has been increasing research in the use of these tools within surgery to improve surgical workflow and efficiency. This review highlights multifaceted applications of natural language processing within surgery, albeit with clear limitations due to the infancy of the infrastructure available to leverage these technologies. There remains room for more rigorous research into broader capability of natural language processing technology within the field of surgery and the need for cross-sectoral collaboration to understand the ways in which these algorithms can best be integrated.
Collapse
Affiliation(s)
- Khang Duy Ricky Le
- Department of General Surgical Specialties, The Royal Melbourne Hospital, Melbourne, VIC, Australia
- Department of Surgical Oncology, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Geelong Clinical School, Deakin University, Geelong, VIC, Australia
- Department of Medical Education, The University of Melbourne, Melbourne, VIC, Australia
| | - Samuel Boon Ping Tay
- Department of Anaesthesia and Pain Medicine, Eastern Health, Box Hill, VIC, Australia
| | - Kay Tai Choy
- Department of Surgery, Austin Health, Melbourne, VIC, Australia
| | - Johan Verjans
- Australian Institute for Machine Learning (AIML), University of Adelaide, Adelaide, SA, Australia
- Lifelong Health Theme (Platform AI), South Australian Health and Medical Research Institute, Adelaide, SA, Australia
| | - Nicola Sasanelli
- Division of Information Technology, Engineering and the Environment, University of South Australia, Adelaide, SA, Australia
- Department of Operations (Strategic and International Partnerships), SmartSAT Cooperative Research Centre, Adelaide, SA, Australia
- Agora High Tech, Adelaide, SA, Australia
| | - Joseph C. H. Kong
- Department of Surgical Oncology, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Monash University Department of Surgery, Alfred Hospital, Melbourne, VIC, Australia
- Department of Colorectal Surgery, Alfred Hospital, Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, The University of Melbourne, Melbourne, VIC, Australia
| |
Collapse
|
2
|
Arenson M, Hogan J, Xu L, Lynch R, Lee YTH, Choi JD, Sun J, Adams A, Patzer RE. Predicting Kidney Transplant Recipient Cohorts' 30-Day Rehospitalization Using Clinical Notes and Electronic Health Care Record Data. Kidney Int Rep 2023; 8:489-498. [PMID: 36938078 PMCID: PMC10014371 DOI: 10.1016/j.ekir.2022.12.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 12/04/2022] [Accepted: 12/05/2022] [Indexed: 12/14/2022] Open
Abstract
Introduction Rehospitalization after kidney transplant is costly to patients and health care systems and is associated with poor outcomes. Few prediction model studies have examined whether inclusion of clinical notes data from the electronic medical record (EMR) enhances prediction of rehospitalization. Methods In a retrospective, observational study of first-time, adult kidney transplant recipients at a large, urban hospital in southeastern United States (2005-2015), we examined 30-day rehospitalization (30DR) using structured EMR and unstructured (i.e., clinical notes) data. We used natural language processing (NLP) methods on 8 types of clinical notes and included terms in predictive models using unsupervised machine learning approaches. Both the area under the receiver operating curve and precision-recall curve (ROC and PRC, respectively) were used to determine and compare model accuracy, and 5-fold cross-validation tested model performance. Results Among 2060 kidney transplant recipients, 30.7% were readmitted within 30 days. Predictive models using clinical notes did not meaningfully improve performance over previous models using structured data alone (ROC 0.6821; 95% confidence interval [CI]: 0.6644, 0.6998). Predictive models built using solely clinical notes performed worse than models using both clinical notes and structured data. The data that contributed to the top performing models were not identical but both included structured data and progress notes (ROC 0.6902; 95% CI: 0.6699, 0.7105). Conclusions Including new features from clinical notes in risk prediction models did not substantially increase predictive accuracy for 30DR for kidney transplant recipients. Future research should consider pooling data from multiple institutions to increase sample size and avoid overfitting models.
Collapse
Affiliation(s)
- Michael Arenson
- Department of Surgery, Division of Transplantation, Emory University School of Medicine, Atlanta, Georgia, USA
- Department of Pediatrics, Child Health Equity Center, UMass Chan Medical School, Worcester, Massachusetts, USA
| | - Julien Hogan
- Department of Surgery, Division of Transplantation, Emory University School of Medicine, Atlanta, Georgia, USA
| | - Liyan Xu
- Department of Computer Science, Emory University, Atlanta, Georgia, USA
| | - Raymond Lynch
- Department of Surgery, Division of Transplantation, Emory University School of Medicine, Atlanta, Georgia, USA
| | - Yi-Ting Hana Lee
- Department of Surgery, Division of Transplantation, Emory University School of Medicine, Atlanta, Georgia, USA
| | - Jinho D. Choi
- Department of Computer Science, Emory University, Atlanta, Georgia, USA
| | - Jimeng Sun
- Department of Computer Science, University of Illinois, Urbana-Champaign, Champaign, Illinois, USA
| | - Andrew Adams
- Department of Surgery, Division of Transplantation, University of Minnesota, Minneapolis, Minnesota, USA
| | - Rachel E. Patzer
- Department of Surgery, Division of Transplantation, Emory University School of Medicine, Atlanta, Georgia, USA
- Department of Epidemiology, Rollins School of Public Health Emory University, Atlanta, Georgia, USA
- Correspondence: Rachel E. Patzer, Department of Surgery, Emory University School of Medicine, 101 Woodruff Circle, 5101 WMB, Atlanta, Georgia 30322, USA.
| |
Collapse
|
3
|
Kim M, Park S, Kim C, Choi M. Diagnostic accuracy of clinical outcome prediction using nursing data in intensive care patients: A systematic review. Int J Nurs Stud 2023; 138:104411. [PMID: 36495596 DOI: 10.1016/j.ijnurstu.2022.104411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2022] [Revised: 09/17/2022] [Accepted: 11/22/2022] [Indexed: 11/30/2022]
Abstract
BACKGROUND Nursing data consist of observations of patients' conditions and information on nurses' clinical judgment based on critically ill patients' behavior and physiological signs. Nursing data in electronic health records were recently emphasized as important predictors of patients' deterioration but have not been systematically reviewed. OBJECTIVE We conducted a systematic review of prediction models using nursing data for clinical outcomes, such as prolonged hospital stay, readmission, and mortality in intensive care patients, compared to physiological data only. In addition, the type of nursing data used in prediction model developments was investigated. DESIGN A systematic review. METHODS PubMed, CINAHL, Cochrane CENTRAL, EMBASE, IEEE Xplore Digital Library, Web of Science, and Scopus were searched. Clinical outcome prediction models using nursing data for intensive care patients were included. Clinical outcomes were prolonged hospital stay, readmission, and mortality. Data were extracted from selected studies such as study design, data source, outcome definition, sample size, predictors, reference test, model development, model performance, and evaluation. The risk of bias and applicability was assessed using the Prediction model Risk of Bias Assessment Tool checklist. Descriptive summaries were produced based on paired forest plots and summary receiver operating characteristic curves. RESULTS Sixteen studies were included in the systematic review. The data types of predictors used in prediction models were categorized as physiological data, nursing data, and clinical notes. The types of nursing data consisted of nursing notes, assessments, documentation frequency, and flowsheet comments. The studies using physiological data as a reference test showed higher predictive performance in combined data or nursing data than in physiological data. The overall risk of bias indicated that most of the included studies have a high risk. CONCLUSIONS This study was conducted to identify and review the diagnostic accuracy of clinical outcome prediction using nursing data in intensive care patients. Most of the included studies developed models using nursing notes, and other studies used nursing assessments, documentation frequency, and flowsheet comments. Although the findings need careful interpretation due to the high risk of bias, the area under the curve scores of nursing data and combined data were higher than physiological data alone. It is necessary to establish a strategy in prediction modeling to utilize nursing data, clinical notes, and physiological data as predictors, considering the clinical context rather than physiological data alone. REGISTRATION The protocol for this study is registered with PROSPERO (registration number: CRD42021273319).
Collapse
Affiliation(s)
- Mihui Kim
- College of Nursing and Brain Korea 21 FOUR Project, Yonsei University, Seoul, Republic of Korea.
| | - Sangwoo Park
- College of Nursing and Mo-Im Kim Nursing Research Institute, Yonsei University, Seoul, Republic of Korea.
| | - Changhwan Kim
- School of Nursing, Johns Hopkins University, Baltimore, MD, United States of America.
| | - Mona Choi
- College of Nursing and Mo-Im Kim Nursing Research Institute, Yonsei University, Seoul, Republic of Korea; Yonsei Evidence Based Nursing Centre of Korea, A JBI Affiliated Group, Seoul, Republic of Korea.
| |
Collapse
|
4
|
Morris MX, Song EY, Rajesh A, Kass N, Asaad M, Phillips BT. New Frontiers of Natural Language Processing in Surgery. Am Surg 2023; 89:43-48. [PMID: 35969539 DOI: 10.1177/00031348221117039] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The vast and ever-growing volume of electronic health records (EHR) have generated a wealth of information-rich data. Traditional, non-machine learning data extraction techniques are error-prone and laborious, hindering the analytical potential of these massive data sources. Equipped with natural language processing (NLP) tools, surgeons are better able to automate, and customize their review to investigate and implement surgical solutions. We identify current perioperative applications of NLP algorithms as well as research limitations and future avenues to outline the impact and potential of this technology for progressing surgical innovation.
Collapse
Affiliation(s)
- Miranda X Morris
- 12277Duke University School of Medicine, Durham, NC, USA.,Duke Pratt School of Engineering, Durham, NC, USA
| | - Ethan Y Song
- Division of Plastic, Maxillofacial, and Oral Surgery, Department of Surgery, 22957Duke University, Durham, NC, USA
| | - Aashish Rajesh
- Department of Surgery, 14742University of Texas Health Science Center at San Antonio, San Antonio, TX, USA
| | - Nicolas Kass
- 12317University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Malke Asaad
- Department of Plastic Surgery, 6595University of Pittsburgh Medical Center, Pittsburgh, PA, USA
| | - Brett T Phillips
- Division of Plastic, Maxillofacial, and Oral Surgery, Department of Surgery, 22957Duke University, Durham, NC, USA
| |
Collapse
|
5
|
Liu CF, Hung CM, Ko SC, Cheng KC, Chao CM, Sung MI, Hsing SC, Wang JJ, Chen CJ, Lai CC, Chen CM, Chiu CC. An artificial intelligence system to predict the optimal timing for mechanical ventilation weaning for intensive care unit patients: A two-stage prediction approach. Front Med (Lausanne) 2022; 9:935366. [PMID: 36465940 PMCID: PMC9715756 DOI: 10.3389/fmed.2022.935366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 10/11/2022] [Indexed: 11/03/2023] Open
Abstract
Background For the intensivists, accurate assessment of the ideal timing for successful weaning from the mechanical ventilation (MV) in the intensive care unit (ICU) is very challenging. Purpose Using artificial intelligence (AI) approach to build two-stage predictive models, namely, the try-weaning stage and weaning MV stage to determine the optimal timing of weaning from MV for ICU intubated patients, and implement into practice for assisting clinical decision making. Methods AI and machine learning (ML) technologies were used to establish the predictive models in the stages. Each stage comprised 11 prediction time points with 11 prediction models. Twenty-five features were used for the first-stage models while 20 features were used for the second-stage models. The optimal models for each time point were selected for further practical implementation in a digital dashboard style. Seven machine learning algorithms including Logistic Regression (LR), Random Forest (RF), Support Vector Machines (SVM), K Nearest Neighbor (KNN), lightGBM, XGBoost, and Multilayer Perception (MLP) were used. The electronic medical records of the intubated ICU patients of Chi Mei Medical Center (CMMC) from 2016 to 2019 were included for modeling. Models with the highest area under the receiver operating characteristic curve (AUC) were regarded as optimal models and used to develop the prediction system accordingly. Results A total of 5,873 cases were included in machine learning modeling for Stage 1 with the AUCs of optimal models ranging from 0.843 to 0.953. Further, 4,172 cases were included for Stage 2 with the AUCs of optimal models ranging from 0.889 to 0.944. A prediction system (dashboard) with the optimal models of the two stages was developed and deployed in the ICU setting. Respiratory care members expressed high recognition of the AI dashboard assisting ventilator weaning decisions. Also, the impact analysis of with- and without-AI assistance revealed that our AI models could shorten the patients' intubation time by 21 hours, besides gaining the benefit of substantial consistency between these two decision-making strategies. Conclusion We noticed that the two-stage AI prediction models could effectively and precisely predict the optimal timing to wean intubated patients in the ICU from ventilator use. This could reduce patient discomfort, improve medical quality, and lower medical costs. This AI-assisted prediction system is beneficial for clinicians to cope with a high demand for ventilators during the COVID-19 pandemic.
Collapse
Affiliation(s)
- Chung-Feng Liu
- Department of Medical Research, Chi Mei Medical Center, Tainan, Taiwan
| | - Chao-Ming Hung
- Department of General Surgery, E-Da Cancer Hospital, Kaohsiung, Taiwan
- College of Medicine, I-Shou University, Kaohsiung, Taiwan
| | - Shian-Chin Ko
- Department of Respiratory Therapy, Chi Mei Medical Center, Tainan, Taiwan
| | - Kuo-Chen Cheng
- Department of Internal Medicine, Chi Mei Medical Center, Tainan, Taiwan
| | - Chien-Ming Chao
- Department of Intensive Care Medicine, Chi Mei Medical Center, Liouying, Taiwan
- Department of Dental Laboratory Technology, Min-Hwei College of Health Care Management, Liouying, Taiwan
| | - Mei-I Sung
- Department of Respiratory Therapy, Chi Mei Medical Center, Tainan, Taiwan
| | - Shu-Chen Hsing
- Department of Respiratory Therapy, Chi Mei Medical Center, Tainan, Taiwan
| | - Jhi-Joung Wang
- Department of Anesthesiology, Chi Mei Medical Center, Tainan, Taiwan
- Department of Anesthesiology, National Defense Medical Center, Taipei, Taiwan
| | - Chia-Jung Chen
- Department of Information Systems, Chi Mei Medical Center, Tainan, Taiwan
| | - Chih-Cheng Lai
- Division of Hospital Medicine, Department of Internal Medicine, Chi Mei Medical Center, Tainan, Taiwan
| | - Chin-Ming Chen
- Department of Intensive Care Medicine, Chi Mei Medical Center, Tainan, Taiwan
| | - Chong-Chi Chiu
- Department of General Surgery, E-Da Cancer Hospital, Kaohsiung, Taiwan
- School of Medicine, College of Medicine, I-Shou University, Kaohsiung, Taiwan
- Department of Medical Education and Research, E-Da Cancer Hospital, Kaohsiung, Taiwan
- Department of General Surgery, Chi Mei Medical Center, Tainan, Taiwan
| |
Collapse
|
6
|
Wang N, Wang M, Jiang L, Du B, Zhu B, Xi X. The predictive value of the Oxford Acute Severity of Illness Score for clinical outcomes in patients with acute kidney injury. Ren Fail 2022; 44:320-328. [PMID: 35168501 PMCID: PMC8856098 DOI: 10.1080/0886022x.2022.2027247] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Objective To compare the performance of the Oxford Acute Severity of Illness Score (OASIS), the Acute Physiology and Chronic Health Evaluation II (APACHE II) score, the Simplified Acute Physiology Score II (SAPS II), and the Sequential Organ Failure Assessment (SOFA) score in predicting 28-day mortality in acute kidney injury (AKI) patients. Methods Data were extracted from the Beijing Acute Kidney Injury Trial (BAKIT). A total of 2954 patients with complete clinical data were included in this study. Receiver operating characteristic (ROC) curves were used to analyze and evaluate the predictive effects of the four scoring systems on the 28-day mortality risk of AKI patients and each subgroup. The best cutoff value was identified by the highest combined sensitivity and specificity using Youden’s index. Results Among the four scoring systems, the area under the curve (AUC) of OASIS was the highest. The comparison of AUC values of different scoring systems showed that there were no significant differences among OASIS, APACHE II, and SAPS II, which were better than SOFA. Moreover, logistic analysis revealed that OASIS was an independent risk factor for 28-day mortality in AKI patients. OASIS also had good predictive ability for the 28-day mortality of each subgroup of AKI patients. Conclusion OASIS, APACHE II, and SAPS II all presented good discrimination and calibration in predicting the 28-day mortality risk of AKI patients. OASIS, APACHE II, and SAPS II had better predictive accuracy than SOFA, but due to the complexity of APACHE II and SAPS II calculations, OASIS is a good substitute. Trial Registration This study was registered at www.chictr.org.cn (registration number Chi CTR-ONC-11001875). Registered on 14 December 2011.
Collapse
Affiliation(s)
- Na Wang
- Emergency Department of China Rehabilitation Research Center, Capital Medical University, Beijing, China
| | - Meiping Wang
- Department of Epidemiology and Health Statistics, School of Public Health, Capital Medical University, Beijing, China
| | - Li Jiang
- Department of Critical Care Medicine, Xuan Wu Hospital, Capital Medical University, Beijing, China
| | - Bin Du
- Medical Intensive Care Unit, Peking Union Medical College Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Bo Zhu
- Department of Critical Care Medicine, Fu Xing Hospital, Capital Medical University, Beijing, China
| | - Xiuming Xi
- Department of Critical Care Medicine, Fu Xing Hospital, Capital Medical University, Beijing, China
| |
Collapse
|
7
|
Impact of Different Approaches to Preparing Notes for Analysis With Natural Language Processing on the Performance of Prediction Models in Intensive Care. Crit Care Explor 2021; 3:e0450. [PMID: 34136824 PMCID: PMC8202578 DOI: 10.1097/cce.0000000000000450] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Supplemental Digital Content is available in the text. OBJECTIVES: To evaluate whether different approaches in note text preparation (known as preprocessing) can impact machine learning model performance in the case of mortality prediction ICU. DESIGN: Clinical note text was used to build machine learning models for adults admitted to the ICU. Preprocessing strategies studied were none (raw text), cleaning text, stemming, term frequency-inverse document frequency vectorization, and creation of n-grams. Model performance was assessed by the area under the receiver operating characteristic curve. Models were trained and internally validated on University of California San Francisco data using 10-fold cross validation. These models were then externally validated on Beth Israel Deaconess Medical Center data. SETTING: ICUs at University of California San Francisco and Beth Israel Deaconess Medical Center. SUBJECTS: Ten thousand patients in the University of California San Francisco training and internal testing dataset and 27,058 patients in the external validation dataset, Beth Israel Deaconess Medical Center. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: Mortality rate at Beth Israel Deaconess Medical Center and University of California San Francisco was 10.9% and 7.4%, respectively. Data are presented as area under the receiver operating characteristic curve (95% CI) for models validated at University of California San Francisco and area under the receiver operating characteristic curve for models validated at Beth Israel Deaconess Medical Center. Models built and trained on University of California San Francisco data for the prediction of inhospital mortality improved from the raw note text model (AUROC, 0.84; CI, 0.80–0.89) to the term frequency-inverse document frequency model (AUROC, 0.89; CI, 0.85–0.94). When applying the models developed at University of California San Francisco to Beth Israel Deaconess Medical Center data, there was a similar increase in model performance from raw note text (area under the receiver operating characteristic curve at Beth Israel Deaconess Medical Center: 0.72) to the term frequency-inverse document frequency model (area under the receiver operating characteristic curve at Beth Israel Deaconess Medical Center: 0.83). CONCLUSIONS: Differences in preprocessing strategies for note text impacted model discrimination. Completing a preprocessing pathway including cleaning, stemming, and term frequency-inverse document frequency vectorization resulted in the preprocessing strategy with the greatest improvement in model performance. Further study is needed, with particular emphasis on how to manage author implicit bias present in note text, before natural language processing algorithms are implemented in the clinical setting.
Collapse
|
8
|
Abstract
OBJECTIVE The aim of this study was to systematically assess the application and potential benefits of natural language processing (NLP) in surgical outcomes research. SUMMARY BACKGROUND DATA Widespread implementation of electronic health records (EHRs) has generated a massive patient data source. Traditional methods of data capture, such as billing codes and/or manual review of free-text narratives in EHRs, are highly labor-intensive, costly, subjective, and potentially prone to bias. METHODS A literature search of PubMed, MEDLINE, Web of Science, and Embase identified all articles published starting in 2000 that used NLP models to assess perioperative surgical outcomes. Evaluation metrics of NLP systems were assessed by means of pooled analysis and meta-analysis. Qualitative synthesis was carried out to assess the results and risk of bias on outcomes. RESULTS The present study included 29 articles, with over half (n = 15) published after 2018. The most common outcome identified using NLP was postoperative complications (n = 14). Compared to traditional non-NLP models, NLP models identified postoperative complications with higher sensitivity [0.92 (0.87-0.95) vs 0.58 (0.33-0.79), P < 0.001]. The specificities were comparable at 0.99 (0.96-1.00) and 0.98 (0.95-0.99), respectively. Using summary of likelihood ratio matrices, traditional non-NLP models have clinical utility for confirming documentation of outcomes/diagnoses, whereas NLP models may be reliably utilized for both confirming and ruling out documentation of outcomes/diagnoses. CONCLUSIONS NLP usage to extract a range of surgical outcomes, particularly postoperative complications, is accelerating across disciplines and areas of clinical outcomes research. NLP and traditional non-NLP approaches demonstrate similar performance measures, but NLP is superior in ruling out documentation of surgical outcomes.
Collapse
|
9
|
Maassen O, Fritsch S, Palm J, Deffge S, Kunze J, Marx G, Riedel M, Schuppert A, Bickenbach J. Future Medical Artificial Intelligence Application Requirements and Expectations of Physicians in German University Hospitals: Web-Based Survey. J Med Internet Res 2021; 23:e26646. [PMID: 33666563 PMCID: PMC7980122 DOI: 10.2196/26646] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Revised: 01/29/2021] [Accepted: 02/15/2021] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND The increasing development of artificial intelligence (AI) systems in medicine driven by researchers and entrepreneurs goes along with enormous expectations for medical care advancement. AI might change the clinical practice of physicians from almost all medical disciplines and in most areas of health care. While expectations for AI in medicine are high, practical implementations of AI for clinical practice are still scarce in Germany. Moreover, physicians' requirements and expectations of AI in medicine and their opinion on the usage of anonymized patient data for clinical and biomedical research have not been investigated widely in German university hospitals. OBJECTIVE This study aimed to evaluate physicians' requirements and expectations of AI in medicine and their opinion on the secondary usage of patient data for (bio)medical research (eg, for the development of machine learning algorithms) in university hospitals in Germany. METHODS A web-based survey was conducted addressing physicians of all medical disciplines in 8 German university hospitals. Answers were given using Likert scales and general demographic responses. Physicians were asked to participate locally via email in the respective hospitals. RESULTS The online survey was completed by 303 physicians (female: 121/303, 39.9%; male: 173/303, 57.1%; no response: 9/303, 3.0%) from a wide range of medical disciplines and work experience levels. Most respondents either had a positive (130/303, 42.9%) or a very positive attitude (82/303, 27.1%) towards AI in medicine. There was a significant association between the personal rating of AI in medicine and the self-reported technical affinity level (H4=48.3, P<.001). A vast majority of physicians expected the future of medicine to be a mix of human and artificial intelligence (273/303, 90.1%) but also requested a scientific evaluation before the routine implementation of AI-based systems (276/303, 91.1%). Physicians were most optimistic that AI applications would identify drug interactions (280/303, 92.4%) to improve patient care substantially but were quite reserved regarding AI-supported diagnosis of psychiatric diseases (62/303, 20.5%). Of the respondents, 82.5% (250/303) agreed that there should be open access to anonymized patient databases for medical and biomedical research. CONCLUSIONS Physicians in stationary patient care in German university hospitals show a generally positive attitude towards using most AI applications in medicine. Along with this optimism comes several expectations and hopes that AI will assist physicians in clinical decision making. Especially in fields of medicine where huge amounts of data are processed (eg, imaging procedures in radiology and pathology) or data are collected continuously (eg, cardiology and intensive care medicine), physicians' expectations of AI to substantially improve future patient care are high. In the study, the greatest potential was seen in the application of AI for the identification of drug interactions, assumedly due to the rising complexity of drug administration to polymorbid, polypharmacy patients. However, for the practical usage of AI in health care, regulatory and organizational challenges still have to be mastered.
Collapse
Affiliation(s)
- Oliver Maassen
- Department of Intensive Care Medicine, University Hospital RWTH Aachen, Aachen, Germany
- SMITH Consortium of the German Medical Informatics Initiative, Leipzig, Germany
| | - Sebastian Fritsch
- Department of Intensive Care Medicine, University Hospital RWTH Aachen, Aachen, Germany
- SMITH Consortium of the German Medical Informatics Initiative, Leipzig, Germany
- Jülich Supercomputing Centre, Forschungszentrum Jülich, Jülich, Germany
| | - Julia Palm
- SMITH Consortium of the German Medical Informatics Initiative, Leipzig, Germany
- Institute of Medical Statistics, Computer and Data Sciences, Jena University Hospital, Jena, Germany
| | - Saskia Deffge
- Department of Intensive Care Medicine, University Hospital RWTH Aachen, Aachen, Germany
- SMITH Consortium of the German Medical Informatics Initiative, Leipzig, Germany
| | - Julian Kunze
- Department of Intensive Care Medicine, University Hospital RWTH Aachen, Aachen, Germany
- SMITH Consortium of the German Medical Informatics Initiative, Leipzig, Germany
| | - Gernot Marx
- Department of Intensive Care Medicine, University Hospital RWTH Aachen, Aachen, Germany
- SMITH Consortium of the German Medical Informatics Initiative, Leipzig, Germany
| | - Morris Riedel
- SMITH Consortium of the German Medical Informatics Initiative, Leipzig, Germany
- Jülich Supercomputing Centre, Forschungszentrum Jülich, Jülich, Germany
- School of Natural Sciences and Engineering, University of Iceland, Reykjavik, Iceland
| | - Andreas Schuppert
- SMITH Consortium of the German Medical Informatics Initiative, Leipzig, Germany
- Institute for Computational Biomedicine II, University Hospital RWTH Aachen, Aachen, Germany
| | - Johannes Bickenbach
- Department of Intensive Care Medicine, University Hospital RWTH Aachen, Aachen, Germany
- SMITH Consortium of the German Medical Informatics Initiative, Leipzig, Germany
| |
Collapse
|
10
|
Nguyen D, Ngo B, vanSonnenberg E. AI in the Intensive Care Unit: Up-to-Date Review. J Intensive Care Med 2020; 36:1115-1123. [PMID: 32985324 DOI: 10.1177/0885066620956620] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
AI is the latest technologic trend that likely will have a huge impact in medicine. AI's potential lies in its ability to process large volumes of data and perform complex pattern analyses. The ICU is an area of medicine that is particularly conducive to AI applications. Much AI ICU research currently is focused on improving high volumes of data on high-risk patients and making clinical workflow more efficient. Emerging topics of AI medicine in the ICU include AI sensors, sepsis prediction, AI in the NICU or SICU, and the legal role of AI in medicine. This review will cover the current applications of AI medicine in the ICU, potential pitfalls, and other AI medicine-related topics relevant for the ICU.
Collapse
Affiliation(s)
- Diep Nguyen
- University of Arizona College of Medicine Phoenix, AZ, USA
| | - Brandon Ngo
- University of Arizona College of Medicine Phoenix, AZ, USA
| | | |
Collapse
|
11
|
|
12
|
Could "Big Brother" Be Joining the Early Mobilization Team? Crit Care Med 2019; 47:1274-1276. [PMID: 31415314 DOI: 10.1097/ccm.0000000000003886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
13
|
Przybyła P, Brockmeier AJ, Ananiadou S. Quantifying risk factors in medical reports with a context-aware linear model. J Am Med Inform Assoc 2019; 26:537-546. [PMID: 30840055 PMCID: PMC6515525 DOI: 10.1093/jamia/ocz004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2018] [Revised: 12/14/2018] [Accepted: 01/03/2019] [Indexed: 12/03/2022] Open
Abstract
OBJECTIVE We seek to quantify the mortality risk associated with mentions of medical concepts in textual electronic health records (EHRs). Recognizing mentions of named entities of relevant types (eg, conditions, symptoms, laboratory tests or behaviors) in text is a well-researched task. However, determining the level of risk associated with them is partly dependent on the textual context in which they appear, which may describe severity, temporal aspects, quantity, etc. METHODS To take into account that a given word appearing in the context of different risk factors (medical concepts) can make different contributions toward risk level, we propose a multitask approach, called context-aware linear modeling, which can be applied using appropriately regularized linear regression. To improve the performance for risk factors unseen in training data (eg, rare diseases), we take into account their distributional similarity to other concepts. RESULTS The evaluation is based on a corpus of 531 reports from EHRs with 99 376 risk factors rated manually by experts. While context-aware linear modeling significantly outperforms single-task models, taking into account concept similarity further improves performance, reaching the level of human annotators' agreements. CONCLUSION Our results show that automatic quantification of risk factors in EHRs can achieve performance comparable to human assessment, and taking into account the multitask structure of the problem and the ability to handle rare concepts is crucial for its accuracy.
Collapse
Affiliation(s)
- Piotr Przybyła
- National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, United Kingdom
| | - Austin J Brockmeier
- National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, United Kingdom
| | - Sophia Ananiadou
- National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, United Kingdom
| |
Collapse
|