1
|
Sarwal D, Wang L, Gandhi S, Sagheb Hossein Pour E, Janssens LP, Delgado AM, Doering KA, Mishra AK, Greenwood JD, Liu H, Majumder S. Identification of pancreatic cancer risk factors from clinical notes using natural language processing. Pancreatology 2024; 24:572-578. [PMID: 38693040 DOI: 10.1016/j.pan.2024.03.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Revised: 03/20/2024] [Accepted: 03/23/2024] [Indexed: 05/03/2024]
Abstract
OBJECTIVES Screening for pancreatic ductal adenocarcinoma (PDAC) is considered in high-risk individuals (HRIs) with established PDAC risk factors, such as family history and germline mutations in PDAC susceptibility genes. Accurate assessment of risk factor status is provider knowledge-dependent and requires extensive manual chart review by experts. Natural Language Processing (NLP) has shown promise in automated data extraction from the electronic health record (EHR). We aimed to use NLP for automated extraction of PDAC risk factors from unstructured clinical notes in the EHR. METHODS We first developed rule-based NLP algorithms to extract PDAC risk factors at the document-level, using an annotated corpus of 2091 clinical notes. Next, we further improved the NLP algorithms using a cohort of 1138 patients through patient-level training, validation, and testing, with comparison against a pre-specified reference standard. To minimize false-negative results we prioritized algorithm recall. RESULTS In the test set (n = 807), the NLP algorithms achieved a recall of 0.933, precision of 0.790, and F1-score of 0.856 for family history of PDAC. For germline genetic mutations, the algorithm had a high recall of 0.851, while precision and F1-score were lower at 0.350 and 0.496 respectively. Most false positives for germline mutations resulted from erroneous recognition of tissue mutations. CONCLUSIONS Rule-based NLP algorithms applied to unstructured clinical notes are highly sensitive for automated identification of PDAC risk factors. Further validation in a large primary-care patient population is warranted to assess real-world utility in identifying HRIs for pancreatic cancer screening.
Collapse
Affiliation(s)
- Dhruv Sarwal
- Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
| | - Liwei Wang
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN, USA
| | - Sonal Gandhi
- Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
| | | | - Laurens P Janssens
- Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
| | - Adriana M Delgado
- Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
| | - Karen A Doering
- Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
| | - Anup Kumar Mishra
- Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
| | - Jason D Greenwood
- Department of Family Medicine, Mayo Clinic, Rochester, MN, USA; Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA
| | - Hongfang Liu
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN, USA
| | - Shounak Majumder
- Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA.
| |
Collapse
|
2
|
Le KDR, Tay SBP, Choy KT, Verjans J, Sasanelli N, Kong JCH. Applications of natural language processing tools in the surgical journey. Front Surg 2024; 11:1403540. [PMID: 38826809 PMCID: PMC11140056 DOI: 10.3389/fsurg.2024.1403540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Accepted: 05/07/2024] [Indexed: 06/04/2024] Open
Abstract
Background Natural language processing tools are becoming increasingly adopted in multiple industries worldwide. They have shown promising results however their use in the field of surgery is under-recognised. Many trials have assessed these benefits in small settings with promising results before large scale adoption can be considered in surgery. This study aims to review the current research and insights into the potential for implementation of natural language processing tools into surgery. Methods A narrative review was conducted following a computer-assisted literature search on Medline, EMBASE and Google Scholar databases. Papers related to natural language processing tools and consideration into their use for surgery were considered. Results Current applications of natural language processing tools within surgery are limited. From the literature, there is evidence of potential improvement in surgical capability and service delivery, such as through the use of these technologies to streamline processes including surgical triaging, data collection and auditing, surgical communication and documentation. Additionally, there is potential to extend these capabilities to surgical academia to improve processes in surgical research and allow innovation in the development of educational resources. Despite these outcomes, the evidence to support these findings are challenged by small sample sizes with limited applicability to broader settings. Conclusion With the increasing adoption of natural language processing technology, such as in popular forms like ChatGPT, there has been increasing research in the use of these tools within surgery to improve surgical workflow and efficiency. This review highlights multifaceted applications of natural language processing within surgery, albeit with clear limitations due to the infancy of the infrastructure available to leverage these technologies. There remains room for more rigorous research into broader capability of natural language processing technology within the field of surgery and the need for cross-sectoral collaboration to understand the ways in which these algorithms can best be integrated.
Collapse
Affiliation(s)
- Khang Duy Ricky Le
- Department of General Surgical Specialties, The Royal Melbourne Hospital, Melbourne, VIC, Australia
- Department of Surgical Oncology, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Geelong Clinical School, Deakin University, Geelong, VIC, Australia
- Department of Medical Education, The University of Melbourne, Melbourne, VIC, Australia
| | - Samuel Boon Ping Tay
- Department of Anaesthesia and Pain Medicine, Eastern Health, Box Hill, VIC, Australia
| | - Kay Tai Choy
- Department of Surgery, Austin Health, Melbourne, VIC, Australia
| | - Johan Verjans
- Australian Institute for Machine Learning (AIML), University of Adelaide, Adelaide, SA, Australia
- Lifelong Health Theme (Platform AI), South Australian Health and Medical Research Institute, Adelaide, SA, Australia
| | - Nicola Sasanelli
- Division of Information Technology, Engineering and the Environment, University of South Australia, Adelaide, SA, Australia
- Department of Operations (Strategic and International Partnerships), SmartSAT Cooperative Research Centre, Adelaide, SA, Australia
- Agora High Tech, Adelaide, SA, Australia
| | - Joseph C. H. Kong
- Department of Surgical Oncology, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Monash University Department of Surgery, Alfred Hospital, Melbourne, VIC, Australia
- Department of Colorectal Surgery, Alfred Hospital, Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, The University of Melbourne, Melbourne, VIC, Australia
| |
Collapse
|
3
|
Pathak A, Yu Z, Paredes D, Monsour EP, Rocha AO, Brito JP, Ospina NS, Wu Y. Extracting Thyroid Nodules Characteristics from Ultrasound Reports Using Transformer-based Natural Language Processing Methods. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2024; 2023:1193-1200. [PMID: 38222394 PMCID: PMC10785862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
The ultrasound characteristics of thyroid nodules guide the evaluation of thyroid cancer in patients with thyroid nodules. However, the characteristics of thyroid nodules are often documented in clinical narratives such as ultrasound reports. Previous studies have examined natural language processing (NLP) methods in extracting a limited number of characteristics (<9) using rule-based NLP systems. In this study, a multidisciplinary team of NLP experts and thyroid specialists, identified thyroid nodule characteristics that are important for clinical care, composed annotation guidelines, developed a corpus, and compared 5 state-of-the-art transformer-based NLP methods, including BERT, RoBERTa, LongFormer, DeBERTa, and GatorTron, for extraction of thyroid nodule characteristics from ultrasound reports. Our GatorTron model, a transformer-based large language model trained using over 90 billion words of text, achieved the best strict and lenient F1-score of 0.8851 and 0.9495 for the extraction of a total number of 16 thyroid nodule characteristics, and 0.9321 for linking characteristics to nodules, outperforming other clinical transformer models. To the best of our knowledge, this is the first study to systematically categorize and apply transformer-based NLP models to extract a large number of clinical relevant thyroid nodule characteristics from ultrasound reports. This study lays ground for assessing the documentation quality of thyroid ultrasound reports and examining outcomes of patients with thyroid nodules using electronic health records.
Collapse
Affiliation(s)
- Aman Pathak
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA
| | - Zehao Yu
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA
| | - Daniel Paredes
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA
| | - Elio Paul Monsour
- Division of Endocrinology, Department of Medicine, College of Medicine, University of Florida, Gainesville, FL, USA
| | - Andrea Ortiz Rocha
- Division of Endocrinology, Department of Medicine, College of Medicine, University of Florida, Gainesville, FL, USA
| | - Juan P Brito
- Division of Endocrinology, Diabetes, Metabolism and Nutrition, Mayo Clinic Rochester, USA
| | - Naykky Singh Ospina
- Division of Endocrinology, Department of Medicine, College of Medicine, University of Florida, Gainesville, FL, USA
| | - Yonghui Wu
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA
| |
Collapse
|
4
|
Zhang J, Mazurowski MA, Allen BC, Wildman-Tobriner B. Multistep Automated Data Labelling Procedure (MADLaP) for thyroid nodules on ultrasound: An artificial intelligence approach for automating image annotation. Artif Intell Med 2023; 141:102553. [PMID: 37295897 DOI: 10.1016/j.artmed.2023.102553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 02/14/2023] [Accepted: 04/11/2023] [Indexed: 06/12/2023]
Abstract
Machine learning (ML) for diagnosis of thyroid nodules on ultrasound is an active area of research. However, ML tools require large, well-labeled datasets, the curation of which is time-consuming and labor-intensive. The purpose of our study was to develop and test a deep-learning-based tool to facilitate and automate the data annotation process for thyroid nodules; we named our tool Multistep Automated Data Labelling Procedure (MADLaP). MADLaP was designed to take multiple inputs including pathology reports, ultrasound images, and radiology reports. Using multiple step-wise 'modules' including rule-based natural language processing, deep-learning-based imaging segmentation, and optical character recognition, MADLaP automatically identified images of a specific thyroid nodule and correctly assigned a pathology label. The model was developed using a training set of 378 patients across our health system and tested on a separate set of 93 patients. Ground truths for both sets were selected by an experienced radiologist. Performance metrics including yield (how many labeled images the model produced) and accuracy (percentage correct) were measured using the test set. MADLaP achieved a yield of 63 % and an accuracy of 83 %. The yield progressively increased as the input data moved through each module, while accuracy peaked part way through. Error analysis showed that inputs from certain examination sites had lower accuracy (40 %) than the other sites (90 %, 100 %). MADLaP successfully created curated datasets of labeled ultrasound images of thyroid nodules. While accurate, the relatively suboptimal yield of MADLaP exposed some challenges when trying to automatically label radiology images from heterogeneous sources. The complex task of image curation and annotation could be automated, allowing for enrichment of larger datasets for use in machine learning development.
Collapse
Affiliation(s)
- Jikai Zhang
- Department of Electrical and Computer Engineering, Duke University, Room 10070, 2424 Erwin Rd, Durham, NC 27705, United States.
| | - Maciej A Mazurowski
- Department of Radiology, Duke University Medical Center, Durham, NC, United States; Department of Electrical and Computer Engineering, Department of Biostatistics and Bioinformatics, Department of Computer Science, Duke University, Room 9044, 2424 Erwin Rd, Durham, NC 27705, United States
| | - Brian C Allen
- Department of Radiology, Duke University Medical Center, Duke University, Dept of Radiology, Box 3808, Durham, NC 27710, United States
| | - Benjamin Wildman-Tobriner
- Department of Radiology, Duke University Medical Center, Duke University, Dept of Radiology, Box 3808, Durham, NC 27710, United States
| |
Collapse
|
5
|
Wang L, Fu S, Wen A, Ruan X, He H, Liu S, Moon S, Mai M, Riaz IB, Wang N, Yang P, Xu H, Warner JL, Liu H. Assessment of Electronic Health Record for Cancer Research and Patient Care Through a Scoping Review of Cancer Natural Language Processing. JCO Clin Cancer Inform 2022; 6:e2200006. [PMID: 35917480 PMCID: PMC9470142 DOI: 10.1200/cci.22.00006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 03/18/2022] [Accepted: 06/15/2022] [Indexed: 11/20/2022] Open
Abstract
PURPOSE The advancement of natural language processing (NLP) has promoted the use of detailed textual data in electronic health records (EHRs) to support cancer research and to facilitate patient care. In this review, we aim to assess EHR for cancer research and patient care by using the Minimal Common Oncology Data Elements (mCODE), which is a community-driven effort to define a minimal set of data elements for cancer research and practice. Specifically, we aim to assess the alignment of NLP-extracted data elements with mCODE and review existing NLP methodologies for extracting said data elements. METHODS Published literature studies were searched to retrieve cancer-related NLP articles that were written in English and published between January 2010 and September 2020 from main literature databases. After the retrieval, articles with EHRs as the data source were manually identified. A charting form was developed for relevant study analysis and used to categorize data including four main topics: metadata, EHR data and targeted cancer types, NLP methodology, and oncology data elements and standards. RESULTS A total of 123 publications were selected finally and included in our analysis. We found that cancer research and patient care require some data elements beyond mCODE as expected. Transparency and reproductivity are not sufficient in NLP methods, and inconsistency in NLP evaluation exists. CONCLUSION We conducted a comprehensive review of cancer NLP for research and patient care using EHRs data. Issues and barriers for wide adoption of cancer NLP were identified and discussed.
Collapse
Affiliation(s)
- Liwei Wang
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN
| | - Sunyang Fu
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN
| | - Andrew Wen
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN
| | - Xiaoyang Ruan
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN
| | - Huan He
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN
| | - Sijia Liu
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN
| | - Sungrim Moon
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN
| | - Michelle Mai
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN
| | - Irbaz B. Riaz
- Department of Hematology/Oncology, Mayo Clinic, Scottsdale, AZ
| | - Nan Wang
- Department of Computer Science and Engineering, College of Science and Engineering, University of Minnesota, Minneapolis, MN
| | - Ping Yang
- Department of Quantitative Health Sciences, Mayo Clinic, Scottsdale, AZ
| | - Hua Xu
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX
| | - Jeremy L. Warner
- Departments of Medicine (Hematology/Oncology), Vanderbilt University, Nashville, TN
- Department Biomedical Informatics, Vanderbilt University, Nashville, TN
| | - Hongfang Liu
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN
| |
Collapse
|
6
|
Dedhia PH, Chen K, Song Y, LaRose E, Imbus JR, Peissig PL, Mendonca EA, Schneider DF. Ambiguous and Incomplete: Natural Language Processing Reveals Problematic Reporting Styles in Thyroid Ultrasound Reports. Methods Inf Med 2022; 61:11-18. [PMID: 34991173 DOI: 10.1055/s-0041-1740493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
OBJECTIVE Natural language processing (NLP) systems convert unstructured text into analyzable data. Here, we describe the performance measures of NLP to capture granular details on nodules from thyroid ultrasound (US) reports and reveal critical issues with reporting language. METHODS We iteratively developed NLP tools using clinical Text Analysis and Knowledge Extraction System (cTAKES) and thyroid US reports from 2007 to 2013. We incorporated nine nodule features for NLP extraction. Next, we evaluated the precision, recall, and accuracy of our NLP tools using a separate set of US reports from an academic medical center (A) and a regional health care system (B) during the same period. Two physicians manually annotated each test-set report. A third physician then adjudicated discrepancies. The adjudicated "gold standard" was then used to evaluate NLP performance on the test-set. RESULTS A total of 243 thyroid US reports contained 6,405 data elements. Inter-annotator agreement for all elements was 91.3%. Compared with the gold standard, overall recall of the NLP tool was 90%. NLP recall for thyroid lobe or isthmus characteristics was: laterality 96% and size 95%. NLP accuracy for nodule characteristics was: laterality 92%, size 92%, calcifications 76%, vascularity 65%, echogenicity 62%, contents 76%, and borders 40%. NLP recall for presence or absence of lymphadenopathy was 61%. Reporting style accounted for 18% errors. For example, the word "heterogeneous" interchangeably referred to nodule contents or echogenicity. While nodule dimensions and laterality were often described, US reports only described contents, echogenicity, vascularity, calcifications, borders, and lymphadenopathy, 46, 41, 17, 15, 9, and 41% of the time, respectively. Most nodule characteristics were equally likely to be described at hospital A compared with hospital B. CONCLUSIONS NLP can automate extraction of critical information from thyroid US reports. However, ambiguous and incomplete reporting language hinders performance of NLP systems regardless of institutional setting. Standardized or synoptic thyroid US reports could improve NLP performance.
Collapse
Affiliation(s)
- Priya H Dedhia
- Department of Surgery, Division of Surgical Oncology, Ohio State University Comprehensive Cancer Center and Ohio State University Wexner Medical Center, Columbus, Ohio, United States
| | - Kallie Chen
- Department of Surgery, Division of Endocrine Surgery, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, United States
| | - Yiqiang Song
- Department of Biostatistics and Medical Informatics, Department of Pediatrics, University of Wisconsin-Madison, Madison, Wisconsin, United States
| | - Eric LaRose
- Center for Precision Medicine Research, Marshfield Clinic Research Institute, Marshfield Clinic Health System, Marshfield, Wisconsin, United States
| | - Joseph R Imbus
- Department of Surgery, Division of Endocrine Surgery, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, United States
| | - Peggy L Peissig
- Center for Precision Medicine Research, Marshfield Clinic Research Institute, Marshfield Clinic Health System, Marshfield, Wisconsin, United States
| | - Eneida A Mendonca
- Department of Biostatistics and Medical Informatics, Department of Pediatrics, University of Wisconsin-Madison, Madison, Wisconsin, United States.,Department of Pediatrics, Department of Biostatistics and Health Data Sciences, Indiana University, Indianapolis, Indiana, United States
| | - David F Schneider
- Department of Surgery, Division of Endocrine Surgery, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, United States
| |
Collapse
|
7
|
Luong G, Idarraga AJ, Hsiao V, Schneider DF. Risk Stratifying Indeterminate Thyroid Nodules With Machine Learning. J Surg Res 2021; 270:214-220. [PMID: 34706298 DOI: 10.1016/j.jss.2021.09.015] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 08/26/2021] [Accepted: 09/21/2021] [Indexed: 12/15/2022]
Abstract
BACKGROUND Up to 30% of thyroid nodules are classified as indeterminate after fine needle aspiration biopsy. These indeterminate thyroid nodules (ITNs) require surgical pathology for definitive diagnosis. Molecular testing provides additional pre-operative cancer risk stratification but adds expense and invasive testing. The purpose of this study is to utilize a machine learning (ML) algorithm to predict malignancy of ITNs using data available from less invasive tests. MATERIALS AND METHODS We conducted a retrospective study using medical records from one academic and one community center. Thyroid nodules with an indeterminate diagnosis on fine needle aspiration biopsy and completed diagnostic pathology were included. Linear, non-linear, and non-linear-ensemble ML methods were tested for accuracy when predicting malignancy using 10-fold cross-validation. Classifiers were evaluated using area under the receiver operating characteristics curve (AUROC). RESULTS A total of 355 nodules met inclusion criteria. Of these, 171 (48.2%) were diagnosed with cancer. A Random Forest classifier performed the best, producing an accuracy of 79.1%, a sensitivity of 75.5%, specificity of 82.4%, positive predicative value of 80.3%, negative predictive value of 79.0%, and an AUROC of 0.859. CONCLUSIONS ML methods accurately risk stratify ITNs using data gathered from existing, non-invasive, and inexpensive diagnostic tests. Applying an ML model with existing data can become a cost-effective alternative to molecular testing. Future studies will prospectively evaluate the performance of this ML approach when combined with expert judgment.
Collapse
Affiliation(s)
- George Luong
- University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin
| | - Alexander J Idarraga
- University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin
| | - Vivian Hsiao
- University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin
| | - David F Schneider
- University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin.
| |
Collapse
|