1
|
Smith BJ, Dey J, Medlock L, Solis D, Kirby K. Maximum-likelihood estimation of glandular fraction for mammography and its effect on microcalcification detection. Phys Eng Sci Med 2025:10.1007/s13246-025-01540-2. [PMID: 40327237 DOI: 10.1007/s13246-025-01540-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Accepted: 04/21/2025] [Indexed: 05/07/2025]
Abstract
Breast tissue is mainly a mixture of adipose and fibro-glandular tissue. Cancer risk and risk of undetected breast cancer increases with the amount of glandular tissue in the breast. Therefore, radiologists must report the total volume glandular fraction or a BI-RADS classification in screening and diagnostic mammography. In this work, a Maximum Likelihood algorithm accounting for count statistics and scatter is shown to estimate the pixel-wise glandular fraction from mammographic images. The pixel-wise glandular fraction provides information that helps localize dense tissue. The total volume glandular fraction can be calculated from pixel-wise glandular fraction. The algorithm was implemented for images acquired with an anti-scatter grid, and those without using the anti-scatter grid but followed by software scatter removal. The work also studied if presenting the pixel-wise glandular fraction image alongside the usual mammographic image has the potential to improve the contrast-to-noise ratio on micro-calcifications in the breast. The algorithms are implemented and evaluated with TOPAS Geant4 generated images with known glandular fractions. These images are also taken with and without microcalcifications present to study the effects of glandular fraction estimation on microcalcification detection. The algorithm was then applied to clinical images with and without microcalcifications. For the TOPAS simulated images, the glandular fraction was estimated with a root mean squared error of 6.6% for the with anti-scatter-grid cases and 7.6% for the software scatter removal (no anti-scatter grid) cases for a range of 2-9 cm compressed breast thickness. Average absolute errors were 4.5% and 4.7% for a range of 2-9 cm compressed breast thickness respectively for the anti-scatter grid and software scatter-removal methods. For higher thickness and glandular fraction, the errors were higher. For the extreme case of 9 cm thickness, the glandular fraction estimation yielded 5%, 13% and 16% mean absolute errors for 20%, 30% and 50% glandular fraction. These errors lowered to 1.5%, 9% and 13.2% for a narrower spectrum for the 9 cm. Results from clinical images (where the true glandular fraction is unknown) show that the algorithm gives a glandular fraction within the average range expected from the literature. For microcalcification detection, the contrast-to-noise ratio improved by 17.5-548% in clinical images and 5.1-88% in TOPAS images. A method for accurately estimating the pixel-wise glandular fraction in images, which provides localization information about breast density, was demonstrated. The glandular fraction images also showed an improvement in contrast to noise ratio for detecting microcalcifications, a risk factor in breast cancer.
Collapse
Affiliation(s)
- Bryce J Smith
- Physics Department, Mary Bird Perkins Cancer Center, Baton Rouge, LA, USA
| | - Joyoni Dey
- Department of Physics and Astronomy, Louisiana State University, Baton Rouge, LA, USA.
| | - Lacey Medlock
- Department of Physics and Astronomy, Louisiana State University, Baton Rouge, LA, USA
| | - David Solis
- Physics Department, Mary Bird Perkins Cancer Center, Baton Rouge, LA, USA
| | - Krystal Kirby
- Physics Department, Mary Bird Perkins Cancer Center, Baton Rouge, LA, USA
| |
Collapse
|
2
|
Hussain S, Naseem U, Ali M, Avendaño Avalos DB, Cardona-Huerta S, Bosques Palomo BA, Tamez-Peña JG. TECRR: a benchmark dataset of radiological reports for BI-RADS classification with machine learning, deep learning, and large language model baselines. BMC Med Inform Decis Mak 2024; 24:310. [PMID: 39444035 PMCID: PMC11515610 DOI: 10.1186/s12911-024-02717-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 10/10/2024] [Indexed: 10/25/2024] Open
Abstract
BACKGROUND Recently, machine learning (ML), deep learning (DL), and natural language processing (NLP) have provided promising results in the free-form radiological reports' classification in the respective medical domain. In order to classify radiological reports properly, a high-quality annotated and curated dataset is required. Currently, no publicly available breast imaging-based radiological dataset exists for the classification of Breast Imaging Reporting and Data System (BI-RADS) categories and breast density scores, as characterized by the American College of Radiology (ACR). To tackle this problem, we construct and annotate a breast imaging-based radiological reports dataset and its benchmark results. The dataset was originally in Spanish. Board-certified radiologists collected and annotated it according to the BI-RADS lexicon and categories at the Breast Radiology department, TecSalud Hospitals Monterrey, Mexico. Initially, it was translated into English language using Google Translate. Afterwards, it was preprocessed by removing duplicates and missing values. After preprocessing, the final dataset consists of 5046 unique reports from 5046 patients with an average age of 53 years and 100% women. Furthermore, we used word-level NLP-based embedding techniques, term frequency-inverse document frequency (TF-IDF) and word2vec to extract semantic and syntactic information. We also compared the performance of ML, DL and large language models (LLMs) classifiers for BI-RADS category classification. RESULTS The final breast imaging-based radiological reports dataset contains 5046 unique reports. We compared K-Nearest Neighbour (KNN), Support Vector Machine (SVM), Naive Bayes (NB), Random Forest (RF), Adaptive Boosting (AdaBoost), Gradient-Boosting (GB), Extreme Gradient Boosting (XGB), Long Short-Term Memory (LSTM), Bidirectional Encoder Representations from Transformers (BERT) and Biomedical Generative Pre-trained Transformer (BioGPT) classifiers. It is observed that the BioGPT classifier with preprocessed data performed 6% better with a mean sensitivity of 0.60 (95% confidence interval (CI), 0.391-0.812) compared to the second best performing classifier BERT, which achieved mean sensitivity of 0.54 (95% CI, 0.477-0.607). CONCLUSION In this work, we propose a curated and annotated benchmark dataset that can be used for BI-RADS and breast density category classification. We also provide baseline results of most ML, DL and LLMs models for BI-RADS classification that can be used as a starting point for future investigation. The main objective of this investigation is to provide a repository for the investigators who wish to enter the field to push the boundaries further.
Collapse
Affiliation(s)
- Sadam Hussain
- School of Engineering and Sciences, Tecnológico de Monterrey, Monterrey, 64849, Nuevo Leon, Mexico.
| | - Usman Naseem
- School of Computing, Macquarie University, Sydney, 2109, NSW, Australia
| | - Mansoor Ali
- School of Engineering and Sciences, Tecnológico de Monterrey, Monterrey, 64849, Nuevo Leon, Mexico
| | | | | | | | | |
Collapse
|
3
|
López-Úbeda P, Martín-Noguerol T, Paulano-Godino F, Luna A. Comparative evaluation of image-based vs. text-based vs. multimodal AI approaches for automatic breast density assessment in mammograms. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 255:108334. [PMID: 39053353 DOI: 10.1016/j.cmpb.2024.108334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 02/23/2024] [Accepted: 07/17/2024] [Indexed: 07/27/2024]
Abstract
BACKGROUND AND OBJECTIVES In the last decade, there has been a growing interest in applying artificial intelligence (AI) systems to breast cancer assessment, including breast density evaluation. However, few models have been developed to integrate textual mammographic reports and mammographic images. Our aims are (1) to generate a natural language processing (NLP)-based AI system, (2) to evaluate an external image-based software, and (3) to develop a multimodal system, using the late fusion approach, by integrating image and text inferences for the automatic classification of breast density according to the American College of Radiology (ACR) guidelines in mammograms and radiological reports. METHODS We first compared different NLP models, three based on n-gram term frequency - inverse document frequency and two transformer-based architectures, using 1533 unstructured mammogram reports as a training set and 303 reports as a test set. Subsequently, we evaluated an external image-based software using 303 mammogram images. Finally, we assessed our multimodal system taking into account both text and mammogram images. RESULTS Our best NLP model achieved 88 % accuracy, while the external software and the multimodal system achieved 75 % and 80 % accuracy, respectively, in classifying ACR breast densities. CONCLUSION Although our multimodal system outperforms the image-based tool, it currently does not improve the results offered by the NLP model for ACR breast density classification. Nevertheless, the promising results observed here open the possibility to more comprehensive studies regarding the utilization of multimodal tools in the assessment of breast density.
Collapse
Affiliation(s)
| | | | - Félix Paulano-Godino
- Image Processing Unit, Engineering Department, HT Médica, Carmelo Torres n 2, 23007, Jaén, Spain
| | - Antonio Luna
- MRI unit, Radiology department, HT Médica, Carmelo Torres n 2, 23007, Jaén, Spain
| |
Collapse
|
4
|
López-Úbeda P, Martín-Noguerol T, Luna A. Automatic classification and prioritisation of actionable BI-RADS categories using natural language processing models. Clin Radiol 2024; 79:e1-e7. [PMID: 37838546 DOI: 10.1016/j.crad.2023.09.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 08/31/2023] [Accepted: 09/03/2023] [Indexed: 10/16/2023]
Abstract
AIM To facilitate the routine tasks performed by radiologists in their evaluation of breast radiology reports, by enhancing the communication of relevant results to referring physicians via a natural language processing (NLP)-based system to classify and prioritise Breast Imaging Reporting Data System (BI-RADS). MATERIALS AND METHODS A NLP-based system was developed to classify and prioritise BI-RADS categories from breast ultrasound and mammogram reports, with the potential to streamline and speed up the standard procedures that radiologists must follow to evaluate and categorise breast imaging results. BI-RADS category extraction was divided into two specific tasks: (1) multi-label classification of BI-RADS categories (0-6) and (2) classification of high-priority (BI-RADS 0, 3, 4 and 5) and low priority (BI-RADS 1, 2, and 6) reports according to the previous BI-RADS assessment. RESULTS To develop the NLP tool, three different Bidirectional Encoder Representations from Transformers (BERT)-based models (XLM-RoBERTa, BETO, and Bio-BERT-Spanish) were trained and tested on three distinct corpora (containing only breast ultrasound reports, only mammogram reports, or both), and achieved an accuracy of 74.29-77.5% in detecting BI-RADS categories and 88.52-91.02% in prioritising reports. CONCLUSION The system designed can effectively classify all BI-RADS categories present in a single radiology report. In the clinical setting, such an automated tool can assist radiologists in evaluating breast radiology reports and decision-making tasks and enhance the speed of communicating priority BI-RADS reports to referring physicians.
Collapse
Affiliation(s)
- P López-Úbeda
- NLP Department, HT Médica, C. Carmelo Torres 2, 23007 Jaén, Spain.
| | - T Martín-Noguerol
- MRI Unit, Radiology Department, HT Médica, C. Carmelo Torres 2, 23007 Jaén, Spain
| | - A Luna
- MRI Unit, Radiology Department, HT Médica, C. Carmelo Torres 2, 23007 Jaén, Spain
| |
Collapse
|
5
|
Gholipour M, Khajouei R, Amiri P, Hajesmaeel Gohari S, Ahmadian L. Extracting cancer concepts from clinical notes using natural language processing: a systematic review. BMC Bioinformatics 2023; 24:405. [PMID: 37898795 PMCID: PMC10613366 DOI: 10.1186/s12859-023-05480-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Accepted: 09/13/2023] [Indexed: 10/30/2023] Open
Abstract
BACKGROUND Extracting information from free texts using natural language processing (NLP) can save time and reduce the hassle of manually extracting large quantities of data from incredibly complex clinical notes of cancer patients. This study aimed to systematically review studies that used NLP methods to identify cancer concepts from clinical notes automatically. METHODS PubMed, Scopus, Web of Science, and Embase were searched for English language papers using a combination of the terms concerning "Cancer", "NLP", "Coding", and "Registries" until June 29, 2021. Two reviewers independently assessed the eligibility of papers for inclusion in the review. RESULTS Most of the software programs used for concept extraction reported were developed by the researchers (n = 7). Rule-based algorithms were the most frequently used algorithms for developing these programs. In most articles, the criteria of accuracy (n = 14) and sensitivity (n = 12) were used to evaluate the algorithms. In addition, Systematized Nomenclature of Medicine-Clinical Terms (SNOMED-CT) and Unified Medical Language System (UMLS) were the most commonly used terminologies to identify concepts. Most studies focused on breast cancer (n = 4, 19%) and lung cancer (n = 4, 19%). CONCLUSION The use of NLP for extracting the concepts and symptoms of cancer has increased in recent years. The rule-based algorithms are well-liked algorithms by developers. Due to these algorithms' high accuracy and sensitivity in identifying and extracting cancer concepts, we suggested that future studies use these algorithms to extract the concepts of other diseases as well.
Collapse
Affiliation(s)
- Maryam Gholipour
- Student Research Committee, Kerman University of Medical Sciences, Kerman, Iran
| | - Reza Khajouei
- Department of Health Information Sciences, Faculty of Management and Medical Information Sciences, Kerman University of Medical Sciences, Kerman, Iran
| | - Parastoo Amiri
- Student Research Committee, Kerman University of Medical Sciences, Kerman, Iran
| | - Sadrieh Hajesmaeel Gohari
- Medical Informatics Research Center, Institute for Futures Studies in Health, Kerman University of Medical Sciences, Kerman, Iran
| | - Leila Ahmadian
- Department of Health Information Sciences, Faculty of Management and Medical Information Sciences, Kerman University of Medical Sciences, Kerman, Iran.
| |
Collapse
|
6
|
Mithun S, Jha AK, Sherkhane UB, Jaiswar V, Purandare NC, Dekker A, Puts S, Bermejo I, Rangarajan V, Zegers CML, Wee L. Clinical Concept-Based Radiology Reports Classification Pipeline for Lung Carcinoma. J Digit Imaging 2023; 36:812-826. [PMID: 36788196 PMCID: PMC10287609 DOI: 10.1007/s10278-023-00787-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 01/23/2023] [Accepted: 01/24/2023] [Indexed: 02/16/2023] Open
Abstract
Rising incidence and mortality of cancer have led to an incremental amount of research in the field. To learn from preexisting data, it has become important to capture maximum information related to disease type, stage, treatment, and outcomes. Medical imaging reports are rich in this kind of information but are only present as free text. The extraction of information from such unstructured text reports is labor-intensive. The use of Natural Language Processing (NLP) tools to extract information from radiology reports can make it less time-consuming as well as more effective. In this study, we have developed and compared different models for the classification of lung carcinoma reports using clinical concepts. This study was approved by the institutional ethics committee as a retrospective study with a waiver of informed consent. A clinical concept-based classification pipeline for lung carcinoma radiology reports was developed using rule-based as well as machine learning models and compared. The machine learning models used were XGBoost and two more deep learning model architectures with bidirectional long short-term neural networks. A corpus consisting of 1700 radiology reports including computed tomography (CT) and positron emission tomography/computed tomography (PET/CT) reports were used for development and testing. Five hundred one radiology reports from MIMIC-III Clinical Database version 1.4 was used for external validation. The pipeline achieved an overall F1 score of 0.94 on the internal set and 0.74 on external validation with the rule-based algorithm using expert input giving the best performance. Among the machine learning models, the Bi-LSTM_dropout model performed better than the ML model using XGBoost and the Bi-LSTM_simple model on internal set, whereas on external validation, the Bi-LSTM_simple model performed relatively better than other 2. This pipeline can be used for clinical concept-based classification of radiology reports related to lung carcinoma from a huge corpus and also for automated annotation of these reports.
Collapse
Affiliation(s)
- Sneha Mithun
- Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre+, 6229 ET, Maastricht, The Netherlands.
- Department of Nuclear Medicine and Molecular Imaging, Tata Memorial Hospital, Mumbai, India.
- Homi Bhabha National Institute (HBNI), Deemed University, Mumbai, India.
| | - Ashish Kumar Jha
- Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre+, 6229 ET, Maastricht, The Netherlands
- Department of Nuclear Medicine and Molecular Imaging, Tata Memorial Hospital, Mumbai, India
- Homi Bhabha National Institute (HBNI), Deemed University, Mumbai, India
| | - Umesh B Sherkhane
- Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre+, 6229 ET, Maastricht, The Netherlands
- Department of Nuclear Medicine and Molecular Imaging, Tata Memorial Hospital, Mumbai, India
| | - Vinay Jaiswar
- Department of Nuclear Medicine and Molecular Imaging, Tata Memorial Hospital, Mumbai, India
| | - Nilendu C Purandare
- Department of Nuclear Medicine and Molecular Imaging, Tata Memorial Hospital, Mumbai, India
- Homi Bhabha National Institute (HBNI), Deemed University, Mumbai, India
| | - Andre Dekker
- Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre+, 6229 ET, Maastricht, The Netherlands
| | - Sander Puts
- Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre+, 6229 ET, Maastricht, The Netherlands
| | - Inigo Bermejo
- Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre+, 6229 ET, Maastricht, The Netherlands
| | - V Rangarajan
- Department of Nuclear Medicine and Molecular Imaging, Tata Memorial Hospital, Mumbai, India
- Homi Bhabha National Institute (HBNI), Deemed University, Mumbai, India
| | - Catharina M L Zegers
- Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre+, 6229 ET, Maastricht, The Netherlands
| | - Leonard Wee
- Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre+, 6229 ET, Maastricht, The Netherlands
| |
Collapse
|
7
|
Saha A, Burns L, Kulkarni AM. A scoping review of natural language processing of radiology reports in breast cancer. Front Oncol 2023; 13:1160167. [PMID: 37124523 PMCID: PMC10130381 DOI: 10.3389/fonc.2023.1160167] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Accepted: 03/28/2023] [Indexed: 05/02/2023] Open
Abstract
Various natural language processing (NLP) algorithms have been applied in the literature to analyze radiology reports pertaining to the diagnosis and subsequent care of cancer patients. Applications of this technology include cohort selection for clinical trials, population of large-scale data registries, and quality improvement in radiology workflows including mammography screening. This scoping review is the first to examine such applications in the specific context of breast cancer. Out of 210 identified articles initially, 44 met our inclusion criteria for this review. Extracted data elements included both clinical and technical details of studies that developed or evaluated NLP algorithms applied to free-text radiology reports of breast cancer. Our review illustrates an emphasis on applications in diagnostic and screening processes over treatment or therapeutic applications and describes growth in deep learning and transfer learning approaches in recent years, although rule-based approaches continue to be useful. Furthermore, we observe increased efforts in code and software sharing but not with data sharing.
Collapse
Affiliation(s)
- Ashirbani Saha
- Department of Oncology, McMaster University, Hamilton, ON, Canada
- Hamilton Health Sciences and McMaster University, Escarpment Cancer Research Institute, Hamilton, ON, Canada
| | - Levi Burns
- Michael G. DeGroote School of Medicine, McMaster University, Hamilton, ON, Canada
| | | |
Collapse
|
8
|
Zhang Y, Grant BMM, Hope AJ, Hung RJ, Warkentin MT, Lam ACL, Aggawal R, Xu M, Shepherd FA, Tsao MS, Xu W, Pakkal M, Liu G, McInnis MC. Using Recurrent Neural Networks to Extract High-Quality Information From Lung Cancer Screening Computerized Tomography Reports for Inter-Radiologist Audit and Feedback Quality Improvement. JCO Clin Cancer Inform 2023; 7:e2200153. [PMID: 36930839 DOI: 10.1200/cci.22.00153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2023] Open
Abstract
PURPOSE Lung cancer screening programs generate a high volume of low-dose computed tomography (LDCT) reports that contain valuable information, typically in a free-text format. High-performance named-entity recognition (NER) models can extract relevant information from these reports automatically for inter-radiologist quality control. METHODS Using LDCT report data from a longitudinal lung cancer screening program (8,305 reports; 3,124 participants; 2006-2019), we trained a rule-based model and two bidirectional long short-term memory (Bi-LSTM) NER neural network models to detect clinically relevant information from LDCT reports. Model performance was tested using F1 scores and compared with a published open-source radiology NER model (Stanza) in an independent evaluation set of 150 reports. The top performing model was applied to a data set of 6,948 reports for an inter-radiologist quality control assessment. RESULTS The best performing model, a Bi-LSTM NER recurrent neural network model, had an overall F1 score of 0.950, which outperformed Stanza (F1 score = 0.872) and a rule-based NER model (F1 score = 0.809). Recall (sensitivity) for the best Bi-LSTM model ranged from 0.916 to 0.991 for different entity types; precision (positive predictive value) ranged from 0.892 to 0.997. Test performance remained stable across time periods. There was an average of a 2.86-fold difference in the number of identified entities between the most and the least detailed radiologists. CONCLUSION We built an open-source Bi-LSTM NER model that outperformed other open-source or rule-based radiology NER models. This model can efficiently extract clinically relevant information from lung cancer screening computerized tomography reports with high accuracy, enabling efficient audit and feedback to improve quality of patient care.
Collapse
Affiliation(s)
- Yucheng Zhang
- Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Benjamin M M Grant
- Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, ON, Canada
| | - Andrew J Hope
- Radiation Medicine Program, Princess Margaret Cancer Centre, and Department of Radiation Oncology, University of Toronto, Toronto, ON, Canada
| | - Rayjean J Hung
- Prosserman Centre for Population Health Research, Lunenfeld-Tanenbaum Research Institute, Sinai Health Systems, Toronto, ON, Canada
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
| | - Matthew T Warkentin
- Prosserman Centre for Population Health Research, Lunenfeld-Tanenbaum Research Institute, Sinai Health Systems, Toronto, ON, Canada
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
| | - Andrew C L Lam
- Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
- Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, ON, Canada
| | - Reenika Aggawal
- Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
- Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, ON, Canada
| | - Maria Xu
- Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, ON, Canada
| | - Frances A Shepherd
- Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
- Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, ON, Canada
| | - Ming-Sound Tsao
- Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
- Laboratory Medicine and Pathology, University Health Network, Toronto, ON, Canada
| | - Wei Xu
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
- Biostatistics, Princess Margaret Cancer Centre, Toronto, ON, Canada
- Computational Biology and Medicine Program, Princess Margaret Cancer Centre, Toronto, ON, Canada
| | - Mini Pakkal
- Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
- Division of Cardiothoracic Imaging, Joint Department of Medical Imaging, Toronto General Hospital, Toronto, ON, Canada
| | - Geoffrey Liu
- Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
- Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, ON, Canada
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
- Biostatistics, Princess Margaret Cancer Centre, Toronto, ON, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada
| | - Micheal C McInnis
- Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
- Division of Cardiothoracic Imaging, Joint Department of Medical Imaging, Toronto General Hospital, Toronto, ON, Canada
| |
Collapse
|
9
|
Berge GT, Granmo OC, Tveit TO, Munkvold BE, Ruthjersen AL, Sharma J. Machine learning-driven clinical decision support system for concept-based searching: a field trial in a Norwegian hospital. BMC Med Inform Decis Mak 2023; 23:5. [PMID: 36627624 PMCID: PMC9832658 DOI: 10.1186/s12911-023-02101-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 01/04/2023] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND Natural language processing (NLP) based clinical decision support systems (CDSSs) have demonstrated the ability to extract vital information from patient electronic health records (EHRs) to facilitate important decision support tasks. While obtaining accurate, medical domain interpretable results is crucial, it is demanding because real-world EHRs contain many inconsistencies and inaccuracies. Further, testing of such machine learning-based systems in clinical practice has received limited attention and are yet to be accepted by clinicians for regular use. METHODS We present our results from the evaluation of an NLP-driven CDSS developed and implemented in a Norwegian Hospital. The system incorporates unsupervised and supervised machine learning combined with rule-based algorithms for clinical concept-based searching to identify and classify allergies of concern for anesthesia and intensive care. The system also implements a semi-supervised machine learning approach to automatically annotate medical concepts in the narrative. RESULTS Evaluation of system adoption was performed by a mixed methods approach applying The Unified Theory of Acceptance and Use of Technology (UTAUT) as a theoretical lens. Most of the respondents demonstrated a high degree of system acceptance and expressed a positive attitude towards the system in general and intention to use the system in the future. Increased detection of patient allergies, and thus improved quality of practice and patient safety during surgery or ICU stays, was perceived as the most important advantage of the system. CONCLUSIONS Our combined machine learning and rule-based approach benefits system performance, efficiency, and interpretability. The results demonstrate that the proposed CDSS increases detection of patient allergies, and that the system received high-level acceptance by the clinicians using it. Useful recommendations for further system improvements and implementation initiatives are reducing the quantity of alarms, expansion of the system to include more clinical concepts, closer EHR system integration, and more workstations available at point of care.
Collapse
Affiliation(s)
- G. T. Berge
- grid.23048.3d0000 0004 0417 6230Department of Information Systems, University of Agder, Kristiansand, Norway ,grid.417290.90000 0004 0627 3712Department of Technology and eHealth, Sørlandet Hospital Trust, Kristiansand, Norway
| | - O. C. Granmo
- grid.23048.3d0000 0004 0417 6230Department of ICT, University of Agder, Grimstad, Norway
| | - T. O. Tveit
- grid.417290.90000 0004 0627 3712Department of Technology and eHealth, Sørlandet Hospital Trust, Kristiansand, Norway ,grid.417290.90000 0004 0627 3712Department of Anaesthesia and Intensive Care, Sørlandet Hospital Trust, Kristiansand, Norway ,grid.417290.90000 0004 0627 3712Research Department, Sørlandet Hospital Trust, Kristiansand, Norway
| | - B. E. Munkvold
- grid.23048.3d0000 0004 0417 6230Department of Information Systems, University of Agder, Kristiansand, Norway
| | - A. L. Ruthjersen
- grid.417290.90000 0004 0627 3712Department of Technology and eHealth, Sørlandet Hospital Trust, Kristiansand, Norway
| | - J. Sharma
- grid.417290.90000 0004 0627 3712Department of Technology and eHealth, Sørlandet Hospital Trust, Kristiansand, Norway ,grid.23048.3d0000 0004 0417 6230Department of ICT, University of Agder, Grimstad, Norway
| |
Collapse
|
10
|
D’Anniballe VM, Tushar FI, Faryna K, Han S, Mazurowski MA, Rubin GD, Lo JY. Multi-label annotation of text reports from computed tomography of the chest, abdomen, and pelvis using deep learning. BMC Med Inform Decis Mak 2022; 22:102. [PMID: 35428335 PMCID: PMC9011942 DOI: 10.1186/s12911-022-01843-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Accepted: 04/08/2022] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background
There is progress to be made in building artificially intelligent systems to detect abnormalities that are not only accurate but can handle the true breadth of findings that radiologists encounter in body (chest, abdomen, and pelvis) computed tomography (CT). Currently, the major bottleneck for developing multi-disease classifiers is a lack of manually annotated data. The purpose of this work was to develop high throughput multi-label annotators for body CT reports that can be applied across a variety of abnormalities, organs, and disease states thereby mitigating the need for human annotation.
Methods
We used a dictionary approach to develop rule-based algorithms (RBA) for extraction of disease labels from radiology text reports. We targeted three organ systems (lungs/pleura, liver/gallbladder, kidneys/ureters) with four diseases per system based on their prevalence in our dataset. To expand the algorithms beyond pre-defined keywords, attention-guided recurrent neural networks (RNN) were trained using the RBA-extracted labels to classify reports as being positive for one or more diseases or normal for each organ system. Alternative effects on disease classification performance were evaluated using random initialization or pre-trained embedding as well as different sizes of training datasets. The RBA was tested on a subset of 2158 manually labeled reports and performance was reported as accuracy and F-score. The RNN was tested against a test set of 48,758 reports labeled by RBA and performance was reported as area under the receiver operating characteristic curve (AUC), with 95% CIs calculated using the DeLong method.
Results
Manual validation of the RBA confirmed 91–99% accuracy across the 15 different labels. Our models extracted disease labels from 261,229 radiology reports of 112,501 unique subjects. Pre-trained models outperformed random initialization across all diseases. As the training dataset size was reduced, performance was robust except for a few diseases with a relatively small number of cases. Pre-trained classification AUCs reached > 0.95 for all four disease outcomes and normality across all three organ systems.
Conclusions
Our label-extracting pipeline was able to encompass a variety of cases and diseases in body CT reports by generalizing beyond strict rules with exceptional accuracy. The method described can be easily adapted to enable automated labeling of hospital-scale medical data sets for training image-based disease classifiers.
Collapse
|
11
|
Dedhia PH, Chen K, Song Y, LaRose E, Imbus JR, Peissig PL, Mendonca EA, Schneider DF. Ambiguous and Incomplete: Natural Language Processing Reveals Problematic Reporting Styles in Thyroid Ultrasound Reports. Methods Inf Med 2022; 61:11-18. [PMID: 34991173 DOI: 10.1055/s-0041-1740493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
OBJECTIVE Natural language processing (NLP) systems convert unstructured text into analyzable data. Here, we describe the performance measures of NLP to capture granular details on nodules from thyroid ultrasound (US) reports and reveal critical issues with reporting language. METHODS We iteratively developed NLP tools using clinical Text Analysis and Knowledge Extraction System (cTAKES) and thyroid US reports from 2007 to 2013. We incorporated nine nodule features for NLP extraction. Next, we evaluated the precision, recall, and accuracy of our NLP tools using a separate set of US reports from an academic medical center (A) and a regional health care system (B) during the same period. Two physicians manually annotated each test-set report. A third physician then adjudicated discrepancies. The adjudicated "gold standard" was then used to evaluate NLP performance on the test-set. RESULTS A total of 243 thyroid US reports contained 6,405 data elements. Inter-annotator agreement for all elements was 91.3%. Compared with the gold standard, overall recall of the NLP tool was 90%. NLP recall for thyroid lobe or isthmus characteristics was: laterality 96% and size 95%. NLP accuracy for nodule characteristics was: laterality 92%, size 92%, calcifications 76%, vascularity 65%, echogenicity 62%, contents 76%, and borders 40%. NLP recall for presence or absence of lymphadenopathy was 61%. Reporting style accounted for 18% errors. For example, the word "heterogeneous" interchangeably referred to nodule contents or echogenicity. While nodule dimensions and laterality were often described, US reports only described contents, echogenicity, vascularity, calcifications, borders, and lymphadenopathy, 46, 41, 17, 15, 9, and 41% of the time, respectively. Most nodule characteristics were equally likely to be described at hospital A compared with hospital B. CONCLUSIONS NLP can automate extraction of critical information from thyroid US reports. However, ambiguous and incomplete reporting language hinders performance of NLP systems regardless of institutional setting. Standardized or synoptic thyroid US reports could improve NLP performance.
Collapse
Affiliation(s)
- Priya H Dedhia
- Department of Surgery, Division of Surgical Oncology, Ohio State University Comprehensive Cancer Center and Ohio State University Wexner Medical Center, Columbus, Ohio, United States
| | - Kallie Chen
- Department of Surgery, Division of Endocrine Surgery, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, United States
| | - Yiqiang Song
- Department of Biostatistics and Medical Informatics, Department of Pediatrics, University of Wisconsin-Madison, Madison, Wisconsin, United States
| | - Eric LaRose
- Center for Precision Medicine Research, Marshfield Clinic Research Institute, Marshfield Clinic Health System, Marshfield, Wisconsin, United States
| | - Joseph R Imbus
- Department of Surgery, Division of Endocrine Surgery, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, United States
| | - Peggy L Peissig
- Center for Precision Medicine Research, Marshfield Clinic Research Institute, Marshfield Clinic Health System, Marshfield, Wisconsin, United States
| | - Eneida A Mendonca
- Department of Biostatistics and Medical Informatics, Department of Pediatrics, University of Wisconsin-Madison, Madison, Wisconsin, United States.,Department of Pediatrics, Department of Biostatistics and Health Data Sciences, Indiana University, Indianapolis, Indiana, United States
| | - David F Schneider
- Department of Surgery, Division of Endocrine Surgery, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, United States
| |
Collapse
|
12
|
Shin D, Kam HJ, Jeon MS, Kim HY. Automatic Classification of Thyroid Findings Using Static and Contextualized Ensemble Natural Language Processing Systems: Development Study. JMIR Med Inform 2021; 9:e30223. [PMID: 34546183 PMCID: PMC8493453 DOI: 10.2196/30223] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 07/15/2021] [Accepted: 08/02/2021] [Indexed: 11/30/2022] Open
Abstract
Background In the case of Korean institutions and enterprises that collect nonstandardized and nonunified formats of electronic medical examination results from multiple medical institutions, a group of experienced nurses who can understand the results and related contexts initially classified the reports manually. The classification guidelines were established by years of workers’ clinical experiences and there were attempts to automate the classification work. However, there have been problems in which rule-based algorithms or human labor–intensive efforts can be time-consuming or limited owing to high potential errors. We investigated natural language processing (NLP) architectures and proposed ensemble models to create automated classifiers. Objective This study aimed to develop practical deep learning models with electronic medical records from 284 health care institutions and open-source corpus data sets for automatically classifying 3 thyroid conditions: healthy, caution required, and critical. The primary goal is to increase the overall accuracy of the classification, yet there are practical and industrial needs to correctly predict healthy (negative) thyroid condition data, which are mostly medical examination results, and minimize false-negative rates under the prediction of healthy thyroid conditions. Methods The data sets included thyroid and comprehensive medical examination reports. The textual data are not only documented in fully complete sentences but also written in lists of words or phrases. Therefore, we propose static and contextualized ensemble NLP network (SCENT) systems to successfully reflect static and contextual information and handle incomplete sentences. We prepared each convolution neural network (CNN)-, long short-term memory (LSTM)-, and efficiently learning an encoder that classifies token replacements accurately (ELECTRA)-based ensemble model by training or fine-tuning them multiple times. Through comprehensive experiments, we propose 2 versions of ensemble models, SCENT-v1 and SCENT-v2, with the single-architecture–based CNN, LSTM, and ELECTRA ensemble models for the best classification performance and practical use, respectively. SCENT-v1 is an ensemble of CNN and ELECTRA ensemble models, and SCENT-v2 is a hierarchical ensemble of CNN, LSTM, and ELECTRA ensemble models. SCENT-v2 first classifies the 3 labels using an ELECTRA ensemble model and then reclassifies them using an ensemble model of CNN and LSTM if the ELECTRA ensemble model predicted them as “healthy” labels. Results SCENT-v1 outperformed all the suggested models, with the highest F1 score (92.56%). SCENT-v2 had the second-highest recall value (94.44%) and the fewest misclassifications for caution-required thyroid condition while maintaining 0 classification error for the critical thyroid condition under the prediction of the healthy thyroid condition. Conclusions The proposed SCENT demonstrates good classification performance despite the unique characteristics of the Korean language and problems of data lack and imbalance, especially for the extremely low amount of critical condition data. The result of SCENT-v1 indicates that different perspectives of static and contextual input token representations can enhance classification performance. SCENT-v2 has a strong impact on the prediction of healthy thyroid conditions.
Collapse
Affiliation(s)
- Dongyup Shin
- Graduate School of Information, Yonsei University, Seoul, Republic of Korea
| | - Hye Jin Kam
- Healthcare, Life Solution Cluster, New Business Unit, Hanwha Life Insurance Co Ltd, Seoul, Republic of Korea
| | - Min-Seok Jeon
- Data Analysis Team, Aimmed Co Ltd, Seoul, Republic of Korea
| | - Ha Young Kim
- Graduate School of Information, Yonsei University, Seoul, Republic of Korea
| |
Collapse
|
13
|
Abstract
Natural language processing (NLP) is an interdisciplinary field, combining linguistics, computer science, and artificial intelligence to enable machines to read and understand human language for meaningful purposes. Recent advancements in deep learning have begun to offer significant improvements in NLP task performance. These techniques have the potential to create new automated tools that could improve clinical workflows and unlock unstructured textual information contained in radiology and clinical reports for the development of radiology and clinical artificial intelligence applications. These applications will combine the appropriate application of classic linguistic and NLP preprocessing techniques, modern NLP techniques, and modern deep learning techniques.
Collapse
Affiliation(s)
- Jack W Luo
- Department of Radiology, McGill University, 1001 Decarie Boulevard, Room B02.9375, Montreal, QC H4A 3J1, Canada
| | - Jaron J R Chong
- Department of Medical Imaging, Western University, 800 Commissioners Road East, Room C1-609, London, ON N6A 5W9, Canada.
| |
Collapse
|
14
|
Zhao Y, Fu S, Bielinski SJ, Decker PA, Chamberlain AM, Roger VL, Liu H, Larson NB. Natural Language Processing and Machine Learning for Identifying Incident Stroke From Electronic Health Records: Algorithm Development and Validation. J Med Internet Res 2021; 23:e22951. [PMID: 33683212 PMCID: PMC7985804 DOI: 10.2196/22951] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Revised: 08/25/2020] [Accepted: 01/20/2021] [Indexed: 11/29/2022] Open
Abstract
Background Stroke is an important clinical outcome in cardiovascular research. However, the ascertainment of incident stroke is typically accomplished via time-consuming manual chart abstraction. Current phenotyping efforts using electronic health records for stroke focus on case ascertainment rather than incident disease, which requires knowledge of the temporal sequence of events. Objective The aim of this study was to develop a machine learning–based phenotyping algorithm for incident stroke ascertainment based on diagnosis codes, procedure codes, and clinical concepts extracted from clinical notes using natural language processing. Methods The algorithm was trained and validated using an existing epidemiology cohort consisting of 4914 patients with atrial fibrillation (AF) with manually curated incident stroke events. Various combinations of feature sets and machine learning classifiers were compared. Using a heuristic rule based on the composition of concepts and codes, we further detected the stroke subtype (ischemic stroke/transient ischemic attack or hemorrhagic stroke) of each identified stroke. The algorithm was further validated using a cohort (n=150) stratified sampled from a population in Olmsted County, Minnesota (N=74,314). Results Among the 4914 patients with AF, 740 had validated incident stroke events. The best-performing stroke phenotyping algorithm used clinical concepts, diagnosis codes, and procedure codes as features in a random forest classifier. Among patients with stroke codes in the general population sample, the best-performing model achieved a positive predictive value of 86% (43/50; 95% CI 0.74-0.93) and a negative predictive value of 96% (96/100). For subtype identification, we achieved an accuracy of 83% in the AF cohort and 80% in the general population sample. Conclusions We developed and validated a machine learning–based algorithm that performed well for identifying incident stroke and for determining type of stroke. The algorithm also performed well on a sample from a general population, further demonstrating its generalizability and potential for adoption by other institutions.
Collapse
Affiliation(s)
- Yiqing Zhao
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Sunyang Fu
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Suzette J Bielinski
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Paul A Decker
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Alanna M Chamberlain
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Veronique L Roger
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Hongfang Liu
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Nicholas B Larson
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| |
Collapse
|
15
|
Esmaeili M, Ayyoubzadeh SM, Ahmadinejad N, Ghazisaeedi M, Nahvijou A, Maghooli K. A decision support system for mammography reports interpretation. Health Inf Sci Syst 2020; 8:17. [PMID: 32257128 PMCID: PMC7113352 DOI: 10.1007/s13755-020-00109-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Accepted: 03/30/2020] [Indexed: 02/06/2023] Open
Abstract
PURPOSE Mammography plays a key role in the diagnosis of breast cancer; however, decision-making based on mammography reports is still challenging. This paper aims to addresses the challenges regarding decision-making based on mammography reports and propose a Clinical Decision Support System (CDSS) using data mining methods to help clinicians to interpret mammography reports. METHODS For this purpose, 2441 mammography reports were collected from Imam Khomeini Hospital from March 21, 2018, to March 20, 2019. In the first step, these mammography reports are analyzed and program code is developed to transform the reports into a dataset. Then, the weight of every feature of the dataset is calculated. Random Forest, Naïve Bayes, K-nearest neighbor (K-NN), Deep Learning classifiers are applied to the dataset to build a model capable of predicting the need for referral to biopsy. Afterward, the models are evaluated using cross-validation with measuring Area Under Curve (AUC), accuracy, sensitivity, specificity indices. RESULTS The mammography type (diagnostic or screening), mass and calcification features mentioned in the reports are the most important features for decision-making. Results reveal that the K-NN model is the most accurate and specific classifier with the accuracy and specificity values of 84.06% and 84.72% respectively. The Random Forest classifier has the best sensitivity and AUC with the sensitivity and AUC values of 87.74% and 0.905 respectively. CONCLUSIONS Accordingly, data mining approaches are proved to be a helpful tool to make the final decision as to whether patients should be referred to biopsy or not based on mammography reports. The developed CDSS may also be helpful especially for less experienced radiologists.
Collapse
Affiliation(s)
- Marzieh Esmaeili
- Department of Health Information Management, School of Allied Medical Sciences, Tehran University of Medical Sciences, 3rd Floor, No #17, Farredanesh Alley, Ghods St, Enghelab Ave, Tehran, Iran
- Scientific Research Center, Tehran University of Medical Sciences, Tehran, Iran
| | - Seyed Mohammad Ayyoubzadeh
- Department of Health Information Management, School of Allied Medical Sciences, Tehran University of Medical Sciences, 3rd Floor, No #17, Farredanesh Alley, Ghods St, Enghelab Ave, Tehran, Iran
- Scientific Research Center, Tehran University of Medical Sciences, Tehran, Iran
| | - Nasrin Ahmadinejad
- Medical Imaging Cancer, Imam Khomeini Hospital, Cancer Research Institute, Tehran, Iran
- Advanced Diagnostic and Interventional Radiology Research Cancer (ADIR), Tehran University of Medical Sciences, Tehran, Iran
| | - Marjan Ghazisaeedi
- Department of Health Information Management, School of Allied Medical Sciences, Tehran University of Medical Sciences, 3rd Floor, No #17, Farredanesh Alley, Ghods St, Enghelab Ave, Tehran, Iran
| | - Azin Nahvijou
- Cancer Research Center, Cancer Institute of Iran, Tehran University of Medical Sciences, Tehran, Iran
| | - Keivan Maghooli
- Department of Biomedical Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran
| |
Collapse
|
16
|
Identification of patients with carotid stenosis using natural language processing. Eur Radiol 2020; 30:4125-4133. [DOI: 10.1007/s00330-020-06721-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Revised: 12/20/2019] [Accepted: 02/05/2020] [Indexed: 11/25/2022]
|
17
|
Liu Y, Liu Q, Han C, Zhang X, Wang X. The implementation of natural language processing to extract index lesions from breast magnetic resonance imaging reports. BMC Med Inform Decis Mak 2019; 19:288. [PMID: 31888615 PMCID: PMC6937920 DOI: 10.1186/s12911-019-0997-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Accepted: 11/25/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND There are often multiple lesions in breast magnetic resonance imaging (MRI) reports and radiologists usually focus on describing the index lesion that is most crucial to clinicians in determining the management and prognosis of patients. Natural language processing (NLP) has been used for information extraction from mammography reports. However, few studies have investigated NLP in breast MRI data based on free-form text. The objective of the current study was to assess the validity of our NLP program to accurately extract index lesions and their corresponding imaging features from free-form text of breast MRI reports. METHODS This cross-sectional study examined 1633 free-form text reports of breast MRIs from 2014 to 2017. First, the NLP system was used to extract 9 features from all the lesions in the reports according to the Breast Imaging Reporting and Data System (BI-RADS) descriptors. Second, the index lesion was defined as the lesion with the largest number of imaging features. Third, we extracted the values of each imaging feature and the BI-RADS category from each index lesion. To evaluate the accuracy of our system, 478 reports were manually reviewed by two individuals. The time taken to extract data by NLP was compared with that by reviewers. RESULTS The NLP system extracted 889 lesions from 478 reports. The mean number of imaging features per lesion was 6.5 ± 2.1 (range: 3-9; 95% CI: 6.362-6.638). The mean number of imaging features per index lesion was 8.0 ± 1.1 (range: 5-9; 95% CI: 7.901-8.099). The NLP system demonstrated a recall of 100.0% and a precision of 99.6% for correct identification of the index lesion. The recall and precision of NLP to correctly extract the value of imaging features from the index lesions were 91.0 and 92.6%, respectively. The recall and precision for the correct identification of the BI-RADS categories were 96.6 and 94.8%, respectively. NLP generated the total results in less than 1 s, whereas the manual reviewers averaged 4.47 min and 4.56 min per report. CONCLUSIONS Our NLP method successfully extracted the index lesion and its corresponding information from free-form text.
Collapse
Affiliation(s)
- Yi Liu
- Department of Radiology, Peking University First Hospital, No. 8 Xishiku Street, Xicheng District, Beijing, 100034, China
| | - Qing Liu
- Department of Radiolog, Peking University Cancer Hospital and Institute, No. 52 Fucheng Road, Haidian District, Beijing, China
| | - Chao Han
- Department of Radiology, Peking University First Hospital, No. 8 Xishiku Street, Xicheng District, Beijing, 100034, China
| | - Xiaodong Zhang
- Department of Radiology, Peking University First Hospital, No. 8 Xishiku Street, Xicheng District, Beijing, 100034, China
| | - Xiaoying Wang
- Department of Radiology, Peking University First Hospital, No. 8 Xishiku Street, Xicheng District, Beijing, 100034, China.
| |
Collapse
|
18
|
Hughes KS, Zhou J, Bao Y, Singh P, Wang J, Yin K. Natural language processing to facilitate breast cancer research and management. Breast J 2019; 26:92-99. [PMID: 31854067 DOI: 10.1111/tbj.13718] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Accepted: 10/02/2019] [Indexed: 12/23/2022]
Abstract
The medical literature has been growing exponentially, and its size has become a barrier for physicians to locate and extract clinically useful information. As a promising solution, natural language processing (NLP), especially machine learning (ML)-based NLP is a technology that potentially provides a promising solution. ML-based NLP is based on training a computational algorithm with a large number of annotated examples to allow the computer to "learn" and "predict" the meaning of human language. Although NLP has been widely applied in industry and business, most physicians still are not aware of the huge potential of this technology in medicine, and the implementation of NLP in breast cancer research and management is fairly limited. With a real-world successful project of identifying penetrance papers for breast and other cancer susceptibility genes, this review illustrates how to train and evaluate an NLP-based medical abstract classifier, incorporate it into a semiautomatic meta-analysis procedure, and validate the effectiveness of this procedure. Other implementations of NLP technology in breast cancer research, such as parsing pathology reports and mining electronic healthcare records, are also discussed. We hope this review will help breast cancer physicians and researchers to recognize, understand, and apply this technology to meet their own clinical or research needs.
Collapse
Affiliation(s)
- Kevin S Hughes
- Division of Surgical Oncology, Massachusetts General Hospital and Harvard Medical School, Boston, MA
| | - Jingan Zhou
- Division of Surgical Oncology, Massachusetts General Hospital and Harvard Medical School, Boston, MA.,Department of General Surgery, Beijing Anzhen Hospital, Capital Medical University, Beijing, China
| | - Yujia Bao
- Computer Science & Artificial Intelligence, Massachusetts Institute of Technology, Boston, MA
| | - Preeti Singh
- Division of Surgical Oncology, Massachusetts General Hospital and Harvard Medical School, Boston, MA
| | - Jin Wang
- Division of Surgical Oncology, Massachusetts General Hospital and Harvard Medical School, Boston, MA.,Department of Breast Oncology, Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center of Cancer Medicine, Guangzhou, China
| | - Kanhua Yin
- Division of Surgical Oncology, Massachusetts General Hospital and Harvard Medical School, Boston, MA
| |
Collapse
|
19
|
Banerjee I, Bozkurt S, Alkim E, Sagreiya H, Kurian AW, Rubin DL. Automatic inference of BI-RADS final assessment categories from narrative mammography report findings. J Biomed Inform 2019; 92:103137. [PMID: 30807833 DOI: 10.1016/j.jbi.2019.103137] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2018] [Revised: 10/02/2018] [Accepted: 02/15/2019] [Indexed: 12/29/2022]
Abstract
We propose an efficient natural language processing approach for inferring the BI-RADS final assessment categories by analyzing only the mammogram findings reported by the mammographer in narrative form. The proposed hybrid method integrates semantic term embedding with distributional semantics, producing a context-aware vector representation of unstructured mammography reports. A large corpus of unannotated mammography reports (300,000) was used to learn the context of the key-terms using a distributional semantics approach, and the trained model was applied to generate context-aware vector representations of the reports annotated with BI-RADS category (22,091). The vectorized reports were utilized to train a supervised classifier to derive the BI-RADS assessment class. Even though the majority of the proposed embedding pipeline is unsupervised, the classifier was able to recognize substantial semantic information for deriving the BI-RADS categorization not only on a holdout internal testset and also on an external validation set (1900 reports). Our proposed method outperforms a recently published domain-specific rule-based system and could be relevant for evaluating concordance between radiologists. With minimal requirement for task specific customization, the proposed method can be easily transferable to a different domain to support large scale text mining or derivation of patient phenotype.
Collapse
Affiliation(s)
- Imon Banerjee
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA.
| | - Selen Bozkurt
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA; Department of Biostatistics and Medical Informatics, Faculty of Medicine, Akdeniz University, Antalya 07059, Turkey
| | - Emel Alkim
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
| | - Hersh Sagreiya
- Department of Radiology, Stanford University School of Medicine, Stanford, CA, USA
| | - Allison W Kurian
- Medicine (Oncology) and Health Research and Policy, Stanford University School of Medicine, Stanford, CA, USA
| | - Daniel L Rubin
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA; Department of Radiology, Stanford University School of Medicine, Stanford, CA, USA
| |
Collapse
|
20
|
Banerjee I, Choi HH, Desser T, Rubin DL. A Scalable Machine Learning Approach for Inferring Probabilistic US-LI-RADS Categorization. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2018; 2018:215-224. [PMID: 30815059 PMCID: PMC6371287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
We propose a scalable computerized approach for large-scale inference of Liver Imaging Reporting and Data System (LI-RADS) final assessment categories in narrative ultrasound (US) reports. Although our model was trained on reports created using a LI-RADS template, it was also able to infer LI-RADS scoring for unstructured reports that were created before the LI-RADS guidelines were established. No human-labelled data was required in any step of this study; for training, LI-RADS scores were automatically extracted from those reports that contained structured LI-RADS scores, and it translated the derived knowledge to reasoning on unstructured radiology reports. By providing automated LI-RADS categorization, our approach may enable standardizing screening recommendations and treatment planning of patients at risk for hepatocellular carcinoma, and it may facilitate AI-based healthcare research with US images by offering large scale text mining and data gathering opportunities from standard hospital clinical data repositories.
Collapse
Affiliation(s)
- Imon Banerjee
- Department of Biomedical Data Science, Stanford University School of Medicine Medical School Office Building, Stanford CA 94305-5479
| | - Hailye H Choi
- Department of Radiology, Stanford University School of Medicine Stanford CA 94305-5479
| | - Terry Desser
- Department of Radiology, Stanford University School of Medicine Stanford CA 94305-5479
| | - Daniel L Rubin
- Department of Biomedical Data Science, Stanford University School of Medicine Medical School Office Building, Stanford CA 94305-5479
- Department of Radiology, Stanford University School of Medicine Stanford CA 94305-5479
| |
Collapse
|
21
|
Miao S, Xu T, Wu Y, Xie H, Wang J, Jing S, Zhang Y, Zhang X, Yang Y, Zhang X, Shan T, Wang L, Xu H, Wang S, Liu Y. Extraction of BI-RADS findings from breast ultrasound reports in Chinese using deep learning approaches. Int J Med Inform 2018; 119:17-21. [DOI: 10.1016/j.ijmedinf.2018.08.009] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2017] [Revised: 08/14/2018] [Accepted: 08/16/2018] [Indexed: 01/10/2023]
|
22
|
Short RG, Befera NT, Hoang JK, Tailor TD. A Normal Thyroid by Any Other Name: Linguistic Analysis of Statements Describing a Normal Thyroid Gland from Noncontrast Chest CT Reports. J Am Coll Radiol 2018; 15:1642-1647. [DOI: 10.1016/j.jacr.2018.04.016] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Revised: 03/30/2018] [Accepted: 04/12/2018] [Indexed: 10/16/2022]
|
23
|
Hassanzadeh H, Nguyen A, Karimi S, Chu K. Transferability of artificial neural networks for clinical document classification across hospitals: A case study on abnormality detection from radiology reports. J Biomed Inform 2018; 85:68-79. [PMID: 30026067 DOI: 10.1016/j.jbi.2018.07.017] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2018] [Revised: 06/25/2018] [Accepted: 07/14/2018] [Indexed: 10/28/2022]
Abstract
OBJECTIVE Application of machine learning techniques for automatic and reliable classification of clinical documents have shown promising results. However, machine learning models require abundant training data specific to each target hospital and may not be able to benefit from available labeled data from each of the hospitals due to data variations. Such training data limitations have presented one of the major obstacles for maximising potential application of machine learning approaches in the healthcare domain. We investigated transferability of artificial neural network models across hospitals from different domains representing various age demographic groups (i.e., children, adults, and mixed) in order to cope with such limitations. MATERIALS AND METHODS We explored the transferability of artificial neural networks for clinical document classification. Our case study was to detect abnormalities from limb X-ray reports obtained from the emergency department (ED) of three hospitals within different domains. Different transfer learning scenarios were investigated in order to employ a source hospital's trained model for addressing a target hospital's abnormality detection problem. RESULTS A Convolutional Neural Network (CNN) model exhibited the best effectiveness compared to other networks when employing an embedding model trained on a large corpus of clinical documents. Furthermore, CNN models derived from a source hospital outperformed a conventional machine learning approach based on Support Vector Machines (SVM) when applied to a different (target) hospital. These models were further improved by leveraging available training data in target hospitals and outperformed the models that used only the target hospital data with F1-Score of 0.92-0.96 across three hospitals. DISCUSSION Our transfer learning model used only simple vector representations of documents without any task-specific feature engineering. Transferring the CNN model significantly improved (approx.10% in F1-Score) the state-of-the-art approach for clinical document classification based on a trivial transferred model. In addition, the results showed that transfer learning techniques can further improve a CNN model that is trained only on either a source or target hospital's data. CONCLUSION Transferring a pre-trained CNN model generated in one hospital to another facilitates application of machine learning approaches that alleviate both hospital-specific feature engineering and training data.
Collapse
Affiliation(s)
- Hamed Hassanzadeh
- The Australian e-Health Research Centre, CSIRO, Brisbane, Australia.
| | - Anthony Nguyen
- The Australian e-Health Research Centre, CSIRO, Brisbane, Australia.
| | | | - Kevin Chu
- Royal Brisbane and Women's Hospital, Queensland Health, Brisbane, Australia.
| |
Collapse
|
24
|
Zech J, Pain M, Titano J, Badgeley M, Schefflein J, Su A, Costa A, Bederson J, Lehar J, Oermann EK. Natural Language–based Machine Learning Models for the Annotation of Clinical Radiology Reports. Radiology 2018; 287:570-580. [DOI: 10.1148/radiol.2018171093] [Citation(s) in RCA: 86] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Affiliation(s)
- John Zech
- From the Departments of Radiology (J.Z., J.T., J.S., A.S.) and Neurosurgery (M.P., M.B., A.C., J.B., E.K.O.), Icahn School of Medicine, 1 Gustave Levy Pl, New York, NY 10029; and Department of Bioengineering and Bioinformatics, Boston University, Boston, Mass (J.L.)
| | - Margaret Pain
- From the Departments of Radiology (J.Z., J.T., J.S., A.S.) and Neurosurgery (M.P., M.B., A.C., J.B., E.K.O.), Icahn School of Medicine, 1 Gustave Levy Pl, New York, NY 10029; and Department of Bioengineering and Bioinformatics, Boston University, Boston, Mass (J.L.)
| | - Joseph Titano
- From the Departments of Radiology (J.Z., J.T., J.S., A.S.) and Neurosurgery (M.P., M.B., A.C., J.B., E.K.O.), Icahn School of Medicine, 1 Gustave Levy Pl, New York, NY 10029; and Department of Bioengineering and Bioinformatics, Boston University, Boston, Mass (J.L.)
| | - Marcus Badgeley
- From the Departments of Radiology (J.Z., J.T., J.S., A.S.) and Neurosurgery (M.P., M.B., A.C., J.B., E.K.O.), Icahn School of Medicine, 1 Gustave Levy Pl, New York, NY 10029; and Department of Bioengineering and Bioinformatics, Boston University, Boston, Mass (J.L.)
| | - Javin Schefflein
- From the Departments of Radiology (J.Z., J.T., J.S., A.S.) and Neurosurgery (M.P., M.B., A.C., J.B., E.K.O.), Icahn School of Medicine, 1 Gustave Levy Pl, New York, NY 10029; and Department of Bioengineering and Bioinformatics, Boston University, Boston, Mass (J.L.)
| | - Andres Su
- From the Departments of Radiology (J.Z., J.T., J.S., A.S.) and Neurosurgery (M.P., M.B., A.C., J.B., E.K.O.), Icahn School of Medicine, 1 Gustave Levy Pl, New York, NY 10029; and Department of Bioengineering and Bioinformatics, Boston University, Boston, Mass (J.L.)
| | - Anthony Costa
- From the Departments of Radiology (J.Z., J.T., J.S., A.S.) and Neurosurgery (M.P., M.B., A.C., J.B., E.K.O.), Icahn School of Medicine, 1 Gustave Levy Pl, New York, NY 10029; and Department of Bioengineering and Bioinformatics, Boston University, Boston, Mass (J.L.)
| | - Joshua Bederson
- From the Departments of Radiology (J.Z., J.T., J.S., A.S.) and Neurosurgery (M.P., M.B., A.C., J.B., E.K.O.), Icahn School of Medicine, 1 Gustave Levy Pl, New York, NY 10029; and Department of Bioengineering and Bioinformatics, Boston University, Boston, Mass (J.L.)
| | - Joseph Lehar
- From the Departments of Radiology (J.Z., J.T., J.S., A.S.) and Neurosurgery (M.P., M.B., A.C., J.B., E.K.O.), Icahn School of Medicine, 1 Gustave Levy Pl, New York, NY 10029; and Department of Bioengineering and Bioinformatics, Boston University, Boston, Mass (J.L.)
| | - Eric Karl Oermann
- From the Departments of Radiology (J.Z., J.T., J.S., A.S.) and Neurosurgery (M.P., M.B., A.C., J.B., E.K.O.), Icahn School of Medicine, 1 Gustave Levy Pl, New York, NY 10029; and Department of Bioengineering and Bioinformatics, Boston University, Boston, Mass (J.L.)
| |
Collapse
|
25
|
Automated annotation and classification of BI-RADS assessment from radiology reports. J Biomed Inform 2017; 69:177-187. [PMID: 28428140 DOI: 10.1016/j.jbi.2017.04.011] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2016] [Revised: 04/12/2017] [Accepted: 04/14/2017] [Indexed: 01/09/2023]
Abstract
The Breast Imaging Reporting and Data System (BI-RADS) was developed to reduce variation in the descriptions of findings. Manual analysis of breast radiology report data is challenging but is necessary for clinical and healthcare quality assurance activities. The objective of this study is to develop a natural language processing (NLP) system for automated BI-RADS categories extraction from breast radiology reports. We evaluated an existing rule-based NLP algorithm, and then we developed and evaluated our own method using a supervised machine learning approach. We divided the BI-RADS category extraction task into two specific tasks: (1) annotation of all BI-RADS category values within a report, (2) classification of the laterality of each BI-RADS category value. We used one algorithm for task 1 and evaluated three algorithms for task 2. Across all evaluations and model training, we used a total of 2159 radiology reports from 18 hospitals, from 2003 to 2015. Performance with the existing rule-based algorithm was not satisfactory. Conditional random fields showed a high performance for task 1 with an F-1 measure of 0.95. Rules from partial decision trees (PART) algorithm showed the best performance across classes for task 2 with a weighted F-1 measure of 0.91 for BIRADS 0-6, and 0.93 for BIRADS 3-5. Classification performance by class showed that performance improved for all classes from Naïve Bayes to Support Vector Machine (SVM), and also from SVM to PART. Our system is able to annotate and classify all BI-RADS mentions present in a single radiology report and can serve as the foundation for future studies that will leverage automated BI-RADS annotation, to provide feedback to radiologists as part of a learning health system loop.
Collapse
|
26
|
Pons E, Braun LMM, Hunink MGM, Kors JA. Natural Language Processing in Radiology: A Systematic Review. Radiology 2016; 279:329-43. [PMID: 27089187 DOI: 10.1148/radiol.16142770] [Citation(s) in RCA: 318] [Impact Index Per Article: 35.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Radiological reporting has generated large quantities of digital content within the electronic health record, which is potentially a valuable source of information for improving clinical care and supporting research. Although radiology reports are stored for communication and documentation of diagnostic imaging, harnessing their potential requires efficient and automated information extraction: they exist mainly as free-text clinical narrative, from which it is a major challenge to obtain structured data. Natural language processing (NLP) provides techniques that aid the conversion of text into a structured representation, and thus enables computers to derive meaning from human (ie, natural language) input. Used on radiology reports, NLP techniques enable automatic identification and extraction of information. By exploring the various purposes for their use, this review examines how radiology benefits from NLP. A systematic literature search identified 67 relevant publications describing NLP methods that support practical applications in radiology. This review takes a close look at the individual studies in terms of tasks (ie, the extracted information), the NLP methodology and tools used, and their application purpose and performance results. Additionally, limitations, future challenges, and requirements for advancing NLP in radiology will be discussed.
Collapse
Affiliation(s)
- Ewoud Pons
- From the Departments of Radiology (E.P., L.M.M.B., M.G.M.H.) and Medical Informatics (J.A.K.), Erasmus Medical Center, PO Box 2040, 3000 CA Rotterdam, the Netherlands
| | - Loes M M Braun
- From the Departments of Radiology (E.P., L.M.M.B., M.G.M.H.) and Medical Informatics (J.A.K.), Erasmus Medical Center, PO Box 2040, 3000 CA Rotterdam, the Netherlands
| | - M G Myriam Hunink
- From the Departments of Radiology (E.P., L.M.M.B., M.G.M.H.) and Medical Informatics (J.A.K.), Erasmus Medical Center, PO Box 2040, 3000 CA Rotterdam, the Netherlands
| | - Jan A Kors
- From the Departments of Radiology (E.P., L.M.M.B., M.G.M.H.) and Medical Informatics (J.A.K.), Erasmus Medical Center, PO Box 2040, 3000 CA Rotterdam, the Netherlands
| |
Collapse
|
27
|
Bozkurt S, Gimenez F, Burnside ES, Gulkesen KH, Rubin DL. Using automatically extracted information from mammography reports for decision-support. J Biomed Inform 2016; 62:224-31. [PMID: 27388877 DOI: 10.1016/j.jbi.2016.07.001] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2016] [Revised: 06/22/2016] [Accepted: 07/02/2016] [Indexed: 02/07/2023]
Abstract
OBJECTIVE To evaluate a system we developed that connects natural language processing (NLP) for information extraction from narrative text mammography reports with a Bayesian network for decision-support about breast cancer diagnosis. The ultimate goal of this system is to provide decision support as part of the workflow of producing the radiology report. MATERIALS AND METHODS We built a system that uses an NLP information extraction system (which extract BI-RADS descriptors and clinical information from mammography reports) to provide the necessary inputs to a Bayesian network (BN) decision support system (DSS) that estimates lesion malignancy from BI-RADS descriptors. We used this integrated system to predict diagnosis of breast cancer from radiology text reports and evaluated it with a reference standard of 300 mammography reports. We collected two different outputs from the DSS: (1) the probability of malignancy and (2) the BI-RADS final assessment category. Since NLP may produce imperfect inputs to the DSS, we compared the difference between using perfect ("reference standard") structured inputs to the DSS ("RS-DSS") vs NLP-derived inputs ("NLP-DSS") on the output of the DSS using the concordance correlation coefficient. We measured the classification accuracy of the BI-RADS final assessment category when using NLP-DSS, compared with the ground truth category established by the radiologist. RESULTS The NLP-DSS and RS-DSS had closely matched probabilities, with a mean paired difference of 0.004±0.025. The concordance correlation of these paired measures was 0.95. The accuracy of the NLP-DSS to predict the correct BI-RADS final assessment category was 97.58%. CONCLUSION The accuracy of the information extracted from mammography reports using the NLP system was sufficient to provide accurate DSS results. We believe our system could ultimately reduce the variation in practice in mammography related to assessment of malignant lesions and improve management decisions.
Collapse
Affiliation(s)
- Selen Bozkurt
- Akdeniz University Faculty of Medicine, Department of Biostatistics and Medical Informatics, Antalya, Turkey
| | - Francisco Gimenez
- Department of Radiology and Medicine (Biomedical Informatics Research), Stanford University, Richard M. Lucas Center, 1201 Welch Road, Office P285, Stanford, CA 94305-5488, United States
| | | | - Kemal H Gulkesen
- Akdeniz University Faculty of Medicine, Department of Biostatistics and Medical Informatics, Antalya, Turkey
| | - Daniel L Rubin
- Department of Radiology and Medicine (Biomedical Informatics Research), Stanford University, Richard M. Lucas Center, 1201 Welch Road, Office P285, Stanford, CA 94305-5488, United States.
| |
Collapse
|
28
|
Lacson R, Harris K, Brawarsky P, Tosteson TD, Onega T, Tosteson ANA, Kaye A, Gonzalez I, Birdwell R, Haas JS. Evaluation of an Automated Information Extraction Tool for Imaging Data Elements to Populate a Breast Cancer Screening Registry. J Digit Imaging 2016; 28:567-75. [PMID: 25561069 DOI: 10.1007/s10278-014-9762-4] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
Breast cancer screening is central to early breast cancer detection. Identifying and monitoring process measures for screening is a focus of the National Cancer Institute's Population-based Research Optimizing Screening through Personalized Regimens (PROSPR) initiative, which requires participating centers to report structured data across the cancer screening continuum. We evaluate the accuracy of automated information extraction of imaging findings from radiology reports, which are available as unstructured text. We present prevalence estimates of imaging findings for breast imaging received by women who obtained care in a primary care network participating in PROSPR (n = 139,953 radiology reports) and compared automatically extracted data elements to a "gold standard" based on manual review for a validation sample of 941 randomly selected radiology reports, including mammograms, digital breast tomosynthesis, ultrasound, and magnetic resonance imaging (MRI). The prevalence of imaging findings vary by data element and modality (e.g., suspicious calcification noted in 2.6% of screening mammograms, 12.1% of diagnostic mammograms, and 9.4% of tomosynthesis exams). In the validation sample, the accuracy of identifying imaging findings, including suspicious calcifications, masses, and architectural distortion (on mammogram and tomosynthesis); masses, cysts, non-mass enhancement, and enhancing foci (on MRI); and masses and cysts (on ultrasound), range from 0.8 to1.0 for recall, precision, and F-measure. Information extraction tools can be used for accurate documentation of imaging findings as structured data elements from text reports for a variety of breast imaging modalities. These data can be used to populate screening registries to help elucidate more effective breast cancer screening processes.
Collapse
Affiliation(s)
- Ronilda Lacson
- Department of Radiology, Brigham and Women's Hospital, 75 Francis Street, Boston, MA, 02115, USA.
- Harvard Medical School, Boston, MA, USA.
| | - Kimberly Harris
- Department of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, USA
| | - Phyllis Brawarsky
- Department of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, USA
| | - Tor D Tosteson
- Department of Community and Family Medicine, The Dartmouth Institute for Health Policy and Clinical Practice, Lebanon, NH, USA
| | - Tracy Onega
- Department of Community and Family Medicine, The Dartmouth Institute for Health Policy and Clinical Practice, Lebanon, NH, USA
| | - Anna N A Tosteson
- Department of Community and Family Medicine, The Dartmouth Institute for Health Policy and Clinical Practice, Lebanon, NH, USA
- Department of Medicine, The Dartmouth Institute for Health Policy and Clinical Practice, Lebanon, NH, USA
| | - Abby Kaye
- Department of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, USA
| | - Irina Gonzalez
- Department of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, USA
| | - Robyn Birdwell
- Department of Radiology, Brigham and Women's Hospital, 75 Francis Street, Boston, MA, 02115, USA
- Harvard Medical School, Boston, MA, USA
| | - Jennifer S Haas
- Department of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| |
Collapse
|
29
|
Burnside ES, Liu J, Wu Y, Onitilo AA, McCarty CA, Page CD, Peissig PL, Trentham-Dietz A, Kitchner T, Fan J, Yuan M. Comparing Mammography Abnormality Features to Genetic Variants in the Prediction of Breast Cancer in Women Recommended for Breast Biopsy. Acad Radiol 2016; 23:62-9. [PMID: 26514439 DOI: 10.1016/j.acra.2015.09.007] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Revised: 09/15/2015] [Accepted: 09/28/2015] [Indexed: 01/10/2023]
Abstract
RATIONALE AND OBJECTIVES The discovery of germline genetic variants associated with breast cancer has engendered interest in risk stratification for improved, targeted detection and diagnosis. However, there has yet to be a comparison of the predictive ability of these genetic variants with mammography abnormality descriptors. MATERIALS AND METHODS Our institutional review board-approved, Health Insurance Portability and Accountability Act-compliant study utilized a personalized medicine registry in which participants consented to provide a DNA sample and to participate in longitudinal follow-up. In our retrospective, age-matched, case-controlled study of 373 cases and 395 controls who underwent breast biopsy, we collected risk factors selected a priori based on the literature, including demographic variables based on the Gail model, common germline genetic variants, and diagnostic mammography findings according to Breast Imaging Reporting and Data System (BI-RADS). We developed predictive models using logistic regression to determine the predictive ability of (1) demographic variables, (2) 10 selected genetic variants, or (3) mammography BI-RADS features. We evaluated each model in turn by calculating a risk score for each patient using 10-fold cross-validation, used this risk estimate to construct Receiver Operator Characteristic Curve (ROC) curves, and compared the area under the ROC curve (AUC) of each using the DeLong method. RESULTS The performance of the regression model using demographic risk factors was not statistically different from the model using genetic variants (P = 0.9). The model using mammography features (AUC = 0.689) was superior to both the demographic model (AUC = .598; P < 0.001) and the genetic model (AUC = .601; P < 0.001). CONCLUSIONS BI-RADS features exceeded the ability of demographic and 10 selected germline genetic variants to predict breast cancer in women recommended for biopsy.
Collapse
|
30
|
Rassu PC. Observed outcomes on the use of oxidized and regenerated cellulose polymer for breast conserving surgery - A case series. Ann Med Surg (Lond) 2015; 5:57-66. [PMID: 26865976 PMCID: PMC4709468 DOI: 10.1016/j.amsu.2015.12.050] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2015] [Revised: 12/01/2015] [Accepted: 12/19/2015] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Oxidized regenerated cellulose polymer (ORCP) may be used for reshaping and filling lack of volume in breast-conserving surgery (BCS). The study aimed to observe both the aesthetic and diagnostic outcomes in patients with different age, BMI, breast volume, and breast tissue composition over 36 months after BCS with ORCP. PATIENTS AND METHODS 18 patients with early breast cancer and with proliferative benign lesions underwent BCS with ORCP that was layered in three-dimensional wafer, and placed into the Chassaignac space between the mammary gland and the fascia of pectoralis major with no fixation. After surgery, patients started a clinical and instrumental 36-month follow-up with mammography, ultrasonography, magnetic resonance imaging (MRI) and cytological examination with fine needle aspiration when seroma occurred. RESULTS Below the median age of 66 years old no complications were observed even in case both of overweight, and large breasts with low density. Over the median age seromas occurred with either small or large skin retraction, with the exception of 1 patient having quite dense breasts and low BMI, which had no complications. In elderly patients, 1 case with quite dense breasts and high BMI showed severe seroma and skin retraction, while 1 case with low BMI and less dense breasts highlighted milder complications. CONCLUSION During 36 months after BCS with ORCP, a significant correlation between positive diagnostic and aesthetic outcomes and low age, dense breasts, and low BMI of patient was observed. Despite of the few number of cases, either low BMI, or high breast density improved the aesthetic outcomes and reduced the entity of complications even in the elderly patients.
Collapse
Affiliation(s)
- Pier Carlo Rassu
- MD, SC General Surgery, “San Giacomo” Hospital, Via Edilio Raggio, 12, 15067 Novi Ligure, AL, Italy.MD, SC General Surgery“San Giacomo” HospitalVia Edilio Raggio, 12Novi LigureAL15067Italy
| |
Collapse
|
31
|
Ng KH, Lau S. Vision 20/20: Mammographic breast density and its clinical applications. Med Phys 2015; 42:7059-77. [PMID: 26632060 DOI: 10.1118/1.4935141] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Affiliation(s)
- Kwan-Hoong Ng
- Department of Biomedical Imaging and University of Malaya Research Imaging Centre, Faculty of Medicine, University of Malaya, 50603 Kuala Lumpur, Malaysia
| | - Susie Lau
- Department of Biomedical Imaging and University of Malaya Research Imaging Centre, Faculty of Medicine, University of Malaya, 50603 Kuala Lumpur, Malaysia
| |
Collapse
|
32
|
Medina García R, Torres Serrano E, Segrelles Quilis JD, Blanquer Espert I, Martí Bonmatí L, Almenar Cubells D. A systematic approach for using DICOM structured reports in clinical processes: focus on breast cancer. J Digit Imaging 2015; 28:132-45. [PMID: 25200428 PMCID: PMC4359202 DOI: 10.1007/s10278-014-9728-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open
Abstract
This paper describes a methodology for redesigning the clinical processes to manage diagnosis, follow-up, and response to treatment episodes of breast cancer. This methodology includes three fundamental elements: (1) identification of similar and contrasting cases that may be of clinical relevance based upon a target study, (2) codification of reports with standard medical terminologies, and (3) linking and indexing the structured reports obtained with different techniques in a common system. The combination of these elements should lead to improvements in the clinical management of breast cancer patients. The motivation for this work is the adaptation of the clinical processes for breast cancer created by the Valencian Community health authorities to the new techniques available for data processing. To achieve this adaptation, it was necessary to design nine Digital Imaging and Communications in Medicine (DICOM) structured report templates: six diagnosis templates and three summary templates that combine reports from clinical episodes. A prototype system is also described that links the lesion to the reports. Preliminary tests of the prototype have shown that the interoperability among the report templates allows correlating parameters from different reports. Further work is in progress to improve the methodology in order that it can be applied to clinical practice.
Collapse
Affiliation(s)
| | - Erik Torres Serrano
- />Institute for Molecular Imaging Technologies (I3M), Universitat Politècnica de València (UPVLC), Camino de Vera S/N, 46022 Valencia, Spain
| | | | | | - Luis Martí Bonmatí
- />Medical Imaging Unit, University and Polytechnic Hospital La Fe, Valencia, Spain
| | | |
Collapse
|
33
|
Liu J, Wu Y, Ong I, Page D, Peissig P, McCarty C, Onitilo AA, Burnside E. Leveraging Interaction between Genetic Variants and Mammographic Findings for Personalized Breast Cancer Diagnosis. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2015; 2015:107-11. [PMID: 26306250 PMCID: PMC4525263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Recent large-scale genome-wide association studies (GWAS) have identified a number of genetic variants associated with breast cancer which showed great potential for clinical translation, especially in breast cancer diagnosis via mammograms. However, the amount of interaction between these genetic variants and mammographic features that can be leveraged for personalized diagnosis remains unknown. Our study utilizes germline genetic variants and mammographic features that we collected in a breast cancer case-control study. By computing the conditional mutual information between the genetic variants and mammographic features given the breast cancer status, we identified six interaction pairs which elevate breast cancer risk and five interaction pairs which reduce breast cancer risk.
Collapse
Affiliation(s)
- Jie Liu
- University of Wisconsin, Madison, WI, US
| | - Yirong Wu
- University of Wisconsin, Madison, WI, US
| | - Irene Ong
- University of Wisconsin, Madison, WI, US
| | - David Page
- University of Wisconsin, Madison, WI, US
| | - Peggy Peissig
- Marshfield Clinic Research Foundation, Marshfield, WI, US
| | | | - Adedayo A. Onitilo
- Marshfield Clinic Research Foundation, Marshfield, WI, US,Department of Hematology/Oncology, Marshfield Clinic Weston Center, Weston, WI, US
| | | |
Collapse
|
34
|
Gao H, Aiello Bowles EJ, Carrell D, Buist DSM. Using natural language processing to extract mammographic findings. J Biomed Inform 2015; 54:77-84. [PMID: 25661260 DOI: 10.1016/j.jbi.2015.01.010] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2014] [Revised: 01/21/2015] [Accepted: 01/25/2015] [Indexed: 11/28/2022]
Abstract
OBJECTIVE Structured data on mammographic findings are difficult to obtain without manual review. We developed and evaluated a rule-based natural language processing (NLP) system to extract mammographic findings from free-text mammography reports. MATERIALS AND METHODS The NLP system extracted four mammographic findings: mass, calcification, asymmetry, and architectural distortion, using a dictionary look-up method on 93,705 mammography reports from Group Health. Status annotations and anatomical location annotation were associated to each NLP detected finding through association rules. After excluding negated, uncertain, and historical findings, affirmative mentions of detected findings were summarized. Confidence flags were developed to denote reports with highly confident NLP results and reports with possible NLP errors. A random sample of 100 reports was manually abstracted to evaluate the accuracy of the system. RESULTS The NLP system correctly coded 96-99 out of our sample of 100 reports depending on findings. Measures of sensitivity, specificity and negative predictive values exceeded 0.92 for all findings. Positive predictive values were relatively low for some findings due to their low prevalence. DISCUSSION Our NLP system was implemented entirely in SAS Base, which makes it portable and easy to implement. It performed reasonably well with multiple applications, such as using confidence flags as a filter to improve the efficiency of manual review. Refinements of library and association rules, and testing on more diverse samples may further improve its performance. CONCLUSION Our NLP system successfully extracts clinically useful information from mammography reports. Moreover, SAS is a feasible platform for implementing NLP algorithms.
Collapse
Affiliation(s)
- Hongyuan Gao
- Group Health Research Institute, Seattle, WA, USA.
| | | | | | | |
Collapse
|
35
|
Automated extraction of BI-RADS final assessment categories from radiology reports with natural language processing. J Digit Imaging 2014; 26:989-94. [PMID: 23868515 DOI: 10.1007/s10278-013-9616-5] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022] Open
Abstract
The objective of this study is to evaluate a natural language processing (NLP) algorithm that determines American College of Radiology Breast Imaging Reporting and Data System (BI-RADS) final assessment categories from radiology reports. This HIPAA-compliant study was granted institutional review board approval with waiver of informed consent. This cross-sectional study involved 1,165 breast imaging reports in the electronic medical record (EMR) from a tertiary care academic breast imaging center from 2009. Reports included screening mammography, diagnostic mammography, breast ultrasound, combined diagnostic mammography and breast ultrasound, and breast magnetic resonance imaging studies. Over 220 reports were included from each study type. The recall (sensitivity) and precision (positive predictive value) of a NLP algorithm to collect BI-RADS final assessment categories stated in the report final text was evaluated against a manual human review standard reference. For all breast imaging reports, the NLP algorithm demonstrated a recall of 100.0 % (95 % confidence interval (CI), 99.7, 100.0 %) and a precision of 96.6 % (95 % CI, 95.4, 97.5 %) for correct identification of BI-RADS final assessment categories. The NLP algorithm demonstrated high recall and precision for extraction of BI-RADS final assessment categories from the free text of breast imaging reports. NLP may provide an accurate, scalable data extraction mechanism from reports within EMRs to create databases to track breast imaging performance measures and facilitate optimal breast cancer population management strategies.
Collapse
|
36
|
Liu J, Page D, Peissig P, McCarty C, Onitilo AA, Trentham-Dietz A, Burnside E. New genetic variants improve personalized breast cancer diagnosis. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2014; 2014:83-9. [PMID: 25717406 PMCID: PMC4333695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Recent large-scale genome-wide association studies (GWAS) have identified a number of new genetic variants associated with breast cancer. However, the degree to which these genetic variants improve breast cancer diagnosis in concert with mammography remains unknown. We conducted a case-control study and collected mammography features and 77 genetic variants which reflect the state of the art GWAS findings on breast cancer. A naïve Bayes model was developed on the mammography features and these genetic variants. We observed that the incorporation of the genetic variants significantly improved breast cancer diagnosis based on mammographic findings.
Collapse
Affiliation(s)
- Jie Liu
- University of Wisconsin, Madison, WI, US
| | - David Page
- University of Wisconsin, Madison, WI, US
| | - Peggy Peissig
- Marshfield Clinic Research Foundation, Marshfield, WI, US
| | | | - Adedayo A Onitilo
- Marshfield Clinic Research Foundation, Marshfield, WI, US ; Department of Hematology/Oncology, Marshfield Clinic Weston Center, Weston, WI, US ; School of Population Health, University of Queensland, Brisbane, Australia
| | | | | |
Collapse
|
37
|
Bui DDA, Zeng-Treitler Q. Learning regular expressions for clinical text classification. J Am Med Inform Assoc 2014; 21:850-7. [PMID: 24578357 DOI: 10.1136/amiajnl-2013-002411] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
OBJECTIVES Natural language processing (NLP) applications typically use regular expressions that have been developed manually by human experts. Our goal is to automate both the creation and utilization of regular expressions in text classification. METHODS We designed a novel regular expression discovery (RED) algorithm and implemented two text classifiers based on RED. The RED+ALIGN classifier combines RED with an alignment algorithm, and RED+SVM combines RED with a support vector machine (SVM) classifier. Two clinical datasets were used for testing and evaluation: the SMOKE dataset, containing 1091 text snippets describing smoking status; and the PAIN dataset, containing 702 snippets describing pain status. We performed 10-fold cross-validation to calculate accuracy, precision, recall, and F-measure metrics. In the evaluation, an SVM classifier was trained as the control. RESULTS The two RED classifiers achieved 80.9-83.0% in overall accuracy on the two datasets, which is 1.3-3% higher than SVM's accuracy (p<0.001). Similarly, small but consistent improvements have been observed in precision, recall, and F-measure when RED classifiers are compared with SVM alone. More significantly, RED+ALIGN correctly classified many instances that were misclassified by the SVM classifier (8.1-10.3% of the total instances and 43.8-53.0% of SVM's misclassifications). CONCLUSIONS Machine-generated regular expressions can be effectively used in clinical text classification. The regular expression-based classifier can be combined with other classifiers, like SVM, to improve classification performance.
Collapse
Affiliation(s)
- Duy Duc An Bui
- Department of Biomedical Informatics, University of Utah, Salt Lake City, Utah, USA VA Salt Lake City Health Care System, Salt Lake City, Utah, USA
| | - Qing Zeng-Treitler
- Department of Biomedical Informatics, University of Utah, Salt Lake City, Utah, USA VA Salt Lake City Health Care System, Salt Lake City, Utah, USA
| |
Collapse
|
38
|
Pathak J, Kho AN, Denny JC. Electronic health records-driven phenotyping: challenges, recent advances, and perspectives. J Am Med Inform Assoc 2014; 20:e206-11. [PMID: 24302669 DOI: 10.1136/amiajnl-2013-002428] [Citation(s) in RCA: 177] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open
Affiliation(s)
- Jyotishman Pathak
- Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, USA
| | | | | |
Collapse
|
39
|
Liu J, Page D, Nassif H, Shavlik J, Peissig P, McCarty C, Onitilo AA, Burnside E. Genetic variants improve breast cancer risk prediction on mammograms. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2013; 2013:876-885. [PMID: 24551380 PMCID: PMC3900221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Several recent genome-wide association studies have identified genetic variants associated with breast cancer. However, how much these genetic variants may help advance breast cancer risk prediction based on other clinical features, like mammographic findings, is unknown. We conducted a retrospective case-control study, collecting mammographic findings and high-frequency/low-penetrance genetic variants from an existing personalized medicine data repository. A Bayesian network was developed using Tree Augmented Naive Bayes (TAN) by training on the mammographic findings, with and without the 22 genetic variants collected. We analyzed the predictive performance using the area under the ROC curve, and found that the genetic variants significantly improved breast cancer risk prediction on mammograms. We also identified the interaction effect between the genetic variants and collected mammographic findings in an attempt to link genotype to mammographic phenotype to better understand disease patterns, mechanisms, and/or natural history.
Collapse
Affiliation(s)
- Jie Liu
- University of Wisconsin, Madison, WI, USA
| | - David Page
- University of Wisconsin, Madison, WI, USA
| | | | | | - Peggy Peissig
- Marshfield Clinic Research Foundation, Marshfield, WI, USA
| | | | - Adedayo A Onitilo
- Department of Hematology/Oncology, Marshfield Clinic Weston Center, Weston, WI, USA ; Marshfield Clinic Research Foundation, Marshfield, WI, USA ; School of Population Health, University of Queensland, Brisbane, Australia
| | | |
Collapse
|
40
|
Shivade C, Raghavan P, Fosler-Lussier E, Embi PJ, Elhadad N, Johnson SB, Lai AM. A review of approaches to identifying patient phenotype cohorts using electronic health records. J Am Med Inform Assoc 2013; 21:221-30. [PMID: 24201027 PMCID: PMC3932460 DOI: 10.1136/amiajnl-2013-001935] [Citation(s) in RCA: 301] [Impact Index Per Article: 25.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Objective To summarize literature describing approaches aimed at automatically identifying patients with a common phenotype. Materials and methods We performed a review of studies describing systems or reporting techniques developed for identifying cohorts of patients with specific phenotypes. Every full text article published in (1) Journal of American Medical Informatics Association, (2) Journal of Biomedical Informatics, (3) Proceedings of the Annual American Medical Informatics Association Symposium, and (4) Proceedings of Clinical Research Informatics Conference within the past 3 years was assessed for inclusion in the review. Only articles using automated techniques were included. Results Ninety-seven articles met our inclusion criteria. Forty-six used natural language processing (NLP)-based techniques, 24 described rule-based systems, 41 used statistical analyses, data mining, or machine learning techniques, while 22 described hybrid systems. Nine articles described the architecture of large-scale systems developed for determining cohort eligibility of patients. Discussion We observe that there is a rise in the number of studies associated with cohort identification using electronic medical records. Statistical analyses or machine learning, followed by NLP techniques, are gaining popularity over the years in comparison with rule-based systems. Conclusions There are a variety of approaches for classifying patients into a particular phenotype. Different techniques and data sources are used, and good performance is reported on datasets at respective institutions. However, no system makes comprehensive use of electronic medical records addressing all of their known weaknesses.
Collapse
Affiliation(s)
- Chaitanya Shivade
- Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio, USA
| | | | | | | | | | | | | |
Collapse
|
41
|
Lin KW, Tharp M, Conway M, Hsieh A, Ross M, Kim J, Kim HE. Feasibility of using Clinical Element Models (CEM) to standardize phenotype variables in the database of genotypes and phenotypes (dbGaP). PLoS One 2013; 8:e76384. [PMID: 24058713 PMCID: PMC3776754 DOI: 10.1371/journal.pone.0076384] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2013] [Accepted: 08/29/2013] [Indexed: 11/18/2022] Open
Abstract
The database of Genotypes and Phenotypes (dbGaP) contains various types of data generated from genome-wide association studies (GWAS). These data can be used to facilitate novel scientific discoveries and to reduce cost and time for exploratory research. However, idiosyncrasies and inconsistencies in phenotype variable names are a major barrier to reusing these data. We addressed these challenges in standardizing phenotype variables by formalizing their descriptions using Clinical Element Models (CEM). Designed to represent clinical data, CEMs were highly expressive and thus were able to represent a majority (77.5%) of the 215 phenotype variable descriptions. However, their high expressivity also made it difficult to directly apply them to research data such as phenotype variables in dbGaP. Our study suggested that simplification of the template models makes it more straightforward to formally represent the key semantics of phenotype variables.
Collapse
Affiliation(s)
- Ko-Wei Lin
- Division of Biomedical Informatics, Department of Medicine, School of Medicine, University of California San Diego, La Jolla, California, United States of America
| | - Melissa Tharp
- Division of Biomedical Informatics, Department of Medicine, School of Medicine, University of California San Diego, La Jolla, California, United States of America
| | - Mike Conway
- Division of Biomedical Informatics, Department of Medicine, School of Medicine, University of California San Diego, La Jolla, California, United States of America
| | - Alexander Hsieh
- Division of Biomedical Informatics, Department of Medicine, School of Medicine, University of California San Diego, La Jolla, California, United States of America
| | - Mindy Ross
- Division of Biomedical Informatics, Department of Medicine, School of Medicine, University of California San Diego, La Jolla, California, United States of America
| | - Jihoon Kim
- Division of Biomedical Informatics, Department of Medicine, School of Medicine, University of California San Diego, La Jolla, California, United States of America
| | - Hyeon-Eui Kim
- Division of Biomedical Informatics, Department of Medicine, School of Medicine, University of California San Diego, La Jolla, California, United States of America
- * E-mail:
| |
Collapse
|
42
|
Chai KEK, Anthony S, Coiera E, Magrabi F. Using statistical text classification to identify health information technology incidents. J Am Med Inform Assoc 2013; 20:980-5. [PMID: 23666777 PMCID: PMC3756261 DOI: 10.1136/amiajnl-2012-001409] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2012] [Revised: 04/04/2013] [Accepted: 04/14/2013] [Indexed: 11/04/2022] Open
Abstract
OBJECTIVE To examine the feasibility of using statistical text classification to automatically identify health information technology (HIT) incidents in the USA Food and Drug Administration (FDA) Manufacturer and User Facility Device Experience (MAUDE) database. DESIGN We used a subset of 570 272 incidents including 1534 HIT incidents reported to MAUDE between 1 January 2008 and 1 July 2010. Text classifiers using regularized logistic regression were evaluated with both 'balanced' (50% HIT) and 'stratified' (0.297% HIT) datasets for training, validation, and testing. Dataset preparation, feature extraction, feature selection, cross-validation, classification, performance evaluation, and error analysis were performed iteratively to further improve the classifiers. Feature-selection techniques such as removing short words and stop words, stemming, lemmatization, and principal component analysis were examined. MEASUREMENTS κ statistic, F1 score, precision and recall. RESULTS Classification performance was similar on both the stratified (0.954 F1 score) and balanced (0.995 F1 score) datasets. Stemming was the most effective technique, reducing the feature set size to 79% while maintaining comparable performance. Training with balanced datasets improved recall (0.989) but reduced precision (0.165). CONCLUSIONS Statistical text classification appears to be a feasible method for identifying HIT reports within large databases of incidents. Automated identification should enable more HIT problems to be detected, analyzed, and addressed in a timely manner. Semi-supervised learning may be necessary when applying machine learning to big data analysis of patient safety incidents and requires further investigation.
Collapse
Affiliation(s)
- Kevin E K Chai
- Centre for Health Informatics, Australian Institute for Health Innovation, The University of New South Wales, Sydney, Australia
| | - Stephen Anthony
- The Kirby Institute for Infection and Immunity in Society, The University of New South Wales, Sydney, Australia
| | - Enrico Coiera
- Centre for Health Informatics, Australian Institute for Health Innovation, The University of New South Wales, Sydney, Australia
| | - Farah Magrabi
- Centre for Health Informatics, Australian Institute for Health Innovation, The University of New South Wales, Sydney, Australia
| |
Collapse
|
43
|
Yadav K, Sarioglu E, Smith M, Choi HA. Automated outcome classification of emergency department computed tomography imaging reports. Acad Emerg Med 2013; 20:848-54. [PMID: 24033628 DOI: 10.1111/acem.12174] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2012] [Revised: 02/26/2013] [Accepted: 03/20/2013] [Indexed: 11/29/2022]
Abstract
BACKGROUND Reliably abstracting outcomes from free-text electronic health records remains a challenge. While automated classification of free text has been a popular medical informatics topic, performance validation using real-world clinical data has been limited. The two main approaches are linguistic (natural language processing [NLP]) and statistical (machine learning). The authors have developed a hybrid system for abstracting computed tomography (CT) reports for specified outcomes. OBJECTIVES The objective was to measure performance of a hybrid NLP and machine learning system for automated outcome classification of emergency department (ED) CT imaging reports. The hypothesis was that such a system is comparable to medical personnel doing the data abstraction. METHODS A secondary analysis was performed on a prior diagnostic imaging study on 3,710 blunt facial trauma victims. Staff radiologists dictated CT reports as free text, which were then deidentified. A trained data abstractor manually coded the reference standard outcome of acute orbital fracture, with a random subset double-coded for reliability. The data set was randomly split evenly into training and testing sets. Training patient reports were used as input to the Medical Language Extraction and Encoding (MedLEE) NLP tool to create structured output containing standardized medical terms and modifiers for certainty and temporal status. Findings were filtered for low certainty and past/future modifiers and then combined with the manual reference standard to generate decision tree classifiers using data mining tools Waikato Environment for Knowledge Analysis (WEKA) 3.7.5 and Salford Predictive Miner 6.6. Performance of decision tree classifiers was evaluated on the testing set with or without NLP processing. RESULTS The performance of machine learning alone was comparable to prior NLP studies (sensitivity = 0.92, specificity = 0.93, precision = 0.95, recall = 0.93, f-score = 0.94), and the combined use of NLP and machine learning showed further improvement (sensitivity = 0.93, specificity = 0.97, precision = 0.97, recall = 0.96, f-score = 0.97). This performance is similar to, or better than, that of medical personnel in previous studies. CONCLUSIONS A hybrid NLP and machine learning automated classification system shows promise in coding free-text electronic clinical data.
Collapse
Affiliation(s)
- Kabir Yadav
- Department of Emergency Medicine; The George Washington University; Washington; DC
| | - Efsun Sarioglu
- Computer Science Department; The George Washington University; Washington; DC
| | - Meaghan Smith
- Department of Emergency Medicine; The George Washington University; Washington; DC
| | - Hyeong-Ah Choi
- Computer Science Department; The George Washington University; Washington; DC
| |
Collapse
|
44
|
Nassif H, Cunha F, Moreira IC, Cruz-Correia R, Sousa E, Page D, Burnside E, Dutra I. Extracting BI-RADS Features from Portuguese Clinical Texts. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE 2012:1-4. [PMID: 23797461 PMCID: PMC3688645 DOI: 10.1109/bibm.2012.6392613] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
In this work we build the first BI-RADS parser for Portuguese free texts, modeled after existing approaches to extract BI-RADS features from English medical records. Our concept finder uses a semantic grammar based on the BIRADS lexicon and on iterative transferred expert knowledge. We compare the performance of our algorithm to manual annotation by a specialist in mammography. Our results show that our parser's performance is comparable to the manual method.
Collapse
Affiliation(s)
| | | | - Inês C. Moreira
- Centro Hospitalar S. João and Faculty of Medicine of the University of Porto, Porto, Portugal Superior School of Health Technology of Porto, Vila Nova de Gaia, Portugal, INESC Porto, Faculty of Engineering of University of Porto, Porto, Portugal ()
| | - Ricardo Cruz-Correia
- CINTESIS - Center for Research in Health Technologies and Information Systems Faculty of Medicine of University of Porto, Porto, Portugal ()
| | - Eliana Sousa
- CIDES - Health Information and Decision Sciences, Faculty of Medicine of Universidade do Porto, Porto, Portugal ()
| | - David Page
- Dept. of Biostatistics and Medical Informatics, UW-Madison, USA ()
| | - Elizabeth Burnside
- Department of Radiology, University of Wisconsin, School of Medicine and Public Health, Madison, WI, USA ()
| | - Inês Dutra
- CRACS & INESC-TEC, Department of Computer Science, Faculty of Sciences, University of Porto, Porto, Portugal ()
| |
Collapse
|