Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zeng J, Banerjee I, Henry AS, Wood DJ, Shachter RD, Gensheimer MF, Rubin DL. Natural Language Processing to Identify Cancer Treatments With Electronic Medical Records. JCO Clin Cancer Inform 2021;5:379-393. [PMID: 33822653 DOI: 10.1200/cci.20.00173] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

For:	Zeng J, Banerjee I, Henry AS, Wood DJ, Shachter RD, Gensheimer MF, Rubin DL. Natural Language Processing to Identify Cancer Treatments With Electronic Medical Records. JCO Clin Cancer Inform 2021;5:379-393. [PMID: 33822653 DOI: 10.1200/cci.20.00173] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Number

Cited by Other Article(s)

Huang YZ, Chen YM, Lin CC, Chiu HY, Chang YC. A nursing note-aware deep neural network for predicting mortality risk after hospital discharge. Int J Nurs Stud 2024;156:104797. [PMID: 38788263 DOI: 10.1016/j.ijnurstu.2024.104797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 04/08/2024] [Accepted: 05/03/2024] [Indexed: 05/26/2024]

Abstract

BACKGROUND

ICU readmissions and post-discharge mortality pose significant challenges. Previous studies used EHRs and machine learning models, but mostly focused on structured data. Nursing records contain crucial unstructured information, but their utilization is challenging. Natural language processing (NLP) can extract structured features from clinical text. This study proposes the Crucial Nursing Description Extractor (CNDE) to predict post-ICU discharge mortality rates and identify high-risk patients for unplanned readmission by analyzing electronic nursing records.

OBJECTIVE

Developed a deep neural network (NurnaNet) with the ability to perceive nursing records, combined with a bio-clinical medicine pre-trained language model (BioClinicalBERT) to analyze the electronic health records (EHRs) in the MIMIC III dataset to predict the death of patients within six month and two year risk.

DESIGN

A cohort and system development design was used.

SETTING(S)

Based on data extracted from MIMIC-III, a database of critically ill in the US between 2001 and 2012, the results were analyzed.

PARTICIPANTS

We calculated patients' age using admission time and date of birth information from the MIMIC dataset. Patients under 18 or over 89 years old, or who died in the hospital, were excluded. We analyzed 16,973 nursing records from patients' ICU stays.

METHODS

We have developed a technology called the Crucial Nursing Description Extractor (CNDE), which extracts key content from text. We use the logarithmic likelihood ratio to extract keywords and combine BioClinicalBERT. We predict the survival of discharged patients after six months and two years and evaluate the performance of the model using precision, recall, the F1-score, the receiver operating characteristic curve (ROC curve), the area under the curve (AUC), and the precision-recall curve (PR curve).

RESULTS

The research findings indicate that NurnaNet achieved good F1-scores (0.67030, 0.70874) within six months and two years. Compared to using BioClinicalBERT alone, there was an improvement in performance of 2.05 % and 1.08 % for predictions within six months and two years, respectively.

CONCLUSIONS

CNDE can effectively reduce long-form records and extract key content. NurnaNet has a good F1-score in analyzing the data of nursing records, which helps to identify the risk of death of patients after leaving the hospital and adjust the regular follow-up and treatment plan of relevant medical care as soon as possible.

Collapse

Gliwska E, Barańska K, Maćkowska S, Różańska A, Sobol A, Spinczyk D. The Use of Natural Language Processing for Computer-Aided Diagnostics and Monitoring of Body Image Perception in Patients with Cancers. Cancers (Basel) 2023;15:5437. [PMID: 38001696 PMCID: PMC10670138 DOI: 10.3390/cancers15225437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 11/08/2023] [Accepted: 11/13/2023] [Indexed: 11/26/2023] Open

Abstract

BACKGROUND

Head and neck cancers (H&NCs) constitute a significant part of all cancer cases. H&NC patients experience unintentional weight loss, poor nutritional status, or speech disorders. Medical interventions affect appearance and interfere with patients' self-perception of their bodies. Psychological consultations are not affordable due to limited time.

METHODS

We used NLP to analyze the basic emotion intensity, sentiment about one's body, characteristic vocabulary, and potential areas of difficulty in free notes. The emotion intensity research uses the extended NAWL dictionary developed using word embedding. The sentiment analysis used a hybrid approach: a sentiment dictionary and a deep recursive network. The part-of-speech tagging and domain rules defined by a psycho-oncologist determine the distinct language traits. Potential areas of difficulty were analyzed using the dictionaries method with word polarity to define a given area and the presentation of a note using bag-of-words. Here, we applied the LSA method using SVD to reduce dimensionality. A total of 50 cancer patients requiring enteral nutrition participated in the study.

RESULTS

The results confirmed the complexity of emotions in patients with H&NC in relation to their body image. A negative attitude towards body image was detected in most of the patients. The method presented in the study appeared to be effective in assessing body image perception disturbances, but it cannot be used as the sole indicator of body image perception issues.

LIMITATIONS

The main problem in the research was the fairly wide age range of participants, which explains the potential diversity of vocabulary.

CONCLUSIONS

The combination of the attributes of a patient's condition, possible to determine using the method for a specific patient, can indicate the direction of support for the patient, relatives, direct medical personnel, and psycho-oncologists.

Collapse

Kotevski DP, Vajdic CM, Field M, Smee RI. Inter-hospital variation in data collection, radiotherapy treatment, and survival in patients with head and neck cancer: A multisite study. Radiother Oncol 2023;188:109843. [PMID: 37543056 DOI: 10.1016/j.radonc.2023.109843] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 06/14/2023] [Accepted: 07/27/2023] [Indexed: 08/07/2023]

Abstract

BACKGROUND AND PURPOSE

Inter-hospital inequalities in head and neck cancer (HNC) survival may exist due to variation in radiotherapy treatment-related factors. This study investigated inter-hospital variation in data collection, primary radiotherapy treatment, and survival in HNC patients from an Australian setting.

MATERIALS AND METHODS

Data collected in oncology information systems (OIS) from seven Australian hospitals was extracted for 3,182 adults treated with curative radiotherapy, with or without surgery or chemotherapy, for primary, non-metastatic squamous cell carcinoma of the head and neck (2000-2017). Death data was sourced from the National Death Index using record linkage. Multivariable Cox regression was used to assess the association between survival and hospital.

RESULTS

Inter-hospital variation in data collection, primary radiotherapy dose, and five-year HNC-related death was detected. Completion of eleven fields ranged from 66%-98%. Primary radiotherapy treated Tis-T1N0 glottic and any stage oral cavity and oropharynx cancers received significantly different time-corrected biologically equivalent dose in two gray fractions (EQD2T) by hospital, with observed deviation from Australian radiotherapy guidelines. Increased EQD2T dose was associated with a reduced risk of five-year HNC-related death in all patients and those treated with primary radiotherapy. Hospital, tumour site, and T and N classification were also identified as independent prognostic factors for five-year HNC-related death in all patients treated with radiotherapy.

CONCLUSION

Unexplained variation exists in HNC-related death in patients treated at Australian hospitals. Available routinely collected data in OIS are insufficient to explain variation in survival. Innovative data collection, extraction, and classification practices are needed to inform clinical practice.

Collapse

Bitterman DS, Goldner E, Finan S, Harris D, Durbin EB, Hochheiser H, Warner JL, Mak RH, Miller T, Savova GK. An End-to-End Natural Language Processing System for Automatically Extracting Radiation Therapy Events From Clinical Texts. Int J Radiat Oncol Biol Phys 2023;117:262-273. [PMID: 36990288 PMCID: PMC10522797 DOI: 10.1016/j.ijrobp.2023.03.055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 02/15/2023] [Accepted: 03/17/2023] [Indexed: 03/29/2023]

Abstract

PURPOSE

Real-world evidence for radiation therapy (RT) is limited because it is often documented only in the clinical narrative. We developed a natural language processing system for automated extraction of detailed RT events from text to support clinical phenotyping.

METHODS AND MATERIALS

A multi-institutional data set of 96 clinician notes, 129 North American Association of Central Cancer Registries cancer abstracts, and 270 RT prescriptions from HemOnc.org was used and divided into train, development, and test sets. Documents were annotated for RT events and associated properties: dose, fraction frequency, fraction number, date, treatment site, and boost. Named entity recognition models for properties were developed by fine-tuning BioClinicalBERT and RoBERTa transformer models. A multiclass RoBERTa-based relation extraction model was developed to link each dose mention with each property in the same event. Models were combined with symbolic rules to create a hybrid end-to-end pipeline for comprehensive RT event extraction.

RESULTS

Named entity recognition models were evaluated on the held-out test set with F1 results of 0.96, 0.88, 0.94, 0.88, 0.67, and 0.94 for dose, fraction frequency, fraction number, date, treatment site, and boost, respectively. The relation model achieved an average F1 of 0.86 when the input was gold-labeled entities. The end-to-end system F1 result was 0.81. The end-to-end system performed best on North American Association of Central Cancer Registries abstracts (average F1 0.90), which are mostly copy-paste content from clinician notes.

CONCLUSIONS

We developed methods and a hybrid end-to-end system for RT event extraction, which is the first natural language processing system for this task. This system provides proof-of-concept for real-world RT data collection for research and is promising for the potential of natural language processing methods to support clinical care.

Collapse

Mottin L, Goldman JP, Jäggli C, Achermann R, Gobeill J, Knafou J, Ehrsam J, Wicky A, Gérard CL, Schwenk T, Charrier M, Tsantoulis P, Lovis C, Leichtle A, Kiessling MK, Michielin O, Pradervand S, Foufi V, Ruch P. Multilingual RECIST classification of radiology reports using supervised learning. Front Digit Health 2023;5:1195017. [PMID: 37388252 PMCID: PMC10303934 DOI: 10.3389/fdgth.2023.1195017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 06/05/2023] [Indexed: 07/01/2023] Open

Affiliation(s)

Luc Mottin HES-SO\HEG Genève, Information Sciences, Geneva, Switzerland SIB Text Mining Group, Swiss Institute of Bioinformatics, Geneva, Switzerland
Jean-Philippe Goldman Division of Medical Information Sciences, University Hospitals of Geneva, Geneva, Switzerland
Christoph Jäggli Inselspital – Bern University Hospital and University of Bern, Bern, Switzerland
Rita Achermann Department of Radiology, Clinic of Radiology & Nuclear Medicine, University Hospital Basel, University of Basel, Basel, Switzerland
Julien Gobeill HES-SO\HEG Genève, Information Sciences, Geneva, Switzerland SIB Text Mining Group, Swiss Institute of Bioinformatics, Geneva, Switzerland
Julien Knafou HES-SO\HEG Genève, Information Sciences, Geneva, Switzerland SIB Text Mining Group, Swiss Institute of Bioinformatics, Geneva, Switzerland
Julien Ehrsam Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
Alexandre Wicky Precision Oncology Center, Oncology Department, Centre Hospitalier Universitaire Vaudois – CHUV, Lausanne, Switzerland
Camille L. Gérard Precision Oncology Center, Oncology Department, Centre Hospitalier Universitaire Vaudois – CHUV, Lausanne, Switzerland
Tanja Schwenk Department of Oncology, Kantonsspital Aarau, Aarau, Switzerland
Mélinda Charrier Division of Medical Information Sciences, University Hospitals of Geneva, Geneva, Switzerland
Petros Tsantoulis Division of Medical Information Sciences, University Hospitals of Geneva, Geneva, Switzerland Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
Christian Lovis Division of Medical Information Sciences, University Hospitals of Geneva, Geneva, Switzerland Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
Alexander Leichtle Inselspital – Bern University Hospital and University of Bern, Bern, Switzerland
Michael K. Kiessling Department of Medical Oncology and Hematology, University Hospital Zurich, Zurich, Switzerland
Olivier Michielin Precision Oncology Center, Oncology Department, Centre Hospitalier Universitaire Vaudois – CHUV, Lausanne, Switzerland
Sylvain Pradervand Precision Oncology Center, Oncology Department, Centre Hospitalier Universitaire Vaudois – CHUV, Lausanne, Switzerland
Vasiliki Foufi Division of Medical Information Sciences, University Hospitals of Geneva, Geneva, Switzerland
Patrick Ruch HES-SO\HEG Genève, Information Sciences, Geneva, Switzerland SIB Text Mining Group, Swiss Institute of Bioinformatics, Geneva, Switzerland

Collapse

Gao J, He S, Hu J, Chen G. A hybrid system to understand the relations between assessments and plans in progress notes. J Biomed Inform 2023;141:104363. [PMID: 37054961 DOI: 10.1016/j.jbi.2023.104363] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 04/05/2023] [Accepted: 04/07/2023] [Indexed: 04/15/2023]

Tyagi N, Bhushan B. Demystifying the Role of Natural Language Processing (NLP) in Smart City Applications: Background, Motivation, Recent Advances, and Future Research Directions. WIRELESS PERSONAL COMMUNICATIONS 2023;130:857-908. [PMID: 37168438 PMCID: PMC10019426 DOI: 10.1007/s11277-023-10312-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 02/25/2023] [Indexed: 05/13/2023]

Kotevski DP, Smee RI, Vajdic CM, Field M. Empirical comparison of routinely collected electronic health record data for head and neck cancer-specific survival in machine-learnt prognostic models. Head Neck 2023;45:365-379. [PMID: 36369773 PMCID: PMC10100433 DOI: 10.1002/hed.27241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 09/21/2022] [Accepted: 11/02/2022] [Indexed: 11/13/2022] Open

Kotevski DP, Smee RI, Field M, Broadley K, Vajdic CM. The Utility of Oncology Information Systems for Prognostic Modelling in Head and Neck Cancer. J Med Syst 2023;47:9. [PMID: 36640212 PMCID: PMC9840592 DOI: 10.1007/s10916-023-01907-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 01/03/2023] [Indexed: 01/15/2023]

Abstract

Cancer centres rely on electronic information in oncology information systems (OIS) to guide patient care. We investigated the completeness and accuracy of routinely collected head and neck cancer (HNC) data sourced from an OIS for suitability in prognostic modelling and other research. Three hundred and fifty-three adults diagnosed from 2000 to 2017 with head and neck squamous cell carcinoma, treated with radiotherapy, were eligible. Thirteen clinically relevant variables in HNC prognosis were extracted from a single-centre OIS and compared to that compiled separately in a research dataset. These two datasets were compared for agreement using Cohen's kappa coefficient for categorical variables, and intraclass correlation coefficients for continuous variables. Research data was 96% complete compared to 84% for OIS data. Agreement was perfect for gender (κ = 1.000), high for age (κ = 0.993), site (κ = 0.992), T (κ = 0.851) and N (κ = 0.812) stage, radiotherapy dose (κ = 0.889), fractions (κ = 0.856), and duration (κ = 0.818), and chemotherapy treatment (κ = 0.871), substantial for overall stage (κ = 0.791) and vital status (κ = 0.689), moderate for grade (κ = 0.547), and poor for performance status (κ = 0.110). Thirty-one other variables were poorly captured and could not be statistically compared. Documentation of clinical information within the OIS for HNC patients is routine practice; however, OIS data was less correct and complete than data collected for research purposes. Substandard collection of routine data may hinder advancements in patient care. Improved data entry, integration with clinical activities and workflows, system usability, data dictionaries, and training are necessary for OIS data to generate robust research. Data mining from clinical documents may supplement structured data collection.

Collapse

Kotevski DP, Smee RI, Vajdic CM, Field M. Machine Learning and Nomogram Prognostic Modeling for 2-Year Head and Neck Cancer-Specific Survival Using Electronic Health Record Data: A Multisite Study. JCO Clin Cancer Inform 2023;7:e2200128. [PMID: 36596211 DOI: 10.1200/cci.22.00128] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open

Abstract

PURPOSE

There is limited knowledge of the prediction of 2-year cancer-specific survival (CSS) in the head and neck cancer (HNC) population. The aim of this study is to develop and validate machine learning models and a nomogram for the prediction of 2-year CSS in patients with HNC using real-world data collected by major teaching and tertiary referral hospitals in New South Wales (NSW), Australia.

MATERIALS AND METHODS

Data collected in oncology information systems at multiple NSW Cancer Centres were extracted for 2,953 eligible adults diagnosed between 2000 and 2017 with squamous cell carcinoma of the head and neck. Death data were sourced from the National Death Index using record linkage. Machine learning and Cox regression/nomogram models were developed and internally validated in Python and R, respectively.

RESULTS

Machine learning models demonstrated highest performance (C-index) in the larynx and nasopharynx cohorts (0.82), followed by the oropharynx (0.79) and the hypopharynx and oral cavity cohorts (0.73). In the whole HNC population, C-indexes of 0.79 and 0.70 and Brier scores of 0.10 and 0.27 were reported for the machine learning and nomogram model, respectively. Cox regression analysis identified age, T and N classification, and time-corrected biologic equivalent dose in two gray fractions as independent prognostic factors for 2-year CSS. N classification was the most important feature used for prediction in the machine learning model followed by age.

CONCLUSION

Machine learning and nomogram analysis predicted 2-year CSS with high performance using routinely collected and complete clinical information extracted from oncology information systems. These models function as visual decision-making tools to guide radiotherapy treatment decisions and provide insight into the prediction of survival outcomes in patients with HNC.

Collapse

Khosravi B, Rouzrokh P, Erickson BJ. Getting More Out of Large Databases and EHRs with Natural Language Processing and Artificial Intelligence: The Future Is Here. J Bone Joint Surg Am 2022;104:51-55. [PMID: 36260045 DOI: 10.2106/jbjs.22.00567] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]

Zeng J, Gensheimer MF, Rubin DL, Athey S, Shachter RD. Uncovering interpretable potential confounders in electronic medical records. Nat Commun 2022;13:1014. [PMID: 35197467 PMCID: PMC8866497 DOI: 10.1038/s41467-022-28546-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Accepted: 01/28/2022] [Indexed: 12/25/2022] Open

Su D, Li Q, Zhang T, Veliz P, Chen Y, He K, Mahajan P, Zhang X. Prediction of acute appendicitis among patients with undifferentiated abdominal pain at emergency department. BMC Med Res Methodol 2022;22:18. [PMID: 35026994 PMCID: PMC8759254 DOI: 10.1186/s12874-021-01490-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 12/08/2021] [Indexed: 11/12/2022] Open

Abstract

Background

Early screening and accurately identifying Acute Appendicitis (AA) among patients with undifferentiated symptoms associated with appendicitis during their emergency visit will improve patient safety and health care quality. The aim of the study was to compare models that predict AA among patients with undifferentiated symptoms at emergency visits using both structured data and free-text data from a national survey.

Methods

We performed a secondary data analysis on the 2005-2017 United States National Hospital Ambulatory Medical Care Survey (NHAMCS) data to estimate the association between emergency department (ED) patients with the diagnosis of AA, and the demographic and clinical factors present at ED visits during a patient’s ED stay. We used binary logistic regression (LR) and random forest (RF) models incorporating natural language processing (NLP) to predict AA diagnosis among patients with undifferentiated symptoms.

Results

Among the 40,441 ED patients with assigned International Classification of Diseases (ICD) codes of AA and appendicitis-related symptoms between 2005 and 2017, 655 adults (2.3%) and 256 children (2.2%) had AA. For the LR model identifying AA diagnosis among adult ED patients, the c-statistic was 0.72 (95% CI: 0.69–0.75) for structured variables only, 0.72 (95% CI: 0.69–0.75) for unstructured variables only, and 0.78 (95% CI: 0.76–0.80) when including both structured and unstructured variables. For the LR model identifying AA diagnosis among pediatric ED patients, the c-statistic was 0.84 (95% CI: 0.79–0.89) for including structured variables only, 0.78 (95% CI: 0.72–0.84) for unstructured variables, and 0.87 (95% CI: 0.83–0.91) when including both structured and unstructured variables. The RF method showed similar c-statistic to the corresponding LR model.

Conclusions

We developed predictive models that can predict the AA diagnosis for adult and pediatric ED patients, and the predictive accuracy was improved with the inclusion of NLP elements and approaches.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12874-021-01490-9.

Collapse