1
|
Xie F, Fassett MJ, Im TM, Park D, Chiu VY, Getahun D. Identifying Elective Induction of Labor among a Diverse Pregnant Population from Electronic Health Records within a Large Integrated Health Care System. Am J Perinatol 2024. [PMID: 39209302 DOI: 10.1055/a-2405-3703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
OBJECTIVE Distinguishing between medically indicated induction of labor (iIOL) and elective induction of labor (eIOL) is a daunting process for researchers. We aimed to develop a Natural Language Processing (NLP) algorithm to identify eIOLs from electronic health records (EHRs) within a large integrated health care system. STUDY DESIGN We used structured and unstructured data from Kaiser Permanente Southern California's EHRs of patients who were <35 years old and had singleton deliveries between 37 and 40 gestational weeks. Induction of labor (IOL) pregnancies were identified if there was evidence of an IOL diagnosis code, procedure code, or documentation in a delivery flowsheet or progress note. A comprehensive NLP algorithm was developed and refined through an iterative process of chart reviews and adjudications, where IOL-associated reasons (medically indicated vs. elective induction) were reviewed. The final algorithm was applied to discern the indications of IOLs performed during the study period. RESULTS A total of 332,163 eligible pregnancies were identified between January 1, 2008, and December 31, 2022. Of these eligible pregnancies, 68,541 (20.6%) were IOL, of which 6,824 (10.0%) were eIOL. Validation of the NLP process against 300 randomly selected pregnancies (100 eIOL, iIOL, and non-IOL cases each) yielded a positive predictive value of 83.0% and 88.0% for eIOL and iIOL, respectively. The rates of eIOL among the maternal age groups ranged between 9.6 and 10.3%, except for the <20 years group (12.2%). Non-Hispanic White individuals had the highest rate of eIOL (13.2%), while non-Hispanic Asian/Pacific Islanders had the lowest rate of eIOL (7.8%). The rate of eIOL increased from 1.0% in the 37-week gestational age (GA) group to 20.6% in the 40-week GA group. CONCLUSION Findings suggest that the developed NLP algorithm effectively identifies eIOL. It can be utilized to support eIOL-related pharmacoepidemiological studies, fill in knowledge gaps, and provide content more relevant to researchers. KEY POINTS · An NLP algorithm was developed to identify indications of IOL.. · The study algorithm was successfully implemented within a large integrated health care system.. · The study algorithm can be utilized to support eIOL-related studies..
Collapse
Affiliation(s)
- Fagen Xie
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, California
| | - Michael J Fassett
- Department of Obstetrics and Gynecology, Kaiser Permanente West Los Angeles Medical Center, Los Angeles, California
- Department of Clinical Science, Kaiser Permanente Bernard J. Tyson School of Medicine, Pasadena, California
| | - Theresa M Im
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, California
| | - Daniella Park
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, California
| | - Vicki Y Chiu
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, California
| | - Darios Getahun
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, California
- Department of Health Systems Science, Kaiser Permanente Bernard J. Tyson School of Medicine, Pasadena, California
| |
Collapse
|
2
|
Hennessy A, Tran TH, Sasikumar SN, Al-Falahi Z. Machine learning, advanced data analysis, and a role in pregnancy care? How can we help improve preeclampsia outcomes? Pregnancy Hypertens 2024; 37:101137. [PMID: 38875933 DOI: 10.1016/j.preghy.2024.101137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 03/31/2024] [Accepted: 06/09/2024] [Indexed: 06/16/2024]
Abstract
The value of machine learning capacity in maternal health, and in particular prediction of preeclampsia will only be realised when there are high quality clinical data provided, representative populations included, different health systems and models of care compared, and a culture of rapid use and application of real-time data and outcomes. This review has been undertaken to provide an overview of the language, and early results of machine learning in a pregnancy and preeclampsia context. Clinicians of all backgrounds are encouraged to learn the language of Machine Learning (ML) and Artificial intelligence (AI) to better understand their potential and utility to improve outcomes for women and their families. This review will outline some definitions and features of ML that will benefit clinician's knowledge in the preeclampsia discipline, and also outline some of the future possibilities for preeclampsia-focussed clinicians via understanding AI. It will further explore the criticality of defining the risk, and outcome being determined.
Collapse
Affiliation(s)
- Annemarie Hennessy
- Campbelltown Hospital, South Western Sydney Local Health District, Sydney, Australia; Western Sydney University, Sydney, Australia; University of Sydney, Sydney, Australia.
| | - Tu Hao Tran
- Campbelltown Hospital, South Western Sydney Local Health District, Sydney, Australia; Ingham Institute for Applied Medical Research, SWERI (South Western Emergency Research Institute), Australia.
| | - Suraj Narayanan Sasikumar
- Ingham Institute for Applied Medical Research, SWERI (South Western Emergency Research Institute), Australia.
| | - Zaidon Al-Falahi
- University of Sydney, Sydney, Australia; Ingham Institute for Applied Medical Research, SWERI (South Western Emergency Research Institute), Australia.
| |
Collapse
|
3
|
Li Y, Wang Z, Tan L, Liang L, Liu S, Huang J, Lin J, Peng K, Wang Z, Li Q, Jian W, Xie B, Gao Y, Zheng J. Hospitalization, case fatality, comorbidities, and isolated pathogens of adult inpatients with pneumonia from 2013 to 2022: a real-world study in Guangzhou, China. BMC Infect Dis 2024; 24:2. [PMID: 38166702 PMCID: PMC10759351 DOI: 10.1186/s12879-023-08929-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Accepted: 12/19/2023] [Indexed: 01/05/2024] Open
Abstract
BACKGROUND In the context of increasing population aging, ongoing drug-resistant pathogens and the COVID-19 epidemic, the changes in the epidemiological and clinical characteristics of patients with pneumonia remain unclear. This study aimed to assess the trends in hospitalization, case fatality, comorbidities, and isolated pathogens of pneumonia-related adult inpatients in Guangzhou during the last decade. METHODS We retrospectively enrolled hospitalized adults who had doctor-diagnosed pneumonia in the First Affiliated Hospital of Guangzhou Medical University from January 1, 2013 to December 31, 2022. A natural language processing system was applied to automatically extract the clinical data from electronic health records. We evaluated the proportion of pneumonia-related hospitalizations in total hospitalizations, pneumonia-related in-hospital case fatality, comorbidities, and species of isolated pathogens during the last decade. Binary logistic regression analysis was used to assess predictors for patients with prolonged length of stay (LOS). RESULTS A total of 38,870 cases were finally included in this study, with 70% males, median age of 64 (53, 73) years and median LOS of 7.9 (5.1, 12.8) days. Although the number of pneumonia-related hospitalizations showed an upward trend, the proportion of pneumonia-related hospitalizations decreased from 199.6 per 1000 inpatients in 2013 to 123.4 per 1000 in 2021, and the case fatality decreased from 50.2 per 1000 in 2013 to 23.9 per 1000 in 2022 (all P < 0.05). The most common comorbidities were chronic obstructive pulmonary disease, lung malignancy, cardiovascular diseases and diabetes. The most common pathogens were Pseudomonas aeruginosa, Candida albicans, Acinetobacter baumannii, Stenotrophomonas maltophilia, Klebsiella pneumoniae, and Staphylococcus aureus. Glucocorticoid use during hospitalization (Odd Ratio [OR] = 1.86, 95% Confidence Interval (CI): 1.14-3.06), immunosuppressant use during hospitalization (OR = 1.99, 1.14-3.46), ICU admission (OR = 16.23, 95%CI: 11.25-23.83), receiving mechanical ventilation (OR = 3.58, 95%CI: 2.60-4.97), presence of other underlying diseases (OR = 1.54, 95%CI: 1.15-2.06), and elevated procalcitonin (OR = 1.61, 95%CI: 1.19-2.19) were identified as independent predictors for prolonged LOS. CONCLUSION The proportion of pneumonia-related hospitalizations and the in-hospital case fatality showed downward trends during the last decade. Pneumonia inpatients were often complicated by chronic underlying diseases and isolated with gram-negative bacteria. ICU admission was a significant predictor for prolonged LOS in pneumonia inpatients.
Collapse
Affiliation(s)
- Yun Li
- National Center for Respiratory Medicine, National Clinical Research Center for Respiratory Disease, State Key Laboratory of Respiratory Disease, Guangzhou Institute of Respiratory Health, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Zhufeng Wang
- National Center for Respiratory Medicine, National Clinical Research Center for Respiratory Disease, State Key Laboratory of Respiratory Disease, Guangzhou Institute of Respiratory Health, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Lunfang Tan
- National Center for Respiratory Medicine, National Clinical Research Center for Respiratory Disease, State Key Laboratory of Respiratory Disease, Guangzhou Institute of Respiratory Health, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Lina Liang
- National Center for Respiratory Medicine, National Clinical Research Center for Respiratory Disease, State Key Laboratory of Respiratory Disease, Guangzhou Institute of Respiratory Health, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Shuyi Liu
- National Center for Respiratory Medicine, National Clinical Research Center for Respiratory Disease, State Key Laboratory of Respiratory Disease, Guangzhou Institute of Respiratory Health, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Jinhai Huang
- National Center for Respiratory Medicine, National Clinical Research Center for Respiratory Disease, State Key Laboratory of Respiratory Disease, Guangzhou Institute of Respiratory Health, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Junfeng Lin
- National Center for Respiratory Medicine, National Clinical Research Center for Respiratory Disease, State Key Laboratory of Respiratory Disease, Guangzhou Institute of Respiratory Health, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Kang Peng
- National Center for Respiratory Medicine, National Clinical Research Center for Respiratory Disease, State Key Laboratory of Respiratory Disease, Guangzhou Institute of Respiratory Health, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Zihui Wang
- National Center for Respiratory Medicine, National Clinical Research Center for Respiratory Disease, State Key Laboratory of Respiratory Disease, Guangzhou Institute of Respiratory Health, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Qiasheng Li
- National Center for Respiratory Medicine, National Clinical Research Center for Respiratory Disease, State Key Laboratory of Respiratory Disease, Guangzhou Institute of Respiratory Health, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Wenhua Jian
- National Center for Respiratory Medicine, National Clinical Research Center for Respiratory Disease, State Key Laboratory of Respiratory Disease, Guangzhou Institute of Respiratory Health, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Baosong Xie
- Department of Pulmonary and Critical Care Medicine, Fujian Provincial Hospital, Fujian Medical University, Fuzhou, China.
| | - Yi Gao
- National Center for Respiratory Medicine, National Clinical Research Center for Respiratory Disease, State Key Laboratory of Respiratory Disease, Guangzhou Institute of Respiratory Health, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China.
| | - Jinping Zheng
- National Center for Respiratory Medicine, National Clinical Research Center for Respiratory Disease, State Key Laboratory of Respiratory Disease, Guangzhou Institute of Respiratory Health, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China.
| |
Collapse
|
4
|
Lee SJ, Garcia GGP, Stanhope KK, Platner MH, Boulet SL. Interpretable machine learning to predict adverse perinatal outcomes: examining marginal predictive value of risk factors during pregnancy. Am J Obstet Gynecol MFM 2023; 5:101096. [PMID: 37454734 DOI: 10.1016/j.ajogmf.2023.101096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 06/13/2023] [Accepted: 07/13/2023] [Indexed: 07/18/2023]
Abstract
BACKGROUND The timely identification of nulliparas at high risk of adverse fetal and neonatal outcomes during pregnancy is crucial for initiating clinical interventions to prevent perinatal complications. Although machine learning methods have been applied to predict preterm birth and other pregnancy complications, many models do not provide explanations of their predictions, limiting the clinical use of the model. OBJECTIVE This study aimed to develop interpretable prediction models for a composite adverse perinatal outcome (stillbirth, neonatal death, estimated Combined Apgar score of <10, or preterm birth) at different points in time during the pregnancy and to evaluate the marginal predictive value of individual predictors in the context of a machine learning model. STUDY DESIGN This was a secondary analysis of the Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-be data, a prospective cohort study in which 10,038 nulliparous pregnant individuals with singleton pregnancies were enrolled. Here, interpretable prediction models were developed using L1-regularized logistic regression for adverse perinatal outcomes using data available at 3 study visits during the pregnancy (visit 1: 6 0/7 to 13 6/7 weeks of gestation; visit 2: 16 0/7 to 21 6/7 weeks of gestation; visit 3: 22 0/7 to 29 6/7 weeks of gestation). We identified the important predictors for each model using SHapley Additive exPlanations, a model-agnostic method of computing explanations of model predictions, and evaluated the marginal predictive value of each predictor using the DeLong test. RESULTS Our interpretable machine learning model had an area under the receiver operating characteristic curves of 0.617 (95% confidence interval, 0.595-0.639; all predictor variables at visit 1), 0.652 (95% confidence interval, 0.631-0.673; all predictor variables at visit 2), and 0.673 (95% confidence interval, 0.651-0.694; all predictor variables at visit 3). For all visits, the placental biomarker inhibin A was a valuable predictor, as including inhibin A resulted in better performance in predicting adverse perinatal outcomes (P<.001, all visits). At visit 1, endoglin was also a valuable predictor (P<.001). At visit 2, free beta human chorionic gonadotropin (P=.001) and uterine artery pulsatility index (P=.023) were also valuable predictors. At visit 3, cervical length was also a valuable predictor (P<.001). CONCLUSION Despite various advances in predictive modeling in obstetrics, the accurate prediction of adverse perinatal outcomes remains difficult. Interpretable machine learning can help clinicians understand how predictions are made, but barriers exist to the widespread clinical adoption of machine learning models for adverse perinatal outcomes. A better understanding of the evolution of risk factors for adverse perinatal outcomes throughout pregnancy is necessary for the development of effective interventions.
Collapse
Affiliation(s)
- Sun Ju Lee
- H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA (Ms Lee and Dr Garcia).
| | - Gian-Gabriel P Garcia
- H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA (Ms Lee and Dr Garcia)
| | - Kaitlyn K Stanhope
- Department of Gynecology and Obstetrics, Emory University School of Medicine, Atlanta, GA (Drs Stanhope, Platner, and Boulet)
| | - Marissa H Platner
- Department of Gynecology and Obstetrics, Emory University School of Medicine, Atlanta, GA (Drs Stanhope, Platner, and Boulet)
| | - Sheree L Boulet
- Department of Gynecology and Obstetrics, Emory University School of Medicine, Atlanta, GA (Drs Stanhope, Platner, and Boulet)
| |
Collapse
|
5
|
Borna S, Maniaci MJ, Haider CR, Maita KC, Torres-Guzman RA, Avila FR, Lunde JJ, Coffey JD, Demaerschalk BM, Forte AJ. Artificial Intelligence Models in Health Information Exchange: A Systematic Review of Clinical Implications. Healthcare (Basel) 2023; 11:2584. [PMID: 37761781 PMCID: PMC10531020 DOI: 10.3390/healthcare11182584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 09/14/2023] [Accepted: 09/16/2023] [Indexed: 09/29/2023] Open
Abstract
Electronic health record (EHR) systems collate patient data, and the integration and standardization of documents through Health Information Exchange (HIE) play a pivotal role in refining patient management. Although the clinical implications of AI in EHR systems have been extensively analyzed, its application in HIE as a crucial source of patient data is less explored. Addressing this gap, our systematic review delves into utilizing AI models in HIE, gauging their predictive prowess and potential limitations. Employing databases such as Scopus, CINAHL, Google Scholar, PubMed/Medline, and Web of Science and adhering to the PRISMA guidelines, we unearthed 1021 publications. Of these, 11 were shortlisted for the final analysis. A noticeable preference for machine learning models in prognosticating clinical results, notably in oncology and cardiac failures, was evident. The metrics displayed AUC values ranging between 61% and 99.91%. Sensitivity metrics spanned from 12% to 96.50%, specificity from 76.30% to 98.80%, positive predictive values varied from 83.70% to 94.10%, and negative predictive values between 94.10% and 99.10%. Despite variations in specific metrics, AI models drawing on HIE data unfailingly showcased commendable predictive proficiency in clinical verdicts, emphasizing the transformative potential of melding AI with HIE. However, variations in sensitivity highlight underlying challenges. As healthcare's path becomes more enmeshed with AI, a well-rounded, enlightened approach is pivotal to guarantee the delivery of trustworthy and effective AI-augmented healthcare solutions.
Collapse
Affiliation(s)
- Sahar Borna
- Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
| | - Michael J. Maniaci
- Division of Hospital Internal Medicine, Mayo Clinic, Jacksonville, FL 32224, USA
| | - Clifton R. Haider
- Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN 55902, USA
| | - Karla C. Maita
- Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
| | | | | | | | - Jordan D. Coffey
- Center for Digital Health, Mayo Clinic, Rochester, MN 55902, USA
| | - Bart M. Demaerschalk
- Center for Digital Health, Mayo Clinic, Rochester, MN 55902, USA
- Department of Neurology, Mayo Clinic College of Medicine and Science, Phoenix, AZ 85054, USA
| | - Antonio J. Forte
- Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
| |
Collapse
|
6
|
Workman TE, Goulet JL, Brandt CA, Warren AR, Eleazer J, Skanderson M, Lindemann L, Blosnich JR, O'Leary J, Zeng‐Treitler Q. Identifying suicide documentation in clinical notes through zero-shot learning. Health Sci Rep 2023; 6:e1526. [PMID: 37706016 PMCID: PMC10495736 DOI: 10.1002/hsr2.1526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 08/08/2023] [Accepted: 08/11/2023] [Indexed: 09/15/2023] Open
Abstract
Background and Aims In deep learning, a major difficulty in identifying suicidality and its risk factors in clinical notes is the lack of training samples given the small number of true positive instances among the number of patients screened. This paper describes a novel methodology that identifies suicidality in clinical notes by addressing this data sparsity issue through zero-shot learning. Our general aim was to develop a tool that leveraged zero-shot learning to effectively identify suicidality documentation in all types of clinical notes. Methods US Veterans Affairs clinical notes served as data. The training data set label was determined using diagnostic codes of suicide attempt and self-harm. We used a base string associated with the target label of suicidality to provide auxiliary information by narrowing the positive training cases to those containing the base string. We trained a deep neural network by mapping the training documents' contents to a semantic space. For comparison, we trained another deep neural network using the identical training data set labels, and bag-of-words features. Results The zero-shot learning model outperformed the baseline model in terms of area under the curve, sensitivity, specificity, and positive predictive value at multiple probability thresholds. In applying a 0.90 probability threshold, the methodology identified notes documenting suicidality but not associated with a relevant ICD-10-CM code, with 94% accuracy. Conclusion This method can effectively identify suicidality without manual annotation.
Collapse
Affiliation(s)
- Terri Elizabeth Workman
- Biomedical Informatics CenterThe George Washington UniversityWashingtonDistrict of ColumbiaUSA
- VA Medical CenterWashingtonDistrict of ColumbiaUSA
| | - Joseph L. Goulet
- Department of Emergency MedicineYale School of MedicineNew HavenConnecticutUSA
- VA Connecticut Healthcare SystemWest HavenConnecticutUSA
| | - Cynthia A. Brandt
- Department of Emergency MedicineYale School of MedicineNew HavenConnecticutUSA
- VA Connecticut Healthcare SystemWest HavenConnecticutUSA
| | - Allison R. Warren
- PRIME Center, VA Connecticut Healthcare SystemWest HavenConnecticutUSA
| | - Jacob Eleazer
- PRIME Center, VA Connecticut Healthcare SystemWest HavenConnecticutUSA
| | | | - Luke Lindemann
- VA Connecticut Healthcare SystemWest HavenConnecticutUSA
| | - John R. Blosnich
- Suzanne Dworak‐Peck School of Social WorkUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
| | - John O'Leary
- VA Connecticut Healthcare SystemWest HavenConnecticutUSA
- Department of Internal MedicineYale School of MedicineWest HavenConnecticutUSA
| | - Qing Zeng‐Treitler
- Biomedical Informatics CenterThe George Washington UniversityWashingtonDistrict of ColumbiaUSA
- VA Medical CenterWashingtonDistrict of ColumbiaUSA
| |
Collapse
|
7
|
Fernandes MB, Valizadeh N, Alabsi HS, Quadri SA, Tesh RA, Bucklin AA, Sun H, Jain A, Brenner LN, Ye E, Ge W, Collens SI, Lin S, Das S, Robbins GK, Zafar SF, Mukerji SS, Westover MB. Classification of neurologic outcomes from medical notes using natural language processing. EXPERT SYSTEMS WITH APPLICATIONS 2023; 214:119171. [PMID: 36865787 PMCID: PMC9974159 DOI: 10.1016/j.eswa.2022.119171] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Neurologic disability level at hospital discharge is an important outcome in many clinical research studies. Outside of clinical trials, neurologic outcomes must typically be extracted by labor intensive manual review of clinical notes in the electronic health record (EHR). To overcome this challenge, we set out to develop a natural language processing (NLP) approach that automatically reads clinical notes to determine neurologic outcomes, to make it possible to conduct larger scale neurologic outcomes studies. We obtained 7314 notes from 3632 patients hospitalized at two large Boston hospitals between January 2012 and June 2020, including discharge summaries (3485), occupational therapy (1472) and physical therapy (2357) notes. Fourteen clinical experts reviewed notes to assign scores on the Glasgow Outcome Scale (GOS) with 4 classes, namely 'good recovery', 'moderate disability', 'severe disability', and 'death' and on the Modified Rankin Scale (mRS), with 7 classes, namely 'no symptoms', 'no significant disability', 'slight disability', 'moderate disability', 'moderately severe disability', 'severe disability', and 'death'. For 428 patients' notes, 2 experts scored the cases generating interrater reliability estimates for GOS and mRS. After preprocessing and extracting features from the notes, we trained a multiclass logistic regression model using LASSO regularization and 5-fold cross validation for hyperparameter tuning. The model performed well on the test set, achieving a micro average area under the receiver operating characteristic and F-score of 0.94 (95% CI 0.93-0.95) and 0.77 (0.75-0.80) for GOS, and 0.90 (0.89-0.91) and 0.59 (0.57-0.62) for mRS, respectively. Our work demonstrates that an NLP algorithm can accurately assign neurologic outcomes based on free text clinical notes. This algorithm increases the scale of research on neurological outcomes that is possible with EHR data.
Collapse
Affiliation(s)
- Marta B. Fernandes
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States
- Harvard Medical School, Boston, MA, United States
- Clinical Data Animation Center (CDAC), MGH, Boston, MA, United States
| | - Navid Valizadeh
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States
- Harvard Medical School, Boston, MA, United States
| | - Haitham S. Alabsi
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States
- Harvard Medical School, Boston, MA, United States
| | - Syed A. Quadri
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States
- Harvard Medical School, Boston, MA, United States
- Clinical Data Animation Center (CDAC), MGH, Boston, MA, United States
| | - Ryan A. Tesh
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States
- Harvard Medical School, Boston, MA, United States
- Clinical Data Animation Center (CDAC), MGH, Boston, MA, United States
| | - Abigail A. Bucklin
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States
- Harvard Medical School, Boston, MA, United States
- Clinical Data Animation Center (CDAC), MGH, Boston, MA, United States
| | - Haoqi Sun
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States
- Harvard Medical School, Boston, MA, United States
- Clinical Data Animation Center (CDAC), MGH, Boston, MA, United States
| | - Aayushee Jain
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States
- Clinical Data Animation Center (CDAC), MGH, Boston, MA, United States
| | - Laura N. Brenner
- Harvard Medical School, Boston, MA, United States
- Division of Pulmonary and Critical Care Medicine, MGH, Boston, MA, United States
- Division of General Internal Medicine, MGH, Boston, MA, United States
| | - Elissa Ye
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States
- Clinical Data Animation Center (CDAC), MGH, Boston, MA, United States
| | - Wendong Ge
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States
- Harvard Medical School, Boston, MA, United States
- Clinical Data Animation Center (CDAC), MGH, Boston, MA, United States
| | - Sarah I. Collens
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States
| | - Stacie Lin
- Harvard Medical School, Boston, MA, United States
| | - Sudeshna Das
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States
- Harvard Medical School, Boston, MA, United States
| | - Gregory K. Robbins
- Harvard Medical School, Boston, MA, United States
- Division of Infectious Diseases, MGH, Boston, MA, United States
| | - Sahar F. Zafar
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States
- Harvard Medical School, Boston, MA, United States
| | - Shibani S. Mukerji
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States
- Harvard Medical School, Boston, MA, United States
- Vaccine and Immunotherapy Center, Division of Infectious Diseases, MGH, Boston, MA, United States
| | - M. Brandon Westover
- Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States
- Harvard Medical School, Boston, MA, United States
- Clinical Data Animation Center (CDAC), MGH, Boston, MA, United States
- McCance Center for Brain Health, MGH, Boston, MA, United States
| |
Collapse
|