1
|
Masison J, Lehmann HP, Wan J. Utilization of Computable Phenotypes in Electronic Health Record Research: A Review and Case Study in Atopic Dermatitis. J Invest Dermatol 2025; 145:1008-1016. [PMID: 39488781 PMCID: PMC12018156 DOI: 10.1016/j.jid.2024.08.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Revised: 08/05/2024] [Accepted: 08/18/2024] [Indexed: 11/04/2024]
Abstract
Querying electronic health records databases to accurately identify specific cohorts of patients has countless observational and interventional research applications. Computable phenotypes are computationally executable, explicit sets of selection criteria composed of data elements, logical expressions, and a combination of natural language processing and machine learning techniques enabling expedited patient cohort identification. Phenotyping encompasses a range of implementations, each with advantages and use cases. In this paper, the dermatologic computable phenotype literature is reviewed. We identify and evaluate approaches and community supports for computable phenotyping that have been used both generally and within dermatology and, as a case study, focus on studied phenotypes for atopic dermatitis.
Collapse
Affiliation(s)
- Joseph Masison
- University of Connecticut School of Medicine, Farmington, Connecticut, USA
| | - Harold P Lehmann
- Division of General Internal Medicine, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Joy Wan
- Department of Dermatology, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA.
| |
Collapse
|
2
|
Jian X, Zhang D, Yu Z, Xu H, Bian J, Wu Y, Tong J, Chen Y. Leveraging undecided cases in chart-reviewed phenotypes to enhance EHR-based association studies. J Biomed Inform 2025; 166:104839. [PMID: 40316004 DOI: 10.1016/j.jbi.2025.104839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Revised: 03/25/2025] [Accepted: 04/23/2025] [Indexed: 05/04/2025]
Abstract
OBJECTIVES In electronic health record (EHR)-based association studies, phenotyping algorithms efficiently classify patient clinical outcomes into binary categories but are susceptible to misclassification errors. The gold standard, manual chart review, involves clinicians determining the true disease status based on their assessment of health records. These clinicians-labeled phenotypes are labor-intensive and typically limited to a small subset of patients, potentially introducing a third "undecided" category when phenotypes are indeterminate. We aim to effectively integrate the algorithm-derived and chart-reviewed outcomes when both are available in EHR-based association studies. MATERIAL AND METHODS We propose an augmented estimation method that combines the binary algorithm-derived phenotypes for the entire cohort with the trinary chart-reviewed phenotypes for a small, selected subset. Additionally, a cost-effective outcome-dependent sampling strategy is used to address the rare disease scenarios. The proposed trinary chart-reviewed phenotype integrated cost-effective augmented estimation (TriCA) was evaluated across a wide range of simulation settings and real-world applications, including using EHR data on Alzheimer's disease and related dementias (ADRD) from the OneFlorida + Clinical Research Network, and using cohort data on second breast cancer events (SBCE) from the Kaiser Permanente Washington. RESULTS Compared to estimation based on random sampling, our augmented method improved mean square error by up to 28.3% in simulation studies; compared to estimation using only trinary chart-reviewed phenotypes, our method improved efficiency by up to 33.3% in ADRD data and 50.8% in SBCE data. DISCUSSION Our simulation studies and real-world applications demonstrate that, compared to existing methods, the proposed method provides unbiased estimates with higher statistical efficiency. CONCLUSION The proposed method effectively combined binary algorithm-derived phenotypes for the whole cohort with trinary chart-reviewed outcomes for a limited validation set, making it applicable to a broader range of applications and enhancing risk factor identification in EHR-based association studies.
Collapse
Affiliation(s)
- Xinyao Jian
- The Center for Health Analytics and Synthesis of Evidence (CHASE), University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA; Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, The University of Pennsylvania, Philadelphia, PA, USA
| | - Dazheng Zhang
- The Center for Health Analytics and Synthesis of Evidence (CHASE), University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA; Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, The University of Pennsylvania, Philadelphia, PA, USA
| | - Zehao Yu
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA
| | - Hua Xu
- Department of Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, CT, USA
| | - Jiang Bian
- Department of Biostatistics and Health Data Science, School of Medicine, Indiana University, Indianapolis, IN, USA; Regenstreif Institute, Indianapolis, Indiana, IN, USA
| | - Yonghui Wu
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA; Cancer Informatics Shared Resource, University of Florida Health Cancer Center, Gainesville, FL, USA
| | - Jiayi Tong
- The Center for Health Analytics and Synthesis of Evidence (CHASE), University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA; Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, The University of Pennsylvania, Philadelphia, PA, USA; Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Yong Chen
- The Center for Health Analytics and Synthesis of Evidence (CHASE), University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA; Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, The University of Pennsylvania, Philadelphia, PA, USA; The Graduate Group in Applied Mathematics and Computational Science, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA; Leonard Davis Institute of Health Economics, Philadelphia, PA, USA; Penn Medicine Center for Evidence-based Practice (CEP), Philadelphia, PA, USA; Penn Institute for Biomedical Informatics (IBI), Philadelphia, PA, USA.
| |
Collapse
|
3
|
Akbasli IT, Birbilen AZ, Teksam O. Leveraging large language models to mimic domain expert labeling in unstructured text-based electronic healthcare records in non-english languages. BMC Med Inform Decis Mak 2025; 25:154. [PMID: 40165165 PMCID: PMC11959812 DOI: 10.1186/s12911-025-02871-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Accepted: 01/14/2025] [Indexed: 04/02/2025] Open
Abstract
BACKGROUND The integration of big data and artificial intelligence (AI) in healthcare, particularly through the analysis of electronic health records (EHR), presents significant opportunities for improving diagnostic accuracy and patient outcomes. However, the challenge of processing and accurately labeling vast amounts of unstructured data remains a critical bottleneck, necessitating efficient and reliable solutions. This study investigates the ability of domain specific, fine-tuned large language models (LLMs) to classify unstructured EHR texts with typographical errors through named entity recognition tasks, aiming to improve the efficiency and reliability of supervised learning AI models in healthcare. METHODS Turkish clinical notes from pediatric emergency room admissions at Hacettepe University İhsan Doğramacı Children's Hospital from 2018 to 2023 were analyzed. The data were preprocessed with open source Python libraries and categorized using a pretrained GPT-3 model, "text-davinci-003," before and after fine-tuning with domain-specific data on respiratory tract infections (RTI). The model's predictions were compared against ground truth labels established by pediatric specialists. RESULTS Out of 24,229 patient records classified as poorly labeled, 18,879 were identified without typographical errors and confirmed for RTI through filtering methods. The fine-tuned model achieved a 99.88% accuracy, significantly outperforming the pretrained model's 78.54% accuracy in identifying RTI cases among the remaining records. The fine-tuned model demonstrated superior performance metrics across all evaluated aspects compared to the pretrained model. CONCLUSIONS Fine-tuned LLMs can categorize unstructured EHR data with high accuracy, closely approximating the performance of domain experts. This approach significantly reduces the time and costs associated with manual data labeling, demonstrating the potential to streamline the processing of large-scale healthcare data for AI applications.
Collapse
Affiliation(s)
- Izzet Turkalp Akbasli
- Division of Pediatric Emergency, Department of Pediatrics, Faculty of Medicine, Hacettepe University, Ankara, Turkey.
- Life Support Center, Digital Health and Artificial Intelligence on Critical Care, Hacettepe University, Ankara, Turkey.
| | - Ahmet Ziya Birbilen
- Division of Pediatric Emergency, Department of Pediatrics, Faculty of Medicine, Hacettepe University, Ankara, Turkey.
| | - Ozlem Teksam
- Division of Pediatric Emergency, Department of Pediatrics, Faculty of Medicine, Hacettepe University, Ankara, Turkey
| |
Collapse
|
4
|
García-Barragán Á, Sakor A, Vidal ME, Menasalvas E, Gonzalez JCS, Provencio M, Robles V. NSSC: a neuro-symbolic AI system for enhancing accuracy of named entity recognition and linking from oncologic clinical notes. Med Biol Eng Comput 2025; 63:749-772. [PMID: 39485651 PMCID: PMC11891111 DOI: 10.1007/s11517-024-03227-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Accepted: 10/12/2024] [Indexed: 11/03/2024]
Abstract
Accurate recognition and linking of oncologic entities in clinical notes is essential for extracting insights across cancer research, patient care, clinical decision-making, and treatment optimization. We present the Neuro-Symbolic System for Cancer (NSSC), a hybrid AI framework that integrates neurosymbolic methods with named entity recognition (NER) and entity linking (EL) to transform unstructured clinical notes into structured terms using medical vocabularies, with the Unified Medical Language System (UMLS) as a case study. NSSC was evaluated on a dataset of clinical notes from breast cancer patients, demonstrating significant improvements in the accuracy of both entity recognition and linking compared to state-of-the-art models. Specifically, NSSC achieved a 33% improvement over BioFalcon and a 58% improvement over scispaCy. By combining large language models (LLMs) with symbolic reasoning, NSSC improves the recognition and interoperability of oncologic entities, enabling seamless integration with existing biomedical knowledge. This approach marks a significant advancement in extracting meaningful information from clinical narratives, offering promising applications in cancer research and personalized patient care.
Collapse
Affiliation(s)
- Álvaro García-Barragán
- Center of Biomedical Technology, Universidad Politécnica de Madrid, Campus Montegancedo, Pozuelo de Alarcón, 28223, Madrid, Spain.
| | - Ahmad Sakor
- Data Science Institute, Leibniz University of Hannover, Welfengarten 1, Hannover, 30060, Lower Saxony, Germany.
- Scientific Data Management Group, TIB-Leibniz Information Centre for Science and Technology, Welfengarten 1B, Hannover, 30167, Lower Saxony, Germany.
| | - Maria-Esther Vidal
- Data Science Institute, Leibniz University of Hannover, Welfengarten 1, Hannover, 30060, Lower Saxony, Germany.
- Scientific Data Management Group, TIB-Leibniz Information Centre for Science and Technology, Welfengarten 1B, Hannover, 30167, Lower Saxony, Germany.
| | - Ernestina Menasalvas
- Center of Biomedical Technology, Universidad Politécnica de Madrid, Campus Montegancedo, Pozuelo de Alarcón, 28223, Madrid, Spain.
| | | | | | - Víctor Robles
- Center of Biomedical Technology, Universidad Politécnica de Madrid, Campus Montegancedo, Pozuelo de Alarcón, 28223, Madrid, Spain.
| |
Collapse
|
5
|
Takeuchi T, Horinouchi H, Takasawa K, Mukai M, Masuda K, Shinno Y, Okuma Y, Yoshida T, Goto Y, Yamamoto N, Ohe Y, Miyake M, Watanabe H, Kusumoto M, Aoki T, Nishimura K, Hamamoto R. A series of natural language processing for predicting tumor response evaluation and survival curve from electronic health records. BMC Med Inform Decis Mak 2025; 25:85. [PMID: 39962486 PMCID: PMC11834625 DOI: 10.1186/s12911-025-02928-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2024] [Accepted: 02/11/2025] [Indexed: 02/20/2025] Open
Abstract
BACKGROUND The clinical information housed within unstructured electronic health records (EHRs) has the potential to promote cancer research. The National Cancer Center Hospital (NCCH) is widely recognized as a leading institution for the treatment of thoracic malignancies in Japan. Information on medical treatment, particularly the characteristics of malignant tumors that occur in patients, tumor response evaluation, and adverse events, was compiled into the databases of each NCCH department from EHRs. However, there have been few opportunities for integrated analysis of data on both the hospital and research institute. METHODS We developed a method for predicting tumor response evaluation and survival curves of drug therapy from the EHRs of lung cancer patients using natural language processing. First, we developed a rule-based algorithm to predict treatment duration using a dictionary of anticancer drugs and regimens used for lung cancer treatment. Thereafter, we applied supervised learning to radiology reports during each treatment period and constructed a classification model to predict the tumor response evaluation of anticancer drugs and date when the progressive disease (PD) was determined. The predicted response and PD date can be used to draw a survival curve for the progression-free survival. RESULTS We used the EHRs of 716 lung cancer treatments at the NCCH and structured data of the cases as labels for the training and testing of supervised learning. The structured data were manually curated by physicians and CRCs. We investigated the results and performance of the proposed method. Individual predictions of tumor response evaluation and PD date were not extremely high. However, the final predicted survival curves were nearly similar to the actual survival curves. CONCLUSIONS Although it is difficult to construct a fully automated system using our method, we believe that it achieves sufficient performance for supporting physicians and CRCs constructing the database and providing clinical information to help researchers find out a chance of clinical studies.
Collapse
Affiliation(s)
- Toshiki Takeuchi
- , Xcoo, Inc. Hongo-Sanchome TH Bldg., 6F, 2-40-8, Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan.
| | - Hidehito Horinouchi
- Department of Thoracic Oncology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-Ku, Tokyo, 104-0045, Japan
| | - Ken Takasawa
- Division of Medical AI Research and Development, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo, 104-0045, Japan
| | - Masami Mukai
- Division of Medical Informatics, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo, 104-0045, Japan
| | - Ken Masuda
- Department of Thoracic Oncology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-Ku, Tokyo, 104-0045, Japan
| | - Yuki Shinno
- Department of Thoracic Oncology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-Ku, Tokyo, 104-0045, Japan
| | - Yusuke Okuma
- Department of Thoracic Oncology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-Ku, Tokyo, 104-0045, Japan
| | - Tatsuya Yoshida
- Department of Thoracic Oncology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-Ku, Tokyo, 104-0045, Japan
| | - Yasushi Goto
- Department of Thoracic Oncology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-Ku, Tokyo, 104-0045, Japan
| | - Noboru Yamamoto
- Department of Thoracic Oncology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-Ku, Tokyo, 104-0045, Japan
| | - Yuichiro Ohe
- Department of Thoracic Oncology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-Ku, Tokyo, 104-0045, Japan
| | - Mototaka Miyake
- Department of Diagnostic Radiology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo, 104-0045, Japan
| | - Hirokazu Watanabe
- Department of Diagnostic Radiology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo, 104-0045, Japan
| | - Masahiko Kusumoto
- Department of Diagnostic Radiology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo, 104-0045, Japan
| | - Takashi Aoki
- , Xcoo, Inc. Hongo-Sanchome TH Bldg., 6F, 2-40-8, Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan
| | - Kunihiro Nishimura
- , Xcoo, Inc. Hongo-Sanchome TH Bldg., 6F, 2-40-8, Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan
| | - Ryuji Hamamoto
- Division of Medical AI Research and Development, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo, 104-0045, Japan
| |
Collapse
|
6
|
Shen Y, Yu J, Zhou J, Hu G. Twenty-Five Years of Evolution and Hurdles in Electronic Health Records and Interoperability in Medical Research: Comprehensive Review. J Med Internet Res 2025; 27:e59024. [PMID: 39787599 PMCID: PMC11757985 DOI: 10.2196/59024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Revised: 10/02/2024] [Accepted: 12/05/2024] [Indexed: 01/12/2025] Open
Abstract
BACKGROUND Electronic health records (EHRs) facilitate the accessibility and sharing of patient data among various health care providers, contributing to more coordinated and efficient care. OBJECTIVE This study aimed to summarize the evolution of secondary use of EHRs and their interoperability in medical research over the past 25 years. METHODS We conducted an extensive literature search in the PubMed, Scopus, and Web of Science databases using the keywords Electronic health record and Electronic medical record in the title or abstract and Medical research in all fields from 2000 to 2024. Specific terms were applied to different time periods. RESULTS The review yielded 2212 studies, all of which were then screened and processed in a structured manner. Of these 2212 studies, 2102 (93.03%) were included in the review analysis, of which 1079 (51.33%) studies were from 2000 to 2009, 582 (27.69%) were from 2010 to 2019, 251 (11.94%) were from 2020 to 2023, and 190 (9.04%) were from 2024. CONCLUSIONS The evolution of EHRs marks an important milestone in health care's journey toward integrating technology and medicine. From early documentation practices to the sophisticated use of artificial intelligence and big data analytics today, EHRs have become central to improving patient care, enhancing public health surveillance, and advancing medical research.
Collapse
Affiliation(s)
- Yun Shen
- Chronic Disease Epidemiology, Population and Public Health, Pennington Biomedical Research Center, Baton Rouge, LA, United States
| | - Jiamin Yu
- Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Jian Zhou
- Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Gang Hu
- Chronic Disease Epidemiology, Population and Public Health, Pennington Biomedical Research Center, Baton Rouge, LA, United States
| |
Collapse
|
7
|
Su D, Zheng J, Shao YK, Liu JY, Liu XX, Yu K, Feng BH, Mei H, Qin S. Developing and validating a machine learning-based model for predicting in-hospital mortality among ICU-admitted heart failure patients: A study utilizing the MIMIC-III database. Digit Health 2025; 11:20552076251335705. [PMID: 40297352 PMCID: PMC12035218 DOI: 10.1177/20552076251335705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2024] [Accepted: 04/01/2025] [Indexed: 04/30/2025] Open
Abstract
Background Although the assessment of in-hospital mortality risk among heart failure patients in the intensive care unit (ICU) is crucial for clinical decision-making, there is currently a lack of comprehensive models accurately predicting their prognosis. Machine learning techniques offer a powerful means to identify potential risk factors and predict outcomes within multivariable clinical data. Methods This study, based on the MIMIC-III database, extracted demographic characteristics, vital signs, laboratory test values, and comorbidity information of heart failure patients using structured query language. LASSO regression was employed for feature selection, and various machine learning algorithms were utilized to train models, including logistic regression (LR), random forest (RF), and gradient boosting (GB), among others. An ensemble learning model based on a soft voting mechanism was constructed. Model performance was evaluated using accuracy, recall, precision, F1 score, and AUC values through cross-validation and on an independent test set. Results In five-fold cross-validation, the soft voting ensemble learning model demonstrated the best overall performance, with accuracy and AUC values both at 0.86. Additionally, RF and GB models also performed well, with RF achieving an accuracy of 0.79 and an AUC of 0.79 on the independent test set, while the GB model achieved an accuracy of 0.77 and an AUC of 0.79. In contrast, other models such as LR, SVM, and KNN exhibited poorer performance in terms of accuracy and AUC values, indicating the significant advantage of ensemble methods in handling complex clinical prediction tasks. Conclusion This study demonstrates the potential of machine learning models, particularly ensemble learning models based on soft voting mechanisms, in predicting in-hospital mortality risk among heart failure patients in the ICU. The overall performance of the ensemble learning model confirms its effectiveness as an adjunct clinical decision-making tool. Future research should further optimize the models and validate them in a broader patient population to enhance their practical utility and accuracy in real clinical settings.
Collapse
Affiliation(s)
- De Su
- Department of Critical Care Medicine, Affiliated Hospital of Zunyi Medical University, Zunyi, Guizhou, P.R. China
| | - Jie Zheng
- Department of Critical Care Medicine, Affiliated Hospital of Zunyi Medical University, Zunyi, Guizhou, P.R. China
| | - Yue-kai Shao
- Department of Critical Care Medicine, Affiliated Hospital of Zunyi Medical University, Zunyi, Guizhou, P.R. China
| | - Jun-ya Liu
- Department of Critical Care Medicine, Affiliated Hospital of Zunyi Medical University, Zunyi, Guizhou, P.R. China
| | - Xin-xin Liu
- Department of Critical Care Medicine, Affiliated Hospital of Zunyi Medical University, Zunyi, Guizhou, P.R. China
| | - Kun Yu
- Department of Critical Care Medicine, Affiliated Hospital of Zunyi Medical University, Zunyi, Guizhou, P.R. China
| | - Bang-hai Feng
- Department of Critical Care Medicine, Zunyi Hospital of Traditional Chinese Medicine, Zunyi, Guizhou, P.R. China
| | - Hong Mei
- Department of Critical Care Medicine, Affiliated Hospital of Zunyi Medical University, Zunyi, Guizhou, P.R. China
| | - Song Qin
- Department of Critical Care Medicine, Affiliated Hospital of Zunyi Medical University, Zunyi, Guizhou, P.R. China
| |
Collapse
|
8
|
Murnan AW, Tscholl JJ, Ganta R, Duah HO, Qasem I, Sezgin E. Identification of Child Survivors of Sex Trafficking From Electronic Health Records: An Artificial Intelligence Guided Approach. CHILD MALTREATMENT 2024; 29:601-611. [PMID: 37545138 PMCID: PMC11000265 DOI: 10.1177/10775595231194599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
Survivors of child sex trafficking (SCST) experience high rates of adverse health outcomes. Amidst the duration of their victimization, survivors regularly seek healthcare yet fail to be identified. This study sought to utilize artificial intelligence (AI) to identify SCST and describe the elements of their healthcare presentation. An AI-supported keyword search was conducted to identify SCST within the electronic medical records (EMR) of ∼1.5 million patients at a large midwestern pediatric hospital. Descriptive analyses were used to evaluate associated diagnoses and clinical presentation. A sex trafficking-related keyword was identified in .18% of patient charts. Among this cohort, the most common associated diagnostic codes were for Confirmed Sexual/Physical Assault; Trauma and Stress-Related Disorders; Depressive Disorders; Anxiety Disorders; and Suicidal Ideation. Our findings are consistent with the myriad of known adverse physical and psychological outcomes among SCST and illuminate the future potential of AI technology to improve screening and research efforts surrounding all aspects of this vulnerable population.
Collapse
Affiliation(s)
- Aaron W Murnan
- College of Nursing, University of Cincinnati, Cincinnati, OH, USA
| | - Jennifer J Tscholl
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA
- Division of Child and Family Advocacy, Center for Family Safety and Healing, Nationwide Children's Hospital, Columbus, OH, USA
| | - Rajesh Ganta
- Information Technology Research and Innovation, Abigail Wexner Research Institute, Nationwide Children's Hospital, Columbus, OH, USA
| | - Henry O Duah
- College of Nursing, University of Cincinnati, Cincinnati, OH, USA
| | - Islam Qasem
- College of Nursing, University of Cincinnati, Cincinnati, OH, USA
| | - Emre Sezgin
- Information Technology Research and Innovation, Abigail Wexner Research Institute, Nationwide Children's Hospital, Columbus, OH, USA
- Center for Biobehavioral Health, The Abigail Wexner Research Institute, Nationwide Children's Hospital, Columbus, OH, USA
| |
Collapse
|
9
|
Lee K, Mai Y, Liu Z, Raja K, Jun T, Ma M, Wang T, Ai L, Calay E, Oh W, Schadt E, Wang X. CriteriaMapper: establishing the automatic identification of clinical trial cohorts from electronic health records by matching normalized eligibility criteria and patient clinical characteristics. Sci Rep 2024; 14:25387. [PMID: 39455879 PMCID: PMC11511882 DOI: 10.1038/s41598-024-77447-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Accepted: 10/22/2024] [Indexed: 10/28/2024] Open
Abstract
The use of electronic health records (EHRs) holds the potential to enhance clinical trial activities. However, the identification of eligible patients within EHRs presents considerable challenges. We aimed to develop a CriteriaMapper system for phenotyping eligibility criteria, enabling the identification of patients from EHRs with clinical characteristics that match those criteria. We utilized clinical trial eligibility criteria and patient EHRs from the Mount Sinai Database. The CriteriaMapper system was developed to normalize the criteria using national standard terminologies and in-house databases, facilitating computability and queryability to bridge clinical trial criteria and EHRs. The system employed rule-based pattern recognition and manual annotation. Our system normalized 367 out of 640 unique eligibility criteria attributes, covering various medical conditions including non-small cell lung cancer, small cell lung cancer, prostate cancer, breast cancer, multiple myeloma, ulcerative colitis, Crohn's disease, non-alcoholic steatohepatitis, and sickle cell anemia. About 174 criteria were encoded with standard terminologies and 193 were normalized using the in-house reference tables. The agreement between automated and manual normalization was high (Cohen's Kappa = 0.82), and patient matching demonstrated a 0.94 F1 score. Our system has proven effective on EHRs from multiple institutions, showing broad applicability and promising improved clinical trial processes, leading to better patient selection, and enhanced clinical research outcomes.
Collapse
Affiliation(s)
- K Lee
- GeneDx (Sema4), 333 Ludlow Street, Stamford, CT, 06902, USA.
| | - Y Mai
- GeneDx (Sema4), 333 Ludlow Street, Stamford, CT, 06902, USA
| | - Z Liu
- GeneDx (Sema4), 333 Ludlow Street, Stamford, CT, 06902, USA
| | - K Raja
- GeneDx (Sema4), 333 Ludlow Street, Stamford, CT, 06902, USA
| | - T Jun
- GeneDx (Sema4), 333 Ludlow Street, Stamford, CT, 06902, USA
| | - M Ma
- GeneDx (Sema4), 333 Ludlow Street, Stamford, CT, 06902, USA
| | - T Wang
- GeneDx (Sema4), 333 Ludlow Street, Stamford, CT, 06902, USA
| | - L Ai
- GeneDx (Sema4), 333 Ludlow Street, Stamford, CT, 06902, USA
| | - E Calay
- GeneDx (Sema4), 333 Ludlow Street, Stamford, CT, 06902, USA
| | - W Oh
- GeneDx (Sema4), 333 Ludlow Street, Stamford, CT, 06902, USA
- Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY, 10029, USA
| | - E Schadt
- GeneDx (Sema4), 333 Ludlow Street, Stamford, CT, 06902, USA
- Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY, 10029, USA
| | - X Wang
- GeneDx (Sema4), 333 Ludlow Street, Stamford, CT, 06902, USA.
| |
Collapse
|
10
|
Soysal E, Roberts K. PheNormGPT: a framework for extraction and normalization of key medical findings. Database (Oxford) 2024; 2024:baae103. [PMID: 39444329 PMCID: PMC11498178 DOI: 10.1093/database/baae103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Revised: 07/31/2024] [Accepted: 08/27/2024] [Indexed: 10/25/2024]
Abstract
This manuscript presents PheNormGPT, a framework for extraction and normalization of key findings in clinical text. PheNormGPT relies on an innovative approach, leveraging large language models to extract key findings and phenotypic data in unstructured clinical text and map them to Human Phenotype Ontology concepts. It utilizes OpenAI's GPT-3.5 Turbo and GPT-4 models with fine-tuning and few-shot learning strategies, including a novel few-shot learning strategy for custom-tailored few-shot example selection per request. PheNormGPT was evaluated in the BioCreative VIII Track 3: Genetic Phenotype Extraction from Dysmorphology Physical Examination Entries shared task. PheNormGPT achieved an F1 score of 0.82 for standard matching and 0.72 for exact matching, securing first place for this shared task.
Collapse
Affiliation(s)
- Ekin Soysal
- McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, 7000 Fannin St #600, Houston, TX 77030, United States
| | - Kirk Roberts
- McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, 7000 Fannin St #600, Houston, TX 77030, United States
| |
Collapse
|
11
|
Klang E, Tessler I, Apakama DU, Abbott E, Glicksberg BS, Arnold M, Moses A, Sakhuja A, Soroush A, Charney AW, Reich DL, McGreevy J, Gavin N, Carr B, Freeman R, Nadkarni GN. Assessing Retrieval-Augmented Large Language Model Performance in Emergency Department ICD-10-CM Coding Compared to Human Coders. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.10.15.24315526. [PMID: 39484238 PMCID: PMC11527068 DOI: 10.1101/2024.10.15.24315526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/03/2024]
Abstract
Background Accurate medical coding is essential for clinical and administrative purposes but complicated, time-consuming, and biased. This study compares Retrieval-Augmented Generation (RAG)-enhanced LLMs to provider-assigned codes in producing ICD-10-CM codes from emergency department (ED) clinical records. Methods Retrospective cohort study using 500 ED visits randomly selected from the Mount Sinai Health System between January and April 2024. The RAG system integrated past 1,038,066 ED visits data (2021-2023) into the LLMs' predictions to improve coding accuracy. Nine commercial and open-source LLMs were evaluated. The primary outcome was a head-to-head comparison of the ICD-10-CM codes generated by the RAG-enhanced LLMs and those assigned by the original providers. A panel of four physicians and two LLMs blindly reviewed the codes, comparing the RAG-enhanced LLM and provider-assigned codes on accuracy and specificity. Findings RAG-enhanced LLMs demonstrated superior performance to provider coders in both the accuracy and specificity of code assignments. In a targeted evaluation of 200 cases where discrepancies existed between GPT-4 and provider-assigned codes, human reviewers favored GPT-4 for accuracy in 447 instances, compared to 277 instances where providers' codes were preferred (p<0.001). Similarly, GPT-4 was selected for its superior specificity in 509 cases, whereas human coders were preferred in only 181 cases (p<0.001). Smaller open-access models, such as Llama-3.1-70B, also demonstrated substantial scalability when enhanced with RAG, with 218 instances of accuracy preference compared to 90 for providers' codes. Furthermore, across all models, the exact match rate between LLM-generated and provider-assigned codes significantly improved following RAG integration, with Qwen-2-7B increasing from 0.8% to 17.6% and Gemma-2-9b-it improving from 7.2% to 26.4%. Interpretation RAG-enhanced LLMs improve medical coding accuracy in EDs, suggesting clinical workflow applications. These findings show that generative AI can improve clinical outcomes and reduce administrative burdens. Funding This work was supported in part through the computational and data resources and staff expertise provided by Scientific Computing and Data at the Icahn School of Medicine at Mount Sinai and supported by the Clinical and Translational Science Awards (CTSA) grant UL1TR004419 from the National Center for Advancing Translational Sciences. Research reported in this publication was also supported by the Office of Research Infrastructure of the National Institutes of Health under award number S10OD026880 and S10OD030463. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The funders played no role in study design, data collection, analysis and interpretation of data, or the writing of this manuscript. Twitter Summary A study showed AI models with retrieval-augmented generation outperformed human doctors in ED diagnostic coding accuracy and specificity. Even smaller AI models perform favorably when using RAG. This suggests potential for reducing administrative burden in healthcare, improving coding efficiency, and enhancing clinical documentation.
Collapse
|
12
|
Martinson AK, Chin AT, Butte MJ, Rider NL. Artificial Intelligence and Machine Learning for Inborn Errors of Immunity: Current State and Future Promise. THE JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY. IN PRACTICE 2024; 12:2695-2704. [PMID: 39127104 DOI: 10.1016/j.jaip.2024.08.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/02/2024] [Revised: 07/10/2024] [Accepted: 08/01/2024] [Indexed: 08/12/2024]
Abstract
Artificial intelligence (AI) and machine learning (ML) research within medicine has exponentially increased over the last decade, with studies showcasing the potential of AI/ML algorithms to improve clinical practice and outcomes. Ongoing research and efforts to develop AI-based models have expanded to aid in the identification of inborn errors of immunity (IEI). The use of larger electronic health record data sets, coupled with advances in phenotyping precision and enhancements in ML techniques, has the potential to significantly improve the early recognition of IEI, thereby increasing access to equitable care. In this review, we provide a comprehensive examination of AI/ML for IEI, covering the spectrum from data preprocessing for AI/ML analysis to current applications within immunology, and address the challenges associated with implementing clinical decision support systems to refine the diagnosis and management of IEI.
Collapse
Affiliation(s)
| | - Aaron T Chin
- Department of Pediatrics, Division of Immunology, Allergy and Rheumatology, University of California, Los Angeles, Los Angeles, Calif
| | - Manish J Butte
- Department of Pediatrics, Division of Immunology, Allergy and Rheumatology, University of California, Los Angeles, Los Angeles, Calif
| | - Nicholas L Rider
- Department of Health Systems & Implementation Science, Virginia Tech Carilion School of Medicine, Roanoke, Va; Department of Medicine, Division of Allergy-Immunology, Carilion Clinic, Roanoke, Va.
| |
Collapse
|
13
|
Guralnik E. US public health surveillance, reimagined. Learn Health Syst 2024; 8:e10445. [PMID: 39444500 PMCID: PMC11493541 DOI: 10.1002/lrh2.10445] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Revised: 06/24/2024] [Accepted: 07/25/2024] [Indexed: 10/25/2024] Open
Abstract
Introduction This study presents two novel concepts for standardizing electronic health records (EHR)-based public health surveillance through utilization of existing informatics methods and data platforms. Methods Drawing from the collective experience in applied epidemiology, health services research and health informatics, the author presents a vision for an alternative path to public health surveillance by repurposing existing tools and resources, such as (1) computable phenotypes which have already been created and validated for a variety of chronic diseases of interest to public health and (2) large data platforms/collaboratives, such as All of Us Research Program and National COVID Cohort Collaborative. Opportunities and challenges are discussed regarding EHR-based chronic disease surveillance, as well as the concept of phenotype definitions and large data platforms reuse for public health needs. Results/Framework Reusing of computable phenotypes for EHR-based public health surveillance would require secure data platforms and nationally representative data. Standardization metrics for reuse of previously developed and validated computable phenotypes are also necessary and are currently being developed by the author. This study presents a reimagined Learning Health System framework by incorporating Public Health and two novel concept sets of solutions into the healthcare ecosystem. Conclusion/Next Steps Alternative approaches to limited resources and current infrastructure of the US Public Health System, especially as applied to disease surveillance, are needed and may be possible when repurposing the resources and methodologies across the Learning Health System.
Collapse
Affiliation(s)
- Elina Guralnik
- Department of Health Administration and PolicyCollege of Public Health, George Mason UniversityFairfaxVAUSA
| |
Collapse
|
14
|
Xian S, Grabowska ME, Kullo IJ, Luo Y, Smoller JW, Wei WQ, Jarvik G, Mooney S, Crosslin D. Language-model-based patient embedding using electronic health records facilitates phenotyping, disease forecasting, and progression analysis. RESEARCH SQUARE 2024:rs.3.rs-4708839. [PMID: 39399661 PMCID: PMC11469380 DOI: 10.21203/rs.3.rs-4708839/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/15/2024]
Abstract
Current studies regarding the secondary use of electronic health records (EHR) predominantly rely on domain expertise and existing medical knowledge. Though significant efforts have been devoted to investigating the application of machine learning algorithms in the EHR, efficient and powerful representation of patients is needed to unleash the potential of discovering new medical patterns underlying the EHR. Here, we present an unsupervised method for embedding high-dimensional EHR data at the patient level, aimed at characterizing patient heterogeneity in complex diseases and identifying new disease patterns associated with clinical outcome disparities. Inspired by the architecture of modern language models-specifically transformers with attention mechanisms, we use patient diagnosis and procedure codes as vocabularies and treat each patient as a sentence to perform the patient embedding. We applied this approach to 34,851 unique medical codes across 1,046,649 longitudinal patient events, including 102,739 patients from the electronic Medical Records and GEnomics (eMERGE) Network. The resulting patient vectors demonstrated excellent performance in predicting future disease events (median AUROC = 0.87 within one year) and bulk phenotyping (median AUROC = 0.84). We then illustrated the utility of these patient vectors in revealing heterogeneous comorbidity patterns, exemplified by disease subtypes in colorectal cancer and systemic lupus erythematosus, and capturing distinct longitudinal disease trajectories. External validation using EHR data from the University of Washington confirmed robust model performance, with median AUROCs of 0.83 and 0.84 for bulk phenotyping tasks and disease onset prediction, respectively. Importantly, the model reproduced the clustering results of disease subtypes identified in the eMERGE cohort and uncovered variations in overall mortality among these subtypes. Together, these results underscore the potential of representation learning in EHRs to enhance patient characterization and associated clinical outcomes, thereby advancing disease forecasting and facilitating personalized medicine.
Collapse
Affiliation(s)
- Su Xian
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA
| | - Monika E Grabowska
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
| | - Iftikhar J Kullo
- Department of Cardiovascular Medicine and the Gonda Vascular Center, Mayo Clinic Rochester Minnesota
| | - Yuan Luo
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine
| | - Jordan W Smoller
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
- Center for Precision Psychiatry, Department of Psychiatry, Massachusetts General Hospital, Boston, MA
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
| | - Gail Jarvik
- Department of Medicine, Division of Medical Genetics, University of Washington, Seattle, WA
| | - Sean Mooney
- Center for Information Technology, National Institutes of Health
| | - David Crosslin
- Department of Medicine, Division of Biomedical Informatics and Genomics, Tulane University, New Orleans, LA
| |
Collapse
|
15
|
Deng Y, Xing Y, Quach J, Chen X, Wu X, Zhang Y, Moureaud C, Yu M, Zhao Y, Wang L, Zhong S. Developing large language models to detect adverse drug events in posts on x. J Biopharm Stat 2024:1-12. [PMID: 39300965 DOI: 10.1080/10543406.2024.2403442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 09/08/2024] [Indexed: 09/22/2024]
Abstract
Adverse drug events (ADEs) are one of the major causes of hospital admissions and are associated with increased morbidity and mortality. Post-marketing ADE identification is one of the most important phases of drug safety surveillance. Traditionally, data sources for post-marketing surveillance mainly come from spontaneous reporting system such as the Food and Drug Administration Adverse Event Reporting System (FAERS). Social media data such as posts on X (formerly Twitter) contain rich patient and medication information and could potentially accelerate drug surveillance research. However, ADE information in social media data is usually locked in the text, making it difficult to be employed by traditional statistical approaches. In recent years, large language models (LLMs) have shown promise in many natural language processing tasks. In this study, we developed several LLMs to perform ADE classification on X data. We fine-tuned various LLMs including BERT-base, Bio_ClinicalBERT, RoBERTa, and RoBERTa-large. We also experimented ChatGPT few-shot prompting and ChatGPT fine-tuned on the whole training data. We then evaluated the model performance based on sensitivity, specificity, negative predictive value, positive predictive value, accuracy, F1-measure, and area under the ROC curve. Our results showed that RoBERTa-large achieved the best F1-measure (0.8) among all models followed by ChatGPT fine-tuned model with F1-measure of 0.75. Our feature importance analysis based on 1200 random samples and RoBERTa-Large showed the most important features are as follows: "withdrawals"/"withdrawal", "dry", "dealing", "mouth", and "paralysis". The good model performance and clinically relevant features show the potential of LLMs in augmenting ADE detection for post-marketing drug safety surveillance.
Collapse
Affiliation(s)
- Yu Deng
- Data & Statistical Sciences, AbbVie Inc, North Chicago, Illinois, USA
| | - Yunzhao Xing
- Data & Statistical Sciences, AbbVie Inc, North Chicago, Illinois, USA
| | - Jason Quach
- Computer Science & Engineering, University of California San Diego, La Jolla, California, USA
| | - Xiaotian Chen
- Data & Statistical Sciences, AbbVie Inc, North Chicago, Illinois, USA
| | - Xiaoqiang Wu
- Data & Statistical Sciences, AbbVie Inc, North Chicago, Illinois, USA
| | - Yafei Zhang
- Data & Statistical Sciences, AbbVie Inc, North Chicago, Illinois, USA
| | | | - Mengjia Yu
- Data & Statistical Sciences, AbbVie Inc, North Chicago, Illinois, USA
| | - Yujie Zhao
- Data & Statistical Sciences, AbbVie Inc, North Chicago, Illinois, USA
| | - Li Wang
- Data & Statistical Sciences, AbbVie Inc, North Chicago, Illinois, USA
| | - Sheng Zhong
- Data & Statistical Sciences, AbbVie Inc, North Chicago, Illinois, USA
| |
Collapse
|
16
|
Yada S, Nishiyama T, Wakamiya S, Kawazoe Y, Imai S, Hori S, Aramaki E. Utility analysis and demonstration of real-world clinical texts: A case study on Japanese cancer-related EHRs. PLoS One 2024; 19:e0310432. [PMID: 39259727 PMCID: PMC11389901 DOI: 10.1371/journal.pone.0310432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Accepted: 08/30/2024] [Indexed: 09/13/2024] Open
Abstract
Real-world data (RWD) in the medical field, such as electronic health records (EHRs) and medication orders, are receiving increasing attention from researchers and practitioners. While structured data have played a vital role thus far, unstructured data represented by text (e.g., discharge summaries) are not effectively utilized because of the difficulty in extracting medical information. We evaluated the information gained by supplementing structured data with clinical concepts extracted from unstructured text by leveraging natural language processing techniques. Using a machine learning-based pretrained named entity recognition tool, we extracted disease and medication names from real discharge summaries in a Japanese hospital and linked them to medical concepts using medical term dictionaries. By comparing the diseases and medications mentioned in the text with medical codes in tabular diagnosis records, we found that: (1) the text data contained richer information on patient symptoms than tabular diagnosis records, whereas the medication-order table stored more injection data than text. In addition, (2) extractable information regarding specific diseases showed surprisingly small intersections among text, diagnosis records, and medication orders. Text data can thus be a useful supplement for RWD mining, which is further demonstrated by (3) our practical application system for drug safety evaluation, which exhaustively visualizes suspicious adverse drug effects caused by the simultaneous use of anticancer drug pairs. We conclude that proper use of textual information extraction can lead to better outcomes in medical RWD mining.
Collapse
Affiliation(s)
- Shuntaro Yada
- Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, Nara, Japan
| | - Tomohiro Nishiyama
- Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, Nara, Japan
| | - Shoko Wakamiya
- Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, Nara, Japan
| | - Yoshimasa Kawazoe
- Artificial Intelligence and Digital Twin in Healthcare, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Shungo Imai
- Division of Drug Informatics, Faculty of Pharmacy, Keio University, Tokyo, Japan
| | - Satoko Hori
- Division of Drug Informatics, Faculty of Pharmacy, Keio University, Tokyo, Japan
| | - Eiji Aramaki
- Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, Nara, Japan
| |
Collapse
|
17
|
Hayden N, Gilbert S, Poisson LM, Griffith B, Klochko C. Performance of GPT-4 with Vision on Text- and Image-based ACR Diagnostic Radiology In-Training Examination Questions. Radiology 2024; 312:e240153. [PMID: 39225605 DOI: 10.1148/radiol.240153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
Background Recent advancements, including image processing capabilities, present new potential applications of large language models such as ChatGPT (OpenAI), a generative pretrained transformer, in radiology. However, baseline performance of ChatGPT in radiology-related tasks is understudied. Purpose To evaluate the performance of GPT-4 with vision (GPT-4V) on radiology in-training examination questions, including those with images, to gauge the model's baseline knowledge in radiology. Materials and Methods In this prospective study, conducted between September 2023 and March 2024, the September 2023 release of GPT-4V was assessed using 386 retired questions (189 image-based and 197 text-only questions) from the American College of Radiology Diagnostic Radiology In-Training Examinations. Nine question pairs were identified as duplicates; only the first instance of each duplicate was considered in ChatGPT's assessment. A subanalysis assessed the impact of different zero-shot prompts on performance. Statistical analysis included χ2 tests of independence to ascertain whether the performance of GPT-4V varied between question types or subspecialty. The McNemar test was used to evaluate performance differences between the prompts, with Benjamin-Hochberg adjustment of the P values conducted to control the false discovery rate (FDR). A P value threshold of less than.05 denoted statistical significance. Results GPT-4V correctly answered 246 (65.3%) of the 377 unique questions, with significantly higher accuracy on text-only questions (81.5%, 159 of 195) than on image-based questions (47.8%, 87 of 182) (χ2 test, P < .001). Subanalysis revealed differences between prompts on text-based questions, where chain-of-thought prompting outperformed long instruction by 6.1% (McNemar, P = .02; FDR = 0.063), basic prompting by 6.8% (P = .009, FDR = 0.044), and the original prompting style by 8.9% (P = .001, FDR = 0.014). No differences were observed between prompts on image-based questions with P values of .27 to >.99. Conclusion While GPT-4V demonstrated a level of competence in text-based questions, it showed deficits interpreting radiologic images. © RSNA, 2024 See also the editorial by Deng in this issue.
Collapse
Affiliation(s)
- Nolan Hayden
- From the Department of Diagnostic Radiology, Henry Ford Health, 2799 W Grand Blvd, Detroit, MI, 48202 (N.H., B.G., C.K.); Michigan State University College of Osteopathic Medicine, East Lansing, Mich (S.G.); and Department of Public Health Sciences, Henry Ford Health, Michigan State University Health Sciences, Detroit, Mich (L.M.P.)
| | - Spencer Gilbert
- From the Department of Diagnostic Radiology, Henry Ford Health, 2799 W Grand Blvd, Detroit, MI, 48202 (N.H., B.G., C.K.); Michigan State University College of Osteopathic Medicine, East Lansing, Mich (S.G.); and Department of Public Health Sciences, Henry Ford Health, Michigan State University Health Sciences, Detroit, Mich (L.M.P.)
| | - Laila M Poisson
- From the Department of Diagnostic Radiology, Henry Ford Health, 2799 W Grand Blvd, Detroit, MI, 48202 (N.H., B.G., C.K.); Michigan State University College of Osteopathic Medicine, East Lansing, Mich (S.G.); and Department of Public Health Sciences, Henry Ford Health, Michigan State University Health Sciences, Detroit, Mich (L.M.P.)
| | - Brent Griffith
- From the Department of Diagnostic Radiology, Henry Ford Health, 2799 W Grand Blvd, Detroit, MI, 48202 (N.H., B.G., C.K.); Michigan State University College of Osteopathic Medicine, East Lansing, Mich (S.G.); and Department of Public Health Sciences, Henry Ford Health, Michigan State University Health Sciences, Detroit, Mich (L.M.P.)
| | - Chad Klochko
- From the Department of Diagnostic Radiology, Henry Ford Health, 2799 W Grand Blvd, Detroit, MI, 48202 (N.H., B.G., C.K.); Michigan State University College of Osteopathic Medicine, East Lansing, Mich (S.G.); and Department of Public Health Sciences, Henry Ford Health, Michigan State University Health Sciences, Detroit, Mich (L.M.P.)
| |
Collapse
|
18
|
Shang Z, Chauhan V, Devi K, Patil S. Artificial Intelligence, the Digital Surgeon: Unravelling Its Emerging Footprint in Healthcare - The Narrative Review. J Multidiscip Healthc 2024; 17:4011-4022. [PMID: 39165254 PMCID: PMC11333562 DOI: 10.2147/jmdh.s482757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2024] [Accepted: 08/09/2024] [Indexed: 08/22/2024] Open
Abstract
Background Artificial Intelligence (AI) holds transformative potential for the healthcare industry, offering innovative solutions for diagnosis, treatment planning, and improving patient outcomes. As AI continues to be integrated into healthcare systems, it promises advancements across various domains. This review explores the diverse applications of AI in healthcare, along with the challenges and limitations that need to be addressed. The aim is to provide a comprehensive overview of AI's impact on healthcare and to identify areas for further development and focus. Main Applications The review discusses the broad range of AI applications in healthcare. In medical imaging and diagnostics, AI enhances the accuracy and efficiency of diagnostic processes, aiding in early disease detection. AI-powered clinical decision support systems assist healthcare professionals in patient management and decision-making. Predictive analytics using AI enables the prediction of patient outcomes and identification of potential health risks. AI-driven robotic systems have revolutionized surgical procedures, improving precision and outcomes. Virtual assistants and chatbots enhance patient interaction and support, providing timely information and assistance. In the pharmaceutical industry, AI accelerates drug discovery and development by identifying potential drug candidates and predicting their efficacy. Additionally, AI improves administrative efficiency and operational workflows in healthcare, streamlining processes and reducing costs. AI-powered remote monitoring and telehealth solutions expand access to healthcare, particularly in underserved areas. Challenges and Limitations Despite the significant promise of AI in healthcare, several challenges persist. Ensuring the reliability and consistency of AI-driven outcomes is crucial. Privacy and security concerns must be navigated carefully, particularly in handling sensitive patient data. Ethical considerations, including bias and fairness in AI algorithms, need to be addressed to prevent unintended consequences. Overcoming these challenges is critical for the ethical and successful integration of AI in healthcare. Conclusion The integration of AI into healthcare is advancing rapidly, offering substantial benefits in improving patient care and operational efficiency. However, addressing the associated challenges is essential to fully realize the transformative potential of AI in healthcare. Future efforts should focus on enhancing the reliability, transparency, and ethical standards of AI technologies to ensure they contribute positively to global health outcomes.
Collapse
Affiliation(s)
- Zifang Shang
- Guangdong Engineering Technological Research Centre of Clinical Molecular Diagnosis and Antibody Drugs, Meizhou People’s Hospital (Huangtang Hospital), Meizhou Academy of Medical Sciences, Meizhou, People’s Republic of China
| | - Varun Chauhan
- Multi-Disciplinary Research Unit, Government Institute of Medical Sciences, Greater Noida, India
| | - Kirti Devi
- Department of Medicine, Government Institute of Medical Sciences, Greater Noida, India
| | - Sandip Patil
- Department Haematology and Oncology, Shenzhen Children’s Hospital, Shenzhen, People’s Republic of China
| |
Collapse
|
19
|
Barruel D, Hilbey J, Charlet J, Chaumette B, Krebs MO, Dauriac-Le Masson V. Predicting treatment resistance in schizophrenia patients: Machine learning highlights the role of early pathophysiologic features. Schizophr Res 2024; 270:1-10. [PMID: 38823319 DOI: 10.1016/j.schres.2024.05.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 05/10/2024] [Accepted: 05/13/2024] [Indexed: 06/03/2024]
Abstract
Detecting patients with a high-risk profile for treatment-resistant schizophrenia (TRS) can be beneficial for implementing individually adapted therapeutic strategies and better understanding the TRS etiology. The aim of this study was to explore, with machine learning methods, the impact of demographic and clinical patient characteristics on TRS prediction, for already established risk factors and unexplored ones. This was a retrospective study of 500 patients admitted during 2020 to the University Hospital Group for Paris Psychiatry. We hypothesized potential TRS risk factors. The selected features were coded into structured variables in a new dataset, by processing patients discharge summaries and medical narratives with natural-language processing methods. We compared three machine learning models (XGBoost, logistic elastic net regression, logistic regression without regularization) for predicting TRS outcome. We analysed feature impact on the models, suggesting the following factors as markers of a high-risk TRS profile: early age at first contact with psychiatry, antipsychotic treatment interruptions due to non-adherence, absence of positive symptoms at baseline, educational problems and adolescence mental disorders in the personal psychiatric history. Specifically, we found a significant association with TRS outcome for age at first contact with psychiatry and medication non-adherence. Our findings on TRS risk factors are consistent with the review of the literature and suggest potential in using early pathophysiologic features for TRS prediction. Results were encouraging with the use of natural-langage processing techniques to leverage raw data provided by discharge summaries, combined with machine leaning models. These findings are a promising step for helping clinicians adapt their guidelines to early detection of TRS.
Collapse
Affiliation(s)
- David Barruel
- GHU Paris Psychiatrie et Neurosciences, Hôpital Sainte Anne, 1, rue Cabanis, 75014 Paris, France.
| | - Jacques Hilbey
- Sorbonne Université, Paris, France; Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances en e-Santé, LIMICS, Paris, France
| | - Jean Charlet
- Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances en e-Santé, LIMICS, Paris, France; Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Boris Chaumette
- GHU Paris Psychiatrie et Neurosciences, Hôpital Sainte Anne, 1, rue Cabanis, 75014 Paris, France; Université de Paris, Institute of Psychiatry and Neuroscience of Paris (IPNP), INSERM, U1266 Paris, France; Department of Psychiatry, McGill University, Montréal, QC, Canada
| | - Marie-Odile Krebs
- GHU Paris Psychiatrie et Neurosciences, Hôpital Sainte Anne, 1, rue Cabanis, 75014 Paris, France; Université de Paris, Institute of Psychiatry and Neuroscience of Paris (IPNP), INSERM, U1266 Paris, France
| | | |
Collapse
|
20
|
Luo Y, Mao C, Sanchez‐Pinto LN, Ahmad FS, Naidech A, Rasmussen L, Pacheco JA, Schneider D, Mithal LB, Dresden S, Holmes K, Carson M, Shah SJ, Khan S, Clare S, Wunderink RG, Liu H, Walunas T, Cooper L, Yue F, Wehbe F, Fang D, Liebovitz DM, Markl M, Michelson KN, McColley SA, Green M, Starren J, Ackermann RT, D'Aquila RT, Adams J, Lloyd‐Jones D, Chisholm RL, Kho A. Northwestern University resource and education development initiatives to advance collaborative artificial intelligence across the learning health system. Learn Health Syst 2024; 8:e10417. [PMID: 39036530 PMCID: PMC11257059 DOI: 10.1002/lrh2.10417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 02/22/2024] [Accepted: 02/26/2024] [Indexed: 07/23/2024] Open
Abstract
Introduction The rapid development of artificial intelligence (AI) in healthcare has exposed the unmet need for growing a multidisciplinary workforce that can collaborate effectively in the learning health systems. Maximizing the synergy among multiple teams is critical for Collaborative AI in Healthcare. Methods We have developed a series of data, tools, and educational resources for cultivating the next generation of multidisciplinary workforce for Collaborative AI in Healthcare. We built bulk-natural language processing pipelines to extract structured information from clinical notes and stored them in common data models. We developed multimodal AI/machine learning (ML) tools and tutorials to enrich the toolbox of the multidisciplinary workforce to analyze multimodal healthcare data. We have created a fertile ground to cross-pollinate clinicians and AI scientists and train the next generation of AI health workforce to collaborate effectively. Results Our work has democratized access to unstructured health information, AI/ML tools and resources for healthcare, and collaborative education resources. From 2017 to 2022, this has enabled studies in multiple clinical specialties resulting in 68 peer-reviewed publications. In 2022, our cross-discipline efforts converged and institutionalized into the Center for Collaborative AI in Healthcare. Conclusions Our Collaborative AI in Healthcare initiatives has created valuable educational and practical resources. They have enabled more clinicians, scientists, and hospital administrators to successfully apply AI methods in their daily research and practice, develop closer collaborations, and advanced the institution-level learning health system.
Collapse
Affiliation(s)
- Yuan Luo
- Northwestern University Clinical and Translational Sciences InstituteChicagoIllinoisUSA
- Institute for Augmented Intelligence in MedicineNorthwestern UniversityChicagoIllinoisUSA
- Division of Health and Biomedical Informatics, Department of Preventive MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Chengsheng Mao
- Northwestern University Clinical and Translational Sciences InstituteChicagoIllinoisUSA
- Institute for Augmented Intelligence in MedicineNorthwestern UniversityChicagoIllinoisUSA
- Division of Health and Biomedical Informatics, Department of Preventive MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Lazaro N. Sanchez‐Pinto
- Division of Health and Biomedical Informatics, Department of Preventive MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
- Division of Critical Care, Department of PediatricsNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
- Stanley Manne Children's Research InstituteAnn & Robert H. Lurie Children's Hospital of ChicagoChicagoIllinoisUSA
| | - Faraz S. Ahmad
- Institute for Augmented Intelligence in MedicineNorthwestern UniversityChicagoIllinoisUSA
- Division of Cardiology, Department of MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Andrew Naidech
- Northwestern University Clinical and Translational Sciences InstituteChicagoIllinoisUSA
- Institute for Augmented Intelligence in MedicineNorthwestern UniversityChicagoIllinoisUSA
- Division of Neurocritical Care, Department of NeurologyNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Luke Rasmussen
- Northwestern University Clinical and Translational Sciences InstituteChicagoIllinoisUSA
- Division of Health and Biomedical Informatics, Department of Preventive MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Jennifer A. Pacheco
- Center for Genetic MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Daniel Schneider
- Northwestern University Clinical and Translational Sciences InstituteChicagoIllinoisUSA
| | - Leena B. Mithal
- Stanley Manne Children's Research InstituteAnn & Robert H. Lurie Children's Hospital of ChicagoChicagoIllinoisUSA
- Division of Infectious Diseases, Department of PediatricsNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Scott Dresden
- Northwestern University Clinical and Translational Sciences InstituteChicagoIllinoisUSA
- Institute for Augmented Intelligence in MedicineNorthwestern UniversityChicagoIllinoisUSA
- Department of Emergency MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Kristi Holmes
- Northwestern University Clinical and Translational Sciences InstituteChicagoIllinoisUSA
- Institute for Augmented Intelligence in MedicineNorthwestern UniversityChicagoIllinoisUSA
- Division of Health and Biomedical Informatics, Department of Preventive MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
- Galter Health Sciences LibraryNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Matthew Carson
- Galter Health Sciences LibraryNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Sanjiv J. Shah
- Institute for Augmented Intelligence in MedicineNorthwestern UniversityChicagoIllinoisUSA
- Division of Cardiology, Department of MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Seema Khan
- Northwestern University Clinical and Translational Sciences InstituteChicagoIllinoisUSA
- Institute for Augmented Intelligence in MedicineNorthwestern UniversityChicagoIllinoisUSA
- Department of SurgeryNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Susan Clare
- Department of SurgeryNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Richard G. Wunderink
- Division of Critical Care, Department of PediatricsNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
- Pulmonary and Critical Care Division, Department of MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Huiping Liu
- Northwestern University Clinical and Translational Sciences InstituteChicagoIllinoisUSA
- Institute for Augmented Intelligence in MedicineNorthwestern UniversityChicagoIllinoisUSA
- Department of PharmacologyNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
- Division of Hematology and Oncology, Department of MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Theresa Walunas
- Division of Health and Biomedical Informatics, Department of Preventive MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
- Division of General Internal Medicine, Department of MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
- Center for Health Information PartnershipsInstitute for Public Health and Medicine, Northwestern UniversityChicagoIllinoisUSA
- Department of Microbiology‐ImmunologyNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Lee Cooper
- Institute for Augmented Intelligence in MedicineNorthwestern UniversityChicagoIllinoisUSA
- Division of Health and Biomedical Informatics, Department of Preventive MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
- Department of PathologyNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Feng Yue
- Institute for Augmented Intelligence in MedicineNorthwestern UniversityChicagoIllinoisUSA
- Department of PathologyNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
- Department of Biochemistry and Molecular GeneticsNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Firas Wehbe
- Northwestern University Clinical and Translational Sciences InstituteChicagoIllinoisUSA
- Institute for Augmented Intelligence in MedicineNorthwestern UniversityChicagoIllinoisUSA
- Department of SurgeryNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Deyu Fang
- Institute for Augmented Intelligence in MedicineNorthwestern UniversityChicagoIllinoisUSA
- Department of PathologyNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - David M. Liebovitz
- Institute for Augmented Intelligence in MedicineNorthwestern UniversityChicagoIllinoisUSA
- Division of Health and Biomedical Informatics, Department of Preventive MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
- Division of General Internal Medicine, Department of MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
- Center for Health Information PartnershipsInstitute for Public Health and Medicine, Northwestern UniversityChicagoIllinoisUSA
| | - Michael Markl
- Department of RadiologyNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Kelly N. Michelson
- Division of Critical Care, Department of PediatricsNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
- Stanley Manne Children's Research InstituteAnn & Robert H. Lurie Children's Hospital of ChicagoChicagoIllinoisUSA
- Center for Bioethics and Medical Humanities, Institute for Public Health and MedicineNorthwestern UniversityChicagoIllinoisUSA
| | - Susanna A. McColley
- Northwestern University Clinical and Translational Sciences InstituteChicagoIllinoisUSA
- Stanley Manne Children's Research InstituteAnn & Robert H. Lurie Children's Hospital of ChicagoChicagoIllinoisUSA
- Division of Pulmonary and Sleep Medicine, Department of PediatricsNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Marianne Green
- Division of General Internal Medicine, Department of MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Justin Starren
- Northwestern University Clinical and Translational Sciences InstituteChicagoIllinoisUSA
- Institute for Augmented Intelligence in MedicineNorthwestern UniversityChicagoIllinoisUSA
- Division of Health and Biomedical Informatics, Department of Preventive MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Ronald T. Ackermann
- Northwestern University Clinical and Translational Sciences InstituteChicagoIllinoisUSA
- Division of General Internal Medicine, Department of MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
- Institute for Public Health and MedicineNorthwestern UniversityChicagoIllinoisUSA
| | - Richard T. D'Aquila
- Northwestern University Clinical and Translational Sciences InstituteChicagoIllinoisUSA
- Division of Infectious Diseases, Department of MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - James Adams
- Northwestern University Clinical and Translational Sciences InstituteChicagoIllinoisUSA
- Institute for Augmented Intelligence in MedicineNorthwestern UniversityChicagoIllinoisUSA
- Department of Emergency MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Donald Lloyd‐Jones
- Northwestern University Clinical and Translational Sciences InstituteChicagoIllinoisUSA
- Institute for Augmented Intelligence in MedicineNorthwestern UniversityChicagoIllinoisUSA
- Division of Epidemiology, Department of Preventive MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
| | - Rex L. Chisholm
- Northwestern University Clinical and Translational Sciences InstituteChicagoIllinoisUSA
- Institute for Augmented Intelligence in MedicineNorthwestern UniversityChicagoIllinoisUSA
- Department of SurgeryNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
- Center for Health Information PartnershipsInstitute for Public Health and Medicine, Northwestern UniversityChicagoIllinoisUSA
| | - Abel Kho
- Northwestern University Clinical and Translational Sciences InstituteChicagoIllinoisUSA
- Institute for Augmented Intelligence in MedicineNorthwestern UniversityChicagoIllinoisUSA
- Division of Health and Biomedical Informatics, Department of Preventive MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
- Division of General Internal Medicine, Department of MedicineNorthwestern University Feinberg School of MedicineChicagoIllinoisUSA
- Center for Health Information PartnershipsInstitute for Public Health and Medicine, Northwestern UniversityChicagoIllinoisUSA
| |
Collapse
|
21
|
Medhi D, Kamidi SR, Mamatha Sree KP, Shaikh S, Rasheed S, Thengu Murichathil AH, Nazir Z. Artificial Intelligence and Its Role in Diagnosing Heart Failure: A Narrative Review. Cureus 2024; 16:e59661. [PMID: 38836155 PMCID: PMC11148729 DOI: 10.7759/cureus.59661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/04/2024] [Indexed: 06/06/2024] Open
Abstract
Heart failure (HF) is prevalent globally. It is a dynamic disease with varying definitions and classifications due to multiple pathophysiologies and etiologies. The diagnosis, clinical staging, and treatment of HF become complex and subjective, impacting patient prognosis and mortality. Technological advancements, like artificial intelligence (AI), have been significant roleplays in medicine and are increasingly used in cardiovascular medicine to transform drug discovery, clinical care, risk prediction, diagnosis, and treatment. Medical and surgical interventions specific to HF patients rely significantly on early identification of HF. Hospitalization and treatment costs for HF are high, with readmissions increasing the burden. AI can help improve diagnostic accuracy by recognizing patterns and using them in multiple areas of HF management. AI has shown promise in offering early detection and precise diagnoses with the help of ECG analysis, advanced cardiac imaging, leveraging biomarkers, and cardiopulmonary stress testing. However, its challenges include data access, model interpretability, ethical concerns, and generalizability across diverse populations. Despite these ongoing efforts to refine AI models, it suggests a promising future for HF diagnosis. After applying exclusion and inclusion criteria, we searched for data available on PubMed, Google Scholar, and the Cochrane Library and found 150 relevant papers. This review focuses on AI's significant contribution to HF diagnosis in recent years, drastically altering HF treatment and outcomes.
Collapse
Affiliation(s)
- Diptiman Medhi
- Internal Medicine, Gauhati Medical College and Hospital, Guwahati, Guwahati, IND
| | | | | | - Shifa Shaikh
- Cardiology, SMBT Institute of Medical Sciences and Research Centre, Igatpuri, IND
| | - Shanida Rasheed
- Emergency Medicine, East Sussex Healthcare NHS Trust, Eastbourne, GBR
| | | | - Zahra Nazir
- Internal Medicine, Combined Military Hospital, Quetta, Quetta, PAK
| |
Collapse
|
22
|
Mashima Y, Tanigawa M, Yokoi H. Information heterogeneity between progress notes by physicians and nurses for inpatients with digestive system diseases. Sci Rep 2024; 14:7656. [PMID: 38561333 PMCID: PMC10984979 DOI: 10.1038/s41598-024-56324-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 03/05/2024] [Indexed: 04/04/2024] Open
Abstract
This study focused on the heterogeneity in progress notes written by physicians or nurses. A total of 806 days of progress notes written by physicians or nurses from 83 randomly selected patients hospitalized in the Gastroenterology Department at Kagawa University Hospital from January to December 2021 were analyzed. We extracted symptoms as the International Classification of Diseases (ICD) Chapter 18 (R00-R99, hereinafter R codes) from each progress note using MedNER-J natural language processing software and counted the days one or more symptoms were extracted to calculate the extraction rate. The R-code extraction rate was significantly higher from progress notes by nurses than by physicians (physicians 68.5% vs. nurses 75.2%; p = 0.00112), regardless of specialty. By contrast, the R-code subcategory R10-R19 for digestive system symptoms (44.2 vs. 37.5%, respectively; p = 0.00299) and many chapters of ICD codes for disease names, as represented by Chapter 11 K00-K93 (68.4 vs. 30.9%, respectively; p < 0.001), were frequently extracted from the progress notes by physicians, reflecting their specialty. We believe that understanding the information heterogeneity of medical documents, which can be the basis of medical artificial intelligence, is crucial, and this study is a pioneering step in that direction.
Collapse
Affiliation(s)
- Yukinori Mashima
- Clinical Research Support Center, Kagawa University Hospital, 1750-1 Ikenobe, Miki-cho, Kita-gun, Kagawa, 761-0793, Japan.
- Department of Medical Informatics, Faculty of Medicine, Kagawa University, Kagawa, Japan.
| | - Masatoshi Tanigawa
- Clinical Research Support Center, Kagawa University Hospital, 1750-1 Ikenobe, Miki-cho, Kita-gun, Kagawa, 761-0793, Japan
| | - Hideto Yokoi
- Clinical Research Support Center, Kagawa University Hospital, 1750-1 Ikenobe, Miki-cho, Kita-gun, Kagawa, 761-0793, Japan
- Department of Medical Informatics, Faculty of Medicine, Kagawa University, Kagawa, Japan
| |
Collapse
|
23
|
Deng Y, Pacheco JA, Ghosh A, Chung A, Mao C, Smith JC, Zhao J, Wei WQ, Barnado A, Dorn C, Weng C, Liu C, Cordon A, Yu J, Tedla Y, Kho A, Ramsey-Goldman R, Walunas T, Luo Y. Natural language processing to identify lupus nephritis phenotype in electronic health records. BMC Med Inform Decis Mak 2024; 22:348. [PMID: 38433189 PMCID: PMC10910523 DOI: 10.1186/s12911-024-02420-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 01/09/2024] [Indexed: 03/05/2024] Open
Abstract
BACKGROUND Systemic lupus erythematosus (SLE) is a rare autoimmune disorder characterized by an unpredictable course of flares and remission with diverse manifestations. Lupus nephritis, one of the major disease manifestations of SLE for organ damage and mortality, is a key component of lupus classification criteria. Accurately identifying lupus nephritis in electronic health records (EHRs) would therefore benefit large cohort observational studies and clinical trials where characterization of the patient population is critical for recruitment, study design, and analysis. Lupus nephritis can be recognized through procedure codes and structured data, such as laboratory tests. However, other critical information documenting lupus nephritis, such as histologic reports from kidney biopsies and prior medical history narratives, require sophisticated text processing to mine information from pathology reports and clinical notes. In this study, we developed algorithms to identify lupus nephritis with and without natural language processing (NLP) using EHR data from the Northwestern Medicine Enterprise Data Warehouse (NMEDW). METHODS We developed five algorithms: a rule-based algorithm using only structured data (baseline algorithm) and four algorithms using different NLP models. The first NLP model applied simple regular expression for keywords search combined with structured data. The other three NLP models were based on regularized logistic regression and used different sets of features including positive mention of concept unique identifiers (CUIs), number of appearances of CUIs, and a mixture of three components (i.e. a curated list of CUIs, regular expression concepts, structured data) respectively. The baseline algorithm and the best performing NLP algorithm were externally validated on a dataset from Vanderbilt University Medical Center (VUMC). RESULTS Our best performing NLP model incorporated features from both structured data, regular expression concepts, and mapped concept unique identifiers (CUIs) and showed improved F measure in both the NMEDW (0.41 vs 0.79) and VUMC (0.52 vs 0.93) datasets compared to the baseline lupus nephritis algorithm. CONCLUSION Our NLP MetaMap mixed model improved the F-measure greatly compared to the structured data only algorithm in both internal and external validation datasets. The NLP algorithms can serve as powerful tools to accurately identify lupus nephritis phenotype in EHR for clinical research and better targeted therapies.
Collapse
Affiliation(s)
- Yu Deng
- Center for Health Information Partnerships, Feinberg School of Medicine, Northwestern University, Chicago, USA
| | - Jennifer A Pacheco
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, USA
| | - Anika Ghosh
- Center for Health Information Partnerships, Feinberg School of Medicine, Northwestern University, Chicago, USA
| | - Anh Chung
- Center for Health Information Partnerships, Feinberg School of Medicine, Northwestern University, Chicago, USA
- Department of Medicine/Rheumatology, Feinberg School of Medicine, Northwestern University, Chicago, USA
| | - Chengsheng Mao
- Center for Health Information Partnerships, Feinberg School of Medicine, Northwestern University, Chicago, USA
| | - Joshua C Smith
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, USA
| | - Juan Zhao
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, USA
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, USA
| | - April Barnado
- Department of Medicine, Vanderbilt University Medical Center, Nashville, USA
| | - Chad Dorn
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, New York City, USA
| | - Cong Liu
- Department of Biomedical Informatics, Columbia University, New York City, USA
| | - Adam Cordon
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, USA
| | - Jingzhi Yu
- Center for Health Information Partnerships, Feinberg School of Medicine, Northwestern University, Chicago, USA
| | - Yacob Tedla
- Center for Health Information Partnerships, Feinberg School of Medicine, Northwestern University, Chicago, USA
| | - Abel Kho
- Center for Health Information Partnerships, Feinberg School of Medicine, Northwestern University, Chicago, USA
| | - Rosalind Ramsey-Goldman
- Department of Medicine/Rheumatology, Feinberg School of Medicine, Northwestern University, Chicago, USA
| | - Theresa Walunas
- Center for Health Information Partnerships, Feinberg School of Medicine, Northwestern University, Chicago, USA.
| | - Yuan Luo
- Center for Health Information Partnerships, Feinberg School of Medicine, Northwestern University, Chicago, USA.
| |
Collapse
|
24
|
Farzan R. Artificial intelligence in Immuno-genetics. Bioinformation 2024; 20:29-35. [PMID: 38352901 PMCID: PMC10859949 DOI: 10.6026/973206300200029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Revised: 01/31/2024] [Accepted: 01/31/2024] [Indexed: 02/16/2024] Open
Abstract
Rapid advancements in the field of artificial intelligence (AI) have opened up unprecedented opportunities to revolutionize various scientific domains, including immunology and genetics. Therefore, it is of interest to explore the emerging applications of AI in immunology and genetics, with the objective of enhancing our understanding of the dynamic intricacies of the immune system, disease etiology, and genetic variations. Hence, the use of AI methodologies in immunological and genetic datasets, thereby facilitating the development of innovative approaches in the realms of diagnosis, treatment, and personalized medicine is reviewed.
Collapse
Affiliation(s)
- Raed Farzan
- Department of Clinical Laboratory Sciences, College of Applied Medical Scienecs, King Saud University, Riyadh - 11433, Saudi Arabia
- Center of Excellence in Biotechnology Research, King Saud University, Riyadh - 11433, Saudi Arabia
- Medical and Molecular Genetics Research, King Saud University, Riyadh-11433, Saudi Arabia
| |
Collapse
|
25
|
Yu J, Yang X, Deng Y, Krefman AE, Pool LR, Zhao L, Mi X, Ning H, Wilkins J, Lloyd-Jones DM, Petito LC, Allen NB. Incorporating longitudinal history of risk factors into atherosclerotic cardiovascular disease risk prediction using deep learning. Sci Rep 2024; 14:2554. [PMID: 38296982 PMCID: PMC10830564 DOI: 10.1038/s41598-024-51685-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 01/08/2024] [Indexed: 02/02/2024] Open
Abstract
It is increasingly clear that longitudinal risk factor levels and trajectories are related to risk for atherosclerotic cardiovascular disease (ASCVD) above and beyond single measures. Currently used in clinical care, the Pooled Cohort Equations (PCE) are based on regression methods that predict ASCVD risk based on cross-sectional risk factor levels. Deep learning (DL) models have been developed to incorporate longitudinal data for risk prediction but its benefit for ASCVD risk prediction relative to the traditional Pooled Cohort Equations (PCE) remain unknown. Our study included 15,565 participants from four cardiovascular disease cohorts free of baseline ASCVD who were followed for adjudicated ASCVD. Ten-year ASCVD risk was calculated in the training set using our benchmark, the PCE, and a longitudinal DL model, Dynamic-DeepHit. Predictors included those incorporated in the PCE: sex, race, age, total cholesterol, high density lipid cholesterol, systolic and diastolic blood pressure, diabetes, hypertension treatment and smoking. The discrimination and calibration performance of the two models were evaluated in an overall hold-out testing dataset. Of the 15,565 participants in our dataset, 2170 (13.9%) developed ASCVD. The performance of the longitudinal DL model that incorporated 8 years of longitudinal risk factor data improved upon that of the PCE [AUROC: 0.815 (CI 0.782-0.844) vs 0.792 (CI 0.760-0.825)] and the net reclassification index was 0.385. The brier score for the DL model was 0.0514 compared with 0.0542 in the PCE. Incorporating longitudinal risk factors in ASCVD risk prediction using DL can improve model discrimination and calibration.
Collapse
Affiliation(s)
- Jingzhi Yu
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Xiaoyun Yang
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Yu Deng
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Amy E Krefman
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Lindsay R Pool
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Lihui Zhao
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Xinlei Mi
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Hongyan Ning
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - John Wilkins
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Donald M Lloyd-Jones
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Lucia C Petito
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Norrina B Allen
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA.
| |
Collapse
|
26
|
Lyu T, Liang C. Computational Phenotyping of OMOP CDM Normalized EHR for Prenatal and Postpartum Episodes: An Informatics Framework and Clinical Implementation on All of Us. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2024; 2023:1096-1104. [PMID: 38222375 PMCID: PMC10785883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
The use of Electronic Health Records (EHR) in pregnancy care and obstetrics-gynecology (OB/GYN) research has increased in recent years. In pregnancy, timing is important because clinical characteristics, risks, and patient management are different in each stage of pregnancy. However, the difficulty of accurately differentiating pregnancy episodes and temporal information of clinical events presents unique challenges for EHR phenotyping. In this work, we introduced the concept of time relativity and proposed a comprehensive framework of computational phenotyping for prenatal and postpartum episodes based on the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). We implemented it on the All of Us national EHR database and identified 6,280 pregnancies with accurate start and end dates among 5,399 female patients. With the ability to identify different episodes in pregnancy care, this framework provides new opportunities for phenotyping complex clinical events and gestational morbidities for pregnant women, thus improving maternal and infant health.
Collapse
Affiliation(s)
- Tianchu Lyu
- University of South Carolina, Columbia, South Carolina, USA
| | - Chen Liang
- University of South Carolina, Columbia, South Carolina, USA
- National Institutes of Health, Bethesda, Maryland, USA
| |
Collapse
|
27
|
Zhang Y, Wang M, Zhang E, Wu Y. Artificial Intelligence in the Screening, Diagnosis, and Management of Aortic Stenosis. Rev Cardiovasc Med 2024; 25:31. [PMID: 39077660 PMCID: PMC11262349 DOI: 10.31083/j.rcm2501031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 08/30/2023] [Accepted: 09/13/2023] [Indexed: 07/31/2024] Open
Abstract
The integration of artificial intelligence (AI) into clinical management of aortic stenosis (AS) has redefined our approach to the assessment and management of this heterogenous valvular heart disease (VHD). While the large-scale early detection of valvular conditions is limited by socioeconomic constraints, AI offers a cost-effective alternative solution for screening by utilizing conventional tools, including electrocardiograms and community-level auscultations, thereby facilitating early detection, prevention, and treatment of AS. Furthermore, AI sheds light on the varied nature of AS, once considered a uniform condition, allowing for more nuanced, data-driven risk assessments and treatment plans. This presents an opportunity to re-evaluate the complexity of AS and to refine treatment using data-driven risk stratification beyond traditional guidelines. AI can be used to support treatment decisions including device selection, procedural techniques, and follow-up surveillance of transcatheter aortic valve replacement (TAVR) in a reproducible manner. While recognizing notable AI achievements, it is important to remember that AI applications in AS still require collaboration with human expertise due to potential limitations such as its susceptibility to bias, and the critical nature of healthcare. This synergy underpins our optimistic view of AI's promising role in the AS clinical pathway.
Collapse
Affiliation(s)
- Yuxuan Zhang
- Department of Cardiology, State Key Laboratory of Cardiovascular Disease,
Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of
Medical Sciences and Peking Union Medical College, 100037 Beijing, China
- Center for Structural Heart Diseases, State Key Laboratory of
Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular
Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College,
100037 Beijing, China
| | - Moyang Wang
- Department of Cardiology, State Key Laboratory of Cardiovascular Disease,
Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of
Medical Sciences and Peking Union Medical College, 100037 Beijing, China
- Center for Structural Heart Diseases, State Key Laboratory of
Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular
Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College,
100037 Beijing, China
| | - Erli Zhang
- Department of Cardiology, State Key Laboratory of Cardiovascular Disease,
Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of
Medical Sciences and Peking Union Medical College, 100037 Beijing, China
- Center for Structural Heart Diseases, State Key Laboratory of
Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular
Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College,
100037 Beijing, China
| | - Yongjian Wu
- Department of Cardiology, State Key Laboratory of Cardiovascular Disease,
Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of
Medical Sciences and Peking Union Medical College, 100037 Beijing, China
- Center for Structural Heart Diseases, State Key Laboratory of
Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular
Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College,
100037 Beijing, China
| |
Collapse
|
28
|
Chafjiri FMA, Reece L, Voke L, Landschaft A, Clark J, Kimia AA, Loddenkemper T. Natural language processing for identification of refractory status epilepticus in children. Epilepsia 2023; 64:3227-3237. [PMID: 37804085 DOI: 10.1111/epi.17789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 10/03/2023] [Accepted: 10/03/2023] [Indexed: 10/08/2023]
Abstract
OBJECTIVE Pediatric status epilepticus is one of the most frequent pediatric emergencies, with high mortality and morbidity. Utilizing electronic health records (EHRs) permits analysis of care approaches and disease outcomes at a lower cost than prospective research. However, reviewing EHR manually is time intensive. We aimed to compare refractory status epilepticus (rSE) cases identified by human EHR review with a natural language processing (NLP)-assisted rSE screen followed by a manual review. METHODS We used the NLP screening tool Document Review Tool (DrT) to generate regular expressions, trained a bag-of-words NLP classifier on EHRs from 2017 to 2019, and then tested our algorithm on data from February to December 2012. We compared results from manual review to NLP-assisted search followed by manual review. RESULTS Our algorithm identified 1528 notes in the test set. After removing notes pertaining to the same event by DrT, the user reviewed a total number of 400 notes to find patients with rSE. Within these 400 notes, we identified 31 rSE cases, including 12 new cases not found in manual review, and 19 of the 20 previously identified cases. The NLP-assisted model found 31 of 32 cases, with a sensitivity of 96.88% (95% CI = 82%-99.84%), whereas manual review identified 20 of 32 cases, with a sensitivity of 62.5% (95% CI = 43.75%-78.34%). SIGNIFICANCE DrT provided a highly sensitive model compared to human review and an increase in patient identification through EHRs. The use of DrT is a suitable application of NLP for identifying patients with a history of recent rSE, which ultimately contributes to the implementation of monitoring techniques and treatments in near real time.
Collapse
Affiliation(s)
- Fatemeh Mohammad Alizadeh Chafjiri
- Department of Neurology, Division of Epilepsy and Clinical Neurophysiology, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Latania Reece
- Department of Neurology, Division of Epilepsy and Clinical Neurophysiology, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts, USA
- Nexamp, Boston, Massachusetts, USA
| | - Lillian Voke
- Department of Neurology, Division of Epilepsy and Clinical Neurophysiology, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | | | - Justice Clark
- Department of Neurology, Division of Epilepsy and Clinical Neurophysiology, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Amir A Kimia
- Department of Medicine, Division of Emergency Medicine, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts, USA
- Connecticut Children's Hospital, Hartford, Connecticut, USA
| | - Tobias Loddenkemper
- Department of Neurology, Division of Epilepsy and Clinical Neurophysiology, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
29
|
Alsentzer E, Rasmussen MJ, Fontoura R, Cull AL, Beaulieu-Jones B, Gray KJ, Bates DW, Kovacheva VP. Zero-shot interpretable phenotyping of postpartum hemorrhage using large language models. NPJ Digit Med 2023; 6:212. [PMID: 38036723 PMCID: PMC10689487 DOI: 10.1038/s41746-023-00957-x] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 11/01/2023] [Indexed: 12/02/2023] Open
Abstract
Many areas of medicine would benefit from deeper, more accurate phenotyping, but there are limited approaches for phenotyping using clinical notes without substantial annotated data. Large language models (LLMs) have demonstrated immense potential to adapt to novel tasks with no additional training by specifying task-specific instructions. Here we report the performance of a publicly available LLM, Flan-T5, in phenotyping patients with postpartum hemorrhage (PPH) using discharge notes from electronic health records (n = 271,081). The language model achieves strong performance in extracting 24 granular concepts associated with PPH. Identifying these granular concepts accurately allows the development of interpretable, complex phenotypes and subtypes. The Flan-T5 model achieves high fidelity in phenotyping PPH (positive predictive value of 0.95), identifying 47% more patients with this complication compared to the current standard of using claims codes. This LLM pipeline can be used reliably for subtyping PPH and outperforms a claims-based approach on the three most common PPH subtypes associated with uterine atony, abnormal placentation, and obstetric trauma. The advantage of this approach to subtyping is its interpretability, as each concept contributing to the subtype determination can be evaluated. Moreover, as definitions may change over time due to new guidelines, using granular concepts to create complex phenotypes enables prompt and efficient updating of the algorithm. Using this language modelling approach enables rapid phenotyping without the need for any manually annotated training data across multiple clinical use cases.
Collapse
Affiliation(s)
- Emily Alsentzer
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, USA
| | - Matthew J Rasmussen
- Department of Anesthesiology, Perioperative and Pain Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Romy Fontoura
- Department of Anesthesiology, Perioperative and Pain Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Alexis L Cull
- Department of Anesthesiology, Perioperative and Pain Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Brett Beaulieu-Jones
- Section of Biomedical Data Science, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Kathryn J Gray
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Division of Maternal-Fetal Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - David W Bates
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, USA
- Department of Health Care Policy and Management, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Vesela P Kovacheva
- Department of Anesthesiology, Perioperative and Pain Medicine, Brigham and Women's Hospital, Boston, MA, USA.
| |
Collapse
|
30
|
Hua Y, Wang L, Nguyen V, Rieu-Werden M, McDowell A, Bates DW, Foer D, Zhou L. A deep learning approach for transgender and gender diverse patient identification in electronic health records. J Biomed Inform 2023; 147:104507. [PMID: 37778672 PMCID: PMC10687838 DOI: 10.1016/j.jbi.2023.104507] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 09/18/2023] [Accepted: 09/22/2023] [Indexed: 10/03/2023]
Abstract
BACKGROUND Although accurate identification of gender identity in the electronic health record (EHR) is crucial for providing equitable health care, particularly for transgender and gender diverse (TGD) populations, it remains a challenging task due to incomplete gender information in structured EHR fields. OBJECTIVE Using TGD identification as a case study, this research uses NLP and deep learning to build an accurate patient gender identity predictive model, aiming to tackle the challenges of identifying relevant patient-level information from EHR data and reducing annotation work. METHODS This study included adult patients in a large healthcare system in Boston, MA, between 4/1/2017 to 4/1/2022. To identify relevant information from massive clinical notes, we compiled a list of gender-related keywords through expert curation, literature review, and expansion via a fine-tuned BioWordVec model. This keyword list was used to pre-screen potential TGD individuals and create two datasets for model training, testing, and validation. Dataset I was a balanced dataset that contained clinician-confirmed TGD patients and cases without keywords. Dataset II contained cases with keywords. The performance of the deep learning model was compared to traditional machine learning and rule-based algorithms. RESULTS The final keyword list consists of 109 keywords, of which 58 (53.2%) were expanded by the BioWordVec model. Dataset I contained 3,150 patients (50% TGD) while Dataset II contained 200 patients (90% TGD). On Dataset I the deep learning model achieved a F1 score of 0.917, sensitivity of 0.854, and a precision of 0.980; and on Dataset II a F1 score of 0.969, sensitivity of 0.967, and precision of 0.972. The deep learning model significantly outperformed rule-based algorithms. CONCLUSION This is the first study to show that deep learning-integrated NLP algorithms can accurately identify gender identity using EHR data. Future work should leverage and evaluate additional diverse data sources to generate more generalizable algorithms.
Collapse
Affiliation(s)
- Yining Hua
- Division of General Internal Medicine and Primary Care, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA; Department of Epidemiology, Harvard T.H Chan School of Public Health, Boston, MA, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
| | - Liqin Wang
- Division of General Internal Medicine and Primary Care, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
| | - Vi Nguyen
- Division of General Internal Medicine and Primary Care, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
| | - Meghan Rieu-Werden
- Division of General Medicine, Massachusetts General Hospital, Boston, MA, USA.
| | - Alex McDowell
- Health Policy Research Institute, Mongan Institute, Massachusetts General Hospital, Boston, MA, USA; Department of Health Care Policy, Harvard Medical School, Boston, MA, USA.
| | - David W Bates
- Division of General Internal Medicine and Primary Care, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
| | - Dinah Foer
- Division of General Internal Medicine and Primary Care, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA; Division of Allergy and Clinical Immunology, Department of Medicine, Brigham and Women's Hospital, USA.
| | - Li Zhou
- Division of General Internal Medicine and Primary Care, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
31
|
Yu J, Yang X, Deng Y, Krefman AE, Pool LR, Zhao L, Mi X, Ning H, Wilkins J, Lloyd-Jones DM, Petito LC, Allen NB. Incorporating longitudinal history of risk factors into atherosclerotic cardiovascular disease risk prediction using deep learning. RESEARCH SQUARE 2023:rs.3.rs-3405388. [PMID: 37886463 PMCID: PMC10602136 DOI: 10.21203/rs.3.rs-3405388/v1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/28/2023]
Abstract
Background It is increasingly clear that longitudinal risk factor levels and trajectories are related to risk for atherosclerotic cardiovascular disease (ASCVD) above and beyond single measures. Currently used in clinical care, the Pooled Cohort Equations (PCE) are based on regression methods that predict ASCVD risk based on cross-sectional risk factor levels. Deep learning (DL) models have been developed to incorporate longitudinal data for risk prediction but its benefit for ASCVD risk prediction relative to the traditional Pooled Cohort Equations (PCE) remain unknown. Objective To develop a ASCVD risk prediction model that incorporates longitudinal risk factors using deep learning. Methods Our study included 15,565 participants from four cardiovascular disease cohorts free of baseline ASCVD who were followed for adjudicated ASCVD. Ten-year ASCVD risk was calculated in the training set using our benchmark, the PCE, and a longitudinal DL model, Dynamic-DeepHit. Predictors included those incorporated in the PCE: sex, race, age, total cholesterol, high density lipid cholesterol, systolic and diastolic blood pressure, diabetes, hypertension treatment and smoking. The discrimination and calibration performance of the two models were evaluated in an overall hold-out testing dataset. Results Of the 15,565 participants in our dataset, 2,170 (13.9%) developed ASCVD. The performance of the longitudinal DL model that incorporated 8 years of longitudinal risk factor data improved upon that of the PCE [AUROC: 0.815 (CI: 0.782-0.844) vs 0.792 (CI: 0.760-0.825)] and the net reclassification index was 0.385. The brier score for the DL model was 0.0514 compared with 0.0542 in the PCE. Conclusion Incorporating longitudinal risk factors in ASCVD risk prediction using DL can improve model discrimination and calibration.
Collapse
|
32
|
Liang J, He Y, Xie J, Fan X, Liu Y, Wen Q, Shen D, Xu J, Gu S, Lei J. Mining electronic health records using artificial intelligence: Bibliometric and content analyses for current research status and product conversion. J Biomed Inform 2023; 146:104480. [PMID: 37657713 DOI: 10.1016/j.jbi.2023.104480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Revised: 07/16/2023] [Accepted: 08/27/2023] [Indexed: 09/03/2023]
Abstract
BACKGROUND The use of Electronic Health Records is the most important milestone in the digitization and intelligence of the entire medical industry. AI can effectively mine the immense medical information contained in EHRs, potentially assist doctors in reducing many medical errors. OBJECTIVE This article aims to summarize the research status and trends in using AI to mine medical information from EHRs for the past thirteen years and investigate its information application. METHODS A systematic search was carried out in 5 databases, including Web of Science Core Collection and PubMed, to identify research using AI to mine medical information from EHRs for the past thirteen years. Furthermore, bibliometric and content analysis were used to explore the research hotspots and trends, and systematically analyze the conversion rate of research resources in this field. RESULTS A total of 631 articles were included and analyzed. The number of published articles has increased rapidly after 2017, with an average annual growth rate of 55.73%. The US (41.68%) and China (19.65%) publish the most articles, but there is a lack of international cooperation. The extraction of disease lesions is a hot topic at present, and the research topic is gradually shifting from disease risk grading to disease risk prediction. Classification (66%), and regress (15%) are the main implemented AI tasks. For AI algorithms, deep learning (31.70%), decision tree algorithms family (26.47%), and regression algorithms family (17.43%) are used most frequently. The funding rate for publications is 69.26%, and the input-output conversion rate is 21.05%. CONCLUSIONS Over the past decade, the use of AI to mine medical information from EHRs has been developing rapidly. However, it is necessary to strengthen international cooperation, improve EHRs data availability, focus on interpretable AI algorithms, and improve the resource conversion rate in future research.
Collapse
Affiliation(s)
- Jun Liang
- IT Center, Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, Zhejiang Province, China; Center for Health Policy Studies, School of Public Health, Zhejiang University, Hangzhou, Zhejiang Province, China; Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, School of Medicine, Zhejiang University, Hangzhou, Zhejiang Province, China; School of Medical Technology and Information Engineering, Zhejiang Chinese Medical University, Hangzhou, Zhejiang Province, China
| | - Yunfan He
- Center for Health Policy Studies, School of Public Health, Zhejiang University, Hangzhou, Zhejiang Province, China
| | - Jun Xie
- Information Technology Center, West China Hospital of Sichuan University/Engineering Research Center of Medical Information Technology, Ministry of Education, Chengdu, Sichuan Province, China
| | - Xianming Fan
- Department of Respiratory and Critical Care Medicine, The Affiliated Hospital of Southwest Medical University, Luzhou, Sichuan Province, China
| | - Yiqi Liu
- Department of Infectious Disease, Center for Liver Disease, Peking University First Hospital, Beijing, China
| | - Qinglian Wen
- Department of Oncology, Affiliated Hospital of Southwest Medical University, Luzhou, Sichuan Province, China
| | - Dongxia Shen
- Editorial Department of Journal of Practical Oncology, Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, Zhejiang Province, China
| | - Jie Xu
- IT Center, Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, Zhejiang Province, China
| | - Shuo Gu
- Hainan Provincial Center for Neurological Diseases, Department of Pediatric Neurosurgery of The First Affiliated Hospital, Hainan Medical University, Haikou, Hainan Province, China.
| | - Jianbo Lei
- Clinical Research Center, The Affiliated Hospital of Southwest Medical University, Luzhou, Sichuan Province, China; School of Medical Information and Engineering, SouthWest Medical University, Luzhou, Sichuan Province, China; Institute of Medical Technology, Health Science Center, Peking University, Beijing, China.
| |
Collapse
|
33
|
Sebro RA, Kahn CE. Automated detection of causal relationships among diseases and imaging findings in textual radiology reports. J Am Med Inform Assoc 2023; 30:1701-1706. [PMID: 37381076 PMCID: PMC10531499 DOI: 10.1093/jamia/ocad119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Revised: 06/10/2023] [Accepted: 06/16/2023] [Indexed: 06/30/2023] Open
Abstract
OBJECTIVE Textual radiology reports contain a wealth of information that may help understand associations among diseases and imaging observations. This study evaluated the ability to detect causal associations among diseases and imaging findings from their co-occurrence in radiology reports. MATERIALS AND METHODS This IRB-approved and HIPAA-compliant study analyzed 1 702 462 consecutive reports of 1 396 293 patients; patient consent was waived. Reports were analyzed for positive mention of 16 839 entities (disorders and imaging findings) of the Radiology Gamuts Ontology (RGO). Entities that occurred in fewer than 25 patients were excluded. A Bayesian network structure-learning algorithm was applied at P < 0.05 threshold: edges were evaluated as possible causal relationships. RGO and/or physician consensus served as ground truth. RESULTS 2742 of 16 839 RGO entities were included, 53 849 patients (3.9%) had at least one included entity. The algorithm identified 725 pairs of entities as causally related; 634 were confirmed by reference to RGO or physician review (87% precision). As shown by its positive likelihood ratio, the algorithm increased detection of causally associated entities 6876-fold. DISCUSSION Causal relationships among diseases and imaging findings can be detected with high precision from textual radiology reports. CONCLUSION This approach finds causal relationships among diseases and imaging findings with high precision from textual radiology reports, despite the fact that causally related entities represent only 0.039% of all pairs of entities. Applying this approach to larger report text corpora may help detect unspecified or heretofore unrecognized associations.
Collapse
Affiliation(s)
- Ronnie A Sebro
- Department of Radiology, Department of Orthopedic Surgery, and Center for Augmented Intelligence, Mayo Clinic, Jacksonville, Florida, USA
| | - Charles E Kahn
- Department of Radiology and Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
34
|
Solarte-Pabón O, Montenegro O, García-Barragán A, Torrente M, Provencio M, Menasalvas E, Robles V. Transformers for extracting breast cancer information from Spanish clinical narratives. Artif Intell Med 2023; 143:102625. [PMID: 37673566 DOI: 10.1016/j.artmed.2023.102625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 05/11/2023] [Accepted: 07/08/2023] [Indexed: 09/08/2023]
Abstract
The wide adoption of electronic health records (EHRs) offers immense potential as a source of support for clinical research. However, previous studies focused on extracting only a limited set of medical concepts to support information extraction in the cancer domain for the Spanish language. Building on the success of deep learning for processing natural language texts, this paper proposes a transformer-based approach to extract named entities from breast cancer clinical notes written in Spanish and compares several language models. To facilitate this approach, a schema for annotating clinical notes with breast cancer concepts is presented, and a corpus for breast cancer is developed. Results indicate that both BERT-based and RoBERTa-based language models demonstrate competitive performance in clinical Named Entity Recognition (NER). Specifically, BETO and multilingual BERT achieve F-scores of 93.71% and 94.63%, respectively. Additionally, RoBERTa Biomedical attains an F-score of 95.01%, while RoBERTa BNE achieves an F-score of 94.54%. The findings suggest that transformers can feasibly extract information in the clinical domain in the Spanish language, with the use of models trained on biomedical texts contributing to enhanced results. The proposed approach takes advantage of transfer learning techniques by fine-tuning language models to automatically represent text features and avoiding the time-consuming feature engineering process.
Collapse
Affiliation(s)
- Oswaldo Solarte-Pabón
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Madrid, Spain; Escuela de Ingeniería de Sistemas, Universidad del Valle, Cali, Colombia.
| | - Orlando Montenegro
- Escuela de Ingeniería de Sistemas, Universidad del Valle, Cali, Colombia
| | | | - Maria Torrente
- Hospital Universitario Puerta de Hierro de Madrid, Madrid, Spain
| | | | - Ernestina Menasalvas
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Madrid, Spain
| | - Víctor Robles
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Madrid, Spain
| |
Collapse
|
35
|
Edwards TL, Greene CA, Piekos JA, Hellwege JN, Hampton G, Jasper EA, Velez Edwards DR. Challenges and Opportunities for Data Science in Women's Health. Annu Rev Biomed Data Sci 2023; 6:23-45. [PMID: 37040736 PMCID: PMC10877578 DOI: 10.1146/annurev-biodatasci-020722-105958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/13/2023]
Abstract
The intersection of women's health and data science is a field of research that has historically trailed other fields, but more recently it has gained momentum. This growth is being driven not only by new investigators who are moving into this area but also by the significant opportunities that have emerged in new methodologies, resources, and technologies in data science. Here, we describe some of the resources and methods being used by women's health researchers today to meet challenges in biomedical data science. We also describe the opportunities and limitations of applying these approaches to advance women's health outcomes and the future of the field, with emphasis on repurposing existing methodologies for women's health.
Collapse
Affiliation(s)
- Todd L Edwards
- Division of Epidemiology, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, USA;
| | - Catherine A Greene
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, USA;
- Division of Quantitative Sciences, Department of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Jacqueline A Piekos
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, USA;
- Division of Quantitative Sciences, Department of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Jacklyn N Hellwege
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, USA;
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Gabrielle Hampton
- Division of Epidemiology, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, USA;
| | - Elizabeth A Jasper
- Division of Quantitative Sciences, Department of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, Tennessee, USA
- Center for Precision Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Digna R Velez Edwards
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, USA;
- Division of Quantitative Sciences, Department of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, Tennessee, USA
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| |
Collapse
|
36
|
Mao C, Xu J, Rasmussen L, Li Y, Adekkanattu P, Pacheco J, Bonakdarpour B, Vassar R, Shen L, Jiang G, Wang F, Pathak J, Luo Y. AD-BERT: Using pre-trained language model to predict the progression from mild cognitive impairment to Alzheimer's disease. J Biomed Inform 2023; 144:104442. [PMID: 37429512 PMCID: PMC11131134 DOI: 10.1016/j.jbi.2023.104442] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 06/13/2023] [Accepted: 07/07/2023] [Indexed: 07/12/2023]
Abstract
OBJECTIVE We develop a deep learning framework based on the pre-trained Bidirectional Encoder Representations from Transformers (BERT) model using unstructured clinical notes from electronic health records (EHRs) to predict the risk of disease progression from Mild Cognitive Impairment (MCI) to Alzheimer's Disease (AD). METHODS We identified 3657 patients diagnosed with MCI together with their progress notes from Northwestern Medicine Enterprise Data Warehouse (NMEDW) between 2000 and 2020. The progress notes no later than the first MCI diagnosis were used for the prediction. We first preprocessed the notes by deidentification, cleaning and splitting into sections, and then pre-trained a BERT model for AD (named AD-BERT) based on the publicly available Bio+Clinical BERT on the preprocessed notes. All sections of a patient were embedded into a vector representation by AD-BERT and then combined by global MaxPooling and a fully connected network to compute the probability of MCI-to-AD progression. For validation, we conducted a similar set of experiments on 2563 MCI patients identified at Weill Cornell Medicine (WCM) during the same timeframe. RESULTS Compared with the 7 baseline models, the AD-BERT model achieved the best performance on both datasets, with Area Under receiver operating characteristic Curve (AUC) of 0.849 and F1 score of 0.440 on NMEDW dataset, and AUC of 0.883 and F1 score of 0.680 on WCM dataset. CONCLUSION The use of EHRs for AD-related research is promising, and AD-BERT shows superior predictive performance in modeling MCI-to-AD progression prediction. Our study demonstrates the utility of pre-trained language models and clinical notes in predicting MCI-to-AD progression, which could have important implications for improving early detection and intervention for AD.
Collapse
Affiliation(s)
- Chengsheng Mao
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, United States
| | - Jie Xu
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, United States; Weill Cornell Medicine, New York, NY, United States
| | - Luke Rasmussen
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, United States
| | - Yikuan Li
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, United States
| | | | - Jennifer Pacheco
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, United States
| | - Borna Bonakdarpour
- Department of Neurology, Feinberg School of Medicine, Northwestern University, Chicago, IL, United States
| | - Robert Vassar
- Department of Neurology, Feinberg School of Medicine, Northwestern University, Chicago, IL, United States
| | - Li Shen
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, United States
| | | | - Fei Wang
- Weill Cornell Medicine, New York, NY, United States
| | | | - Yuan Luo
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, United States.
| |
Collapse
|
37
|
Wen A, He H, Fu S, Liu S, Miller K, Wang L, Roberts KE, Bedrick SD, Hersh WR, Liu H. The IMPACT framework and implementation for accessible in silico clinical phenotyping in the digital era. NPJ Digit Med 2023; 6:132. [PMID: 37479735 PMCID: PMC10362064 DOI: 10.1038/s41746-023-00878-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Accepted: 07/13/2023] [Indexed: 07/23/2023] Open
Abstract
Clinical phenotyping is often a foundational requirement for obtaining datasets necessary for the development of digital health applications. Traditionally done via manual abstraction, this task is often a bottleneck in development due to time and cost requirements, therefore raising significant interest in accomplishing this task via in-silico means. Nevertheless, current in-silico phenotyping development tends to be focused on a single phenotyping task resulting in a dearth of reusable tools supporting cross-task generalizable in-silico phenotyping. In addition, in-silico phenotyping remains largely inaccessible for a substantial portion of potentially interested users. Here, we highlight the barriers to the usage of in-silico phenotyping and potential solutions in the form of a framework of several desiderata as observed during our implementation of such tasks. In addition, we introduce an example implementation of said framework as a software application, with a focus on ease of adoption, cross-task reusability, and facilitating the clinical phenotyping algorithm development process.
Collapse
Affiliation(s)
- Andrew Wen
- Department of AI & Informatics, Mayo Clinic, Rochester, MN, 55905, USA
- School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA
| | - Huan He
- Department of AI & Informatics, Mayo Clinic, Rochester, MN, 55905, USA
| | - Sunyang Fu
- Department of AI & Informatics, Mayo Clinic, Rochester, MN, 55905, USA
- School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA
| | - Sijia Liu
- Department of AI & Informatics, Mayo Clinic, Rochester, MN, 55905, USA
| | - Kurt Miller
- Department of AI & Informatics, Mayo Clinic, Rochester, MN, 55905, USA
| | - Liwei Wang
- Department of AI & Informatics, Mayo Clinic, Rochester, MN, 55905, USA
- School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA
| | - Kirk E Roberts
- School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA
| | - Steven D Bedrick
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, 97239, USA
| | - William R Hersh
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, 97239, USA
| | - Hongfang Liu
- Department of AI & Informatics, Mayo Clinic, Rochester, MN, 55905, USA.
- School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA.
| |
Collapse
|
38
|
Keloth VK, Banda JM, Gurley M, Heider PM, Kennedy G, Liu H, Liu F, Miller T, Natarajan K, V Patterson O, Peng Y, Raja K, Reeves RM, Rouhizadeh M, Shi J, Wang X, Wang Y, Wei WQ, Williams AE, Zhang R, Belenkaya R, Reich C, Blacketer C, Ryan P, Hripcsak G, Elhadad N, Xu H. Representing and utilizing clinical textual data for real world studies: An OHDSI approach. J Biomed Inform 2023; 142:104343. [PMID: 36935011 PMCID: PMC10428170 DOI: 10.1016/j.jbi.2023.104343] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 01/21/2023] [Accepted: 03/13/2023] [Indexed: 03/19/2023]
Abstract
Clinical documentation in electronic health records contains crucial narratives and details about patients and their care. Natural language processing (NLP) can unlock the information conveyed in clinical notes and reports, and thus plays a critical role in real-world studies. The NLP Working Group at the Observational Health Data Sciences and Informatics (OHDSI) consortium was established to develop methods and tools to promote the use of textual data and NLP in real-world observational studies. In this paper, we describe a framework for representing and utilizing textual data in real-world evidence generation, including representations of information from clinical text in the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM), the workflow and tools that were developed to extract, transform and load (ETL) data from clinical notes into tables in OMOP CDM, as well as current applications and specific use cases of the proposed OHDSI NLP solution at large consortia and individual institutions with English textual data. Challenges faced and lessons learned during the process are also discussed to provide valuable insights for researchers who are planning to implement NLP solutions in real-world studies.
Collapse
Affiliation(s)
- Vipina K Keloth
- Section of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA
| | - Juan M Banda
- Department of Computer Science, Georgia State University, Atlanta, GA, USA
| | - Michael Gurley
- Lurie Cancer Center, Northwestern University, Chicago, Illinois, USA
| | - Paul M Heider
- Biomedical Informatics Center, Medical University of South Carolina, Charleston, SC, USA
| | - Georgina Kennedy
- Ingham Institute for Applied Medical Research, Sydney, Australia
| | - Hongfang Liu
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN, USA
| | - Feifan Liu
- Department of Population and Quantitative Health Sciences, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Timothy Miller
- Computational Health Informatics Program, Boston Children's Hospital, and Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Karthik Natarajan
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| | - Olga V Patterson
- VA Informatics and Computing Infrastructure, Department of Veterans Affairs Salt Lake City Health Care System, Salt Lake City, Utah, USA; Division of Epidemiology, Department of Internal Medicine, School of Medicine, University of Utah, Salt Lake City, Utah, USA; Verily Life Sciences, Mountain View, CA, USA
| | - Yifan Peng
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA
| | - Kalpana Raja
- Section of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA
| | - Ruth M Reeves
- TN Valley Healthcare System, U.S. Department of Veterans Affairs, Nashville, TN, USA; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Masoud Rouhizadeh
- Department of Pharmaceutical Outcomes & Policy, University of Florida, Gainesville, FL, USA; Biomedical Informatics and Data Science, Johns Hopkins University, Baltimore, MD, USA
| | - Jianlin Shi
- VA Informatics and Computing Infrastructure, Department of Veterans Affairs Salt Lake City Health Care System, Salt Lake City, Utah, USA; Division of Epidemiology, Department of Internal Medicine, School of Medicine, University of Utah, Salt Lake City, Utah, USA; Department of Biomedical Informatics, University of Utah, Salt Lake City, USA
| | - Xiaoyan Wang
- Sema4 Mount Sinai Genomics Incorporation, Stamford, CT, USA
| | - Yanshan Wang
- Department of Health Information Management, Department of Biomedical Informatics, and Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, USA
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | | | - Rui Zhang
- Institute for Health Informatics, and Department of Pharmaceutical Care & Health Systems, University of Minnesota, Minneapolis, MN, USA
| | | | | | - Clair Blacketer
- Janssen Pharmaceutical Research and Development LLC, Titusville, NJ, USA; Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, the Netherlands
| | - Patrick Ryan
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA; Janssen Pharmaceutical Research and Development LLC, Titusville, NJ, USA
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| | - Noémie Elhadad
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA.
| | - Hua Xu
- Section of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA.
| |
Collapse
|
39
|
Bikdeli B, Lo YC, Khairani CD, Bejjani A, Jimenez D, Barco S, Mahajan S, Caraballo C, Secemsky EA, Klok FA, Hunsaker AR, Aghayev A, Muriel A, Wang Y, Hussain MA, Appah-Sampong A, Lu Y, Lin Z, Aneja S, Khera R, Goldhaber SZ, Zhou L, Monreal M, Krumholz HM, Piazza G. Developing Validated Tools to Identify Pulmonary Embolism in Electronic Databases: Rationale and Design of the PE-EHR+ Study. Thromb Haemost 2023; 123:649-662. [PMID: 36809777 PMCID: PMC11200175 DOI: 10.1055/a-2039-3222] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
Abstract
BACKGROUND Contemporary pulmonary embolism (PE) research, in many cases, relies on data from electronic health records (EHRs) and administrative databases that use International Classification of Diseases (ICD) codes. Natural language processing (NLP) tools can be used for automated chart review and patient identification. However, there remains uncertainty with the validity of ICD-10 codes or NLP algorithms for patient identification. METHODS The PE-EHR+ study has been designed to validate ICD-10 codes as Principal Discharge Diagnosis, or Secondary Discharge Diagnoses, as well as NLP tools set out in prior studies to identify patients with PE within EHRs. Manual chart review by two independent abstractors by predefined criteria will be the reference standard. Sensitivity, specificity, and positive and negative predictive values will be determined. We will assess the discriminatory function of code subgroups for intermediate- and high-risk PE. In addition, accuracy of NLP algorithms to identify PE from radiology reports will be assessed. RESULTS A total of 1,734 patients from the Mass General Brigham health system have been identified. These include 578 with ICD-10 Principal Discharge Diagnosis codes for PE, 578 with codes in the secondary position, and 578 without PE codes during the index hospitalization. Patients within each group were selected randomly from the entire pool of patients at the Mass General Brigham health system. A smaller subset of patients will also be identified from the Yale-New Haven Health System. Data validation and analyses will be forthcoming. CONCLUSIONS The PE-EHR+ study will help validate efficient tools for identification of patients with PE in EHRs, improving the reliability of efficient observational studies or randomized trials of patients with PE using electronic databases.
Collapse
Affiliation(s)
- Behnood Bikdeli
- Division of Cardiovascular Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States
- Thrombosis Research Group, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States
- YNHH/Yale Center for Outcomes Research and Evaluation (CORE), New Haven, Connecticut, United States
- Cardiovascular Research Foundation (CRF), New York, New York, United States
| | - Ying-Chih Lo
- Division of General Internal Medicine and Primary Care, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States
| | - Candrika D Khairani
- Thrombosis Research Group, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States
| | - Antoine Bejjani
- Thrombosis Research Group, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States
| | - David Jimenez
- Respiratory Department, Hospital Ramón y Cajal and Medicine Department, Universidad de Alcalá (Instituto de Ramón y Cajal de Investigación Sanitaria), Centro de Investigación Biomédica en Red de Enfermedades Respiratorias, Madrid, Spain
| | - Stefano Barco
- Department of Angiology, University Hospital Zurich, Zurich, Switzerland
- Center for Thrombosis and Hemostasis, Johannes Gutenberg University Mainz, Mainz, Germany
| | - Shiwani Mahajan
- YNHH/Yale Center for Outcomes Research and Evaluation (CORE), New Haven, Connecticut, United States
- Department of Internal Medicine, Yale School of Medicine, New Haven, Connecticut, United States
| | - César Caraballo
- YNHH/Yale Center for Outcomes Research and Evaluation (CORE), New Haven, Connecticut, United States
| | - Eric A Secemsky
- Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, United States
- Harvard Medical School, Boston, Massachusetts, United States
- Division of Cardiology, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, United States
| | - Frederikus A Klok
- Department of Medicine - Thrombosis and Hemostasis, Leiden University Medical Centre, Leiden, The Netherlands
| | - Andetta R Hunsaker
- Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States
| | - Ayaz Aghayev
- Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States
| | - Alfonso Muriel
- Clinical Biostatistics Unit. Hospital Universitario Ramón y Cajal. IRYCIS, CIBERESP: Universidad de Alcalá. Madrid, Spain
| | - Yun Wang
- YNHH/Yale Center for Outcomes Research and Evaluation (CORE), New Haven, Connecticut, United States
- Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, United States
| | - Mohamad A Hussain
- Division of Vascular and Endovascular Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States
- Centre for Surgery and Public Health, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States
| | - Abena Appah-Sampong
- Department of Surgery, Brigham and Women's Hospital, Boston, Massachusetts, United States
| | - Yuan Lu
- YNHH/Yale Center for Outcomes Research and Evaluation (CORE), New Haven, Connecticut, United States
| | - Zhenqiu Lin
- YNHH/Yale Center for Outcomes Research and Evaluation (CORE), New Haven, Connecticut, United States
| | - Sanjay Aneja
- Department of Therapeutic Radiology, Yale School of Medicine, New Haven, Connecticut, United States
| | - Rohan Khera
- YNHH/Yale Center for Outcomes Research and Evaluation (CORE), New Haven, Connecticut, United States
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, Connecticut, United States
| | - Samuel Z Goldhaber
- Division of Cardiovascular Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States
- Thrombosis Research Group, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States
| | - Li Zhou
- Division of General Internal Medicine and Primary Care, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States
| | - Manuel Monreal
- Cátedra de Enfermedad Tromboembólica, Universidad Católica de Murcia, Murcia, Spain
| | - Harlan M Krumholz
- YNHH/Yale Center for Outcomes Research and Evaluation (CORE), New Haven, Connecticut, United States
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, Connecticut, United States
- Department of Health Policy and Management, Yale School of Public Health, New Haven, Connecticut, United States
| | - Gregory Piazza
- Division of Cardiovascular Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States
- Thrombosis Research Group, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States
| |
Collapse
|
40
|
Alsentzer E, Rasmussen MJ, Fontoura R, Cull AL, Beaulieu-Jones B, Gray KJ, Bates DW, Kovacheva VP. Zero-shot Interpretable Phenotyping of Postpartum Hemorrhage Using Large Language Models. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.05.31.23290753. [PMID: 37398230 PMCID: PMC10312824 DOI: 10.1101/2023.05.31.23290753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Many areas of medicine would benefit from deeper, more accurate phenotyping, but there are limited approaches for phenotyping using clinical notes without substantial annotated data. Large language models (LLMs) have demonstrated immense potential to adapt to novel tasks with no additional training by specifying task-specific i nstructions. We investigated the per-formance of a publicly available LLM, Flan-T5, in phenotyping patients with postpartum hemorrhage (PPH) using discharge notes from electronic health records ( n =271,081). The language model achieved strong performance in extracting 24 granular concepts associated with PPH. Identifying these granular concepts accurately allowed the development of inter-pretable, complex phenotypes and subtypes. The Flan-T5 model achieved high fidelity in phenotyping PPH (positive predictive value of 0.95), identifying 47% more patients with this complication compared to the current standard of using claims codes. This LLM pipeline can be used reliably for subtyping PPH and outperformed a claims-based approach on the three most common PPH subtypes associated with uterine atony, abnormal placentation, and obstetric trauma. The advantage of this approach to subtyping is its interpretability, as each concept contributing to the subtype determination can be evaluated. Moreover, as definitions may change over time due to new guidelines, using granular concepts to create complex phenotypes enables prompt and efficient updating of the algorithm. Using this lan-guage modelling approach enables rapid phenotyping without the need for any manually annotated training data across multiple clinical use cases.
Collapse
|
41
|
He T, Belouali A, Patricoski J, Lehmann H, Ball R, Anagnostou V, Kreimeyer K, Botsis T. Trends and opportunities in computable clinical phenotyping: A scoping review. J Biomed Inform 2023; 140:104335. [PMID: 36933631 DOI: 10.1016/j.jbi.2023.104335] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 03/07/2023] [Accepted: 03/09/2023] [Indexed: 03/18/2023]
Abstract
Identifying patient cohorts meeting the criteria of specific phenotypes is essential in biomedicine and particularly timely in precision medicine. Many research groups deliver pipelines that automatically retrieve and analyze data elements from one or more sources to automate this task and deliver high-performing computable phenotypes. We applied a systematic approach based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines to conduct a thorough scoping review on computable clinical phenotyping. Five databases were searched using a query that combined the concepts of automation, clinical context, and phenotyping. Subsequently, four reviewers screened 7960 records (after removing over 4000 duplicates) and selected 139 that satisfied the inclusion criteria. This dataset was analyzed to extract information on target use cases, data-related topics, phenotyping methodologies, evaluation strategies, and portability of developed solutions. Most studies supported patient cohort selection without discussing the application to specific use cases, such as precision medicine. Electronic Health Records were the primary source in 87.1 % (N = 121) of all studies, and International Classification of Diseases codes were heavily used in 55.4 % (N = 77) of all studies, however, only 25.9 % (N = 36) of the records described compliance with a common data model. In terms of the presented methods, traditional Machine Learning (ML) was the dominant method, often combined with natural language processing and other approaches, while external validation and portability of computable phenotypes were pursued in many cases. These findings revealed that defining target use cases precisely, moving away from sole ML strategies, and evaluating the proposed solutions in the real setting are essential opportunities for future work. There is also momentum and an emerging need for computable phenotyping to support clinical and epidemiological research and precision medicine.
Collapse
Affiliation(s)
- Ting He
- Department of Oncology, The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Biomedical Informatics and Data Science Section, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Anas Belouali
- Biomedical Informatics and Data Science Section, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Jessica Patricoski
- Biomedical Informatics and Data Science Section, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Harold Lehmann
- Biomedical Informatics and Data Science Section, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Robert Ball
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US FDA, Silver Spring, MD, USA
| | - Valsamo Anagnostou
- Department of Oncology, The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Kory Kreimeyer
- Department of Oncology, The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Biomedical Informatics and Data Science Section, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Taxiarchis Botsis
- Department of Oncology, The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Biomedical Informatics and Data Science Section, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
42
|
Chng SY, Tern PJW, Kan MRX, Cheng LTE. Automated labelling of radiology reports using natural language processing: Comparison of traditional and newer methods. HEALTH CARE SCIENCE 2023; 2:120-128. [PMID: 38938764 PMCID: PMC11080679 DOI: 10.1002/hcs2.40] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Revised: 01/31/2023] [Accepted: 02/23/2023] [Indexed: 06/29/2024]
Abstract
Automated labelling of radiology reports using natural language processing allows for the labelling of ground truth for large datasets of radiological studies that are required for training of computer vision models. This paper explains the necessary data preprocessing steps, reviews the main methods for automated labelling and compares their performance. There are four main methods of automated labelling, namely: (1) rules-based text-matching algorithms, (2) conventional machine learning models, (3) neural network models and (4) Bidirectional Encoder Representations from Transformers (BERT) models. Rules-based labellers perform a brute force search against manually curated keywords and are able to achieve high F1 scores. However, they require proper handling of negative words. Machine learning models require preprocessing that involves tokenization and vectorization of text into numerical vectors. Multilabel classification approaches are required in labelling radiology reports and conventional models can achieve good performance if they have large enough training sets. Deep learning models make use of connected neural networks, often a long short-term memory network, and are similarly able to achieve good performance if trained on a large data set. BERT is a transformer-based model that utilizes attention. Pretrained BERT models only require fine-tuning with small data sets. In particular, domain-specific BERT models can achieve superior performance compared with the other methods for automated labelling.
Collapse
Affiliation(s)
- Seo Yi Chng
- Department of PaediatricsNational University of SingaporeSingaporeSingapore
| | - Paul J. W. Tern
- Department of CardiologyNational Heart CentreSingaporeSingapore
| | | | - Lionel T. E. Cheng
- Department of Diagnostic RadiologySingapore General HospitalSingaporeSingapore
| |
Collapse
|
43
|
Hossain E, Rana R, Higgins N, Soar J, Barua PD, Pisani AR, Turner K. Natural Language Processing in Electronic Health Records in relation to healthcare decision-making: A systematic review. Comput Biol Med 2023; 155:106649. [PMID: 36805219 DOI: 10.1016/j.compbiomed.2023.106649] [Citation(s) in RCA: 82] [Impact Index Per Article: 41.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 01/04/2023] [Accepted: 02/07/2023] [Indexed: 02/12/2023]
Abstract
BACKGROUND Natural Language Processing (NLP) is widely used to extract clinical insights from Electronic Health Records (EHRs). However, the lack of annotated data, automated tools, and other challenges hinder the full utilisation of NLP for EHRs. Various Machine Learning (ML), Deep Learning (DL) and NLP techniques are studied and compared to understand the limitations and opportunities in this space comprehensively. METHODOLOGY After screening 261 articles from 11 databases, we included 127 papers for full-text review covering seven categories of articles: (1) medical note classification, (2) clinical entity recognition, (3) text summarisation, (4) deep learning (DL) and transfer learning architecture, (5) information extraction, (6) Medical language translation and (7) other NLP applications. This study follows the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. RESULT AND DISCUSSION EHR was the most commonly used data type among the selected articles, and the datasets were primarily unstructured. Various ML and DL methods were used, with prediction or classification being the most common application of ML or DL. The most common use cases were: the International Classification of Diseases, Ninth Revision (ICD-9) classification, clinical note analysis, and named entity recognition (NER) for clinical descriptions and research on psychiatric disorders. CONCLUSION We find that the adopted ML models were not adequately assessed. In addition, the data imbalance problem is quite important, yet we must find techniques to address this underlining problem. Future studies should address key limitations in studies, primarily identifying Lupus Nephritis, Suicide Attempts, perinatal self-harmed and ICD-9 classification.
Collapse
Affiliation(s)
- Elias Hossain
- School of Engineering & Physical Sciences, North South University, Dhaka 1229, Bangladesh.
| | - Rajib Rana
- School of Mathematics, Physics and Computing, University of Southern Queensland, Springfield Central QLD 4300, Australia
| | - Niall Higgins
- School of Management and Enterprise, University of Southern Queensland, Darling Heights QLD 4350, Australia; School of Nursing, Queensland University of Technology, Kelvin Grove, Brisbane, QLD 4000, Australia; Metro North Mental Health, Herston QLD 4029, Australia
| | - Jeffrey Soar
- School of Business, University of Southern Queensland, Springfield Central QLD 4300, Australia
| | - Prabal Datta Barua
- School of Business, University of Southern Queensland, Springfield Central QLD 4300, Australia
| | - Anthony R Pisani
- Center for the Study and Prevention of Suicide, University of Rochester, Rochester, NY, United States
| | - Kathryn Turner
- School of Nursing, Queensland University of Technology, Kelvin Grove, Brisbane, QLD 4000, Australia
| |
Collapse
|
44
|
Baron JM. Artificial Intelligence in the Clinical Laboratory: An Overview with Frequently Asked Questions. Clin Lab Med 2023; 43:1-16. [PMID: 36764803 DOI: 10.1016/j.cll.2022.09.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
This article provides an overview of machine learning fundamentals and some applications of machine learning to clinical laboratory diagnostics and patient management. A key goal of this article is to provide a basic foundation in clinical machine learning for readers with clinical laboratory experience that will set them up for more in-depth study of the topic and/or to become a better collaborator with computational colleagues in the development and deployment of machine learning-based solutions.
Collapse
Affiliation(s)
- Jason M Baron
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, 55 Fruit Street, Boston, MA 02114-2696, USA.
| |
Collapse
|
45
|
Khope SR, Elias S. Strategies of Predictive Schemes and Clinical Diagnosis for Prognosis Using MIMIC-III: A Systematic Review. Healthcare (Basel) 2023; 11:710. [PMID: 36900715 PMCID: PMC10001415 DOI: 10.3390/healthcare11050710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Revised: 02/18/2023] [Accepted: 02/21/2023] [Indexed: 03/05/2023] Open
Abstract
The prime purpose of the proposed study is to construct a novel predictive scheme for assisting in the prognosis of criticality using the MIMIC-III dataset. With the adoption of various analytics and advanced computing in the healthcare system, there is an increasing trend toward developing an effective prognostication mechanism. Predictive-based modeling is the best alternative to work in this direction. This paper discusses various scientific contributions using desk research methodology towards the Medical Information Mart for Intensive Care (MIMIC-III). This open-access dataset is meant to help predict patient trajectories for various purposes ranging from mortality forecasting to treatment planning. With a dominant machine learning approach in this perspective, there is a need to discover the effectiveness of existing predictive methods. The resultant outcome of this paper offers an inclusive discussion about various available predictive schemes and clinical diagnoses using MIMIC-III in order to contribute toward better information associated with its strengths and weaknesses. Therefore, the paper provides a clear visualization of existing schemes for clinical diagnosis using a systematic review approach.
Collapse
Affiliation(s)
| | - Susan Elias
- School of Electronics Engineering, Vellore Institute of Technology, Chennai 600127, Tamil Nadu, India
| |
Collapse
|
46
|
Brandt PS, Kho A, Luo Y, Pacheco JA, Walunas TL, Hakonarson H, Hripcsak G, Liu C, Shang N, Weng C, Walton N, Carrell DS, Crane PK, Larson EB, Chute CG, Kullo IJ, Carroll R, Denny J, Ramirez A, Wei WQ, Pathak J, Wiley LK, Richesson R, Starren JB, Rasmussen LV. Characterizing variability of electronic health record-driven phenotype definitions. J Am Med Inform Assoc 2023; 30:427-437. [PMID: 36474423 PMCID: PMC9933077 DOI: 10.1093/jamia/ocac235] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Revised: 10/19/2022] [Accepted: 11/23/2022] [Indexed: 12/12/2022] Open
Abstract
OBJECTIVE The aim of this study was to analyze a publicly available sample of rule-based phenotype definitions to characterize and evaluate the variability of logical constructs used. MATERIALS AND METHODS A sample of 33 preexisting phenotype definitions used in research that are represented using Fast Healthcare Interoperability Resources and Clinical Quality Language (CQL) was analyzed using automated analysis of the computable representation of the CQL libraries. RESULTS Most of the phenotype definitions include narrative descriptions and flowcharts, while few provide pseudocode or executable artifacts. Most use 4 or fewer medical terminologies. The number of codes used ranges from 5 to 6865, and value sets from 1 to 19. We found that the most common expressions used were literal, data, and logical expressions. Aggregate and arithmetic expressions are the least common. Expression depth ranges from 4 to 27. DISCUSSION Despite the range of conditions, we found that all of the phenotype definitions consisted of logical criteria, representing both clinical and operational logic, and tabular data, consisting of codes from standard terminologies and keywords for natural language processing. The total number and variety of expressions are low, which may be to simplify implementation, or authors may limit complexity due to data availability constraints. CONCLUSIONS The phenotype definitions analyzed show significant variation in specific logical, arithmetic, and other operators but are all composed of the same high-level components, namely tabular data and logical expressions. A standard representation for phenotype definitions should support these formats and be modular to support localization and shared logic.
Collapse
Affiliation(s)
- Pascal S Brandt
- Department of Biomedical and Medical Education, University of Washington, Seattle, Washington, USA
| | - Abel Kho
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| | - Yuan Luo
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| | - Jennifer A Pacheco
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| | - Theresa L Walunas
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| | - Hakon Hakonarson
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Cong Liu
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Ning Shang
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Nephi Walton
- Intermountain Precision Genomics, Intermountain Healthcare, St George, Utah, USA
| | - David S Carrell
- Kaiser Permanente Washington Health Research Institute, Seattle, Washington, USA
| | - Paul K Crane
- Department of Medicine, University of Washington, Seattle, Washington, USA
| | - Eric B Larson
- Department of Medicine, University of Washington, Seattle, Washington, USA
- Department of Health Services, University of Washington, Seattle, Washington, USA
| | - Christopher G Chute
- Schools of Medicine, Public Health, and Nursing, Johns Hopkins University, Baltimore, Maryland, USA
| | - Iftikhar J Kullo
- Department of Cardiovascular Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Robert Carroll
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Josh Denny
- All of Us Research Program, National Institutes of Health, Bethesda, Maryland, USA
| | - Andrea Ramirez
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Jyoti Pathak
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA
| | - Laura K Wiley
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| | - Rachel Richesson
- Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor, Michigan, USA
| | - Justin B Starren
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| | - Luke V Rasmussen
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| |
Collapse
|
47
|
Pacheco JA, Rasmussen LV, Wiley K, Person TN, Cronkite DJ, Sohn S, Murphy S, Gundelach JH, Gainer V, Castro VM, Liu C, Mentch F, Lingren T, Sundaresan AS, Eickelberg G, Willis V, Furmanchuk A, Patel R, Carrell DS, Deng Y, Walton N, Satterfield BA, Kullo IJ, Dikilitas O, Smith JC, Peterson JF, Shang N, Kiryluk K, Ni Y, Li Y, Nadkarni GN, Rosenthal EA, Walunas TL, Williams MS, Karlson EW, Linder JE, Luo Y, Weng C, Wei W. Evaluation of the portability of computable phenotypes with natural language processing in the eMERGE network. Sci Rep 2023; 13:1971. [PMID: 36737471 PMCID: PMC9898520 DOI: 10.1038/s41598-023-27481-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 01/03/2023] [Indexed: 02/05/2023] Open
Abstract
The electronic Medical Records and Genomics (eMERGE) Network assessed the feasibility of deploying portable phenotype rule-based algorithms with natural language processing (NLP) components added to improve performance of existing algorithms using electronic health records (EHRs). Based on scientific merit and predicted difficulty, eMERGE selected six existing phenotypes to enhance with NLP. We assessed performance, portability, and ease of use. We summarized lessons learned by: (1) challenges; (2) best practices to address challenges based on existing evidence and/or eMERGE experience; and (3) opportunities for future research. Adding NLP resulted in improved, or the same, precision and/or recall for all but one algorithm. Portability, phenotyping workflow/process, and technology were major themes. With NLP, development and validation took longer. Besides portability of NLP technology and algorithm replicability, factors to ensure success include privacy protection, technical infrastructure setup, intellectual property agreement, and efficient communication. Workflow improvements can improve communication and reduce implementation time. NLP performance varied mainly due to clinical document heterogeneity; therefore, we suggest using semi-structured notes, comprehensive documentation, and customization options. NLP portability is possible with improved phenotype algorithm performance, but careful planning and architecture of the algorithms is essential to support local customizations.
Collapse
Affiliation(s)
| | | | - Ken Wiley
- National Human Genome Research Institute, Bethesda, USA
| | | | - David J Cronkite
- Kaiser Permanente Washington Health Research Institute, Seattle, USA
| | | | | | | | | | | | - Cong Liu
- Columbia University, New York, USA
| | - Frank Mentch
- Children's Hospital of Philadelphia, Philadelphia, USA
| | - Todd Lingren
- Cincinnati Children's Hospital Medical Center, Cincinnati, USA
| | | | | | | | | | | | - David S Carrell
- Kaiser Permanente Washington Health Research Institute, Seattle, USA
| | - Yu Deng
- Northwestern University, Evanston, USA
| | | | | | | | | | | | | | | | | | - Yizhao Ni
- Cincinnati Children's Hospital Medical Center, Cincinnati, USA
| | - Yikuan Li
- Northwestern University, Evanston, USA
| | | | | | | | | | | | | | - Yuan Luo
- Northwestern University, Evanston, USA
| | | | - WeiQi Wei
- Vanderbilt University Medical Center, Nashville, USA
| |
Collapse
|
48
|
Deng Y, Liu L, Jiang H, Peng Y, Wei Y, Zhou Z, Zhong Y, Zhao Y, Yang X, Yu J, Lu Z, Kho A, Ning H, Allen NB, Wilkins JT, Liu K, Lloyd-Jones DM, Zhao L. Comparison of State-of-the-Art Neural Network Survival Models with the Pooled Cohort Equations for Cardiovascular Disease Risk Prediction. BMC Med Res Methodol 2023; 23:22. [PMID: 36694118 PMCID: PMC9872364 DOI: 10.1186/s12874-022-01829-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Accepted: 12/23/2022] [Indexed: 01/26/2023] Open
Abstract
BACKGROUND The Pooled Cohort Equations (PCEs) are race- and sex-specific Cox proportional hazards (PH)-based models used for 10-year atherosclerotic cardiovascular disease (ASCVD) risk prediction with acceptable discrimination. In recent years, neural network models have gained increasing popularity with their success in image recognition and text classification. Various survival neural network models have been proposed by combining survival analysis and neural network architecture to take advantage of the strengths from both. However, the performance of these survival neural network models compared to each other and to PCEs in ASCVD prediction is unknown. METHODS In this study, we used 6 cohorts from the Lifetime Risk Pooling Project (with 5 cohorts as training/internal validation and one cohort as external validation) and compared the performance of the PCEs in 10-year ASCVD risk prediction with an all two-way interactions Cox PH model (Cox PH-TWI) and three state-of-the-art neural network survival models including Nnet-survival, Deepsurv, and Cox-nnet. For all the models, we used the same 7 covariates as used in the PCEs. We fitted each of the aforementioned models in white females, white males, black females, and black males, respectively. We evaluated models' internal and external discrimination power and calibration. RESULTS The training/internal validation sample comprised 23216 individuals. The average age at baseline was 57.8 years old (SD = 9.6); 16% developed ASCVD during average follow-up of 10.50 (SD = 3.02) years. Based on 10 × 10 cross-validation, the method that had the highest C-statistics was Deepsurv (0.7371) for white males, Deepsurv and Cox PH-TWI (0.7972) for white females, PCE (0.6981) for black males, and Deepsurv (0.7886) for black females. In the external validation dataset, Deepsurv (0.7032), Cox-nnet (0.7282), PCE (0.6811), and Deepsurv (0.7316) had the highest C-statistics for white male, white female, black male, and black female population, respectively. Calibration plots showed that in 10 × 10 validation, all models had good calibration in all race and sex groups. In external validation, all models overestimated the risk for 10-year ASCVD. CONCLUSIONS We demonstrated the use of the state-of-the-art neural network survival models in ASCVD risk prediction. Neural network survival models had similar if not superior discrimination and calibration compared to PCEs.
Collapse
Affiliation(s)
- Yu Deng
- Center for Health Information Partnerships, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Lei Liu
- Division of Biostatistics, Washington University in St. Louis, St. Louis, MO, USA
| | - Hongmei Jiang
- Department of Statistics and Data Science, Northwestern University, Chicago, IL, USA
| | - Yifan Peng
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA
| | - Yishu Wei
- Department of Statistics and Data Science, Northwestern University, Chicago, IL, USA
| | - Zhiyang Zhou
- Department of Statistics, University of Manitoba, Winnipeg, MB, Canada
| | - Yizhen Zhong
- Center for Health Information Partnerships, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Yun Zhao
- Department of Computer Science, University of California, Santa Barbara, CA, USA
| | - Xiaoyun Yang
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Jingzhi Yu
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Zhiyong Lu
- National Center for Biotechnology Information, National Library of Medicine, National Institute of Health, Bethesda, MD, USA
| | - Abel Kho
- Center for Health Information Partnerships, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Hongyan Ning
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Norrina B Allen
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - John T Wilkins
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Kiang Liu
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Donald M Lloyd-Jones
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Lihui Zhao
- Center for Health Information Partnerships, Northwestern University Feinberg School of Medicine, Chicago, IL, USA.
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA.
| |
Collapse
|
49
|
Yang S, Varghese P, Stephenson E, Tu K, Gronsbell J. Machine learning approaches for electronic health records phenotyping: a methodical review. J Am Med Inform Assoc 2023; 30:367-381. [PMID: 36413056 PMCID: PMC9846699 DOI: 10.1093/jamia/ocac216] [Citation(s) in RCA: 40] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 09/27/2022] [Accepted: 10/27/2022] [Indexed: 11/23/2022] Open
Abstract
OBJECTIVE Accurate and rapid phenotyping is a prerequisite to leveraging electronic health records for biomedical research. While early phenotyping relied on rule-based algorithms curated by experts, machine learning (ML) approaches have emerged as an alternative to improve scalability across phenotypes and healthcare settings. This study evaluates ML-based phenotyping with respect to (1) the data sources used, (2) the phenotypes considered, (3) the methods applied, and (4) the reporting and evaluation methods used. MATERIALS AND METHODS We searched PubMed and Web of Science for articles published between 2018 and 2022. After screening 850 articles, we recorded 37 variables on 100 studies. RESULTS Most studies utilized data from a single institution and included information in clinical notes. Although chronic conditions were most commonly considered, ML also enabled the characterization of nuanced phenotypes such as social determinants of health. Supervised deep learning was the most popular ML paradigm, while semi-supervised and weakly supervised learning were applied to expedite algorithm development and unsupervised learning to facilitate phenotype discovery. ML approaches did not uniformly outperform rule-based algorithms, but deep learning offered a marginal improvement over traditional ML for many conditions. DISCUSSION Despite the progress in ML-based phenotyping, most articles focused on binary phenotypes and few articles evaluated external validity or used multi-institution data. Study settings were infrequently reported and analytic code was rarely released. CONCLUSION Continued research in ML-based phenotyping is warranted, with emphasis on characterizing nuanced phenotypes, establishing reporting and evaluation standards, and developing methods to accommodate misclassified phenotypes due to algorithm errors in downstream applications.
Collapse
Affiliation(s)
- Siyue Yang
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
| | | | - Ellen Stephenson
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Karen Tu
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Jessica Gronsbell
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
50
|
Obeid JS, Khalifa A, Xavier B, Bou-Daher H, Rockey DC. An AI Approach for Identifying Patients With Cirrhosis. J Clin Gastroenterol 2023; 57:82-88. [PMID: 34238846 PMCID: PMC8741865 DOI: 10.1097/mcg.0000000000001586] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Accepted: 06/05/2021] [Indexed: 02/05/2023]
Abstract
GOAL The goal of this study was to evaluate an artificial intelligence approach, namely deep learning, on clinical text in electronic health records (EHRs) to identify patients with cirrhosis. BACKGROUND AND AIMS Accurate identification of cirrhosis in EHR is important for epidemiological, health services, and outcomes research. Currently, such efforts depend on International Classification of Diseases (ICD) codes, with limited success. MATERIALS AND METHODS We trained several machine learning models using discharge summaries from patients with known cirrhosis from a patient registry and random controls without cirrhosis or its complications based on ICD codes. Models were validated on patients for whom discharge summaries were manually reviewed and used as the gold standard test set. We tested Naive Bayes and Random Forest as baseline models and a deep learning model using word embedding and a convolutional neural network (CNN). RESULTS The training set included 446 cirrhosis patients and 689 controls, while the gold standard test set included 139 cirrhosis patients and 152 controls. Among the machine learning models, the CNN achieved the highest area under the receiver operating characteristic curve (0.993), with a precision of 0.965 and recall of 0.978, compared with 0.879 and 0.981 for the Naive Bayes and Random Forest, respectively (precision 0.787 and 0.958, and recalls 0.878 and 0.827). The precision by ICD codes for cirrhosis was 0.883 and recall was 0.978. CONCLUSIONS A CNN model trained on discharge summaries identified cirrhosis patients with high precision and recall. This approach for phenotyping cirrhosis in the EHR may provide a more accurate assessment of disease burden in a variety of studies.
Collapse
Affiliation(s)
- Jihad S. Obeid
- Department of Public Health Sciences, Medical University of South Carolina, Charleston, South Carolina, USA
| | - Ali Khalifa
- Division of Gastroenterology and Hepatology, Medical University of South Carolina, Charleston, South Carolina, USA
| | - Brandon Xavier
- Division of Gastroenterology and Hepatology, Medical University of South Carolina, Charleston, South Carolina, USA
| | - Halim Bou-Daher
- Division of Gastroenterology and Hepatology, Medical University of South Carolina, Charleston, South Carolina, USA
| | - Don C. Rockey
- Division of Gastroenterology and Hepatology, Medical University of South Carolina, Charleston, South Carolina, USA
- Medical University of South Carolina Digestive Disease Research Center, Medical University of South Carolina, Charleston, South Carolina, USA
| |
Collapse
|