1
|
Retamales J, Retamales JP, Demarchi AM, Gonzalez M, Lopez C, Ramirez N, Retamal T, Sun V. Leveraging Artificial Intelligence to Uncover Symptom Burden in Palliative Care: Analysis of Nonscheduled Visits Using a Phi-3 Small Language Model. JCO Glob Oncol 2025; 11:e2400432. [PMID: 40184565 DOI: 10.1200/go-24-00432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Revised: 02/08/2025] [Accepted: 02/24/2025] [Indexed: 04/06/2025] Open
Abstract
PURPOSE This study aimed to differentiate nonscheduled visits (NSVs) in an outpatient palliative care setting that are driven by or accompanied by uncontrolled symptoms from those that are administrative or routine, such as prescription refills and examination readings. A small language model (SLM) was used to enhance the detection and management of symptoms, thus improving health care resource allocation. METHODS A retrospective analysis was performed on 25,867 patient visits to an outpatient palliative care unit, including 7,036 NSVs. A stratified random sample of 384 NSVs was reviewed to determine the presence of symptoms, using physician audits as the gold standard. A Phi-3-based SLM was validated against these audits to assess its accuracy in detecting the symptoms. The validated SLM was then applied to the entire NSV data set to identify symptom patterns. Multivariate linear regression was used to analyze the association of age, cancer type, and insurance category with the presence of symptoms. RESULTS SLM demonstrated high sensitivity (99.4%) and accuracy (95.3%) in identifying symptom-driven NSVs. The analysis revealed that 85.7% of the NSVs were driven by symptoms, indicating a significant hidden burden of unmanaged symptoms. The study found that certain demographic and clinical factors, including younger age groups and specific cancer types, were significantly associated with an increased symptom burden. CONCLUSION This study highlights the substantial burden of symptom-driven NSVs in palliative care and demonstrates the effectiveness of using a SLM to identify and manage symptoms. Implementing such models in clinical practice can improve patient care by optimizing the allocation of health care resources and tailoring interventions to the needs of patients with advanced illnesses.
Collapse
Affiliation(s)
- Javier Retamales
- Hospital Sotero del Río, Santiago, Chile
- Grupo Oncologico Cooperativo Chileno de Investigacion-GOCCHI, Santiago, Chile
| | | | | | | | | | | | | | - Virginia Sun
- City of Hope Comprehensive Cancer Center, Duarte, CA
| |
Collapse
|
2
|
Li J, Chang C, Li Y, Cui S, Yuan F, Li Z, Wang X, Li K, Feng Y, Wang Z, Wei Z, Jian F. Large Language Models' Responses to Spinal Cord Injury: A Comparative Study of Performance. J Med Syst 2025; 49:39. [PMID: 40128385 DOI: 10.1007/s10916-025-02170-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2024] [Accepted: 03/16/2025] [Indexed: 03/26/2025]
Abstract
With the increasing application of large language models (LLMs) in the medical field, their potential in patient education and clinical decision support is becoming increasingly prominent. Given the complex pathogenesis, diverse treatment options, and lengthy rehabilitation periods of spinal cord injury (SCI), patients are increasingly turning to advanced online resources to obtain relevant medical information. This study analyzed responses from four LLMs-ChatGPT-4o, Claude-3.5 sonnet, Gemini-1.5 Pro, and Llama-3.1-to 37 SCI-related questions spanning pathogenesis, risk factors, clinical features, diagnostics, treatments, and prognosis. Quality and readability were assessed using the Ensuring Quality Information for Patients (EQIP) tool and Flesch-Kincaid metrics, respectively. Accuracy was independently scored by three senior spine surgeons using consensus scoring. Performance varied among the models. Gemini ranked highest in EQIP scores, suggesting superior information quality. Although the readability of all four LLMs was generally low, requiring a college-level reading comprehension ability, they were all able to effectively simplify complex content. Notably, ChatGPT led in accuracy, achieving significantly higher "Good" ratings (83.8%) compared to Claude (78.4%), Gemini (54.1%), and Llama (62.2%). Comprehensiveness scores were high across all models. Furthermore, the LLMs exhibited strong self-correction abilities. After being prompted for revision, the accuracy of ChatGPT and Claude's responses improved by 100% and 50%, respectively; both Gemini and Llama improved by 67%. This study represents the first systematic comparison of leading LLMs in the context of SCI. While Gemini excelled in response quality, ChatGPT provided the most accurate and comprehensive responses.
Collapse
Affiliation(s)
- Jinze Li
- Department of Neurosurgery, Xuanwu Hospital, Capital Medical University, No. 45 Changchun Street, Xicheng District, Beijing, 100053, China
- Spine Center, China International Neuroscience Institute (CHINA-INI), Beijing, China
| | - Chao Chang
- Department of Neurosurgery, Xuanwu Hospital, Capital Medical University, No. 45 Changchun Street, Xicheng District, Beijing, 100053, China
- Spine Center, China International Neuroscience Institute (CHINA-INI), Beijing, China
| | - Yanqiu Li
- Center for Integrative Medicine, Beijing Ditan Hospital, Capital Medical University, Beijing, China
| | - Shengyu Cui
- Department of Neurosurgery, Xuanwu Hospital, Capital Medical University, No. 45 Changchun Street, Xicheng District, Beijing, 100053, China
- Spine Center, China International Neuroscience Institute (CHINA-INI), Beijing, China
| | - Fan Yuan
- Department of Neurosurgery, Xuanwu Hospital, Capital Medical University, No. 45 Changchun Street, Xicheng District, Beijing, 100053, China
- Spine Center, China International Neuroscience Institute (CHINA-INI), Beijing, China
| | - Zhuojun Li
- School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, China
| | - Xinyu Wang
- Baylor College of Medicine, Houston, TX, USA
| | - Kang Li
- Department of Neurosurgery, Xuanwu Hospital, Capital Medical University, No. 45 Changchun Street, Xicheng District, Beijing, 100053, China
- Spine Center, China International Neuroscience Institute (CHINA-INI), Beijing, China
| | - Yuxin Feng
- Capital Medical University, Beijing, China
| | - Zuowei Wang
- Department of Neurosurgery, Xuanwu Hospital, Capital Medical University, No. 45 Changchun Street, Xicheng District, Beijing, 100053, China.
- Spine Center, China International Neuroscience Institute (CHINA-INI), Beijing, China.
| | - Zhijian Wei
- Department of Orthopaedics, Qilu Hospital of Shandong University, Shandong University, No. 107 Wenhua West Road, Lixia District, 250012, Jinan, China.
| | - Fengzeng Jian
- Department of Neurosurgery, Xuanwu Hospital, Capital Medical University, No. 45 Changchun Street, Xicheng District, Beijing, 100053, China.
- Spine Center, China International Neuroscience Institute (CHINA-INI), Beijing, China.
| |
Collapse
|
3
|
Fung MMH, Tang EHM, Wu T, Luk Y, Au ICH, Liu X, Lee VHF, Wong CK, Wei Z, Cheng WY, Tai ICY, Ho JWK, Wong JWH, Lang BHH, Leung KSM, Wong ZSY, Wu JT, Wong CKH. Developing a named entity framework for thyroid cancer staging and risk level classification using large language models. NPJ Digit Med 2025; 8:134. [PMID: 40025285 PMCID: PMC11873034 DOI: 10.1038/s41746-025-01528-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2024] [Accepted: 02/19/2025] [Indexed: 03/04/2025] Open
Abstract
We developed a named entity (NE) framework for information extraction from semi-structured clinical notes retrieved from The Cancer Genome Atlas-Thyroid Cancer (TCGA-THCA) database and examined Large Language Models (LLMs) strategies to classify the 8th edition of American Joint Committee on Cancer (AJCC) staging and American Thyroid Association (ATA) risk category for patients with well-differentiated thyroid cancer. The NE framework consisted of annotation guidelines development, ground truth labelling, prompting approaches, and evaluation codes. Four LLMs (Mistral-7B-Instruct, Llama-3.1-8B-Instruct, Gemma-2-9B-Instruct, and Qwen2.5-7B-Instruct) were offline utilised for information extraction, comparing with expert-curated ground truth. Our framework was developed using 50 TCGA-THCA pathology notes. 289 TCGA-THCA notes and 35 pseudo-clinical cases were used for validation. Taking an ensemble-like majority-vote strategy achieved satisfactory performance for AJCC and ATA in both development and validation sets. Our framework and ensemble classifier optimised efficiency and accuracy of classifying stage and risk category in thyroid cancer patients.
Collapse
Affiliation(s)
- Matrix M H Fung
- Division of Endocrine Surgery, Department of Surgery, School of Clinical Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Eric H M Tang
- Laboratory of Data Discovery for Health (D²4H), Hong Kong Science Park, Hong Kong SAR, China
- Department of Family Medicine and Primary Care, School of Clinical Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Tingting Wu
- Laboratory of Data Discovery for Health (D²4H), Hong Kong Science Park, Hong Kong SAR, China
| | - Yan Luk
- Division of Endocrine Surgery, Department of Surgery, School of Clinical Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Ivan C H Au
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Xiaodong Liu
- Division of Endocrine Surgery, Department of Surgery, School of Clinical Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
- Laboratory of Data Discovery for Health (D²4H), Hong Kong Science Park, Hong Kong SAR, China
| | - Victor H F Lee
- Department of Clinical Oncology, School of Clinical Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Chun Ka Wong
- Department of Medicine, School of Clinical Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Zhili Wei
- Laboratory of Data Discovery for Health (D²4H), Hong Kong Science Park, Hong Kong SAR, China
| | - Wing Yiu Cheng
- Laboratory of Data Discovery for Health (D²4H), Hong Kong Science Park, Hong Kong SAR, China
| | - Isaac C Y Tai
- Department of Orthopaedics and Traumatology, School of Clinical Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Joshua W K Ho
- Laboratory of Data Discovery for Health (D²4H), Hong Kong Science Park, Hong Kong SAR, China
- School of Biomedical Science, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Jason W H Wong
- School of Biomedical Science, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Brian H H Lang
- Division of Endocrine Surgery, Department of Surgery, School of Clinical Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Kathy S M Leung
- Laboratory of Data Discovery for Health (D²4H), Hong Kong Science Park, Hong Kong SAR, China
- The Hong Kong Jockey Club Global Health Institute, Hong Kong SAR, China
- WHO Collaborating Centre for Infectious Disease Epidemiology and Control, School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
- The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
| | - Zoie S Y Wong
- The Kirby Institute, University of New South Wales, Sydney, Australia
- Biomedical Informatics and Digital Health, School of Medical Sciences, The University of Sydney, Sydney, Australia
- Graduate School of Public Health, St. Luke's International University, Tokyo, Japan
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Joseph T Wu
- Laboratory of Data Discovery for Health (D²4H), Hong Kong Science Park, Hong Kong SAR, China.
- The Hong Kong Jockey Club Global Health Institute, Hong Kong SAR, China.
- WHO Collaborating Centre for Infectious Disease Epidemiology and Control, School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China.
- The University of Hong Kong-Shenzhen Hospital, Shenzhen, China.
| | - Carlos K H Wong
- Laboratory of Data Discovery for Health (D²4H), Hong Kong Science Park, Hong Kong SAR, China.
- Department of Family Medicine and Primary Care, School of Clinical Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China.
- The Hong Kong Jockey Club Global Health Institute, Hong Kong SAR, China.
- Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, UK.
| |
Collapse
|
4
|
Xin Y, Grabowska ME, Gangireddy S, Krantz MS, Kerchberger VE, Dickson AL, Feng Q, Yin Z, Wei WQ. Improving topic modeling performance on social media through semantic relationships within biomedical terminology. PLoS One 2025; 20:e0318702. [PMID: 39982945 PMCID: PMC11845042 DOI: 10.1371/journal.pone.0318702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Accepted: 01/20/2025] [Indexed: 02/23/2025] Open
Abstract
Topic modeling utilizes unsupervised machine learning to detect underlying themes within texts and has been deployed routinely to analyze social media for insights into healthcare issues. However, the inherent messiness of social media hinders the full realization of this technique's potential. As such, we hypothesized that restricting medical concepts in social media texts to specific related semantic types and applying topic modeling to these concepts could be a feasible approach to overcome the challenge of traditional topic modeling for social media texts. Therefore, we developed a semantic-type-based topic modeling pipeline to discover self-reported health-related topics. This pipeline integrated semantic type information and Systematized Medical Nomenclature for Medicine (SNOMED) precoordinated expressions into a traditional topic modeling approach to enhance effectiveness in clustering meaningful, distinct topics. Using social media texts regarding statins for illustration, we evaluated the efficacy of this new approach and validated a newly identified topic using real-world clinical data. Based on expert evaluations, this approach resulted in more novel, distinguishable, and meaningful health-related topics compared to traditional topic modeling. In addition, our electronic health record validation for a newly identified topic in two real-world clinical databases indicated that statin users had a higher prevalence of depression or anxiety compared to matched non-users. Our results indicate that this new topic modeling pipeline can improve the extraction of themes from noisy online discussions, thereby contributing to deeper insights for healthcare research.
Collapse
Affiliation(s)
- Yi Xin
- Department of Computer Science, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Monika E. Grabowska
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Srushti Gangireddy
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Matthew S. Krantz
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - V. Eric Kerchberger
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Alyson L. Dickson
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Qiping Feng
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Zhijun Yin
- Department of Computer Science, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Wei-Qi Wei
- Department of Computer Science, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| |
Collapse
|
5
|
Rajwal S, Zhang Z, Chen Y, Rogers H, Sarker A, Xiao Y. Applications of Natural Language Processing and Large Language Models for Social Determinants of Health: Protocol for a Systematic Review. JMIR Res Protoc 2025; 14:e66094. [PMID: 39836952 PMCID: PMC11795155 DOI: 10.2196/66094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2024] [Revised: 12/04/2024] [Accepted: 12/26/2024] [Indexed: 01/23/2025] Open
Abstract
BACKGROUND In recent years, the intersection of natural language processing (NLP) and public health has opened innovative pathways for investigating social determinants of health (SDOH) in textual datasets. Despite the promise of NLP in the SDOH domain, the literature is dispersed across various disciplines, and there is a need to consolidate existing knowledge, identify knowledge gaps in the literature, and inform future research directions in this emerging field. OBJECTIVE This research protocol describes a systematic review to identify and highlight NLP techniques, including large language models, used for SDOH-related studies. METHODS A search strategy will be executed across PubMed, Web of Science, IEEE Xplore, Scopus, PsycINFO, HealthSource: Academic Nursing, and ACL Anthology to find studies published in English between 2014 and 2024. Three reviewers (SR, ZZ, and YC) will independently screen the studies to avoid voting bias, and two (AS and YX) additional reviewers will resolve any conflicts during the screening process. We will further screen studies that cited the included studies (forward search). Following the title abstract and full-text screening, the characteristics and main findings of the included studies and resources will be tabulated, visualized, and summarized. RESULTS The search strategy was formulated and run across the 7 databases in August 2024. We expect the results to be submitted for peer review publication in early 2025. As of December 2024, the title and abstract screening was underway. CONCLUSIONS This systematic review aims to provide a comprehensive study of existing research on the application of NLP for various SDOH tasks across multiple textual datasets. By rigorously evaluating the methodologies, tools, and outcomes of eligible studies, the review will identify gaps in current knowledge and suggest directions for future research in the form of specific research questions. The findings will be instrumental in developing more effective NLP models for SDOH, ultimately contributing to improved health outcomes and a better understanding of social determinants in diverse populations. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) DERR1-10.2196/66094.
Collapse
Affiliation(s)
- Swati Rajwal
- Department of Computer Science, Emory University, Atlanta, GA, United States
| | - Ziyuan Zhang
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States
| | - Yankai Chen
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, United States
| | - Hannah Rogers
- Woodruff Health Sciences Center Library, Emory University, Atlanta, GA, United States
| | - Abeed Sarker
- Department of Biomedical Informatics, Emory University, Atlanta, GA, United States
| | - Yunyu Xiao
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, United States
| |
Collapse
|
6
|
Li X, Peng L, Wang YP, Zhang W. Open challenges and opportunities in federated foundation models towards biomedical healthcare. BioData Min 2025; 18:2. [PMID: 39755653 DOI: 10.1186/s13040-024-00414-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2024] [Accepted: 12/09/2024] [Indexed: 01/06/2025] Open
Abstract
This survey explores the transformative impact of foundation models (FMs) in artificial intelligence, focusing on their integration with federated learning (FL) in biomedical research. Foundation models such as ChatGPT, LLaMa, and CLIP, which are trained on vast datasets through methods including unsupervised pretraining, self-supervised learning, instructed fine-tuning, and reinforcement learning from human feedback, represent significant advancements in machine learning. These models, with their ability to generate coherent text and realistic images, are crucial for biomedical applications that require processing diverse data forms such as clinical reports, diagnostic images, and multimodal patient interactions. The incorporation of FL with these sophisticated models presents a promising strategy to harness their analytical power while safeguarding the privacy of sensitive medical data. This approach not only enhances the capabilities of FMs in medical diagnostics and personalized treatment but also addresses critical concerns about data privacy and security in healthcare. This survey reviews the current applications of FMs in federated settings, underscores the challenges, and identifies future research directions including scaling FMs, managing data diversity, and enhancing communication efficiency within FL frameworks. The objective is to encourage further research into the combined potential of FMs and FL, laying the groundwork for healthcare innovations.
Collapse
Affiliation(s)
- Xingyu Li
- Department of Computer Science, Tulane University, New Orleans, LA, USA
| | - Lu Peng
- Department of Computer Science, Tulane University, New Orleans, LA, USA.
| | - Yu-Ping Wang
- Department of Biomedical Engineering, Tulane University, New Orleans, LA, USA
| | - Weihua Zhang
- School of Computer Science, Fudan University, Shanghai, China
| |
Collapse
|
7
|
Murnan AW, Tscholl JJ, Ganta R, Duah HO, Qasem I, Sezgin E. Identification of Child Survivors of Sex Trafficking From Electronic Health Records: An Artificial Intelligence Guided Approach. CHILD MALTREATMENT 2024; 29:601-611. [PMID: 37545138 PMCID: PMC11000265 DOI: 10.1177/10775595231194599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
Survivors of child sex trafficking (SCST) experience high rates of adverse health outcomes. Amidst the duration of their victimization, survivors regularly seek healthcare yet fail to be identified. This study sought to utilize artificial intelligence (AI) to identify SCST and describe the elements of their healthcare presentation. An AI-supported keyword search was conducted to identify SCST within the electronic medical records (EMR) of ∼1.5 million patients at a large midwestern pediatric hospital. Descriptive analyses were used to evaluate associated diagnoses and clinical presentation. A sex trafficking-related keyword was identified in .18% of patient charts. Among this cohort, the most common associated diagnostic codes were for Confirmed Sexual/Physical Assault; Trauma and Stress-Related Disorders; Depressive Disorders; Anxiety Disorders; and Suicidal Ideation. Our findings are consistent with the myriad of known adverse physical and psychological outcomes among SCST and illuminate the future potential of AI technology to improve screening and research efforts surrounding all aspects of this vulnerable population.
Collapse
Affiliation(s)
- Aaron W Murnan
- College of Nursing, University of Cincinnati, Cincinnati, OH, USA
| | - Jennifer J Tscholl
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA
- Division of Child and Family Advocacy, Center for Family Safety and Healing, Nationwide Children's Hospital, Columbus, OH, USA
| | - Rajesh Ganta
- Information Technology Research and Innovation, Abigail Wexner Research Institute, Nationwide Children's Hospital, Columbus, OH, USA
| | - Henry O Duah
- College of Nursing, University of Cincinnati, Cincinnati, OH, USA
| | - Islam Qasem
- College of Nursing, University of Cincinnati, Cincinnati, OH, USA
| | - Emre Sezgin
- Information Technology Research and Innovation, Abigail Wexner Research Institute, Nationwide Children's Hospital, Columbus, OH, USA
- Center for Biobehavioral Health, The Abigail Wexner Research Institute, Nationwide Children's Hospital, Columbus, OH, USA
| |
Collapse
|
8
|
Guo Y, Ovadje A, Al-Garadi MA, Sarker A. Evaluating large language models for health-related text classification tasks with public social media data. J Am Med Inform Assoc 2024; 31:2181-2189. [PMID: 39121174 DOI: 10.1093/jamia/ocae210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Revised: 07/08/2024] [Accepted: 08/07/2024] [Indexed: 08/11/2024] Open
Abstract
OBJECTIVES Large language models (LLMs) have demonstrated remarkable success in natural language processing (NLP) tasks. This study aimed to evaluate their performances on social media-based health-related text classification tasks. MATERIALS AND METHODS We benchmarked 1 Support Vector Machine (SVM), 3 supervised pretrained language models (PLMs), and 2 LLMs-based classifiers across 6 text classification tasks. We developed 3 approaches for leveraging LLMs: employing LLMs as zero-shot classifiers, using LLMs as data annotators, and utilizing LLMs with few-shot examples for data augmentation. RESULTS Across all tasks, the mean (SD) F1 score differences for RoBERTa, BERTweet, and SocBERT trained on human-annotated data were 0.24 (±0.10), 0.25 (±0.11), and 0.23 (±0.11), respectively, compared to those trained on the data annotated using GPT3.5, and were 0.16 (±0.07), 0.16 (±0.08), and 0.14 (±0.08) using GPT4, respectively. The GPT3.5 and GPT4 zero-shot classifiers outperformed SVMs in a single task and in 5 out of 6 tasks, respectively. When leveraging LLMs for data augmentation, the RoBERTa models trained on GPT4-augmented data demonstrated superior or comparable performance compared to those trained on human-annotated data alone. DISCUSSION The results revealed that using LLM-annotated data only for training supervised classification models was ineffective. However, employing the LLM as a zero-shot classifier exhibited the potential to outperform traditional SVM models and achieved a higher recall than the advanced transformer-based model RoBERTa. Additionally, our results indicated that utilizing GPT3.5 for data augmentation could potentially harm model performance. In contrast, data augmentation with GPT4 demonstrated improved model performances, showcasing the potential of LLMs in reducing the need for extensive training data. CONCLUSIONS By leveraging the data augmentation strategy, we can harness the power of LLMs to develop smaller, more effective domain-specific NLP models. Using LLM-annotated data without human guidance for training lightweight supervised classification models is an ineffective strategy. However, LLM, as a zero-shot classifier, shows promise in excluding false negatives and potentially reducing the human effort required for data annotation.
Collapse
Affiliation(s)
- Yuting Guo
- Department of Biomedical Informatics, Emory University, Atlanta, GA 30322, United States
| | - Anthony Ovadje
- Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA 30332, United States
| | - Mohammed Ali Al-Garadi
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN 37235, United States
| | - Abeed Sarker
- Department of Biomedical Informatics, Emory University, Atlanta, GA 30322, United States
- Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA 30332, United States
| |
Collapse
|
9
|
Xie F, Lee MS, Allahwerdy S, Getahun D, Wessler B, Chen W. Identifying the Severity of Heart Valve Stenosis and Regurgitation Among a Diverse Population Within an Integrated Health Care System: Natural Language Processing Approach. JMIR Cardio 2024; 8:e60503. [PMID: 39348175 PMCID: PMC11474122 DOI: 10.2196/60503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Revised: 09/04/2024] [Accepted: 09/09/2024] [Indexed: 10/01/2024] Open
Abstract
BACKGROUND Valvular heart disease (VHD) is a leading cause of cardiovascular morbidity and mortality that poses a substantial health care and economic burden on health care systems. Administrative diagnostic codes for ascertaining VHD diagnosis are incomplete. OBJECTIVE This study aimed to develop a natural language processing (NLP) algorithm to identify patients with aortic, mitral, tricuspid, and pulmonic valve stenosis and regurgitation from transthoracic echocardiography (TTE) reports within a large integrated health care system. METHODS We used reports from echocardiograms performed in the Kaiser Permanente Southern California (KPSC) health care system between January 1, 2011, and December 31, 2022. Related terms/phrases of aortic, mitral, tricuspid, and pulmonic stenosis and regurgitation and their severities were compiled from the literature and enriched with input from clinicians. An NLP algorithm was iteratively developed and fine-trained via multiple rounds of chart review, followed by adjudication. The developed algorithm was applied to 200 annotated echocardiography reports to assess its performance and then the study echocardiography reports. RESULTS A total of 1,225,270 TTE reports were extracted from KPSC electronic health records during the study period. In these reports, valve lesions identified included 111,300 (9.08%) aortic stenosis, 20,246 (1.65%) mitral stenosis, 397 (0.03%) tricuspid stenosis, 2585 (0.21%) pulmonic stenosis, 345,115 (28.17%) aortic regurgitation, 802,103 (65.46%) mitral regurgitation, 903,965 (73.78%) tricuspid regurgitation, and 286,903 (23.42%) pulmonic regurgitation. Among the valves, 50,507 (4.12%), 22,656 (1.85%), 1685 (0.14%), and 1767 (0.14%) were identified as prosthetic aortic valves, mitral valves, tricuspid valves, and pulmonic valves, respectively. Mild and moderate were the most common severity levels of heart valve stenosis, while trace and mild were the most common severity levels of regurgitation. Males had a higher frequency of aortic stenosis and all 4 valvular regurgitations, while females had more mitral, tricuspid, and pulmonic stenosis. Non-Hispanic Whites had the highest frequency of all 4 valvular stenosis and regurgitations. The distribution of valvular stenosis and regurgitation severity was similar across race/ethnicity groups. Frequencies of aortic stenosis, mitral stenosis, and regurgitation of all 4 heart valves increased with age. In TTE reports with stenosis detected, younger patients were more likely to have mild aortic stenosis, while older patients were more likely to have severe aortic stenosis. However, mitral stenosis was opposite (milder in older patients and more severe in younger patients). In TTE reports with regurgitation detected, younger patients had a higher frequency of severe/very severe aortic regurgitation. In comparison, older patients had higher frequencies of mild aortic regurgitation and severe mitral/tricuspid regurgitation. Validation of the NLP algorithm against the 200 annotated TTE reports showed excellent precision, recall, and F1-scores. CONCLUSIONS The proposed computerized algorithm could effectively identify heart valve stenosis and regurgitation, as well as the severity of valvular involvement, with significant implications for pharmacoepidemiological studies and outcomes research.
Collapse
Affiliation(s)
- Fagen Xie
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Ming-Sum Lee
- Department of Cardiology, Los Angeles Medical Center, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Salam Allahwerdy
- Department of Clinical Science, Kaiser Permanente Bernard J Tyson School of Medicine, Pasadena, CA, United States
| | - Darios Getahun
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Benjamin Wessler
- Division of Cardiology, Tufts Medical Center, Boston, MA, United States
| | - Wansu Chen
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| |
Collapse
|
10
|
Abedian Kalkhoran H, Zwaveling J, van Hunsel F, Kant A. An innovative method to strengthen evidence for potential drug safety signals using Electronic Health Records. J Med Syst 2024; 48:51. [PMID: 38753223 PMCID: PMC11098892 DOI: 10.1007/s10916-024-02070-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 04/25/2024] [Indexed: 05/19/2024]
Abstract
Reports from spontaneous reporting systems (SRS) are hypothesis generating. Additional evidence such as more reports is required to determine whether the generated drug-event associations are in fact safety signals. However, underreporting of adverse drug reactions (ADRs) delays signal detection. Through the use of natural language processing, different sources of real-world data can be used to proactively collect additional evidence for potential safety signals. This study aims to explore the feasibility of using Electronic Health Records (EHRs) to identify additional cases based on initial indications from spontaneous ADR reports, with the goal of strengthening the evidence base for potential safety signals. For two confirmed and two potential signals generated by the SRS of the Netherlands Pharmacovigilance Centre Lareb, targeted searches in the EHR of the Leiden University Medical Centre were performed using a text-mining based tool, CTcue. The search for additional cases was done by constructing and running queries in the structured and free-text fields of the EHRs. We identified at least five additional cases for the confirmed signals and one additional case for each potential safety signal. The majority of the identified cases for the confirmed signals were documented in the EHRs before signal detection by the Dutch Medicines Evaluation Board. The identified cases for the potential signals were reported to Lareb as further evidence for signal detection. Our findings highlight the feasibility of performing targeted searches in the EHR based on an underlying hypothesis to provide further evidence for signal generation.
Collapse
Affiliation(s)
- H Abedian Kalkhoran
- Department of Clinical Pharmacology and Toxicology, Leiden University Medical Centre, Leiden, the Netherlands.
- Department of Pharmacy, Haga Teaching Hospital, The Hague, the Netherlands.
| | - J Zwaveling
- Department of Clinical Pharmacology and Toxicology, Leiden University Medical Centre, Leiden, the Netherlands
| | - F van Hunsel
- The Netherlands Pharmacovigilance Centre Lareb, 's-Hertogenbosch, the Netherlands
| | - A Kant
- Department of Clinical Pharmacology and Toxicology, Leiden University Medical Centre, Leiden, the Netherlands
- The Netherlands Pharmacovigilance Centre Lareb, 's-Hertogenbosch, the Netherlands
| |
Collapse
|
11
|
Kaliush PR, Conradt E, Kerig PK, Williams PG, Crowell SE. A multilevel developmental psychopathology model of childbirth and the perinatal transition. Dev Psychopathol 2024; 36:533-544. [PMID: 36700362 PMCID: PMC10368796 DOI: 10.1017/s0954579422001389] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Despite recent applications of a developmental psychopathology perspective to the perinatal period, these conceptualizations have largely ignored the role that childbirth plays in the perinatal transition. Thus, we present a conceptual model of childbirth as a bridge between prenatal and postnatal health. We argue that biopsychosocial factors during pregnancy influence postnatal health trajectories both directly and indirectly through childbirth experiences, and we focus our review on those indirect effects. In order to frame our model within a developmental psychopathology lens, we first describe "typical" biopsychosocial aspects of pregnancy and childbirth. Then, we explore ways in which these processes may deviate from the norm to result in adverse or traumatic childbirth experiences. We briefly describe early postnatal health trajectories that may follow from these birth experiences, including those which are adaptive despite traumatic childbirth, and we conclude with implications for research and clinical practice. We intend for our model to illuminate the importance of including childbirth in multilevel perinatal research. This advancement is critical for reducing perinatal health disparities and promoting health and well-being among birthing parents and their children.
Collapse
Affiliation(s)
- Parisa R. Kaliush
- Department of Psychology, University of Utah, 380 South 1530 East, BEH S 502, Salt Lake City, UT 84112, USA
| | - Elisabeth Conradt
- Department of Psychiatry and Behavioral Sciences, Duke University, Durham, NC 27701, USA
| | - Patricia K. Kerig
- Department of Psychology, University of Utah, 380 South 1530 East, BEH S 502, Salt Lake City, UT 84112, USA
| | - Paula G. Williams
- Department of Psychology, University of Utah, 380 South 1530 East, BEH S 502, Salt Lake City, UT 84112, USA
| | - Sheila E. Crowell
- Department of Psychology, University of Utah, 380 South 1530 East, BEH S 502, Salt Lake City, UT 84112, USA
- Department of Psychiatry, University of Utah, Salt Lake City, UT 84108, USA
- Department of Obstetrics and Gynecology, University of Utah, Salt Lake City, UT 84132, USA
| |
Collapse
|
12
|
Macieira TGR, Yao Y, Marcelle C, Mena N, Mino MM, Huynh TML, Chiampou C, Garcia AL, Montoya N, Sargent L, Keenan GM. Standardizing nursing data extracted from electronic health records for integration into a statewide clinical data research network. Int J Med Inform 2024; 183:105325. [PMID: 38176094 PMCID: PMC11018263 DOI: 10.1016/j.ijmedinf.2023.105325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 12/06/2023] [Accepted: 12/24/2023] [Indexed: 01/06/2024]
Abstract
BACKGROUND Care plans documented by nurses in electronic health records (EHR) are a rich source of data to generate knowledge and measure the impact of nursing care. Unfortunately, there is a lack of integration of these data in clinical data research networks (CDRN) data trusts, due in large part to nursing care being documented with local vocabulary, resulting in non-standardized data. The absence of high-quality nursing care plan data in data trusts limits the investigation of interdisciplinary care aimed at improving patient outcomes. OBJECTIVE To map local nursing care plan terms for patients' problems and goals in the EHR of one large health system to the standardized nursing terminologies (SNTs), NANDA International (NANDA-I), and Nursing Outcomes Classification (NOC). METHODS We extracted local problems and goals used by nurses to document care plans from two hospitals. After removing duplicates, the terms were independently mapped to NANDA-I and NOC by five mappers. Four nurses who regularly use the local vocabulary validated the mapping. RESULTS 83% of local problem terms were mapped to NANDA-I labels and 93% of local goal terms were mapped to NOC labels. The nurses agreed with 95% of the mapping. Local terms not mapped to labels were mapped to the domains or classes of the respective terminologies. CONCLUSION Mapping local vocabularies used by nurses in EHRs to SNTs is a foundational step to making interoperable nursing data available for research and other secondary purposes in large data trusts. This study is the first phase of a larger project building, for the first time, a pipeline to standardize, harmonize, and integrate nursing care plan data from multiple Florida hospitals into the statewide CDRN OneFlorida+ Clinical Research Network data trust.
Collapse
Affiliation(s)
- Tamara G R Macieira
- Department of Family, Community and Health System Science, College of Nursing, University of Florida, PO Box 100197, Gainesville, FL 32610, United States.
| | - Yingwei Yao
- Department of Biobehavioral Nursing Science, College of Nursing, University of Florida, PO Box 100197, Gainesville, FL 32610, United States
| | - Cassie Marcelle
- University of Florida Health Information Technology, 3011 SW Williston Rd, Gainesville, FL 32608, United States
| | - Nathan Mena
- University of Florida Health, 1600 SW Archer Rd, Gainesville, FL 32608, United States
| | - Mikayla M Mino
- College of Nursing, University of Florida, PO Box 100197, Gainesville, FL 32610, United States
| | - Trieu M L Huynh
- College of Nursing, University of Florida, PO Box 100197, Gainesville, FL 32610, United States
| | - Caitlin Chiampou
- College of Nursing, University of Florida, PO Box 100197, Gainesville, FL 32610, United States
| | - Amanda L Garcia
- College of Nursing, University of Florida, PO Box 100197, Gainesville, FL 32610, United States
| | - Noelle Montoya
- University of Florida Health, 1600 SW Archer Rd, Gainesville, FL 32608, United States
| | - Laura Sargent
- University of Florida Health, 1600 SW Archer Rd, Gainesville, FL 32608, United States
| | - Gail M Keenan
- Department of Family, Community and Health System Science, College of Nursing, University of Florida, PO Box 100197, Gainesville, FL 32610, United States
| |
Collapse
|
13
|
Xie F, Chang J, Luong T, Wu B, Lustigova E, Shrader E, Chen W. Identifying Symptoms Prior to Pancreatic Ductal Adenocarcinoma Diagnosis in Real-World Care Settings: Natural Language Processing Approach. JMIR AI 2024; 3:e51240. [PMID: 38875566 PMCID: PMC11041417 DOI: 10.2196/51240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 12/08/2023] [Accepted: 12/16/2023] [Indexed: 06/16/2024]
Abstract
BACKGROUND Pancreatic cancer is the third leading cause of cancer deaths in the United States. Pancreatic ductal adenocarcinoma (PDAC) is the most common form of pancreatic cancer, accounting for up to 90% of all cases. Patient-reported symptoms are often the triggers of cancer diagnosis and therefore, understanding the PDAC-associated symptoms and the timing of symptom onset could facilitate early detection of PDAC. OBJECTIVE This paper aims to develop a natural language processing (NLP) algorithm to capture symptoms associated with PDAC from clinical notes within a large integrated health care system. METHODS We used unstructured data within 2 years prior to PDAC diagnosis between 2010 and 2019 and among matched patients without PDAC to identify 17 PDAC-related symptoms. Related terms and phrases were first compiled from publicly available resources and then recursively reviewed and enriched with input from clinicians and chart review. A computerized NLP algorithm was iteratively developed and fine-trained via multiple rounds of chart review followed by adjudication. Finally, the developed algorithm was applied to the validation data set to assess performance and to the study implementation notes. RESULTS A total of 408,147 and 709,789 notes were retrieved from 2611 patients with PDAC and 10,085 matched patients without PDAC, respectively. In descending order, the symptom distribution of the study implementation notes ranged from 4.98% for abdominal or epigastric pain to 0.05% for upper extremity deep vein thrombosis in the PDAC group, and from 1.75% for back pain to 0.01% for pale stool in the non-PDAC group. Validation of the NLP algorithm against adjudicated chart review results of 1000 notes showed that precision ranged from 98.9% (jaundice) to 84% (upper extremity deep vein thrombosis), recall ranged from 98.1% (weight loss) to 82.8% (epigastric bloating), and F1-scores ranged from 0.97 (jaundice) to 0.86 (depression). CONCLUSIONS The developed and validated NLP algorithm could be used for the early detection of PDAC.
Collapse
Affiliation(s)
- Fagen Xie
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Jenny Chang
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Tiffany Luong
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Bechien Wu
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Eva Lustigova
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Eva Shrader
- Pancreatic Cancer Action Network, Manhattan Beach, CA, United States
| | - Wansu Chen
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| |
Collapse
|
14
|
Tomlin HR, Wissing M, Tanikella S, Kaur P, Tabas L. Challenges and Opportunities for Professional Medical Publications Writers to Contribute to Plain Language Summaries (PLS) in an AI/ML Environment - A Consumer Health Informatics Systematic Review. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2024; 2023:709-717. [PMID: 38222388 PMCID: PMC10785924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
Professional medical publications writers (PMWs) cover a wide range of biomedical writing activities that recently includes translation of biomedical publications to plain language summaries (PLS). The consumer health informatics literature (CHI) consistently describes the importance of incorporating health literacy principles in any natural language processing (NLP) app designed to communicate medical information to lay audiences, particularly patients. In this stepwise systematic review, we searched PubMed indexed literature for CHI NLP-based apps that have the potential to assist PMWs in developing text based PLS. Results showed that available apps are limited to patient portals and other technologies used to communicate medical text and reports from electronic health records. PMWs can apply the lessons learned from CHI NLP-based apps to supervise development of tools specific to text simplification and summarization for PLS from biomedical publications.
Collapse
Affiliation(s)
- Holly R Tomlin
- Certara Synchrogenix, Wilmington, DE, USA
- Consumer Health Informatics Lab (CHIL), Section of Biostatistics and Data Sciences, Yale School of Medicine, New Haven, CT
- Weill Cornell Medicine, Department of Population Health Sciences, Division of Health Analytics, New York, NY
| | | | | | | | | |
Collapse
|
15
|
Sim JA, Huang X, Horan MR, Stewart CM, Robison LL, Hudson MM, Baker JN, Huang IC. Natural language processing with machine learning methods to analyze unstructured patient-reported outcomes derived from electronic health records: A systematic review. Artif Intell Med 2023; 146:102701. [PMID: 38042599 PMCID: PMC10693655 DOI: 10.1016/j.artmed.2023.102701] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 09/30/2023] [Accepted: 10/29/2023] [Indexed: 12/04/2023]
Abstract
OBJECTIVE Natural language processing (NLP) combined with machine learning (ML) techniques are increasingly used to process unstructured/free-text patient-reported outcome (PRO) data available in electronic health records (EHRs). This systematic review summarizes the literature reporting NLP/ML systems/toolkits for analyzing PROs in clinical narratives of EHRs and discusses the future directions for the application of this modality in clinical care. METHODS We searched PubMed, Scopus, and Web of Science for studies written in English between 1/1/2000 and 12/31/2020. Seventy-nine studies meeting the eligibility criteria were included. We abstracted and summarized information related to the study purpose, patient population, type/source/amount of unstructured PRO data, linguistic features, and NLP systems/toolkits for processing unstructured PROs in EHRs. RESULTS Most of the studies used NLP/ML techniques to extract PROs from clinical narratives (n = 74) and mapped the extracted PROs into specific PRO domains for phenotyping or clustering purposes (n = 26). Some studies used NLP/ML to process PROs for predicting disease progression or onset of adverse events (n = 22) or developing/validating NLP/ML pipelines for analyzing unstructured PROs (n = 19). Studies used different linguistic features, including lexical, syntactic, semantic, and contextual features, to process unstructured PROs. Among the 25 NLP systems/toolkits we identified, 15 used rule-based NLP, 6 used hybrid NLP, and 4 used non-neural ML algorithms embedded in NLP. CONCLUSIONS This study supports the potential utility of different NLP/ML techniques in processing unstructured PROs available in EHRs for clinical care. Though using annotation rules for NLP/ML to analyze unstructured PROs is dominant, deploying novel neural ML-based methods is warranted.
Collapse
Affiliation(s)
- Jin-Ah Sim
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States; School of AI Convergence, Hallym University, Chuncheon, Republic of Korea
| | - Xiaolei Huang
- Department of Computer Science, University of Memphis, Memphis, TN, United States
| | - Madeline R Horan
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States
| | - Christopher M Stewart
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States
| | - Leslie L Robison
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States
| | - Melissa M Hudson
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States; Department of Oncology, St. Jude Children's Research Hospital, Memphis, TN, United States
| | - Justin N Baker
- Department of Pediatrics, Stanford University, Stanford, CA, United States
| | - I-Chan Huang
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States.
| |
Collapse
|
16
|
Kugic A, Pfeifer B, Schulz S, Kreuzthaler M. Embedding-based terminology expansion via secondary use of large clinical real-world datasets. J Biomed Inform 2023; 147:104497. [PMID: 37777164 DOI: 10.1016/j.jbi.2023.104497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 06/06/2023] [Accepted: 09/08/2023] [Indexed: 10/02/2023]
Abstract
A log-likelihood based co-occurrence analysis of ∼1.9 million de-identified ICD-10 codes and related short textual problem list entries generated possible term candidates at a significance level of p<0.01. These top 10 term candidates, consisting of 1 to 5-grams, were used as seed terms for an embedding based nearest neighbor approach to fetch additional synonyms, hypernyms and hyponyms in the respective n-gram embedding spaces by leveraging two different language models. This was done to analyze the lexicality of the resulting term candidates and to compare the term classifications of both models. We found no difference in system performance during the processing of lexical and non-lexical content, i.e. abbreviations, acronyms, etc. Additionally, an application-oriented analysis of the SapBERT (Self-Alignment Pretraining for Biomedical Entity Representations) language model indicates suitable performance for the extraction of all term classifications such as synonyms, hypernyms, and hyponyms.
Collapse
Affiliation(s)
- Amila Kugic
- Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Graz, Austria
| | - Bastian Pfeifer
- Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Graz, Austria
| | - Stefan Schulz
- Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Graz, Austria
| | - Markus Kreuzthaler
- Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Graz, Austria.
| |
Collapse
|
17
|
Stamer T, Steinhäuser J, Flägel K. Artificial Intelligence Supporting the Training of Communication Skills in the Education of Health Care Professions: Scoping Review. J Med Internet Res 2023; 25:e43311. [PMID: 37335593 PMCID: PMC10337453 DOI: 10.2196/43311] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 03/10/2023] [Accepted: 04/26/2023] [Indexed: 06/21/2023] Open
Abstract
BACKGROUND Communication is a crucial element of every health care profession, rendering communication skills training in all health care professions as being of great importance. Technological advances such as artificial intelligence (AI) and particularly machine learning (ML) may support this cause: it may provide students with an opportunity for easily accessible and readily available communication training. OBJECTIVE This scoping review aimed to summarize the status quo regarding the use of AI or ML in the acquisition of communication skills in academic health care professions. METHODS We conducted a comprehensive literature search across the PubMed, Scopus, Cochrane Library, Web of Science Core Collection, and CINAHL databases to identify articles that covered the use of AI or ML in communication skills training of undergraduate students pursuing health care profession education. Using an inductive approach, the included studies were organized into distinct categories. The specific characteristics of the studies, methods and techniques used by AI or ML applications, and main outcomes of the studies were evaluated. Furthermore, supporting and hindering factors in the use of AI and ML for communication skills training of health care professionals were outlined. RESULTS The titles and abstracts of 385 studies were identified, of which 29 (7.5%) underwent full-text review. Of the 29 studies, based on the inclusion and exclusion criteria, 12 (3.1%) were included. The studies were organized into 3 distinct categories: studies using AI and ML for text analysis and information extraction, studies using AI and ML and virtual reality, and studies using AI and ML and the simulation of virtual patients, each within the academic training of the communication skills of health care professionals. Within these thematic domains, AI was also used for the provision of feedback. The motivation of the involved agents played a major role in the implementation process. Reported barriers to the use of AI and ML in communication skills training revolved around the lack of authenticity and limited natural flow of language exhibited by the AI- and ML-based virtual patient systems. Furthermore, the use of educational AI- and ML-based systems in communication skills training for health care professionals is currently limited to only a few cases, topics, and clinical domains. CONCLUSIONS The use of AI and ML in communication skills training for health care professionals is clearly a growing and promising field with a potential to render training more cost-effective and less time-consuming. Furthermore, it may serve learners as an individualized and readily available exercise method. However, in most cases, the outlined applications and technical solutions are limited in terms of access, possible scenarios, the natural flow of a conversation, and authenticity. These issues still stand in the way of any widespread implementation ambitions.
Collapse
Affiliation(s)
- Tjorven Stamer
- Institute of Family Medicine, University Hospital Schleswig-Holstein Luebeck Campus, Luebeck, Germany
| | - Jost Steinhäuser
- Institute of Family Medicine, University Hospital Schleswig-Holstein Luebeck Campus, Luebeck, Germany
| | - Kristina Flägel
- Institute of Family Medicine, University Hospital Schleswig-Holstein Luebeck Campus, Luebeck, Germany
| |
Collapse
|
18
|
Shelest-Szumilas O, Wozniak M. The Fears and Hopes of Ukrainian Migrant Workers in Poland in the Pandemic Era. JOURNAL OF INTERNATIONAL MIGRATION AND INTEGRATION 2023; 24:1-23. [PMID: 37360639 PMCID: PMC10209937 DOI: 10.1007/s12134-023-01051-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/02/2023] [Indexed: 06/28/2023]
Abstract
Due to the COVID-19 pandemic, many immigrants found themselves in extremely unstable situations. The recent contributions show that employment decline in the first several months of the lockdown was higher for migrant workers than for natives. At the same time, migrants were less likely to find new employment in the recovery months. Such circumstances may result in an increased level of anxiety about one's economic situation. On the other hand, an unfavorable environment may induce resources that could help to overcome it. The paper aims to reveal migrants' concerns together with ambitions connected with the economic activity during the pandemic. The study is based on 30 individual in-depth interviews with Ukrainian migrant workers from Poland. The research approach was based on Natural Language Processing techniques. We employed sentiment analysis algorithms, and on a basis of selected lexicons, we extracted fears and hopes that appear in migrants' narrations. We also identified major topics and associated them with specific sentiments. Pandemic induced several matters connected with e.g., the stability of employment, discrimination, relationships, family, and financial situation. These affairs are usually connected on the basis of a cause-and-effect relationship. In addition, while several topics were common for both male and female participants, some of them were specific for each group.
Collapse
Affiliation(s)
- Olena Shelest-Szumilas
- Department of Education and Personnel Development, Poznan University of Economics and Business, Poznan, Poland
| | | |
Collapse
|
19
|
Das D, Kumar N, Longjam LA, Sinha R, Deb Roy A, Mondal H, Gupta P. Assessing the Capability of ChatGPT in Answering First- and Second-Order Knowledge Questions on Microbiology as per Competency-Based Medical Education Curriculum. Cureus 2023; 15:e36034. [PMID: 37056538 PMCID: PMC10086829 DOI: 10.7759/cureus.36034] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/11/2023] [Indexed: 03/13/2023] Open
Abstract
Background and objective ChatGPT is an artificial intelligence (AI) language model that has been trained to process and respond to questions across a wide range of topics. It is also capable of solving problems in medical educational topics. However, the capability of ChatGPT to accurately answer first- and second-order knowledge questions in the field of microbiology has not been explored so far. Hence, in this study, we aimed to analyze the capability of ChatGPT in answering first- and second-order questions on the subject of microbiology. Materials and methods Based on the competency-based medical education (CBME) curriculum of the subject of microbiology, we prepared a set of first-order and second-order questions. For the total of eight modules in the CBME curriculum for microbiology, we prepared six first-order and six second-order knowledge questions according to the National Medical Commission-recommended CBME curriculum, amounting to a total of (8 x 12) 96 questions. The questions were checked for content validity by three expert microbiologists. These questions were used to converse with ChatGPT by a single user and responses were recorded for further analysis. The answers were scored by three microbiologists on a rating scale of 0-5. The average of three scores was taken as the final score for analysis. As the data were not normally distributed, we used a non-parametric statistical test. The overall scores were tested by a one-sample median test with hypothetical values of 4 and 5. The scores of answers to first-order and second-order questions were compared by the Mann-Whitney U test. Module-wise responses were tested by the Kruskall-Wallis test followed by the post hoc test for pairwise comparisons. Results The overall score of 96 answers was 4.04 ±0.37 (median: 4.17, Q1-Q3: 3.88-4.33) with the mean score of answers to first-order knowledge questions being 4.07 ±0.32 (median: 4.17, Q1-Q3: 4-4.33) and that of answers to second-order knowledge questions being 3.99 ±0.43 (median: 4, Q1-Q3: 3.67-4.33) (Mann-Whitney p=0.4). The score was significantly below the score of 5 (one-sample median test p<0.0001) and similar to 4 (one-sample median test p=0.09). Overall, there was a variation in median scores obtained in eight categories of topics in microbiology, indicating inconsistent performance in different topics. Conclusion The results of the study indicate that ChatGPT is capable of answering both first- and second-order knowledge questions related to the subject of microbiology. The model achieved an accuracy of approximately 80% and there was no difference between the model's capability of answering first-order questions and second-order knowledge questions. The findings of this study suggest that ChatGPT has the potential to be an effective tool for automated question-answering in the field of microbiology. However, continued improvement in the training and development of language models is necessary to enhance their performance and make them suitable for academic use.
Collapse
|
20
|
Ten Considerations for Integrating Patient-Reported Outcomes into Clinical Care for Childhood Cancer Survivors. Cancers (Basel) 2023; 15:cancers15041024. [PMID: 36831370 PMCID: PMC9954048 DOI: 10.3390/cancers15041024] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Revised: 01/28/2023] [Accepted: 02/01/2023] [Indexed: 02/08/2023] Open
Abstract
Patient-reported outcome measures (PROMs) are subjective assessments of health status or health-related quality of life. In childhood cancer survivors, PROMs can be used to evaluate the adverse effects of cancer treatment and guide cancer survivorship care. However, there are barriers to integrating PROMs into clinical practice, such as constraints in clinical validity, meaningful interpretation, and technology-enabled administration of the measures. This article discusses these barriers and proposes 10 important considerations for appropriate PROM integration into clinical care for choosing the right measure (considering the purpose of using a PROM, health profile vs. health preference approaches, measurement properties), ensuring survivors complete the PROMs (data collection method, data collection frequency, survivor capacity, self- vs. proxy reports), interpreting the results (scoring methods, clinical meaning and interpretability), and selecting a strategy for clinical response (integration into the clinical workflow). An example framework for integrating novel patient-reported outcome (PRO) data collection into the clinical workflow for childhood cancer survivorship care is also discussed. As we continuously improve the clinical validity of PROMs and address implementation barriers, routine PRO assessment and monitoring in pediatric cancer survivorship offer opportunities to facilitate clinical decision making and improve the quality of survivorship care.
Collapse
|
21
|
Omranian S, Zolnoori M, Huang M, Campos-Castillo C, McRoy S. Predicting Patient Satisfaction With Medications for Treating Opioid Use Disorder: Case Study Applying Natural Language Processing to Reviews of Methadone and Buprenorphine/Naloxone on Health-Related Social Media. JMIR INFODEMIOLOGY 2023; 3:e37207. [PMID: 37113381 PMCID: PMC9987197 DOI: 10.2196/37207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 07/06/2022] [Accepted: 12/30/2022] [Indexed: 04/29/2023]
Abstract
Background Medication-assisted treatment (MAT) is an effective method for treating opioid use disorder (OUD), which combines behavioral therapies with one of three Food and Drug Administration-approved medications: methadone, buprenorphine, and naloxone. While MAT has been shown to be effective initially, there is a need for more information from the patient perspective about the satisfaction with medications. Existing research focuses on patient satisfaction with the entirety of the treatment, making it difficult to determine the unique role of medication and overlooking the views of those who may lack access to treatment due to being uninsured or concerns over stigma. Studies focusing on patients' perspectives are also limited by the lack of scales that can efficiently collect self-reports across domains of concerns. Objective A broad survey of patients' viewpoints can be obtained through social media and drug review forums, which are then assessed using automated methods to discover factors associated with medication satisfaction. Because the text is unstructured, it may contain a mix of formal and informal language. The primary aim of this study was to use natural language processing methods on text posted on health-related social media to detect patients' satisfaction with two well-studied OUD medications: methadone and buprenorphine/naloxone. Methods We collected 4353 patient reviews of methadone and buprenorphine/naloxone from 2008 to 2021 posted on WebMD and Drugs.com. To build our predictive models for detecting patient satisfaction, we first employed different analyses to build four input feature sets using the vectorized text, topic models, duration of treatment, and biomedical concepts by applying MetaMap. We then developed six prediction models: logistic regression, Elastic Net, least absolute shrinkage and selection operator, random forest classifier, Ridge classifier, and extreme gradient boosting to predict patients' satisfaction. Lastly, we compared the prediction models' performance over different feature sets. Results Topics discovered included oral sensation, side effects, insurance, and doctor visits. Biomedical concepts included symptoms, drugs, and illnesses. The F-score of the predictive models across all methods ranged from 89.9% to 90.8%. The Ridge classifier model, a regression-based method, outperformed the other models. Conclusions Assessment of patients' satisfaction with opioid dependency treatment medication can be predicted using automated text analysis. Adding biomedical concepts such as symptoms, drug name, and illness, along with the duration of treatment and topic models, had the most benefits for improving the prediction performance of the Elastic Net model compared to other models. Some of the factors associated with patient satisfaction overlap with domains covered in medication satisfaction scales (eg, side effects) and qualitative patient reports (eg, doctors' visits), while others (insurance) are overlooked, thereby underscoring the value added from processing text on online health forums to better understand patient adherence.
Collapse
Affiliation(s)
- Samaneh Omranian
- Department of Electrical Engineering and Computer Science College of Engineering & Applied Science University of Wisconsin-Milwaukee Milwaukee, WI United States
| | - Maryam Zolnoori
- School of Nursing Columbia University New York, NY United States
| | - Ming Huang
- Department of Artificial Intelligence and Informatics Mayo Clinic Rochester, MN United States
| | - Celeste Campos-Castillo
- Department of Media and Information Michigan State University East Lansing, MI United States
| | - Susan McRoy
- Department of Electrical Engineering and Computer Science College of Engineering & Applied Science University of Wisconsin-Milwaukee Milwaukee, WI United States
| |
Collapse
|
22
|
Textual emotion detection in health: Advances and applications. J Biomed Inform 2023; 137:104258. [PMID: 36528329 DOI: 10.1016/j.jbi.2022.104258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 11/24/2022] [Accepted: 11/27/2022] [Indexed: 12/23/2022]
Abstract
Textual Emotion Detection (TED) is a rapidly growing area in Natural Language Processing (NLP) that aims to detect emotions expressed through text. In this paper, we provide a review of the latest research and development in TED as applied in health and medicine. We focus on medical and non-medical data types, use cases, and methods where TED has been integral in supporting decision-making. The application of NLP technologies in health, and particularly TED, requires high confidence that these technologies and technology-aided treatment will first, do no harm. Therefore, this review also aims to assess the accuracy of TED systems and provide an update on the state of the technology. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 guidelines were used in this review. With a specific focus on the identification of different human emotions in text, the more general sentiment analysis studies that only recognize the polarity of text were excluded. A total of 66 papers met the inclusion criteria. This review found that TED in health and medicine is mainly used in the detection of depression, suicidal ideation, and the mental status of patients with asthma, Alzheimer's disease, cancer, and diabetes with major data sources of social media, healthcare services, and counseling centers. Approximately, 44% of the research in the domain is related to COVID-19, investigating the public health response to vaccinations and the emotional response of the public. In most cases, deep learning-based NLP techniques were found to be preferred over other methods due to their superior performance. Developing methods for implementing and evaluating dimensional emotional models, resolving annotation challenges by utilizing health-related lexicons, and using deep learning techniques for multi-faceted and real-time applications were found to be among the main avenues for further development of TED applications in health.
Collapse
|
23
|
Inteligencia artificial al servicio de la salud del futuro. REVISTA MÉDICA CLÍNICA LAS CONDES 2023. [DOI: 10.1016/j.rmclc.2022.12.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2023] Open
|
24
|
Azizi S, Hier DB, Wunsch II DC. Enhanced neurologic concept recognition using a named entity recognition model based on transformers. Front Digit Health 2022; 4:1065581. [PMID: 36569804 PMCID: PMC9772022 DOI: 10.3389/fdgth.2022.1065581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Accepted: 11/21/2022] [Indexed: 12/12/2022] Open
Abstract
Although deep learning has been applied to the recognition of diseases and drugs in electronic health records and the biomedical literature, relatively little study has been devoted to the utility of deep learning for the recognition of signs and symptoms. The recognition of signs and symptoms is critical to the success of deep phenotyping and precision medicine. We have developed a named entity recognition model that uses deep learning to identify text spans containing neurological signs and symptoms and then maps these text spans to the clinical concepts of a neuro-ontology. We compared a model based on convolutional neural networks to one based on bidirectional encoder representation from transformers. Models were evaluated for accuracy of text span identification on three text corpora: physician notes from an electronic health record, case histories from neurologic textbooks, and clinical synopses from an online database of genetic diseases. Both models performed best on the professionally-written clinical synopses and worst on the physician-written clinical notes. Both models performed better when signs and symptoms were represented as shorter text spans. Consistent with prior studies that examined the recognition of diseases and drugs, the model based on bidirectional encoder representations from transformers outperformed the model based on convolutional neural networks for recognizing signs and symptoms. Recall for signs and symptoms ranged from 59.5% to 82.0% and precision ranged from 61.7% to 80.4%. With further advances in NLP, fully automated recognition of signs and symptoms in electronic health records and the medical literature should be feasible.
Collapse
Affiliation(s)
- Sima Azizi
- Applied Computational Intelligence Laboratory, Department of Electrical & Computer Engineering, Missouri University of Science & Technology, Rolla, MO, United States
| | - Daniel B. Hier
- Applied Computational Intelligence Laboratory, Department of Electrical & Computer Engineering, Missouri University of Science & Technology, Rolla, MO, United States
- Department of Neurology and Rehabilitation, University of Illinois at Chicago, Chicago, IL, United States
| | - Donald C. Wunsch II
- Applied Computational Intelligence Laboratory, Department of Electrical & Computer Engineering, Missouri University of Science & Technology, Rolla, MO, United States
- National Science Foundation, ECCS Division, Arlington, VA, United States
| |
Collapse
|
25
|
Guo Y, Ge Y, Yang YC, Al-Garadi MA, Sarker A. Comparison of Pretraining Models and Strategies for Health-Related Social Media Text Classification. Healthcare (Basel) 2022; 10:healthcare10081478. [PMID: 36011135 PMCID: PMC9408372 DOI: 10.3390/healthcare10081478] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 07/29/2022] [Accepted: 08/02/2022] [Indexed: 11/24/2022] Open
Abstract
Pretrained contextual language models proposed in the recent past have been reported to achieve state-of-the-art performances in many natural language processing (NLP) tasks, including those involving health-related social media data. We sought to evaluate the effectiveness of different pretrained transformer-based models for social media-based health-related text classification tasks. An additional objective was to explore and propose effective pretraining strategies to improve machine learning performance on such datasets and tasks. We benchmarked six transformer-based models that were pretrained with texts from different domains and sources—BERT, RoBERTa, BERTweet, TwitterBERT, BioClinical_BERT, and BioBERT—on 22 social media-based health-related text classification tasks. For the top-performing models, we explored the possibility of further boosting performance by comparing several pretraining strategies: domain-adaptive pretraining (DAPT), source-adaptive pretraining (SAPT), and a novel approach called topic specific pretraining (TSPT). We also attempted to interpret the impacts of distinct pretraining strategies by visualizing document-level embeddings at different stages of the training process. RoBERTa outperformed BERTweet on most tasks, and better than others. BERT, TwitterBERT, BioClinical_BERT and BioBERT consistently underperformed. For pretraining strategies, SAPT performed better or comparable to the off-the-shelf models, and significantly outperformed DAPT. SAPT + TSPT showed consistently high performance, with statistically significant improvement in three tasks. Our findings demonstrate that RoBERTa and BERTweet are excellent off-the-shelf models for health-related social media text classification, and extended pretraining using SAPT and TSPT can further improve performance.
Collapse
Affiliation(s)
- Yuting Guo
- Department of Biomedical Informatics, Emory University, Atlanta, GA 30322, USA
- Correspondence:
| | - Yao Ge
- Department of Biomedical Informatics, Emory University, Atlanta, GA 30322, USA
| | - Yuan-Chi Yang
- Department of Biomedical Informatics, Emory University, Atlanta, GA 30322, USA
| | - Mohammed Ali Al-Garadi
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37240, USA
| | - Abeed Sarker
- Department of Biomedical Informatics, Emory University, Atlanta, GA 30322, USA
| |
Collapse
|
26
|
Skeen SJ, Jones SS, Cruse CM, Horvath KJ. Integrating Natural Language Processing and Interpretive Thematic Analyses to Gain Human-Centered Design Insights on HIV Mobile Health: Proof-of-Concept Analysis. JMIR Hum Factors 2022; 9:e37350. [PMID: 35862171 PMCID: PMC9353680 DOI: 10.2196/37350] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 06/13/2022] [Accepted: 06/13/2022] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND HIV mobile health (mHealth) interventions often incorporate interactive peer-to-peer features. The user-generated content (UGC) created by these features can offer valuable design insights by revealing what topics and life events are most salient for participants, which can serve as targets for subsequent interventions. However, unstructured, textual UGC can be difficult to analyze. Interpretive thematic analyses can preserve rich narratives and latent themes but are labor-intensive and therefore scale poorly. Natural language processing (NLP) methods scale more readily but often produce only coarse descriptive results. Recent calls to advance the field have emphasized the untapped potential of combined NLP and qualitative analyses toward advancing user attunement in next-generation mHealth. OBJECTIVE In this proof-of-concept analysis, we gain human-centered design insights by applying hybrid consecutive NLP-qualitative methods to UGC from an HIV mHealth forum. METHODS UGC was extracted from Thrive With Me, a web app intervention for men living with HIV that includes an unstructured peer-to-peer support forum. In Python, topics were modeled by latent Dirichlet allocation. Rule-based sentiment analysis scored interactions by emotional valence. Using a novel ranking standard, the experientially richest and most emotionally polarized segments of UGC were condensed and then analyzed thematically in Dedoose. Design insights were then distilled from these themes. RESULTS The refined topic model detected K=3 topics: A: disease coping; B: social adversities; C: salutations and check-ins. Strong intratopic themes included HIV medication adherence, survivorship, and relationship challenges. Negative UGC often involved strong negative reactions to external media events. Positive UGC often focused on gratitude for survival, well-being, and fellow users' support. CONCLUSIONS With routinization, hybrid NLP-qualitative methods may be viable to rapidly characterize UGC in mHealth environments. Design principles point toward opportunities to align mHealth intervention features with the organically occurring uses captured in these analyses, for example, by foregrounding inspiring personal narratives and expressions of gratitude, or de-emphasizing anger-inducing media.
Collapse
Affiliation(s)
- Simone J Skeen
- Department of Social, Behavioral, and Population Sciences, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, United States.,Department of Psychology, Hunter College, City University of New York, New York, NY, United States
| | - Stephen Scott Jones
- Department of Psychology, Hunter College, City University of New York, New York, NY, United States
| | - Carolyn Marie Cruse
- Department of Psychology, Hunter College, City University of New York, New York, NY, United States
| | - Keith J Horvath
- Department of Psychology, San Diego State University, San Diego, CA, United States
| |
Collapse
|
27
|
Abstract
The use of artificial intelligence in healthcare has led to debates about the role of human clinicians in the increasingly technological contexts of medicine. Some researchers have argued that AI will augment the capacities of physicians and increase their availability to provide empathy and other uniquely human forms of care to their patients. The human vulnerabilities experienced in the healthcare context raise the stakes of new technologies such as AI, and the human dimensions of AI in healthcare have particular significance for research in the humanities. This article explains four key areas of concern relating to AI and the role that medical/health humanities research can play in addressing them: definition and regulation of "medical" versus "health" data and apps; social determinants of health; narrative medicine; and technological mediation of care. Issues include data privacy and trust, flawed datasets and algorithmic bias, racial discrimination, and the rhetoric of humanism and disability. Through a discussion of potential humanities contributions to these emerging intersections with AI, this article will suggest future scholarly directions for the field.
Collapse
Affiliation(s)
- Kirsten Ostherr
- Medical Humanities Program and Department of English, Rice University, 6100 Main St., MS-30, Houston, TX, 77005, USA.
| |
Collapse
|
28
|
Sarker A, Al-Garadi MA, Ge Y, Nataraj N, Jones CM, Sumner SA. Signals of increasing co-use of stimulants and opioids from online drug forum data. Harm Reduct J 2022; 19:51. [PMID: 35614501 PMCID: PMC9131693 DOI: 10.1186/s12954-022-00628-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Accepted: 05/10/2022] [Indexed: 11/21/2022] Open
Abstract
Background Despite recent rises in fatal overdoses involving multiple substances, there is a paucity of knowledge about stimulant co-use patterns among people who use opioids (PWUO) or people being treated with medications for opioid use disorder (PTMOUD). A better understanding of the timing and patterns in stimulant co-use among PWUO based on mentions of these substances on social media can help inform prevention programs, policy, and future research directions. This study examines stimulant co-mention trends among PWUO/PTMOUD on social media over multiple years. Methods We collected publicly available data from 14 forums on Reddit (subreddits) that focused on prescription and illicit opioids, and medications for opioid use disorder (MOUD). Collected data ranged from 2011 to 2020, and we also collected timelines comprising past posts from a sample of Reddit users (Redditors) on these forums. We applied natural language processing to generate lexical variants of all included prescription and illicit opioids and stimulants and detect mentions of them on the chosen subreddits. Finally, we analyzed and described trends and patterns in co-mentions. Results Posts collected for 13,812 Redditors showed that 12,306 (89.1%) mentioned at least 1 opioid, opioid-related medication, or stimulant. Analyses revealed that the number and proportion of Redditors mentioning both opioids and/or opioid-related medications and stimulants steadily increased over time. Relative rates of co-mentions by the same Redditor of heroin and methamphetamine, the substances most commonly co-mentioned, decreased in recent years, while co-mentions of both fentanyl and MOUD with methamphetamine increased. Conclusion Our analyses reflect increasing mentions of stimulants, particularly methamphetamine, among PWUO/PTMOUD, which closely resembles the growth in overdose deaths involving both opioids and stimulants. These findings are consistent with recent reports suggesting increasing stimulant use among people receiving treatment for opioid use disorder. These data offer insights on emerging trends in the overdose epidemic and underscore the importance of scaling efforts to address co-occurring opioid and stimulant use including harm reduction and comprehensive healthcare access spanning mental-health services and substance use disorder treatment. Supplementary Information The online version contains supplementary material available at 10.1186/s12954-022-00628-2.
Collapse
Affiliation(s)
- Abeed Sarker
- Department of Biomedical Informatics, School of Medicine, Emory University, 101 Woodruff Circle, Suite 4101, Atlanta, GA, 30322, USA.
| | - Mohammed Ali Al-Garadi
- Department of Biomedical Informatics, School of Medicine, Emory University, 101 Woodruff Circle, Suite 4101, Atlanta, GA, 30322, USA
| | - Yao Ge
- Department of Biomedical Informatics, School of Medicine, Emory University, 101 Woodruff Circle, Suite 4101, Atlanta, GA, 30322, USA
| | - Nisha Nataraj
- National Center for Injury Prevention and Control, Centers for Disease Control and Prevention, Atlanta, GA, 30341, USA
| | - Christopher M Jones
- National Center for Injury Prevention and Control, Centers for Disease Control and Prevention, Atlanta, GA, 30341, USA
| | - Steven A Sumner
- National Center for Injury Prevention and Control, Centers for Disease Control and Prevention, Atlanta, GA, 30341, USA
| |
Collapse
|
29
|
Walsh J, Dwumfour C, Cave J, Griffiths F. Spontaneously generated online patient experience data - how and why is it being used in health research: an umbrella scoping review. BMC Med Res Methodol 2022; 22:139. [PMID: 35562661 PMCID: PMC9106384 DOI: 10.1186/s12874-022-01610-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Accepted: 04/13/2022] [Indexed: 11/10/2022] Open
Abstract
PURPOSE Social media has led to fundamental changes in the way that people look for and share health related information. There is increasing interest in using this spontaneously generated patient experience data as a data source for health research. The aim was to summarise the state of the art regarding how and why SGOPE data has been used in health research. We determined the sites and platforms used as data sources, the purposes of the studies, the tools and methods being used, and any identified research gaps. METHODS A scoping umbrella review was conducted looking at review papers from 2015 to Jan 2021 that studied the use of SGOPE data for health research. Using keyword searches we identified 1759 papers from which we included 58 relevant studies in our review. RESULTS Data was used from many individual general or health specific platforms, although Twitter was the most widely used data source. The most frequent purposes were surveillance based, tracking infectious disease, adverse event identification and mental health triaging. Despite the developments in machine learning the reviews included lots of small qualitative studies. Most NLP used supervised methods for sentiment analysis and classification. Very early days, methods need development. Methods not being explained. Disciplinary differences - accuracy tweaks vs application. There is little evidence of any work that either compares the results in both methods on the same data set or brings the ideas together. CONCLUSION Tools, methods, and techniques are still at an early stage of development, but strong consensus exists that this data source will become very important to patient centred health research.
Collapse
Affiliation(s)
- Julia Walsh
- Warwick Medical School, University of Warwick, Coventry, UK.
| | | | - Jonathan Cave
- Department of Economics, University of Warwick, Coventry, UK
| | - Frances Griffiths
- Warwick Medical School, University of Warwick, Coventry, UK
- Centre for Health Policy, University of the Witwatersrand, Johannesburg, South Africa
| |
Collapse
|
30
|
Ng JY, Abdelkader W, Lokker C. Tracking discussions of complementary, alternative, and integrative medicine in the context of the COVID-19 pandemic: a month-by-month sentiment analysis of Twitter data. BMC Complement Med Ther 2022; 22:105. [PMID: 35418205 PMCID: PMC9006490 DOI: 10.1186/s12906-022-03586-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Accepted: 04/07/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Coronavirus disease 2019 (COVID-19) is a novel infectious disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Despite the paucity of evidence, various complementary, alternative and integrative medicines (CAIMs) have been being touted as both preventative and curative. We conducted sentiment and emotion analysis with the intent of understanding CAIM content related to COVID-19 being generated on Twitter across 9 months. METHODS Tweets relating to CAIM and COVID-19 were extracted from the George Washington University Libraries Dataverse Coronavirus tweets dataset from March 03 to November 30, 2020. We trained and tested a machine learning classifier using a large, pre-labelled Twitter dataset, which was applied to predict the sentiment of each CAIM-related tweet, and we used a natural language processing package to identify the emotions based on the words contained in the tweets. RESULTS Our dataset included 28 713 English-language Tweets. The number of CAIM-related tweets during the study period peaked in May 2020, then dropped off sharply over the subsequent three months; the fewest CAIM-related tweets were collected during August 2020 and remained low for the remainder of the collection period. Most tweets (n = 15 612, 54%) were classified as positive, 31% were neutral (n = 8803) and 15% were classified as negative (n = 4298). The most frequent emotions expressed across tweets were trust, followed by fear, while surprise and disgust were the least frequent. Though volume of tweets decreased over the 9 months of the study, the expressed sentiments and emotions remained constant. CONCLUSION The results of this sentiment analysis enabled us to establish key CAIMs being discussed at the intersection of COVID-19 across a 9-month period on Twitter. Overall, the majority of our subset of tweets were positive, as were the emotions associated with the words found within them. This may be interpreted as public support for CAIM, however, further qualitative investigation is warranted. Such future directions may be used to combat misinformation and improve public health strategies surrounding the use of social media information.
Collapse
Affiliation(s)
- Jeremy Y Ng
- Department of Health Research Methods, Evidence, and Impact, Faculty of Health Sciences, McMaster University, Hamilton, ON, Canada.
| | - Wael Abdelkader
- Department of Health Research Methods, Evidence, and Impact, Faculty of Health Sciences, McMaster University, Hamilton, ON, Canada
| | - Cynthia Lokker
- Department of Health Research Methods, Evidence, and Impact, Faculty of Health Sciences, McMaster University, Hamilton, ON, Canada
| |
Collapse
|
31
|
Watanabe T, Yada S, Aramaki E, Yajima H, Kizaki H, Hori S. Extracting Multiple Worries from Breast Cancer Patient Blogs Using Multi-Label Classification with a Natural Language-Processing Model BERT (Bidirectional Encoder Representations from Transformers): Infodemiology Study of Blogs (Preprint). JMIR Cancer 2022; 8:e37840. [PMID: 35657664 PMCID: PMC9206207 DOI: 10.2196/37840] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 05/10/2022] [Accepted: 05/23/2022] [Indexed: 12/26/2022] Open
Abstract
Background Patients with breast cancer have a variety of worries and need multifaceted information support. Their accumulated posts on social media contain rich descriptions of their daily worries concerning issues such as treatment, family, and finances. It is important to identify these issues to help patients with breast cancer to resolve their worries and obtain reliable information. Objective This study aimed to extract and classify multiple worries from text generated by patients with breast cancer using Bidirectional Encoder Representations From Transformers (BERT), a context-aware natural language processing model. Methods A total of 2272 blog posts by patients with breast cancer in Japan were collected. Five worry labels, “treatment,” “physical,” “psychological,” “work/financial,” and “family/friends,” were defined and assigned to each post. Multiple labels were allowed. To assess the label criteria, 50 blog posts were randomly selected and annotated by two researchers with medical knowledge. After the interannotator agreement had been assessed by means of Cohen kappa, one researcher annotated all the blogs. A multilabel classifier that simultaneously predicts five worries in a text was developed using BERT. This classifier was fine-tuned by using the posts as input and adding a classification layer to the pretrained BERT. The performance was evaluated for precision using the average of 5-fold cross-validation results. Results Among the blog posts, 477 included “treatment,” 1138 included “physical,” 673 included “psychological,” 312 included “work/financial,” and 283 included “family/friends.” The interannotator agreement values were 0.67 for “treatment,” 0.76 for “physical,” 0.56 for “psychological,” 0.73 for “work/financial,” and 0.73 for “family/friends,” indicating a high degree of agreement. Among all blog posts, 544 contained no label, 892 contained one label, and 836 contained multiple labels. It was found that the worries varied from user to user, and the worries posted by the same user changed over time. The model performed well, though prediction performance differed for each label. The values of precision were 0.59 for “treatment,” 0.82 for “physical,” 0.64 for “psychological,” 0.67 for “work/financial,” and 0.58 for “family/friends.” The higher the interannotator agreement and the greater the number of posts, the higher the precision tended to be. Conclusions This study showed that the BERT model can extract multiple worries from text generated from patients with breast cancer. This is the first application of a multilabel classifier using the BERT model to extract multiple worries from patient-generated text. The results will be helpful to identify breast cancer patients’ worries and give them timely social support.
Collapse
Affiliation(s)
- Tomomi Watanabe
- Division of Drug Informatics, Keio University Faculty of Pharmacy, Tokyo, Japan
| | - Shuntaro Yada
- Nara Institute of Science and Technology, Nara, Japan
| | - Eiji Aramaki
- Nara Institute of Science and Technology, Nara, Japan
| | | | - Hayato Kizaki
- Division of Drug Informatics, Keio University Faculty of Pharmacy, Tokyo, Japan
| | - Satoko Hori
- Division of Drug Informatics, Keio University Faculty of Pharmacy, Tokyo, Japan
| |
Collapse
|
32
|
Šćepanović S, Aiello LM, Barrett D, Quercia D. Epidemic dreams: dreaming about health during the COVID-19 pandemic. ROYAL SOCIETY OPEN SCIENCE 2022; 9:211080. [PMID: 35116145 PMCID: PMC8790359 DOI: 10.1098/rsos.211080] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Accepted: 12/09/2021] [Indexed: 05/04/2023]
Abstract
The continuity hypothesis of dreams suggests that the content of dreams is continuous with the dreamer's waking experiences. Given the unprecedented nature of the experiences during COVID-19, we studied the continuity hypothesis in the context of the pandemic. We implemented a deep-learning algorithm that can extract mentions of medical conditions from text and applied it to two datasets collected during the pandemic: 2888 dream reports (dreaming life experiences), and 57 milion tweets (waking life experiences) mentioning the pandemic. The health expressions common to both sets were typical COVID-19 symptoms (e.g. cough, fever and anxiety), suggesting that dreams reflected people's real-world experiences. The health expressions that distinguished the two sets reflected differences in thought processes: expressions in waking life reflected a linear and logical thought process and, as such, described realistic symptoms or related disorders (e.g. nasal pain, SARS, H1N1); those in dreaming life reflected a thought process closer to the visual and emotional spheres and, as such, described either conditions unrelated to the virus (e.g. maggots, deformities, snake bites), or conditions of surreal nature (e.g. teeth falling out, body crumbling into sand). Our results confirm that dream reports represent an understudied yet valuable source of people's health experiences in the real world.
Collapse
Affiliation(s)
| | | | - Deirdre Barrett
- Harvard Medical School, 352 Harvard Street, Cambridge, MA 02138, USA
| | - Daniele Quercia
- Nokia Bell Labs, 21 JJ Thomson Avenue, Cambridge CB30FA, UK
- CUSP, King's College London, Strand, London, WC2R 2LS, UK
| |
Collapse
|
33
|
Miller RA, Shortliffe EH. The roles of the US National Library of Medicine and Donald A.B. Lindberg in revolutionizing biomedical and health informatics. J Am Med Inform Assoc 2021; 28:2728-2737. [PMID: 34741510 PMCID: PMC8633636 DOI: 10.1093/jamia/ocab245] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Revised: 10/19/2021] [Accepted: 10/20/2021] [Indexed: 01/16/2023] Open
Abstract
Over a 31-year span as Director of the US National Library of Medicine (NLM), Donald A.B. Lindberg, MD, and his extraordinary NLM colleagues fundamentally changed the field of biomedical and health informatics-with a resulting impact on biomedicine that is much broader than its influence on any single subfield. This article provides substance to bolster that claim. The review is based in part on the informatics section of a new book, "Transforming biomedical informatics and health information access: Don Lindberg and the US National Library of Medicine" (IOS Press, forthcoming 2021). After providing insights into selected aspects of the book's informatics-related contents, the authors discuss the broader context in which Dr. Lindberg and the NLM accomplished their transformative work.
Collapse
Affiliation(s)
- Randolph A Miller
- Emeritus Professor of Biomedical Informatics, Vanderbilt University School of Medicine, Alexandria, Virginia, USA
| | - Edward H Shortliffe
- Department of Biomedical Informatics, Columbia University in the City of New York, New York, New York, USA
| |
Collapse
|
34
|
Kesler SR, Henneghan AM, Thurman W, Rao V. Identifying themes for assessing cancer-related cognitive impairment identified by topic modeling and qualitative content analysis of public online comments (Preprint). JMIR Cancer 2021; 8:e34828. [PMID: 35612878 PMCID: PMC9178450 DOI: 10.2196/34828] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 04/28/2022] [Accepted: 05/01/2022] [Indexed: 11/28/2022] Open
Abstract
Background Cancer-related cognitive impairment (CRCI) is a common and significant adverse effect of cancer and its therapies. However, its definition and assessment remain difficult due to limitations of currently available measurement tools. Objective This study aims to evaluate qualitative themes related to the cognitive effects of cancer to help guide development of assessments that are more specific than what is currently available. Methods We applied topic modeling and inductive qualitative content analysis to 145 public online comments related to cognitive effects of cancer. Results Topic modeling revealed 2 latent topics that we interpreted as representing internal and external factors related to cognitive effects. These findings lead us to hypothesize regarding the potential contribution of locus of control to CRCI. Content analysis suggested several major themes including symptoms, emotional/psychological impacts, coping, “chemobrain” is real, change over time, and function. There was some conceptual overlap between the 2 methods regarding internal and external factors related to patient experiences of cognitive effects. Conclusions Our findings indicate that coping mechanisms and locus of control may be important themes to include in assessments of CRCI. Future directions in this field include prospective acquisition of free-text responses to guide development of assessments that are more sensitive and specific to cognitive function in patients with cancer.
Collapse
Affiliation(s)
- Shelli R Kesler
- School of Nursing, University of Texas at Austin, Austin, TX, United States
| | - Ashley M Henneghan
- School of Nursing, University of Texas at Austin, Austin, TX, United States
| | - Whitney Thurman
- School of Nursing, University of Texas at Austin, Austin, TX, United States
| | - Vikram Rao
- School of Nursing, University of Texas at Austin, Austin, TX, United States
| |
Collapse
|
35
|
Lu Z, Sim JA, Wang JX, Forrest CB, Krull KR, Srivastava D, Hudson MM, Robison LL, Baker JN, Huang IC. Natural Language Processing and Machine Learning Methods to Characterize Unstructured Patient-Reported Outcomes: Validation Study. J Med Internet Res 2021; 23:e26777. [PMID: 34730546 PMCID: PMC8600437 DOI: 10.2196/26777] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2020] [Revised: 03/20/2021] [Accepted: 08/12/2021] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Assessing patient-reported outcomes (PROs) through interviews or conversations during clinical encounters provides insightful information about survivorship. OBJECTIVE This study aims to test the validity of natural language processing (NLP) and machine learning (ML) algorithms in identifying different attributes of pain interference and fatigue symptoms experienced by child and adolescent survivors of cancer versus the judgment by PRO content experts as the gold standard to validate NLP/ML algorithms. METHODS This cross-sectional study focused on child and adolescent survivors of cancer, aged 8 to 17 years, and caregivers, from whom 391 meaning units in the pain interference domain and 423 in the fatigue domain were generated for analyses. Data were collected from the After Completion of Therapy Clinic at St. Jude Children's Research Hospital. Experienced pain interference and fatigue symptoms were reported through in-depth interviews. After verbatim transcription, analyzable sentences (ie, meaning units) were semantically labeled by 2 content experts for each attribute (physical, cognitive, social, or unclassified). Two NLP/ML methods were used to extract and validate the semantic features: bidirectional encoder representations from transformers (BERT) and Word2vec plus one of the ML methods, the support vector machine or extreme gradient boosting. Receiver operating characteristic and precision-recall curves were used to evaluate the accuracy and validity of the NLP/ML methods. RESULTS Compared with Word2vec/support vector machine and Word2vec/extreme gradient boosting, BERT demonstrated higher accuracy in both symptom domains, with 0.931 (95% CI 0.905-0.957) and 0.916 (95% CI 0.887-0.941) for problems with cognitive and social attributes on pain interference, respectively, and 0.929 (95% CI 0.903-0.953) and 0.917 (95% CI 0.891-0.943) for problems with cognitive and social attributes on fatigue, respectively. In addition, BERT yielded superior areas under the receiver operating characteristic curve for cognitive attributes on pain interference and fatigue domains (0.923, 95% CI 0.879-0.997; 0.948, 95% CI 0.922-0.979) and superior areas under the precision-recall curve for cognitive attributes on pain interference and fatigue domains (0.818, 95% CI 0.735-0.917; 0.855, 95% CI 0.791-0.930). CONCLUSIONS The BERT method performed better than the other methods. As an alternative to using standard PRO surveys, collecting unstructured PROs via interviews or conversations during clinical encounters and applying NLP/ML methods can facilitate PRO assessment in child and adolescent cancer survivors.
Collapse
Affiliation(s)
- Zhaohua Lu
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN, United States
| | - Jin-Ah Sim
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States
- School of AI Convergence, Hallym University, Chuncheon, Republic of Korea
| | - Jade X Wang
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN, United States
| | - Christopher B Forrest
- Roberts Center for Pediatric Research, Children's Hospital of Philadelphia, Philadelphia, PA, United States
| | - Kevin R Krull
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States
| | - Deokumar Srivastava
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN, United States
| | - Melissa M Hudson
- Department of Oncology, St. Jude Children's Research Hospital, Memphis, TN, United States
| | - Leslie L Robison
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States
| | - Justin N Baker
- Department of Oncology, St. Jude Children's Research Hospital, Memphis, TN, United States
| | - I-Chan Huang
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States
| |
Collapse
|
36
|
Walsh J, Cave J, Griffiths F. Spontaneously Generated Online Patient Experience of Modafinil: A Qualitative and NLP Analysis. Front Digit Health 2021; 3:598431. [PMID: 34713085 PMCID: PMC8521895 DOI: 10.3389/fdgth.2021.598431] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Accepted: 01/27/2021] [Indexed: 11/16/2022] Open
Abstract
Objective: To compare the findings from a qualitative and a natural language processing (NLP) based analysis of online patient experience posts on patient experience of the effectiveness and impact of the drug Modafinil. Methods: Posts (n = 260) from 5 online social media platforms where posts were publicly available formed the dataset/corpus. Three platforms asked posters to give a numerical rating of Modafinil. Thematic analysis: data was coded and themes generated. Data were categorized into PreModafinil, Acquisition, Dosage, and PostModafinil and compared to identify each poster's own view of whether taking Modafinil was linked to an identifiable outcome. We classified this as positive, mixed, negative, or neutral and compared this with numerical ratings. NLP: Corpus text was speech tagged and keywords and key terms extracted. We identified the following entities: drug names, condition names, symptoms, actions, and side-effects. We searched for simple relationships, collocations, and co-occurrences of entities. To identify causal text, we split the corpus into PreModafinil and PostModafinil and used n-gram analysis. To evaluate sentiment, we calculated the polarity of each post between −1 (negative) and +1 (positive). NLP results were mapped to qualitative results. Results: Posters had used Modafinil for 33 different primary conditions. Eight themes were identified: the reason for taking (condition or symptom), impact of symptoms, acquisition, dosage, side effects, other interventions tried or compared to, effectiveness of Modafinil, and quality of life outcomes. Posters reported perceived effectiveness as follows: 68% positive, 12% mixed, 18% negative. Our classification was consistent with poster ratings. Of the most frequent 100 keywords/keyterms identified by term extraction 88/100 keywords and 84/100 keyterms mapped directly to the eight themes. Seven keyterms indicated negation and temporal states. Sentiment was as follows 72% positive sentiment 4% neutral 24% negative. Matching of sentiment between the qualitative and NLP methods was accurate in 64.2% of posts. If we allow for one category difference matching was accurate in 85% of posts. Conclusions: User generated patient experience is a rich resource for evaluating real world effectiveness, understanding patient perspectives, and identifying research gaps. Both methods successfully identified the entities and topics contained in the posts. In contrast to current evidence, posters with a wide range of other conditions found Modafinil effective. Perceived causality and effectiveness were identified by both methods demonstrating the potential to augment existing knowledge.
Collapse
Affiliation(s)
- Julia Walsh
- Warwick Medical School, University of Warwick, Coventry, United Kingdom
| | - Jonathan Cave
- Department of Economics, University of Warwick, Coventry, United Kingdom
| | - Frances Griffiths
- Warwick Medical School, University of Warwick, Coventry, United Kingdom
| |
Collapse
|
37
|
Fairie P, Zhang Z, D'Souza AG, Walsh T, Quan H, Santana MJ. Categorising patient concerns using natural language processing techniques. BMJ Health Care Inform 2021; 28:e100274. [PMID: 34193519 PMCID: PMC8246286 DOI: 10.1136/bmjhci-2020-100274] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 05/20/2021] [Indexed: 11/23/2022] Open
Abstract
OBJECTIVES Patient feedback is critical to identify and resolve patient safety and experience issues in healthcare systems. However, large volumes of unstructured text data can pose problems for manual (human) analysis. This study reports the results of using a semiautomated, computational topic-modelling approach to analyse a corpus of patient feedback. METHODS Patient concerns were received by Alberta Health Services between 2011 and 2018 (n=76 163), regarding 806 care facilities in 163 municipalities, including hospitals, clinics, community care centres and retirement homes, in a province of 4.4 million. Their existing framework requires manual labelling of pre-defined categories. We applied an automated latent Dirichlet allocation (LDA)-based topic modelling algorithm to identify the topics present in these concerns, and thereby produce a framework-free categorisation. RESULTS The LDA model produced 40 topics which, following manual interpretation by researchers, were reduced to 28 coherent topics. The most frequent topics identified were communication issues causing delays (frequency: 10.58%), community care for elderly patients (8.82%), interactions with nurses (8.80%) and emergency department care (7.52%). Many patient concerns were categorised into multiple topics. Some were more specific versions of categories from the existing framework (eg, communication issues causing delays), while others were novel (eg, smoking in inappropriate settings). DISCUSSION LDA-generated topics were more nuanced than the manually labelled categories. For example, LDA found that concerns with community care were related to concerns about nursing for seniors, providing opportunities for insight and action. CONCLUSION Our findings outline the range of concerns patients share in a large health system and demonstrate the usefulness of using LDA to identify categories of patient concerns.
Collapse
Affiliation(s)
- Paul Fairie
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Alberta Strategy for Patient-Oriented Research Patient Engagement Platform, Calgary, Alberta, Canada
| | - Zilong Zhang
- Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Adam G D'Souza
- Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Alberta Health Services, Calgary, Alberta, Canada
| | - Tara Walsh
- Alberta Health Services, Calgary, Alberta, Canada
| | - Hude Quan
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Maria J Santana
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Alberta Strategy for Patient-Oriented Research Patient Engagement Platform, Calgary, Alberta, Canada
- Department of Pediatrics, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| |
Collapse
|
38
|
Gaur M, Aribandi V, Alambo A, Kursuncu U, Thirunarayan K, Beich J, Pathak J, Sheth A. Characterization of time-variant and time-invariant assessment of suicidality on Reddit using C-SSRS. PLoS One 2021; 16:e0250448. [PMID: 33999927 PMCID: PMC8128252 DOI: 10.1371/journal.pone.0250448] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Accepted: 04/06/2021] [Indexed: 11/19/2022] Open
Abstract
Suicide is the 10th leading cause of death in the U.S (1999-2019). However, predicting when someone will attempt suicide has been nearly impossible. In the modern world, many individuals suffering from mental illness seek emotional support and advice on well-known and easily-accessible social media platforms such as Reddit. While prior artificial intelligence research has demonstrated the ability to extract valuable information from social media on suicidal thoughts and behaviors, these efforts have not considered both severity and temporality of risk. The insights made possible by access to such data have enormous clinical potential-most dramatically envisioned as a trigger to employ timely and targeted interventions (i.e., voluntary and involuntary psychiatric hospitalization) to save lives. In this work, we address this knowledge gap by developing deep learning algorithms to assess suicide risk in terms of severity and temporality from Reddit data based on the Columbia Suicide Severity Rating Scale (C-SSRS). In particular, we employ two deep learning approaches: time-variant and time-invariant modeling, for user-level suicide risk assessment, and evaluate their performance against a clinician-adjudicated gold standard Reddit corpus annotated based on the C-SSRS. Our results suggest that the time-variant approach outperforms the time-invariant method in the assessment of suicide-related ideations and supportive behaviors (AUC:0.78), while the time-invariant model performed better in predicting suicide-related behaviors and suicide attempt (AUC:0.64). The proposed approach can be integrated with clinical diagnostic interviews for improving suicide risk assessments.
Collapse
Affiliation(s)
- Manas Gaur
- Artificial Intelligence Institute, University of South Carolina, Columbia, SC, United States of America
| | - Vamsi Aribandi
- Kno.e.sis Center, Wright State University, Dayton, OH, United States of America
| | - Amanuel Alambo
- Kno.e.sis Center, Wright State University, Dayton, OH, United States of America
| | - Ugur Kursuncu
- Artificial Intelligence Institute, University of South Carolina, Columbia, SC, United States of America
| | | | - Jonathan Beich
- Department of Psychiatry, Wright State University, Dayton, OH, United States of America
| | - Jyotishman Pathak
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, United States of America
| | - Amit Sheth
- Kno.e.sis Center, Wright State University, Dayton, OH, United States of America
| |
Collapse
|
39
|
Afshar M, Sharma B, Bhalla S, Thompson HM, Dligach D, Boley RA, Kishen E, Simmons A, Perticone K, Karnik NS. External validation of an opioid misuse machine learning classifier in hospitalized adult patients. Addict Sci Clin Pract 2021; 16:19. [PMID: 33731210 PMCID: PMC7967783 DOI: 10.1186/s13722-021-00229-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Accepted: 03/10/2021] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Opioid misuse screening in hospitals is resource-intensive and rarely done. Many hospitalized patients are never offered opioid treatment. An automated approach leveraging routinely captured electronic health record (EHR) data may be easier for hospitals to institute. We previously derived and internally validated an opioid classifier in a separate hospital setting. The aim is to externally validate our previously published and open-source machine-learning classifier at a different hospital for identifying cases of opioid misuse. METHODS An observational cohort of 56,227 adult hospitalizations was examined between October 2017 and December 2019 during a hospital-wide substance use screening program with manual screening. Manually completed Drug Abuse Screening Test served as the reference standard to validate a convolutional neural network (CNN) classifier with coded word embedding features from the clinical notes of the EHR. The opioid classifier utilized all notes in the EHR and sensitivity analysis was also performed on the first 24 h of notes. Calibration was performed to account for the lower prevalence than in the original cohort. RESULTS Manual screening for substance misuse was completed in 67.8% (n = 56,227) with 1.1% (n = 628) identified with opioid misuse. The data for external validation included 2,482,900 notes with 67,969 unique clinical concept features. The opioid classifier had an AUC of 0.99 (95% CI 0.99-0.99) across the encounter and 0.98 (95% CI 0.98-0.99) using only the first 24 h of notes. In the calibrated classifier, the sensitivity and positive predictive value were 0.81 (95% CI 0.77-0.84) and 0.72 (95% CI 0.68-0.75). For the first 24 h, they were 0.75 (95% CI 0.71-0.78) and 0.61 (95% CI 0.57-0.64). CONCLUSIONS Our opioid misuse classifier had good discrimination during external validation. Our model may provide a comprehensive and automated approach to opioid misuse identification that augments current workflows and overcomes manual screening barriers.
Collapse
Affiliation(s)
- Majid Afshar
- Division of Health Informatics and Data Science, Loyola University Chicago, Maywood, IL, USA.
- Department of Medicine, University of Wisconsin, 1685 Highland Avenue, Madison, WI, 53705, USA.
| | - Brihat Sharma
- Department of Psychiatry and Behavioral Sciences, Rush University Medical Center, Chicago, IL, USA
| | - Sameer Bhalla
- Rush Medical College, Rush University, Chicago, IL, USA
| | - Hale M Thompson
- Department of Psychiatry and Behavioral Sciences, Rush University Medical Center, Chicago, IL, USA
| | - Dmitriy Dligach
- Department of Computer Science, Loyola University Chicago, Chicago, IL, USA
| | - Randy A Boley
- Department of Psychiatry and Behavioral Sciences, Rush University Medical Center, Chicago, IL, USA
| | - Ekta Kishen
- Clinical Research Analytics, Research Core, Rush University Medical Center, Chicago, IL, USA
| | - Alan Simmons
- Clinical Research Analytics, Research Core, Rush University Medical Center, Chicago, IL, USA
| | - Kathryn Perticone
- Department of Psychiatry and Behavioral Sciences, Rush University Medical Center, Chicago, IL, USA
| | - Niranjan S Karnik
- Department of Psychiatry and Behavioral Sciences, Rush University Medical Center, Chicago, IL, USA
| |
Collapse
|
40
|
Jouffroy J, Feldman SF, Lerner I, Rance B, Burgun A, Neuraz A. Hybrid Deep Learning for Medication-Related Information Extraction From Clinical Texts in French: MedExt Algorithm Development Study. JMIR Med Inform 2021; 9:e17934. [PMID: 33724196 PMCID: PMC8077811 DOI: 10.2196/17934] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Revised: 12/29/2020] [Accepted: 01/20/2021] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Information related to patient medication is crucial for health care; however, up to 80% of the information resides solely in unstructured text. Manual extraction is difficult and time-consuming, and there is not a lot of research on natural language processing extracting medical information from unstructured text from French corpora. OBJECTIVE We aimed to develop a system to extract medication-related information from clinical text written in French. METHODS We developed a hybrid system combining an expert rule-based system, contextual word embedding (embedding for language model) trained on clinical notes, and a deep recurrent neural network (bidirectional long short term memory-conditional random field). The task consisted of extracting drug mentions and their related information (eg, dosage, frequency, duration, route, condition). We manually annotated 320 clinical notes from a French clinical data warehouse to train and evaluate the model. We compared the performance of our approach to those of standard approaches: rule-based or machine learning only and classic word embeddings. We evaluated the models using token-level recall, precision, and F-measure. RESULTS The overall F-measure was 89.9% (precision 90.8; recall: 89.2) when combining expert rules and contextualized embeddings, compared to 88.1% (precision 89.5; recall 87.2) without expert rules or contextualized embeddings. The F-measures for each category were 95.3% for medication name, 64.4% for drug class mentions, 95.3% for dosage, 92.2% for frequency, 78.8% for duration, and 62.2% for condition of the intake. CONCLUSIONS Associating expert rules, deep contextualized embedding, and deep neural networks improved medication information extraction. Our results revealed a synergy when associating expert knowledge and latent knowledge.
Collapse
Affiliation(s)
- Jordan Jouffroy
- Department of Biomedical Informatics, Necker-Enfants malades Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
- UMRS 1138 team 22, Institut National de la Santé et de la Recherche Médicale, Université de Paris, Paris, France
| | - Sarah F Feldman
- Department of Biomedical Informatics, Necker-Enfants malades Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
- UMRS 1138 team 22, Institut National de la Santé et de la Recherche Médicale, Université de Paris, Paris, France
| | - Ivan Lerner
- Department of Biomedical Informatics, Necker-Enfants malades Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
- UMRS 1138 team 22, Institut National de la Santé et de la Recherche Médicale, Université de Paris, Paris, France
| | - Bastien Rance
- UMRS 1138 team 22, Institut National de la Santé et de la Recherche Médicale, Université de Paris, Paris, France
- Department of Biomedical Informatics, Georges Pompidou European Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Anita Burgun
- Department of Biomedical Informatics, Necker-Enfants malades Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
- UMRS 1138 team 22, Institut National de la Santé et de la Recherche Médicale, Université de Paris, Paris, France
| | - Antoine Neuraz
- Department of Biomedical Informatics, Necker-Enfants malades Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
- UMRS 1138 team 22, Institut National de la Santé et de la Recherche Médicale, Université de Paris, Paris, France
| |
Collapse
|
41
|
Newman-Griffis D, Fosler-Lussier E. Automated Coding of Under-Studied Medical Concept Domains: Linking Physical Activity Reports to the International Classification of Functioning, Disability, and Health. Front Digit Health 2021; 3:620828. [PMID: 33791684 PMCID: PMC8009547 DOI: 10.3389/fdgth.2021.620828] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Accepted: 02/16/2021] [Indexed: 11/13/2022] Open
Abstract
Linking clinical narratives to standardized vocabularies and coding systems is a key component of unlocking the information in medical text for analysis. However, many domains of medical concepts, such as functional outcomes and social determinants of health, lack well-developed terminologies that can support effective coding of medical text. We present a framework for developing natural language processing (NLP) technologies for automated coding of medical information in under-studied domains, and demonstrate its applicability through a case study on physical mobility function. Mobility function is a component of many health measures, from post-acute care and surgical outcomes to chronic frailty and disability, and is represented as one domain of human activity in the International Classification of Functioning, Disability, and Health (ICF). However, mobility and other types of functional activity remain under-studied in the medical informatics literature, and neither the ICF nor commonly-used medical terminologies capture functional status terminology in practice. We investigated two data-driven paradigms, classification and candidate selection, to link narrative observations of mobility status to standardized ICF codes, using a dataset of clinical narratives from physical therapy encounters. Recent advances in language modeling and word embedding were used as features for established machine learning models and a novel deep learning approach, achieving a macro-averaged F-1 score of 84% on linking mobility activity reports to ICF codes. Both classification and candidate selection approaches present distinct strengths for automated coding in under-studied domains, and we highlight that the combination of (i) a small annotated data set; (ii) expert definitions of codes of interest; and (iii) a representative text corpus is sufficient to produce high-performing automated coding systems. This research has implications for continued development of language technologies to analyze functional status information, and the ongoing growth of NLP tools for a variety of specialized applications in clinical care and research.
Collapse
Affiliation(s)
- Denis Newman-Griffis
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, United States
- Epidemiology & Biostatistics Section, Rehabilitation Medicine Department, National Institutes of Health Clinical Center, Bethesda, MD, United States
| | - Eric Fosler-Lussier
- Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, United States
| |
Collapse
|
42
|
Abstract
Machine learning (ML) has been slowly entering every aspect of our lives and its positive impact has been astonishing. To accelerate embedding ML in more applications and incorporating it in real-world scenarios, automated machine learning (AutoML) is emerging. The main purpose of AutoML is to provide seamless integration of ML in various industries, which will facilitate better outcomes in everyday tasks. In healthcare, AutoML has been already applied to easier settings with structured data such as tabular lab data. However, there is still a need for applying AutoML for interpreting medical text, which is being generated at a tremendous rate. For this to happen, a promising method is AutoML for clinical notes analysis, which is an unexplored research area representing a gap in ML research. The main objective of this paper is to fill this gap and provide a comprehensive survey and analytical study towards AutoML for clinical notes. To that end, we first introduce the AutoML technology and review its various tools and techniques. We then survey the literature of AutoML in the healthcare industry and discuss the developments specific to clinical settings, as well as those using general AutoML tools for healthcare applications. With this background, we then discuss challenges of working with clinical notes and highlight the benefits of developing AutoML for medical notes processing. Next, we survey relevant ML research for clinical notes and analyze the literature and the field of AutoML in the healthcare industry. Furthermore, we propose future research directions and shed light on the challenges and opportunities this emerging field holds. With this, we aim to assist the community with the implementation of an AutoML platform for medical notes, which if realized can revolutionize patient outcomes.
Collapse
|
43
|
So W, Bogucka EP, Scepanovic S, Joglekar S, Zhou K, Quercia D. Humane Visual AI: Telling the Stories Behind a Medical Condition. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:678-688. [PMID: 33048711 DOI: 10.1109/tvcg.2020.3030391] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
A biological understanding is key for managing medical conditions, yet psychological and social aspects matter too. The main problem is that these two aspects are hard to quantify and inherently difficult to communicate. To quantify psychological aspects, this work mined around half a million Reddit posts in the sub-communities specialised in 14 medical conditions, and it did so with a new deep-learning framework. In so doing, it was able to associate mentions of medical conditions with those of emotions. To then quantify social aspects, this work designed a probabilistic approach that mines open prescription data from the National Health Service in England to compute the prevalence of drug prescriptions, and to relate such a prevalence to census data. To finally visually communicate each medical condition's biological, psychological, and social aspects through storytelling, we designed a narrative-style layered Martini Glass visualization. In a user study involving 52 participants, after interacting with our visualization, a considerable number of them changed their mind on previously held opinions: 10% gave more importance to the psychological aspects of medical conditions, and 27% were more favourable to the use of social media data in healthcare, suggesting the importance of persuasive elements in interactive visualizations.
Collapse
|
44
|
Afshar M, Dligach D, Sharma B, Cai X, Boyda J, Birch S, Valdez D, Zelisko S, Joyce C, Modave F, Price R. Development and application of a high throughput natural language processing architecture to convert all clinical documents in a clinical data warehouse into standardized medical vocabularies. J Am Med Inform Assoc 2021; 26:1364-1369. [PMID: 31145455 DOI: 10.1093/jamia/ocz068] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Revised: 04/18/2019] [Accepted: 04/24/2019] [Indexed: 12/23/2022] Open
Abstract
OBJECTIVE Natural language processing (NLP) engines such as the clinical Text Analysis and Knowledge Extraction System are a solution for processing notes for research, but optimizing their performance for a clinical data warehouse remains a challenge. We aim to develop a high throughput NLP architecture using the clinical Text Analysis and Knowledge Extraction System and present a predictive model use case. MATERIALS AND METHODS The CDW was comprised of 1 103 038 patients across 10 years. The architecture was constructed using the Hadoop data repository for source data and 3 large-scale symmetric processing servers for NLP. Each named entity mention in a clinical document was mapped to the Unified Medical Language System concept unique identifier (CUI). RESULTS The NLP architecture processed 83 867 802 clinical documents in 13.33 days and produced 37 721 886 606 CUIs across 8 standardized medical vocabularies. Performance of the architecture exceeded 500 000 documents per hour across 30 parallel instances of the clinical Text Analysis and Knowledge Extraction System including 10 instances dedicated to documents greater than 20 000 bytes. In a use-case example for predicting 30-day hospital readmission, a CUI-based model had similar discrimination to n-grams with an area under the curve receiver operating characteristic of 0.75 (95% CI, 0.74-0.76). DISCUSSION AND CONCLUSION Our health system's high throughput NLP architecture may serve as a benchmark for large-scale clinical research using a CUI-based approach.
Collapse
Affiliation(s)
- Majid Afshar
- Center for Health Outcomes and Informatics Research, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA.,Department of Public Health Sciences, Stritch School of Medicine, Loyola University Chicago, Maywood, Illinois, USA
| | - Dmitriy Dligach
- Center for Health Outcomes and Informatics Research, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA.,Department of Public Health Sciences, Stritch School of Medicine, Loyola University Chicago, Maywood, Illinois, USA.,Department of Computer Science, Loyola University, Chicago, Illinois, USA
| | - Brihat Sharma
- Department of Computer Science, Loyola University, Chicago, Illinois, USA
| | - Xiaoyuan Cai
- Informatics and Systems Development, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA
| | - Jason Boyda
- Informatics and Systems Development, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA
| | - Steven Birch
- Informatics and Systems Development, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA
| | - Daniel Valdez
- Informatics and Systems Development, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA
| | - Suzan Zelisko
- Informatics and Systems Development, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA
| | - Cara Joyce
- Center for Health Outcomes and Informatics Research, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA.,Department of Public Health Sciences, Stritch School of Medicine, Loyola University Chicago, Maywood, Illinois, USA
| | - François Modave
- Center for Health Outcomes and Informatics Research, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA.,Department of Public Health Sciences, Stritch School of Medicine, Loyola University Chicago, Maywood, Illinois, USA
| | - Ron Price
- Center for Health Outcomes and Informatics Research, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA.,Informatics and Systems Development, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA
| |
Collapse
|
45
|
Advancing Biomarker Development Through Convergent Engagement: Summary Report of the 2nd International Danube Symposium on Biomarker Development, Molecular Imaging and Applied Diagnostics; March 14-16, 2018; Vienna, Austria. Mol Imaging Biol 2021; 22:47-65. [PMID: 31049831 DOI: 10.1007/s11307-019-01361-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Here, we report on the outcome of the 2nd International Danube Symposium on advanced biomarker development that was held in Vienna, Austria, in early 2018. During the meeting, cross-speciality participants assessed critical aspects of non-invasive, quantitative biomarker development in view of the need to expand our understanding of disease mechanisms and the definition of appropriate strategies both for molecular diagnostics and personalised therapies. More specifically, panelists addressed the main topics, including the current status of disease characterisation by means of non-invasive imaging, histopathology and liquid biopsies as well as strategies of gaining new understanding of disease formation, modulation and plasticity to large-scale molecular imaging as well as integrative multi-platform approaches. Highlights of the 2018 meeting included dedicated sessions on non-invasive disease characterisation, development of disease and therapeutic tailored biomarkers, standardisation and quality measures in biospecimens, new therapeutic approaches and socio-economic challenges of biomarker developments. The scientific programme was accompanied by a roundtable discussion on identification and implementation of sustainable strategies to address the educational needs in the rapidly evolving field of molecular diagnostics. The central theme that emanated from the 2nd Donau Symposium was the importance of the conceptualisation and implementation of a convergent approach towards a disease characterisation beyond lesion-counting "lumpology" for a cost-effective and patient-centric diagnosis, therapy planning, guidance and monitoring. This involves a judicious choice of diagnostic means, the adoption of clinical decision support systems and, above all, a new way of communication involving all stakeholders across modalities and specialities. Moreover, complex diseases require a comprehensive diagnosis by converging parameters from different disciplines, which will finally yield to a precise therapeutic guidance and outcome prediction. While it is attractive to focus on technical advances alone, it is important to develop a patient-centric approach, thus asking "What can we do with our expertise to help patients?"
Collapse
|
46
|
Wu S, Roberts K, Datta S, Du J, Ji Z, Si Y, Soni S, Wang Q, Wei Q, Xiang Y, Zhao B, Xu H. Deep learning in clinical natural language processing: a methodical review. J Am Med Inform Assoc 2021; 27:457-470. [PMID: 31794016 DOI: 10.1093/jamia/ocz200] [Citation(s) in RCA: 198] [Impact Index Per Article: 49.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Revised: 10/15/2019] [Accepted: 11/09/2019] [Indexed: 02/07/2023] Open
Abstract
OBJECTIVE This article methodically reviews the literature on deep learning (DL) for natural language processing (NLP) in the clinical domain, providing quantitative analysis to answer 3 research questions concerning methods, scope, and context of current research. MATERIALS AND METHODS We searched MEDLINE, EMBASE, Scopus, the Association for Computing Machinery Digital Library, and the Association for Computational Linguistics Anthology for articles using DL-based approaches to NLP problems in electronic health records. After screening 1,737 articles, we collected data on 25 variables across 212 papers. RESULTS DL in clinical NLP publications more than doubled each year, through 2018. Recurrent neural networks (60.8%) and word2vec embeddings (74.1%) were the most popular methods; the information extraction tasks of text classification, named entity recognition, and relation extraction were dominant (89.2%). However, there was a "long tail" of other methods and specific tasks. Most contributions were methodological variants or applications, but 20.8% were new methods of some kind. The earliest adopters were in the NLP community, but the medical informatics community was the most prolific. DISCUSSION Our analysis shows growing acceptance of deep learning as a baseline for NLP research, and of DL-based NLP in the medical community. A number of common associations were substantiated (eg, the preference of recurrent neural networks for sequence-labeling named entity recognition), while others were surprisingly nuanced (eg, the scarcity of French language clinical NLP with deep learning). CONCLUSION Deep learning has not yet fully penetrated clinical NLP and is growing rapidly. This review highlighted both the popular and unique trends in this active field.
Collapse
Affiliation(s)
- Stephen Wu
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Kirk Roberts
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Surabhi Datta
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Jingcheng Du
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Zongcheng Ji
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Yuqi Si
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Sarvesh Soni
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Qiong Wang
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Qiang Wei
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Yang Xiang
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Bo Zhao
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Hua Xu
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| |
Collapse
|
47
|
Kersloot MG, van Putten FJP, Abu-Hanna A, Cornet R, Arts DL. Natural language processing algorithms for mapping clinical text fragments onto ontology concepts: a systematic review and recommendations for future studies. J Biomed Semantics 2020; 11:14. [PMID: 33198814 PMCID: PMC7670625 DOI: 10.1186/s13326-020-00231-z] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Accepted: 11/03/2020] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Free-text descriptions in electronic health records (EHRs) can be of interest for clinical research and care optimization. However, free text cannot be readily interpreted by a computer and, therefore, has limited value. Natural Language Processing (NLP) algorithms can make free text machine-interpretable by attaching ontology concepts to it. However, implementations of NLP algorithms are not evaluated consistently. Therefore, the objective of this study was to review the current methods used for developing and evaluating NLP algorithms that map clinical text fragments onto ontology concepts. To standardize the evaluation of algorithms and reduce heterogeneity between studies, we propose a list of recommendations. METHODS Two reviewers examined publications indexed by Scopus, IEEE, MEDLINE, EMBASE, the ACM Digital Library, and the ACL Anthology. Publications reporting on NLP for mapping clinical text from EHRs to ontology concepts were included. Year, country, setting, objective, evaluation and validation methods, NLP algorithms, terminology systems, dataset size and language, performance measures, reference standard, generalizability, operational use, and source code availability were extracted. The studies' objectives were categorized by way of induction. These results were used to define recommendations. RESULTS Two thousand three hundred fifty five unique studies were identified. Two hundred fifty six studies reported on the development of NLP algorithms for mapping free text to ontology concepts. Seventy-seven described development and evaluation. Twenty-two studies did not perform a validation on unseen data and 68 studies did not perform external validation. Of 23 studies that claimed that their algorithm was generalizable, 5 tested this by external validation. A list of sixteen recommendations regarding the usage of NLP systems and algorithms, usage of data, evaluation and validation, presentation of results, and generalizability of results was developed. CONCLUSION We found many heterogeneous approaches to the reporting on the development and evaluation of NLP algorithms that map clinical text to ontology concepts. Over one-fourth of the identified publications did not perform an evaluation. In addition, over one-fourth of the included studies did not perform a validation, and 88% did not perform external validation. We believe that our recommendations, alongside an existing reporting standard, will increase the reproducibility and reusability of future studies and NLP algorithms in medicine.
Collapse
Affiliation(s)
- Martijn G. Kersloot
- Amsterdam UMC, University of Amsterdam, Department of Medical Informatics, Amsterdam Public Health Research Institute Castor EDC, Room J1B-109, PO Box 22700, 1100 DE Amsterdam, The Netherlands
- Castor EDC, Amsterdam, The Netherlands
| | - Florentien J. P. van Putten
- Amsterdam UMC, University of Amsterdam, Department of Medical Informatics, Amsterdam Public Health Research Institute Castor EDC, Room J1B-109, PO Box 22700, 1100 DE Amsterdam, The Netherlands
| | - Ameen Abu-Hanna
- Amsterdam UMC, University of Amsterdam, Department of Medical Informatics, Amsterdam Public Health Research Institute Castor EDC, Room J1B-109, PO Box 22700, 1100 DE Amsterdam, The Netherlands
| | - Ronald Cornet
- Amsterdam UMC, University of Amsterdam, Department of Medical Informatics, Amsterdam Public Health Research Institute Castor EDC, Room J1B-109, PO Box 22700, 1100 DE Amsterdam, The Netherlands
| | - Derk L. Arts
- Amsterdam UMC, University of Amsterdam, Department of Medical Informatics, Amsterdam Public Health Research Institute Castor EDC, Room J1B-109, PO Box 22700, 1100 DE Amsterdam, The Netherlands
- Castor EDC, Amsterdam, The Netherlands
| |
Collapse
|
48
|
Castillo-Sánchez G, Marques G, Dorronzoro E, Rivera-Romero O, Franco-Martín M, De la Torre-Díez I. Suicide Risk Assessment Using Machine Learning and Social Networks: a Scoping Review. J Med Syst 2020; 44:205. [PMID: 33165729 PMCID: PMC7649702 DOI: 10.1007/s10916-020-01669-5] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Accepted: 10/25/2020] [Indexed: 12/16/2022]
Abstract
According to the World Health Organization (WHO) report in 2016, around 800,000 of individuals have committed suicide. Moreover, suicide is the second cause of unnatural death in people between 15 and 29 years. This paper reviews state of the art on the literature concerning the use of machine learning methods for suicide detection on social networks. Consequently, the objectives, data collection techniques, development process and the validation metrics used for suicide detection on social networks are analyzed. The authors conducted a scoping review using the methodology proposed by Arksey and O'Malley et al. and the PRISMA protocol was adopted to select the relevant studies. This scoping review aims to identify the machine learning techniques used to predict suicide risk based on information posted on social networks. The databases used are PubMed, Science Direct, IEEE Xplore and Web of Science. In total, 50% of the included studies (8/16) report explicitly the use of data mining techniques for feature extraction, feature detection or entity identification. The most commonly reported method was the Linguistic Inquiry and Word Count (4/8, 50%), followed by Latent Dirichlet Analysis, Latent Semantic Analysis, and Word2vec (2/8, 25%). Non-negative Matrix Factorization and Principal Component Analysis were used only in one of the included studies (12.5%). In total, 3 out of 8 research papers (37.5%) combined more than one of those techniques. Supported Vector Machine was implemented in 10 out of the 16 included studies (62.5%). Finally, 75% of the analyzed studies implement machine learning-based models using Python.
Collapse
Affiliation(s)
- Gema Castillo-Sánchez
- Department of Signal Theory and Communications, and Telematics Engineering, Universidad de Valladolid, Paseo de Belén 15, 47011 Valladolid, Spain
| | - Gonçalo Marques
- Department of Signal Theory and Communications, and Telematics Engineering, Universidad de Valladolid, Paseo de Belén 15, 47011 Valladolid, Spain
- Polytechnic of Coimbra, ESTGOH, Rua General Santos Costa, 3400-124 Oliveira do Hospital, Portugal
| | - Enrique Dorronzoro
- Electronic Technology Department, Universidad de Sevilla, Sevilla, Spain
| | | | | | - Isabel De la Torre-Díez
- Department of Signal Theory and Communications, and Telematics Engineering, Universidad de Valladolid, Paseo de Belén 15, 47011 Valladolid, Spain
| |
Collapse
|
49
|
Wulff A, Mast M, Hassler M, Montag S, Marschollek M, Jack T. Designing an openEHR-Based Pipeline for Extracting and Standardizing Unstructured Clinical Data Using Natural Language Processing. Methods Inf Med 2020; 59:e64-e78. [PMID: 33058101 PMCID: PMC7725544 DOI: 10.1055/s-0040-1716403] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Background
Merging disparate and heterogeneous datasets from clinical routine in a standardized and semantically enriched format to enable a multiple use of data also means incorporating unstructured data such as medical free texts. Although the extraction of structured data from texts, known as natural language processing (NLP), has been researched at least for the English language extensively, it is not enough to get a structured output in any format. NLP techniques need to be used together with clinical information standards such as openEHR to be able to reuse and exchange still unstructured data sensibly.
Objectives
The aim of the study is to automatically extract crucial information from medical free texts and to transform this unstructured clinical data into a standardized and structured representation by designing and implementing an exemplary pipeline for the processing of pediatric medical histories.
Methods
We constructed a pipeline that allows reusing medical free texts such as pediatric medical histories in a structured and standardized way by (1) selecting and modeling appropriate openEHR archetypes as standard clinical information models, (2) defining a German dictionary with crucial text markers serving as expert knowledge base for a NLP pipeline, and (3) creating mapping rules between the NLP output and the archetypes. The approach was evaluated in a first pilot study by using 50 manually annotated medical histories from the pediatric intensive care unit of the Hannover Medical School.
Results
We successfully reused 24 existing international archetypes to represent the most crucial elements of unstructured pediatric medical histories in a standardized form. The self-developed NLP pipeline was constructed by defining 3.055 text marker entries, 132 text events, 66 regular expressions, and a text corpus consisting of 776 entries for automatic correction of spelling mistakes. A total of 123 mapping rules were implemented to transform the extracted snippets to an openEHR-based representation to be able to store them together with other structured data in an existing openEHR-based data repository. In the first evaluation, the NLP pipeline yielded 97% precision and 94% recall.
Conclusion
The use of NLP and openEHR archetypes was demonstrated as a viable approach for extracting and representing important information from pediatric medical histories in a structured and semantically enriched format. We designed a promising approach with potential to be generalized, and implemented a prototype that is extensible and reusable for other use cases concerning German medical free texts. In a long term, this will harness unstructured clinical data for further research purposes such as the design of clinical decision support systems. Together with structured data already integrated in openEHR-based representations, we aim at developing an interoperable openEHR-based application that is capable of automatically assessing a patient's risk status based on the patient's medical history at time of admission.
Collapse
Affiliation(s)
- Antje Wulff
- Peter L. Reichertz Institute for Medical Informatics, TU Braunschweig and Hannover Medical School, Hannover, Germany
| | - Marcel Mast
- Peter L. Reichertz Institute for Medical Informatics, TU Braunschweig and Hannover Medical School, Hannover, Germany
| | - Marcus Hassler
- Econob, Informationsdienstleistungs GmbH, Klagenfurt am Wörthersee, Austria
| | - Sara Montag
- Peter L. Reichertz Institute for Medical Informatics, TU Braunschweig and Hannover Medical School, Hannover, Germany
| | - Michael Marschollek
- Peter L. Reichertz Institute for Medical Informatics, TU Braunschweig and Hannover Medical School, Hannover, Germany
| | - Thomas Jack
- Department of Pediatric Cardiology and Intensive Care Medicine, Hannover Medical School, Hannover, Germany
| |
Collapse
|
50
|
Bitton Y, Cohen R, Schifter T, Bachmat E, Elhadad M, Elhadad N. Cross-lingual Unified Medical Language System entity linking in online health communities. J Am Med Inform Assoc 2020; 27:1585-1592. [PMID: 32910823 PMCID: PMC7566404 DOI: 10.1093/jamia/ocaa150] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Revised: 05/25/2020] [Accepted: 07/02/2020] [Indexed: 12/17/2022] Open
Abstract
OBJECTIVE In Hebrew online health communities, participants commonly write medical terms that appear as transliterated forms of a source term in English. Such transliterations introduce high variability in text and challenge text-analytics methods. To reduce their variability, medical terms must be normalized, such as linking them to Unified Medical Language System (UMLS) concepts. We present a method to identify both transliterated and translated Hebrew medical terms and link them with UMLS entities. MATERIALS AND METHODS We investigate the effect of linking terms in Camoni, a popular Israeli online health community in Hebrew. Our method, MDTEL (Medical Deep Transliteration Entity Linking), includes (1) an attention-based recurrent neural network encoder-decoder to transliterate words and mapping UMLS from English to Hebrew, (2) an unsupervised method for creating a transliteration dataset in any language without manually labeled data, and (3) an efficient way to identify and link medical entities in the Hebrew corpus to UMLS concepts, by producing a high-recall list of candidate medical terms in the corpus, and then filtering the candidates to relevant medical terms. RESULTS We carry out experiments on 3 disease-specific communities: diabetes, multiple sclerosis, and depression. MDTEL tagging and normalizing on Camoni posts achieved 99% accuracy, 92% recall, and 87% precision. When tagging and normalizing terms in queries from the Camoni search logs, UMLS-normalized queries improved search results in 46% of the cases. CONCLUSIONS Cross-lingual UMLS entity linking from Hebrew is possible and improves search performance across communities. Annotated datasets, annotation guidelines, and code are made available online (https://github.com/yonatanbitton/mdtel).
Collapse
Affiliation(s)
- Yonatan Bitton
- Department of Computer Science, Ben Gurion University, Beer Sheva, Israel
| | - Raphael Cohen
- Department of Computer Science, Ben Gurion University, Beer Sheva, Israel
| | - Tamar Schifter
- Gertner Institute for Epidemiology and Health Policy Research, Tel HaShomer, Israel
| | - Eitan Bachmat
- Department of Computer Science, Ben Gurion University, Beer Sheva, Israel
| | - Michael Elhadad
- Department of Computer Science, Ben Gurion University, Beer Sheva, Israel
| | - Noémie Elhadad
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| |
Collapse
|