1
|
van Diessen E, van Amerongen RA, Zijlmans M, Otte WM. Potential merits and flaws of large language models in epilepsy care: A critical review. Epilepsia 2024; 65:873-886. [PMID: 38305763 DOI: 10.1111/epi.17907] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 12/30/2023] [Accepted: 01/19/2024] [Indexed: 02/03/2024]
Abstract
The current pace of development and applications of large language models (LLMs) is unprecedented and will impact future medical care significantly. In this critical review, we provide the background to better understand these novel artificial intelligence (AI) models and how LLMs can be of future use in the daily care of people with epilepsy. Considering the importance of clinical history taking in diagnosing and monitoring epilepsy-combined with the established use of electronic health records-a great potential exists to integrate LLMs in epilepsy care. We present the current available LLM studies in epilepsy. Furthermore, we highlight and compare the most commonly used LLMs and elaborate on how these models can be applied in epilepsy. We further discuss important drawbacks and risks of LLMs, and we provide recommendations for overcoming these limitations.
Collapse
Affiliation(s)
- Eric van Diessen
- Department of Child Neurology, UMC Utrecht Brain Center, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
- Department of Pediatrics, Franciscus Gasthuis & Vlietland, Rotterdam, The Netherlands
| | - Ramon A van Amerongen
- Faculty of Science, Bioinformatics and Biocomplexity, Utrecht University, Utrecht, The Netherlands
| | - Maeike Zijlmans
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
- Stichting Epilepsie Instellingen Nederland, Heemstede, The Netherlands
| | - Willem M Otte
- Department of Child Neurology, UMC Utrecht Brain Center, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
2
|
Xie K, Gallagher RS, Shinohara RT, Xie SX, Hill CE, Conrad EC, Davis KA, Roth D, Litt B, Ellis CA. Long-term epilepsy outcome dynamics revealed by natural language processing of clinic notes. Epilepsia 2023; 64:1900-1909. [PMID: 37114472 PMCID: PMC10523917 DOI: 10.1111/epi.17633] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 04/26/2023] [Accepted: 04/26/2023] [Indexed: 04/29/2023]
Abstract
OBJECTIVE Electronic medical records allow for retrospective clinical research with large patient cohorts. However, epilepsy outcomes are often contained in free text notes that are difficult to mine. We recently developed and validated novel natural language processing (NLP) algorithms to automatically extract key epilepsy outcome measures from clinic notes. In this study, we assessed the feasibility of extracting these measures to study the natural history of epilepsy at our center. METHODS We applied our previously validated NLP algorithms to extract seizure freedom, seizure frequency, and date of most recent seizure from outpatient visits at our epilepsy center from 2010 to 2022. We examined the dynamics of seizure outcomes over time using Markov model-based probability and Kaplan-Meier analyses. RESULTS Performance of our algorithms on classifying seizure freedom was comparable to that of human reviewers (algorithm F1 = .88 vs. human annotatorκ = .86). We extracted seizure outcome data from 55 630 clinic notes from 9510 unique patients written by 53 unique authors. Of these, 30% were classified as seizure-free since the last visit, 48% of non-seizure-free visits contained a quantifiable seizure frequency, and 47% of all visits contained the date of most recent seizure occurrence. Among patients with at least five visits, the probabilities of seizure freedom at the next visit ranged from 12% to 80% in patients having seizures or seizure-free at the prior three visits, respectively. Only 25% of patients who were seizure-free for 6 months remained seizure-free after 10 years. SIGNIFICANCE Our findings demonstrate that epilepsy outcome measures can be extracted accurately from unstructured clinical note text using NLP. At our tertiary center, the disease course often followed a remitting and relapsing pattern. This method represents a powerful new tool for clinical research with many potential uses and extensions to other clinical questions.
Collapse
Affiliation(s)
- Kevin Xie
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Ryan S. Gallagher
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Russell T. Shinohara
- Penn Statistics in Imaging and Visualization Center, Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Center for Biomedical Image Computing and Analytics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Sharon X. Xie
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Chloe E. Hill
- Department of Neurology, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Erin C. Conrad
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Kathryn A. Davis
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Dan Roth
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Brian Litt
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Colin A. Ellis
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, 19104, USA
| |
Collapse
|
3
|
Decker BM, Turco A, Xu J, Terman SW, Kosaraju N, Jamil A, Davis KA, Litt B, Ellis CA, Khankhanian P, Hill CE. Development of a natural language processing algorithm to extract seizure types and frequencies from the electronic health record. Seizure 2022; 101:48-51. [PMID: 35882104 PMCID: PMC9547963 DOI: 10.1016/j.seizure.2022.07.010] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 07/16/2022] [Accepted: 07/18/2022] [Indexed: 11/21/2022] Open
Abstract
OBJECTIVE To develop a natural language processing (NLP) algorithm to abstract seizure types and frequencies from electronic health records (EHR). BACKGROUND Seizure frequency measurement is an epilepsy quality metric. Yet, abstraction of seizure frequency from the EHR is laborious. We present an NLP algorithm to extract seizure data from unstructured text of clinic notes. Algorithm performance was assessed at two epilepsy centers. METHODS We developed a rules-based NLP algorithm to recognize terms related to seizures and frequency within the text of an outpatient encounter. Algorithm output (e.g. number of seizures of a particular type within a time interval) was compared to seizure data manually annotated by two expert reviewers ("gold standard"). The algorithm was developed from 150 clinic notes from institution #1 (development set), then tested on a separate set of 219 notes from institution #1 (internal test set) with 248 unique seizure frequency elements. The algorithm was separately applied to 100 notes from institution #2 (external test set) with 124 unique seizure frequency elements. Algorithm performance was measured by recall (sensitivity), precision (positive predictive value), and F1 score (geometric mean of precision and recall). RESULTS In the internal test set, the algorithm demonstrated 70% recall (173/248), 95% precision (173/182), and 0.82 F1 score compared to manual review. Algorithm performance in the external test set was lower with 22% recall (27/124), 73% precision (27/37), and 0.40 F1 score. CONCLUSIONS These results suggest NLP extraction of seizure types and frequencies is feasible, though not without challenges in generalizability for large-scale implementation.
Collapse
Affiliation(s)
- Barbara M Decker
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, United States; Department of Neurological Sciences, University of Vermont Medical Center, Burlington, VT, United States.
| | - Alexandra Turco
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, United States
| | - Jian Xu
- Department of Neurology, Henry Ford Health System, Detroit, MI, United States
| | - Samuel W Terman
- Department of Neurology, University of Michigan, Ann Arbor, MI, United States
| | - Nikitha Kosaraju
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, United States
| | - Alisha Jamil
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, United States
| | - Kathryn A Davis
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, United States
| | - Brian Litt
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, United States
| | - Colin A Ellis
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, United States
| | | | - Chloe E Hill
- Department of Neurology, University of Michigan, Ann Arbor, MI, United States
| |
Collapse
|
4
|
Granit V, Grignon AL, Wuu J, Katz J, Walk D, Hussain S, Hernandez J, Jackson C, Caress J, Yosick T, Smider N, Benatar M. Harnessing the power of the electronic health record for ALS research and quality improvement: CReATe CAPTURE-ALS and the ALS Toolkit. Muscle Nerve 2022; 65:154-161. [PMID: 34730240 PMCID: PMC8752483 DOI: 10.1002/mus.27454] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 10/26/2021] [Accepted: 10/31/2021] [Indexed: 02/03/2023]
Abstract
The electronic health record (EHR) is designed principally to support the provision and documentation of clinical care, as well as billing and insurance claims. Broad implementation of the EHR, however, also yields an opportunity to use EHR data for other purposes, including research and quality improvement. Indeed, effective use of clinical data for research purposes has been a long-standing goal of physicians who provide care for patients with ALS, but the quality and completeness of clinical data, as well as the burden of double data entry into the EHR and into a research database, have been persistent barriers. These factors provided motivation for the development of the ALS Toolkit, a set of interactive digital forms within the EHR that enable easy, consistent, and structured capture of information relevant to ALS patient care (as well as research and quality improvement) during clinical encounters. Routine use of the ALS Toolkit within the context of the CReATe Consortium's institutional review board-approved Clinical Procedures to Support Research in ALS (CAPTURE-ALS) study protocol, permits aggregation of structured ALS patient data, with the goals of empowering research and driving quality improvement. Widespread use of the ALS Toolkit through the CAPTURE-ALS protocol will help to ensure that ALS clinics become a driving force for collecting and aggregating clinical data in a way that reflects the true diversity of the populations affected by this disease, rather than the restricted subset of patients that currently participate in dedicated research studies.
Collapse
Affiliation(s)
- Volkan Granit
- Department of Neurology, University of Miami, Miami, Florida
| | | | - Joanne Wuu
- Department of Neurology, University of Miami, Miami, Florida
| | - Jonathan Katz
- California Pacific Medical Center, San Francisco, California
| | - David Walk
- Department of Neurology, University of Minnesota, Minneapolis, Minnesota
| | - Sumaira Hussain
- Department of Neurology, University of Miami, Miami, Florida
| | | | - Carlayne Jackson
- Department of Neurology, University of Texas Health Science Center San Antonio, San Antonio, Texas
| | - James Caress
- Department of Neurology, Wake Forest School of Medicine, Winston-Salem, NC
| | | | | | - Michael Benatar
- Department of Neurology, University of Miami, Miami, Florida
| |
Collapse
|
5
|
Crema C, Attardi G, Sartiano D, Redolfi A. Natural language processing in clinical neuroscience and psychiatry: A review. Front Psychiatry 2022; 13:946387. [PMID: 36186874 PMCID: PMC9515453 DOI: 10.3389/fpsyt.2022.946387] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Accepted: 08/22/2022] [Indexed: 11/13/2022] Open
Abstract
Natural language processing (NLP) is rapidly becoming an important topic in the medical community. The ability to automatically analyze any type of medical document could be the key factor to fully exploit the data it contains. Cutting-edge artificial intelligence (AI) architectures, particularly machine learning and deep learning, have begun to be applied to this topic and have yielded promising results. We conducted a literature search for 1,024 papers that used NLP technology in neuroscience and psychiatry from 2010 to early 2022. After a selection process, 115 papers were evaluated. Each publication was classified into one of three categories: information extraction, classification, and data inference. Automated understanding of clinical reports in electronic health records has the potential to improve healthcare delivery. Overall, the performance of NLP applications is high, with an average F1-score and AUC above 85%. We also derived a composite measure in the form of Z-scores to better compare the performance of NLP models and their different classes as a whole. No statistical differences were found in the unbiased comparison. Strong asymmetry between English and non-English models, difficulty in obtaining high-quality annotated data, and train biases causing low generalizability are the main limitations. This review suggests that NLP could be an effective tool to help clinicians gain insights from medical reports, clinical research forms, and more, making NLP an effective tool to improve the quality of healthcare services.
Collapse
Affiliation(s)
- Claudio Crema
- Laboratory of Neuroinformatics, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy
| | | | - Daniele Sartiano
- Istituto di Informatica e Telematica, Consiglio Nazionale delle Ricerche, Pisa, Italy
| | - Alberto Redolfi
- Laboratory of Neuroinformatics, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy
| |
Collapse
|