1
|
Sivarajkumar S, Mohammad HA, Oniani D, Roberts K, Hersh W, Liu H, He D, Visweswaran S, Wang Y. Clinical Information Retrieval: A Literature Review. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2024; 8:313-352. [PMID: 38681755 PMCID: PMC11052968 DOI: 10.1007/s41666-024-00159-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 12/07/2023] [Accepted: 01/08/2024] [Indexed: 05/01/2024]
Abstract
Clinical information retrieval (IR) plays a vital role in modern healthcare by facilitating efficient access and analysis of medical literature for clinicians and researchers. This scoping review aims to offer a comprehensive overview of the current state of clinical IR research and identify gaps and potential opportunities for future studies in this field. The main objective was to assess and analyze the existing literature on clinical IR, focusing on the methods, techniques, and tools employed for effective retrieval and analysis of medical information. Adhering to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, we conducted an extensive search across databases such as Ovid Embase, Ovid Medline, Scopus, ACM Digital Library, IEEE Xplore, and Web of Science, covering publications from January 1, 2010, to January 4, 2023. The rigorous screening process led to the inclusion of 184 papers in our review. Our findings provide a detailed analysis of the clinical IR research landscape, covering aspects like publication trends, data sources, methodologies, evaluation metrics, and applications. The review identifies key research gaps in clinical IR methods such as indexing, ranking, and query expansion, offering insights and opportunities for future studies in clinical IR, thus serving as a guiding framework for upcoming research efforts in this rapidly evolving field. The study also underscores an imperative for innovative research on advanced clinical IR systems capable of fast semantic vector search and adoption of neural IR techniques for effective retrieval of information from unstructured electronic health records (EHRs). Supplementary Information The online version contains supplementary material available at 10.1007/s41666-024-00159-4.
Collapse
Affiliation(s)
| | | | - David Oniani
- Department of Health Information Management, University of Pittsburgh, Pittsburgh, PA USA
| | - Kirk Roberts
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX USA
| | - William Hersh
- Department of Medical Informatics & Clinical Epidemiology, Oregon Health & Science University, Portland, OR USA
| | - Hongfang Liu
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX USA
| | - Daqing He
- Department of Information Science, University of Pittsburgh, Pittsburgh, PA USA
| | - Shyam Visweswaran
- Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA USA
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA USA
- Clinical and Translational Science Institute, University of Pittsburgh, Pittsburgh, PA USA
| | - Yanshan Wang
- Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA USA
- Department of Health Information Management, University of Pittsburgh, Pittsburgh, PA USA
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA USA
- Clinical and Translational Science Institute, University of Pittsburgh, Pittsburgh, PA USA
| |
Collapse
|
2
|
Bazoge A, Morin E, Daille B, Gourraud PA. Applying Natural Language Processing to Textual Data From Clinical Data Warehouses: Systematic Review. JMIR Med Inform 2023; 11:e42477. [PMID: 38100200 PMCID: PMC10757232 DOI: 10.2196/42477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 01/16/2023] [Accepted: 09/07/2023] [Indexed: 12/17/2023] Open
Abstract
BACKGROUND In recent years, health data collected during the clinical care process have been often repurposed for secondary use through clinical data warehouses (CDWs), which interconnect disparate data from different sources. A large amount of information of high clinical value is stored in unstructured text format. Natural language processing (NLP), which implements algorithms that can operate on massive unstructured textual data, has the potential to structure the data and make clinical information more accessible. OBJECTIVE The aim of this review was to provide an overview of studies applying NLP to textual data from CDWs. It focuses on identifying the (1) NLP tasks applied to data from CDWs and (2) NLP methods used to tackle these tasks. METHODS This review was performed according to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. We searched for relevant articles in 3 bibliographic databases: PubMed, Google Scholar, and ACL Anthology. We reviewed the titles and abstracts and included articles according to the following inclusion criteria: (1) focus on NLP applied to textual data from CDWs, (2) articles published between 1995 and 2021, and (3) written in English. RESULTS We identified 1353 articles, of which 194 (14.34%) met the inclusion criteria. Among all identified NLP tasks in the included papers, information extraction from clinical text (112/194, 57.7%) and the identification of patients (51/194, 26.3%) were the most frequent tasks. To address the various tasks, symbolic methods were the most common NLP methods (124/232, 53.4%), showing that some tasks can be partially achieved with classical NLP techniques, such as regular expressions or pattern matching that exploit specialized lexica, such as drug lists and terminologies. Machine learning (70/232, 30.2%) and deep learning (38/232, 16.4%) have been increasingly used in recent years, including the most recent approaches based on transformers. NLP methods were mostly applied to English language data (153/194, 78.9%). CONCLUSIONS CDWs are central to the secondary use of clinical texts for research purposes. Although the use of NLP on data from CDWs is growing, there remain challenges in this field, especially with regard to languages other than English. Clinical NLP is an effective strategy for accessing, extracting, and transforming data from CDWs. Information retrieved with NLP can assist in clinical research and have an impact on clinical practice.
Collapse
Affiliation(s)
- Adrien Bazoge
- Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000 Nantes, France
- Nantes Université, CHU de Nantes, Pôle Hospitalo-Universitaire 11: Santé Publique, Clinique des données, INSERM, CIC 1413, F-44000 Nantes, France
| | - Emmanuel Morin
- Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000 Nantes, France
| | - Béatrice Daille
- Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000 Nantes, France
| | - Pierre-Antoine Gourraud
- Nantes Université, CHU de Nantes, Pôle Hospitalo-Universitaire 11: Santé Publique, Clinique des données, INSERM, CIC 1413, F-44000 Nantes, France
- Nantes Université, INSERM, CHU de Nantes, École Centrale Nantes, Centre de Recherche Translationnelle en Transplantation et Immunologie, CR2TI, F-44000 Nantes, France
| |
Collapse
|
3
|
Langlais T, Louis E, Badina A, Vialle R, Pannier S, Le Hanneur M, Fitoussi F. "Unhappy triad" of the trauma elbow in children: Diagnosis, classification, and mid-term outcomes. J Child Orthop 2023; 17:581-589. [PMID: 38050602 PMCID: PMC10693846 DOI: 10.1177/18632521231211643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/06/2023] Open
Abstract
Background The aim of this study was to describe the epidemiology, physiopathology, and outcomes of elbow "unhappy triad" trauma in children, combining a posterior dislocation, a proximal radius fracture, and a third lesion (i.e. bony or capsuloligamentous injury). Methods A retrospective bicentric study was conducted between 1999 and 2020. All skeletally immature children who presented to the emergency department and underwent surgery for a proximal radius injury were selected. Among this selection, only patients with two associated ipsilateral elbow injuries (i.e. posterior elbow dislocation and a bony and/or capsuloligamentous injury) were included. Active elbow ranges of motion, Mayo Elbow Performance Score and Quick-Disabilities Of The Arm, Shoulder And Hand scores and standard radiographs were recorded at last follow-up. Results Twenty-one patients met the inclusion criteria (mean age at surgery = 11.4 years) among 737 selected. The "unhappy triad" diagnosis was made preoperatively in nine cases (bone lesion only), intraoperatively in nine cases, and postoperatively in one case. The third lesions were surgically treated when the lesion was a bony fracture or if the elbow remains unstable between 60° and 90° of flexion (i.e. capsuloligamentous injury). Twenty patients were reviewed (mean follow-up = 5.8 years). The complications and re-operations rates were of 10%. Conclusion The "unhappy" triad of the child's elbow is a rare injury, where the preoperative diagnosis is frequently missed and lead to 10% of complications and re-operations. Level of evidence level III.
Collapse
Affiliation(s)
- Tristan Langlais
- Department of Pediatric Orthopedics, Purpan Children Hospital, Toulouse University, Toulouse, France
- Department of Pediatric Orthopedics, Necker Hospital, Paris Cité University, Paris, France
- Department of Pediatric Orthopedics, Armand-Trousseau Hospital, Sorbonne University, Paris, France
| | - Emmanuelle Louis
- Department of Pediatric Orthopedics, Armand-Trousseau Hospital, Sorbonne University, Paris, France
| | - Alina Badina
- Department of Pediatric Orthopedics, Necker Hospital, Paris Cité University, Paris, France
| | - Raphael Vialle
- Department of Pediatric Orthopedics, Armand-Trousseau Hospital, Sorbonne University, Paris, France
| | - Stéphanie Pannier
- Department of Pediatric Orthopedics, Necker Hospital, Paris Cité University, Paris, France
| | - Malo Le Hanneur
- Department of Pediatric Orthopedics, Armand-Trousseau Hospital, Sorbonne University, Paris, France
- Hand to Shoulder Mediterranean Center, ELSAN, Clinique Bouchard, Marseille, France
| | - Franck Fitoussi
- Department of Pediatric Orthopedics, Armand-Trousseau Hospital, Sorbonne University, Paris, France
| |
Collapse
|
4
|
Kusa W, Mendoza ÓE, Knoth P, Pasi G, Hanbury A. Effective matching of patients to clinical trials using entity extraction and neural re-ranking. J Biomed Inform 2023; 144:104444. [PMID: 37451494 DOI: 10.1016/j.jbi.2023.104444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 06/30/2023] [Accepted: 07/08/2023] [Indexed: 07/18/2023]
Abstract
INTRODUCTION Clinical trials (CTs) often fail due to inadequate patient recruitment. Finding eligible patients involves comparing the patient's information with the CT eligibility criteria. Automated patient matching offers the promise of improving the process, yet the main difficulties of CT retrieval lie in the semantic complexity of matching unstructured patient descriptions with semi-structured, multi-field CT documents and in capturing the meaning of negation coming from the eligibility criteria. OBJECTIVES This paper tackles the challenges of CT retrieval by presenting an approach that addresses the patient-to-trials paradigm. Our approach involves two key components in a pipeline-based model: (i) a data enrichment technique for enhancing both queries and documents during the first retrieval stage, and (ii) a novel re-ranking schema that uses a Transformer network in a setup adapted to this task by leveraging the structure of the CT documents. METHODS We use named entity recognition and negation detection in both patient description and the eligibility section of CTs. We further classify patient descriptions and CT eligibility criteria into current, past, and family medical conditions. This extracted information is used to boost the importance of disease and drug mentions in both query and index for lexical retrieval. Furthermore, we propose a two-step training schema for the Transformer network used to re-rank the results from the lexical retrieval. The first step focuses on matching patient information with the descriptive sections of trials, while the second step aims to determine eligibility by matching patient information with the criteria section. RESULTS Our findings indicate that the inclusion criteria section of the CT has a great influence on the relevance score in lexical models, and that the enrichment techniques for queries and documents improve the retrieval of relevant trials. The re-ranking strategy, based on our training schema, consistently enhances CT retrieval and shows improved performance by 15% in terms of precision at retrieving eligible trials. CONCLUSION The results of our experiments suggest the benefit of making use of extracted entities. Moreover, our proposed re-ranking schema shows promising effectiveness compared to larger neural models, even with limited training data. These findings offer valuable insights for improving methods for retrieval of clinical documents.
Collapse
Affiliation(s)
| | | | | | | | - Allan Hanbury
- TU Wien, Favoritenstrasse 9-11, Vienna, Austria; Complexity Science Hub, Vienna, Austria
| |
Collapse
|
5
|
Chevalier K, Genin M, Jean TP, Avouac J, Flipo RM, Georgin-Lavialle S, El Mahou S, Pertuiset E, Pham T, Servettaz A, Marotte H, Domont F, Chazerain P, Devaux M, Mekinian A, Sellam J, Fautrel B, Rouzaud D, Ebstein E, Costedoat-Chalumeau N, Richez C, Hachulla E, Mariette X, Seror R. CovAID: Identification of factors associated with severe COVID-19 in patients with inflammatory rheumatism or autoimmune diseases. Front Med (Lausanne) 2023; 10:1152587. [PMID: 37035330 PMCID: PMC10075312 DOI: 10.3389/fmed.2023.1152587] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Accepted: 02/27/2023] [Indexed: 04/11/2023] Open
Abstract
Introduction Autoimmune/inflammatory rheumatic diseases (AIRDs) patients might be at-risk of severe COVID-19. However, whether this is linked to the disease or to its treatment is difficult to determine. This study aimed to identify factors associated with occurrence of severe COVID-19 in AIRD patients and to evaluate whether having an AIRD was associated with increased risk of severe COVID-19 or death. Materials and methods Two databases were analyzed: the EDS (Entrepôt des Données de Santé, Clinical Data Warehouse), including all patients followed in Paris university hospitals and the French multi-center COVID-19 cohort [French rheumatic and musculoskeletal diseases (RMD)]. First, in a combined analysis we compared patients with severe and non-severe COVID-19 to identify factors associated with severity. Then, we performed a propensity matched score case-control study within the EDS database to compare AIRD cases and non-AIRD controls. Results Among 1,213 patients, 195 (16.1%) experienced severe COVID-19. In multivariate analysis, older age, interstitial lung disease (ILD), arterial hypertension, obesity, sarcoidosis, vasculitis, auto-inflammatory diseases, and treatment with corticosteroids or rituximab were associated with increased risk of severe COVID-19. Among 35,741 COVID-19 patients in EDS, 316 having AIRDs were compared to 1,264 Propensity score-matched controls. AIRD patients had a higher risk of severe COVID-19 [aOR = 1.43 (1.08-1.87), p = 0.01] but analysis restricted to rheumatoid arthritis and spondyloarthritis found no increased risk of severe COVID-19 [aOR = 1.11 (0.68-1.81)]. Conclusion In this multicenter study, we confirmed that AIRD patients treated with rituximab or corticosteroids and/or having vasculitis, auto-inflammatory disease, and sarcoidosis had increased risk of severe COVID-19. Also, AIRD patients had, overall, an increased risk of severe COVID-19 compares general population.
Collapse
Affiliation(s)
- Kevin Chevalier
- Department of Rheumatology, Université Paris-Saclay, INSERM UMR 1184: Center for Immunology of Viral Infections and Autoimmune Diseases, Assistance Publique-Hôpitaux de Paris, Hôpital Bicêtre, Le Kremlin-Bicêtre, France
| | - Michaël Genin
- University of Lille, CHU Lille, ULR 2694–METRICS: Evaluation des Technologies de Santé et des Pratiques Médicales, Lille, France
| | | | | | | | | | | | | | - Thao Pham
- Hospital Sainte Marguerite, Rheumatology, Marseille, France
| | - Amelie Servettaz
- Hospital Robert Debré, Internal Medicine, Infectious Diseases and Clinical Immunology, Reims, France
| | - Hubert Marotte
- University Hospital of Saint-Étienne, Rheumatology, Saint-Priest-en-Jarez, France
| | - Fanny Domont
- University Hospitals Pitié Salpêtrière - Charles Foix, Internal Medicine and Clinical Immunology, Paris, France
| | - Pascal Chazerain
- Hopital de la Croix Saint-Simon, Rheumatology and Internal Medicine, Paris, France
| | - Mathilde Devaux
- Saint-Germain-en-Laye Intercommunal Hospital Center, Internal Medicine, Poissy, France
| | - Arsene Mekinian
- Hospital Saint-Antoine AP-HP, Internal Medicine, Paris, France
| | - Jérémie Sellam
- Hospital Saint-Antoine AP-HP, Rheumatology, Paris, France
| | - Bruno Fautrel
- Sorbonne Universite – APHP, Pitie Salpetriere Hospital, Department of Rheumatology, Pierre Louis Institute of Epidemiology and Public Health, INSERM UMRS 1136, Paris, France
| | - Diane Rouzaud
- Bichat-Claude Bernard Hospital, Internal Medicine, Paris, France
| | - Esther Ebstein
- Bichat-Claude Bernard Hospital, Rheumatology, Paris, France
| | | | | | - Eric Hachulla
- Department of Internal Medicine and Clinical Immunology, Referral Centre for Centre for Rare Systemic Autoimmune Diseases North and North-West of France (CeRAINO), CHU Lille, University of Lille, INSERM, U1286 - INFINITE - Institute for Translational Research in Inflammation, Lille, France
| | - Xavier Mariette
- Department of Rheumatology, Université Paris-Saclay, INSERM UMR 1184: Center for Immunology of Viral Infections and Autoimmune Diseases, Assistance Publique-Hôpitaux de Paris, Hôpital Bicêtre, Le Kremlin-Bicêtre, France
| | - Raphaèle Seror
- Department of Rheumatology, Université Paris-Saclay, INSERM UMR 1184: Center for Immunology of Viral Infections and Autoimmune Diseases, Assistance Publique-Hôpitaux de Paris, Hôpital Bicêtre, Le Kremlin-Bicêtre, France
| |
Collapse
|
6
|
Pressat-Laffouilhère T, Balayé P, Dahamna B, Lelong R, Billey K, Darmoni SJ, Grosjean J. Evaluation of Doc'EDS: a French semantic search tool to query health documents from a clinical data warehouse. BMC Med Inform Decis Mak 2022; 22:34. [PMID: 35135538 PMCID: PMC8822768 DOI: 10.1186/s12911-022-01762-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 01/20/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Unstructured data from electronic health records represent a wealth of information. Doc'EDS is a pre-screening tool based on textual and semantic analysis. The Doc'EDS system provides a graphic user interface to search documents in French. The aim of this study was to present the Doc'EDS tool and to provide a formal evaluation of its semantic features. METHODS Doc'EDS is a search tool built on top of the clinical data warehouse developed at Rouen University Hospital. This tool is a multilevel search engine combining structured and unstructured data. It also provides basic analytical features and semantic utilities. A formal evaluation was conducted to measure the impact of Natural Language Processing algorithms. RESULTS Approximately 18.1 million narrative documents are stored in Doc'EDS. The formal evaluation was conducted in 5000 clinical concepts that were manually collected. The F-measures of negative concepts and hypothetical concepts were respectively 0.89 and 0.57. CONCLUSION In this formal evaluation, we have shown that Doc'EDS is able to deal with language subtleties to enhance an advanced full text search in French health documents. The Doc'EDS tool is currently used on a daily basis to help researchers to identify patient cohorts thanks to unstructured data.
Collapse
Affiliation(s)
- Thibaut Pressat-Laffouilhère
- Department of Biomedical Informatics, Rouen University Hospital, Normandy, France.,LITIS EA4108, Rouen University, Normandy, France
| | - Pierre Balayé
- Department of Biomedical Informatics, Rouen University Hospital, Normandy, France
| | - Badisse Dahamna
- Department of Biomedical Informatics, Rouen University Hospital, Normandy, France.,LIMICS U1142 INSERM, Sorbonne Université & Sorbonne Paris Nord, Paris, France
| | - Romain Lelong
- Department of Biomedical Informatics, Rouen University Hospital, Normandy, France.,LIMICS U1142 INSERM, Sorbonne Université & Sorbonne Paris Nord, Paris, France
| | - Kévin Billey
- Department of Biomedical Informatics, Rouen University Hospital, Normandy, France.,LITIS EA4108, Rouen University, Normandy, France
| | - Stéfan J Darmoni
- Department of Biomedical Informatics, Rouen University Hospital, Normandy, France.,LIMICS U1142 INSERM, Sorbonne Université & Sorbonne Paris Nord, Paris, France
| | - Julien Grosjean
- Department of Biomedical Informatics, Rouen University Hospital, Normandy, France. .,LIMICS U1142 INSERM, Sorbonne Université & Sorbonne Paris Nord, Paris, France.
| |
Collapse
|
7
|
Touzé R, Paternoster G, Arnaud E, Khonsari RH, James S, Bremond-Gignac D, Robert MP. Ophthalmological findings in children with unicoronal craniosynostosis. Eur J Ophthalmol 2022; 32:3274-3280. [PMID: 35118895 DOI: 10.1177/11206721221077548] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
INTRODUCTION Among non-syndromic, single-suture craniosynostoses, unicoronal craniosynostosis (UCS) presents the highest rate of ophthalmic manifestations requiring a visual follow-up, due to the high risk of amblyopia. After birth or during childhood, children with UCS have a high risk to present an aniso-astigmatism and a strabismus. The aim of this study was to characterize clinical ophthalmologic findings associated with UCS in a paediatric cohort. METHODS This retrospective study included children admitted in our unit between 2015 and 2021, with isolated UCS treated in our institution and complete ophthalmological assessment comprising visual assessment, refractive status and oculomotor examination. Children with associated craniofacial disorders were excluded. RESULTS A total of 28 children met the inclusion criteria. Median age was 62 [13-192] months with a large proportion of girls (86%) and 71% of right-sided UCS. The mean best corrected visual acuity was 0.07 (±0.13) LogMAR, including 10 (36%) children with an amblyopia or history of amblyopia. Astigmatism was significantly higher on the contralateral side of the UCS than on the ipsilateral side, with a refractive cylinder error of 0.97 (±1.06) vs 0.56 (±0.68) diopters, respectively (p = 0.03). Strabismus was observed in 20 patients (71%) with a main pattern of esotropia with a vertical component. A pseudo-superior oblique palsy was found in 13 children (65%) with a median cyclodeviation of 8.7° [-5,4°-20.6°]. CONCLUSION Children with UCS experience a high rate of various visual manifestations. This study highlights their need for a strict ophthalmological follow-up, in order to early diagnose and prevent visual complications.
Collapse
Affiliation(s)
- Romain Touzé
- Service d'ophtalmologie, 37072Hôpital Universitaire Necker - Enfants Malades, Paris, France.,Borelli Centre, UMR 9010 129791CNRS-SSA-ENS Paris Saclay-Paris University, France
| | - Giovanna Paternoster
- Service de neurochirurgie, Unité Fonctionnelle de Chirurgie Craniofaciale, 246596Hôpital Universitaire Necker - Enfants Malades, Paris, France
| | - Eric Arnaud
- Service de neurochirurgie, Unité Fonctionnelle de Chirurgie Craniofaciale, 246596Hôpital Universitaire Necker - Enfants Malades, Paris, France.,Clinique Marcel Sembat, Ramsay - Générale de Santé, Boulogne-Billancourt, France
| | - Roman Hossein Khonsari
- Service de chirurgie maxillo-faciale et chirurgie plastique, Hôpital Universitaire Necker - Enfants Malades, Paris, France
| | - Syril James
- Service de neurochirurgie, Unité Fonctionnelle de Chirurgie Craniofaciale, 246596Hôpital Universitaire Necker - Enfants Malades, Paris, France.,Clinique Marcel Sembat, Ramsay - Générale de Santé, Boulogne-Billancourt, France
| | - Dominique Bremond-Gignac
- Service d'ophtalmologie, 37072Hôpital Universitaire Necker - Enfants Malades, Paris, France.,560861INSERM, UMRS 1138, Team 17, Paris, France
| | - Matthieu P Robert
- Service d'ophtalmologie, 37072Hôpital Universitaire Necker - Enfants Malades, Paris, France.,Borelli Centre, UMR 9010 129791CNRS-SSA-ENS Paris Saclay-Paris University, France
| |
Collapse
|
8
|
The prediction of hospital length of stay using unstructured data. BMC Med Inform Decis Mak 2021; 21:351. [PMID: 34922532 PMCID: PMC8684269 DOI: 10.1186/s12911-021-01722-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Accepted: 12/13/2021] [Indexed: 11/10/2022] Open
Abstract
Objective This study aimed to assess the performance improvement for machine learning-based hospital length of stay (LOS) predictions when clinical signs written in text are accounted for and compared to the traditional approach of solely considering structured information such as age, gender and major ICD diagnosis.
Methods This study was an observational retrospective cohort study and analyzed patient stays admitted between 1 January to 24 September 2019. For each stay, a patient was admitted through the Emergency Department (ED) and stayed for more than two days in the subsequent service. LOS was predicted using two random forest models. The first included unstructured text extracted from electronic health records (EHRs). A word-embedding algorithm based on UMLS terminology with exact matching restricted to patient-centric affirmation sentences was used to assess the EHR data. The second model was primarily based on structured data in the form of diagnoses coded from the International Classification of Disease 10th Edition (ICD-10) and triage codes (CCMU/GEMSA classifications). Variables common to both models were: age, gender, zip/postal code, LOS in the ED, recent visit flag, assigned patient ward after the ED stay and short-term ED activity. Models were trained on 80% of data and performance was evaluated by accuracy on the remaining 20% test data.
Results The model using unstructured data had a 75.0% accuracy compared to 74.1% for the model containing structured data. The two models produced a similar prediction in 86.6% of cases. In a secondary analysis restricted to intensive care patients, the accuracy of both models was also similar (76.3% vs 75.0%).
Conclusions LOS prediction using unstructured data had similar accuracy to using structured data and can be considered of use to accurately model LOS. Supplementary Information The online version contains supplementary material available at 10.1186/s12911-021-01722-4.
Collapse
|
9
|
Common Arterial Trunk Associated with Functionally Univentricular Heart: Anatomical Study and Review of the Literature. J Cardiovasc Dev Dis 2021; 8:jcdd8120175. [PMID: 34940530 PMCID: PMC8705909 DOI: 10.3390/jcdd8120175] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 11/26/2021] [Accepted: 12/02/2021] [Indexed: 11/17/2022] Open
Abstract
Common arterial trunk (CAT) is a rare congenital heart disease that is commonly included into the spectrum of conotruncal heart defects. CAT is rarely associated with functionally univentricular hearts, and only few cases have been described so far. Here, we describe the anatomical characteristics of CAT associated with a univentricular heart diagnosed in children and fetuses referred to our institution, and we completed the anatomical description of this rare condition through an extensive review of the literature. The complete cohort ultimately gathered 32 cases described in the literature completed by seven cases from our unit (seven fetuses and one child). Four types of univentricular hearts associated with CAT were observed: tricuspid atresia or hypoplastic right ventricle in 16 cases, mitral atresia or hypoplastic left ventricle in 12 cases, double-inlet left ventricle in 2 cases, and unbalanced atrioventricular septal defect in 9 cases. Our study questions the diagnosis of CAT as the exclusive consequence of an anomaly of the wedging process, following the convergence between the embryonic atrioventricular canal and the common outflow tract. We confirm that some forms of CAT can be considered to be due to an arrest of cardiac development at the stages preceding the convergence.
Collapse
|
10
|
'In-Out-In' K-wires sliding in severe tibial deformities of osteogenesis imperfecta: a technical note. J Pediatr Orthop B 2021; 30:257-263. [PMID: 33767124 DOI: 10.1097/bpb.0000000000000785] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Severe infant osteogenesis imperfecta requires osteosynthesis. Intramedullary tibia's osteosynthesis is a technical challenge given the deformity and the medullar canal's narrowness. We describe an extramedullary technique: 'In-Out-In' K-wires sliding. We performed an anteromedial diaphysis approach. The periosteum was released while preserving its posterior vascular attachments. To obtain a straight leg, we did numerous osteotomies as many times as necessary. K-wires ('In') were introduced into the proximal epiphysis, and the medial malleolus ('Out') bordered the cortical and ('In') reach their opposite metaphysis. K-wires were cut, curved and impacted at their respective epiphysis ends to allow a telescopic effect. All tibial fragments are strapped on K-wires, and the periosteum was sutured over it. Our inclusion criteria were children with osteogenesis imperfecta operated before 6 years old whose verticalization was impossible. Seven patients (11 tibias) are included (2006-2016) with a mean surgery's age of 3.3 ± 1.1 years old. All patients received intravenous bisphosphonates preoperatively. The follow-up was 6.1 ± 2.7 years. All patients could stand up with supports, and the flexion deformity correction was 46.7 ± 14.2°. Osteosynthesis was changed in nine tibias for the arrest of telescoping with flexion deformity recurrence and meantime first session-revision was 3.8 ± 1.7 years. At revision, K-wires overlap had decreased by 55 ± 23%. Including all surgeries, three distal K-wires migrations were observed, and the number of surgical procedures was 2.5/tibia. No growth arrest and other complications reported. 'In-Out-In' K-wires sliding can be considered in select cases where the absence of a medullary canal prevents the insertion of intramedullary rod or as a salvage or alternative procedure mode of fixation. It can perform in severe infant osteogenesis imperfecta under 6 years old with few complications and good survival time.
Collapse
|
11
|
Boulouis G, Stricker S, Benichi S, Hak JF, Gariel F, Kossorotoff M, Garcelon N, Harroche A, Alias Q, Garzelli L, Bajolle F, Boddaert N, Meyer P, Blauwblomme T, Naggara O. Mortality and functional outcome after pediatric intracerebral hemorrhage: cohort study and meta-analysis. J Neurosurg Pediatr 2021; 27:661-667. [PMID: 33836498 DOI: 10.3171/2020.9.peds20608] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/12/2020] [Accepted: 09/28/2020] [Indexed: 11/06/2022]
Abstract
OBJECTIVE The clinical outcome of pediatric intracerebral hemorrhage (pICH) is rarely reported in a comprehensive way. In this cohort study, systematic review, and meta-analysis of patients with pICH, the authors aimed to describe the basic clinical outcomes of pICH. METHODS Children who received treatment for pICH at the authors' institution were prospectively enrolled in the cohort in 2008; data since 2000 were retrospectively included, and data through October 2019 were analyzed. The authors then searched PubMed and conducted a systematic review of relevant articles published since 1990. Data from the identified populations and patients from the cohort study were pooled into a multicategory meta-analysis and analyzed with regard to clinical outcomes. RESULTS Among 243 children screened for inclusion, 231 patients were included. The median (IQR) age at ictus was 9.6 (4.6-12.5) years, and 128 patients (53%) were male. After a median (IQR) follow-up of 33 (13-63) months, 132 patients (57.4%) had a favorable clinical outcome, of whom 58 (44%) had no residual symptoms. Nineteen studies were included in the meta-analysis. Overall, the proportion of children with complete recovery was 27% (95% CI 19%-36%; Q = 49.6; I2 = 76%); of those with residual deficits, the complete recovery rate was 48.1% (95% CI 40%-57%; Q = 75.3; I2 = 81%). When pooled with the cohort study, the aggregate case-fatality rate at the last follow-up was 17.3% (95% CI 12%-24%; Q = 101.6; I2 = 81%). CONCLUSIONS Here, the authors showed that 1 in 6 children died after pICH, and the majority of children had residual neurological deficits at the latest follow-up. Results from the cohort study also indicate that children with vascular lesions as the etiology of pICH had significantly better clinical functional outcomes.
Collapse
Affiliation(s)
- Grégoire Boulouis
- 1Service d'imagerie Morphologique et Fonctionnelle, GHU Paris Psychiatrie et Neurosciences, Hospitalier Sainte Anne, Institut de Psychiatrie et Neurosciences de Paris, UMR_S1266, INSERM, Université de Paris.,Departments of2Pediatric Radiology
| | | | | | - Jean-François Hak
- 1Service d'imagerie Morphologique et Fonctionnelle, GHU Paris Psychiatrie et Neurosciences, Hospitalier Sainte Anne, Institut de Psychiatrie et Neurosciences de Paris, UMR_S1266, INSERM, Université de Paris.,Departments of2Pediatric Radiology
| | - Florent Gariel
- 1Service d'imagerie Morphologique et Fonctionnelle, GHU Paris Psychiatrie et Neurosciences, Hospitalier Sainte Anne, Institut de Psychiatrie et Neurosciences de Paris, UMR_S1266, INSERM, Université de Paris.,Departments of2Pediatric Radiology
| | | | | | - Annie Harroche
- 8Hematology, Haemophilia Care Centre, Hôpital Necker Enfants Malades, AP-HP, Université de Paris
| | | | - Lorenzo Garzelli
- 1Service d'imagerie Morphologique et Fonctionnelle, GHU Paris Psychiatrie et Neurosciences, Hospitalier Sainte Anne, Institut de Psychiatrie et Neurosciences de Paris, UMR_S1266, INSERM, Université de Paris.,Departments of2Pediatric Radiology
| | - Fanny Bajolle
- 5Unité Médico-Chirurgicale de Cardiologie Congénitale et Pédiatrique, Centre de référence Malformations Cardiaques Congénitales Complexes-M3C
| | - Nathalie Boddaert
- Departments of2Pediatric Radiology.,9INSERM U1163, Université Paris Descartes-Sorbonne Paris Cité, Institut Imagine, and INSERM U1000; and
| | | | - Thomas Blauwblomme
- 3Pediatric Neurosurgery.,10French Center for Pediatric Stroke, Paris, France
| | - Olivier Naggara
- 1Service d'imagerie Morphologique et Fonctionnelle, GHU Paris Psychiatrie et Neurosciences, Hospitalier Sainte Anne, Institut de Psychiatrie et Neurosciences de Paris, UMR_S1266, INSERM, Université de Paris.,Departments of2Pediatric Radiology.,10French Center for Pediatric Stroke, Paris, France
| |
Collapse
|
12
|
Klappe ES, van Putten FJP, de Keizer NF, Cornet R. Contextual property detection in Dutch diagnosis descriptions for uncertainty, laterality and temporality. BMC Med Inform Decis Mak 2021; 21:120. [PMID: 33827555 PMCID: PMC8028823 DOI: 10.1186/s12911-021-01477-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 03/24/2021] [Indexed: 11/28/2022] Open
Abstract
Background Accurate, coded problem lists are valuable for data reuse, including clinical decision support and research. However, healthcare providers frequently modify coded diagnoses by including or removing common contextual properties in free-text diagnosis descriptions: uncertainty (suspected glaucoma), laterality (left glaucoma) and temporality (glaucoma 2002). These contextual properties could cause a difference in meaning between underlying diagnosis codes and modified descriptions, inhibiting data reuse. We therefore aimed to develop and evaluate an algorithm to identify these contextual properties. Methods A rule-based algorithm called UnLaTem (Uncertainty, Laterality, Temporality) was developed using a single-center dataset, including 288,935 diagnosis descriptions, of which 73,280 (25.4%) were modified by healthcare providers. Internal validation of the algorithm was conducted with an independent sample of 980 unique records. A second validation of the algorithm was conducted with 996 records from a Dutch multicenter dataset including 175,210 modified descriptions of five hospitals. Two researchers independently annotated the two validation samples. Performance of the algorithm was determined using means of the recall and precision of the validation samples. The algorithm was applied to the multicenter dataset to determine the actual prevalence of the contextual properties within the modified descriptions per specialty. Results For the single-center dataset recall (and precision) for removal of uncertainty, uncertainty, laterality and temporality respectively were 100 (60.0), 99.1 (89.9), 100 (97.3) and 97.6 (97.6). For the multicenter dataset for removal of uncertainty, uncertainty, laterality and temporality it was 57.1 (88.9), 86.3 (88.9), 99.7 (93.5) and 96.8 (90.1). Within the modified descriptions of the multicenter dataset, 1.3% contained removal of uncertainty, 9.9% uncertainty, 31.4% laterality and 9.8% temporality. Conclusions We successfully developed a rule-based algorithm named UnLaTem to identify contextual properties in Dutch modified diagnosis descriptions. UnLaTem could be extended with more trigger terms, new rules and the recognition of term order to increase the performance even further. The algorithm’s rules are available as additional file 2. Implementing UnLaTem in Dutch hospital systems can improve precision of information retrieval and extraction from diagnosis descriptions, which can be used for data reuse purposes such as decision support and research. Supplementary Information The online version contains supplementary material available at 10.1186/s12911-021-01477-y.
Collapse
Affiliation(s)
- Eva S Klappe
- Department of Medical Informatics, Amsterdam Public Health Research Institute, Amsterdam UMC, University of Amsterdam, Meibergdreef 15, 1105AZ, Amsterdam, The Netherlands.
| | - Florentien J P van Putten
- Department of Medical Informatics, Amsterdam Public Health Research Institute, Amsterdam UMC, University of Amsterdam, Meibergdreef 15, 1105AZ, Amsterdam, The Netherlands
| | - Nicolette F de Keizer
- Department of Medical Informatics, Amsterdam Public Health Research Institute, Amsterdam UMC, University of Amsterdam, Meibergdreef 15, 1105AZ, Amsterdam, The Netherlands
| | - Ronald Cornet
- Department of Medical Informatics, Amsterdam Public Health Research Institute, Amsterdam UMC, University of Amsterdam, Meibergdreef 15, 1105AZ, Amsterdam, The Netherlands
| |
Collapse
|
13
|
French FastContext: A publicly accessible system for detecting negation, temporality and experiencer in French clinical notes. J Biomed Inform 2021; 117:103733. [PMID: 33737205 DOI: 10.1016/j.jbi.2021.103733] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Revised: 12/30/2020] [Accepted: 03/01/2021] [Indexed: 11/21/2022]
Abstract
The context of medical conditions is an important feature to consider when processing clinical narratives. NegEx and its extension ConText became the most well-known rule-based systems that allow determining whether a medical condition is negated, historical or experienced by someone other than the patient in English clinical text. In this paper, we present a French adaptation and enrichment of FastContext which is the most recent, n-trie engine-based implementation of the ConText algorithm. We compiled an extensive list of French lexical cues by automatic and manual translation and enrichment. To evaluate French FastContext, we manually annotated the context of medical conditions present in two types of clinical narratives: (i)death certificates and (ii)electronic health records. Results show good performance across different context values on both types of clinical notes (on average 0.93 and 0.86 F1, respectively). Furthermore, French FastContext outperforms previously reported French systems for negation detection when compared on the same datasets and it is the first implementation of contextual temporality and experiencer identification reported for French. Finally, French FastContext has been implemented within the SIFR Annotator: a publicly accessible Web service to annotate French biomedical text data (http://bioportal.lirmm.fr/annotator). To our knowledge, this is the first implementation of a Web-based ConText-like system in a publicly accessible platform allowing non-natural-language-processing experts to both annotate and contextualize medical conditions in clinical notes.
Collapse
|
14
|
Shen F, Liu S, Fu S, Wang Y, Henry S, Uzuner O, Liu H. Family History Extraction From Synthetic Clinical Narratives Using Natural Language Processing: Overview and Evaluation of a Challenge Data Set and Solutions for the 2019 National NLP Clinical Challenges (n2c2)/Open Health Natural Language Processing (OHNLP) Competition. JMIR Med Inform 2021; 9:e24008. [PMID: 33502329 PMCID: PMC7875692 DOI: 10.2196/24008] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Revised: 11/25/2020] [Accepted: 12/05/2020] [Indexed: 12/18/2022] Open
Abstract
Background As a risk factor for many diseases, family history (FH) captures both shared genetic variations and living environments among family members. Though there are several systems focusing on FH extraction using natural language processing (NLP) techniques, the evaluation protocol of such systems has not been standardized. Objective The n2c2/OHNLP (National NLP Clinical Challenges/Open Health Natural Language Processing) 2019 FH extraction task aims to encourage the community efforts on a standard evaluation and system development on FH extraction from synthetic clinical narratives. Methods We organized the first BioCreative/OHNLP FH extraction shared task in 2018. We continued the shared task in 2019 in collaboration with the n2c2 and OHNLP consortium, and organized the 2019 n2c2/OHNLP FH extraction track. The shared task comprises 2 subtasks. Subtask 1 focuses on identifying family member entities and clinical observations (diseases), and subtask 2 expects the association of the living status, side of the family, and clinical observations with family members to be extracted. Subtask 2 is an end-to-end task which is based on the result of subtask 1. We manually curated the first deidentified clinical narrative from FH sections of clinical notes at Mayo Clinic Rochester, the content of which is highly relevant to patients’ FH. Results A total of 17 teams from all over the world participated in the n2c2/OHNLP FH extraction shared task, where 38 runs were submitted for subtask 1 and 21 runs were submitted for subtask 2. For subtask 1, the top 3 runs were generated by Harbin Institute of Technology, ezDI, Inc., and The Medical University of South Carolina with F1 scores of 0.8745, 0.8225, and 0.8130, respectively. For subtask 2, the top 3 runs were from Harbin Institute of Technology, ezDI, Inc., and University of Florida with F1 scores of 0.681, 0.6586, and 0.6544, respectively. The workshop was held in conjunction with the AMIA 2019 Fall Symposium. Conclusions A wide variety of methods were used by different teams in both tasks, such as Bidirectional Encoder Representations from Transformers, convolutional neural network, bidirectional long short-term memory, conditional random field, support vector machine, and rule-based strategies. System performances show that relation extraction from FH is a more challenging task when compared to entity identification task.
Collapse
Affiliation(s)
- Feichen Shen
- Division of Digital Health Sciences, Mayo Clinic, Rochester, MN, United States
| | - Sijia Liu
- Division of Digital Health Sciences, Mayo Clinic, Rochester, MN, United States
| | - Sunyang Fu
- Division of Digital Health Sciences, Mayo Clinic, Rochester, MN, United States
| | - Yanshan Wang
- Division of Digital Health Sciences, Mayo Clinic, Rochester, MN, United States
| | - Sam Henry
- Department of Information Sciences and Technology, George Mason University, Fairfax, VA, United States
| | - Ozlem Uzuner
- Department of Information Sciences and Technology, George Mason University, Fairfax, VA, United States.,Department of Biomedical Informatics, Massachusetts Institute of Technology, Cambridge, MA, United States.,Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
| | - Hongfang Liu
- Division of Digital Health Sciences, Mayo Clinic, Rochester, MN, United States
| |
Collapse
|
15
|
Boulouis G, Stricker S, Benichi S, Hak JF, Gariel F, Alias Q, de Saint Denis T, Kossorotoff M, Bajolle F, Garzelli L, Beccaria K, Paternoster G, Bourgeois M, Garcelon N, Harroche A, Mancusi RL, Boddaert N, Puget S, Brunelle F, Blauwblomme T, Naggara O. Etiology of intracerebral hemorrhage in children: cohort study, systematic review, and meta-analysis. J Neurosurg Pediatr 2021; 27:357-363. [PMID: 33385999 DOI: 10.3171/2020.7.peds20447] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Accepted: 07/16/2020] [Indexed: 11/06/2022]
Abstract
OBJECTIVE Understanding the etiological spectrum of nontraumatic pediatric intracerebral hemorrhage (pICH) is key to the diagnostic workup and care pathway. The authors aimed to evaluate the etiological spectrum of diseases underlying pICH. METHODS Children treated at the authors' institution for a pICH were included in an inception cohort initiated in 2008 and retrospectively inclusive to 2000, which was analyzed in October 2019. They then conducted a systematic review of relevant articles in PubMed published between 1990 and 2019, identifying cohorts with pICH. Identified populations and patients from the authors' cohort were pooled in a multicategory meta-analysis. RESULTS A total of 243 children with pICH were analyzed in the cohort study. The final primary diagnosis was an intracranial vascular lesion in 190 patients (78.2%), a complication of a cardiac disease in 17 (7.0%), and a coagulation disorder in 14 (5.8%). Hematological and cardiological etiologies were disproportionately more frequent in children younger than 2 years (p < 0.001). The systematic review identified 1309 children in 23 relevant records pooled in the meta-analysis. Overall, there was significant heterogeneity. The dominant etiology was vascular lesion, with an aggregate prevalence of 0.59 (95% CI 0.45-0.64; p < 0.001, Q = 302.8, I2 = 92%). In 18 studies reporting a detailed etiological spectrum, arteriovenous malformation was the dominant etiology (68.3% [95% CI 64.2%-70.9%] of all vascular causes), followed by cavernoma (15.7% [95% CI 13.0%-18.2%]). CONCLUSIONS The most frequent etiology of pICH is brain arteriovenous malformation. The probability of an underlying vascular etiology increases with age, and, conversely, hematological and cardiac causes are dominant causes in children younger than 2 years.
Collapse
Affiliation(s)
- Grégoire Boulouis
- 1Service d'imagerie Morphologique et Fonctionnelle, GHU Paris Psychiatrie et Neurosciences, Hospitalier Sainte Anne, Institut de Psychiatrie et Neurosciences de Paris (IPNP), UMR_S1266, INSERM, Université de Paris.,2Pediatric Radiology Department
| | | | | | - Jean-François Hak
- 1Service d'imagerie Morphologique et Fonctionnelle, GHU Paris Psychiatrie et Neurosciences, Hospitalier Sainte Anne, Institut de Psychiatrie et Neurosciences de Paris (IPNP), UMR_S1266, INSERM, Université de Paris.,2Pediatric Radiology Department
| | - Florent Gariel
- 1Service d'imagerie Morphologique et Fonctionnelle, GHU Paris Psychiatrie et Neurosciences, Hospitalier Sainte Anne, Institut de Psychiatrie et Neurosciences de Paris (IPNP), UMR_S1266, INSERM, Université de Paris.,2Pediatric Radiology Department
| | | | | | | | - Fanny Bajolle
- 5Unité Médico-Chirurgicale de Cardiologie Congénitale et Pédiatrique, Centre de référence Malformations Cardiaques Congénitales Complexes-M3C
| | - Lorenzo Garzelli
- 1Service d'imagerie Morphologique et Fonctionnelle, GHU Paris Psychiatrie et Neurosciences, Hospitalier Sainte Anne, Institut de Psychiatrie et Neurosciences de Paris (IPNP), UMR_S1266, INSERM, Université de Paris.,2Pediatric Radiology Department
| | | | | | | | | | - Annie Harroche
- 7Department of Hematology, Haemophilia Care Centre, Hôpital Necker Enfants Malades, AP-HP, Université de Paris
| | - Rossella Letizia Mancusi
- 8Délégation à la recherche clinique et à l'Innovation (DRCI), GHU Paris Psychiatrie et Neurosciences, Paris; and
| | - Nathalie Boddaert
- 2Pediatric Radiology Department.,9INSERM U1163, Université Paris Descartes-Sorbonne Paris Cité, Institut Imagine, and INSERM U1000, Paris, France
| | | | - Francis Brunelle
- 1Service d'imagerie Morphologique et Fonctionnelle, GHU Paris Psychiatrie et Neurosciences, Hospitalier Sainte Anne, Institut de Psychiatrie et Neurosciences de Paris (IPNP), UMR_S1266, INSERM, Université de Paris.,2Pediatric Radiology Department
| | | | - Olivier Naggara
- 1Service d'imagerie Morphologique et Fonctionnelle, GHU Paris Psychiatrie et Neurosciences, Hospitalier Sainte Anne, Institut de Psychiatrie et Neurosciences de Paris (IPNP), UMR_S1266, INSERM, Université de Paris.,2Pediatric Radiology Department
| |
Collapse
|
16
|
Neuraz A, Lerner I, Digan W, Paris N, Tsopra R, Rogier A, Baudoin D, Cohen KB, Burgun A, Garcelon N, Rance B. Natural Language Processing for Rapid Response to Emergent Diseases: Case Study of Calcium Channel Blockers and Hypertension in the COVID-19 Pandemic. J Med Internet Res 2020; 22:e20773. [PMID: 32759101 PMCID: PMC7431235 DOI: 10.2196/20773] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Revised: 07/02/2020] [Accepted: 07/26/2020] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND A novel disease poses special challenges for informatics solutions. Biomedical informatics relies for the most part on structured data, which require a preexisting data or knowledge model; however, novel diseases do not have preexisting knowledge models. In an emergent epidemic, language processing can enable rapid conversion of unstructured text to a novel knowledge model. However, although this idea has often been suggested, no opportunity has arisen to actually test it in real time. The current coronavirus disease (COVID-19) pandemic presents such an opportunity. OBJECTIVE The aim of this study was to evaluate the added value of information from clinical text in response to emergent diseases using natural language processing (NLP). METHODS We explored the effects of long-term treatment by calcium channel blockers on the outcomes of COVID-19 infection in patients with high blood pressure during in-patient hospital stays using two sources of information: data available strictly from structured electronic health records (EHRs) and data available through structured EHRs and text mining. RESULTS In this multicenter study involving 39 hospitals, text mining increased the statistical power sufficiently to change a negative result for an adjusted hazard ratio to a positive one. Compared to the baseline structured data, the number of patients available for inclusion in the study increased by 2.95 times, the amount of available information on medications increased by 7.2 times, and the amount of additional phenotypic information increased by 11.9 times. CONCLUSIONS In our study, use of calcium channel blockers was associated with decreased in-hospital mortality in patients with COVID-19 infection. This finding was obtained by quickly adapting an NLP pipeline to the domain of the novel disease; the adapted pipeline still performed sufficiently to extract useful information. When that information was used to supplement existing structured data, the sample size could be increased sufficiently to see treatment effects that were not previously statistically detectable.
Collapse
Affiliation(s)
- Antoine Neuraz
- Department of Biomedical Informatics, Necker-Enfant Malades Hospital, Assistance Publique - Hôpitaux de Paris (AP-HP), Paris, France
- Centre de Recherche des Cordeliers, INSERM UMRS 1138 Team 22, Université de Paris, Paris, France
- LIMSI, CNRS, Université Paris Saclay, Orsay, France
| | - Ivan Lerner
- Department of Biomedical Informatics, Necker-Enfant Malades Hospital, Assistance Publique - Hôpitaux de Paris (AP-HP), Paris, France
- Centre de Recherche des Cordeliers, INSERM UMRS 1138 Team 22, Université de Paris, Paris, France
| | - William Digan
- Centre de Recherche des Cordeliers, INSERM UMRS 1138 Team 22, Université de Paris, Paris, France
- Department of Biomedical Informatics, Georges Pompidou European Hospital, Assistance Publique - Hôpitaux de Paris (AP-HP), Paris, France
| | - Nicolas Paris
- DSI WIND, Assistance Publique - Hôpitaux de Paris (AP-HP), Paris, France
| | - Rosy Tsopra
- Centre de Recherche des Cordeliers, INSERM UMRS 1138 Team 22, Université de Paris, Paris, France
- Department of Biomedical Informatics, Georges Pompidou European Hospital, Assistance Publique - Hôpitaux de Paris (AP-HP), Paris, France
| | - Alice Rogier
- Centre de Recherche des Cordeliers, INSERM UMRS 1138 Team 22, Université de Paris, Paris, France
- Department of Biomedical Informatics, Georges Pompidou European Hospital, Assistance Publique - Hôpitaux de Paris (AP-HP), Paris, France
| | - David Baudoin
- Centre de Recherche des Cordeliers, INSERM UMRS 1138 Team 22, Université de Paris, Paris, France
- Department of Biomedical Informatics, Georges Pompidou European Hospital, Assistance Publique - Hôpitaux de Paris (AP-HP), Paris, France
| | | | - Anita Burgun
- Department of Biomedical Informatics, Necker-Enfant Malades Hospital, Assistance Publique - Hôpitaux de Paris (AP-HP), Paris, France
- Centre de Recherche des Cordeliers, INSERM UMRS 1138 Team 22, Université de Paris, Paris, France
- Department of Biomedical Informatics, Georges Pompidou European Hospital, Assistance Publique - Hôpitaux de Paris (AP-HP), Paris, France
| | - Nicolas Garcelon
- Centre de Recherche des Cordeliers, INSERM UMRS 1138 Team 22, Université de Paris, Paris, France
- Institut Imagine, INSERM U1163, Université Paris Descartes, Université de Paris, Paris, France
| | - Bastien Rance
- Centre de Recherche des Cordeliers, INSERM UMRS 1138 Team 22, Université de Paris, Paris, France
- Department of Biomedical Informatics, Georges Pompidou European Hospital, Assistance Publique - Hôpitaux de Paris (AP-HP), Paris, France
| |
Collapse
|
17
|
The "salt and pepper" pattern on renal ultrasound in a group of children with molecular-proven diagnosis of ciliopathy-related renal diseases. Pediatr Nephrol 2020; 35:1033-1040. [PMID: 32040628 DOI: 10.1007/s00467-020-04480-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 12/19/2019] [Accepted: 01/13/2020] [Indexed: 10/25/2022]
Abstract
BACKGROUND While typical ultrasound patterns of ciliopathy-related cystic kidney diseases have been described in children, ultrasound findings can overlap between different diseases and atypical patterns exist. In this study, we assessed the presence of the "salt and pepper" pattern in different renal ciliopathies and looked for additional ultrasound features. METHODS This single-center, retrospective study included all patients with a molecular-proven diagnosis of renal ciliopathy, referred to our center between 2007 and 2017. Images from the first and follow-up ultrasound exams were reviewed. Basic ultrasound features were grouped into patterns and compared to genetic diagnoses. The "salt and pepper" aspect was described as enlarged kidneys with heterogeneous, increased parenchymal echogenicity. RESULTS A total of 41 children with 5 different renal ciliopathies were included (61% male; median age, 6 years [range, 3 days to 17 years]). The "salt and pepper" pattern was present in 14/15 patients with an autosomal recessive polycystic kidney disease (ARPKD). A similar pattern was found in 1/4 patients with an autosomal dominant polycystic kidney disease and in 1/11 patients with HNF1B mutation. Additional signs found were areas of cortical sparing, comet-tail artifacts, and color comet-tail artifacts. CONCLUSION Although the "salt and pepper" ultrasound pattern is predominantly found in ARPKD, it may be detected in other ciliopathies. The color comet-tail artifact is an interesting sign when suspecting a renal ciliopathy in case of enlarged hyperechoic kidneys with no detectable microcysts on B-mode grayscale ultrasound.
Collapse
|
18
|
Electronic health records for the diagnosis of rare diseases. Kidney Int 2020; 97:676-686. [DOI: 10.1016/j.kint.2019.11.037] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 11/15/2019] [Accepted: 11/22/2019] [Indexed: 01/13/2023]
|
19
|
Touzé R, Heuzé Y, Robert MP, Brémond-Gignac D, Roux CJ, James S, Paternoster G, Arnaud E, Khonsari RH. Extraocular muscle positions in anterior plagiocephaly: V-pattern strabismus explained using geometric mophometrics. Br J Ophthalmol 2019; 104:1156-1160. [PMID: 31694836 DOI: 10.1136/bjophthalmol-2019-314989] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2019] [Revised: 10/02/2019] [Accepted: 10/20/2019] [Indexed: 11/04/2022]
Abstract
INTRODUCTION Ophthalmological involvement in anterior plagiocephaly (AP) due to unicoronal synostosis (UCS) raises management challenges. Two abnormalities of the extraocular muscles (EOM) are commonly reported in UCS without objective quantification: (1) excyclorotation of the eye and (2) malposition of the trochlea of the superior oblique muscle. Here we aimed to assess the positions of the EOM in AP, using geometric morphometrics based on MRI data. MATERIALS AND METHODS Patient files were listed using Dr WareHouse, a dedicated big data search engine. We included all patients with AP managed between 2013 and 2018, with an available digital preoperative MRI. MRIs from age-matched controls without craniofacial conditions were also included. We defined 13 orbital and skull base landmarks in order to model the 3D position of the EOM. Cephalometric analyses and geometric morphometrics with Procrustes superimposition and principal component analysis were used with the aim of defining specific EOM anomalies in UCS. RESULTS We included 15 preoperative and 7 postoperative MRIs from patients with UCS and 24 MRIs from age-matched controls. Cephalometric analyses, Procrustes superimposition and distance computations showed a significant shape difference for the position of the trochlea of the superior oblique muscle and an excyclorotation of the EOM. CONCLUSIONS Our results confirm that UCS-associated anomalies of the superior oblique muscle function are associated with malposition of its trochlea in the roof of the orbit. This clinical anomaly supports the importance of MRI imaging in the surgical management of strabismus in patients with UCS.
Collapse
Affiliation(s)
- Romain Touzé
- Department of Ophthalmology, Hôpital Universitaire Necker - Enfants Malades, Assistance Publique - Hôpitaux de Paris; Université de Paris, Sorbonne Paris Cité, Paris, France
| | - Yann Heuzé
- CRNS, Université de Bordeaux, MCC, PACEA, UMR5199, Pessac, France
| | - Matthieu P Robert
- Department of Ophthalmology, Hôpital Universitaire Necker - Enfants Malades, Assistance Publique - Hôpitaux de Paris; Université de Paris, Sorbonne Paris Cité, Paris, France.,COGNAC-G, UMR 8257, CNRS-SSA-Université de Paris, Paris, France
| | - Dominique Brémond-Gignac
- Department of Ophthalmology, Hôpital Universitaire Necker - Enfants Malades, Assistance Publique - Hôpitaux de Paris; Université de Paris, Sorbonne Paris Cité, Paris, France
| | - Charles-Joris Roux
- Department of Pediatric Radiology, Hôpital Universitaire Necker - Enfants Malades, Assistance Publique - Hôpitaux de Paris; Université de Paris, Sorbonne Paris Cité, Paris, France
| | - Syril James
- Department of Neurosurgery, Craniofacial surgery unit, Hôpital Universitaire Necker - Enfants Malades, Assistance Publique - Hôpitaux de Paris; Centre de Référence des Malformations Craniofaciale CRANIOST, Filière Maladies Rares TeteCou; Université Paris Descartes, Université de Paris, Paris, France.,Department of Neurosurgery, Clinique Marcel Sembat, Boulogne-Billancourt, France
| | - Giovanna Paternoster
- Department of Neurosurgery, Craniofacial surgery unit, Hôpital Universitaire Necker - Enfants Malades, Assistance Publique - Hôpitaux de Paris; Centre de Référence des Malformations Craniofaciale CRANIOST, Filière Maladies Rares TeteCou; Université Paris Descartes, Université de Paris, Paris, France
| | - Eric Arnaud
- Department of Neurosurgery, Craniofacial surgery unit, Hôpital Universitaire Necker - Enfants Malades, Assistance Publique - Hôpitaux de Paris; Centre de Référence des Malformations Craniofaciale CRANIOST, Filière Maladies Rares TeteCou; Université Paris Descartes, Université de Paris, Paris, France.,Department of Neurosurgery, Clinique Marcel Sembat, Boulogne-Billancourt, France
| | - Roman Hossein Khonsari
- Department of Neurosurgery, Craniofacial surgery unit, Hôpital Universitaire Necker - Enfants Malades, Assistance Publique - Hôpitaux de Paris; Centre de Référence des Malformations Craniofaciale CRANIOST, Filière Maladies Rares TeteCou; Université Paris Descartes, Université de Paris, Paris, France.,Department of Maxillo-Facial Surgery and Plastic Surgery, Hôpital Universitaire Necker - Enfants Malades, Assistance Publique - Hôpitaux de Paris; Centre de Référence des Malformations Rares de la Face et de la Cavité Buccale MAFACE, Filière Maladies Rares TeteCou; Université Paris Descartes, Université de Paris, Paris, France
| |
Collapse
|
20
|
Sheikhalishahi S, Miotto R, Dudley JT, Lavelli A, Rinaldi F, Osmani V. Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review. JMIR Med Inform 2019; 7:e12239. [PMID: 31066697 PMCID: PMC6528438 DOI: 10.2196/12239] [Citation(s) in RCA: 198] [Impact Index Per Article: 39.6] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Revised: 03/04/2019] [Accepted: 03/24/2019] [Indexed: 01/08/2023] Open
Abstract
Background Novel approaches that complement and go beyond evidence-based medicine are required in the domain of chronic diseases, given the growing incidence of such conditions on the worldwide population. A promising avenue is the secondary use of electronic health records (EHRs), where patient data are analyzed to conduct clinical and translational research. Methods based on machine learning to process EHRs are resulting in improved understanding of patient clinical trajectories and chronic disease risk prediction, creating a unique opportunity to derive previously unknown clinical insights. However, a wealth of clinical histories remains locked behind clinical narratives in free-form text. Consequently, unlocking the full potential of EHR data is contingent on the development of natural language processing (NLP) methods to automatically transform clinical text into structured clinical data that can guide clinical decisions and potentially delay or prevent disease onset. Objective The goal of the research was to provide a comprehensive overview of the development and uptake of NLP methods applied to free-text clinical notes related to chronic diseases, including the investigation of challenges faced by NLP methodologies in understanding clinical narratives. Methods Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines were followed and searches were conducted in 5 databases using “clinical notes,” “natural language processing,” and “chronic disease” and their variations as keywords to maximize coverage of the articles. Results Of the 2652 articles considered, 106 met the inclusion criteria. Review of the included papers resulted in identification of 43 chronic diseases, which were then further classified into 10 disease categories using the International Classification of Diseases, 10th Revision. The majority of studies focused on diseases of the circulatory system (n=38) while endocrine and metabolic diseases were fewest (n=14). This was due to the structure of clinical records related to metabolic diseases, which typically contain much more structured data, compared with medical records for diseases of the circulatory system, which focus more on unstructured data and consequently have seen a stronger focus of NLP. The review has shown that there is a significant increase in the use of machine learning methods compared to rule-based approaches; however, deep learning methods remain emergent (n=3). Consequently, the majority of works focus on classification of disease phenotype with only a handful of papers addressing extraction of comorbidities from the free text or integration of clinical notes with structured data. There is a notable use of relatively simple methods, such as shallow classifiers (or combination with rule-based methods), due to the interpretability of predictions, which still represents a significant issue for more complex methods. Finally, scarcity of publicly available data may also have contributed to insufficient development of more advanced methods, such as extraction of word embeddings from clinical notes. Conclusions Efforts are still required to improve (1) progression of clinical NLP methods from extraction toward understanding; (2) recognition of relations among entities rather than entities in isolation; (3) temporal extraction to understand past, current, and future clinical events; (4) exploitation of alternative sources of clinical knowledge; and (5) availability of large-scale, de-identified clinical corpora.
Collapse
Affiliation(s)
- Seyedmostafa Sheikhalishahi
- eHealth Research Group, Fondazione Bruno Kessler Research Institute, Trento, Italy.,Department of Information Engineering and Computer Science, University of Trento, Trento, Italy
| | - Riccardo Miotto
- Institute for Next Generation Healthcare, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Joel T Dudley
- Institute for Next Generation Healthcare, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Alberto Lavelli
- NLP Research Group, Fondazione Bruno Kessler Research Institute, Trento, Italy
| | - Fabio Rinaldi
- Institute of Computational Linguistics, University of Zurich, Zurich, Switzerland
| | - Venet Osmani
- eHealth Research Group, Fondazione Bruno Kessler Research Institute, Trento, Italy
| |
Collapse
|
21
|
Dietrich G, Krebs J, Liman L, Fette G, Ertl M, Kaspar M, Störk S, Puppe F. Replicating medication trend studies using ad hoc information extraction in a clinical data warehouse. BMC Med Inform Decis Mak 2019; 19:15. [PMID: 30658633 PMCID: PMC6339317 DOI: 10.1186/s12911-018-0729-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Accepted: 12/21/2018] [Indexed: 11/16/2022] Open
Abstract
Background Medication trend studies show the changes of medication over the years and may be replicated using a clinical Data Warehouse (CDW). Even nowadays, a lot of the patient information, like medication data, in the EHR is stored in the format of free text. As the conventional approach of information extraction (IE) demands a high developmental effort, we used ad hoc IE instead. This technique queries information and extracts it on the fly from texts contained in the CDW. Methods We present a generalizable approach of ad hoc IE for pharmacotherapy (medications and their daily dosage) presented in hospital discharge letters. We added import and query features to the CDW system, like error tolerant queries to deal with misspellings and proximity search for the extraction of the daily dosage. During the data integration process in the CDW, negated, historical and non-patient context data are filtered. For the replication studies, we used a drug list grouped by ATC (Anatomical Therapeutic Chemical Classification System) codes as input for queries to the CDW. Results We achieve an F1 score of 0.983 (precision 0.997, recall 0.970) for extracting medication from discharge letters and an F1 score of 0.974 (precision 0.977, recall 0.972) for extracting the dosage. We replicated three published medical trend studies for hypertension, atrial fibrillation and chronic kidney disease. Overall, 93% of the main findings could be replicated, 68% of sub-findings, and 75% of all findings. One study could be completely replicated with all main and sub-findings. Conclusion A novel approach for ad hoc IE is presented. It is very suitable for basic medical texts like discharge letters and finding reports. Ad hoc IE is by definition more limited than conventional IE and does not claim to replace it, but it substantially exceeds the search capabilities of many CDWs and it is convenient to conduct replication studies fast and with high quality.
Collapse
Affiliation(s)
- Georg Dietrich
- Computer Science, Unviversity of Würzburg, Am Hubland, Würzburg, 97074, Germany.
| | - Jonathan Krebs
- Computer Science, Unviversity of Würzburg, Am Hubland, Würzburg, 97074, Germany
| | - Leon Liman
- Computer Science, Unviversity of Würzburg, Am Hubland, Würzburg, 97074, Germany
| | - Georg Fette
- Computer Science, Unviversity of Würzburg, Am Hubland, Würzburg, 97074, Germany.,Comprehensive Heart Failure Center, University and University Hospital Hospital of Würzburg, Am Schwarzenberg 15, Würzburg, 97078, Germany
| | - Maximilian Ertl
- Service Center Medical Informatics, University Hospital of Würzburg, Schweinfurter Strasse 4, Würzburg, 97078, Germany
| | - Mathias Kaspar
- Comprehensive Heart Failure Center, University and University Hospital Hospital of Würzburg, Am Schwarzenberg 15, Würzburg, 97078, Germany
| | - Stefan Störk
- Comprehensive Heart Failure Center, University and University Hospital Hospital of Würzburg, Am Schwarzenberg 15, Würzburg, 97078, Germany
| | - Frank Puppe
- Computer Science, Unviversity of Würzburg, Am Hubland, Würzburg, 97074, Germany
| |
Collapse
|
22
|
Cross Disciplinary Consultancy to Bridge Public Health Technical Needs and Analytic Developers: Negation Detection Use Case. Online J Public Health Inform 2018; 10:e209. [PMID: 30349627 PMCID: PMC6194092 DOI: 10.5210/ojphi.v10i2.8944] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
This paper describes a continuing initiative of the International Society for Disease Surveillance designed to bring together public health practitioners and analytics solution developers from both academia and industry. Funded by the Defense Threat Reduction Agency, a series of consultancies have been conducted on a range of topics of pressing concern to public health (e.g. developing methods to enhance prediction of asthma exacerbation, developing tools for asyndromic surveillance from chief complaints). The topic of this final consultancy, conducted at the University of Utah in January 2017, is focused on defining a roadmap for the development of algorithms, tools, and datasets for improving the capabilities of text processing algorithms to identify negated terms (i.e. negation detection) in free-text chief complaints
and triage reports.
Collapse
|
23
|
Névéol A, Zweigenbaum P. Expanding the Diversity of Texts and Applications: Findings from the Section on Clinical Natural Language Processing of the International Medical Informatics Association Yearbook. Yearb Med Inform 2018; 27:193-198. [PMID: 30157523 PMCID: PMC6115241 DOI: 10.1055/s-0038-1667080] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Objectives:
To summarize recent research and present a selection of the best papers published in 2017 in the field of clinical Natural Language Processing (NLP).
Methods:
A survey of the literature was performed by the two editors of the NLP section of the International Medical Informatics Association (IMIA) Yearbook. Bibliographic databases PubMed and Association of Computational Linguistics (ACL) Anthology were searched for papers with a focus on NLP efforts applied to clinical texts or aimed at a clinical outcome. A total of 709 papers were automatically ranked and then manually reviewed based on title and abstract. A shortlist of 15 candidate best papers was selected by the section editors and peer-reviewed by independent external reviewers to come to the three best clinical NLP papers for 2017.
Results:
Clinical NLP best papers provide a contribution that ranges from methodological studies to the application of research results to practical clinical settings. They draw from text genres as diverse as clinical narratives across hospitals and languages or social media.
Conclusions:
Clinical NLP continued to thrive in 2017, with an increasing number of contributions towards applications compared to fundamental methods. Methodological work explores deep learning and system adaptation across language variants. Research results continue to translate into freely available tools and corpora, mainly for the English language.
Collapse
|
24
|
Garcelon N, Neuraz A, Salomon R, Bahi-Buisson N, Amiel J, Picard C, Mahlaoui N, Benoit V, Burgun A, Rance B. Next generation phenotyping using narrative reports in a rare disease clinical data warehouse. Orphanet J Rare Dis 2018; 13:85. [PMID: 29855327 PMCID: PMC5984368 DOI: 10.1186/s13023-018-0830-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 05/23/2018] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND Secondary use of data collected in Electronic Health Records opens perspectives for increasing our knowledge of rare diseases. The clinical data warehouse (named Dr. Warehouse) at the Necker-Enfants Malades Children's Hospital contains data collected during normal care for thousands of patients. Dr. Warehouse is oriented toward the exploration of clinical narratives. In this study, we present our method to find phenotypes associated with diseases of interest. METHODS We leveraged the frequency and TF-IDF to explore the association between clinical phenotypes and rare diseases. We applied our method in six use cases: phenotypes associated with the Rett, Lowe, Silver Russell, Bardet-Biedl syndromes, DOCK8 deficiency and Activated PI3-kinase Delta Syndrome (APDS). We asked domain experts to evaluate the relevance of the top-50 (for frequency and TF-IDF) phenotypes identified by Dr. Warehouse and computed the average precision and mean average precision. RESULTS Experts concluded that between 16 and 39 phenotypes could be considered as relevant in the top-50 phenotypes ranked by descending frequency discovered by Dr. Warehouse (resp. between 11 and 41 for TF-IDF). Average precision ranges from 0.55 to 0.91 for frequency and 0.52 to 0.95 for TF-IDF. Mean average precision was 0.79. Our study suggests that phenotypes identified in clinical narratives stored in Electronic Health Record can provide rare disease specialists with candidate phenotypes that can be used in addition to the literature. CONCLUSIONS Clinical Data Warehouses can be used to perform Next Generation Phenotyping, especially in the context of rare diseases. We have developed a method to detect phenotypes associated with a group of patients using medical concepts extracted from free-text clinical narratives.
Collapse
Affiliation(s)
- Nicolas Garcelon
- Institut Imagine, Paris Descartes Paris Descartes-Sorbonne Paris Cité University, Paris, France
- Institut National de la Santé et de la Recherche Médicale (INSERM), Centre de Recherche des Cordeliers, UMR 1138 Equipe 22, Paris Descartes, Sorbonne Paris Cité University, Paris, France
- Imagine - Institute of Genetic Diseases, 24 boulevard du Montparnasse, 75015 Paris, France
| | - Antoine Neuraz
- Institut National de la Santé et de la Recherche Médicale (INSERM), Centre de Recherche des Cordeliers, UMR 1138 Equipe 22, Paris Descartes, Sorbonne Paris Cité University, Paris, France
- Department of Medical Informatics, Necker-Enfants Malades Hospital, Assistance Publique des Hôpitaux de Paris (AP-HP), Paris, France
| | - Rémi Salomon
- Institut Imagine, Paris Descartes Paris Descartes-Sorbonne Paris Cité University, Paris, France
- Pediatric Nephrology, Necker Enfants Malades Hospital AP-HP, Université Paris Descartes, Paris, France
| | - Nadia Bahi-Buisson
- Institut Imagine, Paris Descartes Paris Descartes-Sorbonne Paris Cité University, Paris, France
- Pediatric Neurology, Necker Enfants Malades Hospital AP-HP, Université Paris Descartes, Paris, France
| | - Jeanne Amiel
- Institut Imagine, Paris Descartes Paris Descartes-Sorbonne Paris Cité University, Paris, France
- Laboratory of embryology and genetics of congenital malformations, INSERM UMR 1163, Institut Imagine, Paris, France
- Department of Genetic, Necker Enfants Malades Hospital AP-HP, Université Paris Descartes, Paris, France
| | - Capucine Picard
- Institut Imagine, Paris Descartes Paris Descartes-Sorbonne Paris Cité University, Paris, France
- Laboratory of Lymphocyte Activation and Susceptibility to EBV infection, INSERM UMR 1163, Paris Descartes Sorbonne Paris Cité University, Imagine Institute, Paris, France
- Study center for primary immunodeficiencies (CEDI) Necker Enfants Malades Hospital AP-HP, Université Paris Descartes, Paris, France
| | - Nizar Mahlaoui
- Institut Imagine, Paris Descartes Paris Descartes-Sorbonne Paris Cité University, Paris, France
- Laboratory of Lymphocyte Activation and Susceptibility to EBV infection, INSERM UMR 1163, Paris Descartes Sorbonne Paris Cité University, Imagine Institute, Paris, France
- French National Reference Center for Primary Immuno Deficiencies (CEREDIH), Necker Enfants Malades Hospital AP-HP, Université Paris Descartes, Paris, France
- Pediatric Immuno-Haematology and Rheumatology Necker Enfants Malades Hospital AP-HP, Université Paris Descartes, Paris, France
| | - Vincent Benoit
- Institut Imagine, Paris Descartes Paris Descartes-Sorbonne Paris Cité University, Paris, France
| | - Anita Burgun
- Institut National de la Santé et de la Recherche Médicale (INSERM), Centre de Recherche des Cordeliers, UMR 1138 Equipe 22, Paris Descartes, Sorbonne Paris Cité University, Paris, France
- Department of Medical Informatics, Necker-Enfants Malades Hospital, Assistance Publique des Hôpitaux de Paris (AP-HP), Paris, France
- Hôpital Européen Georges Pompidou, AP-HP, Université Paris Descartes, Sorbonne Paris Cité, Paris, France
| | - Bastien Rance
- Institut National de la Santé et de la Recherche Médicale (INSERM), Centre de Recherche des Cordeliers, UMR 1138 Equipe 22, Paris Descartes, Sorbonne Paris Cité University, Paris, France
- Hôpital Européen Georges Pompidou, AP-HP, Université Paris Descartes, Sorbonne Paris Cité, Paris, France
| |
Collapse
|
25
|
Dietrich G, Krebs J, Fette G, Ertl M, Kaspar M, Störk S, Puppe F. Ad Hoc Information Extraction for Clinical Data Warehouses. Methods Inf Med 2018; 57:e22-e29. [PMID: 29801178 PMCID: PMC6193399 DOI: 10.3414/me17-02-0010] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Background:
Clinical Data Warehouses (CDW) reuse Electronic health records (EHR) to make their data retrievable for research purposes or patient recruitment for clinical trials. However, much information are hidden in unstructured data like discharge letters. They can be preprocessed and converted to structured data via information extraction (IE), which is unfortunately a laborious task and therefore usually not available for most of the text data in CDW.
Objectives:
The goal of our work is to provide an ad hoc IE service that allows users to query text data ad hoc in a manner similar to querying structured data in a CDW. While search engines just return text snippets, our systems also returns frequencies (e.g. how many patients exist with “heart failure” including textual synonyms or how many patients have an LVEF < 45) based on the content of discharge letters or textual reports for special investigations like heart echo. Three subtasks are addressed: (1) To recognize and to exclude negations and their scopes, (2) to extract concepts, i.e. Boolean values and (3) to extract numerical values.
Methods:
We implemented an extended version of the NegEx-algorithm for German texts that detects negations and determines their scope. Furthermore, our document oriented CDW PaDaWaN was extended with query functions, e.g. context sensitive queries and regex queries, and an extraction mode for computing the frequencies for Boolean and numerical values.
Results:
Evaluations in chest X-ray reports and in discharge letters showed high F1-scores for the three subtasks: Detection of negated concepts in chest X-ray reports with an F1-score of 0.99 and in discharge letters with 0.97; of Boolean values in chest X-ray reports about 0.99, and of numerical values in chest X-ray reports and discharge letters also around 0.99 with the exception of the concept age.
Discussion:
The advantages of an ad hoc IE over a standard IE are the low development effort (just entering the concept with its variants), the promptness of the results and the adaptability by the user to his or her particular question. Disadvantage are usually lower accuracy and confidence.
This ad hoc information extraction approach is novel and exceeds existing systems: Roogle [
1
] extracts predefined concepts from texts at preprocessing and makes them retrievable at runtime. Dr. Warehouse [
2
] applies negation detection and indexes the produced subtexts which include affirmed findings. Our approach combines negation detection and the extraction of concepts. But the extraction does not take place during preprocessing, but at runtime. That provides an ad hoc, dynamic, interactive and adjustable information extraction of random concepts and even their values on the fly at runtime.
Conclusions:
We developed an ad hoc information extraction query feature for Boolean and numerical values within a CDW with high recall and precision based on a pipeline that detects and removes negations and their scope in clinical texts.
Collapse
Affiliation(s)
- Georg Dietrich
- Computer Science, University of Wuerzburg, Wuerzburg, Germany
- Correspondence to: Georg Dietrich University of WuerzburgComputer ScienceAm Hubland97070 WuerzburgGermany
| | - Jonathan Krebs
- Computer Science, University of Wuerzburg, Wuerzburg, Germany
| | - Georg Fette
- Computer Science, University of Wuerzburg, Wuerzburg, Germany
- Comprehensive Heart Failure Center (CHFC), University Hospital of Wuerzburg, Wuerzburg, Germany
| | - Maximilian Ertl
- Service Center Medical Informatics, University Hospital of Wuerzburg, Wuerzburg, Germany
| | - Mathias Kaspar
- Comprehensive Heart Failure Center (CHFC), University Hospital of Wuerzburg, Wuerzburg, Germany
| | - Stefan Störk
- Comprehensive Heart Failure Center (CHFC), University Hospital of Wuerzburg, Wuerzburg, Germany
| | - Frank Puppe
- Computer Science, University of Wuerzburg, Wuerzburg, Germany
| |
Collapse
|
26
|
Garcelon N, Neuraz A, Salomon R, Faour H, Benoit V, Delapalme A, Munnich A, Burgun A, Rance B. A clinician friendly data warehouse oriented toward narrative reports: Dr. Warehouse. J Biomed Inform 2018; 80:52-63. [DOI: 10.1016/j.jbi.2018.02.019] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2017] [Revised: 02/22/2018] [Accepted: 02/28/2018] [Indexed: 01/26/2023]
|
27
|
Levasseur J, Nysjö J, Sandy R, Britto JA, Garcelon N, Haber S, Picard A, Corre P, Odri GA, Khonsari RH. Orbital volume and shape in Treacher Collins syndrome. J Craniomaxillofac Surg 2018; 46:305-311. [DOI: 10.1016/j.jcms.2017.11.028] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2017] [Revised: 11/01/2017] [Accepted: 11/30/2017] [Indexed: 01/22/2023] Open
|
28
|
Scheurwegs E, Sushil M, Tulkens S, Daelemans W, Luyckx K. Counting trees in Random Forests: Predicting symptom severity in psychiatric intake reports. J Biomed Inform 2017; 75S:S112-S119. [PMID: 28602906 PMCID: PMC5705466 DOI: 10.1016/j.jbi.2017.06.007] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2017] [Revised: 05/31/2017] [Accepted: 06/05/2017] [Indexed: 11/29/2022]
Abstract
The CEGS N-GRID 2016 Shared Task (Filannino et al., 2017) in Clinical Natural Language Processing introduces the assignment of a severity score to a psychiatric symptom, based on a psychiatric intake report. We present a method that employs the inherent interview-like structure of the report to extract relevant information from the report and generate a representation. The representation consists of a restricted set of psychiatric concepts (and the context they occur in), identified using medical concepts defined in UMLS that are directly related to the psychiatric diagnoses present in the Diagnostic and Statistical Manual of Mental Disorders, 4th Edition (DSM-IV) ontology. Random Forests provides a generalization of the extracted, case-specific features in our representation. The best variant presented here scored an inverse mean absolute error (MAE) of 80.64%. A concise concept-based representation, paired with identification of concept certainty and scope (family, patient), shows a robust performance on the task.
Collapse
Affiliation(s)
- Elyne Scheurwegs
- University of Antwerp, Computational Linguistics and Psycholinguistics (CLiPS) Research Center, Lange Winkelstraat 40-42, B-2000 Antwerp, Belgium; University of Antwerp, Advanced Database Research and Modelling Research Group (ADReM), Middelheimlaan 1, B-2020 Antwerp, Belgium; Antwerp University Hospital, ICT Department, Wilrijkstraat 10, B-2650 Edegem, Belgium.
| | - Madhumita Sushil
- University of Antwerp, Computational Linguistics and Psycholinguistics (CLiPS) Research Center, Lange Winkelstraat 40-42, B-2000 Antwerp, Belgium; Antwerp University Hospital, ICT Department, Wilrijkstraat 10, B-2650 Edegem, Belgium
| | - Stéphan Tulkens
- University of Antwerp, Computational Linguistics and Psycholinguistics (CLiPS) Research Center, Lange Winkelstraat 40-42, B-2000 Antwerp, Belgium
| | - Walter Daelemans
- University of Antwerp, Computational Linguistics and Psycholinguistics (CLiPS) Research Center, Lange Winkelstraat 40-42, B-2000 Antwerp, Belgium
| | - Kim Luyckx
- Antwerp University Hospital, ICT Department, Wilrijkstraat 10, B-2650 Edegem, Belgium
| |
Collapse
|
29
|
Finding patients using similarity measures in a rare diseases-oriented clinical data warehouse: Dr. Warehouse and the needle in the needle stack. J Biomed Inform 2017; 73:51-61. [PMID: 28754522 DOI: 10.1016/j.jbi.2017.07.016] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2017] [Revised: 07/05/2017] [Accepted: 07/24/2017] [Indexed: 11/22/2022]
Abstract
OBJECTIVE In the context of rare diseases, it may be helpful to detect patients with similar medical histories, diagnoses and outcomes from a large number of cases with automated methods. To reduce the time to find new cases, we developed a method to find similar patients given an index case leveraging data from the electronic health records. MATERIALS AND METHODS We used the clinical data warehouse of a children academic hospital in Paris, France (Necker-Enfants Malades), containing about 400,000 patients. Our model was based on a vector space model (VSM) to compute the similarity distance between an index patient and all the patients of the data warehouse. The dimensions of the VSM were built upon Unified Medical Language System concepts extracted from clinical narratives stored in the clinical data warehouse. The VSM was enhanced using three parameters: a pertinence score (TF-IDF of the concepts), the polarity of the concept (negated/not negated) and the minimum number of concepts in common. We evaluated this model by displaying the most similar patients for five different rare diseases: Lowe Syndrome (LOWE), Dystrophic Epidermolysis Bullosa (DEB), Activated PI3K delta Syndrome (APDS), Rett Syndrome (RETT) and Dowling Meara (EBS-DM), from the clinical data warehouse representing 18, 103, 21, 84 and 7 patients respectively. RESULTS The percentages of index patients returning at least one true positive similar patient in the Top30 similar patients were 94% for LOWE, 97% for DEB, 86% for APDS, 71% for EBS-DM and 99% for RETT. The mean number of patients with the exact same genetic diseases among the 30 returned patients was 51%. CONCLUSION This tool offers new perspectives in a translational context to identify patients for genetic research. Moreover, when new molecular bases are discovered, our strategy will help to identify additional eligible patients for genetic screening.
Collapse
|