Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Deléger L, Namer F, Zweigenbaum P. Morphosemantic parsing of medical compound words: transferring a French analyzer to English. Int J Med Inform 2008;78 Suppl 1:S48-55. [PMID: 18801700 DOI: 10.1016/j.ijmedinf.2008.07.016] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2008] [Revised: 06/24/2008] [Accepted: 07/30/2008] [Indexed: 11/25/2022]

For:	Deléger L, Namer F, Zweigenbaum P. Morphosemantic parsing of medical compound words: transferring a French analyzer to English. Int J Med Inform 2008;78 Suppl 1:S48-55. [PMID: 18801700 DOI: 10.1016/j.ijmedinf.2008.07.016] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2008] [Revised: 06/24/2008] [Accepted: 07/30/2008] [Indexed: 11/25/2022]

Number

Cited by Other Article(s)

Cassim N, Mapundu M, Olago V, Celik T, George JA, Glencross DK. Using text mining techniques to extract prostate cancer predictive information (Gleason score) from semi-structured narrative laboratory reports in the Gauteng province, South Africa. BMC Med Inform Decis Mak 2021;21:330. [PMID: 34823522 PMCID: PMC8614040 DOI: 10.1186/s12911-021-01697-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Accepted: 11/18/2021] [Indexed: 12/24/2022] Open

Abstract

Background

Prostate cancer (PCa) is the leading male neoplasm in South Africa with an age-standardised incidence rate of 68.0 per 100,000 population in 2018. The Gleason score (GS) is the strongest predictive factor for PCa treatment and is embedded within semi-structured prostate biopsy narrative reports. The manual extraction of the GS is labour-intensive. The objective of our study was to explore the use of text mining techniques to automate the extraction of the GS from irregularly reported text-intensive patient reports.

Methods

We used the associated Systematized Nomenclature of Medicine clinical terms morphology and topography codes to identify prostate biopsies with a PCa diagnosis for men aged > 30 years between 2006 and 2016 in the Gauteng Province, South Africa. We developed a text mining algorithm to extract the GS from 1000 biopsy reports with a PCa diagnosis from the National Health Laboratory Service database and validated the algorithm using 1000 biopsies from the private sector. The logical steps for the algorithm were data acquisition, pre-processing, feature extraction, feature value representation, feature selection, information extraction, classification, and discovered knowledge. We evaluated the algorithm using precision, recall and F-score. The GS was manually coded by two experts for both datasets. The top five GS were reported, with the remaining scores categorised as “Other” for both datasets. The percentage of biopsies with a high-risk GS (≥ 8) was also reported.

Results

The first output reported an F-score of 0.99 that improved to 1.00 after the algorithm was amended (the GS reported in clinical history was ignored). For the validation dataset, an F-score of 0.99 was reported. The most commonly reported GS were 5 + 4 = 9 (17.6%), 3 + 3 = 6 (17.5%), 4 + 3 = 7 (16.4%), 3 + 4 = 7 (14.7%) and 4 + 4 = 8 (14.2%). For the validation dataset, the most commonly reported GS were: (i) 3 + 3 = 6 (37.7%), (ii) 3 + 4 = 7 (19.4%), (iii) 4 + 3 = 7 (14.9%), (iv) 4 + 4 = 8 (10.0%) and (v) 4 + 5 = 9 (7.4%). A high-risk GS was reported for 31.8% compared to 17.4% for the validation dataset.

Conclusions

We demonstrated reliable extraction of information about GS from narrative text-based patient reports using an in-house developed text mining algorithm. A secondary outcome was that late presentation could be assessed.

Collapse

Bousquet C, Souvignet J, Sadou É, Jaulent MC, Declerck G. Ontological and Non-Ontological Resources for Associating Medical Dictionary for Regulatory Activities Terms to SNOMED Clinical Terms With Semantic Properties. Front Pharmacol 2019;10:975. [PMID: 31551780 PMCID: PMC6747929 DOI: 10.3389/fphar.2019.00975] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2018] [Accepted: 07/31/2019] [Indexed: 11/20/2022] Open

Abstract

Background: Formal definitions allow selecting terms (e.g., identifying all terms related to “Infectious disease” using the query “has causative agent organism”) and terminological reasoning (e.g., “hepatitis B” is a “hepatitis” and is an “infectious disease”). However, the standard international terminology Medical Dictionary for Regulatory Activities (MedDRA) used for coding adverse drug reactions in pharmacovigilance databases does not beneficiate from such formal definitions. Our objective was to evaluate the potential of reuse of ontological and non-ontological resources for generating such definitions for MedDRA.

Methods: We developed several methods that collectively allow a semiautomatic semantic enrichment of MedDRA: 1) using MedDRA-to-SNOMED Clinical Terms (SNOMED CT) mappings (available in the Unified Medical Language System metathesaurus or other mapping resources, e.g., the MedDRA preferred term “hepatitis B” is associated to the SNOMED CT concept “type B viral hepatitis”) to extract term definitions (e.g., “hepatitis B” is associated with the following properties: has finding site liver structure, has associated morphology inflammation morphology, and has causative agent hepatitis B virus); 2) using MedDRA labels and lexical/syntactic methods for automatic decomposition of complex MedDRA terms (e.g., the MedDRA systems organ class “blood and lymphatic system disorders” is decomposed in blood system disorders and lymphatic system disorders) or automatic suggestions of properties (e.g., the string “cyclic” in preferred term “cyclic neutropenia” leads to the property has clinical course cyclic).

Results: The Unified Medical Language System metathesaurus was the main ontological resource reusable for generating formal definitions for MedDRA terms. The non-ontological resources (another mapping resource provided by Nadkarni and Darer in 2010 and MedDRA labels) allowed defining few additional preferred terms. While the Ci4SeR tool helped the curator to define 1,935 terms by suggesting potential supplemental relations based on the parents’ and siblings’ semantic definition, defining manually all MedDRA terms remains expensive in time.

Discussion: Several ontological and non-ontological resources are available for associating MedDRA terms to SNOMED CT concepts with semantic properties, but providing manual definitions is still necessary. The ontology of adverse events is a possible alternative but does not cover all MedDRA terms either. Perspectives are to implement more efficient techniques to find more logical relations between SNOMED CT and MedDRA in an automated way.

Collapse

Névéol A, Dalianis H, Velupillai S, Savova G, Zweigenbaum P. Clinical Natural Language Processing in languages other than English: opportunities and challenges. J Biomed Semantics 2018;9:12. [PMID: 29602312 PMCID: PMC5877394 DOI: 10.1186/s13326-018-0179-8] [Citation(s) in RCA: 91] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2017] [Accepted: 02/14/2018] [Indexed: 01/22/2023] Open

Souvignet J, Declerck G, Asfari H, Jaulent MC, Bousquet C. OntoADR a semantic resource describing adverse drug reactions to support searching, coding, and information retrieval. J Biomed Inform 2016;63:100-107. [PMID: 27369567 DOI: 10.1016/j.jbi.2016.06.010] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2016] [Revised: 06/25/2016] [Accepted: 06/27/2016] [Indexed: 10/21/2022]

Abstract

INTRODUCTION

Efficient searching and coding in databases that use terminological resources requires that they support efficient data retrieval. The Medical Dictionary for Regulatory Activities (MedDRA) is a reference terminology for several countries and organizations to code adverse drug reactions (ADRs) for pharmacovigilance. Ontologies that are available in the medical domain provide several advantages such as reasoning to improve data retrieval. The field of pharmacovigilance does not yet benefit from a fully operational ontology to formally represent the MedDRA terms. Our objective was to build a semantic resource based on formal description logic to improve MedDRA term retrieval and aid the generation of on-demand custom groupings by appropriately and efficiently selecting terms: OntoADR.

METHODS

The method consists of the following steps: (1) mapping between MedDRA terms and SNOMED-CT, (2) generation of semantic definitions using semi-automatic methods, (3) storage of the resource and (4) manual curation by pharmacovigilance experts.

RESULTS

We built a semantic resource for ADRs enabling a new type of semantics-based term search. OntoADR adds new search capabilities relative to previous approaches, overcoming the usual limitations of computation using lightweight description logic, such as the intractability of unions or negation queries, bringing it closer to user needs. Our automated approach for defining MedDRA terms enabled the association of at least one defining relationship with 67% of preferred terms. The curation work performed on our sample showed an error level of 14% for this automated approach. We tested OntoADR in practice, which allowed us to build custom groupings for several medical topics of interest.

DISCUSSION

The methods we describe in this article could be adapted and extended to other terminologies which do not benefit from a formal semantic representation, thus enabling better data retrieval performance. Our custom groupings of MedDRA terms were used while performing signal detection, which suggests that the graphical user interface we are currently implementing to process OntoADR could be usefully integrated into specialized pharmacovigilance software that rely on MedDRA.

Collapse

Nadkarni PM, Ohno-Machado L, Chapman WW. Natural language processing: an introduction. J Am Med Inform Assoc 2012;18:544-51. [PMID: 21846786 DOI: 10.1136/amiajnl-2011-000464] [Citation(s) in RCA: 440] [Impact Index Per Article: 36.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open

Demner-Fushman D, Chapman WW, McDonald CJ. What can natural language processing do for clinical decision support? J Biomed Inform 2009;42:760-72. [PMID: 19683066 PMCID: PMC2757540 DOI: 10.1016/j.jbi.2009.08.007] [Citation(s) in RCA: 266] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2008] [Revised: 08/10/2009] [Accepted: 08/11/2009] [Indexed: 11/29/2022]