Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Huang KC, Chiang IJ, Xiao F, Liao CC, Liu CCH, Wong JM. PICO element detection in medical text without metadata: Are first sentences enough? J Biomed Inform 2013;46:940-6. [DOI: 10.1016/j.jbi.2013.07.009] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2012] [Revised: 07/17/2013] [Accepted: 07/22/2013] [Indexed: 11/24/2022]

For:	Huang KC, Chiang IJ, Xiao F, Liao CC, Liu CCH, Wong JM. PICO element detection in medical text without metadata: Are first sentences enough? J Biomed Inform 2013;46:940-6. [DOI: 10.1016/j.jbi.2013.07.009] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2012] [Revised: 07/17/2013] [Accepted: 07/22/2013] [Indexed: 11/24/2022]

Number

Cited by Other Article(s)

Cajamarca G, Proust V, Herskovic V, Cádiz RF, Verdezoto N, Fernández FJ. Technologies for Managing the Health of Older Adults with Multiple Chronic Conditions: A Systematic Literature Review. Healthcare (Basel) 2023;11:2897. [PMID: 37958041 PMCID: PMC10648176 DOI: 10.3390/healthcare11212897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 10/19/2023] [Accepted: 10/23/2023] [Indexed: 11/15/2023] Open

Perlman-Arrow S, Loo N, Bobrovitz N, Yan T, Arora RK. A real-world evaluation of the implementation of NLP technology in abstract screening of a systematic review. Res Synth Methods 2023. [PMID: 37230483 DOI: 10.1002/jrsm.1636] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 04/06/2023] [Accepted: 04/27/2023] [Indexed: 05/27/2023]

Oliveira Dos Santos Á, Sergio da Silva E, Machado Couto L, Valadares Labanca Reis G, Silva Belo V. The use of artificial intelligence for automating or semi-automating biomedical literature analyses: a scoping review. J Biomed Inform 2023;142:104389. [PMID: 37187321 DOI: 10.1016/j.jbi.2023.104389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 04/11/2023] [Accepted: 05/08/2023] [Indexed: 05/17/2023]

Abstract

OBJECTIVE

Evidence-based medicine (EBM) is a decision-making process based on the conscious and judicious use of the best available scientific evidence. However, the exponential increase in the amount of information currently available likely exceeds the capacity of human-only analysis. In this context, artificial intelligence (AI) and its branches such as machine learning (ML) can be used to facilitate human efforts in analyzing the literature to foster EBM. The present scoping review aimed to examine the use of AI in the automation of biomedical literature survey and analysis with a view to establishing the state-of-the-art and identifying knowledge gaps.

MATERIALS AND METHODS

Comprehensive searches of the main databases were performed for articles published up to June 2022 and studies were selected according to inclusion and exclusion criteria. Data were extracted from the included articles and the findings categorized.

RESULTS

The total number of records retrieved from the databases was 12,145, of which 273 were included in the review. Classification of the studies according to the use of AI in evaluating the biomedical literature revealed three main application groups, namely assembly of scientific evidence (n=127; 47%), mining the biomedical literature (n=112; 41%) and quality analysis (n=34; 12%). Most studies addressed the preparation of systematic reviews, while articles focusing on the development of guidelines and evidence synthesis were the least frequent. The biggest knowledge gap was identified within the quality analysis group, particularly regarding methods and tools that assess the strength of recommendation and consistency of evidence.

CONCLUSION

Our review shows that, despite significant progress in the automation of biomedical literature surveys and analyses in recent years, intense research is needed to fill knowledge gaps on more difficult aspects of ML, deep learning and natural language processing, and to consolidate the use of automation by end-users (biomedical researchers and healthcare professionals).

Collapse

Fortunato Costa K, Almeida Araújo F, Morais J, Lisboa Frances CR, Ramos RTJ. Text mining for identification of biological entities related to antibiotic resistant organisms. PeerJ 2022;10:e13351. [PMID: 35539017 PMCID: PMC9080439 DOI: 10.7717/peerj.13351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Accepted: 04/07/2022] [Indexed: 01/13/2023] Open

Schmidt L, Finnerty Mutlu AN, Elmore R, Olorisade BK, Thomas J, Higgins JPT. Data extraction methods for systematic review (semi)automation: Update of a living systematic review. F1000Res 2021;10:401. [PMID: 34408850 PMCID: PMC8361807 DOI: 10.12688/f1000research.51117.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/27/2023] [Indexed: 10/12/2023] Open

Abstract

Background: The reliable and usable (semi)automation of data extraction can support the field of systematic review by reducing the workload required to gather information about the conduct and results of the included studies. This living systematic review examines published approaches for data extraction from reports of clinical studies. Methods: We systematically and continually search PubMed, ACL Anthology, arXiv, OpenAlex via EPPI-Reviewer, and the dblp computer science bibliography. Full text screening and data extraction are conducted within an open-source living systematic review application created for the purpose of this review. This living review update includes publications up to December 2022 and OpenAlex content up to March 2023. Results: 76 publications are included in this review. Of these, 64 (84%) of the publications addressed extraction of data from abstracts, while 19 (25%) used full texts. A total of 71 (93%) publications developed classifiers for randomised controlled trials. Over 30 entities were extracted, with PICOs (population, intervention, comparator, outcome) being the most frequently extracted. Data are available from 25 (33%), and code from 30 (39%) publications. Six (8%) implemented publicly available tools Conclusions: This living systematic review presents an overview of (semi)automated data-extraction literature of interest to different types of literature review. We identified a broad evidence base of publications describing data extraction for interventional reviews and a small number of publications extracting epidemiological or diagnostic accuracy data. Between review updates, trends for sharing data and code increased strongly: in the base-review, data and code were available for 13 and 19% respectively, these numbers increased to 78 and 87% within the 23 new publications. Compared with the base-review, we observed another research trend, away from straightforward data extraction and towards additionally extracting relations between entities or automatic text summarisation. With this living review we aim to review the literature continually.

Collapse

Schmidt L, Finnerty Mutlu AN, Elmore R, Olorisade BK, Thomas J, Higgins JPT. Data extraction methods for systematic review (semi)automation: A living systematic review. F1000Res 2021;10:401. [PMID: 34408850 PMCID: PMC8361807 DOI: 10.12688/f1000research.51117.1] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/04/2021] [Indexed: 01/07/2023] Open

Abstract

Background: The reliable and usable (semi)automation of data extraction can support the field of systematic review by reducing the workload required to gather information about the conduct and results of the included studies. This living systematic review examines published approaches for data extraction from reports of clinical studies. Methods: We systematically and continually search MEDLINE, Institute of Electrical and Electronics Engineers (IEEE), arXiv, and the dblp computer science bibliography databases. Full text screening and data extraction are conducted within an open-source living systematic review application created for the purpose of this review. This iteration of the living review includes publications up to a cut-off date of 22 April 2020. Results: In total, 53 publications are included in this version of our review. Of these, 41 (77%) of the publications addressed extraction of data from abstracts, while 14 (26%) used full texts. A total of 48 (90%) publications developed and evaluated classifiers that used randomised controlled trials as the main target texts. Over 30 entities were extracted, with PICOs (population, intervention, comparator, outcome) being the most frequently extracted. A description of their datasets was provided by 49 publications (94%), but only seven (13%) made the data publicly available. Code was made available by 10 (19%) publications, and five (9%) implemented publicly available tools. Conclusions: This living systematic review presents an overview of (semi)automated data-extraction literature of interest to different types of systematic review. We identified a broad evidence base of publications describing data extraction for interventional reviews and a small number of publications extracting epidemiological or diagnostic accuracy data. The lack of publicly available gold-standard data for evaluation, and lack of application thereof, makes it difficult to draw conclusions on which is the best-performing system for each data extraction target. With this living review we aim to review the literature continually.

Collapse

Afzal M, Alam F, Malik KM, Malik GM. Clinical Context-Aware Biomedical Text Summarization Using Deep Neural Network: Model Development and Validation. J Med Internet Res 2020;22:e19810. [PMID: 33095174 PMCID: PMC7647812 DOI: 10.2196/19810] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2020] [Accepted: 09/24/2020] [Indexed: 01/09/2023] Open

Abstract

Background

Automatic text summarization (ATS) enables users to retrieve meaningful evidence from big data of biomedical repositories to make complex clinical decisions. Deep neural and recurrent networks outperform traditional machine-learning techniques in areas of natural language processing and computer vision; however, they are yet to be explored in the ATS domain, particularly for medical text summarization.

Objective

Traditional approaches in ATS for biomedical text suffer from fundamental issues such as an inability to capture clinical context, quality of evidence, and purpose-driven selection of passages for the summary. We aimed to circumvent these limitations through achieving precise, succinct, and coherent information extraction from credible published biomedical resources, and to construct a simplified summary containing the most informative content that can offer a review particular to clinical needs.

Methods

In our proposed approach, we introduce a novel framework, termed Biomed-Summarizer, that provides quality-aware Patient/Problem, Intervention, Comparison, and Outcome (PICO)-based intelligent and context-enabled summarization of biomedical text. Biomed-Summarizer integrates the prognosis quality recognition model with a clinical context–aware model to locate text sequences in the body of a biomedical article for use in the final summary. First, we developed a deep neural network binary classifier for quality recognition to acquire scientifically sound studies and filter out others. Second, we developed a bidirectional long-short term memory recurrent neural network as a clinical context–aware classifier, which was trained on semantically enriched features generated using a word-embedding tokenizer for identification of meaningful sentences representing PICO text sequences. Third, we calculated the similarity between query and PICO text sequences using Jaccard similarity with semantic enrichments, where the semantic enrichments are obtained using medical ontologies. Last, we generated a representative summary from the high-scoring PICO sequences aggregated by study type, publication credibility, and freshness score.

Results

Evaluation of the prognosis quality recognition model using a large dataset of biomedical literature related to intracranial aneurysm showed an accuracy of 95.41% (2562/2686) in terms of recognizing quality articles. The clinical context–aware multiclass classifier outperformed the traditional machine-learning algorithms, including support vector machine, gradient boosted tree, linear regression, K-nearest neighbor, and naïve Bayes, by achieving 93% (16127/17341) accuracy for classifying five categories: aim, population, intervention, results, and outcome. The semantic similarity algorithm achieved a significant Pearson correlation coefficient of 0.61 (0-1 scale) on a well-known BIOSSES dataset (with 100 pair sentences) after semantic enrichment, representing an improvement of 8.9% over baseline Jaccard similarity. Finally, we found a highly positive correlation among the evaluations performed by three domain experts concerning different metrics, suggesting that the automated summarization is satisfactory.

Conclusions

By employing the proposed method Biomed-Summarizer, high accuracy in ATS was achieved, enabling seamless curation of research evidence from the biomedical literature to use for clinical decision-making.

Collapse

Cajamarca G, Herskovic V, Rossel PO. Enabling Older Adults' Health Self-Management through Self-Report and Visualization-A Systematic Literature Review. SENSORS (BASEL, SWITZERLAND) 2020;20:E4348. [PMID: 32759801 PMCID: PMC7436010 DOI: 10.3390/s20154348] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/28/2020] [Revised: 07/25/2020] [Accepted: 07/26/2020] [Indexed: 12/15/2022]

Jin D, Szolovits P. Advancing PICO element detection in biomedical text via deep neural networks. Bioinformatics 2020;36:3856-3862. [PMID: 32311009 DOI: 10.1093/bioinformatics/btaa256] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Revised: 03/31/2020] [Accepted: 04/14/2020] [Indexed: 11/13/2022] Open

Zhang X, Geng P, Zhang T, Lu Q, Gao P, Mei J. Aceso: PICO-guided Evidence Summarization on Medical Literature. IEEE J Biomed Health Inform 2020;PP:2663-2670. [PMID: 32275627 DOI: 10.1109/jbhi.2020.2984704] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Afzal M, Hussain M, Malik KM, Lee S. Impact of Automatic Query Generation and Quality Recognition Using Deep Learning to Curate Evidence From Biomedical Literature: Empirical Study. JMIR Med Inform 2019;7:e13430. [PMID: 31815673 PMCID: PMC6928703 DOI: 10.2196/13430] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Revised: 08/07/2019] [Accepted: 09/26/2019] [Indexed: 11/13/2022] Open

Bashir R, Surian D, Dunn AG. The risk of conclusion change in systematic review updates can be estimated by learning from a database of published examples. J Clin Epidemiol 2019;110:42-49. [PMID: 30849512 DOI: 10.1016/j.jclinepi.2019.02.015] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2018] [Revised: 01/25/2019] [Accepted: 02/26/2019] [Indexed: 01/11/2023]

Fast and scalable neural embedding models for biomedical sentence classification. BMC Bioinformatics 2018;19:541. [PMID: 30577747 PMCID: PMC6303852 DOI: 10.1186/s12859-018-2496-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Accepted: 11/16/2018] [Indexed: 11/24/2022] Open

Abstract

Background

Biomedical literature is expanding rapidly, and tools that help locate information of interest are needed. To this end, a multitude of different approaches for classifying sentences in biomedical publications according to their coarse semantic and rhetoric categories (e.g., Background, Methods, Results, Conclusions) have been devised, with recent state-of-the-art results reported for a complex deep learning model. Recent evidence showed that shallow and wide neural models such as fastText can provide results that are competitive or superior to complex deep learning models while requiring drastically lower training times and having better scalability. We analyze the efficacy of the fastText model in the classification of biomedical sentences in the PubMed 200k RCT benchmark, and introduce a simple pre-processing step that enables the application of fastText on sentence sequences. Furthermore, we explore the utility of two unsupervised pre-training approaches in scenarios where labeled training data are limited.

Results

Our fastText-based methodology yields a state-of-the-art F1 score of.917 on the PubMed 200k benchmark when sentence ordering is taken into account, with a training time of only 73 s on standard hardware. Applying fastText on single sentences, without taking sentence ordering into account, yielded an F1 score of.852 (training time 13 s). Unsupervised pre-training of N-gram vectors greatly improved the results for small training set sizes, with an increase of F1 score of.21 to.74 when trained on only 1000 randomly picked sentences without taking sentence ordering into account.

Conclusions

Because of it’s ease of use and performance, fastText should be among the first choices of tools when tackling biomedical text classification problems with large corpora. Unsupervised pre-training of N-gram vectors on domain-specific corpora also makes it possible to apply fastText when labeled training data are limited.

Collapse

Giaquinta Aranda A, Fernández Araque A, Curbelo Rodriguez R, Rojo Aragues A. Glaucoma y antioxidantes: revisión sistemática. REVISTA MEXICANA DE OFTALMOLOGÍA 2017. [DOI: 10.1016/j.mexoft.2016.03.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open

Bui DDA, Del Fiol G, Hurdle JF, Jonnalagadda S. Extractive text summarization system to aid data extraction from full text in systematic review development. J Biomed Inform 2016;64:265-272. [PMID: 27989816 PMCID: PMC5362293 DOI: 10.1016/j.jbi.2016.10.014] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2016] [Revised: 10/24/2016] [Accepted: 10/25/2016] [Indexed: 01/30/2023]

Bui DDA, Del Fiol G, Jonnalagadda S. PDF text classification to leverage information extraction from publication reports. J Biomed Inform 2016;61:141-8. [PMID: 27044929 DOI: 10.1016/j.jbi.2016.03.026] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2015] [Revised: 03/22/2016] [Accepted: 03/31/2016] [Indexed: 11/19/2022]

Abstract

OBJECTIVES

Data extraction from original study reports is a time-consuming, error-prone process in systematic review development. Information extraction (IE) systems have the potential to assist humans in the extraction task, however majority of IE systems were not designed to work on Portable Document Format (PDF) document, an important and common extraction source for systematic review. In a PDF document, narrative content is often mixed with publication metadata or semi-structured text, which add challenges to the underlining natural language processing algorithm. Our goal is to categorize PDF texts for strategic use by IE systems.

METHODS

We used an open-source tool to extract raw texts from a PDF document and developed a text classification algorithm that follows a multi-pass sieve framework to automatically classify PDF text snippets (for brevity, texts) into TITLE, ABSTRACT, BODYTEXT, SEMISTRUCTURE, and METADATA categories. To validate the algorithm, we developed a gold standard of PDF reports that were included in the development of previous systematic reviews by the Cochrane Collaboration. In a two-step procedure, we evaluated (1) classification performance, and compared it with machine learning classifier, and (2) the effects of the algorithm on an IE system that extracts clinical outcome mentions.

RESULTS

The multi-pass sieve algorithm achieved an accuracy of 92.6%, which was 9.7% (p<0.001) higher than the best performing machine learning classifier that used a logistic regression algorithm. F-measure improvements were observed in the classification of TITLE (+15.6%), ABSTRACT (+54.2%), BODYTEXT (+3.7%), SEMISTRUCTURE (+34%), and MEDADATA (+14.2%). In addition, use of the algorithm to filter semi-structured texts and publication metadata improved performance of the outcome extraction system (F-measure +4.1%, p=0.002). It also reduced of number of sentences to be processed by 44.9% (p<0.001), which corresponds to a processing time reduction of 50% (p=0.005).

CONCLUSIONS

The rule-based multi-pass sieve framework can be used effectively in categorizing texts extracted from PDF documents. Text classification is an important prerequisite step to leverage information extraction from PDF documents.

Collapse

Jonnalagadda SR, Goyal P, Huffman MD. Automating data extraction in systematic reviews: a systematic review. Syst Rev 2015;4:78. [PMID: 26073888 PMCID: PMC4514954 DOI: 10.1186/s13643-015-0066-7] [Citation(s) in RCA: 101] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/20/2015] [Accepted: 05/21/2015] [Indexed: 11/25/2022] Open

Abstract

BACKGROUND

Automation of the parts of systematic review process, specifically the data extraction step, may be an important strategy to reduce the time necessary to complete a systematic review. However, the state of the science of automatically extracting data elements from full texts has not been well described. This paper performs a systematic review of published and unpublished methods to automate data extraction for systematic reviews.

METHODS

We systematically searched PubMed, IEEEXplore, and ACM Digital Library to identify potentially relevant articles. We included reports that met the following criteria: 1) methods or results section described what entities were or need to be extracted, and 2) at least one entity was automatically extracted with evaluation results that were presented for that entity. We also reviewed the citations from included reports.

RESULTS

Out of a total of 1190 unique citations that met our search criteria, we found 26 published reports describing automatic extraction of at least one of more than 52 potential data elements used in systematic reviews. For 25 (48 %) of the data elements used in systematic reviews, there were attempts from various researchers to extract information automatically from the publication text. Out of these, 14 (27 %) data elements were completely extracted, but the highest number of data elements extracted automatically by a single study was 7. Most of the data elements were extracted with F-scores (a mean of sensitivity and positive predictive value) of over 70 %.

CONCLUSIONS

We found no unified information extraction framework tailored to the systematic review process, and published reports focused on a limited (1-7) number of data elements. Biomedical natural language processing techniques have not been fully utilized to fully or even partially automate the data extraction step of systematic reviews.

Collapse

Application of text mining in the biomedical domain. Methods 2015;74:97-106. [PMID: 25641519 DOI: 10.1016/j.ymeth.2015.01.015] [Citation(s) in RCA: 76] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2014] [Revised: 01/21/2015] [Accepted: 01/23/2015] [Indexed: 12/12/2022] Open