Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Hunter L, Cohen KB. Biomedical language processing: what's beyond PubMed? Mol Cell 2006;21:589-94. [PMID: 16507357 PMCID: PMC1702322 DOI: 10.1016/j.molcel.2006.02.012] [Citation(s) in RCA: 122] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]

For:	Hunter L, Cohen KB. Biomedical language processing: what's beyond PubMed? Mol Cell 2006;21:589-94. [PMID: 16507357 PMCID: PMC1702322 DOI: 10.1016/j.molcel.2006.02.012] [Citation(s) in RCA: 122] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]

Number

Cited by Other Article(s)

Wang W, Shuai Y, Zeng M, Fan W, Li M. DPFunc: accurately predicting protein function via deep learning with domain-guided structure information. Nat Commun 2025;16:70. [PMID: 39746897 PMCID: PMC11697396 DOI: 10.1038/s41467-024-54816-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Accepted: 11/21/2024] [Indexed: 01/04/2025] Open

Li PH, Sun YY, Juan HF, Chen CY, Tsai HK, Huang JH. A large language model framework for literature-based disease-gene association prediction. Brief Bioinform 2024;26:bbaf070. [PMID: 39998433 PMCID: PMC11851487 DOI: 10.1093/bib/bbaf070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2024] [Revised: 01/09/2025] [Accepted: 02/06/2025] [Indexed: 02/26/2025] Open

Metzger VT, Cannon DC, Yang JJ, Mathias SL, Bologa CG, Waller A, Schürer SC, Vidović D, Kelleher KJ, Sheils TK, Jensen LJ, Lambert CG, Oprea TI, Edwards JS. TIN-X version 3: update with expanded dataset and modernized architecture for enhanced illumination of understudied targets. PeerJ 2024;12:e17470. [PMID: 38948230 PMCID: PMC11212617 DOI: 10.7717/peerj.17470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 05/06/2024] [Indexed: 07/02/2024] Open

Abstract

TIN-X (Target Importance and Novelty eXplorer) is an interactive visualization tool for illuminating associations between diseases and potential drug targets and is publicly available at newdrugtargets.org. TIN-X uses natural language processing to identify disease and protein mentions within PubMed content using previously published tools for named entity recognition (NER) of gene/protein and disease names. Target data is obtained from the Target Central Resource Database (TCRD). Two important metrics, novelty and importance, are computed from this data and when plotted as log(importance) vs. log(novelty), aid the user in visually exploring the novelty of drug targets and their associated importance to diseases. TIN-X Version 3.0 has been significantly improved with an expanded dataset, modernized architecture including a REST API, and an improved user interface (UI). The dataset has been expanded to include not only PubMed publication titles and abstracts, but also full-text articles when available. This results in approximately 9-fold more target/disease associations compared to previous versions of TIN-X. Additionally, the TIN-X database containing this expanded dataset is now hosted in the cloud via Amazon RDS. Recent enhancements to the UI focuses on making it more intuitive for users to find diseases or drug targets of interest while providing a new, sortable table-view mode to accompany the existing plot-view mode. UI improvements also help the user browse the associated PubMed publications to explore and understand the basis of TIN-X's predicted association between a specific disease and a target of interest. While implementing these upgrades, computational resources are balanced between the webserver and the user's web browser to achieve adequate performance while accommodating the expanded dataset. Together, these advances aim to extend the duration that users can benefit from TIN-X while providing both an expanded dataset and new features that researchers can use to better illuminate understudied proteins.

Collapse

Affiliation(s)

Vincent T. Metzger Translational Informatics Division, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, United States
Daniel C. Cannon Elevato Digital, Columbia, Missouri, United States
Jeremy J. Yang Translational Informatics Division, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, United States
Stephen L. Mathias Translational Informatics Division, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, United States
Cristian G. Bologa Translational Informatics Division, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, United States
Anna Waller Center for Molecular Discovery, University of New Mexico Comprehensive Cancer Center, Albuquerque, New Mexico, United States
Stephan C. Schürer Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, Miami, Florida, United States
Dušica Vidović Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, Miami, Florida, United States
Keith J. Kelleher National Center for Advancing Translational Science, Rockville, Maryland, United States
Timothy K. Sheils National Center for Advancing Translational Science, Rockville, Maryland, United States
Lars Juhl Jensen Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
Christophe G. Lambert Translational Informatics Division, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, United States
Tudor I. Oprea Translational Informatics Division, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, United States
Jeremy S. Edwards Translational Informatics Division, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, United States Department of Chemistry and Chemical Biology, University of New Mexico, Albuquerque, New Mexico, United States

Collapse

Narvaez-Rojas A, Arnaout MM, Hoz SS, Agrawal A, Lee A, Moscote-Salazar LR, Deora H. Info-pollution: a word of caution for the neurosurgical community. EGYPTIAN JOURNAL OF NEUROSURGERY 2022. [DOI: 10.1186/s41984-022-00179-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open

Su Y, Wang M, Wang P, Zheng C, Liu Y, Zeng X. Deep learning joint models for extracting entities and relations in biomedical: a survey and comparison. Brief Bioinform 2022;23:6686739. [PMID: 36125190 DOI: 10.1093/bib/bbac342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 07/20/2022] [Accepted: 07/25/2022] [Indexed: 12/14/2022] Open

Cheng X, Cao Q, Liao SS. An overview of literature on COVID-19, MERS and SARS: Using text mining and latent Dirichlet allocation. J Inf Sci 2022;48:304-320. [PMID: 38603038 PMCID: PMC7464068 DOI: 10.1177/0165551520954674] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Auto-generated database of semiconductor band gaps using ChemDataExtractor. Sci Data 2022;9:193. [PMID: 35504897 PMCID: PMC9065101 DOI: 10.1038/s41597-022-01294-6] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Accepted: 02/08/2022] [Indexed: 11/16/2022] Open

Song B, Li F, Liu Y, Zeng X. Deep learning methods for biomedical named entity recognition: a survey and qualitative comparison. Brief Bioinform 2021;22:6326536. [PMID: 34308472 DOI: 10.1093/bib/bbab282] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Revised: 06/07/2021] [Accepted: 07/02/2021] [Indexed: 11/13/2022] Open

de Boer ML. Epistemic in/justice in patient participation. A discourse analysis of the Dutch ME/CFS Health Council advisory process. SOCIOLOGY OF HEALTH & ILLNESS 2021;43:1335-1354. [PMID: 34137042 PMCID: PMC8453904 DOI: 10.1111/1467-9566.13301] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Revised: 03/25/2021] [Accepted: 05/05/2021] [Indexed: 05/28/2023]

Henry S, Wijesinghe DS, Myers A, McInnes BT. Using Literature Based Discovery to Gain Insights Into the Metabolomic Processes of Cardiac Arrest. Front Res Metr Anal 2021;6:644728. [PMID: 34250435 PMCID: PMC8267364 DOI: 10.3389/frma.2021.644728] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Accepted: 05/07/2021] [Indexed: 12/19/2022] Open

Zhang Y, Zhang Y, Qi P, Manning CD, Langlotz CP. Biomedical and clinical English model packages for the Stanza Python NLP library. J Am Med Inform Assoc 2021;28:1892-1899. [PMID: 34157094 PMCID: PMC8363782 DOI: 10.1093/jamia/ocab090] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Revised: 04/05/2021] [Accepted: 05/03/2021] [Indexed: 11/13/2022] Open

Liu C, Peres Kury FS, Li Z, Ta C, Wang K, Weng C. Doc2Hpo: a web application for efficient and accurate HPO concept curation. Nucleic Acids Res 2020;47:W566-W570. [PMID: 31106327 PMCID: PMC6602487 DOI: 10.1093/nar/gkz386] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 04/26/2019] [Accepted: 04/30/2019] [Indexed: 01/18/2023] Open

Crichton G, Baker S, Guo Y, Korhonen A. Neural networks for open and closed Literature-based Discovery. PLoS One 2020;15:e0232891. [PMID: 32413059 PMCID: PMC7228051 DOI: 10.1371/journal.pone.0232891] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Accepted: 04/23/2020] [Indexed: 12/18/2022] Open

Zhu H, Zeng Y, Wang D, Huangfu C. Species Classification for Neuroscience Literature Based on Span of Interest Using Sequence-to-Sequence Learning Model. Front Hum Neurosci 2020;14:128. [PMID: 32372933 PMCID: PMC7187631 DOI: 10.3389/fnhum.2020.00128] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2020] [Accepted: 03/19/2020] [Indexed: 11/13/2022] Open

Abstract

Large-scale neuroscience literature call for effective methods to mine the knowledge from species perspective to link the brain and neuroscience communities, neurorobotics, computing devices, and AI research communities. Structured knowledge can motivate researchers to better understand the functionality and structure of the brain and link the related resources and components. However, the abstracts of massive scientific works do not explicitly mention the species. Therefore, in addition to dictionary-based methods, we need to mine species using cognitive computing models that are more like the human reading process, and these methods can take advantage of the rich information in the literature. We also enable the model to automatically distinguish whether the mentioned species is the main research subject. Distinguishing the two situations can generate value at different levels of knowledge management. We propose SpecExplorer project which is used to explore the knowledge associations of different species for brain and neuroscience. This project frees humans from the tedious task of classifying neuroscience literature by species. Species classification task belongs to the multi-label classification which is more complex than the single-label classification due to the correlation between labels. To resolve this problem, we present the sequence-to-sequence classification framework to adaptively assign multiple species to the literature. To model the structure information of documents, we propose the hierarchical attentive decoding (HAD) to extract span of interest (SOI) for predicting each species. We create three datasets from PubMed and PMC corpora. We present two versions of annotation criteria (mention-based annotation and semantic-based annotation) for species research. Experiments demonstrate that our approach achieves improvements in the final results. Finally, we perform species-based analysis of brain diseases, brain cognitive functions, and proteins related to the hippocampus and provide potential research directions for certain species.

Collapse

Bani Baker Q, Nuser MS. Bioinformatics in Jordan: Status, challenges, and future directions. PLoS Comput Biol 2019;15:e1007202. [PMID: 31513576 PMCID: PMC6742231 DOI: 10.1371/journal.pcbi.1007202] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open

Cañada A, Capella-Gutierrez S, Rabal O, Oyarzabal J, Valencia A, Krallinger M. LimTox: a web tool for applied text mining of adverse event and toxicity associations of compounds, drugs and genes. Nucleic Acids Res 2019;45:W484-W489. [PMID: 28531339 PMCID: PMC5570141 DOI: 10.1093/nar/gkx462] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2017] [Accepted: 05/16/2017] [Indexed: 01/03/2023] Open

Yadav S, Ekbal A, Saha S, Kumar A, Bhattacharyya P. Feature assisted stacked attentive shortest dependency path based Bi-LSTM model for protein–protein interaction. Knowl Based Syst 2019. [DOI: 10.1016/j.knosys.2018.11.020] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Kirschnick J, Thomas P, Roller R, Hennig L. SIA: a scalable interoperable annotation server for biomedical named entities. J Cheminform 2018;10:63. [PMID: 30552534 PMCID: PMC6755617 DOI: 10.1186/s13321-018-0319-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Accepted: 12/05/2018] [Indexed: 11/29/2022] Open

Literature-based automated discovery of tumor suppressor p53 phosphorylation and inhibition by NEK2. Proc Natl Acad Sci U S A 2018;115:10666-10671. [PMID: 30266789 PMCID: PMC6196525 DOI: 10.1073/pnas.1806643115] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open

Vilar S, Friedman C, Hripcsak G. Detection of drug-drug interactions through data mining studies using clinical sources, scientific literature and social media. Brief Bioinform 2018;19:863-877. [PMID: 28334070 PMCID: PMC6454455 DOI: 10.1093/bib/bbx010] [Citation(s) in RCA: 90] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2016] [Revised: 12/28/2016] [Indexed: 11/13/2022] Open

Koci O, Logan M, Svolos V, Russell RK, Gerasimidis K, Ijaz UZ. An automated identification and analysis of ontological terms in gastrointestinal diseases and nutrition-related literature provides useful insights. PeerJ 2018;6:e5047. [PMID: 30065857 PMCID: PMC6064635 DOI: 10.7717/peerj.5047] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2018] [Accepted: 05/31/2018] [Indexed: 12/20/2022] Open

Baladrón C, Santos-Lozano A, Aguiar JM, Lucia A, Martín-Hernández J. Tool for filtering PubMed search results by sample size. J Am Med Inform Assoc 2018;25:774-779. [PMID: 29409012 DOI: 10.1093/jamia/ocx155] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2017] [Accepted: 12/20/2017] [Indexed: 11/13/2022] Open

Research Trend Visualization by MeSH Terms from PubMed. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2018;15:ijerph15061113. [PMID: 29848974 PMCID: PMC6025283 DOI: 10.3390/ijerph15061113] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/17/2018] [Revised: 05/28/2018] [Accepted: 05/29/2018] [Indexed: 11/17/2022]

Chen L, Friedman C, Finkelstein J. Automated Metabolic Phenotyping of Cytochrome Polymorphisms Using PubMed Abstract Mining. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2018;2017:535-544. [PMID: 29854118 PMCID: PMC5977704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Cannon DC, Yang JJ, Mathias SL, Ursu O, Mani S, Waller A, Schürer SC, Jensen LJ, Sklar LA, Bologa CG, Oprea TI. TIN-X: target importance and novelty explorer. Bioinformatics 2018;33:2601-2603. [PMID: 28398460 PMCID: PMC5870731 DOI: 10.1093/bioinformatics/btx200] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2017] [Accepted: 04/06/2017] [Indexed: 11/14/2022] Open

Jiang X, Ringwald M, Blake J, Shatkay H. Effective biomedical document classification for identifying publications relevant to the mouse Gene Expression Database (GXD). DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2017;2017:3084695. [PMID: 28365740 PMCID: PMC5467553 DOI: 10.1093/database/bax017] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2016] [Accepted: 02/13/2017] [Indexed: 12/16/2022]

Seco de Herrera AG, Schaer R, Müller H. Shangri-La: A medical case-based retrieval tool. J Assoc Inf Sci Technol 2017. [DOI: 10.1002/asi.23858] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Henry S, McInnes BT. Literature Based Discovery: Models, methods, and trends. J Biomed Inform 2017;74:20-32. [PMID: 28838802 DOI: 10.1016/j.jbi.2017.08.011] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2017] [Revised: 07/21/2017] [Accepted: 08/20/2017] [Indexed: 01/25/2023]

Almeida H, Jean-Louis L, Meurs MJ. An open source and modular search engine for biomedical literature retrieval. Comput Intell 2017. [DOI: 10.1111/coin.12125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Harris DR, Kavuluru R, Jaromczyk JW, Johnson TR. Rapid and Reusable Text Visualization and Exploration Development with DELVE. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2017;2017:139-148. [PMID: 28815123 PMCID: PMC5543346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]

Larsson K, Baker S, Silins I, Guo Y, Stenius U, Korhonen A, Berglund M. Text mining for improved exposure assessment. PLoS One 2017;12:e0173132. [PMID: 28257498 PMCID: PMC5336247 DOI: 10.1371/journal.pone.0173132] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2016] [Accepted: 02/15/2017] [Indexed: 01/24/2023] Open

Roos A, Hedlund T. Using the domain analytical approach in the study of information practices in biomedicine. JOURNAL OF DOCUMENTATION 2016. [DOI: 10.1108/jd-11-2015-0139] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Ahmed Z, Zeeshan S, Dandekar T. Mining biomedical images towards valuable information retrieval in biomedical and life sciences. Database (Oxford) 2016;2016:baw118. [PMID: 27538578 PMCID: PMC4990152 DOI: 10.1093/database/baw118] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2015] [Revised: 06/07/2016] [Accepted: 07/19/2016] [Indexed: 12/22/2022]

Gupta R, Mantri SS. Biomolecular Relationships Discovered from Biological Labyrinth and Lost in Ocean of Literature: Community Efforts Can Rescue Until Automated Artificial Intelligence Takes Over. Front Genet 2016;7:46. [PMID: 27066067 PMCID: PMC4814459 DOI: 10.3389/fgene.2016.00046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2016] [Accepted: 03/15/2016] [Indexed: 11/30/2022] Open

Ahmed Z, Dandekar T. MSL: Facilitating automatic and physical analysis of published scientific literature in PDF format. F1000Res 2015;4:1453. [PMID: 29721305 PMCID: PMC5897790 DOI: 10.12688/f1000research.7329.3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/26/2018] [Indexed: 01/12/2023] Open

Supporting systematic reviews using LDA-based document representations. Syst Rev 2015;4:172. [PMID: 26612232 PMCID: PMC4662004 DOI: 10.1186/s13643-015-0117-0] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/14/2015] [Accepted: 08/18/2015] [Indexed: 12/04/2022] Open

Yu M, Selvaraj SK, Liang-Chu MMY, Aghajani S, Busse M, Yuan J, Lee G, Peale F, Klijn C, Bourgon R, Kaminker JS, Neve RM. A resource for cell line authentication, annotation and quality control. Nature 2015;520:307-11. [PMID: 25877200 DOI: 10.1038/nature14397] [Citation(s) in RCA: 161] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2014] [Accepted: 03/09/2015] [Indexed: 01/25/2023]

Gobeill J, Gaudinat A, Pasche E, Vishnyakova D, Gaudet P, Bairoch A, Ruch P. Deep Question Answering for protein annotation. Database (Oxford) 2015;2015:bav081. [PMID: 26384372 PMCID: PMC4572360 DOI: 10.1093/database/bav081] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2014] [Revised: 07/06/2015] [Accepted: 08/08/2015] [Indexed: 11/14/2022]

Abstract

Biomedical professionals have access to a huge amount of literature, but when they use a search engine, they often have to deal with too many documents to efficiently find the appropriate information in a reasonable time. In this perspective, question-answering (QA) engines are designed to display answers, which were automatically extracted from the retrieved documents. Standard QA engines in literature process a user question, then retrieve relevant documents and finally extract some possible answers out of these documents using various named-entity recognition processes. In our study, we try to answer complex genomics questions, which can be adequately answered only using Gene Ontology (GO) concepts. Such complex answers cannot be found using state-of-the-art dictionary- and redundancy-based QA engines. We compare the effectiveness of two dictionary-based classifiers for extracting correct GO answers from a large set of 100 retrieved abstracts per question. In the same way, we also investigate the power of GOCat, a GO supervised classifier. GOCat exploits the GOA database to propose GO concepts that were annotated by curators for similar abstracts. This approach is called deep QA, as it adds an original classification step, and exploits curated biological data to infer answers, which are not explicitly mentioned in the retrieved documents. We show that for complex answers such as protein functional descriptions, the redundancy phenomenon has a limited effect. Similarly usual dictionary-based approaches are relatively ineffective. In contrast, we demonstrate how existing curated data, beyond information extraction, can be exploited by a supervised classifier, such as GOCat, to massively improve both the quantity and the quality of the answers with a +100% improvement for both recall and precision. Database URL: http://eagl.unige.ch/DeepQA4PA/.

Collapse

Hirschberg J, Manning CD. Advances in natural language processing. Science 2015;349:261-6. [PMID: 26185244 DOI: 10.1126/science.aaa8685] [Citation(s) in RCA: 356] [Impact Index Per Article: 35.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

Cohen PR. DARPA's Big Mechanism program. Phys Biol 2015;12:045008. [DOI: 10.1088/1478-3975/12/4/045008] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]

Zheng JG, Howsmon D, Zhang B, Hahn J, McGuinness D, Hendler J, Ji H. Entity linking for biomedical literature. BMC Med Inform Decis Mak 2015;15 Suppl 1:S4. [PMID: 26045232 PMCID: PMC4460707 DOI: 10.1186/1472-6947-15-s1-s4] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open

Abstract

BACKGROUND

The Entity Linking (EL) task links entity mentions from an unstructured document to entities in a knowledge base. Although this problem is well-studied in news and social media, this problem has not received much attention in the life science domain. One outcome of tackling the EL problem in the life sciences domain is to enable scientists to build computational models of biological processes with more efficiency. However, simply applying a news-trained entity linker produces inadequate results.

METHODS

Since existing supervised approaches require a large amount of manually-labeled training data, which is currently unavailable for the life science domain, we propose a novel unsupervised collective inference approach to link entities from unstructured full texts of biomedical literature to 300 ontologies. The approach leverages the rich semantic information and structures in ontologies for similarity computation and entity ranking.

RESULTS

Without using any manual annotation, our approach significantly outperforms state-of-the-art supervised EL method (9% absolute gain in linking accuracy). Furthermore, the state-of-the-art supervised EL method requires 15,000 manually annotated entity mentions for training. These promising results establish a benchmark for the EL task in the life science domain. We also provide in depth analysis and discussion on both challenges and opportunities on automatic knowledge enrichment for scientific literature.

CONCLUSIONS

In this paper, we propose a novel unsupervised collective inference approach to address the EL problem in a new domain. We show that our unsupervised approach is able to outperform a current state-of-the-art supervised approach that has been trained with a large amount of manually labeled data. Life science presents an underrepresented domain for applying EL techniques. By providing a small benchmark data set and identifying opportunities, we hope to stimulate discussions across natural language processing and bioinformatics and motivate others to develop techniques for this largely untapped domain.

Collapse

Huang CC, Lu Z. Community challenges in biomedical text mining over 10 years: success, failure and the future. Brief Bioinform 2015;17:132-44. [PMID: 25935162 DOI: 10.1093/bib/bbv024] [Citation(s) in RCA: 107] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2015] [Indexed: 11/13/2022] Open

Blanc X, Collet TH, Auer R, Iriarte P, Krause J, Légaré F, Cornuz J, Clair C. Retrieval of publications addressing shared decision making: an evaluation of full-text searches on medical journal websites. JMIR Res Protoc 2015;4:e38. [PMID: 25854180 PMCID: PMC4405619 DOI: 10.2196/resprot.3615] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2014] [Revised: 01/21/2015] [Accepted: 02/04/2015] [Indexed: 11/18/2022] Open

Abstract

BACKGROUND

Full-text searches of articles increase the recall, defined by the proportion of relevant publications that are retrieved. However, this method is rarely used in medical research due to resource constraints. For the purpose of a systematic review of publications addressing shared decision making, a full-text search method was required to retrieve publications where shared decision making does not appear in the title or abstract.

OBJECTIVE

The objective of our study was to assess the efficiency and reliability of full-text searches in major medical journals for identifying shared decision making publications.

METHODS

A full-text search was performed on the websites of 15 high-impact journals in general internal medicine to look up publications of any type from 1996-2011 containing the phrase "shared decision making". The search method was compared with a PubMed search of titles and abstracts only. The full-text search was further validated by requesting all publications from the same time period from the individual journal publishers and searching through the collected dataset.

RESULTS

The full-text search for "shared decision making" on journal websites identified 1286 publications in 15 journals compared to 119 through the PubMed search. The search within the publisher-provided publications of 6 journals identified 613 publications compared to 646 with the full-text search on the respective journal websites. The concordance rate was 94.3% between both full-text searches.

CONCLUSIONS

Full-text searching on medical journal websites is an efficient and reliable way to identify relevant articles in the field of shared decision making for review or other purposes. It may be more widely used in biomedical research in other fields in the future, with the collaboration of publishers and journals toward open-access data.

Collapse

Ma Y, Dong M, Mita C, Sun S, Peng CK, Yang AC. Publication analysis on insomnia: how much has been done in the past two decades? Sleep Med 2015;16:820-6. [PMID: 25979182 DOI: 10.1016/j.sleep.2014.12.028] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/10/2014] [Revised: 12/07/2014] [Accepted: 12/29/2014] [Indexed: 12/19/2022]

Almeida H, Meurs MJ, Kosseim L, Butler G, Tsang A. Machine learning for biomedical literature triage. PLoS One 2014;9:e115892. [PMID: 25551575 PMCID: PMC4281078 DOI: 10.1371/journal.pone.0115892] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2014] [Accepted: 11/27/2014] [Indexed: 11/30/2022] Open

Nim HT, Boyd SE, Rosenthal NA. Systems approaches in integrative cardiac biology: illustrations from cardiac heterocellular signalling studies. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2014;117:69-77. [PMID: 25499442 DOI: 10.1016/j.pbiomolbio.2014.11.006] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/20/2014] [Revised: 11/26/2014] [Accepted: 11/28/2014] [Indexed: 12/27/2022]

Wu C, Schwartz JM, Brabant G, Nenadic G. Molecular profiling of thyroid cancer subtypes using large-scale text mining. BMC Med Genomics 2014;7 Suppl 3:S3. [PMID: 25521965 PMCID: PMC4290788 DOI: 10.1186/1755-8794-7-s3-s3] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Micropublications: a semantic model for claims, evidence, arguments and annotations in biomedical communications. J Biomed Semantics 2014;5:28. [PMID: 26261718 PMCID: PMC4530550 DOI: 10.1186/2041-1480-5-28] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2013] [Accepted: 06/16/2014] [Indexed: 11/10/2022] Open

Abstract

Background

Scientific publications are documentary representations of defeasible arguments, supported by data and repeatable methods. They are the essential mediating artifacts in the ecosystem of scientific communications. The institutional “goal” of science is publishing results. The linear document publication format, dating from 1665, has survived transition to the Web.

Intractable publication volumes; the difficulty of verifying evidence; and observed problems in evidence and citation chains suggest a need for a web-friendly and machine-tractable model of scientific publications. This model should support: digital summarization, evidence examination, challenge, verification and remix, and incremental adoption. Such a model must be capable of expressing a broad spectrum of representational complexity, ranging from minimal to maximal forms.

Results

The micropublications semantic model of scientific argument and evidence provides these features. Micropublications support natural language statements; data; methods and materials specifications; discussion and commentary; challenge and disagreement; as well as allowing many kinds of statement formalization.

The minimal form of a micropublication is a statement with its attribution. The maximal form is a statement with its complete supporting argument, consisting of all relevant evidence, interpretations, discussion and challenges brought forward in support of or opposition to it. Micropublications may be formalized and serialized in multiple ways, including in RDF. They may be added to publications as stand-off metadata.

An OWL 2 vocabulary for micropublications is available at http://purl.org/mp. A discussion of this vocabulary along with RDF examples from the case studies, appears as OWL Vocabulary and RDF Examples in Additional file 1.

Conclusion

Micropublications, because they model evidence and allow qualified, nuanced assertions, can play essential roles in the scientific communications ecosystem in places where simpler, formalized and purely statement-based models, such as the nanopublications model, will not be sufficient. At the same time they will add significant value to, and are intentionally compatible with, statement-based formalizations.

We suggest that micropublications, generated by useful software tools supporting such activities as writing, editing, reviewing, and discussion, will be of great value in improving the quality and tractability of biomedical communications.

Collapse

Xu R, Li L, Wang Q. dRiskKB: a large-scale disease-disease risk relationship knowledge base constructed from biomedical text. BMC Bioinformatics 2014;15:105. [PMID: 24725842 PMCID: PMC3998061 DOI: 10.1186/1471-2105-15-105] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2013] [Accepted: 04/07/2014] [Indexed: 11/14/2022] Open

Abstract

BACKGROUND

Discerning the genetic contributions to complex human diseases is a challenging mandate that demands new types of data and calls for new avenues for advancing the state-of-the-art in computational approaches to uncovering disease etiology. Systems approaches to studying observable phenotypic relationships among diseases are emerging as an active area of research for both novel disease gene discovery and drug repositioning. Currently, systematic study of disease relationships on a phenome-wide scale is limited due to the lack of large-scale machine understandable disease phenotype relationship knowledge bases. Our study innovates a semi-supervised iterative pattern learning approach that is used to build an precise, large-scale disease-disease risk relationship (D1 → D2) knowledge base (dRiskKB) from a vast corpus of free-text published biomedical literature.

RESULTS

21,354,075 MEDLINE records comprised the text corpus under study. First, we used one typical disease risk-specific syntactic pattern (i.e. "D1 due to D2") as a seed to automatically discover other patterns specifying similar semantic relationships among diseases. We then extracted D1 → D2 risk pairs from MEDLINE using the learned patterns. We manually evaluated the precisions of the learned patterns and extracted pairs. Finally, we analyzed the correlations between disease-disease risk pairs and their associated genes and drugs. The newly created dRiskKB consists of a total of 34,448 unique D1 → D2 pairs, representing the risk-specific semantic relationships among 12,981 diseases with each disease linked to its associated genes and drugs. The identified patterns are highly precise (average precision of 0.99) in specifying the risk-specific relationships among diseases. The precisions of extracted pairs are 0.919 for those that are exactly matched and 0.988 for those that are partially matched. By comparing the iterative pattern approach starting from different seeds, we demonstrated that our algorithm is robust in terms of seed choice. We show that diseases and their risk diseases as well as diseases with similar risk profiles tend to share both genes and drugs.

CONCLUSIONS

This unique dRiskKB, when combined with existing phenotypic, genetic, and genomic datasets, can have profound implications in our deeper understanding of disease etiology and in drug repositioning.

Collapse

Das S, McCaffrey PG, Talkington MWT, Andrews NA, Corlosquet S, Ivinson AJ, Clark T. Pain Research Forum: application of scientific social media frameworks in neuroscience. Front Neuroinform 2014;8:21. [PMID: 24653693 PMCID: PMC3949323 DOI: 10.3389/fninf.2014.00021] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2013] [Accepted: 02/19/2014] [Indexed: 01/21/2023] Open

Abstract

BACKGROUND

Social media has the potential to accelerate the pace of biomedical research through online collaboration, discussions, and faster sharing of information. Focused web-based scientific social collaboratories such as the Alzheimer Research Forum have been successful in engaging scientists in open discussions of the latest research and identifying gaps in knowledge. However, until recently, tools to rapidly create such communities and provide high-bandwidth information exchange between collaboratories in related fields did not exist.

METHODS

We have addressed this need by constructing a reusable framework to build online biomedical communities, based on Drupal, an open-source content management system. The framework incorporates elements of Semantic Web technology combined with social media. Here we present, as an exemplar of a web community built on our framework, the Pain Research Forum (PRF) (http://painresearchforum.org). PRF is a community of chronic pain researchers, established with the goal of fostering collaboration and communication among pain researchers.

RESULTS

Launched in 2011, PRF has over 1300 registered members with permission to submit content. It currently hosts over 150 topical news articles on research; more than 30 active or archived forum discussions and journal club features; a webinar series; an editor-curated weekly updated listing of relevant papers; and several other resources for the pain research community. All content is licensed for reuse under a Creative Commons license; the software is freely available. The framework was reused to develop other sites, notably the Multiple Sclerosis Discovery Forum (http://msdiscovery.org) and StemBook (http://stembook.org).

DISCUSSION

Web-based collaboratories are a crucial integrative tool supporting rapid information transmission and translation in several important research areas. In this article, we discuss the success factors, lessons learned, and ongoing challenges in using PRF as a driving force to develop tools for online collaboration in neuroscience. We also indicate ways these tools can be applied to other areas and uses.

Collapse