Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Uzun A, Triche EW, Schuster J, Dewan AT, Padbury JF. dbPEC: a comprehensive literature-based database for preeclampsia related genes and phenotypes. Database (Oxford) 2016;2016:baw006. [PMID: 26946289 PMCID: PMC4779341 DOI: 10.1093/database/baw006] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Revised: 12/28/2015] [Accepted: 01/12/2016] [Indexed: 01/08/2023]

For:	Uzun A, Triche EW, Schuster J, Dewan AT, Padbury JF. dbPEC: a comprehensive literature-based database for preeclampsia related genes and phenotypes. Database (Oxford) 2016;2016:baw006. [PMID: 26946289 PMCID: PMC4779341 DOI: 10.1093/database/baw006] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Revised: 12/28/2015] [Accepted: 01/12/2016] [Indexed: 01/08/2023]

Number

Cited by Other Article(s)

Moufarrej MN, Vorperian SK, Wong RJ, Campos AA, Quaintance CC, Sit RV, Tan M, Detweiler AM, Mekonen H, Neff NF, Baruch-Gravett C, Litch JA, Druzin ML, Winn VD, Shaw GM, Stevenson DK, Quake SR. Early prediction of preeclampsia in pregnancy with cell-free RNA. Nature 2022;602:689-694. [PMID: 35140405 PMCID: PMC8971130 DOI: 10.1038/s41586-022-04410-z] [Citation(s) in RCA: 85] [Impact Index Per Article: 42.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 01/06/2022] [Indexed: 12/30/2022]

Li X, Liu L, Whitehead C, Li J, Thierry B, Le TD, Winter M. OUP accepted manuscript. Brief Funct Genomics 2022;21:296-309. [PMID: 35484822 PMCID: PMC9328024 DOI: 10.1093/bfgp/elac006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Revised: 03/11/2022] [Accepted: 03/18/2022] [Indexed: 11/24/2022] Open

DeWan AT. Gene-Gene and Gene-Environment Interactions. Methods Mol Biol 2019;1793:89-110. [PMID: 29876893 DOI: 10.1007/978-1-4939-7868-7_7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]

Schuster J, Superdock M, Agudelo A, Stey P, Padbury J, Sarkar IN, Uzun A. Machine learning approach to literature mining for the genetics of complex diseases. Database (Oxford) 2019;2019:baz124. [PMID: 31768545 PMCID: PMC6877776 DOI: 10.1093/database/baz124] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Revised: 09/03/2019] [Accepted: 09/23/2019] [Indexed: 11/14/2022]

Abstract

To generate a parsimonious gene set for understanding the mechanisms underlying complex diseases, we reasoned it was necessary to combine the curation of public literature, review of experimental databases and interpolation of pathway-associated genes. Using this strategy, we previously built the following two databases for reproductive disorders: The Database for Preterm Birth (dbPTB) and The Database for Preeclampsia (dbPEC). The completeness and accuracy of these databases is essential for supporting our understanding of these complex conditions. Given the exponential increase in biomedical literature, it is becoming increasingly difficult to manually maintain these databases. Using our curated databases as reference data sets, we implemented a machine learning-based approach to optimize article selection for manual curation. We used logistic regression, random forests and neural networks as our machine learning algorithms to classify articles. We examined features derived from abstract text, annotations and metadata that we hypothesized would best classify articles with genetically relevant content associated to the disorder of interest. Combinations of these features were used build the classifiers and the performance of these feature sets were compared to a standard 'Bag-of-Words'. Several combinations of these genetic based feature sets outperformed 'Bag-of-Words' at a threshold such that 95% of the curated gene set obtained from the original manual curation of all articles were extracted from the articles classified by machine learning as 'considered'. The performance was superior in terms of the reduction of required manual curation and two measures of the harmonic mean of precision and recall. The reduction in workload ranged from 0.814 to 0.846 for the dbPTB and 0.301 to 0.371 for the dbPEC. Additionally, a database of metadata and annotations is generated which allows for rapid query of individual features. Our results demonstrate that machine learning algorithms can identify articles with relevant data for databases of genes associated with complex diseases.

Collapse