Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Gordon MD, Lindsay RK. Toward discovery support systems: A replication, re-examination, and extension of Swanson's work on literature-based discovery of a connection between Raynaud's and fish oil. ACTA ACUST UNITED AC 1996. [DOI: 10.1002/(sici)1097-4571(199602)47:2<116::aid-asi3>3.0.co;2-1] [Citation(s) in RCA: 103] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

For:	Gordon MD, Lindsay RK. Toward discovery support systems: A replication, re-examination, and extension of Swanson's work on literature-based discovery of a connection between Raynaud's and fish oil. ACTA ACUST UNITED AC 1996. [DOI: 10.1002/(sici)1097-4571(199602)47:2<116::aid-asi3>3.0.co;2-1] [Citation(s) in RCA: 103] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Number

Cited by Other Article(s)

Henry S, Wijesinghe DS, Myers A, McInnes BT. Using Literature Based Discovery to Gain Insights Into the Metabolomic Processes of Cardiac Arrest. Front Res Metr Anal 2021;6:644728. [PMID: 34250435 PMCID: PMC8267364 DOI: 10.3389/frma.2021.644728] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Accepted: 05/07/2021] [Indexed: 12/19/2022] Open

Scientometric analysis and knowledge mapping of literature-based discovery (1986–2020). Scientometrics 2021. [DOI: 10.1007/s11192-020-03811-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Crichton G, Baker S, Guo Y, Korhonen A. Neural networks for open and closed Literature-based Discovery. PLoS One 2020;15:e0232891. [PMID: 32413059 PMCID: PMC7228051 DOI: 10.1371/journal.pone.0232891] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Accepted: 04/23/2020] [Indexed: 12/18/2022] Open

Examining drug and side effect relation using author–entity pair bipartite networks. J Informetr 2020. [DOI: 10.1016/j.joi.2019.100999] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Visualizing a field of research: A methodology of systematic scientometric reviews. PLoS One 2019;14:e0223994. [PMID: 31671124 PMCID: PMC6822756 DOI: 10.1371/journal.pone.0223994] [Citation(s) in RCA: 342] [Impact Index Per Article: 68.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2019] [Accepted: 10/02/2019] [Indexed: 12/14/2022] Open

Henry S, McInnes BT. Indirect association and ranking hypotheses for literature based discovery. BMC Bioinformatics 2019;20:425. [PMID: 31416434 PMCID: PMC6694578 DOI: 10.1186/s12859-019-2989-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2018] [Accepted: 07/09/2019] [Indexed: 11/10/2022] Open

Preiss J. Is automatic detection of hidden knowledge an anomaly? BMC Bioinformatics 2019;20:251. [PMID: 31138105 PMCID: PMC6538538 DOI: 10.1186/s12859-019-2815-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Gopalakrishnan V, Jha K, Jin W, Zhang A. A survey on literature based discovery approaches in biomedical domain. J Biomed Inform 2019;93:103141. [PMID: 30857950 DOI: 10.1016/j.jbi.2019.103141] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2018] [Revised: 02/17/2019] [Accepted: 02/19/2019] [Indexed: 02/06/2023]

Thilakaratne M, Falkner K, Atapattu T. A systematic review on literature-based discovery workflow. PeerJ Comput Sci 2019;5:e235. [PMID: 33816888 PMCID: PMC7924697 DOI: 10.7717/peerj-cs.235] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Accepted: 10/17/2019] [Indexed: 05/02/2023]

Henry S, McInnes BT. Literature Based Discovery: Models, methods, and trends. J Biomed Inform 2017;74:20-32. [PMID: 28838802 DOI: 10.1016/j.jbi.2017.08.011] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2017] [Revised: 07/21/2017] [Accepted: 08/20/2017] [Indexed: 01/25/2023]

Preiss J, Stevenson M. Quantifying and filtering knowledge generated by literature based discovery. BMC Bioinformatics 2017;18:249. [PMID: 28617217 PMCID: PMC5471938 DOI: 10.1186/s12859-017-1641-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open

Emerging approaches in literature-based discovery: techniques and performance review. KNOWL ENG REV 2017. [DOI: 10.1017/s0269888917000042] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Kastrin A, Rindflesch TC, Hristovski D. Link Prediction on a Network of Co-occurring MeSH Terms: Towards Literature-based Discovery. Methods Inf Med 2016;55:340-6. [PMID: 27435341 DOI: 10.3414/me15-01-0108] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2015] [Accepted: 05/19/2016] [Indexed: 12/24/2022]

Kajikawa Y, Abe K, Noda S. Filling the gap between researchers studying different materials and different methods: a proposal for structured keywords. J Inf Sci 2016. [DOI: 10.1177/0165551506067125] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Bawden D. The three worlds of health information. J Inf Sci 2016. [DOI: 10.1177/016555150202800106] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Preiss J, Stevenson M, Gaizauskas R. Exploring relation types for literature-based discovery. J Am Med Inform Assoc 2015;22:987-92. [PMID: 25971437 PMCID: PMC4986660 DOI: 10.1093/jamia/ocv002] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2014] [Accepted: 12/26/2014] [Indexed: 11/25/2022] Open

Abstract

Objective Literature-based discovery (LBD) aims to identify “hidden knowledge” in the medical literature by: (1) analyzing documents to identify pairs of explicitly related concepts (terms), then (2) hypothesizing novel relations between pairs of unrelated concepts that are implicitly related via a shared concept to which both are explicitly related. Many LBD approaches use simple techniques to identify semantically weak relations between concepts, for example, document co-occurrence. These generate huge numbers of hypotheses, difficult for humans to assess. More complex techniques rely on linguistic analysis, for example, shallow parsing, to identify semantically stronger relations. Such approaches generate fewer hypotheses, but may miss hidden knowledge. The authors investigate this trade-off in detail, comparing techniques for identifying related concepts to discover which are most suitable for LBD.

Materials and methods A generic LBD system that can utilize a range of relation types was developed. Experiments were carried out comparing a number of techniques for identifying relations. Two approaches were used for evaluation: replication of existing discoveries and the “time slicing” approach.¹

Results Previous LBD discoveries could be replicated using relations based either on document co-occurrence or linguistic analysis. Using relations based on linguistic analysis generated many fewer hypotheses, but a significantly greater proportion of them were candidates for hidden knowledge.

Discussion and Conclusion The use of linguistic analysis-based relations improves accuracy of LBD without overly damaging coverage. LBD systems often generate huge numbers of hypotheses, which are infeasible to manually review. Improving their accuracy has the potential to make these systems significantly more usable.

Collapse

Cameron D, Kavuluru R, Rindflesch TC, Sheth AP, Thirunarayan K, Bodenreider O. Context-driven automatic subgraph creation for literature-based discovery. J Biomed Inform 2015;54:141-57. [PMID: 25661592 PMCID: PMC4888806 DOI: 10.1016/j.jbi.2015.01.014] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Revised: 01/21/2015] [Accepted: 01/25/2015] [Indexed: 01/29/2023]

Abstract

BACKGROUND

Literature-based discovery (LBD) is characterized by uncovering hidden associations in non-interacting scientific literature. Prior approaches to LBD include use of: (1) domain expertise and structured background knowledge to manually filter and explore the literature, (2) distributional statistics and graph-theoretic measures to rank interesting connections, and (3) heuristics to help eliminate spurious connections. However, manual approaches to LBD are not scalable and purely distributional approaches may not be sufficient to obtain insights into the meaning of poorly understood associations. While several graph-based approaches have the potential to elucidate associations, their effectiveness has not been fully demonstrated. A considerable degree of a priori knowledge, heuristics, and manual filtering is still required.

OBJECTIVES

In this paper we implement and evaluate a context-driven, automatic subgraph creation method that captures multifaceted complex associations between biomedical concepts to facilitate LBD. Given a pair of concepts, our method automatically generates a ranked list of subgraphs, which provide informative and potentially unknown associations between such concepts.

METHODS

To generate subgraphs, the set of all MEDLINE articles that contain either of the two specified concepts (A, C) are first collected. Then binary relationships or assertions, which are automatically extracted from the MEDLINE articles, called semantic predications, are used to create a labeled directed predications graph. In this predications graph, a path is represented as a sequence of semantic predications. The hierarchical agglomerative clustering (HAC) algorithm is then applied to cluster paths that are bounded by the two concepts (A, C). HAC relies on implicit semantics captured through Medical Subject Heading (MeSH) descriptors, and explicit semantics from the MeSH hierarchy, for clustering. Paths that exceed a threshold of semantic relatedness are clustered into subgraphs based on their shared context. Finally, the automatically generated clusters are provided as a ranked list of subgraphs.

RESULTS

The subgraphs generated using this approach facilitated the rediscovery of 8 out of 9 existing scientific discoveries. In particular, they directly (or indirectly) led to the recovery of several intermediates (or B-concepts) between A- and C-terms, while also providing insights into the meaning of the associations. Such meaning is derived from predicates between the concepts, as well as the provenance of the semantic predications in MEDLINE. Additionally, by generating subgraphs on different thematic dimensions (such as Cellular Activity, Pharmaceutical Treatment and Tissue Function), the approach may enable a broader understanding of the nature of complex associations between concepts. Finally, in a statistical evaluation to determine the interestingness of the subgraphs, it was observed that an arbitrary association is mentioned in only approximately 4 articles in MEDLINE on average.

CONCLUSION

These results suggest that leveraging the implicit and explicit semantics provided by manually assigned MeSH descriptors is an effective representation for capturing the underlying context of complex associations, along multiple thematic dimensions in LBD situations.

Collapse

Cheng L, Lin H, Zhou F, Yang Z, Wang J. Enhancing the accuracy of knowledge discovery: a supervised learning method. BMC Bioinformatics 2014;15 Suppl 12:S9. [PMID: 25474584 PMCID: PMC4243114 DOI: 10.1186/1471-2105-15-s12-s9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Kazmer MM, Lustria MLA, Cortese J, Burnett G, Kim JH, Ma J, Frost J. Distributed knowledge in an online patient support community: Authority and discovery. J Assoc Inf Sci Technol 2014. [DOI: 10.1002/asi.23064] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

Workman TE, Fiszman M, Rindflesch TC, Nahl D. Framing serendipitous information-seeking behavior for facilitating literature-based discovery: A proposed model. J Assoc Inf Sci Technol 2014. [DOI: 10.1002/asi.22999] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Jenkin TA, Chan YE, Skillicorn DB, Rogers KW. Individual Exploration, Sensemaking, and Innovation: A Design for the Discovery of Novel Information. DECISION SCIENCES 2013. [DOI: 10.1111/deci.12042] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Entitymetrics: measuring the impact of entities. PLoS One 2013;8:e71416. [PMID: 24009660 PMCID: PMC3756961 DOI: 10.1371/journal.pone.0071416] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2013] [Accepted: 06/29/2013] [Indexed: 12/29/2022] Open

Blake C. Text mining. ACTA ACUST UNITED AC 2013. [DOI: 10.1002/aris.2011.1440450110] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Cameron D, Bodenreider O, Yalamanchili H, Danh T, Vallabhaneni S, Thirunarayan K, Sheth AP, Rindflesch TC. A graph-based recovery and decomposition of Swanson's hypothesis using semantic predications. J Biomed Inform 2012;46:238-51. [PMID: 23026233 DOI: 10.1016/j.jbi.2012.09.004] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2012] [Revised: 09/05/2012] [Accepted: 09/08/2012] [Indexed: 11/29/2022]

Abstract

OBJECTIVES

This paper presents a methodology for recovering and decomposing Swanson's Raynaud Syndrome-Fish Oil hypothesis semi-automatically. The methodology leverages the semantics of assertions extracted from biomedical literature (called semantic predications) along with structured background knowledge and graph-based algorithms to semi-automatically capture the informative associations originally discovered manually by Swanson. Demonstrating that Swanson's manually intensive techniques can be undertaken semi-automatically, paves the way for fully automatic semantics-based hypothesis generation from scientific literature.

METHODS

Semantic predications obtained from biomedical literature allow the construction of labeled directed graphs which contain various associations among concepts from the literature. By aggregating such associations into informative subgraphs, some of the relevant details originally articulated by Swanson have been uncovered. However, by leveraging background knowledge to bridge important knowledge gaps in the literature, a methodology for semi-automatically capturing the detailed associations originally explicated in natural language by Swanson, has been developed.

RESULTS

Our methodology not only recovered the three associations commonly recognized as Swanson's hypothesis, but also decomposed them into an additional 16 detailed associations, formulated as chains of semantic predications. Altogether, 14 out of the 19 associations that can be attributed to Swanson were retrieved using our approach. To the best of our knowledge, such an in-depth recovery and decomposition of Swanson's hypothesis has never been attempted.

CONCLUSION

In this work therefore, we presented a methodology to semi-automatically recover and decompose Swanson's RS-DFO hypothesis using semantic representations and graph algorithms. Our methodology provides new insights into potential prerequisites for semantics-driven Literature-Based Discovery (LBD). Based on our observations, three critical aspects of LBD include: (1) the need for more expressive representations beyond Swanson's ABC model; (2) an ability to accurately extract semantic information from text; and (3) the semantic integration of scientific literature and structured background knowledge.

Collapse

Lekka E, Deftereos SN, Persidis A, Persidis A, Andronis C. Literature analysis for systematic drug repurposing: a case study from Biovista. ACTA ACUST UNITED AC 2011. [DOI: 10.1016/j.ddstr.2011.06.005] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Bisgin H, Liu Z, Fang H, Xu X, Tong W. Mining FDA drug labels using an unsupervised learning technique--topic modeling. BMC Bioinformatics 2011;12 Suppl 10:S11. [PMID: 22166012 PMCID: PMC3236833 DOI: 10.1186/1471-2105-12-s10-s11] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

The Food and Drug Administration (FDA) approved drug labels contain a broad array of information, ranging from adverse drug reactions (ADRs) to drug efficacy, risk-benefit consideration, and more. However, the labeling language used to describe these information is free text often containing ambiguous semantic descriptions, which poses a great challenge in retrieving useful information from the labeling text in a consistent and accurate fashion for comparative analysis across drugs. Consequently, this task has largely relied on the manual reading of the full text by experts, which is time consuming and labor intensive.

METHOD

In this study, a novel text mining method with unsupervised learning in nature, called topic modeling, was applied to the drug labeling with a goal of discovering "topics" that group drugs with similar safety concerns and/or therapeutic uses together. A total of 794 FDA-approved drug labels were used in this study. First, the three labeling sections (i.e., Boxed Warning, Warnings and Precautions, Adverse Reactions) of each drug label were processed by the Medical Dictionary for Regulatory Activities (MedDRA) to convert the free text of each label to the standard ADR terms. Next, the topic modeling approach with latent Dirichlet allocation (LDA) was applied to generate 100 topics, each associated with a set of drugs grouped together based on the probability analysis. Lastly, the efficacy of the topic modeling was evaluated based on known information about the therapeutic uses and safety data of drugs.

RESULTS

The results demonstrate that drugs grouped by topics are associated with the same safety concerns and/or therapeutic uses with statistical significance (P<0.05). The identified topics have distinct context that can be directly linked to specific adverse events (e.g., liver injury or kidney injury) or therapeutic application (e.g., antiinfectives for systemic use). We were also able to identify potential adverse events that might arise from specific medications via topics.

CONCLUSIONS

The successful application of topic modeling on the FDA drug labeling demonstrates its potential utility as a hypothesis generation means to infer hidden relationships of concepts such as, in this study, drug safety and therapeutic use in the study of biomedical documents.

Collapse

Kostoff RN, Block JA, Solka JL, Briggs MB, Rushenberg RL, Stump JA, Johnson D, Lyons TJ, Wyatt JR. Literature-related discovery. ACTA ACUST UNITED AC 2011. [DOI: 10.1002/aris.2009.1440430112] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]

Deftereos SN, Andronis C, Friedla EJ, Persidis A, Persidis A. Drug repurposing and adverse event prediction using high-throughput literature analysis. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2011;3:323-34. [PMID: 21416632 DOI: 10.1002/wsbm.147] [Citation(s) in RCA: 80] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]

Ijaz AZ, Song M, Lee D. MKEM: a Multi-level Knowledge Emergence Model for mining undiscovered public knowledge. BMC Bioinformatics 2010;11 Suppl 2:S3. [PMID: 20406501 PMCID: PMC3165192 DOI: 10.1186/1471-2105-11-s2-s3] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open

Cohen T, Schvaneveldt R, Widdows D. Reflective Random Indexing and indirect inference: a scalable method for discovery of implicit connections. J Biomed Inform 2009;43:240-56. [PMID: 19761870 DOI: 10.1016/j.jbi.2009.09.003] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2009] [Revised: 09/09/2009] [Accepted: 09/09/2009] [Indexed: 10/20/2022]

Towards an explanatory and computational theory of scientific discovery. J Informetr 2009. [DOI: 10.1016/j.joi.2009.03.004] [Citation(s) in RCA: 218] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Patents and publications as sources of novel and inventive knowledge. Scientometrics 2009. [DOI: 10.1007/s11192-007-2041-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

Hu X, Zhang X, Yoo I, Wang X, Feng J. Mining hidden connections among biomedical concepts from disjoint biomedical literature sets through semantic-based association rule. INT J INTELL SYST 2009. [DOI: 10.1002/int.20396] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

A new evaluation methodology for literature-based discovery systems. J Biomed Inform 2008;42:633-43. [PMID: 19124086 DOI: 10.1016/j.jbi.2008.12.001] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2008] [Revised: 12/03/2008] [Accepted: 12/05/2008] [Indexed: 11/21/2022]

Literature-Based Knowledge Discovery using Natural Language Processing. ACTA ACUST UNITED AC 2008. [DOI: 10.1007/978-3-540-68690-3_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]

Information Retrieval in Literature-Based Discovery. LITERATURE-BASED DISCOVERY 2008. [DOI: 10.1007/978-3-540-68690-3_10] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Zhou X, Liu B, Wu Z, Feng Y. Integrative mining of traditional Chinese medicine literature and MEDLINE for functional gene networks. Artif Intell Med 2007;41:87-104. [PMID: 17804209 DOI: 10.1016/j.artmed.2007.07.007] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2006] [Revised: 07/24/2007] [Accepted: 07/24/2007] [Indexed: 01/17/2023]

Abstract

OBJECTIVE

The amount of biomedical data in different disciplines is growing at an exponential rate. Integrating these significant knowledge sources to generate novel hypotheses for systems biology research is difficult. Traditional Chinese medicine (TCM) is a completely different discipline, and is a complementary knowledge system to modern biomedical science. This paper uses a significant TCM bibliographic literature database in China, together with MEDLINE, to help discover novel gene functional knowledge.

MATERIALS AND METHODS

We present an integrative mining approach to uncover the functional gene relationships from MEDLINE and TCM bibliographic literature. This paper introduces TCM literature (about 50,000 records) as one knowledge source for constructing literature-based gene networks. We use the TCM diagnosis, TCM syndrome, to automatically congregate the related genes. The syndrome-gene relationships are discovered based on the syndrome-disease relationships extracted from TCM literature and the disease-gene relationships in MEDLINE. Based on the bubble-bootstrapping and relation weight computing methods, we have developed a prototype system called MeDisco/3S, which has name entity and relation extraction, and online analytical processing (OLAP) capabilities, to perform the integrative mining process.

RESULTS

We have got about 200,000 syndrome-gene relations, which could help generate syndrome-based gene networks, and help analyze the functional knowledge of genes from syndrome perspective. We take the gene network of Kidney-Yang Deficiency syndrome (KYD syndrome) and the functional analysis of some genes, such as CRH (corticotropin releasing hormone), PTH (parathyroid hormone), PRL (prolactin), BRCA1 (breast cancer 1, early onset) and BRCA2 (breast cancer 2, early onset), to demonstrate the preliminary results. The underlying hypothesis is that the related genes of the same syndrome will have some biological functional relationships, and will constitute a functional network.

CONCLUSION

This paper presents an approach to integrate TCM literature and modern biomedical data to discover novel gene networks and functional knowledge of genes. The preliminary results show that the novel gene functional knowledge and gene networks, which are worthy of further investigation, could be generated by integrating the two complementary biomedical data sources. It will be a promising research field through integrative mining of TCM and modern life science literature.

Collapse

Tulipano PK, Tao Y, Millar WS, Zanzonico P, Kolbert K, Xu H, Yu H, Chen L, Lussier YA, Friedman C. Natural language processing and visualization in the molecular imaging domain. J Biomed Inform 2006;40:270-81. [PMID: 17084109 DOI: 10.1016/j.jbi.2006.08.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2005] [Revised: 08/25/2006] [Accepted: 08/29/2006] [Indexed: 11/16/2022]

Bekhuis T. Conceptual biology, hypothesis discovery, and text mining: Swanson's legacy. BIOMEDICAL DIGITAL LIBRARIES 2006;3:2. [PMID: 16584552 PMCID: PMC1459187 DOI: 10.1186/1742-5581-3-2] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 09/26/2005] [Accepted: 04/03/2006] [Indexed: 11/10/2022]

Swanson DR. Atrial fibrillation in athletes: Implicit literature-based connections suggest that overtraining and subsequent inflammation may be a contributory mechanism. Med Hypotheses 2006;66:1085-92. [PMID: 16504414 DOI: 10.1016/j.mehy.2006.01.006] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2006] [Accepted: 01/10/2006] [Indexed: 11/21/2022]

Abstract

Research on atrial fibrillation (AF), a common heart arrhythmia in the elderly, over many decades has resulted in a literature of more than 16,000 articles indexed in Medline. An exploratory Medline search was conducted in which the subheadings for epidemiology and etiology of AF were combined to form a small subset of the initial records. Further computer-assisted selection led to a few articles that reported an unexpectedly high prevalence of AF in groups of otherwise healthy middle-aged endurance runners and other athletes. Why athletes should be unusually susceptible to AF is mysterious and puzzling. Because relatively few articles are about both AF and endurance exercise, a computer was used first to create a list of important terms that these two separate literatures had in common. Several inflammation-related terms, including C-reactive protein (CRP) and interleukin-6, were on that list. Further searching and literature analysis revealed that excessive endurance exercise or overtraining can lead to chronic systemic inflammation and, separately, that there is a solid association between CRP and AF and that anti-inflammatory agents have been reported to lower CRP and ameliorate AF. No articles were found that brought together all three concepts - AF, inflammation, and exercise. The following hypothesis is plausible, readily testable, and apparently novel: Older athletes diagnosed with AF but otherwise healthy who have engaged in rigorous aerobic endurance exercise for more than a decade will have CRP levels that are higher than those of a similar population of athletes without AF. Corroboration of this hypothesis would then justify a prospective clinical trial of anti-inflammation therapy. It is of particular interest to extend recent studies of inflammation in AF to athletes; athletic behavior that can induce inflammation may contribute to understanding the origins of AF.

Collapse

Dumais ST. Latent semantic analysis. ACTA ACUST UNITED AC 2005. [DOI: 10.1002/aris.1440380105] [Citation(s) in RCA: 493] [Impact Index Per Article: 25.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Hristovski D, Peterlin B, Mitchell JA, Humphrey SM. Using literature-based discovery to identify disease candidate genes. Int J Med Inform 2005;74:289-98. [PMID: 15694635 DOI: 10.1016/j.ijmedinf.2004.04.024] [Citation(s) in RCA: 169] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2003] [Revised: 04/06/2004] [Accepted: 04/20/2004] [Indexed: 11/27/2022]

MacMullen WJ, Denn SO. Information problems in molecular biology and bioinformatics. ACTA ACUST UNITED AC 2005. [DOI: 10.1002/asi.20134] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Kostoff RN, Block JA, Stump JA, Pfeil KM. Information content in Medline record fields. Int J Med Inform 2004;73:515-27. [PMID: 15171980 DOI: 10.1016/j.ijmedinf.2004.02.008] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2003] [Revised: 11/07/2003] [Accepted: 02/27/2004] [Indexed: 11/30/2022]

Abstract

BACKGROUND

The authors have been conducting text mining analyses (extraction of useful information from text) of Medline records, using Abstracts as the main data source. For literature-based discovery, and other text mining applications as well, all records in a discipline need to be evaluated for determining prior art. Many Medline records do not contain Abstracts, but typically contain Titles and Mesh terms. Substitution of these fields for Abstracts in the non-Abstract records would restore the missing literature to some degree.

OBJECTIVES

Determine how well the information content of Title and Mesh fields approximates that of Abstracts in Medline records.

APPROACH

Select historical Medline records related to Raynaud's Phenomenon that contain Abstracts. Determine the information content in the Abstract fields through text mining. Then, determine the information content in the Title fields, the Mesh fields, and the combined Title-Mesh fields, and compare with the information content in the Abstracts.

RESULTS

Four metrics were used to compare the information content related to Raynaud's Phenomenon in the different fields: total number of phrases; number of unique phrases; content of factors from factor analyses; content of clusters from multi-link clustering. The Abstract field contains almost an order of magnitude more phrases than the other fields, and slightly more than an order of magnitude more unique phrases than the other fields. Each field used a factor matrix with 14 factors, and the combination of all 56 factors for the four fields represented 27 separate, but not unique, themes. These themes could be placed in two major categories, with two sub-categories per major category: Auto-immunity (antibodies, inflammation) and circulation (peripheral vessel circulation, coronary vessel circulation). All four sub-categories included representation from each field. Thus, while the focus of the representation of each field in each sub-category was moderately different, the four sub-category structure could be identified by analyzing the total factors in each field. In the cluster comparison phase of the study, the phrases used to create the clusters were the most important phrases identified for each factor. Thus, the factor matrix served as a filter for words used for clustering. While clusters were generated for all four fields, the Title hierarchy tended to be fragmented due to sparsity of the co-occurrence matrix that underlies the clusters. Therefore, the Title clusters were examined at only the lower levels of aggregation. The Abstract, Mesh, and Mesh + Title fields had the same first level taxonomy categories, auto-immunity and circulation. At the second level, the Abstract, Mesh, and Mesh + Title fields had the autoimmune diseases and antibodies sub-category in common. The Abstract and Mesh fields shared fascia inflammation as the other auto-immunity sub-category, while the other Mesh + Title sub-category focuses on vinyl chloride poisoning from industrial contact, and consequences of antineoplastic agents. However, in both cases, even though the words may be different, inflammation may be the common theme.

CONCLUSIONS

For taxonomy generation, especially at the higher levels, each of the four fields has a similar thematic structure. At very detailed levels, the Mesh and Title fields run out of phrases relative to the Abstract field. Therefore, selection of field (s) to be employed for taxonomy generation depends on the objectives of the study, particularly the level of categorization required for the taxonomy. For information retrieval, or literature-based discovery, selection of the appropriate field again depends on the study objectives. If large queries, or large numbers of concepts or themes are desired, then the field with the largest number of technical phrases would be desirable. If queries or concepts represented by the more accepted popular terminology is adequate, then the smaller fields may be sufficient. Because of its established and controlled vocabulary, the Mesh field lags the Title or Abss the Title or Abstract fields in currency. Thus, the Title or Abstract fields would retrieve records with the most explicitly stated current concepts, but the Mesh field would capture a larger swath of fields that contained a concept of interest but perhaps had a wider range of specific terminology in the Abstract or Title text. In addition, this study provides the first validated estimate of the disparity in information retrieved through text mining limited to Titles and Mesh terms relative to entire Abstracts. As much of the older biomedical literature was entered into electronic databases without associated Abstracts, literature-based discovery exercises that search the older medical literature may miss a substantial proportion of relevant information. On the basis of this study, it may be estimated that up to a log order more information may be retrieved when complete Abstracts are searched.

Collapse

Text Mining for Finding Functional Community of Related Genes Using TCM Knowledge. ACTA ACUST UNITED AC 2004. [DOI: 10.1007/978-3-540-30116-5_42] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

Srinivasan P. Text mining: Generating hypotheses from MEDLINE. ACTA ACUST UNITED AC 2004. [DOI: 10.1002/asi.10389] [Citation(s) in RCA: 168] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

KOSTOFF RONALDN. Overcoming Specialization. Bioscience 2002. [DOI: 10.1641/0006-3568(2002)052[0937:os]2.0.co;2] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open

Viator JA, Pestorius FM. Investigating trends in acoustics research from 1970-1999. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2001;109:1779-1783. [PMID: 11386532 DOI: 10.1121/1.1366711] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]

Relationships in the Organization of Knowledge: An Overview. ACTA ACUST UNITED AC 2001. [DOI: 10.1007/978-94-015-9696-1_1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

Weeber M, Klein H, de Jong-van den Berg LT, Vos R. Using concepts in literature-based discovery: Simulating Swanson's Raynaud-fish oil and migraine-magnesium discoveries. ACTA ACUST UNITED AC 2001. [DOI: 10.1002/asi.1104] [Citation(s) in RCA: 143] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]