1
|
Jing X. The Unified Medical Language System at 30 Years and How It Is Used and Published: Systematic Review and Content Analysis. JMIR Med Inform 2021; 9:e20675. [PMID: 34236337 PMCID: PMC8433943 DOI: 10.2196/20675] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Revised: 11/25/2020] [Accepted: 07/02/2021] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND The Unified Medical Language System (UMLS) has been a critical tool in biomedical and health informatics, and the year 2021 marks its 30th anniversary. The UMLS brings together many broadly used vocabularies and standards in the biomedical field to facilitate interoperability among different computer systems and applications. OBJECTIVE Despite its longevity, there is no comprehensive publication analysis of the use of the UMLS. Thus, this review and analysis is conducted to provide an overview of the UMLS and its use in English-language peer-reviewed publications, with the objective of providing a comprehensive understanding of how the UMLS has been used in English-language peer-reviewed publications over the last 30 years. METHODS PubMed, ACM Digital Library, and the Nursing & Allied Health Database were used to search for studies. The primary search strategy was as follows: UMLS was used as a Medical Subject Headings term or a keyword or appeared in the title or abstract. Only English-language publications were considered. The publications were screened first, then coded and categorized iteratively, following the grounded theory. The review process followed the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. RESULTS A total of 943 publications were included in the final analysis. Moreover, 32 publications were categorized into 2 categories; hence the total number of publications before duplicates are removed is 975. After analysis and categorization of the publications, UMLS was found to be used in the following emerging themes or areas (the number of publications and their respective percentages are given in parentheses): natural language processing (230/975, 23.6%), information retrieval (125/975, 12.8%), terminology study (90/975, 9.2%), ontology and modeling (80/975, 8.2%), medical subdomains (76/975, 7.8%), other language studies (53/975, 5.4%), artificial intelligence tools and applications (46/975, 4.7%), patient care (35/975, 3.6%), data mining and knowledge discovery (25/975, 2.6%), medical education (20/975, 2.1%), degree-related theses (13/975, 1.3%), digital library (5/975, 0.5%), and the UMLS itself (150/975, 15.4%), as well as the UMLS for other purposes (27/975, 2.8%). CONCLUSIONS The UMLS has been used successfully in patient care, medical education, digital libraries, and software development, as originally planned, as well as in degree-related theses, the building of artificial intelligence tools, data mining and knowledge discovery, foundational work in methodology, and middle layers that may lead to advanced products. Natural language processing, the UMLS itself, and information retrieval are the 3 most common themes that emerged among the included publications. The results, although largely related to academia, demonstrate that UMLS achieves its intended uses successfully, in addition to achieving uses broadly beyond its original intentions.
Collapse
Affiliation(s)
- Xia Jing
- Department of Public Health Sciences, College of Behavioral, Social and Health Sciences, Clemson University, Clemson, SC, United States
| |
Collapse
|
2
|
Yan Q, Sun SY, Yuan S, Wang XQ, Zhang ZC. Inhibition of microRNA-9-5p and microRNA-128-3p can inhibit ischemic stroke-related cell death in vitro and in vivo. IUBMB Life 2020; 72:2382-2390. [PMID: 32797712 DOI: 10.1002/iub.2357] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Revised: 07/13/2020] [Accepted: 07/13/2020] [Indexed: 01/02/2023]
Abstract
Ischemic stroke is the major form of stroke and is accentuated by multiple comorbidities. It has been previously shown that different microRNAs (miRNAs) regulate separate aspects of ischemic stroke. Differential miRNA expression analysis in cerebrospinal fluid of stroke patients had revealed upregulation of miR-124-3p, miR-9-3p, miR-9-5p, and miR-128-3p. However, whether the overexpression is correlative or causative was not known. Here, using an in vitro oxygen-glucose deprivation/reoxygenation (OGD/R) neuronal cell model, we saw OGD/R-induced injury was associated with significant upregulation of the aforementioned four miRNAs. Target gene prediction using in situ algorithms and gene set enrichment analysis revealed significant enrichment of FOXO and Relaxin signaling pathways and regulatory processes associated with endothelial cell migration, which are all known to associate with apoptotic pathways. In situ protein-protein interaction network analysis confirmed the findings of gene set enrichment analysis. TUNEL analysis showed that OGD/R-induced injury resulted in significant apoptosis, which was significantly inhibited in neuronal cells pretransfected with inhibitors of either miR-9-5p or miR-128-3p. Further testing in an in vivo middle cerebral artery occlusion (MCAO) mouse model of ischemic stroke showed that inhibiting miR-9-5p or miR-128-3p significantly decreases MCAO-induced infraction volume and inhibited apoptotic response as revealed by decreased cleaved Caspase-3 protein expression in immunohistochemical analysis. Combined inhibition of miR-9-5p and miR-128-3p resulted in a synergistic decrease in cell death and infraction volume in vitro and in vivo, respectively. Cumulatively, our results provide critical knowledge about the mechanism by which elevated miR-9-5p and miR-128-3p causes brain damage in ischemic stroke and provides evidence of them being attractive therapeutic targets.
Collapse
Affiliation(s)
- Qi Yan
- Department of Neurology, Lanzhou University Second Hospital, Lanzhou, China
| | - Shou-Yuan Sun
- Department of Neurosurgery, Lanzhou University Second Hospital, Lanzhou, China
| | - Shuai Yuan
- Department of Neurosurgery, Lanzhou University Second Hospital, Lanzhou, China
| | - Xiao-Qing Wang
- Neurosurgery Laboratory, Lanzhou University Second Hospital, Lanzhou, China
| | - Zhen-Chang Zhang
- Department of Neurology, Lanzhou University Second Hospital, Lanzhou, China
| |
Collapse
|
3
|
Systematic identification of genetic systems associated with phenotypes in patients with rare genomic copy number variations. Hum Genet 2020; 140:457-475. [PMID: 32778951 DOI: 10.1007/s00439-020-02214-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Accepted: 07/30/2020] [Indexed: 01/02/2023]
Abstract
Copy number variation (CNV) related disorders tend to show complex phenotypic profiles that do not match known diseases. This makes it difficult to ascertain their underlying molecular basis. A potential solution is to compare the affected genomic regions for multiple patients that share a pathological phenotype, looking for commonalities. Here, we present a novel approach to associate phenotypes with functional systems, in terms of GO categories and KEGG and Reactome pathways, based on patient data. The approach uses genomic and phenomic data from the same patients, finding shared genomic regions between patients with similar phenotypes. These regions are mapped to genes to find associated functional systems. We applied the approach to analyse patients in the DECIPHER database with de novo CNVs, finding functional systems associated with most phenotypes, often due to mutations affecting related genes in the same genomic region. Manual inspection of the ten top-scoring phenotypes found multiple FunSys connections supported by the previous studies for seven of them. The workflow also produces reports focussed on the genes and FunSys connected to the different phenotypes, alongside patient-specific reports, which give details of the associated genes and FunSys for each individual in the cohort. These can be run in "confidential" mode, preserving patient confidentiality. The workflow presented here can be used to associate phenotypes with functional systems using data at the level of a whole cohort of patients, identifying important connections that could not be found when considering them individually. The full workflow is available for download, enabling it to be run on any patient cohort for which phenotypic and CNV data are available.
Collapse
|
4
|
Yoon S, Lee D. Meta-path Based Prioritization of Functional Drug Actions with Multi-Level Biological Networks. Sci Rep 2019; 9:5469. [PMID: 30940832 PMCID: PMC6445150 DOI: 10.1038/s41598-019-41814-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Accepted: 03/14/2019] [Indexed: 11/09/2022] Open
Abstract
Functional drug actions refer to drug-affected GO terms. They aid in the investigation of drug effects that are therapeutic or adverse. Previous studies have utilized the linkage information between drugs and functions in molecular level biological networks. Since the current knowledge of molecular level mechanisms of biological functions is still limited, such previous studies were incomplete. We expected that the multi-level biological networks would allow us to more completely investigate the functional drug actions. We constructed multi-level biological networks with genes, GO terms, and diseases. Meta-paths were utilized to extract the features of each GO term. We trained 39 SVM models to prioritize the functional drug actions of the various 39 drugs. Through the multi-level networks, more functional drug actions were utilized for the 39 models and inferred by the models. Multi-level based features improved the performance of the models, and the average AUROC value in the cross-validation was 0.86. Moreover, 60% of the candidates were true.
Collapse
Affiliation(s)
- Seyeol Yoon
- Department of Bio and Brain Engineering, KAIST, 291 Daehak-ro, Yuseong-gu, 34141, Daejeon, Republic of Korea
- Bio-Synergy Research Center, 291 Daehak-ro, Yuseong-gu, 34141, Daejeon, Republic of Korea
| | - Doheon Lee
- Department of Bio and Brain Engineering, KAIST, 291 Daehak-ro, Yuseong-gu, 34141, Daejeon, Republic of Korea.
- Bio-Synergy Research Center, 291 Daehak-ro, Yuseong-gu, 34141, Daejeon, Republic of Korea.
| |
Collapse
|
5
|
Reeves PA, Richards CM. Biases induced by using geography and environment to guide ex situ conservation. CONSERV GENET 2018. [DOI: 10.1007/s10592-018-1098-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
6
|
Garcia-Gathright JI, Matiasz NJ, Adame C, Sarma KV, Sauer L, Smedley NF, Spiegel ML, Strunck J, Garon EB, Taira RK, Aberle DR, Bui AAT. Evaluating Casama: Contextualized semantic maps for summarization of lung cancer studies. Comput Biol Med 2018; 92:55-63. [PMID: 29149658 PMCID: PMC5762403 DOI: 10.1016/j.compbiomed.2017.10.034] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2017] [Revised: 10/28/2017] [Accepted: 10/29/2017] [Indexed: 01/15/2023]
Abstract
OBJECTIVE It is crucial for clinicians to stay up to date on current literature in order to apply recent evidence to clinical decision making. Automatic summarization systems can help clinicians quickly view an aggregated summary of literature on a topic. Casama, a representation and summarization system based on "contextualized semantic maps," captures the findings of biomedical studies as well as the contexts associated with patient population and study design. This paper presents a user-oriented evaluation of Casama in comparison to a context-free representation, SemRep. MATERIALS AND METHODS The effectiveness of the representation was evaluated by presenting users with manually annotated Casama and SemRep summaries of ten articles on driver mutations in cancer. Automatic annotations were evaluated on a collection of articles on EGFR mutation in lung cancer. Seven users completed a questionnaire rating the summarization quality for various topics and applications. RESULTS Casama had higher median scores than SemRep for the majority of the topics (p≤ 0.00032), all of the applications (p≤ 0.00089), and in overall summarization quality (p≤ 1.5e-05). Casama's manual annotations outperformed Casama's automatic annotations (p = 0.00061). DISCUSSION Casama performed particularly well in the representation of strength of evidence, which was highly rated both quantitatively and qualitatively. Users noted that Casama's less granular, more targeted representation improved usability compared to SemRep. CONCLUSION This evaluation demonstrated the benefits of a contextualized representation for summarizing biomedical literature on cancer. Iteration on specific areas of Casama's representation, further development of its algorithms, and a clinically-oriented evaluation are warranted.
Collapse
Affiliation(s)
- Jean I Garcia-Gathright
- University of California, Los Angeles, Department of Bioengineering, 924 Westwood Boulevard, Suite 420, Los Angeles, CA, 90024, USA.
| | - Nicholas J Matiasz
- University of California, Los Angeles, Department of Bioengineering, 924 Westwood Boulevard, Suite 420, Los Angeles, CA, 90024, USA
| | - Carlos Adame
- University of California, Los Angeles, Department of Medicine - Division of Hematology-Oncology, 924 Westwood Boulevard, Suite 200, Los Angeles, CA, 90024, USA
| | - Karthik V Sarma
- University of California, Los Angeles, Department of Bioengineering, 924 Westwood Boulevard, Suite 420, Los Angeles, CA, 90024, USA
| | - Lauren Sauer
- University of California, Los Angeles, Department of Medicine - Division of Hematology-Oncology, 924 Westwood Boulevard, Suite 200, Los Angeles, CA, 90024, USA
| | - Nova F Smedley
- University of California, Los Angeles, Department of Bioengineering, 924 Westwood Boulevard, Suite 420, Los Angeles, CA, 90024, USA
| | - Marshall L Spiegel
- University of California, Los Angeles, Department of Medicine - Division of Hematology-Oncology, 924 Westwood Boulevard, Suite 200, Los Angeles, CA, 90024, USA
| | - Jennifer Strunck
- University of California, Los Angeles, Department of Medicine - Division of Hematology-Oncology, 924 Westwood Boulevard, Suite 200, Los Angeles, CA, 90024, USA
| | - Edward B Garon
- University of California, Los Angeles, Department of Medicine - Division of Hematology-Oncology, 924 Westwood Boulevard, Suite 200, Los Angeles, CA, 90024, USA
| | - Ricky K Taira
- University of California, Los Angeles, Department of Bioengineering, 924 Westwood Boulevard, Suite 420, Los Angeles, CA, 90024, USA; University of California, Los Angeles, Department of Radiological Sciences, 924 Westwood Boulevard, Suite 420, Los Angeles, CA, 90024, USA
| | - Denise R Aberle
- University of California, Los Angeles, Department of Bioengineering, 924 Westwood Boulevard, Suite 420, Los Angeles, CA, 90024, USA; University of California, Los Angeles, Department of Radiological Sciences, 924 Westwood Boulevard, Suite 420, Los Angeles, CA, 90024, USA
| | - Alex A T Bui
- University of California, Los Angeles, Department of Bioengineering, 924 Westwood Boulevard, Suite 420, Los Angeles, CA, 90024, USA; University of California, Los Angeles, Department of Radiological Sciences, 924 Westwood Boulevard, Suite 420, Los Angeles, CA, 90024, USA
| |
Collapse
|
7
|
Yu H, Jung J, Yoon S, Kwon M, Bae S, Yim S, Lee J, Kim S, Kang Y, Lee D. CODA: Integrating multi-level context-oriented directed associations for analysis of drug effects. Sci Rep 2017; 7:7519. [PMID: 28790372 PMCID: PMC5548804 DOI: 10.1038/s41598-017-07448-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2017] [Accepted: 07/13/2017] [Indexed: 11/09/2022] Open
Abstract
In silico network-based methods have shown promising results in the field of drug development. Yet, most of networks used in the previous research have not included context information even though biological associations actually do appear in the specific contexts. Here, we reconstruct an anatomical context-specific network by assigning contexts to biological associations using protein expression data and scientific literature. Furthermore, we employ the context-specific network for the analysis of drug effects with a proximity measure between drug targets and diseases. Distinct from previous context-specific networks, intercellular associations and phenomic level entities such as biological processes are included in our network to represent the human body. It is observed that performances in inferring drug-disease associations are increased by adding context information and phenomic level entities. In particular, hypertension, a disease related to multiple organs and associated with several phenomic level entities, is analyzed in detail to investigate how our network facilitates the inference of drug-disease associations. Our results indicate that the inclusion of context information, intercellular associations, and phenomic level entities can contribute towards a better prediction of drug-disease associations and provide detailed insight into understanding of how drugs affect diseases in the human body.
Collapse
Affiliation(s)
- Hasun Yu
- Department of Bio and Brain Engineering, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon, Republic of Korea.,Bio-Synergy Research Center, 291 Daehak-ro, Yuseong-gu, 305- 701, Daejeon, Republic of Korea
| | - Jinmyung Jung
- Department of Bio and Brain Engineering, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon, Republic of Korea.,Bio-Synergy Research Center, 291 Daehak-ro, Yuseong-gu, 305- 701, Daejeon, Republic of Korea
| | - Seyeol Yoon
- Department of Bio and Brain Engineering, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon, Republic of Korea.,Bio-Synergy Research Center, 291 Daehak-ro, Yuseong-gu, 305- 701, Daejeon, Republic of Korea
| | - Mijin Kwon
- Department of Bio and Brain Engineering, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon, Republic of Korea.,Bio-Synergy Research Center, 291 Daehak-ro, Yuseong-gu, 305- 701, Daejeon, Republic of Korea
| | - Sunghwa Bae
- Department of Bio and Brain Engineering, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon, Republic of Korea.,Bio-Synergy Research Center, 291 Daehak-ro, Yuseong-gu, 305- 701, Daejeon, Republic of Korea
| | - Soorin Yim
- Department of Bio and Brain Engineering, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon, Republic of Korea.,Bio-Synergy Research Center, 291 Daehak-ro, Yuseong-gu, 305- 701, Daejeon, Republic of Korea
| | - Jaehyun Lee
- Department of Bio and Brain Engineering, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon, Republic of Korea.,Bio-Synergy Research Center, 291 Daehak-ro, Yuseong-gu, 305- 701, Daejeon, Republic of Korea
| | - Seunghyun Kim
- Department of Bio and Brain Engineering, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon, Republic of Korea.,Bio-Synergy Research Center, 291 Daehak-ro, Yuseong-gu, 305- 701, Daejeon, Republic of Korea
| | - Yeeok Kang
- SD Genomics Co., Ltd., 619 Gaepo-ro, Gangnam-gu, Seoul, Republic of Korea
| | - Doheon Lee
- Department of Bio and Brain Engineering, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon, Republic of Korea. .,Bio-Synergy Research Center, 291 Daehak-ro, Yuseong-gu, 305- 701, Daejeon, Republic of Korea.
| |
Collapse
|
8
|
Zhu H, Dai M, Chen X, Chen X, Qin S, Dai S. Integrated analysis of the potential roles of miRNA‑mRNA networks in triple negative breast cancer. Mol Med Rep 2017. [PMID: 28627677 PMCID: PMC5561991 DOI: 10.3892/mmr.2017.6750] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
Triple negative breast cancer (TNBC) is a type of breast cancer where the tumor cells are negative for the estrogen, progesterone and human epidermal growth factor 2 receptors. To date, expression profiling of microRNA (miRNA/miR) and mRNA sequences have been widely applied for the diagnosis of TNBC. In the present study, an integrated analysis of miRNA‑mRNA profiling arrays was performed. A total of five dysregulated miRNAs in patients with TNBC were identified, including upregulated miR‑558 expression and downregulated miR‑320d‑1, miR‑548v, miR‑99a and miR‑21 expression. In addition, 49 potential target mRNA sequences were identified. Bioinformatics analyses were performed on the identified miRNAs and mRNAs, including gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes pathway and miRNA‑mRNA network analyses. A total of 31 GO terms and three signaling pathways were identified. The results indicated that the differentially expressed miRNAs and their potential target mRNAs may affect the pathogenesis of TNBC, and may therefore be considered as promising biomarkers for the early diagnosis and targeted therapy of patients with TNBC.
Collapse
Affiliation(s)
- Huiru Zhu
- Department of Galactophore, The Third Affiliated Hospital of Guangxi University of Chinese Medicine, Liuzhou, Guangxi 545001, P.R. China
| | - Meiyu Dai
- Department of Clinical Laboratory, The Fourth Affiliated Hospital of Guangxi Medical University, Liuzhou, Guangxi 545005, P.R. China
| | - Xiaoli Chen
- Department of Clinical Laboratory, The Fourth Affiliated Hospital of Guangxi Medical University, Liuzhou, Guangxi 545005, P.R. China
| | - Xiang Chen
- Department of Clinical Laboratory, The Fourth Affiliated Hospital of Guangxi Medical University, Liuzhou, Guangxi 545005, P.R. China
| | - Shini Qin
- Department of Clinical Laboratory, The Fourth Affiliated Hospital of Guangxi Medical University, Liuzhou, Guangxi 545005, P.R. China
| | - Shengming Dai
- Department of Clinical Laboratory, The Fourth Affiliated Hospital of Guangxi Medical University, Liuzhou, Guangxi 545005, P.R. China
| |
Collapse
|
9
|
Brbić M, Piškorec M, Vidulin V, Kriško A, Šmuc T, Supek F. The landscape of microbial phenotypic traits and associated genes. Nucleic Acids Res 2016; 44:10074-10090. [PMID: 27915291 PMCID: PMC5137458 DOI: 10.1093/nar/gkw964] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2016] [Revised: 09/21/2016] [Accepted: 10/11/2016] [Indexed: 12/31/2022] Open
Abstract
Bacteria and Archaea display a variety of phenotypic traits and can adapt to diverse ecological niches. However, systematic annotation of prokaryotic phenotypes is lacking. We have therefore developed ProTraits, a resource containing ∼545 000 novel phenotype inferences, spanning 424 traits assigned to 3046 bacterial and archaeal species. These annotations were assigned by a computational pipeline that associates microbes with phenotypes by text-mining the scientific literature and the broader World Wide Web, while also being able to define novel concepts from unstructured text. Moreover, the ProTraits pipeline assigns phenotypes by drawing extensively on comparative genomics, capturing patterns in gene repertoires, codon usage biases, proteome composition and co-occurrence in metagenomes. Notably, we find that gene synteny is highly predictive of many phenotypes, and highlight examples of gene neighborhoods associated with spore-forming ability. A global analysis of trait interrelatedness outlined clusters in the microbial phenotype network, suggesting common genetic underpinnings. Our extended set of phenotype annotations allows detection of 57 088 high confidence gene-trait links, which recover many known associations involving sporulation, flagella, catalase activity, aerobicity, photosynthesis and other traits. Over 99% of the commonly occurring gene families are involved in genetic interactions conditional on at least one phenotype, suggesting that epistasis has a major role in shaping microbial gene content.
Collapse
Affiliation(s)
- Maria Brbić
- Division of Electronics, Ruder Boskovic Institute, 10000 Zagreb, Croatia
| | - Matija Piškorec
- Division of Electronics, Ruder Boskovic Institute, 10000 Zagreb, Croatia
| | - Vedrana Vidulin
- Division of Electronics, Ruder Boskovic Institute, 10000 Zagreb, Croatia
| | - Anita Kriško
- Mediterranean Institute of Life Sciences, 21000 Split, Croatia
| | - Tomislav Šmuc
- Division of Electronics, Ruder Boskovic Institute, 10000 Zagreb, Croatia
| | - Fran Supek
- Division of Electronics, Ruder Boskovic Institute, 10000 Zagreb, Croatia .,EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), 08002 Barcelona, Spain
| |
Collapse
|
10
|
Zhang X, Peng Y, Jin Z, Huang W, Cheng Y, Liu Y, Feng X, Yang M, Huang Y, Zhao Z, Wang L, Wei Y, Fan X, Zheng D, Meltzer SJ. Integrated miRNA profiling and bioinformatics analyses reveal potential causative miRNAs in gastric adenocarcinoma. Oncotarget 2016; 6:32878-89. [PMID: 26460735 PMCID: PMC4741736 DOI: 10.18632/oncotarget.5419] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2015] [Accepted: 09/25/2015] [Indexed: 12/16/2022] Open
Abstract
Gastric cancer (GC) is one of the leading causes of cancer-related deaths throughout China and worldwide. The discovery of microRNAs (miRNAs) has provided a new opportunity for developing diagnostic biomarkers and effective therapeutic targets in GC. By performing microarray analyses of benign and malignant gastric epithelial cell lines (HFE145, NCI-N87, MKN28, RF1, KATO III and RF48), 16 significantly dysregulated miRNAs were found. 11 of these were validated by real-time qRT-PCR. Based on miRWalk online database scans, 703 potential mRNA targets of the 16 miRNAs were identified. Bioinformatic analyses suggested that these dysregulated miRNAs and their predicted targets were principally involved in tumor pathogenesis, MAPK signaling, and apoptosis. Finally, miRNA-gene network analyses identified miRNA-125b as a crucial miRNA in GC development. Taken together, these results develop a comprehensive expression and functional profile of differentially expressed miRNAs related to gastric oncogenesis. This profile may serve as a potential tool for biomarker and therapeutic target identification in GC patients.
Collapse
Affiliation(s)
- Xiaojing Zhang
- Department of Pathology, The Shenzhen University School of Medicine, Shenzhen, Guangdong, People's Republic of China.,Shenzhen Key Laboratory of Translational Medicine of Tumor, The Shenzhen University School of Medicine, Shenzhen, Guangdong, People's Republic of China
| | - Yin Peng
- Department of Pathology, The Shenzhen University School of Medicine, Shenzhen, Guangdong, People's Republic of China.,Department of Pathology, Wuhan University School of Basic Medical Sciences, Hubei, People's Republic of China
| | - Zhe Jin
- Department of Pathology, The Shenzhen University School of Medicine, Shenzhen, Guangdong, People's Republic of China.,Shenzhen Key Laboratory of Micromolecule Innovatal Drugs, The Shenzhen University School of Medicine, Shenzhen, Guangdong, People's Republic of China.,Shenzhen Key Laboratory of Translational Medicine of Tumor, The Shenzhen University School of Medicine, Shenzhen, Guangdong, People's Republic of China.,Laboratory of Chemical Genomics, School of Chemical Biology and Biotechnology, Peking University Shenzhen Graduate School, Shenzhen, Guangdong, People's Republic of China
| | - Weiling Huang
- Department of Pathology, The Shenzhen University School of Medicine, Shenzhen, Guangdong, People's Republic of China
| | - Yulan Cheng
- Department of Medicine/GI Division, Johns Hopkins University and Sidney Kimmel Cancer Center, Baltimore, MD, USA
| | - Yudan Liu
- School of Pharmacy, China Medical University, Shenyang, Liaoning, People's Republic of China
| | - Xianling Feng
- Department of Pathology, The Shenzhen University School of Medicine, Shenzhen, Guangdong, People's Republic of China
| | - Mengting Yang
- Department of Pathology, The Shenzhen University School of Medicine, Shenzhen, Guangdong, People's Republic of China
| | - Yong Huang
- Department of Pathology, The Shenzhen University School of Medicine, Shenzhen, Guangdong, People's Republic of China
| | - Zhenfu Zhao
- Department of Pathology, The Shenzhen University School of Medicine, Shenzhen, Guangdong, People's Republic of China
| | - Liang Wang
- Department of Pathology, The Shenzhen University School of Medicine, Shenzhen, Guangdong, People's Republic of China.,Shenzhen Key Laboratory of Translational Medicine of Tumor, The Shenzhen University School of Medicine, Shenzhen, Guangdong, People's Republic of China
| | - Yanjie Wei
- Center for High Performance Computing, Shenzhen Institutes of Advanced Technology, Shenzhen, Guangdong, People's Republic of China
| | - Xinmin Fan
- Department of Pathology, The Shenzhen University School of Medicine, Shenzhen, Guangdong, People's Republic of China
| | - Duo Zheng
- Department of Pathology, The Shenzhen University School of Medicine, Shenzhen, Guangdong, People's Republic of China.,Shenzhen Key Laboratory of Translational Medicine of Tumor, The Shenzhen University School of Medicine, Shenzhen, Guangdong, People's Republic of China
| | - Stephen J Meltzer
- Department of Medicine/GI Division, Johns Hopkins University and Sidney Kimmel Cancer Center, Baltimore, MD, USA
| |
Collapse
|
11
|
Ananiadou S, Thompson P, Nawaz R, McNaught J, Kell DB. Event-based text mining for biology and functional genomics. Brief Funct Genomics 2015; 14:213-30. [PMID: 24907365 PMCID: PMC4499874 DOI: 10.1093/bfgp/elu015] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The assessment of genome function requires a mapping between genome-derived entities and biochemical reactions, and the biomedical literature represents a rich source of information about reactions between biological components. However, the increasingly rapid growth in the volume of literature provides both a challenge and an opportunity for researchers to isolate information about reactions of interest in a timely and efficient manner. In response, recent text mining research in the biology domain has been largely focused on the identification and extraction of 'events', i.e. categorised, structured representations of relationships between biochemical entities, from the literature. Functional genomics analyses necessarily encompass events as so defined. Automatic event extraction systems facilitate the development of sophisticated semantic search applications, allowing researchers to formulate structured queries over extracted events, so as to specify the exact types of reactions to be retrieved. This article provides an overview of recent research into event extraction. We cover annotated corpora on which systems are trained, systems that achieve state-of-the-art performance and details of the community shared tasks that have been instrumental in increasing the quality, coverage and scalability of recent systems. Finally, several concrete applications of event extraction are covered, together with emerging directions of research.
Collapse
|
12
|
Collier N, Oellrich A, Groza T. Toward knowledge support for analysis and interpretation of complex traits. Genome Biol 2015; 14:214. [PMID: 24079802 PMCID: PMC4053827 DOI: 10.1186/gb-2013-14-9-214] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
The systematic description of complex traits, from the organism to the cellular level, is important for hypothesis generation about underlying disease mechanisms. We discuss how intelligent algorithms might provide support, leading to faster throughput.
Collapse
|
13
|
Assessment of curated phenotype mining in neuropsychiatric disorder literature. Methods 2014; 74:90-6. [PMID: 25484337 DOI: 10.1016/j.ymeth.2014.11.022] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2014] [Revised: 11/25/2014] [Accepted: 11/27/2014] [Indexed: 12/14/2022] Open
Abstract
Clinical evaluation of patients and diagnosis of disorder is crucial to make decisions on appropriate therapies. In addition, in the case of genetic disorders resulting from gene abnormalities, phenotypic effects may guide basic research on the mechanisms of a disorder to find the mutated gene and therefore to propose novel targets for drug therapy. However, this approach is complicated by two facts. First, the relationship between genes and disorders is not simple: one gene may be related to multiple disorders and a disorder may be caused by mutations in different genes. Second, recognizing relevant phenotypes might be difficult for clinicians working with patients of closely related complex disorders. Neuropsychiatric disorders best illustrate these difficulties since phenotypes range from metabolic to behavioral aspects, the latter extremely complex. Based on our clinical expertise on five neurodegenerative disorders, and from the wealth of bibliographical data on neuropsychiatric disorders, we have built a resource to infer associations between genes, chemicals, phenotypes for a total of 31 disorders. An initial step of automated text mining of the literature related to 31 disorders returned thousands of enriched terms. Fewer relevant phenotypic terms were manually selected by clinicians as relevant to the five neural disorders of their expertise and used to analyze the complete set of disorders. Analysis of the data indicates general relationships between neuropsychiatric disorders, which can be used to classify and characterize them. Correlation analyses allowed us to propose novel associations of genes and drugs with disorders. More generally, the results led us to uncovering mechanisms of disease that span multiple neuropsychiatric disorders, for example that genes related to synaptic transmission and receptor functions tend to be involved in many disorders, whereas genes related to sensory perception and channel transport functions are associated with fewer disorders. Our study shows that starting from expertise covering a limited set of neurological disorders and using text and data mining methods, meaningful and novel associations regarding genes, chemicals and phenotypes can be derived for an expanded set of neuropsychiatric disorders. Our results are intended for clinicians to help them evaluate patients, and for basic scientists to propose new gene targets for drug therapies. This strategy can be extended to virtually all diseases and takes advantage of the ever increasing amount of biomedical literature.
Collapse
|
14
|
Lee YJ, Boyd AD, Li JJ, Gardeux V, Kenost C, Saner D, Li H, Abraham I, Krishnan JA, Lussier YA. COPD Hospitalization Risk Increased with Distinct Patterns of Multiple Systems Comorbidities Unveiled by Network Modeling. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2014; 2014:855-64. [PMID: 25954392 PMCID: PMC4419951] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Earlier studies on hospitalization risk are largely based on regression models. To our knowledge, network modeling of multiple comorbidities is novel and inherently enables multidimensional scoring and unbiased feature reduction. Network modeling was conducted using an independent validation design starting from 38,695 patients, 1,446,581 visits, and 430 distinct clinical facilities/hospitals. Odds ratios (OR) were calculated for every pair of comorbidity using patient counts and compared their tendency with hospitalization rates and ED visits. Network topology analyses were performed, defining significant comorbidity associations as having OR≥5 & False-Discovery-Rate≤10(-7). Four COPD-associated comorbidity sub-networks emerged, incorporating multiple clinical systems: (i) metabolic syndrome, (ii) substance abuse and mental disorder, (iii) pregnancy-associated conditions, and (iv) fall-related injury. The latter two have not been reported yet. Features prioritized from the network are predictive of hospitalizations in an independent set (p<0.004). Therefore, we suggest that network topology is a scalable and generalizable method predictive of hospitalization.
Collapse
Affiliation(s)
- Young Ji Lee
- Department of Medicine, University of Illinois at Chicago, Chicago, IL
| | - Andrew D Boyd
- Institute for Translational Health Informatics, University of Illinois at Chicago, Chicago, IL ; Departments of Biomedical and Health Information Sciences, University of Illinois at Chicago, Chicago, IL ; University of Illinois Hospital and Health Science System, University of Illinois at Chicago, Chicago, IL
| | - Jianrong John Li
- Department of Medicine, The University of Arizona, Tucson, AZ, USA
| | - Vincent Gardeux
- Department of Medicine, The University of Arizona, Tucson, AZ, USA
| | - Colleen Kenost
- Department of Medicine, The University of Arizona, Tucson, AZ, USA ; Biomedical Informatics Service Group, Arizona Health Science Center, The University of Arizona, Tucson, AZ, USA
| | - Don Saner
- Cancer Center, The University of Arizona, Tucson, AZ, USA ; Biomedical Informatics Service Group, Arizona Health Science Center, The University of Arizona, Tucson, AZ, USA
| | - Haiquan Li
- Department of Medicine, The University of Arizona, Tucson, AZ, USA
| | - Ivo Abraham
- Department of Pharmacy Practice and Science, The University of Arizona, Tucson, AZ, USA
| | - Jerry A Krishnan
- Department of Medicine, University of Illinois at Chicago, Chicago, IL ; University of Illinois Hospital and Health Science System, University of Illinois at Chicago, Chicago, IL
| | - Yves A Lussier
- Department of Medicine, The University of Arizona, Tucson, AZ, USA ; Cancer Center, The University of Arizona, Tucson, AZ, USA ; Biomedical Informatics Service Group, Arizona Health Science Center, The University of Arizona, Tucson, AZ, USA ; Interdisciplinary Program in Statistics, The University of Arizona, Tucson, AZ, USA
| |
Collapse
|
15
|
McGuire MF, Iyengar MS, Mercer DW. Computational approaches for translational clinical research in disease progression. J Investig Med 2012; 59:893-903. [PMID: 21712727 DOI: 10.2310/jim.0b013e318224d8cc] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Today, there is an ever-increasing amount of biological and clinical data available that could be used to enhance a systems-based understanding of disease progression through innovative computational analysis. In this article, we review a selection of published research regarding computational methods, primarily from systems biology, which support translational research from the molecular level to the bedside, with a focus on applications in trauma and critical care. Trauma is the leading cause of mortality in Americans younger than 45 years, and its rapid progression offers both opportunities and challenges for computational analysis of trends in molecular patterns associated with outcomes and therapeutic interventions.This review presents methods and domain-specific examples that may inspire the development of new algorithms and computational methods that use both molecular and clinical data for diagnosis, prognosis, and therapy in disease progression.
Collapse
Affiliation(s)
- Mary F McGuire
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, TX 77030, USA.
| | | | | |
Collapse
|
16
|
McGuire MF, Iyengar MS, Mercer DW. Computational approaches for translational clinical research in disease progression. J Investig Med 2011; 59. [PMID: 21712727 PMCID: PMC3196807 DOI: 10.231/jim.0b013e318224d8cc] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Today, there is an ever-increasing amount of biological and clinical data available that could be used to enhance a systems-based understanding of disease progression through innovative computational analysis. In this article, we review a selection of published research regarding computational methods, primarily from systems biology, which support translational research from the molecular level to the bedside, with a focus on applications in trauma and critical care. Trauma is the leading cause of mortality in Americans younger than 45 years, and its rapid progression offers both opportunities and challenges for computational analysis of trends in molecular patterns associated with outcomes and therapeutic interventions.This review presents methods and domain-specific examples that may inspire the development of new algorithms and computational methods that use both molecular and clinical data for diagnosis, prognosis, and therapy in disease progression.
Collapse
Affiliation(s)
- Mary F. McGuire
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston TX USA,Contact: Mary F. McGuire, School of Biomedical Informatics, University of Texas Health Science Center at Houston, 7000 Fannin, #600, Houston, TX 77030 USA, , 1-832-364-6734
| | - M. Sriram Iyengar
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston TX USA
| | - David W. Mercer
- Department of Surgery, University of Nebraska Medical Center, Omaha NE USA
| |
Collapse
|
17
|
Shen E, Diao X, Wang X, Chen R, Hu B. MicroRNAs involved in the mitogen-activated protein kinase cascades pathway during glucose-induced cardiomyocyte hypertrophy. THE AMERICAN JOURNAL OF PATHOLOGY 2011; 179:639-50. [PMID: 21704010 DOI: 10.1016/j.ajpath.2011.04.034] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2010] [Revised: 04/05/2011] [Accepted: 04/26/2011] [Indexed: 01/12/2023]
Abstract
Cardiac hypertrophy is a key structural feature of diabetic cardiomyopathy in the late stage of diabetes. Recent studies show that microRNAs (miRNAs) are involved in the pathogenesis of cardiac hypertrophy in diabetic mice, but more novel miRNAs remain to be investigated. In this study, diabetic cardiomyopathy, characterized by hypertrophy, was induced in mice by streptozotocin injection. Using microarray analysis of myocardial tissue, we were able to identify changes in expression in 19 miRNA, of which 16 miRNAs were further validated by real-time PCR and a total of 3212 targets mRNA were predicted. Further analysis showed that 31 GO functions and 16 KEGG pathways were enriched in the diabetic heart. Of these, MAPK signaling pathway was prominent. In vivo and in vitro studies have confirmed that three major subgroups of MAPK including ERK1/2, JNK, and p38, are specifically upregulated in cardiomyocyte hypertrophy during hyperglycemia. To further explore the potential involvement of miRNAs in the regulation of glucose-induced cardiomyocyte hypertrophy, neonatal rat cardiomyocytes were exposed to high glucose and transfected with miR-373 mimic. Overexpression of miR-373 decreased the cell size, and also reduced the level of its target gene MEF2C, and miR-373 expression was regulated by p38. Our data highlight an important role of miRNAs in diabetic cardiomyopathy, and implicate the reliability of bioinformatics analysis in shedding light on the mechanisms underlying diabetic cardiomyopathy.
Collapse
Affiliation(s)
- E Shen
- Department of Ultrasound in Medicine, Shanghai Jiaotong University Affiliated 6th People's Hospital, Shanghai, China
| | | | | | | | | |
Collapse
|
18
|
Holmes AB, Hawson A, Liu F, Friedman C, Khiabanian H, Rabadan R. Discovering disease associations by integrating electronic clinical data and medical literature. PLoS One 2011; 6:e21132. [PMID: 21731656 PMCID: PMC3121722 DOI: 10.1371/journal.pone.0021132] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2011] [Accepted: 05/20/2011] [Indexed: 11/25/2022] Open
Abstract
Electronic health record (EHR) systems offer an exceptional opportunity for studying many diseases and their associated medical conditions within a population. The increasing number of clinical record entries that have become available electronically provides access to rich, large sets of patients' longitudinal medical information. By integrating and comparing relations found in the EHRs with those already reported in the literature, we are able to verify existing and to identify rare or novel associations. Of particular interest is the identification of rare disease co-morbidities, where the small numbers of diagnosed patients make robust statistical analysis difficult. Here, we introduce ADAMS, an Application for Discovering Disease Associations using Multiple Sources, which contains various statistical and language processing operations. We apply ADAMS to the New York-Presbyterian Hospital's EHR to combine the information from the relational diagnosis tables and textual discharge summaries with those from PubMed and Wikipedia in order to investigate the co-morbidities of the rare diseases Kaposi sarcoma, toxoplasmosis, and Kawasaki disease. In addition to finding well-known characteristics of diseases, ADAMS can identify rare or previously unreported associations. In particular, we report a statistically significant association between Kawasaki disease and diagnosis of autistic disorder.
Collapse
Affiliation(s)
- Antony B. Holmes
- Department of Biomedical Informatics, Columbia University College of Physicians and Surgeons, New York, New York, United States of America
- Center for Computational Biology and Bioinformatics, Columbia University College of Physicians and Surgeons, New York, New York, United States of America
| | - Alexander Hawson
- Center for Computational Biology and Bioinformatics, Columbia University College of Physicians and Surgeons, New York, New York, United States of America
- Department of Medicine, Columbia University College of Physicians and Surgeons, New York, New York, United States of America
| | - Feng Liu
- Department of Biomedical Informatics, Columbia University College of Physicians and Surgeons, New York, New York, United States of America
| | - Carol Friedman
- Department of Biomedical Informatics, Columbia University College of Physicians and Surgeons, New York, New York, United States of America
| | - Hossein Khiabanian
- Department of Biomedical Informatics, Columbia University College of Physicians and Surgeons, New York, New York, United States of America
- Center for Computational Biology and Bioinformatics, Columbia University College of Physicians and Surgeons, New York, New York, United States of America
| | - Raul Rabadan
- Department of Biomedical Informatics, Columbia University College of Physicians and Surgeons, New York, New York, United States of America
- Center for Computational Biology and Bioinformatics, Columbia University College of Physicians and Surgeons, New York, New York, United States of America
| |
Collapse
|
19
|
Abstract
Prioritization of most likely etiological genes entails predicting and defining a set of characteristics that are most likely to fit the underlying disease gene and scoring candidates according to their fit to this "perfect disease gene" profile. This requires a full understanding of the disease phenotype, characteristics, and any available data on the underlying genetics of the disease. Public databases provide enormous and ever-growing amounts of information that can be relevant to the prioritization of etiological genes. Computational approaches allow this information to be retrieved in an automated and exhaustive way and can therefore facilitate the comprehensive mining of this information, including its combination with sets of empirically generated data, in the process of identifying most likely candidate disease genes.
Collapse
Affiliation(s)
- Nicki Tiffin
- The South African National Bioinformatics Institute, University of the Western Cape, 7925, Belville, Cape Town, South Africa.
| |
Collapse
|
20
|
Ananiadou S, Pyysalo S, Tsujii J, Kell DB. Event extraction for systems biology by text mining the literature. Trends Biotechnol 2010; 28:381-90. [PMID: 20570001 DOI: 10.1016/j.tibtech.2010.04.005] [Citation(s) in RCA: 101] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2010] [Revised: 04/20/2010] [Accepted: 04/26/2010] [Indexed: 01/08/2023]
Abstract
Systems biology recognizes in particular the importance of interactions between biological components and the consequences of these interactions. Such interactions and their downstream effects are known as events. To computationally mine the literature for such events, text mining methods that can detect, extract and annotate them are required. This review summarizes the methods that are currently available, with a specific focus on protein-protein interactions and pathway or network reconstruction. The approaches described will be of considerable value in associating particular pathways and their components with higher-order physiological properties, including disease states.
Collapse
|
21
|
Borlawsky TB, Li J, Shagina L, Crowson MG, Liu Y, Friedman C, Lussier YA. Evaluation of an Ontology-anchored Natural Language-based Approach for Asserting Multi-scale Biomolecular Networks for Systems Medicine. SUMMIT ON TRANSLATIONAL BIOINFORMATICS 2010; 2010:6-10. [PMID: 21347135 PMCID: PMC3041541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The ability to adequately and efficiently integrate unstructured, heterogeneous datasets, which are incumbent to systems biology and medicine, is one of the primary limitations to their comprehensive analysis. Natural language processing (NLP) and biomedical ontologies are automated methods for capturing, standardizing and integrating information across diverse sources, including narrative text. We have utilized the BioMedLEE NLP system to extract and encode, using standard ontologies (e.g., Cell Type Ontology, Mammalian Phenotype, Gene Ontology), biomolecular mechanisms and clinical phenotypes from the scientific literature. We subsequently applied semantic processing techniques to the structured BioMedLEE output to determine the relationships between these biomolecular and clinical phenotype concepts. We conducted an evaluation that shows an average precision and recall of BioMedLEE with respect to annotating phrases comprised of cell type, anatomy/disease, and gene/protein concepts were 86% and 78%, respectively. The precision of the asserted phenotype-molecular relationships was 75%.
Collapse
Affiliation(s)
- Tara B. Borlawsky
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH
| | - Jianrong Li
- Center for Biomedical Informatics, Dept. of Medicine, The University of Chicago, IL
| | - Lyudmila Shagina
- Department of Biomedical Informatics, Columbia University, New York, NY
| | - Matthew G. Crowson
- Center for Biomedical Informatics, Dept. of Medicine, The University of Chicago, IL
| | - Yang Liu
- Center for Biomedical Informatics, Dept. of Medicine, The University of Chicago, IL
| | - Carol Friedman
- Department of Biomedical Informatics, Columbia University, New York, NY,Corresponding authors
| | - Yves A. Lussier
- Center for Biomedical Informatics, Dept. of Medicine, The University of Chicago, IL,Corresponding authors
| |
Collapse
|
22
|
Baker CJO, Rebholz-Schuhmann D. Between proteins and phenotypes: annotation and interpretation of mutations. BMC Bioinformatics 2009; 10 Suppl 8:I1. [PMID: 19758463 PMCID: PMC2745581 DOI: 10.1186/1471-2105-10-s8-i1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
23
|
Tiffin N, Andrade-Navarro MA, Perez-Iratxeta C. Linking genes to diseases: it's all in the data. Genome Med 2009; 1:77. [PMID: 19678910 PMCID: PMC2768963 DOI: 10.1186/gm77] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Genome-wide association analyses on large patient cohorts are generating large sets of candidate disease genes. This is coupled with the availability of ever-increasing genomic databases and a rapidly expanding repository of biomedical literature. Computational approaches to disease-gene association attempt to harness these data sources to identify the most likely disease gene candidates for further empirical analysis by translational researchers, resulting in efficient identification of genes of diagnostic, prognostic and therapeutic value. Existing computational methods analyze gene structure and sequence, functional annotation of candidate genes, characteristics of known disease genes, gene regulatory networks, protein-protein interactions, data from animal models and disease phenotype. To date, a few studies have successfully applied computational analysis of clinical phenotype data for specific diseases and shown genetic associations. In the near future, computational strategies will be facilitated by improved integration of clinical and computational research, and by increased availability of clinical phenotype data in a format accessible to computational approaches.
Collapse
Affiliation(s)
- Nicki Tiffin
- MRC/UWC/SANBI Bioinformatics Capacity Development Unit, South African National Bioinformatics Institute, University of the Western Cape, Bellville 7535, South Africa.
| | | | | |
Collapse
|
24
|
Butte AJ, Sarkar IN, Ramoni M, Lussier Y, Troyanskaya O. Selected proceedings of the First Summit on Translational Bioinformatics 2008. BMC Bioinformatics 2009; 10 Suppl 2:I1. [PMID: 19208183 PMCID: PMC2646246 DOI: 10.1186/1471-2105-10-s2-i1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
|