1
|
Riveros-Gomez I, Vasquez-Marin J, Huerta-Garcia EX, Camargo-Ayala PA, Rivera C. Aphthous stomatitis - computational biology suggests external biotic stimulus and immunogenic cell death involved. BMC Oral Health 2024; 24:1154. [PMID: 39343890 PMCID: PMC11440928 DOI: 10.1186/s12903-024-04917-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2024] [Accepted: 09/16/2024] [Indexed: 10/01/2024] Open
Abstract
BACKGROUND The exact cause of recurrent aphthous stomatitis is still unknown, making it a challenge to develop effective treatments. This study employs computational biology to investigate the molecular basis of recurrent aphthous stomatitis, aiming to identify the nature of the stimuli triggering these ulcers and the type of cell death involved. METHODS To understand the molecular underpinnings of recurrent aphthous stomatitis, we used the Génie tool for gene identification, targeting those associated with cell death in recurrent aphthous stomatitis. The ToppGene Suite was employed for functional enrichment analysis. We also used Reactome and InteractiVenn for protein integration and prioritization against a PANoptosis gene list, enabling the construction of a protein-protein interaction network to pinpoint key proteins in recurrent aphthous stomatitis pathogenesis. RESULTS The study's computational approach identified 1,375 protein-coding genes linked to recurrent aphthous stomatitis. Critical among these were proteins responsive to bacterial stimuli, especially high mobility group protein B1 (HMGB1), toll-like receptor 2 (TLR2), and toll-like receptor 4 (TLR4). The enrichment analysis suggested an external biotic factor, likely bacterial, as a triggering agent in recurrent aphthous stomatitis. The protein interaction network highlighted the roles of tumor necrosis factor (TNF), NF-kappa-B essential modulator (IKBKG), and tumor necrosis factor receptor superfamily member 1A (TNFRSF1A), indicating an immunogenic cell death mechanism, potentially PANoptosis, in recurrent aphthous stomatitis. CONCLUSION The findings propose that bacterial stimuli could trigger recurrent aphthous stomatitis through a PANoptosis-related cell death pathway. This new understanding of recurrent aphthous stomatitis pathogenesis underscores the significance of oral microbiota in the condition. Future experimental validation and therapeutic strategy development based on these findings are necessary.
Collapse
Affiliation(s)
- Ignacio Riveros-Gomez
- Laboratorio de Histopatología Oral y Maxilofacial, Unidad de Medicina Oral y Patología Oral, Departamento de Estomatología, Facultad de Odontología, Universidad de Talca, Avenida Lircay S/N, Campus Norte Universidad de Talca, Edificio de Ciencias Biomédicas, Oficina N°4, Talca, 3460000, Región del Maule, Chile
| | - Joaquin Vasquez-Marin
- Laboratorio de Histopatología Oral y Maxilofacial, Unidad de Medicina Oral y Patología Oral, Departamento de Estomatología, Facultad de Odontología, Universidad de Talca, Avenida Lircay S/N, Campus Norte Universidad de Talca, Edificio de Ciencias Biomédicas, Oficina N°4, Talca, 3460000, Región del Maule, Chile
| | - Elisa Ximena Huerta-Garcia
- Laboratorio de Histopatología Oral y Maxilofacial, Unidad de Medicina Oral y Patología Oral, Departamento de Estomatología, Facultad de Odontología, Universidad de Talca, Avenida Lircay S/N, Campus Norte Universidad de Talca, Edificio de Ciencias Biomédicas, Oficina N°4, Talca, 3460000, Región del Maule, Chile
| | - Paola Andrea Camargo-Ayala
- Laboratorio de Histopatología Oral y Maxilofacial, Unidad de Medicina Oral y Patología Oral, Departamento de Estomatología, Facultad de Odontología, Universidad de Talca, Avenida Lircay S/N, Campus Norte Universidad de Talca, Edificio de Ciencias Biomédicas, Oficina N°4, Talca, 3460000, Región del Maule, Chile
| | - Cesar Rivera
- Laboratorio de Histopatología Oral y Maxilofacial, Unidad de Medicina Oral y Patología Oral, Departamento de Estomatología, Facultad de Odontología, Universidad de Talca, Avenida Lircay S/N, Campus Norte Universidad de Talca, Edificio de Ciencias Biomédicas, Oficina N°4, Talca, 3460000, Región del Maule, Chile.
| |
Collapse
|
2
|
Luo ZH, Zhu LD, Wang YM, Hu Qian S, Li M, Zhang W, Chen ZX. DSEATM: drug set enrichment analysis uncovering disease mechanisms by biomedical text mining. Brief Bioinform 2022; 23:6605028. [PMID: 35679594 DOI: 10.1093/bib/bbac228] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 05/09/2022] [Accepted: 05/17/2022] [Indexed: 11/13/2022] Open
Abstract
Disease pathogenesis is always a major topic in biomedical research. With the exponential growth of biomedical information, drug effect analysis for specific phenotypes has shown great promise in uncovering disease-associated pathways. However, this method has only been applied to a limited number of drugs. Here, we extracted the data of 4634 diseases, 3671 drugs, 112 809 disease-drug associations and 81 527 drug-gene associations by text mining of 29 168 919 publications. On this basis, we proposed a 'Drug Set Enrichment Analysis by Text Mining (DSEATM)' pipeline and applied it to 3250 diseases, which outperformed the state-of-the-art method. Furthermore, diseases pathways enriched by DSEATM were similar to those obtained using the TCGA cancer RNA-seq differentially expressed genes. In addition, the drug number, which showed a remarkable positive correlation of 0.73 with the AUC, plays a determining role in the performance of DSEATM. Taken together, DSEATM is an auspicious and accurate disease research tool that offers fresh insights.
Collapse
Affiliation(s)
- Zhi-Hui Luo
- Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Hubei Key Laboratory of Agricultural Bioinformatics, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Interdisciplinary Sciences Institute, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China
| | - Li-Da Zhu
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China
| | - Ya-Min Wang
- Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Hubei Key Laboratory of Agricultural Bioinformatics, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Interdisciplinary Sciences Institute, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China
| | - Sheng Hu Qian
- Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Hubei Key Laboratory of Agricultural Bioinformatics, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Interdisciplinary Sciences Institute, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China
| | - Menglu Li
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China
| | - Zhen-Xia Chen
- Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Hubei Key Laboratory of Agricultural Bioinformatics, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Interdisciplinary Sciences Institute, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China
| |
Collapse
|
3
|
Multi-Omic Meta-Analysis of Transcriptomes and the Bibliome Uncovers Novel Hypoxia-Inducible Genes. Biomedicines 2021; 9:biomedicines9050582. [PMID: 34065451 PMCID: PMC8160971 DOI: 10.3390/biomedicines9050582] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 05/15/2021] [Accepted: 05/19/2021] [Indexed: 12/30/2022] Open
Abstract
Hypoxia is a condition in which cells, tissues, or organisms are deprived of sufficient oxygen supply. Aerobic organisms have a hypoxic response system, represented by hypoxia-inducible factor 1-α (HIF1A), to adapt to this condition. Due to publication bias, there has been little focus on genes other than well-known signature hypoxia-inducible genes. Therefore, in this study, we performed a meta-analysis to identify novel hypoxia-inducible genes. We searched publicly available transcriptome databases to obtain hypoxia-related experimental data, retrieved the metadata, and manually curated it. We selected the genes that are differentially expressed by hypoxic stimulation, and evaluated their relevance in hypoxia by performing enrichment analyses. Next, we performed a bibliometric analysis using gene2pubmed data to examine genes that have not been well studied in relation to hypoxia. Gene2pubmed data provides information about the relationship between genes and publications. We calculated and evaluated the number of reports and similarity coefficients of each gene to HIF1A, which is a representative gene in hypoxia studies. In this data-driven study, we report that several genes that were not known to be associated with hypoxia, including the G protein-coupled receptor 146 gene, are upregulated by hypoxic stimulation.
Collapse
|
4
|
Cui H, Zuo S, Liu Z, Liu H, Wang J, You T, Zheng Z, Zhou Y, Qian X, Yao H, Xie L, Liu T, Sham PC, Yu Y, Li MJ. The support of genetic evidence for cardiovascular risk induced by antineoplastic drugs. SCIENCE ADVANCES 2020; 6:eabb8543. [PMID: 33055159 PMCID: PMC7556838 DOI: 10.1126/sciadv.abb8543] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/23/2020] [Accepted: 08/28/2020] [Indexed: 05/04/2023]
Abstract
Cardiovascular dysfunction is one of the most common complications of long-term cancer treatment. Growing evidence has shown that antineoplastic drugs can increase cardiovascular risk during cancer therapy, seriously affecting patient survival. However, little is known about the genetic factors associated with the cardiovascular risk of antineoplastic drugs. We established a compendium of genetic evidence that supports cardiovascular risk induced by antineoplastic drugs. Most of this genetic evidence is attributed to causal alleles altering the expression of cardiovascular disease genes. We found that antineoplastic drugs predicted to induce cardiovascular risk are significantly enriched in drugs associated with cardiovascular adverse reactions, including many first-line cancer treatments. Functional experiments validated that retinoid X receptor agonists can reduce triglyceride lipolysis, thus modulating cardiovascular risk. Our results establish a link between the causal allele of cardiovascular disease genes and the direction of pharmacological modulation, which could facilitate cancer drug discovery and clinical trial design.
Collapse
Affiliation(s)
- Hui Cui
- School of Life Science and Technology, ShanghaiTech University, Shanghai, China
- Key Laboratory of Food Safety Research, CAS Center for Excellence in Molecular Cell Science, Shanghai Institute for Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai, China
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Shengkai Zuo
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Zipeng Liu
- Centre for PanorOmic Sciences-Genomics and Bioinformatics Cores, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Huanhuan Liu
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Jianhua Wang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Tianyi You
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Zhanye Zheng
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Yao Zhou
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Xinyi Qian
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Hongcheng Yao
- Centre for PanorOmic Sciences-Genomics and Bioinformatics Cores, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Lu Xie
- Shanghai Center for Bioinformation Technology, Shanghai Academy of Science and Technology, Shanghai, China
| | - Tong Liu
- Tianjin Key Laboratory of Ionic-Molecular Function of Cardiovascular Disease, Department of Cardiology, Tianjin Institute of Cardiology, Second Hospital of Tianjin Medical University, Tianjin, China
| | - Pak Chung Sham
- Centre for PanorOmic Sciences-Genomics and Bioinformatics Cores, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Ying Yu
- School of Life Science and Technology, ShanghaiTech University, Shanghai, China.
- Key Laboratory of Food Safety Research, CAS Center for Excellence in Molecular Cell Science, Shanghai Institute for Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai, China
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Mulin Jun Li
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China.
- Department of Epidemiology and Biostatistics, Tianjin Key Laboratory of Molecular Cancer Epidemiology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| |
Collapse
|
5
|
Oral lichen planus interactome reveals CXCR4 and CXCL12 as candidate therapeutic targets. Sci Rep 2020; 10:5454. [PMID: 32214134 PMCID: PMC7096434 DOI: 10.1038/s41598-020-62258-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2019] [Accepted: 03/12/2020] [Indexed: 01/03/2023] Open
Abstract
Today, we face difficulty in generating new hypotheses and understanding oral lichen planus due to the large amount of biomedical information available. In this research, we have used an integrated bioinformatics approach assimilating information from data mining, gene ontologies, protein–protein interaction and network analysis to predict candidate genes related to oral lichen planus. A detailed pathway analysis led us to propose two promising therapeutic targets: the stromal cell derived factor 1 (CXCL12) and the C-X-C type 4 chemokine receptor (CXCR4). We further validated our predictions and found that CXCR4 was upregulated in all oral lichen planus tissue samples. Our bioinformatics data cumulatively support the pathological role of chemokines and chemokine receptors in oral lichen planus. From a clinical perspective, we suggest a drug (plerixafor) and two therapeutic targets for future research.
Collapse
|
6
|
Multidimensional informatic deconvolution defines gender-specific roles of hypothalamic GIT2 in aging trajectories. Mech Ageing Dev 2019; 184:111150. [PMID: 31574270 DOI: 10.1016/j.mad.2019.111150] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2019] [Revised: 08/20/2019] [Accepted: 09/26/2019] [Indexed: 12/13/2022]
Abstract
In most species, females live longer than males. An understanding of this female longevity advantage will likely uncover novel anti-aging therapeutic targets. Here we investigated the transcriptomic responses in the hypothalamus - a key organ for somatic aging control - to the introduction of a simple aging-related molecular perturbation, i.e. GIT2 heterozygosity. Our previous work has demonstrated that GIT2 acts as a network controller of aging. A similar number of both total (1079-female, 1006-male) and gender-unique (577-female, 527-male) transcripts were significantly altered in response to GIT2 heterozygosity in early life-stage (2 month-old) mice. Despite a similar volume of transcriptomic disruption in females and males, a considerably stronger dataset coherency and functional annotation representation was observed for females. It was also evident that female mice possessed a greater resilience to pro-aging signaling pathways compared to males. Using a highly data-dependent natural language processing informatics pipeline, we identified novel functional data clusters that were connected by a coherent group of multifunctional transcripts. From these it was clear that females prioritized metabolic activity preservation compared to males to mitigate this pro-aging perturbation. These findings were corroborated by somatic metabolism analyses of living animals, demonstrating the efficacy of our new informatics pipeline.
Collapse
|
7
|
Thilakaratne M, Falkner K, Atapattu T. A systematic review on literature-based discovery workflow. PeerJ Comput Sci 2019; 5:e235. [PMID: 33816888 PMCID: PMC7924697 DOI: 10.7717/peerj-cs.235] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Accepted: 10/17/2019] [Indexed: 05/02/2023]
Abstract
As scientific publication rates increase, knowledge acquisition and the research development process have become more complex and time-consuming. Literature-Based Discovery (LBD), supporting automated knowledge discovery, helps facilitate this process by eliciting novel knowledge by analysing existing scientific literature. This systematic review provides a comprehensive overview of the LBD workflow by answering nine research questions related to the major components of the LBD workflow (i.e., input, process, output, and evaluation). With regards to the input component, we discuss the data types and data sources used in the literature. The process component presents filtering techniques, ranking/thresholding techniques, domains, generalisability levels, and resources. Subsequently, the output component focuses on the visualisation techniques used in LBD discipline. As for the evaluation component, we outline the evaluation techniques, their generalisability, and the quantitative measures used to validate results. To conclude, we summarise the findings of the review for each component by highlighting the possible future research directions.
Collapse
Affiliation(s)
- Menasha Thilakaratne
- Faculty of Engineering, Computer and Mathematical Sciences, The University of Adelaide, Adelaide, South Australia, Australia
| | - Katrina Falkner
- Faculty of Engineering, Computer and Mathematical Sciences, The University of Adelaide, Adelaide, South Australia, Australia
| | - Thushari Atapattu
- Faculty of Engineering, Computer and Mathematical Sciences, The University of Adelaide, Adelaide, South Australia, Australia
| |
Collapse
|
8
|
Luo P, Tian LP, Ruan J, Wu FX. Disease Gene Prediction by Integrating PPI Networks, Clinical RNA-Seq Data and OMIM Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:222-232. [PMID: 29990218 DOI: 10.1109/tcbb.2017.2770120] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Disease gene prediction is a challenging task that has a variety of applications such as early diagnosis and drug development. The existing machine learning methods suffer from the imbalanced sample issue because the number of known disease genes (positive samples) is much less than that of unknown genes which are typically considered to be negative samples. In addition, most methods have not utilized clinical data from patients with a specific disease to predict disease genes. In this study, we propose a disease gene prediction algorithm (called dgSeq) by combining protein-protein interaction (PPI) network, clinical RNA-Seq data, and Online Mendelian Inheritance in Man (OMIN) data. Our dgSeq constructs differential networks based on rewiring information calculated from clinical RNA-Seq data. To select balanced sets of non-disease genes (negative samples), a disease-gene network is also constructed from OMIM data. After features are extracted from the PPI networks and differential networks, the logistic regression classifiers are trained. Our dgSeq obtains AUC values of 0.88, 0.83, and 0.80 for identifying breast cancer genes, thyroid cancer genes, and Alzheimer's disease genes, respectively, which indicates its superiority to other three competing methods. Both gene set enrichment analysis and predicted results demonstrate that dgSeq can effectively predict new disease genes.
Collapse
|
9
|
Martin B, Wang R, Cong WN, Daimon CM, Wu WW, Ni B, Becker KG, Lehrmann E, Wood WH, Zhang Y, Etienne H, van Gastel J, Azmi A, Janssens J, Maudsley S. Altered learning, memory, and social behavior in type 1 taste receptor subunit 3 knock-out mice are associated with neuronal dysfunction. J Biol Chem 2017; 292:11508-11530. [PMID: 28522608 DOI: 10.1074/jbc.m116.773820] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2016] [Revised: 05/03/2017] [Indexed: 12/19/2022] Open
Abstract
The type 1 taste receptor member 3 (T1R3) is a G protein-coupled receptor involved in sweet-taste perception. Besides the tongue, the T1R3 receptor is highly expressed in brain areas implicated in cognition, including the hippocampus and cortex. As cognitive decline is often preceded by significant metabolic or endocrinological dysfunctions regulated by the sweet-taste perception system, we hypothesized that a disruption of the sweet-taste perception in the brain could have a key role in the development of cognitive dysfunction. To assess the importance of the sweet-taste receptors in the brain, we conducted transcriptomic and proteomic analyses of cortical and hippocampal tissues isolated from T1R3 knock-out (T1R3KO) mice. The effect of an impaired sweet-taste perception system on cognition functions were examined by analyzing synaptic integrity and performing animal behavior on T1R3KO mice. Although T1R3KO mice did not present a metabolically disrupted phenotype, bioinformatic interpretation of the high-dimensionality data indicated a strong neurodegenerative signature associated with significant alterations in pathways involved in neuritogenesis, dendritic growth, and synaptogenesis. Furthermore, a significantly reduced dendritic spine density was observed in T1R3KO mice together with alterations in learning and memory functions as well as sociability deficits. Taken together our data suggest that the sweet-taste receptor system plays an important neurotrophic role in the extralingual central nervous tissue that underpins synaptic function, memory acquisition, and social behavior.
Collapse
Affiliation(s)
- Bronwen Martin
- From the Metabolism Unit, NIA, National Institutes of Health, Baltimore, Maryland 21224
| | - Rui Wang
- From the Metabolism Unit, NIA, National Institutes of Health, Baltimore, Maryland 21224
| | - Wei-Na Cong
- From the Metabolism Unit, NIA, National Institutes of Health, Baltimore, Maryland 21224
| | - Caitlin M Daimon
- From the Metabolism Unit, NIA, National Institutes of Health, Baltimore, Maryland 21224
| | - Wells W Wu
- From the Metabolism Unit, NIA, National Institutes of Health, Baltimore, Maryland 21224
| | - Bin Ni
- the Receptor Pharmacology Unit, NIA, National Institutes of Health, Baltimore, Maryland 21224
| | - Kevin G Becker
- the Gene Expression and Genomics Unit, NIA, National Institutes of Health, Baltimore, Maryland 21224
| | - Elin Lehrmann
- the Gene Expression and Genomics Unit, NIA, National Institutes of Health, Baltimore, Maryland 21224
| | - William H Wood
- the Gene Expression and Genomics Unit, NIA, National Institutes of Health, Baltimore, Maryland 21224
| | - Yongqing Zhang
- the Gene Expression and Genomics Unit, NIA, National Institutes of Health, Baltimore, Maryland 21224
| | - Harmonie Etienne
- the Translational Neurobiology Group, VIB Department of Molecular Genetics, University of Antwerp, AN-2610 Antwerp, Belgium, and.,the Department of Biomedical Sciences, University of Antwerp, AN-2610 Antwerp, Belgium
| | - Jaana van Gastel
- the Translational Neurobiology Group, VIB Department of Molecular Genetics, University of Antwerp, AN-2610 Antwerp, Belgium, and.,the Department of Biomedical Sciences, University of Antwerp, AN-2610 Antwerp, Belgium
| | - Abdelkrim Azmi
- the Translational Neurobiology Group, VIB Department of Molecular Genetics, University of Antwerp, AN-2610 Antwerp, Belgium, and.,the Department of Biomedical Sciences, University of Antwerp, AN-2610 Antwerp, Belgium
| | - Jonathan Janssens
- the Translational Neurobiology Group, VIB Department of Molecular Genetics, University of Antwerp, AN-2610 Antwerp, Belgium, and.,the Department of Biomedical Sciences, University of Antwerp, AN-2610 Antwerp, Belgium
| | - Stuart Maudsley
- the Receptor Pharmacology Unit, NIA, National Institutes of Health, Baltimore, Maryland 21224, .,the Translational Neurobiology Group, VIB Department of Molecular Genetics, University of Antwerp, AN-2610 Antwerp, Belgium, and.,the Department of Biomedical Sciences, University of Antwerp, AN-2610 Antwerp, Belgium
| |
Collapse
|
10
|
Mao Y, Lu Z. MeSH Now: automatic MeSH indexing at PubMed scale via learning to rank. J Biomed Semantics 2017; 8:15. [PMID: 28412964 PMCID: PMC5392968 DOI: 10.1186/s13326-017-0123-3] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2016] [Accepted: 03/16/2017] [Indexed: 01/15/2023] Open
Abstract
BACKGROUND MeSH indexing is the task of assigning relevant MeSH terms based on a manual reading of scholarly publications by human indexers. The task is highly important for improving literature retrieval and many other scientific investigations in biomedical research. Unfortunately, given its manual nature, the process of MeSH indexing is both time-consuming (new articles are not immediately indexed until 2 or 3 months later) and costly (approximately ten dollars per article). In response, automatic indexing by computers has been previously proposed and attempted but remains challenging. In order to advance the state of the art in automatic MeSH indexing, a community-wide shared task called BioASQ was recently organized. METHODS We propose MeSH Now, an integrated approach that first uses multiple strategies to generate a combined list of candidate MeSH terms for a target article. Through a novel learning-to-rank framework, MeSH Now then ranks the list of candidate terms based on their relevance to the target article. Finally, MeSH Now selects the highest-ranked MeSH terms via a post-processing module. RESULTS We assessed MeSH Now on two separate benchmarking datasets using traditional precision, recall and F1-score metrics. In both evaluations, MeSH Now consistently achieved over 0.60 in F-score, ranging from 0.610 to 0.612. Furthermore, additional experiments show that MeSH Now can be optimized by parallel computing in order to process MEDLINE documents on a large scale. CONCLUSIONS We conclude that MeSH Now is a robust approach with state-of-the-art performance for automatic MeSH indexing and that MeSH Now is capable of processing PubMed scale documents within a reasonable time frame. AVAILABILITY http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/MeSHNow/ .
Collapse
Affiliation(s)
- Yuqing Mao
- Nanjing University of Chinese Medicine, 138 Xianlin Avenue, Nanjing, Jiangsu, 210023, China
- National Center for Biotechnology Information (NCBI), 8600 Rockville Pike, Bethesda, MD, 20894, USA
| | - Zhiyong Lu
- National Center for Biotechnology Information (NCBI), 8600 Rockville Pike, Bethesda, MD, 20894, USA.
| |
Collapse
|
11
|
ElShal S, Tranchevent LC, Sifrim A, Ardeshirdavani A, Davis J, Moreau Y. Beegle: from literature mining to disease-gene discovery. Nucleic Acids Res 2016; 44:e18. [PMID: 26384564 PMCID: PMC4737179 DOI: 10.1093/nar/gkv905] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2015] [Revised: 08/25/2015] [Accepted: 08/29/2015] [Indexed: 01/06/2023] Open
Abstract
Disease-gene identification is a challenging process that has multiple applications within functional genomics and personalized medicine. Typically, this process involves both finding genes known to be associated with the disease (through literature search) and carrying out preliminary experiments or screens (e.g. linkage or association studies, copy number analyses, expression profiling) to determine a set of promising candidates for experimental validation. This requires extensive time and monetary resources. We describe Beegle, an online search and discovery engine that attempts to simplify this process by automating the typical approaches. It starts by mining the literature to quickly extract a set of genes known to be linked with a given query, then it integrates the learning methodology of Endeavour (a gene prioritization tool) to train a genomic model and rank a set of candidate genes to generate novel hypotheses. In a realistic evaluation setup, Beegle has an average recall of 84% in the top 100 returned genes as a search engine, which improves the discovery engine by 12.6% in the top 5% prioritized genes. Beegle is publicly available at http://beegle.esat.kuleuven.be/.
Collapse
Affiliation(s)
- Sarah ElShal
- Department of Electrical Engineering (ESAT) STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics Department, KU Leuven, Leuven 3001, Belgium iMinds Future Health Department, KU Leuven, Leuven 3001, Belgium
| | - Léon-Charles Tranchevent
- Department of Electrical Engineering (ESAT) STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics Department, KU Leuven, Leuven 3001, Belgium iMinds Future Health Department, KU Leuven, Leuven 3001, Belgium Inserm UMR-S1052, CNRS UMR5286, Cancer Research Centre of Lyon, Lyon, France Université de Lyon 1, Villeurbanne, France Centre Léon Bérard, Lyon, France
| | - Alejandro Sifrim
- Department of Electrical Engineering (ESAT) STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics Department, KU Leuven, Leuven 3001, Belgium iMinds Future Health Department, KU Leuven, Leuven 3001, Belgium Wellcome Trust Genome Campus, Hinxton, Wellcome Trust Sanger Institute, Cambridge CB10 1SA, UK
| | - Amin Ardeshirdavani
- Department of Electrical Engineering (ESAT) STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics Department, KU Leuven, Leuven 3001, Belgium iMinds Future Health Department, KU Leuven, Leuven 3001, Belgium
| | - Jesse Davis
- Department of Computer Science (DTAI), KU Leuven, Leuven 3001, Belgium
| | - Yves Moreau
- Department of Electrical Engineering (ESAT) STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics Department, KU Leuven, Leuven 3001, Belgium iMinds Future Health Department, KU Leuven, Leuven 3001, Belgium
| |
Collapse
|
12
|
Daimon CM, Jasien JM, Wood WH, Zhang Y, Becker KG, Silverman JL, Crawley JN, Martin B, Maudsley S. Hippocampal Transcriptomic and Proteomic Alterations in the BTBR Mouse Model of Autism Spectrum Disorder. Front Physiol 2015; 6:324. [PMID: 26635614 PMCID: PMC4656818 DOI: 10.3389/fphys.2015.00324] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2015] [Accepted: 10/27/2015] [Indexed: 12/25/2022] Open
Abstract
Autism spectrum disorders (ASD) are complex heterogeneous neurodevelopmental disorders of an unclear etiology, and no cure currently exists. Prior studies have demonstrated that the black and tan, brachyury (BTBR) T+ Itpr3tf/J mouse strain displays a behavioral phenotype with ASD-like features. BTBR T+ Itpr3tf/J mice (referred to simply as BTBR) display deficits in social functioning, lack of communication ability, and engagement in stereotyped behavior. Despite extensive behavioral phenotypic characterization, little is known about the genes and proteins responsible for the presentation of the ASD-like phenotype in the BTBR mouse model. In this study, we employed bioinformatics techniques to gain a wide-scale understanding of the transcriptomic and proteomic changes associated with the ASD-like phenotype in BTBR mice. We found a number of genes and proteins to be significantly altered in BTBR mice compared to C57BL/6J (B6) control mice controls such as BDNF, Shank3, and ERK1, which are highly relevant to prior investigations of ASD. Furthermore, we identified distinct functional pathways altered in BTBR mice compared to B6 controls that have been previously shown to be altered in both mouse models of ASD, some human clinical populations, and have been suggested as a possible etiological mechanism of ASD, including “axon guidance” and “regulation of actin cytoskeleton.” In addition, our wide-scale bioinformatics approach also discovered several previously unidentified genes and proteins associated with the ASD phenotype in BTBR mice, such as Caskin1, suggesting that bioinformatics could be an avenue by which novel therapeutic targets for ASD are uncovered. As a result, we believe that informed use of synergistic bioinformatics applications represents an invaluable tool for elucidating the etiology of complex disorders like ASD.
Collapse
Affiliation(s)
- Caitlin M Daimon
- Metabolism Unit, National Institute on Aging, National Institutes of Health Baltimore, MD, USA
| | - Joan M Jasien
- Metabolism Unit, National Institute on Aging, National Institutes of Health Baltimore, MD, USA
| | - William H Wood
- Gene Expression and Genomics Unit, National Institutes of Health Baltimore, MD, USA
| | - Yongqing Zhang
- Gene Expression and Genomics Unit, National Institutes of Health Baltimore, MD, USA
| | - Kevin G Becker
- Gene Expression and Genomics Unit, National Institutes of Health Baltimore, MD, USA
| | - Jill L Silverman
- Laboratory of Behavioral Neuroscience, Intramural Research Program, National Institute of Mental Health Bethesda, MD, USA ; MIND Institute, University of California Davis School of Medicine Sacramento, CA, USA
| | - Jacqueline N Crawley
- Laboratory of Behavioral Neuroscience, Intramural Research Program, National Institute of Mental Health Bethesda, MD, USA ; MIND Institute, University of California Davis School of Medicine Sacramento, CA, USA
| | - Bronwen Martin
- Metabolism Unit, National Institute on Aging, National Institutes of Health Baltimore, MD, USA
| | - Stuart Maudsley
- Receptor Pharmacology Unit, National Institute on Aging, National Institutes of Health Baltimore, MD, USA ; Translational Neurobiology Group, VIB Department of Molecular Genetics, University of Antwerp Antwerp, Belgium ; Laboratory of Neurogenetics, Institute Born-Bunge, University of Antwerp Antwerpen, Belgium
| |
Collapse
|
13
|
Cornish AJ, Filippis I, David A, Sternberg MJE. Exploring the cellular basis of human disease through a large-scale mapping of deleterious genes to cell types. Genome Med 2015; 7:95. [PMID: 26330083 PMCID: PMC4557825 DOI: 10.1186/s13073-015-0212-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2015] [Accepted: 07/31/2015] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Each cell type found within the human body performs a diverse and unique set of functions, the disruption of which can lead to disease. However, there currently exists no systematic mapping between cell types and the diseases they can cause. METHODS In this study, we integrate protein-protein interaction data with high-quality cell-type-specific gene expression data from the FANTOM5 project to build the largest collection of cell-type-specific interactomes created to date. We develop a novel method, called gene set compactness (GSC), that contrasts the relative positions of disease-associated genes across 73 cell-type-specific interactomes to map genes associated with 196 diseases to the cell types they affect. We conduct text-mining of the PubMed database to produce an independent resource of disease-associated cell types, which we use to validate our method. RESULTS The GSC method successfully identifies known disease-cell-type associations, as well as highlighting associations that warrant further study. This includes mast cells and multiple sclerosis, a cell population currently being targeted in a multiple sclerosis phase 2 clinical trial. Furthermore, we build a cell-type-based diseasome using the cell types identified as manifesting each disease, offering insight into diseases linked through etiology. CONCLUSIONS The data set produced in this study represents the first large-scale mapping of diseases to the cell types in which they are manifested and will therefore be useful in the study of disease systems. Overall, we demonstrate that our approach links disease-associated genes to the phenotypes they produce, a key goal within systems medicine.
Collapse
Affiliation(s)
- Alex J Cornish
- Department of Life Sciences, Imperial College London, Exhibition Road, London, SW7 2AZ, UK.
| | - Ioannis Filippis
- Department of Life Sciences, Imperial College London, Exhibition Road, London, SW7 2AZ, UK.
| | - Alessia David
- Department of Life Sciences, Imperial College London, Exhibition Road, London, SW7 2AZ, UK.
| | - Michael J E Sternberg
- Department of Life Sciences, Imperial College London, Exhibition Road, London, SW7 2AZ, UK.
| |
Collapse
|
14
|
Martin B, Chadwick W, Janssens J, Premont RT, Schmalzigaug R, Becker KG, Lehrmann E, Wood WH, Zhang Y, Siddiqui S, Park SS, Cong WN, Daimon CM, Maudsley S. GIT2 Acts as a Systems-Level Coordinator of Neurometabolic Activity and Pathophysiological Aging. Front Endocrinol (Lausanne) 2015; 6:191. [PMID: 26834700 PMCID: PMC4716144 DOI: 10.3389/fendo.2015.00191] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/05/2015] [Accepted: 12/14/2015] [Indexed: 01/08/2023] Open
Abstract
Aging represents one of the most complicated and highly integrated somatic processes. Healthy aging is suggested to rely upon the coherent regulation of hormonal and neuronal communication between the central nervous system and peripheral tissues. The hypothalamus is one of the main structures in the body responsible for sustaining an efficient interaction between energy balance and neurological activity and therefore likely coordinates multiple systems in the aging process. We previously identified, in hypothalamic and peripheral tissues, the G protein-coupled receptor kinase interacting protein 2 (GIT2) as a stress response and aging regulator. As metabolic status profoundly affects aging trajectories, we investigated the role of GIT2 in regulating metabolic activity. We found that genomic deletion of GIT2 alters hypothalamic transcriptomic signatures related to diabetes and metabolic pathways. Deletion of GIT2 reduced whole animal respiratory exchange ratios away from those related to primary glucose usage for energy homeostasis. GIT2 knockout (GIT2KO) mice demonstrated lower insulin secretion levels, disruption of pancreatic islet beta cell mass, elevated plasma glucose, and insulin resistance. High-dimensionality transcriptomic signatures from islets isolated from GIT2KO mice indicated a disruption of beta cell development. Additionally, GIT2 expression was prematurely elevated in pancreatic and hypothalamic tissues from diabetic-state mice (db/db), compared to age-matched wild type (WT) controls, further supporting the role of GIT2 in metabolic regulation and aging. We also found that the physical interaction of pancreatic GIT2 with the insulin receptor and insulin receptor substrate 2 was diminished in db/db mice compared to WT mice. Therefore, GIT2 appears to exert a multidimensional "keystone" role in regulating the aging process by coordinating somatic responses to energy deficits.
Collapse
Affiliation(s)
- Bronwen Martin
- Metabolism Unit, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Wayne Chadwick
- Receptor Pharmacology Unit, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Jonathan Janssens
- Translational Neurobiology Group, VIB Department of Molecular Genetics, University of Antwerp, Antwerp, Belgium
- Laboratory of Neurogenetics, Institute Born-Bunge, University of Antwerp, Antwerp, Belgium
| | - Richard T. Premont
- Department of Medicine, Gastroenterology Division, Duke University, Durham, NC, USA
| | - Robert Schmalzigaug
- Department of Medicine, Gastroenterology Division, Duke University, Durham, NC, USA
| | - Kevin G. Becker
- Gene Expression and Genomics Unit, National Institutes of Health, Baltimore, MD, USA
| | - Elin Lehrmann
- Gene Expression and Genomics Unit, National Institutes of Health, Baltimore, MD, USA
| | - William H. Wood
- Gene Expression and Genomics Unit, National Institutes of Health, Baltimore, MD, USA
| | - Yongqing Zhang
- Gene Expression and Genomics Unit, National Institutes of Health, Baltimore, MD, USA
| | - Sana Siddiqui
- Receptor Pharmacology Unit, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Sung-Soo Park
- Receptor Pharmacology Unit, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Wei-na Cong
- Metabolism Unit, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Caitlin M. Daimon
- Metabolism Unit, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Stuart Maudsley
- Receptor Pharmacology Unit, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
- Translational Neurobiology Group, VIB Department of Molecular Genetics, University of Antwerp, Antwerp, Belgium
- Laboratory of Neurogenetics, Institute Born-Bunge, University of Antwerp, Antwerp, Belgium
- *Correspondence: Stuart Maudsley,
| |
Collapse
|
15
|
Macintyre G, Jimeno Yepes A, Ong CS, Verspoor K. Associating disease-related genetic variants in intergenic regions to the genes they impact. PeerJ 2014; 2:e639. [PMID: 25374782 PMCID: PMC4217187 DOI: 10.7717/peerj.639] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2014] [Accepted: 10/07/2014] [Indexed: 11/20/2022] Open
Abstract
We present a method to assist in interpretation of the functional impact of intergenic disease-associated SNPs that is not limited to search strategies proximal to the SNP. The method builds on two sources of external knowledge: the growing understanding of three-dimensional spatial relationships in the genome, and the substantial repository of information about relationships among genetic variants, genes, and diseases captured in the published biomedical literature. We integrate chromatin conformation capture data (HiC) with literature support to rank putative target genes of intergenic disease-associated SNPs. We demonstrate that this hybrid method outperforms a genomic distance baseline on a small test set of expression quantitative trait loci, as well as either method individually. In addition, we show the potential for this method to uncover relationships between intergenic SNPs and target genes across chromosomes. With more extensive chromatin conformation capture data becoming readily available, this method provides a way forward towards functional interpretation of SNPs in the context of the three dimensional structure of the genome in the nucleus.
Collapse
Affiliation(s)
- Geoff Macintyre
- Department of Computing and Information Systems, The University of Melbourne, VIC, Australia
- Centre for Neural Engineering, The University of Melbourne, VIC, Australia
| | - Antonio Jimeno Yepes
- Department of Computing and Information Systems, The University of Melbourne, VIC, Australia
| | - Cheng Soon Ong
- Department of Electrical and Electronic Engineering, The University of Melbourne, VIC, Australia
- Machine Learning Group, NICTA Canberra Research Laboratory, Australia
- Research School of Computer Science, Australian National University, Australia
| | - Karin Verspoor
- Department of Computing and Information Systems, The University of Melbourne, VIC, Australia
- Health and Biomedical Informatics Centre, The University of Melbourne, VIC, Australia
| |
Collapse
|
16
|
Liu RL, Shih CC. Identification of highly related references about gene-disease association. BMC Bioinformatics 2014; 15:286. [PMID: 25155502 PMCID: PMC4162969 DOI: 10.1186/1471-2105-15-286] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2013] [Accepted: 08/12/2014] [Indexed: 02/03/2023] Open
Abstract
BACKGROUND Curation of gene-disease associations published in literature should be based on careful and frequent survey of the references that are highly related to specific gene-disease associations. Retrieval of the references is thus essential for timely and complete curation. RESULTS We present a technique CRFref (Conclusive, Rich, and Focused References) that, given a gene-disease pair < g, d>, ranks high those biomedical references that are likely to provide conclusive, rich, and focused results about g and d. Such references are expected to be highly related to the association between g and d. CRFref ranks candidate references based on their scores. To estimate the score of a reference r, CRFref estimates and integrates three measures: degree of conclusiveness, degree of richness, and degree of focus of r with respect to < g, d>. To evaluate CRFref, experiments are conducted on over one hundred thousand references for over one thousand gene-disease pairs. Experimental results show that CRFref performs significantly better than several typical types of baselines in ranking high those references that expert curators select to develop the summaries for specific gene-disease associations. CONCLUSION CRFref is a good technique to rank high those references that are highly related to specific gene-disease associations. It can be incorporated into existing search engines to prioritize biomedical references for curators and researchers, as well as those text mining systems that aim at the study of gene-disease associations.
Collapse
Affiliation(s)
- Rey-Long Liu
- Department of Medical Informatics, Tzu Chi University, Hualien, Taiwan.
| | | |
Collapse
|
17
|
Ono T, Kuhara S. A novel method for gathering and prioritizing disease candidate genes based on construction of a set of disease-related MeSH® terms. BMC Bioinformatics 2014; 15:179. [PMID: 24917541 PMCID: PMC4068192 DOI: 10.1186/1471-2105-15-179] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2013] [Accepted: 06/02/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Understanding the molecular mechanisms involved in disease is critical for the development of more effective and individualized strategies for prevention and treatment. The amount of disease-related literature, including new genetic information on the molecular mechanisms of disease, is rapidly increasing. Extracting beneficial information from literature can be facilitated by computational methods such as the knowledge-discovery approach. Several methods for mining gene-disease relationships using computational methods have been developed, however, there has been a lack of research evaluating specific disease candidate genes. RESULTS We present a novel method for gathering and prioritizing specific disease candidate genes. Our approach involved the construction of a set of Medical Subject Headings (MeSH) terms for the effective retrieval of publications related to a disease candidate gene. Information regarding the relationships between genes and publications was obtained from the gene2pubmed database. The set of genes was prioritized using a "weighted literature score" based on the number of publications and weighted by the number of genes occurring in a publication. Using our method for the disease states of pain and Alzheimer's disease, a total of 1101 pain candidate genes and 2810 Alzheimer's disease candidate genes were gathered and prioritized. The precision was 0.30 and the recall was 0.89 in the case study of pain. The precision was 0.04 and the recall was 0.6 in the case study of Alzheimer's disease. The precision-recall curve indicated that the performance of our method was superior to that of other publicly available tools. CONCLUSIONS Our method, which involved the use of a set of MeSH terms related to disease candidate genes and a novel weighted literature score, improved the accuracy of gathering and prioritizing candidate genes by focusing on a specific disease.
Collapse
Affiliation(s)
| | - Satoru Kuhara
- Department of Genetic Resources Technology, Faculty of Agriculture, Kyushu University, 6-10-1 Hakozaki Higashi-ku, Fukuoka 812-8581, Japan.
| |
Collapse
|
18
|
Abstract
While the genomics-derived discoveries promise benefits to basic research and health care, the speed and affordability of sequencing following recent technological advances has further aggravated the data deluge. Seamless integration of the ever-increasing clinical, genomic, and experimental data and efficient mining for knowledge extraction, delivering actionable insight and generating testable hypotheses are therefore critical for the needs of biomedical research. For instance, high-throughput techniques are frequently applied to detect disease candidate genes. Experimental validation of these candidates however is both time-consuming and expensive. Hence, several computational approaches based on literature and data mining have been developed to identify the most promising candidates for follow-up studies. Based on "guilt by association" principle, most of these methods use prior knowledge about a disease of interest to discover and rank novel candidate genes. In this chapter, we provide a brief overview of recent advances made in literature- and data-mining-based approaches for candidate gene prioritization. As a case study, we focus on a Web-based computational approach that uses integrated heterogeneous data sources including gene-literature associations for ranking disease candidate genes and explain how to run typical queries using this system.
Collapse
|
19
|
Cheung WA, Ouellette BFF, Wasserman WW. Compensating for literature annotation bias when predicting novel drug-disease relationships through Medical Subject Heading Over-representation Profile (MeSHOP) similarity. BMC Med Genomics 2013; 6 Suppl 2:S3. [PMID: 23819887 PMCID: PMC3654871 DOI: 10.1186/1755-8794-6-s2-s3] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Background Using annotations to the articles in MEDLINE®/PubMed®, over six thousand chemical compounds with pharmacological actions have been tracked since 1996. Medical Subject Heading Over-representation Profiles (MeSHOPs) quantitatively leverage the literature associated with biological entities such as diseases or drugs, providing the opportunity to reposition known compounds towards novel disease applications. Methods A MeSHOP is constructed by counting the number of times each medical subject term is assigned to an entity-related research publication in the MEDLINE database and calculating the significance of the count by comparing against the count of the term in a background set of publications. Based on the expectation that drugs suitable for treatment of a disease (or disease symptom) will have similar annotation properties to the disease, we successfully predict drug-disease associations by comparing MeSHOPs of diseases and drugs. Results The MeSHOP comparison approach delivers an 11% improvement over bibliometric baselines. However, novel drug-disease associations are observed to be biased towards drugs and diseases with more publications. To account for the annotation biases, a correction procedure is introduced and evaluated. Conclusions By explicitly accounting for the annotation bias, unexpectedly similar drug-disease pairs are highlighted as candidates for drug repositioning research. MeSHOPs are shown to provide a literature-supported perspective for discovery of new links between drugs and diseases based on pre-existing knowledge.
Collapse
Affiliation(s)
- Warren A Cheung
- Centre for Molecular Medicine and Therapeutics at Child and Family Research Institute, University of British Columbia, Vancouver, BC, Canada
| | | | | |
Collapse
|
20
|
Gillis J, Pavlidis P. Assessing identity, redundancy and confounds in Gene Ontology annotations over time. ACTA ACUST UNITED AC 2013; 29:476-82. [PMID: 23297035 DOI: 10.1093/bioinformatics/bts727] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION The Gene Ontology (GO) is heavily used in systems biology, but the potential for redundancy, confounds with other data sources and problems with stability over time have been little explored. RESULTS We report that GO annotations are stable over short periods, with 3% of genes not being most semantically similar to themselves between monthly GO editions. However, we find that genes can alter their 'functional identity' over time, with 20% of genes not matching to themselves (by semantic similarity) after 2 years. We further find that annotation bias in GO, in which some genes are more characterized than others, has declined in yeast, but generally increased in humans. Finally, we discovered that many entries in protein interaction databases are owing to the same published reports that are used for GO annotations, with 66% of assessed GO groups exhibiting this confound. We provide a case study to illustrate how this information can be used in analyses of gene sets and networks. AVAILABILITY Data available at http://chibi.ubc.ca/assessGO.
Collapse
Affiliation(s)
- Jesse Gillis
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, 192B Genome Research Center, 500 Sunnyside Boulevard, Woodbury, NY 11797, USA
| | | |
Collapse
|
21
|
Andrade-Navarro MA. Mining the literature: new methods to exploit keyword profiles. Genome Med 2012; 4:81. [PMID: 23114100 PMCID: PMC3580450 DOI: 10.1186/gm382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Bibliographic records in the PubMed database of biomedical literature are annotated with Medical Subject Headings (MeSH) by curators, which summarize the content of the articles. Two recent publications explain how to generate profiles of MeSH terms for a set of bibliographic records and to use them to define any given concept by its associated literature. These concepts can then be related by their keyword profiles, and this can be used, for example, to detect new associations between genes and inherited diseases. See related research articles: http://www.biomedcentral.com/1471-2105/13/249/abstracthttp://genomemedicine.com/content/4/9/75/abstract
Collapse
|
22
|
Cheung WA, Ouellette BF, Wasserman WW. Inferring novel gene-disease associations using Medical Subject Heading Over-representation Profiles. Genome Med 2012; 4:75. [PMID: 23021552 PMCID: PMC3580445 DOI: 10.1186/gm376] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2012] [Revised: 09/11/2012] [Accepted: 09/28/2012] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND MEDLINE(®)/PubMed(®) currently indexes over 18 million biomedical articles, providing unprecedented opportunities and challenges for text analysis. Using Medical Subject Heading Over-representation Profiles (MeSHOPs), an entity of interest can be robustly summarized, quantitatively identifying associated biomedical terms and predicting novel indirect associations. METHODS A procedure is introduced for quantitative comparison of MeSHOPs derived from a group of MEDLINE(®) articles for a biomedical topic (for example, articles for a specific gene or disease). Similarity scores are computed to compare MeSHOPs of genes and diseases. RESULTS Similarity scores successfully infer novel associations between diseases and genes. The number of papers addressing a gene or disease has a strong influence on predicted associations, revealing an important bias for gene-disease relationship prediction. Predictions derived from comparisons of MeSHOPs achieves a mean 8% AUC improvement in the identification of gene-disease relationships compared to gene-independent baseline properties. CONCLUSIONS MeSHOP comparisons are demonstrated to provide predictive capacity for novel relationships between genes and human diseases. We demonstrate the impact of literature bias on the performance of gene-disease prediction methods. MeSHOPs provide a rich source of annotation to facilitate relationship discovery in biomedical informatics.
Collapse
Affiliation(s)
- Warren A Cheung
- Bioinformatics Graduate Program, Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, University of British Columbia, 980 W. 28th Ave, Vancouver, V5Z 4H4, Canada
| | - Bf Francis Ouellette
- Department of Cells and Systems Biology, Ontario Institute for Cancer Research, University of Toronto, 101 College Street, Toronto, M5G 0A3, Canada
| | - Wyeth W Wasserman
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, University of British Columbia, 980 W. 28th Ave, Vancouver, V5Z 4H4, Canada
| |
Collapse
|