1
|
Wang C, Liu Y, Tan Y, Xu F, Wang M, Tang Y, Nie G, Chi X, Xu Z, Xu Y, An B, Tian G, Qi D, Yao C. HOGA1 Suppresses Renal Cell Carcinoma Growth via Inhibiting the Wnt/β-Catenin Signalling Pathway. J Cell Mol Med 2025; 29:e70490. [PMID: 40100076 PMCID: PMC11917137 DOI: 10.1111/jcmm.70490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2024] [Revised: 02/26/2025] [Accepted: 03/05/2025] [Indexed: 03/20/2025] Open
Abstract
Changes in hydroxyproline metabolism are reported to promote tumorigenesis. HOGA1 is a useful marker for diagnosing primary hyperoxaluria 3, catalysing the final step of mitochondrial hydroxyproline metabolism from 4-hydroxy-2-oxoglutarate (HOG) to glyoxylate and pyruvate; however, its specific mechanism in RCC remains unclear. This study investigated the role of HOGA1 in the pathogenesis of ccRCC. The results showed that HOGA1 was decreased significantly in tumour tissues, with this low expression associated with a poor prognosis in patients with ccRCC. QTL mapping showed that Hoga1 was cis-regulated. Gene enrichment analyses showed that Hoga1 co-expressed genes were enriched in the Wnt/β-catenin signalling pathway. Furthermore, in vitro and in vivo assays demonstrated that HOGA1 significantly inhibited the proliferation, invasion and migration of renal carcinoma cells via the Wnt/β-catenin-c-Myc/CyclinD1 axis, probably via regulating the level of HOG. In conclusion, this study demonstrates that HOGA1 has a tumour suppressor role by inhibiting the Wnt/β-catenin signalling pathway. This finding provides new insights into the function of HOGA1 in ccRCC.
Collapse
Affiliation(s)
- Congmin Wang
- School of PharmacyBinzhou Medical UniversityYantaiChina
| | - Yu Liu
- School of PharmacyBinzhou Medical UniversityYantaiChina
| | - Ying Tan
- School of PharmacyBinzhou Medical UniversityYantaiChina
| | - Fuyi Xu
- School of PharmacyBinzhou Medical UniversityYantaiChina
- Shandong Technology Innovation Center of Molecular Targeting and Intelligent Diagnosis and TreatmentBinzhou Medical UniversityYantaiChina
| | - Mingyao Wang
- School of PharmacyBinzhou Medical UniversityYantaiChina
| | - Yiming Tang
- The Second School of Clinical MedicineBinzhou Medical UniversityYantaiChina
| | - Guofeng Nie
- The First School of Clinical MedicineBinzhou Medical UniversityYantaiChina
| | - Xiaodong Chi
- School of PharmacyBinzhou Medical UniversityYantaiChina
- Shandong Technology Innovation Center of Molecular Targeting and Intelligent Diagnosis and TreatmentBinzhou Medical UniversityYantaiChina
| | - Zhaowei Xu
- School of PharmacyBinzhou Medical UniversityYantaiChina
- Shandong Technology Innovation Center of Molecular Targeting and Intelligent Diagnosis and TreatmentBinzhou Medical UniversityYantaiChina
| | - Yuxue Xu
- School of PharmacyBinzhou Medical UniversityYantaiChina
- Shandong Technology Innovation Center of Molecular Targeting and Intelligent Diagnosis and TreatmentBinzhou Medical UniversityYantaiChina
| | - Baijiao An
- School of PharmacyBinzhou Medical UniversityYantaiChina
- Shandong Technology Innovation Center of Molecular Targeting and Intelligent Diagnosis and TreatmentBinzhou Medical UniversityYantaiChina
| | - Geng Tian
- School of PharmacyBinzhou Medical UniversityYantaiChina
- Shandong Technology Innovation Center of Molecular Targeting and Intelligent Diagnosis and TreatmentBinzhou Medical UniversityYantaiChina
| | - Donglai Qi
- School of PharmacyBinzhou Medical UniversityYantaiChina
- Shandong Technology Innovation Center of Molecular Targeting and Intelligent Diagnosis and TreatmentBinzhou Medical UniversityYantaiChina
| | - Cuifang Yao
- School of PharmacyBinzhou Medical UniversityYantaiChina
- Shandong Technology Innovation Center of Molecular Targeting and Intelligent Diagnosis and TreatmentBinzhou Medical UniversityYantaiChina
| |
Collapse
|
2
|
Zhang D, Shang X, Ji Q, Niu L. Exploring genetic mapping and co-expression patterns to illuminate significance of Tbx20 in cardiac biology. Transgenic Res 2025; 34:5. [PMID: 39777589 DOI: 10.1007/s11248-024-00423-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2024] [Accepted: 11/04/2024] [Indexed: 01/11/2025]
Abstract
The transcription factor Tbx20 is integral to heart development and plays a significant role in various cardiac diseases. Despite its established importance, the regulatory mechanisms and functional significance of Tbx20 remain incompletely understood. To elucidate these mechanisms, we initially conducted eQTL mapping to identify genetic loci associated with Tbx20 expression in heart tissue from BXD mice. Co-expression and enrichment analyses revealed pathways linked to Tbx20, including dilated cardiomyopathy, hypertrophic cardiomyopathy, and FoxO signaling. Additionally, protein-protein interaction studies identified essential cardiac proteins, such as Myl2 and Myl7, along with upstream regulators like Mef2c. To validate our bioinformatic findings, we performed quantitative reverse transcription polymerase chain reaction (qRT-PCR) to assess the relative mRNA expression levels of TBX20 and Mef2c in the heart tissues of BXD mice compared to their parental strains (B6 and D2). Our results demonstrated significant up-regulation of both TBX20 and Mef2c in the BXD group relative to the parental strains. Conversely, both genes were down-regulated in B6, D2, Control, and Treatment groups when compared to BXD mice. These findings confirm the predicted regulatory roles of TBX20 and Mef2c in cardiac development as suggested by our initial analyses.This study not only reinforces the critical role of Tbx20 in cardiac gene regulation but also highlights its potential as a therapeutic target for cardiovascular disorders. Further investigations into Tbx20 and its interactions will enhance our understanding of heart biology and contribute to the development of targeted therapies for heart diseases.
Collapse
Affiliation(s)
- Dezhong Zhang
- Department of Cardiothoracic Surgery, Children's Hospital of Fudan University (XiamenBranch), Xiamen Children's Hospital, Xiamen, 361000, China
| | - Xiao Shang
- Department of Cardiology, The People´S Hospital of Zouping City, No 22 Huangshan Second Road, Zouping, 256200, Shandong, China
| | - Quanquan Ji
- Department of Geriatrics, Qingdao Chengyang District People's Hospital, No.600, Great Wall Street, Qingdao, 266109, Shandong, China
| | - Li Niu
- Department of Cadre Health Care, Qingdao Municipal Hospital, No.1 Jiaozhou Street, Qingdao, 266011, Shandong, China.
| |
Collapse
|
3
|
Zhang L, Huang T, He H, Xu F, Yang C, Lu L, Tian G, Wang L, Mi J. Unraveling the molecular mechanisms of Ace2-mediated post-COVID-19 cognitive dysfunction through systems genetics approach. Exp Neurol 2024; 381:114921. [PMID: 39142369 DOI: 10.1016/j.expneurol.2024.114921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 08/03/2024] [Accepted: 08/10/2024] [Indexed: 08/16/2024]
Abstract
The dysregulation of Angiotensin-converting enzyme 2 (ACE2) in central nervous system is believed associates with COVID-19 induced cognitive dysfunction. However, the detailed mechanism remains largely unknown. In this study, we performed a comprehensive system genetics analysis on hippocampal ACE2 based on BXD mice panel. Expression quantitative trait loci (eQTLs) mapping showed that Ace2 was strongly trans-regulated, and the elevation of Ace2 expression level was significantly correlated with impaired cognitive functions. Further Gene co-expression analysis showed that Ace2 may be correlated with the membrane proteins in Calcium signaling pathway. Further, qRT-PCR confirmed that SARS-CoV-2 spike S1 protein upregulated ACE2 expression together with eight membrane proteins in Calcium Signaling pathway. Moreover, such elevation can be attenuated by recombinant ACE2. Collectively, our findings revealed a potential mechanism of Ace2 in cognitive dysfunction, which could be beneficial for COVID-19-induced cognitive dysfunction prevention and potential treatment.
Collapse
Affiliation(s)
- Liyuan Zhang
- Shandong Technology Innovation Center of Molecular Targeting and Intelligent Diagnosis and Treatment, Binzhou Medical University, Shandong, Yantai 264003, China
| | - Tingting Huang
- Shandong Technology Innovation Center of Molecular Targeting and Intelligent Diagnosis and Treatment, Binzhou Medical University, Shandong, Yantai 264003, China
| | - Hongjie He
- Shandong Technology Innovation Center of Molecular Targeting and Intelligent Diagnosis and Treatment, Binzhou Medical University, Shandong, Yantai 264003, China
| | - Fuyi Xu
- Shandong Technology Innovation Center of Molecular Targeting and Intelligent Diagnosis and Treatment, Binzhou Medical University, Shandong, Yantai 264003, China
| | - Chunhua Yang
- Shandong Technology Innovation Center of Molecular Targeting and Intelligent Diagnosis and Treatment, Binzhou Medical University, Shandong, Yantai 264003, China
| | - Lu Lu
- University of Tennessee Health Science Center, Memphis, TN 38163, USA
| | - Geng Tian
- Shandong Technology Innovation Center of Molecular Targeting and Intelligent Diagnosis and Treatment, Binzhou Medical University, Shandong, Yantai 264003, China
| | - Lei Wang
- Harbin Medical University, Harbin 150086, Heilongjiang Province, China.
| | - Jia Mi
- Shandong Technology Innovation Center of Molecular Targeting and Intelligent Diagnosis and Treatment, Binzhou Medical University, Shandong, Yantai 264003, China.
| |
Collapse
|
4
|
Roy S, Morrell S, Zhao L, Homayouni R. Large-scale identification of social and behavioral determinants of health from clinical notes: comparison of Latent Semantic Indexing and Generative Pretrained Transformer (GPT) models. BMC Med Inform Decis Mak 2024; 24:296. [PMID: 39390479 PMCID: PMC11465786 DOI: 10.1186/s12911-024-02705-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Accepted: 09/30/2024] [Indexed: 10/12/2024] Open
Abstract
BACKGROUND Social and behavioral determinants of health (SBDH) are associated with a variety of health and utilization outcomes, yet these factors are not routinely documented in the structured fields of electronic health records (EHR). The objective of this study was to evaluate different machine learning approaches for detection of SBDH from the unstructured clinical notes in the EHR. METHODS Latent Semantic Indexing (LSI) was applied to 2,083,180 clinical notes corresponding to 46,146 patients in the MIMIC-III dataset. Using LSI, patients were ranked based on conceptual relevance to a set of keywords (lexicons) pertaining to 15 different SBDH categories. For Generative Pretrained Transformer (GPT) models, API requests were made with a Python script to connect to the OpenAI services in Azure, using gpt-3.5-turbo-1106 and gpt-4-1106-preview models. Prediction of SBDH categories were performed using a logistic regression model that included age, gender, race and SBDH ICD-9 codes. RESULTS LSI retrieved patients according to 15 SBDH domains, with an overall average PPV ≥ 83%. Using manually curated gold standard (GS) sets for nine SBDH categories, the macro-F1 score of LSI (0.74) was better than ICD-9 (0.71) and GPT-3.5 (0.54), but lower than GPT-4 (0.80). Due to document size limitations, only a subset of the GS cases could be processed by GPT-3.5 (55.8%) and GPT-4 (94.2%), compared to LSI (100%). Using common GS subsets for nine different SBDH categories, the macro-F1 of ICD-9 combined with either LSI (mean 0.88, 95% CI 0.82-0.93), GPT-3.5 (0.86, 0.82-0.91) or GPT-4 (0.88, 0.83-0.94) was not significantly different. After including age, gender, race and ICD-9 in a logistic regression model, the AUC for prediction of six out of the nine SBDH categories was higher for LSI compared to GPT-4.0. CONCLUSIONS These results demonstrate that the LSI approach performs comparable to more recent large language models, such as GPT-3.5 and GPT-4.0, when using the same set of documents. Importantly, LSI is robust, deterministic, and does not have document-size limitations or cost implications, which make it more amenable to real-world applications in health systems.
Collapse
Affiliation(s)
- Sujoy Roy
- Foundational Medical Studies, Population Health Informatics, Oakland University William Beaumont School of Medicine, Oakland University, 586 Pioneer Dr, 460 O'Dowd Hall, Rochester, MI, 48309-4482, USA
| | | | - Lili Zhao
- Biostatistics, Beaumont Research Institute, Corewell Health, Royal Oak, Michigan, USA
| | - Ramin Homayouni
- Foundational Medical Studies, Population Health Informatics, Oakland University William Beaumont School of Medicine, Oakland University, 586 Pioneer Dr, 460 O'Dowd Hall, Rochester, MI, 48309-4482, USA.
- Population Health & Health Equity Research, Beaumont Research Institute, Corewell Health, Royal Oak, Michigan, USA.
| |
Collapse
|
5
|
Pan L, Cho KS, Wei X, Xu F, Lennikov A, Hu G, Tang J, Guo S, Chen J, Kriukov E, Kyle R, Elzaridi F, Jiang S, Dromel PA, Young M, Baranov P, Do CW, Williams RW, Chen J, Lu L, Chen DF. IGFBPL1 is a master driver of microglia homeostasis and resolution of neuroinflammation in glaucoma and brain tauopathy. Cell Rep 2023; 42:112889. [PMID: 37527036 PMCID: PMC10528709 DOI: 10.1016/j.celrep.2023.112889] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 03/08/2023] [Accepted: 07/12/2023] [Indexed: 08/03/2023] Open
Abstract
Microglia shift toward an inflammatory phenotype during aging that is thought to exacerbate age-related neurodegeneration. The molecular and cellular signals that resolve neuroinflammation post-injury are largely undefined. Here, we exploit systems genetics methods based on the extended BXD murine reference family and identify IGFBPL1 as an upstream cis-regulator of microglia-specific genes to switch off inflammation. IGFBPL1 is expressed by mouse and human microglia, and higher levels of its expression resolve lipopolysaccharide-induced neuroinflammation by resetting the transcriptome signature back to a homeostatic state via IGF1R signaling. Conversely, IGFBPL1 deficiency or selective deletion of IGF1R in microglia shifts these cells to an inflammatory landscape and induces early manifestation of brain tauopathy and retinal neurodegeneration. Therapeutic administration of IGFBPL1 drives pro-homeostatic microglia and prevents glaucomatous neurodegeneration and vision loss in mice. These results identify IGFBPL1 as a master driver of the counter-inflammatory microglial modulator that presents an endogenous resolution of neuroinflammation to prevent neurodegeneration in eye and brain.
Collapse
Affiliation(s)
- Li Pan
- Schepens Eye Research Institute of Massachusetts Eye and Ear, Department of Ophthalmology, Harvard Medical School, Boston, MA 02114, USA; School of Optometry, The Hong Kong Polytechnic University, Hong Kong 999077, China
| | - Kin-Sang Cho
- Schepens Eye Research Institute of Massachusetts Eye and Ear, Department of Ophthalmology, Harvard Medical School, Boston, MA 02114, USA
| | - Xin Wei
- Schepens Eye Research Institute of Massachusetts Eye and Ear, Department of Ophthalmology, Harvard Medical School, Boston, MA 02114, USA; Department of Ophthalmology, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Fuyi Xu
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA; Shandong Technology Innovation Center of Molecular Targeting and Intelligent Diagnosis and Treatment, School of Pharmacy, Binzhou Medical University, Yantai, Shandong 264003, China
| | - Anton Lennikov
- Schepens Eye Research Institute of Massachusetts Eye and Ear, Department of Ophthalmology, Harvard Medical School, Boston, MA 02114, USA
| | - Guangan Hu
- Koch Institute for Integrative Cancer Research and Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Jing Tang
- Schepens Eye Research Institute of Massachusetts Eye and Ear, Department of Ophthalmology, Harvard Medical School, Boston, MA 02114, USA; Department of Ophthalmology, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Shuai Guo
- Schepens Eye Research Institute of Massachusetts Eye and Ear, Department of Ophthalmology, Harvard Medical School, Boston, MA 02114, USA
| | - Julie Chen
- Schepens Eye Research Institute of Massachusetts Eye and Ear, Department of Ophthalmology, Harvard Medical School, Boston, MA 02114, USA
| | - Emil Kriukov
- Schepens Eye Research Institute of Massachusetts Eye and Ear, Department of Ophthalmology, Harvard Medical School, Boston, MA 02114, USA
| | - Robert Kyle
- Schepens Eye Research Institute of Massachusetts Eye and Ear, Department of Ophthalmology, Harvard Medical School, Boston, MA 02114, USA
| | - Farris Elzaridi
- Schepens Eye Research Institute of Massachusetts Eye and Ear, Department of Ophthalmology, Harvard Medical School, Boston, MA 02114, USA
| | - Shuhong Jiang
- Schepens Eye Research Institute of Massachusetts Eye and Ear, Department of Ophthalmology, Harvard Medical School, Boston, MA 02114, USA
| | - Pierre A Dromel
- Schepens Eye Research Institute of Massachusetts Eye and Ear, Department of Ophthalmology, Harvard Medical School, Boston, MA 02114, USA
| | - Michael Young
- Schepens Eye Research Institute of Massachusetts Eye and Ear, Department of Ophthalmology, Harvard Medical School, Boston, MA 02114, USA
| | - Petr Baranov
- Schepens Eye Research Institute of Massachusetts Eye and Ear, Department of Ophthalmology, Harvard Medical School, Boston, MA 02114, USA
| | - Chi-Wai Do
- School of Optometry, The Hong Kong Polytechnic University, Hong Kong 999077, China
| | - Robert W Williams
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA
| | - Jianzhu Chen
- Koch Institute for Integrative Cancer Research and Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Lu Lu
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA.
| | - Dong Feng Chen
- Schepens Eye Research Institute of Massachusetts Eye and Ear, Department of Ophthalmology, Harvard Medical School, Boston, MA 02114, USA.
| |
Collapse
|
6
|
Gu Q, Orgil BO, Bajpai AK, Chen Y, Ashbrook DG, Starlard-Davenport A, Towbin JA, Lebeche D, Purevjav E, Sheng H, Lu L. Expression Levels of the Tnni3k Gene in the Heart Are Highly Associated with Cardiac and Glucose Metabolism-Related Phenotypes and Functional Pathways. Int J Mol Sci 2023; 24:12759. [PMID: 37628941 PMCID: PMC10454158 DOI: 10.3390/ijms241612759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 08/06/2023] [Accepted: 08/10/2023] [Indexed: 08/27/2023] Open
Abstract
BACKGROUND Troponin-I interacting kinase encoded by the TNNI3K gene is expressed in nuclei and Z-discs of cardiomyocytes. Mutations in TNNI3K were identified in patients with cardiac conduction diseases, arrhythmias, and cardiomyopathy. METHODS We performed cardiac gene expression, whole genome sequencing (WGS), and cardiac function analysis in 40 strains of BXD recombinant inbred mice derived from C57BL/6J (B6) and DBA/2J (D2) strains. Expression quantitative trait loci (eQTLs) mapping and gene enrichment analysis was performed, followed by validation of candidate Tnni3k-regulatory genes. RESULTS WGS identified compound splicing and missense T659I Tnni3k variants in the D2 parent and some BXD strains (D allele) and these strains had significantly lower Tnni3k expression than those carrying wild-type Tnni3k (B allele). Expression levels of Tnni3k significantly correlated with multiple cardiac (heart rate, wall thickness, PR duration, and T amplitude) and metabolic (glucose levels and insulin resistance) phenotypes in BXDs. A significant cis-eQTL on chromosome 3 was identified for the regulation of Tnni3k expression. Furthermore, Tnni3k-correlated genes were primarily involved in cardiac and glucose metabolism-related functions and pathways. Genes Nodal, Gnas, Nfkb1, Bmpr2, Bmp7, Smad7, Acvr1b, Acvr2b, Chrd, Tgfb3, Irs1, and Ppp1cb were differentially expressed between the B and D alleles. CONCLUSIONS Compound splicing and T659I Tnni3k variants reduce cardiac Tnni3k expression and Tnni3k levels are associated with cardiac and glucose metabolism-related phenotypes.
Collapse
Affiliation(s)
- Qingqing Gu
- Department of Cardiology, Affiliated Hospital of Nantong University, Nantong 226001, China; (Q.G.); (Y.C.)
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA; (A.K.B.); (D.G.A.); (A.S.-D.)
| | - Buyan-Ochir Orgil
- The Heart Institute, Department of Pediatrics, University of Tennessee Health Science Center, Memphis, TN 38103, USA; (B.-O.O.); (J.A.T.); (E.P.)
- Children’s Foundation Research Institute, Le Bonheur Children’s Hospital, Memphis, TN 38105, USA
| | - Akhilesh Kumar Bajpai
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA; (A.K.B.); (D.G.A.); (A.S.-D.)
| | - Yufeng Chen
- Department of Cardiology, Affiliated Hospital of Nantong University, Nantong 226001, China; (Q.G.); (Y.C.)
| | - David G. Ashbrook
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA; (A.K.B.); (D.G.A.); (A.S.-D.)
| | - Athena Starlard-Davenport
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA; (A.K.B.); (D.G.A.); (A.S.-D.)
| | - Jeffrey A. Towbin
- The Heart Institute, Department of Pediatrics, University of Tennessee Health Science Center, Memphis, TN 38103, USA; (B.-O.O.); (J.A.T.); (E.P.)
- Children’s Foundation Research Institute, Le Bonheur Children’s Hospital, Memphis, TN 38105, USA
- Pediatric Cardiology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Djamel Lebeche
- Department of Physiology, College of Medicine, University of Tennessee Health Science Center, Memphis, TN 38163, USA;
| | - Enkhsaikhan Purevjav
- The Heart Institute, Department of Pediatrics, University of Tennessee Health Science Center, Memphis, TN 38103, USA; (B.-O.O.); (J.A.T.); (E.P.)
- Children’s Foundation Research Institute, Le Bonheur Children’s Hospital, Memphis, TN 38105, USA
| | - Hongzhuan Sheng
- Department of Cardiology, Affiliated Hospital of Nantong University, Nantong 226001, China; (Q.G.); (Y.C.)
| | - Lu Lu
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA; (A.K.B.); (D.G.A.); (A.S.-D.)
| |
Collapse
|
7
|
Armenta-Medina D, Brambila-Tapia AJL, Miranda-Jiménez S, Rodea-Montero ER. A Web Application for Biomedical Text Mining of Scientific Literature Associated with Coronavirus-Related Syndromes: Coronavirus Finder. Diagnostics (Basel) 2022; 12:887. [PMID: 35453935 PMCID: PMC9028729 DOI: 10.3390/diagnostics12040887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 02/10/2022] [Accepted: 02/11/2022] [Indexed: 12/10/2022] Open
Abstract
In this study, a web application was developed that comprises scientific literature associated with the Coronaviridae family, specifically for those viruses that are members of the Genus Betacoronavirus, responsible for emerging diseases with a great impact on human health: Middle East Respiratory Syndrome-Related Coronavirus (MERS-CoV) and Severe Acute Respiratory Syndrome-Related Coronavirus (SARS-CoV, SARS-CoV-2). The information compiled on this webserver aims to understand the basics of these viruses' infection, and the nature of their pathogenesis, enabling the identification of molecular and cellular components that may function as potential targets on the design and development of successful treatments for the diseases associated with the Coronaviridae family. Some of the web application's primary functions are searching for keywords within the scientific literature, natural language processing for the extraction of genes and words, the generation and visualization of gene networks associated with viral diseases derived from the analysis of latent semantic space, and cosine similarity measures. Interestingly, our gene association analysis reveals drug targets in understudies, and new targets suggested in the scientific literature to treat coronavirus.
Collapse
Affiliation(s)
- Dagoberto Armenta-Medina
- Consejo Nacional de Ciencia y Tecnología (CONACyT), Ciudad de México 03940, Mexico;
- Centro de Investigación e Innovación en Tecnologías de la Información y Comunicación (INFOTEC), Aguascalientes 20326, Mexico
| | | | - Sabino Miranda-Jiménez
- Consejo Nacional de Ciencia y Tecnología (CONACyT), Ciudad de México 03940, Mexico;
- Centro de Investigación e Innovación en Tecnologías de la Información y Comunicación (INFOTEC), Aguascalientes 20326, Mexico
| | | |
Collapse
|
8
|
Siamese Neural Networks for Damage Detection and Diagnosis of Jacket-Type Offshore Wind Turbine Platforms. MATHEMATICS 2022. [DOI: 10.3390/math10071131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
Offshore wind energy is increasingly being realized at deeper ocean depths where jacket foundations are now the greatest choice for dealing with the hostile environment. The structural stability of these undersea constructions is critical. This paper states a methodology to detect and classify damage in a jacket-type support structure for offshore wind turbines. Because of the existence of unknown external disturbances (wind and waves), standard structural health monitoring technologies, such as guided waves, cannot be used directly in this application. Therefore, using vibration-response-only accelerometer measurements, a methodology based on two in-cascade Siamese convolutional neural networks is proposed. The first Siamese network detects the damage (discerns whether the structure is healthy or damaged). Then, in case damage is detected, a second Siamese network determines the damage diagnosis (classifies the type of damage). The main results and claims of the proposed methodology are the following ones: (i) It is solely dependent on accelerometer sensor output vibration data, (ii) it detects damage and classifies the type of damage, (iii) it operates in all wind turbine regions of operation, (iv) it requires less data to train since it is built on Siamese convolutional neural networks, which can learn from very little data compared to standard machine/deep learning algorithms, (v) it is validated in a scaled-down experimental laboratory setup, and (vi) its feasibility is demonstrated as all computed metrics (accuracy, precision, recall, and F1 score) for the obtained results remain above 96%.
Collapse
|
9
|
Gu Q, Xu F, Orgil BO, Khuchua Z, Munkhsaikhan U, Johnson JN, Alberson NR, Pierre JF, Black DD, Dong D, Brennan JA, Cathey BM, Efimov IR, Towbin JA, Purevjav E, Lu L. Systems genetics analysis defines importance of TMEM43/ LUMA for cardiac- and metabolic-related pathways. Physiol Genomics 2022; 54:22-35. [PMID: 34766515 PMCID: PMC8721901 DOI: 10.1152/physiolgenomics.00066.2021] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Revised: 10/07/2021] [Accepted: 11/08/2021] [Indexed: 12/31/2022] Open
Abstract
Broad cellular functions and diseases including muscular dystrophy, arrhythmogenic right ventricular cardiomyopathy (ARVC5) and cancer are associated with transmembrane protein43 (TMEM43/LUMA). The study aimed to investigate biological roles of TMEM43 through genetic regulation, gene pathways and gene networks, candidate interacting genes, and up- or downstream regulators. Cardiac transcriptomes from 40 strains of recombinant inbred BXD mice and two parental strains representing murine genetic reference population (GRP) were applied for genetic correlation, functional enrichment, and coexpression network analysis using systems genetics approach. The results were validated in a newly created knock-in Tmem43-S358L mutation mouse model (Tmem43S358L) that displayed signs of cardiac dysfunction, resembling ARVC5 phenotype seen in humans. We found high Tmem43 levels among BXDs with broad variability in expression. Expression of Tmem43 highly negatively correlated with heart mass and heart rate among BXDs, whereas levels of Tmem43 highly positively correlated with plasma high-density lipoproteins (HDL). Through finding differentially expressed genes (DEGs) between Tmem43S358L mutant and wild-type (Tmem43WT) lines, 18 pathways (out of 42 found in BXDs GRP) that are involved in ARVC, hypertrophic cardiomyopathy, dilated cardiomyopathy, nonalcoholic fatty liver disease, Alzheimer's disease, Parkinson's disease, and Huntington's disease were verified. We further constructed Tmem43-mediated gene network, in which Ctnna1, Adcy6, Gnas, Ndufs6, and Uqcrc2 were significantly altered in Tmem43S358L mice versus Tmem43WT controls. Our study defined the importance of Tmem43 for cardiac- and metabolism-related pathways, suggesting that cardiovascular disease-relevant risk factors may also increase risk of metabolic and neurodegenerative diseases via TMEM43-mediated pathways.
Collapse
Affiliation(s)
- Qingqing Gu
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee
- Department of Cardiology, The Affiliated Hospital of Nantong University, Nantong, China
| | - Fuyi Xu
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee
- School of Pharmacy, Binzhou Medical University, Yantai, Shandong, China
| | - Buyan-Ochir Orgil
- Department of Pediatrics, University of Tennessee Health Science Center, Memphis, Tennessee
- Children's Foundation Research Institute, Le Bonheur Children's Hospital, Memphis, Tennessee
| | - Zaza Khuchua
- The Heart Institute, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio
- Department of Biochemistry, Sechenov University, Moscow, Russia
| | - Undral Munkhsaikhan
- Department of Pediatrics, University of Tennessee Health Science Center, Memphis, Tennessee
- Children's Foundation Research Institute, Le Bonheur Children's Hospital, Memphis, Tennessee
| | - Jason N Johnson
- Department of Pediatrics, University of Tennessee Health Science Center, Memphis, Tennessee
- Children's Foundation Research Institute, Le Bonheur Children's Hospital, Memphis, Tennessee
| | - Neely R Alberson
- Department of Pediatrics, University of Tennessee Health Science Center, Memphis, Tennessee
- Children's Foundation Research Institute, Le Bonheur Children's Hospital, Memphis, Tennessee
| | - Joseph F Pierre
- Department of Pediatrics, University of Tennessee Health Science Center, Memphis, Tennessee
- Children's Foundation Research Institute, Le Bonheur Children's Hospital, Memphis, Tennessee
| | - Dennis D Black
- Department of Pediatrics, University of Tennessee Health Science Center, Memphis, Tennessee
- Children's Foundation Research Institute, Le Bonheur Children's Hospital, Memphis, Tennessee
| | - Deli Dong
- Department of Pharmacology, College of Pharmacy, Harbin Medical University, Harbin, China
| | - Jaclyn A Brennan
- Department of Biomedical Engineering, The George Washington University, Washington, District of Columbia
| | - Brianna M Cathey
- Department of Biomedical Engineering, The George Washington University, Washington, District of Columbia
| | - Igor R Efimov
- Department of Biomedical Engineering, The George Washington University, Washington, District of Columbia
| | - Jeffrey A Towbin
- Department of Pediatrics, University of Tennessee Health Science Center, Memphis, Tennessee
- Children's Foundation Research Institute, Le Bonheur Children's Hospital, Memphis, Tennessee
- Department of Pediatric Cardiology, St. Jude Children's Research Hospital, Memphis, Tennessee
| | - Enkhsaikhan Purevjav
- Department of Pediatrics, University of Tennessee Health Science Center, Memphis, Tennessee
- Children's Foundation Research Institute, Le Bonheur Children's Hospital, Memphis, Tennessee
| | - Lu Lu
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee
| |
Collapse
|
10
|
Abstract
Similarity has always been a key aspect in computer science and statistics. Any time two element vectors are compared, many different similarity approaches can be used, depending on the final goal of the comparison (Euclidean distance, Pearson correlation coefficient, Spearman's rank correlation coefficient, and others). But if the comparison has to be applied to more complex data samples, with features having different dimensionality and types which might need compression before processing, these measures would be unsuitable. In these cases, a siamese neural network may be the best choice: it consists of two identical artificial neural networks each capable of learning the hidden representation of an input vector. The two neural networks are both feedforward perceptrons, and employ error back-propagation during training; they work parallelly in tandem and compare their outputs at the end, usually through a cosine distance. The output generated by a siamese neural network execution can be considered the semantic similarity between the projected representation of the two input vectors. In this overview we first describe the siamese neural network architecture, and then we outline its main applications in a number of computational fields since its appearance in 1994. Additionally, we list the programming languages, software packages, tutorials, and guides that can be practically used by readers to implement this powerful machine learning model.
Collapse
|
11
|
Xu F, Gao J, Munkhsaikhan U, Li N, Gu Q, Pierre JF, Starlard-Davenport A, Towbin JA, Cui Y, Purevjav E, Lu L. The Genetic Dissection of Ace2 Expression Variation in the Heart of Murine Genetic Reference Population. Front Cardiovasc Med 2020; 7:582949. [PMID: 33330645 PMCID: PMC7714829 DOI: 10.3389/fcvm.2020.582949] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Accepted: 09/02/2020] [Indexed: 12/12/2022] Open
Abstract
Background: A high inflammatory and cytokine burden that induces vascular inflammation, myocarditis, cardiac arrhythmias, and myocardial injury is associated with a lethal outcome in COVID-19. The SARS-CoV-2 virus utilizes the ACE2 receptor for cell entry in a similar way to SARS-CoV. This study investigates the regulation, gene network, and associated pathways of ACE2 that may be involved in inflammatory and cardiovascular complications of COVID-19. Methods: Cardiovascular traits were determined in the one of the largest mouse genetic reference populations: BXD recombinant inbred strains using blood pressure, electrocardiography, and echocardiography measurements. Expression quantitative trait locus (eQTL) mapping, genetic correlation, and functional enrichment analysis were used to identify Ace2 regulation, gene pathway, and co-expression networks. Results: A wide range of variation was found in expression of Ace2 among the BXD strains. Levels of Ace2 expression are negatively correlated with cardiovascular traits, including systolic and diastolic blood pressure and P wave duration and amplitude. Ace2 co-expressed genes are significantly involved in cardiac- and inflammatory-related pathways. The eQTL mapping revealed that Cyld is a candidate upstream regulator for Ace2. Moreover, the protein-protein interaction (PPI) network analysis inferred several potential key regulators (Cul3, Atf2, Vcp, Jun, Ppp1cc, Npm1, Mapk8, Set, Dlg1, Mapk14, and Hspa1b) for Ace2 co-expressed genes in the heart. Conclusions: Ace2 is associated with blood pressure, atrial morphology, and sinoatrial conduction in BXD mice. Ace2 co-varies with Atf2, Cyld, Jun, Mapk8, and Mapk14 and is enriched in the RAS, TGFβ, TNFα, and p38α signaling pathways, involved in inflammation and cardiac damage. We suggest that all these novel Ace2-associated genes and pathways may be targeted for preventive, diagnostic, and therapeutic purposes in cardiovascular damage in patients with systemic inflammation, including COVID-19 patients.
Collapse
Affiliation(s)
- Fuyi Xu
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, United States
| | - Jun Gao
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, United States
| | - Undral Munkhsaikhan
- Department of Pediatrics, University of Tennessee Health Science Center, Memphis, TN, United States
- Children's Foundation Research Institute, Le Bonheur Children's Hospital, Memphis, TN, United States
| | - Ning Li
- Department of Pediatrics, University of Tennessee Health Science Center, Memphis, TN, United States
- Children's Foundation Research Institute, Le Bonheur Children's Hospital, Memphis, TN, United States
- Department of Cardiology, Second Affiliated Hospital, Harbin Medical University, Harbin, China
| | - Qingqing Gu
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, United States
- Department of Cardiology, The Affiliated Hospital of Nantong University, Nantong, China
| | - Joseph F. Pierre
- Department of Pediatrics, University of Tennessee Health Science Center, Memphis, TN, United States
- Children's Foundation Research Institute, Le Bonheur Children's Hospital, Memphis, TN, United States
| | - Athena Starlard-Davenport
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, United States
| | - Jeffrey A. Towbin
- Department of Pediatrics, University of Tennessee Health Science Center, Memphis, TN, United States
- Children's Foundation Research Institute, Le Bonheur Children's Hospital, Memphis, TN, United States
- Pediatric Cardiology, St. Jude Children's Research Hospital, Memphis, TN, United States
| | - Yan Cui
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, United States
| | - Enkhsaikhan Purevjav
- Department of Pediatrics, University of Tennessee Health Science Center, Memphis, TN, United States
- Children's Foundation Research Institute, Le Bonheur Children's Hospital, Memphis, TN, United States
| | - Lu Lu
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, United States
| |
Collapse
|
12
|
Roy S, Zaman KI, Williams RW, Homayouni R. Evaluation of Sirtuin-3 probe quality and co-expressed genes using literature cohesion. BMC Bioinformatics 2019; 20:104. [PMID: 30871457 PMCID: PMC6419539 DOI: 10.1186/s12859-019-2621-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Gene co-expression studies can provide important insights into molecular and cellular signaling pathways. The GeneNetwork database is a unique resource for co-expression analysis using data from a variety of tissues across genetically distinct inbred mice. However, extraction of biologically meaningful co-expressed gene sets is challenging due to variability in microarray platforms, probe quality, normalization methods, and confounding biological factors. In this study, we tested whether literature derived functional cohesion could be used as an objective metric in lieu of 'ground truth' to evaluate the quality of probes and microarray datasets. RESULTS We examined Sirtuin-3 (Sirt3) co-expressed gene sets extracted from either liver or brain tissues of BXD recombinant inbred mice in the GeneNetwork database. Depending on the microarray platform, there were as many as 26 probes that targeted different regions of Sirt3 primary transcript. Co-expressed gene sets (ranging from 100-1000 genes) associated with each Sirt3 probe were evaluated using the previously developed literature-derived cohesion p-value (LPv) and benchmarked against 'gold standards' derived from proteomic studies or Gene Ontology classifications. We found that the maximal F-measure was obtained at an average window size of 535 genes. Using set size of 500 genes, the Pearson correlations between LPv and F-measure as well as between LPv and mitochondrial gene enrichment p-values were 0.90 and 0.93, respectively. Importantly, we found that the LPv approach can distinguish high quality Sirt3 probes. Analysis of the most functionally cohesive Sirt3 co-expressed gene set revealed core metabolic pathways that were shared between hippocampus and liver as well as distinct pathways which were unique to each tissue. These results are consistent with other studies that suggest Sirt3 is a key metabolic regulator and has distinct functions in energy-producing vs. energy-demanding tissues. CONCLUSIONS Our results provide proof-of-concept that literature cohesion analysis is useful for evaluating the quality of probes and microarray datasets, particularly when experimentally derived gold standards are unavailable. Our approach would enable researchers to rapidly identify biologically meaningful co-expressed gene sets and facilitate discovery from high throughput genomic data.
Collapse
Affiliation(s)
- Sujoy Roy
- Bioinformatics Program, University of Memphis, Memphis, 38152 USA
- Center for Translational Informatics, University of Memphis, Memphis, 38152 USA
| | - Kazi I. Zaman
- Bioinformatics Program, University of Memphis, Memphis, 38152 USA
| | - Robert W. Williams
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, 38163 USA
| | - Ramin Homayouni
- Bioinformatics Program, University of Memphis, Memphis, 38152 USA
- Center for Translational Informatics, University of Memphis, Memphis, 38152 USA
- Department of Biology, University of Memphis, Memphis, 38152 USA
| |
Collapse
|
13
|
Zhou D, Zhao Y, Hook M, Zhao W, Starlard-Davenport A, Cook MN, Jones BC, Hamre KM, Lu L. Ethanol's Effect on Coq7 Expression in the Hippocampus of Mice. Front Genet 2018; 9:602. [PMID: 30564271 PMCID: PMC6288283 DOI: 10.3389/fgene.2018.00602] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2018] [Accepted: 11/16/2018] [Indexed: 01/16/2023] Open
Abstract
Coenzyme Q (CoQ) is a well-studied molecule, present in every cell membrane in the body, best known for its roles as a mitochondrial electron transporter and a potent membrane anti-oxidant. Much of the previous work was done in vitro in yeast and more recent work has suggested that CoQ may have additional roles prompting calls for a re-assessment of its role using in vivo systems in mammals. Here we investigated the putative role of Coenzyme Q in ethanol-induced effects in vivo using BXD RI mice. We examined hippocampal expression of Coq7 in saline controls and after an acute ethanol treatment, noting enriched biologic processes and pathways following ethanol administration. We also identified 45 ethanol-related phenotypes that were significantly correlated with Coq7 expression, including six phenotypes related to conditioned taste aversion and ethanol preference. This analysis highlights the need for further investigation of Coq7 and related genes in vivo as well as previously unrecognized roles that it may play in the hippocampus.
Collapse
Affiliation(s)
- Diana Zhou
- Department of Genetics, Genomics and Informatics, The University of Tennessee Health Science Center, Memphis, TN, United States
| | - Yinghong Zhao
- Department of Neurology, Affiliated Hospital of Nantong University, Nantong, China
| | - Michael Hook
- Department of Genetics, Genomics and Informatics, The University of Tennessee Health Science Center, Memphis, TN, United States
| | - Wenyuan Zhao
- Department of Genetics, Genomics and Informatics, The University of Tennessee Health Science Center, Memphis, TN, United States
| | - Athena Starlard-Davenport
- Department of Genetics, Genomics and Informatics, The University of Tennessee Health Science Center, Memphis, TN, United States
| | - Melloni N Cook
- Department of Genetics, Genomics and Informatics, The University of Tennessee Health Science Center, Memphis, TN, United States.,Department of Psychology, The University of Memphis, Memphis, TN, United States
| | - Byron C Jones
- Department of Genetics, Genomics and Informatics, The University of Tennessee Health Science Center, Memphis, TN, United States
| | - Kristin M Hamre
- Department of Anatomy and Neurobiology, The University of Tennessee Health Science Center, Memphis, TN, United States
| | - Lu Lu
- Department of Genetics, Genomics and Informatics, The University of Tennessee Health Science Center, Memphis, TN, United States
| |
Collapse
|
14
|
van Gastel J, Hendrickx JO, Leysen H, Santos-Otte P, Luttrell LM, Martin B, Maudsley S. β-Arrestin Based Receptor Signaling Paradigms: Potential Therapeutic Targets for Complex Age-Related Disorders. Front Pharmacol 2018; 9:1369. [PMID: 30546309 PMCID: PMC6280185 DOI: 10.3389/fphar.2018.01369] [Citation(s) in RCA: 64] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2018] [Accepted: 11/07/2018] [Indexed: 12/14/2022] Open
Abstract
G protein coupled receptors (GPCRs) were first characterized as signal transducers that elicit downstream effects through modulation of guanine (G) nucleotide-binding proteins. The pharmacotherapeutic exploitation of this signaling paradigm has created a drug-based field covering nearly 50% of the current pharmacopeia. Since the groundbreaking discoveries of the late 1990s to the present day, it is now clear however that GPCRs can also generate productive signaling cascades through the modulation of β-arrestin functionality. β-Arrestins were first thought to only regulate receptor desensitization and internalization - exemplified by the action of visual arrestin with respect to rhodopsin desensitization. Nearly 20 years ago, it was found that rather than controlling GPCR signal termination, productive β-arrestin dependent GPCR signaling paradigms were highly dependent on multi-protein complex formation and generated long-lasting cellular effects, in contrast to G protein signaling which is transient and functions through soluble second messenger systems. β-Arrestin signaling was then first shown to activate mitogen activated protein kinase signaling in a G protein-independent manner and eventually initiate protein transcription - thus controlling expression patterns of downstream proteins. While the possibility of developing β-arrestin biased or functionally selective ligands is now being investigated, no additional research has been performed on its possible contextual specificity in treating age-related disorders. The ability of β-arrestin-dependent signaling to control complex and multidimensional protein expression patterns makes this therapeutic strategy feasible, as treating complex age-related disorders will likely require therapeutics that can exert network-level efficacy profiles. It is our understanding that therapeutically targeting G protein-independent effectors such as β-arrestin will aid in the development of precision medicines with tailored efficacy profiles for disease/age-specific contextualities.
Collapse
Affiliation(s)
- Jaana van Gastel
- Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium.,Translational Neurobiology Group, Centre for Molecular Neuroscience, VIB, Antwerp, Belgium
| | - Jhana O Hendrickx
- Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium.,Translational Neurobiology Group, Centre for Molecular Neuroscience, VIB, Antwerp, Belgium
| | - Hanne Leysen
- Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium.,Translational Neurobiology Group, Centre for Molecular Neuroscience, VIB, Antwerp, Belgium
| | - Paula Santos-Otte
- Institute of Biophysics, Humboldt University of Berlin, Berlin, Germany
| | - Louis M Luttrell
- Division of Endocrinology, Diabetes and Medical Genetics, Medical University of South Carolina, Charleston, SC, United States
| | - Bronwen Martin
- Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Stuart Maudsley
- Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium.,Translational Neurobiology Group, Centre for Molecular Neuroscience, VIB, Antwerp, Belgium
| |
Collapse
|
15
|
Luo Z, Xie R, Chen W, Ye Z. Automatic domain terminology extraction and its evaluation for domain knowledge graph construction. WEB INTELLIGENCE 2018. [DOI: 10.3233/web-180385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Zhiwei Luo
- School of Automation, Huazhong University of Science and Technology, 1037 Luoyu Road, Wuhan 430074, China. E-mail:
| | - Rong Xie
- School of Computer Science, Wuhan University, 39 Luoyu Road, Wuhan 430079, China. E-mail:
| | - Wen Chen
- Shanghai Key Laboratory of Aerospace Intelligent Control Technology, Shanghai Institute of Spaceflight Control Technology, Shanghai 201109, China. E-mail:
| | - Zetao Ye
- International School of Software, Wuhan University, 39 Luoyu Road, Wuhan 430079, China
| |
Collapse
|
16
|
Roy S, Yun D, Madahian B, Berry MW, Deng LY, Goldowitz D, Homayouni R. Navigating the Functional Landscape of Transcription Factors via Non-Negative Tensor Factorization Analysis of MEDLINE Abstracts. Front Bioeng Biotechnol 2017; 5:48. [PMID: 28894735 PMCID: PMC5581332 DOI: 10.3389/fbioe.2017.00048] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Accepted: 07/31/2017] [Indexed: 01/09/2023] Open
Abstract
In this study, we developed and evaluated a novel text-mining approach, using non-negative tensor factorization (NTF), to simultaneously extract and functionally annotate transcriptional modules consisting of sets of genes, transcription factors (TFs), and terms from MEDLINE abstracts. A sparse 3-mode term × gene × TF tensor was constructed that contained weighted frequencies of 106,895 terms in 26,781 abstracts shared among 7,695 genes and 994 TFs. The tensor was decomposed into sub-tensors using non-negative tensor factorization (NTF) across 16 different approximation ranks. Dominant entries of each of 2,861 sub-tensors were extracted to form term–gene–TF annotated transcriptional modules (ATMs). More than 94% of the ATMs were found to be enriched in at least one KEGG pathway or GO category, suggesting that the ATMs are functionally relevant. One advantage of this method is that it can discover potentially new gene–TF associations from the literature. Using a set of microarray and ChIP-Seq datasets as gold standard, we show that the precision of our method for predicting gene–TF associations is significantly higher than chance. In addition, we demonstrate that the terms in each ATM can be used to suggest new GO classifications to genes and TFs. Taken together, our results indicate that NTF is useful for simultaneous extraction and functional annotation of transcriptional regulatory networks from unstructured text, as well as for literature based discovery. A web tool called Transcriptional Regulatory Modules Extracted from Literature (TREMEL), available at http://binf1.memphis.edu/tremel, was built to enable browsing and searching of ATMs.
Collapse
Affiliation(s)
- Sujoy Roy
- Bioinformatics Program, University of Memphis, Memphis, TN, United States.,Center for Translational Informatics, University of Memphis, Memphis, TN, United States
| | - Daqing Yun
- Computer and Information Sciences Program, Harrisburg University of Science and Technology, Harrisburg, PA, United States
| | - Behrouz Madahian
- Department of Mathematical Sciences, University of Memphis, Memphis, TN, United States
| | - Michael W Berry
- Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, TN, United States
| | - Lih-Yuan Deng
- Department of Mathematical Sciences, University of Memphis, Memphis, TN, United States
| | - Daniel Goldowitz
- Center for Molecular Medicine and Therapeutics, University of British Columbia, Vancouver, BC, Canada
| | - Ramin Homayouni
- Bioinformatics Program, University of Memphis, Memphis, TN, United States.,Center for Translational Informatics, University of Memphis, Memphis, TN, United States.,Department of Biological Sciences, University of Memphis, Memphis, TN, United States
| |
Collapse
|
17
|
Zhou DX, Zhao Y, Baker JA, Gu Q, Hamre KM, Yue J, Jones BC, Cook MN, Lu L. The effect of alcohol on the differential expression of cluster of differentiation 14 gene, associated pathways, and genetic network. PLoS One 2017; 12:e0178689. [PMID: 28575045 PMCID: PMC5456352 DOI: 10.1371/journal.pone.0178689] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2016] [Accepted: 05/17/2017] [Indexed: 12/13/2022] Open
Abstract
Alcohol consumption affects human health in part by compromising the immune system. In this study, we examined the expression of the Cd14 (cluster of differentiation 14) gene, which is involved in the immune system through a proinflammatory cascade. Expression was evaluated in BXD mice treated with saline or acute 1.8 g/kg i.p. ethanol (12.5% v/v). Hippocampal gene expression data were generated to examine differential expression and to perform systems genetics analyses. The Cd14 gene expression showed significant changes among the BXD strains after ethanol treatment, and eQTL mapping revealed that Cd14 is a cis-regulated gene. We also identified eighteen ethanol-related phenotypes correlated with Cd14 expression related to either ethanol responses or ethanol consumption. Pathway analysis was performed to identify possible biological pathways involved in the response to ethanol and Cd14. We also constructed a genetic network for Cd14 using the top 20 correlated genes and present several genes possibly involved in Cd14 and ethanol responses based on differential gene expression. In conclusion, we found Cd14, along with several other genes and pathways, to be involved in ethanol responses in the hippocampus, such as increased susceptibility to lipopolysaccharides and neuroinflammation.
Collapse
Affiliation(s)
- Diana X. Zhou
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee, United States of America
| | - Yinghong Zhao
- Department of Neurology, Affiliated Hospital of Nantong University, Nantong, China
| | - Jessica A. Baker
- Department of Anatomy and Neurobiology, University of Tennessee Health Science Center, Memphis, Tennessee, United States of America
| | - Qingqing Gu
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee, United States of America
- Department of Cardiology, Affiliated Hospital of Nantong University, Nantong, China
| | - Kristin M. Hamre
- Department of Anatomy and Neurobiology, University of Tennessee Health Science Center, Memphis, Tennessee, United States of America
| | - Junming Yue
- Department of Pathology, University of Tennessee Health Science Center, Memphis, Tennessee, United States of America
| | - Byron C. Jones
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee, United States of America
| | - Melloni N. Cook
- Department of Psychology, University of Memphis, Memphis, Tennessee, United States of America
| | - Lu Lu
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee, United States of America
- Department of Neurology, Affiliated Hospital of Nantong University, Nantong, China
| |
Collapse
|
18
|
Abstract
The goal of systems genetics is to understand the impact of genetic variation across all levels of biological organization, from mRNAs, proteins, and metabolites, to higher-order physiological and behavioral traits. This approach requires the accumulation and integration of many types of data, and also requires the use of many types of statistical tools to extract relevant patterns of covariation and causal relations as a function of genetics, environment, stage, and treatment. In this protocol we explain how to use the GeneNetwork web service, a powerful and free online resource for systems genetics. We provide workflows and methods to navigate massive multiscalar data sets and we explain how to use an extensive systems genetics toolkit for analysis and synthesis. Finally, we provide two detailed case studies that take advantage of human and mouse cohorts to evaluate linkage between gene variants, addiction, and aging.
Collapse
|
19
|
Saigi-Morgui N, Quteineh L, Bochud PY, Crettol S, Kutalik Z, Wojtowicz A, Bibert S, Beckmann S, Mueller NJ, Binet I, van Delden C, Steiger J, Mohacsi P, Stirnimann G, Soccal PM, Pascual M, Eap CB, the Swiss Transplant Cohort Study. Weighted Genetic Risk Scores and Prediction of Weight Gain in Solid Organ Transplant Populations. PLoS One 2016; 11:e0164443. [PMID: 27788139 PMCID: PMC5082801 DOI: 10.1371/journal.pone.0164443] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2016] [Accepted: 09/26/2016] [Indexed: 12/18/2022] Open
Abstract
Background Polygenic obesity in Solid Organ Transplant (SOT) populations is considered a risk factor for the development of metabolic abnormalities and graft survival. Few studies to date have studied the genetics of weight gain in SOT recipients. We aimed to determine whether weighted genetic risk scores (w-GRS) integrating genetic polymorphisms from GWAS studies (SNP group#1 and SNP group#2) and from Candidate Gene studies (SNP group#3) influence BMI in SOT populations and if they predict ≥10% weight gain (WG) one year after transplantation. To do so, two samples (nA = 995, nB = 156) were obtained from naturalistic studies and three w-GRS were constructed and tested for association with BMI over time. Prediction of 10% WG at one year after transplantation was assessed with models containing genetic and clinical factors. Results w-GRS were associated with BMI in sample A and B combined (BMI increased by 0.14 and 0.11 units per additional risk allele in SNP group#1 and #2, respectively, p-values<0.008). w-GRS of SNP group#3 showed an effect of 0.01 kg/m2 per additional risk allele when combining sample A and B (p-value 0.04). Models with genetic factors performed better than models without in predicting 10% WG at one year after transplantation. Conclusions This is the first study in SOT evaluating extensively the association of w-GRS with BMI and the influence of clinical and genetic factors on 10% of WG one year after transplantation, showing the importance of integrating genetic factors in the final model. Genetics of obesity among SOT recipients remains an important issue and can contribute to treatment personalization and prediction of WG after transplantation.
Collapse
Affiliation(s)
- Núria Saigi-Morgui
- Unit of Pharmacogenetics and Clinical Psychopharmacology, Department of Psychiatry, Lausanne University Hospital, Prilly, Switzerland
| | - Lina Quteineh
- Unit of Pharmacogenetics and Clinical Psychopharmacology, Department of Psychiatry, Lausanne University Hospital, Prilly, Switzerland
| | - Pierre-Yves Bochud
- Service of Infectious Diseases, University Hospital and University of Lausanne, Lausanne, Switzerland
| | - Severine Crettol
- Unit of Pharmacogenetics and Clinical Psychopharmacology, Department of Psychiatry, Lausanne University Hospital, Prilly, Switzerland
| | - Zoltán Kutalik
- Institute of Social and Preventive Medicine (IUMSP), Lausanne University Hospital, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Agnieszka Wojtowicz
- Service of Infectious Diseases, University Hospital and University of Lausanne, Lausanne, Switzerland
| | - Stéphanie Bibert
- Service of Infectious Diseases, University Hospital and University of Lausanne, Lausanne, Switzerland
| | - Sonja Beckmann
- Institute of Nursing Science, University of Basel, Basel, Switzerland
| | - Nicolas J Mueller
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital, Zurich, Switzerland
| | - Isabelle Binet
- Service of Nephrology and Transplantation Medicine, Kantonsspital, St Gallen, Switzerland
| | | | - Jürg Steiger
- Service of Nephrology, University Hospital, Basel, Switzerland
| | - Paul Mohacsi
- Swiss Cardiovascular Center Bern, University Hospital, Bern, Switzerland
| | - Guido Stirnimann
- University Clinic of Visceral Surgery and Medicine, Inselspital, Bern, Switzerland
| | - Paola M. Soccal
- Service of Pulmonary Medicine, University Hospital, Geneva, Switzerland
| | - Manuel Pascual
- Transplant Center, Lausanne University Hospital, Lausanne, Switzerland
| | - Chin B Eap
- Unit of Pharmacogenetics and Clinical Psychopharmacology, Department of Psychiatry, Lausanne University Hospital, Prilly, Switzerland
- School of Pharmaceutical Sciences, University of Geneva, University of Lausanne, Geneva, Switzerland
- * E-mail:
| | | |
Collapse
|
20
|
Roy S, Curry BC, Madahian B, Homayouni R. Prioritization, clustering and functional annotation of MicroRNAs using latent semantic indexing of MEDLINE abstracts. BMC Bioinformatics 2016; 17:350. [PMID: 27766940 PMCID: PMC5073981 DOI: 10.1186/s12859-016-1223-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Background The amount of scientific information about MicroRNAs (miRNAs) is growing exponentially, making it difficult for researchers to interpret experimental results. In this study, we present an automated text mining approach using Latent Semantic Indexing (LSI) for prioritization, clustering and functional annotation of miRNAs. Results For approximately 900 human miRNAs indexed in miRBase, text documents were created by concatenating titles and abstracts of MEDLINE citations which refer to the miRNAs. The documents were parsed and a weighted term-by-miRNA frequency matrix was created, which was subsequently factorized via singular value decomposition to extract pair-wise cosine values between the term (keyword) and miRNA vectors in reduced rank semantic space. LSI enables derivation of both explicit and implicit associations between entities based on word usage patterns. Using miR2Disease as a gold standard, we found that LSI identified keyword-to-miRNA relationships with high accuracy. In addition, we demonstrate that pair-wise associations between miRNAs can be used to group them into categories which are functionally aligned. Finally, term ranking by querying the LSI space with a group of miRNAs enabled annotation of the clusters with functionally related terms. Conclusions LSI modeling of MEDLINE abstracts provides a robust and automated method for miRNA related knowledge discovery. The latest collection of miRNA abstracts and LSI model can be accessed through the web tool miRNA Literature Network (miRLiN) at http://bioinfo.memphis.edu/mirlin. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1223-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sujoy Roy
- Bioinformatics Program, University of Memphis, Memphis, 38152, USA.,Center for Translational Informatics, University of Memphis, Memphis, 38152, USA
| | - Brandon C Curry
- Bioinformatics Program, University of Memphis, Memphis, 38152, USA
| | - Behrouz Madahian
- Department of Mathematical Sciences, University of Memphis, Memphis, 38152, USA
| | - Ramin Homayouni
- Bioinformatics Program, University of Memphis, Memphis, 38152, USA. .,Center for Translational Informatics, University of Memphis, Memphis, 38152, USA. .,Department of Biology, University of Memphis, Memphis, 38152, USA.
| |
Collapse
|
21
|
Makrehchi M. Predicting political conflicts from polarized social media. WEB INTELLIGENCE 2016. [DOI: 10.3233/web-160333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Masoud Makrehchi
- University of Ontario Institute of Technology (UOIT), 2000 Simcoe Street North, Oshawa, Ontario L1H 7K4, Canada. Tel.: Ext. 5387; Fax: ; E-mail:
| |
Collapse
|
22
|
Ailem M, Role F, Nadif M, Demenais F. Unsupervised text mining for assessing and augmenting GWAS results. J Biomed Inform 2016; 60:252-9. [PMID: 26911523 DOI: 10.1016/j.jbi.2016.02.008] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2015] [Revised: 12/21/2015] [Accepted: 02/14/2016] [Indexed: 12/31/2022]
Abstract
Text mining can assist in the analysis and interpretation of large-scale biomedical data, helping biologists to quickly and cheaply gain confirmation of hypothesized relationships between biological entities. We set this question in the context of genome-wide association studies (GWAS), an actively emerging field that contributed to identify many genes associated with multifactorial diseases. These studies allow to identify groups of genes associated with the same phenotype, but provide no information about the relationships between these genes. Therefore, our objective is to leverage unsupervised text mining techniques using text-based cosine similarity comparisons and clustering applied to candidate and random gene vectors, in order to augment the GWAS results. We propose a generic framework which we used to characterize the relationships between 10 genes reported associated with asthma by a previous GWAS. The results of this experiment showed that the similarities between these 10 genes were significantly stronger than would be expected by chance (one-sided p-value<0.01). The clustering of observed and randomly selected gene also allowed to generate hypotheses about potential functional relationships between these genes and thus contributed to the discovery of new candidate genes for asthma.
Collapse
Affiliation(s)
- Melissa Ailem
- LIPADE, Université Paris Descartes, Sorbonne Paris Cité, Paris F-75006, France
| | - François Role
- LIPADE, Université Paris Descartes, Sorbonne Paris Cité, Paris F-75006, France
| | - Mohamed Nadif
- LIPADE, Université Paris Descartes, Sorbonne Paris Cité, Paris F-75006, France
| | - Florence Demenais
- INSERM, Genetic Variation and Human Diseases Unit, UMR-946, Paris F-75010, France; Institut Universitaire d'Hématologie, Université Paris Diderot, Sorbonne Paris Cité, Paris F-75010, France
| |
Collapse
|
23
|
Lu L, Pandey AK, Houseal MT, Mulligan MK. The Genetic Architecture of Murine Glutathione Transferases. PLoS One 2016; 11:e0148230. [PMID: 26829228 PMCID: PMC4734686 DOI: 10.1371/journal.pone.0148230] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2014] [Accepted: 01/14/2016] [Indexed: 12/17/2022] Open
Abstract
Glutathione S-transferase (GST) genes play a protective role against oxidative stress and may influence disease risk and drug pharmacokinetics. In this study, massive multiscalar trait profiling across a large population of mice derived from a cross between C57BL/6J (B6) and DBA2/J (D2)—the BXD family—was combined with linkage and bioinformatic analyses to characterize mechanisms controlling GST expression and to identify downstream consequences of this variation. Similar to humans, mice show a wide range in expression of GST family members. Variation in the expression of Gsta4, Gstt2, Gstz1, Gsto1, and Mgst3 is modulated by local expression QTLs (eQTLs) in several tissues. Higher expression of Gsto1 in brain and liver of BXD strains is strongly associated (P < 0.01) with inheritance of the B6 parental allele whereas higher expression of Gsta4 and Mgst3 in brain and liver, and Gstt2 and Gstz1 in brain is strongly associated with inheritance of the D2 parental allele. Allele-specific assays confirmed that expression of Gsto1, Gsta4, and Mgst3 are modulated by sequence variants within or near each gene locus. We exploited this endogenous variation to identify coexpression networks and downstream targets in mouse and human. Through a combined systems genetics approach, we provide new insight into the biological role of naturally occurring variants in GST genes.
Collapse
Affiliation(s)
- Lu Lu
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, 38106, United States of America
- Co-innovation Center of Neuroregeneration, Nantong University, Nantong, Jiangsu Province, 226001, China
| | - Ashutosh K. Pandey
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, 38106, United States of America
| | - M. Trevor Houseal
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, 38106, United States of America
| | - Megan K. Mulligan
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, 38106, United States of America
- * E-mail:
| |
Collapse
|
24
|
Xu J, Dai A, Chen Q, Liu X, Zhang Y, Wang H, Li H, Chen Y, Cao M. Genetic regulation analysis reveals involvement of tumor necrosis factor and alpha-induced protein 3 in stress response in mice. Gene 2016; 576:528-36. [PMID: 26546835 DOI: 10.1016/j.gene.2015.10.071] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2015] [Revised: 10/29/2015] [Accepted: 10/31/2015] [Indexed: 11/30/2022]
Abstract
In order to study whether Tnfaip3 is related to stress response and further to find it's genetic regulation, we use C57BL/6J (B6) and DBA/2 (D2) mice to built the model of chronic unpredictable mild stress. RT-PCR, Western blotting and immunohistochemistry were used for studying the expression variation of Tnfaip3 in hippocampus tissue of B6 and D2 mice after being stressed. We found that the expression of Tnfaip3 was more remarkably increased in chronic unpredictable stress models than that in untreated mice (P<0.05). It is indicated that Tnfaip3 might take part in the process of stress response. The expression of Tnfaip3 is regulated by a cis-acting quantitative trait locus (cis-eQTL). We identified 5 genes are controlled by Tnfaip3 and the expression of 64 genes highly associated with Tnfaip3, 9 of those have formerly been participate in stress related pathways. In order to estimate the relationship between Tnfaip3 and its downstream genes or network members, we transfected SH-SY5Y cells with Tnfaip3 siRNA leading to down-regulation of Tnfaip3 mRNA. We confirmed a significant influence of Tnfaip3 depletion on the expression of Tsc22d3, Pex7, Rap2a, Slc2a3, and Gap43. These validated downstream genes and members of Tnfaip3 gene network provide us new insight into the biological mechanisms of Tnfaip3 in chronic unpredictable stress.
Collapse
Affiliation(s)
- Jian Xu
- Department of Neurology, Nantong University Affiliated Mental Health Center, Jiangsu, Nantong 226001, China
| | - Aihua Dai
- Department of neurology, Affiliated Hospital of Nantong University, Jiangsu, Nantong 226001, China
| | - Qi Chen
- Department of neurology, Affiliated Hospital of Nantong University, Jiangsu, Nantong 226001, China
| | - Xiaorong Liu
- Department of neurology, Affiliated Hospital of Nantong University, Jiangsu, Nantong 226001, China
| | - Yu Zhang
- Department of neurology, Affiliated Hospital of Nantong University, Jiangsu, Nantong 226001, China
| | - Hongmei Wang
- Department of neurology, Affiliated Hospital of Nantong University, Jiangsu, Nantong 226001, China
| | - Haizhen Li
- Department of neurology, Affiliated Hospital of Nantong University, Jiangsu, Nantong 226001, China
| | - Ying Chen
- Department of Histology and Embryology, Medical College, Nantong University, Jiangsu, Nantong 226001, China
| | - Maohong Cao
- Department of neurology, Affiliated Hospital of Nantong University, Jiangsu, Nantong 226001, China.
| |
Collapse
|
25
|
Rouchka EC, Chariker JH, Harrison BJ. Proceedings of the Fourteenth Annual UT- KBRIN Bioinformatics Summit 2015. BMC Bioinformatics 2015; 16 Suppl 15:I1-P21. [PMID: 26510995 PMCID: PMC4625115 DOI: 10.1186/1471-2105-16-s15-i1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
26
|
Daimon CM, Jasien JM, Wood WH, Zhang Y, Becker KG, Silverman JL, Crawley JN, Martin B, Maudsley S. Hippocampal Transcriptomic and Proteomic Alterations in the BTBR Mouse Model of Autism Spectrum Disorder. Front Physiol 2015; 6:324. [PMID: 26635614 PMCID: PMC4656818 DOI: 10.3389/fphys.2015.00324] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2015] [Accepted: 10/27/2015] [Indexed: 12/25/2022] Open
Abstract
Autism spectrum disorders (ASD) are complex heterogeneous neurodevelopmental disorders of an unclear etiology, and no cure currently exists. Prior studies have demonstrated that the black and tan, brachyury (BTBR) T+ Itpr3tf/J mouse strain displays a behavioral phenotype with ASD-like features. BTBR T+ Itpr3tf/J mice (referred to simply as BTBR) display deficits in social functioning, lack of communication ability, and engagement in stereotyped behavior. Despite extensive behavioral phenotypic characterization, little is known about the genes and proteins responsible for the presentation of the ASD-like phenotype in the BTBR mouse model. In this study, we employed bioinformatics techniques to gain a wide-scale understanding of the transcriptomic and proteomic changes associated with the ASD-like phenotype in BTBR mice. We found a number of genes and proteins to be significantly altered in BTBR mice compared to C57BL/6J (B6) control mice controls such as BDNF, Shank3, and ERK1, which are highly relevant to prior investigations of ASD. Furthermore, we identified distinct functional pathways altered in BTBR mice compared to B6 controls that have been previously shown to be altered in both mouse models of ASD, some human clinical populations, and have been suggested as a possible etiological mechanism of ASD, including “axon guidance” and “regulation of actin cytoskeleton.” In addition, our wide-scale bioinformatics approach also discovered several previously unidentified genes and proteins associated with the ASD phenotype in BTBR mice, such as Caskin1, suggesting that bioinformatics could be an avenue by which novel therapeutic targets for ASD are uncovered. As a result, we believe that informed use of synergistic bioinformatics applications represents an invaluable tool for elucidating the etiology of complex disorders like ASD.
Collapse
Affiliation(s)
- Caitlin M Daimon
- Metabolism Unit, National Institute on Aging, National Institutes of Health Baltimore, MD, USA
| | - Joan M Jasien
- Metabolism Unit, National Institute on Aging, National Institutes of Health Baltimore, MD, USA
| | - William H Wood
- Gene Expression and Genomics Unit, National Institutes of Health Baltimore, MD, USA
| | - Yongqing Zhang
- Gene Expression and Genomics Unit, National Institutes of Health Baltimore, MD, USA
| | - Kevin G Becker
- Gene Expression and Genomics Unit, National Institutes of Health Baltimore, MD, USA
| | - Jill L Silverman
- Laboratory of Behavioral Neuroscience, Intramural Research Program, National Institute of Mental Health Bethesda, MD, USA ; MIND Institute, University of California Davis School of Medicine Sacramento, CA, USA
| | - Jacqueline N Crawley
- Laboratory of Behavioral Neuroscience, Intramural Research Program, National Institute of Mental Health Bethesda, MD, USA ; MIND Institute, University of California Davis School of Medicine Sacramento, CA, USA
| | - Bronwen Martin
- Metabolism Unit, National Institute on Aging, National Institutes of Health Baltimore, MD, USA
| | - Stuart Maudsley
- Receptor Pharmacology Unit, National Institute on Aging, National Institutes of Health Baltimore, MD, USA ; Translational Neurobiology Group, VIB Department of Molecular Genetics, University of Antwerp Antwerp, Belgium ; Laboratory of Neurogenetics, Institute Born-Bunge, University of Antwerp Antwerpen, Belgium
| |
Collapse
|
27
|
Madahian B, Roy S, Bowman D, Deng LY, Homayouni R. A Bayesian approach for inducing sparsity in generalized linear models with multi-category response. BMC Bioinformatics 2015; 16 Suppl 13:S13. [PMID: 26423345 PMCID: PMC4597416 DOI: 10.1186/1471-2105-16-s13-s13] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND The dimension and complexity of high-throughput gene expression data create many challenges for downstream analysis. Several approaches exist to reduce the number of variables with respect to small sample sizes. In this study, we utilized the Generalized Double Pareto (GDP) prior to induce sparsity in a Bayesian Generalized Linear Model (GLM) setting. The approach was evaluated using a publicly available microarray dataset containing 99 samples corresponding to four different prostate cancer subtypes. RESULTS A hierarchical Sparse Bayesian GLM using GDP prior (SBGG) was developed to take into account the progressive nature of the response variable. We obtained an average overall classification accuracy between 82.5% and 94%, which was higher than Support Vector Machine, Random Forest or a Sparse Bayesian GLM using double exponential priors. Additionally, SBGG outperforms the other 3 methods in correctly identifying pre-metastatic stages of cancer progression, which can prove extremely valuable for therapeutic and diagnostic purposes. Importantly, using Geneset Cohesion Analysis Tool, we found that the top 100 genes produced by SBGG had an average functional cohesion p-value of 2.0E-4 compared to 0.007 to 0.131 produced by the other methods. CONCLUSIONS Using GDP in a Bayesian GLM model applied to cancer progression data results in better subclass prediction. In particular, the method identifies pre-metastatic stages of prostate cancer with substantially better accuracy and produces more functionally relevant gene sets.
Collapse
|
28
|
Chicco D, Masseroli M. Software Suite for Gene and Protein Annotation Prediction and Similarity Search. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:837-43. [PMID: 26357324 DOI: 10.1109/tcbb.2014.2382127] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
In the computational biology community, machine learning algorithms are key instruments for many applications, including the prediction of gene-functions based upon the available biomolecular annotations. Additionally, they may also be employed to compute similarity between genes or proteins. Here, we describe and discuss a software suite we developed to implement and make publicly available some of such prediction methods and a computational technique based upon Latent Semantic Indexing (LSI), which leverages both inferred and available annotations to search for semantically similar genes. The suite consists of three components. BioAnnotationPredictor is a computational software module to predict new gene-functions based upon Singular Value Decomposition of available annotations. SimilBio is a Web module that leverages annotations available or predicted by BioAnnotationPredictor to discover similarities between genes via LSI. The suite includes also SemSim, a new Web service built upon these modules to allow accessing them programmatically. We integrated SemSim in the Bio Search Computing framework (http://www.bioinformatics.deib. polimi.it/bio-seco/seco/), where users can exploit the Search Computing technology to run multi-topic complex queries on multiple integrated Web services. Accordingly, researchers may obtain ranked answers involving the computation of the functional similarity between genes in support of biomedical knowledge discovery.
Collapse
|
29
|
Xu J, Cai R, Lu L, Duan C, Tao X, Chen D, Liu Y, Wang X, Cao M, Chen Y. Genetic regulatory network analysis reveals that low density lipoprotein receptor-related protein 11 is involved in stress responses in mice. Psychiatry Res 2014; 220:1131-7. [PMID: 25262641 DOI: 10.1016/j.psychres.2014.09.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/25/2014] [Revised: 08/22/2014] [Accepted: 09/06/2014] [Indexed: 11/29/2022]
Abstract
To study whether Lrp11 is involved in stress response and find its expression regulatory network, the model of stress has been built using C57BL/6J (B6) and DBA/2 (D2) mice. Western blotting, qPCR and immunohistochemistry were used to investigate the expression variation of Lrp11 in amygdala tissue after exposure to stress. We found the quantity of Lrp11 was more obvious in stress models than that in normal mice (P<0.05) which suggests Lrp11 might participate in the process of stress response. The expression of Lrp11 is controlled by a cis-acting quantitative trait locus (cis-eQTL). We identified four genes that are regulated by Lrp11 and the expression of 66 genes highly correlated with Lrp11, seven of which have previously been implicated in stress pathways. To evaluate the relationship between Lrp11 and its downstream genes or network members, we transfected HEK 293T cells and SH-SY5Y cells with Lrp11 siRNA leading to down-regulation of Lrp11mRNA and were able to confirm a significant influence of Lrp11 depletion on the expression of Xpnpep1, Maneal, Pgap1 and Uprt. These validated downstream targets and members of Lrp11 gene network provide new insight into the biological role of Lrp11 and may be an important risk factor in the development of stress.
Collapse
Affiliation(s)
- Jian Xu
- Department of Neurology, Nantong University Affiliated Mental Health Center, Jiangsu, Nantong 226001, China
| | - Rixin Cai
- Department of Histology and Embryology, Medical College, Nantong University, Jiangsu, Nantong 226001, China; Jiangsu Province Key Laboratory for Inflammation and Molecular Drug Target, Jiangsu, Nantong 226001, China
| | - Lu Lu
- Department of Histology and Embryology, Medical College, Nantong University, Jiangsu, Nantong 226001, China
| | - Chengwei Duan
- Jiangsu Province Key Laboratory for Inflammation and Molecular Drug Target, Jiangsu, Nantong 226001, China
| | - Xuelei Tao
- Department of Histology and Embryology, Medical College, Nantong University, Jiangsu, Nantong 226001, China
| | - Dongjian Chen
- Department of neurology, Affiliated Hospital of Nantong University, Jiangsu, Nantong 226001, China
| | - Yonghua Liu
- Jiangsu Province Key Laboratory for Inflammation and Molecular Drug Target, Jiangsu, Nantong 226001, China
| | - Xiaodong Wang
- Department of Histology and Embryology, Medical College, Nantong University, Jiangsu, Nantong 226001, China
| | - Maohong Cao
- Department of neurology, Affiliated Hospital of Nantong University, Jiangsu, Nantong 226001, China
| | - Ying Chen
- Department of Histology and Embryology, Medical College, Nantong University, Jiangsu, Nantong 226001, China.
| |
Collapse
|
30
|
|
31
|
Wu JS, Kao EF, Lee CN. Discovering hidden connections among diseases, genes and drugs based on microarray expression profiles with negative-term filtering. PLoS One 2014; 9:e98826. [PMID: 24915461 PMCID: PMC4051596 DOI: 10.1371/journal.pone.0098826] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2013] [Accepted: 05/06/2014] [Indexed: 11/18/2022] Open
Abstract
Microarrays based on gene expression profiles (GEPs) can be tailored specifically for a variety of topics to provide a precise and efficient means with which to discover hidden information. This study proposes a novel means of employing existing GEPs to reveal hidden relationships among diseases, genes, and drugs within a rich biomedical database, PubMed. Unlike the co-occurrence method, which considers only the appearance of keywords, the proposed method also takes into account negative relationships and non-relationships among keywords, the importance of which has been demonstrated in previous studies. Three scenarios were conducted to verify the efficacy of the proposed method. In Scenario 1, disease and drug GEPs (disease: lymphoma cancer, lymph node cancer, and drug: cyclophosphamide) were used to obtain lists of disease- and drug-related genes. Fifteen hidden connections were identified between the diseases and the drug. In Scenario 2, we adopted different diseases and drug GEPs (disease: AML-ALL dataset and drug: Gefitinib) to obtain lists of important diseases and drug-related genes. In this case, ten hidden connections were identified. In Scenario 3, we obtained a list of disease-related genes from the disease-related GEP (liver cancer) and the drug (Capecitabine) on the PharmGKB website, resulting in twenty-two hidden connections. Experimental results demonstrate the efficacy of the proposed method in uncovering hidden connections among diseases, genes, and drugs. Following implementation of the weight function in the proposed method, a large number of the documents obtained in each of the scenarios were judged to be related: 834 of 4028 documents, 789 of 1216 documents, and 1928 of 3791 documents in Scenarios 1, 2, and 3, respectively. The negative-term filtering scheme also uncovered a large number of negative relationships as well as non-relationships among these connections: 97 of 834, 38 of 789, and 202 of 1928 in Scenarios 1, 2, and 3, respectively.
Collapse
Affiliation(s)
- Jain-Shing Wu
- Department of Computer Science and Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan
| | - E-Fong Kao
- Department of Medical Imaging and Radiological Sciences, Kaohsiung Medical University, Kaohsiung, Taiwan
| | - Chung-Nan Lee
- Department of Computer Science and Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan
- Department of Medical Imaging and Radiological Sciences, Kaohsiung Medical University, Kaohsiung, Taiwan
- * E-mail:
| |
Collapse
|
32
|
Xiang Z, Qin T, Qin ZS, He Y. A genome-wide MeSH-based literature mining system predicts implicit gene-to-gene relationships and networks. BMC SYSTEMS BIOLOGY 2013; 7 Suppl 3:S9. [PMID: 24555475 PMCID: PMC3852244 DOI: 10.1186/1752-0509-7-s3-s9] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Background The large amount of literature in the post-genomics era enables the study of gene interactions and networks using all available articles published for a specific organism. MeSH is a controlled vocabulary of medical and scientific terms that is used by biomedical scientists to manually index articles in the PubMed literature database. We hypothesized that genome-wide gene-MeSH term associations from the PubMed literature database could be used to predict implicit gene-to-gene relationships and networks. While the gene-MeSH associations have been used to detect gene-gene interactions in some studies, different methods have not been well compared, and such a strategy has not been evaluated for a genome-wide literature analysis. Genome-wide literature mining of gene-to-gene interactions allows ranking of the best gene interactions and investigation of comprehensive biological networks at a genome level. Results The genome-wide GenoMesh literature mining algorithm was developed by sequentially generating a gene-article matrix, a normalized gene-MeSH term matrix, and a gene-gene matrix. The gene-gene matrix relies on the calculation of pairwise gene dissimilarities based on gene-MeSH relationships. An optimized dissimilarity score was identified from six well-studied functions based on a receiver operating characteristic (ROC) analysis. Based on the studies with well-studied Escherichia coli and less-studied Brucella spp., GenoMesh was found to accurately identify gene functions using weighted MeSH terms, predict gene-gene interactions not reported in the literature, and cluster all the genes studied from an organism using the MeSH-based gene-gene matrix. A web-based GenoMesh literature mining program is also available at: http://genomesh.hegroup.org. GenoMesh also predicts gene interactions and networks among genes associated with specific MeSH terms or user-selected gene lists. Conclusions The GenoMesh algorithm and web program provide the first genome-wide, MeSH-based literature mining system that effectively predicts implicit gene-gene interaction relationships and networks in a genome-wide scope.
Collapse
|
33
|
Abate F, Acquaviva A, Ficarra E, Piva R, Macii E. Gelsius: a literature-based workflow for determining quantitative associations between genes and biological processes. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:619-631. [PMID: 24091396 DOI: 10.1109/tcbb.2013.11] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
An effective knowledge extraction and quantification methodology from biomedical literature would allow the researcher to organize and analyze the results of high-throughput experiments on microarrays and next-generation sequencing technologies. Despite the large amount of raw information available on the web, a tool able to extract a measure of the correlation between a list of genes and biological processes is not yet available. In this paper, we present Gelsius, a workflow that incorporates biomedical literature to quantify the correlation between genes and terms describing biological processes. To achieve this target, we build different modules focusing on query expansion and document cononicalization. In this way, we reached to improve the measurement of correlation, performed using a latent semantic analysis approach. To the best of our knowledge, this is the first complete tool able to extract a measure of genes-biological processes correlation from literature. We demonstrate the effectiveness of the proposed workflow on six biological processes and a set of genes, by showing that correlation results for known relationships are in accordance with definitions of gene functions provided by NCI Thesaurus. On the other side, the tool is able to propose new candidate relationships for later experimental validation. The tool is available at >http://bioeda1.polito.it:8080/medSearchServlet/.
Collapse
|
34
|
Chen H, Martin B, Daimon CM, Siddiqui S, Luttrell LM, Maudsley S. Textrous!: extracting semantic textual meaning from gene sets. PLoS One 2013; 8:e62665. [PMID: 23646135 PMCID: PMC3639949 DOI: 10.1371/journal.pone.0062665] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2012] [Accepted: 03/22/2013] [Indexed: 11/19/2022] Open
Abstract
The un-biased and reproducible interpretation of high-content gene sets from large-scale genomic experiments is crucial to the understanding of biological themes, validation of experimental data, and the eventual development of plans for future experimentation. To derive biomedically-relevant information from simple gene lists, a mathematical association to scientific language and meaningful words or sentences is crucial. Unfortunately, existing software for deriving meaningful and easily-appreciable scientific textual ‘tokens’ from large gene sets either rely on controlled vocabularies (Medical Subject Headings, Gene Ontology, BioCarta) or employ Boolean text searching and co-occurrence models that are incapable of detecting indirect links in the literature. As an improvement to existing web-based informatic tools, we have developed Textrous!, a web-based framework for the extraction of biomedical semantic meaning from a given input gene set of arbitrary length. Textrous! employs natural language processing techniques, including latent semantic indexing (LSI), sentence splitting, word tokenization, parts-of-speech tagging, and noun-phrase chunking, to mine MEDLINE abstracts, PubMed Central articles, articles from the Online Mendelian Inheritance in Man (OMIM), and Mammalian Phenotype annotation obtained from Jackson Laboratories. Textrous! has the ability to generate meaningful output data with even very small input datasets, using two different text extraction methodologies (collective and individual) for the selecting, ranking, clustering, and visualization of English words obtained from the user data. Textrous!, therefore, is able to facilitate the output of quantitatively significant and easily appreciable semantic words and phrases linked to both individual gene and batch genomic data.
Collapse
Affiliation(s)
- Hongyu Chen
- Receptor Pharmacology Unit, Laboratory of Neuroscience, National Institute on Aging, National Institutes of Health, Baltimore, Maryland, United States of America
| | - Bronwen Martin
- Metabolism Unit, Laboratory of Clinical Investigation, National Institute on Aging, National Institutes of Health, Baltimore, Maryland, United States of America
| | - Caitlin M. Daimon
- Metabolism Unit, Laboratory of Clinical Investigation, National Institute on Aging, National Institutes of Health, Baltimore, Maryland, United States of America
| | - Sana Siddiqui
- Receptor Pharmacology Unit, Laboratory of Neuroscience, National Institute on Aging, National Institutes of Health, Baltimore, Maryland, United States of America
| | - Louis M. Luttrell
- Division of Endocrinology, Diabetes & Medical Genetics, Department of Medicine, Medical University of South Carolina, Charleston, South Carolina, United States of America
| | - Stuart Maudsley
- Receptor Pharmacology Unit, Laboratory of Neuroscience, National Institute on Aging, National Institutes of Health, Baltimore, Maryland, United States of America
- * E-mail:
| |
Collapse
|
35
|
Cashion A, Stanfill A, Thomas F, Xu L, Sutter T, Eason J, Ensell M, Homayouni R. Expression levels of obesity-related genes are associated with weight change in kidney transplant recipients. PLoS One 2013; 8:e59962. [PMID: 23544116 PMCID: PMC3609773 DOI: 10.1371/journal.pone.0059962] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2012] [Accepted: 02/24/2013] [Indexed: 11/23/2022] Open
Abstract
Background The aim of this study was to investigate the association of gene expression profiles in subcutaneous adipose tissue with weight change in kidney transplant recipients and to gain insights into the underlying mechanisms of weight gain. Methodology/Principal Findings A secondary data analysis was done on a subgroup (n = 26) of existing clinical and gene expression data from a larger prospective longitudinal study examining factors contributing to weight gain in transplant recipients. Measurements taken included adipose tissue gene expression profiles at time of transplant, baseline and six-month weight, and demographic data. Using multivariate linear regression analysis controlled for race and gender, expression levels of 1553 genes were significantly (p<0.05) associated with weight change. Functional analysis using Gene Ontology and Kyoto Encyclopedia of Genes and Genomes classifications identified metabolic pathways that were enriched in this dataset. Furthermore, GeneIndexer literature mining analysis identified a subset of genes that are highly associated with obesity in the literature and Ingenuity pathway analysis revealed several significant gene networks associated with metabolism and endocrine function. Polymorphisms in several of these genes have previously been linked to obesity. Conclusions/Significance We have successfully identified a set of molecular pathways that taken together may provide insights into the mechanisms of weight gain in kidney transplant recipients. Future work will be done to determine how these pathways may contribute to weight gain.
Collapse
Affiliation(s)
- Ann Cashion
- Department of Acute and Chronic Care, College of Nursing, University of Tennessee Health Science Center, Memphis, Tennessee, United States of America.
| | | | | | | | | | | | | | | |
Collapse
|
36
|
Chen H, Martin B, Daimon CM, Maudsley S. Effective use of latent semantic indexing and computational linguistics in biological and biomedical applications. Front Physiol 2013; 4:8. [PMID: 23386833 PMCID: PMC3558626 DOI: 10.3389/fphys.2013.00008] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2012] [Accepted: 01/09/2013] [Indexed: 11/13/2022] Open
Abstract
Text mining is rapidly becoming an essential technique for the annotation and analysis of large biological data sets. Biomedical literature currently increases at a rate of several thousand papers per week, making automated information retrieval methods the only feasible method of managing this expanding corpus. With the increasing prevalence of open-access journals and constant growth of publicly-available repositories of biomedical literature, literature mining has become much more effective with respect to the extraction of biomedically-relevant data. In recent years, text mining of popular databases such as MEDLINE has evolved from basic term-searches to more sophisticated natural language processing techniques, indexing and retrieval methods, structural analysis and integration of literature with associated metadata. In this review, we will focus on Latent Semantic Indexing (LSI), a computational linguistics technique increasingly used for a variety of biological purposes. It is noted for its ability to consistently outperform benchmark Boolean text searches and co-occurrence models at information retrieval and its power to extract indirect relationships within a data set. LSI has been used successfully to formulate new hypotheses, generate novel connections from existing data, and validate empirical data.
Collapse
Affiliation(s)
- Hongyu Chen
- Laboratory of Neuroscience, Receptor Pharmacology Unit, National Institute on Aging, National Institutes of HealthBaltimore, MD, USA
| | - Bronwen Martin
- Laboratory of Clinical Investigation, Metabolism Unit, National Institute on Aging, National Institutes of HealthBaltimore, MD, USA
| | - Caitlin M. Daimon
- Laboratory of Clinical Investigation, Metabolism Unit, National Institute on Aging, National Institutes of HealthBaltimore, MD, USA
| | - Stuart Maudsley
- Laboratory of Neuroscience, Receptor Pharmacology Unit, National Institute on Aging, National Institutes of HealthBaltimore, MD, USA
| |
Collapse
|
37
|
Xu L, Cheng C, George EO, Homayouni R. Literature aided determination of data quality and statistical significance threshold for gene expression studies. BMC Genomics 2012; 13 Suppl 8:S23. [PMID: 23282414 PMCID: PMC3535704 DOI: 10.1186/1471-2164-13-s8-s23] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Background Gene expression data are noisy due to technical and biological variability. Consequently, analysis of gene expression data is complex. Different statistical methods produce distinct sets of genes. In addition, selection of expression p-value (EPv) threshold is somewhat arbitrary. In this study, we aimed to develop novel literature based approaches to integrate functional information in analysis of gene expression data. Methods Functional relationships between genes were derived by Latent Semantic Indexing (LSI) of Medline abstracts and used to calculate the function cohesion of gene sets. In this study, literature cohesion was applied in two ways. First, Literature-Based Functional Significance (LBFS) method was developed to calculate a p-value for the cohesion of differentially expressed genes (DEGs) in order to objectively evaluate the overall biological significance of the gene expression experiments. Second, Literature Aided Statistical Significance Threshold (LASST) was developed to determine the appropriate expression p-value threshold for a given experiment. Results We tested our methods on three different publicly available datasets. LBFS analysis demonstrated that only two experiments were significantly cohesive. For each experiment, we also compared the LBFS values of DEGs generated by four different statistical methods. We found that some statistical tests produced more functionally cohesive gene sets than others. However, no statistical test was consistently better for all experiments. This reemphasizes that a statistical test must be carefully selected for each expression study. Moreover, LASST analysis demonstrated that the expression p-value thresholds for some experiments were considerably lower (p < 0.02 and 0.01), suggesting that the arbitrary p-values and false discovery rate thresholds that are commonly used in expression studies may not be biologically sound. Conclusions We have developed robust and objective literature-based methods to evaluate the biological support for gene expression experiments and to determine the appropriate statistical significance threshold. These methods will assist investigators to more efficiently extract biologically meaningful insights from high throughput gene expression experiments.
Collapse
Affiliation(s)
- Lijing Xu
- Bioinformatics Program, Memphis, TN 38152, USA
| | | | | | | |
Collapse
|
38
|
Levin MC, Lee S, Gardner LA, Shin Y, Douglas JN, Groover CJ. Pathogenic mechanisms of neurodegeneration based on the phenotypic expression of progressive forms of immune-mediated neurologic disease. Degener Neurol Neuromuscul Dis 2012; 2:175-187. [PMID: 30890887 PMCID: PMC6065584 DOI: 10.2147/dnnd.s38353] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Considering there are no treatments for progressive forms of multiple sclerosis (MS), a comprehensive understanding of the role of neurodegeneration in the pathogenesis of MS should lead to novel therapeutic strategies to treat it. Many studies have implicated viral triggers as a cause of MS, yet no single virus has been exclusively shown to cause MS. Given this, human and animal viral models of MS are used to study its pathogenesis. One example is human T-lymphotropic virus type 1-associated myelopathy/tropical spastic paraparesis (HAM/TSP). Importantly, HAM/TSP is similar clinically, pathologically, and immunologically to progressive MS. Interestingly, both MS and HAM/TSP patients were found to make antibodies to heterogeneous nuclear ribonucleoprotein (hnRNP) A1, an RNA-binding protein overexpressed in neurons. Anti-hnRNP A1 antibodies reduced neuronal firing and caused neurodegeneration in neuronal cell lines, suggesting the autoantibodies are pathogenic. Further, microarray analyses of neurons exposed to anti-hnRNP A1 antibodies revealed novel pathways of neurodegeneration related to alterations of RNA levels of the spinal paraplegia genes (SPGs). Mutations in SPGs cause hereditary spastic paraparesis, genetic disorders clinically indistinguishable from progressive MS and HAM/TSP. Thus, there is a strong association between involvement of SPGs in neurodegeneration and the clinical phenotype of progressive MS and HAM/TSP patients, who commonly develop spastic paraparesis. Taken together, these data begin to clarify mechanisms of neurodegeneration related to the clinical presentation of patients with chronic immune-mediated neurological disease of the central nervous system, which will give insights into the design of novel therapies to treat these neurological diseases.
Collapse
Affiliation(s)
- Michael C Levin
- Veterans Administration Medical Center, Memphis, TN, USA,
- Departments of Neurology,
- Neuroscience, University of Tennessee Health Science Center, Memphis, TN, USA,
| | - Sangmin Lee
- Veterans Administration Medical Center, Memphis, TN, USA,
- Departments of Neurology,
| | - Lidia A Gardner
- Veterans Administration Medical Center, Memphis, TN, USA,
- Departments of Neurology,
| | - Yoojin Shin
- Veterans Administration Medical Center, Memphis, TN, USA,
- Departments of Neurology,
| | - Joshua N Douglas
- Veterans Administration Medical Center, Memphis, TN, USA,
- Neuroscience, University of Tennessee Health Science Center, Memphis, TN, USA,
| | - Chassidy J Groover
- Veterans Administration Medical Center, Memphis, TN, USA,
- Departments of Neurology,
| |
Collapse
|
39
|
Kennedy LH, Sutter CH, Leon Carrion S, Tran QT, Bodreddigari S, Kensicki E, Mohney RP, Sutter TR. 2,3,7,8-Tetrachlorodibenzo-p-dioxin-mediated production of reactive oxygen species is an essential step in the mechanism of action to accelerate human keratinocyte differentiation. Toxicol Sci 2012; 132:235-49. [PMID: 23152189 DOI: 10.1093/toxsci/kfs325] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Chloracne is commonly observed in humans exposed to 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD); yet, the mechanism of toxicity is not well understood. Using normal human epidermal keratinocytes, we investigated the mechanism of TCDD-mediated enhancement of epidermal differentiation by integrating functional genomic, metabolomic, and biochemical analyses. TCDD increased the expression of 40% of the genes of the epidermal differentiation complex found on chromosome 1q21 and 75% of the genes required for de novo ceramide biosynthesis. Lipid analysis demonstrated that eight of the nine classes of ceramides were increased by TCDD, altering the ratio of ceramides to free fatty acids. TCDD decreased the expression of the glucose transporter, SLC2A1, and most of the glycolytic transcripts, followed by decreases in glycolytic intermediates, including pyruvate. NADH and Krebs cycle intermediates were decreased, whereas NAD(+) was increased. Mitochondrial glutathione (GSH) reductase activity and the GSH/glutathione disulfide ratio were decreased by TCDD, ultimately leading to mitochondrial dysfunction, characterized by decreased inner mitochondrial membrane potential and ATP production, and increased production of the reactive oxygen species (ROS), hydrogen peroxide. Aryl hydrocarbon receptor (AHR) antagonists blocked the response of many transcripts to TCDD, and the endpoints of decreased ATP production and differentiation, suggesting regulation by the AHR. Cotreatment of cells with chemical antioxidants or the enzyme catalase blocked the TCDD-mediated acceleration of keratinocyte cornified envelope formation, an endpoint of terminal differentiation. Thus, TCDD-mediated ROS production is a critical step in the mechanism of this chemical to accelerate keratinocyte differentiation.
Collapse
|
40
|
Andreux PA, Williams EG, Koutnikova H, Houtkooper RH, Champy MF, Henry H, Schoonjans K, Williams RW, Auwerx J. Systems genetics of metabolism: the use of the BXD murine reference panel for multiscalar integration of traits. Cell 2012; 150:1287-99. [PMID: 22939713 DOI: 10.1016/j.cell.2012.08.012] [Citation(s) in RCA: 176] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2012] [Revised: 06/06/2012] [Accepted: 08/03/2012] [Indexed: 01/22/2023]
Abstract
Metabolic homeostasis is achieved by complex molecular and cellular networks that differ significantly among individuals and are difficult to model with genetically engineered lines of mice optimized to study single gene function. Here, we systematically acquired metabolic phenotypes by using the EUMODIC EMPReSS protocols across a large panel of isogenic but diverse strains of mice (BXD type) to study the genetic control of metabolism. We generated and analyzed 140 classical phenotypes and deposited these in an open-access web service for systems genetics (www.genenetwork.org). Heritability, influence of sex, and genetic modifiers of traits were examined singly and jointly by using quantitative-trait locus (QTL) and expression QTL-mapping methods. Traits and networks were linked to loci encompassing both known variants and novel candidate genes, including alkaline phosphatase (ALPL), here linked to hypophosphatasia. The assembled and curated phenotypes provide key resources and exemplars that can be used to dissect complex metabolic traits and disorders.
Collapse
Affiliation(s)
- Pénélope A Andreux
- Laboratory of Integrative and Systems Physiology, School of Life Sciences, École Polytechnique Fédérale de Lausanne 1015, Switzerland
| | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Abstract
Applications of clustering algorithms in biomedical research are ubiquitous, with typical examples including gene expression data analysis, genomic sequence analysis, biomedical document mining, and MRI image analysis. However, due to the diversity of cluster analysis, the differing terminologies, goals, and assumptions underlying different clustering algorithms can be daunting. Thus, determining the right match between clustering algorithms and biomedical applications has become particularly important. This paper is presented to provide biomedical researchers with an overview of the status quo of clustering algorithms, to illustrate examples of biomedical applications based on cluster analysis, and to help biomedical researchers select the most suitable clustering algorithms for their own applications.
Collapse
Affiliation(s)
- Rui Xu
- Industrial Artificial Intelligence Laboratory, GE Global Research Center, Niskayuna, NY 12309, USA.
| | | |
Collapse
|
42
|
Tran QT, Kennedy LH, Leon Carrion S, Bodreddigari S, Goodwin SB, Sutter CH, Sutter TR. EGFR regulation of epidermal barrier function. Physiol Genomics 2012; 44:455-69. [PMID: 22395315 PMCID: PMC3339861 DOI: 10.1152/physiolgenomics.00176.2011] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Keratinocyte terminal differentiation is the process that ultimately forms the epidermal barrier that is essential for mammalian survival. This process is controlled, in part, by signal transduction and gene expression mechanisms, and the epidermal growth factor receptor (EGFR) is known to be an important regulator of multiple epidermal functions. Using microarray analysis of a confluent cell density-induced model of keratinocyte differentiation, we identified 2,676 genes that are regulated by epidermal growth factor (EGF), a ligand of the EGFR. We further discovered, and separately confirmed by functional assays, that EGFR activation abrogates all of the known essential processes of keratinocyte differentiation by 1) decreasing the expression of lipid matrix biosynthetic enzymes, 2) regulating numerous genes forming the cornified envelope, and 3) suppressing the expression of tight junction proteins. In organotypic cultures of skin, EGF acted to impair epidermal barrier integrity, as shown by increased transepidermal water loss. As defective epidermal differentiation and disruption of barrier function are primary features of many human skin diseases, we used bioinformatic analyses to identify genes that are known to be associated with skin diseases. Compared with non-EGF-regulated genes, EGF-regulated genes were significantly enriched for skin disease genes. These results provide a systems-level understanding of the actions of EGFR signaling to inhibit keratinocyte differentiation, providing new insight into the role of EGFR imbalance in skin pathogenesis.
Collapse
Affiliation(s)
- Quynh T Tran
- Department of Biological Sciences, The University of Memphis, Memphis, Tennessee 38152,USA
| | | | | | | | | | | | | |
Collapse
|
43
|
Hossain MS, Gresock J, Edmonds Y, Helm R, Potts M, Ramakrishnan N. Connecting the dots between PubMed abstracts. PLoS One 2012; 7:e29509. [PMID: 22235301 PMCID: PMC3250456 DOI: 10.1371/journal.pone.0029509] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2011] [Accepted: 11/29/2011] [Indexed: 11/23/2022] Open
Abstract
Background There are now a multitude of articles published in a diversity of journals providing information about genes, proteins, pathways, and diseases. Each article investigates subsets of a biological process, but to gain insight into the functioning of a system as a whole, we must integrate information from multiple publications. Particularly, unraveling relationships between extra-cellular inputs and downstream molecular response mechanisms requires integrating conclusions from diverse publications. Methodology We present an automated approach to biological knowledge discovery from PubMed abstracts, suitable for “connecting the dots” across the literature. We describe a storytelling algorithm that, given a start and end publication, typically with little or no overlap in content, identifies a chain of intermediate publications from one to the other, such that neighboring publications have significant content similarity. The quality of discovered stories is measured using local criteria such as the size of supporting neighborhoods for each link and the strength of individual links connecting publications, as well as global metrics of dispersion. To ensure that the story stays coherent as it meanders from one publication to another, we demonstrate the design of novel coherence and overlap filters for use as post-processing steps. Conclusions We demonstrate the application of our storytelling algorithm to three case studies: i) a many-one study exploring relationships between multiple cellular inputs and a molecule responsible for cell-fate decisions, ii) a many-many study exploring the relationships between multiple cytokines and multiple downstream transcription factors, and iii) a one-to-one study to showcase the ability to recover a cancer related association, viz. the Warburg effect, from past literature. The storytelling pipeline helps narrow down a scientist's focus from several hundreds of thousands of relevant documents to only around a hundred stories. We argue that our approach can serve as a valuable discovery aid for hypothesis generation and connection exploration in large unstructured biological knowledge bases.
Collapse
Affiliation(s)
- M Shahriar Hossain
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America.
| | | | | | | | | | | |
Collapse
|
44
|
Roy S, Heinrich K, Phan V, Berry MW, Homayouni R. Latent Semantic Indexing of PubMed abstracts for identification of transcription factor candidates from microarray derived gene sets. BMC Bioinformatics 2011; 12 Suppl 10:S19. [PMID: 22165960 PMCID: PMC3236841 DOI: 10.1186/1471-2105-12-s10-s19] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
Background Identification of transcription factors (TFs) responsible for modulation of differentially expressed genes is a key step in deducing gene regulatory pathways. Most current methods identify TFs by searching for presence of DNA binding motifs in the promoter regions of co-regulated genes. However, this strategy may not always be useful as presence of a motif does not necessarily imply a regulatory role. Conversely, motif presence may not be required for a TF to regulate a set of genes. Therefore, it is imperative to include functional (biochemical and molecular) associations, such as those found in the biomedical literature, into algorithms for identification of putative regulatory TFs that might be explicitly or implicitly linked to the genes under investigation. Results In this study, we present a Latent Semantic Indexing (LSI) based text mining approach for identification and ranking of putative regulatory TFs from microarray derived differentially expressed genes (DEGs). Two LSI models were built using different term weighting schemes to devise pair-wise similarities between 21,027 mouse genes annotated in the Entrez Gene repository. Amongst these genes, 433 were designated TFs in the TRANSFAC database. The LSI derived TF-to-gene similarities were used to calculate TF literature enrichment p-values and rank the TFs for a given set of genes. We evaluated our approach using five different publicly available microarray datasets focusing on TFs Rel, Stat6, Ddit3, Stat5 and Nfic. In addition, for each of the datasets, we constructed gold standard TFs known to be functionally relevant to the study in question. Receiver Operating Characteristics (ROC) curves showed that the log-entropy LSI model outperformed the tf-normal LSI model and a benchmark co-occurrence based method for four out of five datasets, as well as motif searching approaches, in identifying putative TFs. Conclusions Our results suggest that our LSI based text mining approach can complement existing approaches used in systems biology research to decipher gene regulatory networks by providing putative lists of ranked TFs that might be explicitly or implicitly associated with sets of DEGs derived from microarray experiments. In addition, unlike motif searching approaches, LSI based approaches can reveal TFs that may indirectly regulate genes.
Collapse
Affiliation(s)
- Sujoy Roy
- Bioinformatics Program, University of Memphis, Memphis, TN 38152, USA
| | | | | | | | | |
Collapse
|
45
|
|
46
|
A potential link between autoimmunity and neurodegeneration in immune-mediated neurological disease. J Neuroimmunol 2011; 235:56-69. [PMID: 21570130 DOI: 10.1016/j.jneuroim.2011.02.007] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2010] [Revised: 01/11/2011] [Accepted: 02/08/2011] [Indexed: 01/08/2023]
Abstract
Multiple sclerosis (MS) patients make antibodies to heterogeneous nuclear ribonuclear protein A1 (hnRNP-A1), a nucleocytoplasmic protein. We hypothesized this autoimmune reaction might contribute to neurodegeneration. Antibodies from MS patients reacted with hnRNP-A1-'M9', its nuclear translocation sequence. Transfection of anti-M9 antibodies into neurons resulted in neuronal injury and changes in transcripts related to hnRNP-A1 function. Importantly, RNA levels for the spinal paraplegia genes (SPGs) decreased. Changes in SPG RNA levels were confirmed in neurons purified from MS brains. Also, we show molecular interactions between spastin (the encoded protein of SPG4) and hnRNP-A1. These data suggest a link between autoimmunity, clinical phenotype and neurodegeneration in MS.
Collapse
|
47
|
Xu L, Furlotte N, Lin Y, Heinrich K, Berry MW, George EO, Homayouni R. Functional cohesion of gene sets determined by latent semantic indexing of PubMed abstracts. PLoS One 2011; 6:e18851. [PMID: 21533142 PMCID: PMC3077411 DOI: 10.1371/journal.pone.0018851] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2010] [Accepted: 03/21/2011] [Indexed: 12/31/2022] Open
Abstract
High-throughput genomic technologies enable researchers to identify genes that are co-regulated with respect to specific experimental conditions. Numerous statistical approaches have been developed to identify differentially expressed genes. Because each approach can produce distinct gene sets, it is difficult for biologists to determine which statistical approach yields biologically relevant gene sets and is appropriate for their study. To address this issue, we implemented Latent Semantic Indexing (LSI) to determine the functional coherence of gene sets. An LSI model was built using over 1 million Medline abstracts for over 20,000 mouse and human genes annotated in Entrez Gene. The gene-to-gene LSI-derived similarities were used to calculate a literature cohesion p-value (LPv) for a given gene set using a Fisher's exact test. We tested this method against genes in more than 6,000 functional pathways annotated in Gene Ontology (GO) and found that approximately 75% of gene sets in GO biological process category and 90% of the gene sets in GO molecular function and cellular component categories were functionally cohesive (LPv<0.05). These results indicate that the LPv methodology is both robust and accurate. Application of this method to previously published microarray datasets demonstrated that LPv can be helpful in selecting the appropriate feature extraction methods. To enable real-time calculation of LPv for mouse or human gene sets, we developed a web tool called Gene-set Cohesion Analysis Tool (GCAT). GCAT can complement other gene set enrichment approaches by determining the overall functional cohesion of data sets, taking into account both explicit and implicit gene interactions reported in the biomedical literature.
Collapse
Affiliation(s)
- Lijing Xu
- Bioinformatics Program, University of Memphis, Memphis, Tennessee, United States of America
- Department of Mathematical Sciences, University of Memphis, Memphis, Tennessee, United States of America
| | - Nicholas Furlotte
- Bioinformatics Program, University of Memphis, Memphis, Tennessee, United States of America
| | - Yunyue Lin
- Department of Computer Science, University of Memphis, Memphis, Tennessee, United States of America
| | - Kevin Heinrich
- Computable Genomix, Memphis, Tennessee, United States of America
| | - Michael W. Berry
- Department of Electrical and Computer Engineering, University of Tennessee, Knoxville, Tennessee, United States of America
| | - Ebenezer O. George
- Bioinformatics Program, University of Memphis, Memphis, Tennessee, United States of America
- Department of Mathematical Sciences, University of Memphis, Memphis, Tennessee, United States of America
| | - Ramin Homayouni
- Bioinformatics Program, University of Memphis, Memphis, Tennessee, United States of America
- Department of Biological Sciences, University of Memphis, Memphis, Tennessee, United States of America
- * E-mail:
| |
Collapse
|
48
|
Towards the Development of an Integrated Framework for Enhancing Enterprise Search Using Latent Semantic Indexing. ACTA ACUST UNITED AC 2011. [DOI: 10.1007/978-3-642-22688-5_29] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|
49
|
Jelier R, Goeman JJ, Hettne KM, Schuemie MJ, den Dunnen JT, 't Hoen PAC. Literature-aided interpretation of gene expression data with the weighted global test. Brief Bioinform 2010; 12:518-29. [PMID: 21183478 DOI: 10.1093/bib/bbq082] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Most methods for the interpretation of gene expression profiling experiments rely on the categorization of genes, as provided by the Gene Ontology (GO) and pathway databases. Due to the manual curation process, such databases are never up-to-date and tend to be limited in focus and coverage. Automated literature mining tools provide an attractive, alternative approach. We review how they can be employed for the interpretation of gene expression profiling experiments. We illustrate that their comprehensive scope aids the interpretation of data from domains poorly covered by GO or alternative databases, and allows for the linking of gene expression with diseases, drugs, tissues and other types of concepts. A framework for proper statistical evaluation of the associations between gene expression values and literature concepts was lacking and is now implemented in a weighted extension of global test. The weights are the literature association scores and reflect the importance of a gene for the concept of interest. In a direct comparison with classical GO-based gene sets, we show that use of literature-based associations results in the identification of much more specific GO categories. We demonstrate the possibilities for linking of gene expression data to patient survival in breast cancer and the action and metabolism of drugs. Coupling with online literature mining tools ensures transparency and allows further study of the identified associations. Literature mining tools are therefore powerful additions to the toolbox for the interpretation of high-throughput genomics data.
Collapse
Affiliation(s)
- Rob Jelier
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | | | | | | | | | | |
Collapse
|
50
|
Tjioe E, Berry MW, Homayouni R. Discovering gene functional relationships using FAUN (Feature Annotation Using Nonnegative matrix factorization). BMC Bioinformatics 2010; 11 Suppl 6:S14. [PMID: 20946597 PMCID: PMC3026361 DOI: 10.1186/1471-2105-11-s6-s14] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND Searching the enormous amount of information available in biomedical literature to extract novel functional relationships among genes remains a challenge in the field of bioinformatics. While numerous (software) tools have been developed to extract and identify gene relationships from biological databases, few effectively deal with extracting new (or implied) gene relationships, a process which is useful in interpretation of discovery-oriented genome-wide experiments. RESULTS In this study, we develop a Web-based bioinformatics software environment called FAUN or Feature Annotation Using Nonnegative matrix factorization (NMF) to facilitate both the discovery and classification of functional relationships among genes. Both the computational complexity and parameterization of NMF for processing gene sets are discussed. FAUN is tested on three manually constructed gene document collections. Its utility and performance as a knowledge discovery tool is demonstrated using a set of genes associated with Autism. CONCLUSIONS FAUN not only assists researchers to use biomedical literature efficiently, but also provides utilities for knowledge discovery. This Web-based software environment may be useful for the validation and analysis of functional associations in gene subsets identified by high-throughput experiments.
Collapse
Affiliation(s)
- Elina Tjioe
- Department of Electrical Engineering and Computer Science and Graduate School of Genome Science and Techonology, University of Tennessee, Knoxville, TN 37996, USA
| | | | | |
Collapse
|