1
|
Gottlieb S, Zeliff D, O'Rourke B, Rogers WD, Miles MF. GSK3B inhibition partially reverses brain ethanol-induced transcriptomic changes in C57BL/6J mice: Expression network co-analysis with human genome-wide association studies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.04.03.647116. [PMID: 40235963 PMCID: PMC11996488 DOI: 10.1101/2025.04.03.647116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/17/2025]
Abstract
Alcohol use disorder (AUD) is a chronic behavioral disease with greater than 50% of its risk due to complex genetic contributions. Existing pharmacological and behavioral treatments for AUD are minimally effective and underutilized. Animal model behavioral genetics and human genome-wide association studies have begun to identify individual genes contributing to the progressive compulsive consumption of ethanol that occurs with AUD, promising possible new therapeutic targets. Our laboratory has previously identified Gsk3b as a central member in a network of ethanol-responsive genes in mouse prefrontal cortex, which altered ethanol consumption with genetic manipulation and was also significantly associated with risk for alcohol dependence in human genome-wide association studies. Here we perform detailed brain RNA sequencing transcriptomic studies to characterize a highly specific and clinically available GSK3B pharmacological inhibitor, tideglusib, as a possible therapeutic for clinical trials on treatment of AUD. A model of chronic intermittent ethanol consumption was used to study gene expression changes in prefrontal cortex and nucleus accumbens in the presence or absence of tideglusib treatment. Multivariate analysis of differentially expressed genes showed that tideglusib largely reversed ethanol- induced expression changes for two prominent clusters of genes in both prefrontal cortex and nucleus accumbens. Bioinformatic analysis showed these genes to have prominent roles in neuronal functioning and synaptic activity. Additionally, mouse brain differential gene expression data was analyzed together with human protein-protein interaction and genome-wide association studies on AUD to derive networks responding to tideglusib and relevant to human genetic risk for alcohol dependence. These studies identified discrete networks significantly enriched with genes provisionally associated with AUD, and provide key information on central hubs of such networks. Together these studies document tideglusib as a major modulator of chronic ethanol consumption-evoked brain gene expression signatures, and identify possible new targets for therapeutic modulation of AUD.
Collapse
|
2
|
Doostparast Torshizi A, Truong DT, Hou L, Smets B, Whelan CD, Li S. Proteogenomic network analysis reveals dysregulated mechanisms and potential mediators in Parkinson's disease. Nat Commun 2024; 15:6430. [PMID: 39080267 PMCID: PMC11289099 DOI: 10.1038/s41467-024-50718-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Accepted: 07/18/2024] [Indexed: 08/02/2024] Open
Abstract
Parkinson's disease is highly heterogeneous across disease symptoms, clinical manifestations and progression trajectories, hampering the identification of therapeutic targets. Despite knowledge gleaned from genetics analysis, dysregulated proteome mechanisms stemming from genetic aberrations remain underexplored. In this study, we develop a three-phase system-level proteogenomic analytical framework to characterize disease-associated proteins and dysregulated mechanisms. Proteogenomic analysis identified 577 proteins that enrich for Parkinson's disease-related pathways, such as cytokine receptor interactions and lysosomal function. Converging lines of evidence identified nine proteins, including LGALS3, CSNK2A1, SMPD3, STX4, APOA2, PAFAH1B3, LDLR, HSPB1, BRK1, with potential roles in disease pathogenesis. This study leverages the largest population-scale proteomics dataset, the UK Biobank Pharma Proteomics Project, to characterize genetically-driven protein disturbances associated with Parkinson's disease. Taken together, our work contributes to better understanding of genome-proteome dynamics in Parkinson's disease and sets a paradigm to identify potential indirect mediators connected to GWAS signals for complex neurodegenerative disorders.
Collapse
Affiliation(s)
- Abolfazl Doostparast Torshizi
- Population Analytics & Insights, AI/ML, Data Science & Digital Health, Janssen Research & Development, LLC, Spring House, PA, USA.
| | - Dongnhu T Truong
- Population Analytics & Insights, AI/ML, Data Science & Digital Health, Janssen Research & Development, LLC, Spring House, PA, USA
| | - Liping Hou
- Population Analytics & Insights, AI/ML, Data Science & Digital Health, Janssen Research & Development, LLC, Spring House, PA, USA
| | - Bart Smets
- Neuroscience Data Science, Janssen Pharmaceutica NV, Beerse, Belgium
| | - Christopher D Whelan
- Neuroscience Data Science, Janssen Research & Development, LLC, Cambridge, MA, USA
| | - Shuwei Li
- Population Analytics & Insights, AI/ML, Data Science & Digital Health, Janssen Research & Development, LLC, Spring House, PA, USA
| |
Collapse
|
3
|
Sanders KL, Manuel AM, Liu A, Leng B, Chen X, Zhao Z. Unveiling Gene Interactions in Alzheimer's Disease by Integrating Genetic and Epigenetic Data with a Network-Based Approach. EPIGENOMES 2024; 8:14. [PMID: 38651367 PMCID: PMC11036294 DOI: 10.3390/epigenomes8020014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 03/26/2024] [Accepted: 03/28/2024] [Indexed: 04/25/2024] Open
Abstract
Alzheimer's Disease (AD) is a complex disease and the leading cause of dementia in older people. We aimed to uncover aspects of AD's pathogenesis that may contribute to drug repurposing efforts by integrating DNA methylation and genetic data. Implementing the network-based tool, a dense module search of genome-wide association studies (dmGWAS), we integrated a large-scale GWAS dataset with DNA methylation data to identify gene network modules associated with AD. Our analysis yielded 286 significant gene network modules. Notably, the foremost module included the BIN1 gene, showing the largest GWAS signal, and the GNAS gene, the most significantly hypermethylated. We conducted Web-based Cell-type-Specific Enrichment Analysis (WebCSEA) on genes within the top 10% of dmGWAS modules, highlighting monocyte as the most significant cell type (p < 5 × 10-12). Functional enrichment analysis revealed Gene Ontology Biological Process terms relevant to AD pathology (adjusted p < 0.05). Additionally, drug target enrichment identified five FDA-approved targets (p-value = 0.03) for further research. In summary, dmGWAS integration of genetic and epigenetic signals unveiled new gene interactions related to AD, offering promising avenues for future studies.
Collapse
Affiliation(s)
- Keith L. Sanders
- Center for Precision Health, McWilliams School of Biomedical Informatics, Houston, TX 77030, USA; (K.L.S.); (A.M.M.); (A.L.); (X.C.)
| | - Astrid M. Manuel
- Center for Precision Health, McWilliams School of Biomedical Informatics, Houston, TX 77030, USA; (K.L.S.); (A.M.M.); (A.L.); (X.C.)
| | - Andi Liu
- Center for Precision Health, McWilliams School of Biomedical Informatics, Houston, TX 77030, USA; (K.L.S.); (A.M.M.); (A.L.); (X.C.)
- Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, Houston, TX 77030, USA
| | - Boyan Leng
- Center for Precision Health, McWilliams School of Biomedical Informatics, Houston, TX 77030, USA; (K.L.S.); (A.M.M.); (A.L.); (X.C.)
| | - Xiangning Chen
- Center for Precision Health, McWilliams School of Biomedical Informatics, Houston, TX 77030, USA; (K.L.S.); (A.M.M.); (A.L.); (X.C.)
| | - Zhongming Zhao
- Center for Precision Health, McWilliams School of Biomedical Informatics, Houston, TX 77030, USA; (K.L.S.); (A.M.M.); (A.L.); (X.C.)
- Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, Houston, TX 77030, USA
| |
Collapse
|
4
|
Zhang L, Lu D, Bi X, Zhao K, Yu G, Quan N. Predicting disease genes based on multi-head attention fusion. BMC Bioinformatics 2023; 24:162. [PMID: 37085750 PMCID: PMC10122338 DOI: 10.1186/s12859-023-05285-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Accepted: 04/12/2023] [Indexed: 04/23/2023] Open
Abstract
BACKGROUND The identification of disease-related genes is of great significance for the diagnosis and treatment of human disease. Most studies have focused on developing efficient and accurate computational methods to predict disease-causing genes. Due to the sparsity and complexity of biomedical data, it is still a challenge to develop an effective multi-feature fusion model to identify disease genes. RESULTS This paper proposes an approach to predict the pathogenic gene based on multi-head attention fusion (MHAGP). Firstly, the heterogeneous biological information networks of disease genes are constructed by integrating multiple biomedical knowledge databases. Secondly, two graph representation learning algorithms are used to capture the feature vectors of gene-disease pairs from the network, and the features are fused by introducing multi-head attention. Finally, multi-layer perceptron model is used to predict the gene-disease association. CONCLUSIONS The MHAGP model outperforms all of other methods in comparative experiments. Case studies also show that MHAGP is able to predict genes potentially associated with diseases. In the future, more biological entity association data, such as gene-drug, disease phenotype-gene ontology and so on, can be added to expand the information in heterogeneous biological networks and achieve more accurate predictions. In addition, MHAGP with strong expansibility can be used for potential tasks such as gene-drug association and drug-disease association prediction.
Collapse
Affiliation(s)
- Linlin Zhang
- College of Software Engineering, Xinjiang University, Urumqi, China.
| | - Dianrong Lu
- College of information Science and Engineering, Xinjiang University, Urumqi, China
| | - Xuehua Bi
- Medical Engineering and Technology College, Xinjiang Medical University, Urumqi, China
| | - Kai Zhao
- College of information Science and Engineering, Xinjiang University, Urumqi, China
| | - Guanglei Yu
- Medical Engineering and Technology College, Xinjiang Medical University, Urumqi, China
| | - Na Quan
- College of information Science and Engineering, Xinjiang University, Urumqi, China
| |
Collapse
|
5
|
Liu A, Manuel AM, Dai Y, Fernandes BS, Enduru N, Jia P, Zhao Z. Identifying candidate genes and drug targets for Alzheimer's disease by an integrative network approach using genetic and brain region-specific proteomic data. Hum Mol Genet 2022; 31:3341-3354. [PMID: 35640139 PMCID: PMC9523561 DOI: 10.1093/hmg/ddac124] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Revised: 05/04/2022] [Accepted: 05/24/2022] [Indexed: 02/02/2023] Open
Abstract
Genome-wide association studies (GWAS) have identified more than 75 genetic variants associated with Alzheimer's disease (ad). However, how these variants function and impact protein expression in brain regions remain elusive. Large-scale proteomic datasets of ad postmortem brain tissues have become available recently. In this study, we used these datasets to investigate brain region-specific molecular pathways underlying ad pathogenesis and explore their potential drug targets. We applied our new network-based tool, Edge-Weighted Dense Module Search of GWAS (EW_dmGWAS), to integrate ad GWAS statistics of 472 868 individuals with proteomic profiles from two brain regions from two large-scale ad cohorts [parahippocampal gyrus (PHG), sample size n = 190; dorsolateral prefrontal cortex (DLPFC), n = 192]. The resulting network modules were evaluated using a scale-free network index, followed by a cross-region consistency evaluation. Our EW_dmGWAS analyses prioritized 52 top module genes (TMGs) specific in PHG and 58 TMGs in DLPFC, of which four genes (CLU, PICALM, PRRC2A and NDUFS3) overlapped. Those four genes were significantly associated with ad (GWAS gene-level false discovery rate < 0.05). To explore the impact of these genetic components on TMGs, we further examined their differentially co-expressed genes at the proteomic level and compared them with investigational drug targets. We pinpointed three potential drug target genes, APP, SNCA and VCAM1, specifically in PHG. Gene set enrichment analyses of TMGs in PHG and DLPFC revealed region-specific biological processes, tissue-cell type signatures and enriched drug signatures, suggesting potential region-specific drug repurposing targets for ad.
Collapse
Affiliation(s)
- Andi Liu
- Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, Houston, TX 77030, USA,Center for Precision Health, School of Biomedical Informatics, Houston, TX 77030, USA
| | - Astrid M Manuel
- Center for Precision Health, School of Biomedical Informatics, Houston, TX 77030, USA
| | - Yulin Dai
- Center for Precision Health, School of Biomedical Informatics, Houston, TX 77030, USA
| | - Brisa S Fernandes
- Center for Precision Health, School of Biomedical Informatics, Houston, TX 77030, USA
| | - Nitesh Enduru
- Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, Houston, TX 77030, USA,Center for Precision Health, School of Biomedical Informatics, Houston, TX 77030, USA
| | - Peilin Jia
- Center for Precision Health, School of Biomedical Informatics, Houston, TX 77030, USA
| | - Zhongming Zhao
- To whom correspondence should be addressed at: Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030, USA. Tel: +1 7135003631;
| |
Collapse
|
6
|
Chimusa ER, Defo J. Dissecting Meta-Analysis in GWAS Era: Bayesian Framework for Gene/Subnetwork-Specific Meta-Analysis. Front Genet 2022; 13:838518. [PMID: 35664319 PMCID: PMC9159898 DOI: 10.3389/fgene.2022.838518] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 04/07/2022] [Indexed: 11/13/2022] Open
Abstract
Over the past decades, advanced high-throughput technologies have continuously contributed to genome-wide association studies (GWASs). GWAS meta-analysis has been increasingly adopted, has cross-ancestry replicability, and has power to illuminate the genetic architecture of complex traits, informing about the reliability of estimation effects and their variability across human ancestries. However, detecting genetic variants that have low disease risk still poses a challenge. Designing a meta-analysis approach that combines the effect of various SNPs within genes or genes within pathways from multiple independent population GWASs may be helpful in identifying associations with small effect sizes and increasing the association power. Here, we proposed ancMETA, a Bayesian graph-based framework, to perform the gene/pathway-specific meta-analysis by combining the effect size of multiple SNPs within genes, and genes within subnetwork/pathways across multiple independent population GWASs to deconvolute the interactions between genes underlying the pathogenesis of complex diseases across human populations. We assessed the proposed framework on simulated datasets, and the results show that the proposed model holds promise for increasing statistical power for meta-analysis of genetic variants underlying the pathogenesis of complex diseases. To illustrate the proposed meta-analysis framework, we leverage seven different European bipolar disorder (BD) cohorts, and we identify variants in the angiotensinogen (AGT) gene to be significantly associated with BD across all 7 studies. We detect a commonly significant BD-specific subnetwork with the ESR1 gene as the main hub of a subnetwork, associated with neurotrophin signaling (p = 4e−14) and myometrial relaxation and contraction (p = 3e−08) pathways. ancMETA provides a new contribution to post-GWAS methodologies and holds promise for comprehensively examining interactions between genes underlying the pathogenesis of genetic diseases and also underlying ethnic differences.
Collapse
|
7
|
You Y, Lai X, Pan Y, Zheng H, Vera J, Liu S, Deng S, Zhang L. Artificial intelligence in cancer target identification and drug discovery. Signal Transduct Target Ther 2022; 7:156. [PMID: 35538061 PMCID: PMC9090746 DOI: 10.1038/s41392-022-00994-0] [Citation(s) in RCA: 142] [Impact Index Per Article: 47.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Revised: 03/14/2022] [Accepted: 04/05/2022] [Indexed: 02/08/2023] Open
Abstract
Artificial intelligence is an advanced method to identify novel anticancer targets and discover novel drugs from biology networks because the networks can effectively preserve and quantify the interaction between components of cell systems underlying human diseases such as cancer. Here, we review and discuss how to employ artificial intelligence approaches to identify novel anticancer targets and discover drugs. First, we describe the scope of artificial intelligence biology analysis for novel anticancer target investigations. Second, we review and discuss the basic principles and theory of commonly used network-based and machine learning-based artificial intelligence algorithms. Finally, we showcase the applications of artificial intelligence approaches in cancer target identification and drug discovery. Taken together, the artificial intelligence models have provided us with a quantitative framework to study the relationship between network characteristics and cancer, thereby leading to the identification of potential anticancer targets and the discovery of novel drug candidates.
Collapse
Affiliation(s)
- Yujie You
- College of Computer Science, Sichuan University, Chengdu, 610065, China
| | - Xin Lai
- Laboratory of Systems Tumor Immunology, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Universitätsklinikum Erlangen, Erlangen, 91052, Germany
| | - Yi Pan
- Faculty of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Room D513, 1068 Xueyuan Avenue, Shenzhen University Town, Shenzhen, 518055, China
| | - Huiru Zheng
- School of Computing, Ulster University, Belfast, BT15 1ED, UK
| | - Julio Vera
- Laboratory of Systems Tumor Immunology, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Universitätsklinikum Erlangen, Erlangen, 91052, Germany
| | - Suran Liu
- College of Computer Science, Sichuan University, Chengdu, 610065, China
| | - Senyi Deng
- Institute of Thoracic Oncology, Department of Thoracic Surgery, West China Hospital, Sichuan University, Chengdu, 610065, China.
| | - Le Zhang
- College of Computer Science, Sichuan University, Chengdu, 610065, China.
- Key Laboratory of Systems Biology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou, 310024, China.
- Key Laboratory of Systems Health Science of Zhejiang Province, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, 310024, China.
| |
Collapse
|
8
|
Guo S, Li T, Xu D, Xu J, Wang H, Li J, Bi X, Cao M, Xu Z, Xia Q, Cui Y, Li K. Prognostic Implications and Immune Infiltration Characteristics of Chromosomal Instability-Related Dysregulated CeRNA in Lung Adenocarcinoma. Front Mol Biosci 2022; 9:843640. [PMID: 35419410 PMCID: PMC8995899 DOI: 10.3389/fmolb.2022.843640] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2021] [Accepted: 02/22/2022] [Indexed: 12/14/2022] Open
Abstract
An accumulating body of research indicates that long-noncoding RNAs (lncRNAs) regulate the target genes and act as competitive endogenous RNAs (ceRNAs) playing an indispensable role in lung adenocarcinoma (LUAD). LUAD is frequently accompanied by the feature of chromosomal instability (CIN); however, CIN-related ceRNAs have not been investigated yet. We systematically analyzed and integrated CIN-related dysregulated ceRNAs characteristics in LUAD samples for the first time. In TCGA LUAD cohort, CIN in tumor samples was significantly higher than that in those of adjacent, and patients with high CIN risk tended to have worse clinical outcomes. We constructed a double-weighted CIN-related dysregulated ceRNA network, in which edge weight and node weight represented the disorder extent of ceRNA and the correlation of RNA expression level and prognosis, respectively. After module mining and analysis, a potential prognostic biomarker composed of 12 RNAs (8 mRNAs and 4 lncRNAs) named CIN-related dysregulated ceRNAs (CRDC) was obtained. The CRDC risk score had a positive relation with clinical stage and CIN, and patients with high CRDC risk scores exhibited poor prognosis. Moreover, CRDC tended to be an independent risk factor with high robustness to overcome the effect of multicollinearity among other explanatory variables for disease-specific survival (DSS) in TCGA and two GEO cohorts. The result of functional analysis indicated that CRDC was involved in multiple cancer progresses, especially immune-related pathways. The patients with lower CRDC risk had higher B cell, T cell CD4+, T cell CD8+, neutrophil, macrophage, and myeloid dendritic cell infiltration than the patients with higher CRDC risk. Meanwhile, patients with lower CRDC risk could get more benefits from immunological therapy. The results suggested that the CRDC could be a potential prognostic biomarker and an immunotherapy predictor for lung adenocarcinoma.
Collapse
Affiliation(s)
- Shengnan Guo
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Institute of Nephrology Second Affiliated Hospital and Hainan General Hospital, Hainan Medical University, Haikou, China
| | - Tianhao Li
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Institute of Nephrology Second Affiliated Hospital and Hainan General Hospital, Hainan Medical University, Haikou, China
| | - Dahua Xu
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Institute of Nephrology Second Affiliated Hospital and Hainan General Hospital, Hainan Medical University, Haikou, China
| | - Jiankai Xu
- College of Bioinformatics Science and Technology, Cancer Hospital, Harbin Medical University, Harbin, China
| | - Hong Wang
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Institute of Nephrology Second Affiliated Hospital and Hainan General Hospital, Hainan Medical University, Haikou, China
| | - Jian Li
- College of Bioinformatics Science and Technology, Cancer Hospital, Harbin Medical University, Harbin, China
| | - Xiaoman Bi
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Institute of Nephrology Second Affiliated Hospital and Hainan General Hospital, Hainan Medical University, Haikou, China
| | - Meng Cao
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Institute of Nephrology Second Affiliated Hospital and Hainan General Hospital, Hainan Medical University, Haikou, China
| | - Zhizhou Xu
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Institute of Nephrology Second Affiliated Hospital and Hainan General Hospital, Hainan Medical University, Haikou, China
| | - Qianfeng Xia
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, NHC Key Laboratory of Control of Tropical Diseases, School of Tropical Medicine, The Second Affiliated Hospital, Hainan Medical University, Haikou, China
- *Correspondence: Qianfeng Xia, ; Ying Cui, ; Kongning Li,
| | - Ying Cui
- College of Bioinformatics Science and Technology, Cancer Hospital, Harbin Medical University, Harbin, China
- *Correspondence: Qianfeng Xia, ; Ying Cui, ; Kongning Li,
| | - Kongning Li
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Institute of Nephrology Second Affiliated Hospital and Hainan General Hospital, Hainan Medical University, Haikou, China
- *Correspondence: Qianfeng Xia, ; Ying Cui, ; Kongning Li,
| |
Collapse
|
9
|
Talarico F, Costa GO, Ota VK, Santoro ML, Noto C, Gadelha A, Bressan R, Azevedo H, Belangero SI. Systems-Level Analysis of Genetic Variants Reveals Functional and Spatiotemporal Context in Treatment-resistant Schizophrenia. Mol Neurobiol 2022; 59:3170-3182. [DOI: 10.1007/s12035-022-02794-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Accepted: 03/06/2022] [Indexed: 10/18/2022]
|
10
|
Wang S, Li J, Wang Y. M2PP: a novel computational model for predicting drug-targeted pathogenic proteins. BMC Bioinformatics 2022; 23:7. [PMID: 34983358 PMCID: PMC8728953 DOI: 10.1186/s12859-021-04522-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 12/07/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Detecting pathogenic proteins is the origin way to understand the mechanism and resist the invasion of diseases, making pathogenic protein prediction develop into an urgent problem to be solved. Prediction for genome-wide proteins may be not necessarily conducive to rapidly cure diseases as developing new drugs specifically for the predicted pathogenic protein always need major expenditures on time and cost. In order to facilitate disease treatment, computational method to predict pathogenic proteins which are targeted by existing drugs should be exploited. RESULTS In this study, we proposed a novel computational model to predict drug-targeted pathogenic proteins, named as M2PP. Three types of features were presented on our constructed heterogeneous network (including target proteins, diseases and drugs), which were based on the neighborhood similarity information, drug-inferred information and path information. Then, a random forest regression model was trained to score unconfirmed target-disease pairs. Five-fold cross-validation experiment was implemented to evaluate model's prediction performance, where M2PP achieved advantageous results compared with other state-of-the-art methods. In addition, M2PP accurately predicted high ranked pathogenic proteins for common diseases with public biomedical literature as supporting evidence, indicating its excellent ability. CONCLUSIONS M2PP is an effective and accurate model to predict drug-targeted pathogenic proteins, which could provide convenience for the future biological researches.
Collapse
Affiliation(s)
- Shiming Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, 150001, China
| | - Jie Li
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, 150001, China.
| | - Yadong Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, 150001, China.
| |
Collapse
|
11
|
Castano-Duque L, Gilbert MK, Mack BM, Lebar MD, Carter-Wientjes CH, Sickler CM, Cary JW, Rajasekaran K. Flavonoids Modulate the Accumulation of Toxins From Aspergillus flavus in Maize Kernels. FRONTIERS IN PLANT SCIENCE 2021; 12:761446. [PMID: 34899785 PMCID: PMC8662736 DOI: 10.3389/fpls.2021.761446] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Accepted: 10/15/2021] [Indexed: 06/14/2023]
Abstract
Aspergillus flavus is an opportunistic fungal pathogen capable of producing aflatoxins, potent carcinogenic toxins that accumulate in maize kernels after infection. To better understand the molecular mechanisms of maize resistance to A. flavus growth and aflatoxin accumulation, we performed a high-throughput transcriptomic study in situ using maize kernels infected with A. flavus strain 3357. Three maize lines were evaluated: aflatoxin-contamination resistant line TZAR102, semi-resistant MI82, and susceptible line Va35. A modified genotype-environment association method (GEA) used to detect loci under selection via redundancy analysis (RDA) was used with the transcriptomic data to detect genes significantly influenced by maize line, fungal treatment, and duration of infection. Gene ontology enrichment analysis of genes highly expressed in infected kernels identified molecular pathways associated with defense responses to fungi and other microbes such as production of pathogenesis-related (PR) proteins and lipid bilayer formation. To further identify novel genes of interest, we incorporated genomic and phenotypic field data from a genome wide association analysis with gene expression data, allowing us to detect significantly expressed quantitative trait loci (eQTL). These results identified significant association between flavonoid biosynthetic pathway genes and infection by A. flavus. In planta fungal infections showed that the resistant line, TZAR102, has a higher fold increase of the metabolites naringenin and luteolin than the susceptible line, Va35, when comparing untreated and fungal infected plants. These results suggest flavonoids contribute to plant resistance mechanisms against aflatoxin contamination through modulation of toxin accumulation in maize kernels.
Collapse
|
12
|
Identification of early and intermediate biomarkers for ARDS mortality by multi-omic approaches. Sci Rep 2021; 11:18874. [PMID: 34556700 PMCID: PMC8460799 DOI: 10.1038/s41598-021-98053-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Accepted: 08/26/2021] [Indexed: 12/29/2022] Open
Abstract
The lack of successful clinical trials in acute respiratory distress syndrome (ARDS) has highlighted the unmet need for biomarkers predicting ARDS mortality and for novel therapeutics to reduce ARDS mortality. We utilized a systems biology multi-“omics” approach to identify predictive biomarkers for ARDS mortality. Integrating analyses were designed to differentiate ARDS non-survivors and survivors (568 subjects, 27% overall 28-day mortality) using datasets derived from multiple ‘omics’ studies in a multi-institution ARDS cohort (54% European descent, 40% African descent). ‘Omics’ data was available for each subject and included genome-wide association studies (GWAS, n = 297), RNA sequencing (n = 93), DNA methylation data (n = 61), and selective proteomic network analysis (n = 240). Integration of available “omic” data identified a 9-gene set (TNPO1, NUP214, HDAC1, HNRNPA1, GATAD2A, FOSB, DDX17, PHF20, CREBBP) that differentiated ARDS survivors/non-survivors, results that were validated utilizing a longitudinal transcription dataset. Pathway analysis identified TP53-, HDAC1-, TGF-β-, and IL-6-signaling pathways to be associated with ARDS mortality. Predictive biomarker discovery identified transcription levels of the 9-gene set (AUC-0.83) and Day 7 angiopoietin 2 protein levels as potential candidate predictors of ARDS mortality (AUC-0.70). These results underscore the value of utilizing integrated “multi-omics” approaches in underpowered datasets from racially diverse ARDS subjects.
Collapse
|
13
|
Manuel AM, Dai Y, Freeman LA, Jia P, Zhao Z. An integrative study of genetic variants with brain tissue expression identifies viral etiology and potential drug targets of multiple sclerosis. Mol Cell Neurosci 2021; 115:103656. [PMID: 34284104 PMCID: PMC8802913 DOI: 10.1016/j.mcn.2021.103656] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 07/05/2021] [Accepted: 07/13/2021] [Indexed: 11/18/2022] Open
Abstract
Multiple sclerosis (MS) is a neuroinflammatory disorder leading to chronic disability. Brain lesions in MS commonly arise in normal-appearing white matter (NAWM). Genome-wide association studies (GWAS) have identified genetic variants associated with MS. Transcriptome alterations have been observed in case-control studies of NAWM. We developed a Cross-Dataset Evaluation (CDE) function for our network-based tool, Edge-Weighted Dense Module Search of GWAS (EW_dmGWAS). We applied CDE to integrate publicly available MS GWAS summary statistics of 41,505 cases and controls with collectively 38 NAWM expression samples, using the human protein interactome as the reference network, to investigate biological underpinnings of MS etiology. We validated the resulting modules with colocalization of GWAS and expression quantitative trait loci (eQTL) signals, using GTEx Consortium expression data for MS-relevant tissues: 14 brain tissues and 4 immune-related tissues. Other network assessments included a drug target query and functional gene set enrichment analysis. CDE prioritized a MS NAWM network containing 55 unique genes. The gene list was enriched (p-value = 2.34 × 10-7) with GWAS-eQTL colocalized genes: CDK4, IFITM3, MAPK1, MAPK3, METTL12B and PIK3R2. The resultant network also included drug signatures of FDA-approved medications. Gene set enrichment analysis revealed the top functional term "intracellular transport of virus", among other viral pathways. We prioritize critical genes from the resultant network: CDK4, IFITM3, MAPK1, MAPK3, METTL12B and PIK3R2. Enriched drug signatures suggest potential drug targets and drug repositioning strategies for MS. Finally, we propose mechanisms of potential MS viral onset, based on prioritized gene set and functional enrichment analysis.
Collapse
Affiliation(s)
- Astrid M Manuel
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.
| | - Yulin Dai
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.
| | - Leorah A Freeman
- Department of Neurology, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA.
| | - Peilin Jia
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, USA.
| |
Collapse
|
14
|
Castano-Duque L, Ghosal S, Quilloy FA, Mitchell-Olds T, Dixit S. An epigenetic pathway in rice connects genetic variation to anaerobic germination and seedling establishment. PLANT PHYSIOLOGY 2021; 186:1042-1059. [PMID: 33638990 PMCID: PMC8195528 DOI: 10.1093/plphys/kiab100] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 02/09/2021] [Indexed: 06/12/2023]
Abstract
Rice production is shifting from transplanting seedlings to direct sowing of seeds. Following heavy rains, directly sown seeds may need to germinate under anaerobic environments, but most rice (Oryza sativa) genotypes cannot survive these conditions. To identify the genetic architecture of complex traits, we quantified percentage anaerobic germination (AG) in 2,700 (wet-season) and 1,500 (dry-season) sequenced rice genotypes and performed genome-wide association studies (GWAS) using 693,502 single nucleotide polymorphisms. This was followed by post-GWAS analysis with a generalized SNP-to-gene set analysis, meta-analysis, and network analysis. We determined that percentage AG is intermediate-to-high among indica subpopulations, and AG is a polygenic trait associated with transcription factors linked to ethylene responses or genes involved in metabolic processes that are known to be associated with AG. Our post-GWAS analysis identified several genes involved in a wide variety of metabolic processes. We subsequently performed functional analysis focused on the small RNA and methylation pathways. We selected CLASSY 1 (CLSY1), a gene involved in the RNA-directed DNA methylation (RdDm) pathway, for further analyses under AG and found several lines of evidence that CLSY1 influences AG. We propose that the RdDm pathway plays a role in rice responses to water status during germination and seedling establishment developmental stages.
Collapse
Affiliation(s)
| | - Sharmistha Ghosal
- Rice Breeding Platform, International Rice Research Institute. Pili Drive, Los Baños, Laguna 4031, Philippines
| | - Fergie A Quilloy
- Rice Breeding Platform, International Rice Research Institute. Pili Drive, Los Baños, Laguna 4031, Philippines
| | | | - Shalabh Dixit
- Rice Breeding Platform, International Rice Research Institute. Pili Drive, Los Baños, Laguna 4031, Philippines
| |
Collapse
|
15
|
Jia P, Manuel AM, Fernandes BS, Dai Y, Zhao Z. Distinct effect of prenatal and postnatal brain expression across 20 brain disorders and anthropometric social traits: a systematic study of spatiotemporal modularity. Brief Bioinform 2021; 22:6291943. [PMID: 34086851 DOI: 10.1093/bib/bbab214] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 04/30/2021] [Accepted: 05/15/2021] [Indexed: 02/06/2023] Open
Abstract
Different spatiotemporal abnormalities have been implicated in different neuropsychiatric disorders and anthropometric social traits, yet an investigation in the temporal network modularity with brain tissue transcriptomics has been lacking. We developed a supervised network approach to investigate the genome-wide association study (GWAS) results in the spatial and temporal contexts and demonstrated it in 20 brain disorders and anthropometric social traits. BrainSpan transcriptome profiles were used to discover significant modules enriched with trait susceptibility genes in a developmental stage-stratified manner. We investigated whether, and in which developmental stages, GWAS-implicated genes are coordinately expressed in brain transcriptome. We identified significant network modules for each disorder and trait at different developmental stages, providing a systematic view of network modularity at specific developmental stages for a myriad of brain disorders and traits. Specifically, we observed a strong pattern of the fetal origin for most psychiatric disorders and traits [such as schizophrenia (SCZ), bipolar disorder, obsessive-compulsive disorder and neuroticism], whereas increased co-expression activities of genes were more strongly associated with neurological diseases [such as Alzheimer's disease (AD) and amyotrophic lateral sclerosis] and anthropometric traits (such as college completion, education and subjective well-being) in postnatal brains. Further analyses revealed enriched cell types and functional features that were supported and corroborated prior knowledge in specific brain disorders, such as clathrin-mediated endocytosis in AD, myelin sheath in multiple sclerosis and regulation of synaptic plasticity in both college completion and education. Our study provides a landscape view of the spatiotemporal features in a myriad of brain-related disorders and traits.
Collapse
Affiliation(s)
- Peilin Jia
- Center for Precision Health, School of Biomedical Informatics, the University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030, USA
| | - Astrid M Manuel
- Center for Precision Health, School of Biomedical Informatics, the University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030, USA
| | - Brisa S Fernandes
- Center for Precision Health, School of Biomedical Informatics, the University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030, USA
| | - Yulin Dai
- Center for Precision Health, School of Biomedical Informatics, the University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030, USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, the University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030, USA
| |
Collapse
|
16
|
Kulkarni O, Sugier PE, Guibon J, Boland-Augé A, Lonjou C, Bacq-Daian D, Olaso R, Rubino C, Souchard V, Rachedi F, Lence-Anta JJ, Ortiz RM, Xhaard C, Laurent-Puig P, Mulot C, Guizard AV, Schvartz C, Boutron-Ruault MC, Ostroumova E, Kesminiene A, Deleuze JF, Guénel P, De Vathaire F, Truong T, Lesueur F. Gene network and biological pathways associated with susceptibility to differentiated thyroid carcinoma. Sci Rep 2021; 11:8932. [PMID: 33903625 PMCID: PMC8076215 DOI: 10.1038/s41598-021-88253-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Accepted: 04/09/2021] [Indexed: 12/11/2022] Open
Abstract
Variants identified in earlier genome-wide association studies (GWAS) on differentiated thyroid carcinoma (DTC) explain about 10% of the overall estimated genetic contribution and could not provide complete insights into biological mechanisms involved in DTC susceptibility. Integrating systems biology information from model organisms, genome-wide expression data from tumor and matched normal tissue and GWAS data could help identifying DTC-associated genes, and pathways or functional networks in which they are involved. We performed data mining of GWAS data of the EPITHYR consortium (1551 cases and 1957 controls) using various pathways and protein-protein interaction (PPI) annotation databases and gene expression data from The Cancer Genome Atlas. We identified eight DTC-associated genes at known loci 2q35 (DIRC3), 8p12 (NRG1), 9q22 (FOXE1, TRMO, HEMGN, ANP32B, NANS) and 14q13 (MBIP). Using the EW_dmGWAS approach we found that gene networks related to glycogenolysis, glycogen metabolism, insulin metabolism and signal transduction pathways associated with muscle contraction were overrepresented with association signals (false discovery rate adjusted p-value < 0.05). Additionally, suggestive association of 21 KEGG and 75 REACTOME pathways with DTC indicate a link between DTC susceptibility and functions related to metabolism of cholesterol, amino sugar and nucleotide sugar metabolism, steroid biosynthesis, and downregulation of ERBB2 signaling pathways. Together, our results provide novel insights into biological mechanisms contributing to DTC risk.
Collapse
Affiliation(s)
- Om Kulkarni
- Inserm, U900, Institut Curie, PSL University, Mines ParisTech, 75248, Paris, France
| | | | - Julie Guibon
- Inserm, U900, Institut Curie, PSL University, Mines ParisTech, 75248, Paris, France
- Université Paris-Saclay, UVSQ, Gustave Roussy, Inserm, CESP, 94807, Villejuif, France
| | - Anne Boland-Augé
- Université Paris-Saclay, CEA, Centre National de Recherche en Génomique Humaine, 91057, Evry, France
| | - Christine Lonjou
- Inserm, U900, Institut Curie, PSL University, Mines ParisTech, 75248, Paris, France
| | - Delphine Bacq-Daian
- Université Paris-Saclay, CEA, Centre National de Recherche en Génomique Humaine, 91057, Evry, France
| | - Robert Olaso
- Université Paris-Saclay, CEA, Centre National de Recherche en Génomique Humaine, 91057, Evry, France
| | - Carole Rubino
- Université Paris-Saclay, UVSQ, Gustave Roussy, Inserm, CESP, 94807, Villejuif, France
| | - Vincent Souchard
- Université Paris-Saclay, UVSQ, Gustave Roussy, Inserm, CESP, 94807, Villejuif, France
| | - Frédérique Rachedi
- Centre Hospitalier Territorial de Polynésie Française, CHTPF, Pirae, Tahiti, 98713, Papeete, French Polynesia
| | | | - Rosa Maria Ortiz
- Instituto Nacional de Oncologia y de Radiobiologia, INOR, La Havana, Cuba
| | - Constance Xhaard
- Université Paris-Saclay, UVSQ, Gustave Roussy, Inserm, CESP, 94807, Villejuif, France
- University of Lorraine, INSERM CIC 1433, Nancy CHRU, Inserm U1116, FCRIN, INI-CRCT, 54000, Nancy, France
| | - Pierre Laurent-Puig
- Centre de Recherche des Cordeliers, INSERM, Sorbonne Université, USPC, Université Paris Descartes, Université Paris Diderot, EPIGENETEC, 75006, Paris, France
| | - Claire Mulot
- Centre de Recherche des Cordeliers, INSERM, Sorbonne Université, USPC, Université Paris Descartes, Université Paris Diderot, EPIGENETEC, 75006, Paris, France
| | - Anne-Valérie Guizard
- Registre Général des Tumeurs du Calvados, Centre François Baclesse, 14000, Caen, France
- Inserm U1086-UCNB, Cancers and Prevention, 14000, Caen, France
| | - Claire Schvartz
- Registre des Cancers Thyroïdiens, Institut Jean Godinot, 51100, Reims, France
| | | | - Evgenia Ostroumova
- Environment and Radiation Section, International Agency for Research on Cancer, 69008, Lyon, France
| | - Ausrele Kesminiene
- Environment and Radiation Section, International Agency for Research on Cancer, 69008, Lyon, France
| | - Jean-François Deleuze
- Université Paris-Saclay, CEA, Centre National de Recherche en Génomique Humaine, 91057, Evry, France
| | - Pascal Guénel
- Université Paris-Saclay, UVSQ, Gustave Roussy, Inserm, CESP, 94807, Villejuif, France
| | - Florent De Vathaire
- Université Paris-Saclay, UVSQ, Gustave Roussy, Inserm, CESP, 94807, Villejuif, France
| | - Thérèse Truong
- Université Paris-Saclay, UVSQ, Gustave Roussy, Inserm, CESP, 94807, Villejuif, France
| | - Fabienne Lesueur
- Inserm, U900, Institut Curie, PSL University, Mines ParisTech, 75248, Paris, France.
| |
Collapse
|
17
|
Luo P, Chen B, Liao B, Wu F. Predicting disease‐associated genes: Computational methods, databases, and evaluations. WIRES DATA MINING AND KNOWLEDGE DISCOVERY 2021; 11. [DOI: 10.1002/widm.1383] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2019] [Accepted: 06/13/2020] [Indexed: 09/09/2024]
Abstract
AbstractComplex diseases are associated with a set of genes (called disease genes), the identification of which can help scientists uncover the mechanisms of diseases and develop new drugs and treatment strategies. Due to the huge cost and time of experimental identification techniques, many computational algorithms have been proposed to predict disease genes. Although several review publications in recent years have discussed many computational methods, some of them focus on cancer driver genes while others focus on biomolecular networks, which only cover a specific aspect of existing methods. In this review, we summarize existing methods and classify them into three categories based on their rationales. Then, the algorithms, biological data, and evaluation methods used in the computational prediction are discussed. Finally, we highlight the limitations of existing methods and point out some future directions for improving these algorithms. This review could help investigators understand the principles of existing methods, and thus develop new methods to advance the computational prediction of disease genes.This article is categorized under:Technologies > Machine LearningTechnologies > PredictionAlgorithmic Development > Biological Data Mining
Collapse
Affiliation(s)
- Ping Luo
- Division of Biomedical Engineering University of Saskatchewan Saskatoon Canada
- Princess Margaret Cancer Centre University Health Network Toronto Canada
| | - Bolin Chen
- School of Computer Science and Technology Northwestern Polytechnical University China
| | - Bo Liao
- School of Mathematics and Statistics Hainan Normal University Haikou China
| | - Fang‐Xiang Wu
- Department of Mechanical Engineering and Department of Computer Science University of Saskatchewan Saskatoon Canada
| |
Collapse
|
18
|
Dai Y, O'Brien TD, Pei G, Zhao Z, Jia P. Characterization of genome-wide association study data reveals spatiotemporal heterogeneity of mental disorders. BMC Med Genomics 2020; 13:192. [PMID: 33371872 PMCID: PMC7771094 DOI: 10.1186/s12920-020-00832-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Accepted: 11/23/2020] [Indexed: 12/15/2022] Open
Abstract
Background Psychiatric disorders such as schizophrenia (SCZ), bipolar disorder (BIP), major depressive disorder (MDD), attention deficit-hyperactivity disorder (ADHD), and autism spectrum disorder (ASD) are often related to brain development. Both shared and unique biological and neurodevelopmental processes have been reported to be involved in these disorders. Methods In this work, we developed an integrative analysis framework to seek for the sensitive spatiotemporal point during brain development underlying each disorder. Specifically, we first identified spatiotemporal gene co-expression modules for four brain regions three developmental stages (prenatal, birth to 11 years old, and older than 13 years), totaling 12 spatiotemporal sites. By integrating GWAS summary statistics and the spatiotemporal co-expression modules, we characterized the risk genes and their co-expression partners for five disorders. Results We found that SCZ and BIP, ASD and ADHD tend to cluster with each other and keep a distance from other psychiatric disorders. At the gene level, we identified several genes that were shared among the most significant modules, such as CTNNB1 and LNX1, and a hub gene, ATF2, in multiple modules. Moreover, we pinpointed two spatiotemporal points in the prenatal stage with active expression activities and highlighted one postnatal point for BIP. Further functional analysis of the disorder-related module highlighted the apoptotic signaling pathway for ASD and the immune-related and cell-cell adhesion function for SCZ, respectively. Conclusion Our study demonstrated the dynamic changes of disorder-related genes at the network level, shedding light on the spatiotemporal regulation during brain development.
Collapse
Affiliation(s)
- Yulin Dai
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 820, Houston, TX, 77030, USA
| | - Timothy D O'Brien
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 820, Houston, TX, 77030, USA
| | - Guangsheng Pei
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 820, Houston, TX, 77030, USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 820, Houston, TX, 77030, USA. .,Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA. .,MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences, Houston, TX, 77030, USA. .,Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, 37203, USA.
| | - Peilin Jia
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 820, Houston, TX, 77030, USA.
| |
Collapse
|
19
|
Yan F, Jia P, Yoshioka H, Suzuki A, Iwata J, Zhao Z. A developmental stage-specific network approach for studying dynamic co-regulation of transcription factors and microRNAs during craniofacial development. Development 2020; 147:226075. [PMID: 33234712 DOI: 10.1242/dev.192948] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Accepted: 11/10/2020] [Indexed: 12/21/2022]
Abstract
Craniofacial development is regulated through dynamic and complex mechanisms that involve various signaling cascades and gene regulations. Disruption of such regulations can result in craniofacial birth defects. Here, we propose the first developmental stage-specific network approach by integrating two crucial regulators, transcription factors (TFs) and microRNAs (miRNAs), to study their co-regulation during craniofacial development. Specifically, we used TFs, miRNAs and non-TF genes to form feed-forward loops (FFLs) using genomic data covering mouse embryonic days E10.5 to E14.5. We identified key novel regulators (TFs Foxm1, Hif1a, Zbtb16, Myog, Myod1 and Tcf7, and miRNAs miR-340-5p and miR-129-5p) and target genes (Col1a1, Sgms2 and Slc8a3) expression of which changed in a developmental stage-dependent manner. We found that the Wnt-FoxO-Hippo pathway (from E10.5 to E11.5), tissue remodeling (from E12.5 to E13.5) and miR-129-5p-mediated Col1a1 regulation (from E10.5 to E14.5) might play crucial roles in craniofacial development. Enrichment analyses further suggested their functions. Our experiments validated the regulatory roles of miR-340-5p and Foxm1 in the Wnt-FoxO-Hippo subnetwork, as well as the role of miR-129-5p in the miR-129-5p-Col1a1 subnetwork. Thus, our study helps understand the comprehensive regulatory mechanisms for craniofacial development.
Collapse
Affiliation(s)
- Fangfang Yan
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Peilin Jia
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Hiroki Yoshioka
- Department of Diagnostic and Biomedical Sciences, School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX 77054, USA.,Center for Craniofacial Research, The University of Texas Health Science Center at Houston, Houston, TX 77054, USA
| | - Akiko Suzuki
- Department of Diagnostic and Biomedical Sciences, School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX 77054, USA.,Center for Craniofacial Research, The University of Texas Health Science Center at Houston, Houston, TX 77054, USA
| | - Junichi Iwata
- Department of Diagnostic and Biomedical Sciences, School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX 77054, USA.,Center for Craniofacial Research, The University of Texas Health Science Center at Houston, Houston, TX 77054, USA.,MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences, Houston, TX 77030, USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.,MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences, Houston, TX 77030, USA.,Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| |
Collapse
|
20
|
Yu H, Chen D, Oyebamiji O, Zhao YY, Guo Y. Expression correlation attenuates within and between key signaling pathways in chronic kidney disease. BMC Med Genomics 2020; 13:134. [PMID: 32957963 PMCID: PMC7504859 DOI: 10.1186/s12920-020-00772-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Compared to the conventional differential expression approach, differential coexpression analysis represents a different yet complementary perspective into diseased transcriptomes. In particular, global loss of transcriptome correlation was previously observed in aging mice, and a most recent study found genetic and environmental perturbations on human subjects tended to cause universal attenuation of transcriptome coherence. While methodological progresses surrounding differential coexpression have helped with research on several human diseases, there has not been an investigation of coexpression disruptions in chronic kidney disease (CKD) yet. METHODS RNA-seq was performed on total RNAs of kidney tissue samples from 140 CKD patients. A combination of differential coexpression methods were employed to analyze the transcriptome transition in CKD from the early, mild phase to the late, severe kidney damage phase. RESULTS We discovered a global expression correlation attenuation in CKD progression, with pathway Regulation of nuclear SMAD2/3 signaling demonstrating the most remarkable intra-pathway correlation rewiring. Moreover, the pathway Signaling events mediated by focal adhesion kinase displayed significantly weakened crosstalk with seven pathways, including Regulation of nuclear SMAD2/3 signaling. Well-known relevant genes, such as ACTN4, were characterized with widespread correlation disassociation with partners from a wide array of signaling pathways. CONCLUSIONS Altogether, our analysis reported a global expression correlation attenuation within and between key signaling pathways in chronic kidney disease, and presented a list of vanishing hub genes and disrupted correlations within and between key signaling pathways, illuminating on the pathophysiological mechanisms of CKD progression.
Collapse
Affiliation(s)
- Hui Yu
- Department of Internal Medicine, University of New Mexico, Albuquerque, NM 87131 USA
| | - Danqian Chen
- Key Laboratory of Resource Biology and Biotechnology in Western China, School of Life Sciences, Northwest University, Xi’an, 710069 Shaanxi China
| | | | - Ying-Yong Zhao
- Key Laboratory of Resource Biology and Biotechnology in Western China, School of Life Sciences, Northwest University, Xi’an, 710069 Shaanxi China
| | - Yan Guo
- Department of Internal Medicine, University of New Mexico, Albuquerque, NM 87131 USA
| |
Collapse
|
21
|
Nakashima S, Nacher JC, Song J, Akutsu T. An Overview of Bioinformatics Methods for Analyzing Autism Spectrum Disorders. Curr Pharm Des 2020; 25:4552-4559. [PMID: 31713477 DOI: 10.2174/1381612825666191111154837] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Accepted: 11/07/2019] [Indexed: 02/06/2023]
Abstract
Autism Spectrum Disorders (ASD) are a group of neurodevelopmental disorders and are well recognized to be biologically heterogeneous in which various factors are associated, including genetic, metabolic, and environmental ones. Despite its high prevalence, only a few drugs have been approved for the treatment of ASD. Therefore, extensive studies have been conducted to identify ASD risk genes and novel drug targets. Since many genes and many other factors are associated with ASD, various bioinformatics methods have also been developed for the analysis of ASD. In this paper, we review bioinformatics methods for analyzing ASD data with the focus on computational aspects. We classify existing methods into two categories: (i) methods based on genomic variants and gene expression data, and (ii) methods using biological networks, which include gene co-expression networks and protein-protein interaction networks. Next, for each method, we provide an overall flow and elaborate on the computational techniques used. We also briefly review other approaches and discuss possible future directions and strategies for developing bioinformatics approaches to analyze ASD.
Collapse
Affiliation(s)
- Shogo Nakashima
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto, Japan
| | - Jose C Nacher
- Department of Information Science, Faculty of Science, Toho University, Kyoto, Japan
| | - Jiangning Song
- Monash Biomedicine Discovery Institute, Monash University, Clayton VIC 3800, Australia
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto, Japan
| |
Collapse
|
22
|
Manuel AM, Dai Y, Freeman LA, Jia P, Zhao Z. Dense module searching for gene networks associated with multiple sclerosis. BMC Med Genomics 2020; 13:48. [PMID: 32241259 PMCID: PMC7118851 DOI: 10.1186/s12920-020-0674-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Multiple sclerosis (MS) is a complex disease in which the immune system attacks the central nervous system. The molecular mechanisms contributing to the etiology of MS remain poorly understood. Genome-wide association studies (GWAS) of MS have identified a small number of genetic loci significant at the genome level, but they are mainly non-coding variants. Network-assisted analysis may help better interpret the functional roles of the variants with association signals and potential translational medicine application. The Dense Module Searching of GWAS tool (dmGWAS version 2.4) developed in our team is applied to 2 MS GWAS datasets (GeneMSA and IMSGC GWAS) using the human protein interactome as the reference network. A dual evaluation strategy is used to generate results with reproducibility. RESULTS Approximately 7500 significant network modules were identified for each independent GWAS dataset, and 20 significant modules were identified from the dual evaluation. The top modules included GRB2, HDAC1, JAK2, MAPK1, and STAT3 as central genes. Top module genes were enriched with functional terms such as "regulation of glial cell differentiation" (adjusted p-value = 2.58 × 10- 3), "T-cell costimulation" (adjusted p-value = 2.11 × 10- 6) and "virus receptor activity" (adjusted p-value = 1.67 × 10- 3). Interestingly, top gene networks included several MS FDA approved drug target genes HDAC1, IL2RA, KEAP1, and RELA, CONCLUSIONS: Our dmGWAS network analyses highlighted several genes (GRB2, HDAC1, IL2RA, JAK2, KEAP1, MAPK1, RELA and STAT3) in top modules that are promising to interpret GWAS signals and link to MS drug targets. The genes enriched with glial cell differentiation are important for understanding neurodegenerative processes in MS and for remyelination therapy investigation. Importantly, our identified genetic signals enriched in T cell costimulation and viral receptor activity supported the viral infection onset hypothesis for MS.
Collapse
Affiliation(s)
- Astrid M. Manuel
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030 USA
| | - Yulin Dai
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030 USA
| | - Leorah A. Freeman
- Department of Neurology, McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX 77030 USA
| | - Peilin Jia
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030 USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030 USA
- Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030 USA
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203 USA
| |
Collapse
|
23
|
Yan F, Dai Y, Iwata J, Zhao Z, Jia P. An integrative, genomic, transcriptomic and network-assisted study to identify genes associated with human cleft lip with or without cleft palate. BMC Med Genomics 2020; 13:39. [PMID: 32241273 PMCID: PMC7118807 DOI: 10.1186/s12920-020-0675-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Cleft lip with or without cleft palate (CL/P) is one of the most common congenital human birth defects. A combination of genetic and epidemiology studies has contributed to a better knowledge of CL/P-associated candidate genes and environmental risk factors. However, the etiology of CL/P remains not fully understood. In this study, to identify new CL/P-associated genes, we conducted an integrative analysis using our in-house network tools, dmGWAS [dense module search for Genome-Wide Association Studies (GWAS)] and EW_dmGWAS (Edge-Weighted dmGWAS), in a combination with GWAS data, the human protein-protein interaction (PPI) network, and differential gene expression profiles. RESULTS A total of 87 genes were consistently detected in both European and Asian ancestries in dmGWAS. There were 31.0% (27/87) showed nominal significance with CL/P (gene-based p < 0.05), with three genes showing strong association signals, including KIAA1598, GPR183, and ZMYND11 (p < 1 × 10- 3). In EW_dmGWAS, we identified 253 and 245 module genes associated with CL/P for European ancestry and the Asian ancestry, respectively. Functional enrichment analysis demonstrated that these genes were involved in cell adhesion, protein localization to the plasma membrane, the regulation of the apoptotic signaling pathway, and other pathological conditions. A small proportion of genes (5.1% for European ancestry; 2.4% for Asian ancestry) had prior evidence in CL/P as annotated in CleftGeneDB database. Our analysis highlighted nine novel CL/P candidate genes (BRD1, CREBBP, CSK, DNM1L, LOR, PTPN18, SND1, TGS1, and VIM) and 17 previously reported genes in the top modules. CONCLUSIONS The genes identified through superimposing GWAS signals and differential gene expression profiles onto human PPI network, as well as their functional features, helped our understanding of the etiology of CL/P. Our multi-omics integrative analyses revealed nine novel candidate genes involved in CL/P.
Collapse
Affiliation(s)
- Fangfang Yan
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX, 77030, USA
| | - Yulin Dai
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX, 77030, USA
| | - Junichi Iwata
- Department of Diagnostic and Biomedical Sciences, School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX, 77054, USA.,Center for Craniofacial Research, The University of Texas Health Science Center at Houston, Houston, TX, 77054, USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX, 77030, USA. .,Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA. .,Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, 37203, USA.
| | - Peilin Jia
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX, 77030, USA.
| |
Collapse
|
24
|
Abstract
BACKGROUND Disease gene prediction is a critical and challenging task. Many computational methods have been developed to predict disease genes, which can reduce the money and time used in the experimental validation. Since proteins (products of genes) usually work together to achieve a specific function, biomolecular networks, such as the protein-protein interaction (PPI) network and gene co-expression networks, are widely used to predict disease genes by analyzing the relationships between known disease genes and other genes in the networks. However, existing methods commonly use a universal static PPI network, which ignore the fact that PPIs are dynamic, and PPIs in various patients should also be different. RESULTS To address these issues, we develop an ensemble algorithm to predict disease genes from clinical sample-based networks (EdgCSN). The algorithm first constructs single sample-based networks for each case sample of the disease under study. Then, these single sample-based networks are merged to several fused networks based on the clustering results of the samples. After that, logistic models are trained with centrality features extracted from the fused networks, and an ensemble strategy is used to predict the finial probability of each gene being disease-associated. EdgCSN is evaluated on breast cancer (BC), thyroid cancer (TC) and Alzheimer's disease (AD) and obtains AUC values of 0.970, 0.971 and 0.966, respectively, which are much better than the competing algorithms. Subsequent de novo validations also demonstrate the ability of EdgCSN in predicting new disease genes. CONCLUSIONS In this study, we propose EdgCSN, which is an ensemble learning algorithm for predicting disease genes with models trained by centrality features extracted from clinical sample-based networks. Results of the leave-one-out cross validation show that our EdgCSN performs much better than the competing algorithms in predicting BC-associated, TC-associated and AD-associated genes. de novo validations also show that EdgCSN is valuable for identifying new disease genes.
Collapse
Affiliation(s)
- Ping Luo
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, S7N 5A9 Canada
| | - Li-Ping Tian
- School of Information, Beijing Wuzi University, Beijing, 101149 China
| | - Bolin Chen
- School of Computer Science, Northwestern Polytechnical University, Xi’an, 710072 China
| | - Qianghua Xiao
- School of Mathematics and Physics, University of South China, HengYang, 421001 China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, S7N 5A9 Canada
- Department of Computer Science, University of Saskatchewan, Saskatoon, S7N 5C9 Canada
- School of Mathematics and Statistics, Hainan Normal University, Haikou, 571158 China
- Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, S7N 5A9 Canada
| |
Collapse
|
25
|
Liu C, Ma Y, Zhao J, Nussinov R, Zhang YC, Cheng F, Zhang ZK. Computational network biology: Data, models, and applications. PHYSICS REPORTS 2020; 846:1-66. [DOI: 10.1016/j.physrep.2019.12.004] [Citation(s) in RCA: 86] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
26
|
Shi H, Sun H, Li J, Bai Z, Wu J, Li X, Lv Y, Zhang G. Systematic analysis of lncRNA and microRNA dynamic features reveals diagnostic and prognostic biomarkers of myocardial infarction. Aging (Albany NY) 2020; 12:945-964. [PMID: 31927529 PMCID: PMC6977700 DOI: 10.18632/aging.102667] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2019] [Accepted: 12/24/2019] [Indexed: 12/14/2022]
Abstract
Analyses of long non-coding RNAs (lncRNAs) and microRNAs (miRNAs) implicated in myocardial infarction (MI) have increased our understanding of gene regulatory mechanisms in MI. However, it is not known how their expression fluctuates over the different stages of MI progression. In this study, we used time-series gene expression data to examine global lncRNA and miRNA expression patterns during the acute phase of MI and at three different time points thereafter. We observed that the largest expression peak for mRNAs, lncRNAs, and miRNAs occurred during the acute phase of MI and involved mainly protein-coding, rather than non-coding RNAs. Functional analysis indicated that the lncRNAs and miRNAs most sensitive to MI and most unstable during MI progression were usually related to fewer biological functions. Additionally, we developed a novel computational method for identifying dysregulated competing endogenous lncRNA-miRNA-mRNA triplets (LmiRM-CTs) during MI onset and progression. As a result, a new panel of candidate diagnostic biomarkers defined by seven lncRNAs was suggested to have high classification performance for patients with or without MI, and a new panel of prognostic biomarkers defined by two lncRNAs evidenced high discriminatory capability for MI patients who developed heart failure from those who did not.
Collapse
Affiliation(s)
- Hongbo Shi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Haoran Sun
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Jiayao Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Ziyi Bai
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Jie Wu
- Laboratory of Medical Genetics, Harbin Medical University, Harbin, Heilongjiang, China
| | - Xiuhong Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Yingli Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Guangde Zhang
- Department of Cardiology, The Fourth Affiliated Hospital of Harbin Medical University, Harbin, Heilongjiang, China
| |
Collapse
|
27
|
Abstract
The abundance of high-throughput data and technical refinements in graph theories have allowed network analysis to become an effective approach for various medical fields. This chapter introduces co-expression, Bayesian, and regression-based network construction methods, which are the basis of network analysis. Various methods in network topology analysis are explained, along with their unique features and applications in biomedicine. Furthermore, we explain the role of network embedding in reducing the dimensionality of networks and outline several popular algorithms used by researchers today. Current literature has implemented different combinations of topology analysis and network embedding techniques, and we outline several studies in the fields of genetic-based disease prediction, drug-target identification, and multi-level omics integration.
Collapse
|
28
|
Yoon S, Nguyen HCT, Yoo YJ, Kim J, Baik B, Kim S, Kim J, Kim S, Nam D. Efficient pathway enrichment and network analysis of GWAS summary data using GSA-SNP2. Nucleic Acids Res 2019; 46:e60. [PMID: 29562348 PMCID: PMC6007455 DOI: 10.1093/nar/gky175] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2017] [Accepted: 03/13/2018] [Indexed: 01/19/2023] Open
Abstract
Pathway-based analysis in genome-wide association study (GWAS) is being widely used to uncover novel multi-genic functional associations. Many of these pathway-based methods have been used to test the enrichment of the associated genes in the pathways, but exhibited low powers and were highly affected by free parameters. We present the novel method and software GSA-SNP2 for pathway enrichment analysis of GWAS P-value data. GSA-SNP2 provides high power, decent type I error control and fast computation by incorporating the random set model and SNP-count adjusted gene score. In a comparative study using simulated and real GWAS data, GSA-SNP2 exhibited high power and best prioritized gold standard positive pathways compared with six existing enrichment-based methods and two self-contained methods (alternative pathway analysis approach). Based on these results, the difference between pathway analysis approaches was investigated and the effects of the gene correlation structures on the pathway enrichment analysis were also discussed. In addition, GSA-SNP2 is able to visualize protein interaction networks within and across the significant pathways so that the user can prioritize the core subnetworks for further studies. GSA-SNP2 is freely available at https://sourceforge.net/projects/gsasnp2.
Collapse
Affiliation(s)
- Sora Yoon
- School of Life Sciences, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
| | - Hai C T Nguyen
- School of Life Sciences, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
| | - Yun J Yoo
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, South Korea.,Department of Mathematics Education, Seoul National University, Seoul 08826, Republic of Korea
| | - Jinhwan Kim
- School of Life Sciences, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
| | - Bukyung Baik
- School of Life Sciences, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
| | - Sounkou Kim
- School of Life Sciences, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
| | - Jin Kim
- SK Telecom, Seoul 04539, Republic of Korea
| | - Sangsoo Kim
- School of Systems Biomedical Science, Soongsil University, Seoul 06978, Republic of Korea
| | - Dougu Nam
- School of Life Sciences, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea.,Department of Mathematical Sciences, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
| |
Collapse
|
29
|
Tenenbaum JD, Bhuvaneshwar K, Gagliardi JP, Fultz Hollis K, Jia P, Ma L, Nagarajan R, Rakesh G, Subbian V, Visweswaran S, Zhao Z, Rozenblit L. Translational bioinformatics in mental health: open access data sources and computational biomarker discovery. Brief Bioinform 2019; 20:842-856. [PMID: 29186302 PMCID: PMC6585382 DOI: 10.1093/bib/bbx157] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2017] [Revised: 10/24/2017] [Indexed: 12/12/2022] Open
Abstract
Mental illness is increasingly recognized as both a significant cost to society and a significant area of opportunity for biological breakthrough. As -omics and imaging technologies enable researchers to probe molecular and physiological underpinnings of multiple diseases, opportunities arise to explore the biological basis for behavioral health and disease. From individual investigators to large international consortia, researchers have generated rich data sets in the area of mental health, including genomic, transcriptomic, metabolomic, proteomic, clinical and imaging resources. General data repositories such as the Gene Expression Omnibus (GEO) and Database of Genotypes and Phenotypes (dbGaP) and mental health (MH)-specific initiatives, such as the Psychiatric Genomics Consortium, MH Research Network and PsychENCODE represent a wealth of information yet to be gleaned. At the same time, novel approaches to integrate and analyze data sets are enabling important discoveries in the area of mental and behavioral health. This review will discuss and catalog into an organizing framework the increasingly diverse set of MH data resources available, using schizophrenia as a focus area, and will describe novel and integrative approaches to molecular biomarker discovery that make use of mental health data.
Collapse
Affiliation(s)
- Jessica D Tenenbaum
- Department of Biostatistics and Bioinformatics at the Duke University School of Medicine
| | | | | | - Kate Fultz Hollis
- Department of Biomedical Informatics and Clinical Epidemiology at Oregon Health and Science University
| | - Peilin Jia
- University of Texas Health Science Center at Houston
| | - Liang Ma
- Bioinformatics and Systems Medicine Laboratory (BSML), Center for Precision Health, School of Biomedical Informatics, the University of Texas Health Science Center at Houston
| | | | | | - Vignesh Subbian
- Department of Biomedical Engineering and the Department of Systems and Industrial Engineering at the University of Arizona
| | | | | | | |
Collapse
|
30
|
Mignogna KM, Bacanu SA, Riley BP, Wolen AR, Miles MF. Cross-species alcohol dependence-associated gene networks: Co-analysis of mouse brain gene expression and human genome-wide association data. PLoS One 2019; 14:e0202063. [PMID: 31017905 PMCID: PMC6481773 DOI: 10.1371/journal.pone.0202063] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Accepted: 04/07/2019] [Indexed: 01/06/2023] Open
Abstract
Genome-wide association studies on alcohol dependence, by themselves, have yet to account for the estimated heritability of the disorder and provide incomplete mechanistic understanding of this complex trait. Integrating brain ethanol-responsive gene expression networks from model organisms with human genetic data on alcohol dependence could aid in identifying dependence-associated genes and functional networks in which they are involved. This study used a modification of the Edge-Weighted Dense Module Searching for genome-wide association studies (EW-dmGWAS) approach to co-analyze whole-genome gene expression data from ethanol-exposed mouse brain tissue, human protein-protein interaction databases and alcohol dependence-related genome-wide association studies. Results revealed novel ethanol-responsive and alcohol dependence-associated gene networks in prefrontal cortex, nucleus accumbens, and ventral tegmental area. Three of these networks were overrepresented with genome-wide association signals from an independent dataset. These networks were significantly overrepresented for gene ontology categories involving several mechanisms, including actin filament-based activity, transcript regulation, Wnt and Syndecan-mediated signaling, and ubiquitination. Together, these studies provide novel insight for brain mechanisms contributing to alcohol dependence.
Collapse
Affiliation(s)
- Kristin M. Mignogna
- Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, Virginia, United States of America
- VCU Alcohol Research Center, Virginia Commonwealth University, Richmond, Virginia, United States of America
- VCU Center for Clinical & Translational Research, Virginia Commonwealth University, Richmond, Virginia, United States of America
| | - Silviu A. Bacanu
- Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, Virginia, United States of America
- VCU Alcohol Research Center, Virginia Commonwealth University, Richmond, Virginia, United States of America
| | - Brien P. Riley
- Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, Virginia, United States of America
- VCU Alcohol Research Center, Virginia Commonwealth University, Richmond, Virginia, United States of America
| | - Aaron R. Wolen
- Department of Human and Molecular Genetics, Virginia Commonwealth University, Richmond, Virginia, United States of America
| | - Michael F. Miles
- VCU Alcohol Research Center, Virginia Commonwealth University, Richmond, Virginia, United States of America
- Department of Pharmacology and Toxicology, Virginia Commonwealth University, Richmond, Virginia, United States of America
- * E-mail:
| |
Collapse
|
31
|
Dai Y, Pei G, Zhao Z, Jia P. A Convergent Study of Genetic Variants Associated With Crohn's Disease: Evidence From GWAS, Gene Expression, Methylation, eQTL and TWAS. Front Genet 2019; 10:318. [PMID: 31024628 PMCID: PMC6467075 DOI: 10.3389/fgene.2019.00318] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2018] [Accepted: 03/21/2019] [Indexed: 12/12/2022] Open
Abstract
Crohn’s Disease (CD) is one of the predominant forms of inflammatory bowel disease (IBD). A combination of genetic and non-genetic risk factors have been reported to contribute to the development of CD. Many high-throughput omics studies have been conducted to identify disease associated risk variants that might contribute to CD, such as genome-wide association studies (GWAS) and next generation sequencing studies. A pressing need remains to prioritize and characterize candidate genes that underlie the etiology of CD. In this study, we collected a comprehensive multi-dimensional data from GWAS, gene expression, and methylation studies and generated transcriptome-wide association study (TWAS) data to further interpret the GWAS association results. We applied our previously developed method called mega-analysis of Odds Ratio (MegaOR) to prioritize CD candidate genes (CDgenes). As a result, we identified consensus sets of CDgenes (62–235 genes) based on the evidence matrix. We demonstrated that these CDgenes were significantly more frequently interact with each other than randomly expected. Functional annotation of these genes highlighted critical immune-related processes such as immune response, MHC class II receptor activity, and immunological disorders. In particular, the constitutive photomorphogenesis 9 (COP9) signalosome related genes were found to be significantly enriched in CDgenes, implying a potential role of COP9 signalosome involved in the pathogenesis of CD. Finally, we found some of the CDgenes shared biological functions with known drug targets of CD, such as the regulation of inflammatory response and the leukocyte adhesion to vascular endothelial cell. In summary, we identified highly confident CDgenes from multi-dimensional evidence, providing insights for the understanding of CD etiology.
Collapse
Affiliation(s)
- Yulin Dai
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Guangsheng Pei
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States.,Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, United States.,Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Peilin Jia
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| |
Collapse
|
32
|
Luo P, Li Y, Tian LP, Wu FX. Enhancing the prediction of disease–gene associations with multimodal deep learning. Bioinformatics 2019; 35:3735-3742. [DOI: 10.1093/bioinformatics/btz155] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Revised: 02/11/2019] [Accepted: 02/27/2019] [Indexed: 12/20/2022] Open
Abstract
Abstract
Motivation
Computationally predicting disease genes helps scientists optimize the in-depth experimental validation and accelerates the identification of real disease-associated genes. Modern high-throughput technologies have generated a vast amount of omics data, and integrating them is expected to improve the accuracy of computational prediction. As an integrative model, multimodal deep belief net (DBN) can capture cross-modality features from heterogeneous datasets to model a complex system. Studies have shown its power in image classification and tumor subtype prediction. However, multimodal DBN has not been used in predicting disease–gene associations.
Results
In this study, we propose a method to predict disease–gene associations by multimodal DBN (dgMDL). Specifically, latent representations of protein-protein interaction networks and gene ontology terms are first learned by two DBNs independently. Then, a joint DBN is used to learn cross-modality representations from the two sub-models by taking the concatenation of their obtained latent representations as the multimodal input. Finally, disease–gene associations are predicted with the learned cross-modality representations. The proposed method is compared with two state-of-the-art algorithms in terms of 5-fold cross-validation on a set of curated disease–gene associations. dgMDL achieves an AUC of 0.969 which is superior to the competing algorithms. Further analysis of the top-10 unknown disease–gene pairs also demonstrates the ability of dgMDL in predicting new disease–gene associations.
Availability and implementation
Prediction results and a reference implementation of dgMDL in Python is available on https://github.com/luoping1004/dgMDL.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ping Luo
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, Canada
| | - Yuanyuan Li
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, Canada
- School of Mathematics and Physics, Wuhan Institute of Technology, Wuhan, China
| | - Li-Ping Tian
- School of Information, Beijing Wuzi University, Beijing, China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, Canada
- Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, Canada
- Department of Computer Science, University of Saskatchewan, Saskatoon, Canada
| |
Collapse
|
33
|
Luo P, Tian LP, Ruan J, Wu FX. Disease Gene Prediction by Integrating PPI Networks, Clinical RNA-Seq Data and OMIM Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:222-232. [PMID: 29990218 DOI: 10.1109/tcbb.2017.2770120] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Disease gene prediction is a challenging task that has a variety of applications such as early diagnosis and drug development. The existing machine learning methods suffer from the imbalanced sample issue because the number of known disease genes (positive samples) is much less than that of unknown genes which are typically considered to be negative samples. In addition, most methods have not utilized clinical data from patients with a specific disease to predict disease genes. In this study, we propose a disease gene prediction algorithm (called dgSeq) by combining protein-protein interaction (PPI) network, clinical RNA-Seq data, and Online Mendelian Inheritance in Man (OMIN) data. Our dgSeq constructs differential networks based on rewiring information calculated from clinical RNA-Seq data. To select balanced sets of non-disease genes (negative samples), a disease-gene network is also constructed from OMIM data. After features are extracted from the PPI networks and differential networks, the logistic regression classifiers are trained. Our dgSeq obtains AUC values of 0.88, 0.83, and 0.80 for identifying breast cancer genes, thyroid cancer genes, and Alzheimer's disease genes, respectively, which indicates its superiority to other three competing methods. Both gene set enrichment analysis and predicted results demonstrate that dgSeq can effectively predict new disease genes.
Collapse
|
34
|
Shi H, Li J, Song Q, Cheng L, Sun H, Fan W, Li J, Wang Z, Zhang G. Systematic identification and analysis of dysregulated miRNA and transcription factor feed-forward loops in hypertrophic cardiomyopathy. J Cell Mol Med 2018; 23:306-316. [PMID: 30338905 PMCID: PMC6307764 DOI: 10.1111/jcmm.13928] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Accepted: 08/30/2018] [Indexed: 12/22/2022] Open
Abstract
Hypertrophic cardiomyopathy (HCM) is the most common genetic cardiovascular disease. Although some genes and miRNAs related with HCM have been studied, the molecular regulatory mechanisms between miRNAs and transcription factors (TFs) in HCM have not been systematically elucidated. In this study, we proposed a novel method for identifying dysregulated miRNA‐TF feed‐forward loops (FFLs) by integrating sample matched miRNA and gene expression profiles and experimentally verified interactions of TF‐target gene and miRNA‐target gene. We identified 316 dysregulated miRNA‐TF FFLs in HCM, which were confirmed to be closely related with HCM from various perspectives. Subpathway enrichment analysis demonstrated that the method was outperformed by the existing method. Furthermore, we systematically analysed the global architecture and feature of gene regulation by miRNAs and TFs in HCM, and the FFL composed of hsa‐miR‐17‐5p, FASN and STAT3 was inferred to play critical roles in HCM. Additionally, we identified two panels of biomarkers defined by three TFs (CEBPB, HIF1A, and STAT3) and four miRNAs (hsa‐miR‐155‐5p, hsa‐miR‐17‐5p, hsa‐miR‐20a‐5p, and hsa‐miR‐181a‐5p) in a discovery cohort of 126 samples, which could differentiate HCM patients from healthy controls with better performance. Our work provides HCM‐related dysregulated miRNA‐TF FFLs for further experimental study, and provides candidate biomarkers for HCM diagnosis and treatment.
Collapse
Affiliation(s)
- Hongbo Shi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Jiayao Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Qiong Song
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Haoran Sun
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Wenjing Fan
- Department of Cardiology, The Fourth Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Jianfei Li
- Emergency Cardiovascular Medicine, Inner Mongolia Autonomous Region People's Hospital, Hohhot, China
| | - Zhenzhen Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Guangde Zhang
- Department of Cardiology, The Fourth Affiliated Hospital of Harbin Medical University, Harbin, China
| |
Collapse
|
35
|
Foraita R, Dijkstra L, Falkenberg F, Garling M, Linder R, Pflock R, Rizkallah MR, Schwaninger M, Wright MN, Pigeot I. [Detection of drug risks after approval : Methods development for the use of routine statutory health insurance data]. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz 2018; 61:1075-1081. [PMID: 30027343 DOI: 10.1007/s00103-018-2786-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Adverse drug reactions are among the leading causes of death. Pharmacovigilance aims to monitor drugs after they have been released to the market in order to detect potential risks. Data sources commonly used to this end are spontaneous reports sent in by doctors or pharmaceutical companies. Reports alone are rather limited when it comes to detecting potential health risks. Routine statutory health insurance data, however, are a richer source since they not only provide a detailed picture of the patients' wellbeing over time, but also contain information on concomitant medication and comorbidities.To take advantage of their potential and to increase drug safety, we will further develop statistical methods that have shown their merit in other fields as a source of inspiration. A plethora of methods have been proposed over the years for spontaneous reporting data: a comprehensive comparison of these methods and their potential use for longitudinal data should be explored. In addition, we show how methods from machine learning could aid in identifying rare risks. We discuss these so-called enrichment analyses and how utilizing pharmaceutical similarities between drugs and similarities between comorbidities could help to construct risk profiles of the patients prone to experience an adverse drug event.Summarizing these methods will further push drug safety research based on healthcare claim data from German health insurances which form, due to their size, longitudinal coverage, and timeliness, an excellent basis for investigating adverse effects of drugs.
Collapse
Affiliation(s)
- Ronja Foraita
- Leibniz-Institut für Präventionsforschung und Epidemiologie - BIPS, Achterstr. 30, 28359, Bremen, Deutschland.
| | - Louis Dijkstra
- Leibniz-Institut für Präventionsforschung und Epidemiologie - BIPS, Achterstr. 30, 28359, Bremen, Deutschland
| | - Felix Falkenberg
- Wissenschaftliches Institut der Techniker Krankenkasse für Nutzen und Effizienz im Gesundheitswesen (WINEG TK), Hamburg, Deutschland
| | - Marco Garling
- Wissenschaftliches Institut der Techniker Krankenkasse für Nutzen und Effizienz im Gesundheitswesen (WINEG TK), Hamburg, Deutschland
| | - Roland Linder
- Wissenschaftliches Institut der Techniker Krankenkasse für Nutzen und Effizienz im Gesundheitswesen (WINEG TK), Hamburg, Deutschland
| | - René Pflock
- Institut für Experimentelle und Klinische Pharmakologie und Toxikologie, Universität zu Lübeck, Lübeck, Deutschland
| | - Mariam R Rizkallah
- Leibniz-Institut für Präventionsforschung und Epidemiologie - BIPS, Achterstr. 30, 28359, Bremen, Deutschland
| | - Markus Schwaninger
- Institut für Experimentelle und Klinische Pharmakologie und Toxikologie, Universität zu Lübeck, Lübeck, Deutschland
| | - Marvin N Wright
- Leibniz-Institut für Präventionsforschung und Epidemiologie - BIPS, Achterstr. 30, 28359, Bremen, Deutschland
| | - Iris Pigeot
- Leibniz-Institut für Präventionsforschung und Epidemiologie - BIPS, Achterstr. 30, 28359, Bremen, Deutschland
| |
Collapse
|
36
|
O'Brien TD, Jia P, Caporaso NE, Landi MT, Zhao Z. Weak sharing of genetic association signals in three lung cancer subtypes: evidence at the SNP, gene, regulation, and pathway levels. Genome Med 2018; 10:16. [PMID: 29486777 PMCID: PMC5828003 DOI: 10.1186/s13073-018-0522-9] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2017] [Accepted: 02/13/2018] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND There are two main types of lung cancer: small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC). NSCLC has many subtypes, but the two most common are lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC). These subtypes are mainly classified by physiological and pathological characteristics, although there is increasing evidence of genetic and molecular differences as well. Although some work has been done at the somatic level to explore the genetic and biological differences among subtypes, little work has been done that interrogates these differences at the germline level to characterize the unique and shared susceptibility genes for each subtype. METHODS We used single-nucleotide polymorphisms (SNPs) from a genome-wide association study (GWAS) of European samples to interrogate the similarity of the subtypes at the SNP, gene, pathway, and regulatory levels. We expanded these genotyped SNPs to include all SNPs in linkage disequilibrium (LD) using data from the 1000 Genomes Project. We mapped these SNPs to several lung tissue expression quantitative trait loci (eQTL) and enhancer datasets to identify regulatory SNPs and their target genes. We used these genes to perform a biological pathway analysis for each subtype. RESULTS We identified 8295, 8734, and 8361 SNPs with moderate association signals for LUAD, LUSC, and SCLC, respectively. Those SNPs had p < 1 × 10- 3 in the original GWAS or were within LD (r2 > 0.8, Europeans) to the genotyped SNPs. We identified 215, 320, and 172 disease-associated genes for LUAD, LUSC, and SCLC, respectively. Only five genes (CHRNA5, IDH3A, PSMA4, RP11-650 L12.2, and TBC1D2B) overlapped all subtypes. Furthermore, we observed only two pathways from the Kyoto Encyclopedia of Genes and Genomes shared by all subtypes. At the regulatory level, only three eQTL target genes and two enhancer target genes overlapped between all subtypes. CONCLUSIONS Our results suggest that the three lung cancer subtypes do not share much genetic signal at the SNP, gene, pathway, or regulatory level, which differs from the common subtype classification based upon histology. However, three (CHRNA5, IDH3A, and PSMA4) of the five genes shared between the subtypes are well-known lung cancer genes that may act as general lung cancer genes regardless of subtype.
Collapse
Affiliation(s)
- Timothy D O'Brien
- Vanderbilt Genetics Institute, Vanderbilt University School of Medicine, Nashville, TN, USA.,Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA.,Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 820, Houston, TX, 77030, USA
| | - Peilin Jia
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 820, Houston, TX, 77030, USA
| | - Neil E Caporaso
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Maria Teresa Landi
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Zhongming Zhao
- Vanderbilt Genetics Institute, Vanderbilt University School of Medicine, Nashville, TN, USA. .,Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA. .,Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 820, Houston, TX, 77030, USA. .,Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA.
| |
Collapse
|
37
|
Zhao J, Cheng F, Jia P, Cox N, Denny JC, Zhao Z. An integrative functional genomics framework for effective identification of novel regulatory variants in genome-phenome studies. Genome Med 2018; 10:7. [PMID: 29378629 PMCID: PMC5789733 DOI: 10.1186/s13073-018-0513-x] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Accepted: 01/04/2018] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Genome-phenome studies have identified thousands of variants that are statistically associated with disease or traits; however, their functional roles are largely unclear. A comprehensive investigation of regulatory mechanisms and the gene regulatory networks between phenome-wide association study (PheWAS) and genome-wide association study (GWAS) is needed to identify novel regulatory variants contributing to risk for human diseases. METHODS In this study, we developed an integrative functional genomics framework that maps 215,107 significant single nucleotide polymorphism (SNP) traits generated from the PheWAS Catalog and 28,870 genome-wide significant SNP traits collected from the GWAS Catalog into a global human genome regulatory map via incorporating various functional annotation data, including transcription factor (TF)-based motifs, promoters, enhancers, and expression quantitative trait loci (eQTLs) generated from four major functional genomics databases: FANTOM5, ENCODE, NIH Roadmap, and Genotype-Tissue Expression (GTEx). In addition, we performed a tissue-specific regulatory circuit analysis through the integration of the identified regulatory variants and tissue-specific gene expression profiles in 7051 samples across 32 tissues from GTEx. RESULTS We found that the disease-associated loci in both the PheWAS and GWAS Catalogs were significantly enriched with functional SNPs. The integration of functional annotations significantly improved the power of detecting novel associations in PheWAS, through which we found a number of functional associations with strong regulatory evidence in the PheWAS Catalog. Finally, we constructed tissue-specific regulatory circuits for several complex traits: mental diseases, autoimmune diseases, and cancer, via exploring tissue-specific TF-promoter/enhancer-target gene interaction networks. We uncovered several promising tissue-specific regulatory TFs or genes for Alzheimer's disease (e.g. ZIC1 and STX1B) and asthma (e.g. CSF3 and IL1RL1). CONCLUSIONS This study offers powerful tools for exploring the functional consequences of variants generated from genome-phenome association studies in terms of their mechanisms on affecting multiple complex diseases and traits.
Collapse
Affiliation(s)
- Junfei Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 820, Houston, TX, 77030, USA
| | - Feixiong Cheng
- Center for Cancer Systems Biology and Department of Cancer Biology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, 02215, USA
- Center for Complex Networks Research, Northeastern University, Boston, MA, 02215, USA
| | - Peilin Jia
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 820, Houston, TX, 77030, USA
| | - Nancy Cox
- Vanderbilt Genetics Institute, Vanderbilt University School of Medicine, Nashville, TN, 37232, USA
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Joshua C Denny
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 820, Houston, TX, 77030, USA.
- Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
| |
Collapse
|
38
|
Gumpinger AC, Roqueiro D, Grimm DG, Borgwardt KM. Methods and Tools in Genome-wide Association Studies. Methods Mol Biol 2018; 1819:93-136. [PMID: 30421401 DOI: 10.1007/978-1-4939-8618-7_5] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Many traits, such as height, the response to a given drug, or the susceptibility to certain diseases are presumably co-determined by genetics. Especially in the field of medicine, it is of major interest to identify genetic aberrations that alter an individual's risk to develop a certain phenotypic trait. Addressing this question requires the availability of comprehensive, high-quality genetic datasets. The technological advancements and the decreasing cost of genotyping in the last decade led to an increase in such datasets. Parallel to and in line with this technological progress, an analysis framework under the name of genome-wide association studies was developed to properly collect and analyze these data. Genome-wide association studies aim at finding statistical dependencies-or associations-between a trait of interest and point-mutations in the DNA. The statistical models used to detect such associations are diverse, spanning the whole range from the frequentist to the Bayesian setting.Since genetic datasets are inherently high-dimensional, the search for associations poses not only a statistical but also a computational challenge. As a result, a variety of toolboxes and software packages have been developed, each implementing different statistical methods while using various optimizations and mathematical techniques to enhance the computations.This chapter is devoted to the discussion of widely used methods and tools in genome-wide association studies. We present the different statistical models and the assumptions on which they are based, explain peculiarities of the data that have to be accounted for and, most importantly, introduce commonly used tools and software packages for the different tasks in a genome-wide association study, complemented with examples for their application.
Collapse
Affiliation(s)
- Anja C Gumpinger
- Machine Learning and Computational Biology Lab, D-BSSE, ETH Zurich, Basel, Switzerland. .,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| | - Damian Roqueiro
- Machine Learning and Computational Biology Lab, D-BSSE, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Dominik G Grimm
- Machine Learning and Computational Biology Lab, D-BSSE, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Karsten M Borgwardt
- Machine Learning and Computational Biology Lab, D-BSSE, ETH Zurich, Basel, Switzerland. .,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| |
Collapse
|
39
|
Nikolayeva I, Guitart Pla O, Schwikowski B. Network module identification-A widespread theoretical bias and best practices. Methods 2017; 132:19-25. [PMID: 28941788 DOI: 10.1016/j.ymeth.2017.08.008] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2017] [Revised: 08/14/2017] [Accepted: 08/18/2017] [Indexed: 10/18/2022] Open
Abstract
Biological processes often manifest themselves as coordinated changes across modules, i.e., sets of interacting genes. Commonly, the high dimensionality of genome-scale data prevents the visual identification of such modules, and straightforward computational search through a set of known pathways is a limited approach. Therefore, tools for the data-driven, computational, identification of modules in gene interaction networks have become popular components of visualization and visual analytics workflows. However, many such tools are known to result in modules that are large, and therefore hard to interpret biologically. Here, we show that the empirically known tendency towards large modules can be attributed to a statistical bias present in many module identification tools, and discuss possible remedies from a mathematical perspective. In the current absence of a straightforward practical solution, we outline our view of best practices for the use of the existing tools.
Collapse
Affiliation(s)
- Iryna Nikolayeva
- Systems Biology Lab, Center of Bioinformatics, Biostatistics and Integrative Biology, Institut Pasteur, Paris, France; Functional Genetics of Infectious Diseases Unit, Department Genomes and Genetics, Institut Pasteur, Paris, France; Université Paris-Descartes, Sorbonne Paris Cité, Paris, France
| | - Oriol Guitart Pla
- Systems Biology Lab, Center of Bioinformatics, Biostatistics and Integrative Biology, Institut Pasteur, Paris, France
| | - Benno Schwikowski
- Systems Biology Lab, Center of Bioinformatics, Biostatistics and Integrative Biology, Institut Pasteur, Paris, France.
| |
Collapse
|
40
|
Bogenpohl JW, Mignogna KM, Smith ML, Miles MF. Integrative Analysis of Genetic, Genomic, and Phenotypic Data for Ethanol Behaviors: A Network-Based Pipeline for Identifying Mechanisms and Potential Drug Targets. Methods Mol Biol 2017; 1488:531-549. [PMID: 27933543 PMCID: PMC5152688 DOI: 10.1007/978-1-4939-6427-7_26] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Complex behavioral traits, such as alcohol abuse, are caused by an interplay of genetic and environmental factors, producing deleterious functional adaptations in the central nervous system. The long-term behavioral consequences of such changes are of substantial cost to both the individual and society. Substantial progress has been made in the last two decades in understanding elements of brain mechanisms underlying responses to ethanol in animal models and risk factors for alcohol use disorder (AUD) in humans. However, treatments for AUD remain largely ineffective and few medications for this disease state have been licensed. Genome-wide genetic polymorphism analysis (GWAS) in humans, behavioral genetic studies in animal models and brain gene expression studies produced by microarrays or RNA-seq have the potential to produce nonbiased and novel insight into the underlying neurobiology of AUD. However, the complexity of such information, both statistical and informational, has slowed progress toward identifying new targets for intervention in AUD. This chapter describes one approach for integrating behavioral, genetic, and genomic information across animal model and human studies. The goal of this approach is to identify networks of genes functioning in the brain that are most relevant to the underlying mechanisms of a complex disease such as AUD. We illustrate an example of how genomic studies in animal models can be used to produce robust gene networks that have functional implications, and to integrate such animal model genomic data with human genetic studies such as GWAS for AUD. We describe several useful analysis tools for such studies: ComBAT, WGCNA, and EW_dmGWAS. The end result of this analysis is a ranking of gene networks and identification of their cognate hub genes, which might provide eventual targets for future therapeutic development. Furthermore, this combined approach may also improve our understanding of basic mechanisms underlying gene x environmental interactions affecting brain functioning in health and disease.
Collapse
Affiliation(s)
- James W Bogenpohl
- Department of Pharmacology and Toxicology, VCU Alcohol Research Center, Virginia Commonwealth University, 980613, Richmond, VA, 23298, USA
| | - Kristin M Mignogna
- Department of Psychiatry, VCU Alcohol Research Center, Virginia Commonwealth University, Richmond, VA, 23298, USA
| | - Maren L Smith
- Department of Human and Molecular Genetics, VCU Alcohol Research Center, Virginia Commonwealth University, Richmond, VA, 23298, USA
| | - Michael F Miles
- Department of Pharmacology and Toxicology, VCU Alcohol Research Center, Virginia Commonwealth University, 980613, Richmond, VA, 23298, USA.
| |
Collapse
|
41
|
Ping Luo, Li-Ping Tian, Jishou Ruan, Wu FX. Identifying disease genes from PPI networks weighted by gene expression under different conditions. 2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM) 2016:1259-1264. [DOI: 10.1109/bibm.2016.7822699] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
42
|
Shu L, Zhao Y, Kurt Z, Byars SG, Tukiainen T, Kettunen J, Orozco LD, Pellegrini M, Lusis AJ, Ripatti S, Zhang B, Inouye M, Mäkinen VP, Yang X. Mergeomics: multidimensional data integration to identify pathogenic perturbations to biological systems. BMC Genomics 2016; 17:874. [PMID: 27814671 PMCID: PMC5097440 DOI: 10.1186/s12864-016-3198-9] [Citation(s) in RCA: 75] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 10/25/2016] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Complex diseases are characterized by multiple subtle perturbations to biological processes. New omics platforms can detect these perturbations, but translating the diverse molecular and statistical information into testable mechanistic hypotheses is challenging. Therefore, we set out to create a public tool that integrates these data across multiple datasets, platforms, study designs and species in order to detect the most promising targets for further mechanistic studies. RESULTS We developed Mergeomics, a computational pipeline consisting of independent modules that 1) leverage multi-omics association data to identify biological processes that are perturbed in disease, and 2) overlay the disease-associated processes onto molecular interaction networks to pinpoint hubs as potential key regulators. Unlike existing tools that are mostly dedicated to specific data type or settings, the Mergeomics pipeline accepts and integrates datasets across platforms, data types and species. We optimized and evaluated the performance of Mergeomics using simulation and multiple independent datasets, and benchmarked the results against alternative methods. We also demonstrate the versatility of Mergeomics in two case studies that include genome-wide, epigenome-wide and transcriptome-wide datasets from human and mouse studies of total cholesterol and fasting glucose. In both cases, the Mergeomics pipeline provided statistical and contextual evidence to prioritize further investigations in the wet lab. The software implementation of Mergeomics is freely available as a Bioconductor R package. CONCLUSION Mergeomics is a flexible and robust computational pipeline for multidimensional data integration. It outperforms existing tools, and is easily applicable to datasets from different studies, species and omics data types for the study of complex traits.
Collapse
Affiliation(s)
- Le Shu
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Yuqi Zhao
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Zeyneb Kurt
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Sean Geoffrey Byars
- Center for Systems Genomics, University of Melbourne, Melbourne, Australia.,School of BioSciences, University of Melbourne, Melbourne, Australia
| | | | | | - Luz D Orozco
- Department of Molecular, Cell and Developmental Biology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Matteo Pellegrini
- Department of Molecular, Cell and Developmental Biology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Aldons J Lusis
- Department of Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | | | - Bin Zhang
- Department of Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Michael Inouye
- Center for Systems Genomics, University of Melbourne, Melbourne, Australia.,School of BioSciences, University of Melbourne, Melbourne, Australia.,Department of Pathology, University of Melbourne, Melbourne, Australia
| | - Ville-Petteri Mäkinen
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA, USA. .,South Australian Health and Medical Research Institute, Adelaide, Australia. .,School of Biological Sciences, University of Adelaide, Adelaide, Australia. .,Computational Medicine, Faculty of Medicine, University of Oulu and Biocenter Oulu, Oulu, Finland.
| | - Xia Yang
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA, USA. .,Insitute for Quantitative and Computational Biosciences, University of California, Los Angeles, Los Angeles, CA, USA.
| |
Collapse
|
43
|
Hillenmeyer S, Davis LK, Gamazon ER, Cook EH, Cox NJ, Altman RB. STAMS: STRING-assisted module search for genome wide association studies and application to autism. Bioinformatics 2016; 32:3815-3822. [PMID: 27542772 PMCID: PMC5167061 DOI: 10.1093/bioinformatics/btw530] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Revised: 06/29/2016] [Accepted: 08/09/2016] [Indexed: 01/17/2023] Open
Abstract
Motivation: Analyzing genome wide association data in the context of biological pathways helps us understand how genetic variation influences phenotype and increases power to find associations. However, the utility of pathway-based analysis tools is hampered by undercuration and reliance on a distribution of signal across all of the genes in a pathway. Methods that combine genome wide association results with genetic networks to infer the key phenotype-modulating subnetworks combat these issues, but have primarily been limited to network definitions with yes/no labels for gene-gene interactions. A recent method (EW_dmGWAS) incorporates a biological network with weighted edge probability by requiring a secondary phenotype-specific expression dataset. In this article, we combine an algorithm for weighted-edge module searching and a probabilistic interaction network in order to develop a method, STAMS, for recovering modules of genes with strong associations to the phenotype and probable biologic coherence. Our method builds on EW_dmGWAS but does not require a secondary expression dataset and performs better in six test cases. Results: We show that our algorithm improves over EW_dmGWAS and standard gene-based analysis by measuring precision and recall of each method on separately identified associations. In the Wellcome Trust Rheumatoid Arthritis study, STAMS-identified modules were more enriched for separately identified associations than EW_dmGWAS (STAMS P-value 3.0 × 10−4; EW_dmGWAS- P-value = 0.8). We demonstrate that the area under the Precision-Recall curve is 5.9 times higher with STAMS than EW_dmGWAS run on the Wellcome Trust Type 1 Diabetes data. Availability and Implementation: STAMS is implemented as an R package and is freely available at https://simtk.org/projects/stams. Contact:rbaltman@stanford.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sara Hillenmeyer
- Biomedical Informatics Training Program, Stanford University, Stanford, CA, USA
| | - Lea K Davis
- Vanderbilt Genetics Institute.,Division of Genetic Medicine, Department of Medicine, Vanderbilt University, Nashville, TN, USA
| | - Eric R Gamazon
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University, Nashville, TN, USA.,Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands
| | - Edwin H Cook
- Institute for Juvenile Research, Department of Psychiatry, University of Illinois at Chicago, Chicago, IL, USA
| | - Nancy J Cox
- Vanderbilt Genetics Institute.,Division of Genetic Medicine, Department of Medicine, Vanderbilt University, Nashville, TN, USA
| | - Russ B Altman
- Departments of Bioengineering and Genetics, Stanford University, Stanford, CA, USA
| |
Collapse
|
44
|
Becker K, Siegert S, Toliat MR, Du J, Casper R, Dolmans GH, Werker PM, Tinschert S, Franke A, Gieger C, Strauch K, Nothnagel M, Nürnberg P, Hennies HC, German Dupuytren Study Group. Meta-Analysis of Genome-Wide Association Studies and Network Analysis-Based Integration with Gene Expression Data Identify New Suggestive Loci and Unravel a Wnt-Centric Network Associated with Dupuytren's Disease. PLoS One 2016; 11:e0158101. [PMID: 27467239 PMCID: PMC4965170 DOI: 10.1371/journal.pone.0158101] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2016] [Accepted: 06/12/2016] [Indexed: 11/18/2022] Open
Abstract
Dupuytren´s disease, a fibromatosis of the connective tissue in the palm, is a common complex disease with a strong genetic component. Up to date nine genetic loci have been found to be associated with the disease. Six of these loci contain genes that code for Wnt signalling proteins. In spite of this striking first insight into the genetic factors in Dupuytren´s disease, much of the inherited risk in Dupuytren´s disease still needs to be discovered. The already identified loci jointly explain ~1% of the heritability in this disease. To further elucidate the genetic basis of Dupuytren´s disease, we performed a genome-wide meta-analysis combining three genome-wide association study (GWAS) data sets, comprising 1,580 cases and 4,480 controls. We corroborated all nine previously identified loci, six of these with genome-wide significance (p-value < 5x10-8). In addition, we identified 14 new suggestive loci (p-value < 10−5). Intriguingly, several of these new loci contain genes associated with Wnt signalling and therefore represent excellent candidates for replication. Next, we compared whole-transcriptome data between patient- and control-derived tissue samples and found the Wnt/β-catenin pathway to be the top deregulated pathway in patient samples. We then conducted network and pathway analyses in order to identify protein networks that are enriched for genes highlighted in the GWAS meta-analysis and expression data sets. We found further evidence that the Wnt signalling pathways in conjunction with other pathways may play a critical role in Dupuytren´s disease.
Collapse
Affiliation(s)
- Kerstin Becker
- Cologne Center for Genomics, University of Cologne, Cologne, Germany
- Cluster of Excellence on Cellular Stress Responses in Aging-associated Diseases, University of Cologne, Cologne, Germany
| | - Sabine Siegert
- Cologne Center for Genomics, University of Cologne, Cologne, Germany
| | | | - Juanjiangmeng Du
- Cologne Center for Genomics, University of Cologne, Cologne, Germany
- Cluster of Excellence on Cellular Stress Responses in Aging-associated Diseases, University of Cologne, Cologne, Germany
| | - Ramona Casper
- Cologne Center for Genomics, University of Cologne, Cologne, Germany
| | - Guido H. Dolmans
- University of Groningen and University Medical Center Groningen, Dept. of Plastic Surgery, Groningen, the Netherlands
| | - Paul M. Werker
- University of Groningen and University Medical Center Groningen, Dept. of Plastic Surgery, Groningen, the Netherlands
| | - Sigrid Tinschert
- Div. of Human Genetics and Dept. of Dermatology, Medical University of Innsbruck, Innsbruck, Austria
- Inst. of Clinical Genetics, Dresden University of Technology, Dresden, Germany
| | - Andre Franke
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, University Hospital Schleswig-Holstein, Kiel, Germany
| | - Christian Gieger
- Research Unit Molecular Epidemiology, Helmholtz Zentrum München, Neuherberg, Germany
- Institute of Epidemiologie II, Helmholtz Zentrum München, Neuherberg, Germany
- German Center for Diabetes Research, Neuherberg, Germany
| | - Konstantin Strauch
- Institute of Genetic Epidemiology, Helmholtz Zentrum München, Neuherberg, Germany
| | - Michael Nothnagel
- Cologne Center for Genomics, University of Cologne, Cologne, Germany
| | - Peter Nürnberg
- Cologne Center for Genomics, University of Cologne, Cologne, Germany
- Cluster of Excellence on Cellular Stress Responses in Aging-associated Diseases, University of Cologne, Cologne, Germany
| | - Hans Christian Hennies
- Cologne Center for Genomics, University of Cologne, Cologne, Germany
- Cluster of Excellence on Cellular Stress Responses in Aging-associated Diseases, University of Cologne, Cologne, Germany
- Div. of Human Genetics and Dept. of Dermatology, Medical University of Innsbruck, Innsbruck, Austria
- Dept. of Biological Sciences, University of Huddersfield, Huddersfield, United Kingdom
- * E-mail:
| | | |
Collapse
|
45
|
Xiao X, Hao J, Wen Y, Wang W, Guo X, Zhang F. Genome-wide association studies and gene expression profiles of rheumatoid arthritis: An analysis. Bone Joint Res 2016; 5:314-9. [PMID: 27445359 PMCID: PMC5005471 DOI: 10.1302/2046-3758.57.2000502] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/16/2015] [Accepted: 06/07/2016] [Indexed: 11/05/2022] Open
Abstract
OBJECTIVES The molecular mechanism of rheumatoid arthritis (RA) remains elusive. We conducted a protein-protein interaction network-based integrative analysis of genome-wide association studies (GWAS) and gene expression profiles of RA. METHODS We first performed a dense search of RA-associated gene modules by integrating a large GWAS meta-analysis dataset (containing 5539 RA patients and 20 169 healthy controls), protein interaction network and gene expression profiles of RA synovium and peripheral blood mononuclear cells (PBMCs). Gene ontology (GO) enrichment analysis was conducted by DAVID. The protein association networks of gene modules were generated by STRING. RESULTS For RA synovium, the top-ranked gene module is HLA-A, containing TAP2, HLA-A, HLA-C, TAPBP and LILRB1 genes. For RA PBMCs, the top-ranked gene module is GRB7, consisting of HLA-DRB5, HLA-DRA, GRB7, CD63 and KIT genes. Functional enrichment analysis identified three significant GO terms for RA synovium, including antigen processing and presentation of peptide antigen via major histocompatibility complex class I (false discovery rate (FDR) = 4.86 × 10 - 4), antigen processing and presentation of peptide antigen (FDR = 2.33 × 10 - 3) and eukaryotic translation initiation factor 4F complex (FDR = 2.52 × 10 - 2). CONCLUSION This study reported several RA-associated gene modules and their functional association networks.Cite this article: X. Xiao, J. Hao, Y. Wen, W. Wang, X. Guo, F. Zhang. Genome-wide association studies and gene expression profiles of rheumatoid arthritis: an analysis. Bone Joint Res 2016;5:314-319. DOI: 10.1302/2046-3758.57.2000502.
Collapse
Affiliation(s)
- X Xiao
- Key Laboratory of Trace Elements and Endemic Diseases of National Health and Family Planning Commission, School of Public Health, Health Science Center, Xi'an Jiaotong University, Yanta West Road 76, Xi'an, Shaanxi, China
| | - J Hao
- Key Laboratory of Trace Elements and Endemic Diseases of National Health and Family Planning Commission, School of Public Health, Health Science Center, Xi'an Jiaotong University, Yanta West Road 76, Xi'an, Shaanxi, China
| | - Y Wen
- Key Laboratory of Trace Elements and Endemic Diseases of National Health and Family Planning Commission, School of Public Health, Health Science Center, Xi'an Jiaotong University, Yanta West Road 76, Xi'an, Shaanxi, China
| | - W Wang
- Key Laboratory of Trace Elements and Endemic Diseases of National Health and Family Planning Commission, School of Public Health, Health Science Center, Xi'an Jiaotong University, Yanta West Road 76, Xi'an, Shaanxi, China
| | - X Guo
- Key Laboratory of Trace Elements and Endemic Diseases of National Health and Family Planning Commission, School of Public Health, Health Science Center, Xi'an Jiaotong University, Yanta West Road 76, Xi'an, Shaanxi, China
| | - F Zhang
- Key Laboratory of Trace Elements and Endemic Diseases of National Health and Family Planning Commission, School of Public Health, Health Science Center, Xi'an Jiaotong University, Yanta West Road 76, Xi'an, Shaanxi, China
| |
Collapse
|
46
|
Jiang W, Mitra R, Lin CC, Wang Q, Cheng F, Zhao Z. Systematic dissection of dysregulated transcription factor-miRNA feed-forward loops across tumor types. Brief Bioinform 2015; 17:996-1008. [PMID: 26655252 PMCID: PMC5142013 DOI: 10.1093/bib/bbv107] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2015] [Revised: 10/23/2015] [Indexed: 02/07/2023] Open
Abstract
Transcription factor and microRNA (miRNA) can mutually regulate each other and jointly regulate their shared target genes to form feed-forward loops (FFLs). While there are many studies of dysregulated FFLs in a specific cancer, a systematic investigation of dysregulated FFLs across multiple tumor types (pan-cancer FFLs) has not been performed yet. In this study, using The Cancer Genome Atlas data, we identified 26 pan-cancer FFLs, which were dysregulated in at least five tumor types. These pan-cancer FFLs could communicate with each other and form functionally consistent subnetworks, such as epithelial to mesenchymal transition-related subnetwork. Many proteins and miRNAs in each subnetwork belong to the same protein and miRNA family, respectively. Importantly, cancer-associated genes and drug targets were enriched in these pan-cancer FFLs, in which the genes and miRNAs also tended to be hubs and bottlenecks. Finally, we identified potential anticancer indications for existing drugs with novel mechanism of action. Collectively, this study highlights the potential of pan-cancer FFLs as a novel paradigm in elucidating pathogenesis of cancer and developing anticancer drugs.
Collapse
Affiliation(s)
- Wei Jiang
- *These authors contributed equally to this work
| | | | | | | | | | | |
Collapse
|
47
|
Chimusa ER, Mbiyavanga M, Mazandu GK, Mulder NJ. ancGWAS: a post genome-wide association study method for interaction, pathway and ancestry analysis in homogeneous and admixed populations. Bioinformatics 2015; 32:549-56. [PMID: 26508762 DOI: 10.1093/bioinformatics/btv619] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2014] [Accepted: 10/16/2015] [Indexed: 12/17/2022] Open
Abstract
MOTIVATION Despite numerous successful Genome-wide Association Studies (GWAS), detecting variants that have low disease risk still poses a challenge. GWAS may miss disease genes with weak genetic effects or strong epistatic effects due to the single-marker testing approach commonly used. GWAS may thus generate false negative or inconclusive results, suggesting the need for novel methods to combine effects of single nucleotide polymorphisms within a gene to increase the likelihood of fully characterizing the susceptibility gene. RESULTS We developed ancGWAS, an algebraic graph-based centrality measure that accounts for linkage disequilibrium in identifying significant disease sub-networks by integrating the association signal from GWAS data sets into the human protein-protein interaction (PPI) network. We validated ancGWAS using an association study result from a breast cancer data set and the simulation of interactive disease loci in the simulation of a complex admixed population, as well as pathway-based GWAS simulation. This new approach holds promise for deconvoluting the interactions between genes underlying the pathogenesis of complex diseases. Results obtained yield a novel central breast cancer sub-network of the human interactome implicated in the proteoglycan syndecan-mediated signaling events pathway which is known to play a major role in mesenchymal tumor cell proliferation, thus providing further insights into breast cancer pathogenesis. AVAILABILITY AND IMPLEMENTATION The ancGWAS package and documents are available at http://www.cbio.uct.ac.za/~emile/software.html.
Collapse
Affiliation(s)
- Emile R Chimusa
- Computational Biology Group, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, University of Cape Town, Medical School, 7925, Observatory, South Africa and
| | - Mamana Mbiyavanga
- Computational Biology Group, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, University of Cape Town, Medical School, 7925, Observatory, South Africa and African Institute for Mathematical Sciences, 7945 Muizenberg, Cape Town, South Africa
| | - Gaston K Mazandu
- Computational Biology Group, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, University of Cape Town, Medical School, 7925, Observatory, South Africa and African Institute for Mathematical Sciences, 7945 Muizenberg, Cape Town, South Africa
| | - Nicola J Mulder
- Computational Biology Group, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, University of Cape Town, Medical School, 7925, Observatory, South Africa and
| |
Collapse
|