1
|
Fischer EK, Song Y, Zhou W, Hoke KL. FLEXIBILITY IN GENE COEXPRESSION AT DEVELOPMENTAL AND EVOLUTIONARY TIMESCALES. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.10.627761. [PMID: 39713302 PMCID: PMC11661222 DOI: 10.1101/2024.12.10.627761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/24/2024]
Abstract
The explosion of next-generation sequencing technologies has allowed researchers to move from studying single genes, to thousands of genes, and thereby to also consider the relationships within gene networks. Like others, we are interested in understanding how developmental and evolutionary forces shape the expression of individual genes, as well as the interactions among genes. To this end, we characterized the effects of genetic background and developmental environment on brain gene coexpression in two parallel, independent evolutionary lineages of Trinidadian guppies (Poecilia reticulata). We asked whether connectivity patterns among genes differed based on genetic background and rearing environment, and whether a gene's connectivity predicted its propensity for expression divergence. In pursuing these questions, we confronted the central challenge that standard approaches fail to control the Type I error and/or have low power in the presence of high dimensionality (i.e., large number of genes) and small sample size, as in many gene expression studies. Using our data as a case study, we detail central challenges, discuss sample size guidelines, and provide rigorous statistical approaches for exploring coexpression differences with small sample sizes. Using these approaches, we find evidence that coexpression relationships differ based on both genetic background and rearing environment. We report greater expression divergence in less connected genes and suggest this pattern may arise and be reinforced by selection.
Collapse
Affiliation(s)
- Eva K Fischer
- Department of Neurobiology, Physiology and Behavior, University of California Davis, Davis, CA 95616, USA
| | - Youngseok Song
- Department of Statistics, West Virginia University, Morgantown, WV 26506, USA
| | - Wen Zhou
- Department of Biostatistics, School of Global Public Health, New York University, New York, NY 10003, USA
| | - Kim L Hoke
- Department of Biology, Colorado State University, Fort Collins, CO 80523, USA
| |
Collapse
|
2
|
Olascoaga S, Castañeda-Sánchez JI, Königsberg M, Gutierrez H, López-Diazguerrero NE. Oxidative stress-induced gene expression changes in prostate epithelial cells in vitro reveal a robust signature of normal prostatic senescence and aging. Biogerontology 2024; 25:1145-1169. [PMID: 39162979 PMCID: PMC11486819 DOI: 10.1007/s10522-024-10126-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Accepted: 08/02/2024] [Indexed: 08/21/2024]
Abstract
Oxidative stress has long been postulated to play an essential role in aging mechanisms, and numerous forms of molecular damage associated with oxidative stress have been well documented. However, the extent to which changes in gene expression in direct response to oxidative stress are related to actual cellular aging, senescence, and age-related functional decline remains unclear. Here, we ask whether H2O2-induced oxidative stress and resulting gene expression alterations in prostate epithelial cells in vitro reveal gene regulatory changes typically observed in naturally aging prostate tissue and age-related prostate disease. While a broad range of significant changes observed in the expression of non-coding transcripts implicated in senescence-related responses, we also note an overrepresentation of gene-splicing events among differentially expressed protein-coding genes induced by H2O2. Additionally, the collective expression of these H2O2-induced DEGs is linked to age-related pathological dysfunction, with their protein products exhibiting a dense network of protein-protein interactions. In contrast, co-expression analysis of available gene expression data reveals a naturally occurring highly coordinated expression of H2O2-induced DEGs in normally aging prostate tissue. Furthermore, we find that oxidative stress-induced DEGs statistically overrepresent well-known senescence-related signatures. Our results show that oxidative stress-induced gene expression in prostate epithelial cells in vitro reveals gene regulatory changes typically observed in naturally aging prostate tissue and age-related prostate disease.
Collapse
Affiliation(s)
- Samael Olascoaga
- Posgrado en Biología Experimental, DCBS, Universidad Autónoma Metropolitana Iztapalapa, Mexico City, Mexico
- Laboratorio de Bioenergética y Envejecimiento Celular, Departamento de Ciencias de la Salud, Universidad Autónoma Metropolitana (UAM), Mexico City, Mexico
| | - Jorge I Castañeda-Sánchez
- División de Ciencias Biológicas y de la Salud, Departamento de Sistemas Biológicos, Universidad Autónoma Metropolitana-Xochimilco (UAM-X), Mexico City, Mexico
| | - Mina Königsberg
- Laboratorio de Bioenergética y Envejecimiento Celular, Departamento de Ciencias de la Salud, Universidad Autónoma Metropolitana (UAM), Mexico City, Mexico
| | | | - Norma Edith López-Diazguerrero
- Laboratorio de Bioenergética y Envejecimiento Celular, Departamento de Ciencias de la Salud, Universidad Autónoma Metropolitana (UAM), Mexico City, Mexico.
| |
Collapse
|
3
|
Zheng X, Lim PK, Mutwil M, Wang Y. A method for mining condition-specific co-expressed genes in Camellia sinensis based on k-means clustering. BMC PLANT BIOLOGY 2024; 24:373. [PMID: 38714965 PMCID: PMC11077725 DOI: 10.1186/s12870-024-05086-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Accepted: 04/30/2024] [Indexed: 05/12/2024]
Abstract
BACKGROUND As one of the world's most important beverage crops, tea plants (Camellia sinensis) are renowned for their unique flavors and numerous beneficial secondary metabolites, attracting researchers to investigate the formation of tea quality. With the increasing availability of transcriptome data on tea plants in public databases, conducting large-scale co-expression analyses has become feasible to meet the demand for functional characterization of tea plant genes. However, as the multidimensional noise increases, larger-scale co-expression analyses are not always effective. Analyzing a subset of samples generated by effectively downsampling and reorganizing the global sample set often leads to more accurate results in co-expression analysis. Meanwhile, global-based co-expression analyses are more likely to overlook condition-specific gene interactions, which may be more important and worthy of exploration and research. RESULTS Here, we employed the k-means clustering method to organize and classify the global samples of tea plants, resulting in clustered samples. Metadata annotations were then performed on these clustered samples to determine the "conditions" represented by each cluster. Subsequently, we conducted gene co-expression network analysis (WGCNA) separately on the global samples and the clustered samples, resulting in global modules and cluster-specific modules. Comparative analyses of global modules and cluster-specific modules have demonstrated that cluster-specific modules exhibit higher accuracy in co-expression analysis. To measure the degree of condition specificity of genes within condition-specific clusters, we introduced the correlation difference value (CDV). By incorporating the CDV into co-expression analyses, we can assess the condition specificity of genes. This approach proved instrumental in identifying a series of high CDV transcription factor encoding genes upregulated during sustained cold treatment in Camellia sinensis leaves and buds, and pinpointing a pair of genes that participate in the antioxidant defense system of tea plants under sustained cold stress. CONCLUSIONS To summarize, downsampling and reorganizing the sample set improved the accuracy of co-expression analysis. Cluster-specific modules were more accurate in capturing condition-specific gene interactions. The introduction of CDV allowed for the assessment of condition specificity in gene co-expression analyses. Using this approach, we identified a series of high CDV transcription factor encoding genes related to sustained cold stress in Camellia sinensis. This study highlights the importance of considering condition specificity in co-expression analysis and provides insights into the regulation of the cold stress in Camellia sinensis.
Collapse
Affiliation(s)
- Xinghai Zheng
- Tea Research Institute, Zhejiang University, Hangzhou, 310058, Zhejiang, China.
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore, 637551, Singapore.
| | - Peng Ken Lim
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore, 637551, Singapore
| | - Marek Mutwil
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore, 637551, Singapore.
| | - Yuefei Wang
- Tea Research Institute, Zhejiang University, Hangzhou, 310058, Zhejiang, China.
| |
Collapse
|
4
|
Marino A, Sinaimeri B, Tronci E, Calamoneri T. STARGATE-X: a Python package for statistical analysis on the REACTOME network. J Integr Bioinform 2023; 20:jib-2022-0029. [PMID: 37732505 PMCID: PMC10757075 DOI: 10.1515/jib-2022-0029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 01/24/2023] [Indexed: 09/22/2023] Open
Abstract
Many important aspects of biological knowledge at the molecular level can be represented by pathways. Through their analysis, we gain mechanistic insights and interpret lists of interesting genes from experiments (usually omics and functional genomic experiments). As a result, pathways play a central role in the development of bioinformatics methods and tools for computing predictions from known molecular-level mechanisms. Qualitative as well as quantitative knowledge about pathways can be effectively represented through biochemical networks linking the biochemical reactions and the compounds (e.g., proteins) occurring in the considered pathways. So, repositories providing biochemical networks for known pathways play a central role in bioinformatics and in systems biology. Here we focus on Reactome, a free, comprehensive, and widely used repository for biochemical networks and pathways. In this paper, we: (1) introduce a tool StARGate-X (STatistical Analysis of the Reactome multi-GrAph Through nEtworkX) to carry out an automated analysis of the connectivity properties of Reactome biochemical reaction network and of its biological hierarchy (i.e., cell compartments, namely, the closed parts within the cytosol, usually surrounded by a membrane); the code is freely available at https://github.com/marinoandrea/stargate-x; (2) show the effectiveness of our tool by providing an analysis of the Reactome network, in terms of centrality measures, with respect to in- and out-degree. As an example of usage of StARGate-X, we provide a detailed automated analysis of the Reactome network, in terms of centrality measures. We focus both on the subgraphs induced by single compartments and on the graph whose nodes are the strongly connected components. To the best of our knowledge, this is the first freely available tool that enables automatic analysis of the large biochemical network within Reactome through easy-to-use APIs (Application Programming Interfaces).
Collapse
Affiliation(s)
- Andrea Marino
- Computer Science Department, Sapienza University of Rome, Rome, Italy
| | | | - Enrico Tronci
- Computer Science Department, Sapienza University of Rome, Rome, Italy
| | | |
Collapse
|
5
|
Federico A, Kern J, Varelas X, Monti S. Structure Learning for Gene Regulatory Networks. PLoS Comput Biol 2023; 19:e1011118. [PMID: 37200395 DOI: 10.1371/journal.pcbi.1011118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 05/31/2023] [Accepted: 04/20/2023] [Indexed: 05/20/2023] Open
Abstract
Inference of biological network structures is often performed on high-dimensional data, yet is hindered by the limited sample size of high throughput "omics" data typically available. To overcome this challenge, often referred to as the "small n, large p problem," we exploit known organizing principles of biological networks that are sparse, modular, and likely share a large portion of their underlying architecture. We present SHINE-Structure Learning for Hierarchical Networks-a framework for defining data-driven structural constraints and incorporating a shared learning paradigm for efficiently learning multiple Markov networks from high-dimensional data at large p/n ratios not previously feasible. We evaluated SHINE on Pan-Cancer data comprising 23 tumor types, and found that learned tumor-specific networks exhibit expected graph properties of real biological networks, recapture previously validated interactions, and recapitulate findings in literature. Application of SHINE to the analysis of subtype-specific breast cancer networks identified key genes and biological processes for tumor maintenance and survival as well as potential therapeutic targets for modulating known breast cancer disease genes.
Collapse
Affiliation(s)
- Anthony Federico
- Section of Computational Biomedicine, Boston University School of Medicine, Boston, Massachusetts, United States of America
- Bioinformatics Program, Boston University, Boston, Massachusetts, United States of America
| | - Joseph Kern
- Department of Biochemistry, Boston University School of Medicine, Boston, Massachusetts, United States of America
| | - Xaralabos Varelas
- Department of Biochemistry, Boston University School of Medicine, Boston, Massachusetts, United States of America
| | - Stefano Monti
- Section of Computational Biomedicine, Boston University School of Medicine, Boston, Massachusetts, United States of America
- Bioinformatics Program, Boston University, Boston, Massachusetts, United States of America
| |
Collapse
|
6
|
Cai H, Des Marais DL. Revisiting regulatory coherence: accounting for temporal bias in plant gene co-expression analyses. THE NEW PHYTOLOGIST 2023; 238:16-24. [PMID: 36617750 DOI: 10.1111/nph.18720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 12/16/2022] [Indexed: 06/17/2023]
Affiliation(s)
- Haoran Cai
- Department of Civil and Environmental Engineering, MIT, 15 Vassar St., Cambridge, MA, 02139, USA
| | - David L Des Marais
- Department of Civil and Environmental Engineering, MIT, 15 Vassar St., Cambridge, MA, 02139, USA
| |
Collapse
|
7
|
Uncovering Oncogenic Mechanisms of Tumor Suppressor Genes in Breast Cancer Multi-Omics Data. Int J Mol Sci 2022; 23:ijms23179624. [PMID: 36077026 PMCID: PMC9455665 DOI: 10.3390/ijms23179624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Revised: 08/16/2022] [Accepted: 08/19/2022] [Indexed: 11/17/2022] Open
Abstract
Tumor suppressor genes (TSGs) are essential genes in the development of cancer. While they have many roles in normal cells, mutation and dysregulation of the TSGs result in aberrant molecular processes in cancer cells. Therefore, understanding TSGs and their roles in the oncogenic process is crucial for prevention and treatment of cancer. In this research, multi-omics breast cancer data were used to identify molecular mechanisms of TSGs in breast cancer. Differentially expressed genes and differentially coexpressed genes were identified in four large-scale transcriptomics data from public repositories and multi-omics data analyses of copy number, methylation and gene expression were performed. The results of the analyses were integrated using enrichment analysis and meta-analysis of a p-value summation method. The integrative analysis revealed that TSGs have a significant relationship with genes of gene ontology terms that are related to cell cycle, genome stability, RNA processing and metastasis, indicating the regulatory mechanisms of TSGs on cancer cells. The analysis frame and research results will provide valuable information for the further identification of TSGs in different types of cancers.
Collapse
|
8
|
Pu J, Yu H, Guo Y. A Novel Strategy to Identify Prognosis-Relevant Gene Sets in Cancers. Genes (Basel) 2022; 13:862. [PMID: 35627247 PMCID: PMC9141699 DOI: 10.3390/genes13050862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 05/06/2022] [Accepted: 05/09/2022] [Indexed: 11/16/2022] Open
Abstract
Molecular prognosis markers hold promise for improved prediction of patient survival, and a pathway or gene set may add mechanistic interpretation to their prognostic prediction power. In this study, we demonstrated a novel strategy to identify prognosis-relevant gene sets in cancers. Our study consists of a first round of gene-level analyses and a second round of gene-set-level analyses, in which the Composite Gene Expression Score critically summarizes a surrogate expression value at gene set level and a permutation procedure is exerted to assess prognostic significance of gene sets. An optional differential coexpression module is appended to the two phases of survival analyses to corroborate and refine prognostic gene sets. Our strategy was demonstrated in 33 cancer types across 32,234 gene sets. We found oncogenic gene sets accounted for an increased proportion among the final gene sets, and genes involved in DNA replication and DNA repair have ubiquitous prognositic value for multiple cancer types. In summary, we carried out the largest gene set based prognosis study to date. Compared to previous similar studies, our approach offered multiple improvements in design and methodology implementation. Functionally relevant gene sets of ubiquitous prognostic significance in multiple cancer types were identified.
Collapse
Affiliation(s)
- Junyi Pu
- School of Life Sciences, Northwest University, Xi’an 710069, China;
| | - Hui Yu
- Comprehensive Cancer Center, New Mexico University, Albuquerque, NM 87131, USA;
| | - Yan Guo
- Comprehensive Cancer Center, New Mexico University, Albuquerque, NM 87131, USA;
| |
Collapse
|
9
|
Sun P, Wu Y, Yin C, Jiang H, Xu Y, Sun H. Molecular Subtyping of Cancer Based on Distinguishing Co-Expression Modules and Machine Learning. Front Genet 2022; 13:866005. [PMID: 35586568 PMCID: PMC9108363 DOI: 10.3389/fgene.2022.866005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Accepted: 03/07/2022] [Indexed: 02/05/2023] Open
Abstract
Molecular subtyping of cancer is recognized as a critical and challenging step towards individualized therapy. Most existing computational methods solve this problem via multi-classification of gene-expressions of cancer samples. Although these methods, especially deep learning, perform well in data classification, they usually require large amounts of data for model training and have limitations in interpretability. Besides, as cancer is a complex systemic disease, the phenotypic difference between cancer samples can hardly be fully understood by only analyzing single molecules, and differential expression-based molecular subtyping methods are reportedly not conserved. To address the above issues, we present here a new framework for molecular subtyping of cancer through identifying a robust specific co-expression module for each subtype of cancer, generating network features for each sample by perturbing correlation levels of specific edges, and then training a deep neural network for multi-class classification. When applied to breast cancer (BRCA) and stomach adenocarcinoma (STAD) molecular subtyping, it has superior classification performance over existing methods. In addition to improving classification performance, we consider the specific co-expressed modules selected for subtyping to be biologically meaningful, which potentially offers new insight for diagnostic biomarker design, mechanistic studies of cancer, and individualized treatment plan selection.
Collapse
Affiliation(s)
- Peishuo Sun
- School of Artificial Intelligence, Jilin University, Changchun, China
| | - Ying Wu
- Phase I Clinical Trails Center, The First Affiliated Hospital, China Medical University, Shenyang, China
| | - Chaoyi Yin
- School of Artificial Intelligence, Jilin University, Changchun, China
| | - Hongyang Jiang
- School of Artificial Intelligence, Jilin University, Changchun, China
| | - Ying Xu
- Computational Systems Biology Lab, Department of Biochemistry and Molecular Biology and Institute of Bioinformatics University of Georgia, Athens, GA, United States
- *Correspondence: Huiyan Sun, ; Ying Xu,
| | - Huiyan Sun
- School of Artificial Intelligence, Jilin University, Changchun, China
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
- *Correspondence: Huiyan Sun, ; Ying Xu,
| |
Collapse
|
10
|
Variation in the co-expression profile highlights a loss of miRNA-mRNA regulation in multiple cancer types. Noncoding RNA Res 2022; 7:98-105. [PMID: 35387279 PMCID: PMC8958468 DOI: 10.1016/j.ncrna.2022.03.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 03/15/2022] [Accepted: 03/16/2022] [Indexed: 01/01/2023] Open
Abstract
Recent research provides insight into the ability of miRNA to regulate various pathways in several cancer types. Despite their involvement in the regulation of the mRNA via targeting the 3′UTR, there are relatively few studies examining the changes in these regulatory mechanisms specific to single cancer types or shared between different cancer types. We analyzed samples where both miRNA and mRNA expression had been measured and performed a thorough correlation analysis on 7494 experimentally validated human miRNA-mRNA target-gene pairs in both healthy and tumoral samples. We show how more than 90% of these miRNA-mRNA interactions show a loss of regulation in the tumoral samples compared with their healthy counterparts. As expected, we found shared miRNA-mRNA dysregulated pairs among different tumors of the same tissue. However, anatomically different cancers also share multiple dysregulated interactions, suggesting that some cancer-related mechanisms are not tumor-specific. 2865 unique miRNA-mRNA pairs were identified across 13 cancer types, ≈ 40% of these pairs showed a loss of correlation in the tumoral samples in at least 2 out of the 13 analyzed cancers. Specifically, miR-200 family, miR-155 and miR-1 were identified, based on the computational analysis described below, as the miRNAs that potentially lose the highest number of interactions across different samples (only literature-based interactions were used for this analysis). Moreover, the miR-34a/ALDH2 and miR-9/MTHFD2 pairs show a switch in their correlation between healthy and tumor kidney samples suggesting a possible change in the regulation exerted by the miRNAs. Interestingly, the expression of these mRNAs is also associated with the overall survival. The disruption of miRNA regulation on its target, therefore, suggests the possible involvement of these pairs in cell malignant functions. The analysis reported here shows how the regulation of miRNA-mRNA interactions strongly differs between healthy and tumoral cells, based on the strong correlation variation between miRNA and its target that we obtained by analyzing the expression data of healthy and tumor tissue in highly reliable miRNA-target pairs. Finally, a go term enrichment analysis shows that the critical pairs identified are involved in cellular adhesion, proliferation, and migration.
Collapse
|
11
|
Luo Y, Liang H. Convergent Usage of Amino Acids in Human Cancers as A Reversed Process of Tissue Development. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022; 20:147-162. [PMID: 34492340 PMCID: PMC9510935 DOI: 10.1016/j.gpb.2021.08.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Revised: 07/13/2021] [Accepted: 08/26/2021] [Indexed: 01/01/2023]
Abstract
Genome- and transcriptome-wide amino acid usage preference across different species is a well-studied phenomenon in molecular evolution, but its characteristics and implication in cancer evolution and therapy remain largely unexplored. Here, we analyzed large-scale transcriptome/proteome profiles, such as The Cancer Genome Atlas (TCGA), the Genotype-Tissue Expression (GTEx), and the Clinical Proteomic Tumor Analysis Consortium (CPTAC), and found that compared to normal tissues, different cancer types showed a convergent pattern toward using biosynthetically low-cost amino acids. Such a pattern can be accurately captured by a single index based on the average biosynthetic energy cost of amino acids, termed energy cost per amino acid (ECPA). With this index, we further compared the trends of amino acid usage and the contributing genes in cancer and tissue development, and revealed their reversed patterns. Finally, focusing on the liver, a tissue with a dramatic increase in ECPA during development, we found that ECPA represents a powerful biomarker that could distinguish liver tumors from normal liver samples consistently across 11 independent patient cohorts and outperforms any index based on single genes. Our study reveals an important principle underlying cancer evolution and suggests the global amino acid usage as a system-level biomarker for cancer diagnosis.
Collapse
Affiliation(s)
- Yikai Luo
- Graduate Program in Quantitative and Computational Biosciences, Baylor College of Medicine, Houston, TX 77030, USA; Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Han Liang
- Graduate Program in Quantitative and Computational Biosciences, Baylor College of Medicine, Houston, TX 77030, USA; Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; Department of Systems Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
| |
Collapse
|
12
|
Abstract
DNA microarrays are widely used to investigate gene expression. Even though the classical analysis of microarray data is based on the study of differentially expressed genes, it is well known that genes do not act individually. Network analysis can be applied to study association patterns of the genes in a biological system. Moreover, it finds wide application in differential coexpression analysis between different systems. Network based coexpression studies have for example been used in (complex) disease gene prioritization, disease subtyping, and patient stratification.In this chapter we provide an overview of the methods and tools used to create networks from microarray data and describe multiple methods on how to analyze a single network or a group of networks. The described methods range from topological metrics, functional group identification to data integration strategies, topological pathway analysis as well as graphical models.
Collapse
Affiliation(s)
- Alisa Pavel
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- BioMediTech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
| | - Angela Serra
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- BioMediTech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
| | - Luca Cattelani
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- BioMediTech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
| | - Antonio Federico
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- BioMediTech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
| | - Dario Greco
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland.
- BioMediTech Institute, Tampere University, Tampere, Finland.
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland.
- Institute of Biotechnology , University of Helsinki, Helsinki, Finland.
| |
Collapse
|
13
|
González-Espinoza A, Zamora-Fuentes J, Hernández-Lemus E, Espinal-Enríquez J. Gene Co-Expression in Breast Cancer: A Matter of Distance. Front Oncol 2021; 11:726493. [PMID: 34868919 PMCID: PMC8636045 DOI: 10.3389/fonc.2021.726493] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 10/26/2021] [Indexed: 01/16/2023] Open
Abstract
Gene regulatory and signaling phenomena are known to be relevant players underlying the establishment of cellular phenotypes. It is also known that such regulatory programs are disrupted in cancer, leading to the onset and development of malignant phenotypes. Gene co-expression matrices have allowed us to compare and analyze complex phenotypes such as breast cancer (BrCa) and their control counterparts. Global co-expression patterns have revealed, for instance, that the highest gene-gene co-expression interactions often occur between genes from the same chromosome (cis-), meanwhile inter-chromosome (trans-) interactions are scarce and have lower correlation values. Furthermore, strength of cis- correlations have been shown to decay with the chromosome distance of gene couples. Despite this loss of long-distance co-expression has been clearly identified, it has been observed only in a small fraction of the whole co-expression landscape, namely the most significant interactions. For that reason, an approach that takes into account the whole interaction set results appealing. In this work, we developed a hybrid method to analyze whole-chromosome Pearson correlation matrices for the four BrCa subtypes (Luminal A, Luminal B, HER2+ and Basal), as well as adjacent normal breast tissue derived matrices. We implemented a systematic method for clustering gene couples, by using eigenvalue spectral decomposition and the k–medoids algorithm, allowing us to determine a number of clusters without removing any interaction. With this method we compared, for each chromosome in the five phenotypes: a) Whether or not the gene-gene co-expression decays with the distance in the breast cancer subtypes b) the chromosome location of cis- clusters of gene couples, and c) whether or not the loss of long-distance co-expression is observed in the whole range of interactions. We found that in the correlation matrix for the control phenotype, positive and negative Pearson correlations deviate from a random null model independently of the distance between couples. Conversely, for all BrCa subtypes, in all chromosomes, positive correlations decay with distance, and negative correlations do not differ from the null model. We also found that BrCa clusters are distance-dependent, meanwhile for the control phenotype, chromosome location does not determine the clustering. To our knowledge, this is the first time that a dependence on distance is reported for gene clusters in breast cancer. Since this method uses the whole cis- interaction geneset, combination with other -omics approaches may provide further evidence to understand in a more integrative fashion, the mechanisms that disrupt gene regulation in cancer.
Collapse
Affiliation(s)
- Alfredo González-Espinoza
- Department of Biology, University of Pennsylvania, Philadelphia, PA, United States.,Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
| | - Jose Zamora-Fuentes
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico.,Centro de Ciencias de la Complejidad, Universidad Nacional Autόnoma de México, Mexico City, Mexico
| | - Jesús Espinal-Enríquez
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico.,Centro de Ciencias de la Complejidad, Universidad Nacional Autόnoma de México, Mexico City, Mexico
| |
Collapse
|
14
|
Yu H, Wang L, Chen D, Li J, Guo Y. Conditional transcriptional relationships may serve as cancer prognostic markers. BMC Med Genomics 2021; 14:101. [PMID: 34856998 PMCID: PMC8638091 DOI: 10.1186/s12920-021-00958-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2021] [Accepted: 04/08/2021] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND While most differential coexpression (DC) methods are bound to quantify a single correlation value for a gene pair across multiple samples, a newly devised approach under the name Correlation by Individual Level Product (CILP) revolutionarily projects the summary correlation value to individual product correlation values for separate samples. CILP greatly widened DC analysis opportunities by allowing integration of non-compromised statistical methods. METHODS Here, we performed a study to verify our hypothesis that conditional relationships, i.e., gene pairs of remarkable differential coexpression, may be sought as quantitative prognostic markers for human cancers. Alongside the seeking of prognostic gene links in a pan-cancer setting, we also examined whether a trend of global expression correlation loss appeared in a wide panel of cancer types and revisited the controversial subject of mutual relationship between the DE approach and the DC approach. RESULTS By integrating CILP with classical univariate survival analysis, we identified up to 244 conditional gene links as potential prognostic markers in five cancer types. In particular, five prognostic gene links for kidney renal papillary cell carcinoma tended to condense around cancer gene ESPL1, and the transcriptional synchrony between ESPL1 and PTTG1 tended to be elevated in patients of adverse prognosis. In addition, we extended the observation of global trend of correlation loss in more than ten cancer types and empirically proved DC analysis results were independent of gene differential expression in five cancer types. CONCLUSIONS Combining the power of CILP and the classical survival analysis, we successfully fetched conditional transcriptional relationships that conferred prognosis power for five cancer types. Despite a general trend of global correlation loss in tumor transcriptomes, most of these prognosis conditional links demonstrated stronger expression correlation in tumors, and their stronger coexpression was associated with poor survival.
Collapse
Affiliation(s)
- Hui Yu
- Department of Internal Medicine, University of New Mexico, Albuquerque, NM 87131 USA
| | - Limei Wang
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, Hainan Medical University, Kaikou, Hainan 571199 China
- College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, 150001 Heilongjiang China
| | - Danqian Chen
- Key Laboratory of Resource Biology and Biotechnology in Western China, School of Life Sciences, Northwest University, Xi’an, 710069 Shaanxi China
| | - Jin Li
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, Hainan Medical University, Kaikou, Hainan 571199 China
| | - Yan Guo
- Department of Internal Medicine, University of New Mexico, Albuquerque, NM 87131 USA
| |
Collapse
|
15
|
Guillaume B, Jérôme T, Philippe L, Eduardo C, François-Joseph L, Eric B. Aging at evolutionary crossroads: longitudinal gene co-expression network analyses of proximal and ultimate causes of aging in bats. Mol Biol Evol 2021; 39:6400255. [PMID: 34662394 PMCID: PMC8763092 DOI: 10.1093/molbev/msab302] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
How, when, and why do organisms, their tissues, and their cells age remain challenging issues, although researchers have identified multiple mechanistic causes of aging, and three major evolutionary theories have been developed to unravel the ultimate causes of organismal aging. A central hypothesis of these theories is that the strength of natural selection decreases with age. However, empirical evidence on when, why, and how organisms age is phylogenetically limited, especially in natural populations. Here, we developed generic comparisons of gene co-expression networks that quantify and dissect the heterogeneity of gene co-expression in conspecific individuals from different age-classes to provide topological evidence about some mechanical and fundamental causes of organismal aging. We applied this approach to investigate the complexity of some proximal and ultimate causes of aging phenotypes in a natural population of the greater mouse-eared bat Myotis myotis, a remarkably long-lived species given its body size and metabolic rate, with available longitudinal blood transcriptomes. M. myotis gene co-expression networks become increasingly fragmented with age, suggesting an erosion of the strength of natural selection and a general dysregulation of gene co-expression in aging bats. However, selective pressures remain sufficiently strong to allow successive emergence of homogeneous age-specific gene co-expression patterns, for at least 7 years. Thus, older individuals from long-lived species appear to sit at an evolutionary crossroad: as they age, they experience both a decrease in the strength of natural selection and a targeted selection for very specific biological processes, further inviting to refine a central hypothesis in evolutionary aging theories.
Collapse
Affiliation(s)
- Bernard Guillaume
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Sorbonne Université, CNRS, Museum National d'Histoire Naturelle, EPHE, Université des Antilles, Paris, 75005, France
| | - Teulière Jérôme
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Sorbonne Université, CNRS, Museum National d'Histoire Naturelle, EPHE, Université des Antilles, Paris, 75005, France
| | - Lopez Philippe
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Sorbonne Université, CNRS, Museum National d'Histoire Naturelle, EPHE, Université des Antilles, Paris, 75005, France
| | - Corel Eduardo
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Sorbonne Université, CNRS, Museum National d'Histoire Naturelle, EPHE, Université des Antilles, Paris, 75005, France
| | - Lapointe François-Joseph
- Département de sciences biologiques, Complexe des sciences, 1375 avenue Thérèse-Lavoie-Roux, Université de Montréal, Montréal, Québec), H2V 0B3, Canada (
| | - Bapteste Eric
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Sorbonne Université, CNRS, Museum National d'Histoire Naturelle, EPHE, Université des Antilles, Paris, 75005, France
| |
Collapse
|
16
|
Petti M, Verrienti A, Paci P, Farina L. SEaCorAl: Identifying and contrasting the regulation-correlation bias in RNA-Seq paired expression data of patient groups. Comput Biol Med 2021; 135:104567. [PMID: 34174761 DOI: 10.1016/j.compbiomed.2021.104567] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 05/27/2021] [Accepted: 06/08/2021] [Indexed: 01/11/2023]
Abstract
The Cancer Genome Atlas database offers the possibility of analyzing genome-wide expression RNA-Seq cancer data using paired counts, that is, studies where expression data are collected in pairs of normal and cancer cells, by taking samples from the same individual. Correlation of gene expression profiles is the most common analysis to study co-expression groups, which is used to find biological interpretation of -omics big data. The aim of the paper is threefold: firstly we show for the first time, the presence of a "regulation-correlation bias" in RNA-Seq paired expression data, that is an artifactual link between the expression status (up- or down-regulation) of a gene pair and the sign of the corresponding correlation coefficient. Secondly, we provide a statistical model able to theoretically explain the reasons for the presence of such a bias. Thirdly, we present a bias-removal algorithm, called SEaCorAl, able to effectively reduce bias effects and improve the biological significance of correlation analysis. Validation of the SEaCorAl algorithm is performed by showing a significant increase in the ability to detect biologically meaningful associations of positive correlations and a significant increase of the modularity of the resulting unbiased correlation network.
Collapse
Affiliation(s)
- Manuela Petti
- Department of Computer, Control and Management Engineering, Sapienza University of Rome, Italy
| | - Antonella Verrienti
- Department of Translational and Precision Medicine, Sapienza University of Rome, Italy
| | - Paola Paci
- Department of Computer, Control and Management Engineering, Sapienza University of Rome, Italy
| | - Lorenzo Farina
- Department of Computer, Control and Management Engineering, Sapienza University of Rome, Italy.
| |
Collapse
|
17
|
Lemoine GG, Scott-Boyer MP, Ambroise B, Périn O, Droit A. GWENA: gene co-expression networks analysis and extended modules characterization in a single Bioconductor package. BMC Bioinformatics 2021; 22:267. [PMID: 34034647 PMCID: PMC8152313 DOI: 10.1186/s12859-021-04179-4] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Accepted: 05/07/2021] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Network-based analysis of gene expression through co-expression networks can be used to investigate modular relationships occurring between genes performing different biological functions. An extended description of each of the network modules is therefore a critical step to understand the underlying processes contributing to a disease or a phenotype. Biological integration, topology study and conditions comparison (e.g. wild vs mutant) are the main methods to do so, but to date no tool combines them all into a single pipeline. RESULTS Here we present GWENA, a new R package that integrates gene co-expression network construction and whole characterization of the detected modules through gene set enrichment, phenotypic association, hub genes detection, topological metric computation, and differential co-expression. To demonstrate its performance, we applied GWENA on two skeletal muscle datasets from young and old patients of GTEx study. Remarkably, we prioritized a gene whose involvement was unknown in the muscle development and growth. Moreover, new insights on the variations in patterns of co-expression were identified. The known phenomena of connectivity loss associated with aging was found coupled to a global reorganization of the relationships leading to expression of known aging related functions. CONCLUSION GWENA is an R package available through Bioconductor ( https://bioconductor.org/packages/release/bioc/html/GWENA.html ) that has been developed to perform extended analysis of gene co-expression networks. Thanks to biological and topological information as well as differential co-expression, the package helps to dissect the role of genes relationships in diseases conditions or targeted phenotypes. GWENA goes beyond existing packages that perform co-expression analysis by including new tools to fully characterize modules, such as differential co-expression, additional enrichment databases, and network visualization.
Collapse
Affiliation(s)
- Gwenaëlle G. Lemoine
- Département de médecine moléculaire, Faculté de médecine, Université Laval, 2325 rue de l’Université, Québec, G1V 0A6 Canada
| | - Marie-Pier Scott-Boyer
- Centre de recherche du Chu de Quebec-Université Laval, 2705 boulevard Laurier Québec, Québec, G1V 4G2 Canada
| | - Bathilde Ambroise
- L’Oréal Research and Innovation, 15 rue Pierre Dreyfus, 92110 Clichy, France
| | - Olivier Périn
- L’Oréal Research and Innovation, 15 rue Pierre Dreyfus, 92110 Clichy, France
| | - Arnaud Droit
- Département de médecine moléculaire, Faculté de médecine, Université Laval, 2325 rue de l’Université, Québec, G1V 0A6 Canada
- Centre de recherche du Chu de Quebec-Université Laval, 2705 boulevard Laurier Québec, Québec, G1V 4G2 Canada
| |
Collapse
|
18
|
Gene expression barcode values reveal a potential link between Parkinson's disease and gastric cancer. Aging (Albany NY) 2021; 13:6171-6181. [PMID: 33596182 PMCID: PMC7950232 DOI: 10.18632/aging.202623] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Accepted: 01/22/2021] [Indexed: 12/11/2022]
Abstract
Gastric cancer is a disease that develops from the lining of the stomach, whereas Parkinson’s disease is a long-term degenerative disorder of the central nervous system that mainly affects the motor system. Although these two diseases seem to be distinct from each other, increasing evidence suggests that they might be linked. To explore the linkage between these two diseases, differentially expressed genes between the diseased people and their normal controls were identified using the barcode algorithm. This algorithm transforms actual gene expression values into barcode values comprised of 1’s (expressed genes) and 0’s (silenced genes). Once the overlapped differentially expressed genes were identified, their biological relevance was investigated. Thus, using the gene expression profiles and bioinformatics methods, we demonstrate that Parkinson’s disease and gastric cancer are indeed linked. This research may serve as a pilot study, and it will stimulate more research to investigate the relationship between gastric cancer and Parkinson’s disease from the perspective of gene profiles and their functions.
Collapse
|
19
|
Portes LL, Small M. Navigating differential structures in complex networks. Phys Rev E 2021; 102:062301. [PMID: 33466036 DOI: 10.1103/physreve.102.062301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Accepted: 11/20/2020] [Indexed: 11/07/2022]
Abstract
Structural changes in a network representation of a system, due to different experimental conditions, different connectivity across layers, or to its time evolution, can provide insight on its organization, function, and on how it responds to external perturbations. The deeper understanding of how gene networks cope with diseases and treatments is maybe the most incisive demonstration of the gains obtained through this differential network analysis point of view, which led to an explosion of new numeric techniques in the last decade. However, where to focus one's attention, or how to navigate through the differential structures in the context of large networks, can be overwhelming even for a few experimental conditions. In this paper, we propose a theory and a methodological implementation for the characterization of shared "structural roles" of nodes simultaneously within and between networks. Inspired by recent methodological advances in chaotic phase synchronization analysis, we show how the information about the shared structures of a set of networks can be split and organized in an automatic fashion, in scenarios with very different (i) community sizes, (ii) total number of communities, and (iii) even for a large number of 100 networks compared using numerical benchmarks generated by a stochastic block model. Then, we investigate how the network size, number of networks, and mean size of communities influence the method performance in a series of Monte Carlo experiments. To illustrate its potential use in a more challenging scenario with real-world data, we show evidence that the method can still split and organize the structural information of a set of four gene coexpression networks obtained from two cell types × two treatments (interferon-β stimulated or control). Aside from its potential use as for automatic feature extraction and preprocessing tool, we discuss that another strength of the method is its "story-telling"-like characterization of the information encoded in a set of networks, which can be used to pinpoint unexpected shared structure, leading to further investigations and providing new insights. Finally, the method is flexible to address different research-field-specific questions, by not restricting what scientific-meaningful characteristic (or relevant feature) of a node shall be used.
Collapse
Affiliation(s)
- Leonardo L Portes
- Complex Systems Group, Department of Mathematics and Statistics, University of Western Australia, Nedlands, Perth, WA 6009, Australia
| | - Michael Small
- Complex Systems Group, Department of Mathematics and Statistics, University of Western Australia, Nedlands, Perth, WA 6009, Australia.,Mineral Resources, CSIRO, Kensington, Perth, WA 6151, Australia
| |
Collapse
|
20
|
Savino A, Provero P, Poli V. Differential Co-Expression Analyses Allow the Identification of Critical Signalling Pathways Altered during Tumour Transformation and Progression. Int J Mol Sci 2020; 21:E9461. [PMID: 33322692 PMCID: PMC7764314 DOI: 10.3390/ijms21249461] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 12/02/2020] [Accepted: 12/09/2020] [Indexed: 02/02/2023] Open
Abstract
Biological systems respond to perturbations through the rewiring of molecular interactions, organised in gene regulatory networks (GRNs). Among these, the increasingly high availability of transcriptomic data makes gene co-expression networks the most exploited ones. Differential co-expression networks are useful tools to identify changes in response to an external perturbation, such as mutations predisposing to cancer development, and leading to changes in the activity of gene expression regulators or signalling. They can help explain the robustness of cancer cells to perturbations and identify promising candidates for targeted therapy, moreover providing higher specificity with respect to standard co-expression methods. Here, we comprehensively review the literature about the methods developed to assess differential co-expression and their applications to cancer biology. Via the comparison of normal and diseased conditions and of different tumour stages, studies based on these methods led to the definition of pathways involved in gene network reorganisation upon oncogenes' mutations and tumour progression, often converging on immune system signalling. A relevant implementation still lagging behind is the integration of different data types, which would greatly improve network interpretability. Most importantly, performance and predictivity evaluation of the large variety of mathematical models proposed would urgently require experimental validations and systematic comparisons. We believe that future work on differential gene co-expression networks, complemented with additional omics data and experimentally tested, will considerably improve our insights into the biology of tumours.
Collapse
Affiliation(s)
- Aurora Savino
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Turin, Via Nizza 52, 10126 Turin, Italy
| | - Paolo Provero
- Department of Neurosciences “Rita Levi Montalcini”, University of Turin, Corso Massimo D’Ázeglio 52, 10126 Turin, Italy;
- Center for Omics Sciences, Ospedale San Raffaele IRCCS, Via Olgettina 60, 20132 Milan, Italy
| | - Valeria Poli
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Turin, Via Nizza 52, 10126 Turin, Italy
| |
Collapse
|
21
|
Simpson CM, Gnad F. Applying graph database technology for analyzing perturbed co-expression networks in cancer. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020; 2020:6029398. [PMID: 33306799 PMCID: PMC7731929 DOI: 10.1093/database/baaa110] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/17/2020] [Revised: 09/20/2020] [Accepted: 11/30/2020] [Indexed: 11/13/2022]
Abstract
Graph representations provide an elegant solution to capture and analyze complex molecular mechanisms in the cell. Co-expression networks are undirected graph representations of transcriptional co-behavior indicating (co-)regulations, functional modules or even physical interactions between the corresponding gene products. The growing avalanche of available RNA sequencing (RNAseq) data fuels the construction of such networks, which are usually stored in relational databases like most other biological data. Inferring linkage by recursive multiple-join statements, however, is computationally expensive and complex to design in relational databases. In contrast, graph databases store and represent complex interconnected data as nodes, edges and properties, making it fast and intuitive to query and analyze relationships. While graph-based database technologies are on their way from a fringe domain to going mainstream, there are only a few studies reporting their application to biological data. We used the graph database management system Neo4j to store and analyze co-expression networks derived from RNAseq data from The Cancer Genome Atlas. Comparing co-expression in tumors versus healthy tissues in six cancer types revealed significant perturbation tracing back to erroneous or rewired gene regulation. Applying centrality, community detection and pathfinding graph algorithms uncovered the destruction or creation of central nodes, modules and relationships in co-expression networks of tumors. Given the speed, accuracy and straightforwardness of managing these densely connected networks, we conclude that graph databases are ready for entering the arena of biological data.
Collapse
Affiliation(s)
- Claire M Simpson
- Department of Bioinformatics and Data Science, Cell Signaling Technology Inc., 3 Trask Lane, Danvers, MA 01923, USA
| | - Florian Gnad
- Department of Bioinformatics and Data Science, Cell Signaling Technology Inc., 3 Trask Lane, Danvers, MA 01923, USA
| |
Collapse
|
22
|
Castillo SP, Keymer JE, Marquet PA. Do microenvironmental changes disrupt multicellular organisation with ageing, enacting and favouring the cancer cell phenotype? Bioessays 2020; 43:e2000126. [PMID: 33184914 DOI: 10.1002/bies.202000126] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 10/05/2020] [Accepted: 10/06/2020] [Indexed: 12/13/2022]
Abstract
Cancer is a singular cellular state, the emergence of which destabilises the homeostasis reached through the evolution to multicellularity. We present the idea that the onset of the cellular disobedience to the metazoan functional and structural architecture, known as the cancer phenotype, is triggered by changes in the cell's external environment that occur with ageing: what ensues is a breach of the social contract of multicellular life characteristic of metazoans. By integrating old ideas with new evidence, we propose that with ageing the environmental information that maintains a multicellular organisation is eroded, rewiring internal processes of the cell, and resulting in an internal shift towards an ancestral condition resulting in the pseudo-multicellular cancer phenotype. Once that phenotype emerges, a new local social contract is built, different from the homeostatic one, leading to tumour formation and the foundation of a novel local ecosystem.
Collapse
Affiliation(s)
- Simon P Castillo
- Departamento de Ecología, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, Santiago, Chile.,Instituto de Ecología y Biodiversidad de Chile (IEB) Chile, Santiago, Chile
| | - Juan E Keymer
- Departamento de Ecología, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, Santiago, Chile.,Instituto de Física, Pontificia Universidad Católica de Chile, Santiago, Chile.,Departamento de Ciencias Naturales y Tecnología, Universidad de Aysén, Coyhaique, Chile
| | - Pablo A Marquet
- Departamento de Ecología, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, Santiago, Chile.,Instituto de Ecología y Biodiversidad de Chile (IEB) Chile, Santiago, Chile.,Instituto de Sistemas Complejos de Valparaíso (ISCV), Valparaíso, Chile.,Santa Fe Institute, Santa Fe, New Mexico, USA
| |
Collapse
|
23
|
Mayer P. Modelling bioactivities of combinations of whole extracts of edibles with a simplified theoretical framework reveals the statistical role of molecular diversity and system complexity in their mode of action and their nearly certain safety. PLoS One 2020; 15:e0239841. [PMID: 32986750 PMCID: PMC7521709 DOI: 10.1371/journal.pone.0239841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2020] [Accepted: 09/14/2020] [Indexed: 11/25/2022] Open
Abstract
Network pharmacology and polypharmacology are emerging as novel drug discovery paradigms. The many discovery, safety and regulatory issues they raise may become tractable with polypharmacological combinations of natural compounds found in whole extracts of edible and mixes thereof. The primary goal of this work is to get general insights underlying the innocuity and the emergence of beneficial and toxic activities of combinations of many compounds in general and of edibles in particular. A simplified model of compounds’ interactions with an organism and of their desired and undesired effects is constructed by considering the departure from equilibrium of interconnected biological features. This model allows to compute the scaling of the probability of significant effects relative to nutritional diversity, organism complexity and synergy resulting from mixing compounds and edibles. It allows also to characterize massive indirect perturbation mode of action drugs as a potential novel multi-compound-multi-target pharmaceutical class, coined Ediceuticals when based on edibles. Their mode of action may readily target differentially organisms’ system robustness as such based on differential complexity for discovering nearly certainly safe novel antimicrobials, antiviral and anti-cancer treatments. This very general model provides also a theoretical framework to several pharmaceutical and nutritional observations. In particular, it characterizes two classes of undesirable effects of drugs, and may question the interpretation of undesirable effects in healthy subjects. It also formalizes nutritional diversity as such as a novel statistical supra-chemical parameter that may contribute to guide nutritional health intervention. Finally, it is to be noted that a similar formalism may be further applicable to model whole ecosystems in general.
Collapse
|
24
|
Chowdhury HA, Bhattacharyya DK, Kalita JK. (Differential) Co-Expression Analysis of Gene Expression: A Survey of Best Practices. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1154-1173. [PMID: 30668502 DOI: 10.1109/tcbb.2019.2893170] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Analysis of gene expression data is widely used in transcriptomic studies to understand functions of molecules inside a cell and interactions among molecules. Differential co-expression analysis studies diseases and phenotypic variations by finding modules of genes whose co-expression patterns vary across conditions. We review the best practices in gene expression data analysis in terms of analysis of (differential) co-expression, co-expression network, differential networking, and differential connectivity considering both microarray and RNA-seq data along with comparisons. We highlight hurdles in RNA-seq data analysis using methods developed for microarrays. We include discussion of necessary tools for gene expression analysis throughout the paper. In addition, we shed light on scRNA-seq data analysis by including preprocessing and scRNA-seq in co-expression analysis along with useful tools specific to scRNA-seq. To get insights, biological interpretation and functional profiling is included. Finally, we provide guidelines for the analyst, along with research issues and challenges which should be addressed.
Collapse
|
25
|
Liu W, Gan C, Wang W, Liao L, Li C, Xu L, Li E. Identification of lncRNA-associated differential subnetworks in oesophageal squamous cell carcinoma by differential co-expression analysis. J Cell Mol Med 2020; 24:4804-4818. [PMID: 32164040 PMCID: PMC7176870 DOI: 10.1111/jcmm.15159] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 02/21/2020] [Accepted: 02/25/2020] [Indexed: 02/06/2023] Open
Abstract
Differential expression analysis has led to the identification of important biomarkers in oesophageal squamous cell carcinoma (ESCC). Despite enormous contributions, it has not harnessed the full potential of gene expression data, such as interactions among genes. Differential co-expression analysis has emerged as an effective tool that complements differential expression analysis to provide better insight of dysregulated mechanisms and indicate key driver genes. Here, we analysed the differential co-expression of lncRNAs and protein-coding genes (PCGs) between normal oesophageal tissue and ESCC tissues, and constructed a lncRNA-PCG differential co-expression network (DCN). DCN was characterized as a scale-free, small-world network with modular organization. Focusing on lncRNAs, a total of 107 differential lncRNA-PCG subnetworks were identified from the DCN by integrating both differential expression and differential co-expression. These differential subnetworks provide a valuable source for revealing lncRNA functions and the associated dysfunctional regulatory networks in ESCC. Their consistent discrimination suggests that they may have important roles in ESCC and could serve as robust subnetwork biomarkers. In addition, two tumour suppressor genes (AL121899.1 and ELMO2), identified in the core modules, were validated by functional experiments. The proposed method can be easily used to investigate differential subnetworks of other molecules in other cancers.
Collapse
Affiliation(s)
- Wei Liu
- The Key Laboratory of Molecular Biology for High Cancer Incidence Coastal Chaoshan AreaShantou University Medical CollegeShantouChina
- Department of Biochemistry and Molecular BiologyShantou University Medical CollegeShantouChina
- Department of MathematicsHeilongjiang Institute of TechnologyHarbinChina
| | - Cai‐Yan Gan
- The Key Laboratory of Molecular Biology for High Cancer Incidence Coastal Chaoshan AreaShantou University Medical CollegeShantouChina
- Department of Biochemistry and Molecular BiologyShantou University Medical CollegeShantouChina
| | - Wei Wang
- Department of MathematicsHeilongjiang Institute of TechnologyHarbinChina
| | - Lian‐Di Liao
- The Key Laboratory of Molecular Biology for High Cancer Incidence Coastal Chaoshan AreaShantou University Medical CollegeShantouChina
- Institute of Oncologic PathologyShantou University Medical CollegeShantouChina
| | - Chun‐Quan Li
- Department of Medical InformaticsHarbin Medical University‐DaqingDaqingChina
| | - Li‐Yan Xu
- The Key Laboratory of Molecular Biology for High Cancer Incidence Coastal Chaoshan AreaShantou University Medical CollegeShantouChina
- Institute of Oncologic PathologyShantou University Medical CollegeShantouChina
| | - En‐Min Li
- The Key Laboratory of Molecular Biology for High Cancer Incidence Coastal Chaoshan AreaShantou University Medical CollegeShantouChina
- Department of Biochemistry and Molecular BiologyShantou University Medical CollegeShantouChina
| |
Collapse
|
26
|
Jin T, Wang C, Tian S. Feature selection based on differentially correlated gene pairs reveals the mechanism of IFN-β therapy for multiple sclerosis. PeerJ 2020; 8:e8812. [PMID: 32211244 PMCID: PMC7081782 DOI: 10.7717/peerj.8812] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Accepted: 02/27/2020] [Indexed: 12/22/2022] Open
Abstract
Multiple sclerosis (MS) is one of the most common neurological disabilities of the central nervous system. Immune-modulatory therapy with Interferon-β (IFN-β) is a commonly used first-line treatment to prevent MS patients from relapses. Nevertheless, a large proportion of MS patients on IFN-β therapy experience their first relapse within 2 years of treatment initiation. Feature selection, a machine learning strategy, is routinely used in the fields of bioinformatics and computational biology to determine which subset of genes is most relevant to an outcome of interest. The majority of feature selection methods focus on alterations in gene expression levels. In this study, we sought to determine which genes are most relevant to relapse of MS patients on IFN-β therapy. Rather than the usual focus on alterations in gene expression levels, we devised a feature selection method based on alterations in gene-to-gene interactions. In this study, we applied the proposed method to a longitudinal microarray dataset and evaluated the IFN-β effect on MS patients to identify gene pairs with differentially correlated edges that are consistent over time in the responder group compared to the non-responder group. The resulting gene list had a good predictive ability on an independent validation set and explicit biological implications related to MS. To conclude, it is anticipated that the proposed method will gain widespread interest and application in personalized treatment research to facilitate prediction of which patients may respond to a specific regimen.
Collapse
Affiliation(s)
- Tao Jin
- Department of Neurology and Neuroscience Center, The First Hosptial of Jilin University, Changchun, China
| | - Chi Wang
- Department of Biostatistics and Markey Cancer Center, University of Kentucky, Lexington, KY, USA
| | - Suyan Tian
- Division of Clinical Research, The First Hosptial of Jilin University, Changchuan, Jilin, China
| |
Collapse
|
27
|
Kataka E, Zaucha J, Frishman G, Ruepp A, Frishman D. Edgetic perturbation signatures represent known and novel cancer biomarkers. Sci Rep 2020; 10:4350. [PMID: 32152446 PMCID: PMC7062722 DOI: 10.1038/s41598-020-61422-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Accepted: 02/20/2020] [Indexed: 02/07/2023] Open
Abstract
Isoform switching is a recently characterized hallmark of cancer, and often translates to the loss or gain of domains mediating protein interactions and thus, the re-wiring of the interactome. Recent computational tools leverage domain-domain interaction data to resolve the condition-specific interaction networks from RNA-Seq data accounting for the domain content of the primary transcripts expressed. Here, we used The Cancer Genome Atlas RNA-Seq datasets to generate 642 patient-specific pairs of interactomes corresponding to both the tumor and the healthy tissues across 13 cancer types. The comparison of these interactomes provided a list of patient-specific edgetic perturbations of the interactomes associated with the cancerous state. We found that among the identified perturbations, select sets are robustly shared between patients at the multi-cancer, cancer-specific and cancer sub-type specific levels. Interestingly, the majority of the alterations do not directly involve significantly mutated genes, nevertheless, they strongly correlate with patient survival. The findings (available at EdgeExplorer: “http://webclu.bio.wzw.tum.de/EdgeExplorer”) are a new source of potential biomarkers for classifying cancer types and the proteins we identified are potential anti-cancer therapy targets.
Collapse
Affiliation(s)
- Evans Kataka
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Maximus-von-Imhof-Forum 3, 85354, Freising, Germany
| | - Jan Zaucha
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Maximus-von-Imhof-Forum 3, 85354, Freising, Germany
| | - Goar Frishman
- Institute of Experimental Genetics (IEG), Helmholtz Zentrum München-German Research Center for Environmental Health (GmbH), Ingolstädter Landstrasse 1, 85764, Neuherberg, Germany
| | - Andreas Ruepp
- Institute of Experimental Genetics (IEG), Helmholtz Zentrum München-German Research Center for Environmental Health (GmbH), Ingolstädter Landstrasse 1, 85764, Neuherberg, Germany
| | - Dmitrij Frishman
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Maximus-von-Imhof-Forum 3, 85354, Freising, Germany. .,Laboratory of Bioinformatics, RASA Research Center, St Petersburg State Polytechnic University, St Petersburg, 195251, Russia.
| |
Collapse
|
28
|
Dalgıç E, Konu Ö, Öz ZS, Chan C. Lower connectivity of tumor coexpression networks is not specific to cancer. In Silico Biol 2019; 13:41-53. [PMID: 31156157 PMCID: PMC6597990 DOI: 10.3233/isb-190472] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Global level network analysis of molecular links is necessary for systems level view of complex diseases like cancer. Using genome-wide expression datasets, we constructed and compared gene co-expression based specific networks of pre-cancerous tumors (adenoma) and cancerous tumors (carcinoma) with paired normal networks to assess for any possible changes in network connectivity. Previously, loss of connectivity was reported as a characteristic of cancer samples. Here, we observed that pre-cancerous conditions also had significantly less connections than paired normal samples. We observed a loss of connectivity trend for colorectal adenoma, aldosterone producing adenoma and uterine leiomyoma. We also showed that the loss of connectivity trend is not specific to positive or negative correlation based networks. Differential hub genes, which were the most highly differentially less connected genes in tumor, were mostly different between different datasets. No common gene list could be defined which underlies the lower connectivity of tumor specific networks. Connectivity of colorectal cancer methylation targets was different from other genes. Extracellular space related terms were enriched in negative correlation based differential hubs and common methylation targets of colorectal carcinoma. Our results indicate a systems level change of lower connectivity as cells transform to not only cancer but also pre-cancerous conditions. This systems level behavior could not be attributed to a group of genes.
Collapse
Affiliation(s)
- Ertuğrul Dalgıç
- Department of Medical Biology, Zonguldak Bülent Ecevit University School of Medicine, Zonguldak, Turkey
| | - Özlen Konu
- Department of Molecular Biology and Genetics, Bilkent University, Ankara, Turkey
| | - Zehra Safi Öz
- Department of Medical Biology, Zonguldak Bülent Ecevit University School of Medicine, Zonguldak, Turkey
| | - Christina Chan
- Department of Chemical Engineering and Materials Science, Michigan State University, East Lansing, MI, USA
| |
Collapse
|
29
|
Bhuva DD, Cursons J, Smyth GK, Davis MJ. Differential co-expression-based detection of conditional relationships in transcriptional data: comparative analysis and application to breast cancer. Genome Biol 2019; 20:236. [PMID: 31727119 PMCID: PMC6857226 DOI: 10.1186/s13059-019-1851-8] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Accepted: 10/02/2019] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND Elucidation of regulatory networks, including identification of regulatory mechanisms specific to a given biological context, is a key aim in systems biology. This has motivated the move from co-expression to differential co-expression analysis and numerous methods have been developed subsequently to address this task; however, evaluation of methods and interpretation of the resulting networks has been hindered by the lack of known context-specific regulatory interactions. RESULTS In this study, we develop a simulator based on dynamical systems modelling capable of simulating differential co-expression patterns. With the simulator and an evaluation framework, we benchmark and characterise the performance of inference methods. Defining three different levels of "true" networks for each simulation, we show that accurate inference of causation is difficult for all methods, compared to inference of associations. We show that a z-score-based method has the best general performance. Further, analysis of simulation parameters reveals five network and simulation properties that explained the performance of methods. The evaluation framework and inference methods used in this study are available in the dcanr R/Bioconductor package. CONCLUSIONS Our analysis of networks inferred from simulated data show that hub nodes are more likely to be differentially regulated targets than transcription factors. Based on this observation, we propose an interpretation of the inferred differential network that can reconstruct a putative causal network.
Collapse
Affiliation(s)
- Dharmesh D Bhuva
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia.,School of Mathematics and Statistics, Faculty of Science, University of Melbourne, Melbourne, VIC, 3010, Australia
| | - Joseph Cursons
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia.,Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Melbourne, VIC, 3010, Australia
| | - Gordon K Smyth
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia.,School of Mathematics and Statistics, Faculty of Science, University of Melbourne, Melbourne, VIC, 3010, Australia
| | - Melissa J Davis
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia. .,Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Melbourne, VIC, 3010, Australia. .,Department of Clinical Pathology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Melbourne, VIC, 3010, Australia.
| |
Collapse
|
30
|
Sabir JSM, El Omri A, Shaik NA, Banaganapalli B, Al-Shaeri MA, Alkenani NA, Hajrah NH, Awan ZA, Zrelli H, Elango R, Khan M. Identification of key regulatory genes connected to NF-κB family of proteins in visceral adipose tissues using gene expression and weighted protein interaction network. PLoS One 2019; 14:e0214337. [PMID: 31013288 PMCID: PMC6478283 DOI: 10.1371/journal.pone.0214337] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2019] [Accepted: 03/11/2019] [Indexed: 12/12/2022] Open
Abstract
Obesity is connected to the activation of chronic inflammatory pathways in both adipocytes and macrophages located in adipose tissues. The nuclear factor (NF)-κB is a central molecule involved in inflammatory pathways linked to the pathology of different complex metabolic disorders. Investigating the gene expression data in the adipose tissue would potentially unravel disease relevant gene interactions. The present study is aimed at creating a signature molecular network and at prioritizing the potential biomarkers interacting with NF-κB family of proteins in obesity using system biology approaches. The dataset GSE88837 associated with obesity was downloaded from Gene Expression Omnibus (GEO) database. Statistical analysis represented the differential expression of a total of 2650 genes in adipose tissues (p = <0.05). Using concepts like correlation, semantic similarity, and theoretical graph parameters we narrowed down genes to a network of 23 genes strongly connected with NF-κB family with higher significance. Functional enrichment analysis revealed 21 of 23 target genes of NF-κB were found to have a critical role in the pathophysiology of obesity. Interestingly, GEM and PPP1R13L were predicted as novel genes which may act as potential target or biomarkers of obesity as they occur with other 21 target genes with known obesity relationship. Our study concludes that NF-κB and prioritized target genes regulate the inflammation in adipose tissues through several molecular signaling pathways like NF-κB, PI3K-Akt, glucocorticoid receptor regulatory network, angiogenesis and cytokine pathways. This integrated system biology approaches can be applied for elucidating functional protein interaction networks of NF-κB protein family in different complex diseases. Our integrative and network-based approach for finding therapeutic targets in genomic data could accelerate the identification of novel drug targets for obesity.
Collapse
Affiliation(s)
- Jamal S. M. Sabir
- Center of Excellence in Bionanoscience Research, King Abdulaziz University, Jeddah, Saudi Arabia
- Genomics and Biotechnology Section and Research Group, Department of Biological Sciences, Faculty of Science, King abdulaziz University, Jeddah, Saudi Arabia
| | - Abdelfatteh El Omri
- Center of Excellence in Bionanoscience Research, King Abdulaziz University, Jeddah, Saudi Arabia
- Genomics and Biotechnology Section and Research Group, Department of Biological Sciences, Faculty of Science, King abdulaziz University, Jeddah, Saudi Arabia
- * E-mail: (MK); (AEO)
| | - Noor A. Shaik
- Department of Genetic Medicine, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Babajan Banaganapalli
- Department of Genetic Medicine, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Majed A. Al-Shaeri
- Center of Excellence in Bionanoscience Research, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Naser A. Alkenani
- Biology- Zoology Division, Department of Biological Sciences, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Nahid H. Hajrah
- Center of Excellence in Bionanoscience Research, King Abdulaziz University, Jeddah, Saudi Arabia
- Genomics and Biotechnology Section and Research Group, Department of Biological Sciences, Faculty of Science, King abdulaziz University, Jeddah, Saudi Arabia
| | - Zuhier A. Awan
- Department of Clinical Biochemistry. Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Houda Zrelli
- Center of Excellence in Bionanoscience Research, King Abdulaziz University, Jeddah, Saudi Arabia
- Genomics and Biotechnology Section and Research Group, Department of Biological Sciences, Faculty of Science, King abdulaziz University, Jeddah, Saudi Arabia
| | - Ramu Elango
- Department of Genetic Medicine, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Muhummadh Khan
- Center of Excellence in Bionanoscience Research, King Abdulaziz University, Jeddah, Saudi Arabia
- Genomics and Biotechnology Section and Research Group, Department of Biological Sciences, Faculty of Science, King abdulaziz University, Jeddah, Saudi Arabia
- * E-mail: (MK); (AEO)
| |
Collapse
|
31
|
Lea A, Subramaniam M, Ko A, Lehtimäki T, Raitoharju E, Kähönen M, Seppälä I, Mononen N, Raitakari OT, Ala-Korpela M, Pajukanta P, Zaitlen N, Ayroles JF. Genetic and environmental perturbations lead to regulatory decoherence. eLife 2019; 8:e40538. [PMID: 30834892 PMCID: PMC6400502 DOI: 10.7554/elife.40538] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2018] [Accepted: 02/14/2019] [Indexed: 01/24/2023] Open
Abstract
Correlation among traits is a fundamental feature of biological systems that remains difficult to study. To address this problem, we developed a flexible approach that allows us to identify factors associated with inter-individual variation in correlation. We use data from three human cohorts to study the effects of genetic and environmental variation on correlations among mRNA transcripts and among NMR metabolites. We first show that environmental exposures (infection and disease) lead to a systematic loss of correlation, which we define as 'decoherence'. Using longitudinal data, we show that decoherent metabolites are better predictors of whether someone will develop metabolic syndrome than metabolites commonly used as biomarkers of this disease. Finally, we demonstrate that correlation itself is under genetic control by mapping hundreds of 'correlation quantitative trait loci (QTLs)'. Together, this work furthers our understanding of how and why coordinated biological processes break down, and points to a potential role for decoherence in disease. Editorial note This article has been through an editorial process in which the authors decide how to respond to the issues raised during peer review. The Reviewing Editor's assessment is that all the issues have been addressed (see decision letter).
Collapse
Affiliation(s)
- Amanda Lea
- Department of Ecology and EvolutionPrinceton UniversityPrincetonUnited States
- Lewis-Sigler Institute for Integrative GenomicsPrinceton UniversityPrincetonUnited States
| | - Meena Subramaniam
- Department of Medicine, Lung Biology CenterUniversity of California, San FranciscoSan FranciscoUnited States
| | - Arthur Ko
- Department of Medicine, David Geffen School of Medicine at UCLAUniversity of California, Los AngelesLos AngelesUnited States
| | - Terho Lehtimäki
- Department of Clinical Chemistry, Fimlab Laboratories, Faculty of Medicine and Health TechnologyTampere UniversityTampereFinland
- Finnish Cardiovascular Research Center, Faculty of Medicine and Health TechnologyTampere UniversityTampereFinland
| | - Emma Raitoharju
- Finnish Cardiovascular Research Center, Faculty of Medicine and Health TechnologyTampere UniversityTampereFinland
| | - Mika Kähönen
- Finnish Cardiovascular Research Center, Faculty of Medicine and Health TechnologyTampere UniversityTampereFinland
- Department of Clinical PhysiologyTampere University, Tampere University HospitalTampereFinland
| | - Ilkka Seppälä
- Finnish Cardiovascular Research Center, Faculty of Medicine and Health TechnologyTampere UniversityTampereFinland
| | - Nina Mononen
- Finnish Cardiovascular Research Center, Faculty of Medicine and Health TechnologyTampere UniversityTampereFinland
| | - Olli T Raitakari
- Research Centre of Applied and Preventive Cardiovascular MedicineUniversity of TurkuTurkuFinland
- Department of Clinical Physiology and Nuclear MedicineTurku University HospitalTurkuFinland
| | - Mika Ala-Korpela
- Systems Epidemiology, Baker Heart and Diabetes InstituteMelbourneAustralia
- Computational Medicine, Faculty of Medicine, Biocenter OuluUniversity of OuluOuluFinland
- NMR Metabolomics Laboratory, School of PharmacyUniversity of Eastern FinlandKuopioFinland
- Population Health Science, Bristol Medical SchoolUniversity of BristolBristolUnited Kingdom
- Medical Research Council Integrative Epidemiology UnitUniversity of BristolBristolUnited Kingdom
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing and Health SciencesThe Alfred Hospital, Monash UniversityMelbourneAustralia
| | - Päivi Pajukanta
- Department of Human Genetics, David Geffen School of Medicine at UCLAUniversity of California, Los AngelesLos AngelesUnited States
| | - Noah Zaitlen
- Department of Medicine, Lung Biology CenterUniversity of California, San FranciscoSan FranciscoUnited States
| | - Julien F Ayroles
- Department of Ecology and EvolutionPrinceton UniversityPrincetonUnited States
- Lewis-Sigler Institute for Integrative GenomicsPrinceton UniversityPrincetonUnited States
| |
Collapse
|
32
|
Wu J, Gu Y, Xiao Y, Xia C, Li H, Kang Y, Sun J, Shao Z, Lin Z, Zhao X. Characterization of DNA Methylation Associated Gene Regulatory Networks During Stomach Cancer Progression. Front Genet 2019; 9:711. [PMID: 30778372 PMCID: PMC6369581 DOI: 10.3389/fgene.2018.00711] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Accepted: 12/18/2018] [Indexed: 01/11/2023] Open
Abstract
DNA methylation plays a critical role in tumorigenesis through regulating oncogene activation and tumor suppressor gene silencing. Although extensively analyzed, the implication of DNA methylation in gene regulatory network is less characterized. To address this issue, in this study we performed an integrative analysis on the alteration of DNA methylation patterns and the dynamics of gene regulatory network topology across distinct stages of stomach cancer. We found the global DNA methylation patterns in different stages are generally conserved, whereas some significantly differentially methylated genes were exclusively observed in the early stage of stomach cancer. Integrative analysis of DNA methylation and network topology alteration yielded several genes which have been reported to be involved in the progression of stomach cancer, such as IGF2, ERBB2, GSTP1, MYH11, TMEM59, and SST. Finally, we demonstrated that inhibition of SST promotes cell proliferation, suggesting that DNA methylation-associated SST suppression possibly contributes to the gastric cancer progression. Taken together, our study suggests the DNA methylation-associated regulatory network analysis could be used for identifying cancer-related genes. This strategy can facilitate the understanding of gene regulatory network in cancer biology and provide a new insight into the study of DNA methylation at system level.
Collapse
Affiliation(s)
- Jun Wu
- School of Life Sciences, East China Normal University, Shanghai, China
| | - Yunzhao Gu
- Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Yawen Xiao
- Department of Automation, Shanghai Jiao Tong University, Shanghai, China
| | - Chao Xia
- Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Hua Li
- Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Yani Kang
- Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Jielin Sun
- Shanghai Center for Systems Biomedicine, Shanghai Jiao Tong University, Shanghai, China
| | - Zhifeng Shao
- Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Zongli Lin
- Charles L. Brown Department of Electrical and Computer Engineering, University of Virginia, Charlottesville, VA, United States
| | - Xiaodong Zhao
- Shanghai Center for Systems Biomedicine, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
33
|
Genetic diversity of strawberry germplasm using metabolomic biomarkers. Sci Rep 2018; 8:14386. [PMID: 30258188 PMCID: PMC6158285 DOI: 10.1038/s41598-018-32212-9] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2018] [Accepted: 08/23/2018] [Indexed: 12/22/2022] Open
Abstract
High-throughput metabolomics technologies can provide the quantification of metabolites levels across various biological processes in different tissues, organs and species, allowing the identification of genes underpinning these complex traits. Information about changes of metabolites during strawberry development and ripening processes is key to aiding the development of new approaches to improve fruit attributes. We used network-based methods and multivariate statistical approaches to characterize and investigate variation in the primary and secondary metabolism of seven domesticated and seven wild strawberry fruit accessions at three different fruit development and ripening stages. Our results demonstrated that Fragaria sub-species can be identified solely based on the gathered metabolic profiles. We also showed that domesticated accessions displayed highly similar metabolic changes due to shared domestication history. Differences between domesticated and wild accessions were detected at the level of metabolite associations which served to rank metabolites whose regulation was mostly altered in the process of domestication. The discovery of comprehensive metabolic variation among strawberry accessions offers opportunities to probe into the genetic basis of variation, providing insights into the pathways to relate metabolic variation with important traits.
Collapse
|
34
|
van Dam S, Võsa U, van der Graaf A, Franke L, de Magalhães JP. Gene co-expression analysis for functional classification and gene-disease predictions. Brief Bioinform 2018; 19:575-592. [PMID: 28077403 PMCID: PMC6054162 DOI: 10.1093/bib/bbw139] [Citation(s) in RCA: 471] [Impact Index Per Article: 67.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2016] [Revised: 12/01/2016] [Indexed: 01/06/2023] Open
Abstract
Gene co-expression networks can be used to associate genes of unknown function with biological processes, to prioritize candidate disease genes or to discern transcriptional regulatory programmes. With recent advances in transcriptomics and next-generation sequencing, co-expression networks constructed from RNA sequencing data also enable the inference of functions and disease associations for non-coding genes and splice variants. Although gene co-expression networks typically do not provide information about causality, emerging methods for differential co-expression analysis are enabling the identification of regulatory genes underlying various phenotypes. Here, we introduce and guide researchers through a (differential) co-expression analysis. We provide an overview of methods and tools used to create and analyse co-expression networks constructed from gene expression data, and we explain how these can be used to identify genes with a regulatory role in disease. Furthermore, we discuss the integration of other data types with co-expression networks and offer future perspectives of co-expression analysis.
Collapse
Affiliation(s)
- Sipko van Dam
- Department of Genetics, UMCG HPC CB50, RB Groningen, Netherlands
| | - Urmo Võsa
- Department of Genetics, UMCG HPC CB50, RB Groningen, Netherlands
| | | | - Lude Franke
- Department of Genetics, UMCG HPC CB50, RB Groningen, Netherlands
| | | |
Collapse
|
35
|
Abstract
Wnt signaling is important for breast development and remodeling during pregnancy and lactation. Epigenetic modifications change expression levels of components of the Wnt pathway, underlying oncogenic transformation. However, no clear Wnt component increasing expression universally across breast cancer (BC) or its most Wnt-dependent triple-negative BC (TNBC) subgroup has been identified, delaying development of targeted therapies. Here we perform network correlation analysis of expression of >100 Wnt pathway components in hundreds of healthy and cancerous breast tissues. Varying in expression levels among people, Wnt components remarkably coordinate their production; this coordination is dramatically decreased in BC. Clusters with coordinated gene expression exist within the healthy cohort, highlighting Wnt signaling subtypes. Different BC subgroups are identified, characterized by different remaining Wnt signaling signatures, providing the rational for patient stratification for personalizing the therapeutic applications. Key pairwise interactions within the Wnt pathway (some inherited and some established de novo) emerge as targets for future drug discovery against BC.
Collapse
|
36
|
Li C, Liu L, Dinu V. Pathways of topological rank analysis (PoTRA): a novel method to detect pathways involved in hepatocellular carcinoma. PeerJ 2018; 6:e4571. [PMID: 29666752 PMCID: PMC5896492 DOI: 10.7717/peerj.4571] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Accepted: 03/14/2018] [Indexed: 01/01/2023] Open
Abstract
Complex diseases such as cancer are usually the result of a combination of environmental factors and one or several biological pathways consisting of sets of genes. Each biological pathway exerts its function by delivering signaling through the gene network. Theoretically, a pathway is supposed to have a robust topological structure under normal physiological conditions. However, the pathway's topological structure could be altered under some pathological condition. It is well known that a normal biological network includes a small number of well-connected hub nodes and a large number of nodes that are non-hubs. In addition, it is reported that the loss of connectivity is a common topological trait of cancer networks, which is an assumption of our method. Hence, from normal to cancer, the process of the network losing connectivity might be the process of disrupting the structure of the network, namely, the number of hub genes might be altered in cancer compared to that in normal or the distribution of topological ranks of genes might be altered. Based on this, we propose a new PageRank-based method called Pathways of Topological Rank Analysis (PoTRA) to detect pathways involved in cancer. We use PageRank to measure the relative topological ranks of genes in each biological pathway, then select hub genes for each pathway, and use Fisher's exact test to test if the number of hub genes in each pathway is altered from normal to cancer. Alternatively, if the distribution of topological ranks of gene in a pathway is altered between normal and cancer, this pathway might also be involved in cancer. Hence, we use the Kolmogorov-Smirnov test to detect pathways that have an altered distribution of topological ranks of genes between two phenotypes. We apply PoTRA to study hepatocellular carcinoma (HCC) and several subtypes of HCC. Very interestingly, we discover that all significant pathways in HCC are cancer-associated generally, while several significant pathways in subtypes of HCC are HCC subtype-associated specifically. In conclusion, PoTRA is a new approach to explore and discover pathways involved in cancer. PoTRA can be used as a complement to other existing methods to broaden our understanding of the biological mechanisms behind cancer at the system-level.
Collapse
Affiliation(s)
- Chaoxing Li
- School of Life Sciences, Arizona State University, Tempe, AZ, United States of America
| | - Li Liu
- Department of Biomedical Informatics, Arizona State University, Scottsdale, AZ, United States of America
| | - Valentin Dinu
- Department of Biomedical Informatics, Arizona State University, Scottsdale, AZ, United States of America
| |
Collapse
|
37
|
Han R, Huang G, Wang Y, Xu Y, Hu Y, Jiang W, Wang T, Xiao T, Zheng D. Increased gene expression noise in human cancers is correlated with low p53 and immune activities as well as late stage cancer. Oncotarget 2018; 7:72011-72020. [PMID: 27713130 PMCID: PMC5342140 DOI: 10.18632/oncotarget.12457] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2016] [Accepted: 09/29/2016] [Indexed: 01/19/2023] Open
Abstract
Gene expression in metazoans is delicately organized. As genetic information transmits from DNA to RNA and protein, expression noise is inevitably generated. Recent studies begin to unveil the mechanisms of gene expression noise control, but the changes of gene expression precision in pathologic conditions like cancers are unknown. Here we analyzed the transcriptomic data of human breast, liver, lung and colon cancers, and found that the expression noise of more than 74.9% genes was increased in cancer tissues as compared to adjacent normal tissues. This suggested that gene expression precision controlling collapsed during cancer development. A set of 269 genes with noise increased more than 2-fold were identified across different cancer types. These genes were involved in cell adhesion, catalytic and metabolic functions, implying the vulnerability of deregulation of these processes in cancers. We also observed a tendency of increased expression noise in patients with low p53 and immune activity in breast, liver and lung caners but not in colon cancers, which indicated the contributions of p53 signaling and host immune surveillance to gene expression noise in cancers. Moreover, more than 53.7% genes had increased noise in patients with late stage than early stage cancers, suggesting that gene expression precision was associated with cancer outcome. Together, these results provided genomic scale explorations of gene expression noise control in human cancers.
Collapse
Affiliation(s)
- Rongfei Han
- Shenzhen Key Laboratory of Translational Medicine of Tumor, Department of Cell Biology and Genetics, Shenzhen University Health Sciences Center, Shenzhen, Guangdong, 518060, P.R.China
| | - Guanqun Huang
- Shenzhen Key Laboratory of Translational Medicine of Tumor, Department of Cell Biology and Genetics, Shenzhen University Health Sciences Center, Shenzhen, Guangdong, 518060, P.R.China
| | - Yejun Wang
- Shenzhen Key Laboratory of Translational Medicine of Tumor, Department of Cell Biology and Genetics, Shenzhen University Health Sciences Center, Shenzhen, Guangdong, 518060, P.R.China
| | - Yafei Xu
- Shenzhen Key Laboratory of Translational Medicine of Tumor, Department of Cell Biology and Genetics, Shenzhen University Health Sciences Center, Shenzhen, Guangdong, 518060, P.R.China
| | - Yueming Hu
- Shenzhen Key Laboratory of Translational Medicine of Tumor, Department of Cell Biology and Genetics, Shenzhen University Health Sciences Center, Shenzhen, Guangdong, 518060, P.R.China
| | - Wenqi Jiang
- Shenzhen Key Laboratory of Translational Medicine of Tumor, Department of Cell Biology and Genetics, Shenzhen University Health Sciences Center, Shenzhen, Guangdong, 518060, P.R.China
| | - Tianfu Wang
- Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Shenzhen University Health Sciences Center, Shenzhen, Guangdong, 518060, P.R.China
| | - Tian Xiao
- Shenzhen Key Laboratory of Translational Medicine of Tumor, Department of Cell Biology and Genetics, Shenzhen University Health Sciences Center, Shenzhen, Guangdong, 518060, P.R.China
| | - Duo Zheng
- Shenzhen Key Laboratory of Translational Medicine of Tumor, Department of Cell Biology and Genetics, Shenzhen University Health Sciences Center, Shenzhen, Guangdong, 518060, P.R.China
| |
Collapse
|
38
|
Shilts J, Chen G, Hughey JJ. Evidence for widespread dysregulation of circadian clock progression in human cancer. PeerJ 2018; 6:e4327. [PMID: 29404219 PMCID: PMC5797448 DOI: 10.7717/peerj.4327] [Citation(s) in RCA: 64] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2017] [Accepted: 01/15/2018] [Indexed: 12/12/2022] Open
Abstract
The ubiquitous daily rhythms in mammalian physiology are guided by progression of the circadian clock. In mice, systemic disruption of the clock can promote tumor growth. In vitro, multiple oncogenes can disrupt the clock. However, due to the difficulties of studying circadian rhythms in solid tissues in humans, whether the clock is disrupted within human tumors has remained unknown. We sought to determine the state of the circadian clock in human cancer using publicly available transcriptome data. We developed a method, called the clock correlation distance (CCD), to infer circadian clock progression in a group of samples based on the co-expression of 12 clock genes. Our method can be applied to modestly sized datasets in which samples are not labeled with time of day and coverage of the circadian cycle is incomplete. We used the method to define a signature of clock gene co-expression in healthy mouse organs, then validated the signature in healthy human tissues. By then comparing human tumor and non-tumor samples from twenty datasets of a range of cancer types, we discovered that clock gene co-expression in tumors is consistently perturbed. Subsequent analysis of data from clock gene knockouts in mice suggested that perturbed clock gene co-expression in human cancer is not caused solely by the inactivation of clock genes. Furthermore, focusing on lung cancer, we found that human lung tumors showed systematic changes in expression in a large set of genes previously inferred to be rhythmic in healthy lung. Our findings suggest that clock progression is dysregulated in many solid human cancers and that this dysregulation could have broad effects on circadian physiology within tumors. In addition, our approach opens the door to using publicly available data to infer circadian clock progression in a multitude of human phenotypes.
Collapse
Affiliation(s)
- Jarrod Shilts
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN, United States of America
| | - Guanhua Chen
- Department of Biostatistics & Medical Informatics, University of Wisconsin-Madison, Madison, WI, United States of America
| | - Jacob J Hughey
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN, United States of America.,Department of Biological Sciences, Vanderbilt University, Nashville, TN, United States of America
| |
Collapse
|
39
|
Gonzalez-Valbuena EE, Treviño V. Metrics to estimate differential co-expression networks. BioData Min 2017; 10:32. [PMID: 29151892 PMCID: PMC5681815 DOI: 10.1186/s13040-017-0152-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2017] [Accepted: 10/30/2017] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Detecting the differences in gene expression data is important for understanding the underlying molecular mechanisms. Although the differentially expressed genes are a large component, differences in correlation are becoming an interesting approach to achieving deeper insights. However, diverse metrics have been used to detect differential correlation, making selection and use of a single metric difficult. In addition, available implementations are metric-specific, complicating their use in different contexts. Moreover, because the analyses in the literature have been performed on real data, there are uncertainties regarding the performance of metrics and procedures. RESULTS In this work, we compare four novel and two previously proposed metrics to detect differential correlations. We generated well-controlled datasets into which differences in correlations were carefully introduced by controlled multivariate normal correlation networks and addition of noise. The comparisons were performed on three datasets derived from real tumor data. Our results show that metrics differ in their detection performance and computational time. No single metric was the best in all datasets, but trends show that three metrics are highly correlated and are very good candidates for real data analysis. In contrast, other metrics proposed in the literature seem to show low performance and different detections. Overall, our results suggest that metrics that do not filter correlations perform better. We also show an additional analysis of TCGA breast cancer subtypes. CONCLUSIONS We show a methodology to generate controlled datasets for the objective evaluation of differential correlation pipelines, and compare the performance of several metrics. We implemented in R a package called DifCoNet that can provide easy-to-use functions for differential correlation analyses.
Collapse
Affiliation(s)
| | - Víctor Treviño
- Cátedra de Bioinformática, Escuela de Medicina, Tecnológico de Monterrey, 64710 Monterrey, Nuevo León Mexico
| |
Collapse
|
40
|
Dong LY, Zhou WZ, Ni JW, Xiang W, Hu WH, Yu C, Li HY. Identifying the optimal gene and gene set in hepatocellular carcinoma based on differential expression and differential co-expression algorithm. Oncol Rep 2016; 37:1066-1074. [DOI: 10.3892/or.2016.5333] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2016] [Accepted: 08/10/2016] [Indexed: 11/06/2022] Open
|
41
|
He F, Maslov S. Pan- and core- network analysis of co-expression genes in a model plant. Sci Rep 2016; 6:38956. [PMID: 27982071 PMCID: PMC5159811 DOI: 10.1038/srep38956] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Accepted: 11/14/2016] [Indexed: 01/18/2023] Open
Abstract
Genome-wide gene expression experiments have been performed using the model plant Arabidopsis during the last decade. Some studies involved construction of coexpression networks, a popular technique used to identify groups of co-regulated genes, to infer unknown gene functions. One approach is to construct a single coexpression network by combining multiple expression datasets generated in different labs. We advocate a complementary approach in which we construct a large collection of 134 coexpression networks based on expression datasets reported in individual publications. To this end we reanalyzed public expression data. To describe this collection of networks we introduced concepts of 'pan-network' and 'core-network' representing union and intersection between a sizeable fractions of individual networks, respectively. We showed that these two types of networks are different both in terms of their topology and biological function of interacting genes. For example, the modules of the pan-network are enriched in regulatory and signaling functions, while the modules of the core-network tend to include components of large macromolecular complexes such as ribosomes and photosynthetic machinery. Our analysis is aimed to help the plant research community to better explore the information contained within the existing vast collection of gene expression data in Arabidopsis.
Collapse
Affiliation(s)
- Fei He
- Biology Department, Brookhaven National Laboratory, Upton, NY 11973, USA
| | - Sergei Maslov
- Biology Department, Brookhaven National Laboratory, Upton, NY 11973, USA
- Department of Bioengineering, Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| |
Collapse
|
42
|
Wang L, Ma H, Zhu L, Ma L, Cao L, Wei H, Xu J. Screening for the optimal gene and functional gene sets related to breast cancer using differential co-expression and differential expression analysis. Cancer Biomark 2016; 17:463-471. [PMID: 27802197 DOI: 10.3233/cbm-160663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
OBJECTIVE To investigate novel gene sets related to breast cancer (BC) using differential co-expression and differential expression (DECODE). METHODS T statistics was used to quantify the degree of DE of each gene, and then Z was adopted to quantify the correlation difference between expression levels of two genes. Two optimal thresholds for defining substantial change in DE and DC were selected for each gene using chi-square maximization, and the corresponding gene was defined as the optimal gene. Based on the optimal thresholds, genes were categorized into four partitions with either high or low DC and DE characteristics. Finally, we evaluated the functional relevance of a gene partition with high DE and high DC, and the gene set with best association was considered as the optimal functional gene set. RESULTS The optimal thresholds for DC and DE were respective 2.254 and 1.616, and the optimal gene was UBE2Q2L. Based on the optimal thresholds, genes were divided into four partitions including HDE-HDC (875 genes), HED-LDC (8038 genes), LDE-HDC (678 genes), and LDE-LDC (10516 genes). The best associated gene set was ``fatty acid catabolic process'' with 34 HDC and HDE partitions. Among these partitions, UBE2Q2L attained the highest minimum FI gain of 18.973. CONCLUSION UBE2Q2L and fatty acid catabolic process might be potentially useful signatures in diagnostic purposes for BC.
Collapse
Affiliation(s)
- Lei Wang
- Department of Science and Education, The People's Hospital of Zhangqiu, Zhangqiu, Shandong, China
| | - Hong Ma
- Pharmacy Intravenous Admixture Service, The People's Hospital of Zhangqiu, Zhangqiu, Shandong, China
| | - Lixia Zhu
- Department of Neurosurgery, The People's Hospital of Zhangqiu, Zhangqiu, Shandong, China
| | - Liping Ma
- Department of Science and Education, The People's Hospital of Zhangqiu, Zhangqiu, Shandong, China
| | - Lanting Cao
- Department of Cardiology, The People's Hospital of Zhangqiu, Zhangqiu, Shandong, China
| | - Hui Wei
- Department of General Surgery, The People's Hospital of Zhangqiu, Zhangqiu, Shandong, China
| | - Jumei Xu
- Department of General Surgery, The People's Hospital of Zhangqiu, Zhangqiu, Shandong, China
| |
Collapse
|
43
|
Koumakis L, Kanterakis A, Kartsaki E, Chatzimina M, Zervakis M, Tsiknakis M, Vassou D, Kafetzopoulos D, Marias K, Moustakis V, Potamias G. MinePath: Mining for Phenotype Differential Sub-paths in Molecular Pathways. PLoS Comput Biol 2016; 12:e1005187. [PMID: 27832067 PMCID: PMC5104320 DOI: 10.1371/journal.pcbi.1005187] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2016] [Accepted: 10/10/2016] [Indexed: 01/04/2023] Open
Abstract
Pathway analysis methodologies couple traditional gene expression analysis with knowledge encoded in established molecular pathway networks, offering a promising approach towards the biological interpretation of phenotype differentiating genes. Early pathway analysis methodologies, named as gene set analysis (GSA), view pathways just as plain lists of genes without taking into account either the underlying pathway network topology or the involved gene regulatory relations. These approaches, even if they achieve computational efficiency and simplicity, consider pathways that involve the same genes as equivalent in terms of their gene enrichment characteristics. Most recent pathway analysis approaches take into account the underlying gene regulatory relations by examining their consistency with gene expression profiles and computing a score for each profile. Even with this approach, assessing and scoring single-relations limits the ability to reveal key gene regulation mechanisms hidden in longer pathway sub-paths. We introduce MinePath, a pathway analysis methodology that addresses and overcomes the aforementioned problems. MinePath facilitates the decomposition of pathways into their constituent sub-paths. Decomposition leads to the transformation of single-relations to complex regulation sub-paths. Regulation sub-paths are then matched with gene expression sample profiles in order to evaluate their functional status and to assess phenotype differential power. Assessment of differential power supports the identification of the most discriminant profiles. In addition, MinePath assess the significance of the pathways as a whole, ranking them by their p-values. Comparison results with state-of-the-art pathway analysis systems are indicative for the soundness and reliability of the MinePath approach. In contrast with many pathway analysis tools, MinePath is a web-based system (www.minepath.org) offering dynamic and rich pathway visualization functionality, with the unique characteristic to color regulatory relations between genes and reveal their phenotype inclination. This unique characteristic makes MinePath a valuable tool for in silico molecular biology experimentation as it serves the biomedical researchers' exploratory needs to reveal and interpret the regulatory mechanisms that underlie and putatively govern the expression of target phenotypes.
Collapse
Affiliation(s)
- Lefteris Koumakis
- Computational BioMedicine Laboratory (CBML), Institute of Computers Science (ICS), Foundation for Research and Technology-Hellas (FORTH), Heraklion, Crete, Greece
| | - Alexandros Kanterakis
- Computational BioMedicine Laboratory (CBML), Institute of Computers Science (ICS), Foundation for Research and Technology-Hellas (FORTH), Heraklion, Crete, Greece
| | - Evgenia Kartsaki
- Computational BioMedicine Laboratory (CBML), Institute of Computers Science (ICS), Foundation for Research and Technology-Hellas (FORTH), Heraklion, Crete, Greece
| | - Maria Chatzimina
- Computational BioMedicine Laboratory (CBML), Institute of Computers Science (ICS), Foundation for Research and Technology-Hellas (FORTH), Heraklion, Crete, Greece
| | - Michalis Zervakis
- School of Electrical and Computer Engineering, Technical University of Crete, Greece
| | - Manolis Tsiknakis
- Computational BioMedicine Laboratory (CBML), Institute of Computers Science (ICS), Foundation for Research and Technology-Hellas (FORTH), Heraklion, Crete, Greece
- Department of Informatics Engineering, Technological Educational Institute of Crete, Greece
| | - Despoina Vassou
- Institute of Molecular Biology & Biotechnology, FORTH, Heraklion, Crete, Greece
| | | | - Kostas Marias
- Computational BioMedicine Laboratory (CBML), Institute of Computers Science (ICS), Foundation for Research and Technology-Hellas (FORTH), Heraklion, Crete, Greece
| | - Vassilis Moustakis
- School of Production Engineering & Management, Technical University of Crete, Greece
| | - George Potamias
- Computational BioMedicine Laboratory (CBML), Institute of Computers Science (ICS), Foundation for Research and Technology-Hellas (FORTH), Heraklion, Crete, Greece
| |
Collapse
|
44
|
Differential Regulatory Analysis Based on Coexpression Network in Cancer Research. BIOMED RESEARCH INTERNATIONAL 2016; 2016:4241293. [PMID: 27597964 PMCID: PMC4997028 DOI: 10.1155/2016/4241293] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/14/2016] [Revised: 06/09/2016] [Accepted: 06/12/2016] [Indexed: 12/15/2022]
Abstract
With rapid development of high-throughput techniques and accumulation of big transcriptomic data, plenty of computational methods and algorithms such as differential analysis and network analysis have been proposed to explore genome-wide gene expression characteristics. These efforts are aiming to transform underlying genomic information into valuable knowledges in biological and medical research fields. Recently, tremendous integrative research methods are dedicated to interpret the development and progress of neoplastic diseases, whereas differential regulatory analysis (DRA) based on gene coexpression network (GCN) increasingly plays a robust complement to regular differential expression analysis in revealing regulatory functions of cancer related genes such as evading growth suppressors and resisting cell death. Differential regulatory analysis based on GCN is prospective and shows its essential role in discovering the system properties of carcinogenesis features. Here we briefly review the paradigm of differential regulatory analysis based on GCN. We also focus on the applications of differential regulatory analysis based on GCN in cancer research and point out that DRA is necessary and extraordinary to reveal underlying molecular mechanism in large-scale carcinogenesis studies.
Collapse
|
45
|
He F, Karve AA, Maslov S, Babst BA. Large-Scale Public Transcriptomic Data Mining Reveals a Tight Connection between the Transport of Nitrogen and Other Transport Processes in Arabidopsis. FRONTIERS IN PLANT SCIENCE 2016; 7:1207. [PMID: 27563305 PMCID: PMC4981021 DOI: 10.3389/fpls.2016.01207] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2016] [Accepted: 07/29/2016] [Indexed: 05/29/2023]
Abstract
Movement of nitrogen to the plant tissues where it is needed for growth is an important contribution to nitrogen use efficiency. However, we have very limited knowledge about the mechanisms of nitrogen transport. Loading of nitrogen into the xylem and/or phloem by transporter proteins is likely important, but there are several families of genes that encode transporters of nitrogenous molecules (collectively referred to as N transporters here), each comprised of many gene members. In this study, we leveraged publicly available microarray data of Arabidopsis to investigate the gene networks of N transporters to elucidate their possible biological roles. First, we showed that tissue-specificity of nitrogen (N) transporters was well reflected among the public microarray data. Then, we built coexpression networks of N transporters, which showed relationships between N transporters and particular aspects of plant metabolism, such as phenylpropanoid biosynthesis and carbohydrate metabolism. Furthermore, genes associated with several biological pathways were found to be tightly coexpressed with N transporters in different tissues. Our coexpression networks provide information at the systems-level that will serve as a resource for future investigation of nitrogen transport systems in plants, including candidate gene clusters that may work together in related biological roles.
Collapse
Affiliation(s)
- Fei He
- Biological, Environmental and Climate Sciences Department, Brookhaven National LaboratoryUpton, NY, USA
| | - Abhijit A. Karve
- Biological, Environmental and Climate Sciences Department, Brookhaven National LaboratoryUpton, NY, USA
- Purdue Research FoundationWest Lafayette, IN, USA
| | - Sergei Maslov
- Biological, Environmental and Climate Sciences Department, Brookhaven National LaboratoryUpton, NY, USA
- Department of Bioengineering, Carl R. Woese Institute for Genomic Biology, National Center for Supercomputing Applications, University of Illinois at Urbana-ChampaignUrbana, IL, USA
| | - Benjamin A. Babst
- Biological, Environmental and Climate Sciences Department, Brookhaven National LaboratoryUpton, NY, USA
- Arkansas Forest Resources Center, The University of Arkansas at MonticelloMonticello, AR, USA
| |
Collapse
|
46
|
Creanza TM, Liguori M, Liuni S, Nuzziello N, Ancona N. Meta-Analysis of Differential Connectivity in Gene Co-Expression Networks in Multiple Sclerosis. Int J Mol Sci 2016; 17:E936. [PMID: 27314336 PMCID: PMC4926469 DOI: 10.3390/ijms17060936] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2016] [Revised: 05/09/2016] [Accepted: 05/24/2016] [Indexed: 12/20/2022] Open
Abstract
Differential gene expression analyses to investigate multiple sclerosis (MS) molecular pathogenesis cannot detect genes harboring genetic and/or epigenetic modifications that change the gene functions without affecting their expression. Differential co-expression network approaches may capture changes in functional interactions resulting from these alterations. We re-analyzed 595 mRNA arrays from publicly available datasets by studying changes in gene co-expression networks in MS and in response to interferon (IFN)-β treatment. Interestingly, MS networks show a reduced connectivity relative to the healthy condition, and the treatment activates the transcription of genes and increases their connectivity in MS patients. Importantly, the analysis of changes in gene connectivity in MS patients provides new evidence of association for genes already implicated in MS by single-nucleotide polymorphism studies and that do not show differential expression. This is the case of amiloride-sensitive cation channel 1 neuronal (ACCN1) that shows a reduced number of interacting partners in MS networks, and it is known for its role in synaptic transmission and central nervous system (CNS) development. Furthermore, our study confirms a deregulation of the vitamin D system: among the transcription factors that potentially regulate the deregulated genes, we find TCF3 and SP1 that are both involved in vitamin D3-induced p27Kip1 expression. Unveiling differential network properties allows us to gain systems-level insights into disease mechanisms and may suggest putative targets for the treatment.
Collapse
Affiliation(s)
- Teresa Maria Creanza
- Institute of Intelligent Systems for Automation, National Research Council of Italy, 70126 Bari, Italy.
- Center for Complex Systems in Molecular Biology and Medicine, University of Turin, 10123 Turin, Italy.
| | - Maria Liguori
- Institute of Biomedical Technologies, National Research Council of Italy, 70126 Bari, Italy.
| | - Sabino Liuni
- Institute of Biomedical Technologies, National Research Council of Italy, 70126 Bari, Italy.
| | - Nicoletta Nuzziello
- Institute of Biomedical Technologies, National Research Council of Italy, 70126 Bari, Italy.
- Department of Basic Medical Sciences, Neuroscience and Sense Organs, University of Bari, 70126 Bari, Italy.
| | - Nicola Ancona
- Institute of Intelligent Systems for Automation, National Research Council of Italy, 70126 Bari, Italy.
| |
Collapse
|
47
|
Gomez-Rueda H, Palacios-Corona R, Gutiérrez-Hermosillo H, Trevino V. A robust biomarker of differential correlations improves the diagnosis of cytologically indeterminate thyroid cancers. Int J Mol Med 2016; 37:1355-62. [PMID: 27035928 DOI: 10.3892/ijmm.2016.2534] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2015] [Accepted: 02/23/2016] [Indexed: 11/05/2022] Open
Abstract
The fine-needle aspiration of thyroid nodules and subsequent cytological analysis is unable to determine the diagnosis in 15 to 30% of thyroid cancer cases; patients with indeterminate cytological results undergo diagnostic surgery which is potentially unnecessary. Current gene expression biomarkers based on well-determined cytology are complex and their accuracy is inconsistent across public datasets. In the present study, we identified a robust biomarker using the differences in gene expression values specifically from cytologically indeterminate thyroid tumors and a powerful multivariate search tool coupled with a nearest centroid classifier. The biomarker is based on differences in the expression of the following genes: CCND1, CLDN16, CPE, LRP1B, MAGI3, MAPK6, MATN2, MPPED2, PFKFB2, PTPRE, PYGL, SEMA3D, SERGEF, SLC4A4 and TIMP1. This 15-gene biomarker exhibited superior accuracy independently of the cytology in six datasets, including The Cancer Genome Atlas (TCGA) thyroid dataset. In addition, this biomarker exhibited differences in the correlation coefficients between benign and malignant samples that indicate its discriminatory power, and these 15 genes have been previously related to cancer in the literature. Thus, this 15-gene biomarker provides advantages in clinical practice for the effective diagnosis of thyroid cancer.
Collapse
Affiliation(s)
- Hugo Gomez-Rueda
- Bioinformatics Research Group, Department of Research and Innovation, Medical School, Tecnológico de Monterrey, Colonia Los Doctores, 64710 Monterrey, Nuevo León, Mexico
| | - Rebeca Palacios-Corona
- Northeastern Biomedical Research Center, Instituto Mexicano del Seguro Social, Colonia Independencia, 64720 Monterrey, Nuevo León, Mexico
| | - Hugo Gutiérrez-Hermosillo
- Department of Geriatrics, UMAE 1 CMN del Bajío, Instituto Mexicano del Seguro Social, Hospital Aranda de la Parra, Colonia Centro, 37000 León, Guanajuato, Mexico
| | - Victor Trevino
- Bioinformatics Research Group, Department of Research and Innovation, Medical School, Tecnológico de Monterrey, Colonia Los Doctores, 64710 Monterrey, Nuevo León, Mexico
| |
Collapse
|
48
|
Kaushik A, Bhatia Y, Ali S, Gupta D. Gene Network Rewiring to Study Melanoma Stage Progression and Elements Essential for Driving Melanoma. PLoS One 2015; 10:e0142443. [PMID: 26558755 PMCID: PMC4641706 DOI: 10.1371/journal.pone.0142443] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2015] [Accepted: 10/21/2015] [Indexed: 01/19/2023] Open
Abstract
Metastatic melanoma patients have a poor prognosis, mainly attributable to the underlying heterogeneity in melanoma driver genes and altered gene expression profiles. These characteristics of melanoma also make the development of drugs and identification of novel drug targets for metastatic melanoma a daunting task. Systems biology offers an alternative approach to re-explore the genes or gene sets that display dysregulated behaviour without being differentially expressed. In this study, we have performed systems biology studies to enhance our knowledge about the conserved property of disease genes or gene sets among mutually exclusive datasets representing melanoma progression. We meta-analysed 642 microarray samples to generate melanoma reconstructed networks representing four different stages of melanoma progression to extract genes with altered molecular circuitry wiring as compared to a normal cellular state. Intriguingly, a majority of the melanoma network-rewired genes are not differentially expressed and the disease genes involved in melanoma progression consistently modulate its activity by rewiring network connections. We found that the shortlisted disease genes in the study show strong and abnormal network connectivity, which enhances with the disease progression. Moreover, the deviated network properties of the disease gene sets allow ranking/prioritization of different enriched, dysregulated and conserved pathway terms in metastatic melanoma, in agreement with previous findings. Our analysis also reveals presence of distinct network hubs in different stages of metastasizing tumor for the same set of pathways in the statistically conserved gene sets. The study results are also presented as a freely available database at http://bioinfo.icgeb.res.in/m3db/. The web-based database resource consists of results from the analysis presented here, integrated with cytoscape web and user-friendly tools for visualization, retrieval and further analysis.
Collapse
Affiliation(s)
- Abhinav Kaushik
- Bioinformatics Laboratory, Structural and Computational Biology Group, International Centre for Genetic Engineering and Biotechnology, New Delhi, 110067, India
| | - Yashuma Bhatia
- Bioinformatics Laboratory, Structural and Computational Biology Group, International Centre for Genetic Engineering and Biotechnology, New Delhi, 110067, India
| | - Shakir Ali
- Department of Biochemistry, Faculty of Science, Jamia Hamdard, New Delhi, 110062, India
| | - Dinesh Gupta
- Bioinformatics Laboratory, Structural and Computational Biology Group, International Centre for Genetic Engineering and Biotechnology, New Delhi, 110067, India
- * E-mail:
| |
Collapse
|
49
|
Xu J, Li Y, Lu J, Pan T, Ding N, Wang Z, Shao T, Zhang J, Wang L, Li X. The mRNA related ceRNA-ceRNA landscape and significance across 20 major cancer types. Nucleic Acids Res 2015; 43:8169-82. [PMID: 26304537 PMCID: PMC4787795 DOI: 10.1093/nar/gkv853] [Citation(s) in RCA: 149] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2015] [Accepted: 08/11/2015] [Indexed: 12/14/2022] Open
Abstract
Cross-talk between competitive endogenous RNAs (ceRNAs) through shared miRNAs represents a novel layer of gene regulation that plays important roles in the physiology and development of cancers. However, a global view of their system-level properties across various types of cancers is still unknown. Here, we constructed the mRNA related ceRNA–ceRNA interaction landscape across 20 cancer types by systematically analyzing molecular profiles of 5203 tumors and miRNA regulations. Our study highlights the conserved features shared by pan-cancer and higher similarity within similar origin cell type. Moreover, a core ceRNA network was identified. Function analysis identified a common theme of cancer hallmarks, however they exhibit phenotype-specific connectivity patterns. Besides, we found a marked rewiring in the ceRNA program between various cancers, and further revealed conserved and rewired network ceRNA hubs in each cancer, which were tensely competitive interactions to constitute conserved and cancer-specific modules. By providing mechanistic linkage between known cancer miRNAs, their mediated ceRNA–ceRNA interactions, and the associations with known cancer hallmarks, the inferred cancer ceRNA–ceRNA interaction landscape will serve as a powerful public resource for further biological discoveries of tumorigenesis.
Collapse
Affiliation(s)
- Juan Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Yongsheng Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Jianping Lu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Tao Pan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Na Ding
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Zishan Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Tingting Shao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Jinwen Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Lihua Wang
- Department of Neurology, The Second Affiliated Hospital, Harbin Medical University, Harbin 150081, Heilongjiang Province, China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| |
Collapse
|
50
|
Lui TWH, Tsui NBY, Chan LWC, Wong CSC, Siu PMF, Yung BYM. DECODE: an integrated differential co-expression and differential expression analysis of gene expression data. BMC Bioinformatics 2015; 16:182. [PMID: 26026612 PMCID: PMC4449974 DOI: 10.1186/s12859-015-0582-4] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2014] [Accepted: 04/22/2015] [Indexed: 01/30/2023] Open
Abstract
BACKGROUND Both differential expression (DE) and differential co-expression (DC) analyses are appreciated as useful tools in understanding gene regulation related to complex diseases. The performance of integrating DE and DC, however, remains unexplored. RESULTS In this study, we proposed a novel analytical approach called DECODE (Differential Co-expression and Differential Expression) to integrate DC and DE analyses of gene expression data. DECODE allows one to study the combined features of DC and DE of each transcript between two conditions. By incorporating information of the dependency between DC and DE variables, two optimal thresholds for defining substantial change in expression and co-expression are systematically defined for each gene based on chi-square maximization. By using these thresholds, genes can be categorized into four groups with either high or low DC and DE characteristics. In this study, DECODE was applied to a large breast cancer microarray data set consisted of two thousand tumor samples. By identifying genes with high DE and high DC, we demonstrated that DECODE could improve the detection of some functional gene sets such as those related to immune system, metastasis, lipid and glucose metabolism. Further investigation on the identified genes and the associated functional pathways would provide an additional level of understanding of complex disease mechanism. CONCLUSIONS By complementing the recent DC and the traditional DE analyses, DECODE is a valuable methodology for investigating biological functions of genes exhibiting disease-associated DE and DC combined characteristics, which may not be easily revealed through DC or DE approach alone. DECODE is available at the Comprehensive R Archive Network (CRAN): http://cran.r-project.org/web/packages/decode/index.html .
Collapse
Affiliation(s)
- Thomas W H Lui
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong.
| | - Nancy B Y Tsui
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong.
| | - Lawrence W C Chan
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong.
| | - Cesar S C Wong
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong.
| | - Parco M F Siu
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong.
| | - Benjamin Y M Yung
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong.
| |
Collapse
|