1
|
Piergiorge RM, da Silva Francisco Junior R, de Vasconcelos ATR, Santos-Rebouças CB. Multi-layered transcriptomic analysis reveals a pivotal role of FMR1 and other developmental genes in Alzheimer's disease-associated brain ceRNA network. Comput Biol Med 2023; 166:107494. [PMID: 37769462 DOI: 10.1016/j.compbiomed.2023.107494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 09/05/2023] [Accepted: 09/15/2023] [Indexed: 09/30/2023]
Abstract
Alzheimer's disease (AD) is an increasingly neurodegenerative disorder that causes progressive cognitive decline and memory impairment. Despite extensive research, the underlying causes of late-onset AD (LOAD) are still in progress. This study aimed to establish a network of competing regulatory interactions involving circular RNAs (circRNAs), microRNAs (miRNAs), RNA-binding proteins (RBPs), and messenger RNAs (mRNAs) connected to LOAD. A systematic analysis of publicly available expression data was conducted to identify integrated differentially expressed genes (DEGs) from the hippocampus of LOAD patients. Subsequently, gene co-expression analysis identified modules comprising highly expressed DEGs that act cooperatively. The competition between co-expressed DEGs and miRNAs/RBPs and the simultaneous interactions between circRNA and miRNA/RBP revealed a complex ceRNA network responsible for post-transcriptional regulation in LOAD. Hippocampal expression data for miRNAs, circRNAs, and RBPs were used to filter relevant relationships for AD. An integrated topological score was used to identify the highly connected hub gene, from which a brain core ceRNA subnetwork was generated. The Fragile X Messenger Ribonucleoprotein 1 (FMR1) coding for the RBP FMRP emerged as the prominent driver gene in this subnetwork. FMRP has been previously related to AD but not in a ceRNA network context. Also, the substantial number of neurodevelopmental genes in the ceRNA subnetwork and their related biological pathways strengthen that AD shares common pathological mechanisms with developmental conditions. Our results enhance the current knowledge about the convergent ceRNA regulatory pathways underlying AD and provide potential targets for identifying early biomarkers and developing novel therapeutic interventions.
Collapse
Affiliation(s)
- Rafael Mina Piergiorge
- Department of Genetics, Institute of Biology Roberto Alcantara Gomes, State University of Rio de Janeiro, Rio de Janeiro, Brazil
| | | | | | - Cíntia Barros Santos-Rebouças
- Department of Genetics, Institute of Biology Roberto Alcantara Gomes, State University of Rio de Janeiro, Rio de Janeiro, Brazil.
| |
Collapse
|
2
|
Huang CH, Zaenudin E, Tsai JJ, Kurubanjerdjit N, Ng KL. Network subgraph-based approach for analyzing and comparing molecular networks. PeerJ 2022; 10:e13137. [PMID: 35529499 PMCID: PMC9074881 DOI: 10.7717/peerj.13137] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Accepted: 02/28/2022] [Indexed: 01/12/2023] Open
Abstract
Molecular networks are built up from genetic elements that exhibit feedback interactions. Here, we studied the problem of measuring the similarity of directed networks by proposing a novel alignment-free approach: the network subgraph-based approach. Our approach does not make use of randomized networks to determine modular patterns embedded in a network, and this method differs from the network motif and graphlet methods. Network similarity was quantified by gauging the difference between the subgraph frequency distributions of two networks using Jensen-Shannon entropy. We applied the subgraph approach to study three types of molecular networks, i.e., cancer networks, signal transduction networks, and cellular process networks, which exhibit diverse molecular functions. We compared the performance of our subgraph detection algorithm with other algorithms, and the results were consistent, but other algorithms could not address the issue of subgraphs/motifs embedded within a subgraph/motif. To evaluate the effectiveness of the subgraph-based method, we applied the method along with the Jensen-Shannon entropy to classify six network models, and it achieves a 100% accuracy of classification. The proposed information-theoretic approach allows us to determine the structural similarity of two networks regardless of node identity and network size. We demonstrated the effectiveness of the subgraph approach to cluster molecular networks that exhibit similar regulatory interaction topologies. As an illustration, our method can identify (i) common subgraph-mediated signal transduction and/or cellular processes in AML and pancreatic cancer, and (ii) scaffold proteins in gastric cancer and hepatocellular carcinoma; thus, the results suggested that there are common regulation modules for cancer formation. We also found that the underlying substructures of the molecular networks are dominated by irreducible subgraphs; this feature is valid for the three classes of molecular networks we studied. The subgraph-based approach provides a systematic scenario for analyzing, compare and classifying molecular networks with diverse functionalities.
Collapse
Affiliation(s)
- Chien-Hung Huang
- Department of Computer Science and Information Engineering, National Formosa University, Yun-Lin, Taiwan
| | - Efendi Zaenudin
- National Research and Innovation Agency, Bandung, Jawa Barat, Republic of Indonesia,Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan
| | - Jeffrey J.P. Tsai
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan
| | | | - Ka-Lok Ng
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan,Center for Artificial Intelligence and Precision Medicine Research, Asia University, Taichung, Taiwan,Department of Medical Research, China Medical University Hospital, China Medical University, Taichung, Taiwan
| |
Collapse
|
3
|
Wu G, Li X, Guo W, Wei Z, Hu T, Shan Y, Gu J. JEBIN: analyzing gene co-expressions across multiple datasets by joint network embedding. Brief Bioinform 2022; 23:6519533. [PMID: 35134135 DOI: 10.1093/bib/bbab603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 12/15/2021] [Accepted: 12/27/2021] [Indexed: 11/13/2022] Open
Abstract
The inference of gene co-expression associations is one of the fundamental tasks for large-scale transcriptomic data analysis. Due to the high dimensionality and high noises in transcriptomic data, it is difficult to infer stable gene co-expression associations from single dataset. Meta-analysis of multisource data can effectively tackle this problem. We proposed Joint Embedding of multiple BIpartite Networks (JEBIN) to learn the low-dimensional consensus representation for genes by integrating multiple expression datasets. JEBIN infers gene co-expression associations in a nonlinear and global similarity manner and can integrate datasets with different distributions in linear time complexity with the gene and total sample size. The effectiveness and scalability of JEBIN were verified by simulation experiments, and its superiority over the commonly used integration methods was proved by three indexes on real biological datasets. Then, JEBIN was applied to study the gene co-expression patterns of hepatocellular carcinoma (HCC) based on multiple expression datasets of HCC and adjacent normal tissues, and further on latest HCC single-cell RNA-seq data. Results show that gene co-expressions are highly different between bulk and single-cell datasets. Finally, many differentially co-expressed ligand-receptor pairs were discovered by comparing HCC with adjacent normal data, providing candidate HCC targets for abnormal cell-cell communications.
Collapse
Affiliation(s)
- Guiying Wu
- MOE Key Laboratory of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Xiangyu Li
- School of Software Engineering, Beijing Jiaotong University, Beijing 100044, China
| | - Wenbo Guo
- MOE Key Laboratory of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Zheng Wei
- MOE Key Laboratory of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Tao Hu
- MOE Key Laboratory of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Yiran Shan
- MOE Key Laboratory of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Jin Gu
- MOE Key Laboratory of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation, Tsinghua University, Beijing 100084, China
| |
Collapse
|
4
|
Li S, Han F, Qi N, Wen L, Li J, Feng C, Wang Q. Determination of a six-gene prognostic model for cervical cancer based on WGCNA combined with LASSO and Cox-PH analysis. World J Surg Oncol 2021; 19:277. [PMID: 34530829 PMCID: PMC8447612 DOI: 10.1186/s12957-021-02384-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Accepted: 08/30/2021] [Indexed: 02/06/2023] Open
Abstract
AIM This study aimed to establish a risk model of hub genes to evaluate the prognosis of patients with cervical cancer. METHODS Based on TCGA and GTEx databases, the differentially expressed genes (DEGs) were screened and then analyzed using GO and KEGG analyses. The weighted gene co-expression network (WGCNA) was then used to perform modular analysis of DEGs. Univariate Cox regression analysis combined with LASSO and Cox-pH was used to select the prognostic genes. Then, multivariate Cox regression analysis was used to screen the hub genes. The risk model was established based on hub genes and evaluated by risk curve, survival state, Kaplan-Meier curve, and receiver operating characteristic (ROC) curve. RESULTS We screened 1265 DEGs between cervical cancer and normal samples, of which 620 were downregulated and 645 were upregulated. GO and KEGG analyses revealed that most of the upregulated genes were related to the metastasis of cancer cells, while the downregulated genes mostly acted on the cell cycle. Then, WGCNA mined six modules (red, blue, green, brown, yellow, and gray), and the brown module with the most DEGs and related to multiple cancers was selected for the follow-up study. Eight genes were identified by univariate Cox regression analysis combined with the LASSO Cox-pH model. Then, six hub genes (SLC25A5, ENO1, ANLN, RIBC2, PTTG1, and MCM5) were screened by multivariate Cox regression analysis, and SLC25A5, ANLN, RIBC2, and PTTG1 could be used as independent prognostic factors. Finally, we determined that the risk model established by the six hub genes was effective and stable. CONCLUSIONS This study supplies the prognostic value of the risk model and the new promising targets for the cervical cancer treatment, and their biological functions need to be further explored.
Collapse
Affiliation(s)
- Shiyan Li
- Department of Gynecology, Heilongjiang University of Traditional Chinese Medicine, Harbin, PR China
| | - Fengjuan Han
- Department of Gynecology, Heilongjiang University of Traditional Chinese Medicine, Harbin, PR China.
| | - Na Qi
- Department of Gynecology, Heilongjiang University of Traditional Chinese Medicine, Harbin, PR China
| | - Liyang Wen
- Department of Acupuncture and Moxibustion, Heilongjiang University of Traditional Chinese Medicine, Harbin, P.R. China
| | - Jia Li
- Department of Gynecology, Heilongjiang University of Traditional Chinese Medicine, Harbin, PR China
| | - Cong Feng
- Department of Gynecology, Heilongjiang University of Traditional Chinese Medicine, Harbin, PR China
| | - Qingling Wang
- Department of Gynecology, Shenzhen Nanshan Maternal and Child Health Care Hospital, Shenzhen, P.R. China.
| |
Collapse
|
5
|
Jiang X, Pan W, Chen M, Wang W, Song W, Lin GN. Integrative enrichment analysis of gene expression based on an artificial neuron. BMC Med Genomics 2021; 14:173. [PMID: 34433483 PMCID: PMC8386081 DOI: 10.1186/s12920-021-00988-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Accepted: 05/18/2021] [Indexed: 11/28/2022] Open
Abstract
BACKGROUND Huntington's disease is a kind of chronic progressive neurodegenerative disease with complex pathogenic mechanisms. To data, the pathogenesis of Huntington's disease is still not fully understood, and there has been no effective treatment. The rapid development of high-throughput sequencing technologies makes it possible to explore the molecular mechanisms at the transcriptome level. Our previous studies on Huntington's disease have shown that it is difficult to distinguish disease-associated genes from non-disease genes. Meanwhile, recent progress in bio-medicine shows that the molecular origin of chronic complex diseases may not exist in the diseased tissue, and differentially expressed genes between different tissues may be helpful to reveal the molecular origin of chronic diseases. Therefore, developing integrative analysis computational methods for the multi-tissues gene expression data, exploring the relationship between differentially expressed genes in different tissues and the disease, can greatly accelerate the molecular discovery process. METHODS For analysis of the intra- and inter- tissues' differentially expressed genes, we designed an integrative enrichment analysis method based on an artificial neuron (IEAAN). Firstly, we calculated the differential expression scores of genes which are seen as features of the corresponding gene, using fold-change approach with intra- and inter- tissues' gene expression data. Then, we weighted sum all the differential expression scores through a sigmoid function to get differential expression enrichment score. Finally, we ranked the genes according to the enrichment score. Top ranking genes are supposed to be the potential disease-associated genes. RESULTS In this study, we conducted large amounts of experiments to analyze the differentially expressed genes of intra- and inter- tissues. Experimental results showed that genes differentially expressed between different tissues are more likely to be Huntington's disease-associated genes. Five disease-associated genes were selected out in this study, two of which have been reported to be implicated in Huntington's disease. CONCLUSIONS We proposed a novel integrative enrichment analysis method based on artificial neuron (IEAAN), which displays better prediction precision of disease-associated genes in comparison with the state-of-the-art statistical-based methods. Our comprehensive evaluation suggests that genes differentially expressed between striatum and liver tissues of health individuals are more likely to be Huntington's disease-associated genes.
Collapse
Affiliation(s)
- Xue Jiang
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030 China
| | - Weihao Pan
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030 China
| | - Miao Chen
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030 China
| | - Weidi Wang
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030 China
| | - Weichen Song
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030 China
| | - Guan Ning Lin
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030 China
- Shanghai Key Laboratory of Psychotic Disorders, Shanghai, 200030 China
| |
Collapse
|
6
|
Schulc K, Nagy ZT, Kamp S, Molnár J, Veres DV, Csermely P, Kovács BM. Modular Reorganization of Signaling Networks during the Development of Colon Adenoma and Carcinoma. J Phys Chem B 2021; 125:1716-1726. [PMID: 33562960 PMCID: PMC8023713 DOI: 10.1021/acs.jpcb.0c09307] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/09/2022]
Abstract
![]()
Network science is
an emerging tool in systems biology and oncology,
providing novel, system-level insight into the development of cancer.
The aim of this project was to study the signaling networks in the
process of oncogenesis to explore the adaptive mechanisms taking part
in the cancerous transformation of healthy cells. For this purpose,
colon cancer proved to be an excellent candidate as the preliminary
phase, and adenoma has a long evolution time. In our work, transcriptomic
data have been collected from normal colon, colon adenoma, and colon
cancer samples to calculating link (i.e., network edge) weights as
approximative proxies for protein abundances, and link weights were
included in the Human Cancer Signaling Network. Here we show that
the adenoma phase clearly differs from the normal and cancer states
in terms of a more scattered link weight distribution and enlarged
network diameter. Modular analysis shows the rearrangement of the
apoptosis- and the cell-cycle-related modules, whose pathway enrichment
analysis supports the relevance of targeted therapy. Our work enriches
the system-wide assessment of cancer development, showing specific
changes for the adenoma state.
Collapse
Affiliation(s)
- Klára Schulc
- Department of Molecular Biology, Semmelweis University, Budapest 1085, Hungary
| | - Zsolt T Nagy
- Department of Molecular Biology, Semmelweis University, Budapest 1085, Hungary
| | | | | | - Daniel V Veres
- Department of Molecular Biology, Semmelweis University, Budapest 1085, Hungary.,Turbine Ltd, Budapest, Hungary
| | - Peter Csermely
- Department of Molecular Biology, Semmelweis University, Budapest 1085, Hungary
| | - Borbála M Kovács
- Department of Molecular Biology, Semmelweis University, Budapest 1085, Hungary
| |
Collapse
|
7
|
Dang H, Ye Y, Zhao X, Zeng Y. Identification of candidate genes in ischemic cardiomyopathy by gene expression omnibus database. BMC Cardiovasc Disord 2020; 20:320. [PMID: 32631246 PMCID: PMC7336680 DOI: 10.1186/s12872-020-01596-w] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2019] [Accepted: 06/24/2020] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Ischemic cardiomyopathy (ICM) is one of the most usual causes of death worldwide. This study aimed to find the candidate gene for ICM. METHODS We studied differentially expressed genes (DEGs) in ICM compared to healthy control. According to these DEGs, we carried out the functional annotation, protein-protein interaction (PPI) network and transcriptional regulatory network constructions. The expression of selected candidate genes were confirmed using a published dataset and Quantitative real time polymerase chain reaction (qRT-PCR). RESULTS From three Gene Expression Omnibus (GEO) datasets, we acquired 1081 DEGs (578 up-regulated and 503 down-regulated genes) between ICM and healthy control. The functional annotation analysis revealed that cardiac muscle contraction, hypertrophic cardiomyopathy, arrhythmogenic right ventricular cardiomyopathy and dilated cardiomyopathy were significantly enriched pathways in ICM. SNRPB, BLM, RRS1, CDK2, BCL6, BCL2L1, FKBP5, IPO7, TUBB4B and ATP1A1 were considered the hub proteins. PALLD, THBS4, ATP1A1, NFASC, FKBP5, ECM2 and BCL2L1 were top six transcription factors (TFs) with the most downstream genes. The expression of 6 DEGs (MYH6, THBS4, BCL6, BLM, IPO7 and SERPINA3) were consistent with our integration analysis and GSE116250 validation results. CONCLUSIONS The candidate DEGs and TFs may be related to the ICM process. This study provided novel perspective for understanding mechanism and exploiting new therapeutic means for ICM.
Collapse
Affiliation(s)
- Haiming Dang
- Department of cardiac surgery, Capital medical university, Beijing Anzhen hospital, Beijing, China
| | - Yicong Ye
- Department of cardiology, Capital medical university, Beijing Anzhen hospital, No.2, Anzhen Road, Chaoyan District, Beijing, 100029, China
| | - Xiliang Zhao
- Department of cardiology, Capital medical university, Beijing Anzhen hospital, No.2, Anzhen Road, Chaoyan District, Beijing, 100029, China
| | - Yong Zeng
- Department of cardiology, Capital medical university, Beijing Anzhen hospital, No.2, Anzhen Road, Chaoyan District, Beijing, 100029, China.
| |
Collapse
|
8
|
Gene Coexpression Network and Module Analysis across 52 Human Tissues. BIOMED RESEARCH INTERNATIONAL 2020; 2020:6782046. [PMID: 32462012 PMCID: PMC7232734 DOI: 10.1155/2020/6782046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 03/31/2020] [Accepted: 04/08/2020] [Indexed: 01/30/2023]
Abstract
Gene coexpression analysis is widely used to infer gene modules associated with diseases and other clinical traits. However, a systematic view and comparison of gene coexpression networks and modules across a cohort of tissues are more or less ignored. In this study, we first construct gene coexpression networks and modules of 52 GTEx tissues and cell lines. The network modules are enriched in many tissue-common functions like organelle membrane and tissue-specific functions. We then study the correlation of tissues from the network point of view. As a result, the network modules of most tissues are significantly correlated, indicating a general similar network pattern across tissues. However, the level of similarity among the tissues is different. The tissues closing in a physical location seem to be more similar in their coexpression networks. For example, the two adjacent tissues fallopian tube and bladder have the highest Fisher's exact test p value 8.54E-291 among all tissue pairs. It is known that immune-associated modules are frequently identified in coexperssion modules. In this study, we found immune modules in many tissues like liver, kidney cortex, lung, uterus, adipose subcutaneous, and adipose visceral omentum. However, not all tissues have immune-associated modules, for example, brain cerebellum. Finally, by the clique analysis, we identify the largest clique of modules, in which the genes in each module are significantly overlapped with those in other modules. As a result, we are able to find a clique of size 40 (out of 52 tissues), indicating a strong correlation of modules across tissues. It is not surprising that the 40 modules are most commonly enriched in immune-related functions.
Collapse
|
9
|
Zhang J, Ju S. Identifying genuine protein-protein interactions within communities of gene co-expression networks using a deconvolution method. IET Syst Biol 2019; 13:290-296. [PMID: 31778125 PMCID: PMC8687158 DOI: 10.1049/iet-syb.2019.0060] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Revised: 06/24/2019] [Accepted: 07/09/2019] [Indexed: 11/20/2022] Open
Abstract
Direct relationships between biological molecules connected in a gene co-expression network tend to reflect real biological activities such as gene regulation, protein-protein interactions (PPIs), and metabolisation. As correlation-based networks contain numerous indirect connections, those direct relationships are always 'hidden' in them. Compared with the global network, network communities imply more biological significance on predicting protein function, detecting protein complexes and studying network evolution. Therefore, identifying direct relationships in communities is a pervasive and important topic in the biological sciences. Unfortunately, this field has not been well studied. A major thrust of this study is to apply a deconvolution algorithm on communities stemming from different gene co-expression networks, which are constructed by fixing different thresholds for robustness analysis. Using the fifth Dialogue on Reverse Engineering Assessment and Methods challenge (DREAM5) framework, the authors demonstrate that nearly all new communities extracted from a 'deconvolution filter' contain more genuine PPIs than before deconvolution.
Collapse
Affiliation(s)
- Jin Zhang
- School of Information Science and Engineering, University of Jinan, Jinan 250022, People's Republic of China.
| | - Shan Ju
- School of International Trade and Economics, Shandong University of Finance and Economics, Jinan 250014, People's Republic of China
| |
Collapse
|
10
|
Li W, Li L, Zhang S, Zhang C, Huang H, Li Y, Hu E, Deng G, Guo S, Wang Y, Li W, Chen L. Identification of potential genes for human ischemic cardiomyopathy based on RNA-Seq data. Oncotarget 2018; 7:82063-82073. [PMID: 27852050 PMCID: PMC5347674 DOI: 10.18632/oncotarget.13331] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2016] [Accepted: 10/07/2016] [Indexed: 12/30/2022] Open
Abstract
Ischemic cardiomyopathy (ICM) is an important cause of heart failure, yet no ICM disease genes were stored in any public databases. Mutations of genes provided by RNA-Seq data could set a foundation for a variety of biological processes. This also made it possible to elucidate the mechanism and identify potential genes for ICM. In this paper, an integrated co-expression network was constructed using univariate and bivariate canonical correlation analysis for RNA-Seq data of human ICM samples. Three ICM-related modules were recognized after comparing between Pearson correlation coefficients of ICM samples and normal controls. Furthermore, 32 ICM potential genes were identified from ICM-related modules considering protein-protein interactions. Most of these genes were verified to be involved in ICM and diseases caused it by OMIM and literature. Our study could provide a novel perspective for potential gene identification and the pathogenesis for ICM and other complex diseases.
Collapse
Affiliation(s)
- Wan Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Liansheng Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Shiying Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Ce Zhang
- Department of internal medicine, Heilongjiang Commercial Hospital, Harbin, Heilongjiang, China
| | - Hao Huang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Yiran Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Erqiang Hu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Gui Deng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Shanshan Guo
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Yahui Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Weimin Li
- Department of Cardiology, the First Affiliated Hospital of Harbin Medical University, Harbin, Heilongjiang, China
| | - Lina Chen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| |
Collapse
|
11
|
Pham NC, Haibe-Kains B, Bellot P, Bontempi G, Meyer PE. Study of Meta-analysis strategies for network inference using information-theoretic approaches. BioData Min 2017; 10:15. [PMID: 28484519 PMCID: PMC5420410 DOI: 10.1186/s13040-017-0136-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2017] [Accepted: 04/20/2017] [Indexed: 11/10/2022] Open
Abstract
Background Reverse engineering of gene regulatory networks (GRNs) from gene expression data is a classical challenge in systems biology. Thanks to high-throughput technologies, a massive amount of gene-expression data has been accumulated in the public repositories. Modelling GRNs from multiple experiments (also called integrative analysis) has; therefore, naturally become a standard procedure in modern computational biology. Indeed, such analysis is usually more robust than the traditional approaches, which suffer from experimental biases and the low number of samples by analysing individual datasets. To date, there are mainly two strategies for the problem of interest: the first one (“data merging”) merges all datasets together and then infers a GRN whereas the other (“networks ensemble”) infers GRNs from every dataset separately and then aggregates them using some ensemble rules (such as ranksum or weightsum). Unfortunately, a thorough comparison of these two approaches is lacking. Results In this work, we are going to present another meta-analysis approach for inferring GRNs from multiple studies. Our proposed meta-analysis approach, adapted to methods based on pairwise measures such as correlation or mutual information, consists of two steps: aggregating matrices of the pairwise measures from every dataset followed by extracting the network from the meta-matrix. Afterwards, we evaluate the performance of the two commonly used approaches mentioned above and our presented approach with a systematic set of experiments based on in silico benchmarks. Conclusions We proposed a first systematic evaluation of different strategies for reverse engineering GRNs from multiple datasets. Experiment results strongly suggest that assembling matrices of pairwise dependencies is a better strategy for network inference than the two commonly used ones.
Collapse
Affiliation(s)
- Ngoc C Pham
- Bioinformatics and Systems Biology (BioSys) Lab, Université de Liège, Liège, Belgium
| | - Benjamin Haibe-Kains
- Princess Margaret Cancer Center, Toronto, ON Canada.,Department of Medical Biophysics, University of Toronto, Toronto, ON Canada.,Department of Computer Science, University of Toronto, Toronto, ON Canada.,Ontario Institute of Cancer Research, Toronto, ON Canada
| | - Pau Bellot
- Image Processing group, Technical University of Catalonia, Barcelona, Spain
| | - Gianluca Bontempi
- Machine Learning Group, Interuniversity Institute of Bioinformatics in Brussels (IB)², Université Libre de Bruxelles, Bruxelles, Belgium
| | - Patrick E Meyer
- Bioinformatics and Systems Biology (BioSys) Lab, Université de Liège, Liège, Belgium
| |
Collapse
|
12
|
Zhu L, Ding Y, Chen CY, Wang L, Huo Z, Kim S, Sotiriou C, Oesterreich S, Tseng GC. MetaDCN: meta-analysis framework for differential co-expression network detection with an application in breast cancer. Bioinformatics 2017; 33:1121-1129. [PMID: 28031185 PMCID: PMC6041767 DOI: 10.1093/bioinformatics/btw788] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Revised: 11/11/2016] [Accepted: 12/07/2016] [Indexed: 01/01/2023] Open
Abstract
MOTIVATION Gene co-expression network analysis from transcriptomic studies can elucidate gene-gene interactions and regulatory mechanisms. Differential co-expression analysis helps further detect alterations of regulatory activities in case/control comparison. Co-expression networks estimated from single transcriptomic study is often unstable and not generalizable due to cohort bias and limited sample size. With the rapid accumulation of publicly available transcriptomic studies, co-expression analysis combining multiple transcriptomic studies can provide more accurate and robust results. RESULTS In this paper, we propose a meta-analytic framework for detecting differentially co-expressed networks (MetaDCN). Differentially co-expressed seed modules are first detected by optimizing an energy function via simulated annealing. Basic modules sharing common pathways are merged into pathway-centric supermodules and a Cytoscape plug-in (MetaDCNExplorer) is developed to visualize and explore the findings. We applied MetaDCN to two breast cancer applications: ER+/ER- comparison using five training and three testing studies, and ILC/IDC comparison with two training and two testing studies. We identified 20 and 4 supermodules for ER+/ER- and ILC/IDC comparisons, respectively. Ranking atop are 'immune response pathway' and 'complement cascades pathway' for ER comparison, and 'extracellular matrix pathway' for ILC/IDC comparison. Without the need for prior information, the results from MetaDCN confirm existing as well as discover novel disease mechanisms in a systems manner. AVAILABILITY AND IMPLEMENTATION R package 'MetaDCN' and Cytoscape App 'MetaDCNExplorer' are available at http://tsenglab.biostat.pitt.edu/software.htm . CONTACT ctseng@pitt.edu. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Li Zhu
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Ying Ding
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Cho-Yi Chen
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
- Genome and Systems Biology Degree Program, National Taiwan University, Taipei, Taiwan
| | - Lin Wang
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Zhiguang Huo
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - SungHwan Kim
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Christos Sotiriou
- Breast Cancer Translational Research Laboratory, J. C. Heuson, Institut Jules Bordet, University Libre de Bruxelles, Brussels, Belgium
| | | | - George C Tseng
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
13
|
Quantitative epigenetic co-variation in CpG islands and co-regulation of developmental genes. Sci Rep 2014; 3:2576. [PMID: 23999385 PMCID: PMC6505400 DOI: 10.1038/srep02576] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2013] [Accepted: 08/16/2013] [Indexed: 12/21/2022] Open
Abstract
The genome-wide variation of multiple epigenetic modifications in CpG islands (CGIs) and the interactions between them are of great interest. Here, we optimized an entropy-based strategy to quantify variation of epigenetic modifications and explored their interaction across mouse embryonic stem cells, neural precursor cells and brain. Our results showed that four epigenetic modifications (DNA methylation, H3K4me2, H3K4me3 and H3K27me3) of CGIs in the mouse genome undergo combinatorial variation during neuron differentiation. DNA methylation variation was positively correlated with H3K27me3 variation, and negatively correlated with H3K4me2/3 variation. We identified 5,194 CGIs differentially modified by epigenetic modifications (DEM-CGIs). Among them, the differentially DNA methylated CGIs overlapped significantly with the CGIs differentially modified by H3K27me3. Moreover, DEM-CGIs may contribute to co-regulation of related developmental genes including core transcription factors. Our entropy-based strategy provides an effective way of investigating dynamic cross-talk among epigenetic modifications in various biological processes at the macro scale.
Collapse
|
14
|
Hong S, Chen X, Jin L, Xiong M. Canonical correlation analysis for RNA-seq co-expression networks. Nucleic Acids Res 2013; 41:e95. [PMID: 23460206 PMCID: PMC3632131 DOI: 10.1093/nar/gkt145] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Digital transcriptome analysis by next-generation sequencing discovers substantial mRNA variants. Variation in gene expression underlies many biological processes and holds a key to unravelling mechanism of common diseases. However, the current methods for construction of co-expression networks using overall gene expression are originally designed for microarray expression data, and they overlook a large number of variations in gene expressions. To use information on exon, genomic positional level and allele-specific expressions, we develop novel component-based methods, single and bivariate canonical correlation analysis, for construction of co-expression networks with RNA-seq data. To evaluate the performance of our methods for co-expression network inference with RNA-seq data, they are applied to lung squamous cell cancer expression data from TCGA database and our bipolar disorder and schizophrenia RNA-seq study. The preliminary results demonstrate that the co-expression networks constructed by canonical correlation analysis and RNA-seq data provide rich genetic and molecular information to gain insight into biological processes and disease mechanism. Our new methods substantially outperform the current statistical methods for co-expression network construction with microarray expression data or RNA-seq data based on overall gene expression levels.
Collapse
Affiliation(s)
- Shengjun Hong
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences and Institutes of Biomedical Sciences, Fudan University, Shanghai 200433, China
| | | | | | | |
Collapse
|
15
|
Amin MS, Finley RL, Jamil HM. Top-k similar graph matching using TraM in biological networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012; 9:1790-1804. [PMID: 22732692 DOI: 10.1109/tcbb.2012.90] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Many emerging database applications entail sophisticated graph-based query manipulation, predominantly evident in large-scale scientific applications. To access the information embedded in graphs, efficient graph matching tools and algorithms have become of prime importance. Although the prohibitively expensive time complexity associated with exact subgraph isomorphism techniques has limited its efficacy in the application domain, approximate yet efficient graph matching techniques have received much attention due to their pragmatic applicability. Since public domain databases are noisy and incomplete in nature, inexact graph matching techniques have proven to be more promising in terms of inferring knowledge from numerous structural data repositories. In this paper, we propose a novel technique called TraM for approximate graph matching that off-loads a significant amount of its processing on to the database making the approach viable for large graphs. Moreover, the vector space embedding of the graphs and efficient filtration of the search space enables computation of approximate graph similarity at a throw-away cost. We annotate nodes of the query graphs by means of their global topological properties and compare them with neighborhood biased segments of the datagraph for proper matches. We have conducted experiments on several real data sets, and have demonstrated the effectiveness and efficiency of the proposed method
Collapse
Affiliation(s)
- Mohammad Shafkat Amin
- Department of Computer Science, Wayne State University, 555 E Washington Ave, Apt 1807, Sunnyvale, CA 94086, USA.
| | | | | |
Collapse
|
16
|
Towfic F, Gupta S, Honavar V, Subramaniam S. B-cell ligand processing pathways detected by large-scale comparative analysis. GENOMICS PROTEOMICS & BIOINFORMATICS 2012; 10:142-52. [PMID: 22917187 PMCID: PMC5054497 DOI: 10.1016/j.gpb.2012.03.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/11/2011] [Revised: 03/05/2012] [Accepted: 03/07/2012] [Indexed: 11/03/2022]
Abstract
The initiation of B-cell ligand recognition is a critical step for the generation of an immune response against foreign bodies. We sought to identify the biochemical pathways involved in the B-cell ligand recognition cascade and sets of ligands that trigger similar immunological responses. We utilized several comparative approaches to analyze the gene coexpression networks generated from a set of microarray experiments spanning 33 different ligands. First, we compared the degree distributions of the generated networks. Second, we utilized a pairwise network alignment algorithm, BiNA, to align the networks based on the hubs in the networks. Third, we aligned the networks based on a set of KEGG pathways. We summarized our results by constructing a consensus hierarchy of pathways that are involved in B cell ligand recognition. The resulting pathways were further validated through literature for their common physiological responses. Collectively, the results based on our comparative analyses of degree distributions, alignment of hubs, and alignment based on KEGG pathways provide a basis for molecular characterization of the immune response states of B-cells and demonstrate the power of comparative approaches (e.g., gene coexpression network alignment algorithms) in elucidating biochemical pathways involved in complex signaling events in cells.
Collapse
Affiliation(s)
- Fadi Towfic
- Bioinformatics and Computational Biology Graduate Program, Iowa State University, Ames, IA 50010, USA.
| | | | | | | |
Collapse
|
17
|
Mueller LAJ, Kugler KG, Graber A, Emmert-Streib F, Dehmer M. Structural measures for network biology using QuACN. BMC Bioinformatics 2011; 12:492. [PMID: 22195644 PMCID: PMC3293850 DOI: 10.1186/1471-2105-12-492] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2011] [Accepted: 12/24/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Structural measures for networks have been extensively developed, but many of them have not yet demonstrated their sustainably. That means, it remains often unclear whether a particular measure is useful and feasible to solve a particular problem in network biology. Exemplarily, the classification of complex biological networks can be named, for which structural measures are used leading to a minimal classification error. Hence, there is a strong need to provide freely available software packages to calculate and demonstrate the appropriate usage of structural graph measures in network biology. RESULTS Here, we discuss topological network descriptors that are implemented in the R-package QuACN and demonstrate their behavior and characteristics by applying them to a set of example graphs. Moreover, we show a representative application to illustrate their capabilities for classifying biological networks. In particular, we infer gene regulatory networks from microarray data and classify them by methods provided by QuACN. Note that QuACN is the first freely available software written in R containing a large number of structural graph measures. CONCLUSION The R package QuACN is under ongoing development and we add promising groups of topological network descriptors continuously. The package can be used to answer intriguing research questions in network biology, e.g., classifying biological data or identifying meaningful biological features, by analyzing the topology of biological networks.
Collapse
Affiliation(s)
- Laurin A J Mueller
- Institute for Bioinformatics and Translational Research, Department of Biomedical Sciences and Engineering, University for Health Sciences, Medical Informatics and Technology (UMIT), EWZ 1, Hall in Tirol, Austria
| | | | | | | | | |
Collapse
|
18
|
Mueller LAJ, Kugler KG, Netzer M, Graber A, Dehmer M. A network-based approach to classify the three domains of life. Biol Direct 2011; 6:53. [PMID: 21995640 PMCID: PMC3226542 DOI: 10.1186/1745-6150-6-53] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2011] [Accepted: 10/13/2011] [Indexed: 11/22/2022] Open
Abstract
Background Identifying group-specific characteristics in metabolic networks can provide better insight into evolutionary developments. Here, we present an approach to classify the three domains of life using topological information about the underlying metabolic networks. These networks have been shown to share domain-independent structural similarities, which pose a special challenge for our endeavour. We quantify specific structural information by using topological network descriptors to classify this set of metabolic networks. Such measures quantify the structural complexity of the underlying networks. In this study, we use such measures to capture domain-specific structural features of the metabolic networks to classify the data set. So far, it has been a challenging undertaking to examine what kind of structural complexity such measures do detect. In this paper, we apply two groups of topological network descriptors to metabolic networks and evaluate their classification performance. Moreover, we combine the two groups to perform a feature selection to estimate the structural features with the highest classification ability in order to optimize the classification performance. Results By combining the two groups, we can identify seven topological network descriptors that show a group-specific characteristic by ANOVA. A multivariate analysis using feature selection and supervised machine learning leads to a reasonable classification performance with a weighted F-score of 83.7% and an accuracy of 83.9%. We further demonstrate that our approach outperforms alternative methods. Also, our results reveal that entropy-based descriptors show the highest classification ability for this set of networks. Conclusions Our results show that these particular topological network descriptors are able to capture domain-specific structural characteristics for classifying metabolic networks between the three domains of life.
Collapse
Affiliation(s)
- Laurin A J Mueller
- Institute for Bioinformatics and Translational Research, Department of Biomedical Sciences and Engineering, University for Health Sciences, Medical Informatics and Technology (UMIT), Austria
| | | | | | | | | |
Collapse
|