1
|
Synergy between the Levels of Methylation of microRNA Gene Sets in Primary Tumors and Metastases of Ovarian Cancer Patients. Bull Exp Biol Med 2022; 173:87-91. [PMID: 35622253 DOI: 10.1007/s10517-022-05499-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Indexed: 10/18/2022]
Abstract
We studied the correlations between the levels of methylation of a group of 21 microRNA genes in 99 primary tumors and 29 macroscopic peritoneal metastases of ovarian cancer. Analysis of the level of methylation by quantitative methylation-specific PCR showed that co-methylation was detected for 13 pairs of microRNA genes in primary tumors and for 22 pairs in metastases. Pairs of microRNA genes that have shown significant co-methylation can be involved in common processes and pathways of gene regulation and interaction and can have common target genes. The results are highly significant and pairs of microRNA genes can be proposed as new potential markers for the diagnosis and prognosis of ovarian cancer metastasis.
Collapse
|
2
|
Bhadra T, Mallik S, Sohel A, Zhao Z. Unsupervised Feature Selection Using an Integrated Strategy of Hierarchical Clustering With Singular Value Decomposition: An Integrative Biomarker Discovery Method With Application to Acute Myeloid Leukemia. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1354-1364. [PMID: 34495838 DOI: 10.1109/tcbb.2021.3110989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In this article, we propose a novel unsupervised feature selection method by combining hierarchical feature clustering with singular value decomposition (SVD). The proposed algorithm first generates several feature clusters by adopting the hierarchical clustering on the feature space and then applies SVD to each of these feature clusters to find out the feature that contributes most to the SVD-entropy. The proposed feature selection method selects an optimal feature subset that not only minimizes the mutual dependency among the selected features but also maximizes the mutual dependency of the selected features against their nearest neighbor non-selected features to some extent. Each of the selected features also contributes the maximum SVD-entropy among all features of the same feature cluster. The experimental results demonstrate that the proposed algorithm performs well against several state-of-the-art methods of feature selection in terms of various evaluation criteria such as classification accuracy, redundancy rate, and representation entropy. The superiority of the proposed algorithm is demonstrated through analysis of Acute Myeloid Leukemia (AML) multi-omics data that consist of five datasets: gene expression, exon expression, methylation, microRNA, and pathway activity dataset (paradigm IPLs) from The Cancer Genome Atlas (TCGA). Our analysis pinpoints a candidate gene-marker, EREG for AML with an integrative omics evidence. EREG is targeted by two top ranked microRNAs, hsa-miR-1286 and hsa-miR-1976, here in the datasets. The method and results will be useful for biomarker discovery in the era of in precision medicine.
Collapse
|
3
|
Mallick K, Mallik S, Bandyopadhyay S, Chakraborty S. A Novel Graph Topology-Based GO-Similarity Measure for Signature Detection From Multi-Omics Data and its Application to Other Problems. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:773-785. [PMID: 32866101 DOI: 10.1109/tcbb.2020.3020537] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Large scale multi-omics data analysis and signature prediction have been a topic of interest in the last two decades. While various traditional clustering/correlation-based methods have been proposed, but the overall prediction is not always satisfactory. To solve these challenges, in this article, we propose a new approach by leveraging the Gene Ontology (GO)similarity combined with multiomics data. In this article, a new GO similarity measure, ModSchlicker, is proposed and the effectiveness of the proposed measure along with other standardized measures are reviewed while using various graph topology-based Information Content (IC)values of GO-term. The proposed measure is deployed to PPI prediction. Furthermore, by involving GO similarity, we propose a new framework for stronger disease-based gene signature detection from the multi-omics data. For the first objective, we predict interaction from various benchmark PPI datasets of Yeast and Human species. For the latter, the gene expression and methylation profiles are used to identify Differentially Expressed and Methylated (DEM)genes. Thereafter, the GO similarity score along with a statistical method are used to determine the potential gene signature. Interestingly, the proposed method produces a better performance ( 0.9 avg. accuracy and 0.95 AUC)as compared to the other existing related methods during the classification of the participating features (genes)of the signature. Moreover, the proposed method is highly useful in other prediction/classification problems for any kind of large scale omics data.
Collapse
|
4
|
Levy JJ, Chen Y, Azizgolshani N, Petersen CL, Titus AJ, Moen EL, Vaickus LJ, Salas LA, Christensen BC. MethylSPWNet and MethylCapsNet: Biologically Motivated Organization of DNAm Neural Networks, Inspired by Capsule Networks. NPJ Syst Biol Appl 2021; 7:33. [PMID: 34417465 PMCID: PMC8379254 DOI: 10.1038/s41540-021-00193-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 07/01/2021] [Indexed: 02/07/2023] Open
Abstract
DNA methylation (DNAm) alterations have been heavily implicated in carcinogenesis and the pathophysiology of diseases through upstream regulation of gene expression. DNAm deep-learning approaches are able to capture features associated with aging, cell type, and disease progression, but lack incorporation of prior biological knowledge. Here, we present modular, user-friendly deep-learning methodology and software, MethylCapsNet and MethylSPWNet, that group CpGs into biologically relevant capsules-such as gene promoter context, CpG island relationship, or user-defined groupings-and relate them to diagnostic and prognostic outcomes. We demonstrate these models' utility on 3,897 individuals in the classification of central nervous system (CNS) tumors. MethylCapsNet and MethylSPWNet provide an opportunity to increase DNAm deep-learning analyses' interpretability by enabling a flexible organization of DNAm data into biologically relevant capsules.
Collapse
Affiliation(s)
- Joshua J Levy
- Program in Quantitative Biomedical Sciences, Geisel School of Medicine at Dartmouth, Hanover, NH, USA.
- Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA.
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Hitchcock Medical Center, Lebanon, NH, USA.
| | - Youdinghuan Chen
- Program in Quantitative Biomedical Sciences, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
- Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Nasim Azizgolshani
- Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Curtis L Petersen
- Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
- The Dartmouth Institute for Health Policy and Clinical Practice, Lebanon, NH, USA
| | - Alexander J Titus
- Department of Life Sciences, University of New Hampshire, Manchester, NH, USA
| | - Erika L Moen
- The Dartmouth Institute for Health Policy and Clinical Practice, Lebanon, NH, USA
- Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Louis J Vaickus
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Hitchcock Medical Center, Lebanon, NH, USA
| | - Lucas A Salas
- Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
- Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Brock C Christensen
- Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
- Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
- Department of Community and Family Medicine, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| |
Collapse
|
5
|
Xing H, Wu Y, Zhang MQ, Chen Y. Deciphering hierarchical organization of topologically associated domains through change-point testing. BMC Bioinformatics 2021; 22:183. [PMID: 33838653 PMCID: PMC8037919 DOI: 10.1186/s12859-021-04113-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 03/30/2021] [Indexed: 12/20/2022] Open
Abstract
Background The nucleus of eukaryotic cells spatially packages chromosomes into a hierarchical and distinct segregation that plays critical roles in maintaining transcription regulation. High-throughput methods of chromosome conformation capture, such as Hi-C, have revealed topologically associating domains (TADs) that are defined by biased chromatin interactions within them. Results We introduce a novel method, HiCKey, to decipher hierarchical TAD structures in Hi-C data and compare them across samples. We first derive a generalized likelihood-ratio (GLR) test for detecting change-points in an interaction matrix that follows a negative binomial distribution or general mixture distribution. We then employ several optimal search strategies to decipher hierarchical TADs with p values calculated by the GLR test. Large-scale validations of simulation data show that HiCKey has good precision in recalling known TADs and is robust against random collisions of chromatin interactions. By applying HiCKey to Hi-C data of seven human cell lines, we identified multiple layers of TAD organization among them, but the vast majority had no more than four layers. In particular, we found that TAD boundaries are significantly enriched in active chromosomal regions compared to repressed regions. Conclusions HiCKey is optimized for processing large matrices constructed from high-resolution Hi-C experiments. The method and theoretical result of the GLR test provide a general framework for significance testing of similar experimental chromatin interaction data that may not fully follow negative binomial distributions but rather more general mixture distributions. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04113-8.
Collapse
Affiliation(s)
- Haipeng Xing
- Department of Applied Mathematics and Statistics, State University of New York at Stony Brook, 100 Nicolls Rd, Stony Brook, NY, 11794, USA
| | - Yingru Wu
- Department of Applied Mathematics and Statistics, State University of New York at Stony Brook, 100 Nicolls Rd, Stony Brook, NY, 11794, USA
| | - Michael Q Zhang
- Center for System Biology, University of Texas at Dallas, 800 W Campbell Rd, Richardson, TX, 75080, USA
| | - Yong Chen
- Department of Molecular and Cellular Biosciences, Rowan University, 201 Mullica Hill Rd, Glassboro, NJ, 08028, USA.
| |
Collapse
|
6
|
Lu K, Yang K, Niyongabo E, Shu Z, Wang J, Chang K, Zou Q, Jiang J, Jia C, Liu B, Zhou X. Integrated network analysis of symptom clusters across disease conditions. J Biomed Inform 2020; 107:103482. [PMID: 32535270 DOI: 10.1016/j.jbi.2020.103482] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2019] [Revised: 05/18/2020] [Accepted: 06/08/2020] [Indexed: 10/24/2022]
Abstract
Identifying the symptom clusters (two or more related symptoms) with shared underlying molecular mechanisms has been a vital analysis task to promote the symptom science and precision health. Related studies have applied the clustering algorithms (e.g. k-means, latent class model) to detect the symptom clusters mostly from various kinds of clinical data. In addition, they focused on identifying the symptom clusters (SCs) for a specific disease, which also mainly concerned with the clinical regularities for symptom management. Here, we utilized a network-based clustering algorithm (i.e., BigCLAM) to obtain 208 typical SCs across disease conditions on a large-scale symptom network derived from integrated high-quality disease-symptom associations. Furthermore, we evaluated the underlying shared molecular mechanisms for SCs, i.e., shared genes, protein-protein interaction (PPI) and gene functional annotations using integrated networks and similarity measures. We found that the symptoms in the same SCs tend to share a higher degree of genes, PPIs and have higher functional homogeneities. In addition, we found that most SCs have related symptoms with shared underlying molecular mechanisms (e.g. enriched pathways) across different disease conditions. Our work demonstrated that the integrated network analysis method could be used for identifying robust SCs and investigate the molecular mechanisms of these SCs, which would be valuable for symptom science and precision health.
Collapse
Affiliation(s)
- Kezhi Lu
- Institute of Medical Intelligence, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China.
| | - Kuo Yang
- Institute of Medical Intelligence, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China.
| | - Edouard Niyongabo
- Institute of Medical Intelligence, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China.
| | - Zixin Shu
- Institute of Medical Intelligence, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China.
| | - Jingjing Wang
- Institute of Medical Intelligence, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China.
| | - Kai Chang
- Institute of Medical Intelligence, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China.
| | - Qunsheng Zou
- Institute of Medical Intelligence, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China.
| | - Jiyue Jiang
- Institute of Medical Intelligence, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China.
| | - Caiyan Jia
- Institute of Medical Intelligence, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China.
| | - Baoyan Liu
- Data Center of Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China.
| | - Xuezhong Zhou
- Institute of Medical Intelligence, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China; Data Center of Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China.
| |
Collapse
|