1
|
Zhong Y, Li J, He J, Gao Y, Liu J, Wang J, Shang X, Hu J. Twadn: an efficient alignment algorithm based on time warping for pairwise dynamic networks. BMC Bioinformatics 2020; 21:385. [PMID: 32938373 PMCID: PMC7495832 DOI: 10.1186/s12859-020-03672-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background Network alignment is an efficient computational framework in the prediction of protein function and phylogenetic relationships in systems biology. However, most of existing alignment methods focus on aligning PPIs based on static network model, which are actually dynamic in real-world systems. The dynamic characteristic of PPI networks is essential for understanding the evolution and regulation mechanism at the molecular level and there is still much room to improve the alignment quality in dynamic networks. Results In this paper, we proposed a novel alignment algorithm, Twadn, to align dynamic PPI networks based on a strategy of time warping. We compare Twadn with the existing dynamic network alignment algorithm DynaMAGNA++ and DynaWAVE and use area under the receiver operating characteristic curve and area under the precision-recall curve as evaluation indicators. The experimental results show that Twadn is superior to DynaMAGNA++ and DynaWAVE. In addition, we use protein interaction network of Drosophila to compare Twadn and the static network alignment algorithm NetCoffee2 and experimental results show that Twadn is able to capture timing information compared to NetCoffee2. Conclusions Twadn is a versatile and efficient alignment tool that can be applied to dynamic network. Hopefully, its application can benefit the research community in the fields of molecular function and evolution.
Collapse
Affiliation(s)
- Yuanke Zhong
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China
| | - Jing Li
- Xi'an Mingde Institute of Technology, Fenghe Campus, Fenghe Campus, Xi'an, 710124, China
| | - Junhao He
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China
| | - Yiqun Gao
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China
| | - Jie Liu
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China
| | - Jingru Wang
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China
| | - Jialu Hu
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China. .,Centre of Multidisciplinary Convergence Computing, School of Computer Science, Northwestern Polytechnical University, 1 Dong Xiang Road, Xi'an, 710129, China.
| |
Collapse
|
2
|
Peng J, Guan J, Hui W, Shang X. A novel subnetwork representation learning method for uncovering disease-disease relationships. Methods 2020; 192:77-84. [PMID: 32946974 DOI: 10.1016/j.ymeth.2020.09.002] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Revised: 08/20/2020] [Accepted: 09/07/2020] [Indexed: 12/12/2022] Open
Abstract
Analyzing disease-disease relationships plays an important role for understanding disease mechanisms and finding alternative uses for a drug. A disease is usually the result of abnormal state of multiple molecular process. Since biological networks can model the interplay of multiple molecular processes, network-based methods have been proposed to uncover the disease-disease relationships recently. Given a disease and a network, the disease could be represented as a subnetwork constructed by the disease genes involved in the given network, named disease subnetwork. Because it is difficult to learn the feature representation of disease subnetworks, most existing methods are unsupervised ones without using labeled information. To fill this gap, we propose a novel method named SubNet2vec to learn the feature vectors of diseases from their corresponding subnetwork in the biological network. By utilizing the feature representation of disease subnetwork, we can analyze disease-disease relationships in a supervised fashion. The evaluation results show that the proposed framework outperforms some state-of-the-art approaches in a large margin on disease-disease/disease-drug association prediction. The source code and data are available athttps://github.com/MedicineBiology-AI/SubNet2vec.git.
Collapse
Affiliation(s)
- Jiajie Peng
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710129, China.
| | - Jiaojiao Guan
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710129, China.
| | - Weiwei Hui
- Vivo mobile communications (Hang Zhou) co. LTD, China.
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710129, China.
| |
Collapse
|
3
|
Patra S, Mohapatra A. Review of tools and algorithms for network motif discovery in biological networks. IET Syst Biol 2020; 14:171-189. [PMID: 32737276 PMCID: PMC8687426 DOI: 10.1049/iet-syb.2020.0004] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Network motifs are recurrent and over‐represented patterns having biological relevance. This is one of the important local properties of biological networks. Network motif discovery finds important applications in many areas such as functional analysis of biological components, the validity of network composition, classification of networks, disease discovery, identification of unique subunits etc. The discovery of network motifs is a computationally challenging task due to the large size of real networks, and the exponential increase of search space with respect to network size and motif size. This problem also includes the subgraph isomorphism check, which is Nondeterministic Polynomial (NP)‐complete. Several tools and algorithms have been designed in the last few years to address this problem with encouraging results. These tools and algorithms can be classified into various categories based on exact census, mapping, pattern growth, and so on. In this study, critical aspects of network motif discovery, design principles of background algorithms, and their functionality have been reviewed with their strengths and limitations. The performances of state‐of‐art algorithms are discussed in terms of runtime efficiency, scalability, and space requirement. The future scope, research direction, and challenges of the existing algorithms are presented at the end of the study.
Collapse
Affiliation(s)
- Sabyasachi Patra
- Department of Computer Science, IIIT Bhubaneswar, Odisha, India.
| | | |
Collapse
|
4
|
Qin G, Mallik S, Mitra R, Li A, Jia P, Eischen CM, Zhao Z. MicroRNA and transcription factor co-regulatory networks and subtype classification of seminoma and non-seminoma in testicular germ cell tumors. Sci Rep 2020; 10:852. [PMID: 31965022 PMCID: PMC6972857 DOI: 10.1038/s41598-020-57834-w] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Accepted: 12/24/2019] [Indexed: 12/11/2022] Open
Abstract
Recent studies have revealed that feed-forward loops (FFLs) as regulatory motifs have synergistic roles in cellular systems and their disruption may cause diseases including cancer. FFLs may include two regulators such as transcription factors (TFs) and microRNAs (miRNAs). In this study, we extensively investigated TF and miRNA regulation pairs, their FFLs, and TF-miRNA mediated regulatory networks in two major types of testicular germ cell tumors (TGCT): seminoma (SE) and non-seminoma (NSE). Specifically, we identified differentially expressed mRNA genes and miRNAs in 103 tumors using the transcriptomic data from The Cancer Genome Atlas. Next, we determined significantly correlated TF-gene/miRNA and miRNA-gene/TF pairs with regulation direction. Subsequently, we determined 288 and 664 dysregulated TF-miRNA-gene FFLs in SE and NSE, respectively. By constructing dysregulated FFL networks, we found that many hub nodes (12 out of 30 for SE and 8 out of 32 for NSE) in the top ranked FFLs could predict subtype-classification (Random Forest classifier, average accuracy ≥90%). These hub molecules were validated by an independent dataset. Our network analysis pinpointed several SE-specific dysregulated miRNAs (miR-200c-3p, miR-25-3p, and miR-302a-3p) and genes (EPHA2, JUN, KLF4, PLXDC2, RND3, SPI1, and TIMP3) and NSE-specific dysregulated miRNAs (miR-367-3p, miR-519d-3p, and miR-96-5p) and genes (NR2F1 and NR2F2). This study is the first systematic investigation of TF and miRNA regulation and their co-regulation in two major TGCT subtypes.
Collapse
Affiliation(s)
- Guimin Qin
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.,School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi, China
| | - Saurav Mallik
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Ramkrishna Mitra
- Department of Cancer Biology, Sidney Kimmel Cancer Center, Thomas Jefferson University, Philadelphia, PA, USA
| | - Aimin Li
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.,School of Computer Science and Engineering, Xi'an University of Technology, Xi'an, Shaanxi, China
| | - Peilin Jia
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Christine M Eischen
- Department of Cancer Biology, Sidney Kimmel Cancer Center, Thomas Jefferson University, Philadelphia, PA, USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA. .,Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA.
| |
Collapse
|
5
|
Hu J, He J, Li J, Gao Y, Zheng Y, Shang X. A novel algorithm for alignment of multiple PPI networks based on simulated annealing. BMC Genomics 2019; 20:932. [PMID: 31881842 PMCID: PMC6933650 DOI: 10.1186/s12864-019-6302-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Proteins play essential roles in almost all life processes. The prediction of protein function is of significance for the understanding of molecular function and evolution. Network alignment provides a fast and effective framework to automatically identify functionally conserved proteins in a systematic way. However, due to the fast growing genomic data, interactions and annotation data, there is an increasing demand for more accurate and efficient tools to deal with multiple PPI networks. Here, we present a novel global alignment algorithm NetCoffee2 based on graph feature vectors to discover functionally conserved proteins and predict function for unknown proteins. To test the algorithm performance, NetCoffee2 and three other notable algorithms were applied on eight real biological datasets. Functional analyses were performed to evaluate the biological quality of these alignments. Results show that NetCoffee2 is superior to existing algorithms IsoRankN, NetCoffee and multiMAGNA++ in terms of both coverage and consistency. The binary and source code are freely available under the GNU GPL v3 license at https://github.com/screamer/NetCoffee2.
Collapse
Affiliation(s)
- Jialu Hu
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
- Centre of Multidisciplinary Convergence Computing, School of Computer Science, Northwestern Polytechnical University, 1 Dong Xiang Road, Xi’an, 710129 China
| | - Junhao He
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| | - Jing Li
- Ming De College, Northwestern Polytechnical University, Feng He Campus, Xi’an, 710124 China
| | - Yiqun Gao
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| | - Yan Zheng
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| |
Collapse
|
6
|
Hu J, Gao Y, Li J, Zheng Y, Wang J, Shang X. A novel algorithm based on bi-random walks to identify disease-related lncRNAs. BMC Bioinformatics 2019; 20:569. [PMID: 31760932 PMCID: PMC6876073 DOI: 10.1186/s12859-019-3128-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Backgrounds There is evidence to suggest that lncRNAs are associated with distinct and diverse biological processes. The dysfunction or mutation of lncRNAs are implicated in a wide range of diseases. An accurate computational model can benefit the diagnosis of diseases and help us to gain a better understanding of the molecular mechanism. Although many related algorithms have been proposed, there is still much room to improve the accuracy of the algorithm. Results We developed a novel algorithm, BiWalkLDA, to predict disease-related lncRNAs in three real datasets, which have 528 lncRNAs, 545 diseases and 1216 interactions in total. To compare performance with other algorithms, the leave-one-out validation test was performed for BiWalkLDA and three other existing algorithms, SIMCLDA, LDAP and LRLSLDA. Additional tests were carefully designed to analyze the parameter effects such as α, β, l and r, which could help user to select the best choice of these parameters in their own application. In a case study of prostate cancer, eight out of the top-ten disease-related lncRNAs reported by BiWalkLDA were previously confirmed in literatures. Conclusions In this paper, we develop an algorithm, BiWalkLDA, to predict lncRNA-disease association by using bi-random walks. It constructs a lncRNA-disease network by integrating interaction profile and gene ontology information. Solving cold-start problem by using neighbors’ interaction profile information. Then, bi-random walks was applied to three real biological datasets. Results show that our method outperforms other algorithms in predicting lncRNA-disease association in terms of both accuracy and specificity. Availability https://github.com/screamer/BiwalkLDA
Collapse
Affiliation(s)
- Jialu Hu
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, China.,Centre for Multidisciplinary Convergence Computing, School of Computer Science, Northwestern Polytechnical University, Xi'an, 710129, China
| | - Yiqun Gao
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Jing Li
- Ming De College, Northwestern Polytechnical University, Xi'an, 710124, China
| | - Yan Zheng
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Jingru Wang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, China.
| |
Collapse
|
7
|
Wang L, Zhao H, Li J, Xu Y, Lan Y, Yin W, Liu X, Yu L, Lin S, Du MY, Li X, Xiao Y, Zhang Y. Identifying functions and prognostic biomarkers of network motifs marked by diverse chromatin states in human cell lines. Oncogene 2019; 39:677-689. [PMID: 31537905 PMCID: PMC6962092 DOI: 10.1038/s41388-019-1005-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2019] [Revised: 07/30/2019] [Accepted: 08/15/2019] [Indexed: 12/15/2022]
Abstract
Epigenetic modifications play critical roles in modulating gene expression, yet their roles in regulatory networks in human cell lines remain poorly characterized. We integrated multiomics data to construct directed regulatory networks with nodes and edges labeled with chromatin states in human cell lines. We observed extensive association of diverse chromatin states and network motifs. The gene expression analysis showed that diverse chromatin states of coherent type-1 feedforward loop (C1-FFL) and incoherent type-1 feedforward loops (I1-FFL) contributed to the dynamic expression patterns of targets. Notably, diverse chromatin state compositions could help C1- or I1-FFL to control a large number of distinct biological functions in human cell lines, such as four different types of chromatin state compositions cooperating with K562-associated C1-FFLs controlling “regulation of cytokinesis,” “G1/S transition of mitotic cell cycle,” “DNA recombination,” and “telomere maintenance,” respectively. Remarkably, we identified six chromatin state-marked C1-FFL instances (HCFC1-NFYA-ABL1, THAP1-USF1-BRCA2, ZNF263-USF1-UBA52, MYC-ATF1-UBA52, ELK1-EGR1-CCT4, and YY1-EGR1-INO80C) could act as prognostic biomarkers of acute myelogenous leukemia though influencing cancer-related biological functions, such as cell proliferation, telomere maintenance, and DNA recombination. Our results will provide novel insight for better understanding of chromatin state-mediated gene regulation and facilitate the identification of novel diagnostic and therapeutic biomarkers of human cancers.
Collapse
Affiliation(s)
- Li Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, 150081, Harbin, China
| | - Hongying Zhao
- College of Bioinformatics Science and Technology, Harbin Medical University, 150081, Harbin, China
| | - Jing Li
- Department of Ultrasonic medicine, The First Affiliated Hospital of Heilongjiang University of Chinese Medicine, 150040, Harbin, China
| | - Yingqi Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, 150081, Harbin, China
| | - Yujia Lan
- College of Bioinformatics Science and Technology, Harbin Medical University, 150081, Harbin, China
| | - Wenkang Yin
- College of Bioinformatics Science and Technology, Harbin Medical University, 150081, Harbin, China
| | - Xiaoqin Liu
- College of Bioinformatics Science and Technology, Harbin Medical University, 150081, Harbin, China
| | - Lei Yu
- College of Bioinformatics Science and Technology, Harbin Medical University, 150081, Harbin, China
| | - Shihua Lin
- College of Bioinformatics Science and Technology, Harbin Medical University, 150081, Harbin, China
| | - Michael Yifei Du
- Weston High School of Massachusetts, 444 Wellesley street, Weston, MA, 02493, USA
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, 150081, Harbin, China.
| | - Yun Xiao
- College of Bioinformatics Science and Technology, Harbin Medical University, 150081, Harbin, China.
| | - Yunpeng Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, 150081, Harbin, China.
| |
Collapse
|
8
|
Hu J, Wang J, Lin J, Liu T, Zhong Y, Liu J, Zheng Y, Gao Y, He J, Shang X. MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites. BMC Bioinformatics 2019; 20:200. [PMID: 31074373 PMCID: PMC6509868 DOI: 10.1186/s12859-019-2735-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Transcription factors (TFs) play important roles in the regulation of gene expression. They can activate or block transcription of downstream genes in a manner of binding to specific genomic sequences. Therefore, motif discovery of these binding preference patterns is of central significance in the understanding of molecular regulation mechanism. Many algorithms have been proposed for the identification of transcription factor binding sites. However, it remains a challengeable problem. RESULTS Here, we proposed a novel motif discovery algorithm based on support vector machine (MD-SVM) to learn a discriminative model for TF binding sites. MD-SVM firstly obtains position weight matrix (PWM) from a set of training datasets. Then it translates the MD problem into a computational framework of multiple instance learning (MIL). It was applied to several real biological datasets. Results show that our algorithm outperforms MI-SVM in terms of both accuracy and specificity. CONCLUSIONS In this paper, we modeled the TF motif discovery problem as a MIL optimization problem. The SVM algorithm was adapted to discriminate positive and negative bags of instances. Compared to other svm-based algorithms, MD-SVM show its superiority over its competitors in term of ROC AUC. Hopefully, it could be of benefit to the research community in the understanding of molecular functions of DNA functional elements and transcription factors.
Collapse
Affiliation(s)
- Jialu Hu
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
- Centre of Multidisciplinary Convergence Computing, School of Computer Science, Northwestern Polytechnical University, 1 Dong Xiang Road, Xi’an, 710129 China
| | - Jingru Wang
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| | - Jianan Lin
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| | - Tianwei Liu
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| | - Yuanke Zhong
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| | - Jie Liu
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| | - Yan Zheng
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| | - Yiqun Gao
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| | - Junhao He
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| |
Collapse
|
9
|
MiteFinderII: a novel tool to identify miniature inverted-repeat transposable elements hidden in eukaryotic genomes. BMC Med Genomics 2018; 11:101. [PMID: 30453969 PMCID: PMC6245586 DOI: 10.1186/s12920-018-0418-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Background Miniature inverted-repeat transposable element (MITE) is a type of class II non-autonomous transposable element playing a crucial role in the process of evolution in biology. There is an urgent need to develop bioinformatics tools to effectively identify MITEs on a whole genome-wide scale. However, most of currently existing tools suffer from low ability to deal with large eukaryotic genomes. Methods In this paper, we proposed a novel tool MiteFinderII, which was adapted from our previous algorithm MiteFinder, to efficiently detect MITEs from genomics sequences. It has six major steps: (1) build K-mer Index and search for inverted repeats; (2) filtration of inverted repeats with low complexity; (3) merger of inverted repeats; (4) filtration of candidates with low score; (5) selection of final MITE sequences; (6) selection of representative sequences. Results To test the performance, MiteFinderII and three other existing algorithms were applied to identify MITEs on the whole genome of oryza sativa. Results suggest that MiteFinderII outperforms existing popular tools in terms of both specificity and recall. Additionally, it is much faster and more memory-efficient than other tools in the detection. Conclusion MiteFinderII is an accurate and effective tool to detect MITEs hidden in eukaryotic genomes. The source code is freely accessible at the website: https://github.com/screamer/miteFinder.
Collapse
|
10
|
Hu J, Gao Y, He J, Zheng Y, Shang X. WebNetCoffee: a web-based application to identify functionally conserved proteins from Multiple PPI networks. BMC Bioinformatics 2018; 19:422. [PMID: 30419809 PMCID: PMC6233501 DOI: 10.1186/s12859-018-2443-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2018] [Accepted: 10/22/2018] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND The discovery of functionally conserved proteins is a tough and important task in system biology. Global network alignment provides a systematic framework to search for these proteins from multiple protein-protein interaction (PPI) networks. Although there exist many web servers for network alignment, no one allows to perform global multiple network alignment tasks on users' test datasets. RESULTS Here, we developed a web server WebNetcoffee based on the algorithm of NetCoffee to search for a global network alignment from multiple networks. To build a series of online test datasets, we manually collected 218,339 proteins, 4,009,541 interactions and many other associated protein annotations from several public databases. All these datasets and alignment results are available for download, which can support users to perform algorithm comparison and downstream analyses. CONCLUSION WebNetCoffee provides a versatile, interactive and user-friendly interface for easily running alignment tasks on both online datasets and users' test datasets, managing submitted jobs and visualizing the alignment results through a web browser. Additionally, our web server also facilitates graphical visualization of induced subnetworks for a given protein and its neighborhood. To the best of our knowledge, it is the first web server that facilitates the performing of global alignment for multiple PPI networks. AVAILABILITY http://www.nwpu-bioinformatics.com/WebNetCoffee.
Collapse
Affiliation(s)
- Jialu Hu
- School of Computer Science, Northwestern Polytechnical University, Xi’an, 710072 China
- Centre for Multidisciplinary Convergence Computing, School of Computer Science, Northwestern Polytechnical University, Xi’an, 710129 China
| | - Yiqun Gao
- School of Computer Science, Northwestern Polytechnical University, Xi’an, 710072 China
| | - Junhao He
- School of Computer Science, Northwestern Polytechnical University, Xi’an, 710072 China
| | - Yan Zheng
- School of Computer Science, Northwestern Polytechnical University, Xi’an, 710072 China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, Xi’an, 710072 China
| |
Collapse
|
11
|
Peng J, Xue H, Hui W, Lu J, Chen B, Jiang Q, Shang X, Wang Y. An online tool for measuring and visualizing phenotype similarities using HPO. BMC Genomics 2018; 19:571. [PMID: 30367579 PMCID: PMC6101067 DOI: 10.1186/s12864-018-4927-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Background The Human Phenotype Ontology (HPO) is one of the most popular bioinformatics resources. Recently, HPO-based phenotype semantic similarity has been effectively applied to model patient phenotype data. However, the existing tools are revised based on the Gene Ontology (GO)-based term similarity. The design of the models are not optimized for the unique features of HPO. In addition, existing tools only allow HPO terms as input and only provide pure text-based outputs. Results We present PhenoSimWeb, a web application that allows researchers to measure HPO-based phenotype semantic similarities using four approaches borrowed from GO-based similarity measurements. Besides, we provide a approach considering the unique properties of HPO. And, PhenoSimWeb allows text that describes phenotypes as input, since clinical phenotype data is always in text. PhenoSimWeb also provides a graphic visualization interface to visualize the resulting phenotype network. Conclusions PhenoSimWeb is an easy-to-use and functional online application. Researchers can use it to calculate phenotype similarity conveniently, predict phenotype associated genes or diseases, and visualize the network of phenotype interactions. PhenoSimWeb is available at http://120.77.47.2:8080.
Collapse
Affiliation(s)
- Jiajie Peng
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Hansheng Xue
- Department of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, 518055, China
| | - Weiwei Hui
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Junya Lu
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Bolin Chen
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Qinghua Jiang
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, 150001, China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, China.
| | - Yadong Wang
- Department of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, 518055, China. .,School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China.
| |
Collapse
|
12
|
A Similarity Regression Fusion Model for Integrating Multi-Omics Data to Identify Cancer Subtypes. Genes (Basel) 2018; 9:genes9070314. [PMID: 29933539 PMCID: PMC6070922 DOI: 10.3390/genes9070314] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2018] [Revised: 06/08/2018] [Accepted: 06/11/2018] [Indexed: 12/26/2022] Open
Abstract
The identification of cancer subtypes is crucial to cancer diagnosis and treatments. A number of methods have been proposed to identify cancer subtypes by integrating multi-omics data in recent years. However, the existing methods rarely consider the biases of similarity between samples and weights of different omics data in integration. More accurate and flexible integration approaches need to be developed to comprehensively investigate cancer subtypes. In this paper, we propose a simple and flexible similarity fusion model for integrating multi-omics data to identify cancer subtypes. We consider the similarity biases between samples in each omics data and predict corrected similarities between samples using a generalized linear model. We integrate the corrected similarity information from multi-omics data according to different data-view weights. Based on the integrative similarity information, we cluster patient samples into different subtype groups. Comprehensive experiments demonstrate that the proposed approach obtains more significant results than the state-of-the-art integrative methods. In conclusion, our approach provides an effective and flexible tool to investigate subtypes in cancer by integrating multi-omics data.
Collapse
|
13
|
Hu J, Gao Y, Zheng Y, Shang X. KF-finder: identification of key factors from host-microbial networks in cervical cancer. BMC SYSTEMS BIOLOGY 2018; 12:54. [PMID: 29745858 PMCID: PMC5998879 DOI: 10.1186/s12918-018-0566-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Background The human body is colonized by a vast number of microbes. Microbiota can benefit many normal life processes, but can also cause many diseases by interfering the regular metabolism and immune system. Recent studies have demonstrated that the microbial community is closely associated with various types of cell carcinoma. The search for key factors, which also refer to cancer causing agents, can provide an important clue in understanding the regulatory mechanism of microbiota in uterine cervix cancer. Results In this paper, we investigated microbiota composition and gene expression data for 58 squamous and adenosquamous cell carcinoma. A host-microbial covariance network was constructed based on the 16s rRNA and gene expression data of the samples, which consists of 259 abundant microbes and 738 differentially expressed genes (DEGs). To search for risk factors from host-microbial networks, the method of bi-partite betweenness centrality (BpBC) was used to measure the risk of a given node to a certain biological process in hosts. A web-based tool KF-finder was developed, which can efficiently query and visualize the knowledge of microbiota and differentially expressed genes (DEGs) in the network. Conclusions Our results suggest that prevotellaceade, tissierellaceae and fusobacteriaceae are the most abundant microbes in cervical carcinoma, and the microbial community in cervical cancer is less diverse than that of any other boy sites in health. A set of key risk factors anaerococcus, hydrogenophilaceae, eubacterium, PSMB10, KCNIP1 and KRT13 have been identified, which are thought to be involved in the regulation of viral response, cell cycle and epithelial cell differentiation in cervical cancer. It can be concluded that permanent changes of microbiota composition could be a major force for chromosomal instability, which subsequently enables the effect of key risk factors in cancer. All our results described in this paper can be freely accessed from our website at http://www.nwpu-bioinformatics.com/KF-finder/.
Collapse
Affiliation(s)
- Jialu Hu
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China.,Centre of Multidisciplinary Convergence Computing, School of Computer Science, Northwestern Polytechnical University, Dong Xiang Road 1, Xi'an, 710129, China
| | - Yiqun Gao
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China
| | - Yan Zheng
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China.
| |
Collapse
|
14
|
Abstract
BACKGROUND Recently, measuring phenotype similarity began to play an important role in disease diagnosis. Researchers have begun to pay attention to develop phenotype similarity measurement. However, existing methods ignore the interactions between phenotype-associated proteins, which may lead to inaccurate phenotype similarity. RESULTS We proposed a network-based method PhenoNet to calculate the similarity between phenotypes. We localized phenotypes in the network and calculated the similarity between phenotype-associated modules by modeling both the inter- and intra-similarity. CONCLUSIONS PhenoNet was evaluated on two independent evaluation datasets: gene ontology and gene expression data. The result shows that PhenoNet performs better than the state-of-art methods on all evaluation tests.
Collapse
Affiliation(s)
- Jiajie Peng
- School of Computer Science, Northwestern Polytechnical University, Xi’an, China
| | - Weiwei Hui
- School of Computer Science, Northwestern Polytechnical University, Xi’an, China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, Xi’an, China
| |
Collapse
|