1
|
Ai Y, Xie X, Ma X. Graph Contrastive Learning for Tracking Dynamic Communities in Temporal Networks. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE 2024; 8:3422-3435. [DOI: 10.1109/tetci.2024.3386844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2025]
Affiliation(s)
- Yun Ai
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Xianghua Xie
- Department of Computer Science, Swansea University, Swansea, U.K
| | - Xiaoke Ma
- School of Computer Science and Technology, Xidian University, Xi'an, China
| |
Collapse
|
2
|
Huang L, Wang CD, Yu PS. Higher Order Connection Enhanced Community Detection in Adversarial Multiview Networks. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:3060-3074. [PMID: 34767522 DOI: 10.1109/tcyb.2021.3125227] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Community detection in multiview networks has drawn an increasing amount of attention in recent years. Many approaches have been developed from different perspectives. Despite the success, the problem of community detection in adversarial multiview networks remains largely unsolved. An adversarial multiview network is a multiview network that suffers an adversarial attack on community detection in which the attackers may deliberately remove some critical edges so as to hide the underlying community structure, leading to the performance degeneration of the existing approaches. To address this problem, we propose a novel approach, called higher order connection enhanced multiview modularity (HCEMM). The main idea lies in enhancing the intracommunity connection of each view by means of utilizing the higher order connection structure. The first step is to discover the view-specific higher order Microcommunities (VHM-communities) from the higher order connection structure. Then, for each view of the original multiview network, additional edges are added to make the nodes in each of its VHM-communities fully connected like a clique, by which the intracommunity connection of the multiview network can be enhanced. Therefore, the proposed approach is able to discover the underlying community structure in a multiview network while recovering the missing edges. Extensive experiments conducted on 16 real-world datasets confirm the effectiveness of the proposed approach.
Collapse
|
3
|
Yu Z, Zhang G, Chen J, Chen H, Zhang D, Yang Q, Shao J. Toward Noise-Resistant Graph Embedding With Subspace Clustering Information. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:2980-2992. [PMID: 34793312 DOI: 10.1109/tcyb.2021.3124274] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Most existing approaches of attributed network embedding often combine topology and attribute information based on the homophily assumption. In many real-world networks, such an assumption does not hold since the nodes are usually associated with many noisy or irrelevant attributes. To tackle this issue, we propose a noise-resistant graph embedding method, called NGE, by leveraging the subspace clustering information (i.e., the formation of communities is driven by different latent features in distinct subspaces). Specifically, we first construct a tensor to represent a given attributed network and then map it into different feature subspaces to capture community structure via tensor decomposition. For structure embedding, the link-level and community-level constraints are imposed. For attribute embedding, the feature-selection constraint is used to reinforce the relationship between topology and noise-removal attributes. By learning structure and attribute embedding with subspace clustering information, NGE can benefit both community detection, link prediction, and node classification. Extensive experimental results have demonstrated the superiority of NGE over many state-of-the-art approaches.
Collapse
|
4
|
Wen T, Cao J, Cheong KH. Gravity-Based Community Vulnerability Evaluation Model in Social Networks: GBCVE. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:2467-2479. [PMID: 34793311 DOI: 10.1109/tcyb.2021.3123081] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The usage of social media around the world is ever-increasing. Social media statistics from 2019 show that there are 3.5 billion social media users worldwide. However, the existence of community structure renders the network vulnerable to attacks and large-scale losses. How does one comprehensively consider the multiple information sources and effectively evaluate the vulnerability of the community? To answer this question, we design a gravity-based community vulnerability evaluation (GBCVE) model for multiple information considerations. Specifically, we construct the community network by the Jensen-Shannon divergence and log-sigmoid transition function to show the relationship between communities. The number of edges inside community and outside of each community, as well as the gravity index are the three important factors used in this model for evaluating the community vulnerability. These three factors correspond to the interior information of the community, small-scale interaction relationship, and large-scale interaction relationship, respectively. A fuzzy ranking algorithm is then used to describe the vulnerability relationship between different communities, and the sensitivity of different weighting parameters is then analyzed by Sobol' indices. We validate and demonstrate the applicability of our proposed community vulnerability evaluation method via three real-world complex network test examples. Our proposed model can be applied to find vulnerable components in a network to mitigate the influence of public opinions or natural disasters in real time. The community vulnerability evaluation results from our proposed model are expected to shed light on other properties of communities within social networks and have real-world applications across network science.
Collapse
|
5
|
Li D, Ma X, Gong M. Joint Learning of Feature Extraction and Clustering for Large-Scale Temporal Networks. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:1653-1666. [PMID: 34495863 DOI: 10.1109/tcyb.2021.3107679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Temporal networks are ubiquitous in nature and society, and tracking the dynamics of networks is fundamental for investigating the mechanisms of systems. Dynamic communities in temporal networks simultaneously reflect the topology of the current snapshot (clustering accuracy) and historical ones (clustering drift). Current algorithms are criticized for their inability to characterize the dynamics of networks at the vertex level, independence of feature extraction and clustering, and high time complexity. In this study, we solve these problems by proposing a novel joint learning model for dynamic community detection in temporal networks (also known as jLMDC) via joining feature extraction and clustering. This model is formulated as a constrained optimization problem. Vertices are classified into dynamic and static groups by exploring the topological structure of temporal networks to fully exploit their dynamics at each time step. Then, jLMDC updates the features of dynamic vertices by preserving features of static ones during optimization. The advantage of jLMDC is that features are extracted under the guidance of clustering, promoting performance, and saving the running time of the algorithm. Finally, we extend jLMDC to detect the overlapping dynamic community in temporal networks. The experimental results on 11 temporal networks demonstrate that jLMDC improves accuracy up to 8.23% and saves 24.89% of running time on average compared to state-of-the-art methods.
Collapse
|
6
|
Ji C, Chen H, Wang R, Cai Y, Wu H. Smoothness Sensor: Adaptive Smoothness-Transition Graph Convolutions for Attributed Graph Clustering. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:12771-12784. [PMID: 34398775 DOI: 10.1109/tcyb.2021.3088880] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Clustering techniques attempt to group objects with similar properties into a cluster. Clustering the nodes of an attributed graph, in which each node is associated with a set of feature attributes, has attracted significant attention. Graph convolutional networks (GCNs) represent an effective approach for integrating the two complementary factors of node attributes and structural information for attributed graph clustering. Smoothness is an indicator for assessing the degree of similarity of feature representations among nearby nodes in a graph. Oversmoothing in GCNs, caused by unnecessarily high orders of graph convolution, produces indistinguishable representations of nodes, such that the nodes in a graph tend to be grouped into fewer clusters, and pose a challenge due to the resulting performance drop. In this study, we propose a smoothness sensor for attributed graph clustering based on adaptive smoothness-transition graph convolutions, which senses the smoothness of a graph and adaptively terminates the current convolution once the smoothness is saturated to prevent oversmoothing. Furthermore, as an alternative to graph-level smoothness, a novel fine-grained nodewise-level assessment of smoothness is proposed, in which smoothness is computed in accordance with the neighborhood conditions of a given node at a certain order of graph convolution. In addition, a self-supervision criterion is designed considering both the tightness within clusters and the separation between clusters to guide the entire neural network training process. The experiments show that the proposed methods significantly outperform 13 other state-of-the-art baselines in terms of different metrics across five benchmark datasets. In addition, an extensive study reveals the reasons for their effectiveness and efficiency.
Collapse
|
7
|
Pan X, Hu L, Hu P, You ZH. Identifying Protein Complexes From Protein-Protein Interaction Networks Based on Fuzzy Clustering and GO Semantic Information. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2882-2893. [PMID: 34242171 DOI: 10.1109/tcbb.2021.3095947] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Protein complexes are of great significance to provide valuable insights into the mechanisms of biological processes of proteins. A variety of computational algorithms have thus been proposed to identify protein complexes in a protein-protein interaction network. However, few of them can perform their tasks by taking into account both network topology and protein attribute information in a unified fuzzy-based clustering framework. Since proteins in the same complex are similar in terms of their attribute information and the consideration of fuzzy clustering can also make it possible for us to identify overlapping complexes, we target to propose such a novel fuzzy-based clustering framework, namely FCAN-PCI, for an improved identification accuracy. To do so, the semantic similarity between the attribute information of proteins is calculated and we then integrate it into a well-established fuzzy clustering model together with the network topology. After that, a momentum method is adopted to accelerate the clustering procedure. FCAN-PCI finally applies a heuristical search strategy to identify overlapping protein complexes. A series of extensive experiments have been conducted to evaluate the performance of FCAN-PCI by comparing it with state-of-the-art identification algorithms and the results demonstrate the promising performance of FCAN-PCI.
Collapse
|
8
|
Fang X, Hu Y, Zhou P, Wu DO. Unbalanced Incomplete Multi-View Clustering Via the Scheme of View Evolution: Weak Views are Meat; Strong Views Do Eat. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE 2022. [DOI: 10.1109/tetci.2021.3077909] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Xiang Fang
- School of Computer Science and Technology, Key Laboratory of Information Storage System Ministry of Education of China, Huazhong University of Science and Technology, Wuhan, China
| | - Yuchong Hu
- School of Computer Science and Technology, Key Laboratory of Information Storage System Ministry of Education of China, Huazhong University of Science and Technology, Wuhan, China
| | - Pan Zhou
- Hubei Engineering Research Center on Big Data Security, School of Cyber Science and Engineering, Huazhong University of Science and Technology, Wuhan, China
| | - Dapeng Oliver Wu
- Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL, USA
| |
Collapse
|
9
|
Chen Z, Lin P, Chen Z, Ye D, Wang S. Diversity Embedding Deep Matrix Factorization for Multi-view Clustering. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.07.177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
10
|
He T, Bai L, Ong YS. Vicinal Vertex Allocation for Matrix Factorization in Networks. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:8047-8060. [PMID: 33600331 DOI: 10.1109/tcyb.2021.3051606] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In this article, we present a novel matrix-factorization-based model, labeled here as Vicinal vertex allocated matrix factorization (VVAMo), for uncovering clusters in network data. Different from the past related efforts of network clustering, which consider the edge structure, vertex features, or both in their design, the proposed model includes the additional detail on vertex inclinations with respect to topology and features into the learning. In particular, by taking the latent preferences between vicinal vertices into consideration, VVAMo is then able to uncover network clusters composed of proximal vertices that share analogous inclinations, and correspondingly high structural and feature correlations. To ensure such clusters are effectively uncovered, we propose a unified likelihood function for VVAMo and derive an alternating algorithm for optimizing the proposed function. Subsequently, we provide the theoretical analysis of VVAMo, including the convergence proof and computational complexity analysis. To investigate the effectiveness of the proposed model, a comprehensive empirical study of VVAMo is conducted using extensive commonly used realistic network datasets. The results obtained show that VVAMo attained superior performances over existing classical and state-of-the-art approaches.
Collapse
|
11
|
Wang S, Chen Y, Yi S, Chao G. Frobenius norm-regularized robust graph learning for multi-view subspace clustering. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03816-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
12
|
Su X, Hu L, You Z, Hu P, Zhao B. Attention-based Knowledge Graph Representation Learning for Predicting Drug-drug Interactions. Brief Bioinform 2022; 23:6572660. [PMID: 35453147 DOI: 10.1093/bib/bbac140] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 03/02/2022] [Accepted: 03/27/2022] [Indexed: 02/06/2023] Open
Abstract
Drug-drug interactions (DDIs) are known as the main cause of life-threatening adverse events, and their identification is a key task in drug development. Existing computational algorithms mainly solve this problem by using advanced representation learning techniques. Though effective, few of them are capable of performing their tasks on biomedical knowledge graphs (KGs) that provide more detailed information about drug attributes and drug-related triple facts. In this work, an attention-based KG representation learning framework, namely DDKG, is proposed to fully utilize the information of KGs for improved performance of DDI prediction. In particular, DDKG first initializes the representations of drugs with their embeddings derived from drug attributes with an encoder-decoder layer, and then learns the representations of drugs by recursively propagating and aggregating first-order neighboring information along top-ranked network paths determined by neighboring node embeddings and triple facts. Last, DDKG estimates the probability of being interacting for pairwise drugs with their representations in an end-to-end manner. To evaluate the effectiveness of DDKG, extensive experiments have been conducted on two practical datasets with different sizes, and the results demonstrate that DDKG is superior to state-of-the-art algorithms on the DDI prediction task in terms of different evaluation metrics across all datasets.
Collapse
Affiliation(s)
- Xiaorui Su
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.,University of Chinese Academy of Sciences, Beijing 100049, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China
| | - Lun Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.,University of Chinese Academy of Sciences, Beijing 100049, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China
| | - Zhuhong You
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710129, China
| | - Pengwei Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.,University of Chinese Academy of Sciences, Beijing 100049, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China
| | - Bowei Zhao
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.,University of Chinese Academy of Sciences, Beijing 100049, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China
| |
Collapse
|
13
|
Wang R, Ma H, Wang C. An Ensemble Learning Framework for Detecting Protein Complexes From PPI Networks. Front Genet 2022; 13:839949. [PMID: 35281831 PMCID: PMC8908451 DOI: 10.3389/fgene.2022.839949] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Accepted: 01/31/2022] [Indexed: 11/14/2022] Open
Abstract
Detecting protein complexes is one of the keys to understanding cellular organization and processes principles. With high-throughput experiments and computing science development, it has become possible to detect protein complexes by computational methods. However, most computational methods are based on either unsupervised learning or supervised learning. Unsupervised learning-based methods do not need training datasets, but they can only detect one or several topological protein complexes. Supervised learning-based methods can detect protein complexes with different topological structures. However, they are usually based on a type of training model, and the generalization of a single model is poor. Therefore, we propose an Ensemble Learning Framework for Detecting Protein Complexes (ELF-DPC) within protein-protein interaction (PPI) networks to address these challenges. The ELF-DPC first constructs the weighted PPI network by combining topological and biological information. Second, it mines protein complex cores using the protein complex core mining strategy we designed. Third, it obtains an ensemble learning model by integrating structural modularity and a trained voting regressor model. Finally, it extends the protein complex cores and forms protein complexes by a graph heuristic search strategy. The experimental results demonstrate that ELF-DPC performs better than the twelve state-of-the-art approaches. Moreover, functional enrichment analysis illustrated that ELF-DPC could detect biologically meaningful protein complexes. The code/dataset is available for free download from https://github.com/RongquanWang/ELF-DPC.
Collapse
Affiliation(s)
- Rongquan Wang
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
| | - Huimin Ma
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
- *Correspondence: Huimin Ma,
| | - Caixia Wang
- School of International Economics, China Foreign Affairs University, Beijing, China
| |
Collapse
|
14
|
Zhang X, Wang J, Xue X, Sun H, Zhang J. Confidence level auto-weighting robust multi-view subspace clustering. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.12.029] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
15
|
Jiang H, Huang Y. An effective drug-disease associations prediction model based on graphic representation learning over multi-biomolecular network. BMC Bioinformatics 2022; 23:9. [PMID: 34983364 PMCID: PMC8726520 DOI: 10.1186/s12859-021-04553-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Accepted: 12/29/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Drug-disease associations (DDAs) can provide important information for exploring the potential efficacy of drugs. However, up to now, there are still few DDAs verified by experiments. Previous evidence indicates that the combination of information would be conducive to the discovery of new DDAs. How to integrate different biological data sources and identify the most effective drugs for a certain disease based on drug-disease coupled mechanisms is still a challenging problem. RESULTS In this paper, we proposed a novel computation model for DDA predictions based on graph representation learning over multi-biomolecular network (GRLMN). More specifically, we firstly constructed a large-scale molecular association network (MAN) by integrating the associations among drugs, diseases, proteins, miRNAs, and lncRNAs. Then, a graph embedding model was used to learn vector representations for all drugs and diseases in MAN. Finally, the combined features were fed to a random forest (RF) model to predict new DDAs. The proposed model was evaluated on the SCMFDD-S data set using five-fold cross-validation. Experiment results showed that GRLMN model was very accurate with the area under the ROC curve (AUC) of 87.9%, which outperformed all previous works in terms of both accuracy and AUC in benchmark dataset. To further verify the high performance of GRLMN, we carried out two case studies for two common diseases. As a result, in the ranking of drugs that were predicted to be related to certain diseases (such as kidney disease and fever), 15 of the top 20 drugs have been experimentally confirmed. CONCLUSIONS The experimental results show that our model has good performance in the prediction of DDA. GRLMN is an effective prioritization tool for screening the reliable DDAs for follow-up studies concerning their participation in drug reposition.
Collapse
Affiliation(s)
- Hanjing Jiang
- Key Laboratory of Image Information Processing and Intelligent Control of Education Ministry of China, Institute of Artificial Intelligence, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, 430074, China
| | - Yabing Huang
- Department of Pathology, Renmin Hospital of Wuhan University, Wuhan, 430060, Hubei, China.
| |
Collapse
|
16
|
Su X, Hu L, You Z, Hu P, Wang L, Zhao B. A deep learning method for repurposing antiviral drugs against new viruses via multi-view nonnegative matrix factorization and its application to SARS-CoV-2. Brief Bioinform 2021; 23:6489102. [PMID: 34965582 DOI: 10.1093/bib/bbab526] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Revised: 10/20/2021] [Accepted: 11/14/2021] [Indexed: 12/15/2022] Open
Abstract
The outbreak of COVID-19 caused by SARS-coronavirus (CoV)-2 has made millions of deaths since 2019. Although a variety of computational methods have been proposed to repurpose drugs for treating SARS-CoV-2 infections, it is still a challenging task for new viruses, as there are no verified virus-drug associations (VDAs) between them and existing drugs. To efficiently solve the cold-start problem posed by new viruses, a novel constrained multi-view nonnegative matrix factorization (CMNMF) model is designed by jointly utilizing multiple sources of biological information. With the CMNMF model, the similarities of drugs and viruses can be preserved from their own perspectives when they are projected onto a unified latent feature space. Based on the CMNMF model, we propose a deep learning method, namely VDA-DLCMNMF, for repurposing drugs against new viruses. VDA-DLCMNMF first initializes the node representations of drugs and viruses with their corresponding latent feature vectors to avoid a random initialization and then applies graph convolutional network to optimize their representations. Given an arbitrary drug, its probability of being associated with a new virus is computed according to their representations. To evaluate the performance of VDA-DLCMNMF, we have conducted a series of experiments on three VDA datasets created for SARS-CoV-2. Experimental results demonstrate that the promising prediction accuracy of VDA-DLCMNMF. Moreover, incorporating the CMNMF model into deep learning gains new insight into the drug repurposing for SARS-CoV-2, as the results of molecular docking experiments reveal that four antiviral drugs identified by VDA-DLCMNMF have the potential ability to treat SARS-CoV-2 infections.
Collapse
Affiliation(s)
- Xiaorui Su
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,University of Chinese Academy of Sciences, Beijing, 100049, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi, 830011, China
| | - Lun Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,University of Chinese Academy of Sciences, Beijing, 100049, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi, 830011, China
| | - Zhuhong You
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710129, China
| | - Pengwei Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,University of Chinese Academy of Sciences, Beijing, 100049, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi, 830011, China
| | - Lei Wang
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Science, Nanning, 530007, China
| | - Bowei Zhao
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,University of Chinese Academy of Sciences, Beijing, 100049, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi, 830011, China
| |
Collapse
|
17
|
Ji BY, You ZH, Wang Y, Li ZW, Wong L. DANE-MDA: Predicting microRNA-disease associations via deep attributed network embedding. iScience 2021; 24:102455. [PMID: 34041455 PMCID: PMC8141887 DOI: 10.1016/j.isci.2021.102455] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2020] [Revised: 03/02/2021] [Accepted: 04/19/2021] [Indexed: 12/24/2022] Open
Abstract
Predicting the microRNA-disease associations by using computational methods is conductive to the efficiency of costly and laborious traditional bio-experiments. In this study, we propose a computational machine learning-based method (DANE-MDA) that preserves integrated structure and attribute features via deep attributed network embedding to predict potential miRNA-disease associations. Specifically, the integrated features are extracted by using deep stacked auto-encoder on the diverse orders of matrixes containing structure and attribute information and are then trained by using random forest classifier. Under 5-fold cross-validation experiments, DANE-MDA yielded average accuracy, sensitivity, and AUC at 85.59%, 84.23%, and 0.9264 in term of HMDD v3.0 dataset, and 83.21%, 80.39%, and 0.9113 in term of HMDD v2.0 dataset, respectively. Additionally, case studies on breast, colon, and lung neoplasms related disease show that 47, 47, and 46 of the top 50 miRNAs can be predicted and retrieved in the other database.
Collapse
Affiliation(s)
- Bo-Ya Ji
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
- University of the Chinese Academy of Sciences, Beijing 100049, China
- Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China
| | - Zhu-Hong You
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
- University of the Chinese Academy of Sciences, Beijing 100049, China
- Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China
| | - Yi Wang
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
- Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China
| | - Zheng-Wei Li
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
| | - Leon Wong
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
- University of the Chinese Academy of Sciences, Beijing 100049, China
- Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China
| |
Collapse
|
18
|
Yi HC, You ZH, Wang L, Su XR, Zhou X, Jiang TH. In silico drug repositioning using deep learning and comprehensive similarity measures. BMC Bioinformatics 2021; 22:293. [PMID: 34074242 PMCID: PMC8170943 DOI: 10.1186/s12859-020-03882-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 11/13/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Drug repositioning, meanings finding new uses for existing drugs, which can accelerate the processing of new drugs research and development. Various computational methods have been presented to predict novel drug-disease associations for drug repositioning based on similarity measures among drugs and diseases. However, there are some known associations between drugs and diseases that previous studies not utilized. METHODS In this work, we develop a deep gated recurrent units model to predict potential drug-disease interactions using comprehensive similarity measures and Gaussian interaction profile kernel. More specifically, the similarity measure is used to exploit discriminative feature for drugs based on their chemical fingerprints. Meanwhile, the Gaussian interactions profile kernel is employed to obtain efficient feature of diseases based on known disease-disease associations. Then, a deep gated recurrent units model is developed to predict potential drug-disease interactions. RESULTS The performance of the proposed model is evaluated on two benchmark datasets under tenfold cross-validation. And to further verify the predictive ability, case studies for predicting new potential indications of drugs were carried out. CONCLUSION The experimental results proved the proposed model is a useful tool for predicting new indications for drugs or new treatments for diseases, and can accelerate drug repositioning and related drug research and discovery.
Collapse
Affiliation(s)
- Hai-Cheng Yi
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Zhu-Hong You
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.
| | - Lei Wang
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China
| | - Xiao-Rui Su
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Xi Zhou
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China
| | - Tong-Hai Jiang
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China
| |
Collapse
|
19
|
Su XR, You ZH, Hu L, Huang YA, Wang Y, Yi HC. An Efficient Computational Model for Large-Scale Prediction of Protein-Protein Interactions Based on Accurate and Scalable Graph Embedding. Front Genet 2021; 12:635451. [PMID: 33719344 PMCID: PMC7953052 DOI: 10.3389/fgene.2021.635451] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Accepted: 01/25/2021] [Indexed: 11/23/2022] Open
Abstract
Protein–protein interaction (PPI) is the basis of the whole molecular mechanisms of living cells. Although traditional experiments are able to detect PPIs accurately, they often encounter high cost and require more time. As a result, computational methods have been used to predict PPIs to avoid these problems. Graph structure, as the important and pervasive data carriers, is considered as the most suitable structure to present biomedical entities and relationships. Although graph embedding is the most popular approach for graph representation learning, it usually suffers from high computational and space cost, especially in large-scale graphs. Therefore, developing a framework, which can accelerate graph embedding and improve the accuracy of embedding results, is important to large-scale PPIs prediction. In this paper, we propose a multi-level model LPPI to improve both the quality and speed of large-scale PPIs prediction. Firstly, protein basic information is collected as its attribute, including positional gene sets, motif gene sets, and immunological signatures. Secondly, we construct a weighted graph by using protein attributes to calculate node similarity. Then GraphZoom is used to accelerate the embedding process by reducing the size of the weighted graph. Next, graph embedding methods are used to learn graph topology features from the reconstructed graph. Finally, the linear Logistic Regression (LR) model is used to predict the probability of interactions of two proteins. LPPI achieved a high accuracy of 0.99997 and 0.9979 on the PPI network dataset and GraphSAGE-PPI dataset, respectively. Our further results show that the LPPI is promising for large-scale PPI prediction in both accuracy and efficiency, which is beneficial to other large-scale biomedical molecules interactions detection.
Collapse
Affiliation(s)
- Xiao-Rui Su
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China.,University of Chinese Academy of Sciences, Beijing, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Ürümqi, China
| | - Zhu-Hong You
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China.,University of Chinese Academy of Sciences, Beijing, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Ürümqi, China
| | - Lun Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China.,University of Chinese Academy of Sciences, Beijing, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Ürümqi, China
| | - Yu-An Huang
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China
| | - Yi Wang
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China.,University of Chinese Academy of Sciences, Beijing, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Ürümqi, China
| | - Hai-Cheng Yi
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China.,University of Chinese Academy of Sciences, Beijing, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Ürümqi, China
| |
Collapse
|
20
|
Christoforidis G, Kefalas P, Papadopoulos AN, Manolopoulos Y. RELINE: point-of-interest recommendations using multiple network embeddings. Knowl Inf Syst 2021. [DOI: 10.1007/s10115-020-01541-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
21
|
Chaturvedi I, Su CL, Welsch RE. Fuzzy Aggregated Topology Evolution for Cognitive Multi-tasks. Cognit Comput 2021. [DOI: 10.1007/s12559-020-09807-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|