101
|
Abstract
There is an urgent necessity of effective medication against severe acute respiratory syndrome coronavirus 2 (SARS CoV-2), which is producing the COVID-19 pandemic across the world. Its main protease (Mpro) represents an attractive pharmacological target due to its involvement in essential viral functions. The crystal structure of free Mpro shows a large structural resemblance with the main protease of SARS CoV (nowadays known as SARS CoV-1). Here, we report that average SARS CoV-2 Mpro is 1900% more sensitive than SARS CoV-1 Mpro in transmitting tiny structural changes across the whole protein through long-range interactions. The largest sensitivity of Mpro to structural perturbations is located exactly around the catalytic site Cys-145 and coincides with the binding site of strong inhibitors. These findings, based on a simplified representation of the protein as a residue network, may help in designing potent inhibitors of SARS CoV-2 Mpro.
Collapse
|
102
|
Rashid MA, Ahmad S, Siddiqui MK, Jahanbani A, Sheikholeslami SM, Shao Z. New Bounds for the Estrada Index of Phenylenes. Polycycl Aromat Compd 2020. [DOI: 10.1080/10406638.2020.1765815] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
- Muhammad Aamer Rashid
- Department of Mathematics, COMSATS University Islamabad, Lahore Campus, Lahore, Pakistan
| | - Sarfraz Ahmad
- Department of Mathematics, COMSATS University Islamabad, Lahore Campus, Lahore, Pakistan
| | | | - Akbar Jahanbani
- Department of Mathematics, Azarbaijan Shahid Madani University, Tabriz, Iran
| | | | - Zehui Shao
- Institute of Computing Science and Technology Guangzhou University, Guangzhou, China
| |
Collapse
|
103
|
Zhao B, Hu S, Liu X, Xiong H, Han X, Zhang Z, Li X, Wang L. A Novel Computational Approach for Identifying Essential Proteins From Multiplex Biological Networks. Front Genet 2020; 11:343. [PMID: 32373163 PMCID: PMC7186452 DOI: 10.3389/fgene.2020.00343] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Accepted: 03/23/2020] [Indexed: 11/13/2022] Open
Abstract
The identification of essential proteins can help in understanding the minimum requirements for cell survival and development. Ever-increasing amounts of high-throughput data provide us with opportunities to detect essential proteins from protein interaction networks (PINs). Existing network-based approaches are limited by the poor quality of the underlying PIN data, which exhibits high rates of false positive and false negative results. To overcome this problem, researchers have focused on the prediction of essential proteins by combining PINs with other biological data, which has led to the emergence of various interactions between proteins. It remains challenging, however, to use aggregated multiplex interactions within a single analysis framework to identify essential proteins. In this study, we created a multiplex biological network (MON) by initially integrating PINs, protein domains, and gene expression profiles. Next, we proposed a new approach to discover essential proteins by extending the random walk with restart algorithm to the tensor, which provides a data model representation of the MON. In contrast to existing approaches, the proposed MON approach considers for the importance of nodes and the different types of interactions between proteins during the iteration. MON was implemented to identify essential proteins within two yeast PINs. Our comprehensive experimental results demonstrated that MON outperformed 11 other state-of-the-art approaches in terms of precision-recall curve, jackknife curve, and other criteria.
Collapse
Affiliation(s)
- Bihai Zhao
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China.,Hunan Provincial Key Laboratory of Industrial Internet Technology and Security, Changsha University, Changsha, China.,Hunan Provincial Key Laboratory of Nutrition and Quality Control of Aquatic Animals, Changsha University, Changsha, China
| | - Sai Hu
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Xiner Liu
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Huijun Xiong
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Xiao Han
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Zhihong Zhang
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China.,Hunan Provincial Key Laboratory of Industrial Internet Technology and Security, Changsha University, Changsha, China
| | - Xueyong Li
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Lei Wang
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China.,Hunan Provincial Key Laboratory of Industrial Internet Technology and Security, Changsha University, Changsha, China
| |
Collapse
|
104
|
Ran M, Bai X. Vehicle Cooperative Network Model Based on Hypergraph in Vehicular Fog Computing. SENSORS 2020; 20:s20082269. [PMID: 32316327 PMCID: PMC7219052 DOI: 10.3390/s20082269] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/26/2020] [Revised: 04/07/2020] [Accepted: 04/07/2020] [Indexed: 01/16/2023]
Abstract
In this paper, we propose an optimization framework of vehicular fog computing and a cooperation vehicular network model. We aim to improve the performance of vehicular fog computing and solve the problem that the data of the vehicle collaborative network is difficult to obtain. This paper applies the hypergraph theory to study the underlying structure, considering the social characteristics of the vehicles and vehicle communication. Since the vehicles join the network in accordance with the Poisson process law, the model is analyzed by using Poisson stochastic process and mean field theory. This paper uses MATLAB to simulate the evolution process of cooperative networks. The results show that the vehicle’s super-degree in vehicular fog computing has scale-free characteristics. Through this model, the vehicle cooperation situation can be analyzed, and the vehicle dynamics can be accurately predicted to further improve the performance of vehicular fog computing. The model can be transformed into some complex network models by adjusting the parameters. It has strong universality and has certain reference significance for the research on the related characteristics of VANETs and the theoretical research of the cooperative network.
Collapse
|
105
|
Zhang Z, Luo Y, Hu S, Li X, Wang L, Zhao B. A novel method to predict essential proteins based on tensor and HITS algorithm. Hum Genomics 2020; 14:14. [PMID: 32252824 PMCID: PMC7137323 DOI: 10.1186/s40246-020-00263-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2019] [Accepted: 03/05/2020] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Essential proteins are an important part of the cell and closely related to the life activities of the cell. Hitherto, Protein-Protein Interaction (PPI) networks have been adopted by many computational methods to predict essential proteins. Most of the current approaches focus mainly on the topological structure of PPI networks. However, those methods relying solely on the PPI network have low detection accuracy for essential proteins. Therefore, it is necessary to integrate the PPI network with other biological information to identify essential proteins. RESULTS In this paper, we proposed a novel random walk method for identifying essential proteins, called HEPT. A three-dimensional tensor is constructed first by combining the PPI network of Saccharomyces cerevisiae with multiple biological data such as gene ontology annotations and protein domains. Then, based on the newly constructed tensor, we extended the Hyperlink-Induced Topic Search (HITS) algorithm from a two-dimensional to a three-dimensional tensor model that can be utilized to infer essential proteins. Different from existing state-of-the-art methods, the importance of proteins and the types of interactions will both contribute to the essential protein prediction. To evaluate the performance of our newly proposed HEPT method, proteins are ranked in the descending order based on their ranking scores computed by our method and other competitive methods. After that, a certain number of the ranked proteins are selected as candidates for essential proteins. According to the list of known essential proteins, the number of true essential proteins is used to judge the performance of each method. Experimental results show that our method can achieve better prediction performance in comparison with other nine state-of-the-art methods in identifying essential proteins. CONCLUSIONS Through analysis and experimental results, it is obvious that HEPT can be used to effectively improve the prediction accuracy of essential proteins by the use of HITS algorithm and the combination of network topology with gene ontology annotations and protein domains, which provides a new insight into multi-data source fusion.
Collapse
Affiliation(s)
- Zhihong Zhang
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, 410022 China
| | - Yingchun Luo
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, 410022 China
- Department of Ultrasound, Hunan Province Women and Children’s Hospital, Changsha, 410008 China
| | - Sai Hu
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, 410022 China
| | - Xueyong Li
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, 410022 China
| | - Lei Wang
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, 410022 China
| | - Bihai Zhao
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, 410022 China
- Hunan Provincial Key Laboratory of Nutrition and Quality Control of Aquatic Animals, Department of Biological and Environmental Engineering, Changsha University, Changsha, 410022 China
| |
Collapse
|
106
|
Li X, Li W, Zeng M, Zheng R, Li M. Network-based methods for predicting essential genes or proteins: a survey. Brief Bioinform 2020; 21:566-583. [PMID: 30776072 DOI: 10.1093/bib/bbz017] [Citation(s) in RCA: 64] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Revised: 01/21/2019] [Accepted: 01/22/2019] [Indexed: 01/03/2025] Open
Abstract
Genes that are thought to be critical for the survival of organisms or cells are called essential genes. The prediction of essential genes and their products (essential proteins) is of great value in exploring the mechanism of complex diseases, the study of the minimal required genome for living cells and the development of new drug targets. As laboratory methods are often complicated, costly and time-consuming, a great many of computational methods have been proposed to identify essential genes/proteins from the perspective of the network level with the in-depth understanding of network biology and the rapid development of biotechnologies. Through analyzing the topological characteristics of essential genes/proteins in protein-protein interaction networks (PINs), integrating biological information and considering the dynamic features of PINs, network-based methods have been proved to be effective in the identification of essential genes/proteins. In this paper, we survey the advanced methods for network-based prediction of essential genes/proteins and present the challenges and directions for future research.
Collapse
Affiliation(s)
- Xingyi Li
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Wenkai Li
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Min Zeng
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Ruiqing Zheng
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| |
Collapse
|
107
|
Lei X, Yang X, Wu FX. Artificial Fish Swarm Optimization Based Method to Identify Essential Proteins. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:495-505. [PMID: 30113899 DOI: 10.1109/tcbb.2018.2865567] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
It is well known that essential proteins play an extremely important role in controlling cellular activities in living organisms. Identifying essential proteins from protein protein interaction (PPI) networks is conducive to the understanding of cellular functions and molecular mechanisms. Hitherto, many essential proteins detection methods have been proposed. Nevertheless, those existing identification methods are not satisfactory because of low efficiency and low sensitivity to noisy data. This paper presents a novel computational approach based on artificial fish swarm optimization for essential proteins prediction in PPI networks (called AFSO_EP). In AFSO_EP, first, a part of known essential proteins are randomly chosen as artificial fishes of priori knowledge. Then, detecting essential proteins by imitating four principal biological behaviors of artificial fishes when searching for food or companions, including foraging behavior, following behavior, swarming behavior, and random behavior, in which process, the network topology, gene expression, gene ontology (GO) annotation, and subcellular localization information are utilized. To evaluate the performance of AFSO_EP, we conduct experiments on two species (Saccharomyces cerevisiae and Drosophila melanogaster), the experimental results show that our method AFSO_EP achieves a better performance for identifying essential proteins in comparison with several other well-known identification methods, which confirms the effectiveness of AFSO_EP.
Collapse
|
108
|
Abstract
The analysis of folding trajectories for proteins is an open challenge. One of the problems is how to describe the amount of folded secondary structure in a protein. We extend the use of Estradas' folding degree (Bioinformatics 2002, 18, 697) for the analysis of the evolution of the folding stage during molecular dynamics (MD) simulation. It is shown that residue contribution to the total folding degree is a predominantly local property, well-defined by the backbone dihedral angles at the given residue, without significant contribution from the backbone conformation of other residues. Moreover, the magnitude of this residue contribution can be quite easily associated with characteristic motifs of secondary protein structures such as the α-helix, β-sheet (hairpin), and so on by means of a Ramachandran-like plot as a function of backbone dihedral angles φ,ψ. Additionally, the understanding of the free energy profile associated with the folding process becomes much simpler. Often a 1D profile is sufficient to locate global minima and the corresponding structure for short peptides.
Collapse
Affiliation(s)
- Vladimir Sladek
- Institute of Chemistry - Centre for Glycomics, Dubravska cesta 9, 84538 Bratislava, Slovakia.,Agency for Medical Research and Development (AMED), Chiyoda-ku, Japan
| | - Ryuhei Harada
- Center for Computational Sciences, University of Tsukuba, Tennodai 1-1-1, Tsukuba, Ibaraki 305-8577, Japan
| | - Yasuteru Shigeta
- Center for Computational Sciences, University of Tsukuba, Tennodai 1-1-1, Tsukuba, Ibaraki 305-8577, Japan
| |
Collapse
|
109
|
Network Embedding the Protein-Protein Interaction Network for Human Essential Genes Identification. Genes (Basel) 2020; 11:genes11020153. [PMID: 32023848 PMCID: PMC7074227 DOI: 10.3390/genes11020153] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Revised: 01/27/2020] [Accepted: 01/29/2020] [Indexed: 11/18/2022] Open
Abstract
Essential genes are a group of genes that are indispensable for cell survival and cell fertility. Studying human essential genes helps scientists reveal the underlying biological mechanisms of a human cell but also guides disease treatment. Recently, the publication of human essential gene data makes it possible for researchers to train a machine-learning classifier by using some features of the known human essential genes and to use the classifier to predict new human essential genes. Previous studies have found that the essentiality of genes closely relates to their properties in the protein–protein interaction (PPI) network. In this work, we propose a novel supervised method to predict human essential genes by network embedding the PPI network. Our approach implements a bias random walk on the network to get the node network context. Then, the node pairs are input into an artificial neural network to learn their representation vectors that maximally preserves network structure and the properties of the nodes in the network. Finally, the features are put into an SVM classifier to predict human essential genes. The prediction results on two human PPI networks show that our method achieves better performance than those that refer to either genes’ sequence information or genes’ centrality properties in the network as input features. Moreover, it also outperforms the methods that represent the PPI network by other previous approaches.
Collapse
|
110
|
Abstract
Critical nodes identification in complex networks is significance for studying the survivability and robustness of networks. The previous studies on structural hole theory uncovered that structural holes are gaps between a group of indirectly connected nodes and intermediaries that fill the holes and serve as brokers for information exchange. In this paper, we leverage the property of structural hole to design a heuristic algorithm based on local information of the network topology to identify node importance in undirected and unweighted network, whose adjacency matrix is symmetric. In the algorithm, a node with a larger degree and greater number of structural holes associated with it, achieves a higher importance ranking. Six real networks are used as test data. The experimental results show that the proposed method not only has low computational complexity, but also outperforms degree centrality, k-shell method, mapping entropy centrality, the collective influence algorithm, DDN algorithm that based on node degree and their neighbors, and random ranking method in identifying node importance for network connectivity in complex networks.
Collapse
|
111
|
Curado M, Escolano F, Lozano MA, Hancock ER. Seeking affinity structure: Strategies for improving m-best graph matching. Inf Sci (N Y) 2020. [DOI: 10.1016/j.ins.2019.09.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
112
|
Lauw HW, Wong RCW, Ntoulas A, Lim EP, Ng SK, Pan SJ. Node Conductance: A Scalable Node Centrality Measure on Big Networks. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING 2020. [PMCID: PMC7206264 DOI: 10.1007/978-3-030-47436-2_40] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Node centralities such as Degree and Betweenness help detecting influential nodes from local or global view. Existing global centrality measures suffer from the high computational complexity and unrealistic assumptions, limiting their applications on real-world applications. In this paper, we propose a new centrality measure, Node Conductance, to effectively detect spanning structural hole nodes and predict the formation of new edges. Node Conductance is the sum of the probability that node i is revisited at r-th step, where r is an integer between 1 and infinity. Moreover, with the help of node embedding techniques, Node Conductance is able to be approximately calculated on big networks effectively and efficiently. Thorough experiments present the differences between existing centralities and Node Conductance, its outstanding ability of detecting influential nodes on both static and dynamic network, and its superior efficiency compared with other global centralities.
Collapse
Affiliation(s)
- Hady W. Lauw
- School of Information Systems, Singapore Management University, Singapore, Singapore
| | - Raymond Chi-Wing Wong
- Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, Hong Kong
| | - Alexandros Ntoulas
- Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, Athens, Greece
| | - Ee-Peng Lim
- School of Information Systems, Singapore Management University, Singapore, Singapore
| | - See-Kiong Ng
- Institute of Data Science, National University of Singapore, Singapore, Singapore
| | - Sinno Jialin Pan
- School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| |
Collapse
|
113
|
Habibi M, Khosravi P. Disruption of Protein Complexes from Weighted Complex Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:102-109. [PMID: 30047895 DOI: 10.1109/tcbb.2018.2859952] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Essential proteins are indispensable units for living organisms. Removing those leads to disruption of protein complexes and causing lethality. Recently, theoretical methods have been presented to detect essential proteins in protein interaction network. In these methods, an essential protein is predicted as a high-degree vertex of protein interaction network. However, interaction data are usually incomplete and an essential protein cannot have high-connection due to data deficiency. Then, it is critical to design informative networks from other biological data sources. In this paper, we defined a minimal set of proteins to disrupt the maximum number of protein complexes. We constructed a weighted graph using a set of given complexes. We proposed a more appropriate method based on betweenness values to diagnose a minimal set of proteins whose removal would generate the disruption of protein complexes. The effectiveness of the proposed method was benchmarked using given dataset of complexes. The results of our method were compared to the results of other methods in terms of the number of disrupted complexes. Also, results indicated significant superiority of the minimal set of proteins in the massive disruption of complexes. Finally, we investigated the performance of our method for yeast and human datasets and analyzed biological properties of the selected proteins. Our algorithm and some example are freely available from http://bs.ipm.ac.ir/softwares/DPC/DPC.zip.
Collapse
|
114
|
Key Node Ranking in Complex Networks: A Novel Entropy and Mutual Information-Based Approach. ENTROPY 2019; 22:e22010052. [PMID: 33285827 PMCID: PMC7516483 DOI: 10.3390/e22010052] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Revised: 12/26/2019] [Accepted: 12/27/2019] [Indexed: 11/30/2022]
Abstract
Numerous problems in many fields can be solved effectively through the approach of modeling by complex network analysis. Finding key nodes is one of the most important and challenging problems in network analysis. In previous studies, methods have been proposed to identify key nodes. However, they rely mainly on a limited field of local information, lack large-scale access to global information, and are also usually NP-hard. In this paper, a novel entropy and mutual information-based centrality approach (EMI) is proposed, which attempts to capture a far wider range and a greater abundance of information for assessing how vital a node is. We have developed countermeasures to assess the influence of nodes: EMI is no longer confined to neighbor nodes, and both topological and digital network characteristics are taken into account. We employ mutual information to fix a flaw that exists in many methods. Experiments on real-world connected networks demonstrate the outstanding performance of the proposed approach in both correctness and efficiency as compared with previous approaches.
Collapse
|
115
|
Zeng M, Li M, Wu FX, Li Y, Pan Y. DeepEP: a deep learning framework for identifying essential proteins. BMC Bioinformatics 2019; 20:506. [PMID: 31787076 PMCID: PMC6886168 DOI: 10.1186/s12859-019-3076-y] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Background Essential proteins are crucial for cellular life and thus, identification of essential proteins is an important topic and a challenging problem for researchers. Recently lots of computational approaches have been proposed to handle this problem. However, traditional centrality methods cannot fully represent the topological features of biological networks. In addition, identifying essential proteins is an imbalanced learning problem; but few current shallow machine learning-based methods are designed to handle the imbalanced characteristics. Results We develop DeepEP based on a deep learning framework that uses the node2vec technique, multi-scale convolutional neural networks and a sampling technique to identify essential proteins. In DeepEP, the node2vec technique is applied to automatically learn topological and semantic features for each protein in protein-protein interaction (PPI) network. Gene expression profiles are treated as images and multi-scale convolutional neural networks are applied to extract their patterns. In addition, DeepEP uses a sampling method to alleviate the imbalanced characteristics. The sampling method samples the same number of the majority and minority samples in a training epoch, which is not biased to any class in training process. The experimental results show that DeepEP outperforms traditional centrality methods. Moreover, DeepEP is better than shallow machine learning-based methods. Detailed analyses show that the dense vectors which are generated by node2vec technique contribute a lot to the improved performance. It is clear that the node2vec technique effectively captures the topological and semantic properties of PPI network. The sampling method also improves the performance of identifying essential proteins. Conclusion We demonstrate that DeepEP improves the prediction performance by integrating multiple deep learning techniques and a sampling method. DeepEP is more effective than existing methods.
Collapse
Affiliation(s)
- Min Zeng
- School of Computer Science and Engineering, Central South University, Changsha, 410083, People's Republic of China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha, 410083, People's Republic of China.
| | - Fang-Xiang Wu
- Division of Biomedical Engineering and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SKS7N5A9, Canada
| | - Yaohang Li
- Department of Computer Science, Old Dominion University, Norfolk, VA23529, USA
| | - Yi Pan
- Department of Computer Science, Georgia State University, Atlanta, GA30302, USA
| |
Collapse
|
116
|
|
117
|
Ghadermarzi S, Li X, Li M, Kurgan L. Sequence-Derived Markers of Drug Targets and Potentially Druggable Human Proteins. Front Genet 2019; 10:1075. [PMID: 31803227 PMCID: PMC6872670 DOI: 10.3389/fgene.2019.01075] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Accepted: 10/09/2019] [Indexed: 12/16/2022] Open
Abstract
Recent research shows that majority of the druggable human proteome is yet to be annotated and explored. Accurate identification of these unexplored druggable proteins would facilitate development, screening, repurposing, and repositioning of drugs, as well as prediction of new drug–protein interactions. We contrast the current drug targets against the datasets of non-druggable and possibly druggable proteins to formulate markers that could be used to identify druggable proteins. We focus on the markers that can be extracted from protein sequences or names/identifiers to ensure that they can be applied across the entire human proteome. These markers quantify key features covered in the past works (topological features of PPIs, cellular functions, and subcellular locations) and several novel factors (intrinsic disorder, residue-level conservation, alternative splicing isoforms, domains, and sequence-derived solvent accessibility). We find that the possibly druggable proteins have significantly higher abundance of alternative splicing isoforms, relatively large number of domains, higher degree of centrality in the protein-protein interaction networks, and lower numbers of conserved and surface residues, when compared with the non-druggable proteins. We show that the current drug targets and possibly druggable proteins share involvement in the catalytic and signaling functions. However, unlike the drug targets, the possibly druggable proteins participate in the metabolic and biosynthesis processes, are enriched in the intrinsic disorder, interact with proteins and nucleic acids, and are localized across the cell. To sum up, we formulate several markers that can help with finding novel druggable human proteins and provide interesting insights into the cellular functions and subcellular locations of the current drug targets and potentially druggable proteins.
Collapse
Affiliation(s)
- Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Xingyi Li
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| |
Collapse
|
118
|
Identification of key regulators in prostate cancer from gene expression datasets of patients. Sci Rep 2019; 9:16420. [PMID: 31712650 PMCID: PMC6848149 DOI: 10.1038/s41598-019-52896-x] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Accepted: 10/15/2019] [Indexed: 12/20/2022] Open
Abstract
Identification of key regulators and regulatory pathways is an important step in the discovery of genes involved in cancer. Here, we propose a method to identify key regulators in prostate cancer (PCa) from a network constructed from gene expression datasets of PCa patients. Overexpressed genes were identified using BioXpress, having a mutational status according to COSMIC, followed by the construction of PCa Interactome network using the curated genes. The topological parameters of the network exhibited power law nature indicating hierarchical scale-free properties and five levels of organization. Highest degree hubs (k ≥ 65) were selected from the PCa network, traced, and 19 of them was identified as novel key regulators, as they participated at all network levels serving as backbone. Of the 19 hubs, some have been reported in literature to be associated with PCa and other cancers. Based on participation coefficient values most of these are connector or kinless hubs suggesting significant roles in modular linkage. The observation of non-monotonicity in the rich club formation suggested the importance of intermediate hubs in network integration, and they may play crucial roles in network stabilization. The network was self-organized as evident from fractal nature in topological parameters of it and lacked a central control mechanism.
Collapse
|
119
|
Li G, Li M, Peng W, Li Y, Pan Y, Wang J. A novel extended Pareto Optimality Consensus model for predicting essential proteins. J Theor Biol 2019; 480:141-149. [PMID: 31398315 DOI: 10.1016/j.jtbi.2019.08.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Revised: 08/02/2019] [Accepted: 08/06/2019] [Indexed: 12/11/2022]
Abstract
Essential proteins have vital functions, when they are destroyed in cells, the cells will die or stop reproducing. Therefore, it is very important to identify essential proteins from a large number of other proteins. Due to the time-consuming, expensive, and inefficient process in biological experimental methods, computational methods become more and more popular to recognize them. In the early stages, these methods mainly rely on protein-protein interaction (PPI) information, which limits their discovery capacities. Researchers find novel methods by fusing multi-information to improve prediction accuracy. According to these features, essential proteins are more important and conservative in the evolution process, their neighbors in PPI networks are usually likely to be essential, there are many false positives in PPI data, whether a protein is essential can be assessed by the importance of a protein itself, the relevance of neighbors and the reliability of PPIs. The importance of neighbors and the reliability of PPIs can be further integrated into neighborhood feature. In the study, orthologous information, edge-clustering coefficient and gene expression information are used to measure the importance of a protein itself, the importance of the neighbors and the reliability of PPIs, respectively. Then, a novel expanded POC model, E_POC, is proposed to fuse the above information to discover essential proteins, a weighted PPI network is constructed. The proteins ranked high according to their weights are treated as candidate essential proteins. This novel method is named as E_POC. E_POC outperforms the existing classical methods on S. cerevisiae and E. coli data.
Collapse
Affiliation(s)
- Gaoshi Li
- School of Computer Science and engineering, Central South University, Changsha 410083, China; Guangxi Key Lab of Multi-source Information Mining & Security, Guangxi Normal University, Guilin, Guangxi 541004, China.
| | - Min Li
- School of Computer Science and engineering, Central South University, Changsha 410083, China.
| | - Wei Peng
- Computer Center/ Faculty of Information Engineering and Automation of Kunming University of Science and Technology, Kunming, Yunnan 650093, China
| | - Yaohang Li
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA.
| | - Yi Pan
- Department of Computer Science, Georgia State University, Atlanta, GA 30302-4110, USA.
| | - Jianxin Wang
- School of Computer Science and engineering, Central South University, Changsha 410083, China.
| |
Collapse
|
120
|
|
121
|
Systematic comparison between methods for the detection of influential spreaders in complex networks. Sci Rep 2019; 9:15095. [PMID: 31641200 PMCID: PMC6805897 DOI: 10.1038/s41598-019-51209-6] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Accepted: 09/18/2019] [Indexed: 12/04/2022] Open
Abstract
Influence maximization is the problem of finding the set of nodes of a network that maximizes the size of the outbreak of a spreading process occurring on the network. Solutions to this problem are important for strategic decisions in marketing and political campaigns. The typical setting consists in the identification of small sets of initial spreaders in very large networks. This setting makes the optimization problem computationally infeasible for standard greedy optimization algorithms that account simultaneously for information about network topology and spreading dynamics, leaving space only to heuristic methods based on the drastic approximation of relying on the geometry of the network alone. The literature on the subject is plenty of purely topological methods for the identification of influential spreaders in networks. However, it is unclear how far these methods are from being optimal. Here, we perform a systematic test of the performance of a multitude of heuristic methods for the identification of influential spreaders. We quantify the performance of the various methods on a corpus of 100 real-world networks; the corpus consists of networks small enough for the application of greedy optimization so that results from this algorithm are used as the baseline needed for the analysis of the performance of the other methods on the same corpus of networks. We find that relatively simple network metrics, such as adaptive degree or closeness centralities, are able to achieve performances very close to the baseline value, thus providing good support for the use of these metrics in large-scale problem settings. Also, we show that a further 2–5% improvement towards the baseline performance is achievable by hybrid algorithms that combine two or more topological metrics together. This final result is validated on a small collection of large graphs where greedy optimization is not applicable.
Collapse
|
122
|
Abstract
With the rapid development of Internet technology, the social network has gradually become an indispensable platform for users to release information, obtain information, and share information. Users are not only receivers of information, but also publishers and disseminators of information. How to select a certain number of users to use their influence to achieve the maximum dissemination of information has become a hot topic at home and abroad. Rapid and accurate identification of influential nodes in the network is of great practical significance, such as the rapid dissemination, suppression of social network information, and the smooth operation of the network. Therefore, from the perspective of improving computational accuracy and efficiency, we propose an influential node identification method based on effective distance, named KDEC. By quantifying the effective distance between nodes and combining the position of the node in the network and its local structure, the influence of the node in the network is obtained, which is used as an indicator to evaluate the influence of the node. Through experimental analysis of a lot of real-world networks, the results show that the method can quickly and accurately identify the influential nodes in the network, and is better than some classical algorithms and some recently proposed algorithms.
Collapse
|
123
|
Abstract
In this paper, we propose weighted h-index h w and h-index strength s h to measure spreading capability and identify the most influential spreaders. Experimental results on twelve real networks reveal that s h was more accurate and more monotonic than h w and four previous measures in ranking the spreading influence of a node evaluated by the single seed SIR spreading model. We point out that the questions of how to improve monotonicity and how to determine a proper neighborhood range are two interesting future directions.
Collapse
|
124
|
Oldham S, Fulcher B, Parkes L, Arnatkevic̆iūtė A, Suo C, Fornito A. Consistency and differences between centrality measures across distinct classes of networks. PLoS One 2019; 14:e0220061. [PMID: 31348798 PMCID: PMC6660088 DOI: 10.1371/journal.pone.0220061] [Citation(s) in RCA: 99] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2019] [Accepted: 07/08/2019] [Indexed: 11/20/2022] Open
Abstract
The roles of different nodes within a network are often understood through centrality analysis, which aims to quantify the capacity of a node to influence, or be influenced by, other nodes via its connection topology. Many different centrality measures have been proposed, but the degree to which they offer unique information, and whether it is advantageous to use multiple centrality measures to define node roles, is unclear. Here we calculate correlations between 17 different centrality measures across 212 diverse real-world networks, examine how these correlations relate to variations in network density and global topology, and investigate whether nodes can be clustered into distinct classes according to their centrality profiles. We find that centrality measures are generally positively correlated to each other, the strength of these correlations varies across networks, and network modularity plays a key role in driving these cross-network variations. Data-driven clustering of nodes based on centrality profiles can distinguish different roles, including topological cores of highly central nodes and peripheries of less central nodes. Our findings illustrate how network topology shapes the pattern of correlations between centrality measures and demonstrate how a comparative approach to network centrality can inform the interpretation of nodal roles in complex networks.
Collapse
Affiliation(s)
- Stuart Oldham
- The Turner Institute for Brain and Mental Health, School of Psychological Sciences and Monash Biomedical Imaging, Monash University, Clayton, Victoria, Australia
- * E-mail:
| | - Ben Fulcher
- The Turner Institute for Brain and Mental Health, School of Psychological Sciences and Monash Biomedical Imaging, Monash University, Clayton, Victoria, Australia
- School of Physics, The University of Sydney, Sydney, New South Wales, Australia
| | - Linden Parkes
- The Turner Institute for Brain and Mental Health, School of Psychological Sciences and Monash Biomedical Imaging, Monash University, Clayton, Victoria, Australia
| | - Aurina Arnatkevic̆iūtė
- The Turner Institute for Brain and Mental Health, School of Psychological Sciences and Monash Biomedical Imaging, Monash University, Clayton, Victoria, Australia
| | - Chao Suo
- The Turner Institute for Brain and Mental Health, School of Psychological Sciences and Monash Biomedical Imaging, Monash University, Clayton, Victoria, Australia
| | - Alex Fornito
- The Turner Institute for Brain and Mental Health, School of Psychological Sciences and Monash Biomedical Imaging, Monash University, Clayton, Victoria, Australia
| |
Collapse
|
125
|
Donato C, Lo Giudice P, Marretta R, Ursino D, Virgili L. A well-tailored centrality measure for evaluating patents and their citations. JOURNAL OF DOCUMENTATION 2019. [DOI: 10.1108/jd-10-2018-0168] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Purpose
The development of innovations in all the research and development (R&D) fields is leading to a huge increase of patent data. Therefore, it is reasonable to foresee that, in the next future, Big Data-centered techniques will be compulsory to fully exploit the potential of this kind of data. In this context, network analysis-based approaches are extremely promising. The purpose of this paper is to provide a contribution to this setting. In fact, the authors propose a well-tailored centrality measure for evaluating patents and their citations.
Design/methodology/approach
The authors preliminarily introduce a suitable support directed network representing patents and their citations. After this, the authors present the centrality measures, namely, “Naive Patent Degree” and “Refined Patent Degree.’” Then, the authors show why they are well tailored to capture the specificities of the patent scenario and why classical centrality measure fails to fully reach this purpose.
Findings
The authors present three possible applications of the measures, namely: the computation of a patent “scope” allowing the evaluation of the width and the strength of the influence of a patent on a given R&D field; the computation of a patent lifecycle; and the detection of the so-called “power patents,” i.e., the most relevant patents, and the investigation of the importance, for a patent, to be cited by a power patent.
Originality/value
None of the approaches proposing the application of centrality measures to patent citation networks consider the main peculiarity of this scenario, i.e., that, if a patent pi cites a patent pj, then the value of pi decreases. So, differently from classical scientific paper citation scenario, in this one performing a citation has a cost for the citing entity. This fact is not considered by all the approaches conceived to investigate paper citations. Nevertheless, this feature represents the core of patent citation scenario. The approach has been explicitly conceived to capture this feature.
Collapse
|
126
|
Zaman S, Lee WC. Real-space visualization of quantum phase transitions by network topology. Phys Rev E 2019; 100:012304. [PMID: 31499793 DOI: 10.1103/physreve.100.012304] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Indexed: 06/10/2023]
Abstract
We demonstrate that with appropriate quantum correlation function, a real-space network model can be constructed to study the phase transitions in quantum systems. For a three-dimensional bosonic system, a single-particle density matrix is adopted to construct an adjacency matrix. We show that a Bose-Einstein condensate transition can be interpreted as a transition into a small-world network, which is accurately captured by a small-world coefficient. For a one-dimensional disordered system, using the electron diffusion operator to build the adjacency matrix, we find that Anderson localized states create many weakly linked subgraphs, which significantly reduces the clustering coefficient and lengthens the shortest path. We show that the crossover from delocalized to localized regimes as a function of the disorder strength can be identified as a loss of global connection, which is revealed by the small-world coefficient as well as other independent measures such as robustness, efficiency, and algebraic connectivity. Our results suggest that quantum phase transitions can be visualized in real space and characterized by network analysis with suitable choices of quantum correlation functions.
Collapse
Affiliation(s)
- Shehtab Zaman
- Department of Physics, Applied Physics, and Astronomy, Binghamton University-State University of New York, Binghamton, New York 13902, USA
| | - Wei-Cheng Lee
- Department of Physics, Applied Physics, and Astronomy, Binghamton University-State University of New York, Binghamton, New York 13902, USA
| |
Collapse
|
127
|
Li M, Ni P, Chen X, Wang J, Wu FX, Pan Y. Construction of Refined Protein Interaction Network for Predicting Essential Proteins. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:1386-1397. [PMID: 28186903 DOI: 10.1109/tcbb.2017.2665482] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Identification of essential proteins based on protein interaction network (PIN) is a very important and hot topic in the post genome era. Up to now, a number of network-based essential protein discovery methods have been proposed. Generally, a static protein interaction network was constructed by using the protein-protein interactions obtained from different experiments or databases. Unfortunately, most of the network-based essential protein discovery methods are sensitive to the reliability of the constructed PIN. In this paper, we propose a new method for constructing refined PIN by using gene expression profiles and subcellular location information. The basic idea behind refining the PIN is that two proteins should have higher possibility to physically interact with each other if they appear together at the same subcellular location and are active together at least at a time point in the cell cycle. The original static PIN is denoted by S-PIN while the final PIN refined by our method is denoted by TS-PIN. To evaluate whether the constructed TS-PIN is more suitable to be used in the identification of essential proteins, 10 network-based essential protein discovery methods (DC, EC, SC, BC, CC, IC, LAC, NC, BN, and DMNC) are applied on it to identify essential proteins. A comparison of TS-PIN and two other networks: S-PIN and NF-APIN (a noise-filtered active PIN constructed by using gene expression data and S-PIN) is implemented on the prediction of essential proteins by using these ten network-based methods. The comparison results show that all of the 10 network-based methods achieve better results when being applied on TS-PIN than that being applied on S-PIN and NF-APIN.
Collapse
|
128
|
Zhang Z, Ruan J, Gao J, Wu FX. Predicting essential proteins from protein-protein interactions using order statistics. J Theor Biol 2019; 480:274-283. [PMID: 31251944 DOI: 10.1016/j.jtbi.2019.06.022] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Revised: 03/24/2019] [Accepted: 06/24/2019] [Indexed: 12/11/2022]
Abstract
Many computational methods have been proposed to predict essential proteins from protein-protein interaction (PPI) networks. However, it is still challenging to improve the prediction accuracy. In this study, we propose a new method, esPOS (essential proteins Predictor using Order Statistics) to predict essential proteins from PPI networks. Firstly, we refine the networks by using gene expression information and subcellular localization information. Secondly, we design some new features, which combine the protein predicted secondary structure with PPI network. We show that these new features are useful to predict essential proteins. Thirdly, we optimize these features by using a greedy method, and combine the optimized features by order statistic method. Our method achieves the prediction accuracy of 0.76-0.79 on two network datasets. The proposed method is available at https://sourceforge.net/projects/espos/.
Collapse
Affiliation(s)
- Zhaopeng Zhang
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, 300071, China.
| | - Jishou Ruan
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, 300071, China.
| | - Jianzhao Gao
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, 300071, China.
| | - Fang-Xiang Wu
- Department of Mechanical Engineering and Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada.
| |
Collapse
|
129
|
Deng Y, Wu J, Qi M, Tan Y. Optimal disintegration strategy in spatial networks with disintegration circle model. CHAOS (WOODBURY, N.Y.) 2019; 29:061102. [PMID: 31266330 DOI: 10.1063/1.5093201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Accepted: 05/28/2019] [Indexed: 06/09/2023]
Abstract
The problem of network disintegration, such as suppression of an epidemic spread and destabilization of terrorist networks, has broad applications and recently has received increasing attention. In this study, we concentrate on the problem of network disintegration in the spatial network in which the nodes and edges are embedded in space. For such a network, it is crucial to include spatial information in the search for an optimal disintegration strategy. We first carry out an optimization model with multiple disintegration circles in the spatial network and introduce a tabu search to seek the optimal disintegration strategy. We demonstrate that the "best" disintegration strategy can be identified through global searches in the spatial network. The optimal disintegration strategy of the spatial network tends to place the disintegration circles so that they cover more nodes which are closer to the average degree to achieve a more destructive effect. Our understanding of the optimal disintegration strategy in spatial networks may also provide insight into network protection, e.g., identification of the weakest part, which deserves further study.
Collapse
Affiliation(s)
- Ye Deng
- College of Systems Engineering, National University of Defense Technology, Changsha, Hunan 410073, People's Republic of China
| | - Jun Wu
- International Academic Center of Complex Systems, Beijing Normal University, Zhuhai 519087, Guangdong, People's Republic of China
| | - Mingze Qi
- College of Systems Engineering, National University of Defense Technology, Changsha, Hunan 410073, People's Republic of China
| | - Yuejin Tan
- College of Systems Engineering, National University of Defense Technology, Changsha, Hunan 410073, People's Republic of China
| |
Collapse
|
130
|
Chu D. A Fast Frequent Directions Algorithm for Low Rank Approximation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2019; 41:1279-1293. [PMID: 29993709 DOI: 10.1109/tpami.2018.2839198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Recently a deterministic method, frequent directions (FD) is proposed to solve the high dimensional low rank approximation problem. It works well in practice, but experiences high computational cost. In this paper, we establish a fast frequent directions algorithm for the low rank approximation problem, which implants a randomized algorithm, sparse subspace embedding (SpEmb) in FD. This new algorithm makes use of FD's natural block structure and sends more information through SpEmb to each block in FD. We prove that our new algorithm produces a good low rank approximation with a sketch of size linear on the rank approximated. Its effectiveness and efficiency are demonstrated by the experimental results on both synthetic and real world datasets, as well as applications in network analysis.
Collapse
|
131
|
Rentería-Ramos R, Hurtado-Heredia R, Urdinola BP. Morbi-Mortality of the Victims of Internal Conflict and Poor Population in the Risaralda Province, Colombia. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2019; 16:E1644. [PMID: 31083523 PMCID: PMC6540234 DOI: 10.3390/ijerph16091644] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Revised: 04/28/2019] [Accepted: 04/29/2019] [Indexed: 12/18/2022]
Abstract
This work studies the health status of two populations similar in most social and environmental interactions but one: the individuals from one population are victims of an internal armed conflict. Both populations are located in the Risaralda province, Colombia and the data for this study results from a combination of administrative records from the health system, between 2011 and 2016. We implemented a methodology based on graph theory that defines the system as a set of heterogeneous social actors, including individuals as well as organizations, embedded in a biological environment. The model of analysis uses the diagnoses in medical records to detect morbidity and mortality patterns for each individual (ego-networks), and assumes that these patterns contain relevant information about the effects of the actions of social actors, in a given environment, on the status of health. The analysis of the diagnoses and causes of specific mortality, following the Social Network Analysis framework, shows similar morbidity and mortality rates for both populations. However, the diagnoses' patterns show that victims portray broader interactions between diagnoses, including mental and behavioral disorders, due to the hardships of this population.
Collapse
Affiliation(s)
- Rafael Rentería-Ramos
- Departments of Physics and Statistics, Universidad Nacional de Colombia, Cra 45 Bogotá, Colombia.
- School of Basic Sciences, Technologies and Engineering, Universidad Nacional Abierta y a Distancia de Colombia, 111321 Bogotá, Colombia.
| | | | - B Piedad Urdinola
- Department of Statistics, Universidad Nacional de Colombia, Cra 45 Bogotá, Colombia.
| |
Collapse
|
132
|
|
133
|
Rasti S, Vogiatzis C. A survey of computational methods in protein–protein interaction networks. ANNALS OF OPERATIONS RESEARCH 2019; 276:35-87. [DOI: 10.1007/s10479-018-2956-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
134
|
Abstract
Background:
Essential proteins play important roles in the survival or reproduction of
an organism and support the stability of the system. Essential proteins are the minimum set of
proteins absolutely required to maintain a living cell. The identification of essential proteins is a
very important topic not only for a better comprehension of the minimal requirements for cellular
life, but also for a more efficient discovery of the human disease genes and drug targets.
Traditionally, as the experimental identification of essential proteins is complex, it usually requires
great time and expense. With the cumulation of high-throughput experimental data, many
computational methods that make useful complements to experimental methods have been
proposed to identify essential proteins. In addition, the ability to rapidly and precisely identify
essential proteins is of great significance for discovering disease genes and drug design, and has
great potential for applications in basic and synthetic biology research.
Objective:
The aim of this paper is to provide a review on the identification of essential proteins
and genes focusing on the current developments of different types of computational methods, point
out some progress and limitations of existing methods, and the challenges and directions for
further research are discussed.
Collapse
Affiliation(s)
- Ming Fang
- School of Computer Science, Shaanxi Normal University, Xi'an 710119, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an 710119, China
| | - Ling Guo
- College of Life Sciences, Shaanxi Normal University, Xi'an 710119, China
| |
Collapse
|
135
|
Xu B, Guan J, Wang Y, Wang Z. Essential Protein Detection by Random Walk on Weighted Protein-Protein Interaction Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:377-387. [PMID: 28504946 DOI: 10.1109/tcbb.2017.2701824] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Essential proteins are critical to the development and survival of cells. Identification of essential proteins is helpful for understanding the minimal set of required genes in a living cell and for designing new drugs. To detect essential proteins, various computational methods have been proposed based on protein-protein interaction (PPI) networks. However, protein interaction data obtained by high-throughput experiments usually contain high false positives, which negatively impacts the accuracy of essential protein detection. Moreover, most existing studies focused on the local information of proteins in PPI networks, while ignoring the influence of indirect protein interactions on essentiality. In this paper, we propose a novel method, called Essentiality Ranking (EssRank in short), to boost the accuracy of essential protein detection. To deal with the inaccuracy of PPI data, confidence scores of interactions are evaluated by integrating various biological information. Weighted edge clustering coefficient (WECC), considering both interaction confidence scores and network topology, is proposed to calculate edge weights in PPI networks. The weight of each node is evaluated by the sum of WECC values of its linking edges. A random walk method, making use of both direct and indirect protein interactions, is then employed to calculate protein essentiality iteratively. Experimental results on the yeast PPI network show that EssRank outperforms most existing methods, including the most commonly-used centrality measures (SC, DC, BC, CC, IC, and EC), topology based methods (DMNC and NC) and the data integrating method IEW.
Collapse
|
136
|
Lei X, Wang S, Wu F. Identification of Essential Proteins Based on Improved HITS Algorithm. Genes (Basel) 2019; 10:E177. [PMID: 30823614 PMCID: PMC6409685 DOI: 10.3390/genes10020177] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Revised: 02/09/2019] [Accepted: 02/19/2019] [Indexed: 11/16/2022] Open
Abstract
Essential proteins are critical to the development and survival of cells. Identifying and analyzing essential proteins is vital to understand the molecular mechanisms of living cells and design new drugs. With the development of high-throughput technologies, many protein⁻protein interaction (PPI) data are available, which facilitates the studies of essential proteins at the network level. Up to now, although various computational methods have been proposed, the prediction precision still needs to be improved. In this paper, we propose a novel method by applying Hyperlink-Induced Topic Search (HITS) on weighted PPI networks to detect essential proteins, named HSEP. First, an original undirected PPI network is transformed into a bidirectional PPI network. Then, both biological information and network topological characteristics are taken into account to weighted PPI networks. Pieces of biological information include gene expression data, Gene Ontology (GO) annotation and subcellular localization. The edge clustering coefficient is represented as network topological characteristics to measure the closeness of two connected nodes. We conducted experiments on two species, namely Saccharomyces cerevisiae and Drosophila melanogaster, and the experimental results show that HSEP outperformed some state-of-the-art essential proteins detection techniques.
Collapse
Affiliation(s)
- Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an 710119, China.
| | - Siguo Wang
- School of Computer Science, Shaanxi Normal University, Xi'an 710119, China.
| | - Fangxiang Wu
- Department of Mechanical Engineering and Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada.
| |
Collapse
|
137
|
Abstract
Assume one has the capability of determining whether a node in a network is infectious or not by probing it. Then problem of optimizing sentinel surveillance in networks is to identify the nodes to probe such that an emerging disease outbreak can be discovered early or reliably. Whether the emphasis should be on early or reliable detection depends on the scenario in question. We investigate three objective measures from the literature quantifying the performance of nodes in sentinel surveillance: the time to detection or extinction, the time to detection, and the frequency of detection. As a basis for the comparison, we use the susceptible-infectious-recovered model on static and temporal networks of human contacts. We show that, for some regions of parameter space, the three objective measures can rank the nodes very differently. This means sentinel surveillance is a class of problems, and solutions need to chose an objective measure for the particular scenario in question. As opposed to other problems in network epidemiology, we draw similar conclusions from the static and temporal networks. Furthermore, we do not find one type of network structure that predicts the objective measures, i.e., that depends both on the data set and the SIR parameter values.
Collapse
Affiliation(s)
- Petter Holme
- Institute of Innovative Research, Tokyo Institute of Technology, Nagatsuta-cho 4259, Midori-ku, Yokohama, Kanagawa 226-8503, Japan
| |
Collapse
|
138
|
Chen X, Gong L, Li Q, Hu J, Liu X, Wang Y, Bai J, Ran X, Wu J, Ge Q, Li R, Xiao X, Li X, Zhang J, Wang Z. The appropriate remodeling of extracellular matrix is the key molecular signature in subcutaneous adipose tissue following Roux-en-Y gastric bypass. Life Sci 2019; 218:265-273. [DOI: 10.1016/j.lfs.2018.12.051] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Revised: 12/24/2018] [Accepted: 12/29/2018] [Indexed: 12/12/2022]
|
139
|
Space‒Time Evolution Analysis of the Nanjing Metro Network Based on a Complex Network. SUSTAINABILITY 2019. [DOI: 10.3390/su11020523] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Many cities in China have opened a subway, which has become an important part of urban public transport. How the metro line forms the metro network, and then changes the urban traffic pattern, is a problem worthy of attention. From 2005 to 2018, 10 metro lines were opened in Nanjing, which provides important reference data for the study of the spatial and temporal evolution of the Metro network. In this study, using the complex network method, according to the opening sequence of 10 metro lines in Nanjing, space L and space P models are established, respectively. In view of the evolution of metro network parameters, four parameters—network density, network centrality, network clustering coefficient, and network average distance—are proposed for evaluation. In view of the spatial structure change of the metro network, this study combines the concept of node degree in a complex network, analyzes the starting point, terminal point, and intersection point of metro line, and puts forward the concepts of star structure and ring structure. The analysis of the space‒time evolution of Nanjing metro network shows that with the gradual opening of metro lines, the metro network presents a more complex structure; the line connection tends to important nodes, and gradually outlines the city’s commercial space pattern.
Collapse
|
140
|
Zhang F, Peng W, Yang Y, Dai W, Song J. A Novel Method for Identifying Essential Genes by Fusing Dynamic Protein⁻Protein Interactive Networks. Genes (Basel) 2019; 10:genes10010031. [PMID: 30626157 PMCID: PMC6356314 DOI: 10.3390/genes10010031] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Revised: 12/24/2018] [Accepted: 01/02/2019] [Indexed: 11/16/2022] Open
Abstract
Essential genes play an indispensable role in supporting the life of an organism. Identification of essential genes helps us to understand the underlying mechanism of cell life. The essential genes of bacteria are potential drug targets of some diseases genes. Recently, several computational methods have been proposed to detect essential genes based on the static protein⁻protein interactive (PPI) networks. However, these methods have ignored the fact that essential genes play essential roles under certain conditions. In this work, a novel method was proposed for the identification of essential proteins by fusing the dynamic PPI networks of different time points (called by FDP). Firstly, the active PPI networks of each time point were constructed and then they were fused into a final network according to the networks' similarities. Finally, a novel centrality method was designed to assign each gene in the final network a ranking score, whilst considering its orthologous property and its global and local topological properties in the network. This model was applied on two different yeast data sets. The results showed that the FDP achieved a better performance in essential gene prediction as compared to other existing methods that are based on the static PPI network or that are based on dynamic networks.
Collapse
Affiliation(s)
- Fengyu Zhang
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650093, China.
| | - Wei Peng
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650093, China.
- Computer Center of Kunming University of Science and Technology, Kunming 650093, China.
| | - Yunfei Yang
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650093, China.
| | - Wei Dai
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650093, China.
| | - Junrong Song
- Faculty of Management and Economics, Kunming University of Science and Technology, Kunming 650093, China.
| |
Collapse
|
141
|
Qi M, Deng Y, Deng H, Wu J. Optimal disintegration strategy in multiplex networks. CHAOS (WOODBURY, N.Y.) 2018; 28:121104. [PMID: 30599519 DOI: 10.1063/1.5078449] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Accepted: 11/28/2018] [Indexed: 06/09/2023]
Abstract
Network disintegration comprises the problem of identifying the critical nodes or edges whose removal will lead to a network collapse. The solution of this problem is significant for strategies for dismantling terrorist organizations and for immunization in disease spreading. Network disintegration has received considerable attention in isolated networks. Here, we consider the generalization of optimal disintegration strategy problems to multiplex networks and propose a disintegration strategy based on tabu search. Experiments show that the disintegration effect of our strategy is clearly superior to those of typical disintegration strategies. Moreover, our approach sheds light on the properties of the nodes within the optimal disintegration strategies.
Collapse
Affiliation(s)
- Mingze Qi
- College of Systems Engineering, National University of Defense Technology, Changsha, Hunan 410073, People's Republic of China
| | - Ye Deng
- College of Systems Engineering, National University of Defense Technology, Changsha, Hunan 410073, People's Republic of China
| | - Hongzhong Deng
- College of Systems Engineering, National University of Defense Technology, Changsha, Hunan 410073, People's Republic of China
| | - Jun Wu
- College of Systems Engineering, National University of Defense Technology, Changsha, Hunan 410073, People's Republic of China
| |
Collapse
|
142
|
Elahi A, Babamir SM. Identification of essential proteins based on a new combination of topological and biological features in weighted protein-protein interaction networks. IET Syst Biol 2018; 12:247-257. [PMID: 30472688 PMCID: PMC8687241 DOI: 10.1049/iet-syb.2018.5024] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2018] [Revised: 04/23/2018] [Accepted: 04/30/2018] [Indexed: 02/01/2023] Open
Abstract
The identification of essential proteins in protein-protein interaction (PPI) networks is not only important in understanding the process of cellular life but also useful in diagnosis and drug design. The network topology-based centrality measures are sensitive to noise of network. Moreover, these measures cannot detect low-connectivity essential proteins. The authors have proposed a new method using a combination of topological centrality measures and biological features based on statistical analyses of essential proteins and protein complexes. With incomplete PPI networks, they face the challenge of false-positive interactions. To remove these interactions, the PPI networks are weighted by gene ontology. Furthermore, they use a combination of classifiers, including the newly proposed measures and traditional weighted centrality measures, to improve the precision of identification. This combination is evaluated using the logistic regression model in terms of significance levels. The proposed method has been implemented and compared to both previous and more recent efficient computational methods using six statistical standards. The results show that the proposed method is more precise in identifying essential proteins than the previous methods. This level of precision was obtained through the use of four different data sets: YHQ-W, YMBD-W, YDIP-W and YMIPS-W.
Collapse
Affiliation(s)
- Abdolkarim Elahi
- Department of Software Engineering, University of Kashan, Kashan, Iran
| | | |
Collapse
|
143
|
Graph Energies of Egocentric Networks and Their Correlation with Vertex Centrality Measures. ENTROPY 2018; 20:e20120916. [PMID: 33266640 PMCID: PMC7512502 DOI: 10.3390/e20120916] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Revised: 11/08/2018] [Accepted: 11/27/2018] [Indexed: 11/17/2022]
Abstract
Graph energy is the energy of the matrix representation of the graph, where the energy of a matrix is the sum of singular values of the matrix. Depending on the definition of a matrix, one can contemplate graph energy, Randić energy, Laplacian energy, distance energy, and many others. Although theoretical properties of various graph energies have been investigated in the past in the areas of mathematics, chemistry, physics, or graph theory, these explorations have been limited to relatively small graphs representing chemical compounds or theoretical graph classes with strictly defined properties. In this paper we investigate the usefulness of the concept of graph energy in the context of large, complex networks. We show that when graph energies are applied to local egocentric networks, the values of these energies correlate strongly with vertex centrality measures. In particular, for some generative network models graph energies tend to correlate strongly with the betweenness and the eigencentrality of vertices. As the exact computation of these centrality measures is expensive and requires global processing of a network, our research opens the possibility of devising efficient algorithms for the estimation of these centrality measures based only on local information.
Collapse
|
144
|
Han S, Yang H, Han Y, Zhang H. Genes and transcription factors related to the adverse effects of maternal type I diabetes mellitus on fetal development. Mol Cell Probes 2018; 43:64-71. [PMID: 30447278 DOI: 10.1016/j.mcp.2018.11.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2018] [Revised: 10/15/2018] [Accepted: 11/13/2018] [Indexed: 11/24/2022]
Abstract
PURPOSE Maternal type I diabetes mellitus (T1DM) increases the risk of adverse pregnancy outcomes, but the corresponding mechanism is unclear. This study aims to investigate the mechanism underlying the adverse pregnancy outcomes of maternal T1DM. METHODS Gene expression microarray (GSE51546) was down-loaded from the Gene Expression Omnibus. This dataset included 12 umbilical cord samples from the newborns of T1DM mothers (T1DM group, N = six) and non-diabetic mothers (control group, N = six). RESULTS Consequently, 1051 differentially expressed genes (DEGs) were found between the two groups. The up-regulated DEGs enriched in 30 KEGG pathways. HLA-DPA1, HLA-DMA, HLA-DMB, HLA-DQA1, HLA-DQA2 and HLA-DRA enriched in "Type I diabetes mellitus". This pathway was strongly related to 14 pathways, most of which were associated with diseases. Then, a protein-protein interaction network was constructed, and 45 potential key DEGs were identified. The 45 DEGs enriched in pathways such as "Rheumatoid arthritis", "Chemokine signaling pathway" and "Cytokine-cytokine receptor interaction" (e.g. CXCL12 and CCL5). Transcription factors (TFs) of key DEGs were predicted, and a TF-DEG regulatory network was constructed. CONCLUSIONS Some genes (e.g. CXCL12 and CCL5) and their TFs were significantly and abnormally regulated in the umbilical cord tissue from the pregnancies of T1DM mothers compared to that from non-T1DM mothers.
Collapse
Affiliation(s)
- Shuyi Han
- Department of Clinical Laboratory, Ji'nan Central Hospital Affiliated to Shandong University, Ji'nan, 250013, China
| | - Huili Yang
- Department of Obstetrics, Ji'nan Central Hospital Affiliated to Shandong University, Ji'nan, 250013, China.
| | - Yunhui Han
- Department of Obstetrics, Ji'nan Central Hospital Affiliated to Shandong University, Ji'nan, 250013, China
| | - Hongzhi Zhang
- Department of Gynecology, Ji'nan Central Hospital Affiliated to Shandong University, Ji'nan, 250013, China
| |
Collapse
|
145
|
Sciarra C, Chiarotti G, Laio F, Ridolfi L. A change of perspective in network centrality. Sci Rep 2018; 8:15269. [PMID: 30323242 PMCID: PMC6189051 DOI: 10.1038/s41598-018-33336-8] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Accepted: 09/24/2018] [Indexed: 11/25/2022] Open
Abstract
Typing “Yesterday” into the search-bar of your browser provides a long list of websites with, in top places, a link to a video by The Beatles. The order your browser shows its search results is a notable example of the use of network centrality. Centrality measures the importance of the nodes in a network and it plays a crucial role in several fields, ranging from sociology to engineering, and from biology to economics. Many centrality metrics are available. However, these measures are generally based on ad hoc assumptions, and there is no commonly accepted way to compare the effectiveness and reliability of different metrics. Here we propose a new perspective where centrality definition arises naturally from the most basic feature of a network, its adjacency matrix. Following this perspective, different centrality measures naturally emerge, including degree, eigenvector, and hub-authority centrality. Within this theoretical framework, the effectiveness of different metrics is evaluated and compared. Tests on a large set of networks show that the standard centrality metrics perform unsatisfactorily, highlighting intrinsic limitations for describing the centrality of nodes in complex networks. More informative multi-component centrality metrics are proposed as the natural extension of standard metrics.
Collapse
Affiliation(s)
- Carla Sciarra
- Department of Environmental, Land and Infrastructure Engineering, Politecnico di Torino, Corso Duca degli Abruzzi, 24, 10129, Torino, (IT), Italy.
| | - Guido Chiarotti
- Department of Environmental, Land and Infrastructure Engineering, Politecnico di Torino, Corso Duca degli Abruzzi, 24, 10129, Torino, (IT), Italy
| | - Francesco Laio
- Department of Environmental, Land and Infrastructure Engineering, Politecnico di Torino, Corso Duca degli Abruzzi, 24, 10129, Torino, (IT), Italy
| | - Luca Ridolfi
- Department of Environmental, Land and Infrastructure Engineering, Politecnico di Torino, Corso Duca degli Abruzzi, 24, 10129, Torino, (IT), Italy
| |
Collapse
|
146
|
Zhang J, Wang N, Xu A. Screening of genes associated with inflammatory responses in the endolymphatic sac reveals underlying mechanisms for autoimmune inner ear diseases. Exp Ther Med 2018; 16:2460-2470. [PMID: 30210597 PMCID: PMC6122540 DOI: 10.3892/etm.2018.6479] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2017] [Accepted: 06/01/2018] [Indexed: 12/12/2022] Open
Abstract
The current study analyzed gene expression profiles of the endolymphatic sac (ES) in rats and identified expressed genes, present in the human and rat ES, to reveal key hubs for inflammatory responses. Microarray data (accession no. E-MEXP-3022) were obtained from the European Bioinformatics Institute database, including three biological replicates of ES plus dura tissues and three replicates of pure dura tissues form rats. Differentially expressed genes (DEGs) were screened using the Linear Model for Microarray data method and a protein-protein interaction (PPI) network was constructed using data from the Search Tool for the Retrieval of Interacting Genes/Proteins database followed by a module analysis via Clustering with Overlapping Neighborhood Expansion. Function enrichment analysis was performed using the Database for Annotation, Visualization and Integrated Discovery online tool. A total of 612 DEGs were identified, including 396 upregulated and 216 downregulated genes. Gene ontology term enrichment analysis indicated DEGs were associated with cell adhesion, including α5-integrin (Itga1) and secreted phosphoprotein 1 (Spp1); T cell co-stimulation, including C-C chemokine ligand (Ccl)21 and Ccl19; and the toll-like receptor signaling pathway, including toll-like receptor (Tlr)2, Tlr7 and Tlr8. These conclusions were supported by Kyoto Encyclopedia of Genes and Genomes pathway analyses revealing extracellular matrix-receptor interaction, including Itga1 and Spp1; leukocyte transendothelial migration, includingclaudin-4 (Cldn4); and malaria, including Tlr2. The hub roles of Itga1, Cd24 and Spp1 were revealed by calculating three topological properties of the PPI network. Ccl21, Ccl19 and Cldn4 were demonstrated to be crucial following significant module analysis according to the corresponding threshold, which revealed they were enriched in inflammation pathways. Tlr7, Tlr2, granzyme m and Tlr8 were common genes associated with inflammatory responses in rat and human ES. In conclusion, abnormal expression of the aforementioned inflammation-associated genes may be associated with the development of autoimmune inner ear diseases.
Collapse
Affiliation(s)
- Juhong Zhang
- Department of Otolaryngology, Head and Neck Surgery, The Second Hospital of Shandong University, Jinan, Shandong 250033, P.R. China.,Department of Otolaryngology, Shanghai Jiao Tong University, Affiliated to Sixth People's Hospital South Campus, Shanghai 201411, P.R. China
| | - Na Wang
- Department of Otolaryngology, Head and Neck Surgery, The Second Hospital of Shandong University, Jinan, Shandong 250033, P.R. China
| | - Anting Xu
- Department of Otolaryngology, Head and Neck Surgery, The Second Hospital of Shandong University, Jinan, Shandong 250033, P.R. China.,Department of Otolaryngology, Affiliated Tenth People's Hospital of Tongji University, Shanghai 200072, P.R. China
| |
Collapse
|
147
|
Abstract
Suppose that G is a graph over n vertices. G has n eigenvalues (of adjacency matrix) represented by λ1,λ2,⋯,λn. The Gaussian Estrada index, denoted by H(G) (Estrada et al., Chaos 27(2017) 023109), can be defined as H(G)=∑i=1ne-λi2. Gaussian Estrada index underlines the eigenvalues close to zero, which plays an important role in chemistry reactions, such as molecular stability and molecular magnetic properties. In a network of particles governed by quantum mechanics, this graph-theoretic index is known to account for the information encoded in the eigenvalues of the Hamiltonian near zero by folding the graph spectrum. In this paper, we establish some new lower bounds for H(G) in terms of the number of vertices, the number of edges, as well as the first Zagreb index.
Collapse
|
148
|
A systematic survey of centrality measures for protein-protein interaction networks. BMC SYSTEMS BIOLOGY 2018; 12:80. [PMID: 30064421 PMCID: PMC6069823 DOI: 10.1186/s12918-018-0598-2] [Citation(s) in RCA: 95] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/23/2017] [Accepted: 06/22/2018] [Indexed: 12/12/2022]
Abstract
Background Numerous centrality measures have been introduced to identify “central” nodes in large networks. The availability of a wide range of measures for ranking influential nodes leaves the user to decide which measure may best suit the analysis of a given network. The choice of a suitable measure is furthermore complicated by the impact of the network topology on ranking influential nodes by centrality measures. To approach this problem systematically, we examined the centrality profile of nodes of yeast protein-protein interaction networks (PPINs) in order to detect which centrality measure is succeeding in predicting influential proteins. We studied how different topological network features are reflected in a large set of commonly used centrality measures. Results We used yeast PPINs to compare 27 common of centrality measures. The measures characterize and assort influential nodes of the networks. We applied principal component analysis (PCA) and hierarchical clustering and found that the most informative measures depend on the network’s topology. Interestingly, some measures had a high level of contribution in comparison to others in all PPINs, namely Latora closeness, Decay, Lin, Freeman closeness, Diffusion, Residual closeness and Average distance centralities. Conclusions The choice of a suitable set of centrality measures is crucial for inferring important functional properties of a network. We concluded that undertaking data reduction using unsupervised machine learning methods helps to choose appropriate variables (centrality measures). Hence, we proposed identifying the contribution proportions of the centrality measures with PCA as a prerequisite step of network analysis before inferring functional consequences, e.g., essentiality of a node. Electronic supplementary material The online version of this article (10.1186/s12918-018-0598-2) contains supplementary material, which is available to authorized users.
Collapse
|
149
|
Guo Y, Gao W, Wang D, Liu W, Liu Z. Gene alterations in monocytes are pathogenic factors for immunoglobulin a nephropathy by bioinformatics analysis of microarray data. BMC Nephrol 2018; 19:184. [PMID: 30029622 PMCID: PMC6053766 DOI: 10.1186/s12882-018-0944-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Accepted: 06/07/2018] [Indexed: 11/27/2022] Open
Abstract
Background Immunoglobulin A nephropathy (IgAN) is the most frequent primary glomerulopathy worldwide. The study aimed to provide potential molecular biomarkers for IgAN management. Methods The public gene expression profiling GSE58539 was utilized, which contained 17 monocytes samples (8 monocytes samples isolated from IgAN patients and 9 monocytes samples isolated from healthy blood donors). Firstly, differentially expressed genes (DEGs) between the two kinds of samples were identified by limma package. Afterwards, pathway enrichment analysis was implemented. Thereafter, protein-protein interaction (PPI) network was constructed and key nodes in PPI network were predicted using four network centrality analyses. Ultimately, gene functional interaction (FI) was constructed according to expressions in each sample, and then module network was extracted from FI network. Results A total of 678 DEGs were screened out, of these, 72 DEGs were identified as crucial nodes in PPI network that could well distinguish IgAN and healthy samples. In particular, IL6, TNF, IL1B, PRKACA and CCL20 were closely related to pathways such as hematopoietic cell lineage, apoptosis and Toll-like receptor (TLR) signaling pathway. Moreover, 12 genes in the FI network belonged to the 72 identified key nodes, such as CCL20, HDAC10, FPR2 and PRKACA, which were also key genes in 4 module networks. Conclusions Several crucial genes were identified in monocytes of IgAN patients, such as IL6, TNF, IL1B, CCL20, PRKACA, FPR2 and HDAC10. These genes might co-involve in pathways such as TLR and apoptosis signaling during IgAN progression.
Collapse
Affiliation(s)
- Yingbo Guo
- Department of Nephropathy, Dongfang Hospital Affiliated to Beijing University of Chinese Medicine, Beijng, 100078, China
| | - Wenfeng Gao
- Department of Urology, Dongzhimen Hospital Affiliated to Beijing University of Chinese Medicine, Beijng, 100700, China
| | - Danyang Wang
- Department of Nephropathy and Endocrinology, Dongzhimen Hospital Affiliated to Beijing University of Chinese Medicine, No. 5 Haiyuncang, Dongcheng District, Beijng City, 100700, China
| | - Weijing Liu
- Department of Nephropathy and Endocrinology, Dongzhimen Hospital Affiliated to Beijing University of Chinese Medicine, No. 5 Haiyuncang, Dongcheng District, Beijng City, 100700, China
| | - Zhongjie Liu
- Department of Nephropathy and Endocrinology, Dongzhimen Hospital Affiliated to Beijing University of Chinese Medicine, No. 5 Haiyuncang, Dongcheng District, Beijng City, 100700, China.
| |
Collapse
|
150
|
Zhong J, Sun Y, Peng W, Xie M, Yang J, Tang X. XGBFEMF: An XGBoost-Based Framework for Essential Protein Prediction. IEEE Trans Nanobioscience 2018; 17:243-250. [DOI: 10.1109/tnb.2018.2842219] [Citation(s) in RCA: 70] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|