51
|
Lu H, Shang C, Zou S, Cheng L, Yang S, Wang L. A Novel Method for Predicting Essential Proteins by Integrating Multidimensional Biological Attribute Information and Topological Properties. Curr Bioinform 2022. [DOI: 10.2174/1574893617666220304201507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
Essential proteins are indispensable to the maintenance of life activities and play essential roles in the areas of synthetic biology. Identification of essential proteins by computational methods has become a hot topic in recent years because of its efficiency.
Objective:
Identification of essential proteins is of important significance and practical use in the areas of synthetic biology, drug targets, and human disease genes.
Method:
In this paper, a method called EOP(Edge clustering coefficient -Orthologous-Protein) is proposed to infer potential essential proteins by combining Multidimensional Biological Attribute Information of proteins with Topological Properties of the protein-protein interaction network.
Results:
The simulation results on the yeast protein interaction network show that the number of essential proteins identified by this method is more than the number identified by the other 12 methods(DC, IC, EC, SC, BC, CC, NC, LAC, PEC, CoEWC, POEM, DWE). Especially compared with DC(Degree Centrality), the SN(sensitivity) is 9% higher, when the candidate protein is 1%, the recognition rate is 34% higher, when the candidate protein is 5%, 10%, 15%, 20%, 25% the recognition rate is 36%, 22%, 15%, 11%, 8% higher respectively.
Conclusion:
Experimental results show that our method can achieve satisfactory prediction results, which may provide references for future research.
Collapse
Affiliation(s)
- Hanyu Lu
- College of Big Data and Information Engineering, Guizhou University, Guizhou, China
| | - Chen Shang
- College of Big Data and Information Engineering, Guizhou University, Guizhou, China
| | - Sai Zou
- College of Big Data and Information Engineering, Guizhou University, Guizhou, China
| | - Lihong Cheng
- College of Foreign Languages, Dalian Jiaotong University, China
| | - Shikong Yang
- College of Big Data and Information Engineering, Guizhou University, Guizhou, China
| | - Lei Wang
- College of Computer Engineering and Applied Mathematics, Changsha University, China
| |
Collapse
|
52
|
Bergermann K, Stoll M. Fast computation of matrix function-based centrality measures for layer-coupled multiplex networks. Phys Rev E 2022; 105:034305. [PMID: 35428049 DOI: 10.1103/physreve.105.034305] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Accepted: 02/17/2022] [Indexed: 06/14/2023]
Abstract
Centrality measures identify and rank the most influential entities of complex networks. In this paper, we generalize matrix function-based centrality measures, which have been studied extensively for single-layer and temporal networks in recent years to layer-coupled multiplex networks. The layers of these networks can reflect different relationships and interactions between entities or changing interactions over time. We use the supra-adjacency matrix as network representation, which has already been used to generalize eigenvector centrality to temporal and multiplex networks. With a suitable choice of edge weights, the definition of single-layer matrix function-based centrality measures in terms of walks on networks carries over naturally to the multilayer case. In contrast to other walk-based centralities, matrix function-based centralities are parameterized measures, which have been shown to interpolate between (local) degree and (global) eigenvector centrality in the single-layer case. As the explicit evaluation of the involved matrix function expressions becomes infeasible for medium to large-scale networks, we present highly efficient approximation techniques from numerical linear algebra, which rely on Krylov subspace methods, Gauss quadrature, and stochastic trace estimation. We present extensive numerical studies on synthetic and real-world multiplex transportation, communication, and collaboration networks. The comparison with established multilayer centrality measures shows that our framework produces meaningful rankings of nodes, layers, and node-layer pairs. Furthermore, our experiments corroborate the linear computational complexity of the employed numerical methods in terms of the network size that is theoretically indicated under the assumption of sparsity in the supra-adjacency matrix. This excellent scalability allows the efficient treatment of large-scale networks with the number of node-layer pairs of order 10^{7} or higher.
Collapse
Affiliation(s)
- Kai Bergermann
- Department of Mathematics, Technische Universität Chemnitz, 09107 Chemnitz, Germany
| | - Martin Stoll
- Department of Mathematics, Technische Universität Chemnitz, 09107 Chemnitz, Germany
| |
Collapse
|
53
|
|
54
|
Pirbaluty AM, Mehrban H, Kadkhodaei S, Ravash R, Oryan A, Ghaderi-Zefrehei M, Smith J. Network Meta-Analysis of Chicken Microarray Data following Avian Influenza Challenge-A Comparison of Highly and Lowly Pathogenic Strains. Genes (Basel) 2022; 13:435. [PMID: 35327988 PMCID: PMC8953847 DOI: 10.3390/genes13030435] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 02/18/2022] [Accepted: 02/24/2022] [Indexed: 02/01/2023] Open
Abstract
The current bioinformatics study was undertaken to analyze the transcriptome of chicken (Gallus gallus) after influenza A virus challenge. A meta-analysis was carried out to explore the host expression response after challenge with lowly pathogenic avian influenza (LPAI) (H1N1, H2N3, H5N2, H5N3 and H9N2) and with highly pathogenic avian influenza (HPAI) H5N1 strains. To do so, ten microarray datasets obtained from the Gene Expression Omnibus (GEO) database were normalized and meta-analyzed for the LPAI and HPAI host response individually. Different undirected networks were constructed and their metrics determined e.g., degree centrality, closeness centrality, harmonic centrality, subgraph centrality and eigenvector centrality. The results showed that, based on criteria of centrality, the CMTR1, EPSTI1, RNF213, HERC4L, IFIT5 and LY96 genes were the most significant during HPAI challenge, with PARD6G, HMG20A, PEX14, RNF151 and TLK1L having the lowest values. However, for LPAI challenge, ZDHHC9, IMMP2L, COX7C, RBM18, DCTN3, and NDUFB1 genes had the largest values for aforementioned criteria, with GTF3C5, DROSHA, ATRX, RFWD2, MED23 and SEC23B genes having the lowest values. The results of this study can be used as a basis for future development of treatments/preventions of the effects of avian influenza in chicken.
Collapse
Affiliation(s)
- Azadeh Moradi Pirbaluty
- Department of Genetics and Animal Breeding, Faculty of Agriculture, Shahrekord University, Shahrekord 88186-34141, Iran; (A.M.P.); (H.M.)
| | - Hossein Mehrban
- Department of Genetics and Animal Breeding, Faculty of Agriculture, Shahrekord University, Shahrekord 88186-34141, Iran; (A.M.P.); (H.M.)
| | - Saeid Kadkhodaei
- Agricultural Biotechnology Research Institute of Iran (ABRII), Center of Iran, Isfahan 14968-13151, Iran;
| | - Rudabeh Ravash
- Department of Plant Breeding and Biotechnology, Faculty of Agriculture, Shahrekord University, Shahrekord 88186-34141, Iran;
| | - Ahmad Oryan
- Department of Pathology, School of Veterinary Medicine, Shiraz University, Shiraz 71557-13876, Iran;
| | - Mostafa Ghaderi-Zefrehei
- Department of Genetics and Animal Breeding, Faculty of Agriculture, Yasouj University, Yasouj 75918-74831, Iran
| | - Jacqueline Smith
- The Roslin Institute, University of Edinburgh, Easter Bush Campus, Midlothian EH25 9RG, UK
| |
Collapse
|
55
|
Freund AJ, Giabbanelli PJ. An Experimental Study on the Scalability of Recent Node Centrality Metrics in Sparse Complex Networks. Front Big Data 2022; 5:797584. [PMID: 35252851 PMCID: PMC8889076 DOI: 10.3389/fdata.2022.797584] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Accepted: 01/21/2022] [Indexed: 11/20/2022] Open
Abstract
Node centrality measures are among the most commonly used analytical techniques for networks. They have long helped analysts to identify “important” nodes that hold power in a social context, where damages could have dire consequences for transportation applications, or who should be a focus for prevention in epidemiology. Given the ubiquity of network data, new measures have been proposed, occasionally motivated by emerging applications or by the ability to interpolate existing measures. Before analysts use these measures and interpret results, the fundamental question is: are these measures likely to complete within the time window allotted to the analysis? In this paper, we comprehensively examine how the time necessary to run 18 new measures (introduced from 2005 to 2020) scales as a function of the number of nodes in the network. Our focus is on giving analysts a simple and practical estimate for sparse networks. As the time consumption depends on the properties in the network, we nuance our analysis by considering whether the network is scale-free, small-world, or random. Our results identify that several metrics run in the order of O(nlogn) and could scale to large networks, whereas others can require O(n2) or O(n3) and may become prime targets in future works for approximation algorithms or distributed implementations.
Collapse
|
56
|
An efficient discrete differential evolution algorithm based on community structure for influence maximization. APPL INTELL 2022. [DOI: 10.1007/s10489-021-03021-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
57
|
Wang B, Zhang J, Dai J, Sheng J. Influential nodes identification using network local structural properties. Sci Rep 2022; 12:1833. [PMID: 35115582 PMCID: PMC8814008 DOI: 10.1038/s41598-022-05564-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Accepted: 01/12/2022] [Indexed: 11/08/2022] Open
Abstract
With the rapid development of information technology, the scale of complex networks is increasing, which makes the spread of diseases and rumors harder to control. Identifying the influential nodes effectively and accurately is critical to predict and control the network system pertinently. Some existing influential nodes detection algorithms do not consider the impact of edges, resulting in the algorithm effect deviating from the expected. Some consider the global structure of the network, resulting in high computational complexity. To solve the above problems, based on the information entropy theory, we propose an influential nodes evaluation algorithm based on the entropy and the weight distribution of the edges connecting it to calculate the difference of edge weights and the influence of edge weights on neighbor nodes. We select eight real-world networks to verify the effectiveness and accuracy of the algorithm. We verify the infection size of each node and top-10 nodes according to the ranking results by the SIR model. Otherwise, the Kendall [Formula: see text] coefficient is used to examine the consistency of our algorithm with the SIR model. Based on the above experiments, the performance of the LENC algorithm is verified.
Collapse
Affiliation(s)
- Bin Wang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Junkai Zhang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Jinying Dai
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Jinfang Sheng
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China.
| |
Collapse
|
58
|
Predicting Essential Proteins Based on Integration of Local Fuzzy Fractal Dimension and Subcellular Location Information. Genes (Basel) 2022; 13:genes13020173. [PMID: 35205217 PMCID: PMC8872415 DOI: 10.3390/genes13020173] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Revised: 01/08/2022] [Accepted: 01/12/2022] [Indexed: 11/17/2022] Open
Abstract
Essential proteins are indispensable to cells’ survival and development. Prediction and analysis of essential proteins are crucial for uncovering the mechanisms of cells. With the help of computer science and high-throughput technologies, forecasting essential proteins by protein–protein interaction (PPI) networks has become more efficient than traditional approaches (expensive experimental methods are generally used). Many computational algorithms were employed to predict the essential proteins; however, they have various restrictions. To improve the prediction accuracy, by introducing the Local Fuzzy Fractal Dimension (LFFD) of complex networks into the analysis of the PPI network, we propose a novel algorithm named LDS, which combines the LFFD of the PPI network with the protein subcellular location information. By testing the proposed LDS algorithm on three different yeast PPI networks, the experimental results show that LDS outperforms some state-of-the-art essential protein-prediction techniques.
Collapse
|
59
|
Nirmala P, Nadarajan R. Cumulative centrality index: Centrality measures based ranking technique for molecular chemical structural graphs. J Mol Struct 2022. [DOI: 10.1016/j.molstruc.2021.131354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
60
|
Buxton JE, Abrams JF, Boulton CA, Barlow N, Rangel Smith C, Van Stroud S, Lees KJ, Lenton TM. Quantitatively monitoring the resilience of patterned vegetation in the Sahel. GLOBAL CHANGE BIOLOGY 2022; 28:571-587. [PMID: 34653310 DOI: 10.1111/gcb.15939] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Accepted: 09/25/2021] [Indexed: 06/13/2023]
Abstract
Patterning of vegetation in drylands is a consequence of localized feedback mechanisms. Such feedbacks also determine ecosystem resilience-i.e. the ability to recover from perturbation. Hence, the patterning of vegetation has been hypothesized to be an indicator of resilience, that is, spots are less resilient than labyrinths. Previous studies have made this qualitative link and used models to quantitatively explore it, but few have quantitatively analysed available data to test the hypothesis. Here we provide methods for quantitatively monitoring the resilience of patterned vegetation, applied to 40 sites in the Sahel (a mix of previously identified and new ones). We show that an existing quantification of vegetation patterns in terms of a feature vector metric can effectively distinguish gaps, labyrinths, spots, and a novel category of spot-labyrinths at their maximum extent, whereas NDVI does not. The feature vector pattern metric correlates with mean precipitation. We then explored two approaches to measuring resilience. First we treated the rainy season as a perturbation and examined the subsequent rate of decay of patterns and NDVI as possible measures of resilience. This showed faster decay rates-conventionally interpreted as greater resilience-associated with wetter, more vegetated sites. Second we detrended the seasonal cycle and examined temporal autocorrelation and variance of the residuals as possible measures of resilience. Autocorrelation and variance of our pattern metric increase with declining mean precipitation, consistent with loss of resilience. Thus, drier sites appear less resilient, but we find no significant correlation between the mean or maximum value of the pattern metric (and associated morphological pattern types) and either of our measures of resilience.
Collapse
Affiliation(s)
| | - Jesse F Abrams
- Global Systems Institute, University of Exeter, Exeter, UK
- Institute for Data Science and Artificial Intelligence, University of Exeter, Exeter, UK
| | | | | | | | - Samuel Van Stroud
- The Alan Turing Institute, London, UK
- Department of Physics and Astronomy, University College London, London, UK
| | - Kirsten J Lees
- Global Systems Institute, University of Exeter, Exeter, UK
| | | |
Collapse
|
61
|
Iliopoulos AC, Papasotiriou I. Functional Complex Networks Based on Operational Architectonics: Application on Electroencephalography-Brain-computer Interface for Imagined Speech. Neuroscience 2021; 484:98-118. [PMID: 34871742 DOI: 10.1016/j.neuroscience.2021.11.045] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 11/26/2021] [Accepted: 11/29/2021] [Indexed: 01/18/2023]
Abstract
A new method for analyzing brain complex dynamics and states is presented. This method constructs functional brain graphs and is comprised of two pylons: (a) Operational architectonics (OA) concept of brain and mind functioning. (b) Network neuroscience. In particular, the algorithm utilizes OA framework for a non-parametric segmentation of EEGs, which leads to the identification of change points, namely abrupt jumps in EEG amplitude, called Rapid Transition Processes (RTPs). Subsequently, the time coordinates of RTPs are used for the generation of undirected weighted complex networks fulfilling a scale-free topology criterion, from which various network metrics of brain connectivity are estimated. These metrics form feature vectors, which can be used in machine learning algorithms for classification and/or prediction. The method is tested in classification problems on an EEG-based BCI data set, acquired from individuals during imagery pronunciation tasks of various words/vowels. The classification results, based on a Naïve Bayes classifier, show that the overall accuracies were found to be above chance level in all tested cases. This method was also compared with other state-of-the-art computational approaches commonly used for functional network generation, exhibiting competitive performance. The method can be useful to neuroscientists wishing to enhance their repository of brain research algorithms.
Collapse
Affiliation(s)
- A C Iliopoulos
- Research Genetic Cancer Centre S.A. Industrial Area of Florina, 53100 Florina, Greece
| | - I Papasotiriou
- Research Genetic Cancer Centre International GmbH, Zug 6300, Switzerland.
| |
Collapse
|
62
|
A Bioinformatics Approach to Identifying Potential Biomarkers for Cryptosporidium parvum: A Coccidian Parasite Associated with Fetal Diarrhea. Vaccines (Basel) 2021; 9:vaccines9121427. [PMID: 34960172 PMCID: PMC8705633 DOI: 10.3390/vaccines9121427] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2021] [Revised: 11/25/2021] [Accepted: 11/27/2021] [Indexed: 01/07/2023] Open
Abstract
Cryptosporidium parvum (C. parvum) is a protozoan parasite known for cryptosporidiosis in pre-weaned calves. Animals and patients with immunosuppression are at risk of developing the disease, which can cause potentially fatal diarrhoea. The present study aimed to construct a network biology framework based on the differentially expressed genes (DEGs) of C. parvum infected subjects. In this way, the gene expression profiling analysis of C. parvum infected individuals can give us a snapshot of actively expressed genes and transcripts under infection conditions. In the present study, we have analyzed microarray data sets and compared the gene expression profiles of the patients with the different data sets of the healthy control. Using a network medicine approach to identify the most influential genes in the gene interaction network, we uncovered essential genes and pathways related to C. parvum infection. We identified 164 differentially expressed genes (109 up- and 54 down-regulated DEGs) and allocated them to pathway and gene set enrichment analysis. The results underpin the identification of seven significant hub genes with high centrality values: ISG15, MX1, IFI44L, STAT1, IFIT1, OAS1, IFIT3, RSAD2, IFITM1, and IFI44. These genes are associated with diverse biological processes not limited to host interaction, type 1 interferon production, or response to IL-gamma. Furthermore, four genes (IFI44, IFIT3, IFITM1, and MX1) were also discovered to be involved in innate immunity, inflammation, apoptosis, phosphorylation, cell proliferation, and cell signaling. In conclusion, these results reinforce the development and implementation of tools based on gene profiles to identify and treat Cryptosporidium parvum-related diseases at an early stage.
Collapse
|
63
|
Sladek V, Harada R, Shigeta Y. Residue Folding Degree-Relationship to Secondary Structure Categories and Use as Collective Variable. Int J Mol Sci 2021; 22:ijms222313042. [PMID: 34884847 PMCID: PMC8657879 DOI: 10.3390/ijms222313042] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Revised: 11/23/2021] [Accepted: 11/29/2021] [Indexed: 11/22/2022] Open
Abstract
Recently, we have shown that the residue folding degree, a network-based measure of folded content in proteins, is able to capture backbone conformational transitions related to the formation of secondary structures in molecular dynamics (MD) simulations. In this work, we focus primarily on developing a collective variable (CV) for MD based on this residue-bound parameter to be able to trace the evolution of secondary structure in segments of the protein. We show that this CV can do just that and that the related energy profiles (potentials of mean force, PMF) and transition barriers are comparable to those found by others for particular events in the folding process of the model mini protein Trp-cage. Hence, we conclude that the relative segment folding degree (the newly proposed CV) is a computationally viable option to gain insight into the formation of secondary structures in protein dynamics. We also show that this CV can be directly used as a measure of the amount of α-helical content in a selected segment.
Collapse
Affiliation(s)
- Vladimir Sladek
- Institute of Chemistry, Slovak Academy of Sciences, 845 38 Bratislava, Slovakia
- Correspondence:
| | - Ryuhei Harada
- Center for Computational Sciences, University of Tsukuba, Tsukuba 305-8577, Ibaraki, Japan; (R.H.); (Y.S.)
| | - Yasuteru Shigeta
- Center for Computational Sciences, University of Tsukuba, Tsukuba 305-8577, Ibaraki, Japan; (R.H.); (Y.S.)
| |
Collapse
|
64
|
Meng X, Li W, Peng X, Li Y, Li M. Protein interaction networks: centrality, modularity, dynamics, and applications. FRONTIERS OF COMPUTER SCIENCE 2021; 15:156902. [DOI: 10.1007/s11704-020-8179-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2018] [Accepted: 08/12/2020] [Indexed: 01/03/2025]
|
65
|
Matyi MA, Cioaba SM, Banich MT, Spielberg JM. Identifying brain regions supporting amygdalar functionality: Application of a novel graph theory technique. Neuroimage 2021; 244:118614. [PMID: 34571162 PMCID: PMC8802335 DOI: 10.1016/j.neuroimage.2021.118614] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 09/21/2021] [Indexed: 11/22/2022] Open
Abstract
Effective amygdalar functionality depends on the concerted activity of a complex network of regions. Thus, the role of the amygdala cannot be fully understood without identifying the set of brain structures that allow the processes performed by the amygdala to emerge. However, this identification has yet to occur, hampering our ability to understand both normative and pathological processes that rely on the amygdala. We developed and applied novel graph theory methods to diffusion-based anatomical networks in a large sample (n = 1,052, 54.28% female, mean age=28.75) to identify nodes that critically support amygdalar interactions with the larger brain network. We examined three graph properties, each indexing a different emergent aspect of amygdalar network communication: current-flow betweenness centrality (amygdalar influence on information flowing between other pairs of nodes), node communicability (clarity of communication between the amygdala and other nodes), and subgraph centrality (amygdalar influence over local network processing). Findings demonstrate that each of these aspects of amygdalar communication is associated with separable sets of regions and, in some cases, these sets map onto previously identified sub-circuits. For example, betweenness and communicability were each associated with different sub-circuits that have been identified in previous work as supporting distinct aspects of memory-guided behavior. Other regions identified span basic (e.g., visual cortex) to higher-order (e.g., insula) sensory processing and executive functions (e.g., dorsolateral prefrontal cortex). Present findings expand our current understanding of amygdalar function by showing that there is no single 'amygdala network', but rather multiple networks, each supporting different modes of amygdalar interaction with the larger brain network. Additionally, our novel method allowed for the identification of how such regions support the amygdala, which has not been previously explored.
Collapse
Affiliation(s)
- Melanie A Matyi
- Department of Psychological and Brain Sciences, University of Delaware, Newark, DE 19716, USA.
| | - Sebastian M Cioaba
- Department of Mathematical Sciences, University of Delaware, Newark, DE 19716, USA
| | - Marie T Banich
- Department of Psychology and Neuroscience, University of Colorado Boulder, Boulder, CO 80309, USA
| | - Jeffrey M Spielberg
- Department of Psychological and Brain Sciences, University of Delaware, Newark, DE 19716, USA
| |
Collapse
|
66
|
Zhu X, He X, Kuang L, Chen Z, Lancine C. A Novel Collaborative Filtering Model-Based Method for Identifying Essential Proteins. Front Genet 2021; 12:763153. [PMID: 34745230 PMCID: PMC8566338 DOI: 10.3389/fgene.2021.763153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 09/13/2021] [Indexed: 11/19/2022] Open
Abstract
Considering that traditional biological experiments are expensive and time consuming, it is important to develop effective computational models to infer potential essential proteins. In this manuscript, a novel collaborative filtering model-based method called CFMM was proposed, in which, an updated protein–domain interaction (PDI) network was constructed first by applying collaborative filtering algorithm on the original PDI network, and then, through integrating topological features of PDI networks with biological features of proteins, a calculative method was designed to infer potential essential proteins based on an improved PageRank algorithm. The novelties of CFMM lie in construction of an updated PDI network, application of the commodity-customer-based collaborative filtering algorithm, and introduction of the calculation method based on an improved PageRank algorithm, which ensured that CFMM can be applied to predict essential proteins without relying entirely on known protein–domain associations. Simulation results showed that CFMM can achieve reliable prediction accuracies of 92.16, 83.14, 71.37, 63.87, 55.84, and 52.43% in the top 1, 5, 10, 15, 20, and 25% predicted candidate key proteins based on the DIP database, which are remarkably higher than 14 competitive state-of-the-art predictive models as a whole, and in addition, CFMM can achieve satisfactory predictive performances based on different databases with various evaluation measurements, which further indicated that CFMM may be a useful tool for the identification of essential proteins in the future.
Collapse
Affiliation(s)
- Xianyou Zhu
- College of Computer Science and Technology, Hengyang Normal University, Hengyang, China.,Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hengyang, China
| | - Xin He
- College of Computer, Xiangtan University, Xiangtan, China
| | - Linai Kuang
- College of Computer, Xiangtan University, Xiangtan, China
| | - Zhiping Chen
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Camara Lancine
- The Social Sciences and Management University of Bamako, Bamako, Mali
| |
Collapse
|
67
|
Owen LLW, Chang TH, Manning JR. High-level cognition during story listening is reflected in high-order dynamic correlations in neural activity patterns. Nat Commun 2021; 12:5728. [PMID: 34593791 PMCID: PMC8484677 DOI: 10.1038/s41467-021-25876-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Accepted: 08/24/2021] [Indexed: 02/08/2023] Open
Abstract
Our thoughts arise from coordinated patterns of interactions between brain structures that change with our ongoing experiences. High-order dynamic correlations in neural activity patterns reflect different subgraphs of the brain's functional connectome that display homologous lower-level dynamic correlations. Here we test the hypothesis that high-level cognition is reflected in high-order dynamic correlations in brain activity patterns. We develop an approach to estimating high-order dynamic correlations in timeseries data, and we apply the approach to neuroimaging data collected as human participants either listen to a ten-minute story or listen to a temporally scrambled version of the story. We train across-participant pattern classifiers to decode (in held-out data) when in the session each neural activity snapshot was collected. We find that classifiers trained to decode from high-order dynamic correlations yield the best performance on data collected as participants listened to the (unscrambled) story. By contrast, classifiers trained to decode data from scrambled versions of the story yielded the best performance when they were trained using first-order dynamic correlations or non-correlational activity patterns. We suggest that as our thoughts become more complex, they are reflected in higher-order patterns of dynamic network interactions throughout the brain.
Collapse
Affiliation(s)
- Lucy L W Owen
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - Thomas H Chang
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
- Amazon.com, Seattle, WA, USA
| | - Jeremy R Manning
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA.
| |
Collapse
|
68
|
Li S, Zhang Z, Li X, Tan Y, Wang L, Chen Z. An iteration model for identifying essential proteins by combining comprehensive PPI network with biological information. BMC Bioinformatics 2021; 22:430. [PMID: 34496745 PMCID: PMC8425031 DOI: 10.1186/s12859-021-04300-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Accepted: 07/08/2021] [Indexed: 11/10/2022] Open
Abstract
Background Essential proteins have great impacts on cell survival and development, and played important roles in disease analysis and new drug design. However, since it is inefficient and costly to identify essential proteins by using biological experiments, then there is an urgent need for automated and accurate detection methods. In recent years, the recognition of essential proteins in protein interaction networks (PPI) has become a research hotspot, and many computational models for predicting essential proteins have been proposed successively. Results In order to achieve higher prediction performance, in this paper, a new prediction model called TGSO is proposed. In TGSO, a protein aggregation degree network is constructed first by adopting the node density measurement method for complex networks. And simultaneously, a protein co-expression interactive network is constructed by combining the gene expression information with the network connectivity, and a protein co-localization interaction network is constructed based on the subcellular localization data. And then, through integrating these three kinds of newly constructed networks, a comprehensive protein–protein interaction network will be obtained. Finally, based on the homology information, scores can be calculated out iteratively for different proteins, which can be utilized to estimate the importance of proteins effectively. Moreover, in order to evaluate the identification performance of TGSO, we have compared TGSO with 13 different latest competitive methods based on three kinds of yeast databases. And experimental results show that TGSO can achieve identification accuracies of 94%, 82% and 72% out of the top 1%, 5% and 10% candidate proteins respectively, which are to some degree superior to these state-of-the-art competitive models. Conclusions We constructed a comprehensive interactive network based on multi-source data to reduce the noise and errors in the initial PPI, and combined with iterative methods to improve the accuracy of necessary protein prediction, and means that TGSO may be conducive to the future development of essential protein recognition as well.
Collapse
Affiliation(s)
- Shiyuan Li
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, 410022, China.,Hunan Province Key Laboratory of Industrial Internet Technology and Security, Changsha University, Changsha, 410022, China
| | - Zhen Zhang
- College of Electronic Information and Electrical Engineering, Changsha University, Changsha, 410022, China
| | - Xueyong Li
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, 410022, China.,Hunan Province Key Laboratory of Industrial Internet Technology and Security, Changsha University, Changsha, 410022, China
| | - Yihong Tan
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, 410022, China. .,Hunan Province Key Laboratory of Industrial Internet Technology and Security, Changsha University, Changsha, 410022, China.
| | - Lei Wang
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, 410022, China.,Hunan Province Key Laboratory of Industrial Internet Technology and Security, Changsha University, Changsha, 410022, China
| | - Zhiping Chen
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, 410022, China. .,Hunan Province Key Laboratory of Industrial Internet Technology and Security, Changsha University, Changsha, 410022, China.
| |
Collapse
|
69
|
Zhang Z, Jiang M, Wu D, Zhang W, Yan W, Qu X. A Novel Method for Identifying Essential Proteins Based on Non-negative Matrix Tri-Factorization. Front Genet 2021; 12:709660. [PMID: 34422014 PMCID: PMC8378176 DOI: 10.3389/fgene.2021.709660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 07/06/2021] [Indexed: 11/29/2022] Open
Abstract
Identification of essential proteins is very important for understanding the basic requirements to sustain a living organism. In recent years, there has been an increasing interest in using computational methods to predict essential proteins based on protein–protein interaction (PPI) networks or fusing multiple biological information. However, it has been observed that existing PPI data have false-negative and false-positive data. The fusion of multiple biological information can reduce the influence of false data in PPI, but inevitably more noise data will be produced at the same time. In this article, we proposed a novel non-negative matrix tri-factorization (NMTF)-based model (NTMEP) to predict essential proteins. Firstly, a weighted PPI network is established only using the topology features of the network, so as to avoid more noise. To reduce the influence of false data (existing in PPI network) on performance of identify essential proteins, the NMTF technique, as a widely used recommendation algorithm, is performed to reconstruct a most optimized PPI network with more potential protein–protein interactions. Then, we use the PageRank algorithm to compute the final ranking score of each protein, in which subcellular localization and homologous information of proteins were used to calculate the initial scores. In addition, extensive experiments are performed on the publicly available datasets and the results indicate that our NTMEP model has better performance in predicting essential proteins against the start-of-the-art method. In this investigation, we demonstrated that the introduction of non-negative matrix tri-factorization technology can effectively improve the condition of the protein–protein interaction network, so as to reduce the negative impact of noise on the prediction. At the same time, this finding provides a more novel angle of view for other applications based on protein–protein interaction networks.
Collapse
Affiliation(s)
- Zhihong Zhang
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China.,School of Information Technology and Management, Hunan University of Finance and Economics, Changsha, China
| | - Meiping Jiang
- Department of Ultrasound, Hunan Provincial Maternal and Child Health Care Hospital, Changsha, China
| | - Dongjie Wu
- Department of Banking and Finance, Monash University, Clayton, VIC, Australia
| | - Wang Zhang
- Department of Optoelectronic Engineering, Jinan University, Guangzhou, China
| | - Wei Yan
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Xilong Qu
- School of Information Technology and Management, Hunan University of Finance and Economics, Changsha, China.,Hunan Provincial Key Laboratory of Finance and Economics Big Data Science and Technology, Hunan University of Finance and Economics, Changsha, China
| |
Collapse
|
70
|
Abstract
The eigenvalues of the characteristic polynomial of a graph are sensitive to its symmetry-related characteristics. Within this study, we have examined three eigenvalue–based molecular descriptors. These topological molecular descriptors, among others, are gathering information on the symmetry of a molecular graph. Furthermore, they are being ordinarily employed for predicting physico–chemical properties and/or biological activities of molecules. It has been shown that these indices describe well molecular features that are depending on fine structural details. Therefore, revealing the impact of structural details on the values of the eigenvalue–based topological indices should give a hunch how physico–chemical properties depend on them as well. Here, an effect of a ring in a molecule on the values of the graph energy, Estrada index and the resolvent energy of a graph is examined.
Collapse
|
71
|
Vafaee R, Tavirani MR, Tavirani SR, Razzaghi M. Assessment of cancer prevention effect of exercise. Hum Antibodies 2021; 30:31-36. [PMID: 34459390 DOI: 10.3233/hab-210454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
There are many documents about benefits of exercise on human health. However, evidences indicate to positive effect of exercise on disease prevention, understanding of many aspects of this mechanism need more investigations. Determination of critical genes which effect human health.GSE156249 including 12 gene expression profiles of healthy individual biopsy from vastus lateralis muscle before and after 12-week combined exercise training intervention were extracted from gene expression omnibus (GEO) database. The significant DEGs were included in interactome unit by Cytoscape software and STRING database. The network was analyzed to find the central nodes subnetwork clusters. The nodes of prominent cluster were assessed via gene ontology by using ClueGO. Number of 8 significant DEGs and 100 first neighbors analyzed via network analysis. The network includes 2 clusters and COL3A1, BGN, and LOX were determined as central DEGs. The critical DEGs were involved in cancer prevention process.
Collapse
Affiliation(s)
- Reza Vafaee
- Proteomics Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran.,Laser Application in Medical Sciences Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mostafa Rezaei Tavirani
- Proteomics Research Center, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Sina Rezaei Tavirani
- Proteomics Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mohammadreza Razzaghi
- Laser Application in Medical Sciences Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| |
Collapse
|
72
|
Peng J, Kuang L, Zhang Z, Tan Y, Chen Z, Wang L. A Novel Model for Identifying Essential Proteins Based on Key Target Convergence Sets. Front Genet 2021; 12:721486. [PMID: 34394201 PMCID: PMC8358660 DOI: 10.3389/fgene.2021.721486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 06/30/2021] [Indexed: 11/20/2022] Open
Abstract
In recent years, many computational models have been designed to detect essential proteins based on protein-protein interaction (PPI) networks. However, due to the incompleteness of PPI networks, the prediction accuracy of these models is still not satisfactory. In this manuscript, a novel key target convergence sets based prediction model (KTCSPM) is proposed to identify essential proteins. In KTCSPM, a weighted PPI network and a weighted (Domain-Domain Interaction) network are constructed first based on known PPIs and PDIs downloaded from benchmark databases. And then, by integrating these two kinds of networks, a novel weighted PDI network is built. Next, through assigning a unique key target convergence set (KTCS) for each node in the weighted PDI network, an improved method based on the random walk with restart is designed to identify essential proteins. Finally, in order to evaluate the predictive effects of KTCSPM, it is compared with 12 competitive state-of-the-art models, and experimental results show that KTCSPM can achieve better prediction accuracy. Considering the satisfactory predictive performance achieved by KTCSPM, it indicates that KTCSPM might be a good supplement to the future research on prediction of essential proteins.
Collapse
Affiliation(s)
- Jiaxin Peng
- College of Computer, Xiangtan University, Xiangtan, China.,College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Linai Kuang
- College of Computer, Xiangtan University, Xiangtan, China
| | - Zhen Zhang
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Yihong Tan
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Zhiping Chen
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Lei Wang
- College of Computer, Xiangtan University, Xiangtan, China.,College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| |
Collapse
|
73
|
Weng T, Wang H, Yang H, Gu C, Zhang J, Small M. Representing complex networks without connectivity via spectrum series. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.01.067] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
74
|
He X, Kuang L, Chen Z, Tan Y, Wang L. Method for Identifying Essential Proteins by Key Features of Proteins in a Novel Protein-Domain Network. Front Genet 2021; 12:708162. [PMID: 34267785 PMCID: PMC8276041 DOI: 10.3389/fgene.2021.708162] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 05/31/2021] [Indexed: 11/21/2022] Open
Abstract
In recent years, due to low accuracy and high costs of traditional biological experiments, more and more computational models have been proposed successively to infer potential essential proteins. In this paper, a novel prediction method called KFPM is proposed, in which, a novel protein-domain heterogeneous network is established first by combining known protein-protein interactions with known associations between proteins and domains. Next, based on key topological characteristics extracted from the newly constructed protein-domain network and functional characteristics extracted from multiple biological information of proteins, a new computational method is designed to effectively integrate multiple biological features to infer potential essential proteins based on an improved PageRank algorithm. Finally, in order to evaluate the performance of KFPM, we compared it with 13 state-of-the-art prediction methods, experimental results show that, among the top 1, 5, and 10% of candidate proteins predicted by KFPM, the prediction accuracy can achieve 96.08, 83.14, and 70.59%, respectively, which significantly outperform all these 13 competitive methods. It means that KFPM may be a meaningful tool for prediction of potential essential proteins in the future.
Collapse
Affiliation(s)
- Xin He
- College of Computer, Xiangtan University, Xiangtan, China
| | - Linai Kuang
- College of Computer, Xiangtan University, Xiangtan, China
| | - Zhiping Chen
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, China
| | - Yihong Tan
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, China
| | - Lei Wang
- College of Computer, Xiangtan University, Xiangtan, China
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, China
| |
Collapse
|
75
|
Gurfinkel AJ, Rikvold PA. Adjustable reach in a network centrality based on current flows. Phys Rev E 2021; 103:052308. [PMID: 34134335 DOI: 10.1103/physreve.103.052308] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 04/08/2021] [Indexed: 11/07/2022]
Abstract
Centrality, which quantifies the "importance" of individual nodes, is among the most essential concepts in modern network theory. Most prominent centrality measures can be expressed as an aggregation of influence flows between pairs of nodes. As there are many ways in which influence can be defined, many different centrality measures are in use. Parametrized centralities allow further flexibility and utility by tuning the centrality calculation to the regime most appropriate for a given purpose and network. Here we identify two categories of centrality parameters. Reach parameters control the attenuation of influence flows between distant nodes. Grasp parameters control the centrality's tendency to send influence flows along multiple, often nongeodesic paths. Combining these categories with Borgatti's centrality types [Borgatti, Soc. Networks 27, 55 (2005)0378-873310.1016/j.socnet.2004.11.008], we arrive at a classification system for parametrized centralities. Using this classification, we identify the notable absence of any centrality measures that are radial, reach parametrized, and based on acyclic, conserved flows of influence. We therefore introduce the ground-current centrality, which is a measure of precisely this type. Because of its unique position in the taxonomy, the ground-current centrality differs significantly from similar centralities. We demonstrate that, compared to other conserved-flow centralities, it has a simpler mathematical description. Compared to other reach-parametrized centralities, it robustly preserves an intuitive rank ordering across a wide range of network architectures, capturing aspects of both the closeness and betweenness centralities. We also show that it produces a consistent distribution of centrality values among the nodes, neither trivially equally spread (delocalization) nor overly focused on a few nodes (localization). Other reach-parametrized centralities exhibit both of these behaviors on regular networks and hub networks, respectively. We compare the properties of the ground-current centrality with several other reach-parametrized centralities on four artificial networks and seven real-world networks.
Collapse
Affiliation(s)
- Aleks J Gurfinkel
- Department of Physics, Florida State University, Tallahassee, Florida 32306-4350, USA
| | - Per Arne Rikvold
- Department of Physics, Florida State University, Tallahassee, Florida 32306-4350, USA.,PoreLab, NJORD Centre, Department of Physics, University of Oslo, P.O. Box 1048 Blindern, 0316 Oslo, Norway
| |
Collapse
|
76
|
Zhong J, Tang C, Peng W, Xie M, Sun Y, Tang Q, Xiao Q, Yang J. A novel essential protein identification method based on PPI networks and gene expression data. BMC Bioinformatics 2021; 22:248. [PMID: 33985429 PMCID: PMC8120700 DOI: 10.1186/s12859-021-04175-8] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Accepted: 05/06/2021] [Indexed: 02/08/2023] Open
Abstract
Background Some proposed methods for identifying essential proteins have better results by using biological information. Gene expression data is generally used to identify essential proteins. However, gene expression data is prone to fluctuations, which may affect the accuracy of essential protein identification. Therefore, we propose an essential protein identification method based on gene expression and the PPI network data to calculate the similarity of "active" and "inactive" state of gene expression in a cluster of the PPI network. Our experiments show that the method can improve the accuracy in predicting essential proteins. Results In this paper, we propose a new measure named JDC, which is based on the PPI network data and gene expression data. The JDC method offers a dynamic threshold method to binarize gene expression data. After that, it combines the degree centrality and Jaccard similarity index to calculate the JDC score for each protein in the PPI network. We benchmark the JDC method on four organisms respectively, and evaluate our method by using ROC analysis, modular analysis, jackknife analysis, overlapping analysis, top analysis, and accuracy analysis. The results show that the performance of JDC is better than DC, IC, EC, SC, BC, CC, NC, PeC, and WDC. We compare JDC with both NF-PIN and TS-PIN methods, which predict essential proteins through active PPI networks constructed from dynamic gene expression. Conclusions We demonstrate that the new centrality measure, JDC, is more efficient than state-of-the-art prediction methods with same input. The main ideas behind JDC are as follows: (1) Essential proteins are generally densely connected clusters in the PPI network. (2) Binarizing gene expression data can screen out fluctuations in gene expression profiles. (3) The essentiality of the protein depends on the similarity of "active" and "inactive" state of gene expression in a cluster of the PPI network.
Collapse
Affiliation(s)
- Jiancheng Zhong
- School of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China.,Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Changsha, 410083, China
| | - Chao Tang
- School of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China
| | - Wei Peng
- College of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650500, Yunnan, China
| | - Minzhu Xie
- School of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China
| | - Yusui Sun
- School of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China
| | - Qiang Tang
- College of Engineering and Design, Hunan Normal University, Changsha, 410081, China
| | - Qiu Xiao
- School of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China.
| | - Jiahong Yang
- School of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China.
| |
Collapse
|
77
|
Abstract
In the global health emergency caused by coronavirus disease 2019 (COVID-19), efficient and specific therapies are urgently needed. Compared with traditional small-molecular drugs, antibody therapies are relatively easy to develop; they are as specific as vaccines in targeting severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2); and they have thus attracted much attention in the past few months. This article reviews seven existing antibodies for neutralizing SARS-CoV-2 with 3D structures deposited in the Protein Data Bank (PDB). Five 3D antibody structures associated with the SARS-CoV spike (S) protein are also evaluated for their potential in neutralizing SARS-CoV-2. The interactions of these antibodies with the S protein receptor-binding domain (RBD) are compared with those between angiotensin-converting enzyme 2 and RBD complexes. Due to the orders of magnitude in the discrepancies of experimental binding affinities, we introduce topological data analysis, a variety of network models, and deep learning to analyze the binding strength and therapeutic potential of the 14 antibody-antigen complexes. The current COVID-19 antibody clinical trials, which are not limited to the S protein target, are also reviewed.
Collapse
Affiliation(s)
- Jiahui Chen
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA;
| | - Kaifu Gao
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA;
| | - Rui Wang
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA;
| | - Duc Duy Nguyen
- Department of Mathematics, University of Kentucky, Lexington, Kentucky 40506, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA;
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, USA
| |
Collapse
|
78
|
A network analysis of crab metamorphosis and the hypothesis of development as a process of unfolding of an intensive complexity. Sci Rep 2021; 11:9551. [PMID: 33953251 PMCID: PMC8100167 DOI: 10.1038/s41598-021-88662-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 04/15/2021] [Indexed: 02/02/2023] Open
Abstract
Development has intrigued humanity since ancient times. Today, the main paradigm in developmental biology and evolutionary developmental biology (evo-devo) is the genetic program, in which development is explained by the interplay and interaction of genes, that is, by the action of gene regulatory networks (GRNs). However, it is not even clear that a GRN, no matter how complex, can be translated into a form. Therefore, the fundamental enigma of development still remains: how is a complex organism formed from a single cell? This question unfolded the historical drama and the dialectical tension between preformation and epigenesis. In order to shed light on these issues, I studied the development of crabs (infraorder Brachyura), as representative of the subphylum Crustacea, using network theory. The external morphology of the different phases of brachyuran metamorphosis were modeled as networks and their main characteristics analyzed. As one could expect, the parameters usually regarded as indicative of network complexity, such as modularity and hierarchy, increased during development. However, when more sophisticated complexity measures were tested, it was evidenced that whereas a group of complexity measures increased during development, another group decreased. This led to consider that two kinds of complexities were being measured. I called them intensive and extensive complexity. In view of these results, I propose that crab development involves a passage from an intensive to an extensive complexity. In other words, crab development can be interpreted as a process of unfolding of an intensive, preexistent complexity.
Collapse
|
79
|
Gong Y, Liu S, Bai Y. Efficient parallel computing on the game theory-aware robust influence maximization problem. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.106942] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
80
|
CEGSO: Boosting Essential Proteins Prediction by Integrating Protein Complex, Gene Expression, Gene Ontology, Subcellular Localization and Orthology Information. Interdiscip Sci 2021; 13:349-361. [PMID: 33772722 DOI: 10.1007/s12539-021-00426-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Revised: 02/04/2021] [Accepted: 03/05/2021] [Indexed: 01/13/2023]
Abstract
Essential proteins are assumed to be an indispensable element in sustaining normal physiological function and crucial to drug design and disease diagnosis. The discovery of essential proteins is of great importance in revealing the molecular mechanisms and biological processes. Owing to the tedious biological experiment, many numerical methods have been developed to discover key proteins by mining the features of the high throughput data. Appropriate integration of differential biological information based on protein-protein interaction (PPI) network has been proven useful in predicting essential proteins. The main intention of this research is to provide a comprehensive study and a review on identifying essential proteins by integrating multi-source data and provide guidance for researchers. Detailed analysis and comparison of current essential protein prediction algorithms have been carried out and tested on benchmark PPI networks. In addition, based on the previous method TEGS (short for the network Topology, gene Expression, Gene ontology, and Subcellular localization), we improve the performance of predicting essential proteins by incorporating known protein complex information, the gene expression profile, Gene Ontology (GO) terms information, subcellular localization information, and protein's orthology data into the PPI network, named CEGSO. The simulation results show that CEGSO achieves more accurate and robust results than other compared methods under different test datasets with various evaluation measurements.
Collapse
|
81
|
Meng Z, Kuang L, Chen Z, Zhang Z, Tan Y, Li X, Wang L. Method for Essential Protein Prediction Based on a Novel Weighted Protein-Domain Interaction Network. Front Genet 2021; 12:645932. [PMID: 33815480 PMCID: PMC8010314 DOI: 10.3389/fgene.2021.645932] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Accepted: 02/15/2021] [Indexed: 01/04/2023] Open
Abstract
In recent years a number of calculative models based on protein-protein interaction (PPI) networks have been proposed successively. However, due to false positives, false negatives, and the incompleteness of PPI networks, there are still many challenges affecting the design of computational models with satisfactory predictive accuracy when inferring key proteins. This study proposes a prediction model called WPDINM for detecting key proteins based on a novel weighted protein-domain interaction (PDI) network. In WPDINM, a weighted PPI network is constructed first by combining the gene expression data of proteins with topological information extracted from the original PPI network. Simultaneously, a weighted domain-domain interaction (DDI) network is constructed based on the original PDI network. Next, through integrating the newly obtained weighted PPI network and weighted DDI network with the original PDI network, a weighted PDI network is further constructed. Then, based on topological features and biological information, including the subcellular localization and orthologous information of proteins, a novel PageRank-based iterative algorithm is designed and implemented on the newly constructed weighted PDI network to estimate the criticality of proteins. Finally, to assess the prediction performance of WPDINM, we compared it with 12 kinds of competitive measures. Experimental results show that WPDINM can achieve a predictive accuracy rate of 90.19, 81.96, 70.72, 62.04, 55.83, and 51.13% in the top 1%, top 5%, top 10%, top 15%, top 20%, and top 25% separately, which exceeds the prediction accuracy achieved by traditional state-of-the-art competing measures. Owing to the satisfactory identification effect, the WPDINM measure may contribute to the further development of key protein identification.
Collapse
Affiliation(s)
- Zixuan Meng
- College of Computer, Xiangtan University, Xiangtan, China
| | - Linai Kuang
- College of Computer, Xiangtan University, Xiangtan, China
| | - Zhiping Chen
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, China
| | - Zhen Zhang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, China
| | - Yihong Tan
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, China
| | - Xueyong Li
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, China
| | - Lei Wang
- College of Computer, Xiangtan University, Xiangtan, China
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, China
| |
Collapse
|
82
|
Liu X, Wu S, Liu C, Zhang Y. Social network node influence maximization method combined with degree discount and local node optimization. SOCIAL NETWORK ANALYSIS AND MINING 2021. [DOI: 10.1007/s13278-021-00733-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
83
|
Quality testing of spectrum-based valency descriptors for polycyclic aromatic hydrocarbons with applications. J Mol Struct 2021. [DOI: 10.1016/j.molstruc.2020.129789] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
84
|
Munoz G, Dequidt A, Martzel N, Blaak R, Goujon F, Devémy J, Garruchet S, Latour B, Munch E, Malfreyt P. Heterogeneity Effects in Highly Cross-Linked Polymer Networks. Polymers (Basel) 2021; 13:polym13050757. [PMID: 33671017 PMCID: PMC7957597 DOI: 10.3390/polym13050757] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Revised: 02/23/2021] [Accepted: 02/24/2021] [Indexed: 11/30/2022] Open
Abstract
Despite their level of refinement, micro-mechanical, stretch-based and invariant-based models, still fail to capture and describe all aspects of the mechanical properties of polymer networks for which they were developed. This is for an important part caused by the way the microscopic inhomogeneities are treated. The Elastic Network Model (ENM) approach of reintroducing the spatial resolution by considering the network at the level of its topological constraints, is able to predict the macroscopic properties of polymer networks up to the point of failure. We here demonstrate the ability of ENM to highlight the effects of topology and structure on the mechanical properties of polymer networks for which the heterogeneity is characterised by spatial and topological order parameters. We quantify the macro- and microscopic effects on forces and stress caused by introducing and increasing the heterogeneity of the network. We find that significant differences in the mechanical responses arise between networks with a similar topology but different spatial structure at the time of the reticulation, whereas the dispersion of the cross-link valency has a negligible impact.
Collapse
Affiliation(s)
- Gérald Munoz
- Manufacture Française des Pneumatiques Michelin, Site de Ladoux, 23 Place des Carmes Déchaux, France CEDEX 9, 63040 Clermont-Ferrand, France; (G.M.); (S.G.); (B.L.); (E.M.)
| | - Alain Dequidt
- Institut de Chimie de Clermont-Ferrand, CNRS, SIGMA Clermont, Université Clermont Auvergne, 63000 Clermont-Ferrand, France; (R.B.); (F.G.); (J.D.); (P.M.)
- Correspondence: (A.D.); (N.M.)
| | - Nicolas Martzel
- Manufacture Française des Pneumatiques Michelin, Site de Ladoux, 23 Place des Carmes Déchaux, France CEDEX 9, 63040 Clermont-Ferrand, France; (G.M.); (S.G.); (B.L.); (E.M.)
- Correspondence: (A.D.); (N.M.)
| | - Ronald Blaak
- Institut de Chimie de Clermont-Ferrand, CNRS, SIGMA Clermont, Université Clermont Auvergne, 63000 Clermont-Ferrand, France; (R.B.); (F.G.); (J.D.); (P.M.)
| | - Florent Goujon
- Institut de Chimie de Clermont-Ferrand, CNRS, SIGMA Clermont, Université Clermont Auvergne, 63000 Clermont-Ferrand, France; (R.B.); (F.G.); (J.D.); (P.M.)
| | - Julien Devémy
- Institut de Chimie de Clermont-Ferrand, CNRS, SIGMA Clermont, Université Clermont Auvergne, 63000 Clermont-Ferrand, France; (R.B.); (F.G.); (J.D.); (P.M.)
| | - Sébastien Garruchet
- Manufacture Française des Pneumatiques Michelin, Site de Ladoux, 23 Place des Carmes Déchaux, France CEDEX 9, 63040 Clermont-Ferrand, France; (G.M.); (S.G.); (B.L.); (E.M.)
| | - Benoit Latour
- Manufacture Française des Pneumatiques Michelin, Site de Ladoux, 23 Place des Carmes Déchaux, France CEDEX 9, 63040 Clermont-Ferrand, France; (G.M.); (S.G.); (B.L.); (E.M.)
| | - Etienne Munch
- Manufacture Française des Pneumatiques Michelin, Site de Ladoux, 23 Place des Carmes Déchaux, France CEDEX 9, 63040 Clermont-Ferrand, France; (G.M.); (S.G.); (B.L.); (E.M.)
| | - Patrice Malfreyt
- Institut de Chimie de Clermont-Ferrand, CNRS, SIGMA Clermont, Université Clermont Auvergne, 63000 Clermont-Ferrand, France; (R.B.); (F.G.); (J.D.); (P.M.)
| |
Collapse
|
85
|
Wang R, Chen J, Gao K, Hozumi Y, Yin C, Wei GW. Analysis of SARS-CoV-2 mutations in the United States suggests presence of four substrains and novel variants. Commun Biol 2021; 4:228. [PMID: 33589648 PMCID: PMC7884689 DOI: 10.1038/s42003-021-01754-6] [Citation(s) in RCA: 104] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 11/13/2020] [Indexed: 02/07/2023] Open
Abstract
SARS-CoV-2 has been mutating since it was first sequenced in early January 2020. Here, we analyze 45,494 complete SARS-CoV-2 geneome sequences in the world to understand their mutations. Among them, 12,754 sequences are from the United States. Our analysis suggests the presence of four substrains and eleven top mutations in the United States. These eleven top mutations belong to 3 disconnected groups. The first and second groups consisting of 5 and 8 concurrent mutations are prevailing, while the other group with three concurrent mutations gradually fades out. Moreover, we reveal that female immune systems are more active than those of males in responding to SARS-CoV-2 infections. One of the top mutations, 27964C > T-(S24L) on ORF8, has an unusually strong gender dependence. Based on the analysis of all mutations on the spike protein, we uncover that two of four SASR-CoV-2 substrains in the United States become potentially more infectious.
Collapse
Affiliation(s)
- Rui Wang
- Department of Mathematics, Michigan State University, East Lansing, MI, 48824, USA
| | - Jiahui Chen
- Department of Mathematics, Michigan State University, East Lansing, MI, 48824, USA
| | - Kaifu Gao
- Department of Mathematics, Michigan State University, East Lansing, MI, 48824, USA
| | - Yuta Hozumi
- Department of Mathematics, Michigan State University, East Lansing, MI, 48824, USA
| | - Changchuan Yin
- Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago, Chicago, IL, 60607, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, MI, 48824, USA.
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI, 48824, USA.
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA.
| |
Collapse
|
86
|
Tsuji Y, Yoshizawa K. From Infection Clusters to Metal Clusters: Significance of the Lowest Occupied Molecular Orbital (LOMO). ACS OMEGA 2021; 6:1339-1351. [PMID: 33490793 PMCID: PMC7818624 DOI: 10.1021/acsomega.0c04913] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Accepted: 12/22/2020] [Indexed: 05/10/2023]
Abstract
In this paper, the nature of the lowest-energy electrons is detailed. The orbital occupied by such electrons can be termed the lowest occupied molecular orbital (LOMO). There is a good correspondence between the Hückel method in chemistry and graph theory in mathematics; the molecular orbital, which chemists view as the distribution of an electron with a specific energy, is to mathematicians an algebraic entity, an eigenvector. The mathematical counterpart of LOMO is known as eigenvector centrality, a centrality measure characterizing nodes in networks. It may be instrumental in solving some problems in chemistry, and also it has implications for the challenge facing humanity today. This paper starts with a demonstration of the transmission of infectious disease in social networks, although it is unusual for a chemistry paper but may be a suitable example for understanding what the centrality (LOMO) is all about. The converged distribution of infected patients on the network coincides with the distribution of the LOMO of a molecule that shares the same network structure or topology. This is because the mathematical structures behind graph theory and quantum mechanics are common. Furthermore, the LOMO coefficient can be regarded as a manifestation of the centrality of atoms in an atomic assembly, indicating which atom plays the most important role in the assembly or which one has the greatest influence on the network of these atoms. Therefore, it is proposed that one can predict the binding energy of a metal atom to its cluster based on its LOMO coefficient. A possible improvement of the descriptor using a more sophisticated centrality measure is also discussed.
Collapse
Affiliation(s)
- Yuta Tsuji
- Institute for Materials Chemistry
and Engineering and IRCCS, Kyushu University, Nishi-ku, Fukuoka 819-0395, Japan
| | - Kazunari Yoshizawa
- Institute for Materials Chemistry
and Engineering and IRCCS, Kyushu University, Nishi-ku, Fukuoka 819-0395, Japan
| |
Collapse
|
87
|
Zeng M, Li M, Fei Z, Wu FX, Li Y, Pan Y, Wang J. A Deep Learning Framework for Identifying Essential Proteins by Integrating Multiple Types of Biological Information. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:296-305. [PMID: 30736002 DOI: 10.1109/tcbb.2019.2897679] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Computational methods including centrality and machine learning-based methods have been proposed to identify essential proteins for understanding the minimum requirements of the survival and evolution of a cell. In centrality methods, researchers are required to design a score function which is based on prior knowledge, yet is usually not sufficient to capture the complexity of biological information. In machine learning-based methods, some selected biological features cannot represent the complete properties of biological information as they lack a computational framework to automatically select features. To tackle these problems, we propose a deep learning framework to automatically learn biological features without prior knowledge. We use node2vec technique to automatically learn a richer representation of protein-protein interaction (PPI) network topologies than a score function. Bidirectional long short term memory cells are applied to capture non-local relationships in gene expression data. For subcellular localization information, we exploit a high dimensional indicator vector to characterize their feature. To evaluate the performance of our method, we tested it on PPI network of S. cerevisiae. Our experimental results demonstrate that the performance of our method is better than traditional centrality methods and is superior to existing machine learning-based methods. To explore which of the three types of biological information is the most vital element, we conduct an ablation study by removing each component in turn. Our results show that the PPI network embedding contributes most to the improvement. In addition, gene expression profiles and subcellular localization information are also helpful to improve the performance in identification of essential proteins.
Collapse
|
88
|
Sheng J, Liu C, Chen L, Wang B, Zhang J. Research on Community Detection in Complex Networks Based on Internode Attraction. ENTROPY (BASEL, SWITZERLAND) 2020; 22:E1383. [PMID: 33297386 PMCID: PMC7762263 DOI: 10.3390/e22121383] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Accepted: 12/03/2020] [Indexed: 11/28/2022]
Abstract
With the rapid development of computer technology, the research on complex networks has attracted more and more attention. At present, the research directions of cloud computing, big data, internet of vehicles, and distributed systems with very high attention are all based on complex networks. Community structure detection is a very important and meaningful research hotspot in complex networks. It is a difficult task to quickly and accurately divide the community structure and run it on large-scale networks. In this paper, we put forward a new community detection approach based on internode attraction, named IACD. This algorithm starts from the perspective of the important nodes of the complex network and refers to the gravitational relationship between two objects in physics to represent the forces between nodes in the network dataset, and then perform community detection. Through experiments on a large number of real-world datasets and synthetic networks, it is shown that the IACD algorithm can quickly and accurately divide the community structure, and it is superior to some classic algorithms and recently proposed algorithms.
Collapse
Affiliation(s)
| | | | | | - Bin Wang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China; (J.S.); (C.L.); (L.C.); (J.Z.)
| | | |
Collapse
|
89
|
Erkol Ş, Mazzilli D, Radicchi F. Influence maximization on temporal networks. Phys Rev E 2020; 102:042307. [PMID: 33212670 DOI: 10.1103/physreve.102.042307] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Accepted: 09/14/2020] [Indexed: 11/07/2022]
Abstract
We consider the optimization problem of seeding a spreading process on a temporal network so that the expected size of the resulting outbreak is maximized. We frame the problem for a spreading process following the rules of the susceptible-infected-recovered model with temporal scale equal to the one characterizing the evolution of the network topology. We perform a systematic analysis based on a corpus of 12 real-world temporal networks and quantify the performance of solutions to the influence maximization problem obtained using different level of information about network topology and dynamics. We find that having perfect knowledge of the network topology but in a static and/or aggregated form is not helpful in solving the influence maximization problem effectively. Knowledge, even if partial, of the early stages of the network dynamics appears instead essential for the identification of quasioptimal sets of influential spreaders.
Collapse
Affiliation(s)
- Şirag Erkol
- Center for Complex Networks and Systems Research, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana 47408, USA
| | - Dario Mazzilli
- Center for Complex Networks and Systems Research, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana 47408, USA
| | - Filippo Radicchi
- Center for Complex Networks and Systems Research, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana 47408, USA
| |
Collapse
|
90
|
Iliopoulos A, Beis G, Apostolou P, Papasotiriou I. Complex Networks, Gene Expression and Cancer Complexity: A Brief Review of Methodology and Applications. Curr Bioinform 2020. [DOI: 10.2174/1574893614666191017093504] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
In this brief survey, various aspects of cancer complexity and how this complexity can
be confronted using modern complex networks’ theory and gene expression datasets, are described.
In particular, the causes and the basic features of cancer complexity, as well as the challenges
it brought are underlined, while the importance of gene expression data in cancer research
and in reverse engineering of gene co-expression networks is highlighted. In addition, an introduction
to the corresponding theoretical and mathematical framework of graph theory and complex
networks is provided. The basics of network reconstruction along with the limitations of gene
network inference, the enrichment and survival analysis, evolution, robustness-resilience and cascades
in complex networks, are described. Finally, an indicative and suggestive example of a cancer
gene co-expression network inference and analysis is given.
Collapse
Affiliation(s)
- A.C. Iliopoulos
- Research and Development Department, Research Genetic Cancer Centre S.A., Florina, Greece
| | - G. Beis
- Research and Development Department, Research Genetic Cancer Centre S.A., Florina, Greece
| | - P. Apostolou
- Research and Development Department, Research Genetic Cancer Centre S.A., Florina, Greece
| | - I. Papasotiriou
- Research Genetic Cancer Centre International GmbH, Zug, Switzerland
| |
Collapse
|
91
|
Patankar SP, Kim JZ, Pasqualetti F, Bassett DS. Path-dependent connectivity, not modularity, consistently predicts controllability of structural brain networks. Netw Neurosci 2020; 4:1091-1121. [PMID: 33195950 PMCID: PMC7655114 DOI: 10.1162/netn_a_00157] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2020] [Accepted: 07/15/2020] [Indexed: 01/03/2023] Open
Abstract
The human brain displays rich communication dynamics that are thought to be particularly well-reflected in its marked community structure. Yet, the precise relationship between community structure in structural brain networks and the communication dynamics that can emerge therefrom is not well understood. In addition to offering insight into the structure-function relationship of networked systems, such an understanding is a critical step toward the ability to manipulate the brain's large-scale dynamical activity in a targeted manner. We investigate the role of community structure in the controllability of structural brain networks. At the region level, we find that certain network measures of community structure are sometimes statistically correlated with measures of linear controllability. However, we then demonstrate that this relationship depends on the distribution of network edge weights. We highlight the complexity of the relationship between community structure and controllability by performing numerical simulations using canonical graph models with varying mesoscale architectures and edge weight distributions. Finally, we demonstrate that weighted subgraph centrality, a measure rooted in the graph spectrum, and which captures higher order graph architecture, is a stronger and more consistent predictor of controllability. Our study contributes to an understanding of how the brain's diverse mesoscale structure supports transient communication dynamics.
Collapse
Affiliation(s)
| | - Jason Z. Kim
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA USA
| | - Fabio Pasqualetti
- Department of Mechanical Engineering, University of California, Riverside, CA USA
| | - Danielle S. Bassett
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA USA
- Department of Neuroscience, University of Pennsylvania, Philadelphia, PA USA
- Department of Electrical and Systems Engineering, University of Pennsylvania, Philadelphia, PA USA
- Department of Neurology, University of Pennsylvania, Philadelphia, PA USA
- Department of Physics and Astronomy, University of Pennsylvania, Philadelphia, PA USA
- Department of Psychiatry, University of Pennsylvania, Philadelphia, PA USA
- Santa Fe Institute, Santa Fe, NM USA
| |
Collapse
|
92
|
Zhang W, Xu J, Zou X. Predicting Essential Proteins by Integrating Network Topology, Subcellular Localization Information, Gene Expression Profile and GO Annotation Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:2053-2061. [PMID: 31095490 DOI: 10.1109/tcbb.2019.2916038] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Essential proteins are indispensable for maintaining normal cellular functions. Identification of essential proteins from Protein-protein interaction (PPI) networks has become a hot topic in recent years. Traditionally biological experimental based approaches are time-consuming and expensive, although lots of computational based methods have been developed in the past years; however, the prediction accuracy is still unsatisfied. In this research, by introducing the protein sub-cellular localization information, we define a new measurement for characterizing the protein's subcellular localization essentiality, and a new data fusion based method is developed for identifying essential proteins, named TEGS, based on integrating network topology, gene expression profile, GO annotation information, and protein subcellular localization information. To demonstrate the efficiency of the proposed method TEGS, we evaluate its performance on two Saccharomyces cerevisiae datasets and compare with other seven state-of-the-art methods (DC, BC, NC, PeC, WDC, SON, and TEO) in terms of true predicted number, jackknife curve, and precision-recall curve. Simulation results show that the TEGS outperforms the other compared methods in identifying essential proteins. The source code of TEGS is freely available at https://github.com/wzhangwhu/TEGS.
Collapse
|
93
|
Lella E, Estrada E. Communicability distance reveals hidden patterns of Alzheimer's disease. Netw Neurosci 2020; 4:1007-1029. [PMID: 33195946 PMCID: PMC7655045 DOI: 10.1162/netn_a_00143] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Accepted: 04/29/2020] [Indexed: 01/18/2023] Open
Abstract
The communicability distance between pairs of regions in human brain is used as a quantitative proxy for studying Alzheimer's disease. Using this distance, we obtain the shortest communicability path lengths between different regions of brain networks from patients with Alzheimer's disease (AD) and healthy cohorts (HC). We show that the shortest communicability path length is significantly better than the shortest topological path length in distinguishing AD patients from HC. Based on this approach, we identify 399 pairs of brain regions for which there are very significant changes in the shortest communicability path length after AD appears. We find that 42% of these regions interconnect both brain hemispheres, 28% connect regions inside the left hemisphere only, and 20% affect vermis connection with brain hemispheres. These findings clearly agree with the disconnection syndrome hypothesis of AD. Finally, we show that in 76.9% of damaged brain regions the shortest communicability path length drops in AD in relation to HC. This counterintuitive finding indicates that AD transforms the brain network into a more efficient system from the perspective of the transmission of the disease, because it drops the circulability of the disease factor around the brain regions in relation to its transmissibility to other regions.
Collapse
Affiliation(s)
- Eufemia Lella
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
- Innovation Lab, Exprivia S.p.A., Molfetta, Italy
| | - Ernesto Estrada
- Institute of Applied Mathematics (IUMA), Universidad de Zaragoza, Zaragoza, Spain
- ARAID Foundation, Government of Aragón, Zaragoza, Spain
| |
Collapse
|
94
|
Mangangcha IR, Malik MZ, Kucuk O, Ali S, Singh RKB. Kinless hubs are potential target genes in prostate cancer network. Genomics 2020; 112:5227-5239. [PMID: 32976977 DOI: 10.1016/j.ygeno.2020.09.033] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2019] [Revised: 08/28/2020] [Accepted: 09/14/2020] [Indexed: 02/06/2023]
Abstract
Complex disease networks can be studied successfully using network theoretical approach which helps in finding key disease genes and associated disease modules. We studied prostate cancer (PCa) protein-protein interaction (PPI) network constructed from patients' gene expression datasets and found that the network exhibits hierarchical scale free topology which lacks centrality lethality rule. Knockout experiments of the sets of leading hubs from the network leads to transition from hierarchical (HN) to scale free (SF) topology affecting network integration and organization. This transition, HN → SF, due to removal of significant number of the highest degree hubs, leads to relatively decrease in information processing efficiency, cost effectiveness of signal propagation, compactness, clustering of nodes and energy distributions. A systematic transition from a diassortative PCa PPI network to assortative networks after the removal of top 50 hubs then again reverting to disassortativity nature on further removal of the hubs was also observed indicating the dominance of the largest hubs in PCa network intergration. Further, functional classification of the hubs done by using within module degrees and participation coefficients for PCa network, and leading hubs knockout experiments indicated that kinless hubs serve as the basis of establishing links among constituting modules and heterogeneous nodes to maintain network stabilization. We, then, checked the essentiality of the hubs in the knockout experiment by performing Fisher's exact test on the hubs, and showed that removal of kinless hubs corresponded to maximum lethality in the network. However, excess removal of these hubs essentially may cause network breakdown.
Collapse
Affiliation(s)
- Irengbam Rocky Mangangcha
- School of Interdisciplinary Sciences and Technology, Jamia Hamdard, New Delhi 110062, India; Bioinformatics Infrastructure Facility, BIF & Department of Biochemistry, School of Chemical and Life Sciences Jamia Hamdard, New Delhi 110062, India; Department of Zoology, Deshbandhu College, University of Delhi, New Delhi 110019, India; School of Computational & Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| | - Md Zubbair Malik
- School of Computational & Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| | - Omer Kucuk
- Winship Cancer Institute of Emory University, 1365 Clifton Road NE, Atlanta, GA 30322, USA
| | - Shakir Ali
- Bioinformatics Infrastructure Facility, BIF & Department of Biochemistry, School of Chemical and Life Sciences Jamia Hamdard, New Delhi 110062, India
| | - R K Brojen Singh
- School of Computational & Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India.
| |
Collapse
|
95
|
Khorsand B, Savadi A, Naghibzadeh M. Comprehensive host-pathogen protein-protein interaction network analysis. BMC Bioinformatics 2020; 21:400. [PMID: 32912135 PMCID: PMC7488060 DOI: 10.1186/s12859-020-03706-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2020] [Accepted: 07/31/2020] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Infectious diseases are a cruel assassin with millions of victims around the world each year. Understanding infectious mechanism of viruses is indispensable for their inhibition. One of the best ways of unveiling this mechanism is to investigate the host-pathogen protein-protein interaction network. In this paper we try to disclose many properties of this network. We focus on human as host and integrate experimentally 32,859 interaction between human proteins and virus proteins from several databases. We investigate different properties of human proteins targeted by virus proteins and find that most of them have a considerable high centrality scores in human intra protein-protein interaction network. Investigating human proteins network properties which are targeted by different virus proteins can help us to design multipurpose drugs. RESULTS As host-pathogen protein-protein interaction network is a bipartite network and centrality measures for this type of networks are scarce, we proposed seven new centrality measures for analyzing bipartite networks. Applying them to different virus strains reveals unrandomness of attack strategies of virus proteins which could help us in drug design hence elevating the quality of life. They could also be used in detecting host essential proteins. Essential proteins are those whose functions are critical for survival of its host. One of the proposed centralities named diversity of predators, outperforms the other existing centralities in terms of detecting essential proteins and could be used as an optimal essential proteins' marker. CONCLUSIONS Different centralities were applied to analyze human protein-protein interaction network and to detect characteristics of human proteins targeted by virus proteins. Moreover, seven new centralities were proposed to analyze host-pathogen protein-protein interaction network and to detect pathogens' favorite host protein victims. Comparing different centralities in detecting essential proteins reveals that diversity of predator (one of the proposed centralities) is the best essential protein marker.
Collapse
Affiliation(s)
- Babak Khorsand
- Computer Engineering Department, Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Abdorreza Savadi
- Computer Engineering Department, Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
- Ferdowsi University of Mashhad, Azadi Square, Mashhad, 9177948974 Iran
| | | |
Collapse
|
96
|
Athira K, Gopakumar G. An integrated method for identifying essential proteins from multiplex network model of protein-protein interactions. J Bioinform Comput Biol 2020; 18:2050020. [PMID: 32795133 DOI: 10.1142/s0219720020500201] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Cell survival requires the presence of essential proteins. Detection of essential proteins is relevant not only because of the critical biological functions they perform but also the role played by them as a drug target against pathogens. Several computational techniques are in place to identify essential proteins based on protein-protein interaction (PPI) network. Essential protein detection using only physical interaction data of proteins is challenging due to its inherent uncertainty. Hence, in this work, we propose a multiplex network-based framework that incorporates multiple protein interaction data from their physical, coexpression and phylogenetic profiles. An extended version termed as multiplex eigenvector centrality (MEC) is used to identify essential proteins from this network. The methodology integrates the score obtained from the multiplex analysis with subcellular localization and Gene Ontology information and is implemented using Saccharomyces cerevisiae datasets. The proposed method outperformed many recent essential protein prediction techniques in the literature.
Collapse
Affiliation(s)
- K Athira
- Department of Computer Science and Engineering, National Institute of Technology Calicut, Kozhikkode, Kerala 673601, India
| | - G Gopakumar
- Department of Computer Science and Engineering, National Institute of Technology Calicut, Kozhikkode, Kerala 673601, India
| |
Collapse
|
97
|
Wang R, Chen J, Gao K, Hozumi Y, Yin C, Wei GW. Characterizing SARS-CoV-2 mutations in the United States. RESEARCH SQUARE 2020:rs.3.rs-49671. [PMID: 32818213 PMCID: PMC7430589 DOI: 10.21203/rs.3.rs-49671/v1] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been mutating since it was first sequenced in early January 2020. The genetic variants have developed into a few distinct clusters with different properties. Since the United States (US) has the highest number of viral infected patients globally, it is essential to understand the US SARS-CoV-2. Using genotyping, sequence-alignment, time-evolution, k-means clustering, protein-folding stability, algebraic topology, and network theory, we reveal that the US SARS-CoV-2 has four substrains and five top US SARS-CoV-2 mutations were first detected in China (2 cases), Singapore (2 cases), and the United Kingdom (1 case). The next three top US SARS-CoV-2 mutations were first detected in the US. These eight top mutations belong to two disconnected groups. The first group consisting of 5 concurrent mutations is prevailing, while the other group with three concurrent mutations gradually fades out. We identify that one of the top mutations, 27964C>T-(S24L) on ORF8, has an unusually strong gender dependence. Based on the analysis of all mutations on the spike protein, we further uncover that three of four US SASR-CoV-2 substrains become more infectious. Our study calls for effective viral control and containing strategies in the US.
Collapse
Affiliation(s)
- Rui Wang
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Jiahui Chen
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Kaifu Gao
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Yuta Hozumi
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Changchuan Yin
- Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago, Chicago, IL 60607, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA
- Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
| |
Collapse
|
98
|
Martino A, De Santis E, Giuliani A, Rizzi A. Modelling and Recognition of Protein Contact Networks by Multiple Kernel Learning and Dissimilarity Representations. ENTROPY (BASEL, SWITZERLAND) 2020; 22:E794. [PMID: 33286565 PMCID: PMC7517365 DOI: 10.3390/e22070794] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/27/2020] [Revised: 07/13/2020] [Accepted: 07/17/2020] [Indexed: 11/26/2022]
Abstract
Multiple kernel learning is a paradigm which employs a properly constructed chain of kernel functions able to simultaneously analyse different data or different representations of the same data. In this paper, we propose an hybrid classification system based on a linear combination of multiple kernels defined over multiple dissimilarity spaces. The core of the training procedure is the joint optimisation of kernel weights and representatives selection in the dissimilarity spaces. This equips the system with a two-fold knowledge discovery phase: by analysing the weights, it is possible to check which representations are more suitable for solving the classification problem, whereas the pivotal patterns selected as representatives can give further insights on the modelled system, possibly with the help of field-experts. The proposed classification system is tested on real proteomic data in order to predict proteins' functional role starting from their folded structure: specifically, a set of eight representations are drawn from the graph-based protein folded description. The proposed multiple kernel-based system has also been benchmarked against a clustering-based classification system also able to exploit multiple dissimilarities simultaneously. Computational results show remarkable classification capabilities and the knowledge discovery analysis is in line with current biological knowledge, suggesting the reliability of the proposed system.
Collapse
Affiliation(s)
- Alessio Martino
- Department of Information Engineering, Electronics and Telecommunications, University of Rome “La Sapienza”, Via Eudossiana 18, 00184 Rome, Italy; (E.D.S.); (A.R.)
| | - Enrico De Santis
- Department of Information Engineering, Electronics and Telecommunications, University of Rome “La Sapienza”, Via Eudossiana 18, 00184 Rome, Italy; (E.D.S.); (A.R.)
| | - Alessandro Giuliani
- Department of Environment and Health, Istituto Superiore di Sanità, Viale Regina Elena 299, 00161 Rome, Italy;
| | - Antonello Rizzi
- Department of Information Engineering, Electronics and Telecommunications, University of Rome “La Sapienza”, Via Eudossiana 18, 00184 Rome, Italy; (E.D.S.); (A.R.)
| |
Collapse
|
99
|
Abadias L, Estrada-Rodriguez G, Estrada E. Fractional-Order Susceptible-Infected Model: Definition and Applications to the Study of COVID-19 Main Protease. FRACTIONAL CALCULUS & APPLIED ANALYSIS 2020; 23:635-655. [PMID: 34849076 PMCID: PMC8617368 DOI: 10.1515/fca-2020-0033] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Indexed: 05/02/2023]
Abstract
We propose a model for the transmission of perturbations across the amino acids of a protein represented as an interaction network. The dynamics consists of a Susceptible-Infected (SI) model based on the Caputo fractional-order derivative. We find an upper bound to the analytical solution of this model which represents the worse-case scenario on the propagation of perturbations across a protein residue network. This upper bound is expressed in terms of Mittag-Leffler functions of the adjacency matrix of the network of inter-amino acids interactions. We then apply this model to the analysis of the propagation of perturbations produced by inhibitors of the main protease of SARS CoV-2. We find that the perturbations produced by strong inhibitors of the protease are propagated far away from the binding site, confirming the long-range nature of intra-protein communication. On the contrary, the weakest inhibitors only transmit their perturbations across a close environment around the binding site. These findings may help to the design of drug candidates against this new coronavirus.
Collapse
Affiliation(s)
- Luciano Abadias
- Departamento de Matemáticas, Facultad de Ciencias Universidad de Zaragoza, 50009 Zaragoza, Spain
- Instituto Universitario de Matemáticas y Aplicaciones, Universidad de Zaragoza, 50009 Zaragoza, Spain
| | | | - Ernesto Estrada
- Instituto Universitario de Matemáticas y Aplicaciones, Universidad de Zaragoza, 50009 Zaragoza, Spain
- 50018, Zaragoza, Spain
| |
Collapse
|
100
|
Li G, Li M, Wang J, Li Y, Pan Y. United Neighborhood Closeness Centrality and Orthology for Predicting Essential Proteins. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1451-1458. [PMID: 30596582 DOI: 10.1109/tcbb.2018.2889978] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Identifying essential proteins plays an important role in disease study, drug design, and understanding the minimal requirement for cellular life. Computational methods for essential proteins discovery overcome the disadvantages of biological experimental methods that are often time-consuming, expensive, and inefficient. The topological features of protein-protein interaction (PPI) networks are often used to design computational prediction methods, such as Degree Centrality (DC), Betweenness Centrality (BC), Closeness Centrality (CC), Subgraph Centrality (SC), Eigenvector Centrality (EC), Information Centrality (IC), and Neighborhood Centrality (NC). However, the prediction accuracies of these individual methods still have space to be improved. Studies show that additional information, such as orthologous relations, helps discover essential proteins. Many researchers have proposed different methods by combining multiple information sources to gain improvement of prediction accuracy. In this study, we find that essential proteins appear in triangular structure in PPI network significantly more often than nonessential ones. Based on this phenomenon, we propose a novel pure centrality measure, so-called Neighborhood Closeness Centrality (NCC). Accordingly, we develop a new combination model, Extended Pareto Optimality Consensus model, named EPOC, to fuse NCC and Orthology information and a novel essential proteins identification method, NCCO, is fully proposed. Compared with seven existing classic centrality methods (DC, BC, IC, CC, SC, EC, and NC) and three consensus methods (PeC, ION, and CSC), our results on S.cerevisiae and E.coli datasets show that NCCO has clear advantages. As a consensus method, EPOC also yields better performance than the random walk model.
Collapse
|