1
|
Li Q, Newaz K, Milenković T. Towards future directions in data-integrative supervised prediction of human aging-related genes. Bioinform Adv 2022; 2:vbac081. [PMID: 36699345 PMCID: PMC9710570 DOI: 10.1093/bioadv/vbac081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 09/23/2022] [Accepted: 10/31/2022] [Indexed: 11/13/2022]
Abstract
Motivation Identification of human genes involved in the aging process is critical due to the incidence of many diseases with age. A state-of-the-art approach for this purpose infers a weighted dynamic aging-specific subnetwork by mapping gene expression (GE) levels at different ages onto the protein-protein interaction network (PPIN). Then, it analyzes this subnetwork in a supervised manner by training a predictive model to learn how network topologies of known aging- versus non-aging-related genes change across ages. Finally, it uses the trained model to predict novel aging-related gene candidates. However, the best current subnetwork resulting from this approach still yields suboptimal prediction accuracy. This could be because it was inferred using outdated GE and PPIN data. Here, we evaluate whether analyzing a weighted dynamic aging-specific subnetwork inferred from newer GE and PPIN data improves prediction accuracy upon analyzing the best current subnetwork inferred from outdated data. Results Unexpectedly, we find that not to be the case. To understand this, we perform aging-related pathway and Gene Ontology term enrichment analyses. We find that the suboptimal prediction accuracy, regardless of which GE or PPIN data is used, may be caused by the current knowledge about which genes are aging-related being incomplete, or by the current methods for inferring or analyzing an aging-specific subnetwork being unable to capture all of the aging-related knowledge. These findings can potentially guide future directions towards improving supervised prediction of aging-related genes via -omics data integration. Availability and implementation All data and code are available at zenodo, DOI: 10.5281/zenodo.6995045. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Qi Li
- Department of Computer Science and Engineering, Lucy Family Institute for Data & Society, and Eck Institute for Global Health (EIGH), University of Notre Dame, Notre Dame, IN 46556, USA
| | - Khalique Newaz
- Department of Computer Science and Engineering, Lucy Family Institute for Data & Society, and Eck Institute for Global Health (EIGH), University of Notre Dame, Notre Dame, IN 46556, USA,Center for Data and Computing in Natural Sciences (CDCS), Institute for Computational Systems Biology, Universität Hamburg, Hamburg 20146, Germany
| | | |
Collapse
|
2
|
Newaz K, Piland J, Clark PL, Emrich SJ, Li J, Milenković T. Multi-layer sequential network analysis improves protein 3D structural classification. Proteins 2022; 90:1721-1731. [PMID: 35441395 PMCID: PMC9356989 DOI: 10.1002/prot.26349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 03/04/2022] [Accepted: 03/30/2022] [Indexed: 11/08/2022]
Abstract
Protein structural classification (PSC) is a supervised problem of assigning proteins into pre-defined structural (e.g., CATH or SCOPe) classes based on the proteins' sequence or 3D structural features. We recently proposed PSC approaches that model protein 3D structures as protein structure networks (PSNs) and analyze PSN-based protein features, which performed better than or comparable to state-of-the-art sequence or other 3D structure-based PSC approaches. However, existing PSN-based PSC approaches model the whole 3D structure of a protein as a static (i.e., single-layer) PSN. Because folding of a protein is a dynamic process, where some parts (i.e., sub-structures) of a protein fold before others, modeling the 3D structure of a protein as a PSN that captures the sub-structures might further help improve the existing PSC performance. Here, we propose to model 3D structures of proteins as multi-layer sequential PSNs that approximate 3D sub-structures of proteins, with the hypothesis that this will improve upon the current state-of-the-art PSC approaches that are based on single-layer PSNs (and thus upon the existing state-of-the-art sequence and other 3D structural approaches). Indeed, we confirm this on 72 datasets spanning ~44 000 CATH and SCOPe protein domains.
Collapse
Affiliation(s)
- Khalique Newaz
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA,Center for Data and Computing in Natural Sciences (CDCS), Institute for Computational Systems Biology, Universität Hamburg, Hamburg, 20146, Germany
| | - Jacob Piland
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Patricia L. Clark
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Scott J. Emrich
- Department of Electrical Engineering and Computer Science; University of Tennessee, Knoxville, TN 37996, USA
| | - Jun Li
- Department of Applied and Computational Mathematics and Statistics, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
| |
Collapse
|
3
|
Gu S, Jiang M, Guzzi PH, Milenković T. Modeling multi-scale data via a network of networks. Bioinformatics 2022; 38:2544-2553. [PMID: 35238343 PMCID: PMC9048659 DOI: 10.1093/bioinformatics/btac133] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Revised: 02/01/2022] [Accepted: 02/28/2022] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Prediction of node and graph labels are prominent network science tasks. Data analyzed in these tasks are sometimes related: entities represented by nodes in a higher-level (higher scale) network can themselves be modeled as networks at a lower level. We argue that systems involving such entities should be integrated with a 'network of networks' (NoNs) representation. Then, we ask whether entity label prediction using multi-level NoN data via our proposed approaches is more accurate than using each of single-level node and graph data alone, i.e. than traditional node label prediction on the higher-level network and graph label prediction on the lower-level networks. To obtain data, we develop the first synthetic NoN generator and construct a real biological NoN. We evaluate accuracy of considered approaches when predicting artificial labels from the synthetic NoNs and proteins' functions from the biological NoN. RESULTS For the synthetic NoNs, our NoN approaches outperform or are as good as node- and network-level ones depending on the NoN properties. For the biological NoN, our NoN approaches outperform the single-level approaches for just under half of the protein functions, and for 30% of the functions, only our NoN approaches make meaningful predictions, while node- and network-level ones achieve random accuracy. So, NoN-based data integration is important. AVAILABILITY AND IMPLEMENTATION The software and data are available at https://nd.edu/~cone/NoNs. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shawn Gu
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Meng Jiang
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Pietro Hiram Guzzi
- Department of Surgical and Medical Sciences, University Magna Graecia of Catanzaro, Catanzaro 88100, Italy
| | | |
Collapse
|
4
|
Li Q, Newaz K, Milenković T. Improved supervised prediction of aging-related genes via weighted dynamic network analysis. BMC Bioinformatics 2021; 22:520. [PMID: 34696741 PMCID: PMC8543111 DOI: 10.1186/s12859-021-04439-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Accepted: 10/12/2021] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND This study focuses on the task of supervised prediction of aging-related genes from -omics data. Unlike gene expression methods for this task that capture aging-specific information but ignore interactions between genes (i.e., their protein products), or protein-protein interaction (PPI) network methods for this task that account for PPIs but the PPIs are context-unspecific, we recently integrated the two data types into an aging-specific PPI subnetwork, which yielded more accurate aging-related gene predictions. However, a dynamic aging-specific subnetwork did not improve prediction performance compared to a static aging-specific subnetwork, despite the aging process being dynamic. This could be because the dynamic subnetwork was inferred using a naive Induced subgraph approach. Instead, we recently inferred a dynamic aging-specific subnetwork using a methodologically more advanced notion of network propagation (NP), which improved upon Induced dynamic aging-specific subnetwork in a different task, that of unsupervised analyses of the aging process. RESULTS Here, we evaluate whether our existing NP-based dynamic subnetwork will improve upon the dynamic as well as static subnetwork constructed by the Induced approach in the considered task of supervised prediction of aging-related genes. The existing NP-based subnetwork is unweighted, i.e., it gives equal importance to each of the aging-specific PPIs. Because accounting for aging-specific edge weights might be important, we additionally propose a weighted NP-based dynamic aging-specific subnetwork. We demonstrate that a predictive machine learning model trained and tested on the weighted subnetwork yields higher accuracy when predicting aging-related genes than predictive models run on the existing unweighted dynamic or static subnetworks, regardless of whether the existing subnetworks were inferred using NP or the Induced approach. CONCLUSIONS Our proposed weighted dynamic aging-specific subnetwork and its corresponding predictive model could guide with higher confidence than the existing data and models the discovery of novel aging-related gene candidates for future wet lab validation.
Collapse
Affiliation(s)
- Qi Li
- Department of Computer Science and Engineering, Center for Network and Data Science (CNDS), and Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Khalique Newaz
- Department of Computer Science and Engineering, Center for Network and Data Science (CNDS), and Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Tijana Milenković
- Department of Computer Science and Engineering, Center for Network and Data Science (CNDS), and Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46556, USA.
| |
Collapse
|
5
|
Gu S, Milenković T. Data-driven biological network alignment that uses topological, sequence, and functional information. BMC Bioinformatics 2021; 22:34. [PMID: 33514304 PMCID: PMC7847157 DOI: 10.1186/s12859-021-03971-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 01/15/2021] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND Network alignment (NA) can transfer functional knowledge between species' conserved biological network regions. Traditional NA assumes that it is topological similarity (isomorphic-like matching) between network regions that corresponds to the regions' functional relatedness. However, we recently found that functionally unrelated proteins are as topologically similar as functionally related proteins. So, we redefined NA as a data-driven method called TARA, which learns from network and protein functional data what kind of topological relatedness (rather than similarity) between proteins corresponds to their functional relatedness. TARA used topological information (within each network) but not sequence information (between proteins across networks). Yet, TARA yielded higher protein functional prediction accuracy than existing NA methods, even those that used both topological and sequence information. RESULTS Here, we propose TARA++ that is also data-driven, like TARA and unlike other existing methods, but that uses across-network sequence information on top of within-network topological information, unlike TARA. To deal with the within-and-across-network analysis, we adapt social network embedding to the problem of biological NA. TARA++ outperforms protein functional prediction accuracy of existing methods. CONCLUSIONS As such, combining research knowledge from different domains is promising. Overall, improvements in protein functional prediction have biomedical implications, for example allowing researchers to better understand how cancer progresses or how humans age.
Collapse
Affiliation(s)
- Shawn Gu
- Department of Computer Science and Engineering, Eck Institute for Global Health, Center for Network and Data Science, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Tijana Milenković
- Department of Computer Science and Engineering, Eck Institute for Global Health, Center for Network and Data Science, University of Notre Dame, Notre Dame, IN, 46556, USA.
| |
Collapse
|
6
|
Newaz K, Wright G, Piland J, Li J, Clark PL, Emrich SJ, Milenković T. Network analysis of synonymous codon usage. Bioinformatics 2020; 36:4876-4884. [PMID: 32609328 DOI: 10.1093/bioinformatics/btaa603] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2019] [Revised: 05/05/2020] [Accepted: 06/22/2020] [Indexed: 12/25/2022] Open
Abstract
MOTIVATION Most amino acids are encoded by multiple synonymous codons, some of which are used more rarely than others. Analyses of positions of such rare codons in protein sequences revealed that rare codons can impact co-translational protein folding and that positions of some rare codons are evolutionarily conserved. Analyses of their positions in protein 3-dimensional structures, which are richer in biochemical information than sequences alone, might further explain the role of rare codons in protein folding. RESULTS We model protein structures as networks and use network centrality to measure the structural position of an amino acid. We first validate that amino acids buried within the structural core are network-central, and those on the surface are not. Then, we study potential differences between network centralities and thus structural positions of amino acids encoded by conserved rare, non-conserved rare and commonly used codons. We find that in 84% of proteins, the three codon categories occupy significantly different structural positions. We examine protein groups showing different codon centrality trends, i.e. different relationships between structural positions of the three codon categories. We see several cases of all proteins from our data with some structural or functional property being in the same group. Also, we see a case of all proteins in some group having the same property. Our work shows that codon usage is linked to the final protein structure and thus possibly to co-translational protein folding. AVAILABILITY AND IMPLEMENTATION https://nd.edu/∼cone/CodonUsage/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Khalique Newaz
- Department of Computer Science and Engineering.,Center for Network and Data Science.,Eck institute for Global Health
| | - Gabriel Wright
- Department of Computer Science and Engineering.,Eck institute for Global Health
| | - Jacob Piland
- Department of Computer Science and Engineering.,Center for Network and Data Science.,Eck institute for Global Health
| | - Jun Li
- Department of Applied and Computational Mathematics and Statistics
| | - Patricia L Clark
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Scott J Emrich
- Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, TN 37996, USA
| | - Tijana Milenković
- Department of Computer Science and Engineering.,Center for Network and Data Science.,Eck institute for Global Health
| |
Collapse
|
7
|
Abstract
In this study, we deal with the problem of biological network alignment (NA), which aims to find a node mapping between species’ molecular networks that uncovers similar network regions, thus allowing for the transfer of functional knowledge between the aligned nodes. We provide evidence that current NA methods, which assume that topologically similar nodes (i.e., nodes whose network neighborhoods are isomorphic-like) have high functional relatedness, do not actually end up aligning functionally related nodes. That is, we show that the current topological similarity assumption does not hold well. Consequently, we argue that a paradigm shift is needed with how the NA problem is approached. So, we redefine NA as a data-driven framework, called TARA (data-driven NA), which attempts to learn the relationship between topological relatedness and functional relatedness without assuming that topological relatedness corresponds to topological similarity. TARA makes no assumptions about what nodes should be aligned, distinguishing it from existing NA methods. Specifically, TARA trains a classifier to predict whether two nodes from different networks are functionally related based on their network topological patterns (features). We find that TARA is able to make accurate predictions. TARA then takes each pair of nodes that are predicted as related to be part of an alignment. Like traditional NA methods, TARA uses this alignment for the across-species transfer of functional knowledge. TARA as currently implemented uses topological but not protein sequence information for functional knowledge transfer. In this context, we find that TARA outperforms existing state-of-the-art NA methods that also use topological information, WAVE and SANA, and even outperforms or complements a state-of-the-art NA method that uses both topological and sequence information, PrimAlign. Hence, adding sequence information to TARA, which is our future work, is likely to further improve its performance. The software and data are available at http://www.nd.edu/~cone/TARA/.
Collapse
Affiliation(s)
- Shawn Gu
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, United States of America
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN, United States of America
- Center for Network and Data Science, University of Notre Dame, Notre Dame, IN, United States of America
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, United States of America
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN, United States of America
- Center for Network and Data Science, University of Notre Dame, Notre Dame, IN, United States of America
- * E-mail:
| |
Collapse
|
8
|
Aparício D, Ribeiro P, Milenković T, Silva F. Temporal network alignment via GoT-WAVE. Bioinformatics 2020; 35:3527-3529. [PMID: 30759185 DOI: 10.1093/bioinformatics/btz119] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2019] [Revised: 01/20/2019] [Accepted: 02/12/2019] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Network alignment (NA) finds conserved regions between two networks. NA methods optimize node conservation (NC) and edge conservation. Dynamic graphlet degree vectors are a state-of-the-art dynamic NC measure, used within the fastest and most accurate NA method for temporal networks: DynaWAVE. Here, we use graphlet-orbit transitions (GoTs), a different graphlet-based measure of temporal node similarity, as a new dynamic NC measure within DynaWAVE, resulting in GoT-WAVE. RESULTS On synthetic networks, GoT-WAVE improves DynaWAVE's accuracy by 30% and speed by 64%. On real networks, when optimizing only dynamic NC, the methods are complementary. Furthermore, only GoT-WAVE supports directed edges. Hence, GoT-WAVE is a promising new temporal NA algorithm, which efficiently optimizes dynamic NC. We provide a user-friendly user interface and source code for GoT-WAVE. AVAILABILITY AND IMPLEMENTATION http://www.dcc.fc.up.pt/got-wave/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- David Aparício
- CRACS & INESC-TEC, Departamento de Ciência de Computadores, Faculdade de Ciências, Universidade do Porto, Porto, Portugal
| | - Pedro Ribeiro
- CRACS & INESC-TEC, Departamento de Ciência de Computadores, Faculdade de Ciências, Universidade do Porto, Porto, Portugal
| | - Tijana Milenković
- Department of Computer Science and Engineering, Interdisciplinary Center for Network Science and Applications, and ECK Institute for Global Health, University of Notre Dame, Notre Dame, IN, USA
| | - Fernando Silva
- CRACS & INESC-TEC, Departamento de Ciência de Computadores, Faculdade de Ciências, Universidade do Porto, Porto, Portugal
| |
Collapse
|
9
|
Newaz K, Ghalehnovi M, Rahnama A, Antsaklis PJ, Milenković T. Network-based protein structural classification. R Soc Open Sci 2020; 7:191461. [PMID: 32742675 PMCID: PMC7353965 DOI: 10.1098/rsos.191461] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 05/05/2020] [Indexed: 06/11/2023]
Abstract
Experimental determination of protein function is resource-consuming. As an alternative, computational prediction of protein function has received attention. In this context, protein structural classification (PSC) can help, by allowing for determining structural classes of currently unclassified proteins based on their features, and then relying on the fact that proteins with similar structures have similar functions. Existing PSC approaches rely on sequence-based or direct three-dimensional (3D) structure-based protein features. By contrast, we first model 3D structures of proteins as protein structure networks (PSNs). Then, we use network-based features for PSC. We propose the use of graphlets, state-of-the-art features in many research areas of network science, in the task of PSC. Moreover, because graphlets can deal only with unweighted PSNs, and because accounting for edge weights when constructing PSNs could improve PSC accuracy, we also propose a deep learning framework that automatically learns network features from weighted PSNs. When evaluated on a large set of approximately 9400 CATH and approximately 12 800 SCOP protein domains (spanning 36 PSN sets), the best of our proposed approaches are superior to existing PSC approaches in terms of accuracy, with comparable running times. Our data and code are available at https://doi.org/10.5281/zenodo.3787922.
Collapse
Affiliation(s)
- Khalique Newaz
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
- Center for Network and Data Science, University of Notre Dame, Notre Dame, IN 46556, USA
- Eck institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Mahboobeh Ghalehnovi
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Arash Rahnama
- Department of Electrical Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Panos J. Antsaklis
- Department of Electrical Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
- Center for Network and Data Science, University of Notre Dame, Notre Dame, IN 46556, USA
- Eck institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA
| |
Collapse
|
10
|
Wright G, Rodriguez A, Li J, Clark PL, Milenković T, Emrich SJ. Analysis of computational codon usage models and their association with translationally slow codons. PLoS One 2020; 15:e0232003. [PMID: 32352987 PMCID: PMC7192439 DOI: 10.1371/journal.pone.0232003] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Accepted: 04/05/2020] [Indexed: 11/19/2022] Open
Abstract
Improved computational modeling of protein translation rates, including better prediction of where translational slowdowns along an mRNA sequence may occur, is critical for understanding co-translational folding. Because codons within a synonymous codon group are translated at different rates, many computational translation models rely on analyzing synonymous codons. Some models rely on genome-wide codon usage bias (CUB), believing that globally rare and common codons are the most informative of slow and fast translation, respectively. Others use the CUB observed only in highly expressed genes, which should be under selective pressure to be translated efficiently (and whose CUB may therefore be more indicative of translation rates). No prior work has analyzed these models for their ability to predict translational slowdowns. Here, we evaluate five models for their association with slowly translated positions as denoted by two independent ribosome footprint (RFP) count experiments from S. cerevisiae, because RFP data is often considered as a “ground truth” for translation rates across mRNA sequences. We show that all five considered models strongly associate with the RFP data and therefore have potential for estimating translational slowdowns. However, we also show that there is a weak correlation between RFP counts for the same genes originating from independent experiments, even when their experimental conditions are similar. This raises concerns about the efficacy of using current RFP experimental data for estimating translation rates and highlights a potential advantage of using computational models to understand translation rates instead.
Collapse
Affiliation(s)
- Gabriel Wright
- Department of Computer Science & Engineering, University of Notre Dame, Notre Dame, IN, United States of America
- * E-mail:
| | - Anabel Rodriguez
- Department of Chemistry & Biochemistry, University of Notre Dame, Notre Dame, IN, United States of America
| | - Jun Li
- Department of Applied and Computational Mathematics and Statistics, University of Notre Dame, Notre Dame, IN, United States of America
| | - Patricia L. Clark
- Department of Chemistry & Biochemistry, University of Notre Dame, Notre Dame, IN, United States of America
| | - Tijana Milenković
- Department of Computer Science & Engineering, University of Notre Dame, Notre Dame, IN, United States of America
| | - Scott J. Emrich
- Department of Electrical Engineering & Computer Science, University of Tennessee, Knoxville, TN, United States of America
| |
Collapse
|
11
|
Milano M, Milenković T, Cannataro M, Guzzi PH. L-HetNetAligner: A novel algorithm for Local Alignment of Heterogeneous Biological Networks. Sci Rep 2020; 10:3901. [PMID: 32127586 PMCID: PMC7054427 DOI: 10.1038/s41598-020-60737-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Accepted: 02/11/2020] [Indexed: 11/10/2022] Open
Abstract
Networks are largely used for modelling and analysing a wide range of biological data. As a consequence, many different research efforts have resulted in the introduction of a large number of algorithms for analysis and comparison of networks. Many of these algorithms can deal with networks with a single class of nodes and edges, also referred to as homogeneous networks. Recently, many different approaches tried to integrate into a single model the interplay of different molecules. A possible formalism to model such a scenario comes from node/edge coloured networks (also known as heterogeneous networks) implemented as node/ edge-coloured graphs. Therefore, the need for the introduction of algorithms able to compare heterogeneous networks arises. We here focus on the local comparison of heterogeneous networks, and we formulate it as a network alignment problem. To the best of our knowledge, the local alignment of heterogeneous networks has not been explored in the past. We here propose L-HetNetAligner a novel algorithm that receives as input two heterogeneous networks (node-coloured graphs) and builds a local alignment of them. We also implemented and tested our algorithm. Our results confirm that our method builds high-quality alignments. The following website *contains Supplementary File 1 material and the code.
Collapse
Affiliation(s)
- Marianna Milano
- Department of Surgical and Medical Sciences, University of Catanzaro, Catanzaro, 88040, Italy
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, Indiana, USA
| | - Mario Cannataro
- Department of Surgical and Medical Sciences, University of Catanzaro, Catanzaro, 88040, Italy
- Data Analytics Research Center, University of Catanzaro, Catanzaro, Italy
| | - Pietro Hiram Guzzi
- Department of Surgical and Medical Sciences, University of Catanzaro, Catanzaro, 88040, Italy.
- Data Analytics Research Center, University of Catanzaro, Catanzaro, Italy.
| |
Collapse
|
12
|
Liu S, Hachen D, Lizardo O, Poellabauer C, Striegel A, Milenković T. The power of dynamic social networks to predict individuals' mental health. Pac Symp Biocomput 2020; 25:635-646. [PMID: 31797634 PMCID: PMC6924569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Precision medicine has received attention both in and outside the clinic. We focus on the latter, by exploiting the relationship between individuals' social interactions and their mental health to predict one's likelihood of being depressed or anxious from rich dynamic social network data. Existing studies differ from our work in at least one aspect: they do not model social interaction data as a network; they do so but analyze static network data; they examine "correlation" between social networks and health but without making any predictions; or they study other individual traits but not mental health. In a comprehensive evaluation, we show that our predictive model that uses dynamic social network data is superior to its static network as well as non-network equivalents when run on the same data. Supplementary material for this work is available at https://nd.edu/~cone/NetHealth/PSB_SM.pdf.
Collapse
Affiliation(s)
- Shikang Liu
- Department of Computer Science and Engineering, University of Notre Dame
| | - David Hachen
- Department of Sociology, University of Notre Dame
| | - Omar Lizardo
- Department of Sociology, University of California, Los Angeles
| | | | - Aaron Striegel
- Department of Computer Science and Engineering, University of Notre Dame
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame
| |
Collapse
|
13
|
Abstract
Network alignment (NA) compares networks with the goal of finding a node mapping that uncovers highly similar (conserved) network regions. Existing NA methods are homogeneous, i.e., they can deal only with networks containing nodes and edges of one type. Due to increasing amounts of heterogeneous network data with nodes or edges of different types, we extend three recent state-of-the-art homogeneous NA methods, WAVE, MAGNA++, and SANA, to allow for heterogeneous NA for the first time. We introduce several algorithmic novelties. Namely, these existing methods compute homogeneous graphlet-based node similarities and then find high-scoring alignments with respect to these similarities, while simultaneously maximizing the amount of conserved edges. Instead, we extend homogeneous graphlets to their heterogeneous counterparts, which we then use to develop a new measure of heterogeneous node similarity. Also, we extend S3, a state-of-the-art measure of edge conservation for homogeneous NA, to its heterogeneous counterpart. Then, we find high-scoring alignments with respect to our heterogeneous node similarity and edge conservation measures. In evaluations on synthetic and real-world biological networks, our proposed heterogeneous NA methods lead to higher-quality alignments and better robustness to noise in the data than their homogeneous counterparts. The software and data from this work is available at https://nd.edu/~cone/colored_graphlets/.
Collapse
Affiliation(s)
- Shawn Gu
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - John Johnson
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Fazle E Faisal
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA
- Eck Institute for Global Health and Interdisciplinary Center for Network Science and Applications (iCeNSA), University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA.
- Eck Institute for Global Health and Interdisciplinary Center for Network Science and Applications (iCeNSA), University of Notre Dame, Notre Dame, IN, 46556, USA.
| |
Collapse
|
14
|
Vijayan V, Milenković T. Aligning dynamic networks with DynaWAVE. Bioinformatics 2018; 34:1980. [DOI: 10.1093/bioinformatics/bty038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
15
|
Liu S, Hachen D, Lizardo O, Poellabauer C, Striegel A, Milenković T. Network analysis of the NetHealth data: exploring co-evolution of individuals' social network positions and physical activities. Appl Netw Sci 2018; 3:45. [PMID: 30465021 PMCID: PMC6223883 DOI: 10.1007/s41109-018-0103-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2018] [Accepted: 09/25/2018] [Indexed: 05/03/2023]
Abstract
Understanding the relationship between individuals' social networks and health could help devise public health interventions for reducing incidence of unhealthy behaviors or increasing prevalence of healthy ones. In this context, we explore the co-evolution of individuals' social network positions and physical activities. We are able to do so because the NetHealth study at the University of Notre Dame has generated both high-resolution longitudinal social network (e.g., SMS) data and high-resolution longitudinal health-related behavioral (e.g., Fitbit physical activity) data. We examine trait differences between (i) users whose social network positions (i.e., centralities) change over time versus those whose centralities remain stable, (ii) users whose Fitbit physical activities change over time versus those whose physical activities remain stable, and (iii) users whose centralities and their physical activities co-evolve, i.e., correlate with each other over time. We find that centralities of a majority of all nodes change with time. These users do not show any trait difference compared to time-stable users. However, if out of all users whose centralities change with time we focus on those whose physical activities also change with time, then the resulting users are more likely to be introverted than time-stable users. Moreover, users whose centralities and physical activities both change with time and whose evolving centralities are significantly correlated (i.e., co-evolve) with evolving physical activities are more likely to be introverted as well as anxious compared to those users who are time-stable and do not have a co-evolution relationship. Our network analysis framework reveals several links between individuals' social network structure, health-related behaviors, and the other (e.g., personality) traits. In the future, our study could lead to development of a predictive model of social network structure from behavioral/trait information and vice versa.
Collapse
Affiliation(s)
- Shikang Liu
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, 46556 IN USA
| | - David Hachen
- Department of Sociology, University of Notre Dame, Notre Dame, 46556 IN USA
| | - Omar Lizardo
- Department of Sociology, University of Notre Dame, Notre Dame, 46556 IN USA
| | - Christian Poellabauer
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, 46556 IN USA
| | - Aaron Striegel
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, 46556 IN USA
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, 46556 IN USA
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, 46556 IN USA
- Interdisciplinary Center for Network Science and Applications (iCeNSA), University of Notre Dame, Notre Dame, 46556 IN USA
| |
Collapse
|
16
|
Abstract
MOTIVATION Network alignment (NA) aims to find a node mapping that conserves similar regions between compared networks. NA is applicable to many fields, including computational biology, where NA can guide the transfer of biological knowledge from well- to poorly-studied species across aligned network regions. Existing NA methods can only align static networks. However, most complex real-world systems evolve over time and should thus be modeled as dynamic networks. We hypothesize that aligning dynamic network representations of evolving systems will produce superior alignments compared to aligning the systems' static network representations, as is currently done. RESULTS For this purpose, we introduce the first ever dynamic NA method, DynaMAGNA ++. This proof-of-concept dynamic NA method is an extension of a state-of-the-art static NA method, MAGNA++. Even though both MAGNA++ and DynaMAGNA++ optimize edge as well as node conservation across the aligned networks, MAGNA++ conserves static edges and similarity between static node neighborhoods, while DynaMAGNA++ conserves dynamic edges (events) and similarity between evolving node neighborhoods. For this purpose, we introduce the first ever measure of dynamic edge conservation and rely on our recent measure of dynamic node conservation. Importantly, the two dynamic conservation measures can be optimized with any state-of-the-art NA method and not just MAGNA++. We confirm our hypothesis that dynamic NA is superior to static NA, on synthetic and real-world networks, in computational biology and social domains. DynaMAGNA++ is parallelized and has a user-friendly graphical interface. AVAILABILITY AND IMPLEMENTATION http://nd.edu/∼cone/DynaMAGNA++/ . CONTACT tmilenko@nd.edu. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- V Vijayan
- Department of Computer Science and Engineering, ECK Institute for Global Health, and Interdisciplinary Center for Network Science and Applications (iCeNSA), University of Notre Dame, Notre Dame, IN, USA
| | - D Critchlow
- Department of Computer Science and Engineering, ECK Institute for Global Health, and Interdisciplinary Center for Network Science and Applications (iCeNSA), University of Notre Dame, Notre Dame, IN, USA
- Department of Physics and Astronomy, Austin Peay State University, Clarksville, Tennessee, TN, USA
| | - T Milenković
- Department of Computer Science and Engineering, ECK Institute for Global Health, and Interdisciplinary Center for Network Science and Applications (iCeNSA), University of Notre Dame, Notre Dame, IN, USA
| |
Collapse
|
17
|
Abstract
MOTIVATION Network alignment (NA) aims to find regions of similarities between species' molecular networks. There exist two NA categories: local (LNA) and global (GNA). LNA finds small highly conserved network regions and produces a many-to-many node mapping. GNA finds large conserved regions and produces a one-to-one node mapping. Given the different outputs of LNA and GNA, when a new NA method is proposed, it is compared against existing methods from the same category. However, both NA categories have the same goal: to allow for transferring functional knowledge from well- to poorly-studied species between conserved network regions. So, which one to choose, LNA or GNA? To answer this, we introduce the first systematic evaluation of the two NA categories. RESULTS We introduce new measures of alignment quality that allow for fair comparison of the different LNA and GNA outputs, as such measures do not exist. We provide user-friendly software for efficient alignment evaluation that implements the new and existing measures. We evaluate prominent LNA and GNA methods on synthetic and real-world biological networks. We study the effect on alignment quality of using different interaction types and confidence levels. We find that the superiority of one NA category over the other is context-dependent. Further, when we contrast LNA and GNA in the application of learning novel protein functional knowledge, the two produce very different predictions, indicating their complementarity. Our results and software provide guidelines for future NA method development and evaluation. AVAILABILITY AND IMPLEMENTATION Software: http://www.nd.edu/~cone/LNA_GNA CONTACT: : tmilenko@nd.eduSupplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lei Meng
- Department of Computer Science and Engineering ECK Institute of Global Health and Interdisciplinary Center for Network Science and Applications
| | - Aaron Striegel
- Department of Computer Science and Engineering Wireless Institute, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Tijana Milenković
- Department of Computer Science and Engineering ECK Institute of Global Health and Interdisciplinary Center for Network Science and Applications
| |
Collapse
|
18
|
Rund SSC, Yoo B, Alam C, Green T, Stephens MT, Zeng E, George GF, Sheppard AD, Duffield GE, Milenković T, Pfrender ME. Genome-wide profiling of 24 hr diel rhythmicity in the water flea, Daphnia pulex: network analysis reveals rhythmic gene expression and enhances functional gene annotation. BMC Genomics 2016; 17:653. [PMID: 27538446 PMCID: PMC4991082 DOI: 10.1186/s12864-016-2998-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2016] [Accepted: 08/05/2016] [Indexed: 11/16/2022] Open
Abstract
Background Marine and freshwater zooplankton exhibit daily rhythmic patterns of behavior and physiology which may be regulated directly by the light:dark (LD) cycle and/or a molecular circadian clock. One of the best-studied zooplankton taxa, the freshwater crustacean Daphnia, has a 24 h diel vertical migration (DVM) behavior whereby the organism travels up and down through the water column daily. DVM plays a critical role in resource tracking and the behavioral avoidance of predators and damaging ultraviolet radiation. However, there is little information at the transcriptional level linking the expression patterns of genes to the rhythmic physiology/behavior of Daphnia. Results Here we analyzed genome-wide temporal transcriptional patterns from Daphnia pulex collected over a 44 h time period under a 12:12 LD cycle (diel) conditions using a cosine-fitting algorithm. We used a comprehensive network modeling and analysis approach to identify novel co-regulated rhythmic genes that have similar network topological properties and functional annotations as rhythmic genes identified by the cosine-fitting analyses. Furthermore, we used the network approach to predict with high accuracy novel gene-function associations, thus enhancing current functional annotations available for genes in this ecologically relevant model species. Our results reveal that genes in many functional groupings exhibit 24 h rhythms in their expression patterns under diel conditions. We highlight the rhythmic expression of immunity, oxidative detoxification, and sensory process genes. We discuss differences in the chronobiology of D. pulex from other well-characterized terrestrial arthropods. Conclusions This research adds to a growing body of literature suggesting the genetic mechanisms governing rhythmicity in crustaceans may be divergent from other arthropod lineages including insects. Lastly, these results highlight the power of using a network analysis approach to identify differential gene expression and provide novel functional annotation. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2998-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Samuel S C Rund
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46556, USA.,Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, 46556, USA.,Centre for Immunity, Infection and Evolution, Institute of Evolution, University of Edinburgh, Edinburgh, EH9 3FL, UK.,Institute of Immunology and Infection Research, School of Biological Sciences, University of Edinburgh, Edinburgh, EH9 3FL, UK
| | - Boyoung Yoo
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA.,Present Address: Department of Computer Science, Stanford University, Stanford, CA, 94305, USA
| | - Camille Alam
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Taryn Green
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Melissa T Stephens
- Notre Dame Genomics and Bioinformatics Core Facility, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Erliang Zeng
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA.,Notre Dame Genomics and Bioinformatics Core Facility, University of Notre Dame, Notre Dame, IN, 46556, USA.,Present Address: Department of Biology, University of South Dakota, Vermillion, SD, 57069, USA.,Present Address: Department of Computer Science, University of South Dakota, Vermillion, SD, 57069, USA
| | - Gary F George
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46556, USA.,Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Aaron D Sheppard
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46556, USA.,Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Giles E Duffield
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46556, USA.,Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Tijana Milenković
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46556, USA.,Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA.,Interdisciplinary Center for Network Science and Applications (iCeNSA), University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Michael E Pfrender
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46556, USA. .,Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, 46556, USA. .,Notre Dame Environmental Change Initiative, University of Notre Dame, Notre Dame, IN, 46556, USA.
| |
Collapse
|
19
|
|
20
|
Faisal FE, Meng L, Crawford J, Milenković T. The post-genomic era of biological network alignment. EURASIP J Bioinform Syst Biol 2015; 2015:3. [PMID: 28194172 PMCID: PMC5270500 DOI: 10.1186/s13637-015-0022-9] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2015] [Accepted: 05/18/2015] [Indexed: 11/10/2022]
Abstract
Biological network alignment aims to find regions of topological and functional (dis)similarities between molecular networks of different species. Then, network alignment can guide the transfer of biological knowledge from well-studied model species to less well-studied species between conserved (aligned) network regions, thus complementing valuable insights that have already been provided by genomic sequence alignment. Here, we review computational challenges behind the network alignment problem, existing approaches for solving the problem, ways of evaluating their alignment quality, and the approaches' biomedical applications. We discuss recent innovative efforts of improving the existing view of network alignment. We conclude with open research questions in comparative biological network research that could further our understanding of principles of life, evolution, disease, and therapeutics.
Collapse
Affiliation(s)
- Fazle E Faisal
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556 USA
- Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN, 46556 USA
- ECK Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46556 USA
| | - Lei Meng
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556 USA
| | - Joseph Crawford
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556 USA
- Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN, 46556 USA
- ECK Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46556 USA
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556 USA
- Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN, 46556 USA
- ECK Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46556 USA
| |
Collapse
|
21
|
Abstract
MOTIVATION With increasing availability of temporal real-world networks, how to efficiently study these data? One can model a temporal network as a single aggregate static network, or as a series of time-specific snapshots, each being an aggregate static network over the corresponding time window. Then, one can use established methods for static analysis on the resulting aggregate network(s), but losing in the process valuable temporal information either completely, or at the interface between different snapshots, respectively. Here, we develop a novel approach for studying a temporal network more explicitly, by capturing inter-snapshot relationships. RESULTS We base our methodology on well-established graphlets (subgraphs), which have been proven in numerous contexts in static network research. We develop new theory to allow for graphlet-based analyses of temporal networks. Our new notion of dynamic graphlets is different from existing dynamic network approaches that are based on temporal motifs (statistically significant subgraphs). The latter have limitations: their results depend on the choice of a null network model that is required to evaluate the significance of a subgraph, and choosing a good null model is non-trivial. Our dynamic graphlets overcome the limitations of the temporal motifs. Also, when we aim to characterize the structure and function of an entire temporal network or of individual nodes, our dynamic graphlets outperform the static graphlets. Clearly, accounting for temporal information helps. We apply dynamic graphlets to temporal age-specific molecular network data to deepen our limited knowledge about human aging. AVAILABILITY AND IMPLEMENTATION http://www.nd.edu/∼cone/DG.
Collapse
Affiliation(s)
- Y Hulovatyy
- Department of Computer Science and Engineering, Interdisciplinary Center for Network Science and Applications, and ECK Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA
| | - H Chen
- Department of Computer Science and Engineering, Interdisciplinary Center for Network Science and Applications, and ECK Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA
| | - T Milenković
- Department of Computer Science and Engineering, Interdisciplinary Center for Network Science and Applications, and ECK Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA
| |
Collapse
|
22
|
Abstract
BACKGROUND Analogous to genomic sequence alignment, biological network alignment identifies conserved regions between networks of different species. Then, function can be transferred from well- to poorly-annotated species between aligned network regions. Network alignment typically encompasses two algorithmic components: node cost function (NCF), which measures similarities between nodes in different networks, and alignment strategy (AS), which uses these similarities to rapidly identify high-scoring alignments. Different methods use both different NCFs and different ASs. Thus, it is unclear whether the superiority of a method comes from its NCF, its AS, or both. We already showed on state-of-the-art methods, MI-GRAAL and IsoRankN, that combining NCF of one method and AS of another method can give a new superior method. Here, we evaluate MI-GRAAL against a newer approach, GHOST, by mixing-and-matching the methods' NCFs and ASs to potentially further improve alignment quality. While doing so, we approach important questions that have not been asked systematically thus far. First, we ask how much of the NCF information should come from protein sequence data compared to network topology data. Existing methods determine this parameter more-less arbitrarily, which could affect alignment quality. Second, when topological information is used in NCF, we ask how large the size of the neighborhoods of the compared nodes should be. Existing methods assume that the larger the neighborhood size, the better. RESULTS Our findings are as follows. MI-GRAAL's NCF is superior to GHOST's NCF, while the performance of the methods' ASs is data-dependent. Thus, for data on which GHOST's AS is superior to MI-GRAAL's AS, the combination of MI-GRAAL's NCF and GHOST's AS represents a new superior method. Also, which amount of sequence information is used within NCF does not affect alignment quality, while the inclusion of topological information is crucial for producing good alignments. Finally, larger neighborhood sizes are preferred, but often, it is the second largest size that is superior. Using this size instead of the largest one would decrease computational complexity. CONCLUSION Taken together, our results represent general recommendations for a fair evaluation of network alignment methods and in particular of two-stage NCF-AS approaches.
Collapse
|
23
|
Abstract
Motivation: Network comparison is a computationally intractable problem with important applications in systems biology and other domains. A key challenge is to properly quantify similarity between wiring patterns of two networks in an alignment-free fashion. Also, alignment-based methods exist that aim to identify an actual node mapping between networks and as such serve a different purpose. Various alignment-free methods that use different global network properties (e.g. degree distribution) have been proposed. Methods based on small local subgraphs called graphlets perform the best in the alignment-free network comparison task, due to high level of topological detail that graphlets can capture. Among different graphlet-based methods, Graphlet Correlation Distance (GCD) was shown to be the most accurate for comparing networks. Recently, a new graphlet-based method called NetDis was proposed, which was claimed to be superior. We argue against this, as the performance of NetDis was not properly evaluated to position it correctly among the other alignment-free methods. Results: We evaluate the performance of available alignment-free network comparison methods, including GCD and NetDis. We do this by measuring accuracy of each method (in a systematic precision-recall framework) in terms of how well the method can group (cluster) topologically similar networks. By testing this on both synthetic and real-world networks from different domains, we show that GCD remains the most accurate, noise-tolerant and computationally efficient alignment-free method. That is, we show that NetDis does not outperform the other methods, as originally claimed, while it is also computationally more expensive. Furthermore, since NetDis is dependent on the choice of a network null model (unlike the other graphlet-based methods), we show that its performance is highly sensitive to the choice of this parameter. Finally, we find that its performance is not independent on network sizes and densities, as originally claimed. Contact: natasha@imperial.ac.uk Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ömer Nebil Yaveroğlu
- California Institute for Telecommunications and Information Technology (Calit2), University of California, Irvine, CA 92697, USA
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, IN 46556, USA and
| | - Nataša Pržulj
- Department of Computing, Imperial College London, London SW7 2AZ, UK
| |
Collapse
|
24
|
Vijayan V, Saraph V, Milenković T. MAGNA++: Maximizing Accuracy in Global Network Alignment via both node and edge conservation. Bioinformatics 2015; 31:2409-11. [PMID: 25792552 DOI: 10.1093/bioinformatics/btv161] [Citation(s) in RCA: 72] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2014] [Accepted: 03/14/2015] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Network alignment aims to find conserved regions between different networks. Existing methods aim to maximize total similarity over all aligned nodes (i.e. node conservation). Then, they evaluate alignment quality by measuring the amount of conserved edges, but only after the alignment is constructed. Thus, we recently introduced MAGNA (Maximizing Accuracy in Global Network Alignment) to directly maximize edge conservation while producing alignments and showed its superiority over the existing methods. Here, we extend the original MAGNA with several important algorithmic advances into a new MAGNA++ framework. RESULTS MAGNA++ introduces several novelties: (i) it simultaneously maximizes any one of three different measures of edge conservation (including our recent superior [Formula: see text] measure) and any desired node conservation measure, which further improves alignment quality compared with maximizing only node conservation or only edge conservation; (ii) it speeds up the original MAGNA algorithm by parallelizing it to automatically use all available resources, as well as by reimplementing the edge conservation measures more efficiently; (iii) it provides a friendly graphical user interface for easy use by domain (e.g. biological) scientists; and (iv) at the same time, MAGNA++ offers source code for easy extensibility by computational scientists. AVAILABILITY AND IMPLEMENTATION http://www.nd.edu/∼cone/MAGNA++/
Collapse
Affiliation(s)
- V Vijayan
- Department of Computer Science and Engineering, ECK Institute for Global Health, Interdisciplinary Center for Network Science and Application, University of Notre Dame, IN 46556, USA and
| | - V Saraph
- Department of Computer Science, Brown University, Providence, RI 02912, USA
| | - T Milenković
- Department of Computer Science and Engineering, ECK Institute for Global Health, Interdisciplinary Center for Network Science and Application, University of Notre Dame, IN 46556, USA and
| |
Collapse
|
25
|
Sun Y, Crawford J, Tang J, Milenković T. Simultaneous Optimization of both Node and Edge Conservation in Network Alignment via WAVE. Lecture Notes in Computer Science 2015. [DOI: 10.1007/978-3-662-48221-6_2] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
26
|
|
27
|
Abstract
Protein interaction networks (PINs) are often used to "learn" new biological function from their topology. Since current PINs are noisy, their computational de-noising via link prediction (LP) could improve the learning accuracy. LP uses the existing PIN topology to predict missing and spurious links. Many of existing LP methods rely on shared immediate neighborhoods of the nodes to be linked. As such, they have limitations. Thus, in order to comprehensively study what are the topological properties of nodes in PINs that dictate whether the nodes should be linked, we introduce novel sensitive LP measures that are expected to overcome the limitations of the existing methods. We systematically evaluate the new and existing LP measures by introducing "synthetic" noise into PINs and measuring how accurate the measures are in reconstructing the original PINs. Also, we use the LP measures to de-noise the original PINs, and we measure biological correctness of the de-noised PINs with respect to functional enrichment of the predicted interactions. Our main findings are: 1) LP measures that favor nodes which are both "topologically similar" and have large shared extended neighborhoods are superior; 2) using more network topology often though not always improves LP accuracy; and 3) LP improves biological correctness of the PINs, plus we validate a significant portion of the predicted interactions in independent, external PIN data sources. Ultimately, we are less focused on identifying a superior method but more on showing that LP improves biological correctness of PINs, which is its ultimate goal in computational biology. But we note that our new methods outperform each of the existing ones with respect to at least one evaluation criterion. Alarmingly, we find that the different criteria often disagree in identifying the best method(s), which has important implications for LP communities in any domain, including social networks.
Collapse
Affiliation(s)
- Yuriy Hulovatyy
- Department of Computer Science and Engineering, ECK Institute for Global Health, and Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Ryan W. Solava
- Department of Computer Science and Engineering, ECK Institute for Global Health, and Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Tijana Milenković
- Department of Computer Science and Engineering, ECK Institute for Global Health, and Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, Indiana, United States of America
- * E-mail:
| |
Collapse
|
28
|
Abstract
MOTIVATION Because susceptibility to diseases increases with age, studying aging gains importance. Analyses of gene expression or sequence data, which have been indispensable for investigating aging, have been limited to studying genes and their protein products in isolation, ignoring their connectivities. However, proteins function by interacting with other proteins, and this is exactly what biological networks (BNs) model. Thus, analyzing the proteins' BN topologies could contribute to the understanding of aging. Current methods for analyzing systems-level BNs deal with their static representations, even though cells are dynamic. For this reason, and because different data types can give complementary biological insights, we integrate current static BNs with aging-related gene expression data to construct dynamic age-specific BNs. Then, we apply sensitive measures of topology to the dynamic BNs to study cellular changes with age. RESULTS While global BN topologies do not significantly change with age, local topologies of a number of genes do. We predict such genes to be aging-related. We demonstrate credibility of our predictions by (i) observing significant overlap between our predicted aging-related genes and 'ground truth' aging-related genes; (ii) observing significant overlap between functions and diseases that are enriched in our aging-related predictions and those that are enriched in 'ground truth' aging-related data; (iii) providing evidence that diseases which are enriched in our aging-related predictions are linked to human aging; and (iv) validating our high-scoring novel predictions in the literature. AVAILABILITY AND IMPLEMENTATION Software executables are available upon request.
Collapse
Affiliation(s)
- Fazle E Faisal
- Department of Computer Science and Engineering, ECK Institute for Global Health and Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Tijana Milenković
- Department of Computer Science and Engineering, ECK Institute for Global Health and Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN 46556, USA
| |
Collapse
|
29
|
Abstract
Proteins are essential macromolecules of life that carry out most cellular processes. Since proteins aggregate to perform function, and since protein-protein interaction (PPI) networks model these aggregations, one would expect to uncover new biology from PPI network topology. Hence, using PPI networks to predict protein function and role of protein pathways in disease has received attention. A debate remains open about whether network properties of "biologically central (BC)" genes (i.e., their protein products), such as those involved in aging, cancer, infectious diseases, or signaling and drug-targeted pathways, exhibit some topological centrality compared to the rest of the proteins in the human PPI network.To help resolve this debate, we design new network-based approaches and apply them to get new insight into biological function and disease. We hypothesize that BC genes have a topologically central (TC) role in the human PPI network. We propose two different concepts of topological centrality. We design a new centrality measure to capture complex wirings of proteins in the network that identifies as TC those proteins that reside in dense extended network neighborhoods. Also, we use the notion of domination and find dominating sets (DSs) in the PPI network, i.e., sets of proteins such that every protein is either in the DS or is a neighbor of the DS. Clearly, a DS has a TC role, as it enables efficient communication between different network parts. We find statistically significant enrichment in BC genes of TC nodes and outperform the existing methods indicating that genes involved in key biological processes occupy topologically complex and dense regions of the network and correspond to its "spine" that connects all other network parts and can thus pass cellular signals efficiently throughout the network. To our knowledge, this is the first study that explores domination in the context of PPI networks.
Collapse
Affiliation(s)
- Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Vesna Memišević
- Department of Computer Science, University of California Irvine, Irvine, California, United States of America
| | - Anthony Bonato
- Department of Mathematics, Ryerson University, Toronto, Ontario, Canada
| | - Nataša Pržulj
- Department of Computing, Imperial College London, London, United Kingdom
| |
Collapse
|
30
|
Abstract
SummaryNetworks are used to model real-world phenomena in various domains, including systems biology. Since proteins carry out biological processes by interacting with other proteins, it is expected that cellular functions are reflected in the structure of protein-protein interaction (PPI) networks. Similarly, the topology of residue interaction graphs (RIGs) that model proteins’ 3-dimensional structure might provide insights into protein folding, stability, and function. An important step towards understanding these networks is finding an adequate network model, since models can be exploited algorithmically as well as used for predicting missing data. Evaluating the fit of a model network to the data is a formidable challenge, since network comparisons are computationally infeasible and thus have to rely on heuristics, or “network properties.” We show that it is difficult to assess the reliability of the fit of a model using any network property alone. Thus, we present an integrative approach that feeds a variety of network properties into five machine learning classifiers to predict the best-fitting network model for PPI networks and RIGs. We confirm that geometric random graphs (GEO) are the best-fitting model for RIGs. Since GEO networks model spatial relationships between objects and are thus expected to replicate well the underlying structure of spatially packed residues in a protein, the good fit of GEO to RIGs validates our approach. Additionally, we apply our approach to PPI networks and confirm that the structure of merged data sets containing both binary and co-complex data that are of high coverage and confidence is also consistent with the structure of GEO, while the structure of less complete and lower confidence data is not. Since PPI data are noisy, we test the robustness of the five classifiers to noise and show that their robustness levels differ. We demonstrate that none of the classifiers predicts noisy scale-free (SF) networks as GEO, whereas noisy GEOs can be classified as SF. Thus, it is unlikely that our approach would predict a real-world network as GEO if it had a noisy SF structure. However, it could classify the data as SF if it had a noisy GEO structure. Therefore, the structure of the PPI networks is the most consistent with the structure of a noisy GEO.
Collapse
|
31
|
Abstract
Summary Traditional approaches for homology detection rely on finding sufficient similarities between protein sequences. Motivated by studies demonstrating that from non-sequence based sources of biological information, such as the secondary or tertiary molecular structure, we can extract certain types of biological knowledge when sequence-based approaches fail, we hypothesize that protein-protein interaction (PPI) network topology and protein sequence might give insights into different slices of biological information. Since proteins aggregate to perform a function instead of acting in isolation, analyzing complex wirings around a protein in a PPI network could give deeper insights into the protein’s role in the inner working of the cell than analyzing sequences of individual genes. Hence, we believe that one could lose much information by focusing on sequence information alone. We examine whether the information about homologous proteins captured by PPI network topology differs and to what extent from the information captured by their sequences. We measure how similar the topology around homologous proteins in a PPI network is and show that such proteins have statistically significantly higher network similarity than nonhomologous proteins. We compare these network similarity trends of homologous proteins with the trends in their sequence identity and find that network similarities uncover almost as much homology as sequence identities. Although none of the two methods, network topology and sequence identity, seems to capture homology information in its entirety, we demonstrate that the two might give insights into somewhat different types of biological information, as the overlap of the homology information that they uncover is relatively low. Therefore, we conclude that similarities of proteins’ topological neighborhoods in a PPI network could be used as a complementary method to sequence-based approaches for identifying homologs, as well as for analyzing evolutionary distance and functional divergence of homologous proteins.
Collapse
Affiliation(s)
- Vesna Memišević
- 1Department of Computer Science, University of California, Irvine, CA 92697-3435, United States of America
| | - Tijana Milenković
- 1Department of Computer Science, University of California, Irvine, CA 92697-3435, United States of America
| | - Nataša Pržulj
- 2Department of Computing, Imperial College London, London, SW7 2AZ, United Kingdom of Great Britain and Northern Ireland
| |
Collapse
|
32
|
Kuchaiev O, Milenković T, Memišević V, Hayes W, Pržulj N. Topological network alignment uncovers biological function and phylogeny. J R Soc Interface 2010; 7:1341-54. [PMID: 20236959 PMCID: PMC2894889 DOI: 10.1098/rsif.2010.0063] [Citation(s) in RCA: 227] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2010] [Accepted: 02/25/2010] [Indexed: 12/22/2022] Open
Abstract
Sequence comparison and alignment has had an enormous impact on our understanding of evolution, biology and disease. Comparison and alignment of biological networks will probably have a similar impact. Existing network alignments use information external to the networks, such as sequence, because no good algorithm for purely topological alignment has yet been devised. In this paper, we present a novel algorithm based solely on network topology, that can be used to align any two networks. We apply it to biological networks to produce by far the most complete topological alignments of biological networks to date. We demonstrate that both species phylogeny and detailed biological function of individual proteins can be extracted from our alignments. Topology-based alignments have the potential to provide a completely new, independent source of phylogenetic information. Our alignment of the protein-protein interaction networks of two very different species-yeast and human-indicate that even distant species share a surprising amount of network topology, suggesting broad similarities in internal cellular wiring across all life on Earth.
Collapse
Affiliation(s)
- Oleksii Kuchaiev
- Department of Computer Science, University of California, Irvine, CA 92697-3435, USA
| | - Tijana Milenković
- Department of Computer Science, University of California, Irvine, CA 92697-3435, USA
| | - Vesna Memišević
- Department of Computer Science, University of California, Irvine, CA 92697-3435, USA
| | - Wayne Hayes
- Department of Computer Science, University of California, Irvine, CA 92697-3435, USA
- Department of Mathematics, Imperial College, London SW7 2AZ, UK
| | - Nataša Pržulj
- Department of Computing, Imperial College, London SW7 2AZ, UK
| |
Collapse
|
33
|
Kaake RM, Milenković T, Przulj N, Kaiser P, Huang L. Characterization of cell cycle specific protein interaction networks of the yeast 26S proteasome complex by the QTAX strategy. J Proteome Res 2010; 9:2016-29. [PMID: 20170199 DOI: 10.1021/pr1000175] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Ubiquitin-proteasome dependent protein degradation plays a fundamental role in the regulation of the eukaryotic cell cycle. Cell cycle transitions between different phases are tightly regulated to prevent uncontrolled cell proliferation, which is characteristic of cancer cells. To understand cell cycle phase specific regulation of the 26S proteasome and reveal the molecular mechanisms underlying the ubiquitin-proteasome degradation pathway during cell cycle progression, we have carried out comprehensive characterization of cell cycle phase specific proteasome interacting proteins (PIPs) by QTAX analysis of synchronized yeast cells. Our efforts have generated specific proteasome interaction networks for the G1, S, and M phases of the cell cycle and identified a total of 677 PIPs, 266 of which were not previously identified from unsynchronized cells. On the basis of the dynamic changes of their SILAC ratios across the three cell cycle phases, we have employed a profile vector-based clustering approach and identified 20 functionally significant groups of PIPs, 3 of which are enriched with cell cycle related functions. This work presents the first step toward understanding how dynamic proteasome interactions are involved in various cellular pathways during the cell cycle.
Collapse
Affiliation(s)
- Robyn M Kaake
- Departments of Physiology & Biophysics and Developmental & Cell Biology, University of California, Irvine, California 92697-4560, USA
| | | | | | | | | |
Collapse
|
34
|
Abstract
Important biological information is encoded in the topology of biological networks. Comparative analyses of biological networks are proving to be valuable, as they can lead to transfer of knowledge between species and give deeper insights into biological function, disease, and evolution. We introduce a new method that uses the Hungarian algorithm to produce optimal global alignment between two networks using any cost function. We design a cost function based solely on network topology and use it in our network alignment. Our method can be applied to any two networks, not just biological ones, since it is based only on network topology. We use our new method to align protein-protein interaction networks of two eukaryotic species and demonstrate that our alignment exposes large and topologically complex regions of network similarity. At the same time, our alignment is biologically valid, since many of the aligned protein pairs perform the same biological function. From the alignment, we predict function of yet unannotated proteins, many of which we validate in the literature. Also, we apply our method to find topological similarities between metabolic networks of different species and build phylogenetic trees based on our network alignment score. The phylogenetic trees obtained in this way bear a striking resemblance to the ones obtained by sequence alignments. Our method detects topologically similar regions in large networks that are statistically significant. It does this independent of protein sequence or any other information external to network topology.
Collapse
Affiliation(s)
- Tijana Milenković
- Department of Computing, Imperial College London SW7 2AZ, UK
- Department of Computer Science, University of California, Irvine, CA 92697-3435, USA
| | - Weng Leong Ng
- Department of Computer Science, University of California, Irvine, CA 92697-3435, USA
| | - Wayne Hayes
- Department of Computer Science, University of California, Irvine, CA 92697-3435, USA
- Department of Mathematics, Imperial College London SW7 2AZ, UK
| | - Nataša Pržulj
- Department of Computing, Imperial College London SW7 2AZ, UK
| |
Collapse
|
35
|
Ho H, Milenković T, Memisević V, Aruri J, Przulj N, Ganesan AK. Protein interaction network topology uncovers melanogenesis regulatory network components within functional genomics datasets. BMC Syst Biol 2010; 4:84. [PMID: 20550706 PMCID: PMC2904735 DOI: 10.1186/1752-0509-4-84] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/04/2009] [Accepted: 06/15/2010] [Indexed: 12/11/2022]
Abstract
Background RNA-mediated interference (RNAi)-based functional genomics is a systems-level approach to identify novel genes that control biological phenotypes. Existing computational approaches can identify individual genes from RNAi datasets that regulate a given biological process. However, currently available methods cannot identify which RNAi screen "hits" are novel components of well-characterized biological pathways known to regulate the interrogated phenotype. In this study, we describe a method to identify genes from RNAi datasets that are novel components of known biological pathways. We experimentally validate our approach in the context of a recently completed RNAi screen to identify novel regulators of melanogenesis. Results In this study, we utilize a PPI network topology-based approach to identify targets within our RNAi dataset that may be components of known melanogenesis regulatory pathways. Our computational approach identifies a set of screen targets that cluster topologically in a human PPI network with the known pigment regulator Endothelin receptor type B (EDNRB). Validation studies reveal that these genes impact pigment production and EDNRB signaling in pigmented melanoma cells (MNT-1) and normal melanocytes. Conclusions We present an approach that identifies novel components of well-characterized biological pathways from functional genomics datasets that could not have been identified by existing statistical and computational approaches.
Collapse
Affiliation(s)
- Hsiang Ho
- Department of Biological Chemistry, University of California, Irvine, 92697-1700, USA
| | | | | | | | | | | |
Collapse
|
36
|
Simonovic M, Radisavljević M, Milenković T, Grbeša G. P01-159 - Influence of the mild TBI on the chronic course of the posttraumatic stress disorder. Eur Psychiatry 2010. [DOI: 10.1016/s0924-9338(10)70364-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
|
37
|
Abstract
Much attention has recently been given to the statistical significance of topological features observed in biological networks. Here, we consider residue interaction graphs (RIGs) as network representations of protein structures with residues as nodes and inter-residue interactions as edges. Degree-preserving randomized models have been widely used for this purpose in biomolecular networks. However, such a single summary statistic of a network may not be detailed enough to capture the complex topological characteristics of protein structures and their network counterparts. Here, we investigate a variety of topological properties of RIGs to find a well fitting network null model for them. The RIGs are derived from a structurally diverse protein data set at various distance cut-offs and for different groups of interacting atoms. We compare the network structure of RIGs to several random graph models. We show that 3-dimensional geometric random graphs, that model spatial relationships between objects, provide the best fit to RIGs. We investigate the relationship between the strength of the fit and various protein structural features. We show that the fit depends on protein size, structural class, and thermostability, but not on quaternary structure. We apply our model to the identification of significantly over-represented structural building blocks, i.e., network motifs, in protein structure networks. As expected, choosing geometric graphs as a null model results in the most specific identification of motifs. Our geometric random graph model may facilitate further graph-based studies of protein conformation space and have important implications for protein structure comparison and prediction. The choice of a well-fitting null model is crucial for finding structural motifs that play an important role in protein folding, stability and function. To our knowledge, this is the first study that addresses the challenge of finding an optimized null model for RIGs, by comparing various RIG definitions against a series of network models.
Collapse
Affiliation(s)
- Tijana Milenković
- Department of Computer Science, University of California Irvine, Irvine, California, United States of America
| | | | - Michael Lappe
- Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Nataša Pržulj
- Department of Computer Science, University of California Irvine, Irvine, California, United States of America
- * E-mail:
| |
Collapse
|
38
|
Abstract
Motivation Proteins are essential macromolecules of life and thus understanding their function is of great importance. The number of functionally unclassified proteins is large even for simple and well studied organisms such as baker's yeast. Methods for determining protein function have shifted their focus from targeting specific proteins based solely on sequence homology to analyses of the entire proteome based on protein-protein interaction (PPI) networks. Since proteins interact to perform a certain function, analyzing structural properties of PPI networks may provide useful clues about the biological function of individual proteins, protein complexes they participate in, and even larger subcellular machines. Results We design a sensitive graph theoretic method for comparing local structures of node neighborhoods that demonstrates that in PPI networks, biological function of a node and its local network structure are closely related. The method summarizes a protein's local topology in a PPI network into the vector of graphlet degrees called the signature of the protein and computes the signature similarities between all protein pairs. We group topologically similar proteins under this measure in a PPI network and show that these protein groups belong to the same protein complexes, perform the same biological functions, are localized in the same subcellular compartments, and have the same tissue expressions. Moreover, we apply our technique on a proteome-scale network data and infer biological function of yet unclassified proteins demonstrating that our method can provide valuable guidelines for future experimental research such as disease protein prediction. Availability Data is available upon request.
Collapse
Affiliation(s)
- Tijana Milenković
- Department of Computer Science, University of California, Irvine, CA 92697-3435, U.S.A
| | - Nataša Pržulj
- Department of Computer Science, University of California, Irvine, CA 92697-3435, U.S.A
| |
Collapse
|
39
|
Abstract
BACKGROUND The recent explosion in biological and other real-world network data has created the need for improved tools for large network analyses. In addition to well established global network properties, several new mathematical techniques for analyzing local structural properties of large networks have been developed. Small over-represented subgraphs, called network motifs, have been introduced to identify simple building blocks of complex networks. Small induced subgraphs, called graphlets, have been used to develop "network signatures" that summarize network topologies. Based on these network signatures, two new highly sensitive measures of network local structural similarities were designed: the relative graphlet frequency distance (RGF-distance) and the graphlet degree distribution agreement (GDD-agreement). Finding adequate null-models for biological networks is important in many research domains. Network properties are used to assess the fit of network models to the data. Various network models have been proposed. To date, there does not exist a software tool that measures the above mentioned local network properties. Moreover, none of the existing tools compare real-world networks against a series of network models with respect to these local as well as a multitude of global network properties. RESULTS Thus, we introduce GraphCrunch, a software tool that finds well-fitting network models by comparing large real-world networks against random graph models according to various network structural similarity measures. It has unique capabilities of finding computationally expensive RGF-distance and GDD-agreement measures. In addition, it computes several standard global network measures and thus supports the largest variety of network measures thus far. Also, it is the first software tool that compares real-world networks against a series of network models and that has built-in parallel computing capabilities allowing for a user specified list of machines on which to perform compute intensive searches for local network properties. Furthermore, GraphCrunch is easily extendible to include additional network measures and models. CONCLUSION GraphCrunch is a software tool that implements the latest research on biological network models and properties: it compares real-world networks against a series of random graph models with respect to a multitude of local and global network properties. We present GraphCrunch as a comprehensive, parallelizable, and easily extendible software tool for analyzing and modeling large biological networks. The software is open-source and freely available at http://www.ics.uci.edu/~bio-nets/graphcrunch/. It runs under Linux, MacOS, and Windows Cygwin. In addition, it has an easy to use on-line web user interface that is available from the above web page.
Collapse
Affiliation(s)
- Tijana Milenković
- Department of Computer Science, University of California, Irvine, CA 92697-3435, USA.
| | | | | |
Collapse
|
40
|
Petrovski G, Dimitrovski C, Sadikario S, Bogoev M, Milenković T. [Hypertension and diabetes mellitus]. Pril (Makedon Akad Nauk Umet Odd Med Nauki) 2004; 25:17-26. [PMID: 15735533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/01/2023]
Abstract
AIM To evaluate hypertension in patients with Diabetes Mellitus (DM) and its correlation with age, duration of DM, Body Mass Index (BMI) and HbA1C). MATERIALS AND METHODS A retrospective study was made on 1211 patients with DM (male 554 and female 657), hospitalized at Clinic of Endocrinology between January 2001 and December 2002. Patients were divided in two groups: Control group (CG)-subdivided into 3 groups patients with DM type 1 (CG-1), DM type 2 on oral anti-hyper-glycemic agents (CG-2)and DM type 2 on insulin therapy (CG-3) and Examined Group (EG), the same groups for diabetes, including hypertension. RESULTS We found hypertension in 12.6% patients with DM type 1, 30.5% in DM type 2 on oral anti-hyper-glycemic agents and 33.4% in DM type 2 on insulin therapy. CONCLUSION Hypertension is mostly presented in DM type 2 patients (33,4%), instead of 12.6% in DM type 1. There is statistical significance (p<0.05) between duration of DM in patients with and without hypertension.
Collapse
|
41
|
Rajić V, Zdravković D, Milenković T, Banićević M. [Ectopy of the thyroid gland as cause of neck or tongue base tumour]. SRP ARK CELOK LEK 2001; 129 Suppl 1:68-71. [PMID: 15637996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/01/2023] Open
Abstract
One of the most important causes of congenital hypothyroidism found in 35-42% of cases, is ectopy of the thyroid gland. Thyroid gland can be displaced at base or under the tongue, or under the hyoid bone. Ectopic gland is also hypoplastic, secreting not enough amounts of thyroid hormones. By negative feedback mechanism that cause elevation of TSH. Under the permanent TSH stimulation ectopic gland can enlarge appearing as a neck or tongue base "tumour". In this way, by measuring TSH level in a newborn, all children with thyroid gland ectopy can be detected. Ectopy of the thyroid gland as cause of congenital hypothyroidism was present in three patients as a tumour of neck or tongue base. After surgical removal of the "tumour", histopathologic analysis revealed that it was the thyroid tissue. No patient passed the thyroid function test nor identification of the thyroid tissue (ultrasound or scintigraphy) before surgery. All were born in the regions of Serbia where screening for congenital hypothyroidism was not carried out at all or only temporary. Screening of newborns for congenital hypothyroidism is based on measuring TSH level. By this method all patients with thyroid gland ectopy can be detected. Scintigraphic examination after surgery detected no thyroid tissue and replacement therapy with Na L-thyroxine started.
Collapse
|
42
|
Zdravković D, Milenković T, Sedlecki K, Guć-Sćekić M, Rajić V, Banićević M. [Causes of ambiguous external genitalia in neonates]. SRP ARK CELOK LEK 2001; 129:57-60. [PMID: 11534268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2023] Open
Abstract
INTRODUCTION The classification of disorders such as ambiguous genitalia in newborns is difficult because similar or identical phenotypes could have several different aetiologies. In most cases it was impossible to correlate the aetiology of the disorder and the appearance of the external genitalia [1-3]. A newborn with ambiguous genitalia needs prompt evaluation that will permit gender assignment and detection of life-threatening conditions (salt-losing crisis due to congenital adrenal hyperplasia or Wilms' tumour). We studied the causes and characteristics of ambiguous genitalia in newborn infants over the period from 1990 to 1999. PATIENTS AND METHODS The following genital phenotypes are considered as ambiguous: 1. Hypospadias with no palpable gonads; 2. Hypospadias with micropenis and no palpable gonads or one palpable gonad; 3. Newborn with female external genitalia and a gonadal mass in labia or labial fusion and/or clitoral enlargement [1, 4]. The diagnostic evaluation of newborns with ambiguous genitalia consisted of history and physical examination, determination of serum electrolytes, plasma 17-hydroxyprogesterone (17-OHP), chromosome analysis on cultured lymphocytes, sonogram of the abdomen in connection with a genitogram; and whenever it was necessary, basal plasma concentrations of testosterone and, after the stimulation with human chorionic gonadotropin (hCG), laparotomy for definitive determination of gonadal histology. All disorders with ambiguous genitalia have been classified in four groups: [6]: 1. Female pseudohermaphroditism (FPH); 2. Male pseudoherma phroditism (MPH); 3. True hermaphroditism (TH); 4. Asymmetrical gonadal dysgenesis (ASGD). RESULTS The causes of sexual differentiation disorders in a group of 38 newborns with ambiguous genitalia are presented in Table 1. Main criteria for the diagnosis of FPH were normal female karyotype 46, XX, masculinization of external genitalia and no palpable gonads. Genitography revealed urogenital sinus and vagina, and ultrasound examination the uterus. During initial examination seven of 15 newborns with congenital adrenal hyperplasia (CAH) (Table 2) due to 21-hydroxylase (P450c21) deficiency (21-OHD) had clinical or laboratory signs of adrenal crisis. Two children had a simple virilizing form of 21-OHD. The female gender was chosen for these children. In other three patients with FPH isolated clitoral hyperplasia or labial fusion was the main reason for the studies. The common characteristics of newborns with MPH were as follows: normal male karyotype 46,XY with normally developed or dysgenetic testes, and/or good response to hCG stimulation. The complete androgen insensitivity (testicular feminization) was detected in two children (Table 3) with female external genitalia and palpable gonads in the labial folds, and female gender was chosen. The Denys-Drash syndrome was detected in one newborn with ambiguous genitalia, no palpable gonads, and normal response to hCG, and ultrasound findings of multiple bilateral renal tumours were identified as Wilms' tumour. In other newborns with MPH incomplete masculinization consisted of hypospadias, mostly of perineoscrotal type and of micropenis (penile size less than 2 cm) and/or bilateral or unilateral cryptorchidism (Table 3). In all children male sex was chosen. Asymmetrical gonadal dysgenesis was detected in two newborn infants. Both children had 46,XY/46,XX karyotype, testes on one side of the abdomen, and streak gonad on the other, developed vagina, uterus and unilateral Fallopian tube, and were raised as females. True hermaphroditism was established in one newborn with 46,XX karyotype, with a testis on one side of the abdomen and an ovotestis on the other side. The parents decided for male gender. The aetiology of ambiguous genitalia was not established in five children; in two children with 46,XY and one with 46,XX karyotype (with palpable gonads) the diagnostic study was not completed. CONCLUSIONS The most common cause of ambiguous genitalia in our newborn patients was CAH due to 21-OH deficiency [2, 4, 6, 7]; 87 percent of patients had salt wasting form of the disease. In the majority of patients the appearance of the external genitalia made possible the detection of the disease immediately after the birth. So, the relative high incidence of adrenal crisis in our patients with CAH (38%) seems unreasonable. The decision for gender assignment was possible after the appropriate study of the nature of the disorder. The causes of MPH are numerous and heterogeneous [1, 3, 8]. With the exception of two patients with complete form of androgen insensitivity, in all newborns with MPH the male gender predominated. The appearance of external genitalia with severe perineoscrotal hypospadia and/or micropenis suggested the possibility of incomplete androgen resistance. If a male assignment is being considered, the response of the phallic size to treatment with testosterone was recommended. If penile size did not reach the 2.5 cm range or above, a male sex assignment was not advisable [1]. It is important for the paediatric surgeon to be involved in the diagnostic evaluation of these infants to plan the timing and techniques of the surgical reconstruction [6]. The decision to raise a patient with sex chromosome mosaicism, true hermaphroditism, or mixed gonadal dysgenesis as either a male or a female was based on the appearance of the external genitalia and possible fertility [1, 9]. The parental decision of male sex in our patients with true hermaphroditism could not be considered as optimal.
Collapse
Affiliation(s)
- D Zdravković
- Dr. Vukan Chupitsh Mother and Child Health Care Institute of Serbia, Belgrade
| | | | | | | | | | | |
Collapse
|
43
|
Zdravković D, Banićević M, Maksimović R, Milenković T. [Relationship between height velocity and growth hormone secretion in normal and short prepubertal children]. SRP ARK CELOK LEK 1994; 122:127-130. [PMID: 17977406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2023] Open
Abstract
In 60 prepubertal children of both sexes, aged between 6.4 and 15.5 years, with normal or short stature and growing with heigh velocity standard aviation scores (SDS) between 2.69 and -5.75, we has performed 12h noctunral growh hormone (GH) profiles. We found a statistically significant relationship between growth velocity and growth hormone secreation in the group of 60 children with different degrees of GH deficiency, short normal children and children with normal stature. The relationship was expressed with a logarithmic type of curve. In the gruop of 13 children with constituonal growth delay or in the group of 9 children with the familial short stature, we did not find a significant relationship between growth velocity and GH secretion. In the group of short children with complete GH deficiency there were marked individual differences in growth velocity after the beginning of thyroid replacement therapy; children with secondary hypothyroidsm were growing significantly faster than children with normal thyreotrpic function. In our opinion the height velocity is controlled with thyroid hormones, genetic influences, and possibly, with other unknown factors which operate independntly of GH secreation during prepubertal years.
Collapse
|