1
|
Ironside-Smith R, Noë B, Allen SM, Costello S, Turner LD. Motif discovery in hospital ward vital signs observation networks. NETWORK MODELING AND ANALYSIS IN HEALTH INFORMATICS AND BIOINFORMATICS 2024; 13:55. [PMID: 39386086 PMCID: PMC11458707 DOI: 10.1007/s13721-024-00490-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Revised: 08/27/2024] [Accepted: 09/18/2024] [Indexed: 10/12/2024]
Abstract
Vital signs observations are regular measurements used by healthcare staff to track a patient's overall health status on hospital wards. We look at the potential in re-purposing aggregated and anonymised hospital data sources surrounding vital signs recording to provide new insights into how care is managed and delivered on wards. In this paper, we conduct a retrospective longitudinal observational study of 770,720 individual vital signs recordings across 20 hospital wards in South Wales (UK) and present a network modelling framework to explore and extract behavioural patterns via analysis of the resulting network structures at a global and local level. Self-loop edges, dyad, triad, and tetrad subgraphs were extracted and evaluated against a null model to determine individual statistical significance, and then combined into ward-level feature vectors to provide the means for determining notable behaviours across wards. Modelling data as a static network, by aggregating all vital sign observation data points, resulted in high uniformity but with the loss of important information which was better captured when modelling the static-temporal network, highlighting time's crucial role as a network element. Wards mostly followed expected patterns, with chains or stand-alone supplementary observations by clinical staff. However, observation sequences that deviate from this are revealed in five identified motif subgraphs and 6 anti-motif subgraphs. External ward characteristics also showed minimal impact on the relative abundance of subgraphs, indicating a 'superfamily' phenomena that has been similarly seen in complex networks in other domains. Overall, the results show that network modelling effectively captured and exposed behaviours within vital signs observation data, and demonstrated uniformity across hospital wards in managing this practice.
Collapse
Affiliation(s)
- Rupert Ironside-Smith
- School of Computer Science and Informatics, Cardiff University, Abacws, Senghennydd Road, Cardiff, CF24 4AG UK
| | - Beryl Noë
- School of Computer Science and Informatics, Cardiff University, Abacws, Senghennydd Road, Cardiff, CF24 4AG UK
| | - Stuart M. Allen
- School of Computer Science and Informatics, Cardiff University, Abacws, Senghennydd Road, Cardiff, CF24 4AG UK
| | - Shannon Costello
- Florence Nightingale Faculty of Nursing, Midwifery and Palliative Care, King’s College London, 57 Waterloo Road, London, SE1 8WA UK
| | - Liam D. Turner
- School of Computer Science and Informatics, Cardiff University, Abacws, Senghennydd Road, Cardiff, CF24 4AG UK
| |
Collapse
|
2
|
Tang H, Ma G, Zhang Y, Ye K, Guo L, Liu G, Huang Q, Wang Y, Ajilore O, Leow AD, Thompson PM, Huang H, Zhan L. A comprehensive survey of complex brain network representation. META-RADIOLOGY 2023; 1:100046. [PMID: 39830588 PMCID: PMC11741665 DOI: 10.1016/j.metrad.2023.100046] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/22/2025]
Abstract
Recent years have shown great merits in utilizing neuroimaging data to understand brain structural and functional changes, as well as its relationship to different neurodegenerative diseases and other clinical phenotypes. Brain networks, derived from different neuroimaging modalities, have attracted increasing attention due to their potential to gain system-level insights to characterize brain dynamics and abnormalities in neurological conditions. Traditional methods aim to pre-define multiple topological features of brain networks and relate these features to different clinical measures or demographical variables. With the enormous successes in deep learning techniques, graph learning methods have played significant roles in brain network analysis. In this survey, we first provide a brief overview of neuroimaging-derived brain networks. Then, we focus on presenting a comprehensive overview of both traditional methods and state-of-the-art deep-learning methods for brain network mining. Major models, and objectives of these methods are reviewed within this paper. Finally, we discuss several promising research directions in this field.
Collapse
Affiliation(s)
- Haoteng Tang
- Department of Computer Science, College of Engineering and Computer Science, University of Texas Rio Grande Valley, 1201 W University Dr, Edinburg, 78539, TX, USA
| | - Guixiang Ma
- Intel Labs, 2111 NE 25th Ave, Hillsboro, 97124, OR, USA
| | - Yanfu Zhang
- Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, 3700 O’Hara St., Pittsburgh, 15261, PA, USA
| | - Kai Ye
- Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, 3700 O’Hara St., Pittsburgh, 15261, PA, USA
| | - Lei Guo
- Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, 3700 O’Hara St., Pittsburgh, 15261, PA, USA
| | - Guodong Liu
- Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, 3700 O’Hara St., Pittsburgh, 15261, PA, USA
| | - Qi Huang
- Department of Radiology, Utah Center of Advanced Imaging, University of Utah, 729 Arapeen Drive, Salt Lake City, 84108, UT, USA
| | - Yalin Wang
- School of Computing and Augmented Intelligence, Arizona State University, 699 S Mill Ave., Tempe, 85281, AZ, USA
| | - Olusola Ajilore
- Department of Psychiatry, University of Illinois Chicago, 1601 W. Taylor St., Chicago, 60612, IL, USA
| | - Alex D. Leow
- Department of Psychiatry, University of Illinois Chicago, 1601 W. Taylor St., Chicago, 60612, IL, USA
| | - Paul M. Thompson
- Department of Neurology, University of Southern California, 2001 N. Soto St., Los Angeles, 90032, CA, USA
| | - Heng Huang
- Department of Computer Science, University of Maryland, 8125 Paint Branch Dr, College Park, 20742, MD, USA
| | - Liang Zhan
- Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, 3700 O’Hara St., Pittsburgh, 15261, PA, USA
| |
Collapse
|
3
|
Abstract
Since the large-scale experimental characterization of protein–protein interactions (PPIs) is not possible for all species, several computational PPI prediction methods have been developed that harness existing data from other species. While PPI network prediction has been extensively used in eukaryotes, microbial network inference has lagged behind. However, bacterial interactomes can be built using the same principles and techniques; in fact, several methods are better suited to bacterial genomes. These predicted networks allow systems-level analyses in species that lack experimental interaction data. This review describes the current network inference and analysis techniques and summarizes the use of computationally-predicted microbial interactomes to date.
Collapse
|
4
|
Pairwise Biological Network Alignment Based on Discrete Bat Algorithm. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2021; 2021:5548993. [PMID: 34777564 PMCID: PMC8580637 DOI: 10.1155/2021/5548993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Revised: 09/29/2021] [Accepted: 10/13/2021] [Indexed: 11/18/2022]
Abstract
The development of high-throughput technology has provided a reliable technical guarantee for an increased amount of available data on biological networks. Network alignment is used to analyze these data to identify conserved functional network modules and understand evolutionary relationships across species. Thus, an efficient computational network aligner is needed for network alignment. In this paper, the classic bat algorithm is discretized and applied to the network alignment. The bat algorithm initializes the population randomly and then searches for the optimal solution iteratively. Based on the bat algorithm, the global pairwise alignment algorithm BatAlign is proposed. In BatAlign, the individual velocity and the position are represented by a discrete code. BatAlign uses a search algorithm based on objective function that uses the number of conserved edges as the objective function. The similarity between the networks is used to initialize the population. The experimental results showed that the algorithm was able to match proteins with high functional consistency and reach a relatively high topological quality.
Collapse
|
5
|
Sinha S, Bhattacharya S, Roy S. Impact of second-order network motif on online social networks. THE JOURNAL OF SUPERCOMPUTING 2021; 78:5450-5478. [PMID: 34584343 PMCID: PMC8461152 DOI: 10.1007/s11227-021-04079-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 09/05/2021] [Indexed: 06/13/2023]
Abstract
The behaviour of individual users in an online social network is a major contributing factor in determining the outcome of multiple network phenomenon. Group formation, growth of the network, information propagation, and rumour blocking are some of the many network behavioural traits that are influenced by the interaction patterns of the users in the network. Network motifs capture one such interaction pattern between users in online social networks (OSNs). For this work, four second-order (two-edged) network motifs have been considered, namely, message receiving pattern, message broadcasting pattern, message passing pattern, and reciprocal message pattern, to analyse user behaviour in online social networks. This work provides and utilizes a node interaction pattern-finding algorithm to identify the frequency of aforementioned second-order network motifs in six real-life online social networks (Facebook, GPlus, GNU, Twitter, Enron Email, and Wiki-vote). The frequency of network motifs participated in by a node is considered for the relative ranking of all nodes in the online social networks. The highest-rated nodes are considered seeds for information propagation. The performance of using network motifs for ranking nodes as seeds for information propagation is validated using statistical metrics Z-score, concentration, and significance profile and compared with baseline ranking methods in-degree centrality, out-degree centrality, closeness centrality, and PageRank. The comparative study shows the performance of centrality measures to be similar or better than second-order network motifs as seed nodes in information diffusion. The experimental results on finding frequencies and importance of different interaction patterns provide insights on the significance and representation of each such interaction pattern and how it varies from network to network.
Collapse
Affiliation(s)
- Sankhamita Sinha
- Sankhamita Sinha, Meghnad Saha Institute of Technology, Kolkata, India
| | | | - Sarbani Roy
- Sankhamita Sinha, Meghnad Saha Institute of Technology, Kolkata, India
| |
Collapse
|
6
|
Abstract
Alignments of discrete objects can be constructed in a very general setting as super-objects from which the constituent objects are recovered by means of projections. Here, we focus on contact maps, i.e. undirected graphs with an ordered set of vertices. These serve as natural discretizations of RNA and protein structures. In the general case, the alignment problem for vertex-ordered graphs is NP-complete. In the special case of RNA secondary structures, i.e. crossing-free matchings, however, the alignments have a recursive structure. The alignment problem then can be solved by a variant of the Sankoff algorithm in polynomial time. Moreover, the tree or forest alignments of RNA secondary structure can be understood as the alignments of ordered edge sets.
Collapse
Affiliation(s)
- Peter F Stadler
- Bioinformatics Group, Department of Computer Science and Interdisciplinary Centre for Bioinformatics, Universität Leipzig, Härtelstraße 16-18, 04107 Leipzig, Germany.,German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Competence Centre for Scalable Data Services and Solutions Dresden-Leipzig, Leipzig Research Centre for Civilization Diseases, and Centre for Biotechnology and Biomedicine at Leipzig University, Universität Leipzig, Leipzig, Germany.,Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, 04103 Leipzig, Germany.,Institute for Theoretical Chemistry, University of Vienna, Währingerstrasse 17, 1090 Wien, Austria.,Facultad de Ciencias, Universidad National de Colombia, Bogotá, Colombia.,Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA
| |
Collapse
|
7
|
Wang YXR, Li L, Li JJ, Huang H. Network Modeling in Biology: Statistical Methods for Gene and Brain Networks. Stat Sci 2021; 36:89-108. [PMID: 34305304 PMCID: PMC8296984 DOI: 10.1214/20-sts792] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The rise of network data in many different domains has offered researchers new insight into the problem of modeling complex systems and propelled the development of numerous innovative statistical methodologies and computational tools. In this paper, we primarily focus on two types of biological networks, gene networks and brain networks, where statistical network modeling has found both fruitful and challenging applications. Unlike other network examples such as social networks where network edges can be directly observed, both gene and brain networks require careful estimation of edges using covariates as a first step. We provide a discussion on existing statistical and computational methods for edge esitimation and subsequent statistical inference problems in these two types of biological networks.
Collapse
Affiliation(s)
- Y X Rachel Wang
- School of Mathematics and Statistics, University of Sydney, Australia
| | - Lexin Li
- Department of Biostatistics and Epidemiology, School of Public Health, University of California, Berkeley
| | | | - Haiyan Huang
- Department of Statistics, University of California, Berkeley
| |
Collapse
|
8
|
Zhu L, Zhang J, Zhang Y, Lang J, Xiang J, Bai X, Yan N, Tian G, Zhang H, Yang J. NAIGO: An Improved Method to Align PPI Networks Based on Gene Ontology and Graphlets. Front Bioeng Biotechnol 2020; 8:547. [PMID: 32637398 PMCID: PMC7318716 DOI: 10.3389/fbioe.2020.00547] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2020] [Accepted: 05/06/2020] [Indexed: 11/24/2022] Open
Abstract
With the development of high throughput technologies, there are more and more protein–protein interaction (PPI) networks available, which provide a need for efficient computational tools for network alignment. Network alignment is widely used to predict functions of certain proteins, identify conserved network modules, and study the evolutionary relationship across species or biological entities. However, network alignment is an NP-complete problem, and previous algorithms are usually slow or less accurate in aligning big networks like human vs. yeast. In this study, we proposed a fast yet accurate algorithm called Network Alignment by Integrating Biological Process (NAIGO). Specifically, we first divided the networks into subnets taking the advantage of known prior knowledge, such as gene ontology. For each subnet pair, we then developed a novel method to align them by considering both protein orthologous information and their local structural information. After that, we expanded the obtained local network alignments in a greedy manner. Taking the aligned pairs as seeds, we formulated the global network alignment problem as an assignment problem based on similarity matrix, which was solved by the Hungarian method. We applied NAIGO to align human and Saccharomyces cerevisiae S288c PPI network and compared the results with other popular methods like IsoRank, GRAAL, SANA, and NABEECO. As a result, our method outperformed the competitors by aligning more orthologous proteins or matched interactions. In addition, we found a few potential functional orthologous proteins such as RRM2B in human and DNA2 in S. cerevisiae S288c, which are related to DNA repair. We also identified a conserved subnet with six orthologous proteins EXO1, MSH3, MSH2, MLH1, MLH3, and MSH6, and six aligned interactions. All these proteins are associated with mismatch repair. Finally, we predicted a few proteins of S. cerevisiae S288c potentially involving in certain biological processes like autophagosome assembly.
Collapse
Affiliation(s)
- Lijuan Zhu
- College of Mathematics and Computer Science, Zhejiang Normal University, Jinhua, China
| | - Ju Zhang
- Institute of Infectious Diseases, Beijing Ditan Hospital, Capital Medical University, and Beijing Key Laboratory of Emerging Infectious Diseases, Beijing, China
| | - Yi Zhang
- Department of Mathematics, Hebei University of Science & Technology, Shijiazhuang, China
| | | | - Ju Xiang
- Neuroscience Research Center & Department of Basic Medical Sciences, Changsha Medical University, Changsha, China.,School of Computer Science and Engineering, Central South University, Changsha, China
| | - Xiaogang Bai
- Department of Mathematics, Hebei University of Science & Technology, Shijiazhuang, China
| | - Na Yan
- Geneis Beijing Co., Ltd., Beijing, China
| | - Geng Tian
- Geneis Beijing Co., Ltd., Beijing, China
| | - Huajun Zhang
- College of Mathematics and Computer Science, Zhejiang Normal University, Jinhua, China
| | | |
Collapse
|
9
|
Milano M, Milenković T, Cannataro M, Guzzi PH. L-HetNetAligner: A novel algorithm for Local Alignment of Heterogeneous Biological Networks. Sci Rep 2020; 10:3901. [PMID: 32127586 PMCID: PMC7054427 DOI: 10.1038/s41598-020-60737-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Accepted: 02/11/2020] [Indexed: 11/10/2022] Open
Abstract
Networks are largely used for modelling and analysing a wide range of biological data. As a consequence, many different research efforts have resulted in the introduction of a large number of algorithms for analysis and comparison of networks. Many of these algorithms can deal with networks with a single class of nodes and edges, also referred to as homogeneous networks. Recently, many different approaches tried to integrate into a single model the interplay of different molecules. A possible formalism to model such a scenario comes from node/edge coloured networks (also known as heterogeneous networks) implemented as node/ edge-coloured graphs. Therefore, the need for the introduction of algorithms able to compare heterogeneous networks arises. We here focus on the local comparison of heterogeneous networks, and we formulate it as a network alignment problem. To the best of our knowledge, the local alignment of heterogeneous networks has not been explored in the past. We here propose L-HetNetAligner a novel algorithm that receives as input two heterogeneous networks (node-coloured graphs) and builds a local alignment of them. We also implemented and tested our algorithm. Our results confirm that our method builds high-quality alignments. The following website *contains Supplementary File 1 material and the code.
Collapse
Affiliation(s)
- Marianna Milano
- Department of Surgical and Medical Sciences, University of Catanzaro, Catanzaro, 88040, Italy
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, Indiana, USA
| | - Mario Cannataro
- Department of Surgical and Medical Sciences, University of Catanzaro, Catanzaro, 88040, Italy
- Data Analytics Research Center, University of Catanzaro, Catanzaro, Italy
| | - Pietro Hiram Guzzi
- Department of Surgical and Medical Sciences, University of Catanzaro, Catanzaro, 88040, Italy.
- Data Analytics Research Center, University of Catanzaro, Catanzaro, Italy.
| |
Collapse
|
10
|
Sarkar A, Ren Y, Elhesha R, Kahveci T. A New Algorithm for Counting Independent Motifs in Probabilistic Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:1049-1062. [PMID: 29994098 DOI: 10.1109/tcbb.2018.2821666] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Biological networks provide great potential to understand how cells function. Motifs are topological patterns which are repeated frequently in a specific network. Network motifs are key structures through which biological networks operate. However, counting independent (i.e., non-overlapping) instances of a specific motif remains to be a computationally hard problem. Motif counting problem becomes computationally even harder for biological networks as biological interactions are uncertain events. The main challenge behind this problem is that different embeddings of a given motif in a network can share edges. Such edges can create complex computational dependencies between different instances of the given motif when considering uncertainty of those edges. In this paper, we develop a novel algorithm for counting independent instances of a specific motif topology in probabilistic biological networks. We present a novel mathematical model to capture the dependency between each embedding and all the other embeddings, which it overlaps with. We prove the correctness of this model. We evaluate our model on real and synthetic networks with different probability, and topology models as well as reasonable range of network sizes. Our results demonstrate that our method counts non-overlapping embeddings in practical time for a broad range of networks.
Collapse
|
11
|
Cannataro M. Big Data Analysis in Bioinformatics. ENCYCLOPEDIA OF BIG DATA TECHNOLOGIES 2019:161-180. [DOI: 10.1007/978-3-319-77525-8_139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
|
12
|
Zhang Y, Wang L, Wang L. A Comprehensive Evaluation of Graph Kernels for Unattributed Graphs. ENTROPY 2018; 20:e20120984. [PMID: 33266707 PMCID: PMC7512582 DOI: 10.3390/e20120984] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Revised: 12/16/2018] [Accepted: 12/16/2018] [Indexed: 11/16/2022]
Abstract
Graph kernels are of vital importance in the field of graph comparison and classification. However, how to compare and evaluate graph kernels and how to choose an optimal kernel for a practical classification problem remain open problems. In this paper, a comprehensive evaluation framework of graph kernels is proposed for unattributed graph classification. According to the kernel design methods, the whole graph kernel family can be categorized in five different dimensions, and then several representative graph kernels are chosen from these categories to perform the evaluation. With plenty of real-world and synthetic datasets, kernels are compared by many criteria such as classification accuracy, F1 score, runtime cost, scalability and applicability. Finally, quantitative conclusions are discussed based on the analyses of the extensive experimental results. The main contribution of this paper is that a comprehensive evaluation framework of graph kernels is proposed, which is significant for graph-classification applications and the future kernel research.
Collapse
Affiliation(s)
- Yi Zhang
- State Key Laboratory of Complex Electromagnetic Environment Effects on Electronics and Information System, Luoyang 471003, China
- Correspondence: ; Tel.: +86-137-8713-6328
| | - Lulu Wang
- National Innovation Institute of Defense Technology, Academy of Military Science, Beijing 100071, China
| | - Liandong Wang
- State Key Laboratory of Complex Electromagnetic Environment Effects on Electronics and Information System, Luoyang 471003, China
| |
Collapse
|
13
|
Jing F, Zhang SW, Zhang S. Brief Survey of Biological Network Alignment and a Variant with Incorporation of Functional Annotations. Curr Bioinform 2018. [DOI: 10.2174/1574893612666171020103747] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:Biological network alignment has been widely studied in the context of protein-protein interaction (PPI) networks, metabolic networks and others in bioinformatics. The topological structure of networks and genomic sequence are generally used by existing methods for achieving this task.Objective and Method:Here we briefly survey the methods generally used for this task and introduce a variant with incorporation of functional annotations based on similarity in Gene Ontology (GO). Making full use of GO information is beneficial to provide insights into precise biological network alignment.Results and Conclusion:We analyze the effect of incorporation of GO information to network alignment. Finally, we make a brief summary and discuss future directions about this topic.
Collapse
Affiliation(s)
- Fang Jing
- Key Laboratory of Information Fusion Technology of Ministry of Education, College of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| | - Shao-Wu Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, College of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| | - Shihua Zhang
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
| |
Collapse
|
14
|
Gu S, Johnson J, Faisal FE, Milenković T. From homogeneous to heterogeneous network alignment via colored graphlets. Sci Rep 2018; 8:12524. [PMID: 30131590 PMCID: PMC6104050 DOI: 10.1038/s41598-018-30831-w] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2018] [Accepted: 08/07/2018] [Indexed: 11/19/2022] Open
Abstract
Network alignment (NA) compares networks with the goal of finding a node mapping that uncovers highly similar (conserved) network regions. Existing NA methods are homogeneous, i.e., they can deal only with networks containing nodes and edges of one type. Due to increasing amounts of heterogeneous network data with nodes or edges of different types, we extend three recent state-of-the-art homogeneous NA methods, WAVE, MAGNA++, and SANA, to allow for heterogeneous NA for the first time. We introduce several algorithmic novelties. Namely, these existing methods compute homogeneous graphlet-based node similarities and then find high-scoring alignments with respect to these similarities, while simultaneously maximizing the amount of conserved edges. Instead, we extend homogeneous graphlets to their heterogeneous counterparts, which we then use to develop a new measure of heterogeneous node similarity. Also, we extend S3, a state-of-the-art measure of edge conservation for homogeneous NA, to its heterogeneous counterpart. Then, we find high-scoring alignments with respect to our heterogeneous node similarity and edge conservation measures. In evaluations on synthetic and real-world biological networks, our proposed heterogeneous NA methods lead to higher-quality alignments and better robustness to noise in the data than their homogeneous counterparts. The software and data from this work is available at https://nd.edu/~cone/colored_graphlets/.
Collapse
Affiliation(s)
- Shawn Gu
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - John Johnson
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Fazle E Faisal
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA
- Eck Institute for Global Health and Interdisciplinary Center for Network Science and Applications (iCeNSA), University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA.
- Eck Institute for Global Health and Interdisciplinary Center for Network Science and Applications (iCeNSA), University of Notre Dame, Notre Dame, IN, 46556, USA.
| |
Collapse
|
15
|
|
16
|
Henkel R, Hoehndorf R, Kacprowski T, Knüpfer C, Liebermeister W, Waltemath D. Notions of similarity for systems biology models. Brief Bioinform 2018; 19:77-88. [PMID: 27742665 PMCID: PMC5862271 DOI: 10.1093/bib/bbw090] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2016] [Revised: 08/28/2016] [Indexed: 01/23/2023] Open
Abstract
Systems biology models are rapidly increasing in complexity, size and numbers. When building large models, researchers rely on software tools for the retrieval, comparison, combination and merging of models, as well as for version control. These tools need to be able to quantify the differences and similarities between computational models. However, depending on the specific application, the notion of 'similarity' may greatly vary. A general notion of model similarity, applicable to various types of models, is still missing. Here we survey existing methods for the comparison of models, introduce quantitative measures for model similarity, and discuss potential applications of combined similarity measures. To frame model comparison as a general problem, we describe a theoretical approach to defining and computing similarities based on a combination of different model aspects. The six aspects that we define as potentially relevant for similarity are underlying encoding, references to biological entities, quantitative behaviour, qualitative behaviour, mathematical equations and parameters and network structure. We argue that future similarity measures will benefit from combining these model aspects in flexible, problem-specific ways to mimic users' intuition about model similarity, and to support complex model searches in databases.
Collapse
Affiliation(s)
| | | | | | | | | | - Dagmar Waltemath
- Department of Systems Biology and Bioinformatics, Institute of Computer Science, University of Rostock, Rostock, Germany
| |
Collapse
|
17
|
Koutra D, Faloutsos C. Individual and Collective Graph Mining: Principles, Algorithms, and Applications. ACTA ACUST UNITED AC 2017. [DOI: 10.2200/s00796ed1v01y201708dmk014] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
18
|
Martin O, Krzywicki A, Zagorski M. Drivers of structural features in gene regulatory networks: From biophysical constraints to biological function. Phys Life Rev 2016; 17:124-58. [DOI: 10.1016/j.plrev.2016.06.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2015] [Revised: 03/25/2016] [Accepted: 04/20/2016] [Indexed: 12/23/2022]
|
19
|
Elmsallati A, Clark C, Kalita J. Global Alignment of Protein-Protein Interaction Networks: A Survey. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016; 13:689-705. [PMID: 26336140 DOI: 10.1109/tcbb.2015.2474391] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
In this paper, we survey algorithms that perform global alignment of networks or graphs. Global network alignment aligns two or more given networks to find the best mapping from nodes in one network to nodes in other networks. Since graphs are a common method of data representation, graph alignment has become important with many significant applications. Protein-protein interactions can be modeled as networks and aligning these networks of protein interactions has many applications in biological research. In this survey, we review algorithms for global pairwise alignment highlighting various proposed approaches, and classify them based on their methodology. Evaluation metrics that are used to measure the quality of the resulting alignments are also surveyed. We discuss and present a comparison between selected aligners on the same datasets and evaluate using the same evaluation metrics. Finally, a quick overview of the most popular databases of protein interaction networks is presented focusing on datasets that have been used recently.
Collapse
|
20
|
Henriques R, Madeira SC. BicNET: Flexible module discovery in large-scale biological networks using biclustering. Algorithms Mol Biol 2016; 11:14. [PMID: 27213009 PMCID: PMC4875761 DOI: 10.1186/s13015-016-0074-8] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Accepted: 04/22/2016] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Despite the recognized importance of module discovery in biological networks to enhance our understanding of complex biological systems, existing methods generally suffer from two major drawbacks. First, there is a focus on modules where biological entities are strongly connected, leading to the discovery of trivial/well-known modules and to the inaccurate exclusion of biological entities with subtler yet relevant roles. Second, there is a generalized intolerance towards different forms of noise, including uncertainty associated with less-studied biological entities (in the context of literature-driven networks) and experimental noise (in the context of data-driven networks). Although state-of-the-art biclustering algorithms are able to discover modules with varying coherency and robustness to noise, their application for the discovery of non-dense modules in biological networks has been poorly explored and it is further challenged by efficiency bottlenecks. METHODS This work proposes Biclustering NETworks (BicNET), a biclustering algorithm to discover non-trivial yet coherent modules in weighted biological networks with heightened efficiency. Three major contributions are provided. First, we motivate the relevance of discovering network modules given by constant, symmetric, plaid and order-preserving biclustering models. Second, we propose an algorithm to discover these modules and to robustly handle noisy and missing interactions. Finally, we provide new searches to tackle time and memory bottlenecks by effectively exploring the inherent structural sparsity of network data. RESULTS Results in synthetic network data confirm the soundness, efficiency and superiority of BicNET. The application of BicNET on protein interaction and gene interaction networks from yeast, E. coli and Human reveals new modules with heightened biological significance. CONCLUSIONS BicNET is, to our knowledge, the first method enabling the efficient unsupervised analysis of large-scale network data for the discovery of coherent modules with parameterizable homogeneity.
Collapse
Affiliation(s)
- Rui Henriques
- INESC-ID and Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal
| | - Sara C. Madeira
- INESC-ID and Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal
| |
Collapse
|
21
|
Mersch DP. The social mirror for division of labor: what network topology and dynamics can teach us about organization of work in insect societies. Behav Ecol Sociobiol 2016. [DOI: 10.1007/s00265-016-2104-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
22
|
Faisal FE, Meng L, Crawford J, Milenković T. The post-genomic era of biological network alignment. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2015; 2015:3. [PMID: 28194172 PMCID: PMC5270500 DOI: 10.1186/s13637-015-0022-9] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2015] [Accepted: 05/18/2015] [Indexed: 11/10/2022]
Abstract
Biological network alignment aims to find regions of topological and functional (dis)similarities between molecular networks of different species. Then, network alignment can guide the transfer of biological knowledge from well-studied model species to less well-studied species between conserved (aligned) network regions, thus complementing valuable insights that have already been provided by genomic sequence alignment. Here, we review computational challenges behind the network alignment problem, existing approaches for solving the problem, ways of evaluating their alignment quality, and the approaches' biomedical applications. We discuss recent innovative efforts of improving the existing view of network alignment. We conclude with open research questions in comparative biological network research that could further our understanding of principles of life, evolution, disease, and therapeutics.
Collapse
Affiliation(s)
- Fazle E Faisal
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556 USA
- Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN, 46556 USA
- ECK Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46556 USA
| | - Lei Meng
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556 USA
| | - Joseph Crawford
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556 USA
- Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN, 46556 USA
- ECK Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46556 USA
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556 USA
- Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN, 46556 USA
- ECK Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46556 USA
| |
Collapse
|
23
|
Coexpression Network Analysis of miRNA-142 Overexpression in Neuronal Cells. BIOMED RESEARCH INTERNATIONAL 2015; 2015:921517. [PMID: 26539539 PMCID: PMC4619910 DOI: 10.1155/2015/921517] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/20/2014] [Revised: 01/26/2015] [Accepted: 01/28/2015] [Indexed: 01/14/2023]
Abstract
MicroRNAs are small noncoding RNA molecules, which are differentially expressed in diverse biological processes and are also involved in the regulation of multiple genes. A number of sites in the 3′ untranslated regions (UTRs) of different mRNAs allow complimentary binding for a microRNA, leading to their posttranscriptional regulation. The miRNA-142 is one of the microRNAs overexpressed in neurons that is found to regulate SIRT1 and MAOA genes. Differential analysis of gene expression data, which is focused on identifying up- or downregulated genes, ignores many relationships between genes affected by miRNA-142 overexpression in a cell. Thus, we applied a correlation network model to identify the coexpressed genes and to study the impact of miRNA-142 overexpression on this network. Combining multiple sources of knowledge is useful to infer meaningful relationships in systems biology. We applied coexpression model on the data obtained from wild type and miR-142 overexpression neuronal cells and integrated miRNA seed sequence mapping information to identify genes greatly affected by this overexpression. Larger differences in the enriched networks revealed that the nervous system development related genes such as TEAD2, PLEKHA6, and POGLUT1 were greatly impacted due to miRNA-142 overexpression.
Collapse
|
24
|
Crawford J, Sun Y, Milenković T. Fair evaluation of global network aligners. Algorithms Mol Biol 2015; 10:19. [PMID: 26060505 PMCID: PMC4460690 DOI: 10.1186/s13015-015-0050-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2014] [Accepted: 05/10/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Analogous to genomic sequence alignment, biological network alignment identifies conserved regions between networks of different species. Then, function can be transferred from well- to poorly-annotated species between aligned network regions. Network alignment typically encompasses two algorithmic components: node cost function (NCF), which measures similarities between nodes in different networks, and alignment strategy (AS), which uses these similarities to rapidly identify high-scoring alignments. Different methods use both different NCFs and different ASs. Thus, it is unclear whether the superiority of a method comes from its NCF, its AS, or both. We already showed on state-of-the-art methods, MI-GRAAL and IsoRankN, that combining NCF of one method and AS of another method can give a new superior method. Here, we evaluate MI-GRAAL against a newer approach, GHOST, by mixing-and-matching the methods' NCFs and ASs to potentially further improve alignment quality. While doing so, we approach important questions that have not been asked systematically thus far. First, we ask how much of the NCF information should come from protein sequence data compared to network topology data. Existing methods determine this parameter more-less arbitrarily, which could affect alignment quality. Second, when topological information is used in NCF, we ask how large the size of the neighborhoods of the compared nodes should be. Existing methods assume that the larger the neighborhood size, the better. RESULTS Our findings are as follows. MI-GRAAL's NCF is superior to GHOST's NCF, while the performance of the methods' ASs is data-dependent. Thus, for data on which GHOST's AS is superior to MI-GRAAL's AS, the combination of MI-GRAAL's NCF and GHOST's AS represents a new superior method. Also, which amount of sequence information is used within NCF does not affect alignment quality, while the inclusion of topological information is crucial for producing good alignments. Finally, larger neighborhood sizes are preferred, but often, it is the second largest size that is superior. Using this size instead of the largest one would decrease computational complexity. CONCLUSION Taken together, our results represent general recommendations for a fair evaluation of network alignment methods and in particular of two-stage NCF-AS approaches.
Collapse
|
25
|
Sun Y, Crawford J, Tang J, Milenković T. Simultaneous Optimization of both Node and Edge Conservation in Network Alignment via WAVE. LECTURE NOTES IN COMPUTER SCIENCE 2015. [DOI: 10.1007/978-3-662-48221-6_2] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
26
|
Faisal FE, Zhao H, Milenkovic T. Global Network Alignment in the Context of Aging. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:40-52. [PMID: 26357077 DOI: 10.1109/tcbb.2014.2326862] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Analogous to sequence alignment, network alignment (NA) can be used to transfer biological knowledge across species between conserved network regions. NA faces two algorithmic challenges: 1) Which cost function to use to capture "similarities" between nodes in different networks? 2) Which alignment strategy to use to rapidly identify "high-scoring" alignments from all possible alignments? We "break down" existing state-of-the-art methods that use both different cost functions and different alignment strategies to evaluate each combination of their cost functions and alignment strategies. We find that a combination of the cost function of one method and the alignment strategy of another method beats the existing methods. Hence, we propose this combination as a novel superior NA method. Then, since human aging is hard to study experimentally due to long lifespan, we use NA to transfer aging-related knowledge from well annotated model species to poorly annotated human. By doing so, we produce novel human aging-related knowledge, which complements currently available knowledge about aging that has been obtained mainly by sequence alignment. We demonstrate significant similarity between topological and functional properties of our novel predictions and those of known aging-related genes. We are the first to use NA to learn more about aging.
Collapse
|
27
|
He J, Wang C, Qiu K, Zhong W. An novel frequent probability pattern mining algorithm based on circuit simulation method in uncertain biological networks. BMC SYSTEMS BIOLOGY 2014; 8 Suppl 3:S6. [PMID: 25350277 PMCID: PMC4243085 DOI: 10.1186/1752-0509-8-s3-s6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
BACKGROUND Motif mining has always been a hot research topic in bioinformatics. Most of current research on biological networks focuses on exact motif mining. However, due to the inevitable experimental error and noisy data, biological network data represented as the probability model could better reflect the authenticity and biological significance, therefore, it is more biological meaningful to discover probability motif in uncertain biological networks. One of the key steps in probability motif mining is frequent pattern discovery which is usually based on the possible world model having a relatively high computational complexity. METHODS In this paper, we present a novel method for detecting frequent probability patterns based on circuit simulation in the uncertain biological networks. First, the partition based efficient search is applied to the non-tree like subgraph mining where the probability of occurrence in random networks is small. Then, an algorithm of probability isomorphic based on circuit simulation is proposed. The probability isomorphic combines the analysis of circuit topology structure with related physical properties of voltage in order to evaluate the probability isomorphism between probability subgraphs. The circuit simulation based probability isomorphic can avoid using traditional possible world model. Finally, based on the algorithm of probability subgraph isomorphism, two-step hierarchical clustering method is used to cluster subgraphs, and discover frequent probability patterns from the clusters. RESULTS The experiment results on data sets of the Protein-Protein Interaction (PPI) networks and the transcriptional regulatory networks of E. coli and S. cerevisiae show that the proposed method can efficiently discover the frequent probability subgraphs. The discovered subgraphs in our study contain all probability motifs reported in the experiments published in other related papers. CONCLUSIONS The algorithm of probability graph isomorphism evaluation based on circuit simulation method excludes most of subgraphs which are not probability isomorphism and reduces the search space of the probability isomorphism subgraphs using the mismatch values in the node voltage set. It is an innovative way to find the frequent probability patterns, which can be efficiently applied to probability motif discovery problems in the further studies.
Collapse
Affiliation(s)
- Jieyue He
- School of Computer Science and Engineering, Key Lab of Computer Network & Information Integration, MOE, Southeast University, Nanjing, 210018, China
| | - Chunyan Wang
- School of Computer Science and Engineering, Key Lab of Computer Network & Information Integration, MOE, Southeast University, Nanjing, 210018, China
| | - Kunpu Qiu
- School of Computer Science and Engineering, Key Lab of Computer Network & Information Integration, MOE, Southeast University, Nanjing, 210018, China
| | - Wei Zhong
- Division of Mathematics and Computer Science, University of South Carolina Upstate 800 University Way, Spartanburg, SC 29303, USA
| |
Collapse
|
28
|
|
29
|
Clark C, Kalita J. A comparison of algorithms for the pairwise alignment of biological networks. Bioinformatics 2014; 30:2351-9. [PMID: 24794929 DOI: 10.1093/bioinformatics/btu307] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
MOTIVATION As biological inquiry produces ever more network data, such as protein-protein interaction networks, gene regulatory networks and metabolic networks, many algorithms have been proposed for the purpose of pairwise network alignment-finding a mapping from the nodes of one network to the nodes of another in such a way that the mapped nodes can be considered to correspond with respect to both their place in the network topology and their biological attributes. This technique is helpful in identifying previously undiscovered homologies between proteins of different species and revealing functionally similar subnetworks. In the past few years, a wealth of different aligners has been published, but few of them have been compared with one another, and no comprehensive review of these algorithms has yet appeared. RESULTS We present the problem of biological network alignment, provide a guide to existing alignment algorithms and comprehensively benchmark existing algorithms on both synthetic and real-world biological data, finding dramatic differences between existing algorithms in the quality of the alignments they produce. Additionally, we find that many of these tools are inconvenient to use in practice, and there remains a need for easy-to-use cross-platform tools for performing network alignment.
Collapse
Affiliation(s)
- Connor Clark
- Department of Computer Science, University of Colorado Colorado Springs, Colorado Springs, CO 80918, USA
| | - Jugal Kalita
- Department of Computer Science, University of Colorado Colorado Springs, Colorado Springs, CO 80918, USA
| |
Collapse
|
30
|
Hsieh MF, Sze SH. Finding alignments of conserved graphlets in protein interaction networks. J Comput Biol 2014; 21:234-46. [PMID: 24506222 DOI: 10.1089/cmb.2013.0130] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
As the amount of data describing biological interactions increases, it becomes possible to analyze the complex interactions of genes and proteins across multiple networks at the genome scale. While the most popular techniques to study conservation of patterns in biological networks are through the use of network alignment techniques or the identification of network motifs, we show that it is possible to exhaustively enumerate all graphlet alignments, which consist of at least two vertex-disjoint subgraphs that share a common topology and contain homologous proteins at the same position in the topology. We compare the performance of our algorithm to network alignment algorithms and show that our algorithm is able to cover significantly more proteins in the given networks while maintaining comparable or higher sensitivity and specificity with respect to functional enrichment.
Collapse
Affiliation(s)
- Mu-Fen Hsieh
- 1 Department of Computer Science and Engineering, Texas A&M University , College Station, Texas
| | | |
Collapse
|
31
|
Panni S, Rombo SE. Searching for repetitions in biological networks: methods, resources and tools. Brief Bioinform 2013; 16:118-36. [PMID: 24300112 DOI: 10.1093/bib/bbt084] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
We present here a compact overview of the data, models and methods proposed for the analysis of biological networks based on the search for significant repetitions. In particular, we concentrate on three problems widely studied in the literature: 'network alignment', 'network querying' and 'network motif extraction'. We provide (i) details of the experimental techniques used to obtain the main types of interaction data, (ii) descriptions of the models and approaches introduced to solve such problems and (iii) pointers to both the available databases and software tools. The intent is to lay out a useful roadmap for identifying suitable strategies to analyse cellular data, possibly based on the joint use of different interaction data types or analysis techniques.
Collapse
|
32
|
Smith SA, Brown JW, Hinchliff CE. Analyzing and synthesizing phylogenies using tree alignment graphs. PLoS Comput Biol 2013; 9:e1003223. [PMID: 24086118 PMCID: PMC3784503 DOI: 10.1371/journal.pcbi.1003223] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2013] [Accepted: 07/31/2013] [Indexed: 11/17/2022] Open
Abstract
Phylogenetic trees are used to analyze and visualize evolution. However, trees can be imperfect datatypes when summarizing multiple trees. This is especially problematic when accommodating for biological phenomena such as horizontal gene transfer, incomplete lineage sorting, and hybridization, as well as topological conflict between datasets. Additionally, researchers may want to combine information from sets of trees that have partially overlapping taxon sets. To address the problem of analyzing sets of trees with conflicting relationships and partially overlapping taxon sets, we introduce methods for aligning, synthesizing and analyzing rooted phylogenetic trees within a graph, called a tree alignment graph (TAG). The TAG can be queried and analyzed to explore uncertainty and conflict. It can also be synthesized to construct trees, presenting an alternative to supertrees approaches. We demonstrate these methods with two empirical datasets. In order to explore uncertainty, we constructed a TAG of the bootstrap trees from the Angiosperm Tree of Life project. Analysis of the resulting graph demonstrates that areas of the dataset that are unresolved in majority-rule consensus tree analyses can be understood in more detail within the context of a graph structure, using measures incorporating node degree and adjacency support. As an exercise in synthesis (i.e., summarization of a TAG constructed from the alignment trees), we also construct a TAG consisting of the taxonomy and source trees from a recent comprehensive bird study. We synthesized this graph into a tree that can be reconstructed in a repeatable fashion and where the underlying source information can be updated. The methods presented here are tractable for large scale analyses and serve as a basis for an alternative to consensus tree and supertree methods. Furthermore, the exploration of these graphs can expose structures and patterns within the dataset that are otherwise difficult to observe.
Collapse
Affiliation(s)
- Stephen A Smith
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, United States of America
| | | | | |
Collapse
|
33
|
Winterbach W, Mieghem PV, Reinders M, Wang H, Ridder DD. Topology of molecular interaction networks. BMC SYSTEMS BIOLOGY 2013; 7:90. [PMID: 24041013 PMCID: PMC4231395 DOI: 10.1186/1752-0509-7-90] [Citation(s) in RCA: 75] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/19/2013] [Accepted: 08/01/2013] [Indexed: 12/23/2022]
Abstract
Molecular interactions are often represented as network models which have become the common language of many areas of biology. Graphs serve as convenient mathematical representations of network models and have themselves become objects of study. Their topology has been intensively researched over the last decade after evidence was found that they share underlying design principles with many other types of networks.Initial studies suggested that molecular interaction network topology is related to biological function and evolution. However, further whole-network analyses did not lead to a unified view on what this relation may look like, with conclusions highly dependent on the type of molecular interactions considered and the metrics used to study them. It is unclear whether global network topology drives function, as suggested by some researchers, or whether it is simply a byproduct of evolution or even an artefact of representing complex molecular interaction networks as graphs.Nevertheless, network biology has progressed significantly over the last years. We review the literature, focusing on two major developments. First, realizing that molecular interaction networks can be naturally decomposed into subsystems (such as modules and pathways), topology is increasingly studied locally rather than globally. Second, there is a move from a descriptive approach to a predictive one: rather than correlating biological network topology to generic properties such as robustness, it is used to predict specific functions or phenotypes.Taken together, this change in focus from globally descriptive to locally predictive points to new avenues of research. In particular, multi-scale approaches are developments promising to drive the study of molecular interaction networks further.
Collapse
Affiliation(s)
- Wynand Winterbach
- Network Architectures and Services, Department of Intelligent Systems, Faculty of
Electrical Engineering, Mathematics and Computer Science, Delft University of
Technology, P.O. Box 5031, 2600 GA Delft, The Netherlands
- Delft Bioinformatics Lab, Department of Intelligent Systems, Faculty of Electrical
Engineering, Mathematics and Computer Science, Delft University of Technology,
P.O. Box 5031, 2600 GA Delft, The Netherlands
| | - Piet Van Mieghem
- Network Architectures and Services, Department of Intelligent Systems, Faculty of
Electrical Engineering, Mathematics and Computer Science, Delft University of
Technology, P.O. Box 5031, 2600 GA Delft, The Netherlands
| | - Marcel Reinders
- Delft Bioinformatics Lab, Department of Intelligent Systems, Faculty of Electrical
Engineering, Mathematics and Computer Science, Delft University of Technology,
P.O. Box 5031, 2600 GA Delft, The Netherlands
- Netherlands Bioinformatics Center, 6500 HB Nijmegen, The Netherlands
- Kluyver Centre for Genomics of Industrial Fermentation, 2600 GA Delft, The
Netherlands
| | - Huijuan Wang
- Network Architectures and Services, Department of Intelligent Systems, Faculty of
Electrical Engineering, Mathematics and Computer Science, Delft University of
Technology, P.O. Box 5031, 2600 GA Delft, The Netherlands
| | - Dick de Ridder
- Delft Bioinformatics Lab, Department of Intelligent Systems, Faculty of Electrical
Engineering, Mathematics and Computer Science, Delft University of Technology,
P.O. Box 5031, 2600 GA Delft, The Netherlands
- Netherlands Bioinformatics Center, 6500 HB Nijmegen, The Netherlands
- Kluyver Centre for Genomics of Industrial Fermentation, 2600 GA Delft, The
Netherlands
| |
Collapse
|
34
|
Abstract
Motivation: BioPAX is a standard language for representing complex cellular processes, including metabolic networks, signal transduction and gene regulation. Owing to the inherent complexity of a BioPAX model, searching for a specific type of subnetwork can be non-trivial and difficult. Results: We developed an open source and extensible framework for defining and searching graph patterns in BioPAX models. We demonstrate its use with a sample pattern that captures directed signaling relations between proteins. We provide search results for the pattern obtained from the Pathway Commons database and compare these results with the current data in signaling databases SPIKE and SignaLink. Results show that a pattern search in public pathway data can identify a substantial amount of signaling relations that do not exist in signaling databases. Availability: BioPAX-pattern software was developed in Java. Source code and documentation is freely available at http://code.google.com/p/biopax-pattern under Lesser GNU Public License. Contact:patternsearch@cbio.mskcc.org Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Özgün Babur
- Computational Biology Center, Memorial Sloan-Kettering Cancer Center, New York, NY 10065, USA, Tri-Institutional Training Program in Computational Biology and Medicine, New York, NY 10065, USA and Banting and Best Department of Medical Research, The Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | | | | | | | | | | |
Collapse
|
35
|
|
36
|
Abstract
Large amounts of protein-protein interaction (PPI) data are available. The human PPI network currently contains over 56 000 interactions between 11 100 proteins. It has been demonstrated that the structure of this network is not random and that the same wiring patterns in it underlie the same biological processes and diseases. In this paper, we ask if there exists a subnetwork of the human PPI network such that its topology is the key to disease formation and hence should be the primary object of therapeutic intervention. We demonstrate that such a subnetwork exists and can be obtained purely computationally. In particular, by successively pruning the entire human PPI network, we are left with a "core" subnetwork that is not only topologically and functionally homogeneous, but is also enriched in disease genes, drug targets, and it contains genes that are known to drive disease formation. We call this subnetwork the Core Diseasome. Furthermore, we show that the topology of the Core Diseasome is unique in the human PPI network suggesting that it may be the wiring of this network that governs the mutagenesis that leads to disease. Explaining the mechanisms behind this phenomenon and exploiting them remains a challenge.
Collapse
Affiliation(s)
- Vuk Janjić
- Department of Computing, Imperial College London, London, SW7 2AZ, UK.
| | | |
Collapse
|
37
|
Ma X, Gao L. Biological network analysis: insights into structure and functions. Brief Funct Genomics 2012; 11:434-442. [PMID: 23184677 DOI: 10.1093/bfgp/els045] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
In the past two decades, great efforts have been devoted to extract the dependence and interplay between structure and functions in biological networks because they have strong relevance to biological processes. In this article, we reviewed the recent development in the biological network analysis. In detail, we first reviewed the interactome topological properties of biological networks, the methods for structure and functional patterns.
Collapse
Affiliation(s)
- Xiaoke Ma
- School of Computer Science and Technology, Xidian University, No. 2 South TaiBai Road, Xi'an, Shaanxi 710071, P.R. China
| | | |
Collapse
|
38
|
Abstract
Hard combinatorial optimization problems deal with the search for the minimum cost solutions (ground states) of discrete systems under strong constraints. A transformation of state variables may enhance computational tractability. It has been argued that these state encodings are to be chosen invertible to retain the original size of the state space. Here we show how redundant non-invertible encodings enhance optimization by enriching the density of low-energy states. In addition, smooth landscapes may be established on encoded state spaces to guide local search dynamics towards the ground state.
Collapse
|
39
|
|
40
|
Kotera M, Tokimatsu T, Kanehisa M, Goto S. MUCHA: multiple chemical alignment algorithm to identify building block substructures of orphan secondary metabolites. BMC Bioinformatics 2011; 12 Suppl 14:S1. [PMID: 22373367 PMCID: PMC3287465 DOI: 10.1186/1471-2105-12-s14-s1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Background In contrast to the increasing number of the successful genome projects, there still remain many orphan metabolites for which their synthesis processes are unknown. Metabolites, including these orphan metabolites, can be classified into groups that share the same core substructures, originated from the same biosynthetic pathways. It is known that many metabolites are synthesized by adding up building blocks to existing metabolites. Therefore, it is proposed that, for any given group of metabolites, finding the core substructure and the branched substructures can help predict their biosynthetic pathway. There already have been many reports on the multiple graph alignment techniques to find the conserved chemical substructures in relatively small molecules. However, they are optimized for ligand binding and are not suitable for metabolomic studies. Results We developed an efficient multiple graph alignment method named as MUCHA (Multiple Chemical Alignment), specialized for finding metabolic building blocks. This method showed the strength in finding metabolic building blocks with preserving the relative positions among the substructures, which is not achieved by simply applying the frequent graph mining techniques. Compared with the combined pairwise alignments, this proposed MUCHA method generally reduced computational costs with improving the quality of the alignment. Conclusions MUCHA successfully find building blocks of secondary metabolites, and has a potential to complement to other existing methods to reconstruct metabolic networks using reaction patterns.
Collapse
Affiliation(s)
- Masaaki Kotera
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan
| | | | | | | |
Collapse
|
41
|
Mernberger M, Klebe G, Hüllermeier E. SEGA: semiglobal graph alignment for structure-based protein comparison. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 8:1330-1343. [PMID: 21339532 DOI: 10.1109/tcbb.2011.35] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Comparative analysis is a topic of utmost importance in structural bioinformatics. Recently, a structural counterpart to sequence alignment, called multiple graph alignment, was introduced as a tool for the comparison of protein structures in general and protein binding sites in particular. Using approximate graph matching techniques, this method enables the identification of approximately conserved patterns in functionally related structures. In this paper, we introduce a new method for computing graph alignments motivated by two problems of the original approach, a conceptual and a computational one. First, the existing approach is of limited usefulness for structures that only share common substructures. Second, the goal to find a globally optimal alignment leads to an optimization problem that is computationally intractable. To overcome these disadvantages, we propose a semiglobal approach to graph alignment in analogy to semiglobal sequence alignment that combines the advantages of local and global graph matching.
Collapse
Affiliation(s)
- Marco Mernberger
- Department of Mathematics and Computer Science, Philipps-Universität Marburg, Hans-Meerwein-Straße 6, Marburg D-35032, Germany.
| | | | | |
Collapse
|
42
|
Pržulj N. Protein-protein interactions: making sense of networks via graph-theoretic modeling. Bioessays 2011; 33:115-23. [PMID: 21188720 DOI: 10.1002/bies.201000044] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The emerging area of network biology is seeking to provide insights into organizational principles of life. However, despite significant collaborative efforts, there is still typically a weak link between biological and computational scientists and a lack of understanding of the research issues across the disciplines. This results in the use of simple computational techniques of limited potential that are incapable of explaining these complex data. Hence, the danger is that the community might begin to view the topological properties of network data as mere statistics, rather than rich sources of biological information. A further danger is that such views might result in the imposition of scientific doctrines, such as scale-free-centric (on the modeling side) and genome-centric (on the biological side) opinions onto this area. Here, we take a graph-theoretic perspective on protein-protein interaction networks and present a high-level overview of the area, commenting on possible challenges ahead.
Collapse
Affiliation(s)
- Nataša Pržulj
- Department of Computing, Imperial College London, London, UK.
| |
Collapse
|
43
|
Ay F, Kellis M, Kahveci T. SubMAP: aligning metabolic pathways with subnetwork mappings. J Comput Biol 2011; 18:219-35. [PMID: 21385030 DOI: 10.1089/cmb.2010.0280] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We consider the problem of aligning two metabolic pathways. Unlike traditional approaches, we do not restrict the alignment to one-to-one mappings between the molecules (nodes) of the input pathways (graphs). We follow the observation that, in nature, different organisms can perform the same or similar functions through different sets of reactions and molecules. The number and the topology of the molecules in these alternative sets often vary from one organism to another. With the motivation that an accurate biological alignment should be able to reveal these functionally similar molecule sets across different species, we develop an algorithm that first measures the similarities between different nodes using a mixture of homology and topological similarity. We combine the two metrics by employing an eigenvalue formulation. We then search for an alignment between the two input pathways that maximizes a similarity score, evaluated as the sum of the similarities of the mapped subnetworks of size at most a given integer k, and also does not contain any conflicting mappings. Here we prove that this maximization is NP-hard by a reduction from the maximum weight independent set (MWIS) problem. We then convert our problem to an instance of MWIS and use an efficient vertex-selection strategy to extract the mappings that constitute our alignment. We name our algorithm SubMAP (Subnetwork Mappings in Alignment of Pathways). We evaluate its accuracy and performance on real datasets. Our empirical results demonstrate that SubMAP can identify biologically relevant mappings that are missed by traditional alignment methods. Furthermore, we observe that SubMAP is scalable for metabolic pathways of arbitrary topology, including searching for a query pathway of size 70 against the complete KEGG database of 1,842 pathways. Implementation in C++ is available at http://bioinformatics.cise.ufl.edu/SubMAP.html.
Collapse
Affiliation(s)
- Ferhat Ay
- Computer and Information Science and Engineering, University of Florida, Gainesville, Florida, USA.
| | | | | |
Collapse
|
44
|
Kuchaiev O, Stevanović A, Hayes W, Pržulj N. GraphCrunch 2: Software tool for network modeling, alignment and clustering. BMC Bioinformatics 2011; 12:24. [PMID: 21244715 PMCID: PMC3036622 DOI: 10.1186/1471-2105-12-24] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2010] [Accepted: 01/19/2011] [Indexed: 02/02/2023] Open
Abstract
Background Recent advancements in experimental biotechnology have produced large amounts of protein-protein interaction (PPI) data. The topology of PPI networks is believed to have a strong link to their function. Hence, the abundance of PPI data for many organisms stimulates the development of computational techniques for the modeling, comparison, alignment, and clustering of networks. In addition, finding representative models for PPI networks will improve our understanding of the cell just as a model of gravity has helped us understand planetary motion. To decide if a model is representative, we need quantitative comparisons of model networks to real ones. However, exact network comparison is computationally intractable and therefore several heuristics have been used instead. Some of these heuristics are easily computable "network properties," such as the degree distribution, or the clustering coefficient. An important special case of network comparison is the network alignment problem. Analogous to sequence alignment, this problem asks to find the "best" mapping between regions in two networks. It is expected that network alignment might have as strong an impact on our understanding of biology as sequence alignment has had. Topology-based clustering of nodes in PPI networks is another example of an important network analysis problem that can uncover relationships between interaction patterns and phenotype. Results We introduce the GraphCrunch 2 software tool, which addresses these problems. It is a significant extension of GraphCrunch which implements the most popular random network models and compares them with the data networks with respect to many network properties. Also, GraphCrunch 2 implements the GRAph ALigner algorithm ("GRAAL") for purely topological network alignment. GRAAL can align any pair of networks and exposes large, dense, contiguous regions of topological and functional similarities far larger than any other existing tool. Finally, GraphCruch 2 implements an algorithm for clustering nodes within a network based solely on their topological similarities. Using GraphCrunch 2, we demonstrate that eukaryotic and viral PPI networks may belong to different graph model families and show that topology-based clustering can reveal important functional similarities between proteins within yeast and human PPI networks. Conclusions GraphCrunch 2 is a software tool that implements the latest research on biological network analysis. It parallelizes computationally intensive tasks to fully utilize the potential of modern multi-core CPUs. It is open-source and freely available for research use. It runs under the Windows and Linux platforms.
Collapse
Affiliation(s)
- Oleksii Kuchaiev
- Department of Computer Science, University of California, Irvine, CA, USA
| | | | | | | |
Collapse
|
45
|
Emmert-Streib F, Glazko GV. Network biology: a direct approach to study biological function. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2010; 3:379-91. [PMID: 21197659 DOI: 10.1002/wsbm.134] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
In this paper we discuss the dualism of gene networks and their role in systems biology. We argue that gene networks (1) can serve as a conceptual framework, forming a fundamental level of a phenomenological description, and (2) are a means to represent and analyze data. The latter point does not only allow a systems analysis but is even amenable for a direct approach to study biological function. Here we focus on the clarity of our main arguments and conceptual meaning of gene networks, rather than the causal inference of gene networks from data. WIREs Syst Biol Med 2011 3 379-391 DOI: 10.1002/wsbm.134 For further resources related to this article, please visit the WIREs website.
Collapse
Affiliation(s)
- Frank Emmert-Streib
- Computational Biology and Machine Learning, Center for Cancer Research and Cell Biology, School of Biomedical Sciences, Queen's University Belfast, Belfast, UK.
| | | |
Collapse
|
46
|
Kelly WP, Stumpf MPH. Trees on networks: resolving statistical patterns of phylogenetic similarities among interacting proteins. BMC Bioinformatics 2010; 11:470. [PMID: 20854660 PMCID: PMC2955699 DOI: 10.1186/1471-2105-11-470] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2010] [Accepted: 09/20/2010] [Indexed: 11/28/2022] Open
Abstract
BACKGROUND Phylogenies capture the evolutionary ancestry linking extant species. Correlations and similarities among a set of species are mediated by and need to be understood in terms of the phylogenic tree. In a similar way it has been argued that biological networks also induce correlations among sets of interacting genes or their protein products. RESULTS We develop suitable statistical resampling schemes that can incorporate these two potential sources of correlation into a single inferential framework. To illustrate our approach we apply it to protein interaction data in yeast and investigate whether the phylogenetic trees of interacting proteins in a panel of yeast species are more similar than would be expected by chance. CONCLUSIONS While we find only negligible evidence for such increased levels of similarities, our statistical approach allows us to resolve the previously reported contradictory results on the levels of co-evolution induced by protein-protein interactions. We conclude with a discussion as to how we may employ the statistical framework developed here in further functional and evolutionary analyses of biological networks and systems.
Collapse
Affiliation(s)
- William P Kelly
- Centre for Bioinformatics, Imperial College, London, UK
- Centre for Integrative Systems Biology at Imperial College (CISBIC), London, UK
| | - Michael PH Stumpf
- Centre for Bioinformatics, Imperial College, London, UK
- Centre for Integrative Systems Biology at Imperial College (CISBIC), London, UK
- Institute of Mathematical Sciences, Imperial College, London, UK
| |
Collapse
|
47
|
Kuchaiev O, Milenković T, Memišević V, Hayes W, Pržulj N. Topological network alignment uncovers biological function and phylogeny. J R Soc Interface 2010; 7:1341-54. [PMID: 20236959 PMCID: PMC2894889 DOI: 10.1098/rsif.2010.0063] [Citation(s) in RCA: 150] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2010] [Accepted: 02/25/2010] [Indexed: 12/22/2022] Open
Abstract
Sequence comparison and alignment has had an enormous impact on our understanding of evolution, biology and disease. Comparison and alignment of biological networks will probably have a similar impact. Existing network alignments use information external to the networks, such as sequence, because no good algorithm for purely topological alignment has yet been devised. In this paper, we present a novel algorithm based solely on network topology, that can be used to align any two networks. We apply it to biological networks to produce by far the most complete topological alignments of biological networks to date. We demonstrate that both species phylogeny and detailed biological function of individual proteins can be extracted from our alignments. Topology-based alignments have the potential to provide a completely new, independent source of phylogenetic information. Our alignment of the protein-protein interaction networks of two very different species-yeast and human-indicate that even distant species share a surprising amount of network topology, suggesting broad similarities in internal cellular wiring across all life on Earth.
Collapse
Affiliation(s)
- Oleksii Kuchaiev
- Department of Computer Science, University of California, Irvine, CA 92697-3435, USA
| | - Tijana Milenković
- Department of Computer Science, University of California, Irvine, CA 92697-3435, USA
| | - Vesna Memišević
- Department of Computer Science, University of California, Irvine, CA 92697-3435, USA
| | - Wayne Hayes
- Department of Computer Science, University of California, Irvine, CA 92697-3435, USA
- Department of Mathematics, Imperial College, London SW7 2AZ, UK
| | - Nataša Pržulj
- Department of Computing, Imperial College, London SW7 2AZ, UK
| |
Collapse
|
48
|
Ginoza R, Mugler A. Network motifs come in sets: correlations in the randomization process. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2010; 82:011921. [PMID: 20866662 DOI: 10.1103/physreve.82.011921] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2010] [Revised: 05/05/2010] [Indexed: 05/29/2023]
Abstract
The identification of motifs--subgraphs that appear significantly more often in a particular network than in an ensemble of randomized networks--has become a ubiquitous method for uncovering potentially important subunits within networks drawn from a wide variety of fields. We find that the most common algorithms used to generate the ensemble from the real network change subgraph counts in a highly correlated manner, such that one subgraph's status as a motif may not be independent from the statuses of the other subgraphs. We demonstrate this effect for the problem of three- and four-node motif identification in the transcriptional regulatory networks of E. coli and S. cerevisiae in which randomized networks are generated via an edge-swapping algorithm. We find strong correlations among subgraph counts; for three-node subgraphs these correlations are easily interpreted, and we present an information-theoretic tool that may be used to identify correlations among subgraphs of any size. Our results suggest that single-feature statistics such as Z scores that implicitly assume independence among subgraph counts constitute an insufficient summary of the network.
Collapse
Affiliation(s)
- Reid Ginoza
- Division of Natural Sciences and Mathematics, Bennington College, Bennington, Vermont 05201, USA.
| | | |
Collapse
|
49
|
Abstract
Important biological information is encoded in the topology of biological networks. Comparative analyses of biological networks are proving to be valuable, as they can lead to transfer of knowledge between species and give deeper insights into biological function, disease, and evolution. We introduce a new method that uses the Hungarian algorithm to produce optimal global alignment between two networks using any cost function. We design a cost function based solely on network topology and use it in our network alignment. Our method can be applied to any two networks, not just biological ones, since it is based only on network topology. We use our new method to align protein-protein interaction networks of two eukaryotic species and demonstrate that our alignment exposes large and topologically complex regions of network similarity. At the same time, our alignment is biologically valid, since many of the aligned protein pairs perform the same biological function. From the alignment, we predict function of yet unannotated proteins, many of which we validate in the literature. Also, we apply our method to find topological similarities between metabolic networks of different species and build phylogenetic trees based on our network alignment score. The phylogenetic trees obtained in this way bear a striking resemblance to the ones obtained by sequence alignments. Our method detects topologically similar regions in large networks that are statistically significant. It does this independent of protein sequence or any other information external to network topology.
Collapse
Affiliation(s)
- Tijana Milenković
- Department of Computing, Imperial College London SW7 2AZ, UK
- Department of Computer Science, University of California, Irvine, CA 92697-3435, USA
| | - Weng Leong Ng
- Department of Computer Science, University of California, Irvine, CA 92697-3435, USA
| | - Wayne Hayes
- Department of Computer Science, University of California, Irvine, CA 92697-3435, USA
- Department of Mathematics, Imperial College London SW7 2AZ, UK
| | - Nataša Pržulj
- Department of Computing, Imperial College London SW7 2AZ, UK
| |
Collapse
|
50
|
Abstract
BACKGROUND One of the most recent and important developments in drug discovery is a new drug development approach of building and analyzing networks that contain relationships among drugs and targets, diseases, genes and other components. These networks and their integrations provide useful information for finding new targets as well as new drugs. OBJECTIVE This review article aims to review recent developments in various types of networks and suggest the future direction of these network studies for drug discovery. METHODS Databases and networks are integrated into a more complete network to better present the relationships among drugs, targets, genes, phenotypes and diseases. After discussing the limitations and obstacles of the recent research, we suggest several strategies to build a successful and practical drug-target network. RESULTS/CONCLUSION A useful, integrated network can be built from various databases and networks by resolving several issues, such as limited coverage and inconsistency. This integrated network can be completed by the prediction of missing links, biological network comparison and drug target identification. Possible applications are multi-target drug development, drug repurposing, estimation of drug effect on target perturbations in the whole system and extraction of the suitable purpose of the drug-target sub-network.
Collapse
Affiliation(s)
- Soyoung Lee
- KAIST, Department of Bio and Brain Engineering, 335 Gwahak-ro, Yuseong-gu, Daejeon, 305-701 Korea, Republic of Korea +82 42 350 4317 ; +82 42 350 4310 ;
| | | | | |
Collapse
|