1
|
Menor-Flores M, Vega-Rodríguez MA. A protein-protein interaction network aligner study in the multi-objective domain. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 250:108188. [PMID: 38657382 DOI: 10.1016/j.cmpb.2024.108188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 04/14/2024] [Accepted: 04/17/2024] [Indexed: 04/26/2024]
Abstract
BACKGROUND AND OBJECTIVE The protein-protein interaction (PPI) network alignment has proven to be an efficient technique in the diagnosis and prevention of certain diseases. However, the difficulty in maximizing, at the same time, the two qualities that measure the goodness of alignments (topological and biological quality) has led aligners to produce very different alignments. Thus making a comparative study among alignments of such different qualities a big challenge. Multi-objective optimization is a computer method, which is very powerful in this kind of contexts because both conflicting qualities are considered together. Analysing the alignments of each PPI network aligner with multi-objective methodologies allows you to visualize a bigger picture of the alignments and their qualities, obtaining very interesting conclusions. This paper proposes a comprehensive PPI network aligner study in the multi-objective domain. METHODS Alignments from each aligner and all aligners together were studied and compared to each other via Pareto dominance methodologies. The best alignments produced by each aligner and all aligners together for five different alignment scenarios were displayed in Pareto front graphs. Later, the aligners were ranked according to the topological, biological, and combined quality of their alignments. Finally, the aligners were also ranked based on their average runtimes. RESULTS Regarding aligners constructing the best overall alignments, we found that SAlign, BEAMS, SANA, and HubAlign are the best options. Additionally, the alignments of best topological quality are produced by: SANA, SAlign, and HubAlign aligners. On the contrary, the aligners returning the alignments of best biological quality are: BEAMS, TAME, and WAVE. However, if there are time constraints, it is recommended to select SAlign to obtain high topological quality alignments and PISwap or SAlign aligners for high biological quality alignments. CONCLUSIONS The use of the SANA aligner is recommended for obtaining the best alignments of topological quality, BEAMS for alignments of the best biological quality, and SAlign for alignments of the best combined topological and biological quality. Simultaneously, SANA and BEAMS have above-average runtimes. Therefore, it is suggested, if necessary due to time restrictions, to choose other, faster aligners like SAlign or PISwap whose alignments are also of high quality.
Collapse
Affiliation(s)
- Manuel Menor-Flores
- Escuela Politécnica, Universidad de Extremadura,(1) Campus Universitario s/n, 10003 Cáceres, Spain.
| | - Miguel A Vega-Rodríguez
- Escuela Politécnica, Universidad de Extremadura,(1) Campus Universitario s/n, 10003 Cáceres, Spain.
| |
Collapse
|
2
|
Milano M, Guzzi PH, Cannataro M. Design and Implementation of a New Local Alignment Algorithm for Multilayer Networks. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1272. [PMID: 36141158 PMCID: PMC9497667 DOI: 10.3390/e24091272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 09/06/2022] [Accepted: 09/07/2022] [Indexed: 06/16/2023]
Abstract
Network alignment (NA) is a popular research field that aims to develop algorithms for comparing networks. Applications of network alignment span many fields, from biology to social network analysis. NA comes in two forms: global network alignment (GNA), which aims to find a global similarity, and LNA, which aims to find local regions of similarity. Recently, there has been an increasing interest in introducing complex network models such as multilayer networks. Multilayer networks are common in many application scenarios, such as modelling of relations among people in a social network or representing the interplay of different molecules in a cell or different cells in the brain. Consequently, the need to introduce algorithms for the comparison of such multilayer networks, i.e., local network alignment, arises. Existing algorithms for LNA do not perform well on multilayer networks since they cannot consider inter-layer edges. Thus, we propose local alignment of multilayer networks (MultiLoAl), a novel algorithm for the local alignment of multilayer networks. We define the local alignment of multilayer networks and propose a heuristic for solving it. We present an extensive assessment indicating the strength of the algorithm. Furthermore, we implemented a synthetic multilayer network generator to build the data for the algorithm's evaluation.
Collapse
|
3
|
Wang S, Atkinson GRS, Hayes WB. SANA: cross-species prediction of Gene Ontology GO annotations via topological network alignment. NPJ Syst Biol Appl 2022; 8:25. [PMID: 35859153 PMCID: PMC9300714 DOI: 10.1038/s41540-022-00232-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Accepted: 05/20/2022] [Indexed: 12/31/2022] Open
Abstract
Topological network alignment aims to align two networks node-wise in order to maximize the observed common connection (edge) topology between them. The topological alignment of two protein-protein interaction (PPI) networks should thus expose protein pairs with similar interaction partners allowing, for example, the prediction of common Gene Ontology (GO) terms. Unfortunately, no network alignment algorithm based on topology alone has been able to achieve this aim, though those that include sequence similarity have seen some success. We argue that this failure of topology alone is due to the sparsity and incompleteness of the PPI network data of almost all species, which provides the network topology with a small signal-to-noise ratio that is effectively swamped when sequence information is added to the mix. Here we show that the weak signal can be detected using multiple stochastic samples of "good" topological network alignments, which allows us to observe regions of the two networks that are robustly aligned across multiple samples. The resulting network alignment frequency (NAF) strongly correlates with GO-based Resnik semantic similarity and enables the first successful cross-species predictions of GO terms based on topology-only network alignments. Our best predictions have an AUPR of about 0.4, which is competitive with state-of-the-art algorithms, even when there is no observable sequence similarity and no known homology relationship. While our results provide only a "proof of concept" on existing network data, we hypothesize that predicting GO terms from topology-only network alignments will become increasingly practical as the volume and quality of PPI network data increase.
Collapse
Affiliation(s)
- Siyue Wang
- Department of Computer Science, University of California, Irvine, CA, 92697-3435, USA
| | - Giles R S Atkinson
- Department of Computer Science, University of California, Irvine, CA, 92697-3435, USA
| | - Wayne B Hayes
- Department of Computer Science, University of California, Irvine, CA, 92697-3435, USA.
| |
Collapse
|
4
|
Nelson DR, Hazzouri KM, Lauersen KJ, Jaiswal A, Chaiboonchoe A, Mystikou A, Fu W, Daakour S, Dohai B, Alzahmi A, Nobles D, Hurd M, Sexton J, Preston MJ, Blanchette J, Lomas MW, Amiri KMA, Salehi-Ashtiani K. Large-scale genome sequencing reveals the driving forces of viruses in microalgal evolution. Cell Host Microbe 2021; 29:250-266.e8. [PMID: 33434515 DOI: 10.1016/j.chom.2020.12.005] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 10/08/2020] [Accepted: 11/18/2020] [Indexed: 01/08/2023]
Abstract
Being integral primary producers in diverse ecosystems, microalgal genomes could be mined for ecological insights, but representative genome sequences are lacking for many phyla. We cultured and sequenced 107 microalgae species from 11 different phyla indigenous to varied geographies and climates. This collection was used to resolve genomic differences between saltwater and freshwater microalgae. Freshwater species showed domain-centric ontology enrichment for nuclear and nuclear membrane functions, while saltwater species were enriched in organellar and cellular membrane functions. Further, marine species contained significantly more viral families in their genomes (p = 8e-4). Sequences from Chlorovirus, Coccolithovirus, Pandoravirus, Marseillevirus, Tupanvirus, and other viruses were found integrated into the genomes of algal from marine environments. These viral-origin sequences were found to be expressed and code for a wide variety of functions. Together, this study comprehensively defines the expanse of protein-coding and viral elements in microalgal genomes and posits a unified adaptive strategy for algal halotolerance.
Collapse
Affiliation(s)
- David R Nelson
- Center for Genomics and Systems Biology, New York University Abu Dhabi, Abu Dhabi, UAE.
| | - Khaled M Hazzouri
- Khalifa Center for Genetic Engineering and Biotechnology (KCGEB), UAE University, Al Ain, Abu Dhabi, UAE; Biology Department, College of Science, UAE University, Al Ain, Abu Dhabi, UAE
| | - Kyle J Lauersen
- Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Ashish Jaiswal
- Division of Science and Math, New York University Abu Dhabi, Abu Dhabi, UAE
| | | | - Alexandra Mystikou
- Center for Genomics and Systems Biology, New York University Abu Dhabi, Abu Dhabi, UAE
| | - Weiqi Fu
- Division of Science and Math, New York University Abu Dhabi, Abu Dhabi, UAE
| | - Sarah Daakour
- Center for Genomics and Systems Biology, New York University Abu Dhabi, Abu Dhabi, UAE
| | - Bushra Dohai
- Division of Science and Math, New York University Abu Dhabi, Abu Dhabi, UAE
| | - Amnah Alzahmi
- Center for Genomics and Systems Biology, New York University Abu Dhabi, Abu Dhabi, UAE
| | - David Nobles
- UTEX Culture Collection of Algae at the University of Texas at Austin, Austin, TX, USA
| | - Mark Hurd
- National Center for Marine Algae and Microbiota, East Boothbay, ME, USA
| | - Julie Sexton
- National Center for Marine Algae and Microbiota, East Boothbay, ME, USA
| | - Michael J Preston
- National Center for Marine Algae and Microbiota, East Boothbay, ME, USA
| | - Joan Blanchette
- National Center for Marine Algae and Microbiota, East Boothbay, ME, USA
| | - Michael W Lomas
- National Center for Marine Algae and Microbiota, East Boothbay, ME, USA
| | - Khaled M A Amiri
- Khalifa Center for Genetic Engineering and Biotechnology (KCGEB), UAE University, Al Ain, Abu Dhabi, UAE; Biology Department, College of Science, UAE University, Al Ain, Abu Dhabi, UAE
| | - Kourosh Salehi-Ashtiani
- Center for Genomics and Systems Biology, New York University Abu Dhabi, Abu Dhabi, UAE; Division of Science and Math, New York University Abu Dhabi, Abu Dhabi, UAE.
| |
Collapse
|
5
|
Ovens K, Eames BF, McQuillan I. Comparative Analyses of Gene Co-expression Networks: Implementations and Applications in the Study of Evolution. Front Genet 2021; 12:695399. [PMID: 34484293 PMCID: PMC8414652 DOI: 10.3389/fgene.2021.695399] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Accepted: 07/19/2021] [Indexed: 11/13/2022] Open
Abstract
Similarities and differences in the associations of biological entities among species can provide us with a better understanding of evolutionary relationships. Often the evolution of new phenotypes results from changes to interactions in pre-existing biological networks and comparing networks across species can identify evidence of conservation or adaptation. Gene co-expression networks (GCNs), constructed from high-throughput gene expression data, can be used to understand evolution and the rise of new phenotypes. The increasing abundance of gene expression data makes GCNs a valuable tool for the study of evolution in non-model organisms. In this paper, we cover motivations for why comparing these networks across species can be valuable for the study of evolution. We also review techniques for comparing GCNs in the context of evolution, including local and global methods of graph alignment. While some protein-protein interaction (PPI) bioinformatic methods can be used to compare co-expression networks, they often disregard highly relevant properties, including the existence of continuous and negative values for edge weights. Also, the lack of comparative datasets in non-model organisms has hindered the study of evolution using PPI networks. We also discuss limitations and challenges associated with cross-species comparison using GCNs, and provide suggestions for utilizing co-expression network alignments as an indispensable tool for evolutionary studies going forward.
Collapse
Affiliation(s)
- Katie Ovens
- Augmented Intelligence & Precision Health Laboratory (AIPHL), Research Institute of the McGill University Health Centre, Montreal, QC, Canada
| | - B. Frank Eames
- Department of Anatomy, Physiology, & Pharmacology, University of Saskatchewan, Saskatoon, SK, Canada
| | - Ian McQuillan
- Department of Computer Science, University of Saskatchewan, Saskatoon, SK, Canada
| |
Collapse
|
6
|
Gu S, Milenković T. Data-driven biological network alignment that uses topological, sequence, and functional information. BMC Bioinformatics 2021; 22:34. [PMID: 33514304 PMCID: PMC7847157 DOI: 10.1186/s12859-021-03971-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 01/15/2021] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND Network alignment (NA) can transfer functional knowledge between species' conserved biological network regions. Traditional NA assumes that it is topological similarity (isomorphic-like matching) between network regions that corresponds to the regions' functional relatedness. However, we recently found that functionally unrelated proteins are as topologically similar as functionally related proteins. So, we redefined NA as a data-driven method called TARA, which learns from network and protein functional data what kind of topological relatedness (rather than similarity) between proteins corresponds to their functional relatedness. TARA used topological information (within each network) but not sequence information (between proteins across networks). Yet, TARA yielded higher protein functional prediction accuracy than existing NA methods, even those that used both topological and sequence information. RESULTS Here, we propose TARA++ that is also data-driven, like TARA and unlike other existing methods, but that uses across-network sequence information on top of within-network topological information, unlike TARA. To deal with the within-and-across-network analysis, we adapt social network embedding to the problem of biological NA. TARA++ outperforms protein functional prediction accuracy of existing methods. CONCLUSIONS As such, combining research knowledge from different domains is promising. Overall, improvements in protein functional prediction have biomedical implications, for example allowing researchers to better understand how cancer progresses or how humans age.
Collapse
Affiliation(s)
- Shawn Gu
- Department of Computer Science and Engineering, Eck Institute for Global Health, Center for Network and Data Science, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Tijana Milenković
- Department of Computer Science and Engineering, Eck Institute for Global Health, Center for Network and Data Science, University of Notre Dame, Notre Dame, IN, 46556, USA.
| |
Collapse
|
7
|
Abstract
In this study, we deal with the problem of biological network alignment (NA), which aims to find a node mapping between species' molecular networks that uncovers similar network regions, thus allowing for the transfer of functional knowledge between the aligned nodes. We provide evidence that current NA methods, which assume that topologically similar nodes (i.e., nodes whose network neighborhoods are isomorphic-like) have high functional relatedness, do not actually end up aligning functionally related nodes. That is, we show that the current topological similarity assumption does not hold well. Consequently, we argue that a paradigm shift is needed with how the NA problem is approached. So, we redefine NA as a data-driven framework, called TARA (data-driven NA), which attempts to learn the relationship between topological relatedness and functional relatedness without assuming that topological relatedness corresponds to topological similarity. TARA makes no assumptions about what nodes should be aligned, distinguishing it from existing NA methods. Specifically, TARA trains a classifier to predict whether two nodes from different networks are functionally related based on their network topological patterns (features). We find that TARA is able to make accurate predictions. TARA then takes each pair of nodes that are predicted as related to be part of an alignment. Like traditional NA methods, TARA uses this alignment for the across-species transfer of functional knowledge. TARA as currently implemented uses topological but not protein sequence information for functional knowledge transfer. In this context, we find that TARA outperforms existing state-of-the-art NA methods that also use topological information, WAVE and SANA, and even outperforms or complements a state-of-the-art NA method that uses both topological and sequence information, PrimAlign. Hence, adding sequence information to TARA, which is our future work, is likely to further improve its performance. The software and data are available at http://www.nd.edu/~cone/TARA/.
Collapse
Affiliation(s)
- Shawn Gu
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, United States of America
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN, United States of America
- Center for Network and Data Science, University of Notre Dame, Notre Dame, IN, United States of America
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, United States of America
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN, United States of America
- Center for Network and Data Science, University of Notre Dame, Notre Dame, IN, United States of America
| |
Collapse
|
8
|
Shi D, Khan F, Abagyan R. Extended Multitarget Pharmacology of Anticancer Drugs. J Chem Inf Model 2019; 59:3006-3017. [PMID: 31025863 DOI: 10.1021/acs.jcim.9b00031] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Multitarget pharmacology of small-molecule cancer drugs significantly contributes to their mechanism of action, side effects, and emergence of drug resistance and opens ways to repurpose, combine, or customize drug therapy. In most cases, the set of targets affected at therapeutic concentrations is not fully characterized and/or the interaction efficacy values are not accurately quantified. We collected information about multiple targets for each cancer drug along with their experimental effective concentrations or binding activities from multiple sources. All multitarget activity values for each drug then were used to build two proximity network pharmacology maps of anticancer drugs and targets of those drugs, respectively. Together with the network map, we showed that the majority of the cancer drugs had substantial multitarget pharmacology based on our current knowledge. In addition, most of the cancer drugs simultaneously affect macromolecular targets from different classes and types. The target subset can further be accentuated and personalized by patient sample-specific expression data. The network maps of cancer drugs and targets as well as all quantified activity data were integrated into a freely available database, CancerDrugMap (http://ruben.ucsd.edu/dnet/maps/drugnet.html). The identified multitarget pharmacology of cancer drugs is essential for improving the efficacy of individually prescribed drugs and drug combinations and minimization of adverse effects.
Collapse
Affiliation(s)
- Da Shi
- Skaggs School of Pharmacy and Pharmaceutical Sciences , University of California, San Diego , La Jolla , California 92093-0747 , United States
| | - Feroz Khan
- Metabolic and Structural Biology Department , CSIR-Central Institute of Medicinal and Aromatic Plants (CIMAP) , Lucknow 226015 , Uttar Pradesh , India
| | - Ruben Abagyan
- Skaggs School of Pharmacy and Pharmaceutical Sciences , University of California, San Diego , La Jolla , California 92093-0747 , United States
| |
Collapse
|