2
|
Hayes WB. An Introductory Guide to Aligning Networks Using SANA, the Simulated Annealing Network Aligner. Methods Mol Biol 2020; 2074:263-284. [PMID: 31583643 DOI: 10.1007/978-1-4939-9873-9_18] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Sequence alignment has had an enormous impact on our understanding of biology, evolution, and disease. The alignment of biological networks holds similar promise. Biological networks generally model interactions between biomolecules such as proteins, genes, metabolites, or mRNAs. There is strong evidence that the network topology-the "structure" of the network-is correlated with the functions performed, so that network topology can be used to help predict or understand function. However, unlike sequence comparison and alignment-which is an essentially solved problem-network comparison and alignment is an NP-complete problem for which heuristic algorithms must be used.Here we introduce SANA, the Simulated Annealing Network Aligner. SANA is one of many algorithms proposed for the arena of biological network alignment. In the context of global network alignment, SANA stands out for its speed, memory efficiency, ease-of-use, and flexibility in the arena of producing alignments between two or more networks. SANA produces better alignments in minutes on a laptop than most other algorithms can produce in hours or days of CPU time on large server-class machines. We walk the user through how to use SANA for several types of biomolecular networks.
Collapse
Affiliation(s)
- Wayne B Hayes
- Department of Computer Science, University of California, Irvine, CA, USA.
| |
Collapse
|
3
|
Maharaj S, Tracy B, Hayes WB. BLANT—fast graphlet sampling tool. Bioinformatics 2019; 35:5363-5364. [DOI: 10.1093/bioinformatics/btz603] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Revised: 07/19/2019] [Accepted: 07/30/2019] [Indexed: 11/13/2022] Open
Abstract
Abstract
Summary
BLAST creates local sequence alignments by first building a database of small k-letter sub-sequences called k-mers. Identical k-mers from different regions provide ‘seeds’ for longer local alignments. This seed-and-extend heuristic makes BLAST extremely fast and has led to its almost exclusive use despite the existence of more accurate, but slower, algorithms. In this paper, we introduce the Basic Local Alignment for Networks Tool (BLANT). BLANT is the analog of BLAST, but for networks: given an input graph, it samples small, induced, k-node sub-graphs called k-graphlets. Graphlets have been used to classify networks, quantify structure, align networks both locally and globally, identify topology-function relationships and build taxonomic trees without the use of sequences. Given an input network, BLANT produces millions of graphlet samples in seconds—orders of magnitude faster than existing methods. BLANT offers sampled graphlets in various forms: distributions of graphlets or their orbits; graphlet degree or graphlet orbit degree vectors, the latter being compatible with ORCA; or an index to be used as the basis for seed-and-extend local alignments. We demonstrate BLANT’s usefelness by using its indexing mode to find functional similarity between yeast and human PPI networks.
Availability and implementation
BLANT is written in C and is available at https://github.com/waynebhayes/BLANT/releases.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sridevi Maharaj
- Department of Computer Science, University of California Irvine, Irvine, CA, USA
| | - Brennan Tracy
- Department of Computer Science, University of California Irvine, Irvine, CA, USA
| | - Wayne B Hayes
- Department of Computer Science, University of California Irvine, Irvine, CA, USA
| |
Collapse
|
4
|
Melckenbeeck I, Audenaert P, Van Parys T, Van De Peer Y, Colle D, Pickavet M. Optimising orbit counting of arbitrary order by equation selection. BMC Bioinformatics 2019; 20:27. [PMID: 30646859 PMCID: PMC6334470 DOI: 10.1186/s12859-018-2483-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2018] [Accepted: 11/09/2018] [Indexed: 11/25/2022] Open
Abstract
Background Graphlets are useful for bioinformatics network analysis. Based on the structure of Hočevar and Demšar’s ORCA algorithm, we have created an orbit counting algorithm, named Jesse. This algorithm, like ORCA, uses equations to count the orbits, but unlike ORCA it can count graphlets of any order. To do so, it generates the required internal structures and equations automatically. Many more redundant equations are generated, however, and Jesse’s running time is highly dependent on which of these equations are used. Therefore, this paper aims to investigate which equations are most efficient, and which factors have an effect on this efficiency. Results With appropriate equation selection, Jesse’s running time may be reduced by a factor of up to 2 in the best case, compared to using randomly selected equations. Which equations are most efficient depends on the density of the graph, but barely on the graph type. At low graph density, equations with terms in their right-hand side with few arguments are more efficient, whereas at high density, equations with terms with many arguments in the right-hand side are most efficient. At a density between 0.6 and 0.7, both types of equations are about equally efficient. Conclusions Our Jesse algorithm became up to a factor 2 more efficient, by automatically selecting the best equations based on graph density. It was adapted into a Cytoscape App that is freely available from the Cytoscape App Store to ease application by bioinformaticians.
Collapse
Affiliation(s)
- Ine Melckenbeeck
- Ghent University - imec, IDLab, Technologiepark 15, Ghent, 9052, Belgium
| | - Pieter Audenaert
- Ghent University - imec, IDLab, Technologiepark 15, Ghent, 9052, Belgium. .,Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium.
| | - Thomas Van Parys
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium.,Department of Plant Systems Biology, VIB, Technologiepark 927, Ghent, 9052, Belgium.,Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, Ghent, 9052, Belgium
| | - Yves Van De Peer
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium.,Department of Plant Systems Biology, VIB, Technologiepark 927, Ghent, 9052, Belgium.,Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, Ghent, 9052, Belgium.,Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria 0028, South Africa
| | - Didier Colle
- Ghent University - imec, IDLab, Technologiepark 15, Ghent, 9052, Belgium.,Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - Mario Pickavet
- Ghent University - imec, IDLab, Technologiepark 15, Ghent, 9052, Belgium.,Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| |
Collapse
|