1
|
Matejek B, Wei D, Chen T, Tsourakakis CE, Mitzenmacher M, Pfister H. Edge-colored directed subgraph enumeration on the connectome. Sci Rep 2022; 12:11349. [PMID: 35790766 PMCID: PMC9256670 DOI: 10.1038/s41598-022-15027-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Accepted: 06/16/2022] [Indexed: 11/24/2022] Open
Abstract
Following significant advances in image acquisition, synapse detection, and neuronal segmentation in connectomics, researchers have extracted an increasingly diverse set of wiring diagrams from brain tissue. Neuroscientists frequently represent these wiring diagrams as graphs with nodes corresponding to a single neuron and edges indicating synaptic connectivity. The edges can contain "colors" or "labels", indicating excitatory versus inhibitory connections, among other things. By representing the wiring diagram as a graph, we can begin to identify motifs, the frequently occurring subgraphs that correspond to specific biological functions. Most analyses on these wiring diagrams have focused on hypothesized motifs-those we expect to find. However, one of the goals of connectomics is to identify biologically-significant motifs that we did not previously hypothesize. To identify these structures, we need large-scale subgraph enumeration to find the frequencies of all unique motifs. Exact subgraph enumeration is a computationally expensive task, particularly in the edge-dense wiring diagrams. Furthermore, most existing methods do not differentiate between types of edges which can significantly affect the function of a motif. We propose a parallel, general-purpose subgraph enumeration strategy to count motifs in the connectome. Next, we introduce a divide-and-conquer community-based subgraph enumeration strategy that allows for enumeration per brain region. Lastly, we allow for differentiation of edges by types to better reflect the underlying biological properties of the graph. We demonstrate our results on eleven connectomes and publish for future analyses extensive overviews for the 26 trillion subgraphs enumerated that required approximately 9.25 years of computation time.
Collapse
Affiliation(s)
- Brian Matejek
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA.
- Computer Science Laboratory, SRI International, Washington, DC, USA.
| | - Donglai Wei
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
- Department of Computer Science, Boston College, Chestnut Hill, MA, USA
| | - Tianyi Chen
- Department of Computer Science, Boston University, Boston, MA, USA
| | - Charalampos E Tsourakakis
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
- Department of Computer Science, Boston University, Boston, MA, USA
- ISI Foundation, Turin, Italy
| | - Michael Mitzenmacher
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
| | - Hanspeter Pfister
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
| |
Collapse
|
2
|
Biswas S, Ray S, Bandyopadhyay S. Colored Network Motif Analysis by Dynamic Programming Approach: An Application in Host Pathogen Interaction Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:550-561. [PMID: 31217126 DOI: 10.1109/tcbb.2019.2923173] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Network motifs are subgraphs of a network which are found with significantly higher frequency than that expected in similar random networks. Motifs are small building blocks of a network and they have emerged as a way to uncover topological properties of complex networks. A special yet not much explored type of motif is the 'colored motif' where color (type) of each node, and hence the edges, in the motif is distinguishable from each other. A traditional motif is defined as a recurring structure in a network, whereas colored motif introduces detailed information about the color of the nodes. G-trie is a data structure to efficiently store a given set of subgraphs by exploiting the topological overlaps within them. In this article we have implemented a modified g-trie to store colored subgraphs and developed a method to discover colored motifs. Our method uses an approximate enumeration for counting the subgraphs to reduce the runtime. We have applied our method to find colored motifs of size three in a host pathogen protein-protein interaction network having two types of proteins namely HIV-1 and human proteins, and four types of edges. Here, we have discovered eight motifs, six of which contain both HIV-1 and human proteins, while the remaining two contain only human proteins.
Collapse
|
3
|
Huang CH, Zaenudin E, Tsai JJP, Kurubanjerdjit N, Dessie EY, Ng KL. Dissecting molecular network structures using a network subgraph approach. PeerJ 2020; 8:e9556. [PMID: 33005483 PMCID: PMC7512139 DOI: 10.7717/peerj.9556] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Accepted: 06/25/2020] [Indexed: 11/20/2022] Open
Abstract
Biological processes are based on molecular networks, which exhibit biological functions through interactions of genetic elements or proteins. This study presents a graph-based method to characterize molecular networks by decomposing the networks into directed multigraphs: network subgraphs. Spectral graph theory, reciprocity and complexity measures were used to quantify the network subgraphs. Graph energy, reciprocity and cyclomatic complexity can optimally specify network subgraphs with some degree of degeneracy. Seventy-one molecular networks were analyzed from three network types: cancer networks, signal transduction networks, and cellular processes. Molecular networks are built from a finite number of subgraph patterns and subgraphs with large graph energies are not present, which implies a graph energy cutoff. In addition, certain subgraph patterns are absent from the three network types. Thus, the Shannon entropy of the subgraph frequency distribution is not maximal. Furthermore, frequently-observed subgraphs are irreducible graphs. These novel findings warrant further investigation and may lead to important applications. Finally, we observed that cancer-related cellular processes are enriched with subgraph-associated driver genes. Our study provides a systematic approach for dissecting biological networks and supports the conclusion that there are organizational principles underlying molecular networks.
Collapse
Affiliation(s)
- Chien-Hung Huang
- Department of Computer Science and Information Engineering, National Formosa University, Yunlin, Taiwan
| | - Efendi Zaenudin
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan.,Research Center for Informatics, Indonesian Institute of Sciences, Bandung, Indonesia
| | - Jeffrey J P Tsai
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan
| | | | - Eskezeia Y Dessie
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan
| | - Ka-Lok Ng
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan.,Department of Medical Research, China Medical University Hospital, China Medical University, Taichung, Taiwan
| |
Collapse
|
4
|
|
5
|
|