1
|
Bai L, Li Z, Tang C, Song C, Hu F. Hypergraph-based analysis of weighted gene co-expression hypernetwork. Front Genet 2025; 16:1560841. [PMID: 40255486 PMCID: PMC12006133 DOI: 10.3389/fgene.2025.1560841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2025] [Accepted: 03/19/2025] [Indexed: 04/22/2025] Open
Abstract
Background With the rapid advancement of gene sequencing technologies, Traditional weighted gene co-expression network analysis (WGCNA), which relies on pairwise gene relationships, struggles to capture higher-order interactions and exhibits low computational efficiency when handling large, complex datasets. Methods To overcome these challenges, we propose a novel Weighted Gene Co-expression Hypernetwork Analysis (WGCHNA) based on weighted hypergraph, where genes are modeled as nodes and samples as hyperedges. By calculating the hypergraph Laplacian matrix, WGCHNA generates a topological overlap matrix for module identification through hierarchical clustering. Results Results on four gene expression datasets show that WGCHNA outperforms WGCNA in module identification and functional enrichment. WGCHNA identifies biologically relevant modules with greater complexity, particularly in processes like neuronal energy metabolism linked to Alzheimer's disease. Additionally, functional enrichment analysis uncovers more comprehensive pathway hierarchies, revealing potential regulatory relationships and novel targets. Conclusion WGCHNA effectively addresses WGCNA's limitations, providing superior accuracy in detecting gene modules and deeper insights for disease research, making it a powerful tool for analyzing complex biological systems.
Collapse
Affiliation(s)
- Libing Bai
- Computer College of Qinghai Normal University, Xining, Qinghai, China
- The State Key Laboratory of Tibetan Intelligence, Qinghai, Xining, China
| | - Zongjin Li
- College of Science, North China University of Science and Technology, Tangshan, China
| | - Chunyang Tang
- Computer College of Qinghai Normal University, Xining, Qinghai, China
- The State Key Laboratory of Tibetan Intelligence, Qinghai, Xining, China
| | - Changxin Song
- Department of Mechanical Engineering and Information, Shanghai Urban Construction Vocational College, Shanghai, China
| | - Feng Hu
- Computer College of Qinghai Normal University, Xining, Qinghai, China
- The State Key Laboratory of Tibetan Intelligence, Qinghai, Xining, China
| |
Collapse
|
2
|
Shah AM, Aral AM, Zamora R, Gharpure N, El-Dehaibi F, Zor F, Kulahci Y, Karagoz H, Barclay DA, Yin J, Breidenbach W, Tuder D, Gorantla VS, Vodovotz Y. Peripheral nerve repair is associated with augmented cross-tissue inflammation following vascularized composite allotransplantation. Front Immunol 2023; 14:1151824. [PMID: 37251389 PMCID: PMC10213935 DOI: 10.3389/fimmu.2023.1151824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 04/20/2023] [Indexed: 05/31/2023] Open
Abstract
Introduction Vascularized composite allotransplantation (VCA), with nerve repair/coaptation (NR) and tacrolimus (TAC) immunosuppressive therapy, is used to repair devastating traumatic injuries but is often complicated by inflammation spanning multiple tissues. We identified the parallel upregulation of transcriptional pathways involving chemokine signaling, T-cell receptor signaling, Th17, Th1, and Th2 pathways in skin and nerve tissue in complete VCA rejection compared to baseline in 7 human hand transplants and defined increasing complexity of protein-level dynamic networks involving chemokine, Th1, and Th17 pathways as a function of rejection severity in 5 of these patients. We next hypothesized that neural mechanisms may regulate the complex spatiotemporal evolution of rejection-associated inflammation post-VCA. Methods For mechanistic and ethical reasons, protein-level inflammatory mediators in tissues from Lewis rats (8 per group) receiving either syngeneic (Lewis) or allogeneic (Brown-Norway) orthotopic hind limb transplants in combination with TAC, with and without sciatic NR, were compared to human hand transplant samples using computational methods. Results In cross-correlation analyses of these mediators, VCA tissues from human hand transplants (which included NR) were most similar to those from rats undergoing VCA + NR. Based on dynamic hypergraph analyses, NR following either syngeneic or allogeneic transplantation in rats was associated with greater trans-compartmental localization of early inflammatory mediators vs. no-NR, and impaired downregulation of mediators including IL-17A at later times. Discussion Thus, NR, while considered necessary for restoring graft function, may also result in dysregulated and mis-compartmentalized inflammation post-VCA and therefore necessitate mitigation strategies. Our novel computational pipeline may also yield translational, spatiotemporal insights in other contexts.
Collapse
Affiliation(s)
- Ashti M. Shah
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA, United States
| | - Ali Mubin Aral
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA, United States
| | - Ruben Zamora
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA, United States
| | - Nitin Gharpure
- Department of Surgery, Wake Forest Institute for Regenerative Medicine, Wake Forest Baptist Medical Center, Winston Salem, NC, United States
| | - Fayten El-Dehaibi
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA, United States
| | - Fatih Zor
- Department of Surgery, Wake Forest Institute for Regenerative Medicine, Wake Forest Baptist Medical Center, Winston Salem, NC, United States
| | - Yalcin Kulahci
- Department of Surgery, Wake Forest Institute for Regenerative Medicine, Wake Forest Baptist Medical Center, Winston Salem, NC, United States
| | - Huseyin Karagoz
- Department of Surgery, Wake Forest Institute for Regenerative Medicine, Wake Forest Baptist Medical Center, Winston Salem, NC, United States
| | - Derek A. Barclay
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA, United States
| | - Jinling Yin
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA, United States
| | | | - Dmitry Tuder
- Plastic Surgery, San Antonio Military Medical Center, Fort Sam Houston, San Antonio, TX, United States
| | - Vijay S. Gorantla
- Department of Surgery, Wake Forest Institute for Regenerative Medicine, Wake Forest Baptist Medical Center, Winston Salem, NC, United States
| | - Yoram Vodovotz
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA, United States
- Center for Inflammation and Regeneration Modeling, McGowan Institute for Regenerative Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| |
Collapse
|
3
|
Zamora R, Forsberg JA, Shah AM, Unselt D, Grey S, Lisboa FA, Billiar TR, Schobel SA, Potter BK, Elster EA, Vodovotz Y. Central role for neurally dysregulated IL-17A in dynamic networks of systemic and local inflammation in combat casualties. Sci Rep 2023; 13:6618. [PMID: 37095162 PMCID: PMC10126120 DOI: 10.1038/s41598-023-33623-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 04/15/2023] [Indexed: 04/26/2023] Open
Abstract
Dynamic Network Analysis (DyNA) and Dynamic Hypergraphs (DyHyp) were used to define protein-level inflammatory networks at the local (wound effluent) and systemic circulation (serum) levels from 140 active-duty, injured service members (59 with TBI and 81 non-TBI). Interleukin (IL)-17A was the only biomarker elevated significantly in both serum and effluent in TBI vs. non-TBI casualties, and the mediator with the most DyNA connections in TBI wounds. DyNA combining serum and effluent data to define cross-compartment correlations suggested that IL-17A bridges local and systemic circulation at late time points. DyHyp suggested that systemic IL-17A upregulation in TBI patients was associated with tumor necrosis factor-α, while IL-17A downregulation in non-TBI patients was associated with interferon-γ. Correlation analysis suggested differential upregulation of pathogenic Th17 cells, non-pathogenic Th17 cells, and memory/effector T cells. This was associated with reduced procalcitonin in both effluent and serum of TBI patients, in support of an antibacterial effect of Th17 cells in TBI patients. Dysregulation of Th17 responses following TBI may drive cross-compartment inflammation following combat injury, counteracting wound infection at the cost of elevated systemic inflammation.
Collapse
Affiliation(s)
- Ruben Zamora
- Department of Surgery, University of Pittsburgh, W944 Starzl Biomedical Sciences Tower, 200 Lothrop St., Pittsburgh, PA, 15213, USA
- Center for Inflammation and Regeneration Modeling, McGowan Institute for Regenerative Medicine, Pittsburgh, PA, 15219, USA
- Pittsburgh Liver Research Center, University of Pittsburgh, Pittsburgh, PA, 15213, USA
| | - Jonathan A Forsberg
- Department of Surgery, Uniformed Services University of Health Sciences and Walter Reed National Military Medical Center, Bethesda, MD, 20814, USA
| | - Ashti M Shah
- Department of Surgery, University of Pittsburgh, W944 Starzl Biomedical Sciences Tower, 200 Lothrop St., Pittsburgh, PA, 15213, USA
| | - Desiree Unselt
- Department of Surgery, Uniformed Services University of Health Sciences and Walter Reed National Military Medical Center, Bethesda, MD, 20814, USA
- Surgical Critical Care Initiative (SC2i), Uniformed Services University of Health Sciences, Bethesda, MD, 20814, USA
- The Henry M Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Scott Grey
- Department of Surgery, Uniformed Services University of Health Sciences and Walter Reed National Military Medical Center, Bethesda, MD, 20814, USA
- Surgical Critical Care Initiative (SC2i), Uniformed Services University of Health Sciences, Bethesda, MD, 20814, USA
- The Henry M Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Felipe A Lisboa
- Department of Surgery, Uniformed Services University of Health Sciences and Walter Reed National Military Medical Center, Bethesda, MD, 20814, USA
- Surgical Critical Care Initiative (SC2i), Uniformed Services University of Health Sciences, Bethesda, MD, 20814, USA
- The Henry M Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Timothy R Billiar
- Department of Surgery, University of Pittsburgh, W944 Starzl Biomedical Sciences Tower, 200 Lothrop St., Pittsburgh, PA, 15213, USA
- Center for Inflammation and Regeneration Modeling, McGowan Institute for Regenerative Medicine, Pittsburgh, PA, 15219, USA
- Pittsburgh Liver Research Center, University of Pittsburgh, Pittsburgh, PA, 15213, USA
| | - Seth A Schobel
- Department of Surgery, Uniformed Services University of Health Sciences and Walter Reed National Military Medical Center, Bethesda, MD, 20814, USA
- Surgical Critical Care Initiative (SC2i), Uniformed Services University of Health Sciences, Bethesda, MD, 20814, USA
- The Henry M Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Benjamin K Potter
- Department of Surgery, Uniformed Services University of Health Sciences and Walter Reed National Military Medical Center, Bethesda, MD, 20814, USA
- Surgical Critical Care Initiative (SC2i), Uniformed Services University of Health Sciences, Bethesda, MD, 20814, USA
| | - Eric A Elster
- Department of Surgery, Uniformed Services University of Health Sciences and Walter Reed National Military Medical Center, Bethesda, MD, 20814, USA
- Surgical Critical Care Initiative (SC2i), Uniformed Services University of Health Sciences, Bethesda, MD, 20814, USA
| | - Yoram Vodovotz
- Department of Surgery, University of Pittsburgh, W944 Starzl Biomedical Sciences Tower, 200 Lothrop St., Pittsburgh, PA, 15213, USA.
- Center for Inflammation and Regeneration Modeling, McGowan Institute for Regenerative Medicine, Pittsburgh, PA, 15219, USA.
- Pittsburgh Liver Research Center, University of Pittsburgh, Pittsburgh, PA, 15213, USA.
| |
Collapse
|
4
|
Ban Y, Lao H, Li B, Su W, Zhang X. Diagnosis of Alzheimer's disease using hypergraph p-Laplacian regularized multi-task feature learning. J Biomed Inform 2023; 140:104326. [PMID: 36870585 DOI: 10.1016/j.jbi.2023.104326] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 02/01/2023] [Accepted: 03/01/2023] [Indexed: 03/06/2023]
Abstract
Multimodal data-based classification methods have been widely used in the diagnosis of Alzheimer's disease (AD) and have achieved better performance than single-modal-based methods. However, most classification methods based on multimodal data tend to consider only the correlation between different modal data and ignore the inherent non-linear higher-order relationships between similar data, which can improve the robustness of the model. Therefore, this study proposes a hypergraph p-Laplacian regularized multi-task feature selection (HpMTFS) method for AD classification. Specifically, feature selection for each modal data is considered as a distinct task and the common features of multimodal data are extracted jointly by group-sparsity regularizer. In particular, two regularization terms are introduced in this study, namely (1) a hypergraph p-Laplacian regularization term to retain higher-order structural information for similar data, and (2) a Frobenius norm regularization term to improve the noise immunity of the model. Finally, using a multi-kernel support vector machine to fuse multimodal features and perform the final classification. We used baseline sMRI, FDG-PET, and AV-45 PET imaging data from 528 subjects in the Alzheimer's Disease Neuroimaging Initiative (ADNI) to evaluate our approach. Experimental results show that our HpMTFS method outperforms existing multimodal-based classification methods.
Collapse
Affiliation(s)
- Yanjiao Ban
- School of Computer, Electronics and Information, Guangxi University, Nanning 530004, Guangxi, China
| | - Huan Lao
- Guangxi Key Lab of Multi-source Information Mining & Security, Guangxi Normal University, Guilin 541004, Guangxi, China; School of Artificial Intelligence, Guangxi Minzu University, Nanning 530006, Guangxi, China.
| | - Bin Li
- School of Computer, Electronics and Information, Guangxi University, Nanning 530004, Guangxi, China
| | - Wenjun Su
- School of Computer, Electronics and Information, Guangxi University, Nanning 530004, Guangxi, China
| | - Xuejun Zhang
- School of Computer, Electronics and Information, Guangxi University, Nanning 530004, Guangxi, China; Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning 530004, Guangxi, China.
| |
Collapse
|
5
|
HyperNTF: A hypergraph regularized nonnegative tensor factorization for dimensionality reduction. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.09.036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
6
|
Shah AM, Zamora R, Korff S, Barclay D, Yin J, El-Dehaibi F, Billiar TR, Vodovotz Y. Inferring Tissue-Specific, TLR4-Dependent Type 17 Immune Interactions in Experimental Trauma/Hemorrhagic Shock and Resuscitation Using Computational Modeling. Front Immunol 2022; 13:908618. [PMID: 35663944 PMCID: PMC9160183 DOI: 10.3389/fimmu.2022.908618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 04/21/2022] [Indexed: 11/13/2022] Open
Abstract
Trauma/hemorrhagic shock followed by resuscitation (T/HS-R) results in multi-system inflammation and organ dysfunction, in part driven by binding of damage-associated molecular pattern molecules to Toll-like Receptor 4 (TLR4). We carried out experimental T/HS-R (pseudo-fracture plus 2 h of shock followed by 0-22 h of resuscitation) in C57BL/6 (wild type [WT]) and TLR4-null (TLR4-/-) mice, and then defined the dynamics of 20 protein-level inflammatory mediators in the heart, gut, lung, liver, spleen, kidney, and systemic circulation. Cross-correlation and Principal Component Analysis (PCA) on data from the 7 tissues sampled suggested that TLR4-/- samples express multiple inflammatory mediators in a small subset of tissue compartments as compared to the WT samples, in which many inflammatory mediators were localized non-specifically to nearly all compartments. We and others have previously defined a central role for type 17 immune cells in human trauma. Accordingly, correlations between IL-17A and GM-CSF (indicative of pathogenic Th17 cells); between IL-17A and IL-10 (indicative of non-pathogenic Th17 cells); and IL-17A and TNF (indicative of memory/effector T cells) were assessed across all tissues studied. In both WT and TLR4-/- mice, positive correlations were observed between IL-17A and GM-CSF, IL-10, and TNF in the kidney and gut. In contrast, the variable and dynamic presence of both pathogenic and non-pathogenic Th17 cells was inferred in the systemic circulation of TLR4-/- mice over time, suggesting a role for TLR4 in efflux of these cells into peripheral tissues. Hypergraph analysis - used to define dynamic, cross compartment networks - in concert with PCA-suggested that IL-17A was present persistently in all tissues at all sampled time points except for its absence in the plasma at 0.5h in the WT group, supporting the hypothesis that T/HS-R induces efflux of Th17 cells from the circulation and into specific tissues. These analyses suggest a complex, context-specific role for TLR4 and type 17 immunity following T/HS-R.
Collapse
Affiliation(s)
- Ashti M Shah
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA, United States
| | - Ruben Zamora
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA, United States.,Center for Inflammation and Regeneration Modeling, McGowan Institute for Regenerative Medicine, Pittsburgh, PA, United States
| | - Sebastian Korff
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA, United States
| | - Derek Barclay
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA, United States
| | - Jinling Yin
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA, United States
| | - Fayten El-Dehaibi
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA, United States
| | - Timothy R Billiar
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA, United States
| | - Yoram Vodovotz
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA, United States.,Center for Inflammation and Regeneration Modeling, McGowan Institute for Regenerative Medicine, Pittsburgh, PA, United States.,Center for Systems Immunology, University of Pittsburgh, Pittsburgh, PA, United States
| |
Collapse
|
7
|
Gao Y, Zhang Z, Lin H, Zhao X, Du S, Zou C. Hypergraph Learning: Methods and Practices. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:2548-2566. [PMID: 33211654 DOI: 10.1109/tpami.2020.3039374] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Hypergraph learning is a technique for conducting learning on a hypergraph structure. In recent years, hypergraph learning has attracted increasing attention due to its flexibility and capability in modeling complex data correlation. In this paper, we first systematically review existing literature regarding hypergraph generation, including distance-based, representation-based, attribute-based, and network-based approaches. Then, we introduce the existing learning methods on a hypergraph, including transductive hypergraph learning, inductive hypergraph learning, hypergraph structure updating, and multi-modal hypergraph learning. After that, we present a tensor-based dynamic hypergraph representation and learning framework that can effectively describe high-order correlation in a hypergraph. To study the effectiveness and efficiency of hypergraph generation and learning methods, we conduct comprehensive evaluations on several typical applications, including object and action recognition, Microblog sentiment prediction, and clustering. In addition, we contribute a hypergraph learning development toolkit called THU-HyperG.
Collapse
|
8
|
Fast hypergraph regularized nonnegative tensor ring decomposition based on low-rank approximation. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03346-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
9
|
The Application of Directed Hyper-Graphs for Analysis of Models of Information Systems. MATHEMATICS 2022. [DOI: 10.3390/math10050759] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Hyper-graphs offer the opportunity to formulate logical statements about their components, for example, using Horn clauses. Several models of Information Systems can be represented using hyper-graphs as the workflows, i.e., the business processes. During the modeling of Information Systems, many constraints should be maintained during the development process. The models of Information Systems are complex objects, for this reason, the analysis of algorithms and graph structures that can support the consistency and integrity of models is an essential issue. A set of interdependencies between models and components of architecture can be formulated by functional dependencies and can be investigated via algorithmic methods. Information Systems can be perceived as overarching documents that includes data collections; documents to be processed; and representations of business processes, activities, and services. Whe selecting and working out an appropriate method encoding of artifacts in Information Systems, the complex structure can be represented using hyper-graphs. This representation enables the application of various model-checking, verification, and validation tools that are based on formal approaches. This paper describes the proposed representations in different situations using hyper-graphs, moreover, the formal, algorithmic-based model-checking methods that are coupled with the representations. The model-checking methods are realized by algorithms that are grounded in graph-theoretical approaches and tailored to the specificity of hyper-graphs. Finally, the possible applications in a real-life enterprise environment are outlined.
Collapse
|
10
|
Abstract
The modeling of the graphical representation of business processes (BP) or workflows in enterprise information systems (IS) is often to represent various activities, entities, relations, functions, and communicate between them in an enterprise to achieve the major goal of operational support. In this work, we decided to use graph representation approaches, especially hypergraphs to depict the complex relationships that exist among the artifacts and constituents of BP for more efficient and accurate manipulations. We used bipartite and further hypergraph formats for storing and curating data. We have investigated the various descriptive languages and representation models of BP as process modeling, workflow and process integration, and object-oriented (OO) languages. We have carried out experiments using different approach combinations, but for observing quiltedrepresentation, we focused on the main consistencies of “DBP”. As the final approach, we used the “DBP” stream and data schemes that are defined by us to proceed with using pure Python for manually generating data and external Python libraries to store, curate, and visualize “DBP”.
Collapse
|
11
|
Chodrow PS, Veldt N, Benson AR. Generative hypergraph clustering: From blockmodels to modularity. SCIENCE ADVANCES 2021; 7:eabh1303. [PMID: 34233880 PMCID: PMC11559555 DOI: 10.1126/sciadv.abh1303] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Accepted: 05/24/2021] [Indexed: 06/13/2023]
Abstract
Hypergraphs are a natural modeling paradigm for networked systems with multiway interactions. A standard task in network analysis is the identification of closely related or densely interconnected nodes. We propose a probabilistic generative model of clustered hypergraphs with heterogeneous node degrees and edge sizes. Approximate maximum likelihood inference in this model leads to a clustering objective that generalizes the popular modularity objective for graphs. From this, we derive an inference algorithm that generalizes the Louvain graph community detection method, and a faster, specialized variant in which edges are expected to lie fully within clusters. Using synthetic and empirical data, we demonstrate that the specialized method is highly scalable and can detect clusters where graph-based methods fail. We also use our model to find interpretable higher-order structure in school contact networks, U.S. congressional bill cosponsorship and committees, product categories in copurchasing behavior, and hotel locations from web browsing sessions.
Collapse
Affiliation(s)
- Philip S Chodrow
- Department of Mathematics, University of California, Los Angeles, 520 Portola Plaza, Los Angeles, CA 90095, USA.
| | - Nate Veldt
- Center for Applied Mathematics, Cornell University, 657 Frank H.T. Rhodes Hall, Ithaca, NY 14853, USA
| | - Austin R Benson
- Department of Computer Science, Cornell University, 413B Gates Hall, Ithaca, NY 14853, USA
| |
Collapse
|
12
|
Schwob MR, Zhan J, Dempsey A. Modeling Cell Communication with Time-Dependent Signaling Hypergraphs. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:1151-1163. [PMID: 31449029 DOI: 10.1109/tcbb.2019.2937033] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Signaling pathways describe a group of molecules in a cell that collaborate to control one or more cell functions, such as cell division or cell death. The pathways communicate by sending signals between molecules, and this process is repeated until the terminal molecule is activated and the cell function is executed. Signaling pathways are often represented as directed graphs, which does not provide enough information when modeling cell functions and reactions. Recently, directed hypergraphs have been proposed to more accurately represent reactions such as protein activation and interaction. To further improve the representation of signaling pathways, time dependency must be considered to improve the representation of cell signaling at any given time. In this paper, the importance of time dependency in modeling signaling pathways is presented. An algorithm that finds the shortest a priori path using time-dependent hypergraphs to more robustly model signaling pathways is adopted. The shortest time-dependent hyperpaths representing signaling pathways are an improvement to the recent adoption of hypergraphs representing these pathways. The results display the improved representation of signaling pathways and motivate the adoption of time-dependent signaling hypergraphs.
Collapse
|
13
|
Li Z, Song T, Yong J, Kuang R. Imputation of spatially-resolved transcriptomes by graph-regularized tensor completion. PLoS Comput Biol 2021; 17:e1008218. [PMID: 33826608 PMCID: PMC8055040 DOI: 10.1371/journal.pcbi.1008218] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 04/19/2021] [Accepted: 03/19/2021] [Indexed: 12/02/2022] Open
Abstract
High-throughput spatial-transcriptomics RNA sequencing (sptRNA-seq) based on in-situ capturing technologies has recently been developed to spatially resolve transcriptome-wide mRNA expressions mapped to the captured locations in a tissue sample. Due to the low RNA capture efficiency by in-situ capturing and the complication of tissue section preparation, sptRNA-seq data often only provides an incomplete profiling of the gene expressions over the spatial regions of the tissue. In this paper, we introduce a graph-regularized tensor completion model for imputing the missing mRNA expressions in sptRNA-seq data, namely FIST, Fast Imputation of Spatially-resolved transcriptomes by graph-regularized Tensor completion. We first model sptRNA-seq data as a 3-way sparse tensor in genes (p-mode) and the (x, y) spatial coordinates (x-mode and y-mode) of the observed gene expressions, and then consider the imputation of the unobserved entries or fibers as a tensor completion problem in Canonical Polyadic Decomposition (CPD) form. To improve the imputation of highly sparse sptRNA-seq data, we also introduce a protein-protein interaction network to add prior knowledge of gene functions, and a spatial graph to capture the the spatial relations among the capture spots. The tensor completion model is then regularized by a Cartesian product graph of protein-protein interaction network and the spatial graph to capture the high-order relations in the tensor. In the experiments, FIST was tested on ten 10x Genomics Visium spatial transcriptomic datasets of different tissue sections with cross-validation among the known entries in the imputation. FIST significantly outperformed the state-of-the-art methods for single-cell RNAseq data imputation. We also demonstrate that both the spatial graph and PPI network play an important role in improving the imputation. In a case study, we further analyzed the gene clusters obtained from the imputed gene expressions to show that the imputations by FIST indeed capture the spatial characteristics in the gene expressions and reveal functions that are highly relevant to three different kinds of tissues in mouse kidney. Biological tissues are composed of different types of structurally organized cell units playing distinct functional roles. The exciting new spatial gene expression profiling methods have enabled the analysis of spatially resolved transcriptomes to understand the spatial and functional characteristics of these cells in the context of eco-environment of tissue. Due to the technical limitations, spatial transcriptomics data suffers from only sparsely measured mRNAs by in-situ capture and possibly missing spots in tissue regions that entirely failed fixing and permeabilizing RNAs. Our method, FIST (Fast Imputation of Spatially-resolved transcriptomes by graph-regularized Tensor completion), focuses on the spatial and high-sparsity nature of spatial transcriptomics data by modeling the data as a 3-way gene-by-(x, y)-location tensor and a product graph of a spatial graph and a protein-protein interaction network. Our comprehensive evaluation of FIST on ten 10x Genomics Visium spatial genomics datasets and comparison with the methods for single-cell RNA sequencing data imputation demonstrate that FIST is a better method more suitable for spatial gene expression imputation. Overall, we found FIST a useful new method for analyzing spatially resolved gene expressions based on novel modeling of spatial and functional information.
Collapse
Affiliation(s)
- Zhuliu Li
- Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, Minnesota, United States of America
| | - Tianci Song
- Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, Minnesota, United States of America
| | - Jeongsik Yong
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota Twin Cities, Minneapolis, Minnesota, United States of America
| | - Rui Kuang
- Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, Minnesota, United States of America
- * E-mail:
| |
Collapse
|
14
|
Petegrosso R, Song T, Kuang R. Hierarchical Canonical Correlation Analysis Reveals Phenotype, Genotype, and Geoclimate Associations in Plants. PLANT PHENOMICS (WASHINGTON, D.C.) 2020; 2020:1969142. [PMID: 33313545 PMCID: PMC7706319 DOI: 10.34133/2020/1969142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/04/2019] [Accepted: 03/05/2020] [Indexed: 06/12/2023]
Abstract
The local environment of the geographical origin of plants shaped their genetic variations through environmental adaptation. While the characteristics of the local environment correlate with the genotypes and other genomic features of the plants, they can also be indicative of genotype-phenotype associations providing additional information relevant to environmental dependence. In this study, we investigate how the geoclimatic features from the geographical origin of the Arabidopsis thaliana accessions can be integrated with genomic features for phenotype prediction and association analysis using advanced canonical correlation analysis (CCA). In particular, we propose a novel method called hierarchical canonical correlation analysis (HCCA) to combine mutations, gene expressions, and DNA methylations with geoclimatic features for informative coprojections of the features. HCCA uses a condition number of the cross-covariance between pairs of datasets to infer a hierarchical structure for applying CCA to combine the data. In the experiments on Arabidopsis thaliana data from 1001 Genomes and 1001 Epigenomes projects and climatic, atmospheric, and soil environmental variables combined by CLIMtools, HCCA provided a joint representation of the genomic data and geoclimate data for better prediction of the special flowering time at 10°C (FT10) of Arabidopsis thaliana. We also extended HCCA with information from a protein-protein interaction (PPI) network to guide the feature learning by imposing network modules onto the genomic features, which are shown to be useful for identifying genes with more coherent functions correlated with the geoclimatic features. The findings in this study suggest that environmental data comprise an important component in plant phenotype analysis. HCCA is a useful data integration technique for phenotype prediction, and a better understanding of the interactions between gene functions and environment as more useful functional information is introduced by coprojections of multiple genomic datasets.
Collapse
Affiliation(s)
- Raphael Petegrosso
- Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, MN, USA
| | - Tianci Song
- Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, MN, USA
| | - Rui Kuang
- Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, MN, USA
| |
Collapse
|
15
|
|
16
|
Zu C, Gao Y, Munsell B, Kim M, Peng Z, Cohen JR, Zhang D, Wu G. Identifying disease-related subnetwork connectome biomarkers by sparse hypergraph learning. Brain Imaging Behav 2019; 13:879-892. [PMID: 29948906 PMCID: PMC6513717 DOI: 10.1007/s11682-018-9899-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The functional brain network has gained increased attention in the neuroscience community because of its ability to reveal the underlying architecture of human brain. In general, majority work of functional network connectivity is built based on the correlations between discrete-time-series signals that link only two different brain regions. However, these simple region-to-region connectivity models do not capture complex connectivity patterns between three or more brain regions that form a connectivity subnetwork, or subnetwork for short. To overcome this current limitation, a hypergraph learning-based method is proposed to identify subnetwork differences between two different cohorts. To achieve our goal, a hypergraph is constructed, where each vertex represents a subject and also a hyperedge encodes a subnetwork with similar functional connectivity patterns between different subjects. Unlike previous learning-based methods, our approach is designed to jointly optimize the weights for all hyperedges such that the learned representation is in consensus with the distribution of phenotype data, i.e. clinical labels. In order to suppress the spurious subnetwork biomarkers, we further enforce a sparsity constraint on the hyperedge weights, where a larger hyperedge weight indicates the subnetwork with the capability of identifying the disorder condition. We apply our hypergraph learning-based method to identify subnetwork biomarkers in Autism Spectrum Disorder (ASD) and Attention Deficit Hyperactivity Disorder (ADHD). A comprehensive quantitative and qualitative analysis is performed, and the results show that our approach can correctly classify ASD and ADHD subjects from normal controls with 87.65 and 65.08% accuracies, respectively.
Collapse
Affiliation(s)
- Chen Zu
- Department of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China
- Department of Radiology and BRIC, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Yue Gao
- School of Software, Tsinghua University, Beijing, China
| | - Brent Munsell
- Department of Computer Science, College of Charleston, Charleston, SC, USA
| | - Minjeong Kim
- Department of Computer Science, University of North Carolina, Greensboro, NC, USA
| | - Ziwen Peng
- Centre for Studies of Psychological Application, School of Psychology, South China Normal University, Guangzhou, China
| | - Jessica R Cohen
- Department of Psychology and Neuroscience, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Daoqiang Zhang
- Department of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China.
| | - Guorong Wu
- Department of Radiology and BRIC, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
| |
Collapse
|
17
|
Barrot CC, Woillard JB, Picard N. Big data in pharmacogenomics: current applications, perspectives and pitfalls. Pharmacogenomics 2019; 20:609-620. [PMID: 31190620 DOI: 10.2217/pgs-2018-0184] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
The efficiency of new generation sequencing methods and the reduction of their cost has led pharmacogenomics to gradually supplant pharmacogenetics, leading to new applications in personalized medicine along with new perspectives in drug design or identification of drug response factors. The amount of data generated in genomics fits the definition of big data, and need a specific bioinformatics processing following standard steps: data collection, processing, analysis and interpretation. Pitfalls of pharmacogenomics studies are directly related to these steps. This review aims to describe these steps from a pharmacogenomic point of view, focusing on bioinformatics aspects.
Collapse
Affiliation(s)
- Claire-Cécile Barrot
- INSERM, IPPRITT, U1248, F-87000, Limoges, France; Univ. Limoges, IPPRITT, F-87000 Limoges, France
| | - Jean-Baptiste Woillard
- INSERM, IPPRITT, U1248, F-87000, Limoges, France; Univ. Limoges, IPPRITT, F-87000 Limoges, France
| | - Nicolas Picard
- INSERM, IPPRITT, U1248, F-87000, Limoges, France; Univ. Limoges, IPPRITT, F-87000 Limoges, France
| |
Collapse
|
18
|
Irimia A, Wei S, Lu N, Moore CM, Kennedy DN. Mobile Monitoring of Traumatic Brain Injury in Older Adults: Challenges and Opportunities. Neuroinformatics 2019; 15:227-230. [PMID: 28748392 DOI: 10.1007/s12021-017-9335-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Affiliation(s)
- Andrei Irimia
- Ethel Percy Andrus Gerontology Center, Leonard Davis School of Gerontology, University of Southern California, 3715 McClintock Avenue, Los Angeles, CA, 90089, USA.
| | - Susan Wei
- Division of Biostatistics, School of Public Health, University of Minnesota, 420 Delaware Street SE, Minneapolis, MN, 55455, USA
| | - Nanshu Lu
- Department of Aerospace Engineering and Engineering Mechanics, Cockrell School of Engineering, University of Texas, 210 East 24th Street, Austin, TX, 78705, USA
| | - Constance M Moore
- Center for Comparative Neuroimaging & Department of Psychiatry, University of Massachusetts Medical School, 365 Plantation Street, Biotech One, Worcester, MA, 01605, USA
| | - David N Kennedy
- Eunice Kennedy Shriver Center & Department of Psychiatry, University of Massachusetts Medical School, 55 Lake Avenue North, Worcester, MA, 01605, USA
| |
Collapse
|
19
|
Zhang Z, Lin H, Zhao X, Ji R, Gao Y. Inductive Multi-Hypergraph Learning and Its Application on View-Based 3D Object Classification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:5957-5968. [PMID: 30072328 DOI: 10.1109/tip.2018.2862625] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The wide 3D applications have led to increasing amount of 3D object data, and thus effective 3D object classification technique has become an urgent requirement. One important and challenging task for 3D object classification is how to formulate the 3D data correlation and exploit it. Most of the previous works focus on learning optimal pairwise distance metric for object comparison, which may lose the global correlation among 3D objects. Recently, a transductive hypergraph learning has been investigated for classification, which can jointly explore the correlation among multiple objects, including both the labeled and unlabeled data. Although these methods have shown better performance, they are still limited due to 1) a considerable amount of testing data may not be available in practice and 2) the high computational cost to test new coming data. To handle this problem, considering the multi-modal representations of 3D objects in practice, we propose an inductive multi-hypergraph learning algorithm, which targets on learning an optimal projection for the multi-modal training data. In this method, all the training data are formulated in multi-hypergraph based on the features, and the inductive learning is conducted to learn the projection matrices and the optimal multi-hypergraph combination weights simultaneously. Different from the transductive learning on hypergraph, the high cost training process is off-line, and the testing process is very efficient for the inductive learning on hypergraph. We have conducted experiments on two 3D benchmarks, i.e., the NTU and the ModelNet40 data sets, and compared the proposed algorithm with the state-of-the-art methods and traditional transductive multi-hypergraph learning methods. Experimental results have demonstrated that the proposed method can achieve effective and efficient classification performance. We also note that the proposed method is a general framework and has the potential to be applied in other applications in practice.
Collapse
|
20
|
Wang Y, Zhu L, Qian X, Han J. Joint Hypergraph Learning for Tag-Based Image Retrieval. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:4437-4451. [PMID: 29897870 DOI: 10.1109/tip.2018.2837219] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
As the image sharing websites like Flickr become more and more popular, extensive scholars concentrate on tag-based image retrieval. It is one of the important ways to find images contributed by social users. In this research field, tag information and diverse visual features have been investigated. However, most existing methods use these visual features separately or sequentially. In this paper, we propose a global and local visual features fusion approach to learn the relevance of images by hypergraph approach. A hypergraph is constructed first by utilizing global, local visual features, and tag information. Then, we propose a pseudo-relevance feedback mechanism to obtain the pseudo-positive images. Finally, with the hypergraph and pseudo relevance feedback, we adopt the hypergraph learning algorithm to calculate the relevance score of each image to the query. Experimental results demonstrate the effectiveness of the proposed approach.
Collapse
|
21
|
Gaudelet T, Malod-Dognin N, Pržulj N. Higher-order molecular organization as a source of biological function. Bioinformatics 2018; 34:i944-i953. [PMID: 30423061 PMCID: PMC6129285 DOI: 10.1093/bioinformatics/bty570] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Motivation Molecular interactions have widely been modelled as networks. The local wiring patterns around molecules in molecular networks are linked with their biological functions. However, networks model only pairwise interactions between molecules and cannot explicitly and directly capture the higher-order molecular organization, such as protein complexes and pathways. Hence, we ask if hypergraphs (hypernetworks), that directly capture entire complexes and pathways along with protein-protein interactions (PPIs), carry additional functional information beyond what can be uncovered from networks of pairwise molecular interactions. The mathematical formalism of a hypergraph has long been known, but not often used in studying molecular networks due to the lack of sophisticated algorithms for mining the underlying biological information hidden in the wiring patterns of molecular systems modelled as hypernetworks. Results We propose a new, multi-scale, protein interaction hypernetwork model that utilizes hypergraphs to capture different scales of protein organization, including PPIs, protein complexes and pathways. In analogy to graphlets, we introduce hypergraphlets, small, connected, non-isomorphic, induced sub-hypergraphs of a hypergraph, to quantify the local wiring patterns of these multi-scale molecular hypergraphs and to mine them for new biological information. We apply them to model the multi-scale protein networks of bakers yeast and human and show that the higher-order molecular organization captured by these hypergraphs is strongly related to the underlying biology. Importantly, we demonstrate that our new models and data mining tools reveal different, but complementary biological information compared with classical PPI networks. We apply our hypergraphlets to successfully predict biological functions of uncharacterized proteins. Availability and implementation Code and data are available online at http://www0.cs.ucl.ac.uk/staff/natasa/hypergraphlets.
Collapse
Affiliation(s)
- Thomas Gaudelet
- Department of Computer Science, University College London, London, UK
| | - Noël Malod-Dognin
- Department of Computer Science, University College London, London, UK
| | - Nataša Pržulj
- Department of Computer Science, University College London, London, UK
| |
Collapse
|
22
|
Ye J, Jin Z. Hyper-graph regularized discriminative concept factorization for data representation. Soft comput 2018. [DOI: 10.1007/s00500-017-2636-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
23
|
Nie L, Zhang L, Yan Y, Chang X, Liu M, Shaoling L. Multiview Physician-Specific Attributes Fusion for Health Seeking. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:3680-3691. [PMID: 27337733 DOI: 10.1109/tcyb.2016.2577590] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Community-based health services have risen as important online resources for resolving users health concerns. Despite the value, the gap between what health seekers with specific health needs and what busy physicians with specific attitudes and expertise can offer is being widened. To bridge this gap, we present a question routing scheme that is able to connect health seekers to the right physicians. In this scheme, we first bridge the expertise matching gap via a probabilistic fusion of the physician-expertise distribution and the expertise-question distribution. The distributions are calculated by hypergraph-based learning and kernel density estimation. We then measure physicians attitudes toward answering general questions from the perspectives of activity, responsibility, reputation, and willingness. At last, we adaptively fuse the expertise modeling and attitude modeling by considering the personal needs of the health seekers. Extensive experiments have been conducted on a real-world dataset to validate our proposed scheme.
Collapse
|
24
|
Zhang W, Chien J, Yong J, Kuang R. Network-based machine learning and graph theory algorithms for precision oncology. NPJ Precis Oncol 2017; 1:25. [PMID: 29872707 PMCID: PMC5871915 DOI: 10.1038/s41698-017-0029-7] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2017] [Revised: 06/28/2017] [Accepted: 06/29/2017] [Indexed: 01/07/2023] Open
Abstract
Network-based analytics plays an increasingly important role in precision oncology. Growing evidence in recent studies suggests that cancer can be better understood through mutated or dysregulated pathways or networks rather than individual mutations and that the efficacy of repositioned drugs can be inferred from disease modules in molecular networks. This article reviews network-based machine learning and graph theory algorithms for integrative analysis of personal genomic data and biomedical knowledge bases to identify tumor-specific molecular mechanisms, candidate targets and repositioned drugs for personalized treatment. The review focuses on the algorithmic design and mathematical formulation of these methods to facilitate applications and implementations of network-based analysis in the practice of precision oncology. We review the methods applied in three scenarios to integrate genomic data and network models in different analysis pipelines, and we examine three categories of network-based approaches for repositioning drugs in drug-disease-gene networks. In addition, we perform a comprehensive subnetwork/pathway analysis of mutations in 31 cancer genome projects in the Cancer Genome Atlas and present a detailed case study on ovarian cancer. Finally, we discuss interesting observations, potential pitfalls and future directions in network-based precision oncology.
Collapse
Affiliation(s)
- Wei Zhang
- 1Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, MN USA
| | - Jeremy Chien
- 2Department of Cancer Biology, University of Kansas Medical Center, Kansas City, KS USA
| | - Jeongsik Yong
- 3Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN USA
| | - Rui Kuang
- 1Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, MN USA
| |
Collapse
|
25
|
Greenbaum G, Fefferman NH. Application of network methods for understanding evolutionary dynamics in discrete habitats. Mol Ecol 2017; 26:2850-2863. [DOI: 10.1111/mec.14059] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2016] [Revised: 02/03/2017] [Accepted: 02/06/2017] [Indexed: 02/02/2023]
Affiliation(s)
- Gili Greenbaum
- Department of Solar Energy and Environmental Physics and Mitrani Department of Desert Ecology; The Jacob Blaustein Institutes for Desert Research; Ben-Gurion University of the Negev; Midreshet Ben-Gurion 84990 Israel
| | - Nina H. Fefferman
- Department of Ecology and Evolutionary Biology; University of Tennessee; Knoxville 37996 TN USA
| |
Collapse
|
26
|
A Meta-Path-Based Prediction Method for Human miRNA-Target Association. BIOMED RESEARCH INTERNATIONAL 2016; 2016:7460740. [PMID: 27703979 PMCID: PMC5040835 DOI: 10.1155/2016/7460740] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2016] [Revised: 08/14/2016] [Accepted: 08/21/2016] [Indexed: 01/21/2023]
Abstract
MicroRNAs (miRNAs) are short noncoding RNAs that play important roles in regulating gene expressing, and the perturbed miRNAs are often associated with development and tumorigenesis as they have effects on their target mRNA. Predicting potential miRNA-target associations from multiple types of genomic data is a considerable problem in the bioinformatics research. However, most of the existing methods did not fully use the experimentally validated miRNA-mRNA interactions. Here, we developed RMLM and RMLMSe to predict the relationship between miRNAs and their targets. RMLM and RMLMSe are global approaches as they can reconstruct the missing associations for all the miRNA-target simultaneously and RMLMSe demonstrates that the integration of sequence information can improve the performance of RMLM. In RMLM, we use RM measure to evaluate different relatedness between miRNA and its target based on different meta-paths; logistic regression and MLE method are employed to estimate the weight of different meta-paths. In RMLMSe, sequence information is utilized to improve the performance of RMLM. Here, we carry on fivefold cross validation and pathway enrichment analysis to prove the performance of our methods. The fivefold experiments show that our methods have higher AUC scores compared with other methods and the integration of sequence information can improve the performance of miRNA-target association prediction.
Collapse
|
27
|
Yan W, Xue W, Chen J, Hu G. Biological Networks for Cancer Candidate Biomarkers Discovery. Cancer Inform 2016; 15:1-7. [PMID: 27625573 PMCID: PMC5012434 DOI: 10.4137/cin.s39458] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2016] [Revised: 06/06/2016] [Accepted: 06/16/2016] [Indexed: 12/16/2022] Open
Abstract
Due to its extraordinary heterogeneity and complexity, cancer is often proposed as a model case of a systems biology disease or network disease. There is a critical need of effective biomarkers for cancer diagnosis and/or outcome prediction from system level analyses. Methods based on integrating omics data into networks have the potential to revolutionize the identification of cancer biomarkers. Deciphering the biological networks underlying cancer is undoubtedly important for understanding the molecular mechanisms of the disease and identifying effective biomarkers. In this review, the networks constructed for cancer biomarker discovery based on different omics level data are described and illustrated from recent advances in the field.
Collapse
Affiliation(s)
- Wenying Yan
- Center for Systems Biology, Soochow University, Suzhou, Jiangsu, China
| | - Wenjin Xue
- Department of Electrical Engineering, Technician College of Taizhou, Taizhou, Jiangsu, China
| | - Jiajia Chen
- School of Chemistry, Biology and Material Engineering, Suzhou University of Science and Technology, Suzhou, China
| | - Guang Hu
- Center for Systems Biology, Soochow University, Suzhou, Jiangsu, China
| |
Collapse
|
28
|
Predicting miRNA Targets by Integrating Gene Regulatory Knowledge with Expression Profiles. PLoS One 2016; 11:e0152860. [PMID: 27064982 PMCID: PMC4827848 DOI: 10.1371/journal.pone.0152860] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2015] [Accepted: 03/21/2016] [Indexed: 11/19/2022] Open
Abstract
MOTIVATION microRNAs (miRNAs) play crucial roles in post-transcriptional gene regulation of both plants and mammals, and dysfunctions of miRNAs are often associated with tumorigenesis and development through the effects on their target messenger RNAs (mRNAs). Identifying miRNA functions is critical for understanding cancer mechanisms and determining the efficacy of drugs. Computational methods analyzing high-throughput data offer great assistance in understanding the diverse and complex relationships between miRNAs and mRNAs. However, most of the existing methods do not fully utilise the available knowledge in biology to reduce the uncertainty in the modeling process. Therefore it is desirable to develop a method that can seamlessly integrate existing biological knowledge and high-throughput data into the process of discovering miRNA regulation mechanisms. RESULTS In this article we present an integrative framework, CIDER (Causal miRNA target Discovery with Expression profile and Regulatory knowledge), to predict miRNA targets. CIDER is able to utilise a variety of gene regulation knowledge, including transcriptional and post-transcriptional knowledge, and to exploit gene expression data for the discovery of miRNA-mRNA regulatory relationships. The benefits of our framework is demonstrated by both simulation study and the analysis of the epithelial-to-mesenchymal transition (EMT) and the breast cancer (BRCA) datasets. Our results reveal that even a limited amount of either Transcription Factor (TF)-miRNA or miRNA-mRNA regulatory knowledge improves the performance of miRNA target prediction, and the combination of the two types of knowledge enhances the improvement further. Another useful property of the framework is that its performance increases monotonically with the increase of regulatory knowledge.
Collapse
|
29
|
Network regularised Cox regression and multiplex network models to predict disease comorbidities and survival of cancer. Comput Biol Chem 2015; 59 Pt B:15-31. [DOI: 10.1016/j.compbiolchem.2015.08.010] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2015] [Revised: 08/21/2015] [Accepted: 08/25/2015] [Indexed: 12/17/2022]
|
30
|
Network-Based Logistic Classification with an Enhanced L 1/2 Solver Reveals Biomarker and Subnetwork Signatures for Diagnosing Lung Cancer. BIOMED RESEARCH INTERNATIONAL 2015; 2015:713953. [PMID: 26185761 PMCID: PMC4488258 DOI: 10.1155/2015/713953] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/24/2014] [Revised: 04/05/2015] [Accepted: 04/30/2015] [Indexed: 01/05/2023]
Abstract
Identifying biomarker and signaling pathway is a critical step in genomic studies, in which the regularization method is a widely used feature extraction approach. However, most of the regularizers are based on L 1-norm and their results are not good enough for sparsity and interpretation and are asymptotically biased, especially in genomic research. Recently, we gained a large amount of molecular interaction information about the disease-related biological processes and gathered them through various databases, which focused on many aspects of biological systems. In this paper, we use an enhanced L 1/2 penalized solver to penalize network-constrained logistic regression model called an enhanced L 1/2 net, where the predictors are based on gene-expression data with biologic network knowledge. Extensive simulation studies showed that our proposed approach outperforms L 1 regularization, the old L 1/2 penalized solver, and the Elastic net approaches in terms of classification accuracy and stability. Furthermore, we applied our method for lung cancer data analysis and found that our method achieves higher predictive accuracy than L 1 regularization, the old L 1/2 penalized solver, and the Elastic net approaches, while fewer but informative biomarkers and pathways are selected.
Collapse
|
31
|
Wei B, Cheng M, Wang C, Li J. Combinative hypergraph learning for semi-supervised image classification. Neurocomputing 2015. [DOI: 10.1016/j.neucom.2014.11.028] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
32
|
Yu J, Rui Y, Tang YY, Tao D. High-order distance-based multiview stochastic learning in image classification. IEEE TRANSACTIONS ON CYBERNETICS 2014; 44:2431-2442. [PMID: 25415948 DOI: 10.1109/tcyb.2014.2307862] [Citation(s) in RCA: 129] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
How do we find all images in a larger set of images which have a specific content? Or estimate the position of a specific object relative to the camera? Image classification methods, like support vector machine (supervised) and transductive support vector machine (semi-supervised), are invaluable tools for the applications of content-based image retrieval, pose estimation, and optical character recognition. However, these methods only can handle the images represented by single feature. In many cases, different features (or multiview data) can be obtained, and how to efficiently utilize them is a challenge. It is inappropriate for the traditionally concatenating schema to link features of different views into a long vector. The reason is each view has its specific statistical property and physical interpretation. In this paper, we propose a high-order distance-based multiview stochastic learning (HD-MSL) method for image classification. HD-MSL effectively combines varied features into a unified representation and integrates the labeling information based on a probabilistic framework. In comparison with the existing strategies, our approach adopts the high-order distance obtained from the hypergraph to replace pairwise distance in estimating the probability matrix of data distribution. In addition, the proposed approach can automatically learn a combination coefficient for each view, which plays an important role in utilizing the complementary information of multiview data. An alternative optimization is designed to solve the objective functions of HD-MSL and obtain different views on coefficients and classification scores simultaneously. Experiments on two real world datasets demonstrate the effectiveness of HD-MSL in image classification.
Collapse
|
33
|
An effective haplotype assembly algorithm based on hypergraph partitioning. J Theor Biol 2014; 358:85-92. [DOI: 10.1016/j.jtbi.2014.05.034] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2014] [Revised: 05/08/2014] [Accepted: 05/25/2014] [Indexed: 11/20/2022]
|
34
|
Ritz A, Tegge AN, Kim H, Poirel CL, Murali TM. Signaling hypergraphs. Trends Biotechnol 2014; 32:356-62. [PMID: 24857424 PMCID: PMC4299695 DOI: 10.1016/j.tibtech.2014.04.007] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2013] [Revised: 04/01/2014] [Accepted: 04/04/2014] [Indexed: 01/10/2023]
Abstract
Signaling pathways function as the information-passing mechanisms of cells. A number of databases with extensive manual curation represent the current knowledge base for signaling pathways. These databases motivate the development of computational approaches for prediction and analysis. Such methods require an accurate and computable representation of signaling pathways. Pathways are often described as sets of proteins or as pairwise interactions between proteins. However, many signaling mechanisms cannot be described using these representations. In this opinion, we highlight a representation of signaling pathways that is underutilized: the hypergraph. We demonstrate the usefulness of hypergraphs in this context and discuss challenges and opportunities for the scientific community.
Collapse
Affiliation(s)
- Anna Ritz
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| | - Allison N Tegge
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| | - Hyunju Kim
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| | | | - T M Murali
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA; ICTAS Center for Systems Biology of Engineered Tissues, Virginia Tech, Blacksburg, VA, USA.
| |
Collapse
|
35
|
Kim KJ, Cho SB. Meta-classifiers for high-dimensional, small sample classification for gene expression analysis. Pattern Anal Appl 2014. [DOI: 10.1007/s10044-014-0369-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
36
|
Yu J, Rui Y, Tao D. Click prediction for web image reranking using multimodal sparse coding. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2014; 23:2019-2032. [PMID: 24710402 DOI: 10.1109/tip.2014.2311377] [Citation(s) in RCA: 260] [Impact Index Per Article: 23.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Image reranking is effective for improving the performance of a text-based image search. However, existing reranking algorithms are limited for two main reasons: 1) the textual meta-data associated with images is often mismatched with their actual visual content and 2) the extracted visual features do not accurately describe the semantic similarities between images. Recently, user click information has been used in image reranking, because clicks have been shown to more accurately describe the relevance of retrieved images to search queries. However, a critical problem for click-based methods is the lack of click data, since only a small number of web images have actually been clicked on by users. Therefore, we aim to solve this problem by predicting image clicks. We propose a multimodal hypergraph learning-based sparse coding method for image click prediction, and apply the obtained click data to the reranking of images. We adopt a hypergraph to build a group of manifolds, which explore the complementarity of different features through a group of weights. Unlike a graph that has an edge between two vertices, a hyperedge in a hypergraph connects a set of vertices, and helps preserve the local smoothness of the constructed sparse codes. An alternating optimization procedure is then performed, and the weights of different modalities and the sparse codes are simultaneously obtained. Finally, a voting strategy is used to describe the predicted click as a binary event (click or no click), from the images' corresponding sparse codes. Thorough empirical studies on a large-scale database including nearly 330 K images demonstrate the effectiveness of our approach for click prediction when compared with several other methods. Additional image reranking experiments on real-world data show the use of click prediction is beneficial to improving the performance of prominent graph-based image reranking algorithms.
Collapse
|
37
|
Hwang TH, Atluri G, Kuang R, Kumar V, Starr T, Silverstein KAT, Haverty PM, Zhang Z, Liu J. Large-scale integrative network-based analysis identifies common pathways disrupted by copy number alterations across cancers. BMC Genomics 2013; 14:440. [PMID: 23822816 PMCID: PMC3703268 DOI: 10.1186/1471-2164-14-440] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2012] [Accepted: 06/26/2013] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Many large-scale studies analyzed high-throughput genomic data to identify altered pathways essential to the development and progression of specific types of cancer. However, no previous study has been extended to provide a comprehensive analysis of pathways disrupted by copy number alterations across different human cancers. Towards this goal, we propose a network-based method to integrate copy number alteration data with human protein-protein interaction networks and pathway databases to identify pathways that are commonly disrupted in many different types of cancer. RESULTS We applied our approach to a data set of 2,172 cancer patients across 16 different types of cancers, and discovered a set of commonly disrupted pathways, which are likely essential for tumor formation in majority of the cancers. We also identified pathways that are only disrupted in specific cancer types, providing molecular markers for different human cancers. Analysis with independent microarray gene expression datasets confirms that the commonly disrupted pathways can be used to identify patient subgroups with significantly different survival outcomes. We also provide a network view of disrupted pathways to explain how copy number alterations affect pathways that regulate cell growth, cycle, and differentiation for tumorigenesis. CONCLUSIONS In this work, we demonstrated that the network-based integrative analysis can help to identify pathways disrupted by copy number alterations across 16 types of human cancers, which are not readily identifiable by conventional overrepresentation-based and other pathway-based methods. All the results and source code are available at http://compbio.cs.umn.edu/NetPathID/.
Collapse
Affiliation(s)
- Tae Hyun Hwang
- Masonic Cancer Center, University of Minnesota – Twin Cities, Minneapolis, MN, USA
- Quantitative Biomedical Research Center, University of Texas Southwestern Medical Center, Dallas, TX, USA
- Department of Clinical Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA
- Simmons Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Gowtham Atluri
- Department of Computer Science and Engineering, University of Minnesota – Twin Cities, Minneapolis, MN, USA
| | - Rui Kuang
- Department of Computer Science and Engineering, University of Minnesota – Twin Cities, Minneapolis, MN, USA
| | - Vipin Kumar
- Department of Computer Science and Engineering, University of Minnesota – Twin Cities, Minneapolis, MN, USA
| | - Timothy Starr
- Masonic Cancer Center, University of Minnesota – Twin Cities, Minneapolis, MN, USA
- Department of Obstetrics, Gynecology & Women’s Health, University of Minnesota, Minneapolis, MN, USA
| | - Kevin AT Silverstein
- Masonic Cancer Center, University of Minnesota – Twin Cities, Minneapolis, MN, USA
| | - Peter M Haverty
- Department of Bioinformatics and Computational Biology, Genentech Inc, South San Francisco, CA, USA
| | - Zemin Zhang
- Department of Bioinformatics and Computational Biology, Genentech Inc, South San Francisco, CA, USA
| | - Jinfeng Liu
- Department of Bioinformatics and Computational Biology, Genentech Inc, South San Francisco, CA, USA
| |
Collapse
|
38
|
Zhang W, Ota T, Shridhar V, Chien J, Wu B, Kuang R. Network-based survival analysis reveals subnetwork signatures for predicting outcomes of ovarian cancer treatment. PLoS Comput Biol 2013; 9:e1002975. [PMID: 23555212 PMCID: PMC3605061 DOI: 10.1371/journal.pcbi.1002975] [Citation(s) in RCA: 123] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2012] [Accepted: 01/23/2013] [Indexed: 11/24/2022] Open
Abstract
Cox regression is commonly used to predict the outcome by the time to an event of interest and in addition, identify relevant features for survival analysis in cancer genomics. Due to the high-dimensionality of high-throughput genomic data, existing Cox models trained on any particular dataset usually generalize poorly to other independent datasets. In this paper, we propose a network-based Cox regression model called Net-Cox and applied Net-Cox for a large-scale survival analysis across multiple ovarian cancer datasets. Net-Cox integrates gene network information into the Cox's proportional hazard model to explore the co-expression or functional relation among high-dimensional gene expression features in the gene network. Net-Cox was applied to analyze three independent gene expression datasets including the TCGA ovarian cancer dataset and two other public ovarian cancer datasets. Net-Cox with the network information from gene co-expression or functional relations identified highly consistent signature genes across the three datasets, and because of the better generalization across the datasets, Net-Cox also consistently improved the accuracy of survival prediction over the Cox models regularized by or . This study focused on analyzing the death and recurrence outcomes in the treatment of ovarian carcinoma to identify signature genes that can more reliably predict the events. The signature genes comprise dense protein-protein interaction subnetworks, enriched by extracellular matrix receptors and modulators or by nuclear signaling components downstream of extracellular signal-regulated kinases. In the laboratory validation of the signature genes, a tumor array experiment by protein staining on an independent patient cohort from Mayo Clinic showed that the protein expression of the signature gene FBN1 is a biomarker significantly associated with the early recurrence after 12 months of the treatment in the ovarian cancer patients who are initially sensitive to chemotherapy. Net-Cox toolbox is available at http://compbio.cs.umn.edu/Net-Cox/. Network-based computational models are attracting increasing attention in studying cancer genomics because molecular networks provide valuable information on the functional organizations of molecules in cells. Survival analysis mostly with the Cox proportional hazard model is widely used to predict or correlate gene expressions with time to an event of interest (outcome) in cancer genomics. Surprisingly, network-based survival analysis has not received enough attention. In this paper, we studied resistance to chemotherapy in ovarian cancer with a network-based Cox model, called Net-Cox. The experiments confirm that networks representing gene co-expression or functional relations can be used to improve the accuracy and the robustness of survival prediction of outcome in ovarian cancer treatment. The study also revealed subnetwork signatures that are enriched by extracellular matrix receptors and modulators and the downstream nuclear signaling components of extracellular signal-regulators, respectively. In particular, FBN1, which was detected as a signature gene of high confidence by Net-Cox with network information, was validated as a biomarker for predicting early recurrence in platinum-sensitive ovarian cancer patients in laboratory.
Collapse
Affiliation(s)
- Wei Zhang
- Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, Minnesota, United States of America
| | - Takayo Ota
- Department of Laboratory Medicine and Experimental Pathology, Mayo Clinic College of Medicine, Rochester, Minnesota, United States of America
| | - Viji Shridhar
- Department of Laboratory Medicine and Experimental Pathology, Mayo Clinic College of Medicine, Rochester, Minnesota, United States of America
| | - Jeremy Chien
- Department of Laboratory Medicine and Experimental Pathology, Mayo Clinic College of Medicine, Rochester, Minnesota, United States of America
| | - Baolin Wu
- Division of Biostatistics, School of Public Health, University of Minnesota Twin Cities, Minneapolis, Minnesota, United States of America
| | - Rui Kuang
- Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, Minnesota, United States of America
- * E-mail:
| |
Collapse
|
39
|
Lavi O, Dror G, Shamir R. Network-induced classification kernels for gene expression profile analysis. J Comput Biol 2012; 19:694-709. [PMID: 22697242 DOI: 10.1089/cmb.2012.0065] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Computational classification of gene expression profiles into distinct disease phenotypes has been highly successful to date. Still, robustness, accuracy, and biological interpretation of the results have been limited, and it was suggested that use of protein interaction information jointly with the expression profiles can improve the results. Here, we study three aspects of this problem. First, we show that interactions are indeed relevant by showing that co-expressed genes tend to be closer in the network of interactions. Second, we show that the improved performance of one extant method utilizing expression and interactions is not really due to the biological information in the network, while in another method this is not the case. Finally, we develop a new kernel method--called NICK--that integrates network and expression data for SVM classification, and demonstrate that overall it achieves better results than extant methods while running two orders of magnitude faster.
Collapse
Affiliation(s)
- Ofer Lavi
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel.
| | | | | |
Collapse
|
40
|
Fang X, Netzer M, Baumgartner C, Bai C, Wang X. Genetic network and gene set enrichment analysis to identify biomarkers related to cigarette smoking and lung cancer. Cancer Treat Rev 2012; 39:77-88. [PMID: 22789435 DOI: 10.1016/j.ctrv.2012.06.001] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2012] [Revised: 06/03/2012] [Accepted: 06/06/2012] [Indexed: 10/28/2022]
Abstract
OBJECTIVES Cigarette smoking is the most demonstrated risk factor for the development of lung cancer, while the related genetic mechanisms are still unclear. METHODS The preprocessed microarray expression dataset was downloaded from Gene Expression Omnibus database. Samples were classified according to the disease state, stage and smoking state. A new computational strategy was applied for the identification and biological interpretation of new candidate genes in lung cancer and smoking by coupling a network-based approach with gene set enrichment analysis. MEASUREMENTS Network analysis was performed by pair-wise comparison according to the disease states (tumor or normal), smoking states (current smokers or nonsmokers or former smokers), or the disease stage (stages I-IV). The most activated metabolic pathways were identified by gene set enrichment analysis. RESULTS Panels of top ranked gene candidates in smoking or cancer development were identified, including genes involved in cell proliferation and drug metabolism like cytochrome P450 and WW domain containing transcription regulator 1. Semaphorin 5A and protein phosphatase 1F are the common genes represented as major hubs in both the smoking and cancer related network. Six pathways, e.g. cell cycle, DNA replication, RNA transport, protein processing in endoplasmic reticulum, vascular smooth muscle contraction and endocytosis were commonly involved in smoking and lung cancer when comparing the top ten selected pathways. CONCLUSION New approach of bioinformatics for biomarker identification and validation can probe into deep genetic relationships between cigarette smoking and lung cancer. Our studies indicate that disease-specific network biomarkers, interaction between genes/proteins, or cross-talking of pathways provide more specific values for the development of precision therapies for lung.
Collapse
Affiliation(s)
- Xiaocong Fang
- Department of Pulmonary Medicine, Zhongshan Hospital, Fudan University, Shanghai, China.
| | | | | | | | | |
Collapse
|
41
|
Yu J, Tao D, Wang M. Adaptive hypergraph learning and its application in image classification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2012; 21:3262-3272. [PMID: 22410334 DOI: 10.1109/tip.2012.2190083] [Citation(s) in RCA: 134] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Recent years have witnessed a surge of interest in graph-based transductive image classification. Existing simple graph-based transductive learning methods only model the pairwise relationship of images, however, and they are sensitive to the radius parameter used in similarity calculation. Hypergraph learning has been investigated to solve both difficulties. It models the high-order relationship of samples by using a hyperedge to link multiple samples. Nevertheless, the existing hypergraph learning methods face two problems, i.e., how to generate hyperedges and how to handle a large set of hyperedges. This paper proposes an adaptive hypergraph learning method for transductive image classification. In our method, we generate hyperedges by linking images and their nearest neighbors. By varying the size of the neighborhood, we are able to generate a set of hyperedges for each image and its visual neighbors. Our method simultaneously learns the labels of unlabeled images and the weights of hyperedges. In this way, we can automatically modulate the effects of different hyperedges. Thorough empirical studies show the effectiveness of our approach when compared with representative baselines.
Collapse
Affiliation(s)
- Jun Yu
- Department of Computer Science, Xiamen University, Xiamen 361005, China
| | | | | |
Collapse
|
42
|
Effectiveness of Partition and Graph Theoretic Clustering Algorithms for Multiple Source Partial Discharge Pattern Classification Using Probabilistic Neural Network and Its Adaptive Version: A Critique Based on Experimental Studies. JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING 2012. [DOI: 10.1155/2012/479696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Partial discharge (PD) is a major cause of failure of power apparatus and hence its measurement and analysis have emerged as a vital field in assessing the condition of the insulation system. Several efforts have been undertaken by researchers to classify PD pulses utilizing artificial intelligence techniques. Recently, the focus has shifted to the identification of multiple sources of PD since it is often encountered in real-time measurements. Studies have indicated that classification of multi-source PD becomes difficult with the degree of overlap and that several techniques such as mixed Weibull functions, neural networks, and wavelet transformation have been attempted with limited success. Since digital PD acquisition systems record data for a substantial period, the database becomes large, posing considerable difficulties during classification. This research work aims firstly at analyzing aspects concerning classification capability during the discrimination of multisource PD patterns. Secondly, it attempts at extending the previous work of the authors in utilizing the novel approach of probabilistic neural network versions for classifying moderate sets of PD sources to that of large sets. The third focus is on comparing the ability of partition-based algorithms, namely, the labelled (learning vector quantization) and unlabelled (K-means) versions, with that of a novel hypergraph-based clustering method in providing parsimonious sets of centers during classification.
Collapse
|
43
|
Nan X, Fu G, Zhao Z, Liu S, Patel RY, Liu H, Daga PR, Doerksen RJ, Dang X, Chen Y, Wilkins D. Leveraging domain information to restructure biological prediction. BMC Bioinformatics 2011; 12 Suppl 10:S22. [PMID: 22166097 PMCID: PMC3236845 DOI: 10.1186/1471-2105-12-s10-s22] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background It is commonly believed that including domain knowledge in a prediction model is desirable. However, representing and incorporating domain information in the learning process is, in general, a challenging problem. In this research, we consider domain information encoded by discrete or categorical attributes. A discrete or categorical attribute provides a natural partition of the problem domain, and hence divides the original problem into several non-overlapping sub-problems. In this sense, the domain information is useful if the partition simplifies the learning task. The goal of this research is to develop an algorithm to identify discrete or categorical attributes that maximally simplify the learning task. Results We consider restructuring a supervised learning problem via a partition of the problem space using a discrete or categorical attribute. A naive approach exhaustively searches all the possible restructured problems. It is computationally prohibitive when the number of discrete or categorical attributes is large. We propose a metric to rank attributes according to their potential to reduce the uncertainty of a classification task. It is quantified as a conditional entropy achieved using a set of optimal classifiers, each of which is built for a sub-problem defined by the attribute under consideration. To avoid high computational cost, we approximate the solution by the expected minimum conditional entropy with respect to random projections. This approach is tested on three artificial data sets, three cheminformatics data sets, and two leukemia gene expression data sets. Empirical results demonstrate that our method is capable of selecting a proper discrete or categorical attribute to simplify the problem, i.e., the performance of the classifier built for the restructured problem always beats that of the original problem. Conclusions The proposed conditional entropy based metric is effective in identifying good partitions of a classification problem, hence enhancing the prediction performance.
Collapse
Affiliation(s)
- Xiaofei Nan
- Department of Computer and Information Science, University of Mississippi, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Wang YC, Chen BS. A network-based biomarker approach for molecular investigation and diagnosis of lung cancer. BMC Med Genomics 2011; 4:2. [PMID: 21211025 PMCID: PMC3027087 DOI: 10.1186/1755-8794-4-2] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2010] [Accepted: 01/06/2011] [Indexed: 12/24/2022] Open
Abstract
Background Lung cancer is the leading cause of cancer deaths worldwide. Many studies have investigated the carcinogenic process and identified the biomarkers for signature classification. However, based on the research dedicated to this field, there is no highly sensitive network-based method for carcinogenesis characterization and diagnosis from the systems perspective. Methods In this study, a systems biology approach integrating microarray gene expression profiles and protein-protein interaction information was proposed to develop a network-based biomarker for molecular investigation into the network mechanism of lung carcinogenesis and diagnosis of lung cancer. The network-based biomarker consists of two protein association networks constructed for cancer samples and non-cancer samples. Results Based on the network-based biomarker, a total of 40 significant proteins in lung carcinogenesis were identified with carcinogenesis relevance values (CRVs). In addition, the network-based biomarker, acting as the screening test, proved to be effective in diagnosing smokers with signs of lung cancer. Conclusions A network-based biomarker using constructed protein association networks is a useful tool to highlight the pathways and mechanisms of the lung carcinogenic process and, more importantly, provides potential therapeutic targets to combat cancer.
Collapse
Affiliation(s)
- Yu-Chao Wang
- Laboratory of Control and Systems Biology, Department of Electrical Engineering, National Tsing Hua University, Hsinchu 30013, Taiwan
| | | |
Collapse
|
45
|
Tian Z, Kuang R. Integrative classification and analysis of multiple arrayCGH datasets with probe alignment. Bioinformatics 2010; 26:2313-20. [DOI: 10.1093/bioinformatics/btq428] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|