1
|
Kim JH, Koo B, Kim S. PONYTA: prioritization of phenotype-related genes from mouse KO events using PU learning on a biological network. Bioinformatics 2024; 40:btae634. [PMID: 39432684 PMCID: PMC11561041 DOI: 10.1093/bioinformatics/btae634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2024] [Revised: 09/16/2024] [Accepted: 10/18/2024] [Indexed: 10/23/2024] Open
Abstract
MOTIVATION Transcriptome data from gene knock-out (KO) experiments in mice provide crucial insights into the intricate interactions between genotype and phenotype. Differentially expressed gene (DEG) analysis and network propagation (NP) are well-established methods for analysing transcriptome data. To determine genes related to phenotype changes from a KO experiment, we need to choose a cutoff value for the corresponding criterion based on the specific method. Using a rigorous cutoff value for DEG analysis and NP is likely to select mostly positive genes related to the phenotype, but many will be rejected as false negatives. On the other hand, using a loose cutoff value for either method is prone to include a number of genes that are not phenotype-related, which are false positives. Thus, the research problem at hand is how to deal with the trade-off between false negatives and false positives. RESULTS We propose a novel framework called PONYTA for gene prioritization via positive-unlabeled (PU) learning on biological networks. Beginning with the selection of true phenotype-related genes using a rigorous cutoff value for DEG analysis and NP, we address the issue of handling false negatives by rescuing them through PU learning. Evaluations on transcriptome data from multiple studies show that our approach has superior gene prioritization ability compared to benchmark models. Therefore, PONYTA effectively prioritizes genes related to phenotypes derived from gene KO events and guides in vitro and in vivo gene KO experiments for increased efficiency. AVAILABILITY AND IMPLEMENTATION The source code of PONYTA is available at https://github.com/Jun-Hyeong-Kim/PONYTA.
Collapse
Affiliation(s)
- Jun Hyeong Kim
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul 08826, Republic of Korea
| | - Bonil Koo
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Republic of Korea
- AIGENDRUG Co., Ltd., Seoul 08758, Republic of Korea
| | - Sun Kim
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul 08826, Republic of Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Republic of Korea
- AIGENDRUG Co., Ltd., Seoul 08758, Republic of Korea
- Department of Computer Science and Engineering, Seoul National University, Seoul 08826, Republic of Korea
| |
Collapse
|
2
|
Li Z, Liu G, Yang X, Shu M, Jin W, Tong Y, Liu X, Wang Y, Yuan J, Yang Y. An atlas of cell-type-specific interactome networks across 44 human tumor types. Genome Med 2024; 16:30. [PMID: 38347596 PMCID: PMC10860273 DOI: 10.1186/s13073-024-01303-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 02/06/2024] [Indexed: 02/15/2024] Open
Abstract
BACKGROUND Biological processes are controlled by groups of genes acting in concert. Investigating gene-gene interactions within different cell types can help researchers understand the regulatory mechanisms behind human complex diseases, such as tumors. METHODS We collected extensive single-cell RNA-seq data from tumors, involving 563 patients with 44 different tumor types. Through our analysis, we identified various cell types in tumors and created an atlas of different immune cell subsets across different tumor types. Using the SCINET method, we reconstructed interactome networks specific to different cell types. Diverse functional data was then integrated to gain biological insights into the networks, including somatic mutation patterns and gene functional annotation. Additionally, genes with prognostic relevance within the networks were also identified. We also examined cell-cell communications to investigate how gene interactions modulate cell-cell interactions. RESULTS We developed a data portal called CellNetdb for researchers to study cell-type-specific interactome networks. Our findings indicate that these networks can be used to identify genes with topological specificity in different cell types. We also found that prognostic genes can deconvolved into cell types through analyzing network connectivity. Additionally, we identified commonalities and differences in cell-type-specific networks across different tumor types. Our results suggest that these networks can be used to prioritize risk genes. CONCLUSIONS This study presented CellNetdb, a comprehensive repository featuring an atlas of cell-type-specific interactome networks across 44 human tumor types. The findings underscore the utility of these networks in delineating the intricacies of tumor microenvironments and advancing the understanding of molecular mechanisms underpinning human tumors.
Collapse
Affiliation(s)
- Zekun Li
- Department of Bioinformatics, School of Basic Medical Sciences, The Province and Ministry Co-Sponsored Collaborative Innovation Center for Medical Epigenetics, Center for Reproductive Medicine, The Second Hospital of Tianjin Medical University, Tianjin Key Laboratory of Inflammatory Biology, Tianjin Medical University, Tianjin, 300070, China
| | - Gerui Liu
- Department of Bioinformatics, School of Basic Medical Sciences, The Province and Ministry Co-Sponsored Collaborative Innovation Center for Medical Epigenetics, Center for Reproductive Medicine, The Second Hospital of Tianjin Medical University, Tianjin Key Laboratory of Inflammatory Biology, Tianjin Medical University, Tianjin, 300070, China
| | - Xiaoxiao Yang
- Department of Bioinformatics, School of Basic Medical Sciences, The Province and Ministry Co-Sponsored Collaborative Innovation Center for Medical Epigenetics, Center for Reproductive Medicine, The Second Hospital of Tianjin Medical University, Tianjin Key Laboratory of Inflammatory Biology, Tianjin Medical University, Tianjin, 300070, China
| | - Meng Shu
- Department of Bioinformatics, School of Basic Medical Sciences, The Province and Ministry Co-Sponsored Collaborative Innovation Center for Medical Epigenetics, Center for Reproductive Medicine, The Second Hospital of Tianjin Medical University, Tianjin Key Laboratory of Inflammatory Biology, Tianjin Medical University, Tianjin, 300070, China
| | - Wen Jin
- Department of Bioinformatics, School of Basic Medical Sciences, The Province and Ministry Co-Sponsored Collaborative Innovation Center for Medical Epigenetics, Center for Reproductive Medicine, The Second Hospital of Tianjin Medical University, Tianjin Key Laboratory of Inflammatory Biology, Tianjin Medical University, Tianjin, 300070, China
| | - Yang Tong
- Department of Bioinformatics, School of Basic Medical Sciences, The Province and Ministry Co-Sponsored Collaborative Innovation Center for Medical Epigenetics, Center for Reproductive Medicine, The Second Hospital of Tianjin Medical University, Tianjin Key Laboratory of Inflammatory Biology, Tianjin Medical University, Tianjin, 300070, China
| | - Xiaochuan Liu
- Department of Bioinformatics, School of Basic Medical Sciences, The Province and Ministry Co-Sponsored Collaborative Innovation Center for Medical Epigenetics, Center for Reproductive Medicine, The Second Hospital of Tianjin Medical University, Tianjin Key Laboratory of Inflammatory Biology, Tianjin Medical University, Tianjin, 300070, China
| | - Yuting Wang
- Department of Bioinformatics, School of Basic Medical Sciences, The Province and Ministry Co-Sponsored Collaborative Innovation Center for Medical Epigenetics, Center for Reproductive Medicine, The Second Hospital of Tianjin Medical University, Tianjin Key Laboratory of Inflammatory Biology, Tianjin Medical University, Tianjin, 300070, China
| | - Jiapei Yuan
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Haihe Laboratory of Cell Ecosystem, Institute of Hematology and Blood Diseases Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin, 300020, China.
- Tianjin Institutes of Health Science, Tianjin, 301600, China.
| | - Yang Yang
- Department of Bioinformatics, School of Basic Medical Sciences, The Province and Ministry Co-Sponsored Collaborative Innovation Center for Medical Epigenetics, Center for Reproductive Medicine, The Second Hospital of Tianjin Medical University, Tianjin Key Laboratory of Inflammatory Biology, Tianjin Medical University, Tianjin, 300070, China.
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China.
| |
Collapse
|
3
|
Petti M, Farina L. Network medicine for patients' stratification: From single-layer to multi-omics. WIREs Mech Dis 2023; 15:e1623. [PMID: 37323106 DOI: 10.1002/wsbm.1623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Revised: 03/08/2023] [Accepted: 05/30/2023] [Indexed: 06/17/2023]
Abstract
Precision medicine research increasingly relies on the integrated analysis of multiple types of omics. In the era of big data, the large availability of different health-related information represents a great, but at the same time untapped, chance with a potentially fundamental role in the prevention, diagnosis and prognosis of diseases. Computational methods are needed to combine this data to create a comprehensive view of a given disease. Network science can model biomedical data in terms of relationships among molecular players of different nature and has been successfully proposed as a new paradigm for studying human diseases. Patient stratification is an open challenge aimed at identifying subtypes with different disease manifestations, severity, and expected survival time. Several stratification approaches based on high-throughput gene expression measurements have been successfully applied. However, few attempts have been proposed to exploit the integration of various genotypic and phenotypic data to discover novel sub-types or improve the detection of known groupings. This article is categorized under: Cancer > Biomedical Engineering Cancer > Computational Models Cancer > Genetics/Genomics/Epigenetics.
Collapse
Affiliation(s)
- Manuela Petti
- Department of Computer, Control and Management Engineering, Sapienza University of Rome, Rome, Italy
| | - Lorenzo Farina
- Department of Computer, Control and Management Engineering, Sapienza University of Rome, Rome, Italy
| |
Collapse
|
4
|
Kumar N, Mukhtar MS. Ranking Plant Network Nodes Based on Their Centrality Measures. ENTROPY (BASEL, SWITZERLAND) 2023; 25:e25040676. [PMID: 37190464 PMCID: PMC10137616 DOI: 10.3390/e25040676] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 04/14/2023] [Accepted: 04/16/2023] [Indexed: 05/17/2023]
Abstract
Biological networks are often large and complex, making it difficult to accurately identify the most important nodes. Node prioritization algorithms are used to identify the most influential nodes in a biological network by considering their relationships with other nodes. These algorithms can help us understand the functioning of the network and the role of individual nodes. We developed CentralityCosDist, an algorithm that ranks nodes based on a combination of centrality measures and seed nodes. We applied this and four other algorithms to protein-protein interactions and co-expression patterns in Arabidopsis thaliana using pathogen effector targets as seed nodes. The accuracy of the algorithms was evaluated through functional enrichment analysis of the top 10 nodes identified by each algorithm. Most enriched terms were similar across algorithms, except for DIAMOnD. CentralityCosDist identified more plant-pathogen interactions and related functions and pathways compared to the other algorithms.
Collapse
Affiliation(s)
- Nilesh Kumar
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - M Shahid Mukhtar
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| |
Collapse
|
5
|
Wang Y, Sun Z, He Q, Li J, Ni M, Yang M. Self-supervised graph representation learning integrates multiple molecular networks and decodes gene-disease relationships. PATTERNS (NEW YORK, N.Y.) 2022; 4:100651. [PMID: 36699743 PMCID: PMC9868676 DOI: 10.1016/j.patter.2022.100651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 05/19/2022] [Accepted: 11/07/2022] [Indexed: 12/12/2022]
Abstract
Leveraging molecular networks to discover disease-relevant modules is a long-standing challenge. With the accumulation of interactomes, there is a pressing need for powerful computational approaches to handle the inevitable noise and context-specific nature of biological networks. Here, we introduce Graphene, a two-step self-supervised representation learning framework tailored to concisely integrate multiple molecular networks and adapted to gene functional analysis via downstream re-training. In practice, we first leverage GNN (graph neural network) pre-training techniques to obtain initial node embeddings followed by re-training Graphene using a graph attention architecture, achieving superior performance over competing methods for pathway gene recovery, disease gene reprioritization, and comorbidity prediction. Graphene successfully recapitulates tissue-specific gene expression across disease spectrum and demonstrates shared heritability of common mental disorders. Graphene can be updated with new interactomes or other omics features. Graphene holds promise to decipher gene function under network context and refine GWAS (genome-wide association study) hits and offers mechanistic insights via decoding diseases from genome to networks to phenotypes.
Collapse
Affiliation(s)
- Yi Wang
- MGI, BGI-Shenzhen, Shenzhen, China
| | - Zijun Sun
- Computer Center, Peking University, Beijing, China
| | | | - Jiwei Li
- Department of Computer Science, Zhejiang University, Hangzhou, China
| | - Ming Ni
- MGI, BGI-Shenzhen, Shenzhen, China
- MGI-QingDao, BGI-Shenzhen, Qingdao, China
| | - Meng Yang
- MGI, BGI-Shenzhen, Shenzhen, China
- Corresponding author
| |
Collapse
|
6
|
Nguyen T, Yue Z, Slominski R, Welner R, Zhang J, Chen JY. WINNER: A network biology tool for biomolecular characterization and prioritization. Front Big Data 2022; 5:1016606. [PMID: 36407327 PMCID: PMC9672476 DOI: 10.3389/fdata.2022.1016606] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 10/14/2022] [Indexed: 12/09/2024] Open
Abstract
BACKGROUND AND CONTRIBUTION In network biology, molecular functions can be characterized by network-based inference, or "guilt-by-associations." PageRank-like tools have been applied in the study of biomolecular interaction networks to obtain further the relative significance of all molecules in the network. However, there is a great deal of inherent noise in widely accessible data sets for gene-to-gene associations or protein-protein interactions. How to develop robust tests to expand, filter, and rank molecular entities in disease-specific networks remains an ad hoc data analysis process. RESULTS We describe a new biomolecular characterization and prioritization tool called Weighted In-Network Node Expansion and Ranking (WINNER). It takes the input of any molecular interaction network data and generates an optionally expanded network with all the nodes ranked according to their relevance to one another in the network. To help users assess the robustness of results, WINNER provides two different types of statistics. The first type is a node-expansion p-value, which helps evaluate the statistical significance of adding "non-seed" molecules to the original biomolecular interaction network consisting of "seed" molecules and molecular interactions. The second type is a node-ranking p-value, which helps evaluate the relative statistical significance of the contribution of each node to the overall network architecture. We validated the robustness of WINNER in ranking top molecules by spiking noises in several network permutation experiments. We have found that node degree-preservation randomization of the gene network produced normally distributed ranking scores, which outperform those made with other gene network randomization techniques. Furthermore, we validated that a more significant proportion of the WINNER-ranked genes was associated with disease biology than existing methods such as PageRank. We demonstrated the performance of WINNER with a few case studies, including Alzheimer's disease, breast cancer, myocardial infarctions, and Triple negative breast cancer (TNBC). In all these case studies, the expanded and top-ranked genes identified by WINNER reveal disease biology more significantly than those identified by other gene prioritizing software tools, including Ingenuity Pathway Analysis (IPA) and DiAMOND. CONCLUSION WINNER ranking strongly correlates to other ranking methods when the network covers sufficient node and edge information, indicating a high network quality. WINNER users can use this new tool to robustly evaluate a list of candidate genes, proteins, or metabolites produced from high-throughput biology experiments, as long as there is available gene/protein/metabolic network information.
Collapse
Affiliation(s)
- Thanh Nguyen
- Informatics Institute in School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, United States
- Department of Biomedical Engineering, The University of Alabama at Birmingham, Birmingham, AL, United States
| | - Zongliang Yue
- Informatics Institute in School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, United States
| | - Radomir Slominski
- Informatics Institute in School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, United States
| | - Robert Welner
- Comprehensive Arthritis, Musculoskeletal, Bone and Autoimmunity Center (CAMBAC), School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, United States
| | - Jianyi Zhang
- Department of Biomedical Engineering, The University of Alabama at Birmingham, Birmingham, AL, United States
| | - Jake Y. Chen
- Informatics Institute in School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, United States
| |
Collapse
|
7
|
Zhang H, Ferguson A, Robertson G, Jiang M, Zhang T, Sudlow C, Smith K, Rannikmae K, Wu H. Benchmarking network-based gene prioritization methods for cerebral small vessel disease. Brief Bioinform 2021; 22:bbab006. [PMID: 33634312 PMCID: PMC8425308 DOI: 10.1093/bib/bbab006] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Revised: 12/31/2020] [Accepted: 01/04/2021] [Indexed: 12/25/2022] Open
Abstract
Network-based gene prioritization algorithms are designed to prioritize disease-associated genes based on known ones using biological networks of protein interactions, gene-disease associations (GDAs) and other relationships between biological entities. Various algorithms have been developed based on different mechanisms, but it is not obvious which algorithm is optimal for a specific disease. To address this issue, we benchmarked multiple algorithms for their application in cerebral small vessel disease (cSVD). We curated protein-gene interactions (PGIs) and GDAs from databases and assembled PGI networks and disease-gene heterogeneous networks. A screening of algorithms resulted in seven representative algorithms to be benchmarked. Performance of algorithms was assessed using both leave-one-out cross-validation (LOOCV) and external validation with MEGASTROKE genome-wide association study (GWAS). We found that random walk with restart on the heterogeneous network (RWRH) showed best LOOCV performance, with median LOOCV rediscovery rank of 185.5 (out of 19 463 genes). The GenePanda algorithm had most GWAS-confirmable genes in top 200 predictions, while RWRH had best ranks for small vessel stroke-associated genes confirmed in GWAS. In conclusion, RWRH has overall better performance for application in cSVD despite its susceptibility to bias caused by degree centrality. Choice of algorithms should be determined before applying to specific disease. Current pure network-based gene prioritization algorithms are unlikely to find novel disease-associated genes that are not associated with known ones. The tools for implementing and benchmarking algorithms have been made available and can be generalized for other diseases.
Collapse
Affiliation(s)
- Huayu Zhang
- Centre for Medical Informatics, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
| | - Amy Ferguson
- Centre for Medical Informatics, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
| | - Grant Robertson
- Institute for Adaptive and Neural Computation, School of Informatics, University of Edinburgh, Edinburgh, United Kingdom
| | - Muchen Jiang
- Edinburgh Medical School, University of Edinburgh, Edinburgh, United Kingdom
| | - Teng Zhang
- Department of Orthopaedics and Traumatology, the University of Hong Kong, Hong Kong, China
| | - Cathie Sudlow
- Centre for Medical Informatics, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
- Health Data Research UK, London, United Kingdom
| | - Keith Smith
- Centre for Medical Informatics, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
- Health Data Research UK, London, United Kingdom
| | - Kristiina Rannikmae
- Centre for Medical Informatics, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
- Health Data Research UK, London, United Kingdom
| | - Honghan Wu
- Health Data Research UK, London, United Kingdom
- Institute of Health Informatics, University College London, London, United Kingdom
| |
Collapse
|
8
|
Benincasa G, Marfella R, Della Mura N, Schiano C, Napoli C. Strengths and Opportunities of Network Medicine in Cardiovascular Diseases. Circ J 2020; 84:144-152. [DOI: 10.1253/circj.cj-19-0879] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Giuditta Benincasa
- Clinical Department of Internal Medicine and Specialistics, Department of Advanced Clinical and Surgical Sciences, University of Campania “Luigi Vanvitelli”
| | - Raffaele Marfella
- Clinical Department of Internal Medicine and Specialistics, Department of Advanced Clinical and Surgical Sciences, University of Campania “Luigi Vanvitelli”
| | | | - Concetta Schiano
- Clinical Department of Internal Medicine and Specialistics, Department of Advanced Clinical and Surgical Sciences, University of Campania “Luigi Vanvitelli”
| | - Claudio Napoli
- Clinical Department of Internal Medicine and Specialistics, Department of Advanced Clinical and Surgical Sciences, University of Campania “Luigi Vanvitelli”
- IRCCS-SDN
| |
Collapse
|
9
|
Su L, Liu G, Wang J, Xu D. A rectified factor network based biclustering method for detecting cancer-related coding genes and miRNAs, and their interactions. Methods 2019; 166:22-30. [PMID: 31121299 PMCID: PMC6708461 DOI: 10.1016/j.ymeth.2019.05.010] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Revised: 04/14/2019] [Accepted: 05/13/2019] [Indexed: 12/12/2022] Open
Abstract
Detecting cancer-related genes and their interactions is a crucial task in cancer research. For this purpose, we proposed an efficient method, to detect coding genes, microRNAs (miRNAs), and their interactions related to a particular cancer or a cancer subtype using their expression data from the same set of samples. Firstly, biclusters specific to a particular type of cancer are detected based on rectified factor networks and ranked according to their associations with general cancers. Secondly, coding genes and miRNAs in each bicluster are prioritized by considering their differential expression and differential correlation values, protein-protein interaction data, and potential cancer markers. Finally, a rank fusion process is used to obtain the final comprehensive rank by combining multiple ranking results. We applied our proposed method on breast cancer datasets. Results show that our method outperforms other methods in detecting breast cancer-related coding genes and miRNAs. Furthermore, our method is very efficient in computing time, which can handle tens of thousands genes/miRNAs and hundreds of patients in hours on a desktop. This work may aid researchers in studying the genetic architecture of complex diseases, and improving the accuracy of diagnosis.
Collapse
Affiliation(s)
- Lingtao Su
- Department of Computer Science and Technology, Jilin University, Changchun 130012, China; Department of Electrical Engineering & Computer Science and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Guixia Liu
- Department of Computer Science and Technology, Jilin University, Changchun 130012, China
| | - Juexin Wang
- Department of Electrical Engineering & Computer Science and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Dong Xu
- Department of Electrical Engineering & Computer Science and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA.
| |
Collapse
|
10
|
Conte F, Fiscon G, Licursi V, Bizzarri D, D'Antò T, Farina L, Paci P. A paradigm shift in medicine: A comprehensive review of network-based approaches. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2019; 1863:194416. [PMID: 31382052 DOI: 10.1016/j.bbagrm.2019.194416] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 07/19/2019] [Accepted: 07/28/2019] [Indexed: 02/01/2023]
Abstract
Network medicine is a rapidly evolving new field of medical research, which combines principles and approaches of systems biology and network science, holding the promise to uncovering the causes and to revolutionize the diagnosis and treatments of human diseases. This new paradigm reflects the fact that human diseases are not caused by single molecular defects, but driven by complex interactions among a variety of molecular mediators. The complexity of these interactions embraces different types of information: from the cellular-molecular level of protein-protein interactions to correlational studies of gene expression and regulation, to metabolic and disease pathways up to drug-disease relationships. The analysis of these complex networks can reveal new disease genes and/or disease pathways and identify possible targets for new drug development, as well as new uses for existing drugs. In this review, we offer a comprehensive overview of network types and algorithms used in the framework of network medicine. This article is part of a Special Issue entitled: Transcriptional Profiles and Regulatory Gene Networks edited by Dr. Dr. Federico Manuel Giorgi and Dr. Shaun Mahony.
Collapse
Affiliation(s)
- Federica Conte
- Institute for Systems Analysis and Computer Science "Antonio Ruberti", National Research Council, Rome, Italy
| | - Giulia Fiscon
- Institute for Systems Analysis and Computer Science "Antonio Ruberti", National Research Council, Rome, Italy.
| | - Valerio Licursi
- Biology and Biotechnology Department "Charles Darwin" (BBCD), Sapienza University of Rome, Rome, Italy
| | - Daniele Bizzarri
- Department of Internal Medicine and Medical Specialties, Sapienza University of Rome, Rome, Italy
| | - Tommaso D'Antò
- Department of Computer, Control and Management Engineering, Sapienza University of Rome, Rome, Italy
| | - Lorenzo Farina
- Department of Computer, Control and Management Engineering, Sapienza University of Rome, Rome, Italy
| | - Paola Paci
- Institute for Systems Analysis and Computer Science "Antonio Ruberti", National Research Council, Rome, Italy
| |
Collapse
|
11
|
Almasi SM, Hu T. Measuring the importance of vertices in the weighted human disease network. PLoS One 2019; 14:e0205936. [PMID: 30901770 PMCID: PMC6430629 DOI: 10.1371/journal.pone.0205936] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2018] [Accepted: 02/26/2019] [Indexed: 12/11/2022] Open
Abstract
Many human genetic disorders and diseases are known to be related to each other through frequently observed co-occurrences. Studying the correlations among multiple diseases provides an important avenue to better understand the common genetic background of diseases and to help develop new drugs that can treat multiple diseases. Meanwhile, network science has seen increasing applications on modeling complex biological systems, and can be a powerful tool to elucidate the correlations of multiple human diseases. In this article, known disease-gene associations were represented using a weighted bipartite network. We extracted a weighted human diseases network from such a bipartite network to show the correlations of diseases. Subsequently, we proposed a new centrality measurement for the weighted human disease network (WHDN) in order to quantify the importance of diseases. Using our centrality measurement to quantify the importance of vertices in WHDN, we were able to find a set of most central diseases. By investigating the 30 top diseases and their most correlated neighbors in the network, we identified disease linkages including known disease pairs and novel findings. Our research helps better understand the common genetic origin of human diseases and suggests top diseases that likely induce other related diseases.
Collapse
Affiliation(s)
| | - Ting Hu
- Department of Computer Science, Memorial University, St. John’s, NL, Canada
| |
Collapse
|
12
|
Kara S, Hanna A, Pirela-Morillo GA, Gilliam CT, Wilson GD. Molecular Interaction Network Approach (MINA) identifies association of novel candidate disease genes. MethodsX 2019; 6:1286-1291. [PMID: 31198690 PMCID: PMC6555892 DOI: 10.1016/j.mex.2019.05.031] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2018] [Accepted: 05/29/2019] [Indexed: 12/03/2022] Open
Abstract
Molecular Interaction Network Approach (MINA) was used to elucidate candidate disease genes. The approach was implemented to identify novel gene association with commonly known autoimmune diseases [1]. In MINA, we evaluated the hypothesis that “network proximity” within a whole genome molecular interaction network can be used to inform the search for multigene inheritance. There are now numerous examples of gene discoveries based upon network proximity between novel and previously identified disease genes (Yin et al., 2017 [2], Wang et al., 2011 [3], and Barrenas et al., 2009 [4]). This study extends the application of interaction networks to the interrogation of Genome Wide Association studies: first, by showing that a group of nine autoimmune diseases (AuD) genes “seed genes”, are connected in a highly non-random manner within a whole genome network; and second, by showing that the minimal number of connecting genes required to connect a maximal number of AuD candidate genes are highly enriched as candidate genes for AuD predisposing mutations. The findings imply that a threshold number of candidate genes for any heritable disorder can be used to “seed” a molecular interaction network that Serves to validate the disease status of closely associated seed genes Identifies genes that are highly enriched as novel candidate disease genes Provides a strategy for elucidation of epistatic gene x gene interactions
The method could provide a critical toll for understanding the genetic architecture of common traits and disorders.
Collapse
|
13
|
Lau E, Venkatraman V, Thomas CT, Wu JC, Van Eyk JE, Lam MPY. Identifying High-Priority Proteins Across the Human Diseasome Using Semantic Similarity. J Proteome Res 2018; 17:4267-4278. [PMID: 30256117 PMCID: PMC6606054 DOI: 10.1021/acs.jproteome.8b00393] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Identifying the genes and proteins associated with a biological process or disease is a central goal of the biomedical research enterprise. However, relatively few systematic approaches are available that provide objective evaluation of the genes or proteins known to be important to a research topic, and hence researchers often rely on subjective evaluation of domain experts and laborious manual literature review. Computational bibliometric analysis, in conjunction with text mining and data curation, attempts to automate this process and return prioritized proteins in any given research topic. We describe here a method to identify and rank protein-topic relationships by calculating the semantic similarity between a protein and a query term in the biomerical literature while adjusting for the impact and immediacy of associated research articles. We term the calculated metric the weighted copublication distance (WCD) and show that it compares well to related approaches in predicting benchmark protein lists in multiple biological processes. We used WCD to extract prioritized "popular proteins" across multiple cell types, subanatomical regions, and standardized vocabularies containing over 20 000 human disease terms. The collection of protein-disease associations across the resulting human "diseasome" supports data analytical workflows to perform reverse protein-to-disease queries and functional annotation of experimental protein lists. We envision that the described improvement to the popular proteins strategy will be useful for annotating protein lists and guiding method development efforts as well as generating new hypotheses on understudied disease proteins using bibliometric information.
Collapse
Affiliation(s)
- Edward Lau
- Stanford Cardiovascular Institute, Stanford University, Stanford, California 94305, United States
| | - Vidya Venkatraman
- Advanced Clinical Biosystems Research Institute, Department of Medicine and The Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Cody T. Thomas
- Department of Medicine, Division of Cardiology, Consortium for Fibrosis Research and Translation, Anschutz Medical Campus, University of Colorado Denver, Aurora, Colorado 80045, United States
| | - Joseph C. Wu
- Stanford Cardiovascular Institute, Stanford University, Stanford, California 94305, United States
| | - Jennifer E. Van Eyk
- Advanced Clinical Biosystems Research Institute, Department of Medicine and The Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Maggie P. Y. Lam
- Department of Medicine, Division of Cardiology, Consortium for Fibrosis Research and Translation, Anschutz Medical Campus, University of Colorado Denver, Aurora, Colorado 80045, United States
| |
Collapse
|
14
|
Fiscon G, Conte F, Farina L, Paci P. Network-Based Approaches to Explore Complex Biological Systems towards Network Medicine. Genes (Basel) 2018; 9:genes9090437. [PMID: 30200360 PMCID: PMC6162385 DOI: 10.3390/genes9090437] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2018] [Revised: 08/25/2018] [Accepted: 08/30/2018] [Indexed: 12/14/2022] Open
Abstract
Network medicine relies on different types of networks: from the molecular level of protein–protein interactions to gene regulatory network and correlation studies of gene expression. Among network approaches based on the analysis of the topological properties of protein–protein interaction (PPI) networks, we discuss the widespread DIAMOnD (disease module detection) algorithm. Starting from the assumption that PPI networks can be viewed as maps where diseases can be identified with localized perturbation within a specific neighborhood (i.e., disease modules), DIAMOnD performs a systematic analysis of the human PPI network to uncover new disease-associated genes by exploiting the connectivity significance instead of connection density. The past few years have witnessed the increasing interest in understanding the molecular mechanism of post-transcriptional regulation with a special emphasis on non-coding RNAs since they are emerging as key regulators of many cellular processes in both physiological and pathological states. Recent findings show that coding genes are not the only targets that microRNAs interact with. In fact, there is a pool of different RNAs—including long non-coding RNAs (lncRNAs) —competing with each other to attract microRNAs for interactions, thus acting as competing endogenous RNAs (ceRNAs). The framework of regulatory networks provides a powerful tool to gather new insights into ceRNA regulatory mechanisms. Here, we describe a data-driven model recently developed to explore the lncRNA-associated ceRNA activity in breast invasive carcinoma. On the other hand, a very promising example of the co-expression network is the one implemented by the software SWIM (switch miner), which combines topological properties of correlation networks with gene expression data in order to identify a small pool of genes—called switch genes—critically associated with drastic changes in cell phenotype. Here, we describe SWIM tool along with its applications to cancer research and compare its predictions with DIAMOnD disease genes.
Collapse
Affiliation(s)
- Giulia Fiscon
- Institute for Systems Analysis and Computer Science "Antonio Ruberti", National Research Council, via dei Taurini 19, 00185 Rome, Italy.
- SysBio Centre of Systems Biology, Piazza della Scienza, 3, 20126 Milan, Italy.
| | - Federica Conte
- Institute for Systems Analysis and Computer Science "Antonio Ruberti", National Research Council, via dei Taurini 19, 00185 Rome, Italy.
- SysBio Centre of Systems Biology, Piazza della Scienza, 3, 20126 Milan, Italy.
| | - Lorenzo Farina
- Department of Computer, Control, and Management Engineering "Antonio Ruberti", Sapienza University of Rome, Viale Ariosto 25, 00185 Rome, Italy.
| | - Paola Paci
- Institute for Systems Analysis and Computer Science "Antonio Ruberti", National Research Council, via dei Taurini 19, 00185 Rome, Italy.
- SysBio Centre of Systems Biology, Piazza della Scienza, 3, 20126 Milan, Italy.
| |
Collapse
|
15
|
Yang F, Wu D, Lin L, Yang J, Yang T, Zhao J. The integration of weighted gene association networks based on information entropy. PLoS One 2017; 12:e0190029. [PMID: 29272314 PMCID: PMC5741255 DOI: 10.1371/journal.pone.0190029] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2017] [Accepted: 12/06/2017] [Indexed: 01/18/2023] Open
Abstract
Constructing genome scale weighted gene association networks (WGAN) from multiple data sources is one of research hot spots in systems biology. In this paper, we employ information entropy to describe the uncertain degree of gene-gene links and propose a strategy for data integration of weighted networks. We use this method to integrate four existing human weighted gene association networks and construct a much larger WGAN, which includes richer biology information while still keeps high functional relevance between linked gene pairs. The new WGAN shows satisfactory performance in disease gene prediction, which suggests the reliability of our integration strategy. Compared with existing integration methods, our method takes the advantage of the inherent characteristics of the component networks and pays less attention to the biology background of the data. It can make full use of existing biological networks with low computational effort.
Collapse
Affiliation(s)
- Fan Yang
- Department of Mathematics, Army Logistics University of PLA, Chongqing, China
| | - Duzhi Wu
- Rongzhi College of Chongqing Technology and Business, Chongqing, China
- * E-mail: (DW); (JZ)
| | - Limei Lin
- Department of Mathematics, Army Logistics University of PLA, Chongqing, China
| | - Jian Yang
- School of Pharmacy, Second Military Medical University, Shanghai, China
| | - Tinghong Yang
- Department of Mathematics, Army Logistics University of PLA, Chongqing, China
| | - Jing Zhao
- Institute of Interdisciplinary Complex Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
- * E-mail: (DW); (JZ)
| |
Collapse
|