1
|
Solano LE, D’Sa NM, Nikolaidis N. PRRGO: A Tool for Visualizing and Mapping Globally Expressed Genes in Public Gene Expression Omnibus RNA-Sequencing Studies to PageRank-scored Gene Ontology Terms. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.21.576540. [PMID: 38328158 PMCID: PMC10849496 DOI: 10.1101/2024.01.21.576540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
We herein report PageRankeR Gene Ontology (PRRGO), a downloadable web application that can integrate differentially expressed gene (DEG) data from the gene expression omnibus (GEO) GEO2R web tool with the gene ontology (GO) database [1]. Unlike existing tools, PRRGO computes the PageRank for the entire GO network and can generate both interactive GO networks on the web interface and comma-separated values (CSV) files containing the DEG statistics categorized by GO term. These hierarchical and tabular GO-DEG data are especially conducive to hypothesis generation and overlap studies with the use of PageRank data, which can provide a metric of GO term centrality. We verified the tool for accuracy and reliability across nine independent heat shock (HS) studies for which the RNA-seq data was publicly available on GEO and found that the tool produced increasing concordance between study DEGs, GO terms, and select HS-specific GO terms.
Collapse
Affiliation(s)
- Luis E. Solano
- Department of Biological Science, Center for Applied Biotechnology Studies, and Center for Computational and Applied Mathematics, College of Natural Sciences and Mathematics, California State University Fullerton, Fullerton, CA 92834-6850
- Center for Complex Biological Systems, University of California, Irvine, Irvine, CA
| | - Nicholas M. D’Sa
- Department of Biological Science, Center for Applied Biotechnology Studies, and Center for Computational and Applied Mathematics, College of Natural Sciences and Mathematics, California State University Fullerton, Fullerton, CA 92834-6850
- University of California, Irvine, Irvine, CA
| | - Nikolas Nikolaidis
- Department of Biological Science, Center for Applied Biotechnology Studies, and Center for Computational and Applied Mathematics, College of Natural Sciences and Mathematics, California State University Fullerton, Fullerton, CA 92834-6850
| |
Collapse
|
2
|
Casotti MC, Meira DD, Alves LNR, Bessa BGDO, Campanharo CV, Vicente CR, Aguiar CC, Duque DDA, Barbosa DG, dos Santos EDVW, Garcia FM, de Paula F, Santana GM, Pavan IP, Louro LS, Braga RFR, Trabach RSDR, Louro TS, de Carvalho EF, Louro ID. Translational Bioinformatics Applied to the Study of Complex Diseases. Genes (Basel) 2023; 14:419. [PMID: 36833346 PMCID: PMC9956936 DOI: 10.3390/genes14020419] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 01/29/2023] [Accepted: 01/31/2023] [Indexed: 02/10/2023] Open
Abstract
Translational Bioinformatics (TBI) is defined as the union of translational medicine and bioinformatics. It emerges as a major advance in science and technology by covering everything, from the most basic database discoveries, to the development of algorithms for molecular and cellular analysis, as well as their clinical applications. This technology makes it possible to access the knowledge of scientific evidence and apply it to clinical practice. This manuscript aims to highlight the role of TBI in the study of complex diseases, as well as its application to the understanding and treatment of cancer. An integrative literature review was carried out, obtaining articles through several websites, among them: PUBMED, Science Direct, NCBI-PMC, Scientific Electronic Library Online (SciELO), and Google Academic, published in English, Spanish, and Portuguese, indexed in the referred databases and answering the following guiding question: "How does TBI provide a scientific understanding of complex diseases?" An additional effort is aimed at the dissemination, inclusion, and perpetuation of TBI knowledge from the academic environment to society, helping the study, understanding, and elucidating of complex disease mechanics and their treatment.
Collapse
Affiliation(s)
- Matheus Correia Casotti
- Departamento de Ciências Biológicas, Universidade Federal do Espírito Santo, Vitória 29075-010, Espírito Santo, Brazil
| | - Débora Dummer Meira
- Departamento de Ciências Biológicas, Universidade Federal do Espírito Santo, Vitória 29075-010, Espírito Santo, Brazil
| | - Lyvia Neves Rebello Alves
- Departamento de Ciências Biológicas, Universidade Federal do Espírito Santo, Vitória 29075-010, Espírito Santo, Brazil
| | | | - Camilly Victória Campanharo
- Departamento de Ciências Biológicas, Universidade Federal do Espírito Santo, Vitória 29075-010, Espírito Santo, Brazil
| | - Creuza Rachel Vicente
- Departamento de Medicina Social, Universidade Federal do Espírito Santo, Vitória 29040-090, Espírito Santo, Brazil
| | - Carla Carvalho Aguiar
- Departamento de Ciências Biológicas, Universidade Federal do Espírito Santo, Vitória 29075-010, Espírito Santo, Brazil
| | - Daniel de Almeida Duque
- Departamento de Ciências Biológicas, Universidade Federal do Espírito Santo, Vitória 29075-010, Espírito Santo, Brazil
| | - Débora Gonçalves Barbosa
- Departamento de Ciências Biológicas, Universidade Federal do Espírito Santo, Vitória 29075-010, Espírito Santo, Brazil
| | | | - Fernanda Mariano Garcia
- Departamento de Ciências Biológicas, Universidade Federal do Espírito Santo, Vitória 29075-010, Espírito Santo, Brazil
| | - Flávia de Paula
- Departamento de Ciências Biológicas, Universidade Federal do Espírito Santo, Vitória 29075-010, Espírito Santo, Brazil
| | - Gabriel Mendonça Santana
- Departamento de Ciências Biológicas, Universidade Federal do Espírito Santo, Vitória 29075-010, Espírito Santo, Brazil
| | - Isabele Pagani Pavan
- Departamento de Ciências Biológicas, Universidade Federal do Espírito Santo, Vitória 29075-010, Espírito Santo, Brazil
| | - Luana Santos Louro
- Departamento de Ciências Biológicas, Universidade Federal do Espírito Santo, Vitória 29075-010, Espírito Santo, Brazil
| | - Raquel Furlani Rocon Braga
- Departamento de Ciências Biológicas, Universidade Federal do Espírito Santo, Vitória 29075-010, Espírito Santo, Brazil
| | - Raquel Silva dos Reis Trabach
- Departamento de Ciências Biológicas, Universidade Federal do Espírito Santo, Vitória 29075-010, Espírito Santo, Brazil
| | - Thomas Santos Louro
- Escola Superior de Ciências da Santa Casa de Misericórdia de Vitória (EMESCAM), Vitória 29027-502, Espírito Santo, Brazil
| | - Elizeu Fagundes de Carvalho
- Instituto de Biologia Roberto Alcantara Gomes (IBRAG), Universidade do Estado do Rio de Janeiro (UERJ), Rio de Janeiro 20551-030, Rio de Janeiro, Brazil
| | - Iúri Drumond Louro
- Departamento de Ciências Biológicas, Universidade Federal do Espírito Santo, Vitória 29075-010, Espírito Santo, Brazil
| |
Collapse
|
4
|
Li C, Liu L, Dinu V. Pathways of topological rank analysis (PoTRA): a novel method to detect pathways involved in hepatocellular carcinoma. PeerJ 2018; 6:e4571. [PMID: 29666752 PMCID: PMC5896492 DOI: 10.7717/peerj.4571] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Accepted: 03/14/2018] [Indexed: 01/01/2023] Open
Abstract
Complex diseases such as cancer are usually the result of a combination of environmental factors and one or several biological pathways consisting of sets of genes. Each biological pathway exerts its function by delivering signaling through the gene network. Theoretically, a pathway is supposed to have a robust topological structure under normal physiological conditions. However, the pathway's topological structure could be altered under some pathological condition. It is well known that a normal biological network includes a small number of well-connected hub nodes and a large number of nodes that are non-hubs. In addition, it is reported that the loss of connectivity is a common topological trait of cancer networks, which is an assumption of our method. Hence, from normal to cancer, the process of the network losing connectivity might be the process of disrupting the structure of the network, namely, the number of hub genes might be altered in cancer compared to that in normal or the distribution of topological ranks of genes might be altered. Based on this, we propose a new PageRank-based method called Pathways of Topological Rank Analysis (PoTRA) to detect pathways involved in cancer. We use PageRank to measure the relative topological ranks of genes in each biological pathway, then select hub genes for each pathway, and use Fisher's exact test to test if the number of hub genes in each pathway is altered from normal to cancer. Alternatively, if the distribution of topological ranks of gene in a pathway is altered between normal and cancer, this pathway might also be involved in cancer. Hence, we use the Kolmogorov-Smirnov test to detect pathways that have an altered distribution of topological ranks of genes between two phenotypes. We apply PoTRA to study hepatocellular carcinoma (HCC) and several subtypes of HCC. Very interestingly, we discover that all significant pathways in HCC are cancer-associated generally, while several significant pathways in subtypes of HCC are HCC subtype-associated specifically. In conclusion, PoTRA is a new approach to explore and discover pathways involved in cancer. PoTRA can be used as a complement to other existing methods to broaden our understanding of the biological mechanisms behind cancer at the system-level.
Collapse
Affiliation(s)
- Chaoxing Li
- School of Life Sciences, Arizona State University, Tempe, AZ, United States of America
| | - Li Liu
- Department of Biomedical Informatics, Arizona State University, Scottsdale, AZ, United States of America
| | - Valentin Dinu
- Department of Biomedical Informatics, Arizona State University, Scottsdale, AZ, United States of America
| |
Collapse
|
6
|
Kimmel C, Visweswaran S. KNGP: A network-based gene prioritization algorithm that incorporates multiple sources of knowledge. AMERICAN JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY 2015; 3:1-4. [PMID: 31245171 PMCID: PMC6594558 DOI: 10.7726/ajbcb.2015.1001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
BACKGROUND Candidate gene prioritization is the process of identifying and ranking new genes as potential candidates of being associated with a disease or phenotype. Integrating multiple sources of biological knowledge for gene prioritization can improve performance. RESULTS We developed a novel network-based gene prioritization algorithm called Knowledge Network Gene Prioritization (KNGP) that can incorporate node weights in addition to the usually used link weights. The online Web implementation of KNGP can handle small input files while the downloadable R software package can handle larger input files. We also provide several files of coded biological knowledge that can be used by KNGP.
Collapse
Affiliation(s)
- Chad Kimmel
- Department of Biomedical Informatics, University of Pittsburgh
| | | |
Collapse
|
7
|
Iourov IY, Vorsanova SG, Yurov YB. In silico molecular cytogenetics: a bioinformatic approach to prioritization of candidate genes and copy number variations for basic and clinical genome research. Mol Cytogenet 2014; 7:98. [PMID: 25525469 PMCID: PMC4269961 DOI: 10.1186/s13039-014-0098-z] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2014] [Accepted: 12/02/2014] [Indexed: 01/08/2023] Open
Abstract
Background The availability of multiple in silico tools for prioritizing genetic variants widens the possibilities for converting genomic data into biological knowledge. However, in molecular cytogenetics, bioinformatic analyses are generally limited to result visualization or database mining for finding similar cytogenetic data. Obviously, the potential of bioinformatics might go beyond these applications. On the other hand, the requirements for performing successful in silico analyses (i.e. deep knowledge of computer science, statistics etc.) can hinder the implementation of bioinformatics in clinical and basic molecular cytogenetic research. Here, we propose a bioinformatic approach to prioritization of genomic variations that is able to solve these problems. Results Selecting gene expression as an initial criterion, we have proposed a bioinformatic approach combining filtering and ranking prioritization strategies, which includes analyzing metabolome and interactome data on proteins encoded by candidate genes. To finalize the prioritization of genetic variants, genomic, epigenomic, interactomic and metabolomic data fusion has been made. Structural abnormalities and aneuploidy revealed by array CGH and FISH have been evaluated to test the approach through determining genotype-phenotype correlations, which have been found similar to those of previous studies. Additionally, we have been able to prioritize copy number variations (CNV) (i.e. differentiate between benign CNV and CNV with phenotypic outcome). Finally, the approach has been applied to prioritize genetic variants in cases of somatic mosaicism (including tissue-specific mosaicism). Conclusions In order to provide for an in silico evaluation of molecular cytogenetic data, we have proposed a bioinformatic approach to prioritization of candidate genes and CNV. While having the disadvantage of possible unavailability of gene expression data or lack of expression variability between genes of interest, the approach provides several advantages. These are (i) the versatility due to independence from specific databases/tools or software, (ii) relative algorithm simplicity (possibility to avoid sophisticated computational/statistical methodology) and (iii) applicability to molecular cytogenetic data because of the chromosome-centric nature. In conclusion, the approach is able to become useful for increasing the yield of molecular cytogenetic techniques.
Collapse
Affiliation(s)
- Ivan Y Iourov
- Mental Health Research Center, Russian Academy of Medical Sciences, 117152 Moscow, Russia ; Russian National Research Medical University named after N.I. Pirogov, Separated Structural Unit "Clinical Research Institute of Pediatrics", Ministry of Health of Russian Federation, 125412 Moscow, Russia ; Department of Medical Genetics, Russian Medical Academy of Postgraduate Education, Moscow, 123995 Russia
| | - Svetlana G Vorsanova
- Mental Health Research Center, Russian Academy of Medical Sciences, 117152 Moscow, Russia ; Russian National Research Medical University named after N.I. Pirogov, Separated Structural Unit "Clinical Research Institute of Pediatrics", Ministry of Health of Russian Federation, 125412 Moscow, Russia
| | - Yuri B Yurov
- Mental Health Research Center, Russian Academy of Medical Sciences, 117152 Moscow, Russia ; Russian National Research Medical University named after N.I. Pirogov, Separated Structural Unit "Clinical Research Institute of Pediatrics", Ministry of Health of Russian Federation, 125412 Moscow, Russia
| |
Collapse
|
8
|
Abstract
Bioinformatics aids in the understanding of the biological processes of living beings and the genetic architecture of human diseases. The discovery of disease-related genes improves the diagnosis and therapy design for the disease. To save the cost and time involved in the experimental verification of the candidate genes, computational methods are employed for ranking the genes according to their likelihood of being associated with the disease. Only top-ranked genes are then verified experimentally. A variety of methods have been conceived by the researchers for the prioritization of the disease candidate genes, which differ in the data source being used or the scoring function used for ranking the genes. A review of various aspects of computational disease gene prioritization and its research issues is presented in this article. The aspects covered are gene prioritization process, data sources used, types of prioritization methods, and performance assessment methods. This article provides a brief overview and acts as a quick guide for disease gene prioritization.
Collapse
Affiliation(s)
- Nivit Gill
- 1 Punjabi University Regional Centre For IT and Management , Mohali, Punjab, India
| | | | | |
Collapse
|