1
|
Joseph S, Patil K, Rahate N, Shah J, Mukherjee S, Mahale SD. Integrated data driven analysis identifies potential candidate genes associated with PCOS. Comput Biol Chem 2024; 113:108191. [PMID: 39243549 DOI: 10.1016/j.compbiolchem.2024.108191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Revised: 07/16/2024] [Accepted: 08/28/2024] [Indexed: 09/09/2024]
Abstract
Polycystic ovary syndrome (PCOS) is one of the most common anovulatory disorder observed in women presenting with infertility. Several high and low throughput studies on PCOS have led to accumulation of vast amount of information on PCOS. Despite the availability of several resources which index the advances in PCOS, information on its etiology still remains inadequate. Analysis of the existing information using an integrated evidence based approach may aid identification of novel potential candidate genes with a role in PCOS pathophysiology. This work focuses on integrating existing information on PCOS from literature and gene expression studies and evaluating the application of gene prioritization and network analysis to predict missing novel candidates. Further, it assesses the utility of evidence-based scoring to rank genes for their association with PCOS. The results of this study led to identification of ∼2000 plausible candidate genes associated with PCOS. Insilico validation of these identified candidates confirmed the role of 938 genes in PCOS. Further, experimental validation was carried out for four of the potential candidate genes, a high-scoring (PROS1), two mid-scoring (C1QA and KNG1), and a low-scoring gene (VTN) involved in the complement and coagulation pathway by comparing protein levels in follicular fluid in women with PCOS and healthy controls. While the expression of PROS1, C1QA, and KNG1 was found to be significantly downregulated in women with PCOS, the expression of VTN was found to be unchanged in PCOS. The findings of this study reiterate the utility of employing insilico approaches to identify and prioritize the most promising candidate genes in diseases with a complex pathophysiology like PCOS. Further, the study also helps in gaining clearer insights into the molecular mechanisms associated with the manifestation of the PCOS phenotype by contributing to the existing repertoire of genes associated with PCOS.
Collapse
Affiliation(s)
- Shaini Joseph
- Genetic Research Center, ICMR-National Institute for Research in Reproductive and Child Health, J.M. Street, Parel, Mumbai 400012, India
| | - Krutika Patil
- Department of Molecular Endocrinology, ICMR-National Institute for Research in Reproductive and Child Health, J.M. Street, Parel, Mumbai 400012, India
| | - Niharika Rahate
- Genetic Research Center, ICMR-National Institute for Research in Reproductive and Child Health, J.M. Street, Parel, Mumbai 400012, India
| | - Jatin Shah
- Mumbai Fertility Clinic & IVF Centre, Kamala Polyclinic and Nursing Home, Mumbai 400026, India
| | - Srabani Mukherjee
- Department of Molecular Endocrinology, ICMR-National Institute for Research in Reproductive and Child Health, J.M. Street, Parel, Mumbai 400012, India.
| | - Smita D Mahale
- ICMR-National Institute for Research in Reproductive and Child Health, J.M. Street, Parel, Mumbai 400012, India.
| |
Collapse
|
2
|
Patil N, Howe O, Cahill P, Byrne HJ. Monitoring and modelling the dynamics of the cellular glycolysis pathway: A review and future perspectives. Mol Metab 2022; 66:101635. [PMID: 36379354 PMCID: PMC9703637 DOI: 10.1016/j.molmet.2022.101635] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 10/28/2022] [Accepted: 11/06/2022] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND The dynamics of the cellular glycolysis pathway underpin cellular function and dysfunction, and therefore ultimately health, disease, diagnostic and therapeutic strategies. Evolving our understanding of this fundamental process and its dynamics remains critical. SCOPE OF REVIEW This paper reviews the medical relevance of glycolytic pathway in depth and explores the current state of the art for monitoring and modelling the dynamics of the process. The future perspectives of label free, vibrational microspectroscopic techniques to overcome the limitations of the current approaches are considered. MAJOR CONCLUSIONS Vibrational microspectroscopic techniques can potentially operate in the niche area of limitations of other omics technologies for non-destructive, real-time, in vivo label-free monitoring of glycolysis dynamics at a cellular and subcellular level.
Collapse
Affiliation(s)
- Nitin Patil
- FOCAS Research Institute, Technological University Dublin, City Campus, Camden Row, Dublin 8, Ireland; School of Physics and Optometric & Clinical Sciences, Technological University Dublin, City Campus, Grangegorman, Dublin 7, Ireland.
| | - Orla Howe
- School of Biological and Health Sciences, Technological University Dublin, City Campus, Grangegorman, Dublin 7, Ireland
| | - Paul Cahill
- School of Biotechnology, Dublin City University, Glasnevin, Dublin 9, Ireland
| | - Hugh J Byrne
- FOCAS Research Institute, Technological University Dublin, City Campus, Camden Row, Dublin 8, Ireland
| |
Collapse
|
3
|
Kenny SE, Antaw F, Locke WJ, Howard CB, Korbie D, Trau M. Next-Generation Molecular Discovery: From Bottom-Up In Vivo and In Vitro Approaches to In Silico Top-Down Approaches for Therapeutics Neogenesis. Life (Basel) 2022; 12:363. [PMID: 35330114 PMCID: PMC8950575 DOI: 10.3390/life12030363] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 02/23/2022] [Indexed: 12/02/2022] Open
Abstract
Protein and drug engineering comprises a major part of the medical and research industries, and yet approaches to discovering and understanding therapeutic molecular interactions in biological systems rely on trial and error. The general approach to molecular discovery involves screening large libraries of compounds, proteins, or antibodies, or in vivo antibody generation, which could be considered "bottom-up" approaches to therapeutic discovery. In these bottom-up approaches, a minimal amount is known about the therapeutics at the start of the process, but through meticulous and exhaustive laboratory work, the molecule is characterised in detail. In contrast, the advent of "big data" and access to extensive online databases and machine learning technologies offers promising new avenues to understanding molecular interactions. Artificial intelligence (AI) now has the potential to predict protein structure at an unprecedented accuracy using only the genetic sequence. This predictive approach to characterising molecular structure-when accompanied by high-quality experimental data for model training-has the capacity to invert the process of molecular discovery and characterisation. The process has potential to be transformed into a top-down approach, where new molecules can be designed directly based on the structure of a target and the desired function, rather than performing screening of large libraries of molecular variants. This paper will provide a brief evaluation of bottom-up approaches to discovering and characterising biological molecules and will discuss recent advances towards developing top-down approaches and the prospects of this.
Collapse
Affiliation(s)
- Sophie E. Kenny
- Centre for Personalised Nanomedicine, Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Corner of College and Cooper Roads (Bldg 75), Brisbane, QLD 4072, Australia; (S.E.K.); (F.A.); (C.B.H.)
| | - Fiach Antaw
- Centre for Personalised Nanomedicine, Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Corner of College and Cooper Roads (Bldg 75), Brisbane, QLD 4072, Australia; (S.E.K.); (F.A.); (C.B.H.)
| | - Warwick J. Locke
- Molecular Diagnostic Solutions, Health and Biosecurity, Commonwealth Scientific and Industrial Research Organisation, Building 101, Clunies Ross Street, Canberra, ACT 2601, Australia;
| | - Christopher B. Howard
- Centre for Personalised Nanomedicine, Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Corner of College and Cooper Roads (Bldg 75), Brisbane, QLD 4072, Australia; (S.E.K.); (F.A.); (C.B.H.)
| | - Darren Korbie
- Centre for Personalised Nanomedicine, Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Corner of College and Cooper Roads (Bldg 75), Brisbane, QLD 4072, Australia; (S.E.K.); (F.A.); (C.B.H.)
| | - Matt Trau
- Centre for Personalised Nanomedicine, Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Corner of College and Cooper Roads (Bldg 75), Brisbane, QLD 4072, Australia; (S.E.K.); (F.A.); (C.B.H.)
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, QLD 4072, Australia
| |
Collapse
|
4
|
Mahmoudian M, Venäläinen MS, Klén R, Elo LL. Stable Iterative Variable Selection. Bioinformatics 2021; 37:4810-4817. [PMID: 34270690 PMCID: PMC8665768 DOI: 10.1093/bioinformatics/btab501] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Revised: 05/20/2021] [Accepted: 07/14/2021] [Indexed: 11/13/2022] Open
Abstract
Motivation The emergence of datasets with tens of thousands of features, such as high-throughput omics biomedical data, highlights the importance of reducing the feature space into a distilled subset that can truly capture the signal for research and industry by aiding in finding more effective biomarkers for the question in hand. A good feature set also facilitates building robust predictive models with improved interpretability and convergence of the applied method due to the smaller feature space. Results Here, we present a robust feature selection method named Stable Iterative Variable Selection (SIVS) and assess its performance over both omics and clinical data types. As a performance assessment metric, we compared the number and goodness of the selected feature using SIVS to those selected by Least Absolute Shrinkage and Selection Operator regression. The results suggested that the feature space selected by SIVS was, on average, 41% smaller, without having a negative effect on the model performance. A similar result was observed for comparison with Boruta and caret RFE. Availability and implementation The method is implemented as an R package under GNU General Public License v3.0 and is accessible via Comprehensive R Archive Network (CRAN) via https://cran.r-project.org/package=sivs. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mehrad Mahmoudian
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland.,Department of Future Technologies, University of Turku, Turku, Finland
| | - Mikko S Venäläinen
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland
| | - Riku Klén
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland
| | - Laura L Elo
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland.,Institute of Biomedicine, University of Turku, Turku, Finland
| |
Collapse
|
5
|
Silberstein M, Nesbit N, Cai J, Lee PH. Pathway analysis for genome-wide genetic variation data: Analytic principles, latest developments, and new opportunities. J Genet Genomics 2021; 48:173-183. [PMID: 33896739 PMCID: PMC8286309 DOI: 10.1016/j.jgg.2021.01.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 01/24/2021] [Accepted: 01/25/2021] [Indexed: 12/23/2022]
Abstract
Pathway analysis, also known as gene-set enrichment analysis, is a multilocus analytic strategy that integrates a priori, biological knowledge into the statistical analysis of high-throughput genetics data. Originally developed for the studies of gene expression data, it has become a powerful analytic procedure for in-depth mining of genome-wide genetic variation data. Astonishing discoveries were made in the past years, uncovering genes and biological mechanisms underlying common and complex disorders. However, as massive amounts of diverse functional genomics data accrue, there is a pressing need for newer generations of pathway analysis methods that can utilize multiple layers of high-throughput genomics data. In this review, we provide an intellectual foundation of this powerful analytic strategy, as well as an update of the state-of-the-art in recent method developments. The goal of this review is threefold: (1) introduce the motivation and basic steps of pathway analysis for genome-wide genetic variation data; (2) review the merits and the shortcomings of classic and newly emerging integrative pathway analysis tools; and (3) discuss remaining challenges and future directions for further method developments.
Collapse
Affiliation(s)
- Micah Silberstein
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Nicholas Nesbit
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Jacquelyn Cai
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Phil H Lee
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Psychiatry, Harvard Medical School, Boston, MA 02115, USA; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| |
Collapse
|
6
|
Lynn H, Sun X, Casanova N, Gonzales-Garay M, Bime C, Garcia JGN. Genomic and Genetic Approaches to Deciphering Acute Respiratory Distress Syndrome Risk and Mortality. Antioxid Redox Signal 2019; 31:1027-1052. [PMID: 31016989 PMCID: PMC6939590 DOI: 10.1089/ars.2018.7701] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Significance: Acute respiratory distress syndrome (ARDS) is a severe, highly heterogeneous critical illness with staggering mortality that is influenced by environmental factors, such as mechanical ventilation, and genetic factors. Significant unmet needs in ARDS are addressing the paucity of validated predictive biomarkers for ARDS risk and susceptibility that hamper the conduct of successful clinical trials in ARDS and the complete absence of novel disease-modifying therapeutic strategies. Recent Advances: The current ARDS definition relies on clinical characteristics that fail to capture the diversity of disease pathology, severity, and mortality risk. We undertook a comprehensive survey of the available ARDS literature to identify genes and genetic variants (candidate gene and limited genome-wide association study approaches) implicated in susceptibility to developing ARDS in hopes of uncovering novel biomarkers for ARDS risk and mortality and potentially novel therapeutic targets in ARDS. We further attempted to address the well-known health disparities that exist in susceptibility to and mortality from ARDS. Critical Issues: Bioinformatic analyses identified 201 ARDS candidate genes with pathway analysis indicating a strong predominance in key evolutionarily conserved inflammatory pathways, including reactive oxygen species, innate immunity-related inflammation, and endothelial vascular signaling pathways. Future Directions: Future studies employing a system biology approach that combines clinical characteristics, genomics, transcriptomics, and proteomics may allow for a better definition of biologically relevant pathways and genotype-phenotype connections and result in improved strategies for the sub-phenotyping of diverse ARDS patients via molecular signatures. These efforts should facilitate the potential for successful clinical trials in ARDS and yield a better fundamental understanding of ARDS pathobiology.
Collapse
Affiliation(s)
- Heather Lynn
- Department of Physiological Sciences and University of Arizona, Tucson, Arizona.,Department of Health Sciences, University of Arizona, Tucson, Arizona
| | - Xiaoguang Sun
- Department of Health Sciences, University of Arizona, Tucson, Arizona
| | - Nancy Casanova
- Department of Health Sciences, University of Arizona, Tucson, Arizona
| | | | - Christian Bime
- Department of Health Sciences, University of Arizona, Tucson, Arizona
| | - Joe G N Garcia
- Department of Health Sciences, University of Arizona, Tucson, Arizona
| |
Collapse
|
7
|
Fang X, Zheng Y, Duan Y, Liu Y, Zhong W. Recent Advances in Design of Fluorescence-Based Assays for High-Throughput Screening. Anal Chem 2019; 91:482-504. [PMID: 30481456 PMCID: PMC7262998 DOI: 10.1021/acs.analchem.8b05303] [Citation(s) in RCA: 93] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
- Xiaoni Fang
- Department of Chemistry, University of California, Riverside, California 92521, United States
| | - Yongzan Zheng
- Department of Chemistry, University of California, Riverside, California 92521, United States
| | - Yaokai Duan
- Department of Chemistry, University of California, Riverside, California 92521, United States
| | - Yang Liu
- Environmental Toxicology Graduate Program, University of California, Riverside, California 92521, United States
| | - Wenwan Zhong
- Department of Chemistry, University of California, Riverside, California 92521, United States
- Environmental Toxicology Graduate Program, University of California, Riverside, California 92521, United States
| |
Collapse
|
8
|
Wu ZY, Li JR, Huang MH, Cheng JJ, Li H, Chen JH, Lv XQ, Peng ZG, Jiang JD. Internal driving factors leading to extrahepatic manifestation of the hepatitis C virus infection. Int J Mol Med 2017; 40:1792-1802. [PMID: 29039494 PMCID: PMC5716440 DOI: 10.3892/ijmm.2017.3175] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2017] [Accepted: 09/26/2017] [Indexed: 02/07/2023] Open
Abstract
The hepatitis C virus (HCV) infection is associated with various extrahepatic manifestations, which are correlated with poor outcomes, and thus increase the morbidity and mortality of chronic hepatitis C (CHC). Therefore, understanding the internal linkages between systemic manifestations and HCV infection is helpful for treatment of CHC. Yet, the mechanism by which the virus evokes the systemic diseases remains to be elucidated. In the present study, using gene set enrichment analysis (GSEA) and signaling pathway impact analysis (SPIA), a comprehensive analysis of microarray data of mRNAs was conducted in HCV-infected and -uninfected Huh7.5 cells, and signaling pathways (which are significantly activated or inhibited) and certain molecules (which are commonly important in those signaling pathways) were selected. Forty signaling pathways were selected using GSEA, and eight signaling pathways were selected with SPIA. These pathways are associated with cancer, metabolism, environmental information processing and organismal systems, which provide important information for further clarifying the intrinsic associations between syndromes of HCV infection, of which seven pathways were not previously reported, including basal transcription factors, pathogenic Escherichia coli infection, shigellosis, gastric acid secretion, dorso-ventral axis formation, amoebiasis and cholinergic synapse. Ten genes, SOS1, RAF1, IFNA2, IFNG, MTHFR, IGF1, CALM3, UBE2B, TP53 and BMP7 whose expression may be the key internal driving molecules, were selected using the online tool Anni 2.1. Furthermore, the present study demonstrated the internal linkages between systemic manifestations and HCV infection, and presented the potential molecules that are key to those linkages.
Collapse
Affiliation(s)
- Zhou-Yi Wu
- Institute of Medicinal Biotechnology, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, P.R. China
| | - Jian-Rui Li
- Institute of Medicinal Biotechnology, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, P.R. China
| | - Meng-Hao Huang
- Institute of Medicinal Biotechnology, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, P.R. China
| | - Jun-Jun Cheng
- Institute of Medicinal Biotechnology, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, P.R. China
| | - Hu Li
- Institute of Medicinal Biotechnology, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, P.R. China
| | - Jin-Hua Chen
- Institute of Medicinal Biotechnology, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, P.R. China
| | - Xiao-Qin Lv
- Institute of Medicinal Biotechnology, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, P.R. China
| | - Zong-Gen Peng
- Institute of Medicinal Biotechnology, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, P.R. China
| | - Jian-Dong Jiang
- Institute of Medicinal Biotechnology, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, P.R. China
| |
Collapse
|
9
|
Yan S, Wu G. Reorganization of gene network for degradation of polycyclic aromatic hydrocarbons (PAHs) in Pseudomonas aeruginosa PAO1 under several conditions. J Appl Genet 2017; 58:545-563. [PMID: 28685384 PMCID: PMC5655620 DOI: 10.1007/s13353-017-0402-9] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2016] [Revised: 05/22/2017] [Accepted: 06/06/2017] [Indexed: 01/05/2023]
Abstract
Although polycyclic aromatic hydrocarbons (PAHs) are harmful to human health, their elimination from the environment is not easy. Biodegradation of PAHs is promising since many bacteria have the ability to use hydrocarbons as their sole carbon and energy sources for growth. Of various microorganisms that can degrade PAHs, Pseudomonas aeruginosa is particularly important, not only because it causes a series of diseases including infection in cystic fibrosis patients, but also because it is a model bacterium in various studies. The genes that are responsible for degrading PAHs have been identified in P. aeruginosa, however, no gene acts alone as various stresses often initiate different metabolic pathways, quorum sensing, biofilm formation, antibiotic tolerance, etc. Therefore, it is important to study how PAH degradation genes behave under different conditions. In this study, we apply network analysis to investigating how 46 PAH degradation genes reorganized among 5549 genes in P. aeruginosa PAO1 under nine different conditions using publicly available gene coexpression data from GEO. The results provide six aspects of novelties: (i) comparing the number of gene clusters before and after stresses, (ii) comparing the membership in each gene cluster before and after stresses, (iii) defining which gene changed its membership together with PAH degradation genes before and after stresses, (iv) classifying membership-changed-genes in terms of category in Pseudomonas Genome Database, (v) postulating unknown gene’s function, and (vi) proposing new mechanisms for genes of interests. This study can shed light on understanding of cooperative mechanisms of PAH degradation from the level of entire genes in an organism, and paves the way to conduct the similar studies on other genes.
Collapse
Affiliation(s)
- Shaomin Yan
- Bioscience and Technology Research Center, Guangxi Academy of Sciences, 98 Daling Road, Nanning, Guangxi, 530007, China
| | - Guang Wu
- Bioscience and Technology Research Center, Guangxi Academy of Sciences, 98 Daling Road, Nanning, Guangxi, 530007, China.
| |
Collapse
|
10
|
Raddatz BB, Spitzbarth I, Matheis KA, Kalkuhl A, Deschl U, Baumgärtner W, Ulrich R. Microarray-Based Gene Expression Analysis for Veterinary Pathologists: A Review. Vet Pathol 2017. [PMID: 28641485 DOI: 10.1177/0300985817709887] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
High-throughput, genome-wide transcriptome analysis is now commonly used in all fields of life science research and is on the cusp of medical and veterinary diagnostic application. Transcriptomic methods such as microarrays and next-generation sequencing generate enormous amounts of data. The pathogenetic expertise acquired from understanding of general pathology provides veterinary pathologists with a profound background, which is essential in translating transcriptomic data into meaningful biological knowledge, thereby leading to a better understanding of underlying disease mechanisms. The scientific literature concerning high-throughput data-mining techniques usually addresses mathematicians or computer scientists as the target audience. In contrast, the present review provides the reader with a clear and systematic basis from a veterinary pathologist's perspective. Therefore, the aims are (1) to introduce the reader to the necessary methodological background; (2) to introduce the sequential steps commonly performed in a microarray analysis including quality control, annotation, normalization, selection of differentially expressed genes, clustering, gene ontology and pathway analysis, analysis of manually selected genes, and biomarker discovery; and (3) to provide references to publically available and user-friendly software suites. In summary, the data analysis methods presented within this review will enable veterinary pathologists to analyze high-throughput transcriptome data obtained from their own experiments, supplemental data that accompany scientific publications, or public repositories in order to obtain a more in-depth insight into underlying disease mechanisms.
Collapse
Affiliation(s)
- Barbara B Raddatz
- 1 Department of Pathology, University of Veterinary Medicine Hannover, Hannover, Germany.,2 Center of Systems Neuroscience, Hannover, Germany
| | - Ingo Spitzbarth
- 1 Department of Pathology, University of Veterinary Medicine Hannover, Hannover, Germany.,2 Center of Systems Neuroscience, Hannover, Germany
| | - Katja A Matheis
- 3 Department of Nonclinical Drug Safety, Boehringer Ingelheim Pharma GmbH & Co KG, Biberach (Riß), Germany
| | - Arno Kalkuhl
- 3 Department of Nonclinical Drug Safety, Boehringer Ingelheim Pharma GmbH & Co KG, Biberach (Riß), Germany
| | - Ulrich Deschl
- 3 Department of Nonclinical Drug Safety, Boehringer Ingelheim Pharma GmbH & Co KG, Biberach (Riß), Germany
| | - Wolfgang Baumgärtner
- 1 Department of Pathology, University of Veterinary Medicine Hannover, Hannover, Germany.,2 Center of Systems Neuroscience, Hannover, Germany
| | - Reiner Ulrich
- 1 Department of Pathology, University of Veterinary Medicine Hannover, Hannover, Germany.,2 Center of Systems Neuroscience, Hannover, Germany.,4 Department of Experimental Animal Facilities and Biorisk Management, Friedrich-Loeffler-Institute, Greifswald, Germany
| |
Collapse
|
11
|
SO 2 Emissions in China - Their Network and Hierarchical Structures. Sci Rep 2017; 7:46216. [PMID: 28387301 PMCID: PMC5384192 DOI: 10.1038/srep46216] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Accepted: 03/13/2017] [Indexed: 11/29/2022] Open
Abstract
SO2 emissions lead to various harmful effects on environment and human health. The SO2 emission in China has significant contribution to the global SO2 emission, so it is necessary to employ various methods to study SO2 emissions in China with great details in order to lay the foundation for policymaking to improve environmental conditions in China. Network analysis is used to analyze the SO2 emissions from power generation, industrial, residential and transportation sectors in China for 2008 and 2010, which are recently available from 1744 ground surface monitoring stations. The results show that the SO2 emissions from power generation sector were highly individualized as small-sized clusters, the SO2 emissions from industrial sector underwent an integration process with a large cluster contained 1674 places covering all industrial areas in China, the SO2 emissions from residential sector was not impacted by time, and the SO2 emissions from transportation sector underwent significant integration. Hierarchical structure is obtained by further combining SO2 emissions from all four sectors and is potentially useful to find out similar patterns of SO2 emissions, which can provide information on understanding the mechanisms of SO2 pollution and on designing different environmental measure to combat SO2 emissions.
Collapse
|
12
|
Zhang Q, Jun SR, Leuze M, Ussery D, Nookaew I. Viral Phylogenomics Using an Alignment-Free Method: A Three-Step Approach to Determine Optimal Length of k-mer. Sci Rep 2017; 7:40712. [PMID: 28102365 PMCID: PMC5244389 DOI: 10.1038/srep40712] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2016] [Accepted: 12/08/2016] [Indexed: 11/25/2022] Open
Abstract
The development of rapid, economical genome sequencing has shed new light on the classification of viruses. As of October 2016, the National Center for Biotechnology Information (NCBI) database contained >2 million viral genome sequences and a reference set of ~4000 viral genome sequences that cover a wide range of known viral families. Whole-genome sequences can be used to improve viral classification and provide insight into the viral "tree of life". However, due to the lack of evolutionary conservation amongst diverse viruses, it is not feasible to build a viral tree of life using traditional phylogenetic methods based on conserved proteins. In this study, we used an alignment-free method that uses k-mers as genomic features for a large-scale comparison of complete viral genomes available in RefSeq. To determine the optimal feature length, k (an essential step in constructing a meaningful dendrogram), we designed a comprehensive strategy that combines three approaches: (1) cumulative relative entropy, (2) average number of common features among genomes, and (3) the Shannon diversity index. This strategy was used to determine k for all 3,905 complete viral genomes in RefSeq. The resulting dendrogram shows consistency with the viral taxonomy of the ICTV and the Baltimore classification of viruses.
Collapse
Affiliation(s)
- Qian Zhang
- UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37996, USA
- Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory Oak Ridge, TN 37831 USA
| | - Se-Ran Jun
- Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory Oak Ridge, TN 37831 USA
- Department of Biomedical Informatics, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA
| | - Michael Leuze
- Joint Institute for Computational Sciences, University of Tennessee, Knoxville, TN 37831, USA
- Computational Biomolecular Modeling and Bioinformatics Group, Computer Science and Mathematics Division, Oak Ridge National Laboratories, Oak Ridge, TN 37831, USA
| | - David Ussery
- Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory Oak Ridge, TN 37831 USA
- Department of Biomedical Informatics, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA
| | - Intawat Nookaew
- Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory Oak Ridge, TN 37831 USA
- Department of Biomedical Informatics, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA
| |
Collapse
|
13
|
Affiliation(s)
- Jayasree Sengupta
- Department of Physiology; All India Institute of Medical Sciences; New Delhi India
| | - G. Anupa
- Department of Physiology; All India Institute of Medical Sciences; New Delhi India
| | - Muzaffer Ahmed Bhat
- Department of Physiology; All India Institute of Medical Sciences; New Delhi India
| | - Debabrata Ghosh
- Department of Physiology; All India Institute of Medical Sciences; New Delhi India
| |
Collapse
|
14
|
Network Analysis of Fine Particulate Matter (PM2.5) Emissions in China. Sci Rep 2016; 6:33227. [PMID: 27608625 PMCID: PMC5016853 DOI: 10.1038/srep33227] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2016] [Accepted: 08/23/2016] [Indexed: 11/30/2022] Open
Abstract
Specification of PM2.5 spatial and temporal characteristics is important for understanding PM2.5 adverse effects and policymaking. We applied network analysis to studying the dataset MIX, which contains PM2.5 emissions recorded from 2168 monitoring stations in China in 2008 and 2010. The results showed that for PM2.5 emissions from industrial sector 8 clusters were found in 2008 but they merged together into a huge cluster in 2010, suggesting that industrial sector underwent an integrating process. For PM2.5 emissions from electricity generation sector, strong locality of clusters was revealed, implying that each region had its own electricity generation system. For PM2.5 emissions from residential sector, the same pattern of 10 clusters was uncovered in both years, implicating the household energy consumption unchanged from 2008 to 2010. For PM2.5 emissions from transportation sector, the same pattern of 5 clusters with many connections in-between was unraveled, indicating the high-speed development of transportation nationalwidely. Except for the known elements, mercury (Hg) surfaced as an element for particle nucleation. To our knowledge, this is the first network study in this field.
Collapse
|
15
|
Abstract
The exponential growth of the Internet of Things and the global popularity and remarkable decline in cost of the mobile phone is driving the digital transformation of medical practice. The rapidly maturing digital, non-medical world of mobile (wireless) devices, cloud computing and social networking is coalescing with the emerging digital medical world of omics data, biosensors and advanced imaging which offers the increasingly realistic prospect of personalized medicine. Described as a potential “seismic” shift from the current “healthcare” model to a “wellness” paradigm that is predictive, preventative, personalized and participatory, this change is based on the development of increasingly sophisticated biosensors which can track and measure key biochemical variables in people. Additional key drivers in this shift are metabolomic and proteomic signatures, which are increasingly being reported as pre-symptomatic, diagnostic and prognostic of toxicity and disease. These advancements also have profound implications for toxicological evaluation and safety assessment of pharmaceuticals and environmental chemicals. An approach based primarily on human in vivo and high-throughput in vitro human cell-line data is a distinct possibility. This would transform current chemical safety assessment practice which operates in a human “data poor” to a human “data rich” environment. This could also lead to a seismic shift from the current animal-based to an animal-free chemical safety assessment paradigm.
Collapse
Affiliation(s)
- George D Loizou
- Health Risks, Health and Safety Laboratory, Health and Safety Executive Buxton, UK
| |
Collapse
|
16
|
Ning WF, Wang F, Deng HJ, Chen HH. Screening of differentially expressed genes in chronic hepatitis B patients and prediction of related biological pathways. Shijie Huaren Xiaohua Zazhi 2016; 24:2485-2491. [DOI: 10.11569/wcjd.v24.i16.2485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
AIM: To study the molecular mechanism of pathogenesis of chronic hepatitis B.
METHODS: Based on microarray experiment, GeneSpring software was used to screen differentially expressed genes in chronic hepatitis B patients, and GeneTrail software was used to perform enrichment analysis of related biological pathways.
RESULTS: A total of 417 differentially expressed genes were identified, of which 205 were upregulated and 212 downregulated. Significant pathways to which downregulated genes belong include ErbB, non-small cell lung cancer, mTOR, RNA degradation, T cell receptor, chronic myeloid leukemia, and renal cell carcinoma pathways. Significant pathways to which upregulated genes belong include chemokine, lysosomes, Vibrio cholerae infection, and IgG Fc receptor-mediated phagocytosis pathways.
CONCLUSION: PI3K/AKT downregulation is likely a major molecular mechanism of persistent hepatitis B.
Collapse
|
17
|
Salazar J, Amri H, Noursi D, Abu-Asab M. Computational Tools for Parsimony Phylogenetic Analysis of Omics Data. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2015; 19:471-7. [PMID: 26230532 PMCID: PMC4529085 DOI: 10.1089/omi.2015.0018] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
High-throughput assays from genomics, proteomics, metabolomics, and next generation sequencing produce massive omics datasets that are challenging to analyze in biological or clinical contexts. Thus far, there is no publicly available program for converting quantitative omics data into input formats to be used in off-the-shelf robust phylogenetic programs. To the best of our knowledge, this is the first report on creation of two Windows-based programs, OmicsTract and SynpExtractor, to address this gap. We note, as a way of introduction and development of these programs, that one particularly useful bioinformatics inferential modeling is the phylogenetic cladogram. Cladograms are multidimensional tools that show the relatedness between subgroups of healthy and diseased individuals and the latter's shared aberrations; they also reveal some characteristics of a disease that would not otherwise be apparent by other analytical methods. The OmicsTract and SynpExtractor were written for the respective tasks of (1) accommodating advanced phylogenetic parsimony analysis (through standard programs of MIX [from PHYLIP] and TNT), and (2) extracting shared aberrations at the cladogram nodes. OmicsTract converts comma-delimited data tables through assigning each data point into a binary value ("0" for normal states and "1" for abnormal states) then outputs the converted data tables into the proper input file formats for MIX or with embedded commands for TNT. SynapExtractor uses outfiles from MIX and TNT to extract the shared aberrations of each node of the cladogram, matching them with identifying labels from the dataset and exporting them into a comma-delimited file. Labels may be gene identifiers in gene-expression datasets or m/z values in mass spectrometry datasets. By automating these steps, OmicsTract and SynpExtractor offer a veritable opportunity for rapid and standardized phylogenetic analyses of omics data; their model can also be extended to next generation sequencing (NGS) data. We make OmicsTract and SynpExtractor publicly and freely available for non-commercial use in order to strengthen and build capacity for the phylogenetic paradigm of omics analysis.
Collapse
Affiliation(s)
- Jose Salazar
- Section of Immunopathology, Laboratory of Immunology, National Eye Institute, Bethesda, Maryland
| | - Hakima Amri
- Department of Biochemistry and Cellular and Molecular Biology, Division of Integrative Physiology, Medical Center, Georgetown University, Washington, District of Columbia
| | - David Noursi
- Section of Immunopathology, Laboratory of Immunology, National Eye Institute, Bethesda, Maryland
| | - Mones Abu-Asab
- Section of Immunopathology, Laboratory of Immunology, National Eye Institute, Bethesda, Maryland
| |
Collapse
|