1
|
Wytock TP, Motter AE. Cell reprogramming design by transfer learning of functional transcriptional networks. Proc Natl Acad Sci U S A 2024; 121:e2312942121. [PMID: 38437548 PMCID: PMC10945810 DOI: 10.1073/pnas.2312942121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 01/26/2024] [Indexed: 03/06/2024] Open
Abstract
Recent developments in synthetic biology, next-generation sequencing, and machine learning provide an unprecedented opportunity to rationally design new disease treatments based on measured responses to gene perturbations and drugs to reprogram cells. The main challenges to seizing this opportunity are the incomplete knowledge of the cellular network and the combinatorial explosion of possible interventions, both of which are insurmountable by experiments. To address these challenges, we develop a transfer learning approach to control cell behavior that is pre-trained on transcriptomic data associated with human cell fates, thereby generating a model of the network dynamics that can be transferred to specific reprogramming goals. The approach combines transcriptional responses to gene perturbations to minimize the difference between a given pair of initial and target transcriptional states. We demonstrate our approach's versatility by applying it to a microarray dataset comprising >9,000 microarrays across 54 cell types and 227 unique perturbations, and an RNASeq dataset consisting of >10,000 sequencing runs across 36 cell types and 138 perturbations. Our approach reproduces known reprogramming protocols with an AUROC of 0.91 while innovating over existing methods by pre-training an adaptable model that can be tailored to specific reprogramming transitions. We show that the number of gene perturbations required to steer from one fate to another increases with decreasing developmental relatedness and that fewer genes are needed to progress along developmental paths than to regress. These findings establish a proof-of-concept for our approach to computationally design control strategies and provide insights into how gene regulatory networks govern phenotype.
Collapse
Affiliation(s)
- Thomas P. Wytock
- Department of Physics and Astronomy, Northwestern University, Evanston, IL60208
- Center for Network Dynamics, Northwestern University, Evanston, IL60208
| | - Adilson E. Motter
- Department of Physics and Astronomy, Northwestern University, Evanston, IL60208
- Center for Network Dynamics, Northwestern University, Evanston, IL60208
- Department of Engineering Sciences and Applied Mathematics, Northwestern University, Evanston, IL60208
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL60208
- National Institute for Theory and Mathematics in Biology, Evanston, IL60208
| |
Collapse
|
2
|
Wytock TP, Motter AE. Cell reprogramming design by transfer learning of functional transcriptional networks. ARXIV 2024:arXiv:2403.04837v1. [PMID: 38495570 PMCID: PMC10942484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
Recent developments in synthetic biology, next-generation sequencing, and machine learning provide an unprecedented opportunity to rationally design new disease treatments based on measured responses to gene perturbations and drugs to reprogram cell behavior. The main challenges to seizing this opportunity are the incomplete knowledge of the cellular network and the combinatorial explosion of possible interventions, both of which are insurmountable by experiments. To address these challenges, we develop a transfer learning approach to control cell behavior that is pre-trained on transcriptomic data associated with human cell fates to generate a model of the functional network dynamics that can be transferred to specific reprogramming goals. The approach additively combines transcriptional responses to gene perturbations (single-gene knockdowns and overexpressions) to minimize the transcriptional difference between a given pair of initial and target states. We demonstrate the flexibility of our approach by applying it to a microarray dataset comprising over 9,000 microarrays across 54 cell types and 227 unique perturbations, and an RNASeq dataset consisting of over 10,000 sequencing runs across 36 cell types and 138 perturbations. Our approach reproduces known reprogramming protocols with an average AUROC of 0.91 while innovating over existing methods by pre-training an adaptable model that can be tailored to specific reprogramming transitions. We show that the number of gene perturbations required to steer from one fate to another increases as the developmental relatedness decreases. We also show that fewer genes are needed to progress along developmental paths than to regress. Together, these findings establish a proof-of-concept for our approach to computationally design control strategies and demonstrate their ability to provide insights into the dynamics of gene regulatory networks.
Collapse
Affiliation(s)
- Thomas P Wytock
- Department of Physics and Astronomy, Northwestern University, Evanston, Illinois 60208, USA
- Center for Network Dynamics, Northwestern University, Evanston, Illinois 60208, USA
| | - Adilson E Motter
- Department of Physics and Astronomy, Northwestern University, Evanston, Illinois 60208, USA
- Center for Network Dynamics, Northwestern University, Evanston, Illinois 60208, USA
- Department of Engineering Sciences and Applied Mathematics, Northwestern University, Evanston, Illinois 60208, USA
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, Illinois 60208, USA
- National Institute for Theory and Mathematics in Biology, Evanston, Illinois 60208, USA
| |
Collapse
|
3
|
Ziegler ME, Sorensen AM, Banyard DA, Sayadi LR, Chnari E, Hatch MM, Tassey J, Mirzakhanyan Y, Gershon PD, Hughes CC, Evans GR, Widgerow AD. Deconstructing Allograft Adipose and Fascia Matrix: Fascia Matrix Improves Angiogenesis, Volume Retention, and Adipogenesis in a Rodent Model. Plast Reconstr Surg 2023; 151:108-117. [PMID: 36219861 PMCID: PMC10081826 DOI: 10.1097/prs.0000000000009794] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
BACKGROUND Autologous fat grafting is commonly used for soft-tissue repair (approximately 90,000 cases per year in the United States), but outcomes are limited by volume loss (20% to 80%) over time. Human allograft adipose matrix (AAM) stimulates de novo adipogenesis in vivo, but retention requires optimization. The extracellular matrix derived from superficial fascia, interstitial within the adipose layer, is typically removed during AAM processing. Thus, fascia, which contains numerous important proteins, might cooperate with AAM to stimulate de novo adipogenesis, improving long-term retention compared to AAM alone. METHODS Human AAM and fascia matrix proteins (back and upper leg regions) were identified by mass spectrometry and annotated by gene ontology. A three-dimensional in vitro angiogenesis assay was performed. Finally, AAM and/or fascia (1 mL) was implanted into 6- to 8-week-old male Fischer rats. After 8 weeks, the authors assessed graft retention by gas pycnometry and angiogenesis (CD31) and adipocyte counts (hematoxylin and eosin) histologically. RESULTS Gene ontology annotation revealed an angiogenic enrichment pattern unique to the fascia, including lactadherin, collagen alpha-3(V) chain, and tenascin-C. In vitro, AAM stimulated 1.0 ± 0.17 angiogenic sprouts per bead. The addition of fascia matrix increased sprouting by 88% (2.0 ± 0.12; P < 0.001). A similar angiogenic response (CD31) was observed in vivo. Graft retention volume was 25% (0.25 ± 0.13) for AAM, significantly increasing to 60% (0.60 ± 0.14) for AAM/fascia ( P < 0.05). De novo adipogenesis was 12% (12.4 ± 7.4) for AAM, significantly increasing to 51% (51.2 ± 8.0) for AAM/fascia ( P < 0.001) by means of adipocyte quantification. CONCLUSIONS Combining fascia matrix with AAM improves angiogenesis and adipogenesis compared to AAM alone in rats. These preliminary in vitro and pilot animal studies should be further validated before definitive clinical adoption. CLINICAL RELEVANCE STATEMENT When producing an off-the-shelf adipose inducing product by adding a connective tissue fascial component (that is normally discarded) to the mix of adipose matrix, vasculogenesis is increased and, thus, adipogenesis and graft survival is improved. This is a significant advance in this line of product.
Collapse
Affiliation(s)
- Mary E. Ziegler
- Center for Tissue Engineering, UC Irvine Department of Plastic Surgery, Orange, CA, USA
| | | | - Derek A. Banyard
- Center for Tissue Engineering, UC Irvine Department of Plastic Surgery, Orange, CA, USA
| | - Lohrasb R. Sayadi
- Center for Tissue Engineering, UC Irvine Department of Plastic Surgery, Orange, CA, USA
| | | | - Michaela M. Hatch
- Department of Molecular Biology and Biochemistry, School of Biological Sciences, UC Irvine, USA
| | - Jade Tassey
- Center for Tissue Engineering, UC Irvine Department of Plastic Surgery, Orange, CA, USA
| | - Yeva Mirzakhanyan
- Department of Molecular Biology and Biochemistry, School of Biological Sciences, UC Irvine, USA
| | - Paul D. Gershon
- Department of Molecular Biology and Biochemistry, School of Biological Sciences, UC Irvine, USA
| | - Christopher C.W. Hughes
- Department of Molecular Biology and Biochemistry, School of Biological Sciences, UC Irvine, USA; Department of Biomedical Engineering, The Henry Samueli School of Engineering, UC Irvine, USA; The Edwards Lifesciences Center for Advanced Cardiovascular Technology, UC Irvine, USA
| | - Gregory R.D. Evans
- Center for Tissue Engineering, UC Irvine Department of Plastic Surgery, Orange, CA, USA
| | - Alan D. Widgerow
- Center for Tissue Engineering, UC Irvine Department of Plastic Surgery, Orange, CA, USA
| |
Collapse
|
4
|
Liao H, Wen X, Deng X, Wu Y, Xu J, Li X, Zhou S, Li X, Zhu C, Luo F, Ma Y, Zheng J. Integrated proteomic and metabolomic analyses reveal significant changes in chloroplasts and mitochondria of pepper (Capsicum annuum L.) during Sclerotium rolfsii infection. J Microbiol 2022; 60:511-525. [DOI: 10.1007/s12275-022-1603-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 01/26/2022] [Accepted: 02/04/2022] [Indexed: 10/18/2022]
|
5
|
Zhang HB, Ding XB, Jin J, Guo WP, Yang QL, Chen PC, Yao H, Ruan L, Tao YT, Chen X. Predicted mouse interactome and network-based interpretation of differentially expressed genes. PLoS One 2022; 17:e0264174. [PMID: 35390003 PMCID: PMC8989236 DOI: 10.1371/journal.pone.0264174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2021] [Accepted: 02/04/2022] [Indexed: 11/18/2022] Open
Abstract
The house mouse or Mus musculus has become a premier mammalian model for genetic research due to its genetic and physiological similarities to humans. It brought mechanistic insights into numerous human diseases and has been routinely used to assess drug efficiency and toxicity, as well as to predict patient responses. To facilitate molecular mechanism studies in mouse, we present the Mouse Interactome Database (MID, Version 1), which includes 155,887 putative functional associations between mouse protein-coding genes inferred from functional association evidence integrated from 9 public databases. These putative functional associations are expected to cover 19.32% of all mouse protein interactions, and 26.02% of these function associations may represent protein interactions. On top of MID, we developed a gene set linkage analysis (GSLA) web tool to annotate potential functional impacts from observed differentially expressed genes. Two case studies show that the MID/GSLA system provided precise and informative annotations that other widely used gene set annotation tools, such as PANTHER and DAVID, did not. Both MID and GSLA are accessible through the website http://mouse.biomedtzc.cn.
Collapse
Affiliation(s)
- Hai-Bo Zhang
- Institute of Big Data and Artificial Intelligence in Medicine, School of Electronics & Information Engineering, Taizhou University, Taizhou, China
| | - Xiao-Bao Ding
- Institute of Big Data and Artificial Intelligence in Medicine, School of Electronics & Information Engineering, Taizhou University, Taizhou, China
| | - Jie Jin
- Institute of Big Data and Artificial Intelligence in Medicine, School of Electronics & Information Engineering, Taizhou University, Taizhou, China
| | - Wen-Ping Guo
- Institute of Big Data and Artificial Intelligence in Medicine, School of Electronics & Information Engineering, Taizhou University, Taizhou, China
| | - Qiao-Lei Yang
- Institute of Pharmaceutical Biotechnology, School of Medicine, Zhejiang University, Hangzhou, China
| | - Peng-Cheng Chen
- Institute of Pharmaceutical Biotechnology, School of Medicine, Zhejiang University, Hangzhou, China
| | - Heng Yao
- Institute of Pharmaceutical Biotechnology, School of Medicine, Zhejiang University, Hangzhou, China
| | - Li Ruan
- Institute of Big Data and Artificial Intelligence in Medicine, School of Electronics & Information Engineering, Taizhou University, Taizhou, China
| | - Yu-Tian Tao
- Institute of Big Data and Artificial Intelligence in Medicine, School of Electronics & Information Engineering, Taizhou University, Taizhou, China
- * E-mail: (YTT); (XC)
| | - Xin Chen
- Institute of Big Data and Artificial Intelligence in Medicine, School of Electronics & Information Engineering, Taizhou University, Taizhou, China
- Institute of Pharmaceutical Biotechnology, School of Medicine, Zhejiang University, Hangzhou, China
- Joint Institute for Genetics and Genome Medicine between Zhejiang University and University of Toronto, Zhejiang University, Hangzhou, China
- * E-mail: (YTT); (XC)
| |
Collapse
|
6
|
Alotaibi F, Alharbi S, Alotaibi M, Al Mosallam M, Motawei M, Alrajhi A. Wheat omics: Classical breeding to new breeding technologies. Saudi J Biol Sci 2021; 28:1433-1444. [PMID: 33613071 PMCID: PMC7878716 DOI: 10.1016/j.sjbs.2020.11.083] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Revised: 11/26/2020] [Accepted: 11/29/2020] [Indexed: 12/26/2022] Open
Abstract
Wheat is an important cereal crop, and its significance is more due to compete for dietary products in the world. Many constraints facing by the wheat crop due to environmental hazardous, biotic, abiotic stress and heavy matters factors, as a result, decrease the yield. Understanding the molecular mechanism related to these factors is significant to figure out genes regulate under specific conditions. Classical breeding using hybridization has been used to increase the yield but not prospered at the desired level. With the development of newly emerging technologies in biological sciences i.e., marker assisted breeding (MAB), QTLs mapping, mutation breeding, proteomics, metabolomics, next-generation sequencing (NGS), RNA_sequencing, transcriptomics, differential expression genes (DEGs), computational resources and genome editing techniques i.e. (CRISPR cas9; Cas13) advances in the field of omics. Application of new breeding technologies develops huge data; considerable development is needed in bioinformatics science to interpret the data. However, combined omics application to address physiological questions linked with genetics is still a challenge. Moreover, viroid discovery opens the new direction for research, economics, and target specification. Comparative genomics important to figure gene of interest processes are further discussed about considering the identification of genes, genomic loci, and biochemical pathways linked with stress resilience in wheat. Furthermore, this review extensively discussed the omics approaches and their effective use. Integrated plant omics technologies have been used viroid genomes associated with CRISPR and CRISPR-associated Cas13a proteins system used for engineering of viroid interference along with high-performance multidimensional phenotyping as a significant limiting factor for increasing stress resistance in wheat.
Collapse
Affiliation(s)
- Fahad Alotaibi
- King Abdulaziz City for Science and Technology (KACST), Riyadh, Saudi Arabia
| | - Saif Alharbi
- King Abdulaziz City for Science and Technology (KACST), Riyadh, Saudi Arabia
| | - Majed Alotaibi
- King Abdulaziz City for Science and Technology (KACST), Riyadh, Saudi Arabia
| | - Mobarak Al Mosallam
- King Abdulaziz City for Science and Technology (KACST), Riyadh, Saudi Arabia
| | | | - Abdullah Alrajhi
- King Abdulaziz City for Science and Technology (KACST), Riyadh, Saudi Arabia
| |
Collapse
|
7
|
Borham M, Oreiby A, El-Gedawy A, Hegazy Y, Hemedan A, Al-Gaabary M. Abattoir survey of bovine tuberculosis in tanta, centre of the Nile delta, with in silico analysis of gene mutations and protein-protein interactions of the involved mycobacteria. Transbound Emerg Dis 2021; 69:434-450. [PMID: 33484233 DOI: 10.1111/tbed.14001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 12/18/2020] [Accepted: 01/19/2021] [Indexed: 12/31/2022]
Abstract
Bovine tuberculosis is a transboundary disease of high economic and public health burden worldwide. In this study, post-mortem examination of 750 cattle and buffalo in Tanta abattoir, Centre of the Nile Delta, revealed visible TB in 4% of animals and a true prevalence of 6.85% (95% CI: 5.3%-8.9%). Mycobacterial culture, histopathology and RT-PCR targeting all members of M. tuberculosis complex were performed, upon which 85%, 80% and 100% of each tested lesions were confirmed as TB, respectively. Mpb70-targeting PCR was conducted on ten RT-PCR positive samples for sequencing and identified nine Mycobacterium (M.) bovis strains and, interestingly, one M. tuberculosis (Mtb) strain from a buffalo. Bioinformatics tools were used for prediction of mutations, nucleotide polymorphisms, lineages, drug resistance and protein-protein interactions (PPI) of the sequenced strains. The Mtb strain was resistant to rifampicin, isoniazid and streptomycin, and to the best of our knowledge, this is the first report of multidrug resistant (MDR)-Mtb originating from buffaloes. Seven M. bovis strains were resistant to ethambutol and ethionamide. Such resistances were associated with KatG, rpoB, rpsL, embB and ethA genes mutations. Other mutations and nucleotide polymorphisms were also predicted, some are reported for the first time and require experimental work for validation. PPI revealed more interactions than what would be expected for a random set of proteins of similar size and had dense interactions between nodes that are biologically connected, as a group. Two M. bovis strains belonged to BOV AFRI lineage (Spoligotypes BOV 1; BOV 2) and eight strains belonged to East-Asian (Beijing) lineage. In conclusion, visible TB was prevalent in the study area, RT-PCR is the best to confirm the disease, MDR-Mtb is associated with buffalo TB, and mycobacteria of different lineages carry many resistance genes to chemotherapeutic agents used in treatment of human TB constituting a major public health risk.
Collapse
Affiliation(s)
- Mohamed Borham
- Bacteriology Department, Animal Health Research Institute Matrouh Lab, Matrouh, Egypt
| | - Atef Oreiby
- Department of Animal Medicine (Infectious Diseases), Faculty of Veterinary Medicine, Kafrelsheikh University, Kafr El-Sheikh, Egypt
| | - Attia El-Gedawy
- Bacteriology Department, Animal Health Research Institute, Cairo, Egypt
| | - Yamen Hegazy
- Department of Animal Medicine (Infectious Diseases), Faculty of Veterinary Medicine, Kafrelsheikh University, Kafr El-Sheikh, Egypt
| | - Ahmed Hemedan
- Bioinformatics Core, Luxembourg Centre For Systems Biomedicine, Luxembourg University, Luxembourg, Luxembourg
| | - Magdy Al-Gaabary
- Department of Animal Medicine (Infectious Diseases), Faculty of Veterinary Medicine, Kafrelsheikh University, Kafr El-Sheikh, Egypt
| |
Collapse
|
8
|
Use of Network Pharmacology to Investigate the Mechanism by Which Allicin Ameliorates Lipid Metabolism Disorder in HepG2 Cells. EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE 2021; 2021:3956504. [PMID: 33505493 PMCID: PMC7815415 DOI: 10.1155/2021/3956504] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/25/2020] [Revised: 12/16/2020] [Accepted: 12/26/2020] [Indexed: 02/02/2023]
Abstract
Allicin has been well documented to exhibit a wide spectrum of biological activities, especially lipid-lowering activity, as a promising candidate for the management of nonalcoholic fatty liver disease (NALFD). However, the mechanisms underlying the therapeutic effects of allicin require further investigation. It is tempting to think of combining network pharmacology and experimental validation to investigate the mechanism by which allicin ameliorates lipid metabolism disorder in HepG2 cells. We established a cell model of hepatic steatosis induced by PA to investigate the antisteatotic effects of allicin. The studies showed that allicin reduced PA-induced lipid accumulation using Nile red staining and TC and TG assays. Then, 219 potential targets of allicin were successfully predicted by PharmMapper. According to Reactome Pathway Analysis, 44 potential targets related to lipid metabolism were screened out. Molecular signaling cascades mediated by allicin included PPARA, PPARG, FABP4, and FABP6 by cytoHubba and qPCR analysis. Results revealed that allicin activated the gene expression of PPARA and FABP6 and suppressed the gene expression of FABP4 and PPARG. Thus, the present study united the methods of network pharmacology and experimental validation to investigate the protein targets of allicin on PA-induced lipid metabolism disorders to supply a reference for related application for the first time.
Collapse
|
9
|
Mishra R, Kohli S, Malhotra N, Bandyopadhyay P, Mehta M, Munshi M, Adiga V, Ahuja VK, Shandil RK, Rajmani RS, Seshasayee ASN, Singh A. Targeting redox heterogeneity to counteract drug tolerance in replicating Mycobacterium tuberculosis. Sci Transl Med 2020; 11:11/518/eaaw6635. [PMID: 31723039 DOI: 10.1126/scitranslmed.aaw6635] [Citation(s) in RCA: 70] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Revised: 06/26/2019] [Accepted: 10/25/2019] [Indexed: 12/23/2022]
Abstract
The capacity of Mycobacterium tuberculosis (Mtb) to tolerate multiple antibiotics represents a major problem in tuberculosis (TB) management. Heterogeneity in Mtb populations is one of the factors that drives antibiotic tolerance during infection. However, the mechanisms underpinning this variation in bacterial population remain poorly understood. Here, we show that phagosomal acidification alters the redox physiology of Mtb to generate a population of replicating bacteria that display drug tolerance during infection. RNA sequencing of this redox-altered population revealed the involvement of iron-sulfur (Fe-S) cluster biogenesis, hydrogen sulfide (H2S) gas, and drug efflux pumps in antibiotic tolerance. The fraction of the pH- and redox-dependent tolerant population increased when Mtb infected macrophages with actively replicating HIV-1, suggesting that redox heterogeneity could contribute to high rates of TB therapy failure during HIV-TB coinfection. Pharmacological inhibition of phagosomal acidification by the antimalarial drug chloroquine (CQ) eradicated drug-tolerant Mtb, ameliorated lung pathology, and reduced postchemotherapeutic relapse in in vivo models. The pharmacological profile of CQ (C max and AUClast) exhibited no major drug-drug interaction when coadministered with first line anti-TB drugs in mice. Our data establish a link between phagosomal pH, redox metabolism, and drug tolerance in replicating Mtb and suggest repositioning of CQ to shorten TB therapy and achieve a relapse-free cure.
Collapse
Affiliation(s)
- Richa Mishra
- Department of Microbiology and Cell Biology, Indian Institute of Science, Bangalore 560012, India.,Centre for Infectious Disease Research, Indian Institute of Science, Bangalore 560012, India
| | - Sakshi Kohli
- Department of Microbiology and Cell Biology, Indian Institute of Science, Bangalore 560012, India.,Centre for Infectious Disease Research, Indian Institute of Science, Bangalore 560012, India
| | - Nitish Malhotra
- National Centre for Biological Sciences (NCBS), Tata Institute of Fundamental Research (TIFR), Bangalore 560065, India
| | - Parijat Bandyopadhyay
- Department of Microbiology and Cell Biology, Indian Institute of Science, Bangalore 560012, India.,Centre for Infectious Disease Research, Indian Institute of Science, Bangalore 560012, India
| | - Mansi Mehta
- Department of Microbiology and Cell Biology, Indian Institute of Science, Bangalore 560012, India.,Centre for Infectious Disease Research, Indian Institute of Science, Bangalore 560012, India
| | - MohamedHusen Munshi
- Department of Microbiology and Cell Biology, Indian Institute of Science, Bangalore 560012, India.,Centre for Infectious Disease Research, Indian Institute of Science, Bangalore 560012, India
| | - Vasista Adiga
- Centre for Infectious Disease Research, Indian Institute of Science, Bangalore 560012, India
| | | | - Radha K Shandil
- Foundation for Neglected Disease Research, Bangalore 560065, India
| | - Raju S Rajmani
- Centre for Infectious Disease Research, Indian Institute of Science, Bangalore 560012, India
| | - Aswin Sai Narain Seshasayee
- National Centre for Biological Sciences (NCBS), Tata Institute of Fundamental Research (TIFR), Bangalore 560065, India
| | - Amit Singh
- Department of Microbiology and Cell Biology, Indian Institute of Science, Bangalore 560012, India.
| |
Collapse
|
10
|
Ibrahim O, Sutherland HG, Maksemous N, Smith R, Haupt LM, Griffiths LR. Exploring Neuronal Vulnerability to Head Trauma Using a Whole Exome Approach. J Neurotrauma 2020; 37:1870-1879. [PMID: 32233732 PMCID: PMC7462038 DOI: 10.1089/neu.2019.6962] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Brain injuries are associated with oxidative stress and a need to restore neuronal homeostasis. Mutations in ion channel genes, in particular CACNA1A, have been implicated in familial hemiplegic migraine (FHM) and in the development of concussion-related symptoms in response to trivial head trauma. The aim of this study was to explore the potential role of variants in other ion channel genes in the development of such responses. We conducted whole exome sequencing (WES) on16 individuals who developed a range of neurological and concussion-related symptoms following minor or trivial head injuries. All individuals were initially tested and shown to be negative for mutations in known FHM genes. Variants identified from the WES results were filtered to identify rare variants (minor allele frequency [MAF] <0.01) in genes related to neural processes as well as genes highly expressed in the brain using a combination of in silico prediction tools (SIFT, PolyPhen, PredictSNP, Mutation Taster, and Mutation Assessor). Rare (MAF <0.001) or novel heterozygous variants in 7 ion channel genes were identified in 37.5% (6/16) of the cases (CACNA1I, CACNA1C, ATP10A, ATP7B, KCNAB1, KCNJ10, and SLC26A4), rare variants in neurotransmitter genes were found in 2 cases (GABRG1 and GRIK1), and rare variants in 3 ubiquitin-related genes identified in 4 cases (SQSTM1, TRIM2, and HECTD1). In this study, the largest proportion of potentially pathogenic variants in individuals with severe responses to minor head trauma were identified in genes previously implicated in migraine and seizure-related autosomal recessive neurological disorders. Together with results implicating variants in the hemiplegic migraine genes, CACNA1A and ATP1A2, in severe head trauma response, our results support a role for heterozygous deleterious mutations in genes implicated in neurological dysfunction and potentially increasing the risk of poor response to trivial head trauma.
Collapse
Affiliation(s)
- Omar Ibrahim
- Genomics Research Centre, Institute of Health and Biomedical Innovation, School of Biomedical Science, Queensland University of Technology (QUT), Kelvin Grove, Queensland, Australia
| | - Heidi G Sutherland
- Genomics Research Centre, Institute of Health and Biomedical Innovation, School of Biomedical Science, Queensland University of Technology (QUT), Kelvin Grove, Queensland, Australia
| | - Neven Maksemous
- Genomics Research Centre, Institute of Health and Biomedical Innovation, School of Biomedical Science, Queensland University of Technology (QUT), Kelvin Grove, Queensland, Australia
| | - Robert Smith
- Genomics Research Centre, Institute of Health and Biomedical Innovation, School of Biomedical Science, Queensland University of Technology (QUT), Kelvin Grove, Queensland, Australia
| | - Larisa M Haupt
- Genomics Research Centre, Institute of Health and Biomedical Innovation, School of Biomedical Science, Queensland University of Technology (QUT), Kelvin Grove, Queensland, Australia
| | - Lyn R Griffiths
- Genomics Research Centre, Institute of Health and Biomedical Innovation, School of Biomedical Science, Queensland University of Technology (QUT), Kelvin Grove, Queensland, Australia
| |
Collapse
|
11
|
Contribution of brain pericytes in blood-brain barrier formation and maintenance: a transcriptomic study of cocultured human endothelial cells derived from hematopoietic stem cells. Fluids Barriers CNS 2020; 17:48. [PMID: 32723387 PMCID: PMC7385894 DOI: 10.1186/s12987-020-00208-1] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Accepted: 07/13/2020] [Indexed: 12/22/2022] Open
Abstract
Formation, maintenance, and repair of the blood–brain barrier (BBB) are critical for central nervous system homeostasis. The interaction of endothelial cells (ECs) with brain pericytes is known to induce BBB characteristics in brain ECs during embryogenesis and can be used to differentiate human ECs from stem cell source in in vitro BBB models. However, the molecular events involved in BBB maturation are not fully understood. To this end, human ECs derived from hematopoietic stem cells were cultivated with either primary bovine or cell line-derived human brain pericytes to induce BBB formation. Subsequently, the transcriptomic profiles of solocultured vs. cocultured ECs were analysed over time by Massive Analysis of cDNA Ends (MACE) technology. This RNA sequencing method is a 3′-end targeted, tag-based, reduced representation transcriptome profiling technique, that can reliably quantify all polyadenylated transcripts including those with low expression. By analysing the generated transcriptomic profiles, we can explore the molecular processes responsible for the functional changes observed in ECs in coculture with brain pericytes (e.g. barrier tightening, changes in the expression of transporters and receptors). Our results identified several up- and downregulated genes and signaling pathways that provide a valuable data source to further delineate complex molecular processes that are involved in BBB formation and BBB maintenance. In addition, this data provides a source to identify novel targets for central nervous system drug delivery strategies.
Collapse
|
12
|
Hameed A, Lai WA, Shahina M, Stothard P, Young LS, Lin SY, Sridhar KR, Young CC. Differential visible spectral influence on carbon metabolism in heterotrophic marine flavobacteria. FEMS Microbiol Ecol 2020; 96:5710931. [PMID: 31960903 DOI: 10.1093/femsec/fiaa011] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2018] [Accepted: 01/20/2020] [Indexed: 12/26/2022] Open
Abstract
The visible spectrum of solar radiation is known to stimulate photoheterotrophic bacterial carbon metabolism. However, its impact on 'strictly' heterotrophic bacteria remains less explored. Here, we show that heterotrophic flavobacteria exhibit enhanced uptake and mineralization of dissolved organic carbon with increasing wavelengths of visible light, without employing any 'known' light-harvesting mechanisms. RNA sequencing identified blue light as a major constraint in the extracellular enzymatic hydrolysis of polymeric carbohydrates and acquisition of sugars, despite acting as a stimulus for inorganic carbon sequestration. In contrast, green-red and continuous full-spectrum lights activated diverse hydrolytic enzymes and sugar transporters, but obstructed inorganic carbon fixation. This 'metabolic switching' was apparent through limited nutrient uptake, suppressed light-sensitivity, oxidative stress response and promotion of inorganic carbon sequestration pathways under blue light. The visible light impact on metabolism may be of significant ecological relevance as it appears to promote cell-mediated mineralization of organic carbon in 'green-colored' chlorophyll-rich copiotrophic coastal seawater and inorganic carbon sequestration in 'blue-colored' oligotrophic open ocean. Thus, a novel regulatory role played by light on heterotrophic metabolism and a hidden potential of flavobacteria to sense and respond differentially to monochromatic lights influencing marine carbon cycling were unraveled.
Collapse
Affiliation(s)
- Asif Hameed
- Department of Soil & Environmental Sciences, College of Agriculture and Natural Resources, National Chung Hsing University, 145, XingDa Road, Taichung 40227, Taiwan
| | - Wei-An Lai
- Department of Soil & Environmental Sciences, College of Agriculture and Natural Resources, National Chung Hsing University, 145, XingDa Road, Taichung 40227, Taiwan
| | - Mariyam Shahina
- Department of Soil & Environmental Sciences, College of Agriculture and Natural Resources, National Chung Hsing University, 145, XingDa Road, Taichung 40227, Taiwan
| | - Paul Stothard
- Department of Agricultural, Food and Nutritional Science, University of Alberta, 1427 College Plaza, Edmonton, Alberta, Canada
| | - Li-Sen Young
- Tetanti AgriBiotech Inc. No. 1, Gongyequ 10th Rd., Xitun Dist., Taichung 40755, Taiwan
| | - Shih-Yao Lin
- Department of Soil & Environmental Sciences, College of Agriculture and Natural Resources, National Chung Hsing University, 145, XingDa Road, Taichung 40227, Taiwan
| | | | - Chiu-Chung Young
- Department of Soil & Environmental Sciences, College of Agriculture and Natural Resources, National Chung Hsing University, 145, XingDa Road, Taichung 40227, Taiwan.,Innovation and Development Center of Sustainable Agriculture, National Chung Hsing University, 145, XingDa Road, Taichung 40227, Taiwan
| |
Collapse
|
13
|
Arumugaperumal A, Paul S, Lathakumari S, Balasubramani R, Sivasubramaniam S. The draft genome of a new Verminephrobacter eiseniae strain: a nephridial symbiont of earthworms. ANN MICROBIOL 2020. [DOI: 10.1186/s13213-020-01549-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Abstract
Purpose
Verminephrobacter is a genus of symbiotic bacteria that live in the nephridia of earthworms. The bacteria are recruited during the embryonic stage of the worm and transferred from generation to generation in the same manner. The worm provides shelter and food for the bacteria. The bacteria deliver micronutrients to the worm. The present study reports the genome sequence assembly and annotation of a new strain of Verminephrobacter called Verminephrobacter eiseniae msu.
Methods
We separated the sequences of a new Verminephrobacter strain from the whole genome of Eisenia fetida using the sequence of V. eiseniae EF01-2, and the bacterial genome was assembled using the CLC Workbench. The de novo-assembled genome was annotated and analyzed for the protein domains, functions, and metabolic pathways. Besides, the multigenome comparison was performed to interpret the phylogenomic relationship of the strain with other proteobacteria.
Result
The FastqSifter sifted a total of 593,130 Verminephrobacter genomic reads. The de novo assembly of the reads generated 1832 contigs with a total genome size of 4.4 Mb. The Average Nucleotide Identity denoted the bacterium belongs to the species V. eiseniae, and the 16S rRNA analysis confirmed it as a new strain of V. eiseniae. The AUGUSTUS genome annotation predicted a total of 3809 protein-coding genes; of them, 3805 genes were identified from the homology search.
Conclusion
The bioinformatics analysis confirmed the bacterium is an isolate of V. eiseniae, and it was named Verminephrobacter eiseniae msu. The whole genome of the bacteria can be utilized as a useful resource to explore the area of symbiosis further.
Collapse
|
14
|
Kim J, Yoon S, Nam D. netGO: R-Shiny package for network-integrated pathway enrichment analysis. Bioinformatics 2020; 36:3283-3285. [DOI: 10.1093/bioinformatics/btaa077] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 12/30/2019] [Accepted: 01/29/2020] [Indexed: 11/14/2022] Open
Abstract
Abstract
Summary
We present an R-Shiny package, netGO, for novel network-integrated pathway enrichment analysis. The conventional Fisher’s exact test (FET) considers the extent of overlap between target genes and pathway gene-sets, while recent network-based analysis tools consider only network interactions between the two. netGO implements an intuitive framework to integrate both the overlap and networks into a single score, and adaptively resamples genes based on network degrees to assess the pathway enrichment. In benchmark tests for gene expression and genome-wide association study (GWAS) data, netGO captured the relevant gene-sets better than existing tools, especially when analyzing a small number of genes. Specifically, netGO provides user-interactive visualization of the target genes, enriched gene-set and their network interactions for both netGO and FET results for further analysis. For this visualization, we also developed a standalone R-Shiny package shinyCyJS to connect R-shiny and the JavaScript version of cytoscape.
Availability and implementation
netGO R-Shiny package is freely available from github, https://github.com/unistbig/netGO.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | - Dougu Nam
- School of Life Sciences
- Department of Mathematical Sciences, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
| |
Collapse
|
15
|
Rentzsch R, Deneke C, Nitsche A, Renard BY. Predicting bacterial virulence factors - evaluation of machine learning and negative data strategies. Brief Bioinform 2019; 21:1596-1608. [PMID: 32978619 DOI: 10.1093/bib/bbz076] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2019] [Revised: 05/17/2019] [Accepted: 06/01/2019] [Indexed: 11/12/2022] Open
Abstract
Bacterial proteins dubbed virulence factors (VFs) are a highly diverse group of sequences, whose only obvious commonality is the very property of being, more or less directly, involved in virulence. It is therefore tempting to speculate whether their prediction, based on direct sequence similarity (seqsim) to known VFs, could be enhanced or even replaced by using machine-learning methods. Specifically, when trained on a large and diverse set of VFs, such may be able to detect putative, non-trivial characteristics shared by otherwise unrelated VF families and therefore better predict novel VFs with insignificant similarity to each individual family. We therefore first reassess the performance of dimer-based Support Vector Machines, as used in the widely used MP3 method, in light of seqsim-only and seqsim/dimer-hybrid classifiers. We then repeat the analysis with a novel, considerably more diverse data set, also addressing the important problem of negative data selection. Finally, we move on to the real-world use case of proteome-wide VF prediction, outlining different approaches to estimating specificity in this scenario. We find that direct seqsim is of unparalleled importance and therefore should always be exploited. Further, we observe strikingly low correlations between different feature and classifier types when ranking proteins by VF likeness. We therefore propose a 'best of each world' approach to prioritize proteins for experimental testing, focussing on the top predictions of each classifier. Further, classifiers for individual VF families should be developed.
Collapse
Affiliation(s)
- Robert Rentzsch
- Bioinformatics Unit (MF 1), Robert Koch Institute, Berlin.,Institute for Innovation and Technology (IIT), Steinplatz 1, Berlin
| | - Carlus Deneke
- Bioinformatics Unit (MF 1), Robert Koch Institute, Berlin.,Molecular Microbiology and Genome Analysis Unit, German Federal Institute for Risk Assessment, Berlin
| | - Andreas Nitsche
- Centre for Biological Threats and Special Pathogens: Highly Pathogenic Viruses (ZBS 1), Robert Koch Institute, Berlin
| | | |
Collapse
|
16
|
Choobdar S, Ahsen ME, Crawford J, Tomasoni M, Fang T, Lamparter D, Lin J, Hescott B, Hu X, Mercer J, Natoli T, Narayan R, DREAM Module Identification Challenge Consortium, Subramanian A, Zhang JD, Stolovitzky G, Kutalik Z, Lage K, Slonim DK, Saez-Rodriguez J, Cowen LJ, Bergmann S, Marbach D. Assessment of network module identification across complex diseases. Nat Methods 2019; 16:843-852. [PMID: 31471613 PMCID: PMC6719725 DOI: 10.1038/s41592-019-0509-5] [Citation(s) in RCA: 165] [Impact Index Per Article: 27.5] [Reference Citation Analysis] [Collaborators] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2018] [Accepted: 07/10/2019] [Indexed: 12/11/2022]
Abstract
Many bioinformatics methods have been proposed for reducing the complexity of large gene or protein networks into relevant subnetworks or modules. Yet, how such methods compare to each other in terms of their ability to identify disease-relevant modules in different types of network remains poorly understood. We launched the 'Disease Module Identification DREAM Challenge', an open competition to comprehensively assess module identification methods across diverse protein-protein interaction, signaling, gene co-expression, homology and cancer-gene networks. Predicted network modules were tested for association with complex traits and diseases using a unique collection of 180 genome-wide association studies. Our robust assessment of 75 module identification methods reveals top-performing algorithms, which recover complementary trait-associated modules. We find that most of these modules correspond to core disease-relevant pathways, which often comprise therapeutic targets. This community challenge establishes biologically interpretable benchmarks, tools and guidelines for molecular network analysis to study human disease biology.
Collapse
Affiliation(s)
- Sarvenaz Choobdar
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Mehmet E Ahsen
- Icahn Institute for Genomics and Multiscale Biology and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Jake Crawford
- Department of Computer Science, Tufts University, Medford, MA, USA
| | - Mattia Tomasoni
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Tao Fang
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - David Lamparter
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Verge Genomics, San Francisco, CA, USA
| | - Junyuan Lin
- Department of Mathematics, Tufts University, Medford, MA, USA
| | - Benjamin Hescott
- College of Computer and Information Science, Northeastern University, Boston, MA, USA
| | - Xiaozhe Hu
- Department of Mathematics, Tufts University, Medford, MA, USA
| | - Johnathan Mercer
- Department of Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Stanley Center at the Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ted Natoli
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Rajiv Narayan
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | | | - Jitao D Zhang
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - Gustavo Stolovitzky
- Icahn Institute for Genomics and Multiscale Biology and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- IBM T.J. Watson Research Center, Yorktown Heights, NY, USA
| | - Zoltán Kutalik
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- University Institute of Primary Care and Public Health, University of Lausanne, Lausanne, Switzerland
| | - Kasper Lage
- Department of Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Stanley Center at the Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Institute for Biological Psychiatry, Mental Health Center Sct. Hans, University of Copenhagen, Roskilde, Denmark
| | - Donna K Slonim
- Department of Computer Science, Tufts University, Medford, MA, USA
- Department of Immunology, Tufts University School of Medicine, Boston, MA, USA
| | - Julio Saez-Rodriguez
- Institute for Computational Biomedicine, Faculty of Medicine, Heidelberg University, Bioquant, Heidelberg, Germany
- RWTH Aachen University, Faculty of Medicine, Joint Research Center for Computational Biomedicine, Aachen, Germany
| | - Lenore J Cowen
- Department of Computer Science, Tufts University, Medford, MA, USA
- Department of Mathematics, Tufts University, Medford, MA, USA
| | - Sven Bergmann
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.
- Department of Integrative Biomedical Sciences, University of Cape Town, Cape Town, South Africa.
| | - Daniel Marbach
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Basel, Switzerland.
| |
Collapse
Collaborators
Fabian Aicheler, Nicola Amoroso, Alex Arenas, Karthik Azhagesan, Aaron Baker, Michael Banf, Serafim Batzoglou, Anaïs Baudot, Roberto Bellotti, Sven Bergmann, Keith A Boroevich, Christine Brun, Stanley Cai, Michael Caldera, Alberto Calderone, Gianni Cesareni, Weiqi Chen, Christine Chichester, Sarvenaz Choobdar, Lenore Cowen, Jake Crawford, Hongzhu Cui, Phuong Dao, Manlio De Domenico, Andi Dhroso, Gilles Didier, Mathew Divine, Antonio Del Sol, Tao Fang, Xuyang Feng, Jose C Flores-Canales, Santo Fortunato, Anthony Gitter, Anna Gorska, Yuanfang Guan, Alain Guénoche, Sergio Gómez, Hatem Hamza, András Hartmann, Shan He, Anton Heijs, Julian Heinrich, Benjamin Hescott, Xiaozhe Hu, Ying Hu, Xiaoqing Huang, V Keith Hughitt, Minji Jeon, Lucas Jeub, Nathan T Johnson, Keehyoung Joo, InSuk Joung, Sascha Jung, Susana G Kalko, Piotr J Kamola, Jaewoo Kang, Benjapun Kaveelerdpotjana, Minjun Kim, Yoo-Ah Kim, Oliver Kohlbacher, Dmitry Korkin, Kiryluk Krzysztof, Khalid Kunji, Zoltàn Kutalik, Kasper Lage, David Lamparter, Sean Lang-Brown, Thuc Duy Le, Jooyoung Lee, Sunwon Lee, Juyong Lee, Dong Li, Jiuyong Li, Junyuan Lin, Lin Liu, Antonis Loizou, Zhenhua Luo, Artem Lysenko, Tianle Ma, Raghvendra Mall, Daniel Marbach, Tomasoni Mattia, Mario Medvedovic, Jörg Menche, Johnathan Mercer, Elisa Micarelli, Alfonso Monaco, Felix Müller, Rajiv Narayan, Oleksandr Narykov, Ted Natoli, Thea Norman, Sungjoon Park, Livia Perfetto, Dimitri Perrin, Stefano Pirrò, Teresa M Przytycka, Xiaoning Qian, Karthik Raman, Daniele Ramazzotti, Emilie Ramsahai, Balaraman Ravindran, Philip Rennert, Julio Saez-Rodriguez, Charlotta Schärfe, Roded Sharan, Ning Shi, Wonho Shin, Hai Shu, Himanshu Sinha, Donna K Slonim, Lionel Spinelli, Suhas Srinivasan, Aravind Subramanian, Christine Suver, Damian Szklarczyk, Sabina Tangaro, Suresh Thiagarajan, Laurent Tichit, Thorsten Tiede, Beethika Tripathi, Aviad Tsherniak, Tatsuhiko Tsunoda, Dénes Türei, Ehsan Ullah, Golnaz Vahedi, Alberto Valdeolivas, Jayaswal Vivek, Christian von Mering, Andra Waagmeester, Bo Wang, Yijie Wang, Barbara A Weir, Shana White, Sebastian Winkler, Ke Xu, Taosheng Xu, Chunhua Yan, Liuqing Yang, Kaixian Yu, Xiangtian Yu, Gaia Zaffaroni, Mikhail Zaslavskiy, Tao Zeng, Jitao D Zhang, Lu Zhang, Weijia Zhang, Lixia Zhang, Xinyu Zhang, Junpeng Zhang, Xin Zhou, Jiarui Zhou, Hongtu Zhu, Junjie Zhu, Guido Zuccon,
Collapse
|
17
|
Ebrahimpoor M, Spitali P, Hettne K, Tsonaka R, Goeman J. Simultaneous Enrichment Analysis of all Possible Gene-sets: Unifying Self-Contained and Competitive Methods. Brief Bioinform 2019; 21:1302-1312. [PMID: 31297505 PMCID: PMC7373179 DOI: 10.1093/bib/bbz074] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Revised: 05/28/2019] [Accepted: 05/28/2019] [Indexed: 01/23/2023] Open
Abstract
Studying sets of genomic features is increasingly popular in genomics, proteomics and metabolomics since analyzing at set level not only creates a natural connection to biological knowledge but also offers more statistical power. Currently, there are two gene-set testing approaches, self-contained and competitive, both of which have their advantages and disadvantages, but neither offers the final solution. We introduce simultaneous enrichment analysis (SEA), a new approach for analysis of feature sets in genomics and other omics based on a new unified null hypothesis, which includes the self-contained and competitive null hypotheses as special cases. We employ closed testing using Simes tests to test this new hypothesis. For every feature set, the proportion of active features is estimated, and a confidence bound is provided. Also, for every unified null hypotheses, a \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$P$\end{document}-value is calculated, which is adjusted for family-wise error rate. SEA does not need to assume that the features are independent. Moreover, users are allowed to choose the feature set(s) of interest after observing the data. We develop a novel pipeline and apply it on RNA-seq data of dystrophin-deficient mdx mice, showcasing the flexibility of the method. Finally, the power properties of the method are evaluated through simulation studies.
Collapse
Affiliation(s)
- Mitra Ebrahimpoor
- Medical statistics, Department of Biomedical Data Science, Leiden University Medical Center, Leiden, The Netherlands
| | - Pietro Spitali
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Kristina Hettne
- Medical statistics, Department of Biomedical Data Science, Leiden University Medical Center, Leiden, The Netherlands
| | - Roula Tsonaka
- Medical statistics, Department of Biomedical Data Science, Leiden University Medical Center, Leiden, The Netherlands
| | - Jelle Goeman
- Medical statistics, Department of Biomedical Data Science, Leiden University Medical Center, Leiden, The Netherlands
| |
Collapse
|
18
|
Kaur S, Baldi B, Vuong J, O'Donoghue SI. Visualization and Analysis of Epiproteome Dynamics. J Mol Biol 2019; 431:1519-1539. [PMID: 30769119 DOI: 10.1016/j.jmb.2019.01.044] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2018] [Revised: 01/29/2019] [Accepted: 01/29/2019] [Indexed: 12/28/2022]
Abstract
The epiproteome describes the set of all post-translational modifications (PTMs) made to the proteins comprising a cell or organism. The extent of the epiproteome is still largely unknown; however, advances in experimental techniques are beginning to produce a deluge of data, tracking dynamic changes to the epiproteome in response to cellular stimuli. These data have potential to revolutionize our understanding of biology and disease. This review covers a range of recent visualization methods and tools developed specifically for dynamic epiproteome data sets. These methods have been designed primarily for data sets on phosphorylation, as this the most studied PTM; however, most of these methods are also applicable to other types of PTMs. Unfortunately, the currently available methods are often inadequate for existing data sets; thus, realizing the potential buried in epiproteome data sets will require new, tailored bioinformatics methods that will help researchers analyze, visualize, and interactively explore these complex data sets.
Collapse
Affiliation(s)
- Sandeep Kaur
- University of New South Wales (UNSW), Kensington, NSW 2052, Australia; Garvan Institute of Medical Research, Darlinghurst, NSW 2010, Australia.
| | - Benedetta Baldi
- Garvan Institute of Medical Research, Darlinghurst, NSW 2010, Australia; Data 61, CSIRO, Eveleigh, NSW 2015, Australia.
| | - Jenny Vuong
- Garvan Institute of Medical Research, Darlinghurst, NSW 2010, Australia; Data 61, CSIRO, Eveleigh, NSW 2015, Australia.
| | - Seán I O'Donoghue
- University of New South Wales (UNSW), Kensington, NSW 2052, Australia; Garvan Institute of Medical Research, Darlinghurst, NSW 2010, Australia; Data 61, CSIRO, Eveleigh, NSW 2015, Australia.
| |
Collapse
|
19
|
Silva DAD, Tsai SM, Chiorato AF, da Silva Andrade SC, Esteves JADF, Recchia GH, Morais Carbonell SA. Analysis of the common bean (Phaseolus vulgaris L.) transcriptome regarding efficiency of phosphorus use. PLoS One 2019; 14:e0210428. [PMID: 30657755 PMCID: PMC6338380 DOI: 10.1371/journal.pone.0210428] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2017] [Accepted: 12/21/2018] [Indexed: 12/05/2022] Open
Abstract
Common bean is a highly important food in tropical regions, where most production occurs on small farms with limited use of technology and, consequently, greater vulnerability to abiotic stresses such as nutritional stress. Usually phosphorus (P) is the most limiting nutrient for crop growth in these regions. The aim of this study was to characterize the gene expression profiles of the genotypes of common bean IAC Imperador (P-responsive) and DOR 364 (P-unresponsive) under different P concentrations using RNA-seq transcriptome sequencing technology. Plants were grown hydroponically, with application of two P concentrations (4.00 mg L-1 restrictive level and 8.00 mg L-1 control level). Differential expression analyses, annotation, and functional classification were performed comparing genotypes within each P rate administered and comparing each genotype response to the different P levels. Considering differential expression analyses within genotypes, IAC Imperador exhibited 1538 up-regulated genes under P restriction and 1679 up-regulated genes in the control, while DOR 364 exhibited 13 up-regulated genes in the control and only 2 up-regulated genes under P restriction, strongly corroborating P-unresponsiveness of this genotype. Genes related to phosphorus restriction were identified among the differentially expressed genes, including transcription factors such as WRKY, ERF, and MYB families, phosphatase related genes such as pyrophosphatase, acid phosphatase, and purple acid phosphatase, and phosphate transporters. The enrichment test for the P restriction treatment showed 123 enriched gene ontologies (GO) for IAC Imperador, while DOR 364 enriched only 24. Also, the enriched GO correlated with P metabolism, compound metabolic processes containing phosphate, nucleoside phosphate binding, phosphorylation, and also response to stresses. Thus, this study proved to be informative to phosphorus limitation in common bean showing global changes at transcript level.
Collapse
Affiliation(s)
- Daiana Alves da Silva
- Instituto Agronômico (IAC)-Centro de Grãos e Fibras-Fazenda Santa Elis, Campinas, SP, Brazil
| | - Siu Mui Tsai
- Centro de Energia Nuclear na Agricultura (CENA)-Av. Centenário, São Dimas-CEP-Piracicaba, SP, Brazil
| | | | - Sónia Cristina da Silva Andrade
- Universidade de São Paulo (USP)-Departamento de Genética e Biologia Evolutiva-Instituto de Biociências-Rua do Matão, Cidade Universitária-Cep-São Paulo, SP, Brazil
| | | | - Gustavo Henrique Recchia
- Centro de Energia Nuclear na Agricultura (CENA)-Av. Centenário, São Dimas-CEP-Piracicaba, SP, Brazil
| | | |
Collapse
|
20
|
Paul S, Heckmann LH, Sørensen JG, Holmstrup M, Arumugaperumal A, Sivasubramaniam S. Transcriptome sequencing, de novo assembly and annotation of the freeze tolerant earthworm, Dendrobaena octaedra. GENE REPORTS 2018. [DOI: 10.1016/j.genrep.2018.10.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|
21
|
Paul S, Arumugaperumal A, Rathy R, Ponesakki V, Arunachalam P, Sivasubramaniam S. Data on genome annotation and analysis of earthworm Eisenia fetida. Data Brief 2018; 20:525-534. [PMID: 30191166 PMCID: PMC6126081 DOI: 10.1016/j.dib.2018.08.067] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2018] [Revised: 08/12/2018] [Accepted: 08/21/2018] [Indexed: 11/02/2022] Open
Abstract
The present article reports the complete draft genome annotation of earthworm Eisenia fetida, obtained from the manuscript entitled "Timing and Scope of Genomic Expansion within Annelida: Evidence from Homeoboxes in the Genome of the Earthworm E. fetida" (Zwarycz et al., 2015) and provides the data on the repetitive elements, protein coding genes and noncoding RNAs present in the genome dataset of the species. The E. fetida protein coding genes were predicted from AUGUSTUS gene prediction and subsequently annotated based on their sequence similarity, Gene Ontology (GO) functional terms, InterPro domains, Clusters of Orthologous Groups (COGs) and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways information. The genome wide comparison of orthologous clusters and phylogenomic analysis of the core genes were performed to understand the events of genome evolution and genomic diversity between E. fetida and its related metazoans. In addition, the genome dataset was screened to identify the crucial stem cell markers, regeneration specific genes and immune-related genes and their functionally enriched GO terms were predicted from Fisher׳s enrichment analysis. The E. fetida genome annotation data containing the GFF (general feature format) annotation file, predicted coding gene sequences and translated protein sequences were deposited to the figshare repository under the DOI: https://doi.org/10.6084/m9.figshare.6142322.v1.
Collapse
Affiliation(s)
- Sayan Paul
- Department of Biotechnology, Manonmaniam Sundaranar University, Tirunelveli, Tamilnadu 627012, India
| | - Arun Arumugaperumal
- Department of Biotechnology, Manonmaniam Sundaranar University, Tirunelveli, Tamilnadu 627012, India
| | - Rashmi Rathy
- Department of Biotechnology, Manonmaniam Sundaranar University, Tirunelveli, Tamilnadu 627012, India
| | - Vasanthakumar Ponesakki
- Department of Biotechnology, Manonmaniam Sundaranar University, Tirunelveli, Tamilnadu 627012, India
| | - Palavesam Arunachalam
- Department of Animal Science, Manonmaniam Sundaranar University, Tirunelveli, Tamilnadu 627012, India
| | - Sudhakar Sivasubramaniam
- Department of Biotechnology, Manonmaniam Sundaranar University, Tirunelveli, Tamilnadu 627012, India
| |
Collapse
|
22
|
Ballouz S, Pavlidis P, Gillis J. Using predictive specificity to determine when gene set analysis is biologically meaningful. Nucleic Acids Res 2018; 45:e20. [PMID: 28204549 PMCID: PMC5389513 DOI: 10.1093/nar/gkw957] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2016] [Revised: 10/04/2016] [Accepted: 10/10/2016] [Indexed: 11/14/2022] Open
Abstract
Gene set analysis, which translates gene lists into enriched functions, is among the most common bioinformatic methods. Yet few would advocate taking the results at face value. Not only is there no agreement on the algorithms themselves, there is no agreement on how to benchmark them. In this paper, we evaluate the robustness and uniqueness of enrichment results as a means of assessing methods even where correctness is unknown. We show that heavily annotated (‘multifunctional’) genes are likely to appear in genomics study results and drive the generation of biologically non-specific enrichment results as well as highly fragile significances. By providing a means of determining where enrichment analyses report non-specific and non-robust findings, we are able to assess where we can be confident in their use. We find significant progress in recent bias correction methods for enrichment and provide our own software implementation. Our approach can be readily adapted to any pre-existing package.
Collapse
Affiliation(s)
- Sara Ballouz
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Woodbury, NY 11797, USA
| | - Paul Pavlidis
- Department of Psychiatry and Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Jesse Gillis
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Woodbury, NY 11797, USA
| |
Collapse
|
23
|
Parraga-Alava J, Dorn M, Inostroza-Ponta M. A multi-objective gene clustering algorithm guided by apriori biological knowledge with intensification and diversification strategies. BioData Min 2018; 11:16. [PMID: 30100924 PMCID: PMC6081857 DOI: 10.1186/s13040-018-0178-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2017] [Accepted: 07/29/2018] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND Biologists aim to understand the genetic background of diseases, metabolic disorders or any other genetic condition. Microarrays are one of the main high-throughput technologies for collecting information about the behaviour of genetic information on different conditions. In order to analyse this data, clustering arises as one of the main techniques used, and it aims at finding groups of genes that have some criterion in common, like similar expression profile. However, the problem of finding groups is normally multi dimensional, making necessary to approach the clustering as a multi-objective problem where various cluster validity indexes are simultaneously optimised. They are usually based on criteria like compactness and separation, which may not be sufficient since they can not guarantee the generation of clusters that have both similar expression patterns and biological coherence. METHOD We propose a Multi-Objective Clustering algorithm Guided by a-Priori Biological Knowledge (MOC-GaPBK) to find clusters of genes with high levels of co-expression, biological coherence, and also good compactness and separation. Cluster quality indexes are used to optimise simultaneously gene relationships at expression level and biological functionality. Our proposal also includes intensification and diversification strategies to improve the search process. RESULTS The effectiveness of the proposed algorithm is demonstrated on four publicly available datasets. Comparative studies of the use of different objective functions and other widely used microarray clustering techniques are reported. Statistical, visual and biological significance tests are carried out to show the superiority of the proposed algorithm. CONCLUSIONS Integrating a-priori biological knowledge into a multi-objective approach and using intensification and diversification strategies allow the proposed algorithm to find solutions with higher quality than other microarray clustering techniques available in the literature in terms of co-expression, biological coherence, compactness and separation.
Collapse
Affiliation(s)
- Jorge Parraga-Alava
- Centre for Biotechnology and Bioengineering (CeBiB), Departamento de Ingeniería Informática, Universidad de Santiago de Chile, Av. Ecuador 3659, Santiago, Chile
- Carrera de Computación, Escuela Superior Politécnica Agropecuaria de Manabí Manuel Félix López, Campus Politécnico Sitio El Limón, Calceta, Ecuador
| | - Marcio Dorn
- Instituto de Informatica, Universidade Federal do Rio Grande do Sul, Av. Bento Gonçalves 9500, Porto Alegre, 91501-970 Brasil
| | - Mario Inostroza-Ponta
- Centre for Biotechnology and Bioengineering (CeBiB), Departamento de Ingeniería Informática, Universidad de Santiago de Chile, Av. Ecuador 3659, Santiago, Chile
| |
Collapse
|
24
|
Kumar KK, Devi BU, Neeraja P. Molecular activities and ligand-binding specificities of StAR-related lipid transfer domains: exploring integrated in silico methods and ensemble-docking approaches. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2018; 29:483-501. [PMID: 29688061 DOI: 10.1080/1062936x.2018.1462847] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/03/2018] [Accepted: 04/05/2018] [Indexed: 06/08/2023]
Abstract
In this study, cholesterol biotransformation gene-set of human steroidogenic acute regulatory protein-related lipid transfer (START) domains were evaluated from high-throughput gene screening approaches. It was shown that STARD1, STARD3 and STARD4 proteins are better effective transporters of cholesterol than STARD5 and STARD6 domains. Docking studies show a strong agreement with gene ontology enrichment data. According to both complementary strategies, it was found that only STARD1, STARD3 and STARD4 are potentially involved in cholesterol biotransformation in mitochondria through Ω1-loop of C-terminal α4-helical domain. Ensemble docking assessment for a set of selected chemicals of protein-chemical networks has shown possible binding probabilities with START domains. Among those, reproductive toxicity evoked drugs (mifepristone), insecticides (rotenone), tobacco pulmonary carcinogens (benzo(a)pyrene) and endocrine disruptor chemicals (EDCs) including perfluorooctanesulfonic acid (PFOS) and aflatoxin B1 (AFB1) potentially bound with novel hotspot residues of the α4-helical domain. Compound representation space and clustering approaches reveal that the START proteins show more sensitivity with these lead scaffolds, so they could provide probable barrier assets in cholesterol and steroidogenic acute regulatory (StAR) binding and leads adverse consequences in steroidogenesis. These findings indicate potential START domains and their binding levels with toxic chemicals; sorted viewpoints could be useful as a promising way to identify chemicals with related steroidogenisis impacts on human health.
Collapse
Affiliation(s)
- K Kranthi Kumar
- a Department of Zoology , Sri Venkateswara University , Tirupati , 517502 - A.P . India
| | - B Uma Devi
- a Department of Zoology , Sri Venkateswara University , Tirupati , 517502 - A.P . India
| | - P Neeraja
- a Department of Zoology , Sri Venkateswara University , Tirupati , 517502 - A.P . India
| |
Collapse
|
25
|
Das S, Shyamal S, Durica DS. Analysis of Annotation and Differential Expression Methods used in RNA-seq Studies in Crustacean Systems. Integr Comp Biol 2018; 56:1067-1079. [PMID: 27940611 DOI: 10.1093/icb/icw117] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
In the field of crustacean biology, usage of RNA-seq to study gene expression is rapidly growing. Major advances in sequencing technology have contributed to the ability to examine complex patterns of genome activity in a wide range of organisms that are extensively used for comparative physiology, ecology and evolution, environmental monitoring, and commercial aquaculture. Relative to insect and vertebrate model organisms, however, information on the organization of crustacean genomes is virtually nonexistent, making de novo transcriptome assembly, annotation and quantification problematic and challenging. We present here a summary of the methodologies and software analyses employed in 23 recent publications, which describe de novo transcriptome assembly, annotation, and differential gene expression in a variety of crustacean experimental systems. We focus on establishing a series of best practices that will allow for investigators to produce datasets that are understandable, reproducible, and of general utility for related analyses and cross-study comparisons.
Collapse
Affiliation(s)
- Sunetra Das
- *Department of Biology, Colorado State University, 1878 Campus Delivery Fort Collins, CO 80523, USA;
| | | | - David S Durica
- Department of Biology, University of Oklahoma, Norman, OK 73019, USA
| |
Collapse
|
26
|
Abreu VAC, Popin RV, Alvarenga DO, Schaker PDC, Hoff-Risseti C, Varani AM, Fiore MF. Genomic and Genotypic Characterization of Cylindrospermopsis raciborskii: Toward an Intraspecific Phylogenetic Evaluation by Comparative Genomics. Front Microbiol 2018. [PMID: 29535689 PMCID: PMC5834425 DOI: 10.3389/fmicb.2018.00306] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Cylindrospermopsis raciborskii is a freshwater cyanobacterial species with increasing bloom reports worldwide that are likely due to factors related to climate change. In addition to the deleterious effects of blooms on aquatic ecosystems, the majority of ecotypes can synthesize toxic secondary metabolites causing public health issues. To overcome the harmful effects of C. raciborskii blooms, it is important to advance knowledge of diversity, genetic variation, and evolutionary processes within populations. An efficient approach to exploring this diversity and understanding the evolution of C. raciborskii is to use comparative genomics. Here, we report two new draft genomes of C. raciborskii (strains CENA302 and CENA303) from Brazilian isolates of different origins and explore their molecular diversity, phylogeny, and evolutionary diversification by comparing their genomes with sequences from other strains available in public databases. The results obtained by comparing seven C. raciborskii and the Raphidiopsis brookii D9 genomes revealed a set of conserved core genes and a variable set of accessory genes, such as those involved in the biosynthesis of natural products, heterocyte glycolipid formation, and nitrogen fixation. Gene cluster arrangements related to the biosynthesis of the antifungal cyclic glycosylated lipopeptide hassallidin were identified in four C. raciborskii genomes, including the non-nitrogen fixing strain CENA303. Shifts in gene clusters involved in toxin production according to geographic origins were observed, as well as a lack of nitrogen fixation (nif) and heterocyte glycolipid (hgl) gene clusters in some strains. Single gene phylogeny (16S rRNA sequences) was congruent with phylogeny based on 31 concatenated housekeeping protein sequences, and both analyses have shown, with high support values, that the species C. raciborskii is monophyletic. This comparative genomics study allowed a species-wide view of the biological diversity of C. raciborskii and in some cases linked genome differences to phenotype.
Collapse
Affiliation(s)
- Vinicius A C Abreu
- Center for Nuclear Energy in Agriculture, University of São Paulo, Piracicaba, Brazil
| | - Rafael V Popin
- Center for Nuclear Energy in Agriculture, University of São Paulo, Piracicaba, Brazil
| | - Danillo O Alvarenga
- Center for Nuclear Energy in Agriculture, University of São Paulo, Piracicaba, Brazil.,School of Agricultural and Veterinarian Sciences, São Paulo State University, Jaboticabal, Brazil
| | - Patricia D C Schaker
- Center for Nuclear Energy in Agriculture, University of São Paulo, Piracicaba, Brazil
| | - Caroline Hoff-Risseti
- Center for Nuclear Energy in Agriculture, University of São Paulo, Piracicaba, Brazil
| | - Alessandro M Varani
- School of Agricultural and Veterinarian Sciences, São Paulo State University, Jaboticabal, Brazil
| | - Marli F Fiore
- Center for Nuclear Energy in Agriculture, University of São Paulo, Piracicaba, Brazil
| |
Collapse
|
27
|
Prytuliak R, Pfeiffer F, Habermann BH. SLALOM, a flexible method for the identification and statistical analysis of overlapping continuous sequence elements in sequence- and time-series data. BMC Bioinformatics 2018; 19:24. [PMID: 29373955 PMCID: PMC5787307 DOI: 10.1186/s12859-018-2020-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2017] [Accepted: 01/08/2018] [Indexed: 12/30/2022] Open
Abstract
Background Protein or nucleic acid sequences contain a multitude of associated annotations representing continuous sequence elements (CSEs). Comparing these CSEs is needed, whenever we want to match identical annotations or integrate distinctive ones. Currently, there is no ready-to-use software available that provides comprehensive statistical readout for comparing two annotations of the same type with each other, which can be adapted to the application logic of the scientific question. Results We have developed a method, SLALOM (for StatisticaL Analysis of Locus Overlap Method), to perform comparative analysis of sequence annotations in a highly flexible way. SLALOM implements six major operation modes and a number of additional options that can answer a variety of statistical questions about a pair of input annotations of a given sequence collection. We demonstrate the results of SLALOM on three different examples from biology and economics and compare our method to already existing software. We discuss the importance of carefully choosing the application logic to address specific scientific questions. Conclusion SLALOM is a highly versatile, command-line based method for comparing annotations in a collection of sequences, with a statistical read-out for performance evaluation and benchmarking of predictors and gene annotation pipelines. Abstraction from sequence content even allows SLALOM to compare other kinds of positional data including, for example, data coming from time series. Electronic supplementary material The online version of this article (10.1186/s12859-018-2020-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Roman Prytuliak
- Computational Biology Group, Max Planck Institute of Biochemistry, Am Klopferspitz 18, 82152, Martinsried, Germany
| | - Friedhelm Pfeiffer
- Computational Biology Group, Max Planck Institute of Biochemistry, Am Klopferspitz 18, 82152, Martinsried, Germany
| | - Bianca Hermine Habermann
- Computational Biology Group, Max Planck Institute of Biochemistry, Am Klopferspitz 18, 82152, Martinsried, Germany. .,Computational Biology Group, Aix-Marseille University & CNRS, Developmental Biology Institute of Marseille (IBDM), UMR 7288, Parc Scientifique de Luminy, 163 Avenue de Luminy, 13009, Marseille, France.
| |
Collapse
|
28
|
Kim H, Choi SM, Park S. GSEH: A Novel Approach to Select Prostate Cancer-Associated Genes Using Gene Expression Heterogeneity. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:129-146. [PMID: 27775535 DOI: 10.1109/tcbb.2016.2618927] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
When a gene shows varying levels of expression among normal people but similar levels in disease patients or shows similar levels of expression among normal people but different levels in disease patients, we can assume that the gene is associated with the disease. By utilizing this gene expression heterogeneity, we can obtain additional information that abets discovery of disease-associated genes. In this study, we used collaborative filtering to calculate the degree of gene expression heterogeneity between classes and then scored the genes on the basis of the degree of gene expression heterogeneity to find "differentially predicted" genes. Through the proposed method, we discovered more prostate cancer-associated genes than 10 comparable methods. The genes prioritized by the proposed method are potentially significant to biological processes of a disease and can provide insight into them.
Collapse
|
29
|
Ponesakki V, Paul S, Mani DKS, Rajendiran V, Kanniah P, Sivasubramaniam S. Annotation of nerve cord transcriptome in earthworm Eisenia fetida. GENOMICS DATA 2017; 14:91-105. [PMID: 29204349 PMCID: PMC5688751 DOI: 10.1016/j.gdata.2017.10.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/09/2017] [Revised: 09/02/2017] [Accepted: 10/07/2017] [Indexed: 11/26/2022]
Abstract
In annelid worms, the nerve cord serves as a crucial organ to control the sensory and behavioral physiology. The inadequate genome resource of earthworms has prioritized the comprehensive analysis of their transcriptome dataset to monitor the genes express in the nerve cord and predict their role in the neurotransmission and sensory perception of the species. The present study focuses on identifying the potential transcripts and predicting their functional features by annotating the transcriptome dataset of nerve cord tissues prepared by Gong et al., 2010 from the earthworm Eisenia fetida. Totally 9762 transcripts were successfully annotated against the NCBI nr database using the BLASTX algorithm and among them 7680 transcripts were assigned to a total of 44,354 GO terms. The conserve domain analysis indicated the over representation of P-loop NTPase domain and calcium binding EF-hand domain. The COG functional annotation classified 5860 transcript sequences into 25 functional categories. Further, 4502 contig sequences were found to map with 124 KEGG pathways. The annotated contig dataset exhibited 22 crucial neuropeptides having considerable matches to the marine annelid Platynereis dumerilii, suggesting their possible role in neurotransmission and neuromodulation. In addition, 108 human stem cell marker homologs were identified including the crucial epigenetic regulators, transcriptional repressors and cell cycle regulators, which may contribute to the neuronal and segmental regeneration. The complete functional annotation of this nerve cord transcriptome can be further utilized to interpret genetic and molecular mechanisms associated with neuronal development, nervous system regeneration and nerve cord function.
Collapse
Affiliation(s)
| | | | | | | | | | - Sudhakar Sivasubramaniam
- Department of Biotechnology, Manonmaniam Sundaranar University, Tirunelveli, Tamilnadu 627012, India
| |
Collapse
|
30
|
Singh G, Singh G, Singh P, Parmar R, Paul N, Vashist R, Swarnkar MK, Kumar A, Singh S, Singh AK, Kumar S, Sharma RK. Molecular dissection of transcriptional reprogramming of steviol glycosides synthesis in leaf tissue during developmental phase transitions in Stevia rebaudiana Bert. Sci Rep 2017; 7:11835. [PMID: 28928460 PMCID: PMC5605536 DOI: 10.1038/s41598-017-12025-y] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2017] [Accepted: 09/01/2017] [Indexed: 12/18/2022] Open
Abstract
Stevia is a natural source of commercially important steviol glycosides (SGs), which share biosynthesis route with gibberellic acids (GAs) through plastidal MEP and cytosolic MVA pathways. Ontogeny-dependent deviation in SGs biosynthesis is one of the key factor for global cultivation of Stevia, has not been studied at transcriptional level. To dissect underlying molecular mechanism, we followed a global transcriptome sequencing approach and generated more than 100 million reads. Annotation of 41,262 de novo assembled transcripts identified all the genes required for SGs and GAs biosynthesis. Differential gene expression and quantitative analysis of important pathway genes (DXS, HMGR, KA13H) and gene regulators (WRKY, MYB, NAC TFs) indicated developmental phase dependent utilization of metabolic flux between SGs and GAs synthesis. Further, identification of 124 CYPs and 45 UGTs enrich the genomic resources, and their PPI network analysis with SGs/GAs biosynthesis proteins identifies putative candidates involved in metabolic changes, as supported by their developmental phase-dependent expression. These putative targets can expedite molecular breeding and genetic engineering efforts to enhance SGs content, biomass and yield. Futuristically, the generated dataset will be a useful resource for development of functional molecular markers for diversity characterization, genome mapping and evolutionary studies in Stevia.
Collapse
Affiliation(s)
- Gopal Singh
- Biotechnology Department, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
- Academy of Scientific and Innovative Research, New Delhi, India
| | - Gagandeep Singh
- Biotechnology Department, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
| | - Pradeep Singh
- Biotechnology Department, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
| | - Rajni Parmar
- Biotechnology Department, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
- Academy of Scientific and Innovative Research, New Delhi, India
| | - Navgeet Paul
- Biotechnology Department, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
| | - Radhika Vashist
- Biotechnology Department, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
| | - Mohit Kumar Swarnkar
- Biotechnology Department, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
| | - Ashok Kumar
- Agrotechnology of Medicinal, Aromatic and Commercially Important Plants, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
| | - Sanatsujat Singh
- Agrotechnology of Medicinal, Aromatic and Commercially Important Plants, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
| | - Anil Kumar Singh
- Biotechnology Department, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
- ICAR-Indian Institute of Agricultural Biotechnology, PDU Campus, IINRG, Namkum, Ranchi, Jharkhand, India
| | - Sanjay Kumar
- Biotechnology Department, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
| | - Ram Kumar Sharma
- Biotechnology Department, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India.
- Academy of Scientific and Innovative Research, New Delhi, India.
| |
Collapse
|
31
|
Gülbakan B, Özgül RK, Yüzbaşıoğlu A, Kohl M, Deigner HP, Özgüç M. Discovery of biomarkers in rare diseases: innovative approaches by predictive and personalized medicine. EPMA J 2016; 7:24. [PMID: 27980697 PMCID: PMC5143439 DOI: 10.1186/s13167-016-0074-2] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2016] [Accepted: 10/21/2016] [Indexed: 12/11/2022]
Abstract
There are more than 8000 rare diseases (RDs) that affect >5 % of the world’s population. Many of the RDs have no effective treatment and lack of knowledge creates delayed diagnosis making management difficult. The emerging concept of the personalized medicine allows for early screening, diagnosis, and individualized treatment of human diseases. In this context, the discovery of biomarkers in RDs will be of prime importance to enable timely prevention and effective treatment. Since 80 % of RDs are of genetic origin, identification of new genes and causative mutations become valuable biomarkers. Furthermore, dynamic markers such as expressed genes, metabolites, and proteins are also very important to follow prognosis and response the therapy. Recent advances in omics technologies and their use in combination can define pathophysiological pathways that can be drug targets. Biomarker discovery and their use in diagnosis in RDs is a major pillar in RD research.
Collapse
Affiliation(s)
- Basri Gülbakan
- Pediatric Metabolism Unit, Institute of Child Health, Hacettepe University, Ankara, Turkey
| | - Rıza Köksal Özgül
- Pediatric Metabolism Unit, Institute of Child Health, Hacettepe University, Ankara, Turkey
| | - Ayşe Yüzbaşıoğlu
- Department of Medical Biology & Biobank for Rare Disease, Faculty of Medicine, Hacettepe University, Ankara, Turkey
| | - Matthias Kohl
- Institute of Precision Medicine, Medical and Life Sciences Faculty, Furtwangen University, Villingen-Schwenningen, Germany
| | - Hans-Peter Deigner
- Institute of Precision Medicine, Medical and Life Sciences Faculty, Furtwangen University, Villingen-Schwenningen, Germany ; Fraunhofer Institute IZI, EXIM Department, Rostock, Germany
| | - Meral Özgüç
- Department of Medical Biology & Biobank for Rare Disease, Faculty of Medicine, Hacettepe University, Ankara, Turkey
| |
Collapse
|
32
|
Lu YJ, Swamy KBS, Leu JY. Experimental Evolution Reveals Interplay between Sch9 and Polyploid Stability in Yeast. PLoS Genet 2016; 12:e1006409. [PMID: 27812096 PMCID: PMC5094715 DOI: 10.1371/journal.pgen.1006409] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2016] [Accepted: 10/06/2016] [Indexed: 12/20/2022] Open
Abstract
Polyploidization has crucial impacts on the evolution of different eukaryotic lineages including fungi, plants and animals. Recent genome data suggest that, for many polyploidization events, all duplicated chromosomes are maintained and genome reorganizations occur much later during evolution. However, newly-formed polyploid genomes are intrinsically unstable and often quickly degenerate into aneuploidy or diploidy. The transition between these two states remains enigmatic. In this study, laboratory evolution experiments were conducted to investigate this phenomenon. We show that robust tetraploidy is achieved in evolved yeast cells by increasing the abundance of Sch9—a protein kinase activated by the TORC1 (Target of Rapamycin Complex 1) and other signaling pathways. Overexpressing SCH9, but not TOR1, allows newly-formed tetraploids to exhibit evolved phenotypes and knocking out SCH9 diminishes the evolved phenotypes. Furthermore, when cells were challenged with conditions causing ancestral cells to evolve aneuploidy, tetraploidy was maintained in the evolved lines. Our results reveal a determinant role for Sch9 during the early stage of polyploid evolution. Polyploidy is frequently observed in eukaryotes, including in human liver cells and cancer. Evolutionary studies also suggest that polyploidy has contributed to species diversification and novel adaptation in fungi, plants and animals. However, artificially-constructed polyploids often display chromosome instability and quickly convert to aneuploids. This phenomenon conflicts with observations that many species derived from ancient genome duplications have maintained the extra number of chromosomes following polyploidization. What happened during the early stages of these polyploidy events that stabilized the duplicated genomes? We used laboratory evolution experiments to investigate this process. After being propagated in a rich medium at 23°C for 1000 generations, newly-constructed tetraploid yeast cells had evolved stable genomes. In addition, evolved cells acquired resistance to stresses specific to tetraploids and exhibited a more diploid-like transcriptome profile. Further analyses indicated that Sch9—the functional ortholog of mammalian S6 kinase involved in protein homeostasis, G1 progression, stress response and nutrient signaling—contributed to the evolved phenotypes. Evolved cells increased the protein abundance and stability of Sch9. Reconstitution experiments showed that overexpression of SCH9 enabled ancestral cells to display the evolved phenotypes and eliminating SCH9 diminished the evolved phenotypes. Finally, we show that evolved cells were able to maintain their genomes even under a condition that causes newly-formed tetraploids to evolve aneuploidy. Our results reveal that at the early stages after genome duplication, stable polyploidy can be achieved by fine-tuning a conserved key regulator coordinating multiple cellular processes.
Collapse
Affiliation(s)
- Yi-Jin Lu
- Department of Life Sciences and Institute of Genome Sciences, National Yang-Ming University, Taipei, Taiwan
- Institute of Molecular Biology, Academia Sinica, Taipei, Taiwan
| | | | - Jun-Yi Leu
- Institute of Molecular Biology, Academia Sinica, Taipei, Taiwan
- * E-mail:
| |
Collapse
|
33
|
Das S, Mykles DL. A Comparison of Resources for the Annotation of a De Novo Assembled Transcriptome in the Molting Gland (Y-Organ) of the Blackback Land Crab, Gecarcinus lateralis. Integr Comp Biol 2016; 56:1103-1112. [PMID: 27549198 DOI: 10.1093/icb/icw107] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Next-generation sequencing technologies are revolutionizing crustacean biology. De novo assembly of RNA sequencing (RNA-seq) data allows researchers to catalog and quantify genes expressed in tissues of a species that lacks a complete genome sequence. RNA-seq has become an important tool for understanding phenotypic plasticity and the responses of organisms to environmental cues. However, there are challenges with identification of assembled contiguous sequences (contigs) without a reference genome. Thus, the selection of resources for annotating contigs is critical for the downstream analysis of gene functions. A de novo-assembled transcriptome of the Gecarcinus lateralis molting gland, or Y-organ (YO), was used to compare two functional annotation packages, Trinotate and Blast2GO. The assembled transcriptome contained 229,278 contigs derived from YOs from animals in intermolt, premolt (early, mid, and late), and postmolt stages. Gene identification using BLAST against four databases and functional annotation using Gene Ontologies were conducted. The analysis revealed two major limitations of de novo assembly: (1) assembly using Trinity generates redundant contigs and (2) transcripts that encode protein isoforms are not distinguished with current computational tools. It is recommended that the NCBI Non-Redundant, SwissProt, TrEMBL, and Uniref90 databases be used to maximize gene identification. Trinotate is preferred for assigning functions to identified genes, as the package uses multiple databases for annotation. The differences between packages to generate Gene Ontology (GO) terms are attributed to the databases used for inputs: Trinotate uses both Pfam and BLAST databases, while Blast2GO uses only the BLAST database. InterProScan was used to verify the GO terms assigned via BLAST. A comprehensive annotation of de novo assembled transcriptome is necessary for the downstream analysis of differentially expressed transcripts in the YO over the molt cycle.
Collapse
Affiliation(s)
- Sunetra Das
- Department of Biology, Colorado State University, Fort Collins, CO 80523, USA
| | - Donald L Mykles
- Department of Biology, Colorado State University, Fort Collins, CO 80523, USA
| |
Collapse
|
34
|
Parente MK, Rozen R, Seeholzer SH, Wolfe JH. Integrated analysis of proteome and transcriptome changes in the mucopolysaccharidosis type VII mouse hippocampus. Mol Genet Metab 2016; 118:41-54. [PMID: 27053151 PMCID: PMC4832927 DOI: 10.1016/j.ymgme.2016.03.003] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/03/2016] [Accepted: 03/05/2016] [Indexed: 12/15/2022]
Abstract
Mucopolysaccharidosis type VII (MPS VII) is a lysosomal storage disease caused by the deficiency of β-glucuronidase. In this study, we compared the changes relative to normal littermates in the proteome and transcriptome of the hippocampus in the C57Bl/6 mouse model of MPS VII, which has well-documented histopathological and neurodegenerative changes. A completely different set of significant changes between normal and MPS VII littermates were found in each assay. Nevertheless, the functional annotation terms generated by the two methods showed agreement in many of the processes, which also corresponded to known pathology associated with the disease. Additionally, assay-specific changes were found, which in the proteomic analysis included mitochondria, energy generation, and cytoskeletal differences in the mutant, while the transcriptome differences included immune, vesicular, and extracellular matrix changes. In addition, the transcriptomic changes in the mutant hippocampus were concordant with those in a MPS VII mouse caused by the same mutation but on a different background inbred strain.
Collapse
Affiliation(s)
- Michael K Parente
- Research Institute of the Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Ramona Rozen
- Research Institute of the Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Steven H Seeholzer
- Research Institute of the Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - John H Wolfe
- Research Institute of the Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; W. F. Goodman Center for Comparative Medical Genetics, School of Veterinary Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
35
|
Mayne J, Ning Z, Zhang X, Starr AE, Chen R, Deeke S, Chiang CK, Xu B, Wen M, Cheng K, Seebun D, Star A, Moore JI, Figeys D. Bottom-Up Proteomics (2013-2015): Keeping up in the Era of Systems Biology. Anal Chem 2015; 88:95-121. [PMID: 26558748 DOI: 10.1021/acs.analchem.5b04230] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Janice Mayne
- Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa , 451 Smyth Rd., Ottawa, Ontario, Canada , K1H8M5
| | - Zhibin Ning
- Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa , 451 Smyth Rd., Ottawa, Ontario, Canada , K1H8M5
| | - Xu Zhang
- Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa , 451 Smyth Rd., Ottawa, Ontario, Canada , K1H8M5
| | - Amanda E Starr
- Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa , 451 Smyth Rd., Ottawa, Ontario, Canada , K1H8M5
| | - Rui Chen
- Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa , 451 Smyth Rd., Ottawa, Ontario, Canada , K1H8M5
| | - Shelley Deeke
- Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa , 451 Smyth Rd., Ottawa, Ontario, Canada , K1H8M5
| | - Cheng-Kang Chiang
- Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa , 451 Smyth Rd., Ottawa, Ontario, Canada , K1H8M5
| | - Bo Xu
- Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa , 451 Smyth Rd., Ottawa, Ontario, Canada , K1H8M5
| | - Ming Wen
- Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa , 451 Smyth Rd., Ottawa, Ontario, Canada , K1H8M5
| | - Kai Cheng
- Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa , 451 Smyth Rd., Ottawa, Ontario, Canada , K1H8M5
| | - Deeptee Seebun
- Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa , 451 Smyth Rd., Ottawa, Ontario, Canada , K1H8M5
| | - Alexandra Star
- Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa , 451 Smyth Rd., Ottawa, Ontario, Canada , K1H8M5
| | - Jasmine I Moore
- Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa , 451 Smyth Rd., Ottawa, Ontario, Canada , K1H8M5
| | - Daniel Figeys
- Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa , 451 Smyth Rd., Ottawa, Ontario, Canada , K1H8M5
| |
Collapse
|
36
|
Glass K, Girvan M. Finding New Order in Biological Functions from the Network Structure of Gene Annotations. PLoS Comput Biol 2015; 11:e1004565. [PMID: 26588252 PMCID: PMC4654495 DOI: 10.1371/journal.pcbi.1004565] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2015] [Accepted: 09/23/2015] [Indexed: 11/19/2022] Open
Abstract
The Gene Ontology (GO) provides biologists with a controlled terminology that describes how genes are associated with functions and how functional terms are related to one another. These term-term relationships encode how scientists conceive the organization of biological functions, and they take the form of a directed acyclic graph (DAG). Here, we propose that the network structure of gene-term annotations made using GO can be employed to establish an alternative approach for grouping functional terms that captures intrinsic functional relationships that are not evident in the hierarchical structure established in the GO DAG. Instead of relying on an externally defined organization for biological functions, our approach connects biological functions together if they are performed by the same genes, as indicated in a compendium of gene annotation data from numerous different sources. We show that grouping terms by this alternate scheme provides a new framework with which to describe and predict the functions of experimentally identified sets of genes.
Collapse
Affiliation(s)
- Kimberly Glass
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Physics Department, University of Maryland, College Park, Maryland, United States of America
- * E-mail:
| | - Michelle Girvan
- Physics Department, University of Maryland, College Park, Maryland, United States of America
- Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America
- Santa Fe Institute, Santa Fe, New Mexico, United States of America
| |
Collapse
|
37
|
Richardson JE, Bult CJ. Visual annotation display (VLAD): a tool for finding functional themes in lists of genes. Mamm Genome 2015; 26:567-73. [PMID: 26047590 PMCID: PMC4602057 DOI: 10.1007/s00335-015-9570-2] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2015] [Accepted: 05/19/2015] [Indexed: 12/27/2022]
Abstract
Experiments that employ genome scale technology platforms frequently result in lists of tens to thousands of genes with potential significance to a specific biological process or disease. Searching for biologically relevant connections among the genes or gene products in these lists is a common data analysis task. We have implemented a software application for uncovering functional themes in sets of genes based on their annotations to bio-ontologies, such as the gene ontology and the mammalian phenotype ontology. The application, called VisuaL Annotation Display (VLAD), performs a statistical analysis to test for the enrichment of ontology terms in a set of genes submitted by a researcher. The results for each analysis using VLAD includes a table of ontology terms, sorted in decreasing order of significance. Each row contains the term, statistics such as the number of annotated terms, the p value, etc., and the symbols of annotated genes. An accompanying graphical display shows portions of the ontology hierarchy, where node sizes are scaled based on p values. Although numerous ontology term enrichment programs already exist, VLAD is unique in that it allows users to upload their own annotation files and ontologies for customized term enrichment analyses, supports the analysis of multiple gene sets at once, provides interfaces to customize graphical output, and is tightly integrated with functional and biological details about mouse genes in the Mouse Genome Informatics (MGI) database. VLAD is available as a web-based application from the MGI web site (http://proto.informatics.jax.org/prototypes/vlad/).
Collapse
Affiliation(s)
- Joel E Richardson
- Mouse Genome Informatics (MGI) Database Consortium, The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA.
| | - Carol J Bult
- Mouse Genome Informatics (MGI) Database Consortium, The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA.
| |
Collapse
|
38
|
Tsuyuzaki K, Morota G, Ishii M, Nakazato T, Miyazaki S, Nikaido I. MeSH ORA framework: R/Bioconductor packages to support MeSH over-representation analysis. BMC Bioinformatics 2015; 16:45. [PMID: 25887539 PMCID: PMC4343279 DOI: 10.1186/s12859-015-0453-z] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2014] [Accepted: 01/08/2015] [Indexed: 11/23/2022] Open
Abstract
Background In genome-wide studies, over-representation analysis (ORA) against a set of genes is an essential step for biological interpretation. Many gene annotation resources and software platforms for ORA have been proposed. Recently, Medical Subject Headings (MeSH) terms, which are annotations of PubMed documents, have been used for ORA. MeSH enables the extraction of broader meaning from the gene lists and is expected to become an exhaustive annotation resource for ORA. However, the existing MeSH ORA software platforms are still not sufficient for several reasons. Results In this work, we developed an original MeSH ORA framework composed of six types of R packages, including MeSH.db, MeSH.AOR.db, MeSH.PCR.db, the org.MeSH.XXX.db-type packages, MeSHDbi, and meshr. Conclusions Using our framework, users can easily conduct MeSH ORA. By utilizing the enriched MeSH terms, related PubMed documents can be retrieved and saved on local machines within this framework. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0453-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Koki Tsuyuzaki
- Department of Medical and Life Science, Faculty of Pharmaceutical Science, Tokyo University of Science, 2641 Yamazaki, Noda, 278-8510, Chiba, Japan. .,Bioinformatics Research Unit, Advanced Center for Computing and Communication, RIKEN, 2-1 Hirosawa, Wako, 351-0198, Saitama, Japan.
| | - Gota Morota
- Department of Animal Science, University of Nebraska-Lincoln, Lincoln, NE, USA. .,Department of Animal Sciences, University of Wisconsin-Madison, Madison, WI, USA.
| | - Manabu Ishii
- Bioinformatics Research Unit, Advanced Center for Computing and Communication, RIKEN, 2-1 Hirosawa, Wako, 351-0198, Saitama, Japan.
| | - Takeru Nakazato
- Database Center for Life Science (DBCLS), Research Organization of Information and Systems (ROIS), Faculty of Engineering Building 12, The University of Tokyo, 2-11-16 Yayoi, Bunkyo-ku, 113-0032, Tokyo, Japan.
| | - Satoru Miyazaki
- Department of Medical and Life Science, Faculty of Pharmaceutical Science, Tokyo University of Science, 2641 Yamazaki, Noda, 278-8510, Chiba, Japan.
| | - Itoshi Nikaido
- Bioinformatics Research Unit, Advanced Center for Computing and Communication, RIKEN, 2-1 Hirosawa, Wako, 351-0198, Saitama, Japan.
| |
Collapse
|
39
|
Carnielli CM, Winck FV, Paes Leme AF. Functional annotation and biological interpretation of proteomics data. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2015; 1854:46-54. [DOI: 10.1016/j.bbapap.2014.10.019] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/11/2014] [Revised: 10/07/2014] [Accepted: 10/21/2014] [Indexed: 12/22/2022]
|
40
|
Na D, Son H, Gsponer J. Categorizer: a tool to categorize genes into user-defined biological groups based on semantic similarity. BMC Genomics 2014; 15:1091. [PMID: 25495442 PMCID: PMC4298957 DOI: 10.1186/1471-2164-15-1091] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2014] [Accepted: 12/04/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Communalities between large sets of genes obtained from high-throughput experiments are often identified by searching for enrichments of genes with the same Gene Ontology (GO) annotations. The GO analysis tools used for these enrichment analyses assume that GO terms are independent and the semantic distances between all parent-child terms are identical, which is not true in a biological sense. In addition these tools output lists of often redundant or too specific GO terms, which are difficult to interpret in the context of the biological question investigated by the user. Therefore, there is a demand for a robust and reliable method for gene categorization and enrichment analysis. RESULTS We have developed Categorizer, a tool that classifies genes into user-defined groups (categories) and calculates p-values for the enrichment of the categories. Categorizer identifies the biologically best-fit category for each gene by taking advantage of a specialized semantic similarity measure for GO terms. We demonstrate that Categorizer provides improved categorization and enrichment results of genetic modifiers of Huntington's disease compared to a classical GO Slim-based approach or categorizations using other semantic similarity measures. CONCLUSION Categorizer enables more accurate categorizations of genes than currently available methods. This new tool will help experimental and computational biologists analyzing genomic and proteomic data according to their specific needs in a more reliable manner.
Collapse
Affiliation(s)
| | | | - Jörg Gsponer
- Department of Biochemistry and Molecular Biology, Centre for High-throughput Biology, University of British Columbia, 2125 East Mall, Vancouver, BC V6T 1Z4, Canada.
| |
Collapse
|
41
|
Rodgers-Melnick E, Culp M, DiFazio SP. Predicting whole genome protein interaction networks from primary sequence data in model and non-model organisms using ENTS. BMC Genomics 2013; 14:608. [PMID: 24015873 PMCID: PMC3848842 DOI: 10.1186/1471-2164-14-608] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2012] [Accepted: 09/04/2013] [Indexed: 01/10/2023] Open
Abstract
Background The large-scale identification of physical protein-protein interactions (PPIs) is an important step toward understanding how biological networks evolve and generate emergent phenotypes. However, experimental identification of PPIs is a laborious and error-prone process, and current methods of PPI prediction tend to be highly conservative or require large amounts of functional data that may not be available for newly-sequenced organisms. Results In this study we demonstrate a random-forest based technique, ENTS, for the computational prediction of protein-protein interactions based only on primary sequence data. Our approach is able to efficiently predict interactions on a whole-genome scale for any eukaryotic organism, using pairwise combinations of conserved domains and predicted subcellular localization of proteins as input features. We present the first predicted interactome for the forest tree Populus trichocarpa in addition to the predicted interactomes for Saccharomyces cerevisiae, Homo sapiens, Mus musculus, and Arabidopsis thaliana. Comparing our approach to other PPI predictors, we find that ENTS performs comparably to or better than a number of existing approaches, including several that utilize a variety of functional information for their predictions. We also find that the predicted interactions are biologically meaningful, as indicated by similarity in functional annotations and enrichment of co-expressed genes in public microarray datasets. Furthermore, we demonstrate some of the biological insights that can be gained from these predicted interaction networks. We show that the predicted interactions yield informative groupings of P. trichocarpa metabolic pathways, literature-supported associations among human disease states, and theory-supported insight into the evolutionary dynamics of duplicated genes in paleopolyploid plants. Conclusion We conclude that the ENTS classifier will be a valuable tool for the de novo annotation of genome sequences, providing initial clues about regulatory and metabolic network topology, and revealing relationships that are not immediately obvious from traditional homology-based annotations.
Collapse
Affiliation(s)
- Eli Rodgers-Melnick
- Department of Biology, West Virginia University, Morgantown, West Virginia, 26506, USA.
| | | | | |
Collapse
|