201
|
Marsh JW, Hayward RJ, Shetty AC, Mahurkar A, Humphrys MS, Myers GSA. Bioinformatic analysis of bacteria and host cell dual RNA-sequencing experiments. Brief Bioinform 2019; 19:1115-1129. [PMID: 28535295 DOI: 10.1093/bib/bbx043] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2017] [Indexed: 12/18/2022] Open
Abstract
Bacterial pathogens subvert host cells by manipulating cellular pathways for survival and replication; in turn, host cells respond to the invading pathogen through cascading changes in gene expression. Deciphering these complex temporal and spatial dynamics to identify novel bacterial virulence factors or host response pathways is crucial for improved diagnostics and therapeutics. Dual RNA sequencing (dRNA-Seq) has recently been developed to simultaneously capture host and bacterial transcriptomes from an infected cell. This approach builds on the high sensitivity and resolution of RNA sequencing technology and is applicable to any bacteria that interact with eukaryotic cells, encompassing parasitic, commensal or mutualistic lifestyles. Several laboratory protocols have been presented that outline the collection, extraction and sequencing of total RNA for dRNA-Seq experiments, but there is relatively little guidance available for the detailed bioinformatic analyses required. This protocol outlines a typical dRNA-Seq experiment, based on a Chlamydia trachomatis-infected host cell, with a detailed description of the necessary bioinformatic analyses with currently available software tools.
Collapse
Affiliation(s)
- James W Marsh
- The ithree institute, University of Technology Sydney
| | | | - Amol C Shetty
- Institute for Genome Sciences at the University of Maryland, Baltimore
| | - Anup Mahurkar
- Institute for Genome Sciences at the University of Maryland, Baltimore
| | | | | |
Collapse
|
202
|
Liu L, Kim MH, Hyeon C. Heterogeneous Loop Model to Infer 3D Chromosome Structures from Hi-C. Biophys J 2019; 117:613-625. [PMID: 31337548 DOI: 10.1016/j.bpj.2019.06.032] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Revised: 05/22/2019] [Accepted: 06/25/2019] [Indexed: 10/26/2022] Open
Abstract
Adapting a well-established formalism in polymer physics, we develop a minimalist approach to infer three-dimensional folding of chromatin from Hi-C data. The three-dimensional chromosome structures generated from our heterogeneous loop model (HLM) are used to visualize chromosome organizations that can substantiate the measurements from fluorescence in situ hybridization, chromatin interaction analysis by paired-end tag sequencing, and RNA-seq signals. We demonstrate the utility of the HLM with several case studies. Specifically, the HLM-generated chromosome structures, which reproduce the spatial distribution of topologically associated domains from fluorescence in situ hybridization measurement, show the phase segregation between two types of topologically associated domains explicitly. We discuss the origin of cell-type-dependent gene-expression level by modeling the chromatin globules of α-globin and SOX2 gene loci for two different cell lines. We also use the HLM to discuss how the chromatin folding and gene-expression level of Pax6 loci, associated with mouse neural development, are modulated by interactions with two enhancers. Finally, HLM-generated structures of chromosome 19 of mouse embryonic stem cells, based on single-cell Hi-C data collected over each cell-cycle phase, visualize changes in chromosome conformation along the cell-cycle. Given a contact frequency map between chromatic loci supplied from Hi-C, HLM is a computationally efficient and versatile modeling tool to generate chromosome structures that can complement interpreting other experimental data.
Collapse
Affiliation(s)
- Lei Liu
- School of Computational Sciences, Korea Institute for Advanced Study, Seoul, Republic of Korea
| | - Min Hyeok Kim
- School of Computational Sciences, Korea Institute for Advanced Study, Seoul, Republic of Korea
| | - Changbong Hyeon
- School of Computational Sciences, Korea Institute for Advanced Study, Seoul, Republic of Korea.
| |
Collapse
|
203
|
Alanjary M, Steinke K, Ziemert N. AutoMLST: an automated web server for generating multi-locus species trees highlighting natural product potential. Nucleic Acids Res 2019; 47:W276-W282. [PMID: 30997504 PMCID: PMC6602446 DOI: 10.1093/nar/gkz282] [Citation(s) in RCA: 246] [Impact Index Per Article: 49.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Revised: 03/29/2019] [Accepted: 04/10/2019] [Indexed: 12/31/2022] Open
Abstract
Understanding the evolutionary background of a bacterial isolate has applications for a wide range of research. However generating an accurate species phylogeny remains challenging. Reliance on 16S rDNA for species identification currently remains popular. Unfortunately, this widespread method suffers from low resolution at the species level due to high sequence conservation. Currently, there is now a wealth of genomic data that can be used to yield more accurate species designations via modern phylogenetic methods and multiple genetic loci. However, these often require extensive expertise and time. The Automated Multi-Locus Species Tree (autoMLST) was thus developed to provide a rapid 'one-click' pipeline to simplify this workflow at: https://automlst.ziemertlab.com. This server utilizes Multi-Locus Sequence Analysis (MLSA) to produce high-resolution species trees; this does not preform multi-locus sequence typing (MLST), a related classification method. The resulting phylogenetic tree also includes helpful annotations, such as species clade designations and secondary metabolite counts to aid natural product prospecting. Distinct from currently available web-interfaces, autoMLST can automate selection of reference genomes and out-group organisms based on one or more query genomes. This enables a wide range of researchers to perform rigorous phylogenetic analyses more rapidly compared to manual MLSA workflows.
Collapse
Affiliation(s)
- Mohammad Alanjary
- Interfaculty Institute of Microbiology and Infection Medicine Tübingen, University of Tübingen, Tübingen, Germany
- German Centre for Infection Research (DZIF), Partner Site Tübingen, Tübingen, Germany
| | - Katharina Steinke
- Interfaculty Institute of Microbiology and Infection Medicine Tübingen, University of Tübingen, Tübingen, Germany
- German Centre for Infection Research (DZIF), Partner Site Tübingen, Tübingen, Germany
| | - Nadine Ziemert
- Interfaculty Institute of Microbiology and Infection Medicine Tübingen, University of Tübingen, Tübingen, Germany
- German Centre for Infection Research (DZIF), Partner Site Tübingen, Tübingen, Germany
| |
Collapse
|
204
|
Wang C, Zhang S. Large-scale determination and characterization of cell type-specific regulatory elements in the human genome. J Mol Cell Biol 2019; 9:463-476. [PMID: 29281093 DOI: 10.1093/jmcb/mjx058] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2017] [Accepted: 12/19/2017] [Indexed: 01/05/2023] Open
Abstract
Histone modifications have been widely elucidated to play vital roles in gene regulation and cell identity. The Roadmap Epigenomics Consortium generated a reference catalog of several key histone modifications across >100s of human cell types and tissues. Decoding these epigenomes into functional regulatory elements is a challenging task in computational biology. To this end, we adopted a differential chromatin modification analysis framework to comprehensively determine and characterize cell type-specific regulatory elements (CSREs) and their histone modification codes in the human epigenomes of five histone modifications across 127 tissues or cell types. The CSREs show significant relevance with cell type-specific biological functions and diseases and cell identity. Clustering of CSREs with their specificity signals reveals distinct histone codes, demonstrating the diversity of functional roles of CSREs within the same cell or tissue. Last but not least, dynamics of CSREs from close cell types or tissues can give a detailed view of developmental processes such as normal tissue development and cancer occurrence.
Collapse
Affiliation(s)
- Can Wang
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China.,School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Shihua Zhang
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China.,School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
205
|
Kwan HH, Culibrk L, Taylor GA, Leelakumari S, Tan R, Jackman SD, Tse K, MacLeod T, Cheng D, Chuah E, Kirk H, Pandoh P, Carlsen R, Zhao Y, Mungall AJ, Moore R, Birol I, Marra MA, Rosen DAS, Haulena M, Jones SJM. The Genome of the Steller Sea Lion ( Eumetopias jubatus). Genes (Basel) 2019; 10:genes10070486. [PMID: 31248052 PMCID: PMC6678222 DOI: 10.3390/genes10070486] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 06/20/2019] [Accepted: 06/21/2019] [Indexed: 11/16/2022] Open
Abstract
The Steller sea lion is the largest member of the Otariidae family and is found in the coastal waters of the northern Pacific Rim. Here, we present the Steller sea lion genome, determined through DNA sequencing approaches that utilized microfluidic partitioning library construction, as well as nanopore technologies. These methods constructed a highly contiguous assembly with a scaffold N50 length of over 14 megabases, a contig N50 length of over 242 kilobases and a total length of 2.404 gigabases. As a measure of completeness, 95.1% of 4104 highly conserved mammalian genes were found to be complete within the assembly. Further annotation identified 19,668 protein coding genes. The assembled genome sequence and underlying sequence data can be found at the National Center for Biotechnology Information (NCBI) under the BioProject accession number PRJNA475770.
Collapse
Affiliation(s)
- Harwood H Kwan
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer, Vancouver, BC V5Z-4S6, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC V6T-1Z4, Canada
| | - Luka Culibrk
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer, Vancouver, BC V5Z-4S6, Canada
- Department of Graduate Studies, Bioinformatics, University of British Columbia, Vancouver, BC V6T-1Z4, Canada
| | - Gregory A Taylor
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer, Vancouver, BC V5Z-4S6, Canada
| | - Sreeja Leelakumari
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer, Vancouver, BC V5Z-4S6, Canada
| | - Ryan Tan
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer, Vancouver, BC V5Z-4S6, Canada
| | - Shaun D Jackman
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer, Vancouver, BC V5Z-4S6, Canada
| | - Kane Tse
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer, Vancouver, BC V5Z-4S6, Canada
| | - Tina MacLeod
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer, Vancouver, BC V5Z-4S6, Canada
| | - Dean Cheng
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer, Vancouver, BC V5Z-4S6, Canada
| | - Eric Chuah
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer, Vancouver, BC V5Z-4S6, Canada
| | - Heather Kirk
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer, Vancouver, BC V5Z-4S6, Canada
| | - Pawan Pandoh
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer, Vancouver, BC V5Z-4S6, Canada
| | - Rebecca Carlsen
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer, Vancouver, BC V5Z-4S6, Canada
| | - Yongjun Zhao
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer, Vancouver, BC V5Z-4S6, Canada
| | - Andrew J Mungall
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer, Vancouver, BC V5Z-4S6, Canada
| | - Richard Moore
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer, Vancouver, BC V5Z-4S6, Canada
| | - Inanc Birol
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer, Vancouver, BC V5Z-4S6, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC V6T-1Z4, Canada
| | - Marco A Marra
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer, Vancouver, BC V5Z-4S6, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC V6T-1Z4, Canada
| | - David A S Rosen
- Institute for the Oceans and Fisheries, University of British Columbia, Vancouver, BC V6T-1Z4, Canada
- Vancouver Aquarium, Vancouver, BC V6G 3E2, Canada
| | | | - Steven J M Jones
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer, Vancouver, BC V5Z-4S6, Canada.
- Department of Medical Genetics, University of British Columbia, Vancouver, BC V6T-1Z4, Canada.
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC V5A-1S6, Canada.
| |
Collapse
|
206
|
Richardson MF, Munyard K, Croft LJ, Allnutt TR, Jackling F, Alshanbari F, Jevit M, Wright GA, Cransberg R, Tibary A, Perelman P, Appleton B, Raudsepp T. Chromosome-Level Alpaca Reference Genome VicPac3.1 Improves Genomic Insight Into the Biology of New World Camelids. Front Genet 2019; 10:586. [PMID: 31293619 PMCID: PMC6598621 DOI: 10.3389/fgene.2019.00586] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Accepted: 06/04/2019] [Indexed: 12/11/2022] Open
Abstract
The development of high-quality chromosomally assigned reference genomes constitutes a key feature for understanding genome architecture of a species and is critical for the discovery of the genetic blueprints of traits of biological significance. South American camelids serve people in extreme environments and are important fiber and companion animals worldwide. Despite this, the alpaca reference genome lags far behind those available for other domestic species. Here we produced a chromosome-level improved reference assembly for the alpaca genome using the DNA of the same female Huacaya alpaca as in previous assemblies. We generated 190X Illumina short-read, 8X Pacific Biosciences long-read and 60X Dovetail Chicago® chromatin interaction scaffolding data for the assembly, used testis and skin RNAseq data for annotation, and cytogenetic map data for chromosomal assignments. The new assembly VicPac3.1 contains 90% of the alpaca genome in just 103 scaffolds and 76% of all scaffolds are mapped to the 36 pairs of the alpaca autosomes and the X chromosome. Preliminary annotation of the assembly predicted 22,462 coding genes and 29,337 isoforms. Comparative analysis of selected regions of the alpaca genome, such as the major histocompatibility complex (MHC), the region involved in the Minute Chromosome Syndrome (MCS) and candidate genes for high-altitude adaptations, reveal unique features of the alpaca genome. The alpaca reference genome VicPac3.1 presents a significant improvement in completeness, contiguity and accuracy over VicPac2 and is an important tool for the advancement of genomics research in all New World camelids.
Collapse
Affiliation(s)
- Mark F Richardson
- Genomics Centre, Deakin University, Geelong, VIC, Australia.,Centre for Integrative Ecology, Deakin University, Geelong, VIC, Australia
| | - Kylie Munyard
- School of Pharmacy and Biomedical Sciences, Curtin Health Innovation Research Institute, Curtin University, Perth, WA, Australia
| | - Larry J Croft
- Genomics Centre, Deakin University, Geelong, VIC, Australia
| | - Theodore R Allnutt
- Bioinformatics Core Research Group, Deakin University, Geelong, VIC, Australia
| | - Felicity Jackling
- Department of Genetics, The University of Melbourne, Melbourne, VIC, Australia
| | - Fahad Alshanbari
- Department of Veterinary Pathobiology, Texas A&M University, College Station, TX, United States
| | - Matthew Jevit
- Department of Veterinary Pathobiology, Texas A&M University, College Station, TX, United States
| | - Gus A Wright
- Department of Veterinary Pathobiology, Texas A&M University, College Station, TX, United States
| | - Rhys Cransberg
- School of Pharmacy and Biomedical Sciences, Curtin Health Innovation Research Institute, Curtin University, Perth, WA, Australia
| | - Ahmed Tibary
- Center for Reproductive Biology, Washington State University, Pullman, WA, United States
| | - Polina Perelman
- Institute of Molecular and Cellular Biology, Siberian Branch of Russian Academy of Sciences, Novosibirsk, Russia
| | - Belinda Appleton
- Centre for Integrative Ecology, Deakin University, Geelong, VIC, Australia
| | - Terje Raudsepp
- Department of Veterinary Pathobiology, Texas A&M University, College Station, TX, United States
| |
Collapse
|
207
|
Megquier K, Genereux DP, Hekman J, Swofford R, Turner-Maier J, Johnson J, Alonso J, Li X, Morrill K, Anguish LJ, Koltookian M, Logan B, Sharp CR, Ferrer L, Lindblad-Toh K, Meyers-Wallen VN, Hoffman A, Karlsson EK. BarkBase: Epigenomic Annotation of Canine Genomes. Genes (Basel) 2019; 10:E433. [PMID: 31181663 PMCID: PMC6627511 DOI: 10.3390/genes10060433] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2019] [Revised: 05/29/2019] [Accepted: 06/03/2019] [Indexed: 12/20/2022] Open
Abstract
Dogs are an unparalleled natural model for investigating the genetics of health and disease, particularly for complex diseases like cancer. Comprehensive genomic annotation of regulatory elements active in healthy canine tissues is crucial both for identifying candidate causal variants and for designing functional studies needed to translate genetic associations into disease insight. Currently, canine geneticists rely primarily on annotations of the human or mouse genome that have been remapped to dog, an approach that misses dog-specific features. Here, we describe BarkBase, a canine epigenomic resource available at barkbase.org. BarkBase hosts data for 27 adult tissue types, with biological replicates, and for one sample of up to five tissues sampled at each of four carefully staged embryonic time points. RNA sequencing is complemented with whole genome sequencing and with assay for transposase-accessible chromatin using sequencing (ATAC-seq), which identifies open chromatin regions. By including replicates, we can more confidently discern tissue-specific transcripts and assess differential gene expression between tissues and timepoints. By offering data in easy-to-use file formats, through a visual browser modeled on similar genomic resources for human, BarkBase introduces a powerful new resource to support comparative studies in dogs and humans.
Collapse
Affiliation(s)
- Kate Megquier
- Vertebrate Genomics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Diane P Genereux
- Vertebrate Genomics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Jessica Hekman
- Vertebrate Genomics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Ross Swofford
- Vertebrate Genomics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Jason Turner-Maier
- Vertebrate Genomics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Jeremy Johnson
- Vertebrate Genomics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Jacob Alonso
- Vertebrate Genomics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Xue Li
- Vertebrate Genomics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
- Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01655, USA.
| | - Kathleen Morrill
- Vertebrate Genomics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
- Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01655, USA.
| | - Lynne J Anguish
- Baker Institute for Animal Health, College of Veterinary Medicine, Cornell University, Ithaca, NY 14853, USA.
| | - Michele Koltookian
- Vertebrate Genomics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Brittney Logan
- Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01655, USA.
| | - Claire R Sharp
- School of Veterinary and Life Sciences, College of Veterinary Medicine, Murdoch University, Perth, Murdoch, WA 6150, Australia.
| | - Lluis Ferrer
- Departament de Medicina i Cirurgia Animals Veterinary School, Universitat Autonoma de Barcelona, 08193 Barcelona, Spain.
| | - Kerstin Lindblad-Toh
- Vertebrate Genomics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
- Science for Life Laboratory, Department of Medical Biochemistry & Microbiology, Uppsala University, 751 23 Uppsala, Sweden.
| | - Vicki N Meyers-Wallen
- Baker Institute for Animal Health and Department of Biomedical Sciences, College of Veterinary Medicine, Cornell University, Ithaca, NY 14850, USA.
| | - Andrew Hoffman
- School of Veterinary Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
- Cummings School of Veterinary Medicine, Tufts University, Grafton, MA 01536, USA.
| | - Elinor K Karlsson
- Vertebrate Genomics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
- Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01655, USA.
- Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01655, USA.
| |
Collapse
|
208
|
Miller JB, McKinnon LM, Whiting MF, Ridge PG. CAM: an alignment-free method to recover phylogenies using codon aversion motifs. PeerJ 2019; 7:e6984. [PMID: 31198636 PMCID: PMC6555396 DOI: 10.7717/peerj.6984] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Accepted: 04/17/2019] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Common phylogenomic approaches for recovering phylogenies are often time-consuming and require annotations for orthologous gene relationships that are not always available. In contrast, alignment-free phylogenomic approaches typically use structure and oligomer frequencies to calculate pairwise distances between species. We have developed an approach to quickly calculate distances between species based on codon aversion. METHODS Utilizing a novel alignment-free character state, we present CAM, an alignment-free approach to recover phylogenies by comparing differences in codon aversion motifs (i.e., the set of unused codons within each gene) across all genes within a species. Synonymous codon usage is non-random and differs between organisms, between genes, and even within a single gene, and many genes do not use all possible codons. We report a comprehensive analysis of codon aversion within 229,742,339 genes from 23,428 species across all kingdoms of life, and we provide an alignment-free framework for its use in a phylogenetic construct. For each species, we first construct a set of codon aversion motifs spanning all genes within that species. We define the pairwise distance between two species, A and B, as one minus the number of shared codon aversion motifs divided by the total codon aversion motifs of the species, A or B, containing the fewest motifs. This approach allows us to calculate pairwise distances even when substantial differences in the number of genes or a high rate of divergence between species exists. Finally, we use neighbor-joining to recover phylogenies. RESULTS Using the Open Tree of Life and NCBI Taxonomy Database as expected phylogenies, our approach compares well, recovering phylogenies that largely match expected trees and are comparable to trees recovered using maximum likelihood and other alignment-free approaches. Our technique is much faster than maximum likelihood and similar in accuracy to other alignment-free approaches. Therefore, we propose that codon aversion be considered a phylogenetically conserved character that may be used in future phylogenomic studies. AVAILABILITY CAM, documentation, and test files are freely available on GitHub at https://github.com/ridgelab/cam.
Collapse
Affiliation(s)
- Justin B. Miller
- Department of Biology, Brigham Young University, Provo, UT, United States of America
| | - Lauren M. McKinnon
- Department of Biology, Brigham Young University, Provo, UT, United States of America
| | - Michael F. Whiting
- Department of Biology, Brigham Young University, Provo, UT, United States of America
- Brigham Young University, M.L. Bean Museum, Provo, UT, United States of America
| | - Perry G. Ridge
- Department of Biology, Brigham Young University, Provo, UT, United States of America
| |
Collapse
|
209
|
Abstract
Genetic, transcriptional, and post-transcriptional variations shape the transcriptome of individual cells, rendering establishing an exhaustive set of reference RNAs a complicated matter. Current reference transcriptomes, which are based on carefully curated transcripts, are lagging behind the extensive RNA variation revealed by massively parallel sequencing. Much may be missed by ignoring this unreferenced RNA diversity. There is plentiful evidence for non-reference transcripts with important phenotypic effects. Although reference transcriptomes are inestimable for gene expression analysis, they may turn limiting in important medical applications. We discuss computational strategies for retrieving hidden transcript diversity.
Collapse
Affiliation(s)
- Antonin Morillon
- ncRNA, Epigenetic and Genome Fluidity, CNRS UMR 3244, Sorbonne Université, PSL University, Institut Curie, Centre de Recherche, 26 rue d'Ulm, 75248, Paris, France
| | - Daniel Gautheret
- Institute for Integrative Biology of the Cell, CEA, CNRS, Université Paris-Sud, Université Paris Saclay, Gif sur Yvette, France.
| |
Collapse
|
210
|
Brazel DM, Jiang Y, Hughey JM, Turcot V, Zhan X, Gong J, Batini C, Weissenkampen JD, Liu M, Barnes DR, Bertelsen S, Chou YL, Erzurumluoglu AM, Faul JD, Haessler J, Hammerschlag AR, Hsu C, Kapoor M, Lai D, Le N, de Leeuw CA, Loukola A, Mangino M, Melbourne CA, Pistis G, Qaiser B, Rohde R, Shao Y, Stringham H, Wetherill L, Zhao W, Agrawal A, Bierut L, Chen C, Eaton CB, Goate A, Haiman C, Heath A, Iacono WG, Martin NG, Polderman TJ, Reiner A, Rice J, Schlessinger D, Scholte HS, Smith JA, Tardif JC, Tindle HA, van der Leij AR, Boehnke M, Chang-Claude J, Cucca F, David SP, Foroud T, Howson JMM, Kardia SLR, Kooperberg C, Laakso M, Lettre G, Madden P, McGue M, North K, Posthuma D, Spector T, Stram D, Tobin MD, Weir DR, Kaprio J, Abecasis GR, Liu DJ, Vrieze S. Exome Chip Meta-analysis Fine Maps Causal Variants and Elucidates the Genetic Architecture of Rare Coding Variants in Smoking and Alcohol Use. Biol Psychiatry 2019; 85:946-955. [PMID: 30679032 PMCID: PMC6534468 DOI: 10.1016/j.biopsych.2018.11.024] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Revised: 11/05/2018] [Accepted: 11/29/2018] [Indexed: 12/21/2022]
Abstract
BACKGROUND Smoking and alcohol use have been associated with common genetic variants in multiple loci. Rare variants within these loci hold promise in the identification of biological mechanisms in substance use. Exome arrays and genotype imputation can now efficiently genotype rare nonsynonymous and loss of function variants. Such variants are expected to have deleterious functional consequences and to contribute to disease risk. METHODS We analyzed ∼250,000 rare variants from 16 independent studies genotyped with exome arrays and augmented this dataset with imputed data from the UK Biobank. Associations were tested for five phenotypes: cigarettes per day, pack-years, smoking initiation, age of smoking initiation, and alcoholic drinks per week. We conducted stratified heritability analyses, single-variant tests, and gene-based burden tests of nonsynonymous/loss-of-function coding variants. We performed a novel fine-mapping analysis to winnow the number of putative causal variants within associated loci. RESULTS Meta-analytic sample sizes ranged from 152,348 to 433,216, depending on the phenotype. Rare coding variation explained 1.1% to 2.2% of phenotypic variance, reflecting 11% to 18% of the total single nucleotide polymorphism heritability of these phenotypes. We identified 171 genome-wide associated loci across all phenotypes. Fine mapping identified putative causal variants with double base-pair resolution at 24 of these loci, and between three and 10 variants for 65 loci. Twenty loci contained rare coding variants in the 95% credible intervals. CONCLUSIONS Rare coding variation significantly contributes to the heritability of smoking and alcohol use. Fine-mapping genome-wide association study loci identifies specific variants contributing to the biological etiology of substance use behavior.
Collapse
Affiliation(s)
- David M Brazel
- Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, Colorado; Department of Molecular, Cellular, and Developmental Biology, University of Colorado Boulder, Boulder, Colorado
| | - Yu Jiang
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, Pennsylvania
| | - Jordan M Hughey
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, Pennsylvania
| | - Valérie Turcot
- Department of Medicine, Faculty of Medicine, Université de Montréal, Montreal, Quebec, Canada; Montreal Heart Institute, Montreal, Quebec, Canada
| | - Xiaowei Zhan
- Department of Clinical Science, Center for Genetics of Host Defense, University of Texas Southwestern, Dallas, Texas
| | - Jian Gong
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Chiara Batini
- Department of Health Sciences, University of Leicester, Leicester, United Kingdom
| | - J Dylan Weissenkampen
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, Pennsylvania
| | - MengZhen Liu
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota
| | - Daniel R Barnes
- Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom
| | - Sarah Bertelsen
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Yi-Ling Chou
- Department of Psychiatry, Washington University School of Medicine, St. Louis, Missouri
| | | | - Jessica D Faul
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, Michigan
| | - Jeff Haessler
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Anke R Hammerschlag
- Department of Complex Trait Genetics, Center for Neurogenomics and Cognitive Research, Amsterdam Neuroscience, VU University Amsterdam, University of Amsterdam, Amsterdam, the Netherlands
| | - Chris Hsu
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California
| | - Manav Kapoor
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Dongbing Lai
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, Indiana
| | - Nhung Le
- Department of Medical Microbiology, Immunology and Cell Biology, Southern Illinois University School of Medicine, Springfield, Illinois
| | - Christiaan A de Leeuw
- Department of Complex Trait Genetics, Center for Neurogenomics and Cognitive Research, Amsterdam Neuroscience, VU University Amsterdam, University of Amsterdam, Amsterdam, the Netherlands
| | - Anu Loukola
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland; Department of Public Health, University of Helsinki, Helsinki, Finland
| | - Massimo Mangino
- Department of Twin Research and Genetic Epidemiology, King's College London, London, United Kingdom; National Institute for Health Research Biomedical Research Centre at Guy's and St Thomas' Foundation Trust, London, United Kingdom
| | - Carl A Melbourne
- Department of Health Sciences, University of Leicester, Leicester, United Kingdom
| | - Giorgio Pistis
- Istituto di Ricerca Genetica e Biomedica, Consiglio Nazionale delle Ricerche, Monserrato, Italy
| | - Beenish Qaiser
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland; Department of Public Health, University of Helsinki, Helsinki, Finland
| | - Rebecca Rohde
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Yaming Shao
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Heather Stringham
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, Michigan
| | - Leah Wetherill
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, Indiana
| | - Wei Zhao
- Department of Epidemiology, University of Michigan, Ann Arbor, Michigan
| | - Arpana Agrawal
- Department of Psychiatry, Washington University School of Medicine, St. Louis, Missouri
| | - Laura Bierut
- Department of Psychiatry, Washington University School of Medicine, St. Louis, Missouri
| | - Chu Chen
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington; Department of Epidemiology, Head and Neck Surgery Center, University of Washington, Seattle, Washington; Department of Otolaryngology, Head and Neck Surgery Center, University of Washington, Seattle, Washington
| | - Charles B Eaton
- Department of Family Medicine, Brown University, Providence, Rhode Island
| | - Alison Goate
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Christopher Haiman
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California
| | - Andrew Heath
- Department of Psychiatry, Washington University School of Medicine, St. Louis, Missouri
| | - William G Iacono
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota
| | | | - Tinca J Polderman
- Department of Complex Trait Genetics, Center for Neurogenomics and Cognitive Research, Amsterdam Neuroscience, VU University Amsterdam, University of Amsterdam, Amsterdam, the Netherlands
| | - Alex Reiner
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington; Department of Epidemiology, Head and Neck Surgery Center, University of Washington, Seattle, Washington
| | - John Rice
- Department of Psychiatry, Washington University School of Medicine, St. Louis, Missouri; Department of Mathematics, Washington University in St. Louis, St. Louis, Missouri
| | - David Schlessinger
- National Institute on Aging, National Institutes of Health, Bethesda, Maryland
| | - H Steven Scholte
- Department of Psychology, University of Amsterdam, Amsterdam, the Netherlands; Amsterdam Brain and Cognition, University of Amsterdam, Amsterdam, the Netherlands
| | - Jennifer A Smith
- Department of Epidemiology, University of Michigan, Ann Arbor, Michigan
| | - Jean-Claude Tardif
- Department of Medicine, Faculty of Medicine, Université de Montréal, Montreal, Quebec, Canada; Montreal Heart Institute, Montreal, Quebec, Canada
| | - Hilary A Tindle
- Department of Medicine, Vanderbilt University, Nashville, Tennessee
| | - Andries R van der Leij
- Department of Psychology, University of Amsterdam, Amsterdam, the Netherlands; Amsterdam Brain and Cognition, University of Amsterdam, Amsterdam, the Netherlands
| | - Michael Boehnke
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, Michigan
| | - Jenny Chang-Claude
- Division of Cancer Epidemiology, German Cancer Research Center, Heidelberg, Germany
| | - Francesco Cucca
- Istituto di Ricerca Genetica e Biomedica, Consiglio Nazionale delle Ricerche, Monserrato, Italy
| | - Sean P David
- Department of Medicine, Stanford University, Stanford, California
| | - Tatiana Foroud
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, Indiana
| | - Joanna M M Howson
- Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom
| | - Sharon L R Kardia
- Department of Epidemiology, University of Michigan, Ann Arbor, Michigan
| | - Charles Kooperberg
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Markku Laakso
- Department of Internal Medicine, Institute of Clinical Medicine, University of Eastern Finland, Kuopio, Finland
| | - Guillaume Lettre
- Department of Medicine, Faculty of Medicine, Université de Montréal, Montreal, Quebec, Canada; Montreal Heart Institute, Montreal, Quebec, Canada
| | - Pamela Madden
- Department of Psychiatry, Washington University School of Medicine, St. Louis, Missouri
| | - Matt McGue
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota
| | - Kari North
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Danielle Posthuma
- Department of Complex Trait Genetics, Center for Neurogenomics and Cognitive Research, Amsterdam Neuroscience, VU University Amsterdam, University of Amsterdam, Amsterdam, the Netherlands; Department of Clinical Genetics, VU University Medical Centre, University of Amsterdam, Amsterdam, the Netherlands
| | - Timothy Spector
- Department of Twin Research and Genetic Epidemiology, King's College London, London, United Kingdom
| | - Daniel Stram
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California
| | - Martin D Tobin
- Department of Health Sciences, University of Leicester, Leicester, United Kingdom
| | - David R Weir
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, Michigan
| | - Jaakko Kaprio
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland; Department of Public Health, University of Helsinki, Helsinki, Finland
| | - Gonçalo R Abecasis
- Regeneron Pharmaceuticals, Tarrytown, New York; Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, Michigan
| | - Dajiang J Liu
- Institute of Personalized Medicine, Penn State College of Medicine, Hershey, Pennsylvania.
| | - Scott Vrieze
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota.
| |
Collapse
|
211
|
Li Y, Hagen DE, Ji T, Bakhtiarizadeh MR, Frederic WM, Traxler EM, Kalish JM, Rivera RM. Altered microRNA expression profiles in large offspring syndrome and Beckwith-Wiedemann syndrome. Epigenetics 2019; 14:850-876. [PMID: 31144574 PMCID: PMC6691986 DOI: 10.1080/15592294.2019.1615357] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
The use of assisted reproductive technologies (ART) can induce a congenital overgrowth condition in humans and ruminants, namely Beckwith-Wiedemann syndrome (BWS) and large offspring syndrome (LOS), respectively. Shared phenotypes and epigenotypes have been found between BWS and LOS. We have observed global misregulation of transcripts in bovine foetuses with LOS. microRNAs (miRNAs) are important post-transcriptional gene expression regulators. We hypothesize that there is miRNA misregulation in LOS and that this misregulation is shared with BWS. In this study, small RNA sequencing was conducted to investigate miRNA expression profiles in bovine and human samples. We detected 407 abundant known miRNAs and predicted 196 putative miRNAs from the bovine sequencing results and identified 505 abundant miRNAs in human tongue. Differentially expressed miRNAs (DE-miRNAs) were identified between control and LOS groups in all tissues analysed as well as between BWS and control human samples. DE-miRNAs were detected from several miRNA clusters including DLK1-DIO3 genomic imprinted cluster in LOS and BWS. DNA hypermethylation was associated with downregulation of miRNAs in the DLK1-DIO3. mRNA targets of the DE-miRNAs were predicted and signalling pathways associated with control of organ size (including the Hippo signalling pathway), cell proliferation, apoptosis, cell survival, cell cycle, and cell adhesion were found to be enriched with these genes. Yes associated protein 1 (YAP1) is the core effector of the Hippo signalling pathway, and increased level of active (non-phosphorylated) YAP1 protein was detected in skeletal muscle of LOS foetuses. Overall, our data provide evidence of miRNA misregulation in LOS and BWS.
Collapse
Affiliation(s)
- Yahan Li
- a Division of Animal Sciences, University of Missouri , Columbia , MO , USA
| | - Darren Erich Hagen
- b Department of Animal and Food Science, Oklahoma State University , Stillwater , OK , USA
| | - Tieming Ji
- c Department of Statistics, University of Missouri , Columbia , MO , USA
| | | | - Whitney M Frederic
- e Division of Human Genetics, Center for Childhood Cancer Research, The Children's Hospital of Philadelphia , Philadelphia , PA , USA
| | - Emily M Traxler
- e Division of Human Genetics, Center for Childhood Cancer Research, The Children's Hospital of Philadelphia , Philadelphia , PA , USA
| | - Jennifer M Kalish
- e Division of Human Genetics, Center for Childhood Cancer Research, The Children's Hospital of Philadelphia , Philadelphia , PA , USA.,f Perelman School of Medicine, University of Pennsylvania , Philadelphia , PA , USA
| | | |
Collapse
|
212
|
Brant JO, Boatwright JL, Davenport R, Sandoval AGW, Maden M, Barbazuk WB. Comparative transcriptomic analysis of dermal wound healing reveals de novo skeletal muscle regeneration in Acomys cahirinus. PLoS One 2019; 14:e0216228. [PMID: 31141508 PMCID: PMC6541261 DOI: 10.1371/journal.pone.0216228] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Accepted: 04/16/2019] [Indexed: 01/14/2023] Open
Abstract
The African spiny mouse, Acomys spp., is capable of scar-free dermal wound healing. Here, we have performed a comprehensive analysis of gene expression throughout wound healing following full-thickness excisional dermal wounds in both Acomys cahirinus and Mus musculus. Additionally, we provide an annotated, de novo transcriptome assembly of A. cahirinus skin and skin wounds. Using a novel computational comparative RNA-Seq approach along with pathway and co-expression analyses, we identify enrichment of regeneration associated genes as well as upregulation of genes directly related to muscle development or function. Our RT-qPCR data reveals induction of the myogenic regulatory factors, as well as upregulation of embryonic myosin, starting between days 14 and 18 post-wounding in A. cahirinus. In contrast, the myogenic regulatory factors remain downregulated, embryonic myosin is only modestly upregulated, and no new muscle fibers of the panniculus carnosus are generated in M. musculus wounds. Additionally, we show that Col6a1, a key component of the satellite cell niche, is upregulated in A. cahirinus compared to M. musculus. Our data also demonstrate that the macrophage profile and inflammatory response is different between species, with A. cahirinus expressing significantly higher levels of Il10. We also demonstrate differential expression of the upstream regulators Wnt7a, Wnt2 and Wnt6 during wound healing. Our analyses demonstrate that A. cahirinus is capable of de novo skeletal muscle regeneration of the panniculus carnosus following removal of the extracellular matrix. We believe this study represents the first detailed analysis of de novo skeletal muscle regeneration observed in an adult mammal.
Collapse
Affiliation(s)
- Jason O. Brant
- Department of Biology, University of Florida, Gainesville, Florida, United States of America
| | - J. Lucas Boatwright
- Department of Biology, University of Florida, Gainesville, Florida, United States of America
| | - Ruth Davenport
- Department of Biology, University of Florida, Gainesville, Florida, United States of America
| | | | - Malcolm Maden
- Department of Biology, University of Florida, Gainesville, Florida, United States of America
- Genetics Institute, University of Florida, Gainesville, Florida, United States of America
- * E-mail: (WBB); (MM)
| | - W. Brad Barbazuk
- Department of Biology, University of Florida, Gainesville, Florida, United States of America
- Genetics Institute, University of Florida, Gainesville, Florida, United States of America
- * E-mail: (WBB); (MM)
| |
Collapse
|
213
|
Ramsauer AS, Kubacki J, Favrot C, Ackermann M, Fraefel C, Tobler K. RNA-seq analysis in equine papillomavirus type 2-positive carcinomas identifies affected pathways and potential cancer markers as well as viral gene expression and splicing events. J Gen Virol 2019; 100:985-998. [PMID: 31084699 DOI: 10.1099/jgv.0.001267] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Equine papillomavirus type 2 (EcPV2) was discovered only recently, but it is found consistently in the context of genital squamous cell carcinomas (SCCs). Since neither cell cultures nor animal models exist, the characterization of this potential disease agent relies on the analysis of patient materials. To analyse the host and viral transcriptome in EcPV2-affected horses, genital tissue samples were collected from horses with EcPV2-positive lesions as well as from healthy EcPV2-negative horses. It was determined by RNA-seq analysis that there were 1957 differentially expressed (DE) host genes between the SCC and control samples. These genes were most abundantly related to DNA replication, cell cycle, extracellular matrix (ECM)-receptor interaction and focal adhesion. By comparison to other cancer studies, MMP1 and IL8 appeared to be potential marker genes for the development of SCCs. Analysis of the viral reads revealed the transcriptional activity of EcPV2 in all SCC samples. While few reads mapped to the structural viral genes, the majority of reads mapped to the non-structural early (E) genes, in particular to E6, E7 and E2/E4. Within these reads a distinct pattern of splicing events, which are essential for the expression of different genes in PV infections, was observed. Additionally, in one sample the integration of EcPV2 DNA into the host genome was detected by DNA-seq and confirmed by PCR. In conclusion, while host MMP1 and IL8 expression and the presence of EcPV2 may be useful markers in genital SCCs, further research on EcPV2-related pathomechanisms may focus on cell cycle-related genes, the viral genes E6, E7 and E2/E4, and integration events.
Collapse
Affiliation(s)
- Anna Sophie Ramsauer
- 2 Dermatology Department, Vetsuisse Faculty, University of Zurich, Zurich, Switzerland.,1 Institute of Virology, Vetsuisse Faculty, University of Zurich, Zurich, Switzerland
| | - Jakub Kubacki
- 1 Institute of Virology, Vetsuisse Faculty, University of Zurich, Zurich, Switzerland
| | - Claude Favrot
- 2 Dermatology Department, Vetsuisse Faculty, University of Zurich, Zurich, Switzerland
| | - Mathias Ackermann
- 1 Institute of Virology, Vetsuisse Faculty, University of Zurich, Zurich, Switzerland
| | - Cornel Fraefel
- 1 Institute of Virology, Vetsuisse Faculty, University of Zurich, Zurich, Switzerland
| | - Kurt Tobler
- 1 Institute of Virology, Vetsuisse Faculty, University of Zurich, Zurich, Switzerland
| |
Collapse
|
214
|
Vignal A, Boitard S, Thébault N, Dayo GK, Yapi-Gnaore V, Youssao Abdou Karim I, Berthouly-Salazar C, Pálinkás-Bodzsár N, Guémené D, Thibaud-Nissen F, Warren WC, Tixier-Boichard M, Rognon X. A guinea fowl genome assembly provides new evidence on evolution following domestication and selection in galliformes. Mol Ecol Resour 2019; 19:997-1014. [PMID: 30945415 PMCID: PMC6579635 DOI: 10.1111/1755-0998.13017] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2018] [Revised: 03/19/2019] [Accepted: 03/25/2019] [Indexed: 01/25/2023]
Abstract
The helmeted guinea fowl Numida meleagris belongs to the order Galliformes. Its natural range includes a large part of sub‐Saharan Africa, from Senegal to Eritrea and from Chad to South Africa. Archaeozoological and artistic evidence suggest domestication of this species may have occurred about 2,000 years BP in Mali and Sudan primarily as a food resource, although villagers also benefit from its capacity to give loud alarm calls in case of danger, of its ability to consume parasites such as ticks and to hunt snakes, thus suggesting its domestication may have resulted from a commensal association process. Today, it is still farmed in Africa, mainly as a traditional village poultry, and is also bred more intensively in other countries, mainly France and Italy. The lack of available molecular genetic markers has limited the genetic studies conducted to date on guinea fowl. We present here a first‐generation whole‐genome sequence draft assembly used as a reference for a study by a Pool‐seq approach of wild and domestic populations from Europe and Africa. We show that the domestic populations share a higher genetic similarity between each other than they do to wild populations living in the same geographical area. Several genomic regions showing selection signatures putatively related to domestication or importation to Europe were detected, containing candidate genes, most notably EDNRB2, possibly explaining losses in plumage coloration phenotypes in domesticated populations.
Collapse
Affiliation(s)
- Alain Vignal
- GenPhySE, INRA, INPT, INP-ENVT, Université de Toulouse, Castanet Tolosan, France
| | - Simon Boitard
- GenPhySE, INRA, INPT, INP-ENVT, Université de Toulouse, Castanet Tolosan, France
| | - Noémie Thébault
- GenPhySE, INRA, INPT, INP-ENVT, Université de Toulouse, Castanet Tolosan, France
| | | | | | | | | | | | | | - Francoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland
| | - Wesley C Warren
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri.,Bond Life Sciences Center, University of Missouri, Columbia, Missouri
| | | | - Xavier Rognon
- GABI, INRA, AgroParisTech, Université Paris-Saclay, Jouy-en-Josas, France
| |
Collapse
|
215
|
Schaffer LV, Millikin RJ, Miller RM, Anderson LC, Fellers RT, Ge Y, Kelleher NL, LeDuc RD, Liu X, Payne SH, Sun L, Thomas PM, Tucholski T, Wang Z, Wu S, Wu Z, Yu D, Shortreed MR, Smith LM. Identification and Quantification of Proteoforms by Mass Spectrometry. Proteomics 2019; 19:e1800361. [PMID: 31050378 PMCID: PMC6602557 DOI: 10.1002/pmic.201800361] [Citation(s) in RCA: 128] [Impact Index Per Article: 25.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2019] [Revised: 04/07/2019] [Indexed: 12/29/2022]
Abstract
A proteoform is a defined form of a protein derived from a given gene with a specific amino acid sequence and localized post-translational modifications. In top-down proteomic analyses, proteoforms are identified and quantified through mass spectrometric analysis of intact proteins. Recent technological developments have enabled comprehensive proteoform analyses in complex samples, and an increasing number of laboratories are adopting top-down proteomic workflows. In this review, some recent advances are outlined and current challenges and future directions for the field are discussed.
Collapse
Affiliation(s)
- Leah V Schaffer
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Robert J Millikin
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Rachel M Miller
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Lissa C Anderson
- Ion Cyclotron Resonance Program, National High Magnetic Field Laboratory, Tallahassee, FL, 32310, USA
| | - Ryan T Fellers
- Proteomics Center of Excellence, Northwestern University, Evanston, IL, 60208, USA
| | - Ying Ge
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, 53706, USA
- Department of Cell and Regenerative Biology and Human Proteomics Program, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Neil L Kelleher
- Proteomics Center of Excellence, Northwestern University, Evanston, IL, 60208, USA
- Department of Chemistry and Molecular Biosciences and the Division of Hematology and Oncology, Northwestern University, Evanston, IL, 60208, USA
| | - Richard D LeDuc
- Proteomics Center of Excellence, Northwestern University, Evanston, IL, 60208, USA
| | - Xiaowen Liu
- Department of BioHealth Informatics, Indiana University-Purdue University, Indianapolis, IN, 46202, USA
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, 46202, USA
| | - Samuel H Payne
- Department of Biology, Brigham Young University, Provo, UT, 84602
| | - Liangliang Sun
- Department of Chemistry, Michigan State University, East Lansing, MI, 48824, USA
| | - Paul M Thomas
- Proteomics Center of Excellence, Northwestern University, Evanston, IL, 60208, USA
| | - Trisha Tucholski
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Zhe Wang
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, 73019, USA
| | - Si Wu
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, 73019, USA
| | - Zhijie Wu
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Dahang Yu
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, 73019, USA
| | - Michael R Shortreed
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Lloyd M Smith
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, 53706, USA
| |
Collapse
|
216
|
Kucherov G. Evolution of biosequence search algorithms: a brief survey. Bioinformatics 2019; 35:3547-3552. [DOI: 10.1093/bioinformatics/btz272] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2018] [Revised: 04/01/2019] [Accepted: 04/11/2019] [Indexed: 11/14/2022] Open
Abstract
Abstract
Motivation
Although modern high-throughput biomolecular technologies produce various types of data, biosequence data remain at the core of bioinformatic analyses. However, computational techniques for dealing with this data evolved dramatically.
Results
In this bird’s-eye review, we overview the evolution of main algorithmic techniques for comparing and searching biological sequences. We highlight key algorithmic ideas emerged in response to several interconnected factors: shifts of biological analytical paradigm, advent of new sequencing technologies and a substantial increase in size of the available data. We discuss the expansion of alignment-free techniques coming to replace alignment-based algorithms in large-scale analyses. We further emphasize recently emerged and growing applications of sketching methods which support comparison of massive datasets, such as metagenomics samples. Finally, we focus on the transition to population genomics and outline associated algorithmic challenges.
Collapse
Affiliation(s)
- Gregory Kucherov
- CNRS and LIGM/University of Paris-Est, Marne-la-Vallée, France
- SkolTech, Moscow, Russia
| |
Collapse
|
217
|
Lin YL, Gokcumen O. Fine-Scale Characterization of Genomic Structural Variation in the Human Genome Reveals Adaptive and Biomedically Relevant Hotspots. Genome Biol Evol 2019; 11:1136-1151. [PMID: 30887040 PMCID: PMC6475128 DOI: 10.1093/gbe/evz058] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/16/2019] [Indexed: 12/25/2022] Open
Abstract
Genomic structural variants (SVs) are distributed nonrandomly across the human genome. The "hotspots" of SVs have been implicated in evolutionary innovations, as well as medical conditions. However, the evolutionary and biomedical features of these hotspots remain incompletely understood. Here, we analyzed data from 2,504 genomes to construct a refined map of 1,148 SV hotspots in human genomes. We confirmed that segmental duplication-related nonallelic homologous recombination is an important mechanistic driver of SV hotspot formation. However, to our surprise, we also found that a majority of SVs in hotspots do not form through such recombination-based mechanisms, suggesting diverse mechanistic and selective forces shaping hotspots. Indeed, our evolutionary analyses showed that the majority of SV hotspots are within gene-poor regions and evolve under relaxed negative selection or neutrality. However, we still found a small subset of SV hotspots harboring genes that are enriched for anthropologically crucial functions and evolve under geography-specific and balancing adaptive forces. These include two independent hotspots on different chromosomes affecting alpha and beta hemoglobin gene clusters. Biomedically, we found that the SV hotspots coincide with breakpoints of clinically relevant, large de novo SVs, significantly more often than genome-wide expectations. For example, we showed that the breakpoints of multiple large SVs, which lead to idiopathic short stature, coincide with SV hotspots. Therefore, the mutational instability in SV hotpots likely enables chromosomal breaks that lead to pathogenic structural variation formations. Overall, our study contributes to a better understanding of the mutational and adaptive landscape of the genome.
Collapse
Affiliation(s)
- Yen-Lung Lin
- Department of Biological Sciences, University at Buffalo
| | - Omer Gokcumen
- Department of Biological Sciences, University at Buffalo
- Corresponding author: E-mail: or
| |
Collapse
|
218
|
Batley KC, Sandoval‐Castillo J, Kemper CM, Attard CRM, Zanardo N, Tomo I, Beheregaray LB, Möller LM. Genome-wide association study of an unusual dolphin mortality event reveals candidate genes for susceptibility and resistance to cetacean morbillivirus. Evol Appl 2019; 12:718-732. [PMID: 30976305 PMCID: PMC6439501 DOI: 10.1111/eva.12747] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2018] [Revised: 11/25/2018] [Accepted: 11/27/2018] [Indexed: 12/28/2022] Open
Abstract
Infectious diseases are significant demographic and evolutionary drivers of populations, but studies about the genetic basis of disease resistance and susceptibility are scarce in wildlife populations. Cetacean morbillivirus (CeMV) is a highly contagious disease that is increasing in both geographic distribution and incidence, causing unusual mortality events (UME) and killing tens of thousands of individuals across multiple cetacean species worldwide since the late 1980s. The largest CeMV outbreak in the Southern Hemisphere reported to date occurred in Australia in 2013, where it was a major factor in a UME, killing mainly young Indo-Pacific bottlenose dolphins (Tursiops aduncus). Using cases (nonsurvivors) and controls (putative survivors) from the most affected population, we carried out a genome-wide association study to identify candidate genes for resistance and susceptibility to CeMV. The genomic data set consisted of 278,147,988 sequence reads and 35,493 high-quality SNPs genotyped across 38 individuals. Association analyses found highly significant differences in allele and genotype frequencies among cases and controls at 65 SNPs, and Random Forests conservatively identified eight as candidates. Annotation of these SNPs identified five candidate genes (MAPK8, FBXW11, INADL, ANK3 and ACOX3) with functions associated with stress, pain and immune responses. Our findings provide the first insights into the genetic basis of host defence to this highly contagious disease, enabling the development of an applied evolutionary framework to monitor CeMV resistance across cetacean species. Biomarkers could now be established to assess potential risk factors associated with these genes in other CeMV-affected cetacean populations and species. These results could also possibly aid in the advancement of vaccines against morbilliviruses.
Collapse
Affiliation(s)
- Kimberley C. Batley
- Molecular Ecology Laboratory, College of Science and EngineeringFlinders UniversityAdelaideSouth AustraliaAustralia
- Cetacean Ecology, Behaviour, and Evolution Laboratory, College of Science and EngineeringFlinders UniversityAdelaideSouth AustraliaAustralia
| | - Jonathan Sandoval‐Castillo
- Molecular Ecology Laboratory, College of Science and EngineeringFlinders UniversityAdelaideSouth AustraliaAustralia
| | | | - Catherine R. M. Attard
- Molecular Ecology Laboratory, College of Science and EngineeringFlinders UniversityAdelaideSouth AustraliaAustralia
- Cetacean Ecology, Behaviour, and Evolution Laboratory, College of Science and EngineeringFlinders UniversityAdelaideSouth AustraliaAustralia
| | - Nikki Zanardo
- Molecular Ecology Laboratory, College of Science and EngineeringFlinders UniversityAdelaideSouth AustraliaAustralia
- Cetacean Ecology, Behaviour, and Evolution Laboratory, College of Science and EngineeringFlinders UniversityAdelaideSouth AustraliaAustralia
| | - Ikuko Tomo
- South Australian MuseumAdelaideSouth AustraliaAustralia
| | - Luciano B. Beheregaray
- Molecular Ecology Laboratory, College of Science and EngineeringFlinders UniversityAdelaideSouth AustraliaAustralia
| | - Luciana M. Möller
- Molecular Ecology Laboratory, College of Science and EngineeringFlinders UniversityAdelaideSouth AustraliaAustralia
- Cetacean Ecology, Behaviour, and Evolution Laboratory, College of Science and EngineeringFlinders UniversityAdelaideSouth AustraliaAustralia
| |
Collapse
|
219
|
Hao Y, Zhang L, Niu Y, Cai T, Luo J, He S, Zhang B, Zhang D, Qin Y, Yang F, Chen R. SmProt: a database of small proteins encoded by annotated coding and non-coding RNA loci. Brief Bioinform 2019; 19:636-643. [PMID: 28137767 DOI: 10.1093/bib/bbx005] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Indexed: 11/12/2022] Open
Abstract
Small proteins is the general term for proteins with length shorter than 100 amino acids. Identification and functional studies of small proteins have advanced rapidly in recent years, and several studies have shown that small proteins play important roles in diverse functions including development, muscle contraction and DNA repair. Identification and characterization of previously unrecognized small proteins may contribute in important ways to cell biology and human health. Current databases are generally somewhat deficient in that they have either not collected small proteins systematically, or contain only predictions of small proteins in a limited number of tissues and species. Here, we present a specifically designed web-accessible database, small proteins database (SmProt, http://bioinfo.ibp.ac.cn/SmProt), which is a database documenting small proteins. The current release of SmProt incorporates 255 010 small proteins computationally or experimentally identified in 291 cell lines/tissues derived from eight popular species. The database provides a variety of data including basic information (sequence, location, gene name, organism, etc.) as well as specific information (experiment, function, disease type, etc.). To facilitate data extraction, SmProt supports multiple search options, including species, genome location, gene name and their aliases, cell lines/tissues, ORF type, gene type, PubMed ID and SmProt ID. SmProt also incorporates a service for the BLAST alignment search and provides a local UCSC Genome Browser. Additionally, SmProt defines a high-confidence set of small proteins and predicts the functions of the small proteins.
Collapse
Affiliation(s)
- Yajing Hao
- Key Laboratory of RNA Biology, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Lili Zhang
- Key Laboratory of RNA Biology, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Yiwei Niu
- Key Laboratory of RNA Biology, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Tanxi Cai
- Key Laboratory of Protein and Peptide Pharmaceuticals and Laboratory of Proteomics, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Jianjun Luo
- Key Laboratory of RNA Biology, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Shunmin He
- Key Laboratory of the Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Bao Zhang
- Key Laboratory of RNA Biology, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Dejiu Zhang
- Key Laboratory of RNA Biology, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Yan Qin
- Key Laboratory of RNA Biology, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Fuquan Yang
- Key Laboratory of Protein and Peptide Pharmaceuticals and Laboratory of Proteomics, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Runsheng Chen
- Key Laboratory of RNA Biology, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
220
|
Zhou W, Xu Y, Lv Q, Sheng YH, Chen L, Li M, Shen L, Huai C, Yi Z, Cui D, Qin S. Genetic Association of Olanzapine Treatment Response in Han Chinese Schizophrenia Patients. Front Pharmacol 2019; 10:177. [PMID: 30886581 PMCID: PMC6409308 DOI: 10.3389/fphar.2019.00177] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2018] [Accepted: 02/11/2019] [Indexed: 12/30/2022] Open
Abstract
Olanzapine, a second-generation antipsychotic medication, plays a critical role in current treatment of schizophrenia (SCZ). It has been observed that the olanzapine responses in schizophrenia treatment are different across individuals. However, prediction of this individual-specific olanzapine response requires in-depth knowledge of biomarkers of drug response. Here, we performed an integrative investigation on 238 Han Chinese SCZ patients to identify predictive biomarkers that were associated with the efficacy of olanzapine treatment. This study applied HaloPlex technology to sequence 143 genes from 79 Han Chinese SCZ patients. Our result suggested that there were 12 single nucleotide polymorphisms (SNPs) had significant association with olanzapine response in Han Chinese SCZ patients. Using MassARRAY platform, we tested that if these 12 SNPs were also statistically significant in 159 other SCZ patients (independent cohort) and the combined 238 SCZ patients (composed of two tested cohorts). The result of this analysis showed that 2 SNPs were significantly associated with the olanzapine response in both independent cohorts (rs324026, P = 0.023; rs12610827, P = 0.043) and combined SCZ patient population (rs324026, adjust P = 0.014; rs12610827, adjust P = 0.012). Our study provides systematic analyses of genetic variants associated with olanzapine responses of Han Chinese SCZ patients. The discovery of these novel biomarkers of olanzapine-response will facilitate to advance future olanzapine treatment specific for Han Chinese SCZ patients.
Collapse
Affiliation(s)
- Wei Zhou
- Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Bio-X Institutes, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Yong Xu
- Department of Psychiatry, First Hospital of Shanxi Medical University, Taiyuan, China
| | - Qinyu Lv
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | | | - Luan Chen
- Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Bio-X Institutes, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Mo Li
- Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Bio-X Institutes, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Lu Shen
- Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Bio-X Institutes, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Cong Huai
- Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Bio-X Institutes, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Zhenghui Yi
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Donghong Cui
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China.,Shanghai Key Laboratory of Psychotic Disorders, Shanghai, China
| | - Shengying Qin
- Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Bio-X Institutes, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China.,The Third Affiliated Hospital, Guangzhou Medical University, Guangzhou, China
| |
Collapse
|
221
|
A comprehensive overview of common polymorphic variants that cause missense mutations in human CYPs and UGTs. Biomed Pharmacother 2019; 111:983-992. [DOI: 10.1016/j.biopha.2019.01.024] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Revised: 01/06/2019] [Accepted: 01/06/2019] [Indexed: 01/07/2023] Open
|
222
|
Koutelou E, Wang L, Schibler AC, Chao HP, Kuang X, Lin K, Lu Y, Shen J, Jeter CR, Salinger A, Wilson M, Chen YC, Atanassov BS, Tang DG, Dent SYR. USP22 controls multiple signaling pathways that are essential for vasculature formation in the mouse placenta. Development 2019; 146:dev.174037. [PMID: 30718289 DOI: 10.1242/dev.174037] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Accepted: 01/24/2019] [Indexed: 12/14/2022]
Abstract
USP22, a component of the SAGA complex, is overexpressed in highly aggressive cancers, but the normal functions of this deubiquitinase are not well defined. We determined that loss of USP22 in mice results in embryonic lethality due to defects in extra-embryonic placental tissues and failure to establish proper vascular interactions with the maternal circulatory system. These phenotypes arise from abnormal gene expression patterns that reflect defective kinase signaling, including TGFβ and several receptor tyrosine kinase pathways. USP22 deletion in endothelial cells and pericytes that are induced from embryonic stem cells also hinders these signaling cascades, with detrimental effects on cell survival and differentiation as well as on the ability to form vessels. Our findings provide new insights into the functions of USP22 during development that may offer clues to its role in disease states.
Collapse
Affiliation(s)
- Evangelia Koutelou
- Department of Epigenetics and Molecular Carcinogenesis, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA .,Center for Cancer Epigenetics, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,The University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA
| | - Li Wang
- Department of Epigenetics and Molecular Carcinogenesis, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,Center for Cancer Epigenetics, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,The University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,Program in Epigenetics and Molecular Carcinogenesis, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.,MD Anderson UTHealth Graduate School of Biomedical Sciences, University of Texas, Houston, TX 77030, USA
| | - Andria C Schibler
- Department of Epigenetics and Molecular Carcinogenesis, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,Center for Cancer Epigenetics, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,The University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,MD Anderson UTHealth Graduate School of Biomedical Sciences, University of Texas, Houston, TX 77030, USA.,Program in Genes and Development, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Hsueh-Ping Chao
- Department of Epigenetics and Molecular Carcinogenesis, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,Center for Cancer Epigenetics, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,The University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,Program in Epigenetics and Molecular Carcinogenesis, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.,MD Anderson UTHealth Graduate School of Biomedical Sciences, University of Texas, Houston, TX 77030, USA
| | - Xianghong Kuang
- Department of Epigenetics and Molecular Carcinogenesis, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,Center for Cancer Epigenetics, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,The University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA
| | - Kevin Lin
- Department of Epigenetics and Molecular Carcinogenesis, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,Center for Cancer Epigenetics, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,The University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA
| | - Yue Lu
- Department of Epigenetics and Molecular Carcinogenesis, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,Center for Cancer Epigenetics, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,The University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA
| | - Jianjun Shen
- Department of Epigenetics and Molecular Carcinogenesis, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,Center for Cancer Epigenetics, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,The University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,MD Anderson UTHealth Graduate School of Biomedical Sciences, University of Texas, Houston, TX 77030, USA
| | - Collene R Jeter
- Department of Epigenetics and Molecular Carcinogenesis, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,Center for Cancer Epigenetics, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,The University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA
| | - Andrew Salinger
- Department of Epigenetics and Molecular Carcinogenesis, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,Center for Cancer Epigenetics, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,The University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA
| | - Marenda Wilson
- The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Yi Chun Chen
- MD Anderson UTHealth Graduate School of Biomedical Sciences, University of Texas, Houston, TX 77030, USA.,Program in Genes and Development, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.,The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Boyko S Atanassov
- Department of Epigenetics and Molecular Carcinogenesis, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,Center for Cancer Epigenetics, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,The University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA
| | - Dean G Tang
- Department of Epigenetics and Molecular Carcinogenesis, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,Center for Cancer Epigenetics, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,The University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA
| | - Sharon Y R Dent
- Department of Epigenetics and Molecular Carcinogenesis, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA .,Center for Cancer Epigenetics, University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,The University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA.,MD Anderson UTHealth Graduate School of Biomedical Sciences, University of Texas, Houston, TX 77030, USA
| |
Collapse
|
223
|
Scarpati M, Qi Y, Govind S, Singh S. A combined computational strategy of sequence and structural analysis predicts the existence of a functional eicosanoid pathway in Drosophila melanogaster. PLoS One 2019; 14:e0211897. [PMID: 30753230 PMCID: PMC6372189 DOI: 10.1371/journal.pone.0211897] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2018] [Accepted: 01/22/2019] [Indexed: 02/07/2023] Open
Abstract
This study reports on a putative eicosanoid biosynthesis pathway in Drosophila melanogaster and challenges the currently held view that mechanistic routes to synthesize eicosanoid or eicosanoid-like biolipids do not exist in insects, since to date, putative fly homologs of most mammalian enzymes have not been identified. Here we use systematic and comprehensive bioinformatics approaches to identify most of the mammalian eicosanoid synthesis enzymes. Sensitive sequence analysis techniques identified candidate Drosophila enzymes that share low global sequence identities with their human counterparts. Twenty Drosophila candidates were selected based upon (a) sequence identity with human enzymes of the cyclooxygenase and lipoxygenase branches, (b) similar domain architecture and structural conservation of the catalytic domain, and (c) presence of potentially equivalent functional residues. Evaluation of full-length structural models for these 20 top-scoring Drosophila candidates revealed a surprising degree of conservation in their overall folds and potential analogs for functional residues in all 20 enzymes. Although we were unable to identify any suitable candidate for lipoxygenase enzymes, we report structural homology models of three fly cyclooxygenases. Our findings predict that the D. melanogaster genome likely codes for one or more pathways for eicosanoid or eicosanoid-like biolipid synthesis. Our study suggests that classical and/or novel eicosanoids mediators must regulate biological functions in insects–predictions that can be tested with the power of Drosophila genetics. Such experimental analysis of eicosanoid biology in a simple model organism will have high relevance to human development and health.
Collapse
Affiliation(s)
- Michael Scarpati
- Brooklyn College of the City University of New York, Brooklyn, New York, United States of America
- PhD program in Biology, Graduate Center of the City University of New York, New York, New York, United States of America
| | - Yan Qi
- Brooklyn College of the City University of New York, Brooklyn, New York, United States of America
- PhD program in Biology, Graduate Center of the City University of New York, New York, New York, United States of America
| | - Shubha Govind
- PhD program in Biology, Graduate Center of the City University of New York, New York, New York, United States of America
- PhD program in Biochemistry, Graduate Center of the City University of New York, New York, New York, United States of America
- The City College of the City University of New York, New York, New York, United States of America
| | - Shaneen Singh
- Brooklyn College of the City University of New York, Brooklyn, New York, United States of America
- PhD program in Biology, Graduate Center of the City University of New York, New York, New York, United States of America
- PhD program in Biochemistry, Graduate Center of the City University of New York, New York, New York, United States of America
- * E-mail:
| |
Collapse
|
224
|
Cantone M, Küspert M, Reiprich S, Lai X, Eberhardt M, Göttle P, Beyer F, Azim K, Küry P, Wegner M, Vera J. A gene regulatory architecture that controls region-independent dynamics of oligodendrocyte differentiation. Glia 2019; 67:825-843. [PMID: 30730593 DOI: 10.1002/glia.23569] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2018] [Revised: 10/16/2018] [Accepted: 10/25/2018] [Indexed: 12/18/2022]
Abstract
Oligodendrocytes (OLs) facilitate information processing in the vertebrate central nervous system via axonal ensheathment. The structure and dynamics of the regulatory network that mediates oligodendrogenesis are poorly understood. We employed bioinformatics and meta-analysis of high-throughput datasets to reconstruct a regulatory network underpinning OL differentiation. From this network, we identified families of feedforward loops comprising the transcription factors (TFs) Olig2, Sox10, and Tcf7l2 and their targets. Among the targets, we found eight other TFs related to OL differentiation, suggesting a hierarchical architecture in which some TFs (Olig2, Sox10, and Tcf7l2) regulate via feedforward loops the expression of others (Sox2, Sox6, Sox11, Nkx2-2, Nkx6-2, Hes5, Myt1, and Myrf). Model simulations with a kinetic model reproduced the mechanisms of OL differentiation only when in the model, Sox10-mediated repression of Tcf7l2 by miR-338/miR-155 was introduced, a prediction confirmed in genetic functional experiments. Additional model simulations suggested that OLs from dorsal regions emerge through BMP/Sox9 signaling.
Collapse
Affiliation(s)
- Martina Cantone
- Laboratory of Systems Tumor Immunology, Hautklinik, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.,Faculty of Mechanical Engineering, Specialty Division for Systems Biotechnology, Technische Universität München, Munich, Germany
| | - Melanie Küspert
- Institut für Biochemie, Emil-Fischer-Zentrum, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
| | - Simone Reiprich
- Institut für Biochemie, Emil-Fischer-Zentrum, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
| | - Xin Lai
- Laboratory of Systems Tumor Immunology, Hautklinik, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Martin Eberhardt
- Laboratory of Systems Tumor Immunology, Hautklinik, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Peter Göttle
- Neuroregeneration, Department of Neurology, Medical Faculty, Heinrich-Heine University, Düsseldorf, Germany
| | - Felix Beyer
- Neuroregeneration, Department of Neurology, Medical Faculty, Heinrich-Heine University, Düsseldorf, Germany
| | - Kasum Azim
- Neuroregeneration, Department of Neurology, Medical Faculty, Heinrich-Heine University, Düsseldorf, Germany
| | - Patrick Küry
- Neuroregeneration, Department of Neurology, Medical Faculty, Heinrich-Heine University, Düsseldorf, Germany
| | - Michael Wegner
- Institut für Biochemie, Emil-Fischer-Zentrum, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
| | - Julio Vera
- Laboratory of Systems Tumor Immunology, Hautklinik, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
225
|
Li J, Rettedal EA, van der Helm E, Ellabaan M, Panagiotou G, Sommer MOA. Antibiotic Treatment Drives the Diversification of the Human Gut Resistome. GENOMICS, PROTEOMICS & BIOINFORMATICS 2019; 17:39-51. [PMID: 31026582 PMCID: PMC6520913 DOI: 10.1016/j.gpb.2018.12.003] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Revised: 10/10/2018] [Accepted: 12/17/2018] [Indexed: 01/13/2023]
Abstract
Despite the documented antibiotic-induced disruption of the gut microbiota, the impact of antibiotic intake on strain-level dynamics, evolution of resistance genes, and factors influencing resistance dissemination potential remains poorly understood. To address this gap we analyzed public metagenomic datasets from 24 antibiotic treated subjects and controls, combined with an in-depth prospective functional study with two subjects investigating the bacterial community dynamics based on cultivation-dependent and independent methods. We observed that short-term antibiotic treatment shifted and diversified the resistome composition, increased the average copy number of antibiotic resistance genes, and altered the dominant strain genotypes in an individual-specific manner. More than 30% of the resistance genes underwent strong differentiation at the single nucleotide level during antibiotic treatment. We found that the increased potential for horizontal gene transfer, due to antibiotic administration, was ∼3-fold stronger in the differentiated resistance genes than the non-differentiated ones. This study highlights how antibiotic treatment has individualized impacts on the resistome and strain level composition, and drives the adaptive evolution of the gut microbiota.
Collapse
Affiliation(s)
- Jun Li
- Department of Infectious Diseases and Public Health, Colleague of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong Special Administrative Region, China; School of Data Science, City University of Hong Kong, Hong Kong Special Administrative Region, China
| | | | - Eric van der Helm
- Novo Nordisk Foundation Center for Biosustainability, DK-2900 Hørsholm, Denmark
| | - Mostafa Ellabaan
- Novo Nordisk Foundation Center for Biosustainability, DK-2900 Hørsholm, Denmark
| | - Gianni Panagiotou
- Systems Biology and Bioinformatics Unit, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knöll Institute, 07745 Jena, Germany; Systems Biology and Bioinformatics Group, School of Biological Sciences, Faculty of Sciences, The University of Hong Kong, Hong Kong Special Administrative Region, China; Department of Microbiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, China.
| | - Morten O A Sommer
- Novo Nordisk Foundation Center for Biosustainability, DK-2900 Hørsholm, Denmark.
| |
Collapse
|
226
|
Nono AD, Chen K, Liu X. Comparison of different functional prediction scores using a gene-based permutation model for identifying cancer driver genes. BMC Med Genomics 2019; 12:22. [PMID: 30704472 PMCID: PMC6357357 DOI: 10.1186/s12920-018-0452-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Background Identifying cancer driver genes (CDG) is a crucial step in cancer genomic toward the advancement of precision medicine. However, driver gene discovery is a very challenging task because we are not only dealing with huge amount of data; but we are also faced with the complexity of the disease including the heterogeneity of background somatic mutation rate in each cancer patient. It is generally accepted that CDG harbor variants conferring growth advantage in the malignant cell and they are positively selected, which are critical to cancer development; whereas, non-driver genes harbor random mutations with no functional consequence on cancer. Based on this fact, function prediction based approaches for identifying CDG have been proposed to interrogate the distribution of functional predictions among mutations in cancer genomes (eLS 1–16, 2016). Assuming most of the observed mutations are passenger mutations and given the quantitative predictions for the functional impact of the mutations, genes enriched of functional or deleterious mutations are more likely to be drivers. The promises of these methods have been continually refined and can therefore be applied to increase accuracy in detecting new candidate CDGs. However, current function prediction based approaches only focus on coding mutations and lack a systematic way to pick the best mutation deleteriousness prediction algorithms for usage. Results In this study, we propose a new function prediction based approach to discover CDGs through a gene-based permutation approach. Our method not only covers both coding and non-coding regions of the genes; but it also accounts for the heterogeneous mutational context in cohort of cancer patients. The permutation model was implemented independently using seven popular deleteriousness prediction scores covering splicing regions (SPIDEX), coding regions (MetaLR, and VEST3) and pan-genome (CADD, DANN, Fathmm-MKL coding and Fathmm-MKL noncoding). We applied this new approach to somatic single nucleotide variants (SNVs) from whole-genome sequences of 119 breast and 24 lung cancer patients and compared the seven deleteriousness prediction scores for their performance in this study. Conclusion The new function prediction based approach not only predicted known cancer genes listed in the Cancer Gene Census (CGC), but also new candidate CDGs that are worth further investigation. The results showed the advantage of utilizing pan-genome deleteriousness prediction scores in function prediction based methods. Although VEST3 score, a deleteriousness prediction score for missense mutations, has the best performance in breast cancer, it was topped by CADD and Fathmm-MKL coding, two pan-genome deleteriousness prediction scores, in lung cancer. Electronic supplementary material The online version of this article (10.1186/s12920-018-0452-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Alice Djotsa Nono
- Human Genetics Center, UTHealth School of Public Health, Houston, TX, USA
| | - Ken Chen
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Xiaoming Liu
- Human Genetics Center, UTHealth School of Public Health, Houston, TX, USA. .,Present Address: USF Genomics, College of Public Health, University of South Florida, Tampa, FL, USA.
| |
Collapse
|
227
|
Jiang Y, Qian F, Bai X, Liu Y, Wang Q, Ai B, Han X, Shi S, Zhang J, Li X, Tang Z, Pan Q, Wang Y, Wang F, Li C. SEdb: a comprehensive human super-enhancer database. Nucleic Acids Res 2019; 47:D235-D243. [PMID: 30371817 PMCID: PMC6323980 DOI: 10.1093/nar/gky1025] [Citation(s) in RCA: 132] [Impact Index Per Article: 26.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2018] [Revised: 10/12/2018] [Accepted: 10/17/2018] [Indexed: 12/21/2022] Open
Abstract
Super-enhancers are important for controlling and defining the expression of cell-specific genes. With research on human disease and biological processes, human H3K27ac ChIP-seq datasets are accumulating rapidly, creating the urgent need to collect and process these data comprehensively and efficiently. More importantly, many studies showed that super-enhancer-associated single nucleotide polymorphisms (SNPs) and transcription factors (TFs) strongly influence human disease and biological processes. Here, we developed a comprehensive human super-enhancer database (SEdb, http://www.licpathway.net/sedb) that aimed to provide a large number of available resources on human super-enhancers. The database was annotated with potential functions of super-enhancers in the gene regulation. The current version of SEdb documented a total of 331 601 super-enhancers from 542 samples. Especially, unlike existing super-enhancer databases, we manually curated and classified 410 available H3K27ac samples from >2000 ChIP-seq samples from NCBI GEO/SRA. Furthermore, SEdb provides detailed genetic and epigenetic annotation information on super-enhancers. Information includes common SNPs, motif changes, expression quantitative trait locus (eQTL), risk SNPs, transcription factor binding sites (TFBSs), CRISPR/Cas9 target sites and Dnase I hypersensitivity sites (DHSs) for in-depth analyses of super-enhancers. SEdb will help elucidate super-enhancer-related functions and find potential biological effects.
Collapse
Affiliation(s)
- Yong Jiang
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Fengcui Qian
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Xuefeng Bai
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Yuejuan Liu
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Qiuyu Wang
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Bo Ai
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Xiaole Han
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Shanshan Shi
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Jian Zhang
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Xuecang Li
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Zhidong Tang
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Qi Pan
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Yuezhu Wang
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Fan Wang
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Chunquan Li
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| |
Collapse
|
228
|
Sugino K, Clark E, Schulmann A, Shima Y, Wang L, Hunt DL, Hooks BM, Tränkner D, Chandrashekar J, Picard S, Lemire AL, Spruston N, Hantman AW, Nelson SB. Mapping the transcriptional diversity of genetically and anatomically defined cell populations in the mouse brain. eLife 2019; 8:38619. [PMID: 30977723 PMCID: PMC6499542 DOI: 10.7554/elife.38619] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2018] [Accepted: 04/11/2019] [Indexed: 01/27/2023] Open
Abstract
Understanding the principles governing neuronal diversity is a fundamental goal for neuroscience. Here, we provide an anatomical and transcriptomic database of nearly 200 genetically identified cell populations. By separately analyzing the robustness and pattern of expression differences across these cell populations, we identify two gene classes contributing distinctly to neuronal diversity. Short homeobox transcription factors distinguish neuronal populations combinatorially, and exhibit extremely low transcriptional noise, enabling highly robust expression differences. Long neuronal effector genes, such as channels and cell adhesion molecules, contribute disproportionately to neuronal diversity, based on their patterns rather than robustness of expression differences. By linking transcriptional identity to genetic strains and anatomical atlases, we provide an extensive resource for further investigation of mouse neuronal cell types.
Collapse
Affiliation(s)
- Ken Sugino
- Janelia Research CampusAshburnUnited States
| | | | | | | | - Lihua Wang
- Janelia Research CampusAshburnUnited States
| | | | | | | | | | | | | | | | | | | |
Collapse
|
229
|
Vandamme T, Beyens M, Boons G, Schepers A, Kamp K, Biermann K, Pauwels P, De Herder WW, Hofland LJ, Peeters M, Van Camp G, Op de Beeck K. Hotspot DAXX, PTCH2 and CYFIP2 mutations in pancreatic neuroendocrine neoplasms. Endocr Relat Cancer 2019; 26:1-12. [PMID: 30021865 DOI: 10.1530/erc-18-0120] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/16/2018] [Accepted: 07/18/2018] [Indexed: 12/20/2022]
Abstract
Mutations in DAXX/ATRX, MEN1 and genes involved in the phosphoinositide-3-kinase/Akt/mammalian target of rapamycin (PI3K/Akt/mTOR) pathway have been implicated in pancreatic neuroendocrine neoplasms (pNENs). However, mainly mutations present in the majority of tumor cells have been identified, while proliferation-driving mutations could be present only in small fractions of the tumor. This study aims to identify high- and low-abundance mutations in pNENs using ultra-deep targeted resequencing. Formalin-fixed paraffin-embedded matched tumor-normal tissue of 38 well-differentiated pNENs was sequenced using a HaloPlex targeted resequencing panel. Novel amplicon-based algorithms were used to identify both single nucleotide variants (SNVs) and insertion-deletions (indels) present in >10% of reads (high abundance) and in <10% of reads (low abundance). Found variants were validated by Sanger sequencing. Sequencing resulted in 416,711,794 reads with an average target base coverage of 2663 ± 1476. Across all samples, 32 high-abundance somatic, 3 germline and 30 low-abundance mutations were withheld after filtering and validation. Overall, 92% of high-abundance and 84% of low-abundance mutations were predicted to be protein damaging. Frequently, mutated genes were MEN1, DAXX, ATRX, TSC2, PI3K/Akt/mTOR and MAPK-ERK pathway-related genes. Additionally, recurrent alterations on the same genomic position, so-called hotspot mutations, were found in DAXX, PTCH2 and CYFIP2. This first ultra-deep sequencing study highlighted genetic intra-tumor heterogeneity in pNEN, by the presence of low-abundance mutations. The importance of the ATRX/DAXX pathway was confirmed by the first-ever pNEN-specific protein-damaging hotspot mutation in DAXX. In this study, both novel genes, including the pro-apoptotic CYFIP2 gene and hedgehog signaling PTCH2, and novel pathways, such as the MAPK-ERK pathway, were implicated in pNEN.
Collapse
Affiliation(s)
- T Vandamme
- Center of Oncological Research (CORE), University of Antwerp, Antwerp, Belgium
- Section of Endocrinology, Department of Internal Medicine, Erasmus Medical Center, Rotterdam, The Netherlands
| | - M Beyens
- Center of Oncological Research (CORE), University of Antwerp, Antwerp, Belgium
| | - G Boons
- Center of Oncological Research (CORE), University of Antwerp, Antwerp, Belgium
| | - A Schepers
- Center of Medical Genetics, University of Antwerp, Antwerp, Belgium
| | - K Kamp
- Section of Endocrinology, Department of Internal Medicine, Erasmus Medical Center, Rotterdam, The Netherlands
| | - K Biermann
- Department of Pathology, Erasmus Medical Center, Rotterdam, The Netherlands
| | - P Pauwels
- Department of Pathology, University of Antwerp, Antwerp, Belgium
| | - W W De Herder
- Section of Endocrinology, Department of Internal Medicine, Erasmus Medical Center, Rotterdam, The Netherlands
| | - L J Hofland
- Section of Endocrinology, Department of Internal Medicine, Erasmus Medical Center, Rotterdam, The Netherlands
| | - M Peeters
- Center of Oncological Research (CORE), University of Antwerp, Antwerp, Belgium
| | - G Van Camp
- Center of Medical Genetics, University of Antwerp, Antwerp, Belgium
| | - K Op de Beeck
- Center of Oncological Research (CORE), University of Antwerp, Antwerp, Belgium
| |
Collapse
|
230
|
Wang C, Zhang S. Reveal cell type-specific regulatory elements and their characterized histone code classes via a hidden Markov model. BMC Genomics 2018; 19:903. [PMID: 30598107 PMCID: PMC6311906 DOI: 10.1186/s12864-018-5274-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND With the maturity of next generation sequencing technology, a huge amount of epigenomic data have been generated by several large consortia in the last decade. These plenty resources leave us the opportunity about sufficiently utilizing those data to explore biological problems. RESULTS Here we developed an integrative and comparative method, CsreHMM, which is based on a hidden Markov model, to systematically reveal cell type-specific regulatory elements (CSREs) along the whole genome, and simultaneously recognize the histone codes (mark combinations) charactering them. This method also reveals the subclasses of CSREs and explicitly label those shared by a few cell types. We applied this method to a data set of 9 cell types and 9 chromatin marks to demonstrate its effectiveness and found that the revealed CSREs relates to different kinds of functional regulatory regions significantly. Their proximal genes have consistent expression and are likely to participate in cell type-specific biological functions. CONCLUSIONS These results suggest CsreHMM has the potential to help understand cell identity and the diverse mechanisms of gene regulation.
Collapse
Affiliation(s)
- Can Wang
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Shihua Zhang
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China.
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China.
- Center for Excel-lence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China.
| |
Collapse
|
231
|
New antigens for the serological diagnosis of human visceral leishmaniasis identified by immunogenomic screening. PLoS One 2018; 13:e0209599. [PMID: 30571783 PMCID: PMC6301785 DOI: 10.1371/journal.pone.0209599] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Accepted: 12/08/2018] [Indexed: 12/17/2022] Open
Abstract
Visceral leishmaniasis (VL) still represents a serious public health problem in Brazil due to the inefficiency of the control measures currently employed, that included early diagnosis and treatment of human cases, vector control, euthanasia of infected dogs and, recently approved in Brazil, treatment with Milteforam drug. Effective clinical management depend largely on early and unequivocal diagnosis, however, cross-reactivity have also been described in serological tests, especially when it refers to individuals from areas where Chagas' disease is also present. Thus, to discover new antigens to improve the current serological tests for VL diagnosis is urgently needed. Here, we performed an immunogenomic screen strategy to identify conserved linear B-cell epitopes in the predicted L. infantum proteome using the following criteria: i) proteins expressed in the stages found in the vertebrate host, amastigote stage, and secreted/excreted, to guarantee greater exposure to the immune system; ii) divergent from proteins present in other infectious disease pathogens with incidence in endemic areas for VL, as T. cruzi; iii) highly antigenic to humans with different genetic backgrounds, independently of the clinical stage of the disease; iv) stable and adaptable to quality-control tests to guarantee reproducibility; v) using statistical analysis to determine a suitable sample size to evaluate accuracy of diagnostic tests established by receiver operating characteristic strategy. We selected six predicted linear B-cell epitopes from three proteins of L. infantum parasite. The results demonstrated that a mixture of peptides (Mix IV: peptides 3+6) were able to identify VL cases and simultaneously able to discriminate infections caused by T. cruzi parasite with high accuracy (100.00%) and perfect agreement (Kappa index = 1.000) with direct methods performed by laboratories in Brazil. The results also demonstrated that peptide-6, Mix III (peptides 2+6) and I (peptides 2+3+6) are potential antigens able to used in VL diagnosis, represented by high accuracy (Ac = 99.52%, 99.52% and 98.56%, respectively). This study represents an interesting strategy for discovery new antigens applied to serologic diagnosis which will contribute to the improvement of the diagnosis of VL and, consequently, may help in the prevention, control and treatment of the disease in endemic areas of Brazil.
Collapse
|
232
|
Neville MDC, Choi J, Lieberman J, Duan QL. Identification of deleterious and regulatory genomic variations in known asthma loci. Respir Res 2018; 19:248. [PMID: 30541564 PMCID: PMC6292105 DOI: 10.1186/s12931-018-0953-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2018] [Accepted: 11/23/2018] [Indexed: 11/25/2022] Open
Abstract
Background Candidate gene and genome-wide association studies have identified hundreds of asthma risk loci. The majority of associated variants, however, are not known to have any biological function and are believed to represent markers rather than true causative mutations. We hypothesized that many of these associated markers are in linkage disequilibrium (LD) with the elusive causative variants. Methods We compiled a comprehensive list of 449 asthma-associated variants previously reported in candidate gene and genome-wide association studies. Next, we identified all sequence variants located within the 305 unique genes using whole-genome sequencing data from the 1000 Genomes Project. Then, we calculated the LD between known asthma variants and the sequence variants within each gene. LD variants identified were then annotated to determine those that are potentially deleterious and/or functional (i.e. coding or regulatory effects on the encoded transcript or protein). Results We identified 10,130 variants in LD (r2 > 0.6) with known asthma variants. Annotations of these LD variants revealed that several have potentially deleterious effects including frameshift, alternate splice site, stop-lost, and missense. Moreover, 24 of the LD variants have been reported to regulate gene expression as expression quantitative trait loci (eQTLs). Conclusions This study is proof of concept that many of the genetic loci previously associated with complex diseases such as asthma are not causative but represent markers of disease, which are in LD with the elusive causative variants. We hereby report a number of potentially deleterious and regulatory variants that are in LD with the reported asthma loci. These reported LD variants could account for the original association signals with asthma and represent the true causative mutations at these loci. Electronic supplementary material The online version of this article (10.1186/s12931-018-0953-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Matthew D C Neville
- Department of Biomedical and Molecular Sciences, Queen's University, Botterell Hall, Room 530 - 18 Stuart St, Kingston, ON, K7L3N6, Canada
| | - Jihoon Choi
- Department of Biomedical and Molecular Sciences, Queen's University, Botterell Hall, Room 530 - 18 Stuart St, Kingston, ON, K7L3N6, Canada
| | - Jonathan Lieberman
- Department of Biomedical and Molecular Sciences, Queen's University, Botterell Hall, Room 530 - 18 Stuart St, Kingston, ON, K7L3N6, Canada
| | - Qing Ling Duan
- Department of Biomedical and Molecular Sciences, Queen's University, Botterell Hall, Room 530 - 18 Stuart St, Kingston, ON, K7L3N6, Canada. .,School of Computing, Queen's University, 557 Goodwin Hall, Room 531, Kingston, ON, K7L 2N8, Canada.
| |
Collapse
|
233
|
The Genome of the North American Brown Bear or Grizzly: Ursus arctos ssp. horribilis. Genes (Basel) 2018; 9:genes9120598. [PMID: 30513700 PMCID: PMC6315469 DOI: 10.3390/genes9120598] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Revised: 11/23/2018] [Accepted: 11/28/2018] [Indexed: 11/17/2022] Open
Abstract
The grizzly bear (Ursus arctos ssp. horribilis) represents the largest population of brown bears in North America. Its genome was sequenced using a microfluidic partitioning library construction technique, and these data were supplemented with sequencing from a nanopore-based long read platform. The final assembly was 2.33 Gb with a scaffold N50 of 36.7 Mb, and the genome is of comparable size to that of its close relative the polar bear (2.30 Gb). An analysis using 4104 highly conserved mammalian genes indicated that 96.1% were found to be complete within the assembly. An automated annotation of the genome identified 19,848 protein coding genes. Our study shows that the combination of the two sequencing modalities that we used is sufficient for the construction of highly contiguous reference quality mammalian genomes. The assembled genome sequence and the supporting raw sequence reads are available from the NCBI (National Center for Biotechnology Information) under the bioproject identifier PRJNA493656, and the assembly described in this paper is version QXTK01000000.
Collapse
|
234
|
Clark SL, Costin BN, Chan RF, Johnson AW, Xie L, Jurmain JL, Kumar G, Shabalin AA, Pandey AK, Aberg KA, Miles MF, van den Oord E. A Whole Methylome Study of Ethanol Exposure in Brain and Blood: An Exploration of the Utility of Peripheral Blood as Proxy Tissue for Brain in Alcohol Methylation Studies. Alcohol Clin Exp Res 2018; 42:2360-2368. [PMID: 30320886 DOI: 10.1111/acer.13905] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2018] [Accepted: 10/06/2018] [Indexed: 01/04/2023]
Abstract
BACKGROUND Recent reviews have highlighted the potential use of blood-based methylation biomarkers as diagnostic and prognostic tools of current and future alcohol use and addiction. Due to the substantial overlap that often exists between methylation patterns across different tissues, including blood and brain, blood-based methylation may track methylation changes in brain; however, little work has explored the overlap in alcohol-related methylation in these tissues. METHODS To study the effects of alcohol on the brain methylome and identify possible biomarkers of these changes in blood, we performed a methylome-wide association study in brain and blood from 40 male DBA/2J mice that received either an acute ethanol (EtOH) or saline intraperitoneal injection. To investigate all 22 million CpGs in the mouse genome, we enriched for the methylated genomic fraction using methyl-CpG binding domain (MBD) protein capture followed by next-generation sequencing (MBD-seq). We performed association tests in blood and brain separately followed by enrichment testing to determine whether there was overlapping alcohol-related methylation in the 2 tissues. RESULTS The top result for brain was a CpG located in an intron of Ttc39b (p = 5.65 × 10-08 ), and for blood, the top result was located in Espnl (p = 5.11 × 10-08 ). Analyses implicated pathways involved in inflammation and neuronal differentiation, such as CXCR4, IL-7, and Wnt signaling. Enrichment tests indicated significant overlap among the top results in brain and blood. Pathway analyses of the overlapping genes converge on MAPKinase signaling (p = 5.6 × 10-05 ) which plays a central role in acute and chronic responses to alcohol and glutamate receptor pathways, which can regulate neuroplastic changes underlying addictive behavior. CONCLUSIONS Overall, we have shown some methylation changes in brain and blood after acute EtOH administration and that the changes in blood partly mirror the changes in brain suggesting the potential for DNA methylation in blood to be biomarkers of alcohol use.
Collapse
Affiliation(s)
- Shaunna L Clark
- Department of Psychology , Michigan State University, East Lansing, Michigan.,Center for Biomarker Research and Precision Medicine , Virginia Commonwealth University, Richmond, Virginia
| | - Blair N Costin
- Department of Pharmacology and Toxicology , Virginia Commonwealth University, Richmond, Virginia
| | - Robin F Chan
- Center for Biomarker Research and Precision Medicine , Virginia Commonwealth University, Richmond, Virginia
| | - Alexander W Johnson
- Department of Psychology , Michigan State University, East Lansing, Michigan
| | - Linying Xie
- Center for Biomarker Research and Precision Medicine , Virginia Commonwealth University, Richmond, Virginia
| | - Jessica L Jurmain
- Department of Pharmacology and Toxicology , Virginia Commonwealth University, Richmond, Virginia
| | - Gaurav Kumar
- Center for Biomarker Research and Precision Medicine , Virginia Commonwealth University, Richmond, Virginia
| | - Andrey A Shabalin
- Center for Biomarker Research and Precision Medicine , Virginia Commonwealth University, Richmond, Virginia
| | - Ashutosh K Pandey
- Department of Anatomy and Neurobiology , Center for Integrative and Translational Genomics, University of Tennessee Health Science Center, Memphis, Tennessee
| | - Karolina A Aberg
- Center for Biomarker Research and Precision Medicine , Virginia Commonwealth University, Richmond, Virginia
| | - Michael F Miles
- Department of Pharmacology and Toxicology , Virginia Commonwealth University, Richmond, Virginia
| | - Edwin van den Oord
- Center for Biomarker Research and Precision Medicine , Virginia Commonwealth University, Richmond, Virginia
| |
Collapse
|
235
|
Gene synthesis allows biologists to source genes from farther away in the tree of life. Nat Commun 2018; 9:4425. [PMID: 30356044 PMCID: PMC6200774 DOI: 10.1038/s41467-018-06798-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2018] [Accepted: 09/13/2018] [Indexed: 12/11/2022] Open
Abstract
Gene synthesis enables creation and modification of genetic sequences at an unprecedented pace, offering enormous potential for new biological functionality but also increasing the need for biosurveillance. In this paper, we introduce a bioinformatics technique for determining whether a gene is natural or synthetic based solely on nucleotide sequence. This technique, grounded in codon theory and machine learning, can correctly classify genes with 97.7% accuracy on a novel data set. We then classify ∼19,000 unique genes from the Addgene non-profit plasmid repository to investigate whether natural and synthetic genes have differential use in heterologous expression. Phylogenetic analysis of distance between source and expression organisms reveals that researchers are using synthesis to source genes from more genetically-distant organisms, particularly for longer genes. We provide empirical evidence that gene synthesis is leading biologists to sample more broadly across the diversity of life, and we provide a foundational tool for the biosurveillance community.
Collapse
|
236
|
Pavesi A, Vianelli A, Chirico N, Bao Y, Blinkova O, Belshaw R, Firth A, Karlin D. Overlapping genes and the proteins they encode differ significantly in their sequence composition from non-overlapping genes. PLoS One 2018; 13:e0202513. [PMID: 30339683 PMCID: PMC6195259 DOI: 10.1371/journal.pone.0202513] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2018] [Accepted: 08/03/2018] [Indexed: 11/19/2022] Open
Abstract
Overlapping genes represent a fascinating evolutionary puzzle, since they encode two functionally unrelated proteins from the same DNA sequence. They originate by a mechanism of overprinting, in which point mutations in an existing frame allow the expression (the "birth") of a completely new protein from a second frame. In viruses, in which overlapping genes are abundant, these new proteins often play a critical role in infection, yet they are frequently overlooked during genome annotation. This results in erroneous interpretation of mutational studies and in a significant waste of resources. Therefore, overlapping genes need to be correctly detected, especially since they are now thought to be abundant also in eukaryotes. Developing better detection methods and conducting systematic evolutionary studies require a large, reliable benchmark dataset of known cases. We thus assembled a high-quality dataset of 80 viral overlapping genes whose expression is experimentally proven. Many of them were not present in databases. We found that overall, overlapping genes differ significantly from non-overlapping genes in their nucleotide and amino acid composition. In particular, the proteins they encode are enriched in high-degeneracy amino acids and depleted in low-degeneracy ones, which may alleviate the evolutionary constraints acting on overlapping genes. Principal component analysis revealed that the vast majority of overlapping genes follow a similar composition bias, despite their heterogeneity in length and function. Six proven mammalian overlapping genes also followed this bias. We propose that this apparently near-universal composition bias may either favour the birth of overlapping genes, or/and result from selection pressure acting on them.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma, Italy
| | - Alberto Vianelli
- Department of Theoretical and Applied Sciences, University of Insubria, Varese, Italy
| | - Nicola Chirico
- Department of Theoretical and Applied Sciences, University of Insubria, Varese, Italy
| | - Yiming Bao
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Olga Blinkova
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States of America
| | - Robert Belshaw
- School of Biomedical & Healthcare Sciences, Plymouth University Peninsula Schools of Medicine and Dentistry (PUPSMD), Plymouth, United Kingdom
| | - Andrew Firth
- Department of Pathology, Division of Virology, University of Cambridge, Cambridge, United Kingdom
| | - David Karlin
- Department of Zoology, University of Oxford, Oxford, United Kingdom
- Division of Structural Biology, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
237
|
Hughes T, Sønderby IE, Polushina T, Hansson L, Holmgren A, Athanasiu L, Melbø-Jørgensen C, Hassani S, Hoeffding LK, Herms S, Bergen SE, Karlsson R, Song J, Rietschel M, Nöthen MM, Forstner AJ, Hoffmann P, Hultman CM, Landén M, Cichon S, Werge T, Andreassen OA, Le Hellard S, Djurovic S. Elevated expression of a minor isoform of ANK3 is a risk factor for bipolar disorder. Transl Psychiatry 2018; 8:210. [PMID: 30297702 PMCID: PMC6175894 DOI: 10.1038/s41398-018-0175-x] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/16/2018] [Accepted: 04/22/2018] [Indexed: 01/16/2023] Open
Abstract
Ankyrin-3 (ANK3) is one of the few genes that have been consistently identified as associated with bipolar disorder by multiple genome-wide association studies. However, the exact molecular basis of the association remains unknown. A rare loss-of-function splice-site SNP (rs41283526*G) in a minor isoform of ANK3 (incorporating exon ENSE00001786716) was recently identified as protective of bipolar disorder and schizophrenia. This suggests that an elevated expression of this isoform may be involved in the etiology of the disorders. In this study, we used novel approaches and data sets to test this hypothesis. First, we strengthen the statistical evidence supporting the allelic association by replicating the protective effect of the minor allele of rs41283526 in three additional large independent samples (meta-analysis p-values: 6.8E-05 for bipolar disorder and 8.2E-04 for schizophrenia). Second, we confirm the hypothesis that both bipolar and schizophrenia patients have a significantly higher expression of this isoform than controls (p-values: 3.3E-05 for schizophrenia and 9.8E-04 for bipolar type I). Third, we determine the transcription start site for this minor isoform by Pacific Biosciences sequencing of full-length cDNA and show that it is primarily expressed in the corpus callosum. Finally, we combine genotype and expression data from a large Norwegian sample of psychiatric patients and controls, and show that the risk alleles in ANK3 identified by bipolar disorder GWAS are located near the transcription start site of this isoform and are significantly associated with its elevated expression. Together, these results point to the likely molecular mechanism underlying ANK3´s association with bipolar disorder.
Collapse
Affiliation(s)
- Timothy Hughes
- Department of Medical Genetics, Oslo University Hospital, Oslo, Norway. .,NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway.
| | - Ida E. Sønderby
- 0000 0004 0389 8485grid.55325.34Department of Medical Genetics, Oslo University Hospital, Oslo, Norway ,0000 0004 1936 8921grid.5510.1NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Tatiana Polushina
- 0000 0004 1936 7443grid.7914.bDepartment of Clinical Science, NORMENT, KG Jebsen Centre for Psychosis Research, University of Bergen, Bergen, Norway ,0000 0000 9753 1393grid.412008.fDr Einar Martens Research Group for Biological Psychiatry, Centre for Medical Genetics and Molecular Medicine, Haukeland University Hospital, Bergen, Norway
| | - Lars Hansson
- 0000 0004 0389 8485grid.55325.34Department of Medical Genetics, Oslo University Hospital, Oslo, Norway ,0000 0004 1936 8921grid.5510.1NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Asbjørn Holmgren
- 0000 0004 0389 8485grid.55325.34Department of Medical Genetics, Oslo University Hospital, Oslo, Norway
| | - Lavinia Athanasiu
- 0000 0004 1936 8921grid.5510.1NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Christian Melbø-Jørgensen
- 0000 0004 1936 8921grid.5510.1NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Sahar Hassani
- 0000 0004 1936 8921grid.5510.1NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Louise K. Hoeffding
- 0000 0004 0646 7373grid.4973.9Institute of Biological Psychiatry, Mental Health Centre Sct. Hans, Copenhagen University Hospital, Roskilde, Denmark ,0000 0000 9817 5300grid.452548.aiPSYCH, The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Copenhagen, Denmark
| | - Stefan Herms
- 0000 0004 1937 0642grid.6612.3Department of Biomedicine, Human Genomics Research Group, University of Basel, Basel, Switzerland ,0000 0001 2240 3300grid.10388.32Institute of Human Genetics, University of Bonn, Bonn, Germany ,0000 0001 2240 3300grid.10388.32Department of Genomics, Life & Brain Center, University of Bonn, Bonn, Germany
| | - Sarah E. Bergen
- 0000 0004 1937 0626grid.4714.6Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Robert Karlsson
- 0000 0004 1937 0626grid.4714.6Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Jie Song
- 0000 0004 1937 0626grid.4714.6Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Marcella Rietschel
- 0000 0001 2190 4373grid.7700.0Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, Medical Faculty Mannheim/Heidelberg University, Mannheim, Germany
| | - Markus M. Nöthen
- 0000 0001 2240 3300grid.10388.32Institute of Human Genetics, University of Bonn, Bonn, Germany ,0000 0001 2240 3300grid.10388.32Department of Genomics, Life & Brain Center, University of Bonn, Bonn, Germany
| | - Andreas J. Forstner
- 0000 0004 1937 0642grid.6612.3Department of Biomedicine, Human Genomics Research Group, University of Basel, Basel, Switzerland ,0000 0001 2240 3300grid.10388.32Institute of Human Genetics, University of Bonn, Bonn, Germany ,0000 0001 2240 3300grid.10388.32Department of Genomics, Life & Brain Center, University of Bonn, Bonn, Germany ,0000 0004 1937 0642grid.6612.3Department of Psychiatry (UPK), University of Basel, Basel, Switzerland
| | - Per Hoffmann
- 0000 0004 1937 0642grid.6612.3Department of Biomedicine, Human Genomics Research Group, University of Basel, Basel, Switzerland ,0000 0001 2240 3300grid.10388.32Institute of Human Genetics, University of Bonn, Bonn, Germany ,0000 0001 2240 3300grid.10388.32Department of Genomics, Life & Brain Center, University of Bonn, Bonn, Germany ,grid.410567.1Institute of Medical Genetics and Pathology, University Hospital Basel, Basel, Switzerland
| | - Christina M. Hultman
- 0000 0004 1937 0626grid.4714.6Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Mikael Landén
- 0000 0004 1937 0626grid.4714.6Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden ,0000 0000 9919 9582grid.8761.8Institute of Neuroscience and Physiology, The Sahlgrenska Academy at Gothenburg University, Gothenburg, Sweden
| | - Sven Cichon
- 0000 0004 1937 0642grid.6612.3Department of Biomedicine, Human Genomics Research Group, University of Basel, Basel, Switzerland ,0000 0001 2240 3300grid.10388.32Institute of Human Genetics, University of Bonn, Bonn, Germany ,0000 0001 2240 3300grid.10388.32Department of Genomics, Life & Brain Center, University of Bonn, Bonn, Germany ,0000 0001 2297 375Xgrid.8385.6Institute of Neuroscience and Medicine (INM-1), Research Center Juelich, Juelich, Germany
| | - Thomas Werge
- 0000 0004 0646 7373grid.4973.9Institute of Biological Psychiatry, Mental Health Centre Sct. Hans, Copenhagen University Hospital, Roskilde, Denmark ,0000 0000 9817 5300grid.452548.aiPSYCH, The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Copenhagen, Denmark ,0000 0001 0674 042Xgrid.5254.6Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| | - Ole A. Andreassen
- 0000 0004 1936 8921grid.5510.1NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway ,0000 0004 0389 8485grid.55325.34NORMENT, KG Jebsen Centre for Psychosis Research, Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
| | - Stephanie Le Hellard
- 0000 0004 1936 7443grid.7914.bDepartment of Clinical Science, NORMENT, KG Jebsen Centre for Psychosis Research, University of Bergen, Bergen, Norway ,0000 0000 9753 1393grid.412008.fDr Einar Martens Research Group for Biological Psychiatry, Centre for Medical Genetics and Molecular Medicine, Haukeland University Hospital, Bergen, Norway
| | - Srdjan Djurovic
- 0000 0004 0389 8485grid.55325.34Department of Medical Genetics, Oslo University Hospital, Oslo, Norway ,0000 0004 1936 7443grid.7914.bDepartment of Clinical Science, NORMENT, KG Jebsen Centre for Psychosis Research, University of Bergen, Bergen, Norway
| |
Collapse
|
238
|
Xu Y, Zhao W, Olson SD, Prabhakara KS, Zhou X. Alternative splicing links histone modifications to stem cell fate decision. Genome Biol 2018; 19:133. [PMID: 30217220 PMCID: PMC6138936 DOI: 10.1186/s13059-018-1512-3] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2018] [Accepted: 08/20/2018] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Understanding the embryonic stem cell (ESC) fate decision between self-renewal and proper differentiation is important for developmental biology and regenerative medicine. Attention has focused on mechanisms involving histone modifications, alternative pre-messenger RNA splicing, and cell-cycle progression. However, their intricate interrelations and joint contributions to ESC fate decision remain unclear. RESULTS We analyze the transcriptomes and epigenomes of human ESC and five types of differentiated cells. We identify thousands of alternatively spliced exons and reveal their development and lineage-dependent characterizations. Several histone modifications show dynamic changes in alternatively spliced exons and three are strongly associated with 52.8% of alternative splicing events upon hESC differentiation. The histone modification-associated alternatively spliced genes predominantly function in G2/M phases and ATM/ATR-mediated DNA damage response pathway for cell differentiation, whereas other alternatively spliced genes are enriched in the G1 phase and pathways for self-renewal. These results imply a potential epigenetic mechanism by which some histone modifications contribute to ESC fate decision through the regulation of alternative splicing in specific pathways and cell-cycle genes. Supported by experimental validations and extended datasets from Roadmap/ENCODE projects, we exemplify this mechanism by a cell-cycle-related transcription factor, PBX1, which regulates the pluripotency regulatory network by binding to NANOG. We suggest that the isoform switch from PBX1a to PBX1b links H3K36me3 to hESC fate determination through the PSIP1/SRSF1 adaptor, which results in the exon skipping of PBX1. CONCLUSION We reveal the mechanism by which alternative splicing links histone modifications to stem cell fate decision.
Collapse
Affiliation(s)
- Yungang Xu
- Center for Computational Systems Medicine, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030 USA
- Center for Bioinformatics and Systems Biology, Wake Forest School of Medicine, Winston-Salem, NC 27157 USA
| | - Weiling Zhao
- Center for Computational Systems Medicine, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030 USA
- Center for Bioinformatics and Systems Biology, Wake Forest School of Medicine, Winston-Salem, NC 27157 USA
| | - Scott D. Olson
- Department of Pediatric Surgery, McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX 77030 USA
| | - Karthik S. Prabhakara
- Department of Pediatric Surgery, McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX 77030 USA
| | - Xiaobo Zhou
- Center for Computational Systems Medicine, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030 USA
- Center for Bioinformatics and Systems Biology, Wake Forest School of Medicine, Winston-Salem, NC 27157 USA
| |
Collapse
|
239
|
Shi X, Wang X, Wang TL, Hilakivi-Clarke L, Clarke R, Xuan J. SparseIso: a novel Bayesian approach to identify alternatively spliced isoforms from RNA-seq data. Bioinformatics 2018; 34:56-63. [PMID: 28968634 DOI: 10.1093/bioinformatics/btx557] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2017] [Accepted: 09/02/2017] [Indexed: 01/01/2023] Open
Abstract
Motivation Recent advances in high-throughput RNA sequencing (RNA-seq) technologies have made it possible to reconstruct the full transcriptome of various types of cells. It is important to accurately assemble transcripts or identify isoforms for an improved understanding of molecular mechanisms in biological systems. Results We have developed a novel Bayesian method, SparseIso, to reliably identify spliced isoforms from RNA-seq data. A spike-and-slab prior is incorporated into the Bayesian model to enforce the sparsity for isoform identification, effectively alleviating the problem of overfitting. A Gibbs sampling procedure is further developed to simultaneously identify and quantify transcripts from RNA-seq data. With the sampling approach, SparseIso estimates the joint distribution of all candidate transcripts, resulting in a significantly improved performance in detecting lowly expressed transcripts and multiple expressed isoforms of genes. Both simulation study and real data analysis have demonstrated that the proposed SparseIso method significantly outperforms existing methods for improved transcript assembly and isoform identification. Availability and implementation The SparseIso package is available at http://github.com/henryxushi/SparseIso. Contact xuan@vt.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xu Shi
- Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA
| | - Xiao Wang
- Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA
| | - Tian-Li Wang
- Department of Pathology, Johns Hopkins Medical Institutions, Baltimore, MD 21231, USA
| | - Leena Hilakivi-Clarke
- Department of Oncology and Lombardi Comprehensive Cancer Center, Georgetown University, Washington, DC 20057, USA
| | - Robert Clarke
- Department of Oncology and Lombardi Comprehensive Cancer Center, Georgetown University, Washington, DC 20057, USA
| | - Jianhua Xuan
- Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA
| |
Collapse
|
240
|
Henriques D, Parejo M, Vignal A, Wragg D, Wallberg A, Webster MT, Pinto MA. Developing reduced SNP assays from whole-genome sequence data to estimate introgression in an organism with complex genetic patterns, the Iberian honeybee ( Apis mellifera iberiensis). Evol Appl 2018; 11:1270-1282. [PMID: 30151039 PMCID: PMC6099811 DOI: 10.1111/eva.12623] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Accepted: 02/11/2018] [Indexed: 01/01/2023] Open
Abstract
The most important managed pollinator, the honeybee (Apis mellifera L.), has been subject to a growing number of threats. In western Europe, one such threat is large-scale introductions of commercial strains (C-lineage ancestry), which is leading to introgressive hybridization and even the local extinction of native honeybee populations (M-lineage ancestry). Here, we developed reduced assays of highly informative SNPs from 176 whole genomes to estimate C-lineage introgression in the most diverse and evolutionarily complex subspecies in Europe, the Iberian honeybee (Apis mellifera iberiensis). We started by evaluating the effects of sample size and sampling a geographically restricted area on the number of highly informative SNPs. We demonstrated that a bias in the number of fixed SNPs (FST = 1) is introduced when the sample size is small (N ≤ 10) and when sampling only captures a small fraction of a population's genetic diversity. These results underscore the importance of having a representative sample when developing reliable reduced SNP assays for organisms with complex genetic patterns. We used a training data set to design four independent SNP assays selected from pairwise FST between the Iberian and C-lineage honeybees. The designed assays, which were validated in holdout and simulated hybrid data sets, proved to be highly accurate and can be readily used for monitoring populations not only in the native range of A. m. iberiensis in Iberia but also in the introduced range in the Balearic islands, Macaronesia and South America, in a time- and cost-effective manner. While our approach used the Iberian honeybee as model system, it has a high value in a wide range of scenarios for the monitoring and conservation of potentially hybridized domestic and wildlife populations.
Collapse
Affiliation(s)
- Dora Henriques
- Mountain Research Centre (CIMO)Polytechnic Institute of BragançaBragançaPortugal
- Centre of Molecular and Environmental Biology (CBMA)University of MinhoBragaPortugal
| | - Melanie Parejo
- AgroscopeSwiss Bee Research CentreBernSwitzerland
- Institute of Bee HealthVetsuisse FacultyUniversity of BernBernSwitzerland
| | - Alain Vignal
- GenPhySEUniversité de ToulouseINRAINPTINP‐ENVTCastanet TolosanFrance
| | - David Wragg
- The Roslin InstituteUniversity of EdinburghEdinburghUK
| | - Andreas Wallberg
- Department of Medical Biochemistry and MicrobiologyScience for Life LaboratoryUppsala UniversityUppsalaSweden
| | - Matthew T. Webster
- Department of Medical Biochemistry and MicrobiologyScience for Life LaboratoryUppsala UniversityUppsalaSweden
| | - M. Alice Pinto
- Mountain Research Centre (CIMO)Polytechnic Institute of BragançaBragançaPortugal
| |
Collapse
|
241
|
Uszczynska-Ratajczak B, Lagarde J, Frankish A, Guigó R, Johnson R. Towards a complete map of the human long non-coding RNA transcriptome. Nat Rev Genet 2018; 19:535-548. [PMID: 29795125 PMCID: PMC6451964 DOI: 10.1038/s41576-018-0017-y] [Citation(s) in RCA: 387] [Impact Index Per Article: 64.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Gene maps, or annotations, enable us to navigate the functional landscape of our genome. They are a resource upon which virtually all studies depend, from single-gene to genome-wide scales and from basic molecular biology to medical genetics. Yet present-day annotations suffer from trade-offs between quality and size, with serious but often unappreciated consequences for downstream studies. This is particularly true for long non-coding RNAs (lncRNAs), which are poorly characterized compared to protein-coding genes. Long-read sequencing technologies promise to improve current annotations, paving the way towards a complete annotation of lncRNAs expressed throughout a human lifetime.
Collapse
Affiliation(s)
| | - Julien Lagarde
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Catalonia, Spain
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Catalonia, Spain
| | - Rory Johnson
- Department of Medical Oncology, Inselspital, University Hospital and University of Bern, Bern, Switzerland.
- Department of Biomedical Research (DBMR), University of Bern, Bern, Switzerland.
| |
Collapse
|
242
|
Sun JH, Zhou L, Emerson DJ, Phyo SA, Titus KR, Gong W, Gilgenast TG, Beagan JA, Davidson BL, Tassone F, Phillips-Cremins JE. Disease-Associated Short Tandem Repeats Co-localize with Chromatin Domain Boundaries. Cell 2018; 175:224-238.e15. [PMID: 30173918 DOI: 10.1016/j.cell.2018.08.005] [Citation(s) in RCA: 130] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2018] [Revised: 06/11/2018] [Accepted: 08/02/2018] [Indexed: 01/15/2023]
Abstract
More than 25 inherited human disorders are caused by the unstable expansion of repetitive DNA sequences termed short tandem repeats (STRs). A fundamental unresolved question is why some STRs are susceptible to pathologic expansion, whereas thousands of repeat tracts across the human genome are relatively stable. Here, we discover that nearly all disease-associated STRs (daSTRs) are located at boundaries demarcating 3D chromatin domains. We identify a subset of boundaries with markedly higher CpG island density compared to the rest of the genome. daSTRs specifically localize to ultra-high-density CpG island boundaries, suggesting they might be hotspots for epigenetic misregulation or topological disruption linked to STR expansion. Fragile X syndrome patients exhibit severe boundary disruption in a manner that correlates with local loss of CTCF occupancy and the degree of FMR1 silencing. Our data uncover higher-order chromatin architecture as a new dimension in understanding repeat expansion disorders.
Collapse
Affiliation(s)
- James H Sun
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA; Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Linda Zhou
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA; Genomics and Computational Biology Program, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Daniel J Emerson
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Sai A Phyo
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA; Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Katelyn R Titus
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Wanfeng Gong
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Thomas G Gilgenast
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Jonathan A Beagan
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA; Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Beverly L Davidson
- The Raymond G. Perelman Center for Cellular and Molecular Therapeutics, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology and Laboratory Medicine, The University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Flora Tassone
- Biochemistry and Molecular Medicine, University of California-Davis, Sacramento, CA 95616, USA; MIND Institute, UC Davis, Sacramento, CA 95616, USA
| | - Jennifer E Phillips-Cremins
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA; Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
243
|
Napierala JS, Li Y, Lu Y, Lin K, Hauser LA, Lynch DR, Napierala M. Comprehensive analysis of gene expression patterns in Friedreich's ataxia fibroblasts by RNA sequencing reveals altered levels of protein synthesis factors and solute carriers. Dis Model Mech 2018; 10:1353-1369. [PMID: 29125828 PMCID: PMC5719256 DOI: 10.1242/dmm.030536] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2017] [Accepted: 08/21/2017] [Indexed: 12/30/2022] Open
Abstract
Friedreich's ataxia (FRDA) is an autosomal recessive neurodegenerative disease usually caused by large homozygous expansions of GAA repeat sequences in intron 1 of the frataxin (FXN) gene. FRDA patients homozygous for GAA expansions have low FXN mRNA and protein levels when compared with heterozygous carriers or healthy controls. Frataxin is a mitochondrial protein involved in iron–sulfur cluster synthesis, and many FRDA phenotypes result from deficiencies in cellular metabolism due to lowered expression of FXN. Presently, there is no effective treatment for FRDA, and biomarkers to measure therapeutic trial outcomes and/or to gauge disease progression are lacking. Peripheral tissues, including blood cells, buccal cells and skin fibroblasts, can readily be isolated from FRDA patients and used to define molecular hallmarks of disease pathogenesis. For instance, FXN mRNA and protein levels as well as FXN GAA-repeat tract lengths are routinely determined using all of these cell types. However, because these tissues are not directly involved in disease pathogenesis, their relevance as models of the molecular aspects of the disease is yet to be decided. Herein, we conducted unbiased RNA sequencing to profile the transcriptomes of fibroblast cell lines derived from 18 FRDA patients and 17 unaffected control individuals. Bioinformatic analyses revealed significantly upregulated expression of genes encoding plasma membrane solute carrier proteins in FRDA fibroblasts. Conversely, the expression of genes encoding accessory factors and enzymes involved in cytoplasmic and mitochondrial protein synthesis was consistently decreased in FRDA fibroblasts. Finally, comparison of genes differentially expressed in FRDA fibroblasts to three previously published gene expression signatures defined for FRDA blood cells showed substantial overlap between the independent datasets, including correspondingly deficient expression of antioxidant defense genes. Together, these results indicate that gene expression profiling of cells derived from peripheral tissues can, in fact, consistently reveal novel molecular pathways of the disease. When performed on statistically meaningful sample group sizes, unbiased global profiling analyses utilizing peripheral tissues are critical for the discovery and validation of FRDA disease biomarkers. Summary: Transcriptome profiling of Friedreich's ataxia fibroblasts by RNA sequencing reveals that this peripheral tissue can be used as a disease model for gene expression biomarker discovery.
Collapse
Affiliation(s)
- Jill Sergesketter Napierala
- University of Alabama at Birmingham, Department of Biochemistry and Molecular Genetics, UAB Stem Cell Institute, 1825 University Blvd., Birmingham, Alabama 35294, USA
| | - Yanjie Li
- University of Alabama at Birmingham, Department of Biochemistry and Molecular Genetics, UAB Stem Cell Institute, 1825 University Blvd., Birmingham, Alabama 35294, USA
| | - Yue Lu
- University of Texas MD Anderson Cancer Center, Department of Molecular Carcinogenesis, Center for Cancer Epigenetics, Science Park, Smithville, Texas 78957, USA
| | - Kevin Lin
- University of Texas MD Anderson Cancer Center, Department of Molecular Carcinogenesis, Center for Cancer Epigenetics, Science Park, Smithville, Texas 78957, USA
| | - Lauren A Hauser
- Departments of Neurology and Pediatrics, Children's Hospital of Philadelphia, Abramson Research Center Room 502, Philadelphia, PA 19104, USA
| | - David R Lynch
- Departments of Neurology and Pediatrics, Children's Hospital of Philadelphia, Abramson Research Center Room 502, Philadelphia, PA 19104, USA
| | - Marek Napierala
- University of Alabama at Birmingham, Department of Biochemistry and Molecular Genetics, UAB Stem Cell Institute, 1825 University Blvd., Birmingham, Alabama 35294, USA .,Department of Molecular Biomedicine, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, 61-704, Poland
| |
Collapse
|
244
|
Park S, Supek F, Lehner B. Systematic discovery of germline cancer predisposition genes through the identification of somatic second hits. Nat Commun 2018; 9:2601. [PMID: 29973584 PMCID: PMC6031629 DOI: 10.1038/s41467-018-04900-7] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Accepted: 06/04/2018] [Indexed: 01/08/2023] Open
Abstract
The genetic causes of cancer include both somatic mutations and inherited germline variants. Large-scale tumor sequencing has revolutionized the identification of somatic driver alterations but has had limited impact on the identification of cancer predisposition genes (CPGs). Here we present a statistical method, ALFRED, that tests Knudson's two-hit hypothesis to systematically identify CPGs from cancer genome data. Applied to ~10,000 tumor exomes the approach identifies known and putative CPGs - including the chromatin modifier NSD1 - that contribute to cancer through a combination of rare germline variants and somatic loss-of-heterozygosity (LOH). Rare germline variants in these genes contribute substantially to cancer risk, including to ~14% of ovarian carcinomas, ~7% of breast tumors, ~4% of uterine corpus endometrial carcinomas, and to a median of 2% of tumors across 17 cancer types.
Collapse
Affiliation(s)
- Solip Park
- Systems Biology Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003, Barcelona, Spain
| | - Fran Supek
- Systems Biology Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003, Barcelona, Spain.,Institut de Recerca Biomedica (IRB Barcelona), The Barcelona Institute of Science and Technology, 08028, Barcelona, Spain.,Division of Electronics, Rudjer Boskovic Institute, 10000, Zagreb, Croatia.,Institut de Recerca Biomedica (IRB Barcelona), The Barcelona Institute of Science and Technology, 08028, Barcelona, Spain
| | - Ben Lehner
- Systems Biology Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003, Barcelona, Spain. .,Universitat Pompeu Fabra (UPF), 08003, Barcelona, Spain. .,Institució Catalana de Recerca i Estudis Avançats (ICREA), Pg. Luis Companys 23, 08010, Barcelona, Spain.
| |
Collapse
|
245
|
Kennedy EM, Goehring GN, Nichols MH, Robins C, Mehta D, Klengel T, Eskin E, Smith AK, Conneely KN. An integrated -omics analysis of the epigenetic landscape of gene expression in human blood cells. BMC Genomics 2018; 19:476. [PMID: 29914364 PMCID: PMC6006777 DOI: 10.1186/s12864-018-4842-3] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2018] [Accepted: 05/30/2018] [Indexed: 01/06/2023] Open
Abstract
Background Gene expression can be influenced by DNA methylation 1) distally, at regulatory elements such as enhancers, as well as 2) proximally, at promoters. Our current understanding of the influence of distal DNA methylation changes on gene expression patterns is incomplete. Here, we characterize genome-wide methylation and expression patterns for ~ 13 k genes to explore how DNA methylation interacts with gene expression, throughout the genome. Results We used a linear mixed model framework to assess the correlation of DNA methylation at ~ 400 k CpGs with gene expression changes at ~ 13 k transcripts in two independent datasets from human blood cells. Among CpGs at which methylation significantly associates with transcription (eCpGs), > 50% are distal (> 50 kb) or trans (different chromosome) to the correlated gene. Many eCpG-transcript pairs are consistent between studies and ~ 90% of neighboring eCpGs associate with the same gene, within studies. We find that enhancers (P < 5e-18) and microRNA genes (P = 9e-3) are overrepresented among trans eCpGs, and insulators and long intergenic non-coding RNAs are enriched among cis and distal eCpGs. Intragenic-eCpG-transcript correlations are negative in 60–70% of occurrences and are enriched for annotated gene promoters and enhancers (P < 0.002), highlighting the importance of intragenic regulation. Gene Ontology analysis indicates that trans eCpGs are enriched for transcription factor genes and chromatin modifiers, suggesting that some trans eCpGs represent the influence of gene networks and higher-order transcriptional control. Conclusions This work sheds new light on the interplay between epigenetic changes and gene expression, and provides useful data for mining biologically-relevant results from epigenome-wide association studies. Electronic supplementary material The online version of this article (10.1186/s12864-018-4842-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Elizabeth M Kennedy
- Genetics and Molecular Biology Program, Emory University, Atlanta, GA, USA. .,Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, USA.
| | - George N Goehring
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, USA
| | - Michael H Nichols
- Genetics and Molecular Biology Program, Emory University, Atlanta, GA, USA.,Department of Biology, Emory University, Atlanta, GA, USA
| | - Chloe Robins
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, USA.,Population Biology, Ecology and Evolution Program, Emory University, Atlanta, GA, USA
| | - Divya Mehta
- School of Psychology and Counseling, Faculty of Health, Institute of Health and Biomedical Innovation, Queensland University of Technology, Kelvin Grove, Australia
| | - Torsten Klengel
- Department of Psychiatry, McLean Hospital, Harvard Medical School, Belmont, MA, USA
| | - Eleazar Eskin
- Department of Computer Science, University of California, Los Angeles, CA, USA
| | - Alicia K Smith
- Genetics and Molecular Biology Program, Emory University, Atlanta, GA, USA.,Department of Gynecology and Obstetrics, Emory University School of Medicine, Atlanta, GA, USA.,Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta, GA, USA
| | - Karen N Conneely
- Genetics and Molecular Biology Program, Emory University, Atlanta, GA, USA.,Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, USA
| |
Collapse
|
246
|
Henriques D, Browne KA, Barnett MW, Parejo M, Kryger P, Freeman TC, Muñoz I, Garnery L, Highet F, Jonhston JS, McCormack GP, Pinto MA. High sample throughput genotyping for estimating C-lineage introgression in the dark honeybee: an accurate and cost-effective SNP-based tool. Sci Rep 2018; 8:8552. [PMID: 29867207 PMCID: PMC5986779 DOI: 10.1038/s41598-018-26932-1] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Accepted: 05/17/2018] [Indexed: 11/12/2022] Open
Abstract
The natural distribution of the honeybee (Apis mellifera L.) has been changed by humans in recent decades to such an extent that the formerly widest-spread European subspecies, Apis mellifera mellifera, is threatened by extinction through introgression from highly divergent commercial strains in large tracts of its range. Conservation efforts for A. m. mellifera are underway in multiple European countries requiring reliable and cost-efficient molecular tools to identify purebred colonies. Here, we developed four ancestry-informative SNP assays for high sample throughput genotyping using the iPLEX Mass Array system. Our customized assays were tested on DNA from individual and pooled, haploid and diploid honeybee samples extracted from different tissues using a diverse range of protocols. The assays had a high genotyping success rate and yielded accurate genotypes. Performance assessed against whole-genome data showed that individual assays behaved well, although the most accurate introgression estimates were obtained for the four assays combined (117 SNPs). The best compromise between accuracy and genotyping costs was achieved when combining two assays (62 SNPs). We provide a ready-to-use cost-effective tool for accurate molecular identification and estimation of introgression levels to more effectively monitor and manage A. m. mellifera conservatories.
Collapse
Affiliation(s)
- Dora Henriques
- Mountain Research Centre (CIMO), Polytechnic Institute of Bragança, 5300-253, Bragança, Portugal
- Centre of Molecular and Environmental Biology (CBMA), University of Minho, Campus de Gualtar, 4710-057, Braga, Portugal
| | - Keith A Browne
- Department of Zoology, Ryan Institute, School of Natural Sciences, National University of Ireland Galway, Galway, Ireland
| | - Mark W Barnett
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, Edinburgh, Midlothian, EH25 9RG, Scotland, UK
| | - Melanie Parejo
- Agroscope, Swiss Bee Research Centre, 3003, Bern, Switzerland
| | - Per Kryger
- Aarhus University, Department of Agroecology, Slagelse, 4200, Denmark
| | - Tom C Freeman
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, Edinburgh, Midlothian, EH25 9RG, Scotland, UK
| | - Irene Muñoz
- Área de Biología Animal, Dpto. de Zoología y Antropología Física, Universidad de Murcia, Campus de Espinardo, 30100, Murcia, Spain
| | - Lionel Garnery
- Laboratoire Evolution, Génomes et Spéciation, CNRS, Gif-sur-Yvette, France
- Saint Quentin en Yvelines, Université de Versailles, Versailles, France
| | - Fiona Highet
- Science and Advice for Scottish Agriculture (SASA), Roddinglaw Road, Edinburgh, EH12 9FJ, Scotland, UK
| | | | - Grace P McCormack
- Department of Zoology, Ryan Institute, School of Natural Sciences, National University of Ireland Galway, Galway, Ireland
| | - M Alice Pinto
- Mountain Research Centre (CIMO), Polytechnic Institute of Bragança, 5300-253, Bragança, Portugal.
| |
Collapse
|
247
|
Cai X, Mamun AA, Rajasekaran S. Efficient Algorithms for Finding the Closest l-mers in Biological Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018:1-1. [PMID: 29993557 DOI: 10.1109/tcbb.2018.2843364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
With the advances in the next generation sequencing technology, huge amounts of data have been and get generated in biology. A bottleneck in dealing with such datasets lies in developing effective algorithms for extracting useful information from them. Algorithms for finding patterns in biological data pave the way for extracting crucial information from the voluminous datasets. In this paper we focus on a fundamental pattern, namely, the closest l-mers. Given a set of m biological strings S1,S2,…,Sm and an integer l, the problem of interest is that of finding an l-mer from each string such that the distance among them is the least. I.e., we want to find m l-mers X1,X2,…,Xm such that Xi is an l-mer in Si (for 1 ≤ i ≤ m) and the Hamming distance among these m l-mers is the least (from among all such possible l-mers). This problem has many applications including motif search. Algorithms for finding the closest l-mers have been used in solving the (l,d)-motif search problem (see e.g., \cite{PeSz00,DBR07}). In this paper novel algorithms are proposed for this problem for the case of . A comprehensive experimental evaluation is performed for m=3, along with a further empirical study of m=4 and 5.
Collapse
|
248
|
Shukla V, Varghese VK, Kabekkodu SP, Mallya S, Satyamoorthy K. A compilation of Web-based research tools for miRNA analysis. Brief Funct Genomics 2018; 16:249-273. [PMID: 28334134 DOI: 10.1093/bfgp/elw042] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Since the discovery of microRNAs (miRNAs), a class of noncoding RNAs that regulate the gene expression posttranscriptionally in sequence-specific manner, there has been a release of number of tools useful for both basic and advanced applications. This is because of the significance of miRNAs in many pathophysiological conditions including cancer. Numerous bioinformatics tools that have been developed for miRNA analysis have their utility for detection, expression, function, target prediction and many other related features. This review provides a comprehensive assessment of web-based tools for the miRNA analysis that does not require prior knowledge of any computing languages.
Collapse
|
249
|
Yang F, Machalz D, Wang S, Li Z, Wolber G, Bureik M. A common polymorphic variant of
UGT
1A5 displays increased activity due to optimized cofactor binding. FEBS Lett 2018; 592:1837-1846. [DOI: 10.1002/1873-3468.13072] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2018] [Revised: 04/19/2018] [Accepted: 04/20/2018] [Indexed: 12/19/2022]
Affiliation(s)
- Fan Yang
- School of Pharmaceutical Science and Technology Health Sciences Platform Tianjin University China
| | - David Machalz
- Pharmaceutical and Medicinal Chemistry Computer‐Aided Drug Design Institute of Pharmacy Free University Berlin Germany
| | - Sisi Wang
- School of Pharmaceutical Science and Technology Health Sciences Platform Tianjin University China
| | - Zhengyi Li
- School of Pharmaceutical Science and Technology Health Sciences Platform Tianjin University China
| | - Gerhard Wolber
- Pharmaceutical and Medicinal Chemistry Computer‐Aided Drug Design Institute of Pharmacy Free University Berlin Germany
| | - Matthias Bureik
- School of Pharmaceutical Science and Technology Health Sciences Platform Tianjin University China
| |
Collapse
|
250
|
Huang YC, Dang VD, Chang NC, Wang J. Multiple large inversions and breakpoint rewiring of gene expression in the evolution of the fire ant social supergene. Proc Biol Sci 2018; 285:20180221. [PMID: 29769360 PMCID: PMC5966598 DOI: 10.1098/rspb.2018.0221] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2018] [Accepted: 04/16/2018] [Indexed: 12/16/2022] Open
Abstract
Supergenes consist of co-adapted loci that segregate together and are associated with adaptive traits. In the fire ant Solenopsis invicta, two 'social' supergene variants regulate differences in colony queen number and other traits. Suppressed recombination in this system is maintained, in part, by a greater than 9 Mb inversion, but the supergene is larger. Has the supergene in S. invicta undergone multiple large inversions? The initial gene content of the inverted allele of a supergene would be the same as that of the wild-type allele. So, how did the inversion increase in frequency? To address these questions, we cloned one extreme breakpoint in the fire ant supergene. In doing so, we found a second large (greater than 800 Kb) rearrangement. Furthermore, we determined the temporal order of the two big inversions based on the translocation pattern of a third small fragment. Because the S. invicta supergene lacks evolutionary strata, our finding of multiple inversions may support an introgression model of the supergene. Finally, we showed that one of the inversions swapped the promoter of a breakpoint-adjacent gene, which might have conferred a selective advantage relative to the non-inverted allele. Our findings provide a rare example of gene alterations arising directly from an inversion event.
Collapse
Affiliation(s)
- Yu-Ching Huang
- Biodiversity Research Center, Biodiversity Research Center, Academia Sinica, Taipei, Taiwan, Republic of China
| | - Viet Dai Dang
- Biodiversity Research Center, Biodiversity Research Center, Academia Sinica, Taipei, Taiwan, Republic of China
- Biodiversity Taiwan International Graduate Program, Biodiversity Research Center, Academia Sinica, Taipei, Taiwan, Republic of China
- Department of Life Science, National Taiwan Normal University, Taipei, Taiwan, Republic of China
- Department of Zoology, Southern Institute of Ecology, Hochiminh, Vietnam
| | - Ni-Chen Chang
- Biodiversity Research Center, Biodiversity Research Center, Academia Sinica, Taipei, Taiwan, Republic of China
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | - John Wang
- Biodiversity Research Center, Biodiversity Research Center, Academia Sinica, Taipei, Taiwan, Republic of China
| |
Collapse
|