1
|
Hirsch N, Dahan I, D'haene E, Avni M, Vergult S, Vidal-García M, Magini P, Graziano C, Bonora E, Nardone AM, Brancati F, Fernández-Jaén A, Rory OJ, Hallgrimsson B, Birnbaum RY. HDAC9 structural variants disrupting TWIST1 transcriptional regulation lead to craniofacial and limb malformations. Genome Res 2022; 32:1242-1253. [PMID: 35710300 PMCID: PMC9341515 DOI: 10.1101/gr.276196.121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2021] [Accepted: 06/02/2022] [Indexed: 11/25/2022]
Abstract
Structural variants (SVs) can affect protein-coding sequences as well as gene regulatory elements. However, SVs disrupting protein-coding sequences that also function as cis-regulatory elements remain largely uncharacterized. Here, we show that craniosynostosis patients with SVs containing the Histone deacetylase 9 (HDAC9) protein-coding sequence are associated with disruption of TWIST1 regulatory elements that reside within HDAC9 sequence. Based on SVs within the HDAC9-TWIST1 locus, we defined the 3'-HDAC9 sequence as a critical TWIST1 regulatory region, encompassing craniofacial TWIST1 enhancers and CTCF sites. Deletions of either Twist1 enhancers (eTw5-7Δ/Δ) or Ctcf site (CtcfΔ/Δ) within the Hdac9 protein-coding sequence led to decreased Twist1 expression and altered anterior\posterior limb expression patterns of Shh pathway genes. This decreased Twist1 expression results in a smaller sized and asymmetric skull and polydactyly that resembles Twist1+/- mouse phenotype. Chromatin conformation analysis revealed that the Twist1 promoter interacts with Hdac9 sequences that encompass Twist1 enhancers and a Ctcf site and that interactions depended on the presence of both regulatory regions. Finally, a large inversion of the entire Hdac9 sequence (Hdac9INV/+) in mice that does not disrupt HDAC9 expression but repositions Twist1 regulatory elements showed decreased Twist1 expression and led to a craniosynostosis-like phenotype and polydactyly. Thus, our study elucidated essential components of TWIST1 transcriptional machinery that reside within the HDAC9 sequence It suggests that SVs, encompassing protein-coding sequence could lead to a phenotype that is not attributed to its protein function but rather to a disruption of the transcriptional regulation of a nearby gene.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Pamela Magini
- U.O. Genetica Medica, IRCCS Azienda Ospedaliero Universitaria di Bologna
| | - Claudio Graziano
- U.O. Genetica Medica, IRCCS Azienda Ospedaliero Universitaria di Bologna
| | | | | | | | | | | | | | | |
Collapse
|
2
|
Boldyreva LV, Andreyeva EN, Pindyurin AV. Position Effect Variegation: Role of the Local Chromatin Context in Gene Expression Regulation. Mol Biol 2022. [DOI: 10.1134/s0026893322030049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
3
|
Spielmann M, Kircher M. Computational and experimental methods for classifying variants of unknown clinical significance. Cold Spring Harb Mol Case Stud 2022; 8:mcs.a006196. [PMID: 35483875 PMCID: PMC9059783 DOI: 10.1101/mcs.a006196] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
The increase in sequencing capacity, reduction in costs, and national and international coordinated efforts have led to the widespread introduction of next-generation sequencing (NGS) technologies in patient care. More generally, human genetics and genomic medicine are gaining importance for more and more patients. Some communities are already discussing the prospect of sequencing each individual's genome at time of birth. Together with digital health records, this shall enable individualized treatments and preventive measures, so-called precision medicine. A central step in this process is the identification of disease causal mutations or variant combinations that make us more susceptible for diseases. Although various technological advances have improved the identification of genetic alterations, the interpretation and ranking of the identified variants remains a major challenge. Based on our knowledge of molecular processes or previously identified disease variants, we can identify potentially functional genetic variants and, using different lines of evidence, we are sometimes able to demonstrate their pathogenicity directly. However, the vast majority of variants are classified as variants of uncertain clinical significance (VUSs) with not enough experimental evidence to determine their pathogenicity. In these cases, computational methods may be used to improve the prioritization and an increasing toolbox of experimental methods is emerging that can be used to assay the molecular effects of VUSs. Here, we discuss how computational and experimental methods can be used to create catalogs of variant effects for a variety of molecular and cellular phenotypes. We discuss the prospects of integrating large-scale functional data with machine learning and clinical knowledge for the development of accurate pathogenicity predictions for clinical applications.
Collapse
Affiliation(s)
- Malte Spielmann
- Institute of Human Genetics, University of Lübeck, 23562 Lübeck, Germany;,Institute of Human Genetics, Christian-Albrechts-Universität, 24105 Kiel, Germany;,Human Molecular Genomics Group, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany;,DZHK (German Centre for Cardiovascular Research), partner site Hamburg/Lübeck/Kiel, 23562 Lübeck, Germany
| | - Martin Kircher
- Institute of Human Genetics, University of Lübeck, 23562 Lübeck, Germany;,Berlin Institute of Health at Charité—Universitätsmedizin Berlin, 10117 Berlin, Germany;,DZHK (German Centre for Cardiovascular Research), partner site Berlin, 10115 Berlin, Germany
| |
Collapse
|
4
|
Chen J, Guo JT. Structural and functional analysis of somatic coding and UTR indels in breast and lung cancer genomes. Sci Rep 2021; 11:21178. [PMID: 34707120 DOI: 10.1038/s41598-021-00583-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 10/14/2021] [Indexed: 11/24/2022] Open
Abstract
Insertions and deletions (Indels) represent one of the major variation types in the human genome and have been implicated in diseases including cancer. To study the features of somatic indels in different cancer genomes, we investigated the indels from two large samples of cancer types: invasive breast carcinoma (BRCA) and lung adenocarcinoma (LUAD). Besides mapping somatic indels in both coding and untranslated regions (UTRs) from the cancer whole exome sequences, we investigated the overlap between these indels and transcription factor binding sites (TFBSs), the key elements for regulation of gene expression that have been found in both coding and non-coding sequences. Compared to the germline indels in healthy genomes, somatic indels contain more coding indels with higher than expected frame-shift (FS) indels in cancer genomes. LUAD has a higher ratio of deletions and higher coding and FS indel rates than BRCA. More importantly, these somatic indels in cancer genomes tend to locate in sequences with important functions, which can affect the core secondary structures of proteins and have a bigger overlap with predicted TFBSs in coding regions than the germline indels. The somatic CDS indels are also enriched in highly conserved nucleotides when compared with germline CDS indels.
Collapse
|
5
|
Lange M, Begolli R, Giakountis A. Non-Coding Variants in Cancer: Mechanistic Insights and Clinical Potential for Personalized Medicine. Noncoding RNA 2021; 7:47. [PMID: 34449663 PMCID: PMC8395730 DOI: 10.3390/ncrna7030047] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 07/26/2021] [Accepted: 08/01/2021] [Indexed: 12/11/2022] Open
Abstract
The cancer genome is characterized by extensive variability, in the form of Single Nucleotide Polymorphisms (SNPs) or structural variations such as Copy Number Alterations (CNAs) across wider genomic areas. At the molecular level, most SNPs and/or CNAs reside in non-coding sequences, ultimately affecting the regulation of oncogenes and/or tumor-suppressors in a cancer-specific manner. Notably, inherited non-coding variants can predispose for cancer decades prior to disease onset. Furthermore, accumulation of additional non-coding driver mutations during progression of the disease, gives rise to genomic instability, acting as the driving force of neoplastic development and malignant evolution. Therefore, detection and characterization of such mutations can improve risk assessment for healthy carriers and expand the diagnostic and therapeutic toolbox for the patient. This review focuses on functional variants that reside in transcribed or not transcribed non-coding regions of the cancer genome and presents a collection of appropriate state-of-the-art methodologies to study them.
Collapse
Affiliation(s)
- Marios Lange
- Department of Biochemistry and Biotechnology, University of Thessaly, Biopolis, 41500 Larissa, Greece; (M.L.); (R.B.)
| | - Rodiola Begolli
- Department of Biochemistry and Biotechnology, University of Thessaly, Biopolis, 41500 Larissa, Greece; (M.L.); (R.B.)
| | - Antonis Giakountis
- Department of Biochemistry and Biotechnology, University of Thessaly, Biopolis, 41500 Larissa, Greece; (M.L.); (R.B.)
- Institute for Fundamental Biomedical Research, B.S.R.C “Alexander Fleming”, 34 Fleming Str., 16672 Vari, Greece
| |
Collapse
|
6
|
Wang Y, Shi FY, Liang Y, Gao G. REVA as A Well-curated Database for Human Expression-modulating Variants. Genomics Proteomics Bioinformatics 2021; 19:590-601. [PMID: 34224878 PMCID: PMC9040024 DOI: 10.1016/j.gpb.2021.06.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Revised: 06/22/2021] [Accepted: 06/25/2021] [Indexed: 10/25/2022]
Abstract
More than 90% of disease- and trait-associated human variants are noncoding. By systematically screening multiple large-scale studies, we compiled REVA, a manually curated database for over 11.8 million experimentally tested noncoding variants with expression-modulating potentials. We provided 2424 functional annotations that could be used to pinpoint the plausible regulatory mechanism of these variants. We further benchmarked multiple state-of-the-art computational tools and found their limited sensitivity remains a serious challenge for effective large-scale analysis. REVA provides high-quality experimentally tested expression-modulating variants with extensive functional annotations, which will be useful for users in the noncoding variants community. REVA is available at http://reva.gao-lab.org.
Collapse
Affiliation(s)
- Yu Wang
- Biomedical Pioneering Innovation Center (BIOPIC) & Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI) and State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing 100871, China
| | - Fang-Yuan Shi
- Biomedical Pioneering Innovation Center (BIOPIC) & Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI) and State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing 100871, China
| | - Yu Liang
- Human Aging Research Institute, School of Life Sciences, Nanchang University, Nanchang 330031, China
| | - Ge Gao
- Biomedical Pioneering Innovation Center (BIOPIC) & Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI) and State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing 100871, China.
| |
Collapse
|
7
|
Townsley KG, Brennand KJ, Huckins LM. Massively parallel techniques for cataloguing the regulome of the human brain. Nat Neurosci 2020; 23:1509-1521. [PMID: 33199899 PMCID: PMC8018778 DOI: 10.1038/s41593-020-00740-1] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 10/13/2020] [Indexed: 12/14/2022]
Abstract
Complex brain disorders are highly heritable and arise from a complex polygenic risk architecture. Many disease-associated loci are found in non-coding regions that house regulatory elements. These elements influence the transcription of target genes-many of which demonstrate cell-type-specific expression patterns-and thereby affect phenotypically relevant molecular pathways. Thus, cell-type-specificity must be considered when prioritizing candidate risk loci, variants and target genes. This Review discusses the use of high-throughput assays in human induced pluripotent stem cell-based neurodevelopmental models to probe genetic risk in a cell-type- and patient-specific manner. The application of massively parallel reporter assays in human induced pluripotent stem cells can characterize the human regulome and test the transcriptional responses of putative regulatory elements. Parallel CRISPR-based screens can further functionally dissect this genetic regulatory architecture. The integration of these emerging technologies could decode genetic risk into medically actionable information, thereby improving genetic diagnosis and identifying novel points of therapeutic intervention.
Collapse
Affiliation(s)
- Kayla G Townsley
- Graduate School of Biomedical Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Nash Family Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Kristen J Brennand
- Graduate School of Biomedical Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Nash Family Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| | - Laura M Huckins
- Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Mental Illness Research, Education and Clinical Centers, James J. Peters Department of Veterans Affairs Medical Center, Bronx, NY, USA.
| |
Collapse
|
8
|
Faure AJ, Schmiedel JM, Baeza-Centurion P, Lehner B. DiMSum: an error model and pipeline for analyzing deep mutational scanning data and diagnosing common experimental pathologies. Genome Biol 2020; 21:207. [PMID: 32799905 DOI: 10.1186/s13059-020-02091-3] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Accepted: 07/05/2020] [Indexed: 12/30/2022] Open
Abstract
Deep mutational scanning (DMS) enables multiplexed measurement of the effects of thousands of variants of proteins, RNAs, and regulatory elements. Here, we present a customizable pipeline, DiMSum, that represents an end-to-end solution for obtaining variant fitness and error estimates from raw sequencing data. A key innovation of DiMSum is the use of an interpretable error model that captures the main sources of variability arising in DMS workflows, outperforming previous methods. DiMSum is available as an R/Bioconda package and provides summary reports to help researchers diagnose common DMS pathologies and take remedial steps in their analyses.
Collapse
|
9
|
Ryan GE, Farley EK. Functional genomic approaches to elucidate the role of enhancers during development. Wiley Interdiscip Rev Syst Biol Med 2020; 12:e1467. [PMID: 31808313 PMCID: PMC7027484 DOI: 10.1002/wsbm.1467] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 10/02/2019] [Accepted: 10/11/2019] [Indexed: 12/22/2022]
Abstract
Successful development depends on the precise tissue-specific regulation of genes by enhancers, genetic elements that act as switches to control when and where genes are expressed. Because enhancers are critical for development, and the majority of disease-associated mutations reside within enhancers, it is essential to understand which sequences within enhancers are important for function. Advances in sequencing technology have enabled the rapid generation of genomic data that predict putative active enhancers, but functionally validating these sequences at scale remains a fundamental challenge. Herein, we discuss the power of genome-wide strategies used to identify candidate enhancers, and also highlight limitations and misconceptions that have arisen from these data. We discuss the use of massively parallel reporter assays to test enhancers for function at scale. We also review recent advances in our ability to study gene regulation during development, including CRISPR-based tools to manipulate genomes and single-cell transcriptomics to finely map gene expression. Finally, we look ahead to a synthesis of complementary genomic approaches that will advance our understanding of enhancer function during development. This article is categorized under: Physiology > Mammalian Physiology in Health and Disease Developmental Biology > Developmental Processes in Health and Disease Laboratory Methods and Technologies > Genetic/Genomic Methods.
Collapse
Affiliation(s)
- Genevieve E. Ryan
- Department of MedicineUniversity of CaliforniaSan DiegoCalifornia
- Division of Biological Sciences, Department of MedicineUniversity of CaliforniaSan DiegoCalifornia
| | - Emma K. Farley
- Department of MedicineUniversity of CaliforniaSan DiegoCalifornia
- Division of Biological Sciences, Department of MedicineUniversity of CaliforniaSan DiegoCalifornia
| |
Collapse
|
10
|
Myint L, Avramopoulos DG, Goff LA, Hansen KD. Linear models enable powerful differential activity analysis in massively parallel reporter assays. BMC Genomics 2019; 20:209. [PMID: 30866806 PMCID: PMC6417258 DOI: 10.1186/s12864-019-5556-x] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Accepted: 02/22/2019] [Indexed: 12/15/2022] Open
Abstract
Background Massively parallel reporter assays (MPRAs) have emerged as a popular means for understanding noncoding variation in a variety of conditions. While a large number of experiments have been described in the literature, analysis typically uses ad-hoc methods. There has been little attention to comparing performance of methods across datasets. Results We present the mpralm method which we show is calibrated and powerful, by analyzing its performance on multiple MPRA datasets. We show that it outperforms existing statistical methods for analysis of this data type, in the first comprehensive evaluation of statistical methods on several datasets. We investigate theoretical and real-data properties of barcode summarization methods and show an unappreciated impact of summarization method for some datasets. Finally, we use our model to conduct a power analysis for this assay and show substantial improvements in power by performing up to 6 replicates per condition, whereas sequencing depth has smaller impact; we recommend to always use at least 4 replicates. An R package is available from the Bioconductor project. Conclusions Together, these results inform recommendations for differential analysis, general group comparisons, and power analysis and will help improve design and analysis of MPRA experiments.
Collapse
Affiliation(s)
- Leslie Myint
- Department of Mathematics, Statistics, and Computer Science, Macalester College, 1600 Grand Ave, Saint Paul, MN 55105, USA
| | - Dimitrios G Avramopoulos
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, USA
| | - Loyal A Goff
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, USA.,Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, USA
| | - Kasper D Hansen
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 N. Wolfe St, E3527, Baltimore, MD 21212, USA. .,McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, USA.
| |
Collapse
|
11
|
Abstract
What proportion of coding sequence nucleotides have roles in splicing, and how strong is the selection that maintains them? Despite a large body of research into exonic splice regulatory signals, these questions have not been answered. This is because, to our knowledge, previous investigations have not explicitly disentangled the frequency of splice regulatory elements from the strength of the evolutionary constraint under which they evolve. Current data are consistent both with a scenario of weak and diffuse constraint, enveloping large swaths of sequence, as well as with well-defined pockets of strong purifying selection. In the former case, natural selection on exonic splice enhancers (ESEs) might primarily act as a slight modifier of codon usage bias. In the latter, mutations that disrupt ESEs are likely to have large fitness and, potentially, clinical effects. To distinguish between these scenarios, we used several different methods to determine the distribution of selection coefficients for new mutations within ESEs. The analyses converged to suggest that ∼15%-20% of fourfold degenerate sites are part of functional ESEs. Most of these sites are under strong evolutionary constraint. Therefore, exonic splice regulation does not simply impose a weak bias that gently nudges coding sequence evolution in a particular direction. Rather, the selection to preserve these motifs is a strong force that severely constrains the evolution of a substantial proportion of coding nucleotides. Thus synonymous mutations that disrupt ESEs should be considered as a potentially common cause of single-locus genetic disorders.
Collapse
Affiliation(s)
- Rosina Savisaar
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, United Kingdom
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, United Kingdom
| |
Collapse
|
12
|
Zhang P, Xia JH, Zhu J, Gao P, Tian YJ, Du M, Guo YC, Suleman S, Zhang Q, Kohli M, Tillmans LS, Thibodeau SN, French AJ, Cerhan JR, Wang LD, Wei GH, Wang L. High-throughput screening of prostate cancer risk loci by single nucleotide polymorphisms sequencing. Nat Commun 2018; 9:2022. [PMID: 29789573 PMCID: PMC5964124 DOI: 10.1038/s41467-018-04451-x] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2017] [Accepted: 05/02/2018] [Indexed: 12/18/2022] Open
Abstract
Functional characterization of disease-causing variants at risk loci has been a significant challenge. Here we report a high-throughput single-nucleotide polymorphisms sequencing (SNPs-seq) technology to simultaneously screen hundreds to thousands of SNPs for their allele-dependent protein-binding differences. This technology takes advantage of higher retention rate of protein-bound DNA oligos in protein purification column to quantitatively sequence these SNP-containing oligos. We apply this technology to test prostate cancer-risk loci and observe differential allelic protein binding in a significant number of selected SNPs. We also test a unique application of self-transcribing active regulatory region sequencing (STARR-seq) in characterizing allele-dependent transcriptional regulation and provide detailed functional analysis at two risk loci (RGS17 and ASCL2). Together, we introduce a powerful high-throughput pipeline for large-scale screening of functional SNPs at disease risk loci. Functional characterization of disease-causing variants at risk loci in cancer is challenging. Here, in prostate cancer the authors report a pipeline for high-throughput single-nucleotide polymorphisms sequencing (SNPs-seq) for large scale screening of functional SNPs at disease risk loci.
Collapse
Affiliation(s)
- Peng Zhang
- Henan Key Laboratory for Esophageal Cancer Research, The First Affiliated Hospital of Zhengzhou University, 40 Daxue Road, 450052, Zhengzhou, Henan, China.,Department of Pathology, MCW Cancer Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI, 53226, USA
| | - Ji-Han Xia
- Biocenter Oulu, Faculty of Biochemistry and Molecular Medicine, University of Oulu, Aapistie 5 A, 90220, Oulu, Finland
| | - Jing Zhu
- Department of Pathology, MCW Cancer Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI, 53226, USA
| | - Ping Gao
- Biocenter Oulu, Faculty of Biochemistry and Molecular Medicine, University of Oulu, Aapistie 5 A, 90220, Oulu, Finland
| | - Yi-Jun Tian
- Department of Pathology, MCW Cancer Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI, 53226, USA
| | - Meijun Du
- Department of Pathology, MCW Cancer Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI, 53226, USA
| | - Yong-Chen Guo
- Department of Pathology, MCW Cancer Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI, 53226, USA
| | - Sufyan Suleman
- Biocenter Oulu, Faculty of Biochemistry and Molecular Medicine, University of Oulu, Aapistie 5 A, 90220, Oulu, Finland
| | - Qin Zhang
- Biocenter Oulu, Faculty of Biochemistry and Molecular Medicine, University of Oulu, Aapistie 5 A, 90220, Oulu, Finland
| | - Manish Kohli
- Department of Oncology, Mayo Clinic, 200 First Street SW, Rochester, MN, 55905, USA
| | - Lori S Tillmans
- Department of Laboratory Medicine and Pathology, Mayo Clinic, 200 First Street SW, Rochester, MN, 55905, USA
| | - Stephen N Thibodeau
- Department of Laboratory Medicine and Pathology, Mayo Clinic, 200 First Street SW, Rochester, MN, 55905, USA
| | - Amy J French
- Department of Laboratory Medicine and Pathology, Mayo Clinic, 200 First Street SW, Rochester, MN, 55905, USA
| | - James R Cerhan
- Department of Health Sciences Research, Mayo Clinic, 200 First Street SW, Rochester, MN, 55905, USA
| | - Li-Dong Wang
- Henan Key Laboratory for Esophageal Cancer Research, The First Affiliated Hospital of Zhengzhou University, 40 Daxue Road, 450052, Zhengzhou, Henan, China.
| | - Gong-Hong Wei
- Biocenter Oulu, Faculty of Biochemistry and Molecular Medicine, University of Oulu, Aapistie 5 A, 90220, Oulu, Finland.
| | - Liang Wang
- Department of Pathology, MCW Cancer Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI, 53226, USA.
| |
Collapse
|
13
|
Abstract
While the principal force directing coding sequence (CDS) evolution is selection on protein function, to ensure correct gene expression CDSs must also maintain interactions with RNA-binding proteins (RBPs). Understanding how our genes are shaped by these RNA-level pressures is necessary for diagnostics and for improving transgenes. However, the evolutionary impact of the need to maintain RBP interactions remains unresolved. Are coding sequences constrained by the need to specify RBP binding motifs? If so, what proportion of mutations are affected? Might sequence evolution also be constrained by the need not to specify motifs that might attract unwanted binding, for instance because it would interfere with exon definition? Here, we have scanned human CDSs for motifs that have been experimentally determined to be recognized by RBPs. We observe two sets of motifs-those that are enriched over nucleotide-controlled null and those that are depleted. Importantly, the depleted set is enriched for motifs recognized by non-CDS binding RBPs. Supporting the functional relevance of our observations, we find that motifs that are more enriched are also slower-evolving. The net effect of this selection to preserve is a reduction in the over-all rate of synonymous evolution of 2-3% in both primates and rodents. Stronger motif depletion, on the other hand, is associated with stronger selection against motif gain in evolution. The challenge faced by our CDSs is therefore not only one of attracting the right RBPs but also of avoiding the wrong ones, all while also evolving under selection pressures related to protein structure.
Collapse
Affiliation(s)
- Rosina Savisaar
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| |
Collapse
|
14
|
Kreimer A, Zeng H, Edwards MD, Guo Y, Tian K, Shin S, Welch R, Wainberg M, Mohan R, Sinnott-Armstrong NA, Li Y, Eraslan G, AMIN TB, Goke J, Mueller NS, Kellis M, Kundaje A, Beer MA, Keles S, Gifford DK, Yosef N. Predicting gene expression in massively parallel reporter assays: A comparative study. Hum Mutat 2017; 38:1240-1250. [PMID: 28220625 PMCID: PMC5560998 DOI: 10.1002/humu.23197] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2016] [Revised: 01/19/2017] [Accepted: 02/12/2017] [Indexed: 02/03/2023]
Abstract
In many human diseases, associated genetic changes tend to occur within noncoding regions, whose effect might be related to transcriptional control. A central goal in human genetics is to understand the function of such noncoding regions: given a region that is statistically associated with changes in gene expression (expression quantitative trait locus [eQTL]), does it in fact play a regulatory role? And if so, how is this role "coded" in its sequence? These questions were the subject of the Critical Assessment of Genome Interpretation eQTL challenge. Participants were given a set of sequences that flank eQTLs in humans and were asked to predict whether these are capable of regulating transcription (as evaluated by massively parallel reporter assays), and whether this capability changes between alternative alleles. Here, we report lessons learned from this community effort. By inspecting predictive properties in isolation, and conducting meta-analysis over the competing methods, we find that using chromatin accessibility and transcription factor binding as features in an ensemble of classifiers or regression models leads to the most accurate results. We then characterize the loci that are harder to predict, putting the spotlight on areas of weakness, which we expect to be the subject of future studies.
Collapse
Affiliation(s)
- Anat Kreimer
- Department of Electrical Engineering and Computer Science and Center for Computational Biology, University of California, Berkeley, Berkeley, CA 94720, USA
- Department of Bioengineering and Therapeutic Sciences, Institute for Human Genetics, University of California, San Francisco, San Francisco, California, USA
| | - Haoyang Zeng
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Matthew D. Edwards
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Yuchun Guo
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Kevin Tian
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Sunyoung Shin
- Department of Statistics, Department of Biostatistics and Medical Informatics University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Rene Welch
- Department of Statistics, Department of Biostatistics and Medical Informatics University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Michael Wainberg
- Department of Genetics, Stanford University School of Medicine, Department of Computer Science, Stanford, California 94305, USA
| | - Rahul Mohan
- Department of Genetics, Stanford University School of Medicine, Department of Computer Science, Stanford, California 94305, USA
| | - Nicholas A. Sinnott-Armstrong
- Department of Genetics, Stanford University School of Medicine, Department of Computer Science, Stanford, California 94305, USA
| | - Yue Li
- Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, Massachusetts 02139, USA
- Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, Massachusetts 02139, USA
| | - Gökcen Eraslan
- Computational Cell Maps, Institute of Computational Biology, Helmholtz Zentrum München, Ingolstädter Landstr. 1 85764 Neuherberg, Germany
| | - Talal Bin AMIN
- Computational and Systems Biology, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Jonathan Goke
- Computational and Systems Biology, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Nikola S. Mueller
- Computational Cell Maps, Institute of Computational Biology, Helmholtz Zentrum München, Ingolstädter Landstr. 1 85764 Neuherberg, Germany
| | - Manolis Kellis
- Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, Massachusetts 02139, USA
- Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, Massachusetts 02139, USA
| | - Anshul Kundaje
- Department of Genetics, Stanford University School of Medicine, Department of Computer Science, Stanford, California 94305, USA
| | - Michael A Beer
- McKusick-Nathans Institute of Genetic Medicine, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Sunduz Keles
- Department of Statistics, Department of Biostatistics and Medical Informatics University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - David K. Gifford
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Nir Yosef
- Department of Electrical Engineering and Computer Science and Center for Computational Biology, University of California, Berkeley, Berkeley, CA 94720, USA
- Ragon Institute of Massachusetts General Hospital, MIT and Harvard, Cambridge, MA, 02139
| |
Collapse
|
15
|
Elkon R, Agami R. Characterization of noncoding regulatory DNA in the human genome. Nat Biotechnol 2017; 35:732-746. [DOI: 10.1038/nbt.3863] [Citation(s) in RCA: 63] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Accepted: 03/31/2017] [Indexed: 12/22/2022]
|
16
|
Abstract
In addition to coding information, human exons contain sequences necessary for correct splicing. These elements are known to be under purifying selection and their disruption can cause disease. However, the density of functional exonic splicing information remains profoundly uncertain. Several groups have experimentally investigated how mutations at different exonic positions affect splicing. They have found splice information to be distributed widely in exons, with one estimate putting the proportion of splicing-relevant nucleotides at >90%. These results suggest that splicing could place a major pressure on exon evolution. However, analyses of sequence conservation have concluded that the need to preserve splice regulatory signals only slightly constrains exon evolution, with a resulting decrease in the average human rate of synonymous evolution of only 1–4%. Why do these two lines of research come to such different conclusions? Among other reasons, we suggest that the methods are measuring different things: one assays the density of sites that affect splicing, the other the density of sites whose effects on splicing are visible to selection. In addition, the experimental methods typically consider short exons, thereby enriching for nucleotides close to the splice junction, such sites being enriched for splice-control elements. By contrast, in part owing to correction for nucleotide composition biases and to the assumption that constraint only operates on exon ends, the conservation-based methods can be overly conservative.
Collapse
Affiliation(s)
- Rosina Savisaar
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK.
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| |
Collapse
|
17
|
Abstract
Due to plummeting costs, whole genome sequencing of patients and cancers will soon become routine medical practice; however, we cannot currently predict how non-coding genotype affects cellular gene expression. Gene regulation research has recently been dominated by observational approaches that correlate chromatin state with regulatory function. These approaches are limited to the available genotypes and cannot scratch the surface of possible sequence combinations, and thus there is a need for perturbation-based approaches to better understand how DNA encodes gene regulatory functions. CRISPR/Cas9 genome editing has revolutionized our ability to alter genome sequence, and CRISPR/Cas9-based assays have already begun to contribute to new paradigms of gene regulation. We discuss the variety of arenas in which current and future CRISPR-based technologies will aid in developing predictive understanding of how genome sequence leads to gene regulatory function.
Collapse
Affiliation(s)
- Budhaditya Banerjee
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115
| | - Richard I Sherwood
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115
- Hubrecht Institute and UMC Utrecht, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
| |
Collapse
|
18
|
Martinez AF, Abe Y, Hong S, Molyneux K, Yarnell D, Löhr H, Driever W, Acosta MT, Arcos-Burgos M, Muenke M. An Ultraconserved Brain-Specific Enhancer Within ADGRL3 (LPHN3) Underpins Attention-Deficit/Hyperactivity Disorder Susceptibility. Biol Psychiatry 2016; 80:943-954. [PMID: 27692237 PMCID: PMC5108697 DOI: 10.1016/j.biopsych.2016.06.026] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Revised: 06/28/2016] [Accepted: 06/30/2016] [Indexed: 12/22/2022]
Abstract
BACKGROUND Genetic factors predispose individuals to attention-deficit/hyperactivity disorder (ADHD). Previous studies have reported linkage and association to ADHD of gene variants within ADGRL3. In this study, we functionally analyzed noncoding variants in this gene as likely pathological contributors. METHODS In silico, in vitro, and in vivo approaches were used to identify and characterize evolutionary conserved elements within the ADGRL3 linkage region (~207 Kb). Family-based genetic analyses of 838 individuals (372 affected and 466 unaffected patients) identified ADHD-associated single nucleotide polymorphisms harbored in some of these conserved elements. Luciferase assays and zebrafish green fluorescent protein transgenesis tested conserved elements for transcriptional enhancer activity. Electromobility shift assays were used to verify transcription factor-binding disruption by ADHD risk alleles. RESULTS An ultraconserved element was discovered (evolutionary conserved region 47) that functions as a transcriptional enhancer. A three-variant ADHD risk haplotype in evolutionary conserved region 47, formed by rs17226398, rs56038622, and rs2271338, reduced enhancer activity by 40% in neuroblastoma and astrocytoma cells (pBonferroni < .0001). This enhancer also drove green fluorescent protein expression in the zebrafish brain in a tissue-specific manner, sharing aspects of endogenous ADGRL3 expression. The rs2271338 risk allele disrupts binding of YY1 transcription factor, an important factor in the development and function of the central nervous system. Expression quantitative trait loci analysis of postmortem human brain tissues revealed an association between rs2271338 and reduced ADGRL3 expression in the thalamus. CONCLUSIONS These results uncover the first functional evidence of common noncoding variants with potential implications for the pathology of ADHD.
Collapse
|
19
|
Vockley CM, Barrera A, Reddy TE. Decoding the role of regulatory element polymorphisms in complex disease. Curr Opin Genet Dev 2016; 43:38-45. [PMID: 27984826 DOI: 10.1016/j.gde.2016.10.007] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2016] [Accepted: 10/24/2016] [Indexed: 11/29/2022]
Abstract
Genetic variation in gene regulatory elements contributes to diverse human diseases, ranging from rare and severe developmental defects to common and complex diseases such as obesity and diabetes. Early examples of regulatory mechanisms of human diseases involve large chromosomal rearrangements that change the regulatory connections within the genome. Single nucleotide variants in regulatory elements can also contribute to disease, potentially via demonstrated associations with changes in transcription factor binding, enhancer activity, post-translational histone modifications, long-range enhancer-promoter interactions, or RNA polymerase recruitment. Establishing causality between non-coding genetic variants, gene regulation, and disease has recently become more feasible with advances in genome-editing and epigenome-editing technologies. As establishing causal regulatory mechanisms of diseases becomes routine, functional annotation of target genes is likely to emerge as a major bottleneck for translation into patient benefits. In this review, we discuss the history and recent advances in understanding the regulatory mechanisms of human disease, and new challenges likely to be encountered once establishing those mechanisms becomes rote.
Collapse
Affiliation(s)
- Christopher M Vockley
- Department of Cell Biology, Duke University Medical Center, Durham, NC 27710, United States; Department of Biostatistics & Bioinformatics, and Center for Genomic & Computational Biology, Duke University Medical Center, Durham, NC 27710, United States
| | - Alejandro Barrera
- Department of Biostatistics & Bioinformatics, and Center for Genomic & Computational Biology, Duke University Medical Center, Durham, NC 27710, United States
| | - Timothy E Reddy
- Department of Biostatistics & Bioinformatics, and Center for Genomic & Computational Biology, Duke University Medical Center, Durham, NC 27710, United States.
| |
Collapse
|
20
|
Andersen OM, Rudolph IM, Willnow TE. Risk factor SORL1: from genetic association to functional validation in Alzheimer's disease. Acta Neuropathol 2016; 132:653-665. [PMID: 27638701 PMCID: PMC5073117 DOI: 10.1007/s00401-016-1615-4] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2016] [Revised: 08/12/2016] [Accepted: 09/05/2016] [Indexed: 12/21/2022]
Abstract
Alzheimer's disease (AD) represents one of the most dramatic threats to healthy aging and devising effective treatments for this devastating condition remains a major challenge in biomedical research. Much has been learned about the molecular concepts that govern proteolytic processing of the amyloid precursor protein to amyloid-β peptides (Aβ), and how accelerated accumulation of neurotoxic Aβ peptides underlies neuronal cell death in rare familial but also common sporadic forms of this disease. Out of a plethora of proposed modulators of amyloidogenic processing, one protein emerged as a key factor in AD pathology, a neuronal sorting receptor termed SORLA. Independent approaches using human genetics, clinical pathology, or exploratory studies in animal models all converge on this receptor that is now considered a central player in AD-related processes by many. This review will provide a comprehensive overview of the evidence implicating SORLA-mediated protein sorting in neurodegenerative processes, and how receptor gene variants in the human population impair functional receptor expression in sporadic but possibly also in autosomal-dominant forms of AD.
Collapse
Affiliation(s)
- Olav M Andersen
- Department of Biomedicine, Danish Research Institute of Translational Neuroscience DANDRITE-Nordic EMBL Partnership for Molecular Medicine, Aarhus University, Ole Worms Alle 3, Aarhus C, 8000, Aarhus, Denmark.
| | - Ina-Maria Rudolph
- Max-Delbrueck-Center for Molecular Medicine, Robert-Roessle-Strasse 10, 13125, Berlin, Germany
| | - Thomas E Willnow
- Max-Delbrueck-Center for Molecular Medicine, Robert-Roessle-Strasse 10, 13125, Berlin, Germany.
| |
Collapse
|
21
|
Gasperini M, Starita L, Shendure J. The power of multiplexed functional analysis of genetic variants. Nat Protoc 2016; 11:1782-7. [PMID: 27583640 DOI: 10.1038/nprot.2016.135] [Citation(s) in RCA: 80] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2016] [Accepted: 07/13/2016] [Indexed: 12/30/2022]
Abstract
New technologies have recently enabled saturation mutagenesis and functional analysis of nearly all possible variants of regulatory elements or proteins of interest in single experiments. Here we discuss the past, present, and future of such multiplexed (functional) assays for variant effects (MAVEs). MAVEs provide detailed insight into sequence-function relationships, and they may prove critical for the prospective clinical interpretation of genetic variants.
Collapse
|
22
|
Abstract
Ultraconserved elements (UCEs) are discrete genomic elements conserved across large evolutionary distances. Although UCEs have been linked to multiple facets of mammalian gene regulation their extreme evolutionary conservation remains largely unexplained. Here, we apply a computational approach to investigate this question in Drosophila, exploring the molecular functions of more than 1,500 UCEs shared across the genomes of 12 Drosophila species. Our data indicate that Drosophila UCEs are hubs for gene regulatory functions and suggest that UCE sequence invariance originates from their combinatorial roles in gene control. We also note that the gene regulatory roles of intronic and intergenic UCEs (iUCEs) are distinct from those found in exonic UCEs (eUCEs). In iUCEs, transcription factor (TF) and epigenetic factor binding data strongly support iUCE roles in transcriptional and epigenetic regulation. In contrast, analyses of eUCEs indicate that they are two orders of magnitude more likely than the expected to simultaneously include protein-coding sequence, TF-binding sites, splice sites, and RNA editing sites but have reduced roles in transcriptional or epigenetic regulation. Furthermore, we use a Drosophila cell culture system and transgenic Drosophila embryos to validate the notion of UCE combinatorial regulatory roles using an eUCE within the Hox gene Ultrabithorax and show that its protein-coding region also contains alternative splicing regulatory information. Taken together our experiments indicate that UCEs emerge as a result of combinatorial gene regulatory roles and highlight common features in mammalian and insect UCEs implying that similar processes might underlie ultraconservation in diverse animal taxa.
Collapse
Affiliation(s)
- Maria Warnefors
- Sussex Neuroscience, School of Life Sciences, University of Sussex, Brighton, United Kingdom Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Britta Hartmann
- Institute of Human Genetics, Freiburg, Germany BIOSS Centre for Biological Signaling Studies, University Medical Center Freiburg, Freiburg, Germany
| | - Stefan Thomsen
- Sussex Neuroscience, School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | - Claudio R Alonso
- Sussex Neuroscience, School of Life Sciences, University of Sussex, Brighton, United Kingdom
| |
Collapse
|
23
|
Abstract
Exonic enhancers (eExons) are coding exons that also function as enhancers of the gene in which they reside or (a) nearby gene(s). Mutations that affect the enhancer activity of these eExons have been associated with human disease. Therefore, eExon mutations should be taken into account in exome and genome sequencing projects, not only because of the ability of these mutations to modify the encoded proteins but also because of their effects on enhancer activity.
Collapse
Affiliation(s)
- Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, 94158, USA. .,Institute for Human Genetics, University of California San Francisco, San Francisco, CA, 94158, USA.
| |
Collapse
|
24
|
Abstract
Nucleotide changes in gene regulatory elements can have a major effect on interindividual differences in drug response. For example, by reviewing all published pharmacogenomic genome-wide association studies, we show here that 96.4% of the associated single nucleotide polymorphisms reside in noncoding regions. We discuss how sequencing technologies are improving our ability to identify drug response-associated regulatory elements genome-wide and to annotate nucleotide variants within them. We highlight specific examples of how nucleotide changes in these elements can affect drug response and illustrate the techniques used to find them and functionally characterize them. Finally, we also discuss challenges in the field of drug-responsive regulatory elements that need to be considered in order to translate these findings into the clinic.
Collapse
Affiliation(s)
- Marcelo R Luizon
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA.,Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94158, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA.,Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94158, USA
| |
Collapse
|
25
|
Abstract
In addition to coding for proteins, exons can also impact transcription by encoding regulatory elements such as enhancers. It has been debated whether such features confer heightened selective constraint, or evolve neutrally. We have addressed this question by developing a new approach to disentangle the sources of selection acting on exonic enhancers, in which we model the evolutionary rates of every possible substitution as a function of their effects on both protein sequence and enhancer activity. In three exonic enhancers, we found no significant association between evolutionary rates and effects on enhancer activity. This suggests that despite having biochemical activity, these exonic enhancers have no detectable selective constraint, and thus are unlikely to play a major role in protein evolution.
Collapse
|
26
|
Abstract
Enhancers control the timing, location and expression levels of their target genes. Nucleotide variation in enhancers has been shown to lead to numerous phenotypes, including human disease. While putative enhancer sequences and nucleotide variation within them can now be detected in a rapid manner using various genomic technologies, the understanding of the functional consequences of these variants still remains largely unknown. Massively parallel reporter assays (MPRAs) can overcome this hurdle by providing the ability to test thousands of sequences and nucleotide variants within them for enhancer activity en masse. Here, we describe this technology and specifically focus on how it is being used to obtain an increased understanding of enhancer regulatory code and grammar.
Collapse
Affiliation(s)
- Fumitaka Inoue
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA; Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94158, USA.
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA; Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94158, USA.
| |
Collapse
|
27
|
White MA. Understanding how cis-regulatory function is encoded in DNA sequence using massively parallel reporter assays and designed sequences. Genomics 2015; 106:165-170. [PMID: 26072432 DOI: 10.1016/j.ygeno.2015.06.003] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2015] [Revised: 05/09/2015] [Accepted: 06/08/2015] [Indexed: 01/07/2023]
Abstract
Genome-scale methods have identified thousands of candidate cis-regulatory elements (CREs), but methods to directly assay the regulatory function of these elements on a comparably large scale have not been available. The inability to directly test and perturb the regulatory activity of large numbers of DNA sequences has hindered efforts to discover how cis-regulatory function is encoded in genomic sequence. Recently developed massively parallel reporter gene assays combine next generation sequencing with high-throughput oligonucleotide synthesis to offer the capacity to test and mutationally perturb thousands of specifically chosen or designed cis-regulatory sequences in a single experiment. These assays are the basis of recent studies that include large-scale functional validation of genomic CREs, exhaustive mutational analyses of individual regulatory sequences, and tests of large libraries of synthetic CREs. The results demonstrate how massively parallel reporter assays with libraries of designed sequences provide the statistical power required to address previously intractable questions about cis-regulatory function.
Collapse
Affiliation(s)
- Michael A White
- Center for Genome Sciences and Systems Biology, Department of Genetics, Washington University in St. Louis School of Medicine, St. Louis, MO 63108, USA.
| |
Collapse
|