1
|
A high-throughput real-time PCR tissue-of-origin test to distinguish blood from lymphoblastoid cell line DNA for (epi)genomic studies. Sci Rep 2022; 12:4684. [PMID: 35304543 PMCID: PMC8933453 DOI: 10.1038/s41598-022-08663-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 03/09/2022] [Indexed: 12/13/2022] Open
Abstract
Lymphoblastoid cell lines (LCLs) derive from blood infected in vitro by Epstein–Barr virus and were used in several genetic, transcriptomic and epigenomic studies. Although few changes were shown between LCL and blood genotypes (SNPs) validating their use in genetics, more were highlighted for other genomic features and/or in their transcriptome and epigenome. This could render them less appropriate for these studies, notably when blood DNA could still be available. Here we developed a simple, high-throughput and cost-effective real-time PCR approach allowing to distinguish blood from LCL DNA samples based on the presence of EBV relative load and rearranged T-cell receptors γ and β. Our approach was able to achieve 98.5% sensitivity and 100% specificity on DNA of known origin (458 blood and 316 LCL DNA). It was further applied to 1957 DNA samples from the CEPH Aging cohort comprising DNA of uncertain origin, identifying 784 blood and 1016 LCL DNA. A subset of these DNA was further analyzed with an epigenetic clock indicating that DNA extracted from blood should be preferred to LCL for DNA methylation-based age prediction analysis. Our approach could thereby be a powerful tool to ascertain the origin of DNA in old collections prior to (epi)genomic studies.
Collapse
|
2
|
Keys KL, Mak ACY, White MJ, Eckalbar WL, Dahl AW, Mefford J, Mikhaylova AV, Contreras MG, Elhawary JR, Eng C, Hu D, Huntsman S, Oh SS, Salazar S, Lenoir MA, Ye JC, Thornton TA, Zaitlen N, Burchard EG, Gignoux CR. On the cross-population generalizability of gene expression prediction models. PLoS Genet 2020; 16:e1008927. [PMID: 32797036 PMCID: PMC7449671 DOI: 10.1371/journal.pgen.1008927] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 08/26/2020] [Accepted: 06/10/2020] [Indexed: 11/21/2022] Open
Abstract
The genetic control of gene expression is a core component of human physiology. For the past several years, transcriptome-wide association studies have leveraged large datasets of linked genotype and RNA sequencing information to create a powerful gene-based test of association that has been used in dozens of studies. While numerous discoveries have been made, the populations in the training data are overwhelmingly of European descent, and little is known about the generalizability of these models to other populations. Here, we test for cross-population generalizability of gene expression prediction models using a dataset of African American individuals with RNA-Seq data in whole blood. We find that the default models trained in large datasets such as GTEx and DGN fare poorly in African Americans, with a notable reduction in prediction accuracy when compared to European Americans. We replicate these limitations in cross-population generalizability using the five populations in the GEUVADIS dataset. Via realistic simulations of both populations and gene expression, we show that accurate cross-population generalizability of transcriptome prediction only arises when eQTL architecture is substantially shared across populations. In contrast, models with non-identical eQTLs showed patterns similar to real-world data. Therefore, generating RNA-Seq data in diverse populations is a critical step towards multi-ethnic utility of gene expression prediction.
Collapse
Affiliation(s)
- Kevin L. Keys
- Department of Medicine, University of California, San Francisco, California, United States of America
- Berkeley Institute for Data Science, University of California, Berkeley, California, United States of America
| | - Angel C. Y. Mak
- Department of Medicine, University of California, San Francisco, California, United States of America
| | - Marquitta J. White
- Department of Medicine, University of California, San Francisco, California, United States of America
| | - Walter L. Eckalbar
- Department of Medicine, University of California, San Francisco, California, United States of America
| | - Andrew W. Dahl
- Department of Medicine, University of California, San Francisco, California, United States of America
| | - Joel Mefford
- Department of Medicine, University of California, San Francisco, California, United States of America
| | - Anna V. Mikhaylova
- Department of Biostatistics, University of Washington, Seattle, Washington, United States of America
| | - María G. Contreras
- Department of Medicine, University of California, San Francisco, California, United States of America
- San Francisco State University, San Francisco, California, United States of America
| | - Jennifer R. Elhawary
- Department of Medicine, University of California, San Francisco, California, United States of America
| | - Celeste Eng
- Department of Medicine, University of California, San Francisco, California, United States of America
| | - Donglei Hu
- Department of Medicine, University of California, San Francisco, California, United States of America
| | - Scott Huntsman
- Department of Medicine, University of California, San Francisco, California, United States of America
| | - Sam S. Oh
- Department of Medicine, University of California, San Francisco, California, United States of America
| | - Sandra Salazar
- Department of Medicine, University of California, San Francisco, California, United States of America
| | | | - Jimmie C. Ye
- Department of Epidemiology and Biostatistics, University of California, San Francisco, California, United States of America
- Department of Bioengineering and Therapeutic Biosciences, University of California, San Francisco, California, United States of America
| | - Timothy A. Thornton
- Department of Biostatistics, University of Washington, Seattle, Washington, United States of America
| | - Noah Zaitlen
- Department of Neurology, University of California, Los Angeles, California, United States of America
| | - Esteban G. Burchard
- Department of Medicine, University of California, San Francisco, California, United States of America
- Department of Bioengineering and Therapeutic Biosciences, University of California, San Francisco, California, United States of America
| | - Christopher R. Gignoux
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States of America
- Department of Biostatistics and Informatics, School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States of America
| |
Collapse
|
3
|
Jiang X, Assis R. Population-Specific Genetic and Expression Differentiation in Europeans. Genome Biol Evol 2020; 12:358-369. [PMID: 32365201 PMCID: PMC7197493 DOI: 10.1093/gbe/evaa021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/29/2020] [Indexed: 12/14/2022] Open
Abstract
Much of the enormous phenotypic variation observed across human populations is thought to have arisen from events experienced as our ancestors peopled different regions of the world. However, little is known about the genes involved in these population-specific adaptations. Here, we explore this problem by simultaneously examining population-specific genetic and expression differentiation in four human populations. In particular, we derive a branch-based estimator of population-specific differentiation in four populations, and apply this statistic to single-nucleotide polymorphism and RNA-seq data from Italian, British, Finish, and Yoruban populations. As expected, genome-wide estimates of genetic and expression differentiation each independently recapitulate the known relationships among these four human populations, highlighting the utility of our statistic for identifying putative targets of population-specific adaptations. Moreover, genes with large copy number variations display elevated levels of population-specific genetic and expression differentiation, consistent with the hypothesis that gene duplication and deletion events are key reservoirs of adaptive variation. Further, many top-scoring genes are well-known targets of adaptation in Europeans, including those involved in lactase persistence and vitamin D absorption, and a handful of novel candidates represent promising avenues for future research. Together, these analyses reveal that our statistic can aid in uncovering genes involved in population-specific genetic and expression differentiation, and that such genes often play important roles in a diversity of adaptive and disease-related phenotypes in humans.
Collapse
Affiliation(s)
- Xueyuan Jiang
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802
| | - Raquel Assis
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802
- Department of Biology, Pennsylvania State University, University Park, PA 16802
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431
- Institute for Human Health and Disease Intervention, Florida Atlantic University, Boca Raton, FL 33431
| |
Collapse
|
4
|
Peterson LA, Ignatovich IV, Grill AE, Beauchamp A, Ho YY, DiLernia AS, Zhang L. Individual Differences in the Response of Human β-Lymphoblastoid Cells to the Cytotoxic, Mutagenic, and DNA-Damaging Effects of a DNA Methylating Agent, N-Methylnitrosourethane. Chem Res Toxicol 2019; 32:2214-2226. [PMID: 31589032 DOI: 10.1021/acs.chemrestox.9b00266] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Metabolic activation of many carcinogens leads to formation of reactive intermediates that form DNA adducts. These adducts are cytotoxic when they interfere with cell division. They can also cause mutations by miscoding during DNA replication. Therefore, an individual's risk of developing cancer will depend on the balance between these processes as well as their ability to repair the DNA damage. Our hypothesis is that variations of genes participating in DNA damage repair and response pathways play significant roles in an individual's risk of developing tobacco-related cancers. To test this hypothesis, 61 human B-lymphocyte cell lines from the International HapMap project were phenotyped for their sensitivity to the cytotoxic and genotoxic properties of a model methylating agent, N-nitroso-N-methylurethane (NMUr). Cell viability was measured using a luciferase-based assay. Repair of the mutagenic and toxic DNA adduct, O6-methylguanine (O6-mG), was monitored by LC-MS/MS analysis. Genotoxic potential of NMUr was assessed employing a flow-cytometry based in vitro mutagenesis assay in the phosphatidylinositol-glycan biosynthesis class-A (PIG-A) gene. A wide distribution of responses to NMUr was observed with no correlation to gender or ethnicity. While the rate of O6-mG repair partially influenced the toxicity of NMUr, it did not appear to be the major factor affecting individual susceptibility to the mutagenic effects of NMUr. Genome-wide analysis identified several novel single nucleotide polymorphisms to be explored in future functional validation studies for a number of the toxicological end points.
Collapse
|
5
|
Mikhaylova AV, Thornton TA. Accuracy of Gene Expression Prediction From Genotype Data With PrediXcan Varies Across and Within Continental Populations. Front Genet 2019; 10:261. [PMID: 31001318 PMCID: PMC6456650 DOI: 10.3389/fgene.2019.00261] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Accepted: 03/08/2019] [Indexed: 01/08/2023] Open
Abstract
Using genetic data to predict gene expression has garnered significant attention in recent years. PrediXcan has become one of the most widely used gene-based methods for testing associations between predicted gene expression values and a phenotype, which has facilitated novel insights into the relationship between complex traits and the component of gene expression that can be attributed to genetic variation. The gene expression prediction models for PrediXcan were developed using supervised machine learning methods and training data from the Depression Genes and Networks (DGN) study and the Genotype-Tissue Expression (GTEx) project, where the majority of subjects are of European descent. Many genetic studies, however, include samples from multi-ethnic populations, and in this paper we evaluate the accuracy of PrediXcan for predicting gene expression in diverse populations. Using transcriptomic data from the GEUVADIS (Genetic European Variation in Disease) RNA sequencing project and whole genome sequencing data from the 1000 Genomes project, we evaluate and compare the predictive performance of PrediXcan in an African population (Yoruban) and four European ancestry populations for thousands of genes. We evaluate a range of models from the PrediXcan weight databases and use Pearson's correlation coefficient to assess gene expression prediction accuracy with PrediXcan. From our evaluation, we find that the predictive performance of PrediXcan varies substantially among populations from different continents (F-test p-value < 2.2 × 10-16), where prediction accuracy is lower in the Yoruban population from West Africa compared to the European-ancestry populations. Moreover, not only do we find differences in predictive performance between populations from different continents, we also find highly significant differences in prediction accuracy among the four European ancestry populations considered (F-test p-value < 2.2 × 10-16). Finally, while there is variability in prediction accuracy across different PrediXcan weight databases, we also find consistency in the qualitative performance of PrediXcan for the five populations considered, with the African ancestry population having the lowest accuracy across databases.
Collapse
Affiliation(s)
- Anna V. Mikhaylova
- Department of Biostatistics, University of Washington, Seattle, WA, United States
| | - Timothy A. Thornton
- Department of Biostatistics, University of Washington, Seattle, WA, United States
| |
Collapse
|
6
|
Brown BC, Bray NL, Pachter L. Expression reflects population structure. PLoS Genet 2018; 14:e1007841. [PMID: 30566439 PMCID: PMC6317812 DOI: 10.1371/journal.pgen.1007841] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2018] [Revised: 01/03/2019] [Accepted: 11/20/2018] [Indexed: 11/19/2022] Open
Abstract
Population structure in genotype data has been extensively studied, and is revealed by looking at the principal components of the genotype matrix. However, no similar analysis of population structure in gene expression data has been conducted, in part because a naïve principal components analysis of the gene expression matrix does not cluster by population. We identify a linear projection that reveals population structure in gene expression data. Our approach relies on the coupling of the principal components of genotype to the principal components of gene expression via canonical correlation analysis. Our method is able to determine the significance of the variance in the canonical correlation projection explained by each gene. We identify 3,571 significant genes, only 837 of which had been previously reported to have an associated eQTL in the GEUVADIS results. We show that our projections are not primarily driven by differences in allele frequency at known cis-eQTLs and that similar projections can be recovered using only several hundred randomly selected genes and SNPs. Finally, we present preliminary work on the consequences for eQTL analysis. We observe that using our projection co-ordinates as covariates results in the discovery of slightly fewer genes with eQTLs, but that these genes replicate in GTEx matched tissue at a slightly higher rate.
Collapse
Affiliation(s)
- Brielin C. Brown
- Department of Computer Science, University of California Berkeley, Berkeley, California, United States of America
| | - Nicolas L. Bray
- Institute for Innovative Genomics, University of California Berkeley, Berkeley, California, United States of America
- Department of Molecular & Cell Biology, University of California Berkeley, Berkeley, California, United States of America
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Padadena, California, United States of America
| |
Collapse
|
7
|
Kustatscher G, Grabowski P, Rappsilber J. Pervasive coexpression of spatially proximal genes is buffered at the protein level. Mol Syst Biol 2017; 13:937. [PMID: 28835372 PMCID: PMC5572396 DOI: 10.15252/msb.20177548] [Citation(s) in RCA: 75] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Genes are not randomly distributed in the genome. In humans, 10% of protein-coding genes are transcribed from bidirectional promoters and many more are organised in larger clusters. Intriguingly, neighbouring genes are frequently coexpressed but rarely functionally related. Here we show that coexpression of bidirectional gene pairs, and closeby genes in general, is buffered at the protein level. Taking into account the 3D architecture of the genome, we find that co-regulation of spatially close, functionally unrelated genes is pervasive at the transcriptome level, but does not extend to the proteome. We present evidence that non-functional mRNA coexpression in human cells arises from stochastic chromatin fluctuations and direct regulatory interference between spatially close genes. Protein-level buffering likely reflects a lack of coordination of post-transcriptional regulation of functionally unrelated genes. Grouping human genes together along the genome sequence, or through long-range chromosome folding, is associated with reduced expression noise. Our results support the hypothesis that the selection for noise reduction is a major driver of the evolution of genome organisation.
Collapse
Affiliation(s)
- Georg Kustatscher
- Wellcome Trust Centre for Cell Biology, University of Edinburgh, Edinburgh, UK
| | - Piotr Grabowski
- Chair of Bioanalytics, Institute of Biotechnology, Technische Universität Berlin, Berlin, Germany
| | - Juri Rappsilber
- Wellcome Trust Centre for Cell Biology, University of Edinburgh, Edinburgh, UK .,Chair of Bioanalytics, Institute of Biotechnology, Technische Universität Berlin, Berlin, Germany
| |
Collapse
|
8
|
Transcriptional signature of lymphoblastoid cell lines of BRCA1, BRCA2 and non- BRCA1/2 high risk breast cancer families. Oncotarget 2017; 8:78691-78712. [PMID: 29108258 PMCID: PMC5667991 DOI: 10.18632/oncotarget.20219] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2016] [Accepted: 07/17/2017] [Indexed: 12/20/2022] Open
Abstract
Approximately 25% of hereditary breast cancer cases are associated with a strong familial history which can be explained by mutations in BRCA1 or BRCA2 and other lower penetrance genes. The remaining high-risk families could be classified as BRCAX (non-BRCA1/2) families. Gene expression involving alternative splicing represents a well-known mechanism regulating the expression of multiple transcripts, which could be involved in cancer development. Thus using RNA-seq methodology, the analysis of transcriptome was undertaken to potentially reveal transcripts implicated in breast cancer susceptibility and development. RNA was extracted from immortalized lymphoblastoid cell lines of 117 women (affected and unaffected) coming from BRCA1, BRCA2 and BRCAX families. Anova analysis revealed a total of 95 transcripts corresponding to 85 different genes differentially expressed (Bonferroni corrected p-value <0.01) between those groups. Hierarchical clustering allowed distinctive subgrouping of BRCA1/2 subgroups from BRCAX individuals. We found 67 transcripts, which could discriminate BRCAX from BRCA1/BRCA2 individuals while 28 transcripts discriminate affected from unaffected BRCAX individuals. To our knowledge, this represents the first study identifying transcripts differentially expressed in lymphoblastoid cell lines from major classes of mutation-related breast cancer subgroups, namely BRCA1, BRCA2 and BRCAX. Moreover, some transcripts could discriminate affected from unaffected BRCAX individuals, which could represent potential therapeutic targets for breast cancer treatment.
Collapse
|
9
|
Mandage R, Telford M, Rodríguez JA, Farré X, Layouni H, Marigorta UM, Cundiff C, Heredia-Genestar JM, Navarro A, Santpere G. Genetic factors affecting EBV copy number in lymphoblastoid cell lines derived from the 1000 Genome Project samples. PLoS One 2017; 12:e0179446. [PMID: 28654678 PMCID: PMC5487016 DOI: 10.1371/journal.pone.0179446] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2016] [Accepted: 05/29/2017] [Indexed: 12/22/2022] Open
Abstract
Epstein-Barr virus (EBV), human herpes virus 4, has been classically associated with infectious mononucleosis, multiple sclerosis and several types of cancers. Many of these diseases show marked geographical differences in prevalence, which points to underlying genetic and/or environmental factors. Those factors may include a different susceptibility to EBV infection and viral copy number among human populations. Since EBV is commonly used to transform B-cells into lymphoblastoid cell lines (LCLs) we hypothesize that differences in EBV copy number among individual LCLs may reflect differential susceptibility to EBV infection. To test this hypothesis, we retrieved whole-genome sequenced EBV-mapping reads from 1,753 LCL samples derived from 19 populations worldwide that were sequenced within the context of the 1000 Genomes Project. An in silico methodology was developed to estimate the number of EBV copy number in LCLs and validated these estimations by real-time PCR. After experimentally confirming that EBV relative copy number remains stable over cell passages, we performed a genome wide association analysis (GWAS) to try detecting genetic variants of the host that may be associated with EBV copy number. Our GWAS has yielded several genomic regions suggestively associated with the number of EBV genomes per cell in LCLs, unraveling promising candidate genes such as CAND1, a known inhibitor of EBV replication. While this GWAS does not unequivocally establish the degree to which genetic makeup of individuals determine viral levels within their derived LCLs, for which a larger sample size will be needed, it potentially highlighted human genes affecting EBV-related processes, which constitute interesting candidates to follow up in the context of EBV related pathologies.
Collapse
Affiliation(s)
- Rajendra Mandage
- Institute of Evolutionary Biology (UPF-CSIC), Departament de Ciències Experimentals i la Salut, Universitat Pompeu Fabra, PRBB, Barcelona, Catalonia, Spain
| | - Marco Telford
- Institute of Evolutionary Biology (UPF-CSIC), Departament de Ciències Experimentals i la Salut, Universitat Pompeu Fabra, PRBB, Barcelona, Catalonia, Spain
| | - Juan Antonio Rodríguez
- Institute of Evolutionary Biology (UPF-CSIC), Departament de Ciències Experimentals i la Salut, Universitat Pompeu Fabra, PRBB, Barcelona, Catalonia, Spain
| | - Xavier Farré
- Institute of Evolutionary Biology (UPF-CSIC), Departament de Ciències Experimentals i la Salut, Universitat Pompeu Fabra, PRBB, Barcelona, Catalonia, Spain
| | - Hafid Layouni
- Institute of Evolutionary Biology (UPF-CSIC), Departament de Ciències Experimentals i la Salut, Universitat Pompeu Fabra, PRBB, Barcelona, Catalonia, Spain
- Bioinformatics Studies, ESCI-UPF, Pg. Pujades 1, Barcelona, Spain
| | - Urko M. Marigorta
- Institute of Evolutionary Biology (UPF-CSIC), Departament de Ciències Experimentals i la Salut, Universitat Pompeu Fabra, PRBB, Barcelona, Catalonia, Spain
- Georgia Institute of Technology, Department of Biology, Atlanta, Georgia, United States of America
| | - Caitlin Cundiff
- Georgia Institute of Technology, Department of Biology, Atlanta, Georgia, United States of America
| | - Jose Maria Heredia-Genestar
- Institute of Evolutionary Biology (UPF-CSIC), Departament de Ciències Experimentals i la Salut, Universitat Pompeu Fabra, PRBB, Barcelona, Catalonia, Spain
| | - Arcadi Navarro
- Institute of Evolutionary Biology (UPF-CSIC), Departament de Ciències Experimentals i la Salut, Universitat Pompeu Fabra, PRBB, Barcelona, Catalonia, Spain
- National Institute for Bioinformatics (INB), PRBB, Barcelona, Catalonia, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), PRBB, Barcelona, Catalonia, Spain
- Center for Genomic Regulation (CRG), PRBB, Barcelona, Catalonia, Spain
- * E-mail: (AN); (GS)
| | - Gabriel Santpere
- Institute of Evolutionary Biology (UPF-CSIC), Departament de Ciències Experimentals i la Salut, Universitat Pompeu Fabra, PRBB, Barcelona, Catalonia, Spain
- Department of Neuroscience, Yale School of Medicine, New Haven, CT, United States of America
- * E-mail: (AN); (GS)
| |
Collapse
|
10
|
Galarza-Muñoz G, Briggs FBS, Evsyukova I, Schott-Lerner G, Kennedy EM, Nyanhete T, Wang L, Bergamaschi L, Widen SG, Tomaras GD, Ko DC, Bradrick SS, Barcellos LF, Gregory SG, Garcia-Blanco MA. Human Epistatic Interaction Controls IL7R Splicing and Increases Multiple Sclerosis Risk. Cell 2017; 169:72-84.e13. [PMID: 28340352 DOI: 10.1016/j.cell.2017.03.007] [Citation(s) in RCA: 79] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2016] [Revised: 09/18/2016] [Accepted: 03/02/2017] [Indexed: 12/18/2022]
Abstract
Multiple sclerosis (MS) is an autoimmune disorder where T cells attack neurons in the central nervous system (CNS) leading to demyelination and neurological deficits. A driver of increased MS risk is the soluble form of the interleukin-7 receptor alpha chain gene (sIL7R) produced by alternative splicing of IL7R exon 6. Here, we identified the RNA helicase DDX39B as a potent activator of this exon and consequently a repressor of sIL7R, and we found strong genetic association of DDX39B with MS risk. Indeed, we showed that a genetic variant in the 5' UTR of DDX39B reduces translation of DDX39B mRNAs and increases MS risk. Importantly, this DDX39B variant showed strong genetic and functional epistasis with allelic variants in IL7R exon 6. This study establishes the occurrence of biological epistasis in humans and provides mechanistic insight into the regulation of IL7R exon 6 splicing and its impact on MS risk.
Collapse
Affiliation(s)
- Gaddiel Galarza-Muñoz
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA; Center for RNA Biology, Duke University, Durham, NC 27710, USA; Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX 77555, USA
| | - Farren B S Briggs
- Department of Epidemiology and Biostatistics, School of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Irina Evsyukova
- Center for RNA Biology, Duke University, Durham, NC 27710, USA
| | - Geraldine Schott-Lerner
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX 77555, USA
| | - Edward M Kennedy
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA
| | - Tinashe Nyanhete
- Department of Immunology, Duke University Durham, NC 27710, USA; Department of Surgery, Duke University Durham, NC 27710, USA
| | - Liuyang Wang
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA
| | - Laura Bergamaschi
- Duke Molecular Physiology Institute, Duke University, Durham, NC 27701, USA
| | - Steven G Widen
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX 77555, USA
| | - Georgia D Tomaras
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA; Department of Immunology, Duke University Durham, NC 27710, USA; Department of Surgery, Duke University Durham, NC 27710, USA
| | - Dennis C Ko
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA; Department of Medicine, Duke University Medical Center; Durham, NC 27710, USA
| | - Shelton S Bradrick
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA; Center for RNA Biology, Duke University, Durham, NC 27710, USA; Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX 77555, USA
| | - Lisa F Barcellos
- Division of Epidemiology, School of Public Health, University of California Berkeley, Berkeley, CA 94720, USA
| | - Simon G Gregory
- Duke Molecular Physiology Institute, Duke University, Durham, NC 27701, USA; Department of Neurology, Duke University Medical Center, Durham, NC 27710, USA.
| | - Mariano A Garcia-Blanco
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA; Center for RNA Biology, Duke University, Durham, NC 27710, USA; Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX 77555, USA.
| |
Collapse
|
11
|
Kelly DE, Hansen MEB, Tishkoff SA. Global variation in gene expression and the value of diverse sampling. CURRENT OPINION IN SYSTEMS BIOLOGY 2017; 1:102-108. [PMID: 28596996 PMCID: PMC5458633 DOI: 10.1016/j.coisb.2016.12.018] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The genomics era has accelerated our understanding of how genetic and epigenetic factors influence both normal variable traits and disease risk in humans. However, the majority of "omics" studies have focused on individuals living in urban centers, primarily from Europe and Asia, neglecting much of the genetic and environmental variation that exists across worldwide populations. Comparative studies of gene regulation in ethnically diverse populations are informing our understanding of how evolutionary forces have shaped the genetic and molecular mechanisms underlying complex traits, and studying gene expression in different environmental contexts is enabling the dissection of disease-related pathways such as immune response. Such approaches are vital to the equitable application of genomics and medicine.
Collapse
Affiliation(s)
- Derek E. Kelly
- Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA
- Genomics and Computational Biology Graduate Group, University of Pennsylvania, Philadelphia, PA 19104, USA
| | | | - Sarah A. Tishkoff
- Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA
- Genomics and Computational Biology Graduate Group, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|