1
|
Balliu B, Douglas C, Seok D, Shenhav L, Wu Y, Chatzopoulou D, Kaiser W, Chen V, Kim J, Deverasetty S, Arnaudova I, Gibbons R, Congdon E, Craske MG, Freimer N, Halperin E, Sankararaman S, Flint J. Personalized mood prediction from patterns of behavior collected with smartphones. NPJ Digit Med 2024; 7:49. [PMID: 38418551 PMCID: PMC10902386 DOI: 10.1038/s41746-024-01035-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Accepted: 02/09/2024] [Indexed: 03/01/2024] Open
Abstract
Over the last ten years, there has been considerable progress in using digital behavioral phenotypes, captured passively and continuously from smartphones and wearable devices, to infer depressive mood. However, most digital phenotype studies suffer from poor replicability, often fail to detect clinically relevant events, and use measures of depression that are not validated or suitable for collecting large and longitudinal data. Here, we report high-quality longitudinal validated assessments of depressive mood from computerized adaptive testing paired with continuous digital assessments of behavior from smartphone sensors for up to 40 weeks on 183 individuals experiencing mild to severe symptoms of depression. We apply a combination of cubic spline interpolation and idiographic models to generate individualized predictions of future mood from the digital behavioral phenotypes, achieving high prediction accuracy of depression severity up to three weeks in advance (R2 ≥ 80%) and a 65.7% reduction in the prediction error over a baseline model which predicts future mood based on past depression severity alone. Finally, our study verified the feasibility of obtaining high-quality longitudinal assessments of mood from a clinical population and predicting symptom severity weeks in advance using passively collected digital behavioral data. Our results indicate the possibility of expanding the repertoire of patient-specific behavioral measures to enable future psychiatric research.
Collapse
Affiliation(s)
- Brunilda Balliu
- Departments of Computational Medicine, University of California Los Angeles, Los Angeles, USA.
- Departments of Pathology and Laboratory Medicine, University of California Los Angeles, Los Angeles, USA.
- Department of Biostatistics, University of California Los Angeles, Los Angeles, USA.
| | - Chris Douglas
- Department of Psychiatry and Biobehavioral Science, University of California Los Angeles, Los Angeles, USA
| | - Darsol Seok
- Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, USA
| | - Liat Shenhav
- Department of Computer Science, University of California Los Angeles, Los Angeles, USA
| | - Yue Wu
- Department of Computer Science, University of California Los Angeles, Los Angeles, USA
| | - Doxa Chatzopoulou
- Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, USA
| | - William Kaiser
- Department of Electrical Engineering, University of California Los Angeles, Los Angeles, USA
| | - Victor Chen
- Department of Electrical Engineering, University of California Los Angeles, Los Angeles, USA
| | - Jennifer Kim
- Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, USA
| | - Sandeep Deverasetty
- Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, USA
| | - Inna Arnaudova
- Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, USA
| | - Robert Gibbons
- Departments of Medicine, Public Health Sciences and Comparative Human Development, University of Chicago, Chicago, USA
| | - Eliza Congdon
- Department of Psychiatry and Biobehavioral Science, University of California Los Angeles, Los Angeles, USA
- Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, USA
| | - Michelle G Craske
- Department of Psychiatry and Biobehavioral Science, University of California Los Angeles, Los Angeles, USA
- Department of Psychology, University of California Los Angeles, Los Angeles, USA
| | - Nelson Freimer
- Department of Psychiatry and Biobehavioral Science, University of California Los Angeles, Los Angeles, USA
- Department of Human Genetics, University of California Los Angeles, Los Angeles, USA
| | - Eran Halperin
- Department of Computer Science, University of California Los Angeles, Los Angeles, USA
| | - Sriram Sankararaman
- Departments of Computational Medicine, University of California Los Angeles, Los Angeles, USA
- Department of Computer Science, University of California Los Angeles, Los Angeles, USA
- Department of Human Genetics, University of California Los Angeles, Los Angeles, USA
| | - Jonathan Flint
- Department of Psychiatry and Biobehavioral Science, University of California Los Angeles, Los Angeles, USA.
- Department of Human Genetics, University of California Los Angeles, Los Angeles, USA.
| |
Collapse
|
2
|
Venema WJ, Hiddingh S, van Loosdregt J, Bowes J, Balliu B, de Boer JH, Ossewaarde-van Norel J, Thompson SD, Langefeld CD, de Ligt A, van der Veken LT, Krijger PHL, de Laat W, Kuiper JJW. A cis-regulatory element regulates ERAP2 expression through autoimmune disease risk SNPs. Cell Genom 2024; 4:100460. [PMID: 38190099 PMCID: PMC10794781 DOI: 10.1016/j.xgen.2023.100460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Revised: 10/04/2023] [Accepted: 11/09/2023] [Indexed: 01/09/2024]
Abstract
Single-nucleotide polymorphisms (SNPs) near the ERAP2 gene are associated with various autoimmune conditions, as well as protection against lethal infections. Due to high linkage disequilibrium, numerous trait-associated SNPs are correlated with ERAP2 expression; however, their functional mechanisms remain unidentified. We show by reciprocal allelic replacement that ERAP2 expression is directly controlled by the splice region variant rs2248374. However, disease-associated variants in the downstream LNPEP gene promoter are independently associated with ERAP2 expression. Allele-specific conformation capture assays revealed long-range chromatin contacts between the gene promoters of LNPEP and ERAP2 and showed that interactions were stronger in patients carrying the alleles that increase susceptibility to autoimmune diseases. Replacing the SNPs in the LNPEP promoter by reference sequences lowered ERAP2 expression. These findings show that multiple SNPs act in concert to regulate ERAP2 expression and that disease-associated variants can convert a gene promoter region into a potent enhancer of a distal gene.
Collapse
Affiliation(s)
- Wouter J Venema
- Department of Ophthalmology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands; Center for Translational Immunology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | - Sanne Hiddingh
- Department of Ophthalmology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands; Center for Translational Immunology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | - Jorg van Loosdregt
- Center for Translational Immunology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | - John Bowes
- Centre for Genetics and Genomics Versus Arthritis, Centre for Musculoskeletal Research, Manchester Academic Health Science Centre, The University of Manchester, Manchester, UK
| | - Brunilda Balliu
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Joke H de Boer
- Department of Ophthalmology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | | | - Susan D Thompson
- Department of Pediatrics, University of Cincinnati College of Medicine, Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Carl D Langefeld
- Department of Biostatistics and Data Science, and Center for Precision Medicine, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Aafke de Ligt
- Department of Ophthalmology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands; Center for Translational Immunology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | - Lars T van der Veken
- Department of Genetics, Division Laboratories, Pharmacy and Biomedical Genetics, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | - Peter H L Krijger
- Oncode Institute, Hubrecht Institute-KNAW and University Medical Center Utrecht, 3584 CT Utrecht, the Netherlands
| | - Wouter de Laat
- Oncode Institute, Hubrecht Institute-KNAW and University Medical Center Utrecht, 3584 CT Utrecht, the Netherlands
| | - Jonas J W Kuiper
- Department of Ophthalmology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands; Center for Translational Immunology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands.
| |
Collapse
|
3
|
DeGorter MK, Goddard PC, Karakoc E, Kundu S, Yan SM, Nachun D, Abell N, Aguirre M, Carstensen T, Chen Z, Durrant M, Dwaracherla VR, Feng K, Gloudemans MJ, Hunter N, Moorthy MPS, Pomilla C, Rodrigues KB, Smith CJ, Smith KS, Ungar RA, Balliu B, Fellay J, Flicek P, McLaren PJ, Henn B, McCoy RC, Sugden L, Kundaje A, Sandhu MS, Gurdasani D, Montgomery SB. Transcriptomics and chromatin accessibility in multiple African population samples. bioRxiv 2023:2023.11.04.564839. [PMID: 37986808 PMCID: PMC10659267 DOI: 10.1101/2023.11.04.564839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Mapping the functional human genome and impact of genetic variants is often limited to European-descendent population samples. To aid in overcoming this limitation, we measured gene expression using RNA sequencing in lymphoblastoid cell lines (LCLs) from 599 individuals from six African populations to identify novel transcripts including those not represented in the hg38 reference genome. We used whole genomes from the 1000 Genomes Project and 164 Maasai individuals to identify 8,881 expression and 6,949 splicing quantitative trait loci (eQTLs/sQTLs), and 2,611 structural variants associated with gene expression (SV-eQTLs). We further profiled chromatin accessibility using ATAC-Seq in a subset of 100 representative individuals, to identity chromatin accessibility quantitative trait loci (caQTLs) and allele-specific chromatin accessibility, and provide predictions for the functional effect of 78.9 million variants on chromatin accessibility. Using this map of eQTLs and caQTLs we fine-mapped GWAS signals for a range of complex diseases. Combined, this work expands global functional genomic data to identify novel transcripts, functional elements and variants, understand population genetic history of molecular quantitative trait loci, and further resolve the genetic basis of multiple human traits and disease.
Collapse
Affiliation(s)
| | - Page C Goddard
- Department of Genetics, Stanford University, Stanford, CA
| | - Emre Karakoc
- Human Genetics, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
| | - Soumya Kundu
- Department of Computer Science, Stanford University, Stanford CA
| | | | - Daniel Nachun
- Department of Pathology, Stanford University, Stanford, CA
| | - Nathan Abell
- Department of Genetics, Stanford University, Stanford, CA
| | - Matthew Aguirre
- Department of Biomedical Data Science, Stanford University, Stanford, CA
| | - Tommy Carstensen
- Human Genetics, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
| | - Ziwei Chen
- Department of Computer Science, Stanford University, Stanford CA
| | | | | | - Karen Feng
- Department of Biomedical Data Science, Stanford University, Stanford, CA
| | | | - Naiomi Hunter
- Department of Genetics, Stanford University, Stanford, CA
| | | | - Cristina Pomilla
- Human Genetics, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
| | | | | | - Kevin S Smith
- Department of Pathology, Stanford University, Stanford, CA
| | - Rachel A Ungar
- Department of Genetics, Stanford University, Stanford, CA
| | - Brunilda Balliu
- Department of Pathology and Laboratory Medicine, University of California Los Angeles, Los Angeles, CA and Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA
| | - Jacques Fellay
- School of Life Sciences, Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland and Precision Medicine Unit, Biomedical Data Science Center, Lausanne University Hospital and University of Lausanne, Lausanne, Switzerland
| | - Paul Flicek
- Department of Genetics, University of Cambridge, Cambridge, UK
| | - Paul J McLaren
- Sexually Transmitted and Blood-Borne Infections Division at JC Wilt Infectious Diseases Research Centre, National Microbiology Laboratory Branch, Public Health Agency of Canada, Winnipeg, Canada and Department of Medical Microbiology and Infectious Diseases, University of Manitoba, Winnipeg, Canada
| | - Brenna Henn
- Department of Anthropology, University of California Davis, Davis CA and Genome Center, University of California Davis, Davis CA
| | - Rajiv C McCoy
- Department of Biology, Johns Hopkins University, Baltimore
| | - Lauren Sugden
- Department of Mathematics and Computer Science, Dusquesne University, Pittsburgh, PA
| | - Anshul Kundaje
- Department of Genetics, Stanford University, Stanford, CA
- Department of Computer Science, Stanford University, Stanford CA
| | | | - Deepti Gurdasani
- William Harvey Research Institute, Queen Mary University of London, London, UK; Kirby Institute, University of New South Wales, Australia; School of Medicine, University of Western Australia, Australia
| | | |
Collapse
|
4
|
Garske KM, Kar A, Comenho C, Balliu B, Pan DZ, Bhagat YV, Rosenberg G, Koka A, Das SS, Miao Z, Sinsheimer JS, Kaprio J, Pietiläinen KH, Pajukanta P. Increased body mass index is linked to systemic inflammation through altered chromatin co-accessibility in human preadipocytes. Nat Commun 2023; 14:4214. [PMID: 37452040 PMCID: PMC10349101 DOI: 10.1038/s41467-023-39919-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Accepted: 07/04/2023] [Indexed: 07/18/2023] Open
Abstract
Obesity-induced adipose tissue dysfunction can cause low-grade inflammation and downstream obesity comorbidities. Although preadipocytes may contribute to this pro-inflammatory environment, the underlying mechanisms are unclear. We used human primary preadipocytes from body mass index (BMI) -discordant monozygotic (MZ) twin pairs to generate epigenetic (ATAC-sequence) and transcriptomic (RNA-sequence) data for testing whether increased BMI alters the subnuclear compartmentalization of open chromatin in the twins' preadipocytes, causing downstream inflammation. Here we show that the co-accessibility of open chromatin, i.e. compartmentalization of chromatin activity, is altered in the higher vs lower BMI MZ siblings for a large subset ( ~ 88.5 Mb) of the active subnuclear compartments. Using the UK Biobank we show that variants within these regions contribute to systemic inflammation through interactions with BMI on C-reactive protein. In summary, open chromatin co-accessibility in human preadipocytes is disrupted among the higher BMI siblings, suggesting a mechanism how obesity may lead to inflammation via gene-environment interactions.
Collapse
Affiliation(s)
- Kristina M Garske
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, 90095, USA
| | - Asha Kar
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, 90095, USA
| | - Caroline Comenho
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, 90095, USA
| | - Brunilda Balliu
- Department of Computational Medicine, UCLA, Los Angeles, CA, 90095, USA
| | - David Z Pan
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, 90095, USA
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, 90095, USA
| | - Yash V Bhagat
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, 90095, USA
| | - Gregory Rosenberg
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, 90095, USA
| | - Amogha Koka
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, 90095, USA
| | - Sankha Subhra Das
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, 90095, USA
| | - Zong Miao
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, 90095, USA
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, 90095, USA
| | - Janet S Sinsheimer
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, 90095, USA
- Department of Computational Medicine, UCLA, Los Angeles, CA, 90095, USA
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, 90095, USA
| | - Jaakko Kaprio
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, 00014, Finland
| | - Kirsi H Pietiläinen
- Obesity Research Unit, Research Program for Clinical and Molecular Metabolism, Faculty of Medicine, University of Helsinki, Helsinki, 00014, Finland
- Obesity Center, Abdominal Center, Helsinki University Hospital and University of Helsinki, Helsinki, 00014, Finland
| | - Päivi Pajukanta
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, 90095, USA.
- Department of Computational Medicine, UCLA, Los Angeles, CA, 90095, USA.
- Institute for Precision Heath, David Geffen School of Medicine at UCLA, Los Angeles, CA, 90095, USA.
| |
Collapse
|
5
|
Caggiano C, Boudaie A, Shemirani R, Mefford J, Petter E, Chiu A, Ercelen D, He R, Tward D, Paul KC, Chang TS, Pasaniuc B, Kenny EE, Shortt JA, Gignoux CR, Balliu B, Arboleda VA, Belbin G, Zaitlen N. Disease risk and healthcare utilization among ancestrally diverse groups in the Los Angeles region. Nat Med 2023; 29:1845-1856. [PMID: 37464048 DOI: 10.1038/s41591-023-02425-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Accepted: 05/30/2023] [Indexed: 07/20/2023]
Abstract
An individual's disease risk is affected by the populations that they belong to, due to shared genetics and environmental factors. The study of fine-scale populations in clinical care is important for identifying and reducing health disparities and for developing personalized interventions. To assess patterns of clinical diagnoses and healthcare utilization by fine-scale populations, we leveraged genetic data and electronic medical records from 35,968 patients as part of the UCLA ATLAS Community Health Initiative. We defined clusters of individuals using identity by descent, a form of genetic relatedness that utilizes shared genomic segments arising due to a common ancestor. In total, we identified 376 clusters, including clusters with patients of Afro-Caribbean, Puerto Rican, Lebanese Christian, Iranian Jewish and Gujarati ancestry. Our analysis uncovered 1,218 significant associations between disease diagnoses and clusters and 124 significant associations with specialty visits. We also examined the distribution of pathogenic alleles and found 189 significant alleles at elevated frequency in particular clusters, including many that are not regularly included in population screening efforts. Overall, this work progresses the understanding of health in understudied communities and can provide the foundation for further study into health inequities.
Collapse
Affiliation(s)
- Christa Caggiano
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Neurology, University of California, Los Angeles, Los Angeles, CA, USA
| | | | - Ruhollah Shemirani
- Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Joel Mefford
- Semel Institute for Neuroscience and Human Behavior, University of California, Los Angeles, Los Angeles, CA, USA
| | - Ella Petter
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA
| | - Alec Chiu
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, CA, USA
| | - Defne Ercelen
- Computational and Systems Biology Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, USA
| | - Rosemary He
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Daniel Tward
- Department of Neurology, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Kimberly C Paul
- Department of Neurology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Timothy S Chang
- Department of Neurology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Bogdan Pasaniuc
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Institute of Precision Health, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Pathology and Laboratory Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA
| | - Eimear E Kenny
- Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Jonathan A Shortt
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
- Division of Bioinformatics and Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Christopher R Gignoux
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
- Division of Bioinformatics and Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Brunilda Balliu
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Valerie A Arboleda
- Department of Pathology and Laboratory Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA
| | - Gillian Belbin
- Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Noah Zaitlen
- Department of Neurology, University of California, Los Angeles, Los Angeles, CA, USA.
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA, USA.
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA.
| |
Collapse
|
6
|
Deshpande D, Chhugani K, Chang Y, Karlsberg A, Loeffler C, Zhang J, Muszyńska A, Munteanu V, Yang H, Rotman J, Tao L, Balliu B, Tseng E, Eskin E, Zhao F, Mohammadi P, P. Łabaj P, Mangul S. RNA-seq data science: From raw data to effective interpretation. Front Genet 2023; 14:997383. [PMID: 36999049 PMCID: PMC10043755 DOI: 10.3389/fgene.2023.997383] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 02/24/2023] [Indexed: 03/14/2023] Open
Abstract
RNA sequencing (RNA-seq) has become an exemplary technology in modern biology and clinical science. Its immense popularity is due in large part to the continuous efforts of the bioinformatics community to develop accurate and scalable computational tools to analyze the enormous amounts of transcriptomic data that it produces. RNA-seq analysis enables genes and their corresponding transcripts to be probed for a variety of purposes, such as detecting novel exons or whole transcripts, assessing expression of genes and alternative transcripts, and studying alternative splicing structure. It can be a challenge, however, to obtain meaningful biological signals from raw RNA-seq data because of the enormous scale of the data as well as the inherent limitations of different sequencing technologies, such as amplification bias or biases of library preparation. The need to overcome these technical challenges has pushed the rapid development of novel computational tools, which have evolved and diversified in accordance with technological advancements, leading to the current myriad of RNA-seq tools. These tools, combined with the diverse computational skill sets of biomedical researchers, help to unlock the full potential of RNA-seq. The purpose of this review is to explain basic concepts in the computational analysis of RNA-seq data and define discipline-specific jargon.
Collapse
Affiliation(s)
- Dhrithi Deshpande
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Karishma Chhugani
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Yutong Chang
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Aaron Karlsberg
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Caitlin Loeffler
- Department of Computer Science, University of California, Los Angeles, CA, United States
| | - Jinyang Zhang
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
| | - Agata Muszyńska
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Institute of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Gliwice, Poland
| | - Viorel Munteanu
- Department of Computers, Informatics and Microelectronics, Technical University of Moldova, Chisinau, Moldova
| | - Harry Yang
- Department of Microbiology, Immunology and Molecular Genetics, University of California Los Angeles, Los Angeles, CA, United States
| | - Jeremy Rotman
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Laura Tao
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
| | - Brunilda Balliu
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
| | | | - Eleazar Eskin
- Department of Computer Science, University of California, Los Angeles, CA, United States
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, United States
| | - Fangqing Zhao
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
- Key Laboratory of Systems Biology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, China
| | - Pejman Mohammadi
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, United States
| | - Paweł P. Łabaj
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Department of Biotechnology, Boku University Vienna, Vienna, Austria
| | - Serghei Mangul
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
- Department of Quantitative and Computational Biology, USC Dornsife College of Letters, Arts and Sciences, Los Angeles, CA, United States
- *Correspondence: Serghei Mangul,
| |
Collapse
|
7
|
Johnson R, Ding Y, Venkateswaran V, Bhattacharya A, Boulier K, Chiu A, Knyazev S, Schwarz T, Freund M, Zhan L, Burch KS, Caggiano C, Hill B, Rakocz N, Balliu B, Denny CT, Sul JH, Zaitlen N, Arboleda VA, Halperin E, Sankararaman S, Butte MJ, Lajonchere C, Geschwind DH, Pasaniuc B. Author Correction: Leveraging genomic diversity for discovery in an electronic health record linked biobank: the UCLA ATLAS Community Health Initiative. Genome Med 2022; 14:128. [PMID: 36384576 PMCID: PMC9670414 DOI: 10.1186/s13073-022-01128-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Affiliation(s)
- Ruth Johnson
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
| | - Yi Ding
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Vidhya Venkateswaran
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Oral Biology, School of Dentistry, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Arjun Bhattacharya
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Institute for Quantitative and Computational Biosciences, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Kristin Boulier
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Medicine, Division of Cardiology, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Alec Chiu
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Sergey Knyazev
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Institute for Quantitative and Computational Biosciences, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Tommer Schwarz
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Malika Freund
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Genetics, Stanford School of Medicine, Stanford, CA, 94305, USA
| | - Lingyu Zhan
- Molecular Biology Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Kathryn S Burch
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Christa Caggiano
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Brian Hill
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Nadav Rakocz
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Brunilda Balliu
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Christopher T Denny
- Division of Hematology/Oncology, Department of Pediatrics, Gwynne Hazen Cherry Memorial Laboratories, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Jae Hoon Sul
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Noah Zaitlen
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Valerie A Arboleda
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Eran Halperin
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Anesthesiology and Perioperative Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Sriram Sankararaman
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Manish J Butte
- Department of Pediatrics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Clara Lajonchere
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Institute of Precision Health, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Daniel H Geschwind
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Institute of Precision Health, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Bogdan Pasaniuc
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Institute of Precision Health, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
| |
Collapse
|
8
|
Thompson M, Gordon MG, Lu A, Tandon A, Halperin E, Gusev A, Ye CJ, Balliu B, Zaitlen N. Multi-context genetic modeling of transcriptional regulation resolves novel disease loci. Nat Commun 2022; 13:5704. [PMID: 36171194 PMCID: PMC9519579 DOI: 10.1038/s41467-022-33212-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 09/07/2022] [Indexed: 12/01/2022] Open
Abstract
A majority of the variants identified in genome-wide association studies fall in non-coding regions of the genome, indicating their mechanism of impact is mediated via gene expression. Leveraging this hypothesis, transcriptome-wide association studies (TWAS) have assisted in both the interpretation and discovery of additional genes associated with complex traits. However, existing methods for conducting TWAS do not take full advantage of the intra-individual correlation inherently present in multi-context expression studies and do not properly adjust for multiple testing across contexts. We introduce CONTENT-a computationally efficient method with proper cross-context false discovery correction that leverages correlation structure across contexts to improve power and generate context-specific and context-shared components of expression. We apply CONTENT to bulk multi-tissue and single-cell RNA-seq data sets and show that CONTENT leads to a 42% (bulk) and 110% (single cell) increase in the number of genetically predicted genes relative to previous approaches. We find the context-specific component of expression comprises 30% of heritability in tissue-level bulk data and 75% in single-cell data, consistent with cell-type heterogeneity in bulk tissue. In the context of TWAS, CONTENT increases the number of locus-phenotype associations discovered by over 51% relative to previous methods across 22 complex traits.
Collapse
Affiliation(s)
- Mike Thompson
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, USA.
| | - Mary Grace Gordon
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA
- Biological and Medical Informatics Graduate Program, University of California, San Francisco, San Francisco, CA, USA
| | - Andrew Lu
- UCLA-Caltech Medical Scientist Training Program, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Anchit Tandon
- Department of Mathematics, Indian Institute of Technology Delhi, Hauz Khas, Delhi, India
| | - Eran Halperin
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, University of California Los Angeles, Los Angeles, CA, USA
- Department of Anesthesiology and Perioperative Medicine, University of California Los Angeles, Los Angeles, CA, USA
- Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Alexander Gusev
- Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, US
- Division of Genetics, Brigham and Women's Hospital, Boston, MA, US
| | - Chun Jimmie Ye
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA
- Chan-Zuckerberg Biohub, San Francisco, CA, USA
- Division of Rheumatology, Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
- Institute for Computational Health Sciences, University of California, San Francisco, San Francisco, CA, USA
| | - Brunilda Balliu
- Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Noah Zaitlen
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, USA.
- Department of Neurology, University of California Los Angeles, Los Angeles, CA, USA.
| |
Collapse
|
9
|
Johnson R, Ding Y, Venkateswaran V, Bhattacharya A, Boulier K, Chiu A, Knyazev S, Schwarz T, Freund M, Zhan L, Burch KS, Caggiano C, Hill B, Rakocz N, Balliu B, Denny CT, Sul JH, Zaitlen N, Arboleda VA, Halperin E, Sankararaman S, Butte MJ, Lajonchere C, Geschwind DH, Pasaniuc B. Leveraging genomic diversity for discovery in an electronic health record linked biobank: the UCLA ATLAS Community Health Initiative. Genome Med 2022; 14:104. [PMID: 36085083 PMCID: PMC9461263 DOI: 10.1186/s13073-022-01106-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Accepted: 08/03/2022] [Indexed: 01/13/2023] Open
Abstract
BACKGROUND Large medical centers in urban areas, like Los Angeles, care for a diverse patient population and offer the potential to study the interplay between genetic ancestry and social determinants of health. Here, we explore the implications of genetic ancestry within the University of California, Los Angeles (UCLA) ATLAS Community Health Initiative-an ancestrally diverse biobank of genomic data linked with de-identified electronic health records (EHRs) of UCLA Health patients (N=36,736). METHODS We quantify the extensive continental and subcontinental genetic diversity within the ATLAS data through principal component analysis, identity-by-descent, and genetic admixture. We assess the relationship between genetically inferred ancestry (GIA) and >1500 EHR-derived phenotypes (phecodes). Finally, we demonstrate the utility of genetic data linked with EHR to perform ancestry-specific and multi-ancestry genome and phenome-wide scans across a broad set of disease phenotypes. RESULTS We identify 5 continental-scale GIA clusters including European American (EA), African American (AA), Hispanic Latino American (HL), South Asian American (SAA) and East Asian American (EAA) individuals and 7 subcontinental GIA clusters within the EAA GIA corresponding to Chinese American, Vietnamese American, and Japanese American individuals. Although we broadly find that self-identified race/ethnicity (SIRE) is highly correlated with GIA, we still observe marked differences between the two, emphasizing that the populations defined by these two criteria are not analogous. We find a total of 259 significant associations between continental GIA and phecodes even after accounting for individuals' SIRE, demonstrating that for some phenotypes, GIA provides information not already captured by SIRE. GWAS identifies significant associations for liver disease in the 22q13.31 locus across the HL and EAA GIA groups (HL p-value=2.32×10-16, EAA p-value=6.73×10-11). A subsequent PheWAS at the top SNP reveals significant associations with neurologic and neoplastic phenotypes specifically within the HL GIA group. CONCLUSIONS Overall, our results explore the interplay between SIRE and GIA within a disease context and underscore the utility of studying the genomes of diverse individuals through biobank-scale genotyping linked with EHR-based phenotyping.
Collapse
Affiliation(s)
- Ruth Johnson
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
| | - Yi Ding
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Vidhya Venkateswaran
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Oral Biology, School of Dentistry, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Arjun Bhattacharya
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Institute for Quantitative and Computational Biosciences, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Kristin Boulier
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Medicine, Division of Cardiology, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Alec Chiu
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Sergey Knyazev
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Institute for Quantitative and Computational Biosciences, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Tommer Schwarz
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Malika Freund
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Genetics, Stanford School of Medicine, Stanford, CA, 94305, USA
| | - Lingyu Zhan
- Molecular Biology Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Kathryn S Burch
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Christa Caggiano
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Brian Hill
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Nadav Rakocz
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Brunilda Balliu
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Christopher T Denny
- Division of Hematology/Oncology, Department of Pediatrics, Gwynne Hazen Cherry Memorial Laboratories, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Jae Hoon Sul
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Noah Zaitlen
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Valerie A Arboleda
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Eran Halperin
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Anesthesiology and Perioperative Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Sriram Sankararaman
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Manish J Butte
- Department of Pediatrics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Clara Lajonchere
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Institute of Precision Health, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Daniel H Geschwind
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Institute of Precision Health, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Bogdan Pasaniuc
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Institute of Precision Health, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
| |
Collapse
|
10
|
Perez RK, Gordon MG, Subramaniam M, Kim MC, Hartoularos GC, Targ S, Sun Y, Ogorodnikov A, Bueno R, Lu A, Thompson M, Rappoport N, Dahl A, Lanata CM, Matloubian M, Maliskova L, Kwek SS, Li T, Slyper M, Waldman J, Dionne D, Rozenblatt-Rosen O, Fong L, Dall’Era M, Balliu B, Regev A, Yazdany J, Criswell LA, Zaitlen N, Ye CJ. Single-cell RNA-seq reveals cell type-specific molecular and genetic associations to lupus. Science 2022; 376:eabf1970. [PMID: 35389781 PMCID: PMC9297655 DOI: 10.1126/science.abf1970] [Citation(s) in RCA: 112] [Impact Index Per Article: 56.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Systemic lupus erythematosus (SLE) is a heterogeneous autoimmune disease. Knowledge of circulating immune cell types and states associated with SLE remains incomplete. We profiled more than 1.2 million peripheral blood mononuclear cells (162 cases, 99 controls) with multiplexed single-cell RNA sequencing (mux-seq). Cases exhibited elevated expression of type 1 interferon-stimulated genes (ISGs) in monocytes, reduction of naïve CD4+ T cells that correlated with monocyte ISG expression, and expansion of repertoire-restricted cytotoxic GZMH+ CD8+ T cells. Cell type-specific expression features predicted case-control status and stratified patients into two molecular subtypes. We integrated dense genotyping data to map cell type-specific cis-expression quantitative trait loci and to link SLE-associated variants to cell type-specific expression. These results demonstrate mux-seq as a systematic approach to characterize cellular composition, identify transcriptional signatures, and annotate genetic variants associated with SLE.
Collapse
Affiliation(s)
- Richard K. Perez
- School of Medicine, University of California, San Francisco, CA, USA
| | - M. Grace Gordon
- Biological and Medical Informatics Graduate Program, University of California, San Francisco, CA, USA
- Division of Rheumatology, Department of Medicine, University of California, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, CA, USA
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, USA
| | - Meena Subramaniam
- Biological and Medical Informatics Graduate Program, University of California, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, CA, USA
| | - Min Cheol Kim
- School of Medicine, University of California, San Francisco, CA, USA
- Division of Rheumatology, Department of Medicine, University of California, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, CA, USA
- Medical Scientist Training Program, University of California, San Francisco, CA, USA
- UC Berkeley–UCSF Graduate Program in Bioengineering, San Francisco, CA, USA
| | - George C. Hartoularos
- Biological and Medical Informatics Graduate Program, University of California, San Francisco, CA, USA
- Division of Rheumatology, Department of Medicine, University of California, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, CA, USA
| | - Sasha Targ
- School of Medicine, University of California, San Francisco, CA, USA
- Biological and Medical Informatics Graduate Program, University of California, San Francisco, CA, USA
- Medical Scientist Training Program, University of California, San Francisco, CA, USA
| | - Yang Sun
- Division of Rheumatology, Department of Medicine, University of California, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, CA, USA
| | - Anton Ogorodnikov
- Division of Rheumatology, Department of Medicine, University of California, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, CA, USA
| | - Raymund Bueno
- Division of Rheumatology, Department of Medicine, University of California, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, CA, USA
| | - Andrew Lu
- UCLA-Caltech Medical Scientist Training Program, Los Angeles, CA, USA
| | - Mike Thompson
- Department of Computer Science, University of California, Los Angeles, CA, USA
| | - Nadav Rappoport
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Be’er Sheva, Israel
| | - Andrew Dahl
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Cristina M. Lanata
- Division of Rheumatology, Department of Medicine, University of California, San Francisco, CA, USA
- Rosalind Russell/Ephraim P. Engleman Rheumatology Research Center, University of California, San Francisco, CA, USA
| | - Mehrdad Matloubian
- Division of Rheumatology, Department of Medicine, University of California, San Francisco, CA, USA
- Rosalind Russell/Ephraim P. Engleman Rheumatology Research Center, University of California, San Francisco, CA, USA
| | - Lenka Maliskova
- Institute for Human Genetics, University of California, San Francisco, CA, USA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA, USA
| | - Serena S. Kwek
- Division of Hematology/Oncology, Department of Medicine, University of California, San Francisco, CA, USA
| | - Tony Li
- Division of Hematology/Oncology, Department of Medicine, University of California, San Francisco, CA, USA
| | - Michal Slyper
- Klarman Cell Observatory, Broad Institute, Cambridge, MA, USA
| | - Julia Waldman
- Klarman Cell Observatory, Broad Institute, Cambridge, MA, USA
| | - Danielle Dionne
- Klarman Cell Observatory, Broad Institute, Cambridge, MA, USA
| | | | - Lawrence Fong
- Division of Hematology/Oncology, Department of Medicine, University of California, San Francisco, CA, USA
| | - Maria Dall’Era
- School of Medicine, University of California, San Francisco, CA, USA
| | - Brunilda Balliu
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Aviv Regev
- Klarman Cell Observatory, Broad Institute, Cambridge, MA, USA
- Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Howard Hughes Medical Institute, Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Jinoos Yazdany
- Division of Rheumatology, Department of Medicine, University of California, San Francisco, CA, USA
| | - Lindsey A. Criswell
- Division of Rheumatology, Department of Medicine, University of California, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, CA, USA
- Rosalind Russell/Ephraim P. Engleman Rheumatology Research Center, University of California, San Francisco, CA, USA
| | - Noah Zaitlen
- Center for Neurobehavioral Genetics, Semel Institute for Neuroscience and Human Behavior, University of California, Los Angeles, CA, USA
| | - Chun Jimmie Ye
- Division of Rheumatology, Department of Medicine, University of California, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, CA, USA
- Rosalind Russell/Ephraim P. Engleman Rheumatology Research Center, University of California, San Francisco, CA, USA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA, USA
- Parker Institute for Cancer Immunotherapy, San Francisco, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA, USA
| |
Collapse
|
11
|
Gloudemans MJ, Balliu B, Nachun D, Schnurr TM, Durrant MG, Ingelsson E, Wabitsch M, Quertermous T, Montgomery SB, Knowles JW, Carcamo-Orive I. Integration of genetic colocalizations with physiological and pharmacological perturbations identifies cardiometabolic disease genes. Genome Med 2022; 14:31. [PMID: 35292083 PMCID: PMC8925074 DOI: 10.1186/s13073-022-01036-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Accepted: 03/04/2022] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND Identification of causal genes for polygenic human diseases has been extremely challenging, and our understanding of how physiological and pharmacological stimuli modulate genetic risk at disease-associated loci is limited. Specifically, insulin resistance (IR), a common feature of cardiometabolic disease, including type 2 diabetes, obesity, and dyslipidemia, lacks well-powered genome-wide association studies (GWAS), and therefore, few associated loci and causal genes have been identified. METHODS Here, we perform and integrate linkage disequilibrium (LD)-adjusted colocalization analyses across nine cardiometabolic traits (fasting insulin, fasting glucose, insulin sensitivity, insulin sensitivity index, type 2 diabetes, triglycerides, high-density lipoprotein, body mass index, and waist-hip ratio) combined with expression and splicing quantitative trait loci (eQTLs and sQTLs) from five metabolically relevant human tissues (subcutaneous and visceral adipose, skeletal muscle, liver, and pancreas). To elucidate the upstream regulators and functional mechanisms for these genes, we integrate their transcriptional responses to 21 relevant physiological and pharmacological perturbations in human adipocytes, hepatocytes, and skeletal muscle cells and map their protein-protein interactions. RESULTS We identify 470 colocalized loci and prioritize 207 loci with a single colocalized gene. Patterns of shared colocalizations across traits and tissues highlight different potential roles for colocalized genes in cardiometabolic disease and distinguish several genes involved in pancreatic β-cell function from others with a more direct role in skeletal muscle, liver, and adipose tissues. At the loci with a single colocalized gene, 42 of these genes were regulated by insulin and 35 by glucose in perturbation experiments, including 17 regulated by both. Other metabolic perturbations regulated the expression of 30 more genes not regulated by glucose or insulin, pointing to other potential upstream regulators of candidate causal genes. CONCLUSIONS Our use of transcriptional responses under metabolic perturbations to contextualize genetic associations from our custom colocalization approach provides a list of likely causal genes and their upstream regulators in the context of IR-associated cardiometabolic risk.
Collapse
Affiliation(s)
- Michael J Gloudemans
- Biomedical Informatics Training Program, Stanford, CA, USA. .,Department of Pathology, Stanford, CA, USA.
| | - Brunilda Balliu
- Department of Computational Medicine, UCLA, Los Angeles, CA, USA
| | - Daniel Nachun
- Department of Genetics, Stanford, CA, USA.,Department of Immunology, Stanford, CA, USA
| | - Theresia M Schnurr
- Department of Medicine, Division of Cardiovascular Medicine and Cardiovascular Institute, Stanford, CA, USA
| | | | - Erik Ingelsson
- Department of Medicine, Division of Cardiovascular Medicine and Cardiovascular Institute, Stanford, CA, USA
| | - Martin Wabitsch
- Department of Pediatrics and Adolescent Medicine, Division of Pediatric Endocrinology, Ulm University, Ulm, Germany
| | - Thomas Quertermous
- Department of Medicine, Division of Cardiovascular Medicine and Cardiovascular Institute, Stanford, CA, USA.,Diabetes Research Center, Stanford, CA, USA
| | - Stephen B Montgomery
- Department of Pathology, Stanford, CA, USA. .,Department of Genetics, Stanford, CA, USA.
| | - Joshua W Knowles
- Department of Medicine, Division of Cardiovascular Medicine and Cardiovascular Institute, Stanford, CA, USA. .,Diabetes Research Center, Stanford, CA, USA. .,Prevention Research Center, Stanford, CA, USA.
| | - Ivan Carcamo-Orive
- Department of Medicine, Division of Cardiovascular Medicine and Cardiovascular Institute, Stanford, CA, USA. .,Diabetes Research Center, Stanford, CA, USA.
| |
Collapse
|
12
|
Briscoe L, Balliu B, Sankararaman S, Halperin E, Garud NR. Evaluating supervised and unsupervised background noise correction in human gut microbiome data. PLoS Comput Biol 2022; 18:e1009838. [PMID: 35130266 PMCID: PMC8853548 DOI: 10.1371/journal.pcbi.1009838] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Revised: 02/17/2022] [Accepted: 01/15/2022] [Indexed: 12/13/2022] Open
Abstract
The ability to predict human phenotypes and identify biomarkers of disease from metagenomic data is crucial for the development of therapeutics for microbiome-associated diseases. However, metagenomic data is commonly affected by technical variables unrelated to the phenotype of interest, such as sequencing protocol, which can make it difficult to predict phenotype and find biomarkers of disease. Supervised methods to correct for background noise, originally designed for gene expression and RNA-seq data, are commonly applied to microbiome data but may be limited because they cannot account for unmeasured sources of variation. Unsupervised approaches address this issue, but current methods are limited because they are ill-equipped to deal with the unique aspects of microbiome data, which is compositional, highly skewed, and sparse. We perform a comparative analysis of the ability of different denoising transformations in combination with supervised correction methods as well as an unsupervised principal component correction approach that is presently used in other domains but has not been applied to microbiome data to date. We find that the unsupervised principal component correction approach has comparable ability in reducing false discovery of biomarkers as the supervised approaches, with the added benefit of not needing to know the sources of variation apriori. However, in prediction tasks, it appears to only improve prediction when technical variables contribute to the majority of variance in the data. As new and larger metagenomic datasets become increasingly available, background noise correction will become essential for generating reproducible microbiome analyses. The human gut microbiome is known to play a major role in health and is associated with many diseases including colorectal cancer, obesity, and diabetes. The prediction of host phenotypes and identification of biomarkers of disease is essential for harnessing the therapeutic potential of the microbiome. However, many metagenomic datasets are affected by technical variables that introduce unwanted variation that can confound the ability to predict phenotypes and identify biomarkers. Currently, supervised methods originally designed for gene expression and RNA-seq data are commonly applied to microbiome data for correction of background noise, but they are limited in that they cannot correct for unmeasured sources of variation. Unsupervised approaches address this issue, but current methods are limited because they are ill-equipped to deal with the unique aspects of microbiome data, which is compositional, highly skewed, and sparse. We perform a comparative analysis of the ability of different denoising transformations in combination with supervised correction methods as well as an unsupervised principal component correction approach and find that all correction approaches reduce false positives for biomarker discovery. In the task of predicting phenotypes, different approaches have varying success where the unsupervised correction can improve prediction when technical variables contribute to the majority of variance in the data. As new and larger metagenomic datasets become increasingly available, background noise correction will become essential for generating reproducible microbiome analyses.
Collapse
Affiliation(s)
- Leah Briscoe
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, California, United States of America
- * E-mail: (LB); (EH); (NRG)
| | - Brunilda Balliu
- Department of Computational Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America
| | - Sriram Sankararaman
- Department of Computer Science, University of California Los Angeles, Los Angeles, California, United States of America
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America
- Department of Computational Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America
| | - Eran Halperin
- Department of Computer Science, University of California Los Angeles, Los Angeles, California, United States of America
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America
- Department of Computational Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America
- Department of Anesthesiology and Perioperative Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America
- Institute of Precision Health, University of California Los Angeles, Los Angeles, California, United States of America
- * E-mail: (LB); (EH); (NRG)
| | - Nandita R. Garud
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America
- Department of Ecology and Evolutionary Biology, University of California Los Angeles, Los Angeles, California, United States of America
- * E-mail: (LB); (EH); (NRG)
| |
Collapse
|
13
|
Balliu B, Carcamo-Orive I, Gloudemans MJ, Nachun DC, Durrant MG, Gazal S, Park CY, Knowles DA, Wabitsch M, Quertermous T, Knowles JW, Montgomery SB. An integrated approach to identify environmental modulators of genetic risk factors for complex traits. Am J Hum Genet 2021; 108:1866-1879. [PMID: 34582792 PMCID: PMC8546041 DOI: 10.1016/j.ajhg.2021.08.014] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Accepted: 08/27/2021] [Indexed: 12/16/2022] Open
Abstract
Complex traits and diseases can be influenced by both genetics and environment. However, given the large number of environmental stimuli and power challenges for gene-by-environment testing, it remains a critical challenge to identify and prioritize specific disease-relevant environmental exposures. We propose a framework for leveraging signals from transcriptional responses to environmental perturbations to identify disease-relevant perturbations that can modulate genetic risk for complex traits and inform the functions of genetic variants associated with complex traits. We perturbed human skeletal-muscle-, fat-, and liver-relevant cell lines with 21 perturbations affecting insulin resistance, glucose homeostasis, and metabolic regulation in humans and identified thousands of environmentally responsive genes. By combining these data with GWASs from 31 distinct polygenic traits, we show that the heritability of multiple traits is enriched in regions surrounding genes responsive to specific perturbations and, further, that environmentally responsive genes are enriched for associations with specific diseases and phenotypes from the GWAS Catalog. Overall, we demonstrate the advantages of large-scale characterization of transcriptional changes in diversely stimulated and pathologically relevant cells to identify disease-relevant perturbations.
Collapse
Affiliation(s)
- Brunilda Balliu
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA.
| | - Ivan Carcamo-Orive
- Department of Medicine, Division of Cardiovascular Medicine, Cardiovascular Institute and Stanford Diabetes Research Center, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Michael J Gloudemans
- Biomedical Informatics Training Program and Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Daniel C Nachun
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Matthew G Durrant
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Steven Gazal
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | - Chong Y Park
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - David A Knowles
- New York Genome Center, New York, NY 10013, USA; Department of Computer Science, Columbia University, New York, NY 10027, USA
| | - Martin Wabitsch
- Department of Pediatrics and Adolescent Medicine, Division of Pediatric Endocrinology, Ulm University, Ulm 89075, Germany
| | - Thomas Quertermous
- Department of Medicine, Division of Cardiology and Cardiovascular Institute, Stanford Diabetes Research Center, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Joshua W Knowles
- Department of Medicine, Division of Cardiology and Cardiovascular Institute, Stanford Diabetes Research Center, Stanford University School of Medicine, Stanford, CA 94305, USA.
| | - Stephen B Montgomery
- Department of Pathology and Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| |
Collapse
|
14
|
Alser M, Rotman J, Deshpande D, Taraszka K, Shi H, Baykal PI, Yang HT, Xue V, Knyazev S, Singer BD, Balliu B, Koslicki D, Skums P, Zelikovsky A, Alkan C, Mutlu O, Mangul S. Technology dictates algorithms: recent developments in read alignment. Genome Biol 2021; 22:249. [PMID: 34446078 PMCID: PMC8390189 DOI: 10.1186/s13059-021-02443-7] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 07/28/2021] [Indexed: 01/08/2023] Open
Abstract
Aligning sequencing reads onto a reference is an essential step of the majority of genomic analysis pipelines. Computational algorithms for read alignment have evolved in accordance with technological advances, leading to today's diverse array of alignment methods. We provide a systematic survey of algorithmic foundations and methodologies across 107 alignment methods, for both short and long reads. We provide a rigorous experimental evaluation of 11 read aligners to demonstrate the effect of these underlying algorithms on speed and efficiency of read alignment. We discuss how general alignment algorithms have been tailored to the specific needs of various domains in biology.
Collapse
Affiliation(s)
- Mohammed Alser
- Computer Science Department, ETH Zürich, 8092, Zürich, Switzerland
- Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey
- Information Technology and Electrical Engineering Department, ETH Zürich, Zürich, 8092, Switzerland
| | - Jeremy Rotman
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Dhrithi Deshpande
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, Los Angeles, CA, 90089, USA
| | - Kodi Taraszka
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Huwenbo Shi
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Pelin Icer Baykal
- Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
| | - Harry Taegyun Yang
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
- Bioinformatics Interdepartmental Ph.D. Program, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Victor Xue
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Sergey Knyazev
- Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
| | - Benjamin D Singer
- Division of Pulmonary and Critical Care Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
- Department of Biochemistry & Molecular Genetics, Northwestern University Feinberg School of Medicine, Chicago, USA
- Simpson Querrey Institute for Epigenetics, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| | - Brunilda Balliu
- Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - David Koslicki
- Computer Science and Engineering, Pennsylvania State University, University Park, PA, 16801, USA
- Biology Department, Pennsylvania State University, University Park, PA, 16801, USA
- The Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16801, USA
| | - Pavel Skums
- Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
| | - Alex Zelikovsky
- Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
- The Laboratory of Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, 119991, Russia
| | - Can Alkan
- Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey
- Bilkent-Hacettepe Health Sciences and Technologies Program, Ankara, Turkey
| | - Onur Mutlu
- Computer Science Department, ETH Zürich, 8092, Zürich, Switzerland
- Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey
- Information Technology and Electrical Engineering Department, ETH Zürich, Zürich, 8092, Switzerland
| | - Serghei Mangul
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, Los Angeles, CA, 90089, USA.
| |
Collapse
|
15
|
Temple WC, Vo KT, Matthay KK, Balliu B, Coleman C, Michlitsch J, Phelps A, Behr S, Zapala MA. Association of image-defined risk factors with clinical features, histopathology, and outcomes in neuroblastoma. Cancer Med 2020; 10:2232-2241. [PMID: 33314708 PMCID: PMC7982630 DOI: 10.1002/cam4.3663] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Revised: 10/05/2020] [Accepted: 11/17/2020] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Clinical, molecular, and histopathologic features guide treatment for neuroblastoma, but obtaining tumor tissue may cause complications and is subject to sampling error due to tumor heterogeneity. We hypothesized that image-defined risk factors (IDRFs) would reflect molecular features, histopathology, and clinical outcomes in neuroblastoma. METHODS We performed a retrospective cohort study of 76 patients with neuroblastoma or ganglioneuroblastoma. Diagnostic CT scans were reviewed for 20 IDRFs, which were consolidated into five IDRF groups (involvement of multiple body compartments, vascular encasement, tumor infiltration of adjacent organs/structures, airway compression, or intraspinal extension). IDRF groups were analyzed for association with clinical, molecular, and histopathologic features of neuroblastoma. RESULTS Patients with more IDRF groups had a higher risk of surgical complications (OR = 3.1, p = 0.001). Tumor vascular encasement was associated with increased risk of surgical complications (OR = 5.40, p = 0.009) and increased risk of undifferentiated/poorly differentiated histologic grade (OR = 11.11, p = 0.013). Tumor infiltration of adjacent organs and structures was associated with decreased survival (HR = 8.90, p = 0.007), MYCN amplification (OR = 9.91, p = 0.001), high MKI (OR = 6.20, p = 0.003), and increased risk of International Neuroblastoma Staging System stage 4 disease (OR = 8.96, p < 0.001). CONCLUSIONS The presence of IDRFs at diagnosis was associated with high-risk clinical, molecular, and histopathologic features of neuroblastoma. The IDRF group tumor infiltration into adjacent organs and structures was associated with decreased survival. Collectively, these findings may assist surgical planning and medical management for neuroblastoma patients.
Collapse
Affiliation(s)
- William C Temple
- Department of Pediatrics, UCSF School of Medicine and UCSF Benioff Children's Hospital, San Francisco, CA, USA
| | - Kieuhoa T Vo
- Department of Pediatrics, UCSF School of Medicine and UCSF Benioff Children's Hospital, San Francisco, CA, USA
| | - Katherine K Matthay
- Department of Pediatrics, UCSF School of Medicine and UCSF Benioff Children's Hospital, San Francisco, CA, USA
| | | | - Christina Coleman
- Department of Hematology and Oncology, UCSF Benioff Children's Hospital, Oakland, Oakland, CA, USA
| | - Jennifer Michlitsch
- Department of Hematology and Oncology, UCSF Benioff Children's Hospital, Oakland, Oakland, CA, USA
| | - Andrew Phelps
- Department of Radiology and Biomedical Imaging, University of California, San Francisco, San Francisco, CA, USA
| | - Spencer Behr
- Department of Radiology and Biomedical Imaging, University of California, San Francisco, San Francisco, CA, USA
| | - Matthew A Zapala
- Department of Radiology and Biomedical Imaging, University of California, San Francisco, San Francisco, CA, USA
| |
Collapse
|
16
|
Oliva M, Muñoz-Aguirre M, Kim-Hellmuth S, Wucher V, Gewirtz ADH, Cotter DJ, Parsana P, Kasela S, Balliu B, Viñuela A, Castel SE, Mohammadi P, Aguet F, Zou Y, Khramtsova EA, Skol AD, Garrido-Martín D, Reverter F, Brown A, Evans P, Gamazon ER, Payne A, Bonazzola R, Barbeira AN, Hamel AR, Martinez-Perez A, Soria JM, Pierce BL, Stephens M, Eskin E, Dermitzakis ET, Segrè AV, Im HK, Engelhardt BE, Ardlie KG, Montgomery SB, Battle AJ, Lappalainen T, Guigó R, Stranger BE. The impact of sex on gene expression across human tissues. Science 2020; 369:369/6509/eaba3066. [PMID: 32913072 DOI: 10.1126/science.aba3066] [Citation(s) in RCA: 257] [Impact Index Per Article: 64.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2019] [Accepted: 08/03/2020] [Indexed: 12/12/2022]
Abstract
Many complex human phenotypes exhibit sex-differentiated characteristics. However, the molecular mechanisms underlying these differences remain largely unknown. We generated a catalog of sex differences in gene expression and in the genetic regulation of gene expression across 44 human tissue sources surveyed by the Genotype-Tissue Expression project (GTEx, v8 release). We demonstrate that sex influences gene expression levels and cellular composition of tissue samples across the human body. A total of 37% of all genes exhibit sex-biased expression in at least one tissue. We identify cis expression quantitative trait loci (eQTLs) with sex-differentiated effects and characterize their cellular origin. By integrating sex-biased eQTLs with genome-wide association study data, we identify 58 gene-trait associations that are driven by genetic regulation of gene expression in a single sex. These findings provide an extensive characterization of sex differences in the human transcriptome and its genetic regulation.
Collapse
Affiliation(s)
- Meritxell Oliva
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA. .,Institute for Genomics and Systems Biology, University of Chicago, Chicago, IL, USA.,Department of Public Health Sciences, University of Chicago, Chicago, IL, USA
| | - Manuel Muñoz-Aguirre
- Centre for Genomic Regulation, Barcelona Institute for Science and Technology, Barcelona, Catalonia, Spain.,Department of Statistics and Operations Research, Universitat Politècnica de Catalunya, Barcelona, Catalonia, Spain
| | - Sarah Kim-Hellmuth
- Statistical Genetics, Max Planck Institute of Psychiatry, Munich, Germany.,New York Genome Center, New York, NY, USA.,Department of Systems Biology, Columbia University, New York, NY, USA
| | - Valentin Wucher
- Centre for Genomic Regulation, Barcelona Institute for Science and Technology, Barcelona, Catalonia, Spain
| | - Ariel D H Gewirtz
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Daniel J Cotter
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Princy Parsana
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Silva Kasela
- New York Genome Center, New York, NY, USA.,Department of Systems Biology, Columbia University, New York, NY, USA
| | - Brunilda Balliu
- Department of Computational Medicine, University of California, Los Angeles, CA, USA
| | - Ana Viñuela
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
| | - Stephane E Castel
- New York Genome Center, New York, NY, USA.,Department of Systems Biology, Columbia University, New York, NY, USA
| | - Pejman Mohammadi
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, Scripps Research Translational Institute, La Jolla, CA, USA
| | | | - Yuxin Zou
- Department of Statistics, University of Chicago, Chicago, IL, USA
| | - Ekaterina A Khramtsova
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA.,Computational Sciences, Janssen Pharmaceuticals, Spring House, PA, USA
| | - Andrew D Skol
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA.,Institute for Genomics and Systems Biology, University of Chicago, Chicago, IL, USA.,Center for Translational Data Science, University of Chicago, Chicago, IL, USA.,Department of Pathology and Laboratory Medicine, Ann and Robert H. Lurie Children's Hospital of Chicago, Chicago, IL, USA
| | - Diego Garrido-Martín
- Centre for Genomic Regulation, Barcelona Institute for Science and Technology, Barcelona, Catalonia, Spain
| | - Ferran Reverter
- Department of Genetics, Microbiology and Statistics, Faculty of Biology, University of Barcelona, Barcelona, Spain
| | | | - Patrick Evans
- Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Eric R Gamazon
- Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.,Clare Hall, University of Cambridge, Cambridge, UK
| | - Anthony Payne
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Rodrigo Bonazzola
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Alvaro N Barbeira
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Andrew R Hamel
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Massachusetts Eye and Ear, Harvard Medical School, Boston, MA, USA
| | - Angel Martinez-Perez
- Genomics of Complex Diseases Group, Research Institute Hospital de la Sant Creu i Sant Pau, IIB Sant Pau, Barcelona, Spain
| | - José Manuel Soria
- Genomics of Complex Diseases Group, Research Institute Hospital de la Sant Creu i Sant Pau, IIB Sant Pau, Barcelona, Spain
| | | | - Brandon L Pierce
- Department of Public Health Sciences, University of Chicago, Chicago, IL, USA
| | - Matthew Stephens
- Department of Statistics, University of Chicago, Chicago, IL, USA.,Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Eleazar Eskin
- Departments of Computational Medicine, Computer Science, and Human Genetics, University of California, Los Angeles, CA, USA
| | - Emmanouil T Dermitzakis
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
| | - Ayellet V Segrè
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Massachusetts Eye and Ear, Harvard Medical School, Boston, MA, USA
| | - Hae Kyung Im
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Barbara E Engelhardt
- Department of Computer Science, Center for Statistics and Machine Learning, Princeton University, Princeton, NJ, USA.,Genomics plc, Oxford, UK
| | | | - Stephen B Montgomery
- Department of Genetics, Stanford University, Stanford, CA, USA.,Department of Pathology, Stanford University, Stanford, CA, USA
| | - Alexis J Battle
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.,Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Tuuli Lappalainen
- New York Genome Center, New York, NY, USA.,Department of Systems Biology, Columbia University, New York, NY, USA
| | - Roderic Guigó
- Centre for Genomic Regulation, Barcelona Institute for Science and Technology, Barcelona, Catalonia, Spain.,Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
| | - Barbara E Stranger
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA. .,Institute for Genomics and Systems Biology, University of Chicago, Chicago, IL, USA.,Center for Translational Data Science, University of Chicago, Chicago, IL, USA.,Center for Genetic Medicine, Department of Pharmacology, Northwestern University, Chicago, IL, USA
| |
Collapse
|
17
|
Goodman-Meza D, Rudas A, Chiang JN, Adamson PC, Ebinger J, Sun N, Botting P, Fulcher JA, Saab FG, Brook R, Eskin E, An U, Kordi M, Jew B, Balliu B, Chen Z, Hill BL, Rahmani E, Halperin E, Manuel V. A machine learning algorithm to increase COVID-19 inpatient diagnostic capacity. PLoS One 2020; 15:e0239474. [PMID: 32960917 PMCID: PMC7508387 DOI: 10.1371/journal.pone.0239474] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Accepted: 09/01/2020] [Indexed: 01/09/2023] Open
Abstract
Worldwide, testing capacity for SARS-CoV-2 is limited and bottlenecks in the scale up of polymerase chain reaction (PCR-based testing exist. Our aim was to develop and evaluate a machine learning algorithm to diagnose COVID-19 in the inpatient setting. The algorithm was based on basic demographic and laboratory features to serve as a screening tool at hospitals where testing is scarce or unavailable. We used retrospectively collected data from the UCLA Health System in Los Angeles, California. We included all emergency room or inpatient cases receiving SARS-CoV-2 PCR testing who also had a set of ancillary laboratory features (n = 1,455) between 1 March 2020 and 24 May 2020. We tested seven machine learning models and used a combination of those models for the final diagnostic classification. In the test set (n = 392), our combined model had an area under the receiver operator curve of 0.91 (95% confidence interval 0.87-0.96). The model achieved a sensitivity of 0.93 (95% CI 0.85-0.98), specificity of 0.64 (95% CI 0.58-0.69). We found that our machine learning algorithm had excellent diagnostic metrics compared to SARS-CoV-2 PCR. This ensemble machine learning algorithm to diagnose COVID-19 has the potential to be used as a screening tool in hospital settings where PCR testing is scarce or unavailable.
Collapse
Affiliation(s)
- David Goodman-Meza
- Division of Infectious Diseases, David Geffen School of Medicine at UCLA, Los Angeles, California, United States of America
| | - Akos Rudas
- Department of Computational Medicine, UCLA, Los Angeles, California, United States of America
- Faculty of Informatics, Eötvös Loránd University (ELTE), Budapest, Hungary
| | - Jeffrey N. Chiang
- Department of Computational Medicine, UCLA, Los Angeles, California, United States of America
| | - Paul C. Adamson
- Division of Infectious Diseases, David Geffen School of Medicine at UCLA, Los Angeles, California, United States of America
| | - Joseph Ebinger
- Department of Cardiology, Cedars-Sinai Medical Center, Los Angeles, California, United States of America
| | - Nancy Sun
- Department of Cardiology, Cedars-Sinai Medical Center, Los Angeles, California, United States of America
| | - Patrick Botting
- Department of Cardiology, Cedars-Sinai Medical Center, Los Angeles, California, United States of America
| | - Jennifer A. Fulcher
- Division of Infectious Diseases, David Geffen School of Medicine at UCLA, Los Angeles, California, United States of America
| | - Faysal G. Saab
- Department of Medicine, David Geffen School of Medicine at UCLA, Los Angeles, California, United States of America
| | - Rachel Brook
- Department of Medicine, David Geffen School of Medicine at UCLA, Los Angeles, California, United States of America
| | - Eleazar Eskin
- Department of Computational Medicine, UCLA, Los Angeles, California, United States of America
- Department of Computer Science, UCLA, Los Angeles, California, United States of America
- Department of Human Genetics, UCLA, Los Angeles, California, United States of America
| | - Ulzee An
- Department of Computer Science, UCLA, Los Angeles, California, United States of America
| | - Misagh Kordi
- Department of Computational Medicine, UCLA, Los Angeles, California, United States of America
| | - Brandon Jew
- Department of Computational Medicine, UCLA, Los Angeles, California, United States of America
| | - Brunilda Balliu
- Department of Computational Medicine, UCLA, Los Angeles, California, United States of America
| | - Zeyuan Chen
- Department of Computer Science, UCLA, Los Angeles, California, United States of America
| | - Brian L. Hill
- Department of Computer Science, UCLA, Los Angeles, California, United States of America
| | - Elior Rahmani
- Department of Computer Science, UCLA, Los Angeles, California, United States of America
| | - Eran Halperin
- Department of Computational Medicine, UCLA, Los Angeles, California, United States of America
- Department of Computer Science, UCLA, Los Angeles, California, United States of America
- Department of Human Genetics, UCLA, Los Angeles, California, United States of America
- Department of Anesthesiology, David Geffen School of Medicine at UCLA, Los Angeles, California, United States of America
| | - Vladimir Manuel
- Faculty Practice Group, David Geffen School of Medicine at UCLA, Los Angeles, California, United States of America
- UCLA Clinical and Translational Science Institute, Los Angeles, California, United States of America
| |
Collapse
|
18
|
Gay NR, Gloudemans M, Antonio ML, Abell NS, Balliu B, Park Y, Martin AR, Musharoff S, Rao AS, Aguet F, Barbeira AN, Bonazzola R, Hormozdiari F, Ardlie KG, Brown CD, Im HK, Lappalainen T, Wen X, Montgomery SB. Impact of admixture and ancestry on eQTL analysis and GWAS colocalization in GTEx. Genome Biol 2020; 21:233. [PMID: 32912333 PMCID: PMC7488497 DOI: 10.1186/s13059-020-02113-0] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Accepted: 07/19/2020] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Population structure among study subjects may confound genetic association studies, and lack of proper correction can lead to spurious findings. The Genotype-Tissue Expression (GTEx) project largely contains individuals of European ancestry, but the v8 release also includes up to 15% of individuals of non-European ancestry. Assessing ancestry-based adjustments in GTEx improves portability of this research across populations and further characterizes the impact of population structure on GWAS colocalization. RESULTS Here, we identify a subset of 117 individuals in GTEx (v8) with a high degree of population admixture and estimate genome-wide local ancestry. We perform genome-wide cis-eQTL mapping using admixed samples in seven tissues, adjusted by either global or local ancestry. Consistent with previous work, we observe improved power with local ancestry adjustment. At loci where the two adjustments produce different lead variants, we observe 31 loci (0.02%) where a significant colocalization is called only with one eQTL ancestry adjustment method. Notably, both adjustments produce similar numbers of significant colocalizations within each of two different colocalization methods, COLOC and FINEMAP. Finally, we identify a small subset of eQTL-associated variants highly correlated with local ancestry, providing a resource to enhance functional follow-up. CONCLUSIONS We provide a local ancestry map for admixed individuals in the GTEx v8 release and describe the impact of ancestry and admixture on gene expression, eQTLs, and GWAS colocalization. While the majority of the results are concordant between local and global ancestry-based adjustments, we identify distinct advantages and disadvantages to each approach.
Collapse
Affiliation(s)
- Nicole R. Gay
- Department of Genetics, Stanford University, Stanford, CA USA
| | | | | | - Nathan S. Abell
- Department of Genetics, Stanford University, Stanford, CA USA
| | - Brunilda Balliu
- Department of Biomathematics, University of California, Los Angeles, Los Angeles, CA USA
| | - YoSon Park
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA
| | - Alicia R. Martin
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA USA
- Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA USA
| | | | - Abhiram S. Rao
- Department of Bioengineering, Stanford University, Stanford, CA USA
| | - François Aguet
- The Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Alvaro N. Barbeira
- Section of Genetic Medicine, Department of Medicine, The University of Chicago, Chicago, IL USA
| | - Rodrigo Bonazzola
- Section of Genetic Medicine, Department of Medicine, The University of Chicago, Chicago, IL USA
| | - Farhad Hormozdiari
- The Broad Institute of MIT and Harvard, Cambridge, MA USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA USA
| | - GTEx Consortium
- Department of Genetics, Stanford University, Stanford, CA USA
- Biomedical Informatics, Stanford University, Stanford, CA USA
- Department of Biomathematics, University of California, Los Angeles, Los Angeles, CA USA
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA USA
- Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA USA
- Department of Bioengineering, Stanford University, Stanford, CA USA
- The Broad Institute of MIT and Harvard, Cambridge, MA USA
- Section of Genetic Medicine, Department of Medicine, The University of Chicago, Chicago, IL USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA USA
- New York Genome Center, New York, NY USA
- Department of Systems Biology, Columbia University, New York, NY USA
- Department of Biostatistics, University of Michigan, Ann Arbor, MI USA
- Department of Pathology, Stanford University, Stanford, CA USA
| | | | - Christopher D. Brown
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA
| | - Hae Kyung Im
- Section of Genetic Medicine, Department of Medicine, The University of Chicago, Chicago, IL USA
| | - Tuuli Lappalainen
- New York Genome Center, New York, NY USA
- Department of Systems Biology, Columbia University, New York, NY USA
| | - Xiaoquan Wen
- Department of Biostatistics, University of Michigan, Ann Arbor, MI USA
| | - Stephen B. Montgomery
- Department of Genetics, Stanford University, Stanford, CA USA
- Department of Pathology, Stanford University, Stanford, CA USA
| |
Collapse
|
19
|
Sanford JA, Nogiec CD, Lindholm ME, Adkins JN, Amar D, Dasari S, Drugan JK, Fernández FM, Radom-Aizik S, Schenk S, Snyder MP, Tracy RP, Vanderboom P, Trappe S, Walsh MJ, Adkins JN, Amar D, Dasari S, Drugan JK, Evans CR, Fernandez FM, Li Y, Lindholm ME, Nogiec CD, Radom-Aizik S, Sanford JA, Schenk S, Snyder MP, Tomlinson L, Tracy RP, Trappe S, Vanderboom P, Walsh MJ, Lee Alekel D, Bekirov I, Boyce AT, Boyington J, Fleg JL, Joseph LJ, Laughlin MR, Maruvada P, Morris SA, McGowan JA, Nierras C, Pai V, Peterson C, Ramos E, Roary MC, Williams JP, Xia A, Cornell E, Rooney J, Miller ME, Ambrosius WT, Rushing S, Stowe CL, Jack Rejeski W, Nicklas BJ, Pahor M, Lu CJ, Trappe T, Chambers T, Raue U, Lester B, Bergman BC, Bessesen DH, Jankowski CM, Kohrt WM, Melanson EL, Moreau KL, Schauer IE, Schwartz RS, Kraus WE, Slentz CA, Huffman KM, Johnson JL, Willis LH, Kelly L, Houmard JA, Dubis G, Broskey N, Goodpaster BH, Sparks LM, Coen PM, Cooper DM, Haddad F, Rankinen T, Ravussin E, Johannsen N, Harris M, Jakicic JM, Newman AB, Forman DD, Kershaw E, Rogers RJ, Nindl BC, Page LC, Stefanovic-Racic M, Barr SL, Rasmussen BB, Moro T, Paddon-Jones D, Volpi E, Spratt H, Musi N, Espinoza S, Patel D, Serra M, Gelfond J, Burns A, Bamman MM, Buford TW, Cutter GR, Bodine SC, Esser K, Farrar RP, Goodyear LJ, Hirshman MF, Albertson BG, Qian WJ, Piehowski P, Gritsenko MA, Monore ME, Petyuk VA, McDermott JE, Hansen JN, Hutchison C, Moore S, Gaul DA, Clish CB, Avila-Pacheco J, Dennis C, Kellis M, Carr S, Jean-Beltran PM, Keshishian H, Mani D, Clauser K, Krug K, Mundorff C, Pearce C, Ivanova AA, Ortlund EA, Maner-Smith K, Uppal K, Zhang T, Sealfon SC, Zaslavsky E, Nair V, Li S, Jain N, Ge Y, Sun Y, Nudelman G, Ruf-zamojski F, Smith G, Pincas N, Rubenstein A, Anne Amper M, Seenarine N, Lappalainen T, Lanza IR, Sreekumaran Nair K, Klaus K, Montgomery SB, Smith KS, Gay NR, Zhao B, Hung CJ, Zebarjadi N, Balliu B, Fresard L, Burant CF, Li JZ, Kachman M, Soni T, Raskind AB, Gerszten R, Robbins J, Ilkayeva O, Muehlbauer MJ, Newgard CB, Ashley EA, Wheeler MT, Jimenez-Morales D, Raja A, Dalton KP, Zhen J, Suk Kim Y, Christle JW, Marwaha S, Chin ET, Hershman SG, Hastie T, Tibshirani R, Rivas MA. Molecular Transducers of Physical Activity Consortium (MoTrPAC): Mapping the Dynamic Responses to Exercise. Cell 2020; 181:1464-1474. [DOI: 10.1016/j.cell.2020.06.004] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Revised: 05/19/2020] [Accepted: 06/01/2020] [Indexed: 12/31/2022]
|
20
|
Contrepois K, Wu S, Moneghetti KJ, Hornburg D, Ahadi S, Tsai MS, Metwally AA, Wei E, Lee-McMullen B, Quijada JV, Chen S, Christle JW, Ellenberger M, Balliu B, Taylor S, Durrant MG, Knowles DA, Choudhry H, Ashland M, Bahmani A, Enslen B, Amsallem M, Kobayashi Y, Avina M, Perelman D, Schüssler-Fiorenza Rose SM, Zhou W, Ashley EA, Montgomery SB, Chaib H, Haddad F, Snyder MP. Molecular Choreography of Acute Exercise. Cell 2020; 181:1112-1130.e16. [PMID: 32470399 PMCID: PMC7299174 DOI: 10.1016/j.cell.2020.04.043] [Citation(s) in RCA: 219] [Impact Index Per Article: 54.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Revised: 12/10/2019] [Accepted: 04/21/2020] [Indexed: 02/07/2023]
Abstract
Acute physical activity leads to several changes in metabolic, cardiovascular, and immune pathways. Although studies have examined selected changes in these pathways, the system-wide molecular response to an acute bout of exercise has not been fully characterized. We performed longitudinal multi-omic profiling of plasma and peripheral blood mononuclear cells including metabolome, lipidome, immunome, proteome, and transcriptome from 36 well-characterized volunteers, before and after a controlled bout of symptom-limited exercise. Time-series analysis revealed thousands of molecular changes and an orchestrated choreography of biological processes involving energy metabolism, oxidative stress, inflammation, tissue repair, and growth factor response, as well as regulatory pathways. Most of these processes were dampened and some were reversed in insulin-resistant participants. Finally, we discovered biological pathways involved in cardiopulmonary exercise response and developed prediction models revealing potential resting blood-based biomarkers of peak oxygen consumption.
Collapse
Affiliation(s)
- Kévin Contrepois
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA; Stanford Cardiovascular Institute, Stanford University, Stanford, CA, USA
| | - Si Wu
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Kegan J Moneghetti
- Stanford Cardiovascular Institute, Stanford University, Stanford, CA, USA; Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA; Department of Medicine, St. Vincent's Hospital, University of Melbourne, Melbourne, VIC, Australia; Stanford Sports Cardiology, Department of Medicine, Stanford University, Stanford, CA, USA
| | - Daniel Hornburg
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Sara Ahadi
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Ming-Shian Tsai
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Ahmed A Metwally
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Eric Wei
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Jeniffer V Quijada
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Songjie Chen
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Jeffrey W Christle
- Stanford Cardiovascular Institute, Stanford University, Stanford, CA, USA; Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA; Stanford Sports Cardiology, Department of Medicine, Stanford University, Stanford, CA, USA
| | - Mathew Ellenberger
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Brunilda Balliu
- Department of Pathology, Stanford University, Stanford, CA, USA
| | - Shalina Taylor
- Pediatrics Department, Stanford University School of Medicine, Stanford, CA, USA
| | - Matthew G Durrant
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - David A Knowles
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA; Department of Radiology, Stanford University, Stanford, CA, USA
| | - Hani Choudhry
- Department of Biochemistry, Faculty of Science, Cancer and Mutagenesis Unit, King Fahd Center for Medical Research, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Melanie Ashland
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Amir Bahmani
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Brooke Enslen
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Myriam Amsallem
- Stanford Cardiovascular Institute, Stanford University, Stanford, CA, USA; Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Yukari Kobayashi
- Stanford Cardiovascular Institute, Stanford University, Stanford, CA, USA; Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Monika Avina
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Dalia Perelman
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Wenyu Zhou
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Euan A Ashley
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA; Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA; Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
| | - Stephen B Montgomery
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA; Department of Pathology, Stanford University, Stanford, CA, USA
| | - Hassan Chaib
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Francois Haddad
- Stanford Cardiovascular Institute, Stanford University, Stanford, CA, USA; Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA; Stanford Diabetes Research Center, Stanford University, Stanford, CA, USA.
| | - Michael P Snyder
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA; Stanford Cardiovascular Institute, Stanford University, Stanford, CA, USA; Stanford Diabetes Research Center, Stanford University, Stanford, CA, USA.
| |
Collapse
|
21
|
Balliu B, Durrant M, Goede OD, Abell N, Li X, Liu B, Gloudemans MJ, Cook NL, Smith KS, Knowles DA, Pala M, Cucca F, Schlessinger D, Jaiswal S, Sabatti C, Lind L, Ingelsson E, Montgomery SB. Genetic regulation of gene expression and splicing during a 10-year period of human aging. Genome Biol 2019; 20:230. [PMID: 31684996 PMCID: PMC6827221 DOI: 10.1186/s13059-019-1840-y] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2019] [Accepted: 09/27/2019] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Molecular and cellular changes are intrinsic to aging and age-related diseases. Prior cross-sectional studies have investigated the combined effects of age and genetics on gene expression and alternative splicing; however, there has been no long-term, longitudinal characterization of these molecular changes, especially in older age. RESULTS We perform RNA sequencing in whole blood from the same individuals at ages 70 and 80 to quantify how gene expression, alternative splicing, and their genetic regulation are altered during this 10-year period of advanced aging at a population and individual level. We observe that individuals are more similar to their own expression profiles later in life than profiles of other individuals their own age. We identify 1291 and 294 genes differentially expressed and alternatively spliced with age, as well as 529 genes with outlying individual trajectories. Further, we observe a strong correlation of genetic effects on expression and splicing between the two ages, with a small subset of tested genes showing a reduction in genetic associations with expression and splicing in older age. CONCLUSIONS These findings demonstrate that, although the transcriptome and its genetic regulation is mostly stable late in life, a small subset of genes is dynamic and is characterized by a reduction in genetic regulation, most likely due to increasing environmental variance with age.
Collapse
Affiliation(s)
- Brunilda Balliu
- Department of Pathology, Stanford University School of Medicine, Stanford, USA.
| | - Matthew Durrant
- Department of Genetics, Stanford University School of Medicine, Stanford, USA
| | - Olivia de Goede
- Department of Genetics, Stanford University School of Medicine, Stanford, USA
| | - Nathan Abell
- Department of Genetics, Stanford University School of Medicine, Stanford, USA
| | - Xin Li
- Department of Pathology, Stanford University School of Medicine, Stanford, USA
| | - Boxiang Liu
- Department of Biology, Stanford University School of Medicine, Stanford, USA
| | | | - Naomi L Cook
- Department of Medical Sciences, Uppsala University, Uppsala, Sweden
| | - Kevin S Smith
- Department of Pathology, Stanford University School of Medicine, Stanford, USA
| | | | - Mauro Pala
- Dipartimento di Scienze Biomediche, Universita di Sassari, Sassari, Italy
| | - Francesco Cucca
- Dipartimento di Scienze Biomediche, Universita di Sassari, Sassari, Italy
| | | | - Siddhartha Jaiswal
- Department of Pathology, Stanford University School of Medicine, Stanford, USA
| | - Chiara Sabatti
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, USA
| | - Lars Lind
- Department of Medical Sciences, Uppsala University, Uppsala, Sweden
| | - Erik Ingelsson
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, USA.
- Stanford Cardiovascular Institute, Stanford University, Stanford, USA.
- Stanford Diabetes Research Center, Stanford University, Stanford, USA.
| | - Stephen B Montgomery
- Department of Pathology, Stanford University School of Medicine, Stanford, USA.
- Department of Genetics, Stanford University School of Medicine, Stanford, USA.
| |
Collapse
|
22
|
Frésard L, Smail C, Ferraro NM, Teran NA, Li X, Smith KS, Bonner D, Kernohan KD, Marwaha S, Zappala Z, Balliu B, Davis JR, Liu B, Prybol CJ, Kohler JN, Zastrow DB, Reuter CM, Fisk DG, Grove ME, Davidson JM, Hartley T, Joshi R, Strober BJ, Utiramerur S, Lind L, Ingelsson E, Battle A, Bejerano G, Bernstein JA, Ashley EA, Boycott KM, Merker JD, Wheeler MT, Montgomery SB. Identification of rare-disease genes using blood transcriptome sequencing and large control cohorts. Nat Med 2019; 25:911-919. [PMID: 31160820 PMCID: PMC6634302 DOI: 10.1038/s41591-019-0457-8] [Citation(s) in RCA: 174] [Impact Index Per Article: 34.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Accepted: 04/15/2019] [Indexed: 02/08/2023]
Abstract
It is estimated that 350 million individuals worldwide suffer from rare diseases, which are predominantly caused by mutation in a single gene1. The current molecular diagnostic rate is estimated at 50%, with whole-exome sequencing (WES) among the most successful approaches2-5. For patients in whom WES is uninformative, RNA sequencing (RNA-seq) has shown diagnostic utility in specific tissues and diseases6-8. This includes muscle biopsies from patients with undiagnosed rare muscle disorders6,9, and cultured fibroblasts from patients with mitochondrial disorders7. However, for many individuals, biopsies are not performed for clinical care, and tissues are difficult to access. We sought to assess the utility of RNA-seq from blood as a diagnostic tool for rare diseases of different pathophysiologies. We generated whole-blood RNA-seq from 94 individuals with undiagnosed rare diseases spanning 16 diverse disease categories. We developed a robust approach to compare data from these individuals with large sets of RNA-seq data for controls (n = 1,594 unrelated controls and n = 49 family members) and demonstrated the impacts of expression, splicing, gene and variant filtering strategies on disease gene identification. Across our cohort, we observed that RNA-seq yields a 7.5% diagnostic rate, and an additional 16.7% with improved candidate gene resolution.
Collapse
Affiliation(s)
- Laure Frésard
- Department of Pathology, School of Medicine, Stanford University, Stanford, CA, USA.
| | - Craig Smail
- Biomedical Informatics Program, Stanford University, Stanford, CA, USA
| | - Nicole M Ferraro
- Biomedical Informatics Program, Stanford University, Stanford, CA, USA
| | - Nicole A Teran
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, USA
| | - Xin Li
- Department of Pathology, School of Medicine, Stanford University, Stanford, CA, USA
| | - Kevin S Smith
- Department of Pathology, School of Medicine, Stanford University, Stanford, CA, USA
| | - Devon Bonner
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
| | - Kristin D Kernohan
- Newborn Screening Ontario (NSO), Children's Hospital of Eastern Ontario, Ottawa, Ontario, Canada
| | - Shruti Marwaha
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
- Stanford Cardiovascular Institute, School of Medicine, Stanford University, Stanford, CA, USA
| | - Zachary Zappala
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, USA
| | - Brunilda Balliu
- Department of Pathology, School of Medicine, Stanford University, Stanford, CA, USA
| | - Joe R Davis
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, USA
| | - Boxiang Liu
- Department of Biology, School of Humanities and Sciences, Stanford University, Stanford, CA, USA
| | - Cameron J Prybol
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, USA
| | - Jennefer N Kohler
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
| | - Diane B Zastrow
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
| | - Chloe M Reuter
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
| | - Dianna G Fisk
- Stanford Medicine Clinical Genomics Program, School of Medicine, Stanford University, Stanford, CA, USA
| | - Megan E Grove
- Stanford Medicine Clinical Genomics Program, School of Medicine, Stanford University, Stanford, CA, USA
| | - Jean M Davidson
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
| | - Taila Hartley
- Children's Hospital of Eastern Ontario Research Institute, University of Ottawa, Ottawa, Ontario, Canada
| | - Ruchi Joshi
- Stanford Medicine Clinical Genomics Program, School of Medicine, Stanford University, Stanford, CA, USA
| | - Benjamin J Strober
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Sowmithri Utiramerur
- Stanford Medicine Clinical Genomics Program, School of Medicine, Stanford University, Stanford, CA, USA
| | - Lars Lind
- Department of Medical Sciences, Cardiovascular Epidemiology, Uppsala University, Uppsala, Sweden
| | - Erik Ingelsson
- Stanford Cardiovascular Institute, School of Medicine, Stanford University, Stanford, CA, USA
- Department of Medicine, Division of Cardiovascular Medicine, School of Medicine, Stanford University, Stanford, CA, USA
| | - Alexis Battle
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Gill Bejerano
- Department of Computer Science, Stanford University, Stanford, CA, USA
- Department of Pediatrics, School of Medicine, Stanford University, Stanford, CA, USA
- Department of Developmental Biology, School of Medicine, Stanford University, Stanford, CA, USA
- Department of Biomedical Data Science, School of Medicine, Stanford University, Stanford, CA, USA
| | - Jonathan A Bernstein
- Department of Pediatrics, School of Medicine, Stanford University, Stanford, CA, USA
| | - Euan A Ashley
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, USA
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
- Department of Medicine, Division of Cardiovascular Medicine, School of Medicine, Stanford University, Stanford, CA, USA
| | - Kym M Boycott
- Children's Hospital of Eastern Ontario Research Institute, University of Ottawa, Ottawa, Ontario, Canada
| | - Jason D Merker
- Department of Pathology, School of Medicine, Stanford University, Stanford, CA, USA
- Stanford Medicine Clinical Genomics Program, School of Medicine, Stanford University, Stanford, CA, USA
- Departments of Pathology and Laboratory Medicine & Genetics, Lineberger Comprehensive Cancer Center, University of North Carolina School Medicine, Chapel Hill, NC, USA
| | - Matthew T Wheeler
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
- Stanford Cardiovascular Institute, School of Medicine, Stanford University, Stanford, CA, USA
| | - Stephen B Montgomery
- Department of Pathology, School of Medicine, Stanford University, Stanford, CA, USA.
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, USA.
| |
Collapse
|
23
|
Liu B, Calton MA, Abell NS, Benchorin G, Gloudemans MJ, Chen M, Hu J, Li X, Balliu B, Bok D, Montgomery SB, Vollrath D. Genetic analyses of human fetal retinal pigment epithelium gene expression suggest ocular disease mechanisms. Commun Biol 2019; 2:186. [PMID: 31123710 PMCID: PMC6527609 DOI: 10.1038/s42003-019-0430-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2018] [Accepted: 04/17/2019] [Indexed: 02/07/2023] Open
Abstract
The retinal pigment epithelium (RPE) serves vital roles in ocular development and retinal homeostasis but has limited representation in large-scale functional genomics datasets. Understanding how common human genetic variants affect RPE gene expression could elucidate the sources of phenotypic variability in selected monogenic ocular diseases and pinpoint causal genes at genome-wide association study (GWAS) loci. We interrogated the genetics of gene expression of cultured human fetal RPE (fRPE) cells under two metabolic conditions and discovered hundreds of shared or condition-specific expression or splice quantitative trait loci (e/sQTLs). Co-localizations of fRPE e/sQTLs with age-related macular degeneration (AMD) and myopia GWAS data suggest new candidate genes, and mechanisms by which a common RDH5 allele contributes to both increased AMD risk and decreased myopia risk. Our study highlights the unique transcriptomic characteristics of fRPE and provides a resource to connect e/sQTLs in a critical ocular cell type to monogenic and complex eye disorders.
Collapse
Affiliation(s)
- Boxiang Liu
- Department of Biology, Stanford University, Stanford, CA 94305 USA
| | - Melissa A. Calton
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305 USA
| | - Nathan S. Abell
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305 USA
| | - Gillie Benchorin
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305 USA
| | - Michael J. Gloudemans
- Program in Biomedical Informatics, Stanford University School of Medicine, Stanford, 94305 CA USA
| | - Ming Chen
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305 USA
| | - Jane Hu
- Department of Ophthalmology, Jules Stein Eye Institute, UCLA, Los Angeles, 90095 CA USA
| | - Xin Li
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305 USA
| | - Brunilda Balliu
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305 USA
| | - Dean Bok
- Department of Ophthalmology, Jules Stein Eye Institute, UCLA, Los Angeles, 90095 CA USA
| | - Stephen B. Montgomery
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305 USA
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305 USA
| | - Douglas Vollrath
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305 USA
| |
Collapse
|
24
|
Balliu B, Houwing‐Duistermaat JJ, Böhringer S. Powerful testing via hierarchical linkage disequilibrium in haplotype association studies. Biom J 2019; 61:747-768. [PMID: 30693553 PMCID: PMC6637384 DOI: 10.1002/bimj.201800053] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2018] [Revised: 08/09/2018] [Accepted: 09/08/2018] [Indexed: 12/03/2022]
Abstract
Marginal tests based on individual SNPs are routinely used in genetic association studies. Studies have shown that haplotype-based methods may provide more power in disease mapping than methods based on single markers when, for example, multiple disease-susceptibility variants occur within the same gene. A limitation of haplotype-based methods is that the number of parameters increases exponentially with the number of SNPs, inducing a commensurate increase in the degrees of freedom and weakening the power to detect associations. To address this limitation, we introduce a hierarchical linkage disequilibrium model for disease mapping, based on a reparametrization of the multinomial haplotype distribution, where every parameter corresponds to the cumulant of each possible subset of a set of loci. This hierarchy present in the parameters enables us to employ flexible testing strategies over a range of parameter sets: from standard single SNP analyses through the full haplotype distribution tests, reducing degrees of freedom and increasing the power to detect associations. We show via extensive simulations that our approach maintains the type I error at nominal level and has increased power under many realistic scenarios, as compared to single SNP and standard haplotype-based studies. To evaluate the performance of our proposed methodology in real data, we analyze genome-wide data from the Wellcome Trust Case-Control Consortium.
Collapse
Affiliation(s)
- Brunilda Balliu
- Department of BiomathematicsDavid Geffen School of MedicineUCLALos AngelesCAUSA
| | | | - Stefan Böhringer
- Department of Biomedical Data SciencesSection Medical Statistics and BioinformaticsLeiden University Medical CenterLeidenThe Netherlands
| |
Collapse
|
25
|
Kraemer M, Huynh QB, Wieczorek D, Balliu B, Mikat B, Boehringer S. Distinctive facial features in idiopathic Moyamoya disease in Caucasians: a first systematic analysis. PeerJ 2018; 6:e4740. [PMID: 29977664 PMCID: PMC6029584 DOI: 10.7717/peerj.4740] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2017] [Accepted: 04/19/2018] [Indexed: 12/02/2022] Open
Abstract
Background Craniofacial dysmorphic features are morphological changes of the face and skull which are associated with syndromic conditions. Moyamoya angiopathy is a rare cerebral vasculopathy that can be divided into Moyamoya syndrome, which is associated or secondary to other diseases, and into idiopathic Moyamoya disease. Facial dysmorphism has been described in rare genetic syndromes with associated Moyamoya syndrome. However, a direct relationship between idiopathic Moyamoya disease with dysmorphic facial changes is not known yet. Methods Landmarks were manually placed on frontal photographs of the face of 45 patients with bilateral Moyamoya disease and 50 matched controls. After procrustes alignment of landmarks a multivariate, penalized logistic regression (elastic-net) was performed on geometric features derived from landmark data to classify patients against controls. Classifiers were visualized in importance plots that colorcode importance of geometric locations for the classification decision. Results The classification accuracy for discriminating the total patient group from controls was 82.3% (P-value = 6.3×10−11, binomial test, a-priori chance 50.2%) for an elastic-net classifier. Importance plots show that differences around the eyes and forehead were responsible for the discrimination. Subgroup analysis corrected for body mass index confirmed a similar result. Discussion Results suggest that there is a resemblance in faces of Caucasian patients with idiopathic Moyamoya disease and that there is a difference to matched controls. Replication of findings is necessary as it is difficult to control all residual confounding in study designs such as ours. If our results would be replicated in a larger cohort, this would be helpful for pathophysiological interpretation and early detection of the disease.
Collapse
Affiliation(s)
- Markus Kraemer
- Department of Neurology, Alfried Krupp Hospital Essen, Essen, Germany.,Department of Neurology, University Clinic of Duesseldorf, Duesseldorf, Germany
| | - Quoc Bao Huynh
- Department of Neurology, Alfried Krupp Hospital Essen, Essen, Germany
| | - Dagmar Wieczorek
- Institute of Human Genetics, University of Duesseldorf, Duesseldorf, Germany.,Institute of Human Genetics, University of Essen, Essen, Germany
| | - Brunilda Balliu
- Institute of Genetics, Stanford University School of Medicine, Stanford, CA, United States of America
| | - Barbara Mikat
- Institute of Human Genetics, University of Essen, Essen, Germany
| | - Stefan Boehringer
- Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Nederlands
| |
Collapse
|
26
|
Tissier R, Uh HW, van den Akker E, Balliu B, Tsonaka S, Houwing-Duistermaat J. Gene coexpression network analysis for family studies based on a meta-analytic approach. BMC Proc 2016; 10:119-123. [PMID: 27980622 PMCID: PMC5133496 DOI: 10.1186/s12919-016-0016-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
For a better understanding of the biological mechanisms involved in complex traits or diseases, networks are often useful tools in genetic studies: coexpression networks based on pairwise correlations between genes are commonly used. In case of a family-based design, it can be problematic when there is a large between-family variation in expression levels. We propose here a gene coexpression network analysis for family studies. We build a coexpression network for each family and then combine the results. We applied our approach to data provided for analysis in the Genetic Analysis Workshop 19 and compared it to 2 naïve approaches—ignoring correlations among the expressions and decorrelating the gene expression by using the residuals of a mixed model—and a single-probe analysis. Our approach seemed to better deal with heterogeneity with regard to the naïve approaches. The naïve approaches did not provide any significant results, while our approach detected genes via indirect effects. It also detected more genes than the single-probe analysis.
Collapse
Affiliation(s)
- Renaud Tissier
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, PO Box 9600, 2300 RC Leiden, The Netherlands
| | - Hae-Won Uh
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, PO Box 9600, 2300 RC Leiden, The Netherlands
| | - Erik van den Akker
- Molecular epidemiology, Leiden University Medical Centre, Leiden, The Netherlands ; Pattern Recognition & Bioinformatics, Delft University of Technology, Leiden, The Netherlands
| | - Brunilda Balliu
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, PO Box 9600, 2300 RC Leiden, The Netherlands
| | - Spyridoula Tsonaka
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, PO Box 9600, 2300 RC Leiden, The Netherlands
| | - Jeanine Houwing-Duistermaat
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, PO Box 9600, 2300 RC Leiden, The Netherlands ; Department of Statistics, University of Leeds, Leeds, UK
| |
Collapse
|
27
|
Kukurba KR, Parsana P, Balliu B, Smith KS, Zappala Z, Knowles DA, Favé MJ, Davis JR, Li X, Zhu X, Potash JB, Weissman MM, Shi J, Kundaje A, Levinson DF, Awadalla P, Mostafavi S, Battle A, Montgomery SB. Impact of the X Chromosome and sex on regulatory variation. Genome Res 2016; 26:768-77. [PMID: 27197214 PMCID: PMC4889977 DOI: 10.1101/gr.197897.115] [Citation(s) in RCA: 66] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2015] [Accepted: 04/18/2016] [Indexed: 02/07/2023]
Abstract
The X Chromosome, with its unique mode of inheritance, contributes to differences between the sexes at a molecular level, including sex-specific gene expression and sex-specific impact of genetic variation. Improving our understanding of these differences offers to elucidate the molecular mechanisms underlying sex-specific traits and diseases. However, to date, most studies have either ignored the X Chromosome or had insufficient power to test for the sex-specific impact of genetic variation. By analyzing whole blood transcriptomes of 922 individuals, we have conducted the first large-scale, genome-wide analysis of the impact of both sex and genetic variation on patterns of gene expression, including comparison between the X Chromosome and autosomes. We identified a depletion of expression quantitative trait loci (eQTL) on the X Chromosome, especially among genes under high selective constraint. In contrast, we discovered an enrichment of sex-specific regulatory variants on the X Chromosome. To resolve the molecular mechanisms underlying such effects, we generated chromatin accessibility data through ATAC-sequencing to connect sex-specific chromatin accessibility to sex-specific patterns of expression and regulatory variation. As sex-specific regulatory variants discovered in our study can inform sex differences in heritable disease prevalence, we integrated our data with genome-wide association study data for multiple immune traits identifying several traits with significant sex biases in genetic susceptibilities. Together, our study provides genome-wide insight into how genetic variation, the X Chromosome, and sex shape human gene regulation and disease.
Collapse
Affiliation(s)
- Kimberly R Kukurba
- Department of Pathology, Stanford University School of Medicine, Stanford, California 94305, USA; Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Princy Parsana
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Brunilda Balliu
- Department of Pathology, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Kevin S Smith
- Department of Pathology, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Zachary Zappala
- Department of Pathology, Stanford University School of Medicine, Stanford, California 94305, USA; Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | - David A Knowles
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Marie-Julie Favé
- Sainte-Justine University Hospital Research Centre, Department of Pediatrics, University of Montreal, Montreal, Québec H3T 1J4, Canada
| | - Joe R Davis
- Department of Pathology, Stanford University School of Medicine, Stanford, California 94305, USA; Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Xin Li
- Department of Pathology, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Xiaowei Zhu
- Department of Psychiatry, Stanford University School of Medicine, Stanford, California 94305, USA
| | - James B Potash
- Department of Psychiatry, University of Iowa Hospitals & Clinics, Iowa City, Iowa 52242, USA
| | - Myrna M Weissman
- Department of Psychiatry, Columbia University and New York State Psychiatric Institute, New York, New York 10032, USA
| | - Jianxin Shi
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland 20892, USA
| | - Anshul Kundaje
- Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA; Department of Computer Science, Stanford University, Stanford, California 94305, USA
| | - Douglas F Levinson
- Department of Psychiatry, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Philip Awadalla
- Sainte-Justine University Hospital Research Centre, Department of Pediatrics, University of Montreal, Montreal, Québec H3T 1J4, Canada
| | - Sara Mostafavi
- Department of Statistics, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - Alexis Battle
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA;
| | - Stephen B Montgomery
- Department of Pathology, Stanford University School of Medicine, Stanford, California 94305, USA; Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA; Department of Computer Science, Stanford University, Stanford, California 94305, USA;
| |
Collapse
|
28
|
Balliu B, Tsonaka R, Boehringer S, Houwing-Duistermaat J. A retrospective likelihood approach for efficient integration of multiple omics factors in case-control association studies. Genet Epidemiol 2015; 39:156-65. [PMID: 25620726 DOI: 10.1002/gepi.21884] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2014] [Revised: 10/08/2014] [Accepted: 12/02/2014] [Indexed: 11/09/2022]
Abstract
Integrative omics, the joint analysis of outcome and multiple types of omics data, such as genomics, epigenomics, and transcriptomics data, constitute a promising approach for powerful and biologically relevant association studies. These studies often employ a case-control design, and often include nonomics covariates, such as age and gender, that may modify the underlying omics risk factors. An open question is how to best integrate multiple omics and nonomics information to maximize statistical power in case-control studies that ascertain individuals based on the phenotype. Recent work on integrative omics have used prospective approaches, modeling case-control status conditional on omics, and nonomics risk factors. Compared to univariate approaches, jointly analyzing multiple risk factors with a prospective approach increases power in nonascertained cohorts. However, these prospective approaches often lose power in case-control studies. In this article, we propose a novel statistical method for integrating multiple omics and nonomics factors in case-control association studies. Our method is based on a retrospective likelihood function that models the joint distribution of omics and nonomics factors conditional on case-control status. The new method provides accurate control of Type I error rate and has increased efficiency over prospective approaches in both simulated and real data.
Collapse
Affiliation(s)
- Brunilda Balliu
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, The Netherlands
| | | | | | | |
Collapse
|
29
|
Abstract
With the advance of next-generation sequencing technologies in recent years, rare genetic variant data have now become available for genetic epidemiology studies. For family samples, however, only a few statistical methods for association analysis of rare genetic variants have been developed. Rare variant approaches are of great interest, particularly for family data, because samples enriched for trait-relevant variants can be ascertained and rare variants are putatively enriched through segregation. To facilitate the evaluation of existing and new rare variant testing approaches for analyzing family data, Genetic Analysis Workshop 18 (GAW18) provided genotype and next-generation sequencing data and longitudinal blood pressure traits from extended pedigrees of Mexican American families from the San Antonio Family Study. Our GAW18 group members analyzed real and simulated phenotype data from GAW18 by using generalized linear mixed-effects models or principal components to adjust for familial correlation or by testing binary traits using a correction factor for familial effects. With one exception, approaches dealt with the extended pedigrees in their original state using information based on the kinship matrix or alternative genetic similarity measures. For simulated data our group demonstrated that the family-based kernel machine score test is superior in power to family-based single-marker or burden tests, except in a few specific scenarios. For real data three contributions identified significant associations. They substantially reduced the number of tests before performing the association analysis. We conclude from our real data analyses that further development of strategies for targeted testing or more focused screening of genetic variants is strongly desirable.
Collapse
Affiliation(s)
- Han Chen
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, United States of America
| | | | | | | | | |
Collapse
|
30
|
Balliu B, Uh HW, Tsonaka R, Boehringer S, Helmer Q, Houwing-Duistermaat JJ. Combining information from linkage and association mapping for next-generation sequencing longitudinal family data. BMC Proc 2014; 8:S34. [PMID: 25519382 PMCID: PMC4143620 DOI: 10.1186/1753-6561-8-s1-s34] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
In this analysis, we investigate the contributions that linkage-based methods, such as identical-by-descent mapping, can make to association mapping to identify rare variants in next-generation sequencing data. First, we identify regions in which cases share more segments identical-by-descent around a putative causal variant than do controls. Second, we use a two-stage mixed-effect model approach to summarize the single-nucleotide polymorphism data within each region and include them as covariates in the model for the phenotype. We assess the impact of linkage disequilibrium in determining identical-by-descent states between individuals by using markers with and without linkage disequilibrium for the first part and the impact of imputation in testing for association by using imputed genome-wide association studies or raw sequence markers for the second part. We apply the method to next-generation sequencing longitudinal family data from Genetic Association Workshop 18 and identify a significant region at chromosome 3: 40249244-41025167 (p-value = 2.3 × 10−3).
Collapse
Affiliation(s)
- Brunilda Balliu
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Einthovenweg 20, 2333 ZC Leiden, The Netherlands
| | - Hae-Won Uh
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Einthovenweg 20, 2333 ZC Leiden, The Netherlands ; Netherlands Consortium for Healthy Ageing, Leiden University Medical Center, P.O. Box 9600, 2300 RC Leiden, The Netherlands
| | - Roula Tsonaka
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Einthovenweg 20, 2333 ZC Leiden, The Netherlands
| | - Stefan Boehringer
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Einthovenweg 20, 2333 ZC Leiden, The Netherlands
| | - Quinta Helmer
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Einthovenweg 20, 2333 ZC Leiden, The Netherlands ; Netherlands Consortium for Healthy Ageing, Leiden University Medical Center, P.O. Box 9600, 2300 RC Leiden, The Netherlands
| | - Jeanine J Houwing-Duistermaat
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Einthovenweg 20, 2333 ZC Leiden, The Netherlands
| |
Collapse
|
31
|
Balliu B, Tsonaka R, van der Woude D, Boehringer S, Houwing-Duistermaat JJ. Combining family and twin data in association studies to estimate the noninherited maternal antigens effect. Genet Epidemiol 2012; 36:811-9. [PMID: 22851506 DOI: 10.1002/gepi.21667] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2012] [Revised: 06/06/2012] [Accepted: 06/20/2012] [Indexed: 11/08/2022]
Abstract
It is hypothesized that certain alleles can have a protective effect not only when inherited by the offspring but also as noninherited maternal antigens (NIMA). To estimate the NIMA effect, large samples of families are needed. When large samples are not available, we propose a combined approach to estimate the NIMA effect from ascertained nuclear families and twin pairs. We develop a likelihood-based approach allowing for several ascertainment schemes, to accommodate for the outcome-dependent sampling scheme, and a family-specific random term, to take into account the correlation between family members. We estimate the parameters using maximum likelihood based on the combined joint likelihood (CJL) approach. Simulations show that the CJL is more efficient for estimating the NIMA odds ratios as compared to a families-only approach. To illustrate our approach, we used data from a family and a twin study from the United Kingdom on rheumatoid arthritis, and confirmed the protective NIMA effect, with an odds ratio of 0.477 (95% CI 0.264-0.864).
Collapse
Affiliation(s)
- Brunilda Balliu
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, The Netherlands
| | | | | | | | | |
Collapse
|