1
|
Herzig AF, Rubinacci S, Marenne G, Perdry H, Deleuze JF, Dina C, Barc J, Redon R, Delaneau O, Génin E. SURFBAT: a surrogate family based association test building on large imputation reference panels. G3 (BETHESDA, MD.) 2025; 15:jkae287. [PMID: 39657733 PMCID: PMC12005154 DOI: 10.1093/g3journal/jkae287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2024] [Revised: 11/07/2024] [Accepted: 11/29/2024] [Indexed: 12/12/2024]
Abstract
Genotype-phenotype association tests are typically adjusted for population stratification using principal components that are estimated genome-wide. This lacks resolution when analyzing populations with fine structure and/or individuals with fine levels of admixture. This can affect power and precision, and is a particularly relevant consideration when control individuals are recruited using geographic selection criteria. Such is the case in France where we have recently created reference panels of individuals anchored to different geographic regions. To make correct comparisons against case groups, who would likely be gathered from large urban areas, new methods are needed. We present SURFBAT (a surrogate family based association test), which performs an approximation of the transmission-disequilibrium test. Our method hinges on the application of genotype imputation algorithms to match similar haplotypes between the case and control groups. This permits us to approximate local ancestry informed posterior probabilities of un-transmitted parental alleles of each case individual. This is achieved by assuming haplotypes from the imputation panel are well-matched for ancestry with the case individuals. When the first haplotype of an individual from the imputation panel matches that of a case individual, it is assumed that the second haplotype of the same reference individual can be used as a locally ancestry matched control haplotype and to approximately impute un-transmitted parental alleles. SURFBAT provides an association test that is inherently robust to fine-scale population stratification and opens up the possibility of efficiently using large imputation reference panels as control groups for association testing. In contrast to other methods for association testing that incorporate local-ancestry inference, SURFBAT does not require a set of ancestry groups to be defined, nor for local ancestry to be explicitly estimated. We demonstrate the interest of our tool on simulated datasets, as well as on a real-data example for a group of case individuals affected by Brugada syndrome.
Collapse
Affiliation(s)
- Anthony F Herzig
- Inserm, Université de Bretagne-Occidentale, EFS, UMR 1078, GGB, Brest F-29200, France
| | - Simone Rubinacci
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki 00290, Finland
| | - Gaëlle Marenne
- Inserm, Université de Bretagne-Occidentale, EFS, UMR 1078, GGB, Brest F-29200, France
| | - Hervé Perdry
- CESP Inserm U1018, Université Paris-Saclay, Villejuif F-94807, France
| | - Jean-François Deleuze
- Université Paris-Saclay, CEA, Centre National de Recherche en Génomique Humaine (CNRGH), Evry F-91000, France
- CEPH, Fondation Jean Dausset, Paris F-75010, France
| | - Christian Dina
- Nantes Université, CNRS, INSERM UMR 1087, L’Institut du Thorax, Nantes F-44000, France
| | - Julien Barc
- Nantes Université, CNRS, INSERM UMR 1087, L’Institut du Thorax, Nantes F-44000, France
| | - Richard Redon
- Nantes Université, CNRS, INSERM UMR 1087, L’Institut du Thorax, Nantes F-44000, France
| | | | - Emmanuelle Génin
- Inserm, Université de Bretagne-Occidentale, EFS, UMR 1078, GGB, Brest F-29200, France
- CHU Brest, Brest F-29200, France
| |
Collapse
|
2
|
Ulitzsch E, He Q, Ulitzsch V, Molter H, Nichterlein A, Niedermeier R, Pohl S. Combining Clickstream Analyses and Graph-Modeled Data Clustering for Identifying Common Response Processes. PSYCHOMETRIKA 2021; 86:190-214. [PMID: 33544300 PMCID: PMC8035117 DOI: 10.1007/s11336-020-09743-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Revised: 12/10/2020] [Accepted: 12/15/2020] [Indexed: 06/12/2023]
Abstract
Complex interactive test items are becoming more widely used in assessments. Being computer-administered, assessments using interactive items allow logging time-stamped action sequences. These sequences pose a rich source of information that may facilitate investigating how examinees approach an item and arrive at their given response. There is a rich body of research leveraging action sequence data for investigating examinees' behavior. However, the associated timing data have been considered mainly on the item-level, if at all. Considering timing data on the action-level in addition to action sequences, however, has vast potential to support a more fine-grained assessment of examinees' behavior. We provide an approach that jointly considers action sequences and action-level times for identifying common response processes. In doing so, we integrate tools from clickstream analyses and graph-modeled data clustering with psychometrics. In our approach, we (a) provide similarity measures that are based on both actions and the associated action-level timing data and (b) subsequently employ cluster edge deletion for identifying homogeneous, interpretable, well-separated groups of action patterns, each describing a common response process. Guidelines on how to apply the approach are provided. The approach and its utility are illustrated on a complex problem-solving item from PIAAC 2012.
Collapse
Affiliation(s)
- Esther Ulitzsch
- Educational Measurement, IPN - Leibniz Institute for Science and Mathematics Education, Olshausenstraße 62, 24118, Kiel, Germany.
| | - Qiwei He
- Educational Testing Service, Princeton, USA
| | | | | | | | | | | |
Collapse
|
3
|
Stein MM, Conery M, Magnaye KM, Clay SM, Billstrand C, Nicolae R, Naughton K, Ober C, Thompson EE. Sex-specific differences in peripheral blood leukocyte transcriptional response to LPS are enriched for HLA region and X chromosome genes. Sci Rep 2021; 11:1107. [PMID: 33441806 PMCID: PMC7806814 DOI: 10.1038/s41598-020-80145-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Accepted: 12/08/2020] [Indexed: 02/08/2023] Open
Abstract
Sex-specific differences in prevalence are well documented for many common, complex diseases, especially for immune-mediated diseases, yet the precise mechanisms through which factors associated with biological sex exert their effects throughout life are not well understood. We interrogated sex-specific transcriptional responses of peripheral blood leukocytes (PBLs) to innate immune stimulation by lipopolysaccharide (LPS) in 46 male and 66 female members of the Hutterite community, who practice a communal lifestyle. We identified 1217 autosomal and 54 X-linked genes with sex-specific responses to LPS, as well as 71 autosomal and one X-linked sex-specific expression quantitative trait loci (eQTLs). Despite a similar proportion of the 15 HLA genes responding to LPS compared to all expressed autosomal genes, there was a significant over-representation of genes with sex by treatment interactions among HLA genes. We also observed an enrichment of sex-specific differentially expressed genes in response to LPS for X-linked genes compared to autosomal genes, suggesting that HLA and X-linked genes may disproportionately contribute to sex disparities in risk for immune-mediated diseases.
Collapse
Affiliation(s)
- Michelle M Stein
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA
| | - Mitch Conery
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA
| | - Kevin M Magnaye
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA
| | - Selene M Clay
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA
| | | | - Raluca Nicolae
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA
| | - Katherine Naughton
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA
| | - Carole Ober
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA
| | - Emma E Thompson
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA.
| |
Collapse
|
4
|
Wu FL, Strand AI, Cox LA, Ober C, Wall JD, Moorjani P, Przeworski M. A comparison of humans and baboons suggests germline mutation rates do not track cell divisions. PLoS Biol 2020; 18:e3000838. [PMID: 32804933 PMCID: PMC7467331 DOI: 10.1371/journal.pbio.3000838] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Revised: 09/02/2020] [Accepted: 07/28/2020] [Indexed: 12/19/2022] Open
Abstract
In humans, most germline mutations are inherited from the father. This observation has been widely interpreted as reflecting the replication errors that accrue during spermatogenesis. If so, the male bias in mutation should be substantially lower in a closely related species with similar rates of spermatogonial stem cell divisions but a shorter mean age of reproduction. To test this hypothesis, we resequenced two 3-4 generation nuclear families (totaling 29 individuals) of olive baboons (Papio anubis), who reproduce at approximately 10 years of age on average, and analyzed the data in parallel with three 3-generation human pedigrees (26 individuals). We estimated a mutation rate per generation in baboons of 0.57×10-8 per base pair, approximately half that of humans. Strikingly, however, the degree of male bias in germline mutations is approximately 4:1, similar to that of humans-indeed, a similar male bias is seen across mammals that reproduce months, years, or decades after birth. These results mirror the finding in humans that the male mutation bias is stable with parental ages and cast further doubt on the assumption that germline mutations track cell divisions. Our mutation rate estimates for baboons raise a further puzzle, suggesting a divergence time between apes and Old World monkeys of 65 million years, too old to be consistent with the fossil record; reconciling them now requires not only a slowdown of the mutation rate per generation in humans but also in baboons.
Collapse
Affiliation(s)
- Felix L. Wu
- Department of Systems Biology, Columbia University, New York, New York, United States of America
- Integrated Program in Cellular, Molecular, and Biomedical Studies, Columbia University, New York, New York, United States of America
| | - Alva I. Strand
- Department of Biological Sciences, Columbia University, New York, New York, United States of America
| | - Laura A. Cox
- Center for Precision Medicine, Department of Internal Medicine, Section of Molecular Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America
- Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, Texas, United States of America
| | - Carole Ober
- Department of Human Genetics, The University of Chicago, Chicago, Illinois, United States of America
| | - Jeffrey D. Wall
- Institute for Human Genetics, Department of Epidemiology & Biostatistics, University of California, San Francisco, San Francisco, California, United States of America
| | - Priya Moorjani
- Department of Biological Sciences, Columbia University, New York, New York, United States of America
| | - Molly Przeworski
- Department of Systems Biology, Columbia University, New York, New York, United States of America
- Department of Biological Sciences, Columbia University, New York, New York, United States of America
| |
Collapse
|
5
|
Thompson EE, Dang Q, Mitchell-Handley B, Rajendran K, Ram-Mohan S, Solway J, Ober C, Krishnan R. Cytokine-induced molecular responses in airway smooth muscle cells inform genome-wide association studies of asthma. Genome Med 2020; 12:64. [PMID: 32690065 PMCID: PMC7370514 DOI: 10.1186/s13073-020-00759-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2019] [Accepted: 06/26/2020] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND A challenge in the post-GWAS era is to assign function to disease-associated variants. However, available resources do not include all tissues or environmental exposures that are relevant to all diseases. For example, exaggerated bronchoconstriction of airway smooth muscle cells (ASMCs) defines airway hyperresponsiveness (AHR), a cardinal feature of asthma. However, the contribution of ASMC to genetic and genomic studies has largely been overlooked. Our study aimed to address the gap in data availability from a critical tissue in genomic studies of asthma. METHODS We developed a cell model of AHR to discover variants associated with transcriptional, epigenetic, and cellular responses to two AHR promoting cytokines, IL-13 and IL-17A, and performed a GWAS of bronchial responsiveness (BRI) in humans. RESULTS Our study revealed significant response differences between ASMCs from asthma cases and controls, including genes implicated in asthma susceptibility. We defined molecular quantitative trait loci (QTLs) for expression (eQTLs) and methylation (meQTLs), and cellular QTLs for contractility (coQTLs) and performed a GWAS of BRI in human subjects. Variants in asthma GWAS were significantly enriched for ASM QTLs and BRI-associated SNPs, and near genes enriched for ASM function, many with small P values that did not reach stringent thresholds of significance in GWAS. CONCLUSIONS Our study identified significant differences between ASMCs from asthma cases and controls, potentially reflecting trained tolerance in these cells, as well as a set of variants, overlooked in previous GWAS, which reflect the AHR component of asthma.
Collapse
Affiliation(s)
- Emma E Thompson
- Department of Human Genetics, The University of Chicago, Chicago, IL, USA.
| | - Quynh Dang
- Center for Vascular Biology Research, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | | | - Kavitha Rajendran
- Center for Vascular Biology Research, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | - Sumati Ram-Mohan
- Center for Vascular Biology Research, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | - Julian Solway
- Department of Medicine, The University of Chicago, Chicago, IL, USA
| | - Carole Ober
- Department of Human Genetics, The University of Chicago, Chicago, IL, USA
| | - Ramaswamy Krishnan
- Center for Vascular Biology Research, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
6
|
Gilly A, Southam L, Suveges D, Kuchenbaecker K, Moore R, Melloni GEM, Hatzikotoulas K, Farmaki AE, Ritchie G, Schwartzentruber J, Danecek P, Kilian B, Pollard MO, Ge X, Tsafantakis E, Dedoussis G, Zeggini E. Very low-depth whole-genome sequencing in complex trait association studies. Bioinformatics 2020; 35:2555-2561. [PMID: 30576415 PMCID: PMC6662288 DOI: 10.1093/bioinformatics/bty1032] [Citation(s) in RCA: 53] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2018] [Revised: 11/17/2018] [Accepted: 12/17/2018] [Indexed: 12/31/2022] Open
Abstract
Motivation Very low-depth sequencing has been proposed as a cost-effective approach to capture low-frequency and rare variation in complex trait association studies. However, a full characterization of the genotype quality and association power for very low-depth sequencing designs is still lacking. Results We perform cohort-wide whole-genome sequencing (WGS) at low depth in 1239 individuals (990 at 1× depth and 249 at 4× depth) from an isolated population, and establish a robust pipeline for calling and imputing very low-depth WGS genotypes from standard bioinformatics tools. Using genotyping chip, whole-exome sequencing (75× depth) and high-depth (22×) WGS data in the same samples, we examine in detail the sensitivity of this approach, and show that imputed 1× WGS recapitulates 95.2% of variants found by imputed GWAS with an average minor allele concordance of 97% for common and low-frequency variants. In our study, 1× further allowed the discovery of 140 844 true low-frequency variants with 73% genotype concordance when compared to high-depth WGS data. Finally, using association results for 57 quantitative traits, we show that very low-depth WGS is an efficient alternative to imputed GWAS chip designs, allowing the discovery of up to twice as many true association signals than the classical imputed GWAS design. Availability and implementation The HELIC genotype and WGS datasets have been deposited to the European Genome-phenome Archive (https://www.ebi.ac.uk/ega/home): EGAD00010000518; EGAD00010000522; EGAD00010000610; EGAD00001001636, EGAD00001001637. The peakplotter software is available at https://github.com/wtsi-team144/peakplotter, the transformPhenotype app can be downloaded at https://github.com/wtsi-team144/transformPhenotype. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Arthur Gilly
- Department of Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.,Institute of Translational Genomics, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany
| | - Lorraine Southam
- Department of Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.,Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Daniel Suveges
- Department of Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Karoline Kuchenbaecker
- Department of Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Rachel Moore
- Department of Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Giorgio E M Melloni
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Konstantinos Hatzikotoulas
- Department of Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.,Institute of Translational Genomics, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany
| | - Aliki-Eleni Farmaki
- Department of Nutrition and Dietetics, School of Health Science and Education, Harokopio University of Athens, Athens, Greece
| | - Graham Ritchie
- Department of Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.,European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Jeremy Schwartzentruber
- Department of Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Petr Danecek
- Department of Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Britt Kilian
- Department of Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Martin O Pollard
- Department of Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Xiangyu Ge
- Department of Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | | | - George Dedoussis
- Department of Nutrition and Dietetics, School of Health Science and Education, Harokopio University of Athens, Athens, Greece
| | - Eleftheria Zeggini
- Department of Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.,Institute of Translational Genomics, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany
| |
Collapse
|
7
|
Abney M, ElSherbiny A. Kinpute: using identity by descent to improve genotype imputation. Bioinformatics 2019; 35:4321-4326. [PMID: 30918937 PMCID: PMC6821425 DOI: 10.1093/bioinformatics/btz221] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2018] [Revised: 02/21/2019] [Accepted: 03/26/2019] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Genotype imputation, though generally accurate, often results in many genotypes being poorly imputed, particularly in studies where the individuals are not well represented by standard reference panels. When individuals in the study share regions of the genome identical by descent (IBD), it is possible to use this information in combination with a study-specific reference panel (SSRP) to improve the imputation results. Kinpute uses IBD information-due to recent, familial relatedness or distant, unknown ancestors-in conjunction with the output from linkage disequilibrium (LD) based imputation methods to compute more accurate genotype probabilities. Kinpute uses a novel method for IBD imputation, which works even in the absence of a pedigree, and results in substantially improved imputation quality. RESULTS Given initial estimates of average IBD between subjects in the study sample, Kinpute uses a novel algorithm to select an optimal set of individuals to sequence and use as an SSRP. Kinpute is designed to use as input both this SSRP and the genotype probabilities output from other LD-based imputation software, and uses a new method to combine the LD imputed genotype probabilities with IBD configurations to substantially improve imputation. We tested Kinpute on a human population isolate where 98 individuals have been sequenced. In half of this sample, whose sequence data was masked, we used Impute2 to perform LD-based imputation and Kinpute was used to obtain higher accuracy genotype probabilities. Measures of imputation accuracy improved significantly, particularly for those genotypes that Impute2 imputed with low certainty. AVAILABILITY AND IMPLEMENTATION Kinpute is an open-source and freely available C++ software package that can be downloaded from. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mark Abney
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Aisha ElSherbiny
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| |
Collapse
|
8
|
Hoff JL, Decker JE, Schnabel RD, Seabury CM, Neibergs HL, Taylor JF. QTL-mapping and genomic prediction for bovine respiratory disease in U.S. Holsteins using sequence imputation and feature selection. BMC Genomics 2019; 20:555. [PMID: 31277567 PMCID: PMC6612181 DOI: 10.1186/s12864-019-5941-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2018] [Accepted: 06/26/2019] [Indexed: 02/01/2023] Open
Abstract
BACKGROUND National genetic evaluations for disease resistance do not exist, precluding the genetic improvement of cattle for these traits. We imputed BovineHD genotypes to whole genome sequence for 2703 Holsteins that were cases or controls for Bovine Respiratory Disease and sampled from either California or New Mexico to construct and compare genomic prediction models. The sequence variation reference dataset comprised variants called for 1578 animals from Run 5 of the 1000 Bull Genomes Project, including 450 Holsteins and 29 animals sequenced from this study population. Genotypes for 9,282,726 variants with minor allele frequencies ≥5% were imputed and used to obtain genomic predictions in GEMMA using a Bayesian Sparse Linear Mixed Model. RESULTS Variation explained by markers increased from 13.6% using BovineHD data to 14.4% using imputed whole genome sequence data and the resolution of genomic regions detected as harbouring QTL substantially increased. Explained variation in the analysis of the combined California and New Mexico data was less than when data for each state were separately analysed and the estimated genetic correlation between risk of Bovine Respiratory Disease in California and New Mexico Holsteins was - 0.36. Consequently, genomic predictions trained using the data from one state did not accurately predict disease risk in the other state. To determine if a prediction model could be developed with utility in both states, we selected variants within genomic regions harbouring: 1) genes involved in the normal immune response to infection by pathogens responsible for Bovine Respiratory Disease detected by RNA-Seq analysis, and/or 2) QTL identified in the association analysis of the imputed sequence variants. The model based on QTL selected variants is biased but when trained in one state generated BRD risk predictions with positive accuracies in the other state. CONCLUSIONS We demonstrate the utility of sequence-based and biology-driven model development for genomic selection. Disease phenotypes cannot be routinely recorded in most livestock species and the observed phenotypes may vary in their genomic architecture due to variation in the pathogen composition across environments. Elucidation of trait biology and genetic architecture may guide the development of prediction models with utility across breeds and environments.
Collapse
Affiliation(s)
- Jesse L Hoff
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA
| | - Jared E Decker
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA.,Informatics Institute, University of Missouri, Columbia, MO, 65211, USA
| | - Robert D Schnabel
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA.,Informatics Institute, University of Missouri, Columbia, MO, 65211, USA
| | - Christopher M Seabury
- Department of Veterinary Pathobiology, Texas A&M University, College Station, TX, 77843, USA
| | - Holly L Neibergs
- Department of Animal Sciences, Washington State University, Pullman, WA, 99163, USA
| | - Jeremy F Taylor
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA.
| |
Collapse
|
9
|
Abdelfatah N, Chen R, Duff HJ, Seifer CM, Buffo I, Huculak C, Clarke S, Clegg R, Jassal DS, Gordon PMK, Ober C, Frosk P, Gerull B. Characterization of a Unique Form of Arrhythmic Cardiomyopathy Caused by Recessive Mutation in LEMD2. JACC Basic Transl Sci 2019; 4:204-221. [PMID: 31061923 PMCID: PMC6488817 DOI: 10.1016/j.jacbts.2018.12.001] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Revised: 11/02/2018] [Accepted: 12/03/2018] [Indexed: 02/08/2023]
Abstract
Nuclear envelope proteins have been shown to play an important role in the pathogenesis of inherited dilated cardiomyopathy. Here, we present a remarkable cardiac phenotype caused by a homozygous LEMD2 mutation in patients of the Hutterite population with juvenile cataract. Mutation carriers develop arrhythmic cardiomyopathy with mild impairment of left ventricular systolic function but severe ventricular arrhythmias leading to sudden cardiac death. Affected cardiac tissue from a deceased patient and fibroblasts exhibit elongated nuclei with abnormal condensed heterochromatin at the periphery. The patient fibroblasts demonstrate cellular senescence and reduced proliferation capacity, which may suggest an involvement of LEM domain containing protein 2 in chromatin remodeling processes and premature aging.
Collapse
Key Words
- ACM, arrhythmogenic cardiomyopathy
- BANF, barrier to autointegration factor
- CMR, cardiac magnetic resonance
- DAPI, 4′,6′-diamidino-2-phenylindole
- DCM, dilated cardiomyopathy
- DNA, deoxyribonucleic acid
- EMD, emerin
- ICD, implantable cardioverter-defibrillator
- LEMD2
- LEMD2, LEM domain containing protein 2
- LGE, late gadolinium enhancement
- LMNA, lamin A/C
- LV, left ventricular
- NE, nuclear envelope
- P, passage
- PBS, phosphate-buffered saline
- SAHF, senescence-associated heterochromatin foci
- SNV, single nucleotide variant
- chromatin remodeling
- dilated cardiomyopathy
- eGFP, enhanced green fluorescent protein
- inner nuclear membrane
- sudden death
Collapse
Affiliation(s)
- Nelly Abdelfatah
- Department of Cardiac Sciences, Libin Cardiovascular Institute of Alberta, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Ruping Chen
- Comprehensive Heart Failure Center, University Hospital Würzburg, Würzburg, Germany
| | - Henry J Duff
- Department of Cardiac Sciences, Libin Cardiovascular Institute of Alberta, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Colette M Seifer
- Section of Cardiology, Department of Internal Medicine, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Ilan Buffo
- Variety Children's Heart Centre, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Cathleen Huculak
- Department of Medical Genetics, Alberta Health Services, Calgary, Alberta, Canada
| | - Stephanie Clarke
- Department of Biochemistry and Medical Genetics, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Robin Clegg
- Department of Pediatrics, University of Calgary, Calgary, Alberta, Canada
| | - Davinder S Jassal
- Section of Cardiology, Department of Internal Medicine, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Paul M K Gordon
- Cumming School of Medicine Centre for Health Genomics and Informatics, University of Calgary, Calgary, Alberta, Canada
| | - Carole Ober
- Department of Human Genetics, The University of Chicago, Chicago, Illinois
| | | | - Patrick Frosk
- Department of Biochemistry and Medical Genetics, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada.,Department of Pediatrics and Child Health, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Brenda Gerull
- Department of Cardiac Sciences, Libin Cardiovascular Institute of Alberta, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada.,Comprehensive Heart Failure Center, University Hospital Würzburg, Würzburg, Germany.,Department of Internal Medicine I, University Hospital Würzburg, Würzburg, Germany
| |
Collapse
|
10
|
Rediscovering the value of families for psychiatric genetics research. Mol Psychiatry 2019; 24:523-535. [PMID: 29955165 PMCID: PMC7028329 DOI: 10.1038/s41380-018-0073-x] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/21/2017] [Revised: 01/11/2018] [Accepted: 03/26/2018] [Indexed: 01/09/2023]
Abstract
As it is likely that both common and rare genetic variation are important for complex disease risk, studies that examine the full range of the allelic frequency distribution should be utilized to dissect the genetic influences on mental illness. The rate limiting factor for inferring an association between a variant and a phenotype is inevitably the total number of copies of the minor allele captured in the studied sample. For rare variation, with minor allele frequencies of 0.5% or less, very large samples of unrelated individuals are necessary to unambiguously associate a locus with an illness. Unfortunately, such large samples are often cost prohibitive. However, by using alternative analytic strategies and studying related individuals, particularly those from large multiplex families, it is possible to reduce the required sample size while maintaining statistical power. We contend that using whole genome sequence (WGS) in extended pedigrees provides a cost-effective strategy for psychiatric gene mapping that complements common variant approaches and WGS in unrelated individuals. This was our impetus for forming the "Pedigree-Based Whole Genome Sequencing of Affective and Psychotic Disorders" consortium. In this review, we provide a rationale for the use of WGS with pedigrees in modern psychiatric genetics research. We begin with a focused review of the current literature, followed by a short history of family-based research in psychiatry. Next, we describe several advantages of pedigrees for WGS research, including power estimates, methods for studying the environment, and endophenotypes. We conclude with a brief description of our consortium and its goals.
Collapse
|
11
|
Mozaffari SV, DeCara JM, Shah SJ, Sidore C, Fiorillo E, Cucca F, Lang RM, Nicolae DL, Ober C. Parent-of-origin effects on quantitative phenotypes in a large Hutterite pedigree. Commun Biol 2019; 2:28. [PMID: 30675526 PMCID: PMC6338666 DOI: 10.1038/s42003-018-0267-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2018] [Accepted: 12/14/2018] [Indexed: 12/22/2022] Open
Abstract
The impact of the parental origin of associated alleles in GWAS has been largely ignored. Yet sequence variants could affect traits differently depending on whether they are inherited from the mother or the father, as in imprinted regions, where identical inherited DNA sequences can have different effects based on the parental origin. To explore parent-of-origin effects (POEs), we studied 21 quantitative phenotypes in a large Hutterite pedigree to identify variants with single parent (maternal-only or paternal-only) effects, and then variants with opposite parental effects. Here we show that POEs, which can be opposite in direction, are relatively common in humans, have potentially important clinical effects, and will be missed in traditional GWAS. We identified POEs with 11 phenotypes, most of which are risk factors for cardiovascular disease. Many of the loci identified are characteristic of imprinted regions and are associated with the expression of nearby genes.
Collapse
Affiliation(s)
- Sahar V. Mozaffari
- Department of Human Genetics, University of Chicago, Chicago, IL 60637 USA
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL 60637 USA
| | - Jeanne M. DeCara
- Department of Medicine, University of Chicago, Chicago, IL 60637 USA
| | - Sanjiv J. Shah
- Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611 USA
| | - Carlo Sidore
- Istituto di Ricerca Genetica e Biomedica (IRGB), CNR, Monserrato, 09042 Italy
| | - Edoardo Fiorillo
- Istituto di Ricerca Genetica e Biomedica (IRGB), CNR, Monserrato, 09042 Italy
| | - Francesco Cucca
- Istituto di Ricerca Genetica e Biomedica (IRGB), CNR, Monserrato, 09042 Italy
- Dipartimento di Scienze Biomediche, Universita di Sassari, Sassari, 07100 Italy
| | - Roberto M. Lang
- Department of Medicine, University of Chicago, Chicago, IL 60637 USA
| | - Dan L. Nicolae
- Department of Human Genetics, University of Chicago, Chicago, IL 60637 USA
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL 60637 USA
- Department of Medicine, University of Chicago, Chicago, IL 60637 USA
- Department of Statistics, University of Chicago, Chicago, IL 60637 USA
| | - Carole Ober
- Department of Human Genetics, University of Chicago, Chicago, IL 60637 USA
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL 60637 USA
| |
Collapse
|
12
|
Nelson D, Moreau C, de Vriendt M, Zeng Y, Preuss C, Vézina H, Milot E, Andelfinger G, Labuda D, Gravel S. Inferring Transmission Histories of Rare Alleles in Population-Scale Genealogies. Am J Hum Genet 2018; 103:893-906. [PMID: 30526866 DOI: 10.1016/j.ajhg.2018.10.017] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2018] [Accepted: 10/22/2018] [Indexed: 01/06/2023] Open
Abstract
Learning the transmission history of alleles through a family or population plays an important role in evolutionary, demographic, and medical genetic studies. Most classical models of population genetics have attempted to do so under the assumption that the genealogy of a population is unavailable and that its idiosyncrasies can be described by a small number of parameters describing population size and mate choice dynamics. Large genetic samples have increased sensitivity to such modeling assumptions, and large-scale genealogical datasets become a useful tool to investigate realistic genealogies. However, analyses in such large datasets are often intractable using conventional methods. We present an efficient method to infer transmission paths of rare alleles through population-scale genealogies. Based on backward-time Monte Carlo simulations of genetic inheritance, we use an importance sampling scheme to dramatically speed up convergence. The approach can take advantage of available genotypes of subsets of individuals in the genealogy including haplotype structure as well as information about the mode of inheritance and general prevalence of a mutation or disease in the population. Using a high-quality genealogical dataset of more than three million married individuals in the Quebec founder population, we apply the method to reconstruct the transmission history of chronic atrial and intestinal dysrhythmia (CAID), a rare recessive disease. We identify the most likely early carriers of the mutation and geographically map the expected carrier rate in the present-day French-Canadian population of Quebec.
Collapse
|
13
|
Ullah E, Mall R, Abbas MM, Kunji K, Nato AQ, Bensmail H, Wijsman EM, Saad M. Comparison and assessment of family- and population-based genotype imputation methods in large pedigrees. Genome Res 2018; 29:125-134. [PMID: 30514702 PMCID: PMC6314157 DOI: 10.1101/gr.236315.118] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Accepted: 11/30/2018] [Indexed: 01/19/2023]
Abstract
Genotype imputation is widely used in genome-wide association studies to boost variant density, allowing increased power in association testing. Many studies currently include pedigree data due to increasing interest in rare variants coupled with the availability of appropriate analysis tools. The performance of population-based (subjects are unrelated) imputation methods is well established. However, the performance of family- and population-based imputation methods on family data has been subject to much less scrutiny. Here, we extensively compare several family- and population-based imputation methods on family data of large pedigrees with both European and African ancestry. Our comparison includes many widely used family- and population-based tools and another method, Ped_Pop, which combines family- and population-based imputation results. We also compare four subject selection strategies for full sequencing to serve as the reference panel for imputation: GIGI-Pick, ExomePicks, PRIMUS, and random selection. Moreover, we compare two imputation accuracy metrics: the Imputation Quality Score and Pearson's correlation R 2 for predicting power of association analysis using imputation results. Our results show that (1) GIGI outperforms Merlin; (2) family-based imputation outperforms population-based imputation for rare variants but not for common ones; (3) combining family- and population-based imputation outperforms all imputation approaches for all minor allele frequencies; (4) GIGI-Pick gives the best selection strategy based on the R 2 criterion; and (5) R 2 is the best measure of imputation accuracy. Our study is the first to extensively evaluate the imputation performance of many available family- and population-based tools on the same family data and provides guidelines for future studies.
Collapse
Affiliation(s)
- Ehsan Ullah
- Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar
| | - Raghvendra Mall
- Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar
| | - Mostafa M Abbas
- Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar
| | - Khalid Kunji
- Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar
| | - Alejandro Q Nato
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, Washington 98195-9460, USA.,Department of Biomedical Sciences, Joan C. Edwards School of Medicine, Marshall University, Huntington, West Virginia 25755, USA
| | - Halima Bensmail
- Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar
| | - Ellen M Wijsman
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, Washington 98195-9460, USA.,Department of Biostatistics, University of Washington, Seattle, Washington 98195-9460, USA
| | - Mohamad Saad
- Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar
| |
Collapse
|
14
|
Mozaffari SV, Stein MM, Magnaye KM, Nicolae DL, Ober C. Parent of origin gene expression in a founder population identifies two new candidate imprinted genes at known imprinted regions. PLoS One 2018; 13:e0203906. [PMID: 30204804 PMCID: PMC6133383 DOI: 10.1371/journal.pone.0203906] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2018] [Accepted: 08/29/2018] [Indexed: 11/18/2022] Open
Abstract
Genomic imprinting is the phenomena that leads to silencing of one copy of a gene inherited from a specific parent. Mutations in imprinted regions have been involved in diseases showing parent of origin effects. Identifying genes with evidence of parent of origin expression patterns in family studies allows the detection of more subtle imprinting. Here, we use allele specific expression in lymphoblastoid cell lines from 306 Hutterites related in a single pedigree to provide formal evidence for parent of origin effects. We take advantage of phased genotype data to assign parent of origin to RNA-seq reads in individuals with gene expression data. Our approach identified known imprinted genes, two putative novel imprinted genes, PXDC1 and PWAR6, and 14 genes with asymmetrical parent of origin gene expression. We used gene expression in peripheral blood leukocytes (PBL) to validate our findings, and then confirmed imprinting control regions (ICRs) using DNA methylation levels in the PBLs.
Collapse
Affiliation(s)
- Sahar V. Mozaffari
- Committee on Genetics, Genomics & Systems Biology, University of Chicago, Chicago, Illinois, United States of America
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Michelle M. Stein
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Kevin M. Magnaye
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Dan L. Nicolae
- Committee on Genetics, Genomics & Systems Biology, University of Chicago, Chicago, Illinois, United States of America
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Department of Statistics, University of Chicago, Chicago, Illinois, United States of America
| | - Carole Ober
- Committee on Genetics, Genomics & Systems Biology, University of Chicago, Chicago, Illinois, United States of America
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
15
|
Knowles DA, Burrows CK, Blischak JD, Patterson KM, Serie DJ, Norton N, Ober C, Pritchard JK, Gilad Y. Determining the genetic basis of anthracycline-cardiotoxicity by molecular response QTL mapping in induced cardiomyocytes. eLife 2018; 7:e33480. [PMID: 29737278 PMCID: PMC6010343 DOI: 10.7554/elife.33480] [Citation(s) in RCA: 74] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2017] [Accepted: 04/30/2018] [Indexed: 12/17/2022] Open
Abstract
Anthracycline-induced cardiotoxicity (ACT) is a key limiting factor in setting optimal chemotherapy regimes, with almost half of patients expected to develop congestive heart failure given high doses. However, the genetic basis of sensitivity to anthracyclines remains unclear. We created a panel of iPSC-derived cardiomyocytes from 45 individuals and performed RNA-seq after 24 hr exposure to varying doxorubicin dosages. The transcriptomic response is substantial: the majority of genes are differentially expressed and over 6000 genes show evidence of differential splicing, the later driven by reduced splicing fidelity in the presence of doxorubicin. We show that inter-individual variation in transcriptional response is predictive of in vitro cell damage, which in turn is associated with in vivo ACT risk. We detect 447 response-expression quantitative trait loci (QTLs) and 42 response-splicing QTLs, which are enriched in lower ACT GWAS [Formula: see text]-values, supporting the in vivo relevance of our map of genetic regulation of cellular response to anthracyclines.
Collapse
Affiliation(s)
- David A Knowles
- Department of GeneticsStanford UniversityStanfordUnited States
- Department of RadiologyStanford UniversityStanfordUnited States
| | | | - John D Blischak
- Department of Human GeneticsUniversity of ChicagoChicagoUnited States
| | | | - Daniel J Serie
- Department of Health Sciences ResearchMayo ClinicJacksonvilleUnited States
| | - Nadine Norton
- Department of Cancer BiologyMayo ClinicJacksonvilleUnited States
| | - Carole Ober
- Department of Human GeneticsUniversity of ChicagoChicagoUnited States
| | - Jonathan K Pritchard
- Department of GeneticsStanford UniversityStanfordUnited States
- Department of BiologyStanford UniversityStanfordUnited States
- Howard Hughes Medical InstituteStanford UniversityStanfordUnited States
| | - Yoav Gilad
- Department of Human GeneticsUniversity of ChicagoChicagoUnited States
- Department of MedicineUniversity of ChicagoChicagoUnited States
| |
Collapse
|
16
|
Reppell M, Novembre J. Using pseudoalignment and base quality to accurately quantify microbial community composition. PLoS Comput Biol 2018; 14:e1006096. [PMID: 29659582 PMCID: PMC5945057 DOI: 10.1371/journal.pcbi.1006096] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Revised: 05/10/2018] [Accepted: 03/19/2018] [Indexed: 12/31/2022] Open
Abstract
Pooled DNA from multiple unknown organisms arises in a variety of contexts, for example microbial samples from ecological or human health research. Determining the composition of pooled samples can be difficult, especially at the scale of modern sequencing data and reference databases. Here we propose a novel method for taxonomic profiling in pooled DNA that combines the speed and low-memory requirements of k-mer based pseudoalignment with a likelihood framework that uses base quality information to better resolve multiply mapped reads. We apply the method to the problem of classifying 16S rRNA reads using a reference database of known organisms, a common challenge in microbiome research. Using simulations, we show the method is accurate across a variety of read lengths, with different length reference sequences, at different sample depths, and when samples contain reads originating from organisms absent from the reference. We also assess performance in real 16S data, where we reanalyze previous genetic association data to show our method discovers a larger number of quantitative trait associations than other widely used methods. We implement our method in the software Karp, for k-mer based analysis of read pools, to provide a novel combination of speed and accuracy that is uniquely suited for enhancing discoveries in microbial studies.
Collapse
Affiliation(s)
- Mark Reppell
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - John Novembre
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
17
|
Herzig AF, Nutile T, Babron MC, Ciullo M, Bellenguez C, Leutenegger AL. Strategies for phasing and imputation in a population isolate. Genet Epidemiol 2018; 42:201-213. [PMID: 29319195 DOI: 10.1002/gepi.22109] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2017] [Revised: 11/16/2017] [Accepted: 11/16/2017] [Indexed: 11/05/2022]
Abstract
In the search for genetic associations with complex traits, population isolates offer the advantage of reduced genetic and environmental heterogeneity. In addition, cost-efficient next-generation association approaches have been proposed in these populations where only a subsample of representative individuals is sequenced and then genotypes are imputed into the rest of the population. Gene mapping in such populations thus requires high-quality genetic imputation and preliminary phasing. To identify an effective study design, we compare by simulation a range of phasing and imputation software and strategies. We simulated 1,115,604 variants on chromosome 10 for 477 members of the large complex pedigree of Campora, a village within the established isolate of Cilento in southern Italy. We assessed the phasing performance of identical by descent based software ALPHAPHASE and SLRP, LD-based software SHAPEIT2, SHAPEIT3, and BEAGLE, and new software EAGLE that combines both methodologies. For imputation we compared IMPUTE2, IMPUTE4, MINIMAC3, BEAGLE, and new software PBWT. Genotyping errors and missing genotypes were simulated to observe their effects on the performance of each software. Highly accurate phased data were achieved by all software with SHAPEIT2, SHAPEIT3, and EAGLE2 providing the most accurate results. MINIMAC3, IMPUTE4, and IMPUTE2 all performed strongly as imputation software and our study highlights the considerable gain in imputation accuracy provided by a genome sequenced reference panel specific to the population isolate.
Collapse
Affiliation(s)
- Anthony Francis Herzig
- Université Paris-Diderot, Sorbonne Paris Cité, U946, Paris, France.,Inserm, U946, Genetic Variation and Human Diseases, Paris, France
| | - Teresa Nutile
- Institute of Genetics and Biophysics A. Buzzati-Traverso-CNR, Naples, Italy
| | - Marie-Claude Babron
- Université Paris-Diderot, Sorbonne Paris Cité, U946, Paris, France.,Inserm, U946, Genetic Variation and Human Diseases, Paris, France
| | - Marina Ciullo
- Institute of Genetics and Biophysics A. Buzzati-Traverso-CNR, Naples, Italy.,IRCCS Neuromed, Pozzilli, Isernia, Italy
| | - Céline Bellenguez
- Inserm, U1167, RID-AGE-Risk Factors and Molecular Determinants of Aging-Related Diseases, Lille, France.,Institut Pasteur de Lille, Lille, France.,Université de Lille, U1167-Excellence Laboratory LabEx DISTALZ, Lille, France
| | - Anne-Louise Leutenegger
- Université Paris-Diderot, Sorbonne Paris Cité, U946, Paris, France.,Inserm, U946, Genetic Variation and Human Diseases, Paris, France
| |
Collapse
|
18
|
Igartua C, Mozaffari SV, Nicolae DL, Ober C. Rare non-coding variants are associated with plasma lipid traits in a founder population. Sci Rep 2017; 7:16415. [PMID: 29180722 PMCID: PMC5704019 DOI: 10.1038/s41598-017-16550-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2017] [Accepted: 11/14/2017] [Indexed: 12/31/2022] Open
Abstract
Founder populations are ideally suited for studies on the clinical effects of alleles that are rare in general populations but occur at higher frequencies in these isolated populations. Whole genome sequencing in 98 Hutterites, a founder population of European descent, and subsequent imputation revealed 660,238 single nucleotide polymorphisms that are rare (<1%) or absent in European populations, but occur at frequencies >1% in the Hutterites. We examined the effects of these rare in European variants on plasma lipid levels in 828 Hutterites and applied a Bayesian hierarchical framework to prioritize potentially causal variants based on functional annotations. We identified two novel non-coding rare variants associated with LDL cholesterol (rs17242388 in LDLR) and HDL cholesterol (rs189679427 between GOT2 and APOOP5), and replicated previous associations of a splice variant in APOC3 (rs138326449) with triglycerides and HDL-C. All three variants are at well-replicated loci in GWAS but are independent from and have larger effect sizes than the known common variation in these regions. Candidate eQTL analyses in in LCLs in the Hutterites suggest that these rare non-coding variants are likely to mediate their effects on lipid traits by regulating gene expression.
Collapse
Affiliation(s)
- Catherine Igartua
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA.
| | - Sahar V Mozaffari
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA.,Committee of Genetics, Genomics and Systems Biology, University of Chicago, Chicago, IL, 60637, USA
| | - Dan L Nicolae
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA.,Department of Statistics, University of Chicago, Chicago, IL, 60637, USA.,Department of Medicine, University of Chicago, Chicago, IL, 60637, USA
| | - Carole Ober
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA.,Committee of Genetics, Genomics and Systems Biology, University of Chicago, Chicago, IL, 60637, USA
| |
Collapse
|
19
|
Ko A, Nielsen R. Composite likelihood method for inferring local pedigrees. PLoS Genet 2017; 13:e1006963. [PMID: 28827797 PMCID: PMC5578687 DOI: 10.1371/journal.pgen.1006963] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2017] [Revised: 08/31/2017] [Accepted: 08/07/2017] [Indexed: 12/21/2022] Open
Abstract
Pedigrees contain information about the genealogical relationships among individuals and are of fundamental importance in many areas of genetic studies. However, pedigrees are often unknown and must be inferred from genetic data. Despite the importance of pedigree inference, existing methods are limited to inferring only close relationships or analyzing a small number of individuals or loci. We present a simulated annealing method for estimating pedigrees in large samples of otherwise seemingly unrelated individuals using genome-wide SNP data. The method supports complex pedigree structures such as polygamous families, multi-generational families, and pedigrees in which many of the member individuals are missing. Computational speed is greatly enhanced by the use of a composite likelihood function which approximates the full likelihood. We validate our method on simulated data and show that it can infer distant relatives more accurately than existing methods. Furthermore, we illustrate the utility of the method on a sample of Greenlandic Inuit. Pedigrees contain information about the genealogical relationships among individuals. This information can be used in many areas of genetic studies such as disease association studies, conservation efforts, and for inferences about the demographic history and social structure of a population. Despite their importance, pedigrees are often unknown and must be estimated from genetic information. However, pedigree inference remains a difficult problem due to the high cost of likelihood computation and the enormous number of possible pedigrees that must be considered. These difficulties limit existing methods in their ability to infer pedigrees when the sample size or the number of markers is large, or when the sample contains only distant relatives. In this report, we present a method that circumvents these computational challenges in order to infer pedigrees of complex structure for a large number of individuals. Using simulations, we find that the method can infer distant relatives much more accurately than existing methods. Furthermore, we show that even pairwise inferences of relatedness can be improved substantially by consideration of the pedigree structure with other related individuals in the sample.
Collapse
Affiliation(s)
- Amy Ko
- Department of Integrative Biology, University of California, Berkeley, Berkeley, California, United States of America
- * E-mail:
| | - Rasmus Nielsen
- Department of Integrative Biology, University of California, Berkeley, Berkeley, California, United States of America
- Department of Statistics, University of California, Berkeley, Berkeley, California, United States of America
- Museum of Natural History, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
20
|
Pathogenic Variant in ACTB, p.Arg183Trp, Causes Juvenile-Onset Dystonia, Hearing Loss, and Developmental Delay without Midline Malformation. Case Rep Genet 2017; 2017:9184265. [PMID: 28487785 PMCID: PMC5405358 DOI: 10.1155/2017/9184265] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2017] [Revised: 03/25/2017] [Accepted: 03/30/2017] [Indexed: 11/17/2022] Open
Abstract
ACTB encodes the β-actin, and pathogenic variations in this gene have typically been associated with Baraitser-Winter cerebrofrontofacial syndrome, a congenital malformation syndrome characterized by short stature, craniofacial anomalies, and cerebral anomalies. Here, we describe the third case with the p.Arg183Trp variant in ACTB causing juvenile-onset dystonia. Our patient has severe, intractable dystonia, developmental delay, and sensorineural hearing loss, besides hyperintensities in the caudate nuclei and putamen on the brain MRI, which is a distinct but overlapping phenotype with the previously reported case of identical twins with the same alteration in ACTB.
Collapse
|
21
|
Igartua C, Davenport ER, Gilad Y, Nicolae DL, Pinto J, Ober C. Host genetic variation in mucosal immunity pathways influences the upper airway microbiome. MICROBIOME 2017; 5:16. [PMID: 28143570 PMCID: PMC5286564 DOI: 10.1186/s40168-016-0227-5] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2016] [Accepted: 12/25/2016] [Indexed: 05/08/2023]
Abstract
BACKGROUND The degree to which host genetic variation can modulate microbial communities in humans remains an open question. Here, we performed a genetic mapping study of the microbiome in two accessible upper airway sites, the nasopharynx and the nasal vestibule, during two seasons in 144 adult members of a founder population of European decent. RESULTS We estimated the relative abundances (RAs) of genus level bacteria from 16S rRNA gene sequences and examined associations with 148,653 genetic variants (linkage disequilibrium [LD] r 2 < 0.5) selected from among all common variants discovered in genome sequences in this population. We identified 37 microbiome quantitative trait loci (mbQTLs) that showed evidence of association with the RAs of 22 genera (q < 0.05) and were enriched for genes in mucosal immunity pathways. The most significant association was between the RA of Dermacoccus (phylum Actinobacteria) and a variant 8 kb upstream of TINCR (rs117042385; p = 1.61 × 10-8; q = 0.002), a long non-coding RNA that binds to peptidoglycan recognition protein 3 (PGLYRP3) mRNA, a gene encoding a known antimicrobial protein. A second association was between a missense variant in PGLYRP4 (rs3006458) and the RA of an unclassified genus of family Micrococcaceae (phylum Actinobacteria) (p = 5.10 × 10-7; q = 0.032). CONCLUSIONS Our findings provide evidence of host genetic influences on upper airway microbial composition in humans and implicate mucosal immunity genes in this relationship.
Collapse
Affiliation(s)
- Catherine Igartua
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA.
| | - Emily R Davenport
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, 14853, USA
| | - Yoav Gilad
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA
- Department of Medicine, University of Chicago, Chicago, IL, 60637, USA
| | - Dan L Nicolae
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA
- Department of Medicine, University of Chicago, Chicago, IL, 60637, USA
- Department of Statistics, University of Chicago, Chicago, IL, 60637, USA
| | - Jayant Pinto
- Section of Otolaryngology-Head and Neck Surgery, Department of Surgery, University of Chicago, Chicago, IL, 60637, USA
| | - Carole Ober
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA.
| |
Collapse
|
22
|
A novel NDUFS4 frameshift mutation causes Leigh disease in the Hutterite population. Am J Med Genet A 2016; 173:596-600. [DOI: 10.1002/ajmg.a.37983] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2016] [Accepted: 09/05/2016] [Indexed: 11/07/2022]
|
23
|
Burrows CK, Kosova G, Herman C, Patterson K, Hartmann KE, Velez Edwards DR, Stephenson MD, Lynch VJ, Ober C. Expression Quantitative Trait Locus Mapping Studies in Mid-secretory Phase Endometrial Cells Identifies HLA-F and TAP2 as Fecundability-Associated Genes. PLoS Genet 2016; 12:e1005858. [PMID: 27447835 PMCID: PMC4957750 DOI: 10.1371/journal.pgen.1005858] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2015] [Accepted: 01/20/2016] [Indexed: 12/29/2022] Open
Abstract
Fertility traits in humans are heritable, however, little is known about the genes that influence reproductive outcomes or the genetic variants that contribute to differences in these traits between individuals, particularly women. To address this gap in knowledge, we performed an unbiased genome-wide expression quantitative trait locus (eQTL) mapping study to identify common regulatory (expression) single nucleotide polymorphisms (eSNPs) in mid-secretory endometrium. We identified 423 cis-eQTLs for 132 genes that were significant at a false discovery rate (FDR) of 1%. After pruning for strong LD (r2 >0.95), we tested for associations between eSNPs and fecundability (the ability to get pregnant), measured as the length of the interval to pregnancy, in 117 women. Two eSNPs were associated with fecundability at a FDR of 5%; both were in the HLA region and were eQTLs for the TAP2 gene (P = 1.3x10-4) and the HLA-F gene (P = 4.0x10-4), respectively. The effects of these SNPs on fecundability were replicated in an independent sample. The two eSNPs reside within or near regulatory elements in decidualized human endometrial stromal cells. Our study integrating eQTL mapping in a primary tissue with association studies of a related phenotype revealed novel genes and associated alleles with independent effects on fecundability, and identified a central role for two HLA region genes in human implantation success. Little is known about the genetics of female fertility. In this study, we addressed this gap in knowledge by first searching for genetic variants that regulate gene expression in uterine endometrial cells, and then testing those functional variants for associations with the length of time to pregnancy in fertile women. Two functional genetic variants were associated with time to pregnancy in women after correcting for multiple testing. Those variants were each associated with the expression of genes in the HLA region, HLA-F and TAP2, which are have not previously been implicated female fertility. The association between HLA-F and TAP2 genotypes on the length of time to pregnancy was replicated in an independent cohort of women. Because HLA-F and TAP2 are involved in immune processes, these results suggest their role in specific immune regulation in the endometrium during implantation. Future studies will characterize these molecules in the implantation process and their potential as drug targets for treatment of conditions related to implantation failure.
Collapse
Affiliation(s)
- Courtney K. Burrows
- Department of Human Genetics, The University of Chicago, Chicago, Illinois, United States of America
| | - Gülüm Kosova
- Department of Human Genetics, The University of Chicago, Chicago, Illinois, United States of America
| | - Catherine Herman
- Department of Human Genetics, The University of Chicago, Chicago, Illinois, United States of America
| | - Kristen Patterson
- Department of Human Genetics, The University of Chicago, Chicago, Illinois, United States of America
| | - Katherine E. Hartmann
- Institute for Medicine and Public Health, Vanderbilt Epidemiology Center, Vanderbilt University, Nashville, Tennessee, United States of America
- Departments of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Digna R. Velez Edwards
- Institute for Medicine and Public Health, Vanderbilt Epidemiology Center, Vanderbilt University, Nashville, Tennessee, United States of America
- Departments of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Mary D. Stephenson
- Department of Obstetrics and Gynecology, The University of Chicago, Chicago, Illinois, United States of America
| | - Vincent J. Lynch
- Department of Human Genetics, The University of Chicago, Chicago, Illinois, United States of America
| | - Carole Ober
- Department of Human Genetics, The University of Chicago, Chicago, Illinois, United States of America
- Department of Obstetrics and Gynecology, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail:
| |
Collapse
|
24
|
Cusanovich DA, Caliskan M, Billstrand C, Michelini K, Chavarria C, De Leon S, Mitrano A, Lewellyn N, Elias JA, Chupp GL, Lang RM, Shah SJ, Decara JM, Gilad Y, Ober C. Integrated analyses of gene expression and genetic association studies in a founder population. Hum Mol Genet 2016; 25:2104-2112. [PMID: 26931462 PMCID: PMC5062579 DOI: 10.1093/hmg/ddw061] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2015] [Accepted: 02/21/2016] [Indexed: 12/17/2022] Open
Abstract
Genome-wide association studies (GWASs) have become a standard tool for dissecting genetic contributions to disease risk. However, these studies typically require extraordinarily large sample sizes to be adequately powered. Strategies that incorporate functional information alongside genetic associations have proved successful in increasing GWAS power. Following this paradigm, we present the results of 20 different genetic association studies for quantitative traits related to complex diseases, conducted in the Hutterites of South Dakota. To boost the power of these association studies, we collected RNA-sequencing data from lymphoblastoid cell lines for 431 Hutterite individuals. We then used Sherlock, a tool that integrates GWAS and expression quantitative trait locus (eQTL) data, to identify weak GWAS signals that are also supported by eQTL data. Using this approach, we found novel associations with quantitative phenotypes related to cardiovascular disease, including carotid intima-media thickness, left atrial volume index, monocyte count and serum YKL-40 levels.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Jack A Elias
- Division of Biology and Medicine, Brown University, Providence, RI 02912, USA and
| | - Geoffrey L Chupp
- Pulmonary and Critical Care, Yale School of Medicine, New Haven, CT 06519, USA
| | - Roberto M Lang
- Department of Medicine, Section of Cardiology, University of Chicago, Chicago, IL 60637, USA
| | - Sanjiv J Shah
- Department of Medicine, Section of Cardiology, University of Chicago, Chicago, IL 60637, USA
| | - Jeanne M Decara
- Department of Medicine, Section of Cardiology, University of Chicago, Chicago, IL 60637, USA
| | | | | |
Collapse
|
25
|
Sengupta S, Gulukota K, Zhu Y, Ober C, Naughton K, Wentworth-Sheilds W, Ji Y. Ultra-fast local-haplotype variant calling using paired-end DNA-sequencing data reveals somatic mosaicism in tumor and normal blood samples. Nucleic Acids Res 2016; 44:e25. [PMID: 26420835 PMCID: PMC4756850 DOI: 10.1093/nar/gkv953] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2015] [Revised: 09/09/2015] [Accepted: 09/13/2015] [Indexed: 12/30/2022] Open
Abstract
Somatic mosaicism refers to the existence of somatic mutations in a fraction of somatic cells in a single biological sample. Its importance has mainly been discussed in theory although experimental work has started to emerge linking somatic mosaicism to disease diagnosis. Through novel statistical modeling of paired-end DNA-sequencing data using blood-derived DNA from healthy donors as well as DNA from tumor samples, we present an ultra-fast computational pipeline, LocHap that searches for multiple single nucleotide variants (SNVs) that are scaffolded by the same reads. We refer to scaffolded SNVs as local haplotypes (LH). When an LH exhibits more than two genotypes, we call it a local haplotype variant (LHV). The presence of LHVs is considered evidence of somatic mosaicism because a genetically homogeneous cell population will not harbor LHVs. Applying LocHap to whole-genome and whole-exome sequence data in DNA from normal blood and tumor samples, we find wide-spread LHVs across the genome. Importantly, we find more LHVs in tumor samples than in normal samples, and more in older adults than in younger ones. We confirm the existence of LHVs and somatic mosaicism by validation studies in normal blood samples. LocHap is publicly available at http://www.compgenome.org/lochap.
Collapse
Affiliation(s)
- Subhajit Sengupta
- Program of Computational Genomics & Medicine, NorthShore University HealthSystem, Evanston, IL 60201, USA
| | - Kamalakar Gulukota
- Center for Molecular Medicine, NorthShore University HealthSystem, Evanston, IL 60201, USA
| | - Yitan Zhu
- Program of Computational Genomics & Medicine, NorthShore University HealthSystem, Evanston, IL 60201, USA
| | - Carole Ober
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
| | - Katherine Naughton
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
| | | | - Yuan Ji
- Program of Computational Genomics & Medicine, NorthShore University HealthSystem, Evanston, IL 60201, USA Department of Health Studies, University of Chicago, Chicago, IL 60637, USA
| |
Collapse
|