1
|
Li M, Zhang YW, Zhang ZC, Xiang Y, Liu MH, Zhou YH, Zuo JF, Zhang HQ, Chen Y, Zhang YM. A compressed variance component mixed model for detecting QTNs and QTN-by-environment and QTN-by-QTN interactions in genome-wide association studies. MOLECULAR PLANT 2022; 15:630-650. [PMID: 35202864 DOI: 10.1016/j.molp.2022.02.012] [Citation(s) in RCA: 47] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2021] [Revised: 01/26/2022] [Accepted: 02/19/2022] [Indexed: 05/25/2023]
Abstract
Although genome-wide association studies are widely used to mine genes for quantitative traits, the effects to be estimated are confounded, and the methodologies for detecting interactions are imperfect. To address these issues, the mixed model proposed here first estimates the genotypic effects for AA, Aa, and aa, and the genotypic polygenic background replaces additive and dominance polygenic backgrounds. Then, the estimated genotypic effects are partitioned into additive and dominance effects using a one-way analysis of variance model. This strategy was further expanded to cover QTN-by-environment interactions (QEIs) and QTN-by-QTN interactions (QQIs) using the same mixed-model framework. Thus, a three-variance-component mixed model was integrated with our multi-locus random-SNP-effect mixed linear model (mrMLM) method to establish a new methodological framework, 3VmrMLM, that detects all types of loci and estimates their effects. In Monte Carlo studies, 3VmrMLM correctly detected all types of loci and almost unbiasedly estimated their effects, with high powers and accuracies and a low false positive rate. In re-analyses of 10 traits in 1439 rice hybrids, detection of 269 known genes, 45 known gene-by-environment interactions, and 20 known gene-by-gene interactions strongly validated 3VmrMLM. Further analyses of known genes showed more small (67.49%), minor-allele-frequency (35.52%), and pleiotropic (30.54%) genes, with higher repeatability across datasets (54.36%) and more dominance loci. In addition, a heteroscedasticity mixed model in multiple environments and dimension reduction methods in quite a number of environments were developed to detect QEIs, and variable selection under a polygenic background was proposed for QQI detection. This study provides a new approach for revealing the genetic architecture of quantitative traits.
Collapse
Affiliation(s)
- Mei Li
- Crop Information Center, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Ya-Wen Zhang
- Crop Information Center, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, China; State Key Laboratory of Cotton Biology, Anyang 455000, China
| | - Ze-Chang Zhang
- Crop Information Center, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Yu Xiang
- Crop Information Center, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Ming-Hui Liu
- Crop Information Center, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Ya-Hui Zhou
- Crop Information Center, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Jian-Fang Zuo
- Crop Information Center, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Han-Qing Zhang
- Crop Information Center, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Ying Chen
- Crop Information Center, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Yuan-Ming Zhang
- Crop Information Center, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, China.
| |
Collapse
|
2
|
Serres-Armero A, Davis BW, Povolotskaya IS, Morcillo-Suarez C, Plassais J, Juan D, Ostrander EA, Marques-Bonet T. Copy number variation underlies complex phenotypes in domestic dog breeds and other canids. Genome Res 2021; 31:762-774. [PMID: 33863806 PMCID: PMC8092016 DOI: 10.1101/gr.266049.120] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Accepted: 02/26/2021] [Indexed: 01/02/2023]
Abstract
Extreme phenotypic diversity, a history of artificial selection, and socioeconomic value make domestic dog breeds a compelling subject for genomic research. Copy number variation (CNV) is known to account for a significant part of inter-individual genomic diversity in other systems. However, a comprehensive genome-wide study of structural variation as it relates to breed-specific phenotypes is lacking. We have generated whole genome CNV maps for more than 300 canids. Our data set extends the canine structural variation landscape to more than 100 dog breeds, including novel variants that cannot be assessed using microarray technologies. We have taken advantage of this data set to perform the first CNV-based genome-wide association study (GWAS) in canids. We identify 96 loci that display copy number differences across breeds, which are statistically associated with a previously compiled set of breed-specific morphometrics and disease susceptibilities. Among these, we highlight the discovery of a long-range interaction involving a CNV near MED13L and TBX3, which could influence breed standard height. Integration of the CNVs with chromatin interactions, long noncoding RNA expression, and single nucleotide variation highlights a subset of specific loci and genes with potential functional relevance and the prospect to explain trait variation between dog breeds.
Collapse
Affiliation(s)
- Aitor Serres-Armero
- IBE, Institut de Biologia Evolutiva (Universitat Pompeu Fabra/CSIC), Ciencies Experimentals i de la Salut, Barcelona 08003, Spain
| | - Brian W Davis
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA.,Department of Veterinary Integrative Biosciences, College of Veterinary Medicine, Texas A&M University, College Station, Texas 77843, USA
| | - Inna S Povolotskaya
- Veltischev Research and Clinical Institute for Pediatrics of the Pirogov Russian National Research Medical University, Moscow 117997, Russia
| | - Carlos Morcillo-Suarez
- IBE, Institut de Biologia Evolutiva (Universitat Pompeu Fabra/CSIC), Ciencies Experimentals i de la Salut, Barcelona 08003, Spain
| | - Jocelyn Plassais
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - David Juan
- IBE, Institut de Biologia Evolutiva (Universitat Pompeu Fabra/CSIC), Ciencies Experimentals i de la Salut, Barcelona 08003, Spain
| | - Elaine A Ostrander
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Tomas Marques-Bonet
- IBE, Institut de Biologia Evolutiva (Universitat Pompeu Fabra/CSIC), Ciencies Experimentals i de la Salut, Barcelona 08003, Spain.,CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona 08028, Spain.,Institucio Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Catalonia 08010, Spain.,Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Catalonia 08201, Spain
| |
Collapse
|
3
|
Genome-wide association mapping for adult resistance to powdery mildew in common wheat. Mol Biol Rep 2019; 47:1241-1256. [PMID: 31813131 DOI: 10.1007/s11033-019-05225-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Accepted: 12/04/2019] [Indexed: 12/23/2022]
Abstract
Blumeria graminis f. sp. tritici, the causal agent of wheat powdery mildew disease, can occur at all stages of the crop and constantly threatens wheat production. To identify candidate resistance genes for powdery mildew, we performed GWAS (genome-wide association studies) on a total set of 329 wheat varieties obtained from different origins. These wheat materials were genotyped using wheat 90K SNP array and evaluated for their resistance in either field or glasshouse condition from 2016 to 2018. Using a mixed linear model, 33 SNP markers of which 14 QTL (quantitative trait loci) were later defined were observed to associate with powdery mildew resistance. Among these, QTL on chromosome 3A, 3B, 6D and 7D were concluded as potentially new QTL. Exploration of candidate genes for new QTL suggested roles of these genes involved in encoding disease resistance and defence-related proteins, and regulating early immune response to the pathogen. Overall, the results reveal that GWAS can be an effective means of identifying marker-trait associations, though further functional validation and fine-mapping of gene candidates are required before creating opportunities for developing new resistant genotypes.
Collapse
|
4
|
Nani JP, Rezende FM, Peñagaricano F. Predicting male fertility in dairy cattle using markers with large effect and functional annotation data. BMC Genomics 2019; 20:258. [PMID: 30940077 PMCID: PMC6444482 DOI: 10.1186/s12864-019-5644-y] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2018] [Accepted: 03/25/2019] [Indexed: 11/22/2022] Open
Abstract
Background Fertility is among the most important economic traits in dairy cattle. Genomic prediction for cow fertility has received much attention in the last decade, while bull fertility has been largely overlooked. The goal of this study was to assess genomic prediction of dairy bull fertility using markers with large effect and functional annotation data. Sire conception rate (SCR) was used as a measure of service sire fertility. Dataset consisted of 11.5 k U.S. Holstein bulls with SCR records and about 300 k single nucleotide polymorphism (SNP) markers. The analyses included the use of both single-kernel and multi-kernel predictive models fitting either all SNPs, markers with large effect, or markers with presumed functional roles, such as non-synonymous, synonymous, or non-coding regulatory variants. Results The entire set of SNPs yielded predictive correlations of 0.340. Five markers located on chromosomes BTA8, BTA9, BTA13, BTA17, and BTA27 showed marked dominance effects. Interestingly, the inclusion of these five major markers as fixed effects in the predictive models increased predictive correlations to 0.403, representing an increase in accuracy of about 19% compared with the standard model. Single-kernel models fitting functional SNP classes outperformed their counterparts using random sets of SNPs, suggesting that the predictive power of these functional variants is driven in part by their biological roles. Multi-kernel models fitting all the functional SNP classes together with the five major markers exhibited predictive correlations around 0.405. Conclusions The inclusion of markers with large effect markedly improved the prediction of dairy sire fertility. Functional variants exhibited higher predictive ability than random variants, but did not outperform the standard whole-genome approach. This research is the foundation for the development of novel strategies that could help the dairy industry make accurate genome-guided selection decisions on service sire fertility.
Collapse
Affiliation(s)
- Juan Pablo Nani
- Department of Animal Sciences, University of Florida, 2250 Shealy Drive, Gainesville, FL, 32611, USA.,Estación Experimental Agropecuaria Rafaela, Instituto Nacional de Tecnología Agropecuaria, 22-2300, Rafaela, SF, Argentina
| | - Fernanda M Rezende
- Department of Animal Sciences, University of Florida, 2250 Shealy Drive, Gainesville, FL, 32611, USA.,Faculdade de Medicina Veterinária, Universidade Federal de Uberlândia, Uberlândia, MG, 38410-337, Brazil
| | - Francisco Peñagaricano
- Department of Animal Sciences, University of Florida, 2250 Shealy Drive, Gainesville, FL, 32611, USA. .,University of Florida Genetics Institute, University of Florida, Gainesville, FL, 32610, USA.
| |
Collapse
|
5
|
Rezende FM, Dietsch GO, Peñagaricano F. Genetic dissection of bull fertility in US Jersey dairy cattle. Anim Genet 2018; 49:393-402. [PMID: 30109710 PMCID: PMC6175157 DOI: 10.1111/age.12710] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/28/2018] [Indexed: 02/06/2023]
Abstract
The service sire has been recognized as an important factor affecting herd fertility in dairy cattle. Recent studies suggest that genetic factors explain part of the difference in fertility among Holstein sires. The main objective of this study was to dissect the genetic architecture of sire fertility in US Jersey cattle. The dataset included 1.5 K Jersey bulls with sire conception rate (SCR) records and 96 K single nucleotide polymorphism (SNP) markers spanning the whole genome. The analysis included whole‐genome scans for both additive and non‐additive effects and subsequent functional enrichment analyses using KEGG Pathway, Gene Ontology (GO) and Medical Subject Headings (MeSH) databases. Ten genomic regions located on eight different chromosomes explained more than 0.5% of the additive genetic variance for SCR. These regions harbor genes, such as PKDREJ,EPB41L2,PDGFD,STX2,SLC25A20 and IP6K1, that are directly implicated in testis development and spermatogenesis, sperm motility and the acrosome reaction. In addition, the genomic scan for non‐additive effects identified two regions on BTA11 and BTA25 with marked recessive effects. These regions harbor three genes—FER1L5,CNNM4 and DNAH3—with known roles in sperm biology. Moreover, the gene‐set analysis revealed terms associated with calcium regulation and signaling, membrane fusion, sperm cell energy metabolism, GTPase activity and MAPK signaling. These gene sets are directly implicated in sperm physiology and male fertility. Overall, this integrative genomic study unravels genetic variants and pathways affecting Jersey bull fertility. These findings may contribute to the development of novel genomic strategies for improving sire fertility in Jersey cattle.
Collapse
Affiliation(s)
- F M Rezende
- Department of Animal Sciences, University of Florida, Gainesville, FL, 32611, USA.,Faculdade de Medicina Veterinária, Universidade Federal de Uberlândia, Uberlândia, MG, 38400-902, Brazil
| | - G O Dietsch
- Department of Animal Sciences, University of Florida, Gainesville, FL, 32611, USA
| | - F Peñagaricano
- Department of Animal Sciences, University of Florida, Gainesville, FL, 32611, USA.,University of Florida Genetics Institute, University of Florida, Gainesville, FL, 32610, USA
| |
Collapse
|
6
|
Mandaviya PR, Joehanes R, Aïssi D, Kühnel B, Marioni RE, Truong V, Stolk L, Beekman M, Bonder MJ, Franke L, Gieger C, Huan T, Ikram MA, Kunze S, Liang L, Lindemans J, Liu C, McRae AF, Mendelson MM, Müller-Nurasyid M, Peters A, Slagboom PE, Starr JM, Trégouët DA, Uitterlinden AG, van Greevenbroek MMJ, van Heemst D, van Iterson M, Wells PS, Yao C, Deary IJ, Gagnon F, Heijmans BT, Levy D, Morange PE, Waldenberger M, Heil SG, van Meurs JBJ. Genetically defined elevated homocysteine levels do not result in widespread changes of DNA methylation in leukocytes. PLoS One 2017; 12:e0182472. [PMID: 29084233 PMCID: PMC5662081 DOI: 10.1371/journal.pone.0182472] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2017] [Accepted: 07/19/2017] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND DNA methylation is affected by the activities of the key enzymes and intermediate metabolites of the one-carbon pathway, one of which involves homocysteine. We investigated the effect of the well-known genetic variant associated with mildly elevated homocysteine: MTHFR 677C>T independently and in combination with other homocysteine-associated variants, on genome-wide leukocyte DNA-methylation. METHODS Methylation levels were assessed using Illumina 450k arrays on 9,894 individuals of European ancestry from 12 cohort studies. Linear-mixed-models were used to study the association of additive MTHFR 677C>T and genetic-risk score (GRS) based on 18 homocysteine-associated SNPs, with genome-wide methylation. RESULTS Meta-analysis revealed that the MTHFR 677C>T variant was associated with 35 CpG sites in cis, and the GRS showed association with 113 CpG sites near the homocysteine-associated variants. Genome-wide analysis revealed that the MTHFR 677C>T variant was associated with 1 trans-CpG (nearest gene ZNF184), while the GRS model showed association with 5 significant trans-CpGs annotated to nearest genes PTF1A, MRPL55, CTDSP2, CRYM and FKBP5. CONCLUSIONS Our results do not show widespread changes in DNA-methylation across the genome, and therefore do not support the hypothesis that mildly elevated homocysteine is associated with widespread methylation changes in leukocytes.
Collapse
Grants
- G0700704 Medical Research Council
- N01HC25195 NHLBI NIH HHS
- Wellcome Trust
- MR/K026992/1 Medical Research Council
- K99 HL136875 NHLBI NIH HHS
- BB/F019394/1 Biotechnology and Biological Sciences Research Council
- National Institutes of Health
- Erasmus Medical Center and Erasmus University, Rotterdam
- Netherlands Organization for the Health Research and Development (ZonMw)
- Research Institute for Diseases in the Elderly (RIDE)
- Ministry of Education, Culture and Science
- Ministry for Health, Welfare and Sports, the European Commission (DG XII)
- Municipality of Rotterdam
- Netherlands CardioVascular Research Initiative (the Dutch Heart Foundation, Dutch Federation of University Medical Centres, the Netherlands Organisation for Health Research and Development, and the Royal Netherlands Academy of Sciences) for the GENIUS project “Generating the best evidence-based pharmaceutical targets for atherosclerosis”
- BBMRI-NL, a research infrastructure financed by the Dutch government
- Région Ile de France (CORDDIM)
- Région Ile de France, Pierre and Marie Curie University
- ICAN Institute for Cardiometabolism and Nutrition
- Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health
Collapse
Affiliation(s)
- Pooja R. Mandaviya
- Department of Clinical Chemistry, Erasmus University Medical Center, Rotterdam, The Netherlands
- Department of Internal Medicine, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Roby Joehanes
- Institute for Aging Research, Hebrew SeniorLife, Harvard Medical School, Boston, MA, United States of America
| | - Dylan Aïssi
- Sorbonne Universités, UPMC Univ. Paris 06, INSERM, UMR_S 1166, Team Genomics & Pathophysiology of Cardiovascular Diseases, Paris, France
- ICAN Institute for Cardiometabolism and Nutrition, Paris, France
| | - Brigitte Kühnel
- Research Unit of Molecular Epidemiology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
- Institute of Epidemiology II, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
| | - Riccardo E. Marioni
- Queensland Brain Institute, The University of Queensland, Brisbane, Australia
- Centre for Cognitive Ageing and Cognitive Epidemiology, University of Edinburgh, Edinburgh, United Kingdom
- Medical Genetics Section, Centre for Genomic and Experimental Medicine, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, United Kingdom
| | - Vinh Truong
- Division of Epidemiology, Dalla Lana School of Public Health, University of Toronto, Toronto, Canada
| | - Lisette Stolk
- Department of Internal Medicine, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Marian Beekman
- Molecular Epidemiology Section, Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, The Netherlands
| | - Marc Jan Bonder
- Department of Genetics, University Medical Center Groningen, Groningen, The Netherlands
| | - Lude Franke
- Department of Genetics, University Medical Center Groningen, Groningen, The Netherlands
| | - Christian Gieger
- Research Unit of Molecular Epidemiology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
- Institute of Epidemiology II, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
| | - Tianxiao Huan
- Framingham Heart Study, Framingham, MA, United States of America
- The Population Studies Branch, National Heart, Lung, and Blood Institute of the National Institutes of Health, Bethesda, MD, United States of America
| | - M. Arfan Ikram
- Department of Epidemiology, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Sonja Kunze
- Research Unit of Molecular Epidemiology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
- Institute of Epidemiology II, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
| | - Liming Liang
- Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
| | - Jan Lindemans
- Department of Clinical Chemistry, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Chunyu Liu
- Framingham Heart Study, Framingham, MA, United States of America
- The Population Studies Branch, National Heart, Lung, and Blood Institute of the National Institutes of Health, Bethesda, MD, United States of America
| | - Allan F. McRae
- Queensland Brain Institute, The University of Queensland, Brisbane, Australia
| | - Michael M. Mendelson
- Framingham Heart Study, Framingham, MA, United States of America
- The Population Studies Branch, National Heart, Lung, and Blood Institute of the National Institutes of Health, Bethesda, MD, United States of America
- Department of Cardiology, Boston Children's Hospital, Boston, MA, United States of America
| | - Martina Müller-Nurasyid
- DZHK (German Centre for Cardiovascular Research), partner site Munich Heart Alliance, Munich, Germany
- Institute of Genetic Epidemiology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
- Department of Medicine I, University Hospital Munich, Campus Grosshadern, Ludwig-Maximilians-University, Munich, Germany
| | - Annette Peters
- Research Unit of Molecular Epidemiology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
- Institute of Epidemiology II, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
- DZHK (German Centre for Cardiovascular Research), partner site Munich Heart Alliance, Munich, Germany
| | - P. Eline Slagboom
- Molecular Epidemiology Section, Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, The Netherlands
| | - John M. Starr
- Centre for Cognitive Ageing and Cognitive Epidemiology, University of Edinburgh, Edinburgh, United Kingdom
| | - David-Alexandre Trégouët
- Sorbonne Universités, UPMC Univ. Paris 06, INSERM, UMR_S 1166, Team Genomics & Pathophysiology of Cardiovascular Diseases, Paris, France
- ICAN Institute for Cardiometabolism and Nutrition, Paris, France
| | - André G. Uitterlinden
- Department of Internal Medicine, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Marleen M. J. van Greevenbroek
- Department of Internal Medicine and School for Cardiovascular Diseases (CARIM), Maastricht University Medical Center, Maastricht, The Netherlands
| | - Diana van Heemst
- Department of Gerontology and Geriatrics Section, Leiden University Medical Center, Leiden, The Netherlands
| | - Maarten van Iterson
- Molecular Epidemiology Section, Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, The Netherlands
| | - Philip S. Wells
- Department of Medicine, University of Ottawa, and the Ottawa Hospital Research Institute, Ottawa, Canada
| | - Chen Yao
- Framingham Heart Study, Framingham, MA, United States of America
- The Population Studies Branch, National Heart, Lung, and Blood Institute of the National Institutes of Health, Bethesda, MD, United States of America
| | - Ian J. Deary
- Centre for Cognitive Ageing and Cognitive Epidemiology, University of Edinburgh, Edinburgh, United Kingdom
- Department of Psychology, University of Edinburgh, Edinburgh, United Kingdom
| | - France Gagnon
- Division of Epidemiology, Dalla Lana School of Public Health, University of Toronto, Toronto, Canada
| | - Bastiaan T. Heijmans
- Molecular Epidemiology Section, Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, The Netherlands
| | - Daniel Levy
- Framingham Heart Study, Framingham, MA, United States of America
- The Population Studies Branch, National Heart, Lung, and Blood Institute of the National Institutes of Health, Bethesda, MD, United States of America
| | - Pierre-Emmanuel Morange
- Laboratory of Haematology, La Timone Hospital, Marseille, France
- Institut National pour la Santé et la Recherche Médicale (INSERM), UMR_S 1062, Inra UMR_1260, Aix-Marseille Université, Marseille, France
| | - Melanie Waldenberger
- Research Unit of Molecular Epidemiology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
- Institute of Epidemiology II, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
| | - Sandra G. Heil
- Department of Clinical Chemistry, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Joyce B. J. van Meurs
- Department of Internal Medicine, Erasmus University Medical Center, Rotterdam, The Netherlands
| | | |
Collapse
|
7
|
Abstract
Development of free/libre open source software is usually done by a community of people with an interest in the tool. For scientific software, however, this is less often the case. Most scientific software is written by only a few authors, often a student working on a thesis. Once the paper describing the tool has been published, the tool is no longer developed further and is left to its own device. Here we describe the broad, multidisciplinary community we formed around a set of tools for statistical genomics. The GenABEL project for statistical omics actively promotes open interdisciplinary development of statistical methodology and its implementation in efficient and user-friendly software under an open source licence. The software tools developed withing the project collectively make up the GenABEL suite, which currently consists of eleven tools. The open framework of the project actively encourages involvement of the community in all stages, from formulation of methodological ideas to application of software to specific data sets. A web forum is used to channel user questions and discussions, further promoting the use of the GenABEL suite. Developer discussions take place on a dedicated mailing list, and development is further supported by robust development practices including use of public version control, code review and continuous integration. Use of this open science model attracts contributions from users and developers outside the “core team”, facilitating agile statistical omics methodology development and fast dissemination.
Collapse
Affiliation(s)
- Lennart C Karssen
- PolyOmica, Groningen, 9722 HC, Netherlands; Department of Epidemiology, Erasmus Medical Center, Rotterdam, 3000 CA, Netherlands
| | - Cornelia M van Duijn
- Department of Epidemiology, Erasmus Medical Center, Rotterdam, 3000 CA, Netherlands
| | - Yurii S Aulchenko
- PolyOmica, Groningen, 9722 HC, Netherlands; Institute of Cytology and Genetics, Siberian Division of the Russian Academy of Sciences, Novosibirsk, 630090, Russian Federation; Novosibirsk State University, Novosibirsk, 630090, Russian Federation; Centre for Global Health Research, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Teviot Place, Edinburgh, EH8 9AG, UK
| |
Collapse
|
8
|
Vuckovic D, Dawson S, Scheffer DI, Rantanen T, Morgan A, Di Stazio M, Vozzi D, Nutile T, Concas MP, Biino G, Nolan L, Bahl A, Loukola A, Viljanen A, Davis A, Ciullo M, Corey DP, Pirastu M, Gasparini P, Girotto G. Genome-wide association analysis on normal hearing function identifies PCDH20 and SLC28A3 as candidates for hearing function and loss. Hum Mol Genet 2015; 24:5655-64. [PMID: 26188009 PMCID: PMC4572074 DOI: 10.1093/hmg/ddv279] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2015] [Accepted: 07/10/2015] [Indexed: 12/16/2022] Open
Abstract
Hearing loss and individual differences in normal hearing both have a substantial genetic basis. Although many new genes contributing to deafness have been identified, very little is known about genes/variants modulating the normal range of hearing ability. To fill this gap, we performed a two-stage meta-analysis on hearing thresholds (tested at 0.25, 0.5, 1, 2, 4, 8 kHz) and on pure-tone averages (low-, medium- and high-frequency thresholds grouped) in several isolated populations from Italy and Central Asia (total N = 2636). Here, we detected two genome-wide significant loci close to PCDH20 and SLC28A3 (top hits: rs78043697, P = 4.71E−10 and rs7032430, P = 2.39E−09, respectively). For both loci, we sought replication in two independent cohorts: B58C from the UK (N = 5892) and FITSA from Finland (N = 270). Both loci were successfully replicated at a nominal level of significance (P < 0.05). In order to confirm our quantitative findings, we carried out RT-PCR and reported RNA-Seq data, which showed that both genes are expressed in mouse inner ear, especially in hair cells, further suggesting them as good candidates for modulatory genes in the auditory system. Sequencing data revealed no functional variants in the coding region of PCDH20 or SLC28A3, suggesting that variation in regulatory sequences may affect expression. Overall, these results contribute to a better understanding of the complex mechanisms underlying human hearing function.
Collapse
Affiliation(s)
- Dragana Vuckovic
- Department of Medical, Surgical and Health Sciences, University of Trieste, Trieste 34100, Italy
| | - Sally Dawson
- UCL Ear Institute, University College London, London WC1X 8EE, UK
| | - Deborah I Scheffer
- Howard Hughes Medical Institute and Department of Neurobiology, Harvard Medical School, Boston, MA 02115, USA
| | - Taina Rantanen
- Gerontology Research Center and Department of Health Sciences, University of Jyväskylä, Jyväskylä FI-40014, Finland
| | - Anna Morgan
- Department of Medical, Surgical and Health Sciences, University of Trieste, Trieste 34100, Italy
| | - Mariateresa Di Stazio
- Department of Medical, Surgical and Health Sciences, University of Trieste, Trieste 34100, Italy
| | - Diego Vozzi
- Institute for Maternal and Child Health IRCCS 'Burlo Garofolo', Trieste 34100, Italy
| | - Teresa Nutile
- Institute of Genetics and Biophysics 'A. Buzzati-Traverso', CNR, Naples 80131, Italy
| | - Maria P Concas
- Institute of Population Genetics, National Research Council of Italy, Sassari 07100, Italy
| | - Ginevra Biino
- Institute of Molecular Genetics, National Research Council of Italy, Pavia 27100, Italy
| | - Lisa Nolan
- UCL Ear Institute, University College London, London WC1X 8EE, UK
| | - Aileen Bahl
- Department of Public Health, Hjelt Institute, University of Helsinki, Helsinki FI-00014, Finland and
| | - Anu Loukola
- Department of Public Health, Hjelt Institute, University of Helsinki, Helsinki FI-00014, Finland and
| | - Anne Viljanen
- Gerontology Research Center and Department of Health Sciences, University of Jyväskylä, Jyväskylä FI-40014, Finland
| | - Adrian Davis
- UCL Ear Institute, University College London, London WC1X 8EE, UK
| | - Marina Ciullo
- Institute of Genetics and Biophysics 'A. Buzzati-Traverso', CNR, Naples 80131, Italy
| | - David P Corey
- Howard Hughes Medical Institute and Department of Neurobiology, Harvard Medical School, Boston, MA 02115, USA
| | - Mario Pirastu
- Institute of Population Genetics, National Research Council of Italy, Sassari 07100, Italy
| | - Paolo Gasparini
- Department of Medical, Surgical and Health Sciences, University of Trieste, Trieste 34100, Italy, Institute for Maternal and Child Health IRCCS 'Burlo Garofolo', Trieste 34100, Italy, Experimental Genetics Division, Sidra, Doha, Qatar
| | - Giorgia Girotto
- Department of Medical, Surgical and Health Sciences, University of Trieste, Trieste 34100, Italy,
| |
Collapse
|
9
|
Tsepilov YA, Shin SY, Soranzo N, Spector TD, Prehn C, Adamski J, Kastenmüller G, Wang-Sattler R, Strauch K, Gieger C, Aulchenko YS, Ried JS. Nonadditive Effects of Genes in Human Metabolomics. Genetics 2015; 200:707-18. [PMID: 25977471 PMCID: PMC4512538 DOI: 10.1534/genetics.115.175760] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2015] [Accepted: 05/04/2015] [Indexed: 12/30/2022] Open
Abstract
Genome-wide association studies (GWAS) are widely applied to analyze the genetic effects on phenotypes. With the availability of high-throughput technologies for metabolite measurements, GWAS successfully identified loci that affect metabolite concentrations and underlying pathways. In most GWAS, the effect of each SNP on the phenotype is assumed to be additive. Other genetic models such as recessive, dominant, or overdominant were considered only by very few studies. In contrast to this, there are theories that emphasize the relevance of nonadditive effects as a consequence of physiologic mechanisms. This might be especially important for metabolites because these intermediate phenotypes are closer to the underlying pathways than other traits or diseases. In this study we analyzed systematically nonadditive effects on a large panel of serum metabolites and all possible ratios (22,801 total) in a population-based study [Cooperative Health Research in the Region of Augsburg (KORA) F4, N = 1,785]. We applied four different 1-degree-of-freedom (1-df) tests corresponding to an additive, dominant, recessive, and overdominant trait model as well as a genotypic model with two degree-of-freedom (2-df) that allows a more general consideration of genetic effects. Twenty-three loci were found to be genome-wide significantly associated (Bonferroni corrected P ≤ 2.19 × 10(-12)) with at least one metabolite or ratio. For five of them, we show the evidence of nonadditive effects. We replicated 17 loci, including 3 loci with nonadditive effects, in an independent study (TwinsUK, N = 846). In conclusion, we found that most genetic effects on metabolite concentrations and ratios were indeed additive, which verifies the practice of using the additive model for analyzing SNP effects on metabolites.
Collapse
Affiliation(s)
- Yakov A Tsepilov
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, 630090 Novosibirsk, Russia Novosibirsk State University, 630090 Novosibirsk, Russia Institute of Genetic Epidemiology, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, Germany Research Unit of Molecular Epidemiology, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, Germany Institute of Epidemiology II, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, Germany
| | - So-Youn Shin
- Human Genetics, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, CB10 1HH, United Kingdom MRC Integrative Epidemiology Unit, School of Social and Community Medicine, University of Bristol, Bristol, BS8 1TH, United Kingdom
| | - Nicole Soranzo
- Human Genetics, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, CB10 1HH, United Kingdom Department of Haematology, University of Cambridge, Cambridge, CB2 0AH, United Kingdom
| | - Tim D Spector
- Department of Twin Research and Genetic Epidemiology, King's College London, London, WC2R 2LS, United Kingdom
| | - Cornelia Prehn
- Institute of Experimental Genetics, Genome Analysis Center, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, Germany
| | - Jerzy Adamski
- Institute of Experimental Genetics, Genome Analysis Center, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, Germany Institute of Experimental Genetics, Life and Food Science Center Weihenstephan, Technische Universität München, 85354 Freising-Weihenstephan, Germany German Center for Diabetes Research, 85764 Neuherberg, Germany
| | - Gabi Kastenmüller
- Department of Twin Research and Genetic Epidemiology, King's College London, London, WC2R 2LS, United Kingdom Institute of Bioinformatics and Systems Biology, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, Germany
| | - Rui Wang-Sattler
- Research Unit of Molecular Epidemiology, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, Germany Institute of Epidemiology II, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, Germany
| | - Konstantin Strauch
- Institute of Genetic Epidemiology, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, Germany Institute of Medical Informatics, Biometry and Epidemiology, Chair of Genetic Epidemiology, Ludwig-Maximilians-Universität, 85764 Neuherberg, Germany
| | - Christian Gieger
- Institute of Genetic Epidemiology, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, Germany Institute of Epidemiology II, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, Germany Research Unit of Molecular Epidemiology, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, Germany
| | - Yurii S Aulchenko
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, 630090 Novosibirsk, Russia Novosibirsk State University, 630090 Novosibirsk, Russia
| | - Janina S Ried
- Institute of Genetic Epidemiology, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, Germany
| |
Collapse
|
10
|
Pirie A, Wood A, Lush M, Tyrer J, Pharoah PDP. The effect of rare variants on inflation of the test statistics in case-control analyses. BMC Bioinformatics 2015; 16:53. [PMID: 25888290 PMCID: PMC4339749 DOI: 10.1186/s12859-015-0496-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2014] [Accepted: 02/12/2015] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND The detection of bias due to cryptic population structure is an important step in the evaluation of findings of genetic association studies. The standard method of measuring this bias in a genetic association study is to compare the observed median association test statistic to the expected median test statistic. This ratio is inflated in the presence of cryptic population structure. However, inflation may also be caused by the properties of the association test itself particularly in the analysis of rare variants. We compared the properties of the three most commonly used association tests: the likelihood ratio test, the Wald test and the score test when testing rare variants for association using simulated data. RESULTS We found evidence of inflation in the median test statistics of the likelihood ratio and score tests for tests of variants with less than 20 heterozygotes across the sample, regardless of the total sample size. The test statistics for the Wald test were under-inflated at the median for variants below the same minor allele frequency. CONCLUSIONS In a genetic association study, if a substantial proportion of the genetic variants tested have rare minor allele frequencies, the properties of the association test may mask the presence or absence of bias due to population structure. The use of either the likelihood ratio test or the score test is likely to lead to inflation in the median test statistic in the absence of population structure. In contrast, the use of the Wald test is likely to result in under-inflation of the median test statistic which may mask the presence of population structure.
Collapse
Affiliation(s)
- Ailith Pirie
- Department of Public Health and Primary Care, Strangeways Research Laboratory, University of Cambridge, 2 Worts' Causeway, Cambridge, CB1 8RN, UK.
| | - Angela Wood
- Department of Public Health and Primary Care, Strangeways Research Laboratory, University of Cambridge, 2 Worts' Causeway, Cambridge, CB1 8RN, UK.
| | - Michael Lush
- Department of Public Health and Primary Care, Strangeways Research Laboratory, University of Cambridge, 2 Worts' Causeway, Cambridge, CB1 8RN, UK.
| | - Jonathan Tyrer
- Department of Oncology, Strangeways Research Laboratory, University of Cambridge, 2 Worts' Causeway, Cambridge, CB1 8RN, UK.
| | - Paul D P Pharoah
- Department of Public Health and Primary Care, Strangeways Research Laboratory, University of Cambridge, 2 Worts' Causeway, Cambridge, CB1 8RN, UK.
- Department of Oncology, Strangeways Research Laboratory, University of Cambridge, 2 Worts' Causeway, Cambridge, CB1 8RN, UK.
| |
Collapse
|
11
|
Finno CJ, Aleman M, Higgins RJ, Madigan JE, Bannasch DL. Risk of false positive genetic associations in complex traits with underlying population structure: a case study. Vet J 2014; 202:543-9. [PMID: 25278384 DOI: 10.1016/j.tvjl.2014.09.013] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2014] [Revised: 09/08/2014] [Accepted: 09/13/2014] [Indexed: 10/24/2022]
Abstract
Genome-wide association (GWA) studies are widely used to investigate the genetic etiology of diseases in domestic animals. In the horse, GWA studies using 40-50,000 single nucleotide polymorphisms (SNPs) in sample sizes of 30-40 individuals, consisting of only 6-14 affected horses, have led to the discovery of genetic mutations for simple monogenic traits. Equine neuroaxonal dystrophy is a common inherited neurological disorder characterized by symmetric ataxia. A case-control GWA study was performed using genotypes from 42,819 SNP marker loci distributed across the genome in 99 clinically phenotyped Quarter horses (37 affected, 62 unaffected). A significant GWA was not achieved although a suggestive association was uncovered when only the most stringently phenotyped NAD-affected horses (n = 10) were included (chromosome 8:62130605 and 62134644 [log(1/P) = 5.56]). Candidate genes (PIK3C3, RIT2, and SYT4) within the associated region were excluded through sequencing, association testing of uncovered variants and quantitative RT-PCR. It was concluded that variants in PIK3C3, RIT2, and SYT4 are not responsible for equine neuroaxonal dystrophy. This study demonstrates the risk of false positive associations when performing GWA studies on complex traits and underlying population structure when using 40-50,000 SNP markers and small sample size.
Collapse
Affiliation(s)
- Carrie J Finno
- Department of Population Health and Reproduction, University of California, Davis, CA 95616, USA.
| | - Monica Aleman
- Department of Medicine and Epidemiology, University of California, Davis, CA 95616, USA
| | - Robert J Higgins
- Department of Pathology, Microbiology and Immunology, University of California, Davis, CA 95616, USA
| | - John E Madigan
- Department of Medicine and Epidemiology, University of California, Davis, CA 95616, USA
| | - Danika L Bannasch
- Department of Population Health and Reproduction, University of California, Davis, CA 95616, USA
| |
Collapse
|
12
|
Genome wide association studies using a new nonparametric model reveal the genetic architecture of 17 agronomic traits in an enlarged maize association panel. PLoS Genet 2014; 10:e1004573. [PMID: 25211220 PMCID: PMC4161304 DOI: 10.1371/journal.pgen.1004573] [Citation(s) in RCA: 225] [Impact Index Per Article: 22.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2013] [Accepted: 06/30/2014] [Indexed: 11/19/2022] Open
Abstract
Association mapping is a powerful approach for dissecting the genetic architecture of complex quantitative traits using high-density SNP markers in maize. Here, we expanded our association panel size from 368 to 513 inbred lines with 0.5 million high quality SNPs using a two-step data-imputation method which combines identity by descent (IBD) based projection and k-nearest neighbor (KNN) algorithm. Genome-wide association studies (GWAS) were carried out for 17 agronomic traits with a panel of 513 inbred lines applying both mixed linear model (MLM) and a new method, the Anderson-Darling (A-D) test. Ten loci for five traits were identified using the MLM method at the Bonferroni-corrected threshold −log10 (P) >5.74 (α = 1). Many loci ranging from one to 34 loci (107 loci for plant height) were identified for 17 traits using the A-D test at the Bonferroni-corrected threshold −log10 (P) >7.05 (α = 0.05) using 556809 SNPs. Many known loci and new candidate loci were only observed by the A-D test, a few of which were also detected in independent linkage analysis. This study indicates that combining IBD based projection and KNN algorithm is an efficient imputation method for inferring large missing genotype segments. In addition, we showed that the A-D test is a useful complement for GWAS analysis of complex quantitative traits. Especially for traits with abnormal phenotype distribution, controlled by moderate effect loci or rare variations, the A-D test balances false positives and statistical power. The candidate SNPs and associated genes also provide a rich resource for maize genetics and breeding. Genotype imputation has been used widely in the analysis of genome-wide association studies (GWAS) to boost power and fine-map associations. We developed a two-step data imputation method to meet the challenge of large proportion missing genotypes. GWAS have uncovered an extensive genetic architecture of complex quantitative traits using high-density SNP markers in maize in the past few years. Here, GWAS were carried out for 17 agronomic traits with a panel of 513 inbred lines applying both mixed linear model and a new method, the Anderson-Darling (A-D) test. We intend to show that the A-D test is a complement to current GWAS methods, especially for complex quantitative traits controlled by moderate effect loci or rare variations and with abnormal phenotype distribution. In addition, the traits associated QTL identified here provide a rich resource for maize genetics and breeding.
Collapse
|