1
|
Peng Q, Gilder DA, Bernert RA, Karriker-Jaffe KJ, Ehlers CL. Genetic factors associated with suicidal behaviors and alcohol use disorders in an American Indian population. Mol Psychiatry 2024; 29:902-913. [PMID: 38177348 PMCID: PMC11176067 DOI: 10.1038/s41380-023-02379-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 12/12/2023] [Accepted: 12/13/2023] [Indexed: 01/06/2024]
Abstract
American Indians (AI) demonstrate the highest rates of both suicidal behaviors (SB) and alcohol use disorders (AUD) among all ethnic groups in the US. Rates of suicide and AUD vary substantially between tribal groups and across different geographical regions, underscoring a need to delineate more specific risk and resilience factors. Using data from over 740 AI living within eight contiguous reservations, we assessed genetic risk factors for SB by investigating: (1) possible genetic overlap with AUD, and (2) impacts of rare and low-frequency genomic variants. Suicidal behaviors included lifetime history of suicidal thoughts and acts, including verified suicide deaths, scored using a ranking variable for the SB phenotype (range 0-4). We identified five loci significantly associated with SB and AUD, two of which are intergenic and three intronic on genes AACSP1, ANK1, and FBXO11. Nonsynonymous rare and low-frequency mutations in four genes including SERPINF1 (PEDF), ZNF30, CD34, and SLC5A9, and non-intronic rare and low-frequency mutations in genes OPRD1, HSD17B3 and one lincRNA were significantly associated with SB. One identified pathway related to hypoxia-inducible factor (HIF) regulation, whose 83 nonsynonymous rare and low-frequency variants on 10 genes were significantly linked to SB as well. Four additional genes, and two pathways related to vasopressin-regulated water metabolism and cellular hexose transport, also were strongly associated with SB. This study represents the first investigation of genetic factors for SB in an American Indian population that has high risk for suicide. Our study suggests that bivariate association analysis between comorbid disorders can increase statistical power; and rare and low-frequency variant analysis in a high-risk population enabled by whole-genome sequencing has the potential to identify novel genetic factors. Although such findings may be population specific, rare functional mutations relating to PEDF and HIF regulation align with past reports and suggest a biological mechanism for suicide risk and a potential therapeutic target for intervention.
Collapse
Affiliation(s)
- Qian Peng
- Department of Neuroscience, The Scripps Research Institute, La Jolla, CA, USA.
| | - David A Gilder
- Department of Neuroscience, The Scripps Research Institute, La Jolla, CA, USA
| | - Rebecca A Bernert
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA
| | | | - Cindy L Ehlers
- Department of Neuroscience, The Scripps Research Institute, La Jolla, CA, USA
| |
Collapse
|
2
|
Gilder D, Bernert R, Karriker-Jaffe K, Ehlers C, Peng Q. Genetic Factors Associated with Suicidal Behaviors and Alcohol Use Disorders in an American Indian Population. RESEARCH SQUARE 2023:rs.3.rs-2950284. [PMID: 37398076 PMCID: PMC10312956 DOI: 10.21203/rs.3.rs-2950284/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
American Indians (AI) demonstrate the highest rates of both suicidal behaviors (SB) and alcohol use disorders (AUD) among all ethnic groups in the US. Rates of suicide and AUD vary substantially between tribal groups and across different geographical regions, underscoring a need to delineate more specific risk and resilience factors. Using data from over 740 AI living within eight contiguous reservations, we assessed genetic risk factors for SB by investigating: (1) possible genetic overlap with AUD, and (2) impacts of rare and low frequency genomic variants. Suicidal behaviors included lifetime history of suicidal thoughts and acts, including verified suicide deaths, scored using a ranking variable for the SB phenotype (range 0-4). We identified five loci significantly associated with SB and AUD, two of which are intergenic and three intronic on genes AACSP1, ANK1, and FBXO11. Nonsynonymous rare mutations in four genes including SERPINF1 (PEDF), ZNF30, CD34, and SLC5A9, and non-intronic rare mutations in genes OPRD1, HSD17B3 and one lincRNA were significantly associated with SB. One identified pathway related to hypoxia-inducible factor (HIF) regulation, whose 83 nonsynonymous rare variants on 10 genes were significantly linked to SB as well. Four additional genes, and two pathways related to vasopressin-regulated water metabolism and cellular hexose transport, also were strongly associated with SB. This study represents the first investigation of genetic factors for SB in an American Indian population that has high risk for suicide. Our study suggests that bivariate association analysis between comorbid disorders can increase statistical power; and rare variant analysis in a high-risk population enabled by whole-genome sequencing has the potential to identify novel genetic factors. Although such findings may be population specific, rare functional mutations relating to PEDF and HIF regulation align with past reports and suggest a biological mechanism for suicide risk and a potential therapeutic target for intervention.
Collapse
|
3
|
Zhang T, Ji L, Luo J, Wang W, Tian X, Duan H, Xu C, Zhang D. A genetic correlation and bivariate genome-wide association study of grip strength and depression. PLoS One 2022; 17:e0278392. [PMID: 36520780 PMCID: PMC9754196 DOI: 10.1371/journal.pone.0278392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2022] [Accepted: 11/15/2022] [Indexed: 12/23/2022] Open
Abstract
Grip strength is an important biomarker reflecting muscle strength, and depression is a psychiatric disorder all over the world. Several studies found a significant inverse association between grip strength and depression, and there is also evidence for common physiological mechanisms between them. We used twin data from Qingdao, China to calculate genetic correlations, and we performed a bivariate GWAS to explore potential SNPs, genes, and pathways in common between grip strength and depression. 139 pairs of Dizygotic twins were used for bivariate GWAS. VEAGSE2 and PASCAL software were used for gene-based analysis and pathway enrichment analysis, respectively. And the resulting SNPs were subjected to eQTL analysis and pleiotropy analysis. The genetic correlation coefficient between grip strength and depression was -0.41 (-0.96, -0.15). In SNP-based analysis, 7 SNPs exceeded the genome-wide significance level (P<5×10-8) and a total of 336 SNPs reached the level of suggestive significance (P<1×10-5). Gene-based analysis and pathway-based analysis identified genes and pathways related to muscle strength and the nervous system. The results of eQTL analysis were mainly enriched in tissues such as the brain, thyroid, and skeletal muscle. Pleiotropy analysis shows that 9 of the 15 top SNPs were associated with both grip strength and depression. In conclusion, this bivariate GWAS identified potentially common pleiotropic SNPs, genes, and pathways in grip strength and depression.
Collapse
Affiliation(s)
- Tianhao Zhang
- Department of Epidemiology and Health Statistics, Public Health College, Qingdao University, Qingdao, Shandong Province, China
| | - Lujun Ji
- Department of Epidemiology and Health Statistics, Public Health College, Qingdao University, Qingdao, Shandong Province, China
| | - Jia Luo
- Department of Epidemiology and Health Statistics, Public Health College, Qingdao University, Qingdao, Shandong Province, China
| | - Weijing Wang
- Department of Epidemiology and Health Statistics, Public Health College, Qingdao University, Qingdao, Shandong Province, China
| | - Xiaocao Tian
- Qingdao Municipal Center for Disease Control and Prevention, Qingdao Institute of Preventive Medicine, Qingdao, Shandong, China
| | - Haiping Duan
- Qingdao Municipal Center for Disease Control and Prevention, Qingdao Institute of Preventive Medicine, Qingdao, Shandong, China
| | - Chunsheng Xu
- Qingdao Municipal Center for Disease Control and Prevention, Qingdao Institute of Preventive Medicine, Qingdao, Shandong, China
| | - Dongfeng Zhang
- Department of Epidemiology and Health Statistics, Public Health College, Qingdao University, Qingdao, Shandong Province, China
| |
Collapse
|
4
|
Tan VY, Timpson NJ. The UK Biobank: A Shining Example of Genome-Wide Association Study Science with the Power to Detect the Murky Complications of Real-World Epidemiology. Annu Rev Genomics Hum Genet 2022; 23:569-589. [PMID: 35508184 DOI: 10.1146/annurev-genom-121321-093606] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Genome-wide association studies (GWASs) have successfully identified thousands of genetic variants that are reliably associated with human traits. Although GWASs are restricted to certain variant frequencies, they have improved our understanding of the genetic architecture of complex traits and diseases. The UK Biobank (UKBB) has brought substantial analytical opportunity and performance to association studies. The dramatic expansion of many GWAS sample sizes afforded by the inclusion of UKBB data has improved the power of estimation of effect sizes but, critically, has done so in a context where phenotypic depth and precision enable outcome dissection and the application of epidemiological approaches. However, at the same time, the availability of such a large, well-curated, and deeply measured population-based collection has the capacity to increase our exposure to the many complications and inferential complexities associated with GWASs and other analyses. In this review, we discuss the impact that UKBB has had in the GWAS era, some of the opportunities that it brings, and exemplar challenges that illustrate the reality of using data from this world-leading resource.
Collapse
Affiliation(s)
- Vanessa Y Tan
- Medical Research Council (MRC) Integrative Epidemiology Unit, University of Bristol, Bristol, United Kingdom;
- Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Nicholas J Timpson
- Medical Research Council (MRC) Integrative Epidemiology Unit, University of Bristol, Bristol, United Kingdom;
- Bristol Medical School, University of Bristol, Bristol, United Kingdom
| |
Collapse
|
5
|
Sun J, Wang W, Zhang R, Duan H, Tian X, Xu C, Li X, Zhang D. Multivariate genome-wide association study of depression, cognition, and memory phenotypes and validation analysis identify 12 cross-ethnic variants. Transl Psychiatry 2022; 12:304. [PMID: 35907915 PMCID: PMC9338946 DOI: 10.1038/s41398-022-02074-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 07/15/2022] [Accepted: 07/19/2022] [Indexed: 11/10/2022] Open
Abstract
To date, little is known about the pleiotropic genetic variants among depression, cognition, and memory. The current research aimed to identify the potential pleiotropic single nucleotide polymorphisms (SNPs), genes, and pathways of the three phenotypes by conducting a multivariate genome-wide association study and an additional pleiotropy analysis among Chinese individuals and further validate the top variants in the UK Biobank (UKB). In the discovery phase, the participants were 139 pairs of dizygotic twins from the Qingdao Twins Registry. The genome-wide efficient mixed-model analysis identified 164 SNPs reaching suggestive significance (P < 1 × 10-5). Among them, rs3967317 (P = 1.21 × 10-8) exceeded the genome-wide significance level (P < 5 × 10-8) and was also demonstrated to be associated with depression and memory in pleiotropy analysis, followed by rs9863698, rs3967316, and rs9261381 (P = 7.80 × 10-8-5.68 × 10-7), which were associated with all three phenotypes. After imputation, a total of 457 SNPs reached suggestive significance. The top SNP chr6:24597173 was located in the KIAA0319 gene, which had biased expression in brain tissues. Genes and pathways related to metabolism, immunity, and neuronal systems demonstrated nominal significance (P < 0.05) in gene-based and pathway enrichment analyses. In the validation phase, 12 of the abovementioned SNPs reached the nominal significance level (P < 0.05) in the UKB. Among them, three SNPs were located in the KIAA0319 gene, and four SNPs were identified as significant expression quantitative trait loci in brain tissues. These findings may provide evidence for pleiotropic variants among depression, cognition, and memory and clues for further exploring the shared genetic pathogenesis of depression with Alzheimer's disease.
Collapse
Affiliation(s)
- Jing Sun
- Department of Epidemiology and Health Statistics, The School of Public Health of Qingdao University, Qingdao, Shandong Province, China
- Department of Big Data in Health Science School of Public Health, Center of Clinical Big Data and Analytics of The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Weijing Wang
- Department of Epidemiology and Health Statistics, The School of Public Health of Qingdao University, Qingdao, Shandong Province, China
| | - Ronghui Zhang
- Department of Epidemiology and Health Statistics, The School of Public Health of Qingdao University, Qingdao, Shandong Province, China
| | - Haiping Duan
- Qingdao Municipal Center for Disease Control and Prevention, No. 175 Shandong Road, Shibei District, Qingdao, Shandong Province, China
| | - Xiaocao Tian
- Qingdao Municipal Center for Disease Control and Prevention, No. 175 Shandong Road, Shibei District, Qingdao, Shandong Province, China
| | - Chunsheng Xu
- Qingdao Municipal Center for Disease Control and Prevention, No. 175 Shandong Road, Shibei District, Qingdao, Shandong Province, China
| | - Xue Li
- Department of Big Data in Health Science School of Public Health, Center of Clinical Big Data and Analytics of The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China.
| | - Dongfeng Zhang
- Department of Epidemiology and Health Statistics, The School of Public Health of Qingdao University, Qingdao, Shandong Province, China.
| |
Collapse
|
6
|
Liu F, Zhou Z, Cai M, Wen Y, Zhang J. AGNEP: An Agglomerative Nesting Clustering Algorithm for Phenotypic Dimension Reduction in Joint Analysis of Multiple Phenotypes. Front Genet 2021; 12:648831. [PMID: 33981331 PMCID: PMC8107386 DOI: 10.3389/fgene.2021.648831] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2021] [Accepted: 04/01/2021] [Indexed: 11/17/2022] Open
Abstract
Genome-wide association study (GWAS) has identified thousands of genetic variants associated with complex traits and diseases. Compared with analyzing a single phenotype at a time, the joint analysis of multiple phenotypes can improve statistical power by taking into account the information from phenotypes. However, most established joint algorithms ignore the different level of correlations between multiple phenotypes; instead of that, they simultaneously analyze all phenotypes in a genetic model. Thus, they may fail to capture the genetic structure of phenotypes and consequently reduce the statistical power. In this study, we develop a novel method agglomerative nesting clustering algorithm for phenotypic dimension reduction analysis (AGNEP) to jointly analyze multiple phenotypes for GWAS. First, AGNEP uses an agglomerative nesting clustering algorithm to group correlated phenotypes and then applies principal component analysis (PCA) to generate representative phenotypes for each group. Finally, multivariate analysis is employed to test associations between genetic variants and the representative phenotypes rather than all phenotypes. We perform three simulation experiments with various genetic structures and a real dataset analysis for 19 Arabidopsis phenotypes. Compared to established methods, AGNEP is more powerful in terms of statistical power, computing time, and the number of quantitative trait nucleotides (QTNs). The analysis of the Arabidopsis real dataset further illustrates the efficiency of AGNEP for detecting QTNs, which are confirmed by The Arabidopsis Information Resource gene bank.
Collapse
Affiliation(s)
- Fengrong Liu
- College of Science, Nanjing Agricultural University, Nanjing, China.,School of Data Science, University of Science and Technology of China, Hefei, China
| | - Ziyang Zhou
- College of Science, Nanjing Agricultural University, Nanjing, China
| | - Mingzhi Cai
- College of Science, Nanjing Agricultural University, Nanjing, China
| | - Yangjun Wen
- College of Science, Nanjing Agricultural University, Nanjing, China
| | - Jin Zhang
- College of Science, Nanjing Agricultural University, Nanjing, China.,Postdoctoral Research Station of Crop Science, Nanjing Agricultural University, Nanjing, China
| |
Collapse
|
7
|
Abstract
Conventional longitudinal behavioral genetic models estimate the relative contribution of genetic and environmental factors to stability and change of traits and behaviors. Longitudinal models rarely explain the processes that generate observed differences between genetically and socially related individuals. We propose that exchanges between individuals and their environments (i.e., phenotype-environment effects) can explain the emergence of observed differences over time. Phenotype-environment models, however, would require violation of the independence assumption of standard behavioral genetic models; that is, uncorrelated genetic and environmental factors. We review how specification of phenotype-environment effects contributes to understanding observed changes in genetic variability over time and longitudinal correlations among nonshared environmental factors. We then provide an example using 30 days of positive and negative affect scores from an all-female sample of twins. Results demonstrate that the phenotype-environment effects explain how heritability estimates fluctuate as well as how nonshared environmental factors persist over time. We discuss possible mechanisms underlying change in gene-environment correlation over time, the advantages and challenges of including gene-environment correlation in longitudinal twin models, and recommendations for future research.
Collapse
|
8
|
Ribeiro AH, Maria Pavan Soler J. Learning genetic and environmental graphical models from family data. Stat Med 2020; 39:2403-2422. [PMID: 32346898 DOI: 10.1002/sim.8545] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Revised: 02/23/2020] [Accepted: 03/13/2020] [Indexed: 11/08/2022]
Abstract
Many challenging problems in biomedical research rely on understanding how variables are associated with each other and influenced by genetic and environmental factors. Probabilistic graphical models (PGMs) are widely acknowledged as a very natural and formal language to describe relationships among variables and have been extensively used for studying complex diseases and traits. In this work, we propose methods that leverage observational Gaussian family data for learning a decomposition of undirected and directed acyclic PGMs according to the influence of genetic and environmental factors. Many structure learning algorithms are strongly based on a conditional independence test. For independent measurements of normally distributed variables, conditional independence can be tested through standard tests for zero partial correlation. In family data, the assumption of independent measurements does not hold since related individuals are correlated due to mainly genetic factors. Based on univariate polygenic linear mixed models, we propose tests that account for the familial dependence structure and allow us to assess the significance of the partial correlation due to genetic (between-family) factors and due to other factors, denoted here as environmental (within-family) factors, separately. Then, we extend standard structure learning algorithms, including the IC/PC and the really fast causal inference (RFCI) algorithms, to Gaussian family data. The algorithms learn the most likely PGM and its decomposition into two components, one explained by genetic factors and the other by environmental factors. The proposed methods are evaluated by simulation studies and applied to the Genetic Analysis Workshop 13 simulated dataset, which captures significant features of the Framingham Heart Study.
Collapse
Affiliation(s)
- Adèle H Ribeiro
- Department of Computer Science, Institute of Mathematics and Statistics, University of São Paulo (IME-USP), São Paulo, Brazil
| | - Júlia Maria Pavan Soler
- Department of Statistics, Institute of Mathematics and Statistics, University of São Paulo (IME-USP), São Paulo, Brazil
| |
Collapse
|
9
|
Nguyen TH, Dobbyn A, Brown RC, Riley BP, Buxbaum JD, Pinto D, Purcell SM, Sullivan PF, He X, Stahl EA. mTADA is a framework for identifying risk genes from de novo mutations in multiple traits. Nat Commun 2020; 11:2929. [PMID: 32522981 PMCID: PMC7287090 DOI: 10.1038/s41467-020-16487-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2018] [Accepted: 05/06/2020] [Indexed: 11/12/2022] Open
Abstract
Joint analysis of multiple traits can result in the identification of associations not found through the analysis of each trait in isolation. Studies of neuropsychiatric disorders and congenital heart disease (CHD) which use de novo mutations (DNMs) from parent-offspring trios have reported multiple putatively causal genes. However, a joint analysis method designed to integrate DNMs from multiple studies has yet to be implemented. We here introduce multiple-trait TADA (mTADA) which jointly analyzes two traits using DNMs from non-overlapping family samples. We first demonstrate that mTADA is able to leverage genetic overlaps to increase the statistical power of risk-gene identification. We then apply mTADA to large datasets of >13,000 trios for five neuropsychiatric disorders and CHD. We report additional risk genes for schizophrenia, epileptic encephalopathies and CHD. We outline some shared and specific biological information of intellectual disability and CHD by conducting systems biology analyses of genes prioritized by mTADA.
Collapse
Affiliation(s)
- Tan-Hoang Nguyen
- Division of Psychiatric Genomics, Department of Genetics and Genomic Sciences, Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Virginia Institute for Psychiatric and Behavioral Genetics, Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, USA.
| | - Amanda Dobbyn
- Division of Psychiatric Genomics, Department of Genetics and Genomic Sciences, Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ruth C Brown
- Virginia Institute for Psychiatric and Behavioral Genetics, Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, USA
| | - Brien P Riley
- Virginia Institute for Psychiatric and Behavioral Genetics, Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, USA
| | - Joseph D Buxbaum
- Seaver Autism Center, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Dalila Pinto
- Seaver Autism Center, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- The Mindich Child Health & Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Shaun M Purcell
- Sleep Center, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Patrick F Sullivan
- Departments of Genetics and Psychiatry, University of North Carolina, Chapel Hill, NC, USA
| | - Xin He
- Department of Human Genetics, University of Chicago, Chicago, IL, USA.
- Grossman Institute for Neuroscience, Quantitative Biology and Human Behavior, University of Chicago, Chicago, IL, USA.
| | - Eli A Stahl
- Division of Psychiatric Genomics, Department of Genetics and Genomic Sciences, Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
10
|
Blancon J, Dutartre D, Tixier MH, Weiss M, Comar A, Praud S, Baret F. A High-Throughput Model-Assisted Method for Phenotyping Maize Green Leaf Area Index Dynamics Using Unmanned Aerial Vehicle Imagery. FRONTIERS IN PLANT SCIENCE 2019; 10:685. [PMID: 31231403 PMCID: PMC6568052 DOI: 10.3389/fpls.2019.00685] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2019] [Accepted: 05/07/2019] [Indexed: 05/19/2023]
Abstract
The dynamics of the Green Leaf Area Index (GLAI) is of great interest for numerous applications such as yield prediction and plant breeding. We present a high-throughput model-assisted method for characterizing GLAI dynamics in maize (Zea mays subsp. mays) using multispectral imagery acquired from an Unmanned Aerial Vehicle (UAV). Two trials were conducted with a high diversity panel of 400 lines under well-watered and water-deficient treatments in 2016 and 2017. For each UAV flight, we first derived GLAI estimates from empirical relationships between the multispectral reflectance and ground level measurements of GLAI achieved over a small sample of microplots. We then fitted a simple but physiologically sound GLAI dynamics model over the GLAI values estimated previously. Results show that GLAI dynamics was estimated accurately throughout the cycle (R2 > 0.9). Two parameters of the model, biggest leaf area and leaf longevity, were also estimated successfully. We showed that GLAI dynamics and the parameters of the fitted model are highly heritable (0.65 ≤ H2 ≤ 0.98), responsive to environmental conditions, and linked to yield and drought tolerance. This method, combining growth modeling, UAV imagery and simple non-destructive field measurements, provides new high-throughput tools for understanding the adaptation of GLAI dynamics and its interaction with the environment. GLAI dynamics is also a promising trait for crop breeding, and paves the way for future genetic studies.
Collapse
Affiliation(s)
- Justin Blancon
- Biogemma, Centre de Recherche de Chappes, Chappes, France
| | | | | | - Marie Weiss
- INRA UMR 114 EMMAH, UMT CAPTE, Domaine Saint-Paul, Avignon, France
| | | | | | - Frédéric Baret
- INRA UMR 114 EMMAH, UMT CAPTE, Domaine Saint-Paul, Avignon, France
| |
Collapse
|
11
|
Loohuis LM, Albersen M, de Jong S, Wu T, Luykx JJ, Jans JJM, Verhoeven-Duif NM, Ophoff RA. The Alkaline Phosphatase (ALPL) Locus Is Associated with B6 Vitamer Levels in CSF and Plasma. Genes (Basel) 2018; 10:genes10010008. [PMID: 30583557 PMCID: PMC6357176 DOI: 10.3390/genes10010008] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Revised: 12/13/2018] [Accepted: 12/14/2018] [Indexed: 01/27/2023] Open
Abstract
The active form of vitamin B6, pyridoxal phosphate (PLP), is essential for human metabolism. The brain is dependent on vitamin B6 for its neurotransmitter balance. To obtain insight into the genetic determinants of vitamin B6 homeostasis, we conducted a genome-wide association study (GWAS) of the B6 vitamers pyridoxal (PL), PLP and the degradation product of vitamin B6, pyridoxic acid (PA). We collected a unique sample set of cerebrospinal fluid (CSF) and plasma from the same healthy human subjects of Dutch ancestry (n = 493) and included concentrations and ratios in and between these body fluids in our analysis. Based on a multivariate joint analysis of all B6 vitamers and their ratios, we identified a genome-wide significant association at a locus on chromosome 1 containing the ALPL (alkaline phosphatase) gene (minimal p = 7.89 × 10−10, rs1106357, minor allele frequency (MAF) = 0.46), previously associated with vitamin B6 levels in blood. Subjects homozygous for the minor allele showed a 1.4-times-higher ratio between PLP and PL in plasma, and even a 1.6-times-higher ratio between PLP and PL in CSF than subjects homozygous for the major allele. In addition, we observed a suggestive association with the CSF:plasma ratio of PLP on chromosome 15 (minimal p = 7.93 × 10−7, and MAF = 0.06 for rs28789220). Even though this finding is not reaching genome-wide significance, it highlights the potential of our experimental setup for studying transport and metabolism across the blood–CSF barrier. This GWAS of B6 vitamers identifies alkaline phosphatase as a key regulator in human vitamin B6 metabolism in CSF as well as plasma. Furthermore, our results demonstrate the potential of genetic studies of metabolites in plasma and CSF to elucidate biological aspects underlying metabolite generation, transport and degradation.
Collapse
Affiliation(s)
- Loes M Loohuis
- Center for Neurobehavioral Genetics, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles (UCLA), Los Angeles, CA 90095, USA.
| | - Monique Albersen
- Section Metabolic Diagnostics, Department of Genetics, University Medical Center (UMC), 3584 EA, Utrecht, The Netherlands.
| | - Simone de Jong
- Center for Neurobehavioral Genetics, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles (UCLA), Los Angeles, CA 90095, USA.
| | - Timothy Wu
- Center for Neurobehavioral Genetics, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles (UCLA), Los Angeles, CA 90095, USA.
| | - Jurjen J Luykx
- Department of Psychiatry, Rudolf Magnus Institute of Neuroscience, University Medical Center (UMC), 3584 CG, Utrecht, The Netherlands.
- Department of Translational Neuroscience, Human Neurogenetics Unit, Brain Center Rudolf Magnus, University Medical Center Utrecht (UMC), 3584 CG, Utrecht, The Netherlands.
| | - Judith J M Jans
- Section Metabolic Diagnostics, Department of Genetics, University Medical Center (UMC), 3584 EA, Utrecht, The Netherlands.
| | - Nanda M Verhoeven-Duif
- Section Metabolic Diagnostics, Department of Genetics, University Medical Center (UMC), 3584 EA, Utrecht, The Netherlands.
| | - Roel A Ophoff
- Center for Neurobehavioral Genetics, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles (UCLA), Los Angeles, CA 90095, USA.
| |
Collapse
|
12
|
Pecanka J, van der Vaart AW, Jonker MA. Modeling association between multivariate correlated outcomes and high-dimensional sparse covariates: the adaptive SVS method. J Appl Stat 2018. [DOI: 10.1080/02664763.2018.1523377] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Affiliation(s)
- J. Pecanka
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, Netherlands
| | - A. W. van der Vaart
- Mathematical Institute, Faculty of Science, Leiden University, Leiden, Netherlands
| | - M. A. Jonker
- Department for Health Evidence – Biostatistics, Radboud University Medical Center, Nijmegen, Netherlands
| |
Collapse
|
13
|
Rudra P, Broadaway KA, Ware EB, Jhun MA, Bielak LF, Zhao W, Smith JA, Peyser PA, Kardia SL, Epstein MP, Ghosh D. Testing cross-phenotype effects of rare variants in longitudinal studies of complex traits. Genet Epidemiol 2018; 42:320-332. [PMID: 29601641 PMCID: PMC5980726 DOI: 10.1002/gepi.22121] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2017] [Revised: 01/19/2018] [Accepted: 02/19/2018] [Indexed: 01/09/2023]
Abstract
Many gene mapping studies of complex traits have identified genes or variants that influence multiple phenotypes. With the advent of next-generation sequencing technology, there has been substantial interest in identifying rare variants in genes that possess cross-phenotype effects. In the presence of such effects, modeling both the phenotypes and rare variants collectively using multivariate models can achieve higher statistical power compared to univariate methods that either model each phenotype separately or perform separate tests for each variant. Several studies collect phenotypic data over time and using such longitudinal data can further increase the power to detect genetic associations. Although rare-variant approaches exist for testing cross-phenotype effects at a single time point, there is no analogous method for performing such analyses using longitudinal outcomes. In order to fill this important gap, we propose an extension of Gene Association with Multiple Traits (GAMuT) test, a method for cross-phenotype analysis of rare variants using a framework based on the distance covariance. The approach allows for both binary and continuous phenotypes and can also adjust for covariates. Our simple adjustment to the GAMuT test allows it to handle longitudinal data and to gain power by exploiting temporal correlation. The approach is computationally efficient and applicable on a genome-wide scale due to the use of a closed-form test whose significance can be evaluated analytically. We use simulated data to demonstrate that our method has favorable power over competing approaches and also apply our approach to exome chip data from the Genetic Epidemiology Network of Arteriopathy.
Collapse
Affiliation(s)
- Pratyaydipta Rudra
- Department of Biostatistics and Informatics, Colorado School of Public Health, Aurora, CO
| | | | - Erin B. Ware
- Department of Epidemiology, University of Michigan, Ann Arbor, MI
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI
| | - Min A. Jhun
- Department of Epidemiology, University of Michigan, Ann Arbor, MI
| | | | - Wei Zhao
- Department of Epidemiology, University of Michigan, Ann Arbor, MI
| | | | | | | | | | - Debashis Ghosh
- Department of Biostatistics and Informatics, Colorado School of Public Health, Aurora, CO
| |
Collapse
|
14
|
Wu Y, Duan H, Tian X, Xu C, Wang W, Jiang W, Pang Z, Zhang D, Tan Q. Genetics of Obesity Traits: A Bivariate Genome-Wide Association Analysis. Front Genet 2018; 9:179. [PMID: 29868124 PMCID: PMC5964872 DOI: 10.3389/fgene.2018.00179] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Accepted: 04/30/2018] [Indexed: 12/19/2022] Open
Abstract
Previous genome-wide association studies on anthropometric measurements have identified more than 100 related loci, but only a small portion of heritability in obesity was explained. Here we present a bivariate twin study to look for the genetic variants associated with body mass index and waist-hip ratio, and to explore the obesity-related pathways in Northern Han Chinese. Cholesky decomposition model for 242 monozygotic and 140 dizygotic twin pairs indicated a moderate genetic correlation (r = 0.53, 95%CI: 0.42-0.64) between body mass index and waist-hip ratio. Bivariate genome-wide association analysis in 139 dizygotic twin pairs identified 26 associated SNPs with p < 10-5. Further gene-based analysis found 291 nominally associated genes (P < 0.05), including F12, HCRTR1, PHOSPHO1, DOCK2, DOCK6, DGKB, GLP1R, TRHR, MMP1, GPR55, CCK, and OR2AK2, as well as 6 enriched gene-sets with FDR < 0.05. Expression quantitative trait loci analysis identified rs2242044 as a significant cis-eQTL in both the normal adipose-subcutaneous (P = 1.7 × 10-9) and adipose-visceral (P = 4.4 × 10-15) tissue. These findings may provide an important entry point to unravel genetic pleiotropy in obesity traits.
Collapse
Affiliation(s)
- Yili Wu
- Department of Epidemiology and Health Statistics, Public Health College, Qingdao University, Qingdao, China
| | - Haiping Duan
- Department of Epidemiology and Health Statistics, Public Health College, Qingdao University, Qingdao, China
- Qingdao Municipal Center for Disease Control and Prevention, Qingdao, China
| | - Xiaocao Tian
- Qingdao Municipal Center for Disease Control and Prevention, Qingdao, China
| | - Chunsheng Xu
- Department of Epidemiology and Health Statistics, Public Health College, Qingdao University, Qingdao, China
- Qingdao Municipal Center for Disease Control and Prevention, Qingdao, China
| | - Weijing Wang
- Department of Epidemiology and Health Statistics, Public Health College, Qingdao University, Qingdao, China
| | - Wenjie Jiang
- Department of Epidemiology and Health Statistics, Public Health College, Qingdao University, Qingdao, China
| | - Zengchang Pang
- Qingdao Municipal Center for Disease Control and Prevention, Qingdao, China
| | - Dongfeng Zhang
- Department of Epidemiology and Health Statistics, Public Health College, Qingdao University, Qingdao, China
| | - Qihua Tan
- Epidemiology and Biostatistics, Department of Public Health, University of Southern Denmark, Odense, Denmark
- Unit of Human Genetics, Department of Clinical Research, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
15
|
Lin YF, Chen CY, Öngür D, Betensky R, Smoller JW, Blacker D, Hall MH. Polygenic pleiotropy and potential causal relationships between educational attainment, neurobiological profile, and positive psychotic symptoms. Transl Psychiatry 2018; 8:97. [PMID: 29765027 PMCID: PMC5954124 DOI: 10.1038/s41398-018-0144-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Revised: 01/30/2018] [Accepted: 04/03/2018] [Indexed: 11/23/2022] Open
Abstract
Event-related potential (ERP) components have been used to assess cognitive functions in patients with psychotic illness. Evidence suggests that among patients with psychosis there is a distinct heritable neurophysiologic phenotypic subtype captured by impairments across a range of ERP measures. In this study, we investigated the genetic basis of this "globally impaired" ERP cluster and its relationship to psychosis and cognitive abilities. We applied K-means clustering to six ERP measures to re-derive the globally impaired (n = 60) and the non-globally impaired ERP clusters (n = 323) in a sample of cases with schizophrenia (SCZ = 136) or bipolar disorder (BPD = 121) and healthy controls (n = 126). We used genome-wide association study (GWAS) results for SCZ, BPD, college completion, and childhood intelligence as the discovery datasets to derive polygenic risk scores (PRS) in our study sample and tested their associations with globally impaired ERP. We conducted mediation analyses to estimate the proportion of each PRS effect on severity of psychotic symptoms that is mediated through membership in the globally impaired ERP. Individuals with globally impaired ERP had significantly higher PANSS-positive scores (β = 3.95, P = 0.005). The SCZ-PRS was nominally associated with globally impaired ERP (unadjusted P = 0.01; R2 = 3.07%). We also found a significant positive association between the college-PRS and globally impaired ERP (FDR-corrected P = 0.004; R2 = 6.15%). The effect of college-PRS on PANSS positivity was almost entirely (97.1%) mediated through globally impaired ERP. These results suggest that the globally impaired ERP phenotype may represent some aspects of brain physiology on the path between genetic influences on educational attainment and psychotic symptoms.
Collapse
Affiliation(s)
- Yen-Feng Lin
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA. .,Department of Psychiatry, Taipei City Psychiatric Center, Taipei City Hospital, Taipei, Taiwan.
| | - Chia-Yen Chen
- 0000 0004 0386 9924grid.32224.35Psychiatric and Neurodevelopmental Genetics Unit, Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA USA ,grid.66859.34Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA USA ,000000041936754Xgrid.38142.3cDepartment of Psychiatry, Harvard Medical School, Boston, MA USA
| | - Dost Öngür
- 000000041936754Xgrid.38142.3cDepartment of Psychiatry, Harvard Medical School, Boston, MA USA ,0000 0000 8795 072Xgrid.240206.2Psychotic Disorders Division, McLean Hospital, Belmont, MA USA
| | - Rebecca Betensky
- 000000041936754Xgrid.38142.3cDepartment of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA USA
| | - Jordan W. Smoller
- 000000041936754Xgrid.38142.3cDepartment of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA USA ,0000 0004 0386 9924grid.32224.35Psychiatric and Neurodevelopmental Genetics Unit, Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA USA ,grid.66859.34Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA USA ,000000041936754Xgrid.38142.3cDepartment of Psychiatry, Harvard Medical School, Boston, MA USA
| | - Deborah Blacker
- 000000041936754Xgrid.38142.3cDepartment of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA USA ,000000041936754Xgrid.38142.3cDepartment of Psychiatry, Harvard Medical School, Boston, MA USA ,0000 0004 0386 9924grid.32224.35Gerontology Research Unit, Massachusetts General Hospital, Boston, MA USA
| | - Mei-Hua Hall
- Department of Psychiatry, Harvard Medical School, Boston, MA, USA. .,Psychosis Neurobiology Laboratory, McLean Hospital, Belmont, MA, USA.
| |
Collapse
|
16
|
Salinas YD, Wang Z, DeWan AT. Statistical Analysis of Multiple Phenotypes in Genetic Epidemiologic Studies: From Cross-Phenotype Associations to Pleiotropy. Am J Epidemiol 2018; 187:855-863. [PMID: 29020254 DOI: 10.1093/aje/kwx296] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Accepted: 08/03/2017] [Indexed: 12/15/2022] Open
Abstract
In the context of genetics, pleiotropy refers to the phenomenon in which a single genetic locus affects more than 1 trait or disease. Genetic epidemiologic studies have identified loci associated with multiple phenotypes, and these cross-phenotype associations are often incorrectly interpreted as examples of pleiotropy. Pleiotropy is only one possible explanation for cross-phenotype associations. Cross-phenotype associations may also arise due to issues related to study design, confounder bias, or nongenetic causal links between the phenotypes under analysis. Therefore, it is necessary to dissect cross-phenotype associations carefully to uncover true pleiotropic loci. In this review, we describe statistical methods that can be used to identify robust statistical evidence of pleiotropy. First, we provide an overview of univariate and multivariate methods for discovery of cross-phenotype associations and highlight important considerations for choosing among available methods. Then, we describe how to dissect cross-phenotype associations by using mediation analysis. Pleiotropic loci provide insights into the mechanistic underpinnings of disease comorbidity, and they may serve as novel targets for interventions that simultaneously treat multiple diseases. Discerning between different types of cross-phenotype associations is necessary to realize the public health potential of pleiotropic loci.
Collapse
Affiliation(s)
- Yasmmyn D Salinas
- Department of Chronic Disease Epidemiology, Yale School of Public Health, New Haven, Connecticut
| | - Zuoheng Wang
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut
| | - Andrew T DeWan
- Department of Chronic Disease Epidemiology, Yale School of Public Health, New Haven, Connecticut
| |
Collapse
|
17
|
Association analysis of rare and common variants with multiple traits based on variable reduction method. Genet Res (Camb) 2018; 100:e2. [PMID: 29386084 DOI: 10.1017/s0016672317000052] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Abstract
Pleiotropy, the effect of one variant on multiple traits, is widespread in complex diseases. Joint analysis of multiple traits can improve statistical power to detect genetic variants and uncover the underlying genetic mechanism. Currently, a large number of existing methods target one common variant or only rare variants. Increasing evidence shows that complex diseases are caused by common and rare variants. Here we propose a region-based method to test both rare and common variant associated multiple traits based on variable reduction method (abbreviated as MULVR). However, in the presence of noise traits, the MULVR method may lose power, so we propose the MULVR-O method, which jointly analyses the optimal number of traits associated with genetic variants by the MULVR method, to guard against the effect of noise traits. Extensive simulation studies show that our proposed method (MULVR-O) is applied to not only multiple quantitative traits but also qualitative traits, and is more powerful than several other comparison methods in most scenarios. An application to the two genes (SHBG and CHRM3) and two phenotypes (systolic blood pressure and diastolic blood pressure) from the GAW19 dataset illustrates that our proposed methods (MULVR and MULVR-O) are feasible and efficient as a region-based method.
Collapse
|
18
|
Lu S, Zhao LJ, Chen XD, Papasian CJ, Wu KH, Tan LJ, Wang ZE, Pei YF, Tian Q, Deng HW. Bivariate genome-wide association analyses identified genetic pleiotropic effects for bone mineral density and alcohol drinking in Caucasians. J Bone Miner Metab 2017; 35:649-658. [PMID: 28012008 PMCID: PMC5812284 DOI: 10.1007/s00774-016-0802-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/16/2016] [Accepted: 10/31/2016] [Indexed: 11/29/2022]
Abstract
Several studies indicated bone mineral density (BMD) and alcohol intake might share common genetic factors. The study aimed to explore potential SNPs/genes related to both phenotypes in US Caucasians at the genome-wide level. A bivariate genome-wide association study (GWAS) was performed in 2069 unrelated participants. Regular drinking was graded as 1, 2, 3, 4, 5, or 6, representing drinking alcohol never, less than once, once or twice, three to six times, seven to ten times, or more than ten times per week respectively. Hip, spine, and whole body BMDs were measured. The bivariate GWAS was conducted on the basis of a bivariate linear regression model. Sex-stratified association analyses were performed in the male and female subgroups. In males, the most significant association signal was detected in SNP rs685395 in DYNC2H1 with bivariate spine BMD and alcohol drinking (P = 1.94 × 10-8). SNP rs685395 and five other SNPs, rs657752, rs614902, rs682851, rs626330, and rs689295, located in the same haplotype block in DYNC2H1 were the top ten most significant SNPs in the bivariate GWAS in males. Additionally, two SNPs in GRIK4 in males and three SNPs in OPRM1 in females were suggestively associated with BMDs (of the hip, spine, and whole body) and alcohol drinking. Nine SNPs in IL1RN were only suggestively associated with female whole body BMD and alcohol drinking. Our study indicated that DYNC2H1 may contribute to the genetic mechanisms of both spine BMD and alcohol drinking in male Caucasians. Moreover, our study suggested potential pleiotropic roles of OPRM1 and IL1RN in females and GRIK4 in males underlying variation of both BMD and alcohol drinking.
Collapse
Affiliation(s)
- Shan Lu
- Key Lab of Protein Chemistry and Developmental Biology of Ministry of Education, College of Life Sciences, Hunan Normal University, Changsha, China
| | - Lan-Juan Zhao
- Center for Bioinformatics and Genomics, Department of Biostatistics, School of Public Health and Tropical Medicine, Tulane University, 1440 Canal St.Suite 2001, New Orleans, LA, 70112, USA
| | - Xiang-Ding Chen
- Key Lab of Protein Chemistry and Developmental Biology of Ministry of Education, College of Life Sciences, Hunan Normal University, Changsha, China
| | | | - Ke-Hao Wu
- Key Lab of Protein Chemistry and Developmental Biology of Ministry of Education, College of Life Sciences, Hunan Normal University, Changsha, China
| | - Li-Jun Tan
- Key Lab of Protein Chemistry and Developmental Biology of Ministry of Education, College of Life Sciences, Hunan Normal University, Changsha, China
| | - Zhuo-Er Wang
- Key Lab of Protein Chemistry and Developmental Biology of Ministry of Education, College of Life Sciences, Hunan Normal University, Changsha, China
| | - Yu-Fang Pei
- Center of System Biomedical Sciences, University of Shanghai for Science and Technology, Shanghai, China
| | - Qing Tian
- Center for Bioinformatics and Genomics, Department of Biostatistics, School of Public Health and Tropical Medicine, Tulane University, 1440 Canal St.Suite 2001, New Orleans, LA, 70112, USA
| | - Hong-Wen Deng
- Key Lab of Protein Chemistry and Developmental Biology of Ministry of Education, College of Life Sciences, Hunan Normal University, Changsha, China.
- Center of System Biomedical Sciences, University of Shanghai for Science and Technology, Shanghai, China.
- Center for Bioinformatics and Genomics, Department of Biostatistics, School of Public Health and Tropical Medicine, Tulane University, 1440 Canal St.Suite 2001, New Orleans, LA, 70112, USA.
| |
Collapse
|
19
|
Yang JJ, Williams LK, Buu A. Identifying pleiotropic genes in genome-wide association studies from related subjects using the linear mixed model and Fisher combination function. BMC Bioinformatics 2017; 18:376. [PMID: 28836938 PMCID: PMC5571642 DOI: 10.1186/s12859-017-1791-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2016] [Accepted: 08/15/2017] [Indexed: 11/11/2022] Open
Abstract
Background A multivariate genome-wide association test is proposed for analyzing data on multivariate quantitative phenotypes collected from related subjects. The proposed method is a two-step approach. The first step models the association between the genotype and marginal phenotype using a linear mixed model. The second step uses the correlation between residuals of the linear mixed model to estimate the null distribution of the Fisher combination test statistic. Results The simulation results show that the proposed method controls the type I error rate and is more powerful than the marginal tests across different population structures (admixed or non-admixed) and relatedness (related or independent). The statistical analysis on the database of the Study of Addiction: Genetics and Environment (SAGE) demonstrates that applying the multivariate association test may facilitate identification of the pleiotropic genes contributing to the risk for alcohol dependence commonly expressed by four correlated phenotypes. Conclusions This study proposes a multivariate method for identifying pleiotropic genes while adjusting for cryptic relatedness and population structure between subjects. The two-step approach is not only powerful but also computationally efficient even when the number of subjects and the number of phenotypes are both very large.
Collapse
Affiliation(s)
- James J Yang
- School of Nursing, University of Michigan, Ann Arbor, 48104, Michigan, USA.
| | - L Keoki Williams
- Department of Internal Medicine, Henry Ford Health System, Detroit, 48202, Michigan, USA.,The Center for Health Policy and Health Services Research, Henry Ford Health System, Detroit, 48202, Michigan, USA
| | - Anne Buu
- Department of Health Behavior and Biological Sciences, University of Michigan, Ann Arbor, 48104, Michigan, USA
| |
Collapse
|
20
|
Schulthess AW, Reif JC, Ling J, Plieske J, Kollers S, Ebmeyer E, Korzun V, Argillier O, Stiewe G, Ganal MW, Röder MS, Jiang Y. The roles of pleiotropy and close linkage as revealed by association mapping of yield and correlated traits of wheat (Triticum aestivum L.). JOURNAL OF EXPERIMENTAL BOTANY 2017; 68:4089-4101. [PMID: 28922760 PMCID: PMC5853857 DOI: 10.1093/jxb/erx214] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2017] [Accepted: 06/01/2017] [Indexed: 05/22/2023]
Abstract
Grain yield (GY) of bread wheat (Triticum aestivum L.) is quantitatively inherited. Correlated GY-syndrome traits such as plant height (PH), heading date (HD), thousand grain weight (TGW), test weight (TW), grains per ear (GPE), and ear weight (EW) influence GY. Most quantitative genetics studies assessed the multiple-trait (MT) complex of GY-syndrome using single-trait approaches, and little is known about its underlying pleiotropic architecture. We investigated the pleiotropic architecture of wheat GY-syndrome through MT association mapping (MT-GWAS) using 372 varieties phenotyped in up to eight environments and genotyped with 18 832 single nucleotide polymorphisms plus 24 polymorphic functional markers. MT-GWAS revealed a total of 345 significant markers spread genome wide, representing 8, 40, 11, 40, 34, and 35 effective GY-PH, GY-HD, GY-TGW, GY-TW, GY-GPE, and GY-EW associations, respectively. Among them, pleiotropic roles of Rht-B1 and TaGW2-6B loci were corroborated. Only one marker presented simultaneous associations for three traits (i.e. GY-TGW-TW). Close linkage was difficult to differentiate from pleiotropy; thus, the pleiotropic architecture of GY-syndrome was dissected more as a cause of pleiotropy rather than close linkage. Simulations showed that minor allele frequencies, along with sizes and distances between quantitative trait loci for two traits, influenced the ability to distinguish close linkage from pleiotropy.
Collapse
Affiliation(s)
- Albert W Schulthess
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | - Jochen C Reif
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | - Jie Ling
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | | | | | | | | | | | | | | | - Marion S Röder
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
- Correspondence:
| | - Yong Jiang
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| |
Collapse
|
21
|
Kaakinen M, Mägi R, Fischer K, Heikkinen J, Järvelin MR, Morris AP, Prokopenko I. A rare-variant test for high-dimensional data. Eur J Hum Genet 2017; 25:988-994. [PMID: 28537275 PMCID: PMC5513099 DOI: 10.1038/ejhg.2017.90] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2016] [Revised: 02/17/2017] [Accepted: 03/28/2017] [Indexed: 12/22/2022] Open
Abstract
Genome-wide association studies have facilitated the discovery of thousands of loci for hundreds of phenotypes. However, the issue of missing heritability remains unsolved for most complex traits. Locus discovery could be enhanced with both improved power through multi-phenotype analysis (MPA) and use of a wider allele frequency range, including rare variants (RVs). MPA methods for single-variant association have been proposed, but given their low power for RVs, more efficient approaches are required. We propose multi-phenotype analysis of rare variants (MARV), a burden test-based method for RVs extended to the joint analysis of multiple phenotypes through a powerful reverse regression technique. Specifically, MARV models the proportion of RVs at which minor alleles are carried by individuals within a genomic region as a linear combination of multiple phenotypes, which can be both binary and continuous, and the method accommodates directly the genotyped and imputed data. The full model, including all phenotypes, is tested for association for discovery, and a more thorough dissection of the phenotype combinations for any set of RVs is also enabled. We show, via simulations, that the type I error rate is well controlled under various correlations between two continuous phenotypes, and that the method outperforms a univariate burden test in all considered scenarios. Application of MARV to 4876 individuals from the Northern Finland Birth Cohort 1966 for triglycerides, high- and low-density lipoprotein cholesterols highlights known loci with stronger signals of association than those observed in univariate RV analyses and suggests novel RV effects for these lipid traits.
Collapse
Affiliation(s)
- Marika Kaakinen
- Department of Genomics of Common Disease, Imperial College London, London, UK
| | - Reedik Mägi
- Estonian Genome Center, University of Tartu, Tartu, Estonia
| | - Krista Fischer
- Estonian Genome Center, University of Tartu, Tartu, Estonia
| | - Jani Heikkinen
- Department of Genomics of Common Disease, Imperial College London, London, UK.,Neuroepidemiology and Ageing (NEA) Research Unit, Imperial College London, London, UK
| | - Marjo-Riitta Järvelin
- Department of Epidemiology and Biostatistics, MRC-PHE Centre for Environment and Health, School of Public Health, Imperial College London, London, UK.,Center for Life Course Health Research, University of Oulu, Oulu, Finland.,Unit of Primary Care, Oulu University Hospital, Oulu, Finland.,Biocenter Oulu, University of Oulu, Oulu, Finland
| | - Andrew P Morris
- Department of Biostatistics, University of Liverpool, Liverpool, UK
| | - Inga Prokopenko
- Department of Genomics of Common Disease, Imperial College London, London, UK
| |
Collapse
|
22
|
Kaakinen M, Mägi R, Fischer K, Heikkinen J, Järvelin MR, Morris AP, Prokopenko I. MARV: a tool for genome-wide multi-phenotype analysis of rare variants. BMC Bioinformatics 2017; 18:110. [PMID: 28209135 PMCID: PMC5311849 DOI: 10.1186/s12859-017-1530-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2016] [Accepted: 02/06/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Genome-wide association studies have enabled identification of thousands of loci for hundreds of traits. Yet, for most human traits a substantial part of the estimated heritability is unexplained. This and recent advances in technology to produce high-dimensional data cost-effectively have led to method development beyond standard common variant analysis, including single-phenotype rare variant and multi-phenotype common variant analysis, with the latter increasing power for locus discovery and providing suggestions of pleiotropic effects. However, there are currently no optimal methods and tools for the combined analysis of rare variants and multiple phenotypes. RESULTS We propose a user-friendly software tool MARV for Multi-phenotype Analysis of Rare Variants. The tool is based on a method that collapses rare variants within a genomic region and models the proportion of minor alleles in the rare variants on a linear combination of multiple phenotypes. MARV provides analyses of all phenotype combinations within one run and calculates the Bayesian Information Criterion to facilitate model selection. The running time increases with the size of the genetic data while the number of phenotypes to analyse has little effect both on running time and required memory. We illustrate the use of MARV with analysis of triglycerides (TG), fasting insulin (FI) and waist-to-hip ratio (WHR) in 4,721 individuals from the Northern Finland Birth Cohort 1966. The analysis suggests novel multi-phenotype effects for these metabolic traits at APOA5 and ZNF259, and at ZNF259 provides stronger support for association (P TG+FI = 1.8 × 10-9) than observed in single phenotype rare variant analyses (P TG = 6.5 × 10-8 and P FI = 0.27). CONCLUSIONS MARV is a computationally efficient, flexible and user-friendly software tool allowing rapid identification of rare variant effects on multiple phenotypes, thus paving the way for novel discoveries and insights into biology of complex traits.
Collapse
Affiliation(s)
- Marika Kaakinen
- Department of Genomics of Common Disease, Imperial College London, London, W12 0NN UK
| | - Reedik Mägi
- Estonian Genome Center, University of Tartu, Tartu, 51010 Estonia
| | - Krista Fischer
- Estonian Genome Center, University of Tartu, Tartu, 51010 Estonia
| | - Jani Heikkinen
- Department of Genomics of Common Disease, Imperial College London, London, W12 0NN UK
- Neuroepidemiology and Ageing (NEA) Research Unit, Imperial College London, London, W6 8RP UK
| | - Marjo-Riitta Järvelin
- Department of Epidemiology and Biostatistics, MRC-PHE Centre for Environment and Health, School of Public Health, Imperial College London, London, W2 1PG UK
- Center for Life Course Health Research, University of Oulu, 90014 Oulu, Finland
- Unit of Primary Care, Oulu University Hospital, 90220 Oulu, Finland
- Biocenter Oulu, University of Oulu, 90014 Oulu, Finland
| | - Andrew P. Morris
- Department of Biostatistics, University of Liverpool, Liverpool, L69 3BX UK
| | - Inga Prokopenko
- Department of Genomics of Common Disease, Imperial College London, London, W12 0NN UK
| |
Collapse
|
23
|
Identifying Pleiotropic Genes in Genome-Wide Association Studies for Multivariate Phenotypes with Mixed Measurement Scales. PLoS One 2017; 12:e0169893. [PMID: 28081206 PMCID: PMC5231271 DOI: 10.1371/journal.pone.0169893] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2016] [Accepted: 12/22/2016] [Indexed: 11/30/2022] Open
Abstract
We propose a multivariate genome-wide association test for mixed continuous, binary, and ordinal phenotypes. A latent response model is used to estimate the correlation between phenotypes with different measurement scales so that the empirical distribution of the Fisher’s combination statistic under the null hypothesis is estimated efficiently. The simulation study shows that our proposed correlation estimation methods have high levels of accuracy. More importantly, our approach conservatively estimates the variance of the test statistic so that the type I error rate is controlled. The simulation also shows that the proposed test maintains the power at the level very close to that of the ideal analysis based on known latent phenotypes while controlling the type I error. In contrast, conventional approaches–dichotomizing all observed phenotypes or treating them as continuous variables–could either reduce the power or employ a linear regression model unfit for the data. Furthermore, the statistical analysis on the database of the Study of Addiction: Genetics and Environment (SAGE) demonstrates that conducting a multivariate test on multiple phenotypes can increase the power of identifying markers that may not be, otherwise, chosen using marginal tests. The proposed method also offers a new approach to analyzing the Fagerström Test for Nicotine Dependence as multivariate phenotypes in genome-wide association studies.
Collapse
|
24
|
Chiu CY, Jung J, Wang Y, Weeks DE, Wilson AF, Bailey-Wilson JE, Amos CI, Mills JL, Boehnke M, Xiong M, Fan R. A comparison study of multivariate fixed models and Gene Association with Multiple Traits (GAMuT) for next-generation sequencing. Genet Epidemiol 2016; 41:18-34. [PMID: 27917525 DOI: 10.1002/gepi.22014] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Revised: 09/01/2016] [Accepted: 09/19/2016] [Indexed: 01/23/2023]
Abstract
In this paper, extensive simulations are performed to compare two statistical methods to analyze multiple correlated quantitative phenotypes: (1) approximate F-distributed tests of multivariate functional linear models (MFLM) and additive models of multivariate analysis of variance (MANOVA), and (2) Gene Association with Multiple Traits (GAMuT) for association testing of high-dimensional genotype data. It is shown that approximate F-distributed tests of MFLM and MANOVA have higher power and are more appropriate for major gene association analysis (i.e., scenarios in which some genetic variants have relatively large effects on the phenotypes); GAMuT has higher power and is more appropriate for analyzing polygenic effects (i.e., effects from a large number of genetic variants each of which contributes a small amount to the phenotypes). MFLM and MANOVA are very flexible and can be used to perform association analysis for (i) rare variants, (ii) common variants, and (iii) a combination of rare and common variants. Although GAMuT was designed to analyze rare variants, it can be applied to analyze a combination of rare and common variants and it performs well when (1) the number of genetic variants is large and (2) each variant contributes a small amount to the phenotypes (i.e., polygenes). MFLM and MANOVA are fixed effect models that perform well for major gene association analysis. GAMuT can be viewed as an extension of sequence kernel association tests (SKAT). Both GAMuT and SKAT are more appropriate for analyzing polygenic effects and they perform well not only in the rare variant case, but also in the case of a combination of rare and common variants. Data analyses of European cohorts and the Trinity Students Study are presented to compare the performance of the two methods.
Collapse
Affiliation(s)
- Chi-Yang Chiu
- Biostatistics and Bioinformatics Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health (NIH), Bethesda, MD, USA
| | - Jeesun Jung
- Laboratory of Epidemiology and Biometry, National Institute on Alcohol, Abuse and Alcoholism, NIH, Bethesda, MD, USA
| | - Yifan Wang
- Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Daniel E Weeks
- Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Alexander F Wilson
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, NIH, Bethesda, MD, USA
| | - Joan E Bailey-Wilson
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, NIH, Bethesda, MD, USA
| | - Christopher I Amos
- Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth, Lebanon, NH, USA
| | - James L Mills
- Epidemiology Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health (NIH), Bethesda, MD, USA
| | - Michael Boehnke
- Department of Biostatistics, School of Public Health, The University of Michigan, Ann Arbor, MI, USA
| | - Momiao Xiong
- Human Genetics Center, University of Texas-Houston, Houston, TX, USA
| | - Ruzong Fan
- Biostatistics and Bioinformatics Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health (NIH), Bethesda, MD, USA
| |
Collapse
|
25
|
Fan R, Chiu CY, Jung J, Weeks DE, Wilson AF, Bailey-Wilson JE, Amos CI, Chen Z, Mills JL, Xiong M. A Comparison Study of Fixed and Mixed Effect Models for Gene Level Association Studies of Complex Traits. Genet Epidemiol 2016; 40:702-721. [PMID: 27374056 DOI: 10.1002/gepi.21984] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2015] [Revised: 03/08/2016] [Accepted: 04/26/2016] [Indexed: 12/22/2022]
Abstract
In association studies of complex traits, fixed-effect regression models are usually used to test for association between traits and major gene loci. In recent years, variance-component tests based on mixed models were developed for region-based genetic variant association tests. In the mixed models, the association is tested by a null hypothesis of zero variance via a sequence kernel association test (SKAT), its optimal unified test (SKAT-O), and a combined sum test of rare and common variant effect (SKAT-C). Although there are some comparison studies to evaluate the performance of mixed and fixed models, there is no systematic analysis to determine when the mixed models perform better and when the fixed models perform better. Here we evaluated, based on extensive simulations, the performance of the fixed and mixed model statistics, using genetic variants located in 3, 6, 9, 12, and 15 kb simulated regions. We compared the performance of three models: (i) mixed models that lead to SKAT, SKAT-O, and SKAT-C, (ii) traditional fixed-effect additive models, and (iii) fixed-effect functional regression models. To evaluate the type I error rates of the tests of fixed models, we generated genotype data by two methods: (i) using all variants, (ii) using only rare variants. We found that the fixed-effect tests accurately control or have low false positive rates. We performed simulation analyses to compare power for two scenarios: (i) all causal variants are rare, (ii) some causal variants are rare and some are common. Either one or both of the fixed-effect models performed better than or similar to the mixed models except when (1) the region sizes are 12 and 15 kb and (2) effect sizes are small. Therefore, the assumption of mixed models could be satisfied and SKAT/SKAT-O/SKAT-C could perform better if the number of causal variants is large and each causal variant contributes a small amount to the traits (i.e., polygenes). In major gene association studies, we argue that the fixed-effect models perform better or similarly to mixed models in most cases because some variants should affect the traits relatively large. In practice, it makes sense to perform analysis by both the fixed and mixed effect models and to make a comparison, and this can be readily done using our R codes and the SKAT packages.
Collapse
Affiliation(s)
- Ruzong Fan
- Biostatistics and Bioinformatics Branch, Division of Intramural Population Health Research, Eunice Kennedy Shriver, National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Chi-Yang Chiu
- Biostatistics and Bioinformatics Branch, Division of Intramural Population Health Research, Eunice Kennedy Shriver, National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Jeesun Jung
- Laboratory of Epidemiology and Biometry, National Institute on Alcohol Abuse and Alcoholism, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Daniel E Weeks
- Departments of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America.,Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Alexander F Wilson
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Joan E Bailey-Wilson
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Christopher I Amos
- Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth, Lebanon, New Hampshire, United States of America
| | - Zhen Chen
- Biostatistics and Bioinformatics Branch, Division of Intramural Population Health Research, Eunice Kennedy Shriver, National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland, United States of America
| | - James L Mills
- Epidemiology Branch, Division of Intramural Population Health Research, Eunice Kennedy Shriver, National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Momiao Xiong
- Human Genetics Center, University of Texas-Houston, Houston, Texas, United States of America
| |
Collapse
|
26
|
A Statistical Approach for Testing Cross-Phenotype Effects of Rare Variants. Am J Hum Genet 2016; 98:525-540. [PMID: 26942286 DOI: 10.1016/j.ajhg.2016.01.017] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2015] [Accepted: 01/29/2016] [Indexed: 11/20/2022] Open
Abstract
Increasing empirical evidence suggests that many genetic variants influence multiple distinct phenotypes. When cross-phenotype effects exist, multivariate association methods that consider pleiotropy are often more powerful than univariate methods that model each phenotype separately. Although several statistical approaches exist for testing cross-phenotype effects for common variants, there is a lack of similar tests for gene-based analysis of rare variants. In order to fill this important gap, we introduce a statistical method for cross-phenotype analysis of rare variants using a nonparametric distance-covariance approach that compares similarity in multivariate phenotypes to similarity in rare-variant genotypes across a gene. The approach can accommodate both binary and continuous phenotypes and further can adjust for covariates. Our approach yields a closed-form test whose significance can be evaluated analytically, thereby improving computational efficiency and permitting application on a genome-wide scale. We use simulated data to demonstrate that our method, which we refer to as the Gene Association with Multiple Traits (GAMuT) test, provides increased power over competing approaches. We also illustrate our approach using exome-chip data from the Genetic Epidemiology Network of Arteriopathy.
Collapse
|
27
|
Dumancas GG, Ramasahayam S, Bello G, Hughes J, Kramer R. Chemometric regression techniques as emerging, powerful tools in genetic association studies. Trends Analyt Chem 2015. [DOI: 10.1016/j.trac.2015.05.007] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
28
|
Abstract
For many years, linkage analysis was the primary tool used for the genetic mapping of Mendelian and complex traits with familial aggregation. Linkage analysis was largely supplanted by the wide adoption of genome-wide association studies (GWASs). However, with the recent increased use of whole-genome sequencing (WGS), linkage analysis is again emerging as an important and powerful analysis method for the identification of genes involved in disease aetiology, often in conjunction with WGS filtering approaches. Here, we review the principles of linkage analysis and provide practical guidelines for carrying out linkage studies using WGS data.
Collapse
Affiliation(s)
- Jurg Ott
- 1] Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Beijing 100101, China. [2] Laboratory of Statistical Genetics, Rockefeller University, 1230 York Avenue, New York, New York 10065, USA
| | - Jing Wang
- Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Beijing 100101, China
| | - Suzanne M Leal
- Center for Statistical Genetics, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, Texas 77030, USA
| |
Collapse
|
29
|
Alonso A, Marsal S, Julià A. Analytical methods in untargeted metabolomics: state of the art in 2015. Front Bioeng Biotechnol 2015; 3:23. [PMID: 25798438 PMCID: PMC4350445 DOI: 10.3389/fbioe.2015.00023] [Citation(s) in RCA: 395] [Impact Index Per Article: 39.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2014] [Accepted: 02/18/2015] [Indexed: 12/20/2022] Open
Abstract
Metabolomics comprises the methods and techniques that are used to measure the small molecule composition of biofluids and tissues, and is actually one of the most rapidly evolving research fields. The determination of the metabolomic profile - the metabolome - has multiple applications in many biological sciences, including the developing of new diagnostic tools in medicine. Recent technological advances in nuclear magnetic resonance and mass spectrometry are significantly improving our capacity to obtain more data from each biological sample. Consequently, there is a need for fast and accurate statistical and bioinformatic tools that can deal with the complexity and volume of the data generated in metabolomic studies. In this review, we provide an update of the most commonly used analytical methods in metabolomics, starting from raw data processing and ending with pathway analysis and biomarker identification. Finally, the integration of metabolomic profiles with molecular data from other high-throughput biotechnologies is also reviewed.
Collapse
Affiliation(s)
- Arnald Alonso
- Rheumatology Research Group, Vall d’Hebron Research Institute, Barcelona, Spain
- Department of Automatic Control (ESAII), Polytechnic University of Catalonia, Barcelona, Spain
| | - Sara Marsal
- Rheumatology Research Group, Vall d’Hebron Research Institute, Barcelona, Spain
| | - Antonio Julià
- Rheumatology Research Group, Vall d’Hebron Research Institute, Barcelona, Spain
| |
Collapse
|
30
|
Wu Y, Fan H, Wang Y, Zhang L, Gao X, Chen Y, Li J, Ren H, Gao H. Genome-wide association studies using haplotypes and individual SNPs in Simmental cattle. PLoS One 2014; 9:e109330. [PMID: 25330174 PMCID: PMC4203724 DOI: 10.1371/journal.pone.0109330] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2014] [Accepted: 09/10/2014] [Indexed: 01/05/2023] Open
Abstract
Recent advances in high-throughput genotyping technologies have provided the opportunity to map genes using associations between complex traits and markers. Genome-wide association studies (GWAS) based on either a single marker or haplotype have identified genetic variants and underlying genetic mechanisms of quantitative traits. Prompted by the achievements of studies examining economic traits in cattle and to verify the consistency of these two methods using real data, the current study was conducted to construct the haplotype structure in the bovine genome and to detect relevant genes genuinely affecting a carcass trait and a meat quality trait. Using the Illumina BovineHD BeadChip, 942 young bulls with genotyping data were introduced as a reference population to identify the genes in the beef cattle genome significantly associated with foreshank weight and triglyceride levels. In total, 92,553 haplotype blocks were detected in the genome. The regions of high linkage disequilibrium extended up to approximately 200 kb, and the size of haplotype blocks ranged from 22 bp to 199,266 bp. Additionally, the individual SNP analysis and the haplotype-based analysis detected similar regions and common SNPs for these two representative traits. A total of 12 and 7 SNPs in the bovine genome were significantly associated with foreshank weight and triglyceride levels, respectively. By comparison, 4 and 5 haplotype blocks containing the majority of significant SNPs were strongly associated with foreshank weight and triglyceride levels, respectively. In addition, 36 SNPs with high linkage disequilibrium were detected in the GNAQ gene, a potential hotspot that may play a crucial role for regulating carcass trait components.
Collapse
Affiliation(s)
- Yang Wu
- Institute of Animal Science, Chinese Academy of Agricultural Science, Beijing, China
| | - Huizhong Fan
- Institute of Animal Science, Chinese Academy of Agricultural Science, Beijing, China
| | - Yanhui Wang
- Institute of Animal Science, Chinese Academy of Agricultural Science, Beijing, China
| | - Lupei Zhang
- Institute of Animal Science, Chinese Academy of Agricultural Science, Beijing, China
| | - Xue Gao
- Institute of Animal Science, Chinese Academy of Agricultural Science, Beijing, China
| | - Yan Chen
- Institute of Animal Science, Chinese Academy of Agricultural Science, Beijing, China
| | - Junya Li
- Institute of Animal Science, Chinese Academy of Agricultural Science, Beijing, China
| | - HongYan Ren
- Department of life sciences, National Natural Science Foundation of China, Beijing, China
- * E-mail: (HG); (HR)
| | - Huijiang Gao
- Institute of Animal Science, Chinese Academy of Agricultural Science, Beijing, China
- * E-mail: (HG); (HR)
| |
Collapse
|
31
|
Xu HM, Sun XW, Qi T, Lin WY, Liu N, Lou XY. Multivariate dimensionality reduction approaches to identify gene-gene and gene-environment interactions underlying multiple complex traits. PLoS One 2014; 9:e108103. [PMID: 25259584 PMCID: PMC4178067 DOI: 10.1371/journal.pone.0108103] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2014] [Accepted: 08/18/2014] [Indexed: 11/30/2022] Open
Abstract
The elusive but ubiquitous multifactor interactions represent a stumbling block that urgently needs to be removed in searching for determinants involved in human complex diseases. The dimensionality reduction approaches are a promising tool for this task. Many complex diseases exhibit composite syndromes required to be measured in a cluster of clinical traits with varying correlations and/or are inherently longitudinal in nature (changing over time and measured dynamically at multiple time points). A multivariate approach for detecting interactions is thus greatly needed on the purposes of handling a multifaceted phenotype and longitudinal data, as well as improving statistical power for multiple significance testing via a two-stage testing procedure that involves a multivariate analysis for grouped phenotypes followed by univariate analysis for the phenotypes in the significant group(s). In this article, we propose a multivariate extension of generalized multifactor dimensionality reduction (GMDR) based on multivariate generalized linear, multivariate quasi-likelihood and generalized estimating equations models. Simulations and real data analysis for the cohort from the Study of Addiction: Genetics and Environment are performed to investigate the properties and performance of the proposed method, as compared with the univariate method. The results suggest that the proposed multivariate GMDR substantially boosts statistical power.
Collapse
Affiliation(s)
- Hai-Ming Xu
- Institute of Bioinformatics, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, P.R. China
- Research Center for Air Pollution and Health, Zhejiang University, Hangzhou, P.R. China
| | - Xi-Wei Sun
- Institute of Bioinformatics, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, P.R. China
| | - Ting Qi
- Institute of Bioinformatics, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, P.R. China
| | - Wan-Yu Lin
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan
| | - Nianjun Liu
- Department of Biostatistics, University of Alabama at Birmingham, Birmingham, Alabama, United States of America
| | - Xiang-Yang Lou
- Department of Biostatistics, University of Alabama at Birmingham, Birmingham, Alabama, United States of America
- * E-mail:
| |
Collapse
|
32
|
Abstract
Joint association analysis of multiple traits in a genome-wide association study (GWAS), i.e. a multivariate GWAS, offers several advantages over analyzing each trait in a separate GWAS. In this study we directly compared a number of multivariate GWAS methods using simulated data. We focused on six methods that are implemented in the software packages PLINK, SNPTEST, MultiPhen, BIMBAM, PCHAT and TATES, and also compared them to standard univariate GWAS, analysis of the first principal component of the traits, and meta-analysis of univariate results. We simulated data (N = 1000) for three quantitative traits and one bi-allelic quantitative trait locus (QTL), and varied the number of traits associated with the QTL (explained variance 0.1%), minor allele frequency of the QTL, residual correlation between the traits, and the sign of the correlation induced by the QTL relative to the residual correlation. We compared the power of the methods using empirically fixed significance thresholds (α = 0.05). Our results showed that the multivariate methods implemented in PLINK, SNPTEST, MultiPhen and BIMBAM performed best for the majority of the tested scenarios, with a notable increase in power for scenarios with an opposite sign of genetic and residual correlation. All multivariate analyses resulted in a higher power than univariate analyses, even when only one of the traits was associated with the QTL. Hence, use of multivariate GWAS methods can be recommended, even when genetic correlations between traits are weak.
Collapse
|
33
|
Reading and language disorders: the importance of both quantity and quality. Genes (Basel) 2014; 5:285-309. [PMID: 24705331 PMCID: PMC4094934 DOI: 10.3390/genes5020285] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2013] [Revised: 03/11/2014] [Accepted: 03/12/2014] [Indexed: 01/25/2023] Open
Abstract
Reading and language disorders are common childhood conditions that often co-occur with each other and with other neurodevelopmental impairments. There is strong evidence that disorders, such as dyslexia and Specific Language Impairment (SLI), have a genetic basis, but we expect the contributing genetic factors to be complex in nature. To date, only a few genes have been implicated in these traits. Their functional characterization has provided novel insight into the biology of neurodevelopmental disorders. However, the lack of biological markers and clear diagnostic criteria have prevented the collection of the large sample sizes required for well-powered genome-wide screens. One of the main challenges of the field will be to combine careful clinical assessment with high throughput genetic technologies within multidisciplinary collaborations.
Collapse
|
34
|
Serretti A, Kato M. The serotonin transporter gene and effectiveness of SSRIs. Expert Rev Neurother 2014; 8:111-20. [DOI: 10.1586/14737175.8.1.111] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
35
|
|
36
|
Hall MH, Smoller JW, Cook NR, Schulze K, Lee PH, Taylor G, Bramon E, Coleman MJ, Murray RM, Salisbury DF, Levy DL. Patterns of deficits in brain function in bipolar disorder and schizophrenia: a cluster analytic study. Psychiatry Res 2012; 200:272-80. [PMID: 22925372 PMCID: PMC3535009 DOI: 10.1016/j.psychres.2012.07.052] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/08/2012] [Revised: 06/29/2012] [Accepted: 07/30/2012] [Indexed: 11/27/2022]
Abstract
Historically, bipolar disorder and schizophrenia have been considered distinct disorders with different etiologies. Growing evidence suggests that overlapping genetic influences contribute to risk for these disorders and that each disease is genetically heterogeneous. Using cluster analytic methods, we empirically identified homogeneous subgroups of patients, their relatives, and controls based on distinct neurophysiologic profiles. Seven phenotypes were collected from two independent cohorts at two institutions. K-means clustering was used to identify neurophysiologic profiles. In the analysis of all participants, three distinct profiles emerged: "globally impaired", "sensory processing", and "high cognitive". In a secondary analysis, restricted to patients only, we observed a similar clustering into three profiles. The neurophysiological profiles of the Schizophrenia (SZ) and Bipolar Disorder (BPD) patients did not support the Diagnostic and Statistical Manual of Mental Disorders (DSM) diagnostic distinction between these two disorders. Smokers in the globally impaired group smoked significantly more cigarettes than those in the sensory processing or high cognitive groups. Our results suggest that empirical analyses of neurophysiological phenotypes can identify potentially biologically relevant homogenous subgroups independent of diagnostic boundaries. We hypothesize that each neurophysiology subgroup may share similar genotypic profiles, which may increase statistical power to detect genetic risk factors.
Collapse
Affiliation(s)
- Mei-Hua Hall
- Psychology Research Laboratory, McLean Hospital, Harvard Medical School, Belmont, MA, USA.
| | - Jordan W Smoller
- Psychiatric Genetics Program in Mood and Anxiety Disorders, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Nancy R. Cook
- Division of Preventive Medicine, Brigham & Women’s Hospital, Harvard Medical School, Boston, MA, USA
| | - Katja Schulze
- Division of Psychological Medicine, Institute of Psychiatry, King’s College London, London, UK
| | - Phil Hyoun Lee
- Psychiatric Genetics Program in Mood and Anxiety Disorders, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Grantley Taylor
- Cognitive Neuroscience Laboratory, McLean Hospital, Harvard Medical School, Belmont, MA, USA
| | - Elvira Bramon
- Division of Psychological Medicine, Institute of Psychiatry, King’s College London, London, UK
| | - Michael J. Coleman
- Psychology Research Laboratory, McLean Hospital, Harvard Medical School, Belmont, MA, USA
| | - Robin M. Murray
- Division of Psychological Medicine, Institute of Psychiatry, King’s College London, London, UK
| | - Dean F Salisbury
- Cognitive Neuroscience Laboratory, McLean Hospital, Harvard Medical School, Belmont, MA, USA
| | - Deborah L. Levy
- Psychology Research Laboratory, McLean Hospital, Harvard Medical School, Belmont, MA, USA
| |
Collapse
|
37
|
Mehmood T, Warringer J, Snipen L, Sæbø S. Improving stability and understandability of genotype-phenotype mapping in Saccharomyces using regularized variable selection in L-PLS regression. BMC Bioinformatics 2012; 13:327. [PMID: 23216988 PMCID: PMC3598729 DOI: 10.1186/1471-2105-13-327] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2012] [Accepted: 12/05/2012] [Indexed: 11/26/2022] Open
Abstract
Background Multivariate approaches have been successfully applied to genome wide association studies. Recently, a Partial Least Squares (PLS) based approach was introduced for mapping yeast genotype-phenotype relations, where background information such as gene function classification, gene dispensability, recent or ancient gene copy number variations and the presence of premature stop codons or frameshift mutations in reading frames, were used post hoc to explain selected genes. One of the latest advancement in PLS named L-Partial Least Squares (L-PLS), where ‘L’ presents the used data structure, enables the use of background information at the modeling level. Here, a modification of L-PLS with variable importance on projection (VIP) was implemented using a stepwise regularized procedure for gene and background information selection. Results were compared to PLS-based procedures, where no background information was used. Results Applying the proposed methodology to yeast Saccharomyces cerevisiae data, we found the relationship between genotype-phenotype to have improved understandability. Phenotypic variations were explained by the variations of relatively stable genes and stable background variations. The suggested procedure provides an automatic way for genotype-phenotype mapping. The selected phenotype influencing genes were evolving 29% faster than non-influential genes, and the current results are supported by a recently conducted study. Further power analysis on simulated data verified that the proposed methodology selects relevant variables. Conclusions A modification of L-PLS with VIP in a stepwise regularized elimination procedure can improve the understandability and stability of selected genes and background information. The approach is recommended for genome wide association studies where background information is available.
Collapse
Affiliation(s)
- Tahir Mehmood
- Biostatistics, Department of Chemistry, Biotechnology and Food Sciences, Norwegian University of Life Sciences, Ås, Norway.
| | | | | | | |
Collapse
|
38
|
Inouye M, Ripatti S, Kettunen J, Lyytikäinen LP, Oksala N, Laurila PP, Kangas AJ, Soininen P, Savolainen MJ, Viikari J, Kähönen M, Perola M, Salomaa V, Raitakari O, Lehtimäki T, Taskinen MR, Järvelin MR, Ala-Korpela M, Palotie A, de Bakker PIW. Novel Loci for metabolic networks and multi-tissue expression studies reveal genes for atherosclerosis. PLoS Genet 2012; 8:e1002907. [PMID: 22916037 PMCID: PMC3420921 DOI: 10.1371/journal.pgen.1002907] [Citation(s) in RCA: 129] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2012] [Accepted: 07/01/2012] [Indexed: 12/16/2022] Open
Abstract
Association testing of multiple correlated phenotypes offers better power than univariate analysis of single traits. We analyzed 6,600 individuals from two population-based cohorts with both genome-wide SNP data and serum metabolomic profiles. From the observed correlation structure of 130 metabolites measured by nuclear magnetic resonance, we identified 11 metabolic networks and performed a multivariate genome-wide association analysis. We identified 34 genomic loci at genome-wide significance, of which 7 are novel. In comparison to univariate tests, multivariate association analysis identified nearly twice as many significant associations in total. Multi-tissue gene expression studies identified variants in our top loci, SERPINA1 and AQP9, as eQTLs and showed that SERPINA1 and AQP9 expression in human blood was associated with metabolites from their corresponding metabolic networks. Finally, liver expression of AQP9 was associated with atherosclerotic lesion area in mice, and in human arterial tissue both SERPINA1 and AQP9 were shown to be upregulated (6.3-fold and 4.6-fold, respectively) in atherosclerotic plaques. Our study illustrates the power of multi-phenotype GWAS and highlights candidate genes for atherosclerosis.
Collapse
Affiliation(s)
- Michael Inouye
- Medical Systems Biology, Departments of Pathology and of Microbiology and Immunology, The University of Melbourne, Parkville, Victoria, Australia.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Melton PE, Pankratz N. Joint analyses of disease and correlated quantitative phenotypes using next-generation sequencing data. Genet Epidemiol 2012; 35 Suppl 1:S67-73. [PMID: 22128062 DOI: 10.1002/gepi.20653] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The joint analysis of multiple disease phenotypes aims to increase statistical power and potentially identify pleiotropic genes involved in the biological development of common chronic diseases. As next-generation sequencing data become more common, it will be important to consider ways to maximize the ability to detect rare variants within the human genome. The two exome sequence data sets provided for analysis at Genetic Analysis Workshop 17 (GAW17) offered three quantitative phenotypes related to disease status in 200 simulated replicates for both families and unrelated individuals. Participants in Group 10 addressed the challenges and potential uses of next-generation sequencing data to identify causal variants through a broad range of statistical methods. These methods included investigating multiple phenotypes either through data reduction or joint methods, using family or unrelated individuals, and reducing the dimensionality inherent in these data. Most of the research teams regarded the use of multiple phenotypes as a means of increasing analytical power and as a way to clarify the biology of complex disease. Three major observations were gleaned from these Group 10 contributions. First, family and unrelated case-control samples are suited to finding different types of variants. In addition, collapsing either phenotypes or genotypes can reduce the dimensionality of the data and alleviate some of the problems of multiple testing. Finally, we were able to demonstrate in certain cases that performing a joint analysis of disease status and a quantitative trait can improve statistical power.
Collapse
Affiliation(s)
- Phillip E Melton
- Department of Genetics, Texas Biomedical Research Institute, San Antonio, Texas, USA
| | | |
Collapse
|
40
|
Yoon JH, Nguyen DV, McVay LM, Deramo P, Minzenberg MJ, Ragland JD, Niendham T, Solomon M, Carter CS. Automated classification of fMRI during cognitive control identifies more severely disorganized subjects with schizophrenia. Schizophr Res 2012; 135:28-33. [PMID: 22277668 PMCID: PMC3288252 DOI: 10.1016/j.schres.2012.01.001] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/21/2011] [Revised: 01/04/2012] [Accepted: 01/04/2012] [Indexed: 11/18/2022]
Abstract
The establishment of a neurobiologically based nosological system is one of the ultimate goals of modern biological psychiatry research. Developments in neuroimaging and statistical/machine learning have provided useful basic tools for these efforts. Recent studies have demonstrated the utility of fMRI as input data for the classification of schizophrenia, but none, to date, has used fMRI of cognitive control for this purpose. In this study, we evaluated the accuracy of an unbiased classification method on fMRI data from a large cohort of subjects with first episode schizophrenia and a cohort of age matched healthy control subjects while they completed the AX version of the Continuous Performance Task (AX-CPT). We compared these results to classifications based on AX-CPT behavioral data. Classification accuracy for DSM-IV defined schizophrenia using fMRI data was modest and comparable to classifications conducted with behavioral data. Interestingly fMRI classifications did however identify a distinct subgroup of patients with greater behavioral disorganization, whereas behavioral data classifications did not. These results suggest that fMRI-based classification could be a useful tool in defining a neurobiologically distinct subgroup within the clinically defined syndrome of schizophrenia, reflecting alterations in discrete neural circuits. Independent validation of classification-based phenotypes using other biological data such as genetics would provide a strong test of this hypothesis.
Collapse
Affiliation(s)
- Jong H Yoon
- Department of Psychiatry and Imaging Research Center, University of California Davis School of Medicine, Sacramento CA, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Edwards K, Talmud P, Newman B, Krauss R, Austin M. Lipoprotein Candidate Genes for Multivariate Factors of the Insulin Resistance Syndrome: A Sib-pair Linkage Analysis in Women Twins. ACTA ACUST UNITED AC 2012. [DOI: 10.1375/twin.4.1.41] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
AbstractThe insulin resistance syndrome (IRS) is characterized by a combination of interrelated coronary heart disease risk factors, including low high-density lipoprotein cholesterol (HDLC) levels, obesity and increases in triglyceride (TG), systolic and diastolic blood pressure (BP), small low-density lipoprotein particles (LDL-size), and fasting and postload plasma insulin and glucose. Using factor analysis, we previously identified multivariate factors based on data from women participating in the Kaiser Permanente Women Twins Study: 1) Weight/Fat, 2) Insulin/Glucose, 3) Lipids, and 4) BP. The purpose of this study is to evaluate evidence for genetic linkage between the multivariate factors and candidate genes. Quantitative sib-pair analysis based on the factor scores with markers for 9 candidate genes was carried out based on data from 126 pairs of dizygotic (DZ) women twins from the second exam of the Kaiser Permanente Women Twins study. Suggestive evidence for linkage was found for the Weight/fat factor and the Apo E gene (p= 0.01), and stronger evidence for linkage with the Lipid factor and the cholesterol ester transfer protein (p= 0.002) gene. Therefore, the CETP gene appears to influence covariation in LDL size, TG, and HDL, and may account for a portion of the well-established statistical and metabolic associations observed between these risk factors.
Collapse
|
42
|
Tang CS, Ferreira MAR. A gene-based test of association using canonical correlation analysis. ACTA ACUST UNITED AC 2012; 28:845-50. [PMID: 22296789 DOI: 10.1093/bioinformatics/bts051] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
MOTIVATION Canonical correlation analysis (CCA) measures the association between two sets of multidimensional variables. We reasoned that CCA could provide an efficient and powerful approach for both univariate and multivariate gene-based tests of association without the need for permutation testing. RESULTS Compared with a commonly used permutation-based approach, CCA (i) is faster; (ii) has appropriate type-I error rate for normally distributed quantitative traits; (iii) provides comparable power for small to medium-sized genes (<100 kb); (iv) provides greater power when the causal variants are uncommon; (v) provides considerably less power for larger genes (≥100 kb) when the causal variants have a broad minor allele frequency (MAF) spectrum. Application to a GWAS of leukocyte levels identified SAFB and a histone gene cluster as novel putative loci harboring multiple independent variants regulating lymphocyte and neutrophil counts.
Collapse
Affiliation(s)
- Clara S Tang
- Queensland Institute of Medical Research, Brisbane, QLD 4029, Australia
| | | |
Collapse
|
43
|
Liu DJ, Leal SM. A flexible likelihood framework for detecting associations with secondary phenotypes in genetic studies using selected samples: application to sequence data. Eur J Hum Genet 2011; 20:449-56. [PMID: 22166943 DOI: 10.1038/ejhg.2011.211] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
For most complex trait association studies using next-generation sequencing, in addition to the primary phenotype of interest, many clinically important secondary traits are also available, which can be analyzed to map susceptibility genes. Owing to high sequencing costs, most studies use selected samples, and the sampling mechanisms of these studies can be complicated. When the primary and secondary traits are correlated, analyses of secondary phenotypes can cause spurious associations in selected samples and existing methods are inadequate to adjust for them. To address this problem, a likelihood-based method, MULTI-TRAIT-ASSOCIATION (MTA) was developed. MTA is flexible and can be applied to any study with known sampling mechanisms. It also allows efficient inferences of genetic parameters. To investigate the power of MTA and different study designs, extensive simulations were performed under rigorous population genetic and phenotypic models. It is demonstrated that there are great benefits for analyzing secondary phenotypes in selected samples. In particular, using case-control samples and samples with extreme primary phenotypes can be more powerful than analyzing random samples of equivalent size. One major challenge for sequence-based association studies is that most data sets are not of sufficient size to be adequately powered. By applying MTA, data sets ascertained under distinct mechanisms or targeted at different primary traits can be jointly analyzed to map common phenotypes and greatly increase power. The combined analysis can be performed using freely available data sets from public repositories, for example, dbGaP. In conclusion, MTA will have an important role in dissecting the etiology of complex traits.
Collapse
Affiliation(s)
- Dajiang J Liu
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | | |
Collapse
|
44
|
|
45
|
Shared genetic architecture in the relationship between adult stature and subclinical coronary artery atherosclerosis. Atherosclerosis 2011; 219:679-83. [PMID: 21937044 DOI: 10.1016/j.atherosclerosis.2011.08.030] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/21/2011] [Revised: 08/16/2011] [Accepted: 08/17/2011] [Indexed: 11/21/2022]
Abstract
BACKGROUND Short stature is associated with increased risk of coronary heart disease (CHD); although the mechanisms for this relationship are unknown, shared genetic factors have been proposed. Subclinical atherosclerosis, measured by coronary artery calcification (CAC), is associated with CHD events and represents part of the biological continuum to overt CHD. Many molecular mechanisms of CAC development are shared with bone growth. Thus, we examined whether there was evidence of shared genes (pleiotropy) between adult stature and CAC. METHODS 877 Asymptomatic white adults (46% men) from 625 families in a community-based sample had computed tomography measures of CAC. Pleiotropy between height and CAC was determined using maximum-likelihood estimation implemented in SOLAR. RESULTS Adult height was significantly and inversely associated with CAC score (P = 0.01). After adjusting for age, sex and CHD risk factors, the estimated genetic correlation between height and CAC score was -0.37 and was significantly different than 0 (P = 0.001) and -1 (P < 0.001). The environmental correlation between height and CAC score was 0.60 and was significantly different than 0 (P = 0.024). CONCLUSIONS Further studies of shared genetic factors between height and CAC may provide important insight into the complex genetic architecture of CHD, in part through increased understanding of the molecular pathways underlying the process of both normal growth and disease development. Bivariate genetic linkage analysis may provide a powerful mechanism for identifying specific genomic regions associated with both height and CAC.
Collapse
|
46
|
Mehmood T, Martens H, Saebø S, Warringer J, Snipen L. Mining for genotype-phenotype relations in Saccharomyces using partial least squares. BMC Bioinformatics 2011; 12:318. [PMID: 21812956 PMCID: PMC3175482 DOI: 10.1186/1471-2105-12-318] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2011] [Accepted: 08/03/2011] [Indexed: 11/18/2022] Open
Abstract
Background Multivariate approaches are important due to their versatility and applications in many fields as it provides decisive advantages over univariate analysis in many ways. Genome wide association studies are rapidly emerging, but approaches in hand pay less attention to multivariate relation between genotype and phenotype. We introduce a methodology based on a BLAST approach for extracting information from genomic sequences and Soft- Thresholding Partial Least Squares (ST-PLS) for mapping genotype-phenotype relations. Results Applying this methodology to an extensive data set for the model yeast Saccharomyces cerevisiae, we found that the relationship between genotype-phenotype involves surprisingly few genes in the sense that an overwhelmingly large fraction of the phenotypic variation can be explained by variation in less than 1% of the full gene reference set containing 5791 genes. These phenotype influencing genes were evolving 20% faster than non-influential genes and were unevenly distributed over cellular functions, with strong enrichments in functions such as cellular respiration and transposition. These genes were also enriched with known paralogs, stop codon variations and copy number variations, suggesting that such molecular adjustments have had a disproportionate influence on Saccharomyces yeasts recent adaptation to environmental changes in its ecological niche. Conclusions BLAST and PLS based multivariate approach derived results that adhere to the known yeast phylogeny and gene ontology and thus verify that the methodology extracts a set of fast evolving genes that capture the phylogeny of the yeast strains. The approach is worth pursuing, and future investigations should be made to improve the computations of genotype signals as well as variable selection procedure within the PLS framework.
Collapse
Affiliation(s)
- Tahir Mehmood
- Biostatistics, Department of Chemistry, Biotechnology and Food Sciences, Norwegian University of Life Sciences, Norway.
| | | | | | | | | |
Collapse
|
47
|
Ott J, Wang J. Multiple phenotypes in genome-wide genetic mapping studies. Protein Cell 2011; 2:519-22. [PMID: 21647556 DOI: 10.1007/s13238-011-1059-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2011] [Accepted: 05/23/2011] [Indexed: 11/24/2022] Open
Abstract
For many psychiatric and other traits, diagnoses are based on a number of different criteria or phenotypes. Rather than carrying out genetic analyses on the final diagnosis, it has been suggested that relevant phenotypes should be analyzed directly. We provide an overview of statistical methods for the joint analysis of multiple phenotypes in case-control association studies.
Collapse
Affiliation(s)
- Jurg Ott
- Key Laboratory of Mental Health Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China.
| | | |
Collapse
|
48
|
Edwards KL, Wan JY, Hutter CM, Fong PY, Santorico SA. Multivariate linkage scan for metabolic syndrome traits in families with type 2 diabetes. Obesity (Silver Spring) 2011; 19:1235-43. [PMID: 21183932 DOI: 10.1038/oby.2010.299] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
The purpose of this study was to evaluate evidence for linkage to interrelated quantitative features of the metabolic syndrome (MetS). Data on eight quantitative MetS traits (body weight, waist circumference, systolic and diastolic blood pressure, high-density lipoprotein (HDL) cholesterol, triglycerides (TGs), and fasting glucose and insulin measurements) and a 10 cM genome scan were available for 78 white families (n = 532 subjects). These data were used to conduct multipoint, multivariate linkage analyses, including tests for coincident linkage and complete pleiotropy. The strongest evidence for linkage from the bivariate analyses was observed on chromosome 1 (1p22.2) (HDL-TG; univariate lod score equivalent (lod(eq) = 3.99)) with stronger results from the trivariate analysis at the same location (HDL-TG-Insulin; lod(eq) = 4.32). Seven additional susceptibility regions (lod(eq) scores >1.9) were observed (1p36, 1q23, 2q21.2, 8q23.3, 14q23.2, 14q32.11, and 20p11.21). The results from this study indicate that several correlated traits of the MetS are influenced by the same gene(s) that account for some of the clustering of the MetS features.
Collapse
Affiliation(s)
- Karen L Edwards
- Department of Epidemiology, School of Public Health, University of Washington, Seattle, Washington, USA.
| | | | | | | | | |
Collapse
|
49
|
Gupta M, Cheung CL, Hsu YH, Demissie S, Cupples LA, Kiel DP, Karasik D. Identification of homogeneous genetic architecture of multiple genetically correlated traits by block clustering of genome-wide associations. J Bone Miner Res 2011; 26:1261-71. [PMID: 21611967 PMCID: PMC3312758 DOI: 10.1002/jbmr.333] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Genome-wide association studies (GWAS) using high-density genotyping platforms offer an unbiased strategy to identify new candidate genes for osteoporosis. It is imperative to be able to clearly distinguish signal from noise by focusing on the best phenotype in a genetic study. We performed GWAS of multiple phenotypes associated with fractures [bone mineral density (BMD), bone quantitative ultrasound (QUS), bone geometry, and muscle mass] with approximately 433,000 single-nucleotide polymorphisms (SNPs) and created a database of resulting associations. We performed analysis of GWAS data from 23 phenotypes by a novel modification of a block clustering algorithm followed by gene-set enrichment analysis. A data matrix of standardized regression coefficients was partitioned along both axes--SNPs and phenotypes. Each partition represents a distinct cluster of SNPs that have similar effects over a particular set of phenotypes. Application of this method to our data shows several SNP-phenotype connections. We found a strong cluster of association coefficients of high magnitude for 10 traits (BMD at several skeletal sites, ultrasound measures, cross-sectional bone area, and section modulus of femoral neck and shaft). These clustered traits were highly genetically correlated. Gene-set enrichment analyses indicated the augmentation of genes that cluster with the 10 osteoporosis-related traits in pathways such as aldosterone signaling in epithelial cells, role of osteoblasts, osteoclasts, and chondrocytes in rheumatoid arthritis, and Parkinson signaling. In addition to several known candidate genes, we also identified PRKCH and SCNN1B as potential candidate genes for multiple bone traits. In conclusion, our mining of GWAS results revealed the similarity of association results between bone strength phenotypes that may be attributed to pleiotropic effects of genes. This knowledge may prove helpful in identifying novel genes and pathways that underlie several correlated phenotypes, as well as in deciphering genetic and phenotypic modularity underlying osteoporosis risk.
Collapse
Affiliation(s)
- Mayetri Gupta
- Department of Biostatistics, Boston University, Boston, MA, USA
| | | | | | | | | | | | | |
Collapse
|
50
|
Saless N, Litscher SJ, Houlihan MJ, Han IK, Wilson D, Demant P, Blank RD. Comprehensive skeletal phenotyping and linkage mapping in an intercross of recombinant congenic mouse strains HcB-8 and HcB-23. Cells Tissues Organs 2011; 194:244-8. [PMID: 21625064 DOI: 10.1159/000324774] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Bone biomechanical performance is a complex trait or, more properly, an ensemble of complex traits. Biomechanical performance incorporates flexibility under loading, yield and failure load, and energy to failure; all are important measures of bone function. To date, the vast majority of work has focused on yield and failure load and its surrogate, bone mineral density. We performed a reciprocal intercross of the mouse strains HcB-8 and HcB-23 to map and ultimately identify genes that contribute to differences in biomechanical performance. Mechanical testing was performed by 3-point bending of the femora. We measured femoral diaphysis cross-sectional anatomy from photographs of the fracture surfaces. We used beam equations to calculate material level mechanical properties. We performed a principal component (PC) analysis of normalized whole bone phenotypes (17 input traits). We measured distances separating mandibular landmarks from calibrated digital photographs and performed linkage analysis. Experiment-wide α = 0.05 significance thresholds were established by permutation testing. Three quantitative trait loci (QTLs) identified in these studies illustrate the advantages of the comprehensive phenotyping approach. A pleiotropic QTL on chromosome 4 affected multiple whole bone phenotypes with LOD scores as large as 17.5, encompassing size, cross-sectional ellipticity, stiffness, yield and failure load, and bone mineral density. This locus was linked to 3 of the PCs but unlinked to any of the tissue level phenotypes. From this pattern, we infer that the QTL operates by modulating the proliferative response to mechanical loading. On this basis, we successfully predicted that this locus also affects the length of a specific region of the mandible. A pleiotropic locus on chromosome 10 with LOD scores displays opposite effects on failure load and toughness with LOD scores of 4.5 and 5.5, respectively, so that the allele that increases failure load decreases toughness. A chromosome 19 QTL for PC2 with an LOD score of 4.8 was not detected with either the whole bone or tissue level phenotypes. We conclude that first, comprehensive, system-oriented phenotyping provides much information that could not be obtained by focusing on bone mineral density alone. Second, mechanical performance includes inherent trade-offs between strength and brittleness. Third, considering the aggregate phenotypic data allows prediction of novel QTLs.
Collapse
Affiliation(s)
- Neema Saless
- Department of Medicine, University of Wisconsin, Madison, Wisc., USA
| | | | | | | | | | | | | |
Collapse
|