1
|
Kontou PI, Bagos PG. The goldmine of GWAS summary statistics: a systematic review of methods and tools. BioData Min 2024; 17:31. [PMID: 39238044 PMCID: PMC11375927 DOI: 10.1186/s13040-024-00385-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Accepted: 08/27/2024] [Indexed: 09/07/2024] Open
Abstract
Genome-wide association studies (GWAS) have revolutionized our understanding of the genetic architecture of complex traits and diseases. GWAS summary statistics have become essential tools for various genetic analyses, including meta-analysis, fine-mapping, and risk prediction. However, the increasing number of GWAS summary statistics and the diversity of software tools available for their analysis can make it challenging for researchers to select the most appropriate tools for their specific needs. This systematic review aims to provide a comprehensive overview of the currently available software tools and databases for GWAS summary statistics analysis. We conducted a comprehensive literature search to identify relevant software tools and databases. We categorized the tools and databases by their functionality, including data management, quality control, single-trait analysis, and multiple-trait analysis. We also compared the tools and databases based on their features, limitations, and user-friendliness. Our review identified a total of 305 functioning software tools and databases dedicated to GWAS summary statistics, each with unique strengths and limitations. We provide descriptions of the key features of each tool and database, including their input/output formats, data types, and computational requirements. We also discuss the overall usability and applicability of each tool for different research scenarios. This comprehensive review will serve as a valuable resource for researchers who are interested in using GWAS summary statistics to investigate the genetic basis of complex traits and diseases. By providing a detailed overview of the available tools and databases, we aim to facilitate informed tool selection and maximize the effectiveness of GWAS summary statistics analysis.
Collapse
Affiliation(s)
| | - Pantelis G Bagos
- Department of Computer Science and Biomedical Informatics, University of Thessaly, 35131, Lamia, Greece.
| |
Collapse
|
2
|
Xu C, Ganesh SK, Zhou X. mtPGS: Leverage multiple correlated traits for accurate polygenic score construction. Am J Hum Genet 2023; 110:1673-1689. [PMID: 37716346 PMCID: PMC10577082 DOI: 10.1016/j.ajhg.2023.08.016] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 08/18/2023] [Accepted: 08/27/2023] [Indexed: 09/18/2023] Open
Abstract
Accurate polygenic scores (PGSs) facilitate the genetic prediction of complex traits and aid in the development of personalized medicine. Here, we develop a statistical method called multi-trait assisted PGS (mtPGS), which can construct accurate PGSs for a target trait of interest by leveraging multiple traits relevant to the target trait. Specifically, mtPGS borrows SNP effect size similarity information between the target trait and its relevant traits to improve the effect size estimation on the target trait, thus achieving accurate PGSs. In the process, mtPGS flexibly models the shared genetic architecture between the target and the relevant traits to achieve robust performance, while explicitly accounting for the environmental covariance among them to accommodate different study designs with various sample overlap patterns. In addition, mtPGS uses only summary statistics as input and relies on a deterministic algorithm with several algebraic techniques for scalable computation. We evaluate the performance of mtPGS through comprehensive simulations and applications to 25 traits in the UK Biobank, where in the real data mtPGS achieves an average of 0.90%-52.91% accuracy gain compared to the state-of-the-art PGS methods. Overall, mtPGS represents an accurate, fast, and robust solution for PGS construction in biobank-scale datasets.
Collapse
Affiliation(s)
- Chang Xu
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Santhi K Ganesh
- Department of Internal Medicine, Division of Cardiovascular Medicine, University of Michigan, Ann Arbor, MI, USA; Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA.
| |
Collapse
|
3
|
Wang J, Wang W, Li H. Sparse block signal detection and identification for shared cross-trait association analysis. Ann Appl Stat 2022. [DOI: 10.1214/21-aoas1523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Jianqiao Wang
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania
| | - Wanjie Wang
- Department of Statistics and Applied Probability, National University of Singapore
| | - Hongzhe Li
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania
| |
Collapse
|
4
|
Xiao J, Cai M, Hu X, Wan X, Chen G, Yang C. XPXP: improving polygenic prediction by cross-population and cross-phenotype analysis. Bioinformatics 2022; 38:1947-1955. [PMID: 35040939 DOI: 10.1093/bioinformatics/btac029] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 11/16/2021] [Accepted: 01/12/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION As increasing sample sizes from genome-wide association studies (GWASs), polygenic risk scores (PRSs) have shown great potential in personalized medicine with disease risk prediction, prevention and treatment. However, the PRS constructed using European samples becomes less accurate when it is applied to individuals from non-European populations. It is an urgent task to improve the accuracy of PRSs in under-represented populations, such as African populations and East Asian populations. RESULTS In this article, we propose a cross-population and cross-phenotype (XPXP) method for construction of PRSs in under-represented populations. XPXP can construct accurate PRSs by leveraging biobank-scale datasets in European populations and multiple GWASs of genetically correlated phenotypes. XPXP also allows to incorporate population-specific and phenotype-specific effects, and thus further improves the accuracy of PRS. Through comprehensive simulation studies and real data analysis, we demonstrated that our XPXP outperformed existing PRS approaches. We showed that the height PRSs constructed by XPXP achieved 9% and 18% improvement over the runner-up method in terms of predicted R2 in East Asian and African populations, respectively. We also showed that XPXP substantially improved the stratification ability in identifying individuals at high genetic risk of type 2 diabetes. AVAILABILITY AND IMPLEMENTATION The XPXP software and all analysis code are available at github.com/YangLabHKUST/XPXP. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jiashun Xiao
- Guangzhou HKUST Fok Ying Tung Research Institute, Guangzhou 511458, China.,Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Mingxuan Cai
- Guangzhou HKUST Fok Ying Tung Research Institute, Guangzhou 511458, China.,Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Xianghong Hu
- Guangzhou HKUST Fok Ying Tung Research Institute, Guangzhou 511458, China.,Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Xiang Wan
- Shenzhen Research Institute of Big Data, Shenzhen 518172, China.,Pazhou Lab, Guangzhou 510330, China
| | - Gang Chen
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Can Yang
- Guangzhou HKUST Fok Ying Tung Research Institute, Guangzhou 511458, China.,Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| |
Collapse
|
5
|
Wu Y, Furuya S, Wang Z, Nobles JE, Fletcher JM, Lu Q. GWAS on birth year infant mortality rates provides evidence of recent natural selection. Proc Natl Acad Sci U S A 2022; 119:e2117312119. [PMID: 35290122 PMCID: PMC8944929 DOI: 10.1073/pnas.2117312119] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Accepted: 02/07/2022] [Indexed: 01/17/2023] Open
Abstract
Following more than a century of phenotypic measurement of natural selection processes, much recent work explores relationships between molecular genetic measurements and realized fitness in the next generation. We take an innovative approach to the study of contemporary selective pressure by examining which genetic variants are “sustained” in populations as mortality exposure increases. Specifically, we deploy a so-called “regional GWAS” (genome-wide association study) that links the infant mortality rate (IMR) by place and year in the United Kingdom with common genetic variants among birth cohorts in the UK Biobank. These cohorts (born between 1936 and 1970) saw a decline in IMR from above 65 to under 20 deaths per 1,000 live births, with substantial subnational variations and spikes alongside wartime exposures. Our results show several genome-wide significant loci, including LCT and TLR10/1/6, related to area-level cohort IMR exposure during gestation and infancy. Genetic correlations are found across multiple domains, including fertility, cognition, health behaviors, and health outcomes, suggesting an important role for cohort selection in modern populations.
Collapse
Affiliation(s)
- Yuchang Wu
- Department of Biostatistics and Medical Informatics, University of Wisconsin–Madison, Madison, WI 53706
- Center for Demography of Health and Aging, University of Wisconsin–Madison, Madison, WI 53706
| | - Shiro Furuya
- Department of Sociology, University of Wisconsin–Madison, Madison, WI 53706
| | - Zihang Wang
- Department of Statistics, University of Wisconsin–Madison, Madison, WI 53706
| | - Jenna E. Nobles
- Center for Demography of Health and Aging, University of Wisconsin–Madison, Madison, WI 53706
- Department of Sociology, University of Wisconsin–Madison, Madison, WI 53706
| | - Jason M. Fletcher
- Center for Demography of Health and Aging, University of Wisconsin–Madison, Madison, WI 53706
- Department of Sociology, University of Wisconsin–Madison, Madison, WI 53706
- La Follette School of Public Affairs, University of Wisconsin–Madison, Madison, WI 53706
| | - Qiongshi Lu
- Department of Biostatistics and Medical Informatics, University of Wisconsin–Madison, Madison, WI 53706
- Center for Demography of Health and Aging, University of Wisconsin–Madison, Madison, WI 53706
- Department of Statistics, University of Wisconsin–Madison, Madison, WI 53706
| |
Collapse
|
6
|
Ma Y, Zhou X. Genetic prediction of complex traits with polygenic scores: a statistical review. Trends Genet 2021; 37:995-1011. [PMID: 34243982 PMCID: PMC8511058 DOI: 10.1016/j.tig.2021.06.004] [Citation(s) in RCA: 68] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 05/31/2021] [Accepted: 06/03/2021] [Indexed: 01/03/2023]
Abstract
Accurate genetic prediction of complex traits can facilitate disease screening, improve early intervention, and aid in the development of personalized medicine. Genetic prediction of complex traits requires the development of statistical methods that can properly model polygenic architecture and construct a polygenic score (PGS). We present a comprehensive review of 46 methods for PGS construction. We connect the majority of these methods through a multiple linear regression framework which can be instrumental for understanding their prediction performance for traits with distinct genetic architectures. We discuss the practical considerations of PGS analysis as well as challenges and future directions of PGS method development. We hope our review serves as a useful reference both for statistical geneticists who develop PGS methods and for data analysts who perform PGS analysis.
Collapse
Affiliation(s)
- Ying Ma
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
7
|
Wang T, Lu H, Zeng P. Identifying pleiotropic genes for complex phenotypes with summary statistics from a perspective of composite null hypothesis testing. Brief Bioinform 2021; 23:6375058. [PMID: 34571531 DOI: 10.1093/bib/bbab389] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Revised: 08/06/2021] [Accepted: 08/28/2021] [Indexed: 12/13/2022] Open
Abstract
Pleiotropy has important implication on genetic connection among complex phenotypes and facilitates our understanding of disease etiology. Genome-wide association studies provide an unprecedented opportunity to detect pleiotropic associations; however, efficient pleiotropy test methods are still lacking. We here consider pleiotropy identification from a methodological perspective of high-dimensional composite null hypothesis and propose a powerful gene-based method called MAIUP. MAIUP is constructed based on the traditional intersection-union test with two sets of independent P-values as input and follows a novel idea that was originally proposed under the high-dimensional mediation analysis framework. The key improvement of MAIUP is that it takes the composite null nature of pleiotropy test into account by fitting a three-component mixture null distribution, which can ultimately generate well-calibrated P-values for effective control of family-wise error rate and false discover rate. Another attractive advantage of MAIUP is its ability to effectively address the issue of overlapping subjects commonly encountered in association studies. Simulation studies demonstrate that compared with other methods, only MAIUP can maintain correct type I error control and has higher power across a wide range of scenarios. We apply MAIUP to detect shared associated genes among 14 psychiatric disorders with summary statistics and discover many new pleiotropic genes that are otherwise not identified if failing to account for the issue of composite null hypothesis testing. Functional and enrichment analyses offer additional evidence supporting the validity of these identified pleiotropic genes associated with psychiatric disorders. Overall, MAIUP represents an efficient method for pleiotropy identification.
Collapse
Affiliation(s)
- Ting Wang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China
| | - Haojie Lu
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China
| | - Ping Zeng
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China.,Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China.,Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China
| |
Collapse
|
8
|
Zambrano-Gonzalez G, Almanza-Pinzon MI, Vélez-T M. Genetic parameters in traits of productive importance in lines of Bombyx mori L. J Anim Breed Genet 2021; 139:136-144. [PMID: 34510553 PMCID: PMC9293116 DOI: 10.1111/jbg.12647] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 08/14/2021] [Accepted: 09/01/2021] [Indexed: 11/26/2022]
Abstract
In the textile industry, complex cocoon traits are closely related to silk production. The aim of the present study was to estimate genetic parameters of economic importance traits-cocoon length (CL), cocoon weight (CW) and shell weight (SW)-in three B. mori lines-Chinese (C6), Japanese (J7) and Indian (C. Nichi)-reared under different temperature and photoperiod conditions. For each of these lines, data were collected from several generations with a full-sibling family structure and variance-covariance component were obtained via restricted maximum likelihood (REML) estimates based on a bi-trait animal model analysed through the multiple-trait derivative-free restricted maximum likelihood (MTDFREML) software. Genetic parameters of the traits varied between the silkworm lines were evaluated. Heritabilities were highest in J7 line (0.71, 0.89 and 0.93 for CL, CW and SW, respectively) followed by C6 (0.48, 0.54 and 0.50 for CL, CW and SW, respectively) and C. Nichi (0.36, 0.43 and 0.40 for CL, CW and SW, respectively). Phenotypic correlations among these lines were positive, with values ranging between 0.36 and 0.767. Similarly, genetic correlations between the analysed silkworm lines were observed to be positive, with high values ranging from 0.86 to 0.94. The evidence for environmental correlation in these lines was found only between CW-SW traits with moderate to high values ranging from 0.600 to 0.940. The magnitude of heritability and genetic correlations implies that phenotypic variations of the CL, CW and SW traits depend mainly on genotypic variation within the J7, C6 and C. Nichi lines, and that simultaneous genetic gains are possible by implementing selection processes for any of the evaluated traits.
Collapse
Affiliation(s)
- Giselle Zambrano-Gonzalez
- Department of Biology, Faculty of Natural, Exact and Education Sciences, Group of Geology, Ecology and Conservation Studies (GECO), Universidad del Cauca, Popayán, Colombia
| | - Martha I Almanza-Pinzon
- Department of Agricultural Sciences, Faculty of Agricultural Sciences, Integrated Production Systems Group (SISINPRO), Universidad del Cauca, Popayán, Colombia
| | - Mauricio Vélez-T
- Department of Animal Science, Universidad Nacional de Colombia, Sede Palmira, Colombia
| |
Collapse
|
9
|
Wang L, Gao B, Fan Y, Xue F, Zhou X. Mendelian randomization under the omnigenic architecture. Brief Bioinform 2021; 22:6347949. [PMID: 34379090 DOI: 10.1093/bib/bbab322] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 07/22/2021] [Accepted: 07/24/2021] [Indexed: 11/15/2022] Open
Abstract
Mendelian randomization (MR) is a common analytic tool for exploring the causal relationship among complex traits. Existing MR methods require selecting a small set of single nucleotide polymorphisms (SNPs) to serve as instrument variables. However, selecting a small set of SNPs may not be ideal, as most complex traits have a polygenic or omnigenic architecture and are each influenced by thousands of SNPs. Here, motivated by the recent omnigenic hypothesis, we present an MR method that uses all genome-wide SNPs for causal inference. Our method uses summary statistics from genome-wide association studies as input, accommodates the commonly encountered horizontal pleiotropy effects and relies on a composite likelihood framework for scalable computation. We refer to our method as the omnigenic Mendelian randomization, or OMR. We examine the power and robustness of OMR through extensive simulations including those under various modeling misspecifications. We apply OMR to several real data applications, where we identify multiple complex traits that potentially causally influence coronary artery disease (CAD) and asthma. The identified new associations reveal important roles of blood lipids, blood pressure and immunity underlying CAD as well as important roles of immunity and obesity underlying asthma.
Collapse
Affiliation(s)
- Lu Wang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China.,Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.,Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Boran Gao
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.,Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yue Fan
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.,Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA.,School of Public Health, Health Science Center of Xi'an Jiaotong University, Xi'an, Shaanxi 710061, China
| | - Fuzhong Xue
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.,Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
10
|
Malik P, Kumar J, Singh S, Sharma S, Meher PK, Sharma MK, Roy JK, Sharma PK, Balyan HS, Gupta PK, Sharma S. Single-trait, multi-locus and multi-trait GWAS using four different models for yield traits in bread wheat. MOLECULAR BREEDING : NEW STRATEGIES IN PLANT IMPROVEMENT 2021; 41:46. [PMID: 37309385 PMCID: PMC10236106 DOI: 10.1007/s11032-021-01240-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 06/30/2021] [Indexed: 06/14/2023]
Abstract
A genome-wide association study (GWAS) for 10 yield and yield component traits was conducted using an association panel comprising 225 diverse spring wheat genotypes. The panel was genotyped using 10,904 SNPs and evaluated for three years (2016-2019), which constituted three environments (E1, E2 and E3). Heritability for different traits ranged from 29.21 to 97.69%. Marker-trait associations (MTAs) were identified for each trait using data from each environment separately and also using BLUP values. Four different models were used, which included three single trait models (CMLM, FarmCPU, SUPER) and one multi-trait model (mvLMM). Hundreds of MTAs were obtained using each model, but after Bonferroni correction, only 6 MTAs for 3 traits were available using CMLM, and 21 MTAs for 4 traits were available using FarmCPU; none of the 525 MTAs obtained using SUPER could qualify after Bonferroni correction. Using BLUP, 20 MTAs were available, five of which also figured among MTAs identified for individual environments. Using mvLMM model, after Bonferroni correction, 38 multi-trait MTAs, for 15 different trait combinations were available. Epistatic interactions involving 28 pairs of MTAs were also available for seven of the 10 traits; no epistatic interactions were available for GNPS, PH, and BYPP. As many as 164 putative candidate genes (CGs) were identified using all the 50 MTAs (CMLM, 3; FarmCPU, 9; mvLMM, 6, epistasis, 21 and BLUP, 11 MTAs), which ranged from 20 (CMLM) to 66 (epistasis) CGs. In-silico expression analysis of CGs was also conducted in different tissues at different developmental stages. The information generated through the present study proved useful for developing a better understanding of the genetics of each of the 10 traits; the study also provided novel markers for marker-assisted selection (MAS) to be utilized for the development of wheat cultivars with improved agronomic traits. Supplementary Information The online version contains supplementary material available at 10.1007/s11032-021-01240-1.
Collapse
Affiliation(s)
- Parveen Malik
- Department of Genetics and Plant Breeding, Chaudhary Charan Singh University, Meerut 250004, India
| | - Jitendra Kumar
- Department of Genetics and Plant Breeding, Chaudhary Charan Singh University, Meerut 250004, India
- National Agri-Food Biotechnology Institute (NABI), Sector 81, Sahibzada Ajit Singh Nagar, 140306 Punjab India
| | - Sahadev Singh
- Department of Genetics and Plant Breeding, Chaudhary Charan Singh University, Meerut 250004, India
| | - Shiveta Sharma
- Department of Genetics and Plant Breeding, Chaudhary Charan Singh University, Meerut 250004, India
| | - Prabina Kumar Meher
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi 110012, India
| | - Mukesh Kumar Sharma
- Department of Mathematics, Chaudhary Charan Singh University, Meerut 250004, India
| | - Joy Kumar Roy
- National Agri-Food Biotechnology Institute (NABI), Sector 81, Sahibzada Ajit Singh Nagar, 140306 Punjab India
| | - Pradeep Kumar Sharma
- Department of Genetics and Plant Breeding, Chaudhary Charan Singh University, Meerut 250004, India
| | - Harindra Singh Balyan
- Department of Genetics and Plant Breeding, Chaudhary Charan Singh University, Meerut 250004, India
| | - Pushpendra Kumar Gupta
- Department of Genetics and Plant Breeding, Chaudhary Charan Singh University, Meerut 250004, India
| | - Shailendra Sharma
- Department of Genetics and Plant Breeding, Chaudhary Charan Singh University, Meerut 250004, India
| |
Collapse
|
11
|
Wu Y, Zhong X, Lin Y, Zhao Z, Chen J, Zheng B, Li JJ, Fletcher JM, Lu Q. Estimating genetic nurture with summary statistics of multigenerational genome-wide association studies. Proc Natl Acad Sci U S A 2021; 118:e2023184118. [PMID: 34131076 PMCID: PMC8237646 DOI: 10.1073/pnas.2023184118] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Marginal effect estimates in genome-wide association studies (GWAS) are mixtures of direct and indirect genetic effects. Existing methods to dissect these effects require family-based, individual-level genetic, and phenotypic data with large samples, which is difficult to obtain in practice. Here, we propose a statistical framework to estimate direct and indirect genetic effects using summary statistics from GWAS conducted on own and offspring phenotypes. Applied to birth weight, our method showed nearly identical results with those obtained using individual-level data. We also decomposed direct and indirect genetic effects of educational attainment (EA), which showed distinct patterns of genetic correlations with 45 complex traits. The known genetic correlations between EA and higher height, lower body mass index, less-active smoking behavior, and better health outcomes were mostly explained by the indirect genetic component of EA. In contrast, the consistently identified genetic correlation of autism spectrum disorder (ASD) with higher EA resides in the direct genetic component. A polygenic transmission disequilibrium test showed a significant overtransmission of the direct component of EA from healthy parents to ASD probands. Taken together, we demonstrate that traditional GWAS approaches, in conjunction with offspring phenotypic data collection in existing cohorts, could greatly benefit studies on genetic nurture and shed important light on the interpretation of genetic associations for human complex traits.
Collapse
Affiliation(s)
- Yuchang Wu
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53706
- Center for Demography of Health and Aging, University of Wisconsin-Madison, Madison, WI 53706
| | - Xiaoyuan Zhong
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53706
| | - Yunong Lin
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53706
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706
| | - Zijie Zhao
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53706
| | - Jiawen Chen
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27514
| | - Boyan Zheng
- Center for Demography of Health and Aging, University of Wisconsin-Madison, Madison, WI 53706
- Department of Sociology, University of Wisconsin-Madison, Madison, WI 53706
| | - James J Li
- Center for Demography of Health and Aging, University of Wisconsin-Madison, Madison, WI 53706
- Department of Psychology, University of Wisconsin-Madison, Madison, WI 53706
- Waisman Center, University of Wisconsin-Madison, Madison, WI 53706
| | - Jason M Fletcher
- Center for Demography of Health and Aging, University of Wisconsin-Madison, Madison, WI 53706
- Department of Sociology, University of Wisconsin-Madison, Madison, WI 53706
- La Follette School of Public Affairs, University of Wisconsin-Madison, Madison, WI 53706
| | - Qiongshi Lu
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53706;
- Center for Demography of Health and Aging, University of Wisconsin-Madison, Madison, WI 53706
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706
| |
Collapse
|