1
|
Luo L, Mehrotra DV, Shen J, Tang ZZ. Multi-trait analysis of gene-by-environment interactions in large-scale genetic studies. Biostatistics 2024; 25:504-520. [PMID: 36897773 DOI: 10.1093/biostatistics/kxad004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 02/15/2023] [Accepted: 02/22/2023] [Indexed: 03/11/2023] Open
Abstract
Identifying genotype-by-environment interaction (GEI) is challenging because the GEI analysis generally has low power. Large-scale consortium-based studies are ultimately needed to achieve adequate power for identifying GEI. We introduce Multi-Trait Analysis of Gene-Environment Interactions (MTAGEI), a powerful, robust, and computationally efficient framework to test gene-environment interactions on multiple traits in large data sets, such as the UK Biobank (UKB). To facilitate the meta-analysis of GEI studies in a consortium, MTAGEI efficiently generates summary statistics of genetic associations for multiple traits under different environmental conditions and integrates the summary statistics for GEI analysis. MTAGEI enhances the power of GEI analysis by aggregating GEI signals across multiple traits and variants that would otherwise be difficult to detect individually. MTAGEI achieves robustness by combining complementary tests under a wide spectrum of genetic architectures. We demonstrate the advantages of MTAGEI over existing single-trait-based GEI tests through extensive simulation studies and the analysis of the whole exome sequencing data from the UKB.
Collapse
Affiliation(s)
- Lan Luo
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ 07065, USA
| | - Devan V Mehrotra
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., North Wales, PA 19454, USA
| | - Judong Shen
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ 07065, USA
| | - Zheng-Zheng Tang
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, 330 N Orchard St, Madison, WI 53715, USA
| |
Collapse
|
2
|
Tissink EP, Shadrin AA, van der Meer D, Parker N, Hindley G, Roelfs D, Frei O, Fan CC, Nagel M, Nærland T, Budisteanu M, Djurovic S, Westlye LT, van den Heuvel MP, Posthuma D, Kaufmann T, Dale AM, Andreassen OA. Abundant pleiotropy across neuroimaging modalities identified through a multivariate genome-wide association study. Nat Commun 2024; 15:2655. [PMID: 38531894 DOI: 10.1038/s41467-024-46817-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Accepted: 03/12/2024] [Indexed: 03/28/2024] Open
Abstract
Genetic pleiotropy is abundant across spatially distributed brain characteristics derived from one neuroimaging modality (e.g. structural, functional or diffusion magnetic resonance imaging [MRI]). A better understanding of pleiotropy across modalities could inform us on the integration of brain function, micro- and macrostructure. Here we show extensive genetic overlap across neuroimaging modalities at a locus and gene level in the UK Biobank (N = 34,029) and ABCD Study (N = 8607). When jointly analysing phenotypes derived from structural, functional and diffusion MRI in a genome-wide association study (GWAS) with the Multivariate Omnibus Statistical Test (MOSTest), we boost the discovery of loci and genes beyond previously identified effects for each modality individually. Cross-modality genes are involved in fundamental biological processes and predominantly expressed during prenatal brain development. We additionally boost prediction of psychiatric disorders by conditioning independent GWAS on our multimodal multivariate GWAS. These findings shed light on the shared genetic mechanisms underlying variation in brain morphology, functional connectivity, and tissue composition.
Collapse
Affiliation(s)
- E P Tissink
- Department of Complex Trait Genetics, Center for Neurogenomics and Cognitive Research, Vrije Universiteit Amsterdam, Amsterdam Neuroscience, 1081 HV, Amsterdam, The Netherlands.
- Department of Sleep and Cognition, Netherlands Institute for Neuroscience, an institute of the Royal Netherlands Academy of Arts and Sciences, Amsterdam, The Netherlands.
| | - A A Shadrin
- NORMENT Centre, Division of Mental Health and Addiction, Oslo University Hospital and Institute of Clinical Medicine, University of Oslo, Building 48, Oslo, Norway
| | - D van der Meer
- NORMENT Centre, Division of Mental Health and Addiction, Oslo University Hospital and Institute of Clinical Medicine, University of Oslo, Building 48, Oslo, Norway
- School of Mental Health and Neuroscience, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, The Netherlands
| | - N Parker
- NORMENT Centre, Division of Mental Health and Addiction, Oslo University Hospital and Institute of Clinical Medicine, University of Oslo, Building 48, Oslo, Norway
| | - G Hindley
- NORMENT Centre, Division of Mental Health and Addiction, Oslo University Hospital and Institute of Clinical Medicine, University of Oslo, Building 48, Oslo, Norway
- Psychosis Studies, Institute of Psychiatry, Psychology and Neurosciences, King's College London, 16 De Crespigny Park, London, SE5 8AB, United Kingdom
| | - D Roelfs
- NORMENT Centre, Division of Mental Health and Addiction, Oslo University Hospital and Institute of Clinical Medicine, University of Oslo, Building 48, Oslo, Norway
| | - O Frei
- NORMENT Centre, Division of Mental Health and Addiction, Oslo University Hospital and Institute of Clinical Medicine, University of Oslo, Building 48, Oslo, Norway
| | - C C Fan
- Laureate Institute for Brain Research, Tulsa, OK, USA
- Department of Radiology, University of California San Diego, La Jolla, CA, 92037, USA
| | - M Nagel
- Department of Complex Trait Genetics, Center for Neurogenomics and Cognitive Research, Vrije Universiteit Amsterdam, Amsterdam Neuroscience, 1081 HV, Amsterdam, The Netherlands
| | - T Nærland
- K.G. Jebsen Centre for Neurodevelopmental disorders, Division of Paediatric Medicine, Institute of Clinical Medicine, University of Oslo, Building 31, Oslo, Norway
| | - M Budisteanu
- Prof. Dr. Alex Obregia Clinical Hospital of Psychiatry, Bucharest, Romania
- "Victor Babes" National Institute of Pathology, Bucharest, Romania
| | - S Djurovic
- NORMENT Centre, Division of Mental Health and Addiction, Oslo University Hospital and Institute of Clinical Medicine, University of Oslo, Building 48, Oslo, Norway
- K.G. Jebsen Centre for Neurodevelopmental disorders, Division of Paediatric Medicine, Institute of Clinical Medicine, University of Oslo, Building 31, Oslo, Norway
- Department of Medical Genetics, Oslo University Hospital, Oslo, Norway
| | - L T Westlye
- NORMENT Centre, Division of Mental Health and Addiction, Oslo University Hospital and Institute of Clinical Medicine, University of Oslo, Building 48, Oslo, Norway
- K.G. Jebsen Centre for Neurodevelopmental disorders, Division of Paediatric Medicine, Institute of Clinical Medicine, University of Oslo, Building 31, Oslo, Norway
- Department of Psychology, University of Oslo, Oslo, Norway
| | - M P van den Heuvel
- Department of Complex Trait Genetics, Center for Neurogenomics and Cognitive Research, Vrije Universiteit Amsterdam, Amsterdam Neuroscience, 1081 HV, Amsterdam, The Netherlands
- Department of Child and Adolescent Psychology and Psychiatry, section Complex Trait Genetics, Amsterdam Neuroscience, VU University Medical Centre, Amsterdam, The Netherlands
| | - D Posthuma
- Department of Complex Trait Genetics, Center for Neurogenomics and Cognitive Research, Vrije Universiteit Amsterdam, Amsterdam Neuroscience, 1081 HV, Amsterdam, The Netherlands
- Department of Child and Adolescent Psychology and Psychiatry, section Complex Trait Genetics, Amsterdam Neuroscience, VU University Medical Centre, Amsterdam, The Netherlands
| | - T Kaufmann
- NORMENT Centre, Division of Mental Health and Addiction, Oslo University Hospital and Institute of Clinical Medicine, University of Oslo, Building 48, Oslo, Norway
- Department of Psychiatry and Psychotherapy, Tübingen Center for Mental Health, University of Tübingen, Tübingen, Germany
| | - A M Dale
- Department of Radiology, University of California San Diego, La Jolla, CA, 92037, USA
- Center for Multimodal Imaging and Genetics, University of California San Diego, La Jolla, CA, 92037, USA
- Department of Neurosciences, University of California San Diego, La Jolla, CA, 92037, USA
| | - O A Andreassen
- NORMENT Centre, Division of Mental Health and Addiction, Oslo University Hospital and Institute of Clinical Medicine, University of Oslo, Building 48, Oslo, Norway.
- K.G. Jebsen Centre for Neurodevelopmental disorders, Division of Paediatric Medicine, Institute of Clinical Medicine, University of Oslo, Building 31, Oslo, Norway.
| |
Collapse
|
3
|
Zhai S, Mehrotra DV, Shen J. Applying polygenic risk score methods to pharmacogenomics GWAS: challenges and opportunities. Brief Bioinform 2023; 25:bbad470. [PMID: 38152980 PMCID: PMC10782924 DOI: 10.1093/bib/bbad470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 11/20/2023] [Accepted: 11/28/2023] [Indexed: 12/29/2023] Open
Abstract
Polygenic risk scores (PRSs) have emerged as promising tools for the prediction of human diseases and complex traits in disease genome-wide association studies (GWAS). Applying PRSs to pharmacogenomics (PGx) studies has begun to show great potential for improving patient stratification and drug response prediction. However, there are unique challenges that arise when applying PRSs to PGx GWAS beyond those typically encountered in disease GWAS (e.g. Eurocentric or trans-ethnic bias). These challenges include: (i) the lack of knowledge about whether PGx or disease GWAS/variants should be used in the base cohort (BC); (ii) the small sample sizes in PGx GWAS with corresponding low power and (iii) the more complex PRS statistical modeling required for handling both prognostic and predictive effects simultaneously. To gain insights in this landscape about the general trends, challenges and possible solutions, we first conduct a systematic review of both PRS applications and PRS method development in PGx GWAS. To further address the challenges, we propose (i) a novel PRS application strategy by leveraging both PGx and disease GWAS summary statistics in the BC for PRS construction and (ii) a new Bayesian method (PRS-PGx-Bayesx) to reduce Eurocentric or cross-population PRS prediction bias. Extensive simulations are conducted to demonstrate their advantages over existing PRS methods applied in PGx GWAS. Our systematic review and methodology research work not only highlights current gaps and key considerations while applying PRS methods to PGx GWAS, but also provides possible solutions for better PGx PRS applications and future research.
Collapse
Affiliation(s)
- Song Zhai
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ 07065, USA
| | - Devan V Mehrotra
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., North Wales, PA 19454, USA
| | - Judong Shen
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ 07065, USA
| |
Collapse
|
4
|
Li X, Chen H, Selvaraj MS, Van Buren E, Zhou H, Wang Y, Sun R, McCaw ZR, Yu Z, Arnett DK, Bis JC, Blangero J, Boerwinkle E, Bowden DW, Brody JA, Cade BE, Carson AP, Carlson JC, Chami N, Chen YDI, Curran JE, de Vries PS, Fornage M, Franceschini N, Freedman BI, Gu C, Heard-Costa NL, He J, Hou L, Hung YJ, Irvin MR, Kaplan RC, Kardia SL, Kelly T, Konigsberg I, Kooperberg C, Kral BG, Li C, Loos RJ, Mahaney MC, Martin LW, Mathias RA, Minster RL, Mitchell BD, Montasser ME, Morrison AC, Palmer ND, Peyser PA, Psaty BM, Raffield LM, Redline S, Reiner AP, Rich SS, Sitlani CM, Smith JA, Taylor KD, Tiwari H, Vasan RS, Wang Z, Yanek LR, Yu B, Rice KM, Rotter JI, Peloso GM, Natarajan P, Li Z, Liu Z, Lin X. A statistical framework for powerful multi-trait rare variant analysis in large-scale whole-genome sequencing studies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.30.564764. [PMID: 37961350 PMCID: PMC10634938 DOI: 10.1101/2023.10.30.564764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Large-scale whole-genome sequencing (WGS) studies have improved our understanding of the contributions of coding and noncoding rare variants to complex human traits. Leveraging association effect sizes across multiple traits in WGS rare variant association analysis can improve statistical power over single-trait analysis, and also detect pleiotropic genes and regions. Existing multi-trait methods have limited ability to perform rare variant analysis of large-scale WGS data. We propose MultiSTAAR, a statistical framework and computationally-scalable analytical pipeline for functionally-informed multi-trait rare variant analysis in large-scale WGS studies. MultiSTAAR accounts for relatedness, population structure and correlation among phenotypes by jointly analyzing multiple traits, and further empowers rare variant association analysis by incorporating multiple functional annotations. We applied MultiSTAAR to jointly analyze three lipid traits (low-density lipoprotein cholesterol, high-density lipoprotein cholesterol and triglycerides) in 61,861 multi-ethnic samples from the Trans-Omics for Precision Medicine (TOPMed) Program. We discovered new associations with lipid traits missed by single-trait analysis, including rare variants within an enhancer of NIPSNAP3A and an intergenic region on chromosome 1.
Collapse
Affiliation(s)
- Xihao Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Han Chen
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Margaret Sunitha Selvaraj
- Center for Genomic Medicine and Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Eric Van Buren
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Hufeng Zhou
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Yuxuan Wang
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Ryan Sun
- Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Zachary R. McCaw
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Zhi Yu
- Center for Genomic Medicine and Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Donna K. Arnett
- Provost Office, University of South Carolina, Columbia, SC, USA
| | - Joshua C. Bis
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - John Blangero
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Eric Boerwinkle
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Donald W. Bowden
- Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Jennifer A. Brody
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Brian E. Cade
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Sleep Medicine, Harvard Medical School, Boston, MA, USA
| | - April P. Carson
- Department of Medicine, University of Mississippi Medical Center, Jackson, MS, USA
| | - Jenna C. Carlson
- Department of Human Genetics and Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Nathalie Chami
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Yii-Der Ida Chen
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Joanne E. Curran
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Paul S. de Vries
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Myriam Fornage
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Brown Foundation Institute of Molecular Medicine, McGovern Medical School, the University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Nora Franceschini
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Barry I. Freedman
- Department of Internal Medicine, Nephrology, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Charles Gu
- Division of Biology & Biomedical Sciences, Washington University School of Medicine, St. Louis, MO, USA
| | - Nancy L. Heard-Costa
- Department of Neurology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
- Framingham Heart Study, Framingham, MA, USA
| | - Jiang He
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA
- Tulane University Translational Science Institute, New Orleans, LA, USA
| | - Lifang Hou
- Department of Preventive Medicine, Northwestern University, Chicago, IL, USA
| | - Yi-Jen Hung
- Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan
| | - Marguerite R. Irvin
- Department of Epidemiology, School of Public Health, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Robert C. Kaplan
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, USA
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Sharon L.R. Kardia
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Tanika Kelly
- Department of Medicine, Division of Nephrology, University of Illinois Chicago, Chicago, IL, USA
| | - Iain Konigsberg
- Department of Biomedical Informatics, University of Colorado, Aurora, CO, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Brian G. Kral
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Changwei Li
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA
- Tulane University Translational Science Institute, New Orleans, LA, USA
| | - Ruth J.F. Loos
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Michael C. Mahaney
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Lisa W. Martin
- George Washington University School of Medicine and Health Sciences, Washington, DC, USA
| | - Rasika A. Mathias
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Ryan L. Minster
- Department of Human Genetics and Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Braxton D. Mitchell
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - May E. Montasser
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Alanna C. Morrison
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Nicholette D. Palmer
- Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Patricia A. Peyser
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Bruce M. Psaty
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
- Departments of Epidemiology, University of Washington, Seattle, WA, USA
- Department of Health Systems and Population Health, University of Washington, Seattle, WA, USA
| | - Laura M. Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Susan Redline
- Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Sleep Medicine, Harvard Medical School, Boston, MA, USA
| | - Alexander P. Reiner
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, USA
- Departments of Epidemiology, University of Washington, Seattle, WA, USA
| | - Stephen S. Rich
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Colleen M. Sitlani
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Jennifer A. Smith
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Kent D. Taylor
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Hemant Tiwari
- Department of Biostatistics, School of Public Health, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Ramachandran S. Vasan
- Framingham Heart Study, Framingham, MA, USA
- Department of Quantitative and Qualitative Health Sciences, UT Health San Antonio School of Public Health, San Antonia, TX, USA
| | - Zhe Wang
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Lisa R. Yanek
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Bing Yu
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | | | - Kenneth M. Rice
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Jerome I. Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Gina M. Peloso
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Pradeep Natarajan
- Center for Genomic Medicine and Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Zilin Li
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Zhonghua Liu
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY, USA
| | - Xihong Lin
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Statistics, Harvard University, Cambridge, MA, USA
| |
Collapse
|
5
|
Kim K, Jun TH, Ha BK, Wang S, Sun H. New statistical selection method for pleiotropic variants associated with both quantitative and qualitative traits. BMC Bioinformatics 2023; 24:381. [PMID: 37817069 PMCID: PMC10563219 DOI: 10.1186/s12859-023-05505-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Accepted: 09/28/2023] [Indexed: 10/12/2023] Open
Abstract
BACKGROUND Identification of pleiotropic variants associated with multiple phenotypic traits has received increasing attention in genetic association studies. Overlapping genetic associations from multiple traits help to detect weak genetic associations missed by single-trait analyses. Many statistical methods were developed to identify pleiotropic variants with most of them being limited to quantitative traits when pleiotropic effects on both quantitative and qualitative traits have been observed. This is a statistically challenging problem because there does not exist an appropriate multivariate distribution to model both quantitative and qualitative data together. Alternatively, meta-analysis methods can be applied, which basically integrate summary statistics of individual variants associated with either a quantitative or a qualitative trait without accounting for correlations among genetic variants. RESULTS We propose a new statistical selection method based on a unified selection score quantifying how a genetic variant, i.e., a pleiotropic variant associates with both quantitative and qualitative traits. In our extensive simulation studies where various types of pleiotropic effects on both quantitative and qualitative traits were considered, we demonstrated that the proposed method outperforms the existing meta-analysis methods in terms of true positive selection. We also applied the proposed method to a peanut dataset with 6 quantitative and 2 qualitative traits, and a cowpea dataset with 2 quantitative and 6 qualitative traits. We were able to detect some potentially pleiotropic variants missed by the existing methods in both analyses. CONCLUSIONS The proposed method is able to locate pleiotropic variants associated with both quantitative and qualitative traits. It has been implemented into an R package 'UNISS', which can be downloaded from http://github.com/statpng/uniss.
Collapse
Affiliation(s)
- Kipoong Kim
- Department of Statistic, Pusan National University, 46241, Busan, Korea
| | - Tae-Hwan Jun
- Department of Plant Bioscience, Pusan National University, 50463, Miryang, Korea
| | - Bo-Keun Ha
- Department of Applied Plant Science, Chonnam National University, 61186, Gwangju, Korea
| | - Shuang Wang
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, 10032, USA
| | - Hokeun Sun
- Department of Statistic, Pusan National University, 46241, Busan, Korea.
| |
Collapse
|
6
|
Liang X, Sun H. Weighted Selection Probability to Prioritize Susceptible Rare Variants in Multi-Phenotype Association Studies with Application to a Soybean Genetic Data Set. J Comput Biol 2023; 30:1075-1088. [PMID: 37871292 DOI: 10.1089/cmb.2022.0487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2023] Open
Abstract
Rare variant association studies with multiple traits or diseases have drawn a lot of attention since association signals of rare variants can be boosted if more than one phenotype outcome is associated with the same rare variants. Most of the existing statistical methods to identify rare variants associated with multiple phenotypes are based on a group test, where a pre-specified genetic region is tested one at a time. However, these methods are not designed to locate susceptible rare variants within the genetic region. In this article, we propose new statistical methods to prioritize rare variants within a genetic region when a group test for the genetic region identifies a statistical association with multiple phenotypes. It computes the weighted selection probability (WSP) of individual rare variants and ranks them from largest to smallest according to their WSP. In simulation studies, we demonstrated that the proposed method outperforms other statistical methods in terms of true positive selection, when multiple phenotypes are correlated with each other. We also applied it to our soybean single nucleotide polymorphism (SNP) data with 13 highly correlated amino acids, where we identified some potentially susceptible rare variants in chromosome 19.
Collapse
Affiliation(s)
- Xianglong Liang
- Department of Statistic, Pusan National University, Busan, Korea
| | - Hokeun Sun
- Department of Statistic, Pusan National University, Busan, Korea
| |
Collapse
|
7
|
Zhai S, Guo B, Wu B, Mehrotra DV, Shen J. Integrating multiple traits for improving polygenic risk prediction in disease and pharmacogenomics GWAS. Brief Bioinform 2023:7169140. [PMID: 37200155 DOI: 10.1093/bib/bbad181] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 03/30/2023] [Accepted: 04/21/2023] [Indexed: 05/20/2023] Open
Abstract
Polygenic risk score (PRS) has been recently developed for predicting complex traits and drug responses. It remains unknown whether multi-trait PRS (mtPRS) methods, by integrating information from multiple genetically correlated traits, can improve prediction accuracy and power for PRS analysis compared with single-trait PRS (stPRS) methods. In this paper, we first review commonly used mtPRS methods and find that they do not directly model the underlying genetic correlations among traits, which has been shown to be useful in guiding multi-trait association analysis in the literature. To overcome this limitation, we propose a mtPRS-PCA method to combine PRSs from multiple traits with weights obtained from performing principal component analysis (PCA) on the genetic correlation matrix. To accommodate various genetic architectures covering different effect directions, signal sparseness and across-trait correlation structures, we further propose an omnibus mtPRS method (mtPRS-O) by combining P values from mtPRS-PCA, mtPRS-ML (mtPRS based on machine learning) and stPRSs using Cauchy Combination Test. Our extensive simulation studies show that mtPRS-PCA outperforms other mtPRS methods in both disease and pharmacogenomics (PGx) genome-wide association studies (GWAS) contexts when traits are similarly correlated, with dense signal effects and in similar effect directions, and mtPRS-O is consistently superior to most other methods due to its robustness under various genetic architectures. We further apply mtPRS-PCA, mtPRS-O and other methods to PGx GWAS data from a randomized clinical trial in the cardiovascular domain and demonstrate performance improvement of mtPRS-PCA in both prediction accuracy and patient stratification as well as the robustness of mtPRS-O in PRS association test.
Collapse
Affiliation(s)
- Song Zhai
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ 07065, USA
| | - Bin Guo
- Data and Genome Science, Merck & Co., Inc., Cambridge, MA 02141, USA
| | - Baolin Wu
- Department of Epidemiology and Biostatistics, University of California Irvine, Irvine, CA 92697, USA
| | - Devan V Mehrotra
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., North Wales, PA 19454, USA
| | - Judong Shen
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ 07065, USA
| |
Collapse
|
8
|
Wang J, Jiang Z, Guo H, Li Z. Divided-and-combined omnibus test for genetic association analysis with high-dimensional data. Stat Methods Med Res 2023; 32:626-637. [PMID: 36652550 DOI: 10.1177/09622802231151204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Advances in biologic technology enable researchers to obtain a huge amount of genetic and genomic data, whose dimensions are often quite high on both phenotypes and variants. Testing their association with multiple phenotypes has been a hot topic in recent years. Traditional single phenotype multiple variant analysis has to be adjusted for multiple testing and thus suffers from substantial power loss due to ignorance of correlation across phenotypes. Similarity-based method, which uses the trace of product of two similarity matrices as a test statistic, has emerged as a useful tool to handle this problem. However, it loses power when the correlation strength within multiple phenotypes is middle or strong, for some signals represented by the eigenvalues of phenotypic similarity matrix are masked by others. We propose a divided-and-combined omnibus test to handle this drawback of the similarity-based method. Based on the divided-and-combined strategy, we first divide signals into two groups in a series of cut points according to eigenvalues of the phenotypic similarity matrix and combine analysis results via the Cauchy-combined method to reach a final statistic. Extensive simulations and application to a pig data demonstrate that the proposed statistic is much more powerful and robust than the original test under most of the considered scenarios, and sometimes the power increase can be more than 0.6. Divided-and-combined omnibus test facilitates genetic association analysis with high-dimensional data and achieves much higher power than the existing similarity based method. In fact, divided-and-combined omnibus test can be used whenever the association analysis between two multivariate variables needs to be conducted.
Collapse
Affiliation(s)
- Jinjuan Wang
- School of Mathematics and Statistics, 47833Beijing Institute of Technology, Beijing, China
| | - Zhenzhen Jiang
- LSC, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.,School of Mathematical Science, University of Chinese Academy of Sciences, Beijing, China
| | - Hongping Guo
- School of Mathematics and Statistics, Hubei Normal University, Huangshi, China
| | - Zhengbang Li
- School of Mathematics and Statistics, 12446Central China Normal University, Wuhan, China
| |
Collapse
|
9
|
Abbas-Aghababazadeh F, Xu W, Haibe-Kains B. The impact of violating the independence assumption in meta-analysis on biomarker discovery. Front Genet 2023; 13:1027345. [PMID: 36726714 PMCID: PMC9885264 DOI: 10.3389/fgene.2022.1027345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 11/25/2022] [Indexed: 01/06/2023] Open
Abstract
With rapid advancements in high-throughput sequencing technologies, massive amounts of "-omics" data are now available in almost every biomedical field. Due to variance in biological models and analytic methods, findings from clinical and biological studies are often not generalizable when tested in independent cohorts. Meta-analysis, a set of statistical tools to integrate independent studies addressing similar research questions, has been proposed to improve the accuracy and robustness of new biological insights. However, it is common practice among biomarker discovery studies using preclinical pharmacogenomic data to borrow molecular profiles of cancer cell lines from one study to another, creating dependence across studies. The impact of violating the independence assumption in meta-analyses is largely unknown. In this study, we review and compare different meta-analyses to estimate variations across studies along with biomarker discoveries using preclinical pharmacogenomics data. We further evaluate the performance of conventional meta-analysis where the dependence of the effects was ignored via simulation studies. Results show that, as the number of non-independent effects increased, relative mean squared error and lower coverage probability increased. Additionally, we also assess potential bias in the estimation of effects for established meta-analysis approaches when data are duplicated and the assumption of independence is violated. Using pharmacogenomics biomarker discovery, we find that treating dependent studies as independent can substantially increase the bias of meta-analyses. Importantly, we show that violating the independence assumption decreases the generalizability of the biomarker discovery process and increases false positive results, a key challenge in precision oncology.
Collapse
Affiliation(s)
| | - Wei Xu
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada,Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
| | - Benjamin Haibe-Kains
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada,Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada,Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada,Ontario Institute for Cancer Research, Toronto, ON, Canada,Department of Computer Science, University of Toronto, Toronto, ON, Canada,*Correspondence: Benjamin Haibe-Kains,
| |
Collapse
|
10
|
Chen W, Coombes BJ, Larson NB. Recent advances and challenges of rare variant association analysis in the biobank sequencing era. Front Genet 2022; 13:1014947. [PMID: 36276986 PMCID: PMC9582646 DOI: 10.3389/fgene.2022.1014947] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 09/22/2022] [Indexed: 12/04/2022] Open
Abstract
Causal variants for rare genetic diseases are often rare in the general population. Rare variants may also contribute to common complex traits and can have much larger per-allele effect sizes than common variants, although power to detect these associations can be limited. Sequencing costs have steadily declined with technological advancements, making it feasible to adopt whole-exome and whole-genome profiling for large biobank-scale sample sizes. These large amounts of sequencing data provide both opportunities and challenges for rare-variant association analysis. Herein, we review the basic concepts of rare-variant analysis methods, the current state-of-the-art methods in utilizing variant annotations or external controls to improve the statistical power, and particular challenges facing rare variant analysis such as accounting for population structure, extremely unbalanced case-control design. We also review recent advances and challenges in rare variant analysis for familial sequencing data and for more complex phenotypes such as survival data. Finally, we discuss other potential directions for further methodology investigation.
Collapse
Affiliation(s)
- Wenan Chen
- Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, TN, United States
- *Correspondence: Wenan Chen, ; Brandon J. Coombes, ; Nicholas B. Larson,
| | - Brandon J. Coombes
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, United States
- *Correspondence: Wenan Chen, ; Brandon J. Coombes, ; Nicholas B. Larson,
| | - Nicholas B. Larson
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, United States
- *Correspondence: Wenan Chen, ; Brandon J. Coombes, ; Nicholas B. Larson,
| |
Collapse
|
11
|
Identification of microbial features in multivariate regression under false discovery rate control. Comput Stat Data Anal 2022. [DOI: 10.1016/j.csda.2022.107621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
12
|
Shao Z, Wang T, Qiao J, Zhang Y, Huang S, Zeng P. A comprehensive comparison of multilocus association methods with summary statistics in genome-wide association studies. BMC Bioinformatics 2022; 23:359. [PMID: 36042399 PMCID: PMC9429742 DOI: 10.1186/s12859-022-04897-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 08/22/2022] [Indexed: 02/07/2023] Open
Abstract
Background Multilocus analysis on a set of single nucleotide polymorphisms (SNPs) pre-assigned within a gene constitutes a valuable complement to single-marker analysis by aggregating data on complex traits in a biologically meaningful way. However, despite the existence of a wide variety of SNP-set methods, few comprehensive comparison studies have been previously performed to evaluate the effectiveness of these methods. Results We herein sought to fill this knowledge gap by conducting a comprehensive empirical comparison for 22 commonly-used summary-statistics based SNP-set methods. We showed that only seven methods could effectively control the type I error, and that these well-calibrated approaches had varying power performance under the simulation scenarios. Overall, we confirmed that the burden test was generally underpowered and score-based variance component tests (e.g., sequence kernel association test) were much powerful under the polygenic genetic architecture in both common and rare variant association analyses. We further revealed that two linkage-disequilibrium-free P value combination methods (e.g., harmonic mean P value method and aggregated Cauchy association test) behaved very well under the sparse genetic architecture in simulations and real-data applications to common and rare variant association analyses as well as in expression quantitative trait loci weighted integrative analysis. We also assessed the scalability of these approaches by recording computational time and found that all these methods can be scalable to biobank-scale data although some might be relatively slow. Conclusion In conclusion, we hope that our findings can offer an important guidance on how to choose appropriate multilocus association analysis methods in post-GWAS era. All the SNP-set methods are implemented in the R package called MCA, which is freely available at https://github.com/biostatpzeng/. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04897-3.
Collapse
Affiliation(s)
- Zhonghe Shao
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Ting Wang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Jiahao Qiao
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Yuchen Zhang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Shuiping Huang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.,Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.,Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.,Key Laboratory of Environment and Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.,Engineering Research Innovation Center of Biological Data Mining and Healthcare Transformation, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Ping Zeng
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China. .,Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China. .,Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China. .,Key Laboratory of Environment and Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China. .,Engineering Research Innovation Center of Biological Data Mining and Healthcare Transformation, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
| |
Collapse
|
13
|
sumSTAAR: A flexible framework for gene-based association studies using GWAS summary statistics. PLoS Comput Biol 2022; 18:e1010172. [PMID: 35653402 PMCID: PMC9197066 DOI: 10.1371/journal.pcbi.1010172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Revised: 06/14/2022] [Accepted: 05/05/2022] [Indexed: 11/19/2022] Open
Abstract
Gene-based association analysis is an effective gene-mapping tool. Many gene-based methods have been proposed recently. However, their power depends on the underlying genetic architecture, which is rarely known in complex traits, and so it is likely that a combination of such methods could serve as a universal approach. Several frameworks combining different gene-based methods have been developed. However, they all imply a fixed set of methods, weights and functional annotations. Moreover, most of them use individual phenotypes and genotypes as input data. Here, we introduce sumSTAAR, a framework for gene-based association analysis using summary statistics obtained from genome-wide association studies (GWAS). It is an extended and modified version of STAAR framework proposed by Li and colleagues in 2020. The sumSTAAR framework offers a wider range of gene-based methods to combine. It allows the user to arbitrarily define a set of these methods, weighting functions and probabilities of genetic variants being causal. The methods used in the framework were adapted to analyse genes with large number of SNPs to decrease the running time. The framework includes the polygene pruning procedure to guard against the influence of the strong GWAS signals outside the gene. We also present new improved matrices of correlations between the genotypes of variants within genes. These matrices estimated on a sample of 265,000 individuals are a state-of-the-art replacement of widely used matrices based on the 1000 Genomes Project data. Gene-based association analysis is an effective gene mapping tool. Quite a few frameworks have been proposed recently for gene-based association analysis using a combination of different methods. However, all of these frameworks have at least one of the disadvantages: they use a fixed set of methods, they cannot use functional annotations, or they use individual phenotypes and genotypes as input data. To overcome these limitations, we propose sumSTAAR, a framework for gene-based association analysis using GWAS summary statistics. Our framework allows the user to arbitrarily define a set of the methods and functional annotations. Moreover, we adopted the methods for the analysis of genes with a large number of SNPs to decrease the running time. The framework includes the polygene pruning procedure to guard against the influence of the strong GWAS signals outside the gene. We also present new improved matrices of correlations between the genotypes of variants within genes, which now allows to include ultra-rare variants (MAF < 10−4) in analysis.
Collapse
|
14
|
Wang P, Dong N, Wang M, Sun G, Jia Y, Geng X, Liu M, Wang W, Pan Z, Yang Q, Li H, Wei C, Wang L, Zheng H, He S, Zhang X, Wang Q, Du X. Introgression from Gossypium hirsutum is a driver for population divergence and genetic diversity in Gossypium barbadense. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2022; 110:764-780. [PMID: 35132720 DOI: 10.1111/tpj.15702] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Revised: 01/22/2022] [Accepted: 02/03/2022] [Indexed: 05/26/2023]
Affiliation(s)
- Pengpeng Wang
- Institute of Cotton Research, Chinese Academy of Agricultural Sciences/Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China
| | - Na Dong
- Henan Key Laboratory of Molecular Ecology and Germplasm Innovation of Cotton and Wheat, Collaborative Innovation Center of Modern Biological Breeding in Henan Province, Henan Institute of Science and Technology, Xinxiang, 453003, China
| | - Maojun Wang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, Hubei, 430070, China
| | - Gaofei Sun
- Anyang Institute of Technology, Anyang, 455000, China
| | - Yinhua Jia
- Institute of Cotton Research, Chinese Academy of Agricultural Sciences/Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China
| | - Xiaoli Geng
- Institute of Cotton Research, Chinese Academy of Agricultural Sciences/Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China
| | - Min Liu
- Biomarker Technologies Corporation, Beijing, China
| | - Weipeng Wang
- Henan Key Laboratory of Molecular Ecology and Germplasm Innovation of Cotton and Wheat, Collaborative Innovation Center of Modern Biological Breeding in Henan Province, Henan Institute of Science and Technology, Xinxiang, 453003, China
| | - Zhaoe Pan
- Institute of Cotton Research, Chinese Academy of Agricultural Sciences/Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China
| | - Qiuyue Yang
- Henan Key Laboratory of Molecular Ecology and Germplasm Innovation of Cotton and Wheat, Collaborative Innovation Center of Modern Biological Breeding in Henan Province, Henan Institute of Science and Technology, Xinxiang, 453003, China
| | - Hongge Li
- Institute of Cotton Research, Chinese Academy of Agricultural Sciences/Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China
| | - Chunyan Wei
- Henan Key Laboratory of Molecular Ecology and Germplasm Innovation of Cotton and Wheat, Collaborative Innovation Center of Modern Biological Breeding in Henan Province, Henan Institute of Science and Technology, Xinxiang, 453003, China
| | - Liru Wang
- Institute of Cotton Research, Chinese Academy of Agricultural Sciences/Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China
| | | | - Shoupu He
- Institute of Cotton Research, Chinese Academy of Agricultural Sciences/Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China
| | - Xianlong Zhang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, Hubei, 430070, China
| | - Qinglian Wang
- Henan Key Laboratory of Molecular Ecology and Germplasm Innovation of Cotton and Wheat, Collaborative Innovation Center of Modern Biological Breeding in Henan Province, Henan Institute of Science and Technology, Xinxiang, 453003, China
| | - Xiongming Du
- Institute of Cotton Research, Chinese Academy of Agricultural Sciences/Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China
| |
Collapse
|
15
|
Wang YC, Wu Y, Choi J, Allington G, Zhao S, Khanfar M, Yang K, Fu PY, Wrubel M, Yu X, Mekbib KY, Ocken J, Smith H, Shohfi J, Kahle KT, Lu Q, Jin SC. Computational Genomics in the Era of Precision Medicine: Applications to Variant Analysis and Gene Therapy. J Pers Med 2022; 12:175. [PMID: 35207663 PMCID: PMC8878256 DOI: 10.3390/jpm12020175] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 01/18/2022] [Accepted: 01/24/2022] [Indexed: 02/04/2023] Open
Abstract
Rapid methodological advances in statistical and computational genomics have enabled researchers to better identify and interpret both rare and common variants responsible for complex human diseases. As we continue to see an expansion of these advances in the field, it is now imperative for researchers to understand the resources and methodologies available for various data types and study designs. In this review, we provide an overview of recent methods for identifying rare and common variants and understanding their roles in disease etiology. Additionally, we discuss the strategy, challenge, and promise of gene therapy. As computational and statistical approaches continue to improve, we will have an opportunity to translate human genetic findings into personalized health care.
Collapse
Affiliation(s)
- Yung-Chun Wang
- Department of Genetics, School of Medicine, Washington University, St. Louis, MO 63110, USA; (Y.-C.W.); (J.C.); (S.Z.); (M.K.); (K.Y.); (P.-Y.F.); (M.W.); (X.Y.)
| | - Yuchang Wu
- Department of Biostatistics & Medical Informatics, University of Wisconsin-Madison, Madison, WI 53706, USA;
| | - Julie Choi
- Department of Genetics, School of Medicine, Washington University, St. Louis, MO 63110, USA; (Y.-C.W.); (J.C.); (S.Z.); (M.K.); (K.Y.); (P.-Y.F.); (M.W.); (X.Y.)
| | - Garrett Allington
- Department of Pathology, Yale School of Medicine, New Haven, CT 06510, USA;
- Department of Neurosurgery, Massachusetts General Hospital, Boston, MA 02114, USA; (H.S.); (K.T.K.)
| | - Shujuan Zhao
- Department of Genetics, School of Medicine, Washington University, St. Louis, MO 63110, USA; (Y.-C.W.); (J.C.); (S.Z.); (M.K.); (K.Y.); (P.-Y.F.); (M.W.); (X.Y.)
| | - Mariam Khanfar
- Department of Genetics, School of Medicine, Washington University, St. Louis, MO 63110, USA; (Y.-C.W.); (J.C.); (S.Z.); (M.K.); (K.Y.); (P.-Y.F.); (M.W.); (X.Y.)
| | - Kuangying Yang
- Department of Genetics, School of Medicine, Washington University, St. Louis, MO 63110, USA; (Y.-C.W.); (J.C.); (S.Z.); (M.K.); (K.Y.); (P.-Y.F.); (M.W.); (X.Y.)
| | - Po-Ying Fu
- Department of Genetics, School of Medicine, Washington University, St. Louis, MO 63110, USA; (Y.-C.W.); (J.C.); (S.Z.); (M.K.); (K.Y.); (P.-Y.F.); (M.W.); (X.Y.)
| | - Max Wrubel
- Department of Genetics, School of Medicine, Washington University, St. Louis, MO 63110, USA; (Y.-C.W.); (J.C.); (S.Z.); (M.K.); (K.Y.); (P.-Y.F.); (M.W.); (X.Y.)
| | - Xiaobing Yu
- Department of Genetics, School of Medicine, Washington University, St. Louis, MO 63110, USA; (Y.-C.W.); (J.C.); (S.Z.); (M.K.); (K.Y.); (P.-Y.F.); (M.W.); (X.Y.)
- Department of Computer Science & Engineering, Washington University, St. Louis, MO 63130, USA
| | - Kedous Y. Mekbib
- Department of Neurosurgery, Yale University School of Medicine, New Haven, CT 06510, USA; (K.Y.M.); (J.O.); (J.S.)
| | - Jack Ocken
- Department of Neurosurgery, Yale University School of Medicine, New Haven, CT 06510, USA; (K.Y.M.); (J.O.); (J.S.)
| | - Hannah Smith
- Department of Neurosurgery, Massachusetts General Hospital, Boston, MA 02114, USA; (H.S.); (K.T.K.)
- Department of Neurosurgery, Yale University School of Medicine, New Haven, CT 06510, USA; (K.Y.M.); (J.O.); (J.S.)
| | - John Shohfi
- Department of Neurosurgery, Yale University School of Medicine, New Haven, CT 06510, USA; (K.Y.M.); (J.O.); (J.S.)
| | - Kristopher T. Kahle
- Department of Neurosurgery, Massachusetts General Hospital, Boston, MA 02114, USA; (H.S.); (K.T.K.)
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115, USA
- Departments of Pediatrics and Neurology, Harvard Medical School, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Qiongshi Lu
- Department of Biostatistics & Medical Informatics, University of Wisconsin-Madison, Madison, WI 53706, USA;
| | - Sheng Chih Jin
- Department of Genetics, School of Medicine, Washington University, St. Louis, MO 63110, USA; (Y.-C.W.); (J.C.); (S.Z.); (M.K.); (K.Y.); (P.-Y.F.); (M.W.); (X.Y.)
- Department of Pediatrics, School of Medicine, Washington University, St. Louis, MO 63110, USA
| |
Collapse
|
16
|
Shulman ED, Elkon R. Genetic mapping of developmental trajectories for complex traits and diseases. Comput Struct Biotechnol J 2021; 19:3458-3469. [PMID: 34194671 PMCID: PMC8220172 DOI: 10.1016/j.csbj.2021.05.055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 05/30/2021] [Accepted: 05/30/2021] [Indexed: 11/04/2022] Open
Abstract
Genome-wide association studies (GWAS) have identified numerous common genetic variants associated with complex human traits and diseases. However, the translation of GWAS discoveries into biological and clinical insights is highly challenging. In this study, we present a novel bioinformatics approach for enhancing the functional interpretation of GWAS signals, based on their integration with single-cell (sc)RNA-seq datasets that examine developmental processes. Our approach performs three tasks: (1) Identification of links between cell differentiation trajectories and traits; (2) Elucidation of biological processes and molecular pathways that underlie such trajectory-trait links; and (3) Prioritization of target genes that carry the links between trajectories, pathways and traits. We applied our method to a set of 11 traits of various pathologies, and 12 scRNA-seq datasets of diverse developmental processes, and it readily detected well-established biological connections, including those between the maturation of cortical inhibitory interneurons and schizophrenia, hepatocytes and cholesterol levels, and pancreatic beta-islet cells and type-2 diabetes. For each of these associations, our method pinpointed top candidate genes that are strongly associated with both the kinetics of the differentiation trajectory and the disease's genetic risk. By the identification of trajectory-disease links, molecular pathways that underlie them and prioritizing candidate risk genes, our method improves the understanding of the etiology of complex diseases, and thus holds promise for enhancing rational drug development that is aimed at targeting specific biological processes that mediate the genetic predisposition to diseases.
Collapse
Affiliation(s)
- Eldad David Shulman
- Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Ran Elkon
- Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
17
|
Matallana-Ramirez LP, Whetten RW, Sanchez GM, Payn KG. Breeding for Climate Change Resilience: A Case Study of Loblolly Pine ( Pinus taeda L.) in North America. FRONTIERS IN PLANT SCIENCE 2021; 12:606908. [PMID: 33995428 PMCID: PMC8119900 DOI: 10.3389/fpls.2021.606908] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 04/08/2021] [Indexed: 05/25/2023]
Abstract
Earth's atmosphere is warming and the effects of climate change are becoming evident. A key observation is that both the average levels and the variability of temperature and precipitation are changing. Information and data from new technologies are developing in parallel to provide multidisciplinary opportunities to address and overcome the consequences of these changes in forest ecosystems. Changes in temperature and water availability impose multidimensional environmental constraints that trigger changes from the molecular to the forest stand level. These can represent a threat for the normal development of the tree from early seedling recruitment to adulthood both through direct mortality, and by increasing susceptibility to pathogens, insect attack, and fire damage. This review summarizes the strengths and shortcomings of previous work in the areas of genetic variation related to cold and drought stress in forest species with particular emphasis on loblolly pine (Pinus taeda L.), the most-planted tree species in North America. We describe and discuss the implementation of management and breeding strategies to increase resilience and adaptation, and discuss how new technologies in the areas of engineering and genomics are shaping the future of phenotype-genotype studies. Lessons learned from the study of species important in intensively-managed forest ecosystems may also prove to be of value in helping less-intensively managed forest ecosystems adapt to climate change, thereby increasing the sustainability and resilience of forestlands for the future.
Collapse
Affiliation(s)
- Lilian P. Matallana-Ramirez
- Department of Forestry and Environmental Resources, North Carolina State University, Raleigh, Raleigh, NC, United States
| | - Ross W. Whetten
- Department of Forestry and Environmental Resources, North Carolina State University, Raleigh, Raleigh, NC, United States
| | - Georgina M. Sanchez
- Center for Geospatial Analytics, North Carolina State University, Raleigh, Raleigh, NC, United States
| | - Kitt G. Payn
- Department of Forestry and Environmental Resources, North Carolina State University, Raleigh, Raleigh, NC, United States
| |
Collapse
|
18
|
Liu W, Guo Y, Liu Z. An Omnibus Test for Detecting Multiple Phenotype Associations Based on GWAS Summary Level Data. Front Genet 2021; 12:644419. [PMID: 33815478 PMCID: PMC8009968 DOI: 10.3389/fgene.2021.644419] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Accepted: 02/23/2021] [Indexed: 11/30/2022] Open
Abstract
Abundant Genome-wide association study (GWAS) findings have reflected the sharing of genetic variants among multiple phenotypes. Exploring the association between genetic variants and multiple traits can provide novel insights into the biological mechanism of complex human traits. In this article, we proposed to apply the generalized Berk-Jones (GBJ) test and the generalized higher criticism (GHC) test to identify the genetic variants that affect multiple traits based on GWAS summary statistics. To be more robust to different gene-multiple traits association patterns across the whole genome, we proposed an omnibus test (OMNI) by using the aggregated Cauchy association test. We conducted extensive simulation studies to investigate the type one error rates and compare the powers of the proposed tests (i.e., the GBJ, GHC and OMNI tests) and the existing tests (i.e., the minimum of the p-values (MinP) and the cross-phenotype association test (CPASSOC) in a wide range of simulation settings. We found that all of these methods could control the type one error rates well and the proposed OMNI test has robust power. We applied those methods to the summary statistics dataset from Global Lipids Genetics Consortium and identified 19 new genetic variants that were missed by the original single trait association analysis.
Collapse
Affiliation(s)
| | | | - Zhonghua Liu
- Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong, China
| |
Collapse
|
19
|
Ning Z, Tsepilov YA, Sharapov SZ, Wang Z, Grishenko AK, Feng X, Shirali M, Joshi PK, Wilson JF, Pawitan Y, Haley CS, Aulchenko YS, Shen X. Nontrivial Replication of Loci Detected by Multi-Trait Methods. Front Genet 2021; 12:627989. [PMID: 33613642 PMCID: PMC7886991 DOI: 10.3389/fgene.2021.627989] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 01/04/2021] [Indexed: 11/21/2022] Open
Abstract
The ever-growing genome-wide association studies (GWAS) have revealed widespread pleiotropy. To exploit this, various methods that jointly consider associations of a genetic variant with multiple traits have been developed. Most efforts have been made concerning improving GWAS discovery power. However, how to replicate these discovered pleiotropic loci has yet to be discussed thoroughly. Unlike a single-trait scenario, multi-trait replication is not trivial considering the underlying genotype-multi-phenotype map of the associations. Here, we evaluate four methods for replicating multi-trait associations, corresponding to four levels of replication strength. Weak replication cannot justify pleiotropic genetic effects, whereas strong replication using our developed correlation methods can inform consistent pleiotropic genetic effects across the discovery and replication samples. We provide a protocol for replicating multi-trait genetic associations in practice. The described methods are implemented in the free and open-source R package MultiABEL.
Collapse
Affiliation(s)
- Zheng Ning
- Biostatistics Group, School of Life Sciences and School of Ecology, Sun Yat-sen University, Guangzhou, China.,Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Yakov A Tsepilov
- Division of Biology, Novosibirsk State University, Novosibirsk, Russia.,Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
| | | | - Zhipeng Wang
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.,College of Animal Science and Technology, Northeast Agricultural University, Harbin, China.,Bioinformatics Center, Northeast Agricultural University, Harbin, China
| | | | - Xiao Feng
- Biostatistics Group, School of Life Sciences and School of Ecology, Sun Yat-sen University, Guangzhou, China
| | - Masoud Shirali
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, United Kingdom
| | - Peter K Joshi
- Centre for Global Health Research, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
| | - James F Wilson
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, United Kingdom.,Centre for Global Health Research, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
| | - Yudi Pawitan
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Chris S Haley
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, United Kingdom
| | - Yurii S Aulchenko
- Kurchatov Genomics Center, Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia.,PolyOmica, 's-Hertogenbosch, Netherlands
| | - Xia Shen
- Biostatistics Group, School of Life Sciences and School of Ecology, Sun Yat-sen University, Guangzhou, China.,Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.,MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, United Kingdom.,Centre for Global Health Research, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
| |
Collapse
|