1
|
Kwong AM, Blackwell TW, LeFaive J, de Andrade M, Barnard J, Barnes KC, Blangero J, Boerwinkle E, Burchard EG, Cade BE, Chasman DI, Chen H, Conomos MP, Cupples LA, Ellinor PT, Eng C, Gao Y, Guo X, Irvin MR, Kelly TN, Kim W, Kooperberg C, Lubitz SA, Mak ACY, Manichaikul AW, Mathias RA, Montasser ME, Montgomery CG, Musani S, Palmer ND, Peloso GM, Qiao D, Reiner AP, Roden DM, Shoemaker MB, Smith JA, Smith NL, Su JL, Tiwari HK, Weeks DE, Weiss ST, Scott LJ, Smith AV, Abecasis GR, Boehnke M, Kang HM. Robust, flexible, and scalable tests for Hardy-Weinberg equilibrium across diverse ancestries. Genetics 2021; 218:iyab044. [PMID: 33720349 PMCID: PMC8128395 DOI: 10.1093/genetics/iyab044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Accepted: 02/03/2021] [Indexed: 11/13/2022] Open
Abstract
Traditional Hardy-Weinberg equilibrium (HWE) tests (the χ2 test and the exact test) have long been used as a metric for evaluating genotype quality, as technical artifacts leading to incorrect genotype calls often can be identified as deviations from HWE. However, in data sets composed of individuals from diverse ancestries, HWE can be violated even without genotyping error, complicating the use of HWE testing to assess genotype data quality. In this manuscript, we present the Robust Unified Test for HWE (RUTH) to test for HWE while accounting for population structure and genotype uncertainty, and to evaluate the impact of population heterogeneity and genotype uncertainty on the standard HWE tests and alternative methods using simulated and real sequence data sets. Our results demonstrate that ignoring population structure or genotype uncertainty in HWE tests can inflate false-positive rates by many orders of magnitude. Our evaluations demonstrate different tradeoffs between false positives and statistical power across the methods, with RUTH consistently among the best across all evaluations. RUTH is implemented as a practical and scalable software tool to rapidly perform HWE tests across millions of markers and hundreds of thousands of individuals while supporting standard VCF/BCF formats. RUTH is publicly available at https://www.github.com/statgen/ruth.
Collapse
Affiliation(s)
- Alan M Kwong
- Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Thomas W Blackwell
- Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Jonathon LeFaive
- Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | | | - John Barnard
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44106, USA
| | - Kathleen C Barnes
- Department of Medicine, Anschultz Medical Campus, University of Colorado, Aurora, CO 80045, USA
| | - John Blangero
- Department of Human Genetics, South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX 78520, USA
| | - Eric Boerwinkle
- Department of Epidemiology, Human Genetics Center, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Esteban G Burchard
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94143, USA
- Department of Medicine, University of California San Francisco, San Francisco, CA 94143, USA
| | - Brian E Cade
- Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, MA 02115, USA
- Division of Sleep Medicine, Harvard Medical School, Boston, MA 02115, USA
| | - Daniel I Chasman
- Division of Preventive Medicine, Brigham and Women’s Hospital, Boston, MA 02215, USA
| | - Han Chen
- Department of Epidemiology, Human Genetics Center, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- Center for Precision Health, School of Public Health and School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Matthew P Conomos
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
| | - L Adrienne Cupples
- Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA
- Framingham Heart Study, Framingham, MA 01702, USA
| | - Patrick T Ellinor
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA 02114, USA
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA 02124, USA
| | - Celeste Eng
- Department of Medicine, University of California San Francisco, San Francisco, CA 94143, USA
| | - Yan Gao
- Department of Physiology and Biophysics, University of Mississippi Medical Center, Jackson, MS 39216 USA
| | - Xiuqing Guo
- Department of Pediatrics, The Institute for Translational Genomics and Population Sciences, The Lundquist Institute at Harbor-UCLA Medical Center, Torrance, CA 90502, USA
| | - Marguerite Ryan Irvin
- Department of Epidemiology, School of Public Health, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Tanika N Kelly
- Department of Epidemiology, Tulane University, New Orleans, LA 70112, USA
| | - Wonji Kim
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | | | - Steven A Lubitz
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA 02114, USA
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA 02124, USA
| | - Angel C Y Mak
- Department of Medicine, University of California San Francisco, San Francisco, CA 94143, USA
| | - Ani W Manichaikul
- Department of Public Health Sciences, Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA
| | - Rasika A Mathias
- GeneSTAR Research Program and Division of Allergy and Clinical Immunology, Department of Medicine, Johns Hopkins University, Baltimore, MD 21205, USA
| | - May E Montasser
- Division of Endocrinology, Diabetes and Nutrition, Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Courtney G Montgomery
- Sarcoidosis Research Unit, Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA
| | - Solomon Musani
- Jackson Heart Study, University of Mississippi Medical Center, Jackson, MS 39216, USA
| | - Nicholette D Palmer
- Department of Biochemistry, Wake Forest School of Medicine, Winston-Salem, NC 27157, USA
| | - Gina M Peloso
- Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA
| | - Dandi Qiao
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | | | - Dan M Roden
- Departments of Medicine, Pharmacology, and Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - M Benjamin Shoemaker
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Jennifer A Smith
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Nicholas L Smith
- Department of Epidemiology, University of Washington, Seattle, WA 98195, USA
- Kaiser Permanente Washington Health Research Institute, Kaiser Permanente Washington, Seattle, WA 98101, USA
- Department of Veterans Affairs, Seattle Epidemiologic Research and Information Center, Office of Research and Development, Seattle, WA 98108, USA
| | - Jessica Lasky Su
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Hemant K Tiwari
- Department of Biostatistics, School of Public Health, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Daniel E Weeks
- Departments of Human Genetics and Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Scott T Weiss
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | | | | | - Laura J Scott
- Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Albert V Smith
- Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Gonçalo R Abecasis
- Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Michael Boehnke
- Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Hyun Min Kang
- Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
3
|
Brieger K, Zajac GJM, Pandit A, Foerster JR, Li KW, Annis AC, Schmidt EM, Clark CP, McMorrow K, Zhou W, Yang J, Kwong AM, Boughton AP, Wu J, Scheller C, Parikh T, de la Vega A, Brazel DM, Frieser M, Rea-Sandin G, Fritsche LG, Vrieze SI, Abecasis GR. Genes for Good: Engaging the Public in Genetics Research via Social Media. Am J Hum Genet 2019; 105:65-77. [PMID: 31204010 PMCID: PMC6612519 DOI: 10.1016/j.ajhg.2019.05.006] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Accepted: 05/08/2019] [Indexed: 01/06/2023] Open
Abstract
The Genes for Good study uses social media to engage a large, diverse participant pool in genetics research and education. Health history and daily tracking surveys are administered through a Facebook application, and participants who complete a minimum number of surveys are mailed a saliva sample kit ("spit kit") to collect DNA for genotyping. As of March 2019, we engaged >80,000 individuals, sent spit kits to >32,000 individuals who met minimum participation requirements, and collected >27,000 spit kits. Participants come from all 50 states and include a diversity of ancestral backgrounds. Rates of important chronic health indicators are consistent with those estimated for the general U.S. population using more traditional study designs. However, our sample is younger and contains a greater percentage of females than the general population. As one means of verifying data quality, we have replicated genome-wide association studies (GWASs) for exemplar traits, such as asthma, diabetes, body mass index (BMI), and pigmentation. The flexible framework of the web application makes it relatively simple to add new questionnaires and for other researchers to collaborate. We anticipate that the study sample will continue to grow and that future analyses may further capitalize on the strengths of the longitudinal data in combination with genetic information.
Collapse
Affiliation(s)
- Katharine Brieger
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA; Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Gregory J M Zajac
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Anita Pandit
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA.
| | - Johanna R Foerster
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Kevin W Li
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Aubrey C Annis
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Ellen M Schmidt
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA; Wellcome Sanger Institute, Hinxton CB10 1SA, UK
| | - Chris P Clark
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Karly McMorrow
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Wei Zhou
- Department of Computational Medicine and Bioinformatics, University of Michigan School of Medicine, Ann Arbor, MI 48109, USA
| | - Jingjing Yang
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Alan M Kwong
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Andrew P Boughton
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Jinxi Wu
- School of Information, University of Michigan, Ann Arbor, MI 48109, USA
| | - Chris Scheller
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Tanvi Parikh
- Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, CO 80309, USA
| | - Alejandro de la Vega
- Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, CO 80309, USA
| | - David M Brazel
- Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, CO 80309, USA; Department of Molecular, Cellular, and Developmental Biology, University of Colorado Boulder, Boulder, CO 80309, USA
| | - Maia Frieser
- Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, CO 80309, USA; Department of Psychology, University of Colorado Boulder, Boulder, CO 80309, USA
| | - Gianna Rea-Sandin
- Department of Psychology, Arizona State University, Tempe, AZ 85281, USA
| | - Lars G Fritsche
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Scott I Vrieze
- Department of Psychology, University of Minnesota, Minneapolis, MN 55455, USA
| | - Gonçalo R Abecasis
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|