1
|
Das A, Lakhani C, Terwagne C, Lin JST, Naito T, Raj T, Knowles DA. Leveraging functional annotations to map rare variants associated with Alzheimer's disease with gruyere. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2024.12.06.24318577. [PMID: 39677477 PMCID: PMC11643288 DOI: 10.1101/2024.12.06.24318577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2024]
Abstract
The increasing availability of whole-genome sequencing (WGS) has begun to elucidate the contribution of rare variants (RVs), both coding and non-coding, to complex disease. Multiple RV association tests are available to study the relationship between genotype and phenotype, but most are restricted to per-gene models and do not fully leverage the availability of variant-level functional annotations. We propose Genome-wide Rare Variant EnRichment Evaluation (gruyere), a Bayesian probabilistic model that complements existing methods by learning global, trait-specific weights for functional annotations to improve variant prioritization. We apply gruyere to WGS data from the Alzheimer's Disease (AD) Sequencing Project, consisting of 7,966 cases and 13,412 controls, to identify AD-associated genes and annotations. Growing evidence suggests that disruption of microglial regulation is a key contributor to AD risk, yet existing methods have not had sufficient power to examine rare non-coding effects that incorporate such cell-type specific information. To address this gap, we 1) use predicted enhancer and promoter regions in microglia and other potentially relevant cell types (oligodendrocytes, astrocytes, and neurons) to define per-gene non-coding RV test sets and 2) include cell-type specific variant effect predictions (VEPs) as functional annotations. gruyere identifies 15 significant genetic associations not detected by other RV methods and finds deep learning-based VEPs for splicing, transcription factor binding, and chromatin state are highly predictive of functional non-coding RVs. Our study establishes a novel and robust framework incorporating functional annotations, coding RVs, and cell-type associated non-coding RVs, to perform genome-wide association tests, uncovering AD-relevant genes and annotations.
Collapse
Affiliation(s)
- Anjali Das
- Computer Science, Columbia University, New York, NY, USA
- New York Genome Center, New York,NY, USA
| | | | | | | | - Tatsuhiko Naito
- New York Genome Center, New York,NY, USA
- Neuroscience, Icahn School of Medicine, Mount Sinai, New York, NY, USA
| | - Towfique Raj
- Neuroscience, Icahn School of Medicine, Mount Sinai, New York, NY, USA
| | - David A Knowles
- Computer Science, Columbia University, New York, NY, USA
- New York Genome Center, New York,NY, USA
- Systems Biology, Columbia University, New York, NY, USA
- Data Science Institute, Columbia University, New York, NY, USA
| |
Collapse
|
2
|
Mangnier L, Ruczinski I, Ricard J, Moreau C, Girard S, Maziade M, Bureau A. RetroFun-RVS: A Retrospective Family-Based Framework for Rare Variant Analysis Incorporating Functional Annotations. Genet Epidemiol 2025; 49:e70001. [PMID: 39876583 PMCID: PMC11775437 DOI: 10.1002/gepi.70001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 10/16/2024] [Accepted: 01/03/2025] [Indexed: 01/30/2025]
Abstract
A large proportion of genetic variations involved in complex diseases are rare and located within noncoding regions, making the interpretation of underlying biological mechanisms a daunting task. Although technical and methodological progress has been made to annotate the genome, current disease-rare-variant association tests incorporating such annotations suffer from two major limitations. First, they are generally restricted to case-control designs of unrelated individuals, which often require tens or hundreds of thousands of individuals to achieve sufficient power. Second, they were not evaluated with region-based annotations needed to interpret the causal regulatory mechanisms. In this work, we propose RetroFun-RVS, a new retrospective family-based score test, incorporating functional annotations. A critical feature of the proposed method is to aggregate genotypes to compare against rare variant-sharing expectations among affected family members. Through extensive simulations, we have demonstrated that RetroFun-RVS integrating networks based on 3D genome contacts as functional annotations reach greater power over the region-wide test, other strategies to include subregions and competing methods. Also, the proposed framework shows robustness to non-informative annotations, maintaining its power when causal variants are spread across regions. Asymptotic p-values are susceptible to Type I error inflation when the number of families with rare variants is small, and a bootstrap procedure is recommended in these instances. Application of RetroFun-RVS is illustrated on whole genome sequence in the Eastern Quebec Schizophrenia and Bipolar Disorder Kindred Study with networks constructed from 3D contacts and epigenetic data on neurons. In summary, the integration of functional annotations corresponding to regions or networks with transcriptional impacts in rare variant tests appears promising to highlight regulatory mechanisms involved in complex diseases.
Collapse
Affiliation(s)
- Loïc Mangnier
- Department of Social and Preventive MedicineLaval UniversityQuebec CityQuebecCanada
- CERVO Brain Research CenterQuebec CityQuebecCanada
- Big Data Research CenterLaval UniversityQuebec CityQuebecCanada
| | - Ingo Ruczinski
- Department of BiostatisticsJohns Hopkins Bloomberg School of Public HealthBaltimoreMarylandUSA
| | | | - Claudia Moreau
- Department of Fundamental SciencesUniversity of Quebec in ChicoutimiSaguenayQuebecCanada
| | - Simon Girard
- Department of Fundamental SciencesUniversity of Quebec in ChicoutimiSaguenayQuebecCanada
| | - Michel Maziade
- CERVO Brain Research CenterQuebec CityQuebecCanada
- Department of Psychiatry and NeurosciencesLaval UniversityQuebec CityQuebecCanada
| | - Alexandre Bureau
- Department of Social and Preventive MedicineLaval UniversityQuebec CityQuebecCanada
- CERVO Brain Research CenterQuebec CityQuebecCanada
- Big Data Research CenterLaval UniversityQuebec CityQuebecCanada
| |
Collapse
|
3
|
Riccio C, Jansen ML, Thalén F, Koliopanos G, Link V, Ziegler A. Assessment of the functionality and usability of open-source rare variant analysis pipelines. Brief Bioinform 2025; 26:bbaf044. [PMID: 39907318 PMCID: PMC11795309 DOI: 10.1093/bib/bbaf044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2024] [Revised: 01/07/2025] [Accepted: 01/20/2025] [Indexed: 02/06/2025] Open
Abstract
Sequencing of increasingly larger cohorts has revealed many rare variants, presenting an opportunity to further unravel the genetic basis of complex traits. Compared with common variants, rare variants are more complex to analyze. Specialized computational tools for these analyses should be both flexible and user-friendly. However, an overview of the available rare variant analysis pipelines and their functionalities is currently lacking. Here, we provide a systematic review of the currently available rare variant analysis pipelines. We searched MEDLINE and Google Scholar until 27 November 2023, and included open-source rare variant pipelines that accepted genotype data from cohort and case-control studies and group variants into testing units. Eligible pipelines were assessed based on functionality and usability criteria. We identified 17 rare variant pipelines that collectively support various trait types, association tests, testing units, and variant weighting schemes. Currently, no single pipeline can handle all data types in a scalable and flexible manner. We recommend different tools to meet diverse analysis needs. STAARpipeline is suitable for newcomers and common applications owing to its built-in definitions for the testing units. REGENIE is highly scalable, actively maintained, regularly updated, and well documented. Ravages is suitable for analyzing multinomial variables, and OrdinalGWAS is tailored for analyzing ordinal variables. Opportunities remain for developing a user-friendly pipeline that provides high degrees of flexibility and scalability. Such a pipeline would enable researchers to exploit the potential of rare variant analyses to uncover the genetic basis of complex traits.
Collapse
Affiliation(s)
- Cristian Riccio
- Cardio-CARE, Medizincampus Davos, Herman-Burchard-Str. 12, 7265 Davos Wolfgang, Switzerland
- Swiss Institute of Bioinformatics, Herman-Burchard-Str. 12, 7265 Davos Wolfgang, Switzerland
| | - Max L Jansen
- Cardio-CARE, Medizincampus Davos, Herman-Burchard-Str. 12, 7265 Davos Wolfgang, Switzerland
- Swiss Institute of Bioinformatics, Herman-Burchard-Str. 12, 7265 Davos Wolfgang, Switzerland
| | - Felix Thalén
- Cardio-CARE, Medizincampus Davos, Herman-Burchard-Str. 12, 7265 Davos Wolfgang, Switzerland
- Swiss Institute of Bioinformatics, Herman-Burchard-Str. 12, 7265 Davos Wolfgang, Switzerland
| | - Georgios Koliopanos
- Cardio-CARE, Medizincampus Davos, Herman-Burchard-Str. 12, 7265 Davos Wolfgang, Switzerland
- Swiss Institute of Bioinformatics, Herman-Burchard-Str. 12, 7265 Davos Wolfgang, Switzerland
| | - Vivian Link
- Cardio-CARE, Medizincampus Davos, Herman-Burchard-Str. 12, 7265 Davos Wolfgang, Switzerland
- Swiss Institute of Bioinformatics, Herman-Burchard-Str. 12, 7265 Davos Wolfgang, Switzerland
| | - Andreas Ziegler
- Cardio-CARE, Medizincampus Davos, Herman-Burchard-Str. 12, 7265 Davos Wolfgang, Switzerland
- Swiss Institute of Bioinformatics, Herman-Burchard-Str. 12, 7265 Davos Wolfgang, Switzerland
- Center for Population Health Innovation (POINT), University Heart and Vascular Center Hamburg, University Medical Center Hamburg-Eppendorf, Martinistr. 52, 20251 Hamburg, Germany
- University Center of Cardiovascular Science & Department of Cardiology, University Heart and Vascular Center Hamburg, University Medical Center Hamburg-Eppendorf, Martinistr. 52, 20251 Hamburg, Germany
- School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, King Edward Ave, Scottsville, Pietermaritzburg, 3201, South Africa
| |
Collapse
|
4
|
Li X, Chen H, Selvaraj MS, Van Buren E, Zhou H, Wang Y, Sun R, McCaw ZR, Yu Z, Jiang MZ, DiCorpo D, Gaynor SM, Dey R, Arnett DK, Benjamin EJ, Bis JC, Blangero J, Boerwinkle E, Bowden DW, Brody JA, Cade BE, Carson AP, Carlson JC, Chami N, Chen YDI, Curran JE, de Vries PS, Fornage M, Franceschini N, Freedman BI, Gu C, Heard-Costa NL, He J, Hou L, Hung YJ, Irvin MR, Kaplan RC, Kardia SLR, Kelly TN, Konigsberg I, Kooperberg C, Kral BG, Li C, Li Y, Lin H, Liu CT, Loos RJF, Mahaney MC, Martin LW, Mathias RA, Mitchell BD, Montasser ME, Morrison AC, Naseri T, North KE, Palmer ND, Peyser PA, Psaty BM, Redline S, Reiner AP, Rich SS, Sitlani CM, Smith JA, Taylor KD, Tiwari HK, Vasan RS, Viali S, Wang Z, Wessel J, Yanek LR, Yu B, Dupuis J, Meigs JB, Auer PL, Raffield LM, Manning AK, Rice KM, Rotter JI, Peloso GM, Natarajan P, Li Z, Liu Z, Lin X. A statistical framework for multi-trait rare variant analysis in large-scale whole-genome sequencing studies. NATURE COMPUTATIONAL SCIENCE 2025; 5:125-143. [PMID: 39920506 PMCID: PMC11981678 DOI: 10.1038/s43588-024-00764-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Accepted: 12/20/2024] [Indexed: 02/09/2025]
Abstract
Large-scale whole-genome sequencing (WGS) studies have improved our understanding of the contributions of coding and noncoding rare variants to complex human traits. Leveraging association effect sizes across multiple traits in WGS rare variant association analysis can improve statistical power over single-trait analysis, and also detect pleiotropic genes and regions. Existing multi-trait methods have limited ability to perform rare variant analysis of large-scale WGS data. We propose MultiSTAAR, a statistical framework and computationally scalable analytical pipeline for functionally informed multi-trait rare variant analysis in large-scale WGS studies. MultiSTAAR accounts for relatedness, population structure and correlation among phenotypes by jointly analyzing multiple traits, and further empowers rare variant association analysis by incorporating multiple functional annotations. We applied MultiSTAAR to jointly analyze three lipid traits in 61,838 multi-ethnic samples from the Trans-Omics for Precision Medicine (TOPMed) Program. We discovered and replicated new associations with lipid traits missed by single-trait analysis.
Collapse
Grants
- U01 DK085524 NIDDK NIH HHS
- HHSN268201800001I U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01 DK078616 NIDDK NIH HHS
- U01 HL054472 NHLBI NIH HHS
- R01 HL071025 NHLBI NIH HHS
- UL1 RR033176 NCRR NIH HHS
- R01 HL112064 NHLBI NIH HHS
- K26 DK138425 NIDDK NIH HHS
- 75N92020D00002 NHLBI NIH HHS
- R01 HL113323 NHLBI NIH HHS
- U01-HG012064 U.S. Department of Health & Human Services | NIH | National Human Genome Research Institute (NHGRI)
- N01-HC-95160 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01-HL071251 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R35 CA197449 NCI NIH HHS
- 75N92020D00005 NHLBI NIH HHS
- R01 HL104135 NHLBI NIH HHS
- HHSN268201600002C NHLBI NIH HHS
- N01HC95160 NHLBI NIH HHS
- R01-DK117445 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01 HL071251 NHLBI NIH HHS
- R01 HL120393 NHLBI NIH HHS
- R01 HL087698 NHLBI NIH HHS
- R01 HL046380 NHLBI NIH HHS
- R01 HL071259 NHLBI NIH HHS
- N01-HC-95163 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- U19 CA203654 NCI NIH HHS
- N01HC95163 NHLBI NIH HHS
- R01-HL071259 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- UL1 TR001079 NCATS NIH HHS
- R01 HL175681 NHLBI NIH HHS
- U01 HG012064 NHGRI NIH HHS
- N01-HC-95169 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01 HL087660 NHLBI NIH HHS
- DK063491 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01 AR048797 NIAMS NIH HHS
- R01-HL071205 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01 HL092577 NHLBI NIH HHS
- N01-HC-95166 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- N01HC95169 NHLBI NIH HHS
- U01 HL054509 NHLBI NIH HHS
- 75N92020D00001 NHLBI NIH HHS
- U01 HL120393 NHLBI NIH HHS
- R01 HL113338 NHLBI NIH HHS
- R01 DK117445 NIDDK NIH HHS
- R01 HL153805 NHLBI NIH HHS
- R01 AG058921 NIA NIH HHS
- R01 HL071250 NHLBI NIH HHS
- R01-HL104135-04S1 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- UL1-TR-000040 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- N01-HC-95162 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- UL1-TR001881 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01 NS058700 NINDS NIH HHS
- R01 HL127564 NHLBI NIH HHS
- R01 HL076784 NHLBI NIH HHS
- N01-HC-95167 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- N01HC95164 NHLBI NIH HHS
- R01-HL113338 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL163972 NHLBI NIH HHS
- HHSN268201600004C NHLBI NIH HHS
- HHSN268201700005I NHLBI NIH HHS
- R03-HL154284 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01-HL142711 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- 75N92020D00003 NHLBI NIH HHS
- F32 HL085989 NHLBI NIH HHS
- R01 MH078111 NIMH NIH HHS
- N01HC95162 NHLBI NIH HHS
- U01 HL054464 NHLBI NIH HHS
- R01 HL119443 NHLBI NIH HHS
- R01 HL105756 NHLBI NIH HHS
- N01HC95168 NHLBI NIH HHS
- NHLBI TOPMed Fellowship 75N92021F00229 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- HHSN268201500003I NHLBI NIH HHS
- HHSN268201700004I NHLBI NIH HHS
- R01-HL071051 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01 HL067348 NHLBI NIH HHS
- 1R01AG086379-01 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01 HL142711 NHLBI NIH HHS
- R35 HL135818 NHLBI NIH HHS
- R01-HL071250 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R35-CA197449 U.S. Department of Health & Human Services | NIH | National Cancer Institute (NCI)
- U01 HL072524 NHLBI NIH HHS
- DK078616 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- P30 DK063491 NIDDK NIH HHS
- R01 HL071051 NHLBI NIH HHS
- N01-HC-95161 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- U01 HL054457 NHLBI NIH HHS
- N01HC95165 NHLBI NIH HHS
- N01HC95159 NHLBI NIH HHS
- M01 RR000052 NCRR NIH HHS
- HHSN268201700003I NHLBI NIH HHS
- N01HC95161 NHLBI NIH HHS
- UL1 TR001420 NCATS NIH HHS
- R01 HL049762 NHLBI NIH HHS
- HL046389 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- P01 HL045522 NHLBI NIH HHS
- U01-HG009088 U.S. Department of Health & Human Services | NIH | National Human Genome Research Institute (NHGRI)
- 75N92020D00004 NHLBI NIH HHS
- R00 HG012956 NHGRI NIH HHS
- 75N92020D00007 NHLBI NIH HHS
- U01 HL072518 NHLBI NIH HHS
- U19-CA203654 U.S. Department of Health & Human Services | NIH | National Cancer Institute (NCI)
- U01 DK078616 NIDDK NIH HHS
- N01-HC-95168 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- HHSN268201700001I NHLBI NIH HHS
- 1R35-HL135818 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- U01 HL137162 NHLBI NIH HHS
- M01 RR007122 NCRR NIH HHS
- R01 HL059684 NHLBI NIH HHS
- U54 HG013247 NHGRI NIH HHS
- HHSN268201600018C NHLBI NIH HHS
- R01 AG086379 NIA NIH HHS
- R01 MH078143 NIMH NIH HHS
- R01 DK071891 NIDDK NIH HHS
- N01HC95167 NHLBI NIH HHS
- R01 HG013163 NHGRI NIH HHS
- N01HC25195 NHLBI NIH HHS
- R01-MD012765 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01 HL071205 NHLBI NIH HHS
- U01 HL054481 NHLBI NIH HHS
- 75N92019D00031 NHLBI NIH HHS
- R03 HL154284 NHLBI NIH HHS
- R01 MD012765 NIMHD NIH HHS
- R00HG012956-02 U.S. Department of Health & Human Services | NIH | National Human Genome Research Institute (NHGRI)
- UL1 TR000040 NCATS NIH HHS
- HL105756 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- U01-HL054472 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- HHSN268201700002I NHLBI NIH HHS
- R01 HL151855 NHLBI NIH HHS
- U01 HG009088 NHGRI NIH HHS
- UM1 DK078616 NIDDK NIH HHS
- R01 MH083824 NIMH NIH HHS
- R01 HL117626 NHLBI NIH HHS
- N01-HC-95159 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- 75N92020D00006 NHLBI NIH HHS
- HHSN268201600001C NHLBI NIH HHS
- N01HC95166 NHLBI NIH HHS
- U01-HL054473 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- N01-HC-95164 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01 AG028321 NIA NIH HHS
- U01-HL054509 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- UL1-TR-001420 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- U01-HL054495 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- U01-HL137162 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01-HL071258 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- HHSN268201600003C NHLBI NIH HHS
- UL1-TR-001079 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- UL1 TR001881 NCATS NIH HHS
- UL1-RR033176 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- N01-HC-95165 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- U01 HL054495 NHLBI NIH HHS
- R01 HL071258 NHLBI NIH HHS
- R01-HL153805 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL055673 NHLBI NIH HHS
- R01-HL055673-18S1 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL092301 NHLBI NIH HHS
- U01 HL054473 NHLBI NIH HHS
- HL151855 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01-HL127564 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- U01-HL072524 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
Collapse
Affiliation(s)
- Xihao Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Han Chen
- Human Genetics Center, Department of Epidemiology, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Margaret Sunitha Selvaraj
- Center for Genomic Medicine and Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Eric Van Buren
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Hufeng Zhou
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Yuxuan Wang
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Ryan Sun
- Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Zachary R McCaw
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Zhi Yu
- Center for Genomic Medicine and Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Clinical and Translational Epidemiology Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Min-Zhi Jiang
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Biostatistics, The Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Daniel DiCorpo
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Sheila M Gaynor
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Rounak Dey
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Donna K Arnett
- Provost Office, University of South Carolina, Columbia, SC, USA
| | - Emelia J Benjamin
- Section of Cardiovascular Medicine, Boston Medical Center, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
- Department of Epidemiology, Boston University School of Public Health, Boston, MA, USA
- Framingham Heart Study, Framingham, MA, USA
| | - Joshua C Bis
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - John Blangero
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Eric Boerwinkle
- Human Genetics Center, Department of Epidemiology, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Donald W Bowden
- Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Jennifer A Brody
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Brian E Cade
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA, USA
- Division of Sleep Medicine, Harvard Medical School, Boston, MA, USA
| | - April P Carson
- Department of Medicine, University of Mississippi Medical Center, Jackson, MS, USA
| | - Jenna C Carlson
- Department of Human Genetics and Department of Biostatistics and Health Data Science, University of Pittsburgh, Pittsburgh, PA, USA
| | - Nathalie Chami
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- The Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Yii-Der Ida Chen
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Joanne E Curran
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Paul S de Vries
- Human Genetics Center, Department of Epidemiology, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Myriam Fornage
- Human Genetics Center, Department of Epidemiology, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Brown Foundation Institute of Molecular Medicine, McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Nora Franceschini
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Barry I Freedman
- Department of Internal Medicine, Nephrology, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Charles Gu
- Division of Biology & Biomedical Sciences, Washington University School of Medicine, St. Louis, MO, USA
| | - Nancy L Heard-Costa
- Framingham Heart Study, Framingham, MA, USA
- Department of Neurology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - Jiang He
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA
- Translational Science Institute, Tulane University, New Orleans, LA, USA
| | - Lifang Hou
- Department of Preventive Medicine, Northwestern University, Chicago, IL, USA
| | - Yi-Jen Hung
- Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan
| | - Marguerite R Irvin
- Department of Epidemiology, School of Public Health, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Robert C Kaplan
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, USA
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Sharon L R Kardia
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Tanika N Kelly
- Department of Medicine, Division of Nephrology, University of Illinois Chicago, Chicago, IL, USA
| | - Iain Konigsberg
- Department of Biomedical Informatics, University of Colorado, Aurora, CO, USA
| | - Charles Kooperberg
- Department of Medicine, Division of Nephrology, University of Illinois Chicago, Chicago, IL, USA
| | - Brian G Kral
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Changwei Li
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA
- Translational Science Institute, Tulane University, New Orleans, LA, USA
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Honghuang Lin
- Department of Medicine, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Ching-Ti Liu
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Ruth J F Loos
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Michael C Mahaney
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Lisa W Martin
- School of Medicine and Health Sciences, George Washington University, Washington, DC, USA
| | - Rasika A Mathias
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Braxton D Mitchell
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - May E Montasser
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Alanna C Morrison
- Human Genetics Center, Department of Epidemiology, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Take Naseri
- Naseri & Associates Public Health Consultancy Firm and Family Health Clinic, Apia, Samoa
- Department of Epidemiology, Brown University, Providence, RI, USA
| | - Kari E North
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Nicholette D Palmer
- Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Patricia A Peyser
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Bruce M Psaty
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
- Departments of Epidemiology, University of Washington, Seattle, WA, USA
- Department of Health Systems and Population Health, University of Washington, Seattle, WA, USA
| | - Susan Redline
- Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA, USA
- Division of Sleep Medicine, Harvard Medical School, Boston, MA, USA
| | - Alexander P Reiner
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, USA
- Departments of Epidemiology, University of Washington, Seattle, WA, USA
| | - Stephen S Rich
- Department of Genome Sciences, University of Virginia, Charlottesville, VA, USA
| | - Colleen M Sitlani
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Jennifer A Smith
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Kent D Taylor
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Hemant K Tiwari
- Department of Biostatistics, School of Public Health, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Ramachandran S Vasan
- Framingham Heart Study, Framingham, MA, USA
- Department of Quantitative and Qualitative Health Sciences, UT Health San Antonio School of Public Health, San Antonia, TX, USA
| | - Satupa'itea Viali
- School of Medicine, National University of Samoa, Apia, Samoa
- Department of Chronic Disease Epidemiology, Yale University School of Public Health, New Haven, CT, USA
- Oceania University of Medicine, Apia, Samoa
| | - Zhe Wang
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Jennifer Wessel
- Department of Epidemiology, Fairbanks School of Public Health, Indiana University, Indianapolis, IN, USA
- Diabetes Translational Research Center, Indiana University, Indianapolis, IN, USA
| | - Lisa R Yanek
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Bing Yu
- Human Genetics Center, Department of Epidemiology, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Josée Dupuis
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, QC, Canada
| | - James B Meigs
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Division of General Internal Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Paul L Auer
- Division of Biostatistics, Data Science Institute, and Cancer Center, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Laura M Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Alisa K Manning
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Metabolism Program, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Clinical and Translational Epidemiology Unit, Mongan Institute, Massachusetts General Hospital, Boston, MA, USA
| | - Kenneth M Rice
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Jerome I Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Gina M Peloso
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Pradeep Natarajan
- Center for Genomic Medicine and Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Zilin Li
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
| | - Zhonghua Liu
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY, USA.
| | - Xihong Lin
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Department of Statistics, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
5
|
Hawkes G, Beaumont RN, Li Z, Mandla R, Li X, Albert CM, Arnett DK, Ashley-Koch AE, Ashrani AA, Barnes KC, Boerwinkle E, Brody JA, Carson AP, Chami N, Chen YDI, Chung MK, Curran JE, Darbar D, Ellinor PT, Fornage M, Gordeuk VR, Guo X, He J, Hwu CM, Kalyani RR, Kaplan R, Kardia SLR, Kooperberg C, Loos RJF, Lubitz SA, Minster RL, Naseri T, Viali S, Mitchell BD, Murabito JM, Palmer ND, Psaty BM, Redline S, Shoemaker MB, Silverman EK, Telen MJ, Weiss ST, Yanek LR, Zhou H, Liu CT, North KE, Justice AE, Locke JM, Owens N, Murray A, Patel K, Frayling TM, Wright CF, Wood AR, Lin X, Manning A, Weedon MN. Whole-genome sequencing in 333,100 individuals reveals rare non-coding single variant and aggregate associations with height. Nat Commun 2024; 15:8549. [PMID: 39362880 PMCID: PMC11450065 DOI: 10.1038/s41467-024-52579-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Accepted: 09/12/2024] [Indexed: 10/05/2024] Open
Abstract
The role of rare non-coding variation in complex human phenotypes is still largely unknown. To elucidate the impact of rare variants in regulatory elements, we performed a whole-genome sequencing association analysis for height using 333,100 individuals from three datasets: UK Biobank (N = 200,003), TOPMed (N = 87,652) and All of Us (N = 45,445). We performed rare ( < 0.1% minor-allele-frequency) single-variant and aggregate testing of non-coding variants in regulatory regions based on proximal-regulatory, intergenic-regulatory and deep-intronic annotation. We observed 29 independent variants associated with height at P < 6 × 10 - 10 after conditioning on previously reported variants, with effect sizes ranging from -7cm to +4.7 cm. We also identified and replicated non-coding aggregate-based associations proximal to HMGA1 containing variants associated with a 5 cm taller height and of highly-conserved variants in MIR497HG on chromosome 17. We have developed an approach for identifying non-coding rare variants in regulatory regions with large effects from whole-genome sequencing data associated with complex traits.
Collapse
Affiliation(s)
- Gareth Hawkes
- Clinical and Biomedical Sciences, University of Exeter, Exeter, UK.
| | - Robin N Beaumont
- Clinical and Biomedical Sciences, University of Exeter, Exeter, UK
| | - Zilin Li
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Ravi Mandla
- Department of Medicine, Harvard Medical School, Broad Institute, Boston, Massachusetts, USA
| | - Xihao Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Christine M Albert
- Department of Cardiology, Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Donna K Arnett
- Provost Office, University of South Carolina, Columbia, SC, USA
| | - Allison E Ashley-Koch
- Department of Medicine, Duke Molecular Physiology Institute, Duke University Medical Center, Durham, NC, USA
| | - Aneel A Ashrani
- Division of Hematology, Department of Medicine, Mayo Clinic Rochester, Rochester, MN, USA
| | - Kathleen C Barnes
- Department of Medicine, School of Medicine, University of Colorado, Aurora, CO, USA
| | - Eric Boerwinkle
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Jennifer A Brody
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - April P Carson
- Department of Medicine, University of Mississippi Medical Center, Jackson, MS, USA
| | - Nathalie Chami
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Yii-Der Ida Chen
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Mina K Chung
- Department of Cardiovascular Medicine, Heart, Vascular & Thoracic Institute, Cleveland, OH, USA
| | - Joanne E Curran
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Dawood Darbar
- Division of Cardiology, Department of Medicine, University of Illinois Chicago, Chicago, IL, USA
| | - Patrick T Ellinor
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
| | - Myrian Fornage
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Victor R Gordeuk
- Department of Medicine, School of Medicine, University of Illinois at Chicago, Chicago, IL, USA
| | - Xiuqing Guo
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Jiang He
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA
| | - Chii-Min Hwu
- Section of Endocrinology and Metabolism, Department of Medicine, Taipei Veterans General Hospital, Taipei City, Taiwan
| | - Rita R Kalyani
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Robert Kaplan
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Sharon L R Kardia
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Ruth J F Loos
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Steven A Lubitz
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
| | - Ryan L Minster
- Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Take Naseri
- Naseri & Associates Public Health Consultancy Firm and Family Health Clinic, Apia, Samoa
- International Health Institute, Brown University, Providence, Rhode Island, US
| | - Satupa'itea Viali
- Oceania University of Medicine, Apia, Samoa
- School of Medicine, National University of Samoa, Apia, Samoa
- Dept of Chronic Disease Epidemiology, Yale University, New Haven, Connecticut, US
| | - Braxton D Mitchell
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Joanne M Murabito
- Boston University's and National Heart, Lung, and Blood Institute's Framingham Heart Study, Framingham, MA, USA
| | - Nicholette D Palmer
- Department of Biochemistry, Wake Forest University School of Medicine, Winston-, Salem, NC, USA
| | - Bruce M Psaty
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
- Departments of Medicine, Epidemiology, and Health Systems and Population Health, University of Washington, Seattle, WA, USA
| | - Susan Redline
- Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA, USA
| | - M Benjamin Shoemaker
- Department of Medicine, Cardiovascular Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Edwin K Silverman
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Marilyn J Telen
- Department of Medicine, Duke University School of Medicine, Durham, NC, USA
| | - Scott T Weiss
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Lisa R Yanek
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Hufeng Zhou
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Ching-Ti Liu
- Department of Biostatistics, School of Public Health, Boston University, Boston, MA, USA
| | - Kari E North
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Anne E Justice
- Population Health Sciences, Geisinger, Danville, PA, USA
| | - Jonathan M Locke
- Clinical and Biomedical Sciences, University of Exeter, Exeter, UK
| | - Nick Owens
- Clinical and Biomedical Sciences, University of Exeter, Exeter, UK
| | - Anna Murray
- Clinical and Biomedical Sciences, University of Exeter, Exeter, UK
| | - Kashyap Patel
- Clinical and Biomedical Sciences, University of Exeter, Exeter, UK
| | | | | | - Andrew R Wood
- Clinical and Biomedical Sciences, University of Exeter, Exeter, UK
| | - Xihong Lin
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Statistics, Harvard University, Cambridge, MA, USA
| | - Alisa Manning
- Department of Medicine, Harvard Medical School, Broad Institute, Boston, Massachusetts, USA
| | - Michael N Weedon
- Clinical and Biomedical Sciences, University of Exeter, Exeter, UK.
| |
Collapse
|
6
|
Tseng YP, Chang YS, Mekala VR, Liu TY, Chang JG, Shieh GS. Whole-genome sequencing reveals rare variants associated with gout in Taiwanese males. Front Genet 2024; 15:1423714. [PMID: 39385933 PMCID: PMC11462091 DOI: 10.3389/fgene.2024.1423714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Accepted: 08/28/2024] [Indexed: 10/12/2024] Open
Abstract
To identify rare variants (RVs) of gout, we sequenced the whole genomes of 321 male gout patients and combined these with those of 64 male gout patients and 682 normal controls at Taiwan Biobank. We performed ACAT-O to identify 682 significant RVs (p < 3.8 × 10-8) clustered on chromosomes 1, 7, 10, 16, and 18. To prioritize causal variants effectively, we sifted them by Combined Annotation-Dependent Depletion score >10 or |effect size| ≥ 1.5 for those without CADD scores. In particular, to the best of our knowledge, we identified the rare variants rs559954634, rs186763678, and 13-85340782-G-A for the first time to be associated with gout in Taiwanese males. Importantly, the RV rs559954634 positively affects gout, and its neighboring gene NPHS2 is involved in serum urate and expressed in kidney tissues. The kidneys play a major role in regulating uric acid levels. This suggests that rs559954634 may be involved in gout. Furthermore, rs186763678 is in the intron of NFIA that interacts with SLC2A9, which has the most significant effect on serum urate. Note that gene-gene interaction NFIA-SLC2A9 is significantly associated with serum urate in the Italian MICROS population and a Croatian population. Moreover, 13-85340782-G-A significantly affects gout susceptibility (odds ratio 6.0; P = 0.038). The >1% carrier frequencies of these potentially pathogenic (protective) RVs in cases (controls) suggest the revealed associations may be true; these RVs deserve further studies for the mechanism. Finally, multivariate logistic regression analysis shows that the rare variants rs559954634 and 13-85340782-G-A jointly are significantly associated with gout susceptibility.
Collapse
Affiliation(s)
- Yu-Ping Tseng
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
| | - Ya-Sian Chang
- Department of Pathology, Chung Shan Medical University Hospital, Taichung, Taiwan
| | | | - Ting-Yuan Liu
- Department of Medical Research, China Medical University Hospital, Taichung, Taiwan
| | - Jan-Gowth Chang
- Department of Laboratory Medicine, China Medical University Hospital, Taichung, Taiwan
| | - Grace S. Shieh
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
- Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei, Taiwan
- Data Science Degree Program, Academia Sinica and National Taiwan University, Taipei, Taiwan
- Genome and Systems Biology Degree Program, Academia Sinica and National Taiwan University, Taipei, Taiwan
| |
Collapse
|
7
|
Weinstock JS, Chaudhry SA, Ioannou M, Viskadourou M, Reventun P, Jakubek YA, Liggett LA, Laurie C, Broome JG, Khan A, Taylor KD, Guo X, Peyser PA, Boerwinkle E, Chami N, Kenny EE, Loos RJ, Psaty BM, Russell TP, Brody JA, Yun JH, Cho MH, Vasan RS, Kardia SL, Smith JA, Raffield LM, Bidulescu A, O’Brien E, de Andrade M, Rotter JI, Rich SS, Tracy RP, Chen YDI, Gu CC, Hsiung CA, Kooperberg C, Haring B, Nassir R, Mathias R, Reiner A, Sankaran V, Lowenstein CJ, Blackwell TW, Abecasis GR, Smith AV, Kang HM, Natarajan P, Jaiswal S, Bick A, Post WS, Scheet P, Auer P, Karantanos T, Battle A, Arvanitis M. The Genetic Determinants and Genomic Consequences of Non-Leukemogenic Somatic Point Mutations. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.08.22.24312319. [PMID: 39228737 PMCID: PMC11370504 DOI: 10.1101/2024.08.22.24312319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
Clonal hematopoiesis (CH) is defined by the expansion of a lineage of genetically identical cells in blood. Genetic lesions that confer a fitness advantage, such as point mutations or mosaic chromosomal alterations (mCAs) in genes associated with hematologic malignancy, are frequent mediators of CH. However, recent analyses of both single cell-derived colonies of hematopoietic cells and population sequencing cohorts have revealed CH frequently occurs in the absence of known driver genetic lesions. To characterize CH without known driver genetic lesions, we used 51,399 deeply sequenced whole genomes from the NHLBI TOPMed sequencing initiative to perform simultaneous germline and somatic mutation analyses among individuals without leukemogenic point mutations (LPM), which we term CH-LPMneg. We quantified CH by estimating the total mutation burden. Because estimating somatic mutation burden without a paired-tissue sample is challenging, we developed a novel statistical method, the Genomic and Epigenomic informed Mutation (GEM) rate, that uses external genomic and epigenomic data sources to distinguish artifactual signals from true somatic mutations. We performed a genome-wide association study of GEM to discover the germline determinants of CH-LPMneg. After fine-mapping and variant-to-gene analyses, we identified seven genes associated with CH-LPMneg (TCL1A, TERT, SMC4, NRIP1, PRDM16, MSRA, SCARB1), and one locus associated with a sex-associated mutation pathway (SRGAP2C). We performed a secondary analysis excluding individuals with mCAs, finding that the genetic architecture was largely unaffected by their inclusion. Functional analyses of SMC4 and NRIP1 implicated altered HSC self-renewal and proliferation as the primary mediator of mutation burden in blood. We then performed comprehensive multi-tissue transcriptomic analyses, finding that the expression levels of 404 genes are associated with GEM. Finally, we performed phenotypic association meta-analyses across four cohorts, finding that GEM is associated with increased white blood cell count and increased risk for incident peripheral artery disease, but is not significantly associated with incident stroke or coronary disease events. Overall, we develop GEM for quantifying mutation burden from WGS without a paired-tissue sample and use GEM to discover the genetic, genomic, and phenotypic correlates of CH-LPMneg.
Collapse
Affiliation(s)
- Joshua S. Weinstock
- Department of Human Genetics, School of Medicine, Emory University, Atlanta, GA, USA
| | - Sharjeel A. Chaudhry
- Division of Cardiology, Department of Medicine, Johns Hopkins University, Baltimore, MD
- Department of Surgery, Division of Vascular and Endovascular Surgery, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | - Maria Ioannou
- Division of Hematological Malignancies, Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine
| | - Maria Viskadourou
- Division of Cardiology, Department of Medicine, Johns Hopkins University, Baltimore, MD
| | - Paula Reventun
- Division of Cardiology, Department of Medicine, Johns Hopkins University, Baltimore, MD
| | | | - L. Alexander Liggett
- Division of Hematology/Oncology, Boston Childrens Hospital and Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02115, USA
| | - Cecelia Laurie
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
| | - Jai G. Broome
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA 98195, USA
| | - Alyna Khan
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
| | - Kent D. Taylor
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA USA
| | - Xiuqing Guo
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA USA
| | - Patricia A. Peyser
- Department of Epidemiology, School of Public Health, Boston University, Boxton, MA USA
| | - Eric Boerwinkle
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Nathalie Chami
- The Charles Bronfman Institute of Personalized Medicine
- The Mindich Child Health and Developlement Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | - Ruth J. Loos
- The Charles Bronfman Institute of Personalized Medicine
- The Mindich Child Health and Developlement Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Bruce M. Psaty
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
- Department of Health Systems and Population Health, University of Washington, Seattle, WA, USA
| | - Tracy P. Russell
- Department of Pathology & Laboratory Medicine and Biochemistry, Larner College of Medicine at the University of Vermont, Colchester, VT, USA
| | - Jennifer A. Brody
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Jeong H. Yun
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA USA
| | - Michael H. Cho
- Channing Division of Network Medicine and Division of Pulmonary and Critical Care Medicine, Brigham and Women’s Hospital, Boston, MA USA
| | - Ramachandran S. Vasan
- National Heart Lung and Blood Institute’s, Boston University’s Framingham Heart Study, Framingham, MA, USA
| | - Sharon L. Kardia
- Department of Epidemiology, University of Michigan, Ann Arbor, MI
| | - Jennifer A. Smith
- Department of Epidemiology, University of Michigan, Ann Arbor, MI
- Survey Research Center, Institute for Social Research, University of Michgian, Ann Arbor, MI
| | - Laura M. Raffield
- Department of Genetics, University of North Carolina, Chapel Hill, NC, 27514
| | - Aurelian Bidulescu
- Department of Epidemiology and Biostatistics, Indiana University School of Public Health Bloomington, Bloomington, IN, USA
| | | | - Mariza de Andrade
- Mayo Clinic, Department of Health Sciences Research, Rochester, MN, USA
| | - Jerome I. Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA USA
| | - Stephen S. Rich
- Department of Public Health Sciences, Center for Public Health Genomics, University of Virginia, Charlottesville, VA USA
| | - Russell P. Tracy
- Department of Pathology & Laboratory Medicine and Biochemistry, Larner College of Medicine at the University of Vermont, Colchester, VT, USA
| | - Yii Der Ida Chen
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA USA
| | - C. Charles. Gu
- Center for Biostatistics and Data Sciences, Washington University, St. Louis, MO USA
| | - Chao A. Hsiung
- Department of Medicine, Taipei Veterans General Hospital, Taipei Taiwan - 201 Shi-Pai Rd. Sec. 2, Taipei Taiwan
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Bernhard Haring
- Department of Medicine III, Saarland University Hospital, Homburg, Saarland, Germany - Department of Medicine I, University of Wrzburg, Wrzburg, Bavaria, Germany
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, New York, USA. Electronic address
| | - Rami Nassir
- University of California Davis, Davis, CA, USA
| | - Rasika Mathias
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Alex Reiner
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Vijay Sankaran
- Division of Hematology/Oncology, Boston Childrens Hospital and Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02115, USA
| | | | - Thomas W. Blackwell
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Goncalo R. Abecasis
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Regeneron Pharmaceuticals, Tarrytown, NY, USA
| | - Albert V. Smith
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Hyun M. Kang
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Pradeep Natarajan
- Center for Genomic Medicine and Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA
- Program in Medical and Population Genetics, Broad Institute of Harvard & MIT, Cambridge, MA
- Department of Medicine, Harvard Medical School, Boston, MA
| | | | - Alexander Bick
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University, Nashville, TN, USA
| | - Wendy S. Post
- Department of Medicine, Cardiology Division, Johns Hopkins University
| | - Paul Scheet
- Department of Epidemiology, University of Texas M.D. Anderson Cancer Center, Houston, TX, USA
| | - Paul Auer
- Department of Biostatistics, Medical College of WisconsinDivision of Biostatistics, Institute for Health and Equity, and Cancer Center, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Theodoros Karantanos
- Division of Hematological Malignancies, Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine
| | - Alexis Battle
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
- Malone Center for Engineering in Healthcare, Johns Hopkins University, Baltimore, MD
- Department of Computer Science, Johns Hopkins University, Baltimore, MD
| | - Marios Arvanitis
- Division of Cardiology, Department of Medicine, Johns Hopkins University, Baltimore, MD
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
8
|
Cirulli ET, Schiabor Barrett KM, Bolze A, Judge DP, Pawloski PA, Grzymski JJ, Lee W, Washington NL. A power-based sliding window approach to evaluate the clinical impact of rare genetic variants in the nucleotide sequence or the spatial position of the folded protein. HGG ADVANCES 2024; 5:100284. [PMID: 38509709 PMCID: PMC11004801 DOI: 10.1016/j.xhgg.2024.100284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 03/14/2024] [Accepted: 03/15/2024] [Indexed: 03/22/2024] Open
Abstract
Systematic determination of novel variant pathogenicity remains a major challenge, even when there is an established association between a gene and phenotype. Here we present Power Window (PW), a sliding window technique that identifies the impactful regions of a gene using population-scale clinico-genomic datasets. By sizing analysis windows on the number of variant carriers, rather than the number of variants or nucleotides, statistical power is held constant, enabling the localization of clinical phenotypes and removal of unassociated gene regions. The windows can be built by sliding across either the nucleotide sequence of the gene (through 1D space) or the positions of the amino acids in the folded protein (through 3D space). Using a training set of 350k exomes from the UK Biobank (UKB), we developed PW models for well-established gene-disease associations and tested their accuracy in two independent cohorts (117k UKB exomes and 65k exomes sequenced at Helix in the Healthy Nevada Project, myGenetics, or In Our DNA SC studies). The significant models retained a median of 49% of the qualifying variant carriers in each gene (range 2%-98%), with quantitative traits showing a median effect size improvement of 66% compared with aggregating variants across the entire gene, and binary traits' odds ratios improving by a median of 2.2-fold. PW showcases that electronic health record-based statistical analyses can accurately distinguish between novel coding variants in established genes that will have high phenotypic penetrance and those that will not, unlocking new potential for human genomics research, drug development, variant interpretation, and precision medicine.
Collapse
Affiliation(s)
| | | | - Alexandre Bolze
- Helix, 101 S Ellsworth Ave Suite 350, San Mateo, CA 94401, USA
| | - Daniel P Judge
- Division of Cardiology, Medical University of South Carolina, 30 Courtenay Drive, MSC 592, Charleston, SC 29425, USA
| | | | - Joseph J Grzymski
- University of Nevada, 2215 Raggio Pkwy, Reno, NV 89512, USA; Renown Institute for Health Innovation, Reno, NV 89512, USA
| | - William Lee
- Helix, 101 S Ellsworth Ave Suite 350, San Mateo, CA 94401, USA
| | | |
Collapse
|
9
|
Alireza Z, Maleeha M, Kaikkonen M, Fortino V. Enhancing prediction accuracy of coronary artery disease through machine learning-driven genomic variant selection. J Transl Med 2024; 22:356. [PMID: 38627847 PMCID: PMC11020205 DOI: 10.1186/s12967-024-05090-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 03/14/2024] [Indexed: 04/19/2024] Open
Abstract
Machine learning (ML) methods are increasingly becoming crucial in genome-wide association studies for identifying key genetic variants or SNPs that statistical methods might overlook. Statistical methods predominantly identify SNPs with notable effect sizes by conducting association tests on individual genetic variants, one at a time, to determine their relationship with the target phenotype. These genetic variants are then used to create polygenic risk scores (PRSs), estimating an individual's genetic risk for complex diseases like cancer or cardiovascular disorders. Unlike traditional methods, ML algorithms can identify groups of low-risk genetic variants that improve prediction accuracy when combined in a mathematical model. However, the application of ML strategies requires addressing the feature selection challenge to prevent overfitting. Moreover, ensuring the ML model depends on a concise set of genomic variants enhances its clinical applicability, where testing is feasible for only a limited number of SNPs. In this study, we introduce a robust pipeline that applies ML algorithms in combination with feature selection (ML-FS algorithms), aimed at identifying the most significant genomic variants associated with the coronary artery disease (CAD) phenotype. The proposed computational approach was tested on individuals from the UK Biobank, differentiating between CAD and non-CAD individuals within this extensive cohort, and benchmarked against standard PRS-based methodologies like LDpred2 and Lassosum. Our strategy incorporates cross-validation to ensure a more robust evaluation of genomic variant-based prediction models. This method is commonly applied in machine learning strategies but has often been neglected in previous studies assessing the predictive performance of polygenic risk scores. Our results demonstrate that the ML-FS algorithm can identify panels with as few as 50 genetic markers that can achieve approximately 80% accuracy when used in combination with known risk factors. The modest increase in accuracy over PRS performances is noteworthy, especially considering that PRS models incorporate a substantially larger number of genetic variants. This extensive variant selection can pose practical challenges in clinical settings. Additionally, the proposed approach revealed novel CAD-genetic variant associations.
Collapse
Affiliation(s)
- Z Alireza
- Institute of Biomedicine, University of Eastern Finland, 70210, Kuopio, Finland
| | - M Maleeha
- Institute of Biomedicine, University of Eastern Finland, 70210, Kuopio, Finland
| | - M Kaikkonen
- A.I.Virtanen Institute, University of Eastern Finland, 70210, Kuopio, Finland
| | - V Fortino
- Institute of Biomedicine, University of Eastern Finland, 70210, Kuopio, Finland.
| |
Collapse
|
10
|
Li X, Pura J, Allen A, Owzar K, Lu J, Harms M, Xie J. DYNATE: Localizing rare-variant association regions via multiple testing embedded in an aggregation tree. Genet Epidemiol 2024; 48:42-55. [PMID: 38014869 PMCID: PMC10842871 DOI: 10.1002/gepi.22542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 10/09/2023] [Accepted: 10/26/2023] [Indexed: 11/29/2023]
Abstract
Rare-variants (RVs) genetic association studies enable researchers to uncover the variation in phenotypic traits left unexplained by common variation. Traditional single-variant analysis lacks power; thus, researchers have developed various methods to aggregate the effects of RVs across genomic regions to study their collective impact. Some existing methods utilize a static delineation of genomic regions, often resulting in suboptimal effect aggregation, as neutral subregions within the test region will result in an attenuation of signal. Other methods use varying windows to search for signals but often result in long regions containing many neutral RVs. To pinpoint short genomic regions enriched for disease-associated RVs, we developed a novel method, DYNamic Aggregation TEsting (DYNATE). DYNATE dynamically and hierarchically aggregates smaller genomic regions into larger ones and performs multiple testing for disease associations with a controlled weighted false discovery rate. DYNATE's main advantage lies in its strong ability to identify short genomic regions highly enriched for disease-associated RVs. Extensive numerical simulations demonstrate the superior performance of DYNATE under various scenarios compared with existing methods. We applied DYNATE to an amyotrophic lateral sclerosis study and identified a new gene, EPG5, harboring possibly pathogenic mutations.
Collapse
Affiliation(s)
- Xuechan Li
- Novartis Pharmaceuticals Corporation, Basel, Switzerland
| | | | - Andrew Allen
- Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina, USA
| | - Kouros Owzar
- Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina, USA
| | - Jianfeng Lu
- Department of Mathematics, Duke University, Durham, North Carolina, USA
| | - Matthew Harms
- Department of Neurology, Columbia University, Broadway, New York, USA
| | - Jichun Xie
- Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina, USA
- Department of Mathematics, Duke University, Durham, North Carolina, USA
| |
Collapse
|
11
|
Chen H, Naseri A, Zhi D. FiMAP: A fast identity-by-descent mapping test for biobank-scale cohorts. PLoS Genet 2023; 19:e1011057. [PMID: 38039339 PMCID: PMC10718418 DOI: 10.1371/journal.pgen.1011057] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 12/13/2023] [Accepted: 11/07/2023] [Indexed: 12/03/2023] Open
Abstract
Although genome-wide association studies (GWAS) have identified tens of thousands of genetic loci, the genetic architecture is still not fully understood for many complex traits. Most GWAS and sequencing association studies have focused on single nucleotide polymorphisms or copy number variations, including common and rare genetic variants. However, phased haplotype information is often ignored in GWAS or variant set tests for rare variants. Here we leverage the identity-by-descent (IBD) segments inferred from a random projection-based IBD detection algorithm in the mapping of genetic associations with complex traits, to develop a computationally efficient statistical test for IBD mapping in biobank-scale cohorts. We used sparse linear algebra and random matrix algorithms to speed up the computation, and a genome-wide IBD mapping scan of more than 400,000 samples finished within a few hours. Simulation studies showed that our new method had well-controlled type I error rates under the null hypothesis of no genetic association in large biobank-scale cohorts, and outperformed traditional GWAS single-variant tests when the causal variants were untyped and rare, or in the presence of haplotype effects. We also applied our method to IBD mapping of six anthropometric traits using the UK Biobank data and identified a total of 3,442 associations, 2,131 (62%) of which remained significant after conditioning on suggestive tag variants in the ± 3 centimorgan flanking regions from GWAS.
Collapse
Affiliation(s)
- Han Chen
- Human Genetics Center, Department of Epidemiology, School of Public Health, The University of Texas Health Science Center at Houston, Houston, Texas, United States of America
| | - Ardalan Naseri
- Center for Artificial Intelligence and Genome Informatics, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, United States of America
| | - Degui Zhi
- Center for Artificial Intelligence and Genome Informatics, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, United States of America
| |
Collapse
|
12
|
Chen Y, Paramo MI, Zhang Y, Yao L, Shah SR, Jin Y, Zhang J, Pan X, Yu H. Finding Needles in the Haystack: Strategies for Uncovering Noncoding Regulatory Variants. Annu Rev Genet 2023; 57:201-222. [PMID: 37562413 DOI: 10.1146/annurev-genet-030723-120717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2023]
Abstract
Despite accumulating evidence implicating noncoding variants in human diseases, unraveling their functionality remains a significant challenge. Systematic annotations of the regulatory landscape and the growth of sequence variant data sets have fueled the development of tools and methods to identify causal noncoding variants and evaluate their regulatory effects. Here, we review the latest advances in the field and discuss potential future research avenues to gain a more in-depth understanding of noncoding regulatory variants.
Collapse
Affiliation(s)
- You Chen
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
| | - Mauricio I Paramo
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
| | - Yingying Zhang
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
| | - Li Yao
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
- Department of Computational Biology, Cornell University, Ithaca, New York, USA
| | - Sagar R Shah
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
| | - Yiyang Jin
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
| | - Junke Zhang
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
- Department of Computational Biology, Cornell University, Ithaca, New York, USA
| | - Xiuqi Pan
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
| | - Haiyuan Yu
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA;
- Department of Computational Biology, Cornell University, Ithaca, New York, USA
| |
Collapse
|
13
|
Li X, Chen H, Selvaraj MS, Van Buren E, Zhou H, Wang Y, Sun R, McCaw ZR, Yu Z, Arnett DK, Bis JC, Blangero J, Boerwinkle E, Bowden DW, Brody JA, Cade BE, Carson AP, Carlson JC, Chami N, Chen YDI, Curran JE, de Vries PS, Fornage M, Franceschini N, Freedman BI, Gu C, Heard-Costa NL, He J, Hou L, Hung YJ, Irvin MR, Kaplan RC, Kardia SL, Kelly T, Konigsberg I, Kooperberg C, Kral BG, Li C, Loos RJ, Mahaney MC, Martin LW, Mathias RA, Minster RL, Mitchell BD, Montasser ME, Morrison AC, Palmer ND, Peyser PA, Psaty BM, Raffield LM, Redline S, Reiner AP, Rich SS, Sitlani CM, Smith JA, Taylor KD, Tiwari H, Vasan RS, Wang Z, Yanek LR, Yu B, Rice KM, Rotter JI, Peloso GM, Natarajan P, Li Z, Liu Z, Lin X. A statistical framework for powerful multi-trait rare variant analysis in large-scale whole-genome sequencing studies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.30.564764. [PMID: 37961350 PMCID: PMC10634938 DOI: 10.1101/2023.10.30.564764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Large-scale whole-genome sequencing (WGS) studies have improved our understanding of the contributions of coding and noncoding rare variants to complex human traits. Leveraging association effect sizes across multiple traits in WGS rare variant association analysis can improve statistical power over single-trait analysis, and also detect pleiotropic genes and regions. Existing multi-trait methods have limited ability to perform rare variant analysis of large-scale WGS data. We propose MultiSTAAR, a statistical framework and computationally-scalable analytical pipeline for functionally-informed multi-trait rare variant analysis in large-scale WGS studies. MultiSTAAR accounts for relatedness, population structure and correlation among phenotypes by jointly analyzing multiple traits, and further empowers rare variant association analysis by incorporating multiple functional annotations. We applied MultiSTAAR to jointly analyze three lipid traits (low-density lipoprotein cholesterol, high-density lipoprotein cholesterol and triglycerides) in 61,861 multi-ethnic samples from the Trans-Omics for Precision Medicine (TOPMed) Program. We discovered new associations with lipid traits missed by single-trait analysis, including rare variants within an enhancer of NIPSNAP3A and an intergenic region on chromosome 1.
Collapse
Affiliation(s)
- Xihao Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Han Chen
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Margaret Sunitha Selvaraj
- Center for Genomic Medicine and Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Eric Van Buren
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Hufeng Zhou
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Yuxuan Wang
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Ryan Sun
- Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Zachary R. McCaw
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Zhi Yu
- Center for Genomic Medicine and Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Donna K. Arnett
- Provost Office, University of South Carolina, Columbia, SC, USA
| | - Joshua C. Bis
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - John Blangero
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Eric Boerwinkle
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Donald W. Bowden
- Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Jennifer A. Brody
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Brian E. Cade
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Sleep Medicine, Harvard Medical School, Boston, MA, USA
| | - April P. Carson
- Department of Medicine, University of Mississippi Medical Center, Jackson, MS, USA
| | - Jenna C. Carlson
- Department of Human Genetics and Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Nathalie Chami
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Yii-Der Ida Chen
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Joanne E. Curran
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Paul S. de Vries
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Myriam Fornage
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Brown Foundation Institute of Molecular Medicine, McGovern Medical School, the University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Nora Franceschini
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Barry I. Freedman
- Department of Internal Medicine, Nephrology, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Charles Gu
- Division of Biology & Biomedical Sciences, Washington University School of Medicine, St. Louis, MO, USA
| | - Nancy L. Heard-Costa
- Department of Neurology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
- Framingham Heart Study, Framingham, MA, USA
| | - Jiang He
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA
- Tulane University Translational Science Institute, New Orleans, LA, USA
| | - Lifang Hou
- Department of Preventive Medicine, Northwestern University, Chicago, IL, USA
| | - Yi-Jen Hung
- Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan
| | - Marguerite R. Irvin
- Department of Epidemiology, School of Public Health, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Robert C. Kaplan
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, USA
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Sharon L.R. Kardia
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Tanika Kelly
- Department of Medicine, Division of Nephrology, University of Illinois Chicago, Chicago, IL, USA
| | - Iain Konigsberg
- Department of Biomedical Informatics, University of Colorado, Aurora, CO, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Brian G. Kral
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Changwei Li
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA
- Tulane University Translational Science Institute, New Orleans, LA, USA
| | - Ruth J.F. Loos
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Michael C. Mahaney
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Lisa W. Martin
- George Washington University School of Medicine and Health Sciences, Washington, DC, USA
| | - Rasika A. Mathias
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Ryan L. Minster
- Department of Human Genetics and Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Braxton D. Mitchell
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - May E. Montasser
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Alanna C. Morrison
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Nicholette D. Palmer
- Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Patricia A. Peyser
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Bruce M. Psaty
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
- Departments of Epidemiology, University of Washington, Seattle, WA, USA
- Department of Health Systems and Population Health, University of Washington, Seattle, WA, USA
| | - Laura M. Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Susan Redline
- Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Sleep Medicine, Harvard Medical School, Boston, MA, USA
| | - Alexander P. Reiner
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, USA
- Departments of Epidemiology, University of Washington, Seattle, WA, USA
| | - Stephen S. Rich
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Colleen M. Sitlani
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Jennifer A. Smith
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Kent D. Taylor
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Hemant Tiwari
- Department of Biostatistics, School of Public Health, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Ramachandran S. Vasan
- Framingham Heart Study, Framingham, MA, USA
- Department of Quantitative and Qualitative Health Sciences, UT Health San Antonio School of Public Health, San Antonia, TX, USA
| | - Zhe Wang
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Lisa R. Yanek
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Bing Yu
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | | | - Kenneth M. Rice
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Jerome I. Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Gina M. Peloso
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Pradeep Natarajan
- Center for Genomic Medicine and Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Zilin Li
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Zhonghua Liu
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY, USA
| | - Xihong Lin
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Statistics, Harvard University, Cambridge, MA, USA
| |
Collapse
|
14
|
Liang X, Sun H. Weighted Selection Probability to Prioritize Susceptible Rare Variants in Multi-Phenotype Association Studies with Application to a Soybean Genetic Data Set. J Comput Biol 2023; 30:1075-1088. [PMID: 37871292 DOI: 10.1089/cmb.2022.0487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2023] Open
Abstract
Rare variant association studies with multiple traits or diseases have drawn a lot of attention since association signals of rare variants can be boosted if more than one phenotype outcome is associated with the same rare variants. Most of the existing statistical methods to identify rare variants associated with multiple phenotypes are based on a group test, where a pre-specified genetic region is tested one at a time. However, these methods are not designed to locate susceptible rare variants within the genetic region. In this article, we propose new statistical methods to prioritize rare variants within a genetic region when a group test for the genetic region identifies a statistical association with multiple phenotypes. It computes the weighted selection probability (WSP) of individual rare variants and ranks them from largest to smallest according to their WSP. In simulation studies, we demonstrated that the proposed method outperforms other statistical methods in terms of true positive selection, when multiple phenotypes are correlated with each other. We also applied it to our soybean single nucleotide polymorphism (SNP) data with 13 highly correlated amino acids, where we identified some potentially susceptible rare variants in chromosome 19.
Collapse
Affiliation(s)
- Xianglong Liang
- Department of Statistic, Pusan National University, Busan, Korea
| | - Hokeun Sun
- Department of Statistic, Pusan National University, Busan, Korea
| |
Collapse
|
15
|
Bocher O, Marenne G, Génin E, Perdry H. Ravages: An R package for the simulation and analysis of rare variants in multicategory phenotypes. Genet Epidemiol 2023; 47:450-460. [PMID: 37158367 DOI: 10.1002/gepi.22529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 03/27/2023] [Accepted: 04/25/2023] [Indexed: 05/10/2023]
Abstract
Current software packages for the analysis and the simulations of rare variants are only available for binary and continuous traits. Ravages provides solutions in a single R package to perform rare variant association tests for multicategory, binary and continuous phenotypes, to simulate datasets under different scenarios and to compute statistical power. Association tests can be run in the whole genome thanks to C++ implementation of most of the functions, using either RAVA-FIRST, a recently developed strategy to filter and analyse genome-wide rare variants, or user-defined candidate regions. Ravages also includes a simulation module that generates genetic data for cases who can be stratified into several subgroups and for controls. Through comparisons with existing programmes, we show that Ravages complements existing tools and will be useful to study the genetic architecture of complex diseases. Ravages is available on the CRAN at https://cran.r-project.org/web/packages/Ravages/ and maintained on Github at https://github.com/genostats/Ravages.
Collapse
Affiliation(s)
- Ozvan Bocher
- Univ Brest, Inserm, EFS, UMR 1078, GGB, Brest, France
- Institute of Translational Genomics, Helmholtz Zentrum München, Munich, Germany
| | | | | | - Hervé Perdry
- CESP Inserm, U1018, UFR Médecine, Univ Paris-Sud, Université Paris-Saclay, Villejuif, France
| |
Collapse
|
16
|
Jiang Z, Zhang H, Ahearn TU, Garcia-Closas M, Chatterjee N, Zhu H, Zhan X, Zhao N. The sequence kernel association test for multicategorical outcomes. Genet Epidemiol 2023; 47:432-449. [PMID: 37078108 DOI: 10.1002/gepi.22527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 03/29/2023] [Accepted: 03/30/2023] [Indexed: 04/21/2023]
Abstract
Disease heterogeneity is ubiquitous in biomedical and clinical studies. In genetic studies, researchers are increasingly interested in understanding the distinct genetic underpinning of subtypes of diseases. However, existing set-based analysis methods for genome-wide association studies are either inadequate or inefficient to handle such multicategorical outcomes. In this paper, we proposed a novel set-based association analysis method, sequence kernel association test (SKAT)-MC, the sequence kernel association test for multicategorical outcomes (nominal or ordinal), which jointly evaluates the relationship between a set of variants (common and rare) and disease subtypes. Through comprehensive simulation studies, we showed that SKAT-MC effectively preserves the nominal type I error rate while substantially increases the statistical power compared to existing methods under various scenarios. We applied SKAT-MC to the Polish breast cancer study (PBCS), and identified gene FGFR2 was significantly associated with estrogen receptor (ER)+ and ER- breast cancer subtypes. We also investigated educational attainment using UK Biobank data (N = 127 , 127 $N=127,127$ ) with SKAT-MC, and identified 21 significant genes in the genome. Consequently, SKAT-MC is a powerful and efficient analysis tool for genetic association studies with multicategorical outcomes. A freely distributed R package SKAT-MC can be accessed at https://github.com/Zhiwen-Owen-Jiang/SKATMC.
Collapse
Affiliation(s)
- Zhiwen Jiang
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Haoyu Zhang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland, USA
| | - Thomas U Ahearn
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland, USA
| | - Montserrat Garcia-Closas
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland, USA
| | - Nilanjan Chatterjee
- Department of Biostatistics, Johns Hopkins University, Baltimore, Maryland, USA
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Xiang Zhan
- Department of Biostatistics, Peking University, Beijing, China
| | - Ni Zhao
- Department of Biostatistics, Johns Hopkins University, Baltimore, Maryland, USA
| |
Collapse
|
17
|
Obry L, Dalmasso C. Weighted multiple testing procedures in genome-wide association studies. PeerJ 2023; 11:e15369. [PMID: 37337586 PMCID: PMC10276986 DOI: 10.7717/peerj.15369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 04/17/2023] [Indexed: 06/21/2023] Open
Abstract
Multiple testing procedures controlling the false discovery rate (FDR) are increasingly used in the context of genome wide association studies (GWAS), and weighted multiple testing procedures that incorporate covariate information are efficient to improve the power to detect associations. In this work, we evaluate some recent weighted multiple testing procedures in the specific context of GWAS through a simulation study. We also present a new efficient procedure called wBHa that prioritizes the detection of genetic variants with low minor allele frequencies while maximizing the overall detection power. The results indicate good performance of our procedure compared to other weighted multiple testing procedures. In particular, in all simulated settings, wBHa tends to outperform other procedures in detecting rare variants while maintaining good overall power. The use of the different procedures is illustrated with a real dataset.
Collapse
Affiliation(s)
- Ludivine Obry
- Université Paris-Saclay, CNRS, Univ Evry, Laboratoire de Mathématiques et Modélisation d’Evry, Evry-Courcouronnes, France
| | - Cyril Dalmasso
- Université Paris-Saclay, CNRS, Univ Evry, Laboratoire de Mathématiques et Modélisation d’Evry, Evry-Courcouronnes, France
| |
Collapse
|
18
|
Weinstock JS, Gopakumar J, Burugula BB, Uddin MM, Jahn N, Belk JA, Bouzid H, Daniel B, Miao Z, Ly N, Mack TM, Luna SE, Prothro KP, Mitchell SR, Laurie CA, Broome JG, Taylor KD, Guo X, Sinner MF, von Falkenhausen AS, Kääb S, Shuldiner AR, O'Connell JR, Lewis JP, Boerwinkle E, Barnes KC, Chami N, Kenny EE, Loos RJF, Fornage M, Hou L, Lloyd-Jones DM, Redline S, Cade BE, Psaty BM, Bis JC, Brody JA, Silverman EK, Yun JH, Qiao D, Palmer ND, Freedman BI, Bowden DW, Cho MH, DeMeo DL, Vasan RS, Yanek LR, Becker LC, Kardia SLR, Peyser PA, He J, Rienstra M, Van der Harst P, Kaplan R, Heckbert SR, Smith NL, Wiggins KL, Arnett DK, Irvin MR, Tiwari H, Cutler MJ, Knight S, Muhlestein JB, Correa A, Raffield LM, Gao Y, de Andrade M, Rotter JI, Rich SS, Tracy RP, Konkle BA, Johnsen JM, Wheeler MM, Smith JG, Melander O, Nilsson PM, Custer BS, Duggirala R, Curran JE, Blangero J, McGarvey S, Williams LK, Xiao S, Yang M, Gu CC, Chen YDI, Lee WJ, Marcus GM, Kane JP, Pullinger CR, Shoemaker MB, Darbar D, Roden DM, Albert C, Kooperberg C, Zhou Y, Manson JE, Desai P, Johnson AD, Mathias RA, et alWeinstock JS, Gopakumar J, Burugula BB, Uddin MM, Jahn N, Belk JA, Bouzid H, Daniel B, Miao Z, Ly N, Mack TM, Luna SE, Prothro KP, Mitchell SR, Laurie CA, Broome JG, Taylor KD, Guo X, Sinner MF, von Falkenhausen AS, Kääb S, Shuldiner AR, O'Connell JR, Lewis JP, Boerwinkle E, Barnes KC, Chami N, Kenny EE, Loos RJF, Fornage M, Hou L, Lloyd-Jones DM, Redline S, Cade BE, Psaty BM, Bis JC, Brody JA, Silverman EK, Yun JH, Qiao D, Palmer ND, Freedman BI, Bowden DW, Cho MH, DeMeo DL, Vasan RS, Yanek LR, Becker LC, Kardia SLR, Peyser PA, He J, Rienstra M, Van der Harst P, Kaplan R, Heckbert SR, Smith NL, Wiggins KL, Arnett DK, Irvin MR, Tiwari H, Cutler MJ, Knight S, Muhlestein JB, Correa A, Raffield LM, Gao Y, de Andrade M, Rotter JI, Rich SS, Tracy RP, Konkle BA, Johnsen JM, Wheeler MM, Smith JG, Melander O, Nilsson PM, Custer BS, Duggirala R, Curran JE, Blangero J, McGarvey S, Williams LK, Xiao S, Yang M, Gu CC, Chen YDI, Lee WJ, Marcus GM, Kane JP, Pullinger CR, Shoemaker MB, Darbar D, Roden DM, Albert C, Kooperberg C, Zhou Y, Manson JE, Desai P, Johnson AD, Mathias RA, Blackwell TW, Abecasis GR, Smith AV, Kang HM, Satpathy AT, Natarajan P, Kitzman JO, Whitsel EA, Reiner AP, Bick AG, Jaiswal S. Aberrant activation of TCL1A promotes stem cell expansion in clonal haematopoiesis. Nature 2023; 616:755-763. [PMID: 37046083 PMCID: PMC10360040 DOI: 10.1038/s41586-023-05806-1] [Show More Authors] [Citation(s) in RCA: 50] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 02/08/2023] [Indexed: 04/14/2023]
Abstract
Mutations in a diverse set of driver genes increase the fitness of haematopoietic stem cells (HSCs), leading to clonal haematopoiesis1. These lesions are precursors for blood cancers2-6, but the basis of their fitness advantage remains largely unknown, partly owing to a paucity of large cohorts in which the clonal expansion rate has been assessed by longitudinal sampling. Here, to circumvent this limitation, we developed a method to infer the expansion rate from data from a single time point. We applied this method to 5,071 people with clonal haematopoiesis. A genome-wide association study revealed that a common inherited polymorphism in the TCL1A promoter was associated with a slower expansion rate in clonal haematopoiesis overall, but the effect varied by driver gene. Those carrying this protective allele exhibited markedly reduced growth rates or prevalence of clones with driver mutations in TET2, ASXL1, SF3B1 and SRSF2, but this effect was not seen in clones with driver mutations in DNMT3A. TCL1A was not expressed in normal or DNMT3A-mutated HSCs, but the introduction of mutations in TET2 or ASXL1 led to the expression of TCL1A protein and the expansion of HSCs in vitro. The protective allele restricted TCL1A expression and expansion of mutant HSCs, as did experimental knockdown of TCL1A expression. Forced expression of TCL1A promoted the expansion of human HSCs in vitro and mouse HSCs in vivo. Our results indicate that the fitness advantage of several commonly mutated driver genes in clonal haematopoiesis may be mediated by TCL1A activation.
Collapse
Affiliation(s)
- Joshua S Weinstock
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | | | | | - Md Mesbah Uddin
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Nikolaus Jahn
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Julia A Belk
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Hind Bouzid
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Bence Daniel
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Zhuang Miao
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Nghi Ly
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Taralynn M Mack
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University, Nashville, TN, USA
| | - Sofia E Luna
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, USA
| | - Katherine P Prothro
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
| | - Shaneice R Mitchell
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Cecelia A Laurie
- Department of Biostatistics, University of Washington, Seattle, WA, USA
- University of Washington, Seattle, WA, USA
| | - Jai G Broome
- Department of Biostatistics, University of Washington, Seattle, WA, USA
- University of Washington, Seattle, WA, USA
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Kent D Taylor
- Department of Pediatrics, The Institute for Translational Genomics and Population Sciences, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
- Institute for Translational Genomics and Populations Sciences, Lundquist Institute, Torrance, CA, USA
| | - Xiuqing Guo
- Department of Pediatrics, The Institute for Translational Genomics and Population Sciences, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
- Lundquist Institute, Torrance, CA, USA
| | - Moritz F Sinner
- Department of Medicine I, University Hospital, LMU Munich, Munich, Germany
- German Centre for Cardiovascular Research (DZHK), partner site: Munich Heart Alliance, Munich, Germany
| | - Aenne S von Falkenhausen
- Department of Medicine I, University Hospital, LMU Munich, Munich, Germany
- German Centre for Cardiovascular Research (DZHK), partner site: Munich Heart Alliance, Munich, Germany
| | - Stefan Kääb
- Department of Medicine I, University Hospital, LMU Munich, Munich, Germany
- German Centre for Cardiovascular Research (DZHK), partner site: Munich Heart Alliance, Munich, Germany
| | - Alan R Shuldiner
- Department of Medicine, University of Maryland, Baltimore, Baltimore, MD, USA
| | - Jeffrey R O'Connell
- Department of Medicine, University of Maryland, Baltimore, Baltimore, MD, USA
| | - Joshua P Lewis
- Department of Medicine, University of Maryland, Baltimore, Baltimore, MD, USA
- University of Maryland, Baltimore, MD, USA
| | - Eric Boerwinkle
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- University of Texas Health at Houston, Houston, TX, USA
| | - Kathleen C Barnes
- Division of Biomedical Informatics and Personalized Medicine, Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
- University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Nathalie Chami
- The Charles Bronfman Institute of Personalized Medicine, New York, NY, USA
- The Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Eimear E Kenny
- Institute for Genomic Health, New York, NY, USA
- Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ruth J F Loos
- The Charles Bronfman Institute of Personalized Medicine, New York, NY, USA
- The Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Myriam Fornage
- University of Texas Health at Houston, Houston, TX, USA
- Brown Foundation Institute of Molecular Medicine, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Lifang Hou
- Department of Preventive Medicine, Northeastern University, Chicago, IL, USA
| | | | - Susan Redline
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Brian E Cade
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
- Brigham and Women's Hospital, Boston, MA, USA
| | - Bruce M Psaty
- University of Washington, Seattle, WA, USA
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
- Department of Medicine, University of Washington, Seattle, WA, USA
| | - Joshua C Bis
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Jennifer A Brody
- University of Washington, Seattle, WA, USA
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Edwin K Silverman
- Brigham and Women's Hospital, Boston, MA, USA
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Jeong H Yun
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Dandi Qiao
- Brigham and Women's Hospital, Boston, MA, USA
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Nicholette D Palmer
- Department of Biochemistry, Wake Forest School of Medicine, Winston-Salem, NC, USA
- Department of Biochemistry, Wake Forest Baptist Health, Winston-Salem, NC, USA
| | - Barry I Freedman
- Department of Internal Medicine, Section on Nephrology, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Donald W Bowden
- Department of Biochemistry, Wake Forest School of Medicine, Winston-Salem, NC, USA
- Department of Biochemistry, Wake Forest Baptist Health, Winston-Salem, NC, USA
| | - Michael H Cho
- Brigham and Women's Hospital, Boston, MA, USA
- Channing Division of Network Medicine and Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Dawn L DeMeo
- Brigham and Women's Hospital, Boston, MA, USA
- Channing Division of Network Medicine and Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Ramachandran S Vasan
- National Heart Lung and Blood Institute's, Boston University's Framingham Heart Study, Framingham, MA, USA
| | - Lisa R Yanek
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Johns Hopkins University, Baltimore, MD, USA
| | - Lewis C Becker
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Johns Hopkins University, Baltimore, MD, USA
| | - Sharon L R Kardia
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
- University of Michigan, Ann Arbor, MI, USA
| | - Patricia A Peyser
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
- University of Michigan, Ann Arbor, MI, USA
| | - Jiang He
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA
- Tulane University, New Orleans, LA, USA
| | - Michiel Rienstra
- Department of Cardiology, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Pim Van der Harst
- Department of Cardiology, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Robert Kaplan
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, USA
- Albert Einstein College of Medicine, New York, NY, USA
| | - Susan R Heckbert
- Department of Epidemiology, University of Washington, Seattle, WA, USA
- Kaiser Permanente Washington Health Research Institute, Kaiser Permanente Washington, Seattle, WA, USA
| | - Nicholas L Smith
- Department of Epidemiology, University of Washington, Seattle, WA, USA
- Kaiser Permanente Washington Health Research Institute, Kaiser Permanente Washington, Seattle, WA, USA
- Seattle Epidemiologic Research and Information Center, Department of Veterans Affairs Office of Research and Development, Seattle, WA, USA
- Broad Institute, Cambridge, MA, USA
| | - Kerri L Wiggins
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Donna K Arnett
- College of Public Health, University of Kentucky, Lexington, KY, USA
- University of Kentucky, Lexington, KY, USA
| | | | - Hemant Tiwari
- Department of Biostatistics, University of Alabama, Birmingham, AL, USA
| | - Michael J Cutler
- Intermountain Heart Institute, Intermountain Medical Center, Salt Lake City, UT, USA
| | - Stacey Knight
- Intermountain Heart Institute, Intermountain Medical Center, Salt Lake City, UT, USA
| | - J Brent Muhlestein
- Intermountain Heart Institute, Intermountain Medical Center, Salt Lake City, UT, USA
| | - Adolfo Correa
- Department of Medicine, Jackson Heart Study, University of Mississippi Medical Center, Jackson, MS, USA
- Department of Population Health Science, University of Mississippi, Jackson, MS, USA
| | - Laura M Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Yan Gao
- Department of Medicine, University of Mississippi Medical Center, Jackson, MS, USA
- University of Mississippi, Jackson, MS, USA
| | - Mariza de Andrade
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA
| | - Jerome I Rotter
- Department of Pediatrics, The Institute for Translational Genomics and Population Sciences, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
- Department of Pediatrics, Lundquist Institute, Torrance, CA, USA
| | - Stephen S Rich
- Department of Public Health Sciences, Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
- University of Virginia, Charlottesville, VA, USA
| | - Russell P Tracy
- Department of Pathology and Laboratory Medicine and Biochemistry, Larner College of Medicine at the University of Vermont, Colchester, VT, USA
- Department of Pathology and Laboratory Medicine, University of Vermont, Burlington, VT, USA
| | - Barbara A Konkle
- Department of Cardiology, Clinical Sciences, Lund University and Skåne University Hospital, Lund, Sweden
- Blood Works Northwest, Seattle, WA, USA
| | - Jill M Johnsen
- Department of Cardiology, Clinical Sciences, Lund University and Skåne University Hospital, Lund, Sweden
- Research Institute, Bloodworks Northwest, Seattle, WA, USA
| | | | - J Gustav Smith
- Department of Cardiology, Clinical Sciences, Lund University and Skåne University Hospital, Lund, Sweden
- The Wallenberg Laboratory, Department of Molecular and Clinical Medicine, Institute of Medicine, Gothenburg University, Gothenburg, Sweden
- Wallenberg Center for Molecular Medicine and Lund University Diabetes Center, Lund University, Lund, Sweden
- Department of Cardiology, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Olle Melander
- Department of Internal Medicine, Clinical Sciences, Lund University and Skane University Hospital, Malmo, Sweden
| | - Peter M Nilsson
- Department of Internal Medicine, Clinical Sciences, Lund University and Skane University Hospital, Malmo, Sweden
| | | | - Ravindranath Duggirala
- Department of Human Genetics, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
| | - Joanne E Curran
- Department of Human Genetics, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
- University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
| | - John Blangero
- Department of Human Genetics, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
| | - Stephen McGarvey
- Department of Epidemiology and International Health Institute, Brown University School of Public Health, Providence, RI, USA
- Department of Epidemiology, Brown University, Providence, RI, USA
| | - L Keoki Williams
- Center for Individualized and Genomic Medicine Research (CIGMA), Department of Internal Medicine, Henry Ford Health System, Detroit, MI, USA
- Henry Ford Health System, Detroit, MI, USA
| | - Shujie Xiao
- Center for Individualized and Genomic Medicine Research (CIGMA), Department of Internal Medicine, Henry Ford Health System, Detroit, MI, USA
| | - Mao Yang
- Center for Individualized and Genomic Medicine Research (CIGMA), Department of Internal Medicine, Henry Ford Health System, Detroit, MI, USA
| | - C Charles Gu
- Division of Biostatistics, Washington University School of Medicine, St Louis, MO, USA
- Washington University in St Louis, St Louis, MO, USA
| | - Yii-Der Ida Chen
- Department of Pediatrics, The Institute for Translational Genomics and Population Sciences, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
- Lundquist Institute, Torrance, CA, USA
| | - Wen-Jane Lee
- Department of Medical Research, Taichung Veterans General Hospital, Taichung, Taiwan
- Taichung Veterans General Hospital Taiwan, Taichung City, Taiwan
| | - Gregory M Marcus
- Division of Cardiology, University of California, San Francisco, San Francisco, CA, USA
| | - John P Kane
- Department of Medicine, Cardiovascular Research Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Clive R Pullinger
- Cardiovascular Research Institute, University of California, San Francisco, USA
| | - M Benjamin Shoemaker
- Division of Cardiology, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Medicine and Cardiology, Vanderbilt University, Nashville, TN, USA
| | - Dawood Darbar
- Division of Cardiology, University of Illinois at Chicago, Chicago, IL, USA
- University of Illinois at Chicago, Chicago, IL, USA
| | - Dan M Roden
- Departments of Medicine, Pharmacology and Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Christine Albert
- Department of Cardiology, Cedars-Sinai, Los Angeles, CA, USA
- Cedars-Sinai, Boston, MA, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Ying Zhou
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - JoAnn E Manson
- Brigham and Women's Hospital, Boston, MA, USA
- Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Pinkal Desai
- Division of Hematology and Oncology, Weill Cornell Medicine, New York, NY, USA
- Englander Institute of Precision Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Andrew D Johnson
- National Heart, Lung and Blood Institute, Population Sciences Branch, Framingham, MA, USA
- Population Sciences Branch, National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD, USA
- National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Rasika A Mathias
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Johns Hopkins University, Baltimore, MD, USA
| | - Thomas W Blackwell
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Goncalo R Abecasis
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Regeneron Pharmaceuticals, Tarrytown, NY, USA
| | - Albert V Smith
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Hyun M Kang
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Ansuman T Satpathy
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Pradeep Natarajan
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Broad Institute, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
| | - Jacob O Kitzman
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Eric A Whitsel
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, USA
| | - Alexander P Reiner
- Broad Institute, Cambridge, MA, USA
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Fred Hutchinson Cancer Research Center, University of Washington, Seattle, WA, USA
| | - Alexander G Bick
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University, Nashville, TN, USA.
| | - Siddhartha Jaiswal
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA.
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, Stanford, CA, USA.
| |
Collapse
|
19
|
Li X, Quick C, Zhou H, Gaynor SM, Liu Y, Chen H, Selvaraj MS, Sun R, Dey R, Arnett DK, Bielak LF, Bis JC, Blangero J, Boerwinkle E, Bowden DW, Brody JA, Cade BE, Correa A, Cupples LA, Curran JE, de Vries PS, Duggirala R, Freedman BI, Göring HHH, Guo X, Haessler J, Kalyani RR, Kooperberg C, Kral BG, Lange LA, Manichaikul A, Martin LW, McGarvey ST, Mitchell BD, Montasser ME, Morrison AC, Naseri T, O'Connell JR, Palmer ND, Peyser PA, Psaty BM, Raffield LM, Redline S, Reiner AP, Reupena MS, Rice KM, Rich SS, Sitlani CM, Smith JA, Taylor KD, Vasan RS, Willer CJ, Wilson JG, Yanek LR, Zhao W, Rotter JI, Natarajan P, Peloso GM, Li Z, Lin X. Powerful, scalable and resource-efficient meta-analysis of rare variant associations in large whole genome sequencing studies. Nat Genet 2023; 55:154-164. [PMID: 36564505 PMCID: PMC10084891 DOI: 10.1038/s41588-022-01225-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Accepted: 10/13/2022] [Indexed: 12/24/2022]
Abstract
Meta-analysis of whole genome sequencing/whole exome sequencing (WGS/WES) studies provides an attractive solution to the problem of collecting large sample sizes for discovering rare variants associated with complex phenotypes. Existing rare variant meta-analysis approaches are not scalable to biobank-scale WGS data. Here we present MetaSTAAR, a powerful and resource-efficient rare variant meta-analysis framework for large-scale WGS/WES studies. MetaSTAAR accounts for relatedness and population structure, can analyze both quantitative and dichotomous traits and boosts the power of rare variant tests by incorporating multiple variant functional annotations. Through meta-analysis of four lipid traits in 30,138 ancestrally diverse samples from 14 studies of the Trans Omics for Precision Medicine (TOPMed) Program, we show that MetaSTAAR performs rare variant meta-analysis at scale and produces results comparable to using pooled data. Additionally, we identified several conditionally significant rare variant associations with lipid traits. We further demonstrate that MetaSTAAR is scalable to biobank-scale cohorts through meta-analysis of TOPMed WGS data and UK Biobank WES data of ~200,000 samples.
Collapse
Affiliation(s)
- Xihao Li
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Corbin Quick
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Hufeng Zhou
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Sheila M Gaynor
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Yaowu Liu
- School of Statistics, Southwestern University of Finance and Economics, Chengdu, China
| | - Han Chen
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Margaret Sunitha Selvaraj
- Center for Genomic Medicine and Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Ryan Sun
- Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Rounak Dey
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Donna K Arnett
- University of Kentucky, College of Public Health, Lexington, KY, USA
| | - Lawrence F Bielak
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Joshua C Bis
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - John Blangero
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Eric Boerwinkle
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Donald W Bowden
- Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Jennifer A Brody
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Brian E Cade
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA, USA
- Division of Sleep Medicine, Harvard Medical School, Boston, MA, USA
| | - Adolfo Correa
- Jackson Heart Study, Department of Medicine, University of Mississippi Medical Center, Jackson, MS, USA
| | - L Adrienne Cupples
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
- Framingham Heart Study, National Heart, Lung, and Blood Institute and Boston University, Framingham, MA, USA
| | - Joanne E Curran
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Paul S de Vries
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Ravindranath Duggirala
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Barry I Freedman
- Department of Internal Medicine, Nephrology, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Harald H H Göring
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Xiuqing Guo
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Jeffrey Haessler
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Rita R Kalyani
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Brian G Kral
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Leslie A Lange
- Division of Biomedical Informatics and Personalized Medicine, Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Ani Manichaikul
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Lisa W Martin
- Division of Cardiology, George Washington School of Medicine and Health Sciences, Washington, DC, USA
| | - Stephen T McGarvey
- Department of Epidemiology, International Health Institute, Department of Anthropology, Brown University, Providence, RI, USA
| | - Braxton D Mitchell
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Geriatrics Research and Education Clinical Center, Baltimore VA Medical Center, Baltimore, MD, USA
| | - May E Montasser
- Division of Endocrinology, Diabetes, and Nutrition, Program for Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Alanna C Morrison
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Take Naseri
- Ministry of Health, Government of Samoa, Apia, Samoa
| | - Jeffrey R O'Connell
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Nicholette D Palmer
- Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Patricia A Peyser
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Bruce M Psaty
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
- Departments of Epidemiology, University of Washington, Seattle, WA, USA
- Department of Health Systems and Population Health, University of Washington, Seattle, WA, USA
| | - Laura M Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Susan Redline
- Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA, USA
- Division of Sleep Medicine, Harvard Medical School, Boston, MA, USA
- Division of Pulmonary, Critical Care, and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Alexander P Reiner
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, USA
- Departments of Epidemiology, University of Washington, Seattle, WA, USA
| | | | - Kenneth M Rice
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Stephen S Rich
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Colleen M Sitlani
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Jennifer A Smith
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI, USA
| | - Kent D Taylor
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Ramachandran S Vasan
- Framingham Heart Study, National Heart, Lung, and Blood Institute and Boston University, Framingham, MA, USA
- Department of Medicine, Boston University School of Medicine, Boston, MA, USA
| | - Cristen J Willer
- Department of Internal Medicine, University of Michigan, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - James G Wilson
- Division of Cardiology, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Lisa R Yanek
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Wei Zhao
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI, USA
| | - Jerome I Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Pradeep Natarajan
- Center for Genomic Medicine and Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Gina M Peloso
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Zilin Li
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Department of Biostatistics and Health Data Science, Indiana University School of Medicine, Indianapolis, IN, USA.
| | - Xihong Lin
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Department of Statistics, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
20
|
Li Z, Li X, Zhou H, Gaynor SM, Selvaraj MS, Arapoglou T, Quick C, Liu Y, Chen H, Sun R, Dey R, Arnett DK, Auer PL, Bielak LF, Bis JC, Blackwell TW, Blangero J, Boerwinkle E, Bowden DW, Brody JA, Cade BE, Conomos MP, Correa A, Cupples LA, Curran JE, de Vries PS, Duggirala R, Franceschini N, Freedman BI, Göring HHH, Guo X, Kalyani RR, Kooperberg C, Kral BG, Lange LA, Lin BM, Manichaikul A, Manning AK, Martin LW, Mathias RA, Meigs JB, Mitchell BD, Montasser ME, Morrison AC, Naseri T, O'Connell JR, Palmer ND, Peyser PA, Psaty BM, Raffield LM, Redline S, Reiner AP, Reupena MS, Rice KM, Rich SS, Smith JA, Taylor KD, Taub MA, Vasan RS, Weeks DE, Wilson JG, Yanek LR, Zhao W, Rotter JI, Willer CJ, Natarajan P, Peloso GM, Lin X. A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies. Nat Methods 2022; 19:1599-1611. [PMID: 36303018 PMCID: PMC10008172 DOI: 10.1038/s41592-022-01640-x] [Citation(s) in RCA: 54] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Accepted: 09/06/2022] [Indexed: 02/07/2023]
Abstract
Large-scale whole-genome sequencing studies have enabled analysis of noncoding rare-variant (RV) associations with complex human diseases and traits. Variant-set analysis is a powerful approach to study RV association. However, existing methods have limited ability in analyzing the noncoding genome. We propose a computationally efficient and robust noncoding RV association detection framework, STAARpipeline, to automatically annotate a whole-genome sequencing study and perform flexible noncoding RV association analysis, including gene-centric analysis and fixed window-based and dynamic window-based non-gene-centric analysis by incorporating variant functional annotations. In gene-centric analysis, STAARpipeline uses STAAR to group noncoding variants based on functional categories of genes and incorporate multiple functional annotations. In non-gene-centric analysis, STAARpipeline uses SCANG-STAAR to incorporate dynamic window sizes and multiple functional annotations. We apply STAARpipeline to identify noncoding RV sets associated with four lipid traits in 21,015 discovery samples from the Trans-Omics for Precision Medicine (TOPMed) program and replicate several of them in an additional 9,123 TOPMed samples. We also analyze five non-lipid TOPMed traits.
Collapse
Grants
- R01 DK078616 NIDDK NIH HHS
- U01 HG007417 NHGRI NIH HHS
- KL2 TR001100 NCATS NIH HHS
- R01 HL112064 NHLBI NIH HHS
- N01-HC-95160 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R35 HG010692 NHGRI NIH HHS
- U01-HL054472 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01-HL142711 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01-DK071891 U.S. Department of Health & Human Services | NIH | National Institute of Diabetes and Digestive and Kidney Diseases (National Institute of Diabetes & Digestive & Kidney Diseases)
- F30 HL149180 NHLBI NIH HHS
- R01 NR019628 NINR NIH HHS
- R01 HL113323 NHLBI NIH HHS
- N01-HC-95166 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- UL1RR033176 U.S. Department of Health & Human Services | NIH | National Center for Research Resources (NCRR)
- R01 HL132947 NHLBI NIH HHS
- P30 DK040561 NIDDK NIH HHS
- U01 HL137183 NHLBI NIH HHS
- R01-HL127564 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- P30 CA016672 NCI NIH HHS
- R01-HL071051 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL104135 NHLBI NIH HHS
- T32 HL144442 NHLBI NIH HHS
- R35 CA197449 NCI NIH HHS
- P30 ES010126 NIEHS NIH HHS
- DP5 OD029586 NIH HHS
- R01-NS058700 U.S. Department of Health & Human Services | NIH | National Institute of Neurological Disorders and Stroke (NINDS)
- R01 HL123915 NHLBI NIH HHS
- R01 HL120393 NHLBI NIH HHS
- R01HL071259 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL046380 NHLBI NIH HHS
- R01HL071251, R01HL071258, R01HL071259 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- U54 HG003067 NHGRI NIH HHS
- 75N92020D00003 NHLBI NIH HHS
- K01 AG059898 NIA NIH HHS
- U01 DK085524 NIDDK NIH HHS
- KL2 TR002542 NCATS NIH HHS
- R01-HL055673-18S1 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R03 HL141439 NHLBI NIH HHS
- HHSN268201500001I NHLBI NIH HHS
- R01-MH078143, R01-MH078111, R01-MH083824 U.S. Department of Health & Human Services | NIH | National Institute of Mental Health (NIMH)
- U01 DK062413 NIDDK NIH HHS
- R01 HL109946 NHLBI NIH HHS
- U01-HL054495 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- K01 HL136700 NHLBI NIH HHS
- U19 CA203654 NCI NIH HHS
- R01-DK078616 U.S. Department of Health & Human Services | NIH | National Institute of Diabetes and Digestive and Kidney Diseases (National Institute of Diabetes & Digestive & Kidney Diseases)
- U01 HL080295 NHLBI NIH HHS
- NO1-HC-25195 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HG006703 NHGRI NIH HHS
- UL1-TR-001420 U.S. Department of Health & Human Services | NIH | National Center for Advancing Translational Sciences (NCATS)
- U01 HG012064 NHGRI NIH HHS
- R35-CA197449 U.S. Department of Health & Human Services | NIH | National Cancer Institute (NCI)
- P30 ES005605 NIEHS NIH HHS
- R01 AR042742 NIAMS NIH HHS
- R21 HL140385 NHLBI NIH HHS
- HHSN268201800015I NHLBI NIH HHS
- U01 HL130114 NHLBI NIH HHS
- R01 HL117191 NHLBI NIH HHS
- R01 HG009974 NHGRI NIH HHS
- U01-HL054473 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 DK113003 NIDDK NIH HHS
- UL1RR033176 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL059367 NHLBI NIH HHS
- R24 AG047115 NIA NIH HHS
- U01-HL137181 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- P01 HL107202 NHLBI NIH HHS
- NR0224103 U.S. Department of Health & Human Services | NIH | National Institute of Nursing Research (NINR)
- P50 HL118006 NHLBI NIH HHS
- U01-HL72518, HL087698, HL49762, HL59684, HL58625, HL071025, HL112064 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- U01 HL120393 NHLBI NIH HHS
- R01 DK117445 NIDDK NIH HHS
- R01-AG058921 U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- R03-HL154284 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- UL1-TR-000040, UL1-TR-001079, UL1-TR-001420, UL1-TR-001881 U.S. Department of Health & Human Services | NIH | National Center for Advancing Translational Sciences (NCATS)
- R01 AG058921 NIA NIH HHS
- R01 HL129132 NHLBI NIH HHS
- R01 HL113338 NHLBI NIH HHS
- HHSN268201800012I NHLBI NIH HHS
- R01 HL153805 NHLBI NIH HHS
- R01 DK072193 NIDDK NIH HHS
- R01 HL137922 NHLBI NIH HHS
- R01 AI079139 NIAID NIH HHS
- N01-HC-95164 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- U01-DK085524 U.S. Department of Health & Human Services | NIH | National Institute of Diabetes and Digestive and Kidney Diseases (National Institute of Diabetes & Digestive & Kidney Diseases)
- U19 AI111224 NIAID NIH HHS
- R35 HL135824 NHLBI NIH HHS
- 75N92019D00031 NHLBI NIH HHS
- R01 DK110113 NIDDK NIH HHS
- N01-HC-95159, N01-HC-95160, N01-HC-95161, N01-HC-95162 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- N01-HC-95165 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL138737 NHLBI NIH HHS
- P30 DK079626 NIDDK NIH HHS
- R01 NS058700 NINDS NIH HHS
- R01 HL127564 NHLBI NIH HHS
- T32 HG000040 NHGRI NIH HHS
- DK063491 U.S. Department of Health & Human Services | NIH | National Institute of Diabetes and Digestive and Kidney Diseases (National Institute of Diabetes & Digestive & Kidney Diseases)
- R01 HL141845 NHLBI NIH HHS
- R01 DK075787 NIDDK NIH HHS
- R01 AR072199 NIAMS NIH HHS
- R01 HL120854 NHLBI NIH HHS
- R01 HL163560 NHLBI NIH HHS
- R01HL071258 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- U01-HG009088 U.S. Department of Health & Human Services | NIH | National Human Genome Research Institute (NHGRI)
- R01 HL163972 NHLBI NIH HHS
- K23 HL123778 NHLBI NIH HHS
- U01 HL137181 NHLBI NIH HHS
- R01 MH078111 NIMH NIH HHS
- HHSN268201700005I NHLBI NIH HHS
- N01-HC-95159 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01-HL113323 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL141944 NHLBI NIH HHS
- R01 HL119443 NHLBI NIH HHS
- R01-HL071051, R01-HL071205, R01HL071250 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- P60-AG10484 U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- 75N92020D00007 NHLBI NIH HHS
- UM1 AI068634 NIAID NIH HHS
- HHSN268201500003I NHLBI NIH HHS
- HHSN268201700004I NHLBI NIH HHS
- N01-HC-95163 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01-HL071205 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- F30 HL107066 NHLBI NIH HHS
- R01-HL153805 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL105756 NHLBI NIH HHS
- K01 HL125751 NHLBI NIH HHS
- R01 HL067348 NHLBI NIH HHS
- T32 HL007208 NHLBI NIH HHS
- R01 HL142711 NHLBI NIH HHS
- R35 HL135818 NHLBI NIH HHS
- R01-HL92301 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- T32 GM074897 NIGMS NIH HHS
- I01 BX005295 BLRD VA
- 75N92020D00001 NHLBI NIH HHS
- R01 HL113326 NHLBI NIH HHS
- R00 HL129045 NHLBI NIH HHS
- UL1-TR-000040 U.S. Department of Health & Human Services | NIH | National Center for Advancing Translational Sciences (NCATS)
- UL1-TR-001079 U.S. Department of Health & Human Services | NIH | National Center for Advancing Translational Sciences (NCATS)
- U01 HL072524 NHLBI NIH HHS
- R35-HL135818 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- K08 HL140203 NHLBI NIH HHS
- N01-HC-95162 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- K08 HL141601 NHLBI NIH HHS
- 75N92020D00005 NHLBI NIH HHS
- R01-DK117445 U.S. Department of Health & Human Services | NIH | National Institute of Diabetes and Digestive and Kidney Diseases (National Institute of Diabetes & Digestive & Kidney Diseases)
- R01-AR48797 U.S. Department of Health & Human Services | NIH | National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS)
- R56 AG058543 NIA NIH HHS
- U19 AI077439 NIAID NIH HHS
- R01 HL142028 NHLBI NIH HHS
- 75N92020D00004 NHLBI NIH HHS
- HHSN268201800011I NHLBI NIH HHS
- R35 GM127131 NIGMS NIH HHS
- U01 HL137880 NHLBI NIH HHS
- R01 HG010869 NHGRI NIH HHS
- R01-HL133040 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- HHSN268201700003I NHLBI NIH HHS
- R01HL071250 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- N01-HC-95168 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL148239 NHLBI NIH HHS
- U01-HL137162 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 AI132476 NIAID NIH HHS
- T32 GM007205 NIGMS NIH HHS
- HHSN268201800010I NHLBI NIH HHS
- R01-HL092577-06S1 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- UL1-TR-001881 U.S. Department of Health & Human Services | NIH | National Center for Advancing Translational Sciences (NCATS)
- R01-HL104135-04S1 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL132320 NHLBI NIH HHS
- U01 DK078616 NIDDK NIH HHS
- HHSN268201700001I NHLBI NIH HHS
- R01-HL141944 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- U01 HL137162 NHLBI NIH HHS
- R01 HG005701 NHGRI NIH HHS
- 75N92020D00001, 75N92020D00002, 75N92020D00003, 75N92020D00004 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01 HL143221 NHLBI NIH HHS
- R01 HL142992 NHLBI NIH HHS
- K01 HL129039 NHLBI NIH HHS
- R01 HL133870 NHLBI NIH HHS
- R01 DA037904 NIDA NIH HHS
- R21 HL123677 NHLBI NIH HHS
- R01 DK071891 NIDDK NIH HHS
- HHSN268201800001I U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- 75N92020D00002 NHLBI NIH HHS
- K01 HL130609 NHLBI NIH HHS
- N01-HC-95167 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- T32 HL007374 NHLBI NIH HHS
- N01-HC-95169 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- U01-DK078616 U.S. Department of Health & Human Services | NIH | National Institute of Diabetes and Digestive and Kidney Diseases (National Institute of Diabetes & Digestive & Kidney Diseases)
- R01 AR063611 NIAMS NIH HHS
- KL2TR002490 U.S. Department of Health & Human Services | NIH | National Center for Advancing Translational Sciences (NCATS)
- R03 HL154284 NHLBI NIH HHS
- M01-RR000052 U.S. Department of Health & Human Services | NIH | National Center for Research Resources (NCRR)
- 75N92020D00006 NHLBI NIH HHS
- S10 OD020069 NIH HHS
- R01 MD012765 NIMHD NIH HHS
- N01-HC-95161 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- HHSN268201700002I NHLBI NIH HHS
- R01 HL151855 NHLBI NIH HHS
- K23 HL138461 NHLBI NIH HHS
- U01 CA182913 NCI NIH HHS
- UG3 HL151865 NHLBI NIH HHS
- F32 HL150992 NHLBI NIH HHS
- R01-MD012765 U.S. Department of Health & Human Services | NIH | National Institute on Minority Health and Health Disparities (NIMHD)
- 75N92020D00005, 75N92020D00006, 75N92020D00007 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01 MH101244 NIMH NIH HHS
- U01 HG009088 NHGRI NIH HHS
- N01-HC-95163, N01-HC-95164, N01-HC-95165, N01-HC-95166 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- P42 ES016454 NIEHS NIH HHS
- UM1 DK078616 NIDDK NIH HHS
- U01-HL054509 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R35-HL135824 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- M01-RR07122 U.S. Department of Health & Human Services | NIH | National Center for Research Resources (NCRR)
- U01 DK105561 NIDDK NIH HHS
- U01-HL072524 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- P20 GM121334 NIGMS NIH HHS
- N01-HC-95167, N01-HC-95168, N01-HC-95169 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL131565 NHLBI NIH HHS
- R01HL071251 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R13 CA124365 NCI NIH HHS
- R01-HL045522 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- P01 HL132825 NHLBI NIH HHS
- R01 HL118267 NHLBI NIH HHS
- HHSN268201800013I NIMHD NIH HHS
- R01-HL67348 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- U54 GM115428 NIGMS NIH HHS
- R01 HL055673 NHLBI NIH HHS
- HHSN268201600018C, HHSN268201600001C, HHSN268201600002C, HHSN268201600003C, and HHSN268201600004C U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- UM1-DK078616 U.S. Department of Health & Human Services | NIH | National Institute of Diabetes and Digestive and Kidney Diseases (National Institute of Diabetes & Digestive & Kidney Diseases)
- R01 HL149683 NHLBI NIH HHS
- R01 HL092301 NHLBI NIH HHS
- P30 DK020595 NIDDK NIH HHS
- R01 HL149836 NHLBI NIH HHS
- K08 HL145095 NHLBI NIH HHS
- K01 HL135405 NHLBI NIH HHS
- R03 OD030608 NIH HHS
- HHSN268201800014I NHLBI NIH HHS
- R01-HL113338 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- F32-HL085989 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- UM1 AI068636 NIAID NIH HHS
- R01 AG057381 NIA NIH HHS
- U19-CA203654 U.S. Department of Health & Human Services | NIH | National Cancer Institute (NCI)
Collapse
Affiliation(s)
- Zilin Li
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Department of Biostatistics and Health Data Science, Indiana University School of Medicine, Indianapolis, IN, USA.
| | - Xihao Li
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Hufeng Zhou
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Sheila M Gaynor
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Margaret Sunitha Selvaraj
- Center for Genomic Medicine and Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Theodore Arapoglou
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Corbin Quick
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Yaowu Liu
- School of Statistics, Southwestern University of Finance and Economics, Chengdu, China
| | - Han Chen
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Ryan Sun
- Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Rounak Dey
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Donna K Arnett
- Dean's Office, University of Kentucky, College of Public Health, Lexington, KY, USA
| | - Paul L Auer
- Division of Biostatistics, Institute for Health & Equity and Cancer Center, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Lawrence F Bielak
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Joshua C Bis
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Thomas W Blackwell
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA
| | - John Blangero
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Eric Boerwinkle
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Donald W Bowden
- Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Jennifer A Brody
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Brian E Cade
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA, USA
- Division of Sleep Medicine, Harvard Medical School, Boston, MA, USA
| | - Matthew P Conomos
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Adolfo Correa
- Jackson Heart Study, Department of Medicine, University of Mississippi Medical Center, Jackson, MS, USA
| | - L Adrienne Cupples
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
- Framingham Heart Study, National Heart, Lung, and Blood Institute and Boston University, Framingham, MA, USA
| | - Joanne E Curran
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Paul S de Vries
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Ravindranath Duggirala
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Nora Franceschini
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, USA
| | - Barry I Freedman
- Department of Internal Medicine, Nephrology, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Harald H H Göring
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Xiuqing Guo
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Rita R Kalyani
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Brian G Kral
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Leslie A Lange
- Division of Biomedical Informatics and Personalized Medicine, Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Bridget M Lin
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA
| | - Ani Manichaikul
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Alisa K Manning
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Metabolism Program, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Clinical and Translational Epidemiology Unit, Mongan Institute, Massachusetts General Hospital, Boston, MA, USA
| | - Lisa W Martin
- Division in Cardiology, George Washington School of Medicine and Health Sciences, Washington, DC, USA
| | - Rasika A Mathias
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - James B Meigs
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Division of General Internal Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Braxton D Mitchell
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Geriatrics Research and Education Clinical Center, Baltimore VA Medical Center, Baltimore, MD, USA
| | - May E Montasser
- Division of Endocrinology, Diabetes, and Nutrition, Program for Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Alanna C Morrison
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Take Naseri
- Ministry of Health, Government of Samoa, Apia, Samoa
| | - Jeffrey R O'Connell
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Nicholette D Palmer
- Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Patricia A Peyser
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Bruce M Psaty
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
- Departments of Health Systems and Population Health, University of Washington, Seattle, WA, USA
| | - Laura M Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Susan Redline
- Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA, USA
- Division of Sleep Medicine, Harvard Medical School, Boston, MA, USA
- Division of Pulmonary, Critical Care, and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Alexander P Reiner
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | | | - Kenneth M Rice
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Stephen S Rich
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Jennifer A Smith
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI, USA
| | - Kent D Taylor
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Margaret A Taub
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Ramachandran S Vasan
- Framingham Heart Study, National Heart, Lung, and Blood Institute and Boston University, Framingham, MA, USA
- Department of Medicine, Boston University School of Medicine, Boston, MA, USA
| | - Daniel E Weeks
- Department of Human Genetics and Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA
| | - James G Wilson
- Division of Cardiology, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Lisa R Yanek
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Wei Zhao
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Jerome I Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Cristen J Willer
- Department of Internal Medicine, University of Michigan, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Pradeep Natarajan
- Center for Genomic Medicine and Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Gina M Peloso
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
- Framingham Heart Study, National Heart, Lung, and Blood Institute and Boston University, Framingham, MA, USA
| | - Xihong Lin
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Department of Statistics, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
21
|
Chen X, Zhang H, Liu M, Deng HW, Wu Z. Simultaneous detection of novel genes and SNPs by adaptive p-value combination. Front Genet 2022; 13:1009428. [DOI: 10.3389/fgene.2022.1009428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 11/03/2022] [Indexed: 11/18/2022] Open
Abstract
Combining SNP p-values from GWAS summary data is a promising strategy for detecting novel genetic factors. Existing statistical methods for the p-value-based SNP-set testing confront two challenges. First, the statistical power of different methods depends on unknown patterns of genetic effects that could drastically vary over different SNP sets. Second, they do not identify which SNPs primarily contribute to the global association of the whole set. We propose a new signal-adaptive analysis pipeline to address these challenges using the omnibus thresholding Fisher’s method (oTFisher). The oTFisher remains robustly powerful over various patterns of genetic effects. Its adaptive thresholding can be applied to estimate important SNPs contributing to the overall significance of the given SNP set. We develop efficient calculation algorithms to control the type I error rate, which accounts for the linkage disequilibrium among SNPs. Extensive simulations show that the oTFisher has robustly high power and provides a higher balanced accuracy in screening SNPs than the traditional Bonferroni and FDR procedures. We applied the oTFisher to study the genetic association of genes and haplotype blocks of the bone density-related traits using the summary data of the Genetic Factors for Osteoporosis Consortium. The oTFisher identified more novel and literature-reported genetic factors than existing p-value combination methods. Relevant computation has been implemented into the R package TFisher to support similar data analysis.
Collapse
|
22
|
Kuksa PP, Greenfest-Allen E, Cifello J, Ionita M, Wang H, Nicaretta H, Cheng PL, Lee WP, Wang LS, Leung YY. Scalable approaches for functional analyses of whole-genome sequencing non-coding variants. Hum Mol Genet 2022; 31:R62-R72. [PMID: 35943817 PMCID: PMC9585666 DOI: 10.1093/hmg/ddac191] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 08/04/2022] [Accepted: 08/08/2022] [Indexed: 11/23/2022] Open
Abstract
Non-coding genetic variants outside of protein-coding genome regions play an important role in genetic and epigenetic regulation. It has become increasingly important to understand their roles, as non-coding variants often make up the majority of top findings of genome-wide association studies (GWAS). In addition, the growing popularity of disease-specific whole-genome sequencing (WGS) efforts expands the library of and offers unique opportunities for investigating both common and rare non-coding variants, which are typically not detected in more limited GWAS approaches. However, the sheer size and breadth of WGS data introduce additional challenges to predicting functional impacts in terms of data analysis and interpretation. This review focuses on the recent approaches developed for efficient, at-scale annotation and prioritization of non-coding variants uncovered in WGS analyses. In particular, we review the latest scalable annotation tools, databases and functional genomic resources for interpreting the variant findings from WGS based on both experimental data and in silico predictive annotations. We also review machine learning-based predictive models for variant scoring and prioritization. We conclude with a discussion of future research directions which will enhance the data and tools necessary for the effective functional analyses of variants identified by WGS to improve our understanding of disease etiology.
Collapse
Affiliation(s)
- Pavel P Kuksa
- Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Emily Greenfest-Allen
- Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Jeffrey Cifello
- Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Matei Ionita
- Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Hui Wang
- Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Heather Nicaretta
- Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Po-Liang Cheng
- Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Wan-Ping Lee
- Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Li-San Wang
- Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Yuk Yee Leung
- Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
23
|
Selvaraj MS, Li X, Li Z, Pampana A, Zhang DY, Park J, Aslibekyan S, Bis JC, Brody JA, Cade BE, Chuang LM, Chung RH, Curran JE, de las Fuentes L, de Vries PS, Duggirala R, Freedman BI, Graff M, Guo X, Heard-Costa N, Hidalgo B, Hwu CM, Irvin MR, Kelly TN, Kral BG, Lange L, Li X, Lisa M, Lubitz SA, Manichaikul AW, Michael P, Montasser ME, Morrison AC, Naseri T, O'Connell JR, Palmer ND, Peyser PA, Reupena MS, Smith JA, Sun X, Taylor KD, Tracy RP, Tsai MY, Wang Z, Wang Y, Bao W, Wilkins JT, Yanek LR, Zhao W, Arnett DK, Blangero J, Boerwinkle E, Bowden DW, Chen YDI, Correa A, Cupples LA, Dutcher SK, Ellinor PT, Fornage M, Gabriel S, Germer S, Gibbs R, He J, Kaplan RC, Kardia SLR, Kim R, Kooperberg C, Loos RJF, Viaud-Martinez KA, Mathias RA, McGarvey ST, Mitchell BD, Nickerson D, North KE, Psaty BM, Redline S, Reiner AP, Vasan RS, Rich SS, Willer C, Rotter JI, Rader DJ, Lin X, Peloso GM, Natarajan P. Whole genome sequence analysis of blood lipid levels in >66,000 individuals. Nat Commun 2022; 13:5995. [PMID: 36220816 PMCID: PMC9553944 DOI: 10.1038/s41467-022-33510-7] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 09/21/2022] [Indexed: 01/05/2023] Open
Abstract
Blood lipids are heritable modifiable causal factors for coronary artery disease. Despite well-described monogenic and polygenic bases of dyslipidemia, limitations remain in discovery of lipid-associated alleles using whole genome sequencing (WGS), partly due to limited sample sizes, ancestral diversity, and interpretation of clinical significance. Among 66,329 ancestrally diverse (56% non-European) participants, we associate 428M variants from deep-coverage WGS with lipid levels; ~400M variants were not assessed in prior lipids genetic analyses. We find multiple lipid-related genes strongly associated with blood lipids through analysis of common and rare coding variants. We discover several associated rare non-coding variants, largely at Mendelian lipid genes. Notably, we observe rare LDLR intronic variants associated with markedly increased LDL-C, similar to rare LDLR exonic variants. In conclusion, we conducted a systematic whole genome scan for blood lipids expanding the alleles linked to lipids for multiple ancestries and characterize a clinically-relevant rare non-coding variant model for lipids.
Collapse
Affiliation(s)
- Margaret Sunitha Selvaraj
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, 02114, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
- Department of Medicine, Harvard Medical School, Boston, MA, 02115, USA
| | - Xihao Li
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Zilin Li
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Akhil Pampana
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - David Y Zhang
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Joseph Park
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Stella Aslibekyan
- Department of Epidemiology, University of Alabama at Birmingham School of Public Health, Birmingham, AL, USA
| | - Joshua C Bis
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Jennifer A Brody
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Brian E Cade
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Lee-Ming Chuang
- Department of Internal Medicine, National Taiwan University Hospital, Taipei, Taiwan
| | - Ren-Hua Chung
- Institute of Population Health Sciences, National Health Research Institutes, Zhunan, 350, Taiwan
| | - Joanne E Curran
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, 78520, USA
| | - Lisa de las Fuentes
- Department of Medicine, Cardiovascular Division, Washington University School of Medicine, St. Louis, MO, USA
- Division of Biostatistics, Washington University School of Medicine, St. Louis, MO, USA
| | - Paul S de Vries
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Ravindranath Duggirala
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, 78520, USA
| | - Barry I Freedman
- Department of Internal Medicine, Section on Nephrology, Wake Forest School of Medicine, Winston-Salem, NC, 27157, USA
| | - Mariaelisa Graff
- Department of Epidemiology, UNC Chapel Hill, Chapel Hill, NC, USA
| | - Xiuqing Guo
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Nancy Heard-Costa
- Department of Neurology, Boston university School of Medicine, Boston, MA, USA
| | - Bertha Hidalgo
- Department of Epidemiology, University of Alabama at Birmingham School of Public Health, Birmingham, AL, USA
| | - Chii-Min Hwu
- Section of Endocrinology and Metabolism, Department of Medicine, Taipei Veterans General Hospital, Taipei, Taiwan
| | - Marguerite R Irvin
- Department of Epidemiology, University of Alabama at Birmingham School of Public Health, Birmingham, AL, USA
| | - Tanika N Kelly
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, 70112, USA
- Tulane University Translational Science Institute, New Orleans, LA, 70112, USA
| | - Brian G Kral
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, 21205, USA
| | - Leslie Lange
- Division of Biomedical Informatics and Personalized Medicine, Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Xiaohui Li
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Martin Lisa
- Department of Medicine, George Washington University, Washingron, DC, USA
| | - Steven A Lubitz
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, 02114, USA
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, 02124, USA
| | - Ani W Manichaikul
- Department of Public Health Sciences, Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Preuss Michael
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - May E Montasser
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Alanna C Morrison
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Take Naseri
- Ministry of Health, Government of Samoa, Samoa, USA
| | - Jeffrey R O'Connell
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Nicholette D Palmer
- Department of Biochemistry, Wake Forest School of Medicine, Winston-Salem, NC, 27157, USA
| | - Patricia A Peyser
- Department of Epidemiology, University of Michigan, Ann Arbor, MI, 48109, USA
| | | | - Jennifer A Smith
- Department of Epidemiology, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Xiao Sun
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, 70112, USA
| | - Kent D Taylor
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Russell P Tracy
- Departments of Pathology & Laboratory Medicine and Biochemistry, Larner College of Medicine at the University of Vermont, Colchester, VT, USA
| | - Michael Y Tsai
- Department of Laboratory Medicine and Pathology, University of Minneosta, Minneapolis, MN, USA
| | - Zhe Wang
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Yuxuan Wang
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, 02118, USA
| | - Wei Bao
- Institute of Public Health, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, 230026, China
| | - John T Wilkins
- Department of Medicine (Cardiology) and Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Lisa R Yanek
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, 21205, USA
| | - Wei Zhao
- Department of Epidemiology, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Donna K Arnett
- Dean's Office, University of Kentucky College of Public Health, Lexington, KY, USA
| | - John Blangero
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, 78520, USA
| | - Eric Boerwinkle
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Donald W Bowden
- Department of Biochemistry, Wake Forest School of Medicine, Winston-Salem, NC, 27157, USA
| | - Yii-Der Ida Chen
- Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Adolfo Correa
- Department of Population Health Science, University of Mississippi Medical Center, Jackson, MS, USA
| | - L Adrienne Cupples
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, 02118, USA
| | - Susan K Dutcher
- The McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, 63108, USA
| | - Patrick T Ellinor
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, 02114, USA
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, 02124, USA
| | - Myriam Fornage
- Brown Foundation Institute of Molecular Medicine, McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX, 7722, USA
| | | | - Soren Germer
- New York Genome Center, New York, NY, 10013, USA
| | - Richard Gibbs
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, 77030, USA
| | - Jiang He
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, 70112, USA
- Tulane University Translational Science Institute, New Orleans, LA, 70112, USA
| | - Robert C Kaplan
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA
| | - Sharon L R Kardia
- Department of Epidemiology, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Ryan Kim
- Psomagen, Inc. (formerly Macrogen USA), Rockville, MD, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA
| | - Ruth J F Loos
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- NNF Center for Basic Metabolic Research, University of Copenhagen, Cophenhagen, Denmark
| | | | - Rasika A Mathias
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, 21205, USA
| | - Stephen T McGarvey
- Department of Epidemiology, International Health Institute, Brown University, Providence, RI, USA
| | - Braxton D Mitchell
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Geriatrics Research and Education Clinical Center, Baltimore Veterans Administration Medical Center, Baltimore, MD, USA
| | - Deborah Nickerson
- University of Washington, Department of Genome Sciences, Seattle, WA, 98195, USA
| | - Kari E North
- Department of Epidemiology, UNC Chapel Hill, Chapel Hill, NC, USA
| | - Bruce M Psaty
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
- Department of Health Systems and Population Health, University of Washington, Seattle, WA, USA
| | - Susan Redline
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Alexander P Reiner
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Ramachandran S Vasan
- Sections of Preventive medicine and Epidemiology, Cardiovascular medicine, Department of Medicine, Boston University School of Medicine, Boston, MA, USA
- Department of Epidemiology, Boston University School of Public Health, Boston, MA, USA
- Framingham Heart Study, Framingham, MA, USA
| | - Stephen S Rich
- Department of Public Health Sciences, Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Cristen Willer
- University of Michigan, Internal Medicine, Ann Arbor, MI, 48109, USA
| | - Jerome I Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Daniel J Rader
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Xihong Lin
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
- Department of Statistics, Harvard University, Cambridge, MA, 02138, USA
| | - Gina M Peloso
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, 02118, USA.
| | - Pradeep Natarajan
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, 02114, USA.
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA.
- Department of Medicine, Harvard Medical School, Boston, MA, 02115, USA.
| |
Collapse
|
24
|
Bocher O, Ludwig TE, Oglobinsky MS, Marenne G, Deleuze JF, Suryakant S, Odeberg J, Morange PE, Trégouët DA, Perdry H, Génin E. Testing for association with rare variants in the coding and non-coding genome: RAVA-FIRST, a new approach based on CADD deleteriousness score. PLoS Genet 2022; 18:e1009923. [PMID: 36112662 PMCID: PMC9518893 DOI: 10.1371/journal.pgen.1009923] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Revised: 09/28/2022] [Accepted: 08/15/2022] [Indexed: 11/18/2022] Open
Abstract
Rare variant association tests (RVAT) have been developed to study the contribution of rare variants widely accessible through high-throughput sequencing technologies. RVAT require to aggregate rare variants in testing units and to filter variants to retain only the most likely causal ones. In the exome, genes are natural testing units and variants are usually filtered based on their functional consequences. However, when dealing with whole-genome sequence (WGS) data, both steps are challenging. No natural biological unit is available for aggregating rare variants. Sliding windows procedures have been proposed to circumvent this difficulty, however they are blind to biological information and result in a large number of tests. We propose a new strategy to perform RVAT on WGS data: “RAVA-FIRST” (RAre Variant Association using Functionally-InfoRmed STeps) comprising three steps. (1) New testing units are defined genome-wide based on functionally-adjusted Combined Annotation Dependent Depletion (CADD) scores of variants observed in the gnomAD populations, which are referred to as “CADD regions”. (2) A region-dependent filtering of rare variants is applied in each CADD region. (3) A functionally-informed burden test is performed with sub-scores computed for each genomic category within each CADD region. Both on simulations and real data, RAVA-FIRST was found to outperform other WGS-based RVAT. Applied to a WGS dataset of venous thromboembolism patients, we identified an intergenic region on chromosome 18 enriched for rare variants in early-onset patients. This region that was missed by standard sliding windows procedures is included in a TAD region that contains a strong candidate gene. RAVA-FIRST enables new investigations of rare non-coding variants in complex diseases, facilitated by its implementation in the R package Ravages.
Collapse
Affiliation(s)
- Ozvan Bocher
- Univ Brest, Inserm, EFS, UMR 1078, GGB, Brest, France
- Institute of Translational Genomics, Helmholtz Zentrum München, Munich, Germany
- * E-mail:
| | - Thomas E. Ludwig
- Univ Brest, Inserm, EFS, UMR 1078, GGB, Brest, France
- CHU Brest, Brest, France
| | | | | | - Jean-François Deleuze
- Centre National de Recherche en Génomique Humaine CNRGH, Institut de Biologie François Jacob, Université Paris Saclay, CEA, Evry, France
| | - Suryakant Suryakant
- University of Bordeaux, Inserm, Bordeaux Population Health Research Center, team ELEANOR, UMR 1219, Bordeaux, France
| | - Jacob Odeberg
- Science for Life Laboratory, Department of Protein Science, CBH, KTH Royal Institute of Technology, Stockholm, Sweden
- Department of Clinical Medicine, Faculty of Health Science, The Arctic University of Tromsö, Tromsö, Norway
| | | | - David-Alexandre Trégouët
- University of Bordeaux, Inserm, Bordeaux Population Health Research Center, team ELEANOR, UMR 1219, Bordeaux, France
| | - Hervé Perdry
- CESP Inserm, U1018, UFR Médecine, Univ Paris-Sud, Université Paris-Saclay, Villejuif, France
| | - Emmanuelle Génin
- Univ Brest, Inserm, EFS, UMR 1078, GGB, Brest, France
- CHU Brest, Brest, France
| |
Collapse
|
25
|
Long M, Li Z, Zhang W, Li Q. The Cauchy Combination Test under Arbitrary Dependence Structures. AM STAT 2022. [DOI: 10.1080/00031305.2022.2116109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Affiliation(s)
- Mingya Long
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, University of Chinese Academy of Sciences
| | | | - Wei Zhang
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, University of Chinese Academy of Sciences
| | - Qizhai Li
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, University of Chinese Academy of Sciences
| |
Collapse
|
26
|
Yang Y, Sun Q, Huang L, Broome JG, Correa A, Reiner A, Raffield LM, Yang Y, Li Y. eSCAN: scan regulatory regions for aggregate association testing using whole-genome sequencing data. Brief Bioinform 2022; 23:bbab497. [PMID: 34882196 PMCID: PMC8898002 DOI: 10.1093/bib/bbab497] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 10/25/2021] [Accepted: 10/30/2021] [Indexed: 02/07/2023] Open
Abstract
Multiple statistical methods for aggregate association testing have been developed for whole-genome sequencing (WGS) data. Many aggregate variants in a given genomic window and ignore existing knowledge to define test regions, resulting in many identified regions not clearly linked to genes, and thus, limiting biological understanding. Functional information from new technologies (such as Hi-C and its derivatives), which can help link enhancers to their effector genes, can be leveraged to predefine variant sets for aggregate testing in WGS data. Here, we propose the eSCAN (scan the enhancers) method for genome-wide assessment of enhancer regions in sequencing studies, combining the advantages of dynamic window selection in SCANG (SCAN the Genome), a previously developed method, with the advantages of incorporating putative regulatory regions from annotation. eSCAN, by searching in putative enhancers, increases statistical power and aids mechanistic interpretation, as demonstrated by extensive simulation studies. We also apply eSCAN for blood cell traits using NHLBI Trans-Omics for Precision Medicine WGS data. Results from real data analysis show that eSCAN is able to capture more significant signals, and these signals are of shorter length (indicating higher resolution fine-mapping capability) and drive association of larger regions detected by other methods.
Collapse
Affiliation(s)
- Yingxi Yang
- Department of Statistics and Data Science, Yale University, New Haven, CT, 06511, USA
| | - Quan Sun
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Le Huang
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Jai G Broome
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
- Department of Medicine, Division of Medical Genetics, University of Washington, Seattle, WA 98195, USA
| | - Adolfo Correa
- Department of Medicine and Population Health Science, University of Mississippi Medical Center, Jackson, MS, 39216, USA
| | - Alexander Reiner
- Department of Epidemiology, University of Washington, Seattle, WA, 98195, USA
- Fred Hutchinson Cancer Research Center, University of Washington, Seattle, WA, 98195, USA
| | | | - Laura M Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Yuchen Yang
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-sen University, 510275 Guangzhou, China
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| |
Collapse
|
27
|
Zhang J, Lu H, Zhang S, Wang T, Zhao H, Guan F, Zeng P. Leveraging Methylation Alterations to Discover Potential Causal Genes Associated With the Survival Risk of Cervical Cancer in TCGA Through a Two-Stage Inference Approach. Front Genet 2021; 12:667877. [PMID: 34149809 PMCID: PMC8206792 DOI: 10.3389/fgene.2021.667877] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Accepted: 04/19/2021] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Multiple genes were previously identified to be associated with cervical cancer; however, the genetic architecture of cervical cancer remains unknown and many potential causal genes are yet to be discovered. METHODS To explore potential causal genes related to cervical cancer, a two-stage causal inference approach was proposed within the framework of Mendelian randomization, where the gene expression was treated as exposure, with methylations located within the promoter regions of genes serving as instrumental variables. Five prediction models were first utilized to characterize the relationship between the expression and methylations for each gene; then, the methylation-regulated gene expression (MReX) was obtained and the association was evaluated via Cox mixed-effect model based on MReX. We further implemented the aggregated Cauchy association test (ACAT) combination to take advantage of respective strengths of these prediction models while accounting for dependency among the p-values. RESULTS A total of 14 potential causal genes were discovered to be associated with the survival risk of cervical cancer in TCGA when the five prediction models were separately employed. The total number of potential causal genes was brought to 23 when conducting ACAT. Some of the newly discovered genes may be novel (e.g., YJEFN3, SPATA5L1, IMMP1L, C5orf55, PPIP5K2, ZNF330, CRYZL1, PPM1A, ESCO2, ZNF605, ZNF225, ZNF266, FICD, and OSTC). Functional analyses showed that these genes were enriched in tumor-associated pathways. Additionally, four genes (i.e., COL6A1, SYDE1, ESCO2, and GIPC1) were differentially expressed between tumor and normal tissues. CONCLUSION Our study discovered promising candidate genes that were causally associated with the survival risk of cervical cancer and thus provided new insights into the genetic etiology of cervical cancer.
Collapse
Affiliation(s)
- Jinhui Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Haojie Lu
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Shuo Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Ting Wang
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
- Center for Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Huashuo Zhao
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
- Center for Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Fengjun Guan
- Department of Pediatrics, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
| | - Ping Zeng
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
- Center for Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical University, Xuzhou, China
| |
Collapse
|
28
|
Tang H, He Z. Advances and challenges in quantitative delineation of the genetic architecture of complex traits. QUANTITATIVE BIOLOGY 2021; 9:168-184. [PMID: 35492964 PMCID: PMC9053444 DOI: 10.15302/j-qb-021-0249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Background Genome-wide association studies (GWAS) have been widely adopted in studies of human complex traits and diseases. Results This review surveys areas of active research: quantifying and partitioning trait heritability, fine mapping functional variants and integrative analysis, genetic risk prediction of phenotypes, and the analysis of sequencing studies that have identified millions of rare variants. Current challenges and opportunities are highlighted. Conclusion GWAS have fundamentally transformed the field of human complex trait genetics. Novel statistical and computational methods have expanded the scope of GWAS and have provided valuable insights on the genetic architecture underlying complex phenotypes.
Collapse
Affiliation(s)
- Hua Tang
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Zihuai He
- Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA 94305, USA
- Quantitative Sciences Unit, Department of Medicine, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
29
|
Guo H, Li JJ, Lu Q, Hou L. Detecting local genetic correlations with scan statistics. Nat Commun 2021; 12:2033. [PMID: 33795679 PMCID: PMC8016883 DOI: 10.1038/s41467-021-22334-6] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Accepted: 03/08/2021] [Indexed: 02/06/2023] Open
Abstract
Genetic correlation analysis has quickly gained popularity in the past few years and provided insights into the genetic etiology of numerous complex diseases. However, existing approaches oversimplify the shared genetic architecture between different phenotypes and cannot effectively identify precise genetic regions contributing to the genetic correlation. In this work, we introduce LOGODetect, a powerful and efficient statistical method to identify small genome segments harboring local genetic correlation signals. LOGODetect automatically identifies genetic regions showing consistent associations with multiple phenotypes through a scan statistic approach. It uses summary association statistics from genome-wide association studies (GWAS) as input and is robust to sample overlap between studies. Applied to seven phenotypically distinct but genetically correlated neuropsychiatric traits, we identify 227 non-overlapping genome regions associated with multiple traits, including multiple hub regions showing concordant effects on five or more traits. Our method addresses critical limitations in existing analytic strategies and may have wide applications in post-GWAS analysis.
Collapse
Affiliation(s)
- Hanmin Guo
- Center for Statistical Science, Tsinghua University, Beijing, China
- Department of Industrial Engineering, Tsinghua University, Beijing, China
| | - James J Li
- Department of Psychology, University of Wisconsin-Madison, Madison, WI, USA
- Waisman Center, University of Wisconsin-Madison, Madison, WI, USA
| | - Qiongshi Lu
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA.
| | - Lin Hou
- Center for Statistical Science, Tsinghua University, Beijing, China.
- Department of Industrial Engineering, Tsinghua University, Beijing, China.
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China.
| |
Collapse
|
30
|
Hecker J, Townes FW, Kachroo P, Laurie C, Lasky-Su J, Ziniti J, Cho MH, Weiss ST, Laird NM, Lange C. A unifying framework for rare variant association testing in family-based designs, including higher criticism approaches, SKATs, and burden tests. Bioinformatics 2021; 36:5432-5438. [PMID: 33367522 PMCID: PMC8016468 DOI: 10.1093/bioinformatics/btaa1055] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 11/20/2020] [Accepted: 12/10/2020] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION Analysis of rare variants in family-based studies remains a challenge. Transmission-based approaches provide robustness against population stratification, but the evaluation of the significance of test statistics based on asymptotic theory can be imprecise. Also, power will depend heavily on the choice of the test statistic and on the underlying genetic architecture of the locus, which will be generally unknown. RESULTS In our proposed framework, we utilize the FBAT haplotype algorithm to obtain the conditional offspring genotype distribution under the null hypothesis given the sufficient statistic. Based on this conditional offspring genotype distribution, the significance of virtually any association test statistic can be evaluated based on simulations or exact computations, without the need for asymptotic approximations. Besides standard linear burden-type statistics, this enables our approach to also evaluate other test statistics such as variance components statistics, higher criticism approaches, and maximum-single-variant-statistics, where asymptotic theory might be involved or does not provide accurate approximations for rare variant data. Based on these P-values, combined test statistics such as the aggregated Cauchy association test (ACAT) can also be utilized. In simulation studies, we show that our framework outperforms existing approaches for family-based studies in several scenarios. We also applied our methodology to a TOPMed whole-genome sequencing dataset with 897 asthmatic trios from Costa Rica. AVAILABILITY AND IMPLEMENTATION FBAT software is available at https://sites.google.com/view/fbatwebpage. Simulation code is available at https://github.com/julianhecker/FBAT_rare_variant_test_simulations. Whole-genome sequencing data for 'NHLBI TOPMed: The Genetic Epidemiology of Asthma in Costa Rica' is available at https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000988.v4.p1. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Julian Hecker
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - F William Townes
- Department of Computer Science, Princeton University, Princeton, NJ 08540-5233, USA
| | - Priyadarshini Kachroo
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Cecelia Laurie
- Department of Biostatistics, University of Washington, Seattle, WA 98195-1617, USA
| | - Jessica Lasky-Su
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - John Ziniti
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Michael H Cho
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Scott T Weiss
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Nan M Laird
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Christoph Lange
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| |
Collapse
|
31
|
Gorla A, Jew B, Zhang L, Sul JH. xGAP: A python based efficient, modular, extensible and fault tolerant genomic analysis pipeline for variant discovery. Bioinformatics 2021; 37:9-16. [PMID: 33416856 PMCID: PMC8034531 DOI: 10.1093/bioinformatics/btaa1097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Revised: 12/22/2020] [Accepted: 01/04/2021] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Since the first human genome was sequenced in 2001, there has been a rapid growth in the number of bioinformatic methods to process and analyze next generation sequencing (NGS) data for research and clinical studies that aim to identify genetic variants influencing diseases and traits. To achieve this goal, one first needs to call genetic variants from NGS data which requires multiple computationally intensive analysis steps. Unfortunately, there is a lack of an open source pipeline that can perform all these steps on NGS data in a manner which is fully automated, efficient, rapid, scalable, modular, user-friendly and fault tolerant. To address this, we introduce xGAP, an extensible Genome Analysis Pipeline, which implements modified GATK best practice to analyze DNA-seq data with aforementioned functionalities. RESULTS xGAP implements massive parallelization of the modified GATK best practice pipeline by splitting a genome into many smaller regions with efficient load-balancing to achieve high scalability. It can process 30x coverage whole-genome sequencing (WGS) data in approximately 90 minutes. In terms of accuracy of discovered variants, xGAP achieves average F1 scores of 99.37% for SNVs and 99.20% for Indels across seven benchmark WGS datasets. We achieve highly consistent results across multiple on-premises (SGE & SLURM) high performance clusters. Compared to the Churchill pipeline, with similar parallelization, xGAP is 20% faster when analyzing 50X coverage WGS in AWS. Finally, xGAP is user-friendly and fault tolerant where it can automatically re-initiate failed processes to minimize required user intervention. AVAILABILITY xGAP is available at https://github.com/Adigorla/xgap. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Aditya Gorla
- Department of Bioengineering, University of California, Los, Los, U.S.A Angeles, Angeles, CA 90095
| | - Brandon Jew
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, CA 90095, Los, U.S.A. Angeles
| | - Luke Zhang
- Undergraduate Neuroscience Interdepartmental Program, University of California, Los Angeles, CA 90095, Los, U.S.A. Angeles
| | - Jae Hoon Sul
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, CA 90095, Los, U.S.A Angeles
| |
Collapse
|
32
|
Xiang Y, Xiang X, Li Y. Identifying rare variants for quantitative traits in extreme samples of population via Kullback-Leibler distance. BMC Genet 2020; 21:130. [PMID: 33234108 PMCID: PMC7687851 DOI: 10.1186/s12863-020-00951-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Accepted: 11/10/2020] [Indexed: 11/23/2022] Open
Abstract
Background The rapid development of sequencing technology and simultaneously the availability of large quantities of sequence data has facilitated the identification of rare variant associated with quantitative traits. However, existing statistical methods depend on certain assumptions and thus lacking uniform power. The present study focuses on mapping rare variant associated with quantitative traits. Results In the present study, we proposed a two-stage strategy to identify rare variant of quantitative traits using phenotype extreme selection design and Kullback-Leibler distance, where the first stage was association analysis and the second stage was fine mapping. We presented a statistic and a linkage disequilibrium measure for the first stage and the second stage, respectively. Theory analysis and simulation study showed that (1) the power of the proposed statistic for association analysis increased with the stringency of the sample selection and was affected slightly by non-causal variants and opposite effect variants, (2) the statistic here achieved higher power than three commonly used methods, and (3) the linkage disequilibrium measure for fine mapping was independent of the frequencies of non-causal variants and simply dependent on the frequencies of causal variants. Conclusions We conclude that the two-stage strategy here can be used effectively to mapping rare variant associated with quantitative traits.
Collapse
Affiliation(s)
- Yang Xiang
- School of Mathematics and Computational Science, Huaihua University, Huaihua, Hunan, 418008, People's Republic of China.,Key Laboratory of Research and Utilization of Ethnomedicinal Plant Resources of Hunan Province, Huaihua University, Huaihua, 418008, China.,Key Laboratory of Hunan Higher Education for Western Hunan Medicinal Plant and Ethnobotany, Huaihua University, Huaihua, 418008, China
| | - Xinrong Xiang
- School of Mathematics and Statistics, Hunan Normal University, Changsha, Hunan, 410081, People's Republic of China
| | - Yumei Li
- School of Mathematics and Computational Science, Huaihua University, Huaihua, Hunan, 418008, People's Republic of China. .,Key Laboratory of Research and Utilization of Ethnomedicinal Plant Resources of Hunan Province, Huaihua University, Huaihua, 418008, China. .,Key Laboratory of Hunan Higher Education for Western Hunan Medicinal Plant and Ethnobotany, Huaihua University, Huaihua, 418008, China.
| |
Collapse
|
33
|
Li X, Li Z, Zhou H, Gaynor SM, Liu Y, Chen H, Sun R, Dey R, Arnett DK, Aslibekyan S, Ballantyne CM, Bielak LF, Blangero J, Boerwinkle E, Bowden DW, Broome JG, Conomos MP, Correa A, Cupples LA, Curran JE, Freedman BI, Guo X, Hindy G, Irvin MR, Kardia SLR, Kathiresan S, Khan AT, Kooperberg CL, Laurie CC, Liu XS, Mahaney MC, Manichaikul AW, Martin LW, Mathias RA, McGarvey ST, Mitchell BD, Montasser ME, Moore JE, Morrison AC, O'Connell JR, Palmer ND, Pampana A, Peralta JM, Peyser PA, Psaty BM, Redline S, Rice KM, Rich SS, Smith JA, Tiwari HK, Tsai MY, Vasan RS, Wang FF, Weeks DE, Weng Z, Wilson JG, Yanek LR, Neale BM, Sunyaev SR, Abecasis GR, Rotter JI, Willer CJ, Peloso GM, Natarajan P, Lin X. Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale. Nat Genet 2020; 52:969-983. [PMID: 32839606 PMCID: PMC7483769 DOI: 10.1038/s41588-020-0676-4] [Citation(s) in RCA: 150] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Accepted: 07/02/2020] [Indexed: 12/13/2022]
Abstract
Large-scale whole-genome sequencing studies have enabled the analysis of rare variants (RVs) associated with complex phenotypes. Commonly used RV association tests have limited scope to leverage variant functions. We propose STAAR (variant-set test for association using annotation information), a scalable and powerful RV association test method that effectively incorporates both variant categories and multiple complementary annotations using a dynamic weighting scheme. For the latter, we introduce 'annotation principal components', multidimensional summaries of in silico variant annotations. STAAR accounts for population structure and relatedness and is scalable for analyzing very large cohort and biobank whole-genome sequencing studies of continuous and dichotomous traits. We applied STAAR to identify RVs associated with four lipid traits in 12,316 discovery and 17,822 replication samples from the Trans-Omics for Precision Medicine Program. We discovered and replicated new RV associations, including disruptive missense RVs of NPC1L1 and an intergenic region near APOC1P1 associated with low-density lipoprotein cholesterol.
Collapse
Affiliation(s)
- Xihao Li
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Zilin Li
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Hufeng Zhou
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Sheila M Gaynor
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Yaowu Liu
- School of Statistics, Southwestern University of Finance and Economics, Chengdu, China
| | - Han Chen
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Center for Precision Health, School of Public Health and School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Ryan Sun
- Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Rounak Dey
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Donna K Arnett
- College of Public Health, University of Kentucky, Lexington, KY, USA
| | - Stella Aslibekyan
- Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL, USA
| | | | - Lawrence F Bielak
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - John Blangero
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Eric Boerwinkle
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Donald W Bowden
- Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Jai G Broome
- Division of Medical Genetics, University of Washington, Seattle, WA, USA
| | - Matthew P Conomos
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Adolfo Correa
- Jackson Heart Study, Department of Medicine, University of Mississippi Medical Center, Jackson, MS, USA
| | - L Adrienne Cupples
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
- Framingham Heart Study, National Heart, Lung, and Blood Institute and Boston University, Framingham, MA, USA
| | - Joanne E Curran
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Barry I Freedman
- Department of Internal Medicine, Nephrology, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Xiuqing Guo
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - George Hindy
- Department of Population Medicine, Qatar University College of Medicine, QU Health, Doha, Qatar
| | - Marguerite R Irvin
- Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Sharon L R Kardia
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Sekar Kathiresan
- Verve Therapeutics, Cambridge, MA, USA
- Cardiology Division, Massachusetts General Hospital, Boston, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Alyna T Khan
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Charles L Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Cathy C Laurie
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - X Shirley Liu
- Department of Data Sciences, Dana-Farber Cancer Institute and Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Statistics, Harvard University, Cambridge, MA, USA
| | - Michael C Mahaney
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Ani W Manichaikul
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Lisa W Martin
- Division of Cardiology, George Washington School of Medicine and Health Sciences, Washington, DC, USA
| | - Rasika A Mathias
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Stephen T McGarvey
- Department of Epidemiology, International Health Institute, Department of Anthropology, Brown University, Providence, RI, USA
| | - Braxton D Mitchell
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Geriatrics Research and Education Clinical Center, Baltimore VA Medical Center, Baltimore, MD, USA
| | - May E Montasser
- Division of Endocrinology, Diabetes, and Nutrition, Program for Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Jill E Moore
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, USA
| | - Alanna C Morrison
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Jeffrey R O'Connell
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Nicholette D Palmer
- Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Akhil Pampana
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Center for Genomic Medicine and Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
| | - Juan M Peralta
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, School of Medicine, The University of Texas Rio Grande Valley, Brownsville, TX, USA
| | - Patricia A Peyser
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Bruce M Psaty
- Cardiovascular Health Research Unit, Departments of Medicine, Epidemiology, and Health Services, University of Washington, Seattle, WA, USA
- Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA
| | - Susan Redline
- Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA, USA
- Division of Sleep Medicine, Harvard Medical School, Boston, MA, USA
- Division of Pulmonary, Critical Care, and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Kenneth M Rice
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Stephen S Rich
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Jennifer A Smith
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI, USA
| | - Hemant K Tiwari
- Department of Biostatistics, School of Public Health, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Michael Y Tsai
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, USA
| | - Ramachandran S Vasan
- Framingham Heart Study, National Heart, Lung, and Blood Institute and Boston University, Framingham, MA, USA
- Department of Medicine, Boston University School of Medicine, Boston, MA, USA
| | - Fei Fei Wang
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Daniel E Weeks
- Department of Human Genetics and Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, USA
| | - James G Wilson
- Department of Physiology and Biophysics, University of Mississippi Medical Center, Jackson, MS, USA
- Division of Cardiology, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Lisa R Yanek
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Benjamin M Neale
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Shamil R Sunyaev
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Division of Genetics, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Gonçalo R Abecasis
- Regeneron Pharmaceuticals, Tarrytown, NY, USA
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | - Jerome I Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Cristen J Willer
- Department of Internal Medicine, University of Michigan, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Gina M Peloso
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Pradeep Natarajan
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Center for Genomic Medicine and Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
| | - Xihong Lin
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Department of Statistics, Harvard University, Cambridge, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
| |
Collapse
|
34
|
Tang ZZ, Sliwoski GR, Chen G, Jin B, Bush WS, Li B, Capra JA. PSCAN: Spatial scan tests guided by protein structures improve complex disease gene discovery and signal variant detection. Genome Biol 2020; 21:217. [PMID: 32847609 PMCID: PMC7448521 DOI: 10.1186/s13059-020-02121-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Accepted: 07/27/2020] [Indexed: 12/25/2022] Open
Abstract
Germline disease-causing variants are generally more spatially clustered in protein 3-dimensional structures than benign variants. Motivated by this tendency, we develop a fast and powerful protein-structure-based scan (PSCAN) approach for evaluating gene-level associations with complex disease and detecting signal variants. We validate PSCAN's performance on synthetic data and two real data sets for lipid traits and Alzheimer's disease. Our results demonstrate that PSCAN performs competitively with existing gene-level tests while increasing power and identifying more specific signal variant sets. Furthermore, PSCAN enables generation of hypotheses about the molecular basis for the associations in the context of protein structures and functional domains.
Collapse
Affiliation(s)
- Zheng-Zheng Tang
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, 53715 WI USA
- Wisconsin Institute for Discovery, Madison, 53715 WI USA
| | - Gregory R. Sliwoski
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, 37232 TN USA
| | - Guanhua Chen
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, 53715 WI USA
| | - Bowen Jin
- Department for Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106 OH USA
| | - William S. Bush
- Department for Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106 OH USA
- Institute for Computational Biology, Case Western Reserve University, Cleveland, 44106 OH USA
| | - Bingshan Li
- Department of Molecular Physiology & Biophysics, Vanderbilt University Medical Center, Nashville, 37232 TN USA
| | - John A. Capra
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, 37232 TN USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, 37232 TN USA
- Departments of Biological Sciences and Computer Science, Vanderbilt University, Nashville, 37232 TN USA
- Center for Structural Biology, Vanderbilt University, Nashville, 37232 TN USA
| |
Collapse
|
35
|
Bu D, Yang Q, Meng Z, Zhang S, Li Q. Truncated tests for combining evidence of summary statistics. Genet Epidemiol 2020; 44:687-701. [DOI: 10.1002/gepi.22330] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Revised: 04/24/2020] [Accepted: 06/01/2020] [Indexed: 12/15/2022]
Affiliation(s)
- Deliang Bu
- School of Mathematical Sciences University of Chinese Academy of Sciences Beijing China
- Key Laboratory of Big Data Mining and Knowledge Management Chinese Academy of Sciences Beijing China
| | - Qinglong Yang
- School of Statistics and Mathematics Zhongnan University of Economics and Law Wuhan China
| | - Zhen Meng
- LSC, NCMIS, Academy of Mathematics and Systems Science Chinese Academy of Sciences Beijing China
| | - Sanguo Zhang
- School of Mathematical Sciences University of Chinese Academy of Sciences Beijing China
- Key Laboratory of Big Data Mining and Knowledge Management Chinese Academy of Sciences Beijing China
| | - Qizhai Li
- School of Mathematical Sciences University of Chinese Academy of Sciences Beijing China
- LSC, NCMIS, Academy of Mathematics and Systems Science Chinese Academy of Sciences Beijing China
| |
Collapse
|
36
|
Bocher O, Génin E. Rare variant association testing in the non-coding genome. Hum Genet 2020; 139:1345-1362. [PMID: 32500240 DOI: 10.1007/s00439-020-02190-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Accepted: 05/29/2020] [Indexed: 12/25/2022]
Abstract
The development of next-generation sequencing technologies has opened-up some new possibilities to explore the contribution of genetic variants to human diseases and in particular that of rare variants. Statistical methods have been developed to test for association with rare variants that require the definition of testing units and, in these testing units, the selection of qualifying variants to include in the test. In the coding regions of the genome, testing units are usually the different genes and qualifying variants are selected based on their functional effects on the encoded proteins. Extending these tests to the non-coding regions of the genome is challenging. Testing units are difficult to define as the non-coding genome organisation is still rather unknown. Qualifying variants are difficult to select as the functional impact of non-coding variants on gene expression is hard to predict. These difficulties could explain why very few investigators so far have analysed the non-coding parts of their whole genome sequencing data. These non-coding parts yet represent the vast majority of the genome and some studies suggest that they could play a major role in disease susceptibility. In this review, we discuss recent experimental and statistical developments to gain knowledge on the non-coding genome and how this knowledge could be used to include rare non-coding variants in association tests. We describe the few studies that have considered variants from the non-coding genome in association tests and how they managed to define testing units and select qualifying variants.
Collapse
Affiliation(s)
- Ozvan Bocher
- Génétique, Génomique Fonctionnelle Et Biotechnologies, Faculté de Médecine, Univ Brest, Inserm, Inserm UMR1078, Bâtiment E-IBRBS 2ieme étage, 22 avenue Camille Desmoulins, 29238, Brest Cedex 3, France.
| | - Emmanuelle Génin
- Génétique, Génomique Fonctionnelle Et Biotechnologies, Faculté de Médecine, Univ Brest, Inserm, Inserm UMR1078, Bâtiment E-IBRBS 2ieme étage, 22 avenue Camille Desmoulins, 29238, Brest Cedex 3, France.
- CHU Brest, Brest, France.
| |
Collapse
|
37
|
Zhang J, Xie S, Gonzales S, Liu J, Wang X. A fast and powerful eQTL weighted method to detect genes associated with complex trait using GWAS summary data. Genet Epidemiol 2020; 44:550-563. [PMID: 32350919 DOI: 10.1002/gepi.22297] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 04/13/2020] [Accepted: 04/14/2020] [Indexed: 02/06/2023]
Abstract
Although genomewide association studies (GWASs) have identified many genetic variants underlying complex traits, a large fraction of heritability still remains unexplained. Integrative analysis that incorporates additional information, such as expression quantitativetrait locus (eQTL) data into sequencing studies (denoted as transcriptomewide association study [TWAS]), can aid the discovery of trait-associated genetic variants. However, general TWAS methods only incorporate one eQTL-derived weight (e.g., cis-effect), and thus can suffer a substantial loss of power when the single estimated cis-effect is not predictive for the effect size of a genetic variant or when there are estimation errors in the estimated cis-effect, or if the data are not consistent with the model assumption. In this study, we propose an omnibus test (OT) which utilizes a Cauchy association test to integrate association evidence demonstrated by three different traditional tests (burden test, quadratic test, and adaptive test) using GWAS summary data with multiple eQTL-derived weights. The p value of the proposed test can be calculated analytically, and thus it is fast and efficient. We applied our proposed test to two schizophrenia (SCZ) GWAS summary data sets and two lipids trait (HDL) GWAS summary data sets. Compared with the three traditional tests, our proposed OT can identify more trait-associated genes.
Collapse
Affiliation(s)
- Jianjun Zhang
- Department of Mathematics, University of North Texas, Denton, Texas
| | - Sicong Xie
- Beijing National Day School, Beijing, China
| | - Samantha Gonzales
- Department of Computer Science and Engineering, University of North Texas, Denton, Texas
| | - Jianguo Liu
- Department of Mathematics, University of North Texas, Denton, Texas
| | - Xuexia Wang
- Department of Mathematics, University of North Texas, Denton, Texas
| |
Collapse
|