1
|
Chen W, Coombes BJ, Larson NB. Recent advances and challenges of rare variant association analysis in the biobank sequencing era. Front Genet 2022; 13:1014947. [PMID: 36276986 PMCID: PMC9582646 DOI: 10.3389/fgene.2022.1014947] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 09/22/2022] [Indexed: 12/04/2022] Open
Abstract
Causal variants for rare genetic diseases are often rare in the general population. Rare variants may also contribute to common complex traits and can have much larger per-allele effect sizes than common variants, although power to detect these associations can be limited. Sequencing costs have steadily declined with technological advancements, making it feasible to adopt whole-exome and whole-genome profiling for large biobank-scale sample sizes. These large amounts of sequencing data provide both opportunities and challenges for rare-variant association analysis. Herein, we review the basic concepts of rare-variant analysis methods, the current state-of-the-art methods in utilizing variant annotations or external controls to improve the statistical power, and particular challenges facing rare variant analysis such as accounting for population structure, extremely unbalanced case-control design. We also review recent advances and challenges in rare variant analysis for familial sequencing data and for more complex phenotypes such as survival data. Finally, we discuss other potential directions for further methodology investigation.
Collapse
Affiliation(s)
- Wenan Chen
- Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, TN, United States
- *Correspondence: Wenan Chen, ; Brandon J. Coombes, ; Nicholas B. Larson,
| | - Brandon J. Coombes
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, United States
- *Correspondence: Wenan Chen, ; Brandon J. Coombes, ; Nicholas B. Larson,
| | - Nicholas B. Larson
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, United States
- *Correspondence: Wenan Chen, ; Brandon J. Coombes, ; Nicholas B. Larson,
| |
Collapse
|
2
|
Ray D, Vergara C, Taub MA, Wojcik G, Ladd‐Acosta C, Beaty TH, Duggal P. Benchmarking statistical methods for analyzing parent-child dyads in genetic association studies. Genet Epidemiol 2022; 46:266-284. [PMID: 35451532 PMCID: PMC9356976 DOI: 10.1002/gepi.22453] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Revised: 02/06/2022] [Accepted: 03/15/2022] [Indexed: 11/24/2022]
Abstract
Genetic association studies of child health outcomes often employ family-based study designs. One of the most popular family-based designs is the case-parent trio design that considers the smallest possible nuclear family consisting of two parents and their affected child. This trio design is particularly advantageous for studying relatively rare disorders because it is less prone to type 1 error inflation due to population stratification compared to population-based study designs (e.g., case-control studies). However, obtaining genetic data from both parents is difficult, from a practical perspective, and many large studies predominantly measure genetic variants in mother-child dyads. While some statistical methods for analyzing parent-child dyad data (most commonly involving mother-child pairs) exist, it is not clear if they provide the same advantage as trio methods in protecting against population stratification, or if a specific dyad design (e.g., case-mother dyads vs. case-mother/control-mother dyads) is more advantageous. In this article, we review existing statistical methods for analyzing genome-wide marker data on dyads and perform extensive simulation experiments to benchmark their type I errors and statistical power under different scenarios. We extend our evaluation to existing methods for analyzing a combination of case-parent trios and dyads together. We apply these methods on genotyped and imputed data from multiethnic mother-child pairs only, case-parent trios only or combinations of both dyads and trios from the Gene, Environment Association Studies consortium (GENEVA), where each family was ascertained through a child affected by nonsyndromic cleft lip with or without cleft palate. Results from the GENEVA study corroborate the findings from our simulation experiments. Finally, we provide recommendations for using statistical genetic association methods for dyads.
Collapse
Affiliation(s)
- Debashree Ray
- Department of Epidemiology, Bloomberg School of Public HealthJohns Hopkins UniversityBaltimoreMarylandUSA
- Department of Biostatistics, Bloomberg School of Public HealthJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Candelaria Vergara
- Department of Epidemiology, Bloomberg School of Public HealthJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Margaret A. Taub
- Department of Biostatistics, Bloomberg School of Public HealthJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Genevieve Wojcik
- Department of Epidemiology, Bloomberg School of Public HealthJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Christine Ladd‐Acosta
- Department of Epidemiology, Bloomberg School of Public HealthJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Terri H. Beaty
- Department of Epidemiology, Bloomberg School of Public HealthJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Priya Duggal
- Department of Epidemiology, Bloomberg School of Public HealthJohns Hopkins UniversityBaltimoreMarylandUSA
| |
Collapse
|
3
|
Prokopenko D, Lee S, Hecker J, Mullin K, Morgan S, Katsumata Y, Weiner MW, Fardo DW, Laird N, Bertram L, Hide W, Lange C, Tanzi RE. Region-based analysis of rare genomic variants in whole-genome sequencing datasets reveal two novel Alzheimer's disease-associated genes: DTNB and DLG2. Mol Psychiatry 2022; 27:1963-1969. [PMID: 35246634 PMCID: PMC9126808 DOI: 10.1038/s41380-022-01475-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 01/25/2022] [Accepted: 02/04/2022] [Indexed: 01/01/2023]
Abstract
Alzheimer's disease (AD) is a genetically complex disease for which nearly 40 loci have now been identified via genome-wide association studies (GWAS). We attempted to identify groups of rare variants (alternate allele frequency <0.01) associated with AD in a region-based, whole-genome sequencing (WGS) association study (rvGWAS) of two independent AD family datasets (NIMH/NIA; 2247 individuals; 605 families). Employing a sliding window approach across the genome, we identified several regions that achieved association p values <10-6, using the burden test or the SKAT statistic. The genomic region around the dystobrevin beta (DTNB) gene was identified with the burden and SKAT test and replicated in case/control samples from the ADSP study reaching genome-wide significance after meta-analysis (pmeta = 4.74 × 10-8). SKAT analysis also revealed region-based association around the Discs large homolog 2 (DLG2) gene and replicated in case/control samples from the ADSP study (pmeta = 1 × 10-6). In conclusion, in a region-based rvGWAS of AD we identified two novel AD genes, DLG2 and DTNB, based on association with rare variants.
Collapse
Affiliation(s)
- Dmitry Prokopenko
- grid.32224.350000 0004 0386 9924Genetics and Aging Research Unit and The Henry and Allison McCance Center for Brain Health, Department of Neurology, Massachusetts General Hospital, Boston, MA USA ,grid.38142.3c000000041936754XHarvard Medical School, Boston, MA USA
| | - Sanghun Lee
- grid.411982.70000 0001 0705 4288Department of Medical Consilience, Graduate School, Dankook University, Yongin, South Korea ,grid.38142.3c000000041936754XDepartment of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA USA
| | - Julian Hecker
- grid.38142.3c000000041936754XHarvard Medical School, Boston, MA USA ,grid.62560.370000 0004 0378 8294Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA USA
| | - Kristina Mullin
- grid.32224.350000 0004 0386 9924Genetics and Aging Research Unit and The Henry and Allison McCance Center for Brain Health, Department of Neurology, Massachusetts General Hospital, Boston, MA USA
| | - Sarah Morgan
- grid.38142.3c000000041936754XHarvard Medical School, Boston, MA USA ,grid.239395.70000 0000 9011 8547Department of Pathology, Beth Israel Deaconess Medical Center, 330 Brookline Avenue, Boston, MA USA
| | - Yuriko Katsumata
- grid.266539.d0000 0004 1936 8438Department of Biostatistics, University of Kentucky, Lexington, KY USA ,grid.266539.d0000 0004 1936 8438Sanders-Brown Center on Aging, University of Kentucky, Lexington, KY USA
| | | | - Michael W. Weiner
- grid.266102.10000 0001 2297 6811Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, CA USA
| | - David W. Fardo
- grid.266539.d0000 0004 1936 8438Department of Biostatistics, University of Kentucky, Lexington, KY USA ,grid.266539.d0000 0004 1936 8438Sanders-Brown Center on Aging, University of Kentucky, Lexington, KY USA
| | - Nan Laird
- grid.38142.3c000000041936754XDepartment of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA USA
| | - Lars Bertram
- grid.4562.50000 0001 0057 2672Lübeck Interdisciplinary Platform for Genome Analytics, University of Lübeck, Lübeck, Germany ,grid.5510.10000 0004 1936 8921Department of Psychology, University of Oslo, Oslo, Norway
| | - Winston Hide
- grid.38142.3c000000041936754XHarvard Medical School, Boston, MA USA ,grid.239395.70000 0000 9011 8547Department of Pathology, Beth Israel Deaconess Medical Center, 330 Brookline Avenue, Boston, MA USA
| | - Christoph Lange
- grid.38142.3c000000041936754XDepartment of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA USA
| | - Rudolph E. Tanzi
- grid.32224.350000 0004 0386 9924Genetics and Aging Research Unit and The Henry and Allison McCance Center for Brain Health, Department of Neurology, Massachusetts General Hospital, Boston, MA USA ,grid.38142.3c000000041936754XHarvard Medical School, Boston, MA USA
| |
Collapse
|
4
|
Martin LJ, Benson DW. Focused Strategies for Defining the Genetic Architecture of Congenital Heart Defects. Genes (Basel) 2021; 12:827. [PMID: 34071175 PMCID: PMC8228798 DOI: 10.3390/genes12060827] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Revised: 05/24/2021] [Accepted: 05/26/2021] [Indexed: 12/14/2022] Open
Abstract
Congenital heart defects (CHD) are malformations present at birth that occur during heart development. Increasing evidence supports a genetic origin of CHD, but in the process important challenges have been identified. This review begins with information about CHD and the importance of detailed phenotyping of study subjects. To facilitate appropriate genetic study design, we review DNA structure, genetic variation in the human genome and tools to identify the genetic variation of interest. Analytic approaches powered for both common and rare variants are assessed. While the ideal outcome of genetic studies is to identify variants that have a causal role, a more realistic goal for genetic analytics is to identify variants in specific genes that influence the occurrence of a phenotype and which provide keys to open biologic doors that inform how the genetic variants modulate heart development. It has never been truer that good genetic studies start with good planning. Continued progress in unraveling the genetic underpinnings of CHD will require multidisciplinary collaboration between geneticists, quantitative scientists, clinicians, and developmental biologists.
Collapse
Affiliation(s)
- Lisa J. Martin
- Division of Human Genetics, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA
- Department of Pediatrics, University of Cincinnati School of Medicine, Cincinnati, OH 45229, USA
| | - D. Woodrow Benson
- Department of Pediatrics, Medical College of Wisconsin, Wauwatosa, WI 53226, USA;
| |
Collapse
|