1
|
Si Y, Lu W, Holloway S, Wang H, Tucci AA, Brucker A, Cheng Y, Wang LS, Schellenberger G, Lee WP, Tzeng JY. CNV-Profile Regression: A New Approach for Copy Number Variant Association Analysis in Whole Genome Sequencing Data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.23.624994. [PMID: 39651129 PMCID: PMC11623527 DOI: 10.1101/2024.11.23.624994] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2024]
Abstract
Copy number variants (CNVs) are DNA gains or losses involving >50 base pairs. Assessing CNV effects on disease risk requires consideration of several factors. First, there are no natural definitions for CNV loci. Second, CNV effects can depend on dosage and length. Third, CNV effects can be more accurately estimated when all CNV events in a genomic region are analyzed together to assess their joint effects. We propose a new framework for association analysis that directly models an individual's entire CNV profile within a genomic region. This framework represents an individual's CNVs using a CNV profile curve to capture variations in CNV length and dosage and to bypass the need to predefine CNV loci. CNV effects are estimated at each genome position, making the results comparable across different studies. To jointly estimate the effects of all CNVs, we use a Lasso penalty to select CNVs associated with the trait and integrate a weighted L2-fusion penalty to encourage similar effects of adjacent CNVs when supported by the data. Simulations show that the proposed model can more effectively identify causal CNVs while maintaining false positive rates comparable to baseline methods and yield more precise effect-size estimates across different settings. When applied to CNV derived from whole genome sequencing data of the Alzheimer's Disease Sequencing Project, the proposed methods identify additional CNVs associated with Alzheimer's Disease (AD). These identified CNVs overlap with several known AD-risk genes and are significantly enriched by biological processes related to neuron structures and functions crucial in AD development.
Collapse
|
2
|
Siecinski SK, Giamberardino SN, Spanos M, Hauser AC, Gibson JR, Chandrasekhar T, Trelles MDP, Rockhill CM, Palumbo ML, Cundiff AW, Montgomery A, Siper P, Minjarez M, Nowinski LA, Marler S, Kwee LC, Shuffrey LC, Alderman C, Weissman J, Zappone B, Mullett JE, Crosson H, Hong N, Luo S, She L, Bhapkar M, Dean R, Scheer A, Johnson JL, King BH, McDougle CJ, Sanders KB, Kim SJ, Kolevzon A, Veenstra-VanderWeele J, Hauser ER, Sikich L, Gregory SG. Genetic and epigenetic signatures associated with plasma oxytocin levels in children and adolescents with autism spectrum disorder. Autism Res 2023; 16:502-523. [PMID: 36609850 PMCID: PMC10023458 DOI: 10.1002/aur.2884] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Accepted: 12/19/2022] [Indexed: 01/09/2023]
Abstract
Oxytocin (OT), the brain's most abundant neuropeptide, plays an important role in social salience and motivation. Clinical trials of the efficacy of OT in autism spectrum disorder (ASD) have reported mixed results due in part to ASD's complex etiology. We investigated whether genetic and epigenetic variation contribute to variable endogenous OT levels that modulate sensitivity to OT therapy. To carry out this analysis, we integrated genome-wide profiles of DNA-methylation, transcriptional activity, and genetic variation with plasma OT levels in 290 participants with ASD enrolled in a randomized controlled trial of OT. Our analysis identified genetic variants with novel association with plasma OT, several of which reside in known ASD risk genes. We also show subtle but statistically significant association of plasma OT levels with peripheral transcriptional activity and DNA-methylation profiles across several annotated gene sets. These findings broaden our understanding of the effects of the peripheral oxytocin system and provide novel genetic candidates for future studies to decode the complex etiology of ASD and its interaction with OT signaling and OT-based interventions. LAY SUMMARY: Oxytocin (OT) is an abundant chemical produced by neurons that plays an important role in social interaction and motivation. We investigated whether genetic and epigenetic factors contribute to variable OT levels in the blood. To this, we integrated genetic, gene expression, and non-DNA regulated (epigenetic) signatures with blood OT levels in 290 participants with autism enrolled in an OT clinical trial. We identified genetic association with plasma OT, several of which reside in known autism risk genes. We also show statistically significant association of plasma OT levels with gene expression and epigenetic across several gene pathways. These findings broaden our understanding of the factors that influence OT levels in the blood for future studies to decode the complex presentation of autism and its interaction with OT and OT-based treatment.
Collapse
Affiliation(s)
- Stephen K Siecinski
- Duke Molecular Physiology Institute, Duke University School of Medicine, Durham, NC, USA
| | | | - Marina Spanos
- Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, NC, USA
| | - Annalise C Hauser
- Duke Molecular Physiology Institute, Duke University School of Medicine, Durham, NC, USA
| | - Jason R Gibson
- Duke Molecular Physiology Institute, Duke University School of Medicine, Durham, NC, USA
| | - Tara Chandrasekhar
- Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, NC, USA
| | - M D Pilar Trelles
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Carol M Rockhill
- Department of Psychiatry, Seattle Children’s Hospital and the University of Washington, Seattle, WA, USA
| | - Michelle L Palumbo
- Department of Psychiatry, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | | | | | - Paige Siper
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Mendy Minjarez
- Department of Psychiatry, Seattle Children’s Hospital and the University of Washington, Seattle, WA, USA
| | - Lisa A Nowinski
- Department of Psychiatry, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Sarah Marler
- Department of Psychiatry, Vanderbilt University, Nashville, TN, USA
| | - Lydia C Kwee
- Duke Molecular Physiology Institute, Duke University School of Medicine, Durham, NC, USA
| | | | - Cheryl Alderman
- Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, NC, USA
- Duke Clinical Research Institute, Duke University School of Medicine, Durham, NC, USA
| | - Jordana Weissman
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Brooke Zappone
- Department of Psychiatry, Seattle Children’s Hospital and the University of Washington, Seattle, WA, USA
| | - Jennifer E Mullett
- Department of Psychiatry, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Hope Crosson
- Department of Psychiatry, Columbia University, New York, NY, USA
| | - Natalie Hong
- Department of Psychiatry, Columbia University, New York, NY, USA
| | - Sheng Luo
- Duke Clinical Research Institute, Duke University School of Medicine, Durham, NC, USA
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, USA
| | - Lilin She
- Duke Clinical Research Institute, Duke University School of Medicine, Durham, NC, USA
| | - Manjushri Bhapkar
- Duke Clinical Research Institute, Duke University School of Medicine, Durham, NC, USA
| | - Russell Dean
- Department of Psychiatry, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Abby Scheer
- Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, NC, USA
| | - Jacqueline L Johnson
- Department of Psychiatry, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Bryan H King
- Department of Psychiatry, Seattle Children’s Hospital and the University of Washington, Seattle, WA, USA
| | - Christopher J McDougle
- Department of Psychiatry, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Kevin B Sanders
- Department of Psychiatry, Vanderbilt University, Nashville, TN, USA
| | - Soo-Jeong Kim
- Department of Psychiatry, Seattle Children’s Hospital and the University of Washington, Seattle, WA, USA
| | - Alexander Kolevzon
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | - Elizabeth R Hauser
- Duke Molecular Physiology Institute, Duke University School of Medicine, Durham, NC, USA
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, USA
| | - Linmarie Sikich
- Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, NC, USA
- Duke Clinical Research Institute, Duke University School of Medicine, Durham, NC, USA
| | - Simon G Gregory
- Duke Molecular Physiology Institute, Duke University School of Medicine, Durham, NC, USA
- Department of Neurology, Duke University School of Medicine, Durham, NC, USA
| |
Collapse
|
3
|
Lu TP, Kamatani Y, Belbin G, Park T, Hsiao CK. Editorial: Current Status and Future Challenges of Biobank Data Analysis. Front Genet 2022; 13:882611. [PMID: 35495141 PMCID: PMC9047950 DOI: 10.3389/fgene.2022.882611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 03/24/2022] [Indexed: 11/23/2022] Open
Affiliation(s)
- Tzu-Pin Lu
- Department of Public Health, College of Public Health, Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taipei, Taiwan
| | - Yoichiro Kamatani
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Gillian Belbin
- Institute of Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul, South Korea
| | - Chuhsing Kate Hsiao
- Department of Public Health, College of Public Health, Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taipei, Taiwan
- *Correspondence: Chuhsing Kate Hsiao,
| |
Collapse
|
5
|
Yu QY, Lu TP, Hsiao TH, Lin CH, Wu CY, Tzeng JY, Hsiao CK. An Integrative Co-localization (INCO) Analysis for SNV and CNV Genomic Features With an Application to Taiwan Biobank Data. Front Genet 2021; 12:709555. [PMID: 34567069 PMCID: PMC8456116 DOI: 10.3389/fgene.2021.709555] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 08/23/2021] [Indexed: 11/13/2022] Open
Abstract
Genomic studies have been a major approach to elucidating disease etiology and to exploring potential targets for treatments of many complex diseases. Statistical analyses in these studies often face the challenges of multiplicity, weak signals, and the nature of dependence among genetic markers. This situation becomes even more complicated when multi-omics data are available. To integrate the data from different platforms, various integrative analyses have been adopted, ranging from the direct union or intersection operation on sets derived from different single-platform analysis to complex hierarchical multi-level models. The former ignores the biological relationship between molecules while the latter can be hard to interpret. We propose in this study an integrative approach that combines both single nucleotide variants (SNVs) and copy number variations (CNVs) in the same genomic unit to co-localize the concurrent effect and to deal with the sparsity due to rare variants. This approach is illustrated with simulation studies to evaluate its performance and is applied to low-density lipoprotein cholesterol and triglyceride measurements from Taiwan Biobank. The results show that the proposed method can more effectively detect the collective effect from both SNVs and CNVs compared to traditional methods. For the biobank analysis, the identified genetic regions including the gene VNN2 could be novel and deserve further investigation.
Collapse
Affiliation(s)
- Qi-You Yu
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan
| | - Tzu-Pin Lu
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan.,Department of Public Health, National Taiwan University, Taipei, Taiwan
| | - Tzu-Hung Hsiao
- Department of Medical Research, Taichung Veterans General Hospital, Taichung, Taiwan
| | - Ching-Heng Lin
- Department of Medical Research, Taichung Veterans General Hospital, Taichung, Taiwan
| | - Chi-Yun Wu
- Graduate Group in Genomics and Computational Biology, University of Pennsylvania, Philadelphia, PA, United States.,Department of Statistics, University of Pennsylvania, Philadelphia, PA, United States
| | - Jung-Ying Tzeng
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan.,Department of Statistics and Bioinformatics Research Center, North Carolina State University, Raleigh, NC, United States
| | - Chuhsing Kate Hsiao
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan.,Department of Public Health, National Taiwan University, Taipei, Taiwan
| |
Collapse
|