1
|
De T, Coin L, Herberg J, Johnson MR, Järvelin MR. Plasma metabolomic signatures for copy number variants and COVID-19 risk loci in Northern Finland populations. Sci Rep 2025; 15:13172. [PMID: 40240424 PMCID: PMC12003712 DOI: 10.1038/s41598-025-94839-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Accepted: 03/17/2025] [Indexed: 04/18/2025] Open
Abstract
Copy number variants (CNVs) are an important class of genomic variation known to be important for human physiology and diseases. Here we present genome-wide metabolomic signatures for CNVs in two Finnish cohorts-The Northern Finland Birth Cohort 1966 (NFBC 1966) and NFBC 1986. We have analysed and reported CNVs in over 9,300 individuals and characterised their dosage effect (CNV-metabolomic QTL) on 228 plasma lipoproteins and metabolites. We have reported reference (normal physiology) metabolomic signatures for up to ~ 2.6 million COVID-19 GWAS results from the National Institutes of Health (NIH) GRASP database, including for outcomes related to COVID-19 death, severity, and hospitalisation. Furthermore, by analysing two exemplar genes for COVID-19 severity namely LZTFL1 and OAS1, we have reported here two additional candidate genes for COVID-19 severity biology, (1) NFIX, a gene related to viral (adenovirus) replication and hematopoietic stem cells and (2) ACSL1, a known candidate gene for sepsis and bacterial inflammation. Based on our results and current literature we hypothesise that (1) charge imbalance across the cellular membrane between cations (Fe2+, Mg2+ etc.) and anions (e.g. ROS, hydroxide ion from cellular Fenton reactions, superoxide etc.), (2) iron trafficking within and between different cell types e.g., macrophages and (3) systemic oxidative stress response (e.g. lipid peroxidation mediated inflammation), together could be of relevance in severe COVID-19 cases. To conclude, our unique atlas of univariate and multivariate metabolomic signatures for CNVs (~ 7.2 million signatures) with deep annotations of various multi-omics data sets provide an important reference knowledge base for human metabolism and diseases.
Collapse
Affiliation(s)
- Tisham De
- Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, UK.
- Department of Genomics of Common Diseases, Imperial College London, London, UK.
- Department of Infectious Disease, Imperial College London, London, UK.
| | - Lachlan Coin
- Department of Infectious Disease, Imperial College London, London, UK
- Department of Microbiology and Immunology, Institute for Infection and Immunity, University of Melbourne at The Peter Doherty, Melbourne, Australia
| | - Jethro Herberg
- Department of Infectious Disease, Imperial College London, London, UK
| | | | - Marjo-Riitta Järvelin
- Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, UK
- Centre for Life Course Health Research, Faculty of Medicine, University of Oulu, Oulu, Finland
- Unit of Primary Health Care and Medical Research Center, Oulu University Hospital, Oulu, Finland
- Centre for Environment and Health, Imperial College London, London, UK
- Biocenter Oulu, University of Oulu, Oulu, Finland
| |
Collapse
|
2
|
Malekpour SA, Kalirad A, Majidian S. Inferring the Selective History of CNVs Using a Maximum Likelihood Model. Genome Biol Evol 2025; 17:evaf050. [PMID: 40100752 PMCID: PMC11950529 DOI: 10.1093/gbe/evaf050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 02/27/2025] [Accepted: 03/13/2025] [Indexed: 03/20/2025] Open
Abstract
Copy number variations (CNVs)-structural variations generated by deletion and/or duplication that result in a change in DNA dosage-are prevalent in nature. CNVs can drastically affect the phenotype of an organism and have been shown to be both involved in genetic disorders and be used as raw material in adaptive evolution. Unlike single-nucleotide variations, the often large and varied effects of CNVs on phenotype hinders our ability to infer their selective advantage based on the population genetics data. Here, we present a likelihood-based approach, dubbed PoMoCNV (POlymorphism-aware phylogenetic MOdel for CNVs), that estimates the evolutionary parameters such as mutation rates among different copy numbers and relative fitness loss per copy deletion at a genomic locus based on population genetics data. As a case study, we analyze the genomics data of 40 strains of Caenorhabditis elegans, representing four different populations. We take advantage of the data on chromatin accessibility to interpret the mutation rate and fitness of copy numbers, as inferred by PoMoCNV, specifically in open or closed chromatin loci. We further test the reliability of PoMoCNV by estimating the evolutionary parameters of CNVs for mutation-accumulation experiments in C. elegans with varying levels of genetic drift.
Collapse
Affiliation(s)
- Seyed Amir Malekpour
- School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran 19395-5746, Iran
| | - Ata Kalirad
- Department for Integrative Evolutionary Biology, Max Planck Institute for Biology Tübingen, Tübingen 72076, Germany
| | - Sina Majidian
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
- Department of Computational Biology, University of Lausanne, Lausanne 1015, Switzerland
| |
Collapse
|
3
|
Harris L, McDonagh EM, Zhang X, Fawcett K, Foreman A, Daneck P, Sergouniotis PI, Parkinson H, Mazzarotto F, Inouye M, Hollox EJ, Birney E, Fitzgerald T. Genome-wide association testing beyond SNPs. Nat Rev Genet 2025; 26:156-170. [PMID: 39375560 DOI: 10.1038/s41576-024-00778-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/03/2024] [Indexed: 10/09/2024]
Abstract
Decades of genetic association testing in human cohorts have provided important insights into the genetic architecture and biological underpinnings of complex traits and diseases. However, for certain traits, genome-wide association studies (GWAS) for common SNPs are approaching signal saturation, which underscores the need to explore other types of genetic variation to understand the genetic basis of traits and diseases. Copy number variation (CNV) is an important source of heritability that is well known to functionally affect human traits. Recent technological and computational advances enable the large-scale, genome-wide evaluation of CNVs, with implications for downstream applications such as polygenic risk scoring and drug target identification. Here, we review the current state of CNV-GWAS, discuss current limitations in resource infrastructure that need to be overcome to enable the wider uptake of CNV-GWAS results, highlight emerging opportunities and suggest guidelines and standards for future GWAS for genetic variation beyond SNPs at scale.
Collapse
Affiliation(s)
- Laura Harris
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EBI), Wellcome Genome Campus, Hinxton, UK
| | - Ellen M McDonagh
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EBI), Wellcome Genome Campus, Hinxton, UK
| | - Xiaolei Zhang
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EBI), Wellcome Genome Campus, Hinxton, UK
| | - Katherine Fawcett
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EBI), Wellcome Genome Campus, Hinxton, UK
- Department of Population Health Sciences, University of Leicester, Leicester, UK
| | - Amy Foreman
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EBI), Wellcome Genome Campus, Hinxton, UK
| | - Petr Daneck
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Panagiotis I Sergouniotis
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EBI), Wellcome Genome Campus, Hinxton, UK
- Division of Evolution, Infection and Genomics, School of Biological Sciences, University of Manchester, Manchester, UK
| | - Helen Parkinson
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EBI), Wellcome Genome Campus, Hinxton, UK
| | - Francesco Mazzarotto
- Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy
- National Heart and Lung Institute, Imperial College London, London, UK
| | - Michael Inouye
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Australia
| | - Edward J Hollox
- Department of Genetics and Genome Biology, University of Leicester, Leicester, UK
| | - Ewan Birney
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EBI), Wellcome Genome Campus, Hinxton, UK
| | - Tomas Fitzgerald
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EBI), Wellcome Genome Campus, Hinxton, UK.
| |
Collapse
|
4
|
Wu XR, Wu BS, Kang JJ, Chen LM, Deng YT, Chen SD, Dong Q, Feng JF, Cheng W, Yu JT. Contribution of copy number variations to education, socioeconomic status and cognition from a genome-wide study of 305,401 subjects. Mol Psychiatry 2025; 30:889-898. [PMID: 39215183 DOI: 10.1038/s41380-024-02717-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Revised: 08/19/2024] [Accepted: 08/22/2024] [Indexed: 09/04/2024]
Abstract
Educational attainment (EA), socioeconomic status (SES) and cognition are phenotypically and genetically linked to health outcomes. However, the role of copy number variations (CNVs) in influencing EA/SES/cognition remains unclear. Using a large-scale (n = 305,401) genome-wide CNV-level association analysis, we discovered 33 CNV loci significantly associated with EA/SES/cognition, 20 of which were novel (deletions at 2p22.2, 2p16.2, 2p12, 3p25.3, 4p15.2, 5p15.33, 5q21.1, 8p21.3, 9p21.1, 11p14.3, 13q12.13, 17q21.31, and 20q13.33, as well as duplications at 3q12.2, 3q23, 7p22.3, 8p23.1, 8p23.2, 17q12 (105 kb), and 19q13.32). The genes identified in gene-level tests were enriched in biological pathways such as neurodegeneration, telomere maintenance and axon guidance. Phenome-wide association studies further identified novel associations of EA/SES/cognition-associated CNVs with mental and physical diseases, such as 6q27 duplication with upper respiratory disease and 17q12 (105 kb) duplication with mood disorders. Our findings provide a genome-wide CNV profile for EA/SES/cognition and bridge their connections to health. The expanded candidate CNVs database and the residing genes would be a valuable resource for future studies aimed at uncovering the biological mechanisms underlying cognitive function and related clinical phenotypes.
Collapse
Affiliation(s)
- Xin-Rui Wu
- Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
| | - Bang-Sheng Wu
- Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
| | - Ju-Jiao Kang
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China
| | - Li-Min Chen
- Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
| | - Yue-Ting Deng
- Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
| | - Shi-Dong Chen
- Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
| | - Qiang Dong
- Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
| | - Jian-Feng Feng
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China
- Department of Computer Science, University of Warwick, Coventry, CV4 7AL, UK
| | - Wei Cheng
- Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China.
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China.
| | - Jin-Tai Yu
- Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China.
| |
Collapse
|
5
|
Collins RL, Talkowski ME. Diversity and consequences of structural variation in the human genome. Nat Rev Genet 2025:10.1038/s41576-024-00808-9. [PMID: 39838028 DOI: 10.1038/s41576-024-00808-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/26/2024] [Indexed: 01/23/2025]
Abstract
The biomedical community is increasingly invested in capturing all genetic variants across human genomes, interpreting their functional consequences and translating these findings to the clinic. A crucial component of this endeavour is the discovery and characterization of structural variants (SVs), which are ubiquitous in the human population, heterogeneous in their mutational processes, key substrates for evolution and adaptation, and profound drivers of human disease. The recent emergence of new technologies and the remarkable scale of sequence-based population studies have begun to crystalize our understanding of SVs as a mutational class and their widespread influence across phenotypes. In this Review, we summarize recent discoveries and new insights into SVs in the human genome in terms of their mutational patterns, population genetics, functional consequences, and impact on human traits and disease. We conclude by outlining three frontiers to be explored by the field over the next decade.
Collapse
Affiliation(s)
- Ryan L Collins
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Michael E Talkowski
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
6
|
Krause J, Classen C, Dey D, Lausberg E, Kessler L, Eggermann T, Kurth I, Begemann M, Kraft F. CNVizard-a lightweight streamlit application for an interactive analysis of copy number variants. BMC Bioinformatics 2024; 25:376. [PMID: 39690401 DOI: 10.1186/s12859-024-06010-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2024] [Accepted: 12/09/2024] [Indexed: 12/19/2024] Open
Abstract
BACKGROUND Methods to call, analyze and visualize copy number variations (CNVs) from massive parallel sequencing data have been widely adopted in clinical practice and genetic research. To enable a streamlined analysis of CNV data, comprehensive annotations and good visualizations are indispensable. The ability to detect single exon CNVs is another important feature for genetic testing. Nonetheless, most available open-source tools come with limitations in at least one of these areas. One additional drawback is that available tools deliver data in an unstructured and static format which requires subsequent visualization and formatting efforts. RESULTS Here we present CNVizard, an interactive Streamlit app allowing a comprehensive visualization of CNVkit data. Furthermore, combining CNVizard with the CNVand pipeline allows the annotation and visualization of CNV or SV VCF files from any CNV caller. CONCLUSION CNVizard, in combination with CNVand, enables the comprehensive and streamlined analysis of short- and long-read sequencing data and provide an intuitive webapp-like experience enabling an interactive visualization of CNV data.
Collapse
Affiliation(s)
- Jeremias Krause
- Medical Faculty, Institute for Human Genetics and Genomic Medicine, Uniklinik RWTH Aachen, Pauwelsstrasse 30, 52074, Aachen, North-Rhine-Westphalia, Germany.
| | - Carlos Classen
- Medical Faculty, Institute for Human Genetics and Genomic Medicine, Uniklinik RWTH Aachen, Pauwelsstrasse 30, 52074, Aachen, North-Rhine-Westphalia, Germany
| | - Daniela Dey
- Medical Faculty, Institute for Human Genetics and Genomic Medicine, Uniklinik RWTH Aachen, Pauwelsstrasse 30, 52074, Aachen, North-Rhine-Westphalia, Germany
| | - Eva Lausberg
- Medical Faculty, Institute for Human Genetics and Genomic Medicine, Uniklinik RWTH Aachen, Pauwelsstrasse 30, 52074, Aachen, North-Rhine-Westphalia, Germany
| | - Luise Kessler
- Medical Faculty, Institute for Human Genetics and Genomic Medicine, Uniklinik RWTH Aachen, Pauwelsstrasse 30, 52074, Aachen, North-Rhine-Westphalia, Germany
| | - Thomas Eggermann
- Medical Faculty, Institute for Human Genetics and Genomic Medicine, Uniklinik RWTH Aachen, Pauwelsstrasse 30, 52074, Aachen, North-Rhine-Westphalia, Germany
| | - Ingo Kurth
- Medical Faculty, Institute for Human Genetics and Genomic Medicine, Uniklinik RWTH Aachen, Pauwelsstrasse 30, 52074, Aachen, North-Rhine-Westphalia, Germany
| | - Matthias Begemann
- Medical Faculty, Institute for Human Genetics and Genomic Medicine, Uniklinik RWTH Aachen, Pauwelsstrasse 30, 52074, Aachen, North-Rhine-Westphalia, Germany
| | - Florian Kraft
- Medical Faculty, Institute for Human Genetics and Genomic Medicine, Uniklinik RWTH Aachen, Pauwelsstrasse 30, 52074, Aachen, North-Rhine-Westphalia, Germany
| |
Collapse
|
7
|
Kushima I, Nakatochi M, Ozaki N. Copy Number Variations and Human Well-Being: Integrating Psychiatric, Physical, and Socioeconomic Perspectives. Biol Psychiatry 2024:S0006-3223(24)01788-8. [PMID: 39643102 DOI: 10.1016/j.biopsych.2024.11.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 11/12/2024] [Accepted: 11/30/2024] [Indexed: 12/09/2024]
Abstract
Copy number variations (CNVs) have emerged as crucial genetic factors that influence a wide spectrum of human health outcomes, with particularly strong associations to psychiatric disorders. In this review, we present a synthesis of diverse impacts of psychiatric disorder-associated CNVs on neurodevelopment, brain function, and physical health across the lifespan. Large-scale studies have revealed that CNV carriers exhibit an increased risk for psychiatric disorders, cognitive deficits, sleep disturbances, neurological disorders, and other physical conditions, including cardiovascular diseases, diabetes, and renal disease, highlighting the wide-ranging impact of CNVs beyond the brain. Neuroimaging studies have revealed substantial CNV effects on brain structure, from cortical and subcortical alterations to white matter microstructure, with effect sizes often exceeding those observed in idiopathic psychiatric disorders. Cellular and animal models have begun to elucidate dynamic CNV effects on neurodevelopment, neuronal function, and cellular energy metabolism, while revealing complex CNV-environment interactions and cell type-specific responses, particularly in studies of 22q11.2 deletion syndrome. This review also explores the complex interplay between psychiatric and physical health conditions in CNV carriers and how these interactions contribute to adverse socioeconomic outcomes, including reduced educational attainment and income levels, creating a feedback loop that further impacts health outcomes. Finally, in this review, we also highlight research limitations and propose key priorities for clinical implementation, including the need for longitudinal studies, standardized guidelines for CNV result reporting and genetic counseling, and integrated care networks to provide a foundation for advancing the field of precision psychiatry.
Collapse
Affiliation(s)
- Itaru Kushima
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan; Medical Genomics Center, Nagoya University Hospital, Nagoya, Japan.
| | - Masahiro Nakatochi
- Public Health Informatics Unit, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Norio Ozaki
- Pathophysiology of Mental Disorders, Nagoya University Graduate School of Medicine, Nagoya, Japan; Institute for Glyco-core Research, Nagoya University, Nagoya, Japan
| |
Collapse
|
8
|
Hujoel MLA, Handsaker RE, Kamitaki N, Mukamel RE, Rubinacci S, Palamara PF, McCarroll SA, Loh PR. Insights into the causes and consequences of DNA repeat expansions from 700,000 biobank participants. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.25.625248. [PMID: 39651202 PMCID: PMC11623664 DOI: 10.1101/2024.11.25.625248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2024]
Abstract
Expansions and contractions of tandem DNA repeats are a source of genetic variation in human populations and in human tissues: some expanded repeats cause inherited disorders, and some are also somatically unstable. We analyzed DNA sequence data, derived from the blood cells of >700,000 participants in UK Biobank and the All of Us Research Program, and developed new computational approaches to recognize, measure and learn from DNA-repeat instability at 15 highly polymorphic CAG-repeat loci. We found that expansion and contraction rates varied widely across these 15 loci, even for alleles of the same length; repeats at different loci also exhibited widely variable relative propensities to mutate in the germline versus the blood. The high somatic instability of TCF4 repeats enabled a genome-wide association analysis that identified seven loci at which inherited variants modulate TCF4 repeat instability in blood cells. Three of the implicated loci contained genes ( MSH3 , FAN1 , and PMS2 ) that also modulate Huntington's disease age-at-onset as well as somatic instability of the HTT repeat in blood; however, the specific genetic variants and their effects (instability-increasing or-decreasing) appeared to be tissue-specific and repeat-specific, suggesting that somatic mutation in different tissues-or of different repeats in the same tissue-proceeds independently and under the control of substantially different genetic variation. Additional modifier loci included DNA damage response genes ATAD5 and GADD45A . Analyzing DNA repeat expansions together with clinical data showed that inherited repeats in the 5' UTR of the glutaminase ( GLS) gene are associated with stage 5 chronic kidney disease (OR=14.0 [5.7-34.3]) and liver diseases (OR=3.0 [1.5-5.9]). These and other results point to the dynamics of DNA repeats in human populations and across the human lifespan.
Collapse
|
9
|
Si Y, Lu W, Holloway S, Wang H, Tucci AA, Brucker A, Cheng Y, Wang LS, Schellenberger G, Lee WP, Tzeng JY. CNV-Profile Regression: A New Approach for Copy Number Variant Association Analysis in Whole Genome Sequencing Data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.23.624994. [PMID: 39651129 PMCID: PMC11623527 DOI: 10.1101/2024.11.23.624994] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2024]
Abstract
Copy number variants (CNVs) are DNA gains or losses involving >50 base pairs. Assessing CNV effects on disease risk requires consideration of several factors. First, there are no natural definitions for CNV loci. Second, CNV effects can depend on dosage and length. Third, CNV effects can be more accurately estimated when all CNV events in a genomic region are analyzed together to assess their joint effects. We propose a new framework for association analysis that directly models an individual's entire CNV profile within a genomic region. This framework represents an individual's CNVs using a CNV profile curve to capture variations in CNV length and dosage and to bypass the need to predefine CNV loci. CNV effects are estimated at each genome position, making the results comparable across different studies. To jointly estimate the effects of all CNVs, we use a Lasso penalty to select CNVs associated with the trait and integrate a weighted L2-fusion penalty to encourage similar effects of adjacent CNVs when supported by the data. Simulations show that the proposed model can more effectively identify causal CNVs while maintaining false positive rates comparable to baseline methods and yield more precise effect-size estimates across different settings. When applied to CNV derived from whole genome sequencing data of the Alzheimer's Disease Sequencing Project, the proposed methods identify additional CNVs associated with Alzheimer's Disease (AD). These identified CNVs overlap with several known AD-risk genes and are significantly enriched by biological processes related to neuron structures and functions crucial in AD development.
Collapse
|
10
|
Sasako T, Ilboudo Y, Liang KYH, Chen Y, Yoshiji S, Richards JB. The Influence of Trinucleotide Repeats in the Androgen Receptor Gene on Androgen-related Traits and Diseases. J Clin Endocrinol Metab 2024; 109:3234-3244. [PMID: 38701087 PMCID: PMC11570371 DOI: 10.1210/clinem/dgae302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Revised: 04/26/2024] [Accepted: 05/01/2024] [Indexed: 05/05/2024]
Abstract
CONTEXT Trinucleotide repeats in the androgen receptor have been proposed to influence testosterone signaling in men, but the clinical relevance of these trinucleotide repeats remains controversial. OBJECTIVE To examine how androgen receptor trinucleotide repeat lengths affect androgen-related traits and disease risks and whether they influence the clinical importance of circulating testosterone levels. METHODS We quantified CAG and GGC repeat lengths in the androgen receptor (AR) gene of European-ancestry male participants in the UK Biobank from whole-genome and whole-exome sequence data using ExpansionHunter and tested associations with androgen-related traits and diseases. We also examined whether the associations between testosterone levels and these outcomes were affected by adjustment for the repeat lengths. RESULTS We successfully quantified the repeat lengths from whole-genome and/or whole-exome sequence data in 181 217 males. Both repeat lengths were shown to be positively associated with circulating total testosterone level and bone mineral density, whereas CAG repeat length was negatively associated with male-pattern baldness, but their effects were relatively small and were not associated with most of the other outcomes. Circulating total testosterone level was associated with various outcomes, but this relationship was not affected by adjustment for the repeat lengths. CONCLUSION In this large-scale study, we found that longer CAG and GGC repeats in the AR gene influence androgen resistance, elevate circulating testosterone level via a feedback loop, and play a role in some androgen-targeted tissues. Generally, however, circulating testosterone level is a more important determinant of androgen action in males than repeat lengths.
Collapse
Affiliation(s)
- Takayoshi Sasako
- McGill University, Montréal, Québec H3T 1E2, Canada
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec H3T 1E2, Canada
- Tanaka Diabetes Clinic Omiya, Saitama 330-0846, Japan
- Department of Diabetes and Metabolic Diseases, Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-Ku, Tokyo 113-0033, Japan
| | - Yann Ilboudo
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec H3T 1E2, Canada
| | - Kevin Y H Liang
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec H3T 1E2, Canada
- Quantitative Life Sciences Program, McGill University, Montréal, Québec H3T 1E2, Canada
| | - Yiheng Chen
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec H3T 1E2, Canada
- Department of Human Genetics, McGill University, Montréal, Québec H3T 1E2, Canada
| | - Satoshi Yoshiji
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec H3T 1E2, Canada
- Department of Human Genetics, McGill University, Montréal, Québec H3T 1E2, Canada
- Kyoto-McGill International Collaborative Program in Genomic Medicine, Graduate School of Medicine, Kyoto University, Kyoto 606-8501, Japan
- Japan Society for the Promotion of Science, Tokyo 102-0083, Japan
| | - J Brent Richards
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec H3T 1E2, Canada
- Department of Human Genetics, McGill University, Montréal, Québec H3T 1E2, Canada
- Five Prime Sciences Inc, Montréal, Québec H3Y 2W4, Canada
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montréal, Québec H3T 1E2, Canada
- Department of Twin Research, King's College London, London WC2R 2LS, UK
| |
Collapse
|
11
|
Smith CIE, Burger JA, Zain R. Estimating the Number of Polygenic Diseases Among Six Mutually Exclusive Entities of Non-Tumors and Cancer. Int J Mol Sci 2024; 25:11968. [PMID: 39596040 PMCID: PMC11593959 DOI: 10.3390/ijms252211968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Revised: 11/04/2024] [Accepted: 11/05/2024] [Indexed: 11/28/2024] Open
Abstract
In the era of precision medicine with increasing amounts of sequenced cancer and non-cancer genomes of different ancestries, we here enumerate the resulting polygenic disease entities. Based on the cell number status, we first identified six fundamental types of polygenic illnesses, five of which are non-cancerous. Like complex, non-tumor disorders, neoplasms normally carry alterations in multiple genes, including in 'Drivers' and 'Passengers'. However, tumors also lack certain genetic alterations/epigenetic changes, recently named 'Goners', which are toxic for the neoplasm and potentially constitute therapeutic targets. Drivers are considered essential for malignant transformation, whereas environmental influences vary considerably among both types of polygenic diseases. For each form, hyper-rare disorders, defined as affecting <1/108 individuals, likely represent the largest number of disease entities. Loss of redundant tumor-suppressor genes exemplifies such a profoundly rare mutational event. For non-tumor, polygenic diseases, pathway-centered taxonomies seem preferable. This classification is not readily feasible in cancer, but the inclusion of Drivers and possibly also of epigenetic changes to the existing nomenclature might serve as initial steps in this direction. Based on the detailed genetic alterations, the number of polygenic diseases is essentially countless, but different forms of nosologies may be used to restrict the number.
Collapse
Affiliation(s)
- C. I. Edvard Smith
- Department of Laboratory Medicine, Karolinska Institutet, ANA Futura, Alfred Nobels Allé 8 Floor 8, SE-141 52 Huddinge, Sweden;
- Karolinska ATMP Center, Karolinska Institutet, Karolinska University Hospital, SE-171 76 Stockholm, Sweden
- Department of Infectious Diseases, Karolinska University Hospital, SE-141 86 Huddinge, Sweden
| | - Jan A. Burger
- Department of Leukemia, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA;
| | - Rula Zain
- Department of Laboratory Medicine, Karolinska Institutet, ANA Futura, Alfred Nobels Allé 8 Floor 8, SE-141 52 Huddinge, Sweden;
- Karolinska ATMP Center, Karolinska Institutet, Karolinska University Hospital, SE-171 76 Stockholm, Sweden
- Centre for Rare Diseases, Department of Clinical Genetics, Karolinska University Hospital, SE-171 76 Stockholm, Sweden
| |
Collapse
|
12
|
Auwerx C, Moix S, Kutalik Z, Reymond A. Disentangling mechanisms behind the pleiotropic effects of proximal 16p11.2 BP4-5 CNVs. Am J Hum Genet 2024; 111:2347-2361. [PMID: 39332408 PMCID: PMC11568757 DOI: 10.1016/j.ajhg.2024.08.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 08/06/2024] [Accepted: 08/21/2024] [Indexed: 09/29/2024] Open
Abstract
Whereas 16p11.2 BP4-5 copy-number variants (CNVs) represent one of the most pleiotropic etiologies of genomic syndromes in both clinical and population cohorts, the mechanisms leading to such pleiotropy remain understudied. Identifying 73 deletion and 89 duplication carrier individuals among unrelated White British UK Biobank participants, we performed a phenome-wide association study (PheWAS) between the region's copy number and 117 complex traits and diseases, mimicking four dosage models. Forty-six phenotypes (39%) were affected by 16p11.2 BP4-5 CNVs, with the deletion-only, mirror, U-shape, and duplication-only models being the best fit for 30, 10, 4, and 2 phenotypes, respectively, aligning with the stronger deleteriousness of the deletion. Upon individually adjusting CNV effects for either body mass index (BMI), height, or educational attainment (EA), we found that sixteen testable deletion-driven associations-primarily with cardiovascular and metabolic traits-were BMI dependent, with EA playing a more subtle role and no association depending on height. Bidirectional Mendelian randomization supported that 13 out of these 16 associations were secondary consequences of the CNV's impact on BMI. For the 23 traits that remained significantly associated upon individual adjustment for mediators, matched-control analyses found that 10 phenotypes, including musculoskeletal traits, liver enzymes, fluid intelligence, platelet count, and pneumonia and acute kidney injury risk, remained associated under strict Bonferroni correction, with 10 additional nominally significant associations. These results paint a complex picture of 16p11.2 BP4-5's pleiotropic pattern that involves direct effects on multiple physiological systems and indirect co-morbidities consequential to the CNV's impact on BMI and EA, acting through trait-specific dosage mechanisms.
Collapse
Affiliation(s)
- Chiara Auwerx
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland; Department of Computational Biology, University of Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland; University Center for Primary Care and Public Health, Lausanne, Switzerland
| | - Samuel Moix
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland; University Center for Primary Care and Public Health, Lausanne, Switzerland
| | - Zoltán Kutalik
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland; University Center for Primary Care and Public Health, Lausanne, Switzerland.
| | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.
| |
Collapse
|
13
|
Auwerx C, Kutalik Z, Reymond A. The pleiotropic spectrum of proximal 16p11.2 CNVs. Am J Hum Genet 2024; 111:2309-2346. [PMID: 39332410 PMCID: PMC11568765 DOI: 10.1016/j.ajhg.2024.08.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 08/18/2024] [Accepted: 08/21/2024] [Indexed: 09/29/2024] Open
Abstract
Recurrent genomic rearrangements at 16p11.2 BP4-5 represent one of the most common causes of genomic disorders. Originally associated with increased risk for autism spectrum disorder, schizophrenia, and intellectual disability, as well as adiposity and head circumference, these CNVs have since been associated with a plethora of phenotypic alterations, albeit with high variability in expressivity and incomplete penetrance. Here, we comprehensively review the pleiotropy associated with 16p11.2 BP4-5 rearrangements to shine light on its full phenotypic spectrum. Illustrating this phenotypic heterogeneity, we expose many parallels between findings gathered from clinical versus population-based cohorts, which often point to the same physiological systems, and emphasize the role of the CNV beyond neuropsychiatric and anthropometric traits. Revealing the complex and variable clinical manifestations of this CNV is crucial for accurate diagnosis and personalized treatment strategies for carrier individuals. Furthermore, we discuss areas of research that will be key to identifying factors contributing to phenotypic heterogeneity and gaining mechanistic insights into the molecular pathways underlying observed associations, while demonstrating how diversity in affected individuals, cohorts, experimental models, and analytical approaches can catalyze discoveries.
Collapse
Affiliation(s)
- Chiara Auwerx
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland; Department of Computational Biology, University of Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland; University Center for Primary Care and Public Health, Lausanne, Switzerland
| | - Zoltán Kutalik
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland; University Center for Primary Care and Public Health, Lausanne, Switzerland
| | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.
| |
Collapse
|
14
|
Schultz LM, Knighton A, Huguet G, Saci Z, Jean-Louis M, Mollon J, Knowles EEM, Glahn DC, Jacquemont S, Almasy L. Copy-number variants differ in frequency across genetic ancestry groups. HGG ADVANCES 2024; 5:100340. [PMID: 39138864 PMCID: PMC11401192 DOI: 10.1016/j.xhgg.2024.100340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 08/07/2024] [Accepted: 08/07/2024] [Indexed: 08/15/2024] Open
Abstract
Copy-number variants (CNVs) have been implicated in a variety of neuropsychiatric and cognitive phenotypes. We found that deleterious CNVs are less prevalent in non-European ancestry groups than they are in European ancestry groups of both the UK Biobank (UKBB) and a US replication cohort (SPARK). We also identified specific recurrent CNVs that consistently differ in frequency across ancestry groups in both the UKBB and SPARK. These ancestry-related differences in CNV prevalence present in both an unselected community population and a family cohort enriched with individuals diagnosed with autism spectrum disorder (ASD) strongly suggest that genetic ancestry should be considered when probing associations between CNVs and health outcomes.
Collapse
Affiliation(s)
- Laura M Schultz
- Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA, USA.
| | - Alexys Knighton
- School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA
| | | | - Zohra Saci
- CHU Sainte-Justine, Montréal, QC, Canada
| | | | - Josephine Mollon
- Department of Psychiatry and Behavioral Sciences, Boston Children's Hospital, Boston, MA, USA; Department of Psychiatry, Harvard Medical School, Boston, MA, USA
| | - Emma E M Knowles
- Department of Psychiatry and Behavioral Sciences, Boston Children's Hospital, Boston, MA, USA; Department of Psychiatry, Harvard Medical School, Boston, MA, USA
| | - David C Glahn
- Department of Psychiatry and Behavioral Sciences, Boston Children's Hospital, Boston, MA, USA; Department of Psychiatry, Harvard Medical School, Boston, MA, USA
| | - Sébastien Jacquemont
- CHU Sainte-Justine, Montréal, QC, Canada; Department of Pediatrics, Université de Montréal, Montréal, QC, Canada
| | - Laura Almasy
- Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA, USA; Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
15
|
Yu Z, Coorens THH, Uddin MM, Ardlie KG, Lennon N, Natarajan P. Genetic variation across and within individuals. Nat Rev Genet 2024; 25:548-562. [PMID: 38548833 PMCID: PMC11457401 DOI: 10.1038/s41576-024-00709-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/09/2024] [Indexed: 04/12/2024]
Abstract
Germline variation and somatic mutation are intricately connected and together shape human traits and disease risks. Germline variants are present from conception, but they vary between individuals and accumulate over generations. By contrast, somatic mutations accumulate throughout life in a mosaic manner within an individual due to intrinsic and extrinsic sources of mutations and selection pressures acting on cells. Recent advancements, such as improved detection methods and increased resources for association studies, have drastically expanded our ability to investigate germline and somatic genetic variation and compare underlying mutational processes. A better understanding of the similarities and differences in the types, rates and patterns of germline and somatic variants, as well as their interplay, will help elucidate the mechanisms underlying their distinct yet interlinked roles in human health and biology.
Collapse
Affiliation(s)
- Zhi Yu
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cardiovascular Research Center and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | | | - Md Mesbah Uddin
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cardiovascular Research Center and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | | | - Niall Lennon
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Pradeep Natarajan
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Cardiovascular Research Center and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA.
- Department of Medicine, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
16
|
Sonehara K, Yano Y, Naito T, Goto S, Yoshihara H, Otani T, Ozawa F, Kitaori T, Matsuda K, Nishiyama T, Okada Y, Sugiura-Ogasawara M. Common and rare genetic variants predisposing females to unexplained recurrent pregnancy loss. Nat Commun 2024; 15:5744. [PMID: 39019884 PMCID: PMC11255296 DOI: 10.1038/s41467-024-49993-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 06/25/2024] [Indexed: 07/19/2024] Open
Abstract
Recurrent pregnancy loss (RPL) is a major reproductive health issue with multifactorial causes, affecting 2.6% of all pregnancies worldwide. Nearly half of the RPL cases lack clinically identifiable causes (e.g., antiphospholipid syndrome, uterine anomalies, and parental chromosomal abnormalities), referred to as unexplained RPL (uRPL). Here, we perform a genome-wide association study focusing on uRPL in 1,728 cases and 24,315 female controls of Japanese ancestry. We detect significant associations in the major histocompatibility complex (MHC) region at 6p21 (lead variant=rs9263738; P = 1.4 × 10-10; odds ratio [OR] = 1.51 [95% CI: 1.33-1.72]; risk allele frequency = 0.871). The MHC associations are fine-mapped to the classical HLA alleles, HLA-C*12:02, HLA-B*52:01, and HLA-DRB1*15:02 (P = 1.1 × 10-10, 1.5 × 10-10, and 1.2 × 10-9, respectively), which constitute a population-specific common long-range haplotype with a protective effect (P = 2.8 × 10-10; OR = 0.65 [95% CI: 0.57-0.75]; haplotype frequency=0.108). Genome-wide copy-number variation (CNV) calling demonstrates rare predicted loss-of-function (pLoF) variants of the cadherin-11 gene (CDH11) conferring the risk of uRPL (P = 1.3 × 10-4; OR = 3.29 [95% CI: 1.78-5.76]). Our study highlights the importance of reproductive immunology and rare variants in the uRPL etiology.
Collapse
Affiliation(s)
- Kyuto Sonehara
- Department of Genome Informatics, Graduate School of Medicine, the University of Tokyo, Tokyo, Japan
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Osaka, Suita, Japan
- Laboratory for Systems Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Yoshitaka Yano
- Department of Obstetrics and Gynecology, Nagoya City University Graduate School of Medical Sciences, Nagoya, Japan
| | - Tatsuhiko Naito
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Osaka, Suita, Japan
- Laboratory for Systems Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Shinobu Goto
- Department of Obstetrics and Gynecology, Nagoya City University Graduate School of Medical Sciences, Nagoya, Japan
| | - Hiroyuki Yoshihara
- Department of Obstetrics and Gynecology, Nagoya City University Graduate School of Medical Sciences, Nagoya, Japan
| | - Takahiro Otani
- Department of Public Health, Nagoya City University Graduate School of Medical Sciences, Nagoya, Japan
| | - Fumiko Ozawa
- Department of Obstetrics and Gynecology, Nagoya City University Graduate School of Medical Sciences, Nagoya, Japan
| | - Tamao Kitaori
- Department of Obstetrics and Gynecology, Nagoya City University Graduate School of Medical Sciences, Nagoya, Japan
| | - Koichi Matsuda
- Laboratory of Genome Technology, Human Genome Center, Institute of Medical Science, The University of Tokyo, Tokyo, Japan
- Laboratory of Clinical Genome Sequencing, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, the University of Tokyo, Tokyo, Japan
| | - Takashi Nishiyama
- Department of Public Health, Nagoya City University Graduate School of Medical Sciences, Nagoya, Japan
| | - Yukinori Okada
- Department of Genome Informatics, Graduate School of Medicine, the University of Tokyo, Tokyo, Japan.
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Osaka, Suita, Japan.
- Laboratory for Systems Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
- Laboratory of Statistical Immunology, Immunology Frontier Research Center (WPI-IFReC), Osaka University, Osaka, Suita, Japan.
- Premium Research Institute for Human Metaverse Medicine (WPI-PRIMe), Osaka University, Osaka, Suita, Japan.
| | - Mayumi Sugiura-Ogasawara
- Department of Obstetrics and Gynecology, Nagoya City University Graduate School of Medical Sciences, Nagoya, Japan.
| |
Collapse
|
17
|
Liu A, Zhou L, Huang Y, Peng D. Analysis of copy number variants detected by sequencing in spontaneous abortion. Mol Cytogenet 2024; 17:13. [PMID: 38764094 PMCID: PMC11103966 DOI: 10.1186/s13039-024-00683-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Accepted: 05/13/2024] [Indexed: 05/21/2024] Open
Abstract
BACKGROUND The incidence of spontaneous abortion (SA), which affects approximately 15-20% of pregnancies, is the most common complication of early pregnancy. Pathogenic copy number variations (CNVs) are recognized as potential genetic causes of SA. However, CNVs of variants of uncertain significance (VOUS) have been identified in products of conceptions (POCs), and their correlation with SA remains uncertain. RESULTS Of 189 spontaneous abortion cases, trisomy 16 was the most common numerical chromosome abnormality, followed by monosomy X. CNVs most often occurred on chromosomes 4 and 8. Gene Ontology and signaling pathway analysis revealed significant enrichment of genes related to nervous system development, transmembrane transport, cell adhesion, and structural components of chromatin. Furthermore, genes within the VOUS CNVs were screened by integrating human placental expression profiles, PhyloP scores, and Residual Variance Intolerance Score (RVIS) percentiles to identify potential candidate genes associated with spontaneous abortion. Fourteen potential candidate genes (LZTR1, TSHZ1, AMIGO2, H1-4, H2BC4, H2AC7, H3C8, H4C3, H3C6, PHKG2, PRR14, RNF40, SRCAP, ZNF629) were identified. Variations in LZTR1, TSHZ1, and H4C3 may contribute to embryonic lethality. CONCLUSIONS CNV sequencing (CNV-seq) analysis is an effective technique for detecting chromosomal abnormalities in POCs and identifying potential candidate genes for SA.
Collapse
Affiliation(s)
- Anhui Liu
- Hengyang Medical School, University of South China, Hengyang, 421000, China
| | - Liyuan Zhou
- Hunan Provincial Key Laboratory of Regional Hereditary Birth Defects Prevention and Control, Changsha Hospital for Maternal & Child Health Care Affiliated to Hunan Normal University, Changsha, 410000, China
| | - Yazhou Huang
- Department of Medical Genetics, Xiangya School of Medicine, Changde Hospital, Central South University (The First People's Hospital of Changde city), Changde, 415000, China.
| | - Dan Peng
- Hengyang Medical School, University of South China, Hengyang, 421000, China.
- Department of Medical Genetics, Xiangya School of Medicine, Changde Hospital, Central South University (The First People's Hospital of Changde city), Changde, 415000, China.
| |
Collapse
|
18
|
Rossen J, Shi H, Strober BJ, Zhang MJ, Kanai M, McCaw ZR, Liang L, Weissbrod O, Price AL. MultiSuSiE improves multi-ancestry fine-mapping in All of Us whole-genome sequencing data. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.13.24307291. [PMID: 38798542 PMCID: PMC11118590 DOI: 10.1101/2024.05.13.24307291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Leveraging data from multiple ancestries can greatly improve fine-mapping power due to differences in linkage disequilibrium and allele frequencies. We propose MultiSuSiE, an extension of the sum of single effects model (SuSiE) to multiple ancestries that allows causal effect sizes to vary across ancestries based on a multivariate normal prior informed by empirical data. We evaluated MultiSuSiE via simulations and analyses of 14 quantitative traits leveraging whole-genome sequencing data in 47k African-ancestry and 94k European-ancestry individuals from All of Us. In simulations, MultiSuSiE applied to Afr47k+Eur47k was well-calibrated and attained higher power than SuSiE applied to Eur94k; interestingly, higher causal variant PIPs in Afr47k compared to Eur47k were entirely explained by differences in the extent of LD quantified by LD 4th moments. Compared to very recently proposed multi-ancestry fine-mapping methods, MultiSuSiE attained higher power and/or much lower computational costs, making the analysis of large-scale All of Us data feasible. In real trait analyses, MultiSuSiE applied to Afr47k+Eur94k identified 579 fine-mapped variants with PIP > 0.5, and MultiSuSiE applied to Afr47k+Eur47k identified 44% more fine-mapped variants with PIP > 0.5 than SuSiE applied to Eur94k. We validated MultiSuSiE results for real traits via functional enrichment of fine-mapped variants. We highlight several examples where MultiSuSiE implicates well-studied or biologically plausible fine-mapped variants that were not implicated by other methods.
Collapse
|
19
|
Hujoel MLA, Handsaker RE, Sherman MA, Kamitaki N, Barton AR, Mukamel RE, Terao C, McCarroll SA, Loh PR. Protein-altering variants at copy number-variable regions influence diverse human phenotypes. Nat Genet 2024; 56:569-578. [PMID: 38548989 PMCID: PMC11018521 DOI: 10.1038/s41588-024-01684-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Accepted: 02/08/2024] [Indexed: 04/09/2024]
Abstract
Copy number variants (CNVs) are among the largest genetic variants, yet CNVs have not been effectively ascertained in most genetic association studies. Here we ascertained protein-altering CNVs from UK Biobank whole-exome sequencing data (n = 468,570) using haplotype-informed methods capable of detecting subexonic CNVs and variation within segmental duplications. Incorporating CNVs into analyses of rare variants predicted to cause gene loss of function (LOF) identified 100 associations of predicted LOF variants with 41 quantitative traits. A low-frequency partial deletion of RGL3 exon 6 conferred one of the strongest protective effects of gene LOF on hypertension risk (odds ratio = 0.86 (0.82-0.90)). Protein-coding variation in rapidly evolving gene families within segmental duplications-previously invisible to most analysis methods-generated some of the human genome's largest contributions to variation in type 2 diabetes risk, chronotype and blood cell traits. These results illustrate the potential for new genetic insights from genomic variation that has escaped large-scale analysis to date.
Collapse
Affiliation(s)
- Margaux L A Hujoel
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | - Robert E Handsaker
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Maxwell A Sherman
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
- Serinus Biosciences Inc., New York, NY, USA
| | - Nolan Kamitaki
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Alison R Barton
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Ronen E Mukamel
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
- Department of Applied Genetics, School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan
| | - Steven A McCarroll
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Po-Ru Loh
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
20
|
Lappalainen T, Li YI, Ramachandran S, Gusev A. Genetic and molecular architecture of complex traits. Cell 2024; 187:1059-1075. [PMID: 38428388 PMCID: PMC10977002 DOI: 10.1016/j.cell.2024.01.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/20/2023] [Accepted: 01/16/2024] [Indexed: 03/03/2024]
Abstract
Human genetics has emerged as one of the most dynamic areas of biology, with a broadening societal impact. In this review, we discuss recent achievements, ongoing efforts, and future challenges in the field. Advances in technology, statistical methods, and the growing scale of research efforts have all provided many insights into the processes that have given rise to the current patterns of genetic variation. Vast maps of genetic associations with human traits and diseases have allowed characterization of their genetic architecture. Finally, studies of molecular and cellular effects of genetic variants have provided insights into biological processes underlying disease. Many outstanding questions remain, but the field is well poised for groundbreaking discoveries as it increases the use of genetic data to understand both the history of our species and its applications to improve human health.
Collapse
Affiliation(s)
- Tuuli Lappalainen
- New York Genome Center, New York, NY, USA; Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden.
| | - Yang I Li
- Section of Genetic Medicine, University of Chicago, Chicago, IL, USA; Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Sohini Ramachandran
- Ecology, Evolution and Organismal Biology, Center for Computational Molecular Biology, and the Data Science Institute, Brown University, Providence, RI 029129, USA
| | - Alexander Gusev
- Harvard Medical School and Dana-Farber Cancer Institute, Boston, MA, USA
| |
Collapse
|
21
|
Benfica LF, Brito LF, do Bem RD, Mulim HA, Glessner J, Braga LG, Gloria LS, Cyrillo JNSG, Bonilha SFM, Mercadante MEZ. Genome-wide association study between copy number variation and feeding behavior, feed efficiency, and growth traits in Nellore cattle. BMC Genomics 2024; 25:54. [PMID: 38212678 PMCID: PMC10785391 DOI: 10.1186/s12864-024-09976-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 01/04/2024] [Indexed: 01/13/2024] Open
Abstract
BACKGROUND Feeding costs represent the largest expenditures in beef production. Therefore, the animal efficiency in converting feed in high-quality protein for human consumption plays a major role in the environmental impact of the beef industry and in the beef producers' profitability. In this context, breeding animals for improved feed efficiency through genomic selection has been considered as a strategic practice in modern breeding programs around the world. Copy number variation (CNV) is a less-studied source of genetic variation that can contribute to phenotypic variability in complex traits. In this context, this study aimed to: (1) identify CNV and CNV regions (CNVRs) in the genome of Nellore cattle (Bos taurus indicus); (2) assess potential associations between the identified CNVR and weaning weight (W210), body weight measured at the time of selection (WSel), average daily gain (ADG), dry matter intake (DMI), residual feed intake (RFI), time spent at the feed bunk (TF), and frequency of visits to the feed bunk (FF); and, (3) perform functional enrichment analyses of the significant CNVR identified for each of the traits evaluated. RESULTS A total of 3,161 CNVs and 561 CNVRs ranging from 4,973 bp to 3,215,394 bp were identified. The CNVRs covered up to 99,221,894 bp (3.99%) of the Nellore autosomal genome. Seventeen CNVR were significantly associated with dry matter intake and feeding frequency (number of daily visits to the feed bunk). The functional annotation of the associated CNVRs revealed important candidate genes related to metabolism that may be associated with the phenotypic expression of the evaluated traits. Furthermore, Gene Ontology (GO) analyses revealed 19 enrichment processes associated with FF. CONCLUSIONS A total of 3,161 CNVs and 561 CNVRs were identified and characterized in a Nellore cattle population. Various CNVRs were significantly associated with DMI and FF, indicating that CNVs play an important role in key biological pathways and in the phenotypic expression of feeding behavior and growth traits in Nellore cattle.
Collapse
Affiliation(s)
- Lorena F Benfica
- Department of Animal Sciences, Purdue University, 270 S. Russell Street, West Lafayette, IN, 47907, USA.
- Department of Animal Science, Faculty of Agricultural and Veterinary Sciences, Sao Paulo State University, Jaboticabal, SP, Brazil.
| | - Luiz F Brito
- Department of Animal Sciences, Purdue University, 270 S. Russell Street, West Lafayette, IN, 47907, USA
| | - Ricardo D do Bem
- Department of Animal Science, Faculty of Agricultural and Veterinary Sciences, Sao Paulo State University, Jaboticabal, SP, Brazil
| | - Henrique A Mulim
- Department of Animal Sciences, Purdue University, 270 S. Russell Street, West Lafayette, IN, 47907, USA
| | - Joseph Glessner
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Larissa G Braga
- Department of Animal Science, Faculty of Agricultural and Veterinary Sciences, Sao Paulo State University, Jaboticabal, SP, Brazil
| | - Leonardo S Gloria
- Department of Animal Sciences, Purdue University, 270 S. Russell Street, West Lafayette, IN, 47907, USA
| | | | | | | |
Collapse
|
22
|
Auwerx C, Jõeloo M, Sadler MC, Tesio N, Ojavee S, Clark CJ, Mägi R, Reymond A, Kutalik Z. Rare copy-number variants as modulators of common disease susceptibility. Genome Med 2024; 16:5. [PMID: 38185688 PMCID: PMC10773105 DOI: 10.1186/s13073-023-01265-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 11/27/2023] [Indexed: 01/09/2024] Open
Abstract
BACKGROUND Copy-number variations (CNVs) have been associated with rare and debilitating genomic disorders (GDs) but their impact on health later in life in the general population remains poorly described. METHODS Assessing four modes of CNV action, we performed genome-wide association scans (GWASs) between the copy-number of CNV-proxy probes and 60 curated ICD-10 based clinical diagnoses in 331,522 unrelated white British UK Biobank (UKBB) participants with replication in the Estonian Biobank. RESULTS We identified 73 signals involving 40 diseases, all of which indicating that CNVs increased disease risk and caused earlier onset. We estimated that 16% of these associations are indirect, acting by increasing body mass index (BMI). Signals mapped to 45 unique, non-overlapping regions, nine of which being linked to known GDs. Number and identity of genes affected by CNVs modulated their pathogenicity, with many associations being supported by colocalization with both common and rare single-nucleotide variant association signals. Dissection of association signals provided insights into the epidemiology of known gene-disease pairs (e.g., deletions in BRCA1 and LDLR increased risk for ovarian cancer and ischemic heart disease, respectively), clarified dosage mechanisms of action (e.g., both increased and decreased dosage of 17q12 impacted renal health), and identified putative causal genes (e.g., ABCC6 for kidney stones). Characterization of the pleiotropic pathological consequences of recurrent CNVs at 15q13, 16p13.11, 16p12.2, and 22q11.2 in adulthood indicated variable expressivity of these regions and the involvement of multiple genes. Finally, we show that while the total burden of rare CNVs-and especially deletions-strongly associated with disease risk, it only accounted for ~ 0.02% of the UKBB disease burden. These associations are mainly driven by CNVs at known GD CNV regions, whose pleiotropic effect on common diseases was broader than anticipated by our CNV-GWAS. CONCLUSIONS Our results shed light on the prominent role of rare CNVs in determining common disease susceptibility within the general population and provide actionable insights for anticipating later-onset comorbidities in carriers of recurrent CNVs.
Collapse
Affiliation(s)
- Chiara Auwerx
- Center for Integrative Genomics, University of Lausanne, Genopode building, 1015, Lausanne, Switzerland.
- Department of Computational Biology, University of Lausanne, Genopode building, 1015, Lausanne, Switzerland.
- Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland.
- University Center for Primary Care and Public Health, 1005, Lausanne, Switzerland.
| | - Maarja Jõeloo
- Institute of Molecular and Cell Biology, University of Tartu, 51010, Tartu, Estonia
- Estonian Genome Centre, Institute of Genomics, University of Tartu, 51010, Tartu, Estonia
| | - Marie C Sadler
- Department of Computational Biology, University of Lausanne, Genopode building, 1015, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
- University Center for Primary Care and Public Health, 1005, Lausanne, Switzerland
| | - Nicolò Tesio
- Center for Integrative Genomics, University of Lausanne, Genopode building, 1015, Lausanne, Switzerland
| | - Sven Ojavee
- Department of Computational Biology, University of Lausanne, Genopode building, 1015, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
| | - Charlie J Clark
- Center for Integrative Genomics, University of Lausanne, Genopode building, 1015, Lausanne, Switzerland
| | - Reedik Mägi
- Estonian Genome Centre, Institute of Genomics, University of Tartu, 51010, Tartu, Estonia
| | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, Genopode building, 1015, Lausanne, Switzerland.
| | - Zoltán Kutalik
- Department of Computational Biology, University of Lausanne, Genopode building, 1015, Lausanne, Switzerland.
- Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland.
- University Center for Primary Care and Public Health, 1005, Lausanne, Switzerland.
| |
Collapse
|
23
|
Robinson D, Vanacloig-Pedros E, Cai R, Place M, Hose J, Gasch AP. Gene-by-environment interactions influence the fitness cost of gene copy-number variation in yeast. G3 (BETHESDA, MD.) 2023; 13:jkad159. [PMID: 37481264 PMCID: PMC10542507 DOI: 10.1093/g3journal/jkad159] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 05/11/2023] [Accepted: 07/12/2023] [Indexed: 07/24/2023]
Abstract
Variation in gene copy number can alter gene expression and influence downstream phenotypes; thus copy-number variation provides a route for rapid evolution if the benefits outweigh the cost. We recently showed that genetic background significantly influences how yeast cells respond to gene overexpression, revealing that the fitness costs of copy-number variation can vary substantially with genetic background in a common-garden environment. But the interplay between copy-number variation tolerance and environment remains unexplored on a genomic scale. Here, we measured the tolerance to gene overexpression in four genetically distinct Saccharomyces cerevisiae strains grown under sodium chloride stress. Overexpressed genes that are commonly deleterious during sodium chloride stress recapitulated those commonly deleterious under standard conditions. However, sodium chloride stress uncovered novel differences in strain responses to gene overexpression. West African strain NCYC3290 and North American oak isolate YPS128 are more sensitive to sodium chloride stress than vineyard BC187 and laboratory strain BY4743. Consistently, NCYC3290 and YPS128 showed the greatest sensitivities to overexpression of specific genes. Although most genes were deleterious, hundreds were beneficial when overexpressed-remarkably, most of these effects were strain specific. Few beneficial genes were shared between the sodium chloride-sensitive isolates, implicating mechanistic differences behind their sodium chloride sensitivity. Transcriptomic analysis suggested underlying vulnerabilities and tolerances across strains, and pointed to natural copy-number variation of a sodium export pump that likely contributes to strain-specific responses to overexpression of other genes. Our results reveal extensive strain-by-environment interactions in the response to gene copy-number variation, raising important implications for the accessibility of copy-number variation-dependent evolutionary routes under times of stress.
Collapse
Affiliation(s)
- DeElegant Robinson
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI 53704, USA
| | - Elena Vanacloig-Pedros
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI 53704, USA
- Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53704, USA
| | - Ruoyi Cai
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI 53704, USA
| | - Michael Place
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI 53704, USA
- Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53704, USA
| | - James Hose
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI 53704, USA
| | - Audrey P Gasch
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI 53704, USA
- Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53704, USA
- Department of Medical Genetics, University of Wisconsin-Madison, Madison, WI 53704, USA
| |
Collapse
|
24
|
Meadows JRS, Kidd JM, Wang GD, Parker HG, Schall PZ, Bianchi M, Christmas MJ, Bougiouri K, Buckley RM, Hitte C, Nguyen AK, Wang C, Jagannathan V, Niskanen JE, Frantz LAF, Arumilli M, Hundi S, Lindblad-Toh K, Ginja C, Agustina KK, André C, Boyko AR, Davis BW, Drögemüller M, Feng XY, Gkagkavouzis K, Iliopoulos G, Harris AC, Hytönen MK, Kalthoff DC, Liu YH, Lymberakis P, Poulakakis N, Pires AE, Racimo F, Ramos-Almodovar F, Savolainen P, Venetsani S, Tammen I, Triantafyllidis A, vonHoldt B, Wayne RK, Larson G, Nicholas FW, Lohi H, Leeb T, Zhang YP, Ostrander EA. Genome sequencing of 2000 canids by the Dog10K consortium advances the understanding of demography, genome function and architecture. Genome Biol 2023; 24:187. [PMID: 37582787 PMCID: PMC10426128 DOI: 10.1186/s13059-023-03023-7] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 07/25/2023] [Indexed: 08/17/2023] Open
Abstract
BACKGROUND The international Dog10K project aims to sequence and analyze several thousand canine genomes. Incorporating 20 × data from 1987 individuals, including 1611 dogs (321 breeds), 309 village dogs, 63 wolves, and four coyotes, we identify genomic variation across the canid family, setting the stage for detailed studies of domestication, behavior, morphology, disease susceptibility, and genome architecture and function. RESULTS We report the analysis of > 48 M single-nucleotide, indel, and structural variants spanning the autosomes, X chromosome, and mitochondria. We discover more than 75% of variation for 239 sampled breeds. Allele sharing analysis indicates that 94.9% of breeds form monophyletic clusters and 25 major clades. German Shepherd Dogs and related breeds show the highest allele sharing with independent breeds from multiple clades. On average, each breed dog differs from the UU_Cfam_GSD_1.0 reference at 26,960 deletions and 14,034 insertions greater than 50 bp, with wolves having 14% more variants. Discovered variants include retrogene insertions from 926 parent genes. To aid functional prioritization, single-nucleotide variants were annotated with SnpEff and Zoonomia phyloP constraint scores. Constrained positions were negatively correlated with allele frequency. Finally, the utility of the Dog10K data as an imputation reference panel is assessed, generating high-confidence calls across varied genotyping platform densities including for breeds not included in the Dog10K collection. CONCLUSIONS We have developed a dense dataset of 1987 sequenced canids that reveals patterns of allele sharing, identifies likely functional variants, informs breed structure, and enables accurate imputation. Dog10K data are publicly available.
Collapse
Affiliation(s)
- Jennifer R S Meadows
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, 75132, Uppsala, Sweden.
| | - Jeffrey M Kidd
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48107, USA.
| | - Guo-Dong Wang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China
| | - Heidi G Parker
- National Human Genome Research Institute, National Institutes of Health, 50 South Drive, Building 50 Room 5351, Bethesda, MD, 20892, USA
| | - Peter Z Schall
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48107, USA
| | - Matteo Bianchi
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, 75132, Uppsala, Sweden
| | - Matthew J Christmas
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, 75132, Uppsala, Sweden
| | - Katia Bougiouri
- Section for Molecular Ecology and Evolution, Globe Institute, University of Copenhagen, Øster Voldgade 5-7, 1350, Copenhagen, Denmark
| | - Reuben M Buckley
- National Human Genome Research Institute, National Institutes of Health, 50 South Drive, Building 50 Room 5351, Bethesda, MD, 20892, USA
| | - Christophe Hitte
- University of Rennes, CNRS, Institute Genetics and Development Rennes - UMR6290, 35000, Rennes, France
| | - Anthony K Nguyen
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48107, USA
| | - Chao Wang
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, 75132, Uppsala, Sweden
| | - Vidhya Jagannathan
- Institute of Genetics, Vetsuisse Faculty, University of Bern, 3001, Bern, Switzerland
| | - Julia E Niskanen
- Department of Medical and Clinical Genetics, Department of Veterinary Biosciences, University of Helsinki and Folkhälsan Research Center, 02900, Helsinki, Finland
| | - Laurent A F Frantz
- School of Biological and Behavioural Sciences, Queen Mary University of London, London E14NS, UK and Palaeogenomics Group, Department of Veterinary Sciences, Ludwig Maximilian University, D-80539, Munich, Germany
| | - Meharji Arumilli
- Department of Medical and Clinical Genetics, Department of Veterinary Biosciences, University of Helsinki and Folkhälsan Research Center, 02900, Helsinki, Finland
| | - Sruthi Hundi
- Department of Medical and Clinical Genetics, Department of Veterinary Biosciences, University of Helsinki and Folkhälsan Research Center, 02900, Helsinki, Finland
| | - Kerstin Lindblad-Toh
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, 75132, Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Catarina Ginja
- BIOPOLIS-CIBIO-InBIO-Centro de Investigação Em Biodiversidade E Recursos Genéticos - ArchGen Group, Universidade Do Porto, 4485-661, Vairão, Portugal
| | | | - Catherine André
- University of Rennes, CNRS, Institute Genetics and Development Rennes - UMR6290, 35000, Rennes, France
| | - Adam R Boyko
- Department of Biomedical Sciences, Cornell University, 930 Campus Road, Ithaca, NY, 14853, USA
| | - Brian W Davis
- Department of Veterinary Integrative Biosciences, School of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, TX, 77843, USA
| | - Michaela Drögemüller
- Institute of Genetics, Vetsuisse Faculty, University of Bern, 3001, Bern, Switzerland
| | - Xin-Yao Feng
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China
| | - Konstantinos Gkagkavouzis
- Department of Genetics, School of Biology, ), Aristotle University of Thessaloniki, Thessaloniki, Macedonia 54124, Greece and Genomics and Epigenomics Translational Research (GENeTres), Center for Interdisciplinary Research and Innovation (CIRI-AUTH, Balkan Center, Thessaloniki, Greece
| | - Giorgos Iliopoulos
- NGO "Callisto", Wildlife and Nature Conservation Society, 54621, Thessaloniki, Greece
| | - Alexander C Harris
- National Human Genome Research Institute, National Institutes of Health, 50 South Drive, Building 50 Room 5351, Bethesda, MD, 20892, USA
| | - Marjo K Hytönen
- Department of Medical and Clinical Genetics, Department of Veterinary Biosciences, University of Helsinki and Folkhälsan Research Center, 02900, Helsinki, Finland
| | - Daniela C Kalthoff
- NGO "Callisto", Wildlife and Nature Conservation Society, 54621, Thessaloniki, Greece
| | - Yan-Hu Liu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China
| | - Petros Lymberakis
- Natural History Museum of Crete & Department of Biology, University of Crete, 71202, Irakleio, Greece
- Biology Department, School of Sciences and Engineering, University of Crete, Heraklion, Greece
- Palaeogenomics and Evolutionary Genetics Lab, Institute of Molecular Biology and Biotechnology (IMBB), Foundation for Research and Technology - Hellas (FORTH), Heraklion, Greece
| | - Nikolaos Poulakakis
- Natural History Museum of Crete & Department of Biology, University of Crete, 71202, Irakleio, Greece
- Biology Department, School of Sciences and Engineering, University of Crete, Heraklion, Greece
- Palaeogenomics and Evolutionary Genetics Lab, Institute of Molecular Biology and Biotechnology (IMBB), Foundation for Research and Technology - Hellas (FORTH), Heraklion, Greece
| | - Ana Elisabete Pires
- BIOPOLIS-CIBIO-InBIO-Centro de Investigação Em Biodiversidade E Recursos Genéticos - ArchGen Group, Universidade Do Porto, 4485-661, Vairão, Portugal
| | - Fernando Racimo
- Section for Molecular Ecology and Evolution, Globe Institute, University of Copenhagen, Øster Voldgade 5-7, 1350, Copenhagen, Denmark
| | | | - Peter Savolainen
- Department of Gene Technology, Science for Life Laboratory, KTH - Royal Institute of Technology, 17121, Solna, Sweden
| | - Semina Venetsani
- Department of Genetics, School of Biology, Aristotle University of Thessaloniki, 54124, Thessaloniki, Macedonia, Greece
| | - Imke Tammen
- Sydney School of Veterinary Science, The University of Sydney, Sydney, NSW, 2570, Australia
| | - Alexandros Triantafyllidis
- Department of Genetics, School of Biology, ), Aristotle University of Thessaloniki, Thessaloniki, Macedonia 54124, Greece and Genomics and Epigenomics Translational Research (GENeTres), Center for Interdisciplinary Research and Innovation (CIRI-AUTH, Balkan Center, Thessaloniki, Greece
| | - Bridgett vonHoldt
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ, 08544, USA
| | - Robert K Wayne
- Department of Ecology and Evolutionary Biology, Ecology and Evolutionary Biology, University of California, Los Angeles, CA, 90095-7246, USA
| | - Greger Larson
- Palaeogenomics and Bio-Archaeology Research Network, School of Archaeology, University of Oxford, Oxford, OX1 3TG, UK
| | - Frank W Nicholas
- Sydney School of Veterinary Science, The University of Sydney, Sydney, NSW, 2570, Australia
| | - Hannes Lohi
- Department of Medical and Clinical Genetics, Department of Veterinary Biosciences, University of Helsinki and Folkhälsan Research Center, 02900, Helsinki, Finland
| | - Tosso Leeb
- Institute of Genetics, Vetsuisse Faculty, University of Bern, 3001, Bern, Switzerland
| | - Ya-Ping Zhang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China
| | - Elaine A Ostrander
- National Human Genome Research Institute, National Institutes of Health, 50 South Drive, Building 50 Room 5351, Bethesda, MD, 20892, USA.
| |
Collapse
|
25
|
Chumakova OS, Baulina NM. Advanced searching for hypertrophic cardiomyopathy heritability in real practice tomorrow. Front Cardiovasc Med 2023; 10:1236539. [PMID: 37583586 PMCID: PMC10425241 DOI: 10.3389/fcvm.2023.1236539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 07/17/2023] [Indexed: 08/17/2023] Open
Abstract
Hypertrophic cardiomyopathy (HCM) is the most common inherited cardiac disease associated with morbidity and mortality at any age. As studies in recent decades have shown, the genetic architecture of HCM is quite complex both in the entire population and in each patient. In the rapidly advancing era of gene therapy, we have to provide a detailed molecular diagnosis to our patients to give them the chance for better and more personalized treatment. In addition to emphasizing the importance of genetic testing in routine practice, this review aims to discuss the possibility to go a step further and create an expanded genetic panel that contains not only variants in core genes but also new candidate genes, including those located in deep intron regions, as well as structural variations. It also highlights the benefits of calculating polygenic risk scores based on a combination of rare and common genetic variants for each patient and of using non-genetic HCM markers, such as microRNAs that can enhance stratification of risk for HCM in unselected populations alongside rare genetic variants and clinical factors. While this review is focusing on HCM, the discussed issues are relevant to other cardiomyopathies.
Collapse
Affiliation(s)
- Olga S. Chumakova
- Laboratory of Functional Genomics of Cardiovascular Diseases, National Medical Research Centre of Cardiology Named After E.I. Chazov, Moscow, Russia
| | | |
Collapse
|
26
|
Robinson D, Vanacloig-Pedros E, Cai R, Place M, Hose J, Gasch AP. Gene-by-environment interactions influence the fitness cost of gene copy-number variation in yeast. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.11.540375. [PMID: 37503218 PMCID: PMC10369901 DOI: 10.1101/2023.05.11.540375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Variation in gene copy number can alter gene expression and influence downstream phenotypes; thus copy-number variation (CNV) provides a route for rapid evolution if the benefits outweigh the cost. We recently showed that genetic background significantly influences how yeast cells respond to gene over-expression (OE), revealing that the fitness costs of CNV can vary substantially with genetic background in a common-garden environment. But the interplay between CNV tolerance and environment remains unexplored on a genomic scale. Here we measured the tolerance to gene OE in four genetically distinct Saccharomyces cerevisiae strains grown under sodium chloride (NaCl) stress. OE genes that are commonly deleterious during NaCl stress recapitulated those commonly deleterious under standard conditions. However, NaCl stress uncovered novel differences in strain responses to gene OE. West African strain NCYC3290 and North American oak isolate YPS128 are more sensitive to NaCl stress than vineyard BC187 and laboratory strain BY4743. Consistently, NCYC3290 and YPS128 showed the greatest sensitivities to gene OE. Although most genes were deleterious, hundreds were beneficial when overexpressed - remarkably, most of these effects were strain specific. Few beneficial genes were shared between the NaCl-sensitive isolates, implicating mechanistic differences behind their NaCl sensitivity. Transcriptomic analysis suggested underlying vulnerabilities and tolerances across strains, and pointed to natural CNV of a sodium export pump that likely contributes to strain-specific responses to OE of other genes. Our results reveal extensive strain-by-environment interaction in the response to gene CNV, raising important implications for the accessibility of CNV-dependent evolutionary routes under times of stress.
Collapse
Affiliation(s)
- DeElegant Robinson
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison WI 53704
| | - Elena Vanacloig-Pedros
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison WI 53704
- Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison WI 53704
| | - Ruoyi Cai
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison WI 53704
| | - Michael Place
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison WI 53704
- Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison WI 53704
| | - James Hose
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison WI 53704
| | - Audrey P Gasch
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison WI 53704
- Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison WI 53704
- Department of Medical Genetics, University of Wisconsin-Madison, Madison WI 53704
| |
Collapse
|
27
|
Hujoel ML, Handsaker RE, Sherman MA, Kamitaki N, Barton AR, Mukamel RE, Terao C, McCarroll SA, Loh PR. Hidden protein-altering variants influence diverse human phenotypes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.07.544066. [PMID: 37333244 PMCID: PMC10274781 DOI: 10.1101/2023.06.07.544066] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
Structural variants (SVs) comprise the largest genetic variants, altering from 50 base pairs to megabases of DNA. However, SVs have not been effectively ascertained in most genetic association studies, leaving a key gap in our understanding of human complex trait genetics. We ascertained protein-altering SVs from UK Biobank whole-exome sequencing data (n=468,570) using haplotype-informed methods capable of detecting sub-exonic SVs and variation within segmental duplications. Incorporating SVs into analyses of rare variants predicted to cause gene loss-of-function (pLoF) identified 100 associations of pLoF variants with 41 quantitative traits. A low-frequency partial deletion of RGL3 exon 6 appeared to confer one of the strongest protective effects of gene LoF on hypertension risk (OR = 0.86 [0.82-0.90]). Protein-coding variation in rapidly-evolving gene families within segmental duplications-previously invisible to most analysis methods-appeared to generate some of the human genome's largest contributions to variation in type 2 diabetes risk, chronotype, and blood cell traits. These results illustrate the potential for new genetic insights from genomic variation that has escaped large-scale analysis to date.
Collapse
Affiliation(s)
- Margaux L.A. Hujoel
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Robert E. Handsaker
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard University, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Maxwell A. Sherman
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Nolan Kamitaki
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Alison R. Barton
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Ronen E. Mukamel
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
- Department of Applied Genetics, School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan
| | - Steven A. McCarroll
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard University, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Po-Ru Loh
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| |
Collapse
|
28
|
Fabo T, Khavari P. Functional characterization of human genomic variation linked to polygenic diseases. Trends Genet 2023; 39:462-490. [PMID: 36997428 PMCID: PMC11025698 DOI: 10.1016/j.tig.2023.02.014] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 02/22/2023] [Accepted: 02/23/2023] [Indexed: 03/30/2023]
Abstract
The burden of human disease lies predominantly in polygenic diseases. Since the early 2000s, genome-wide association studies (GWAS) have identified genetic variants and loci associated with complex traits. These have ranged from variants in coding sequences to mutations in regulatory regions, such as promoters and enhancers, as well as mutations affecting mediators of mRNA stability and other downstream regulators, such as 5' and 3'-untranslated regions (UTRs), long noncoding RNA (lncRNA), and miRNA. Recent research advances in genetics have utilized a combination of computational techniques, high-throughput in vitro and in vivo screening modalities, and precise genome editing to impute the function of diverse classes of genetic variants identified through GWAS. In this review, we highlight the vastness of genomic variants associated with polygenic disease risk and address recent advances in how genetic tools can be used to functionally characterize them.
Collapse
Affiliation(s)
- Tania Fabo
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA; Stanford Cancer Institute, Stanford University, Stanford, CA, USA; Graduate Program in Genetics, Stanford University, Stanford, CA, USA; Stanford University School of Medicine, Stanford University, Stanford, CA, USA
| | - Paul Khavari
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA; Stanford Cancer Institute, Stanford University, Stanford, CA, USA; Graduate Program in Genetics, Stanford University, Stanford, CA, USA; Stanford University School of Medicine, Stanford University, Stanford, CA, USA; Veterans Affairs Palo Alto Healthcare System, Palo Alto, CA, USA.
| |
Collapse
|
29
|
Conery M, Grant SFA. Human height: a model common complex trait. Ann Hum Biol 2023; 50:258-266. [PMID: 37343163 PMCID: PMC10368389 DOI: 10.1080/03014460.2023.2215546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 04/10/2023] [Accepted: 05/09/2023] [Indexed: 06/23/2023]
Abstract
CONTEXT Like other complex phenotypes, human height reflects a combination of environmental and genetic factors, but is notable for being exceptionally easy to measure. Height has therefore been commonly used to make observations later generalised to other phenotypes though the appropriateness of such generalisations is not always considered. OBJECTIVES We aimed to assess height's suitability as a model for other complex phenotypes and review recent advances in height genetics with regard to their implications for complex phenotypes more broadly. METHODS We conducted a comprehensive literature search in PubMed and Google Scholar for articles relevant to the genetics of height and its comparatibility to other phenotypes. RESULTS Height is broadly similar to other phenotypes apart from its high heritability and ease of measurment. Recent genome-wide association studies (GWAS) have identified over 12,000 independent signals associated with height and saturated height's common single nucleotide polymorphism based heritability of height within a subset of the genome in individuals similar to European reference populations. CONCLUSIONS Given the similarity of height to other complex traits, the saturation of GWAS's ability to discover additional height-associated variants signals potential limitations to the omnigenic model of complex-phenotype inheritance, indicating the likely future power of polygenic scores and risk scores, and highlights the increasing need for large-scale variant-to-gene mapping efforts.
Collapse
Affiliation(s)
- Mitchell Conery
- Division of Human Genetics, Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pediatrics, Perelman School of Medicine at the University of PA, Philadelphia, PA, USA
- Department of Pharmacology, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
| | - Struan F A Grant
- Division of Human Genetics, Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pediatrics, Perelman School of Medicine at the University of PA, Philadelphia, PA, USA
- Division of Diabetes and Endocrinology, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Institute for Diabetes, Obesity, and Metabolism, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
- Department of Genetics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
30
|
Sánchez S, Juárez U, Domínguez J, Molina B, Barrientos R, Martínez-Hernández A, Carnevale A, Grether-González P, Mayen DG, Villarroel C, Lieberman E, Yokoyama E, Del Castillo V, Torres L, Frias S. Frequent copy number variants in a cohort of Mexican-Mestizo individuals. Mol Cytogenet 2023; 16:2. [PMID: 36631885 PMCID: PMC9835318 DOI: 10.1186/s13039-022-00631-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 12/13/2022] [Indexed: 01/13/2023] Open
Abstract
BACKGROUND The human genome presents variation at distinct levels, copy number variants (CNVs) are DNA segments of variable lengths that range from several base pairs to megabases and are present at a variable number of copies in human genomes. Common CNVs have no apparent influence on the phenotype; however, some rare CNVs have been associated with phenotypic traits, depending on their size and gene content. CNVs are detected by microarrays of different densities and are generally visualized, and their frequencies analysed using the HapMap as default reference population. Nevertheless, this default reference is inadequate when the samples analysed are from people from Mexico, since population with a Hispanic genetic background are minimally represented. In this work, we describe the variation in the frequencies of four common CNVs in Mexican-Mestizo individuals. RESULTS In a cohort of 147 unrelated Mexican-Mestizo individuals, we found that the common CNVs 2p11.2 (99.6%), 8p11.22 (54.5%), 14q32.33 (100%), and 15q11.2 (71.1%) appeared with unexpectedly high frequencies when contrasted with the HapMap reference (ChAS). Yet, while when comparing to an ethnically related reference population, these differences were significantly reduced or even disappeared. CONCLUSION The findings in this work contribute to (1) a better description of the CNVs characteristics of the Mexican Mestizo population and enhance the knowledge of genome variation in different ethnic groups. (2) emphasize the importance of contrasting CNVs identified in studied individuals against a reference group that-as best as possible-share the same ethnicity while keeping this relevant information in mind when conducting CNV studies at the population or clinical level.
Collapse
Affiliation(s)
- Silvia Sánchez
- Laboratorio de Citogenética, Instituto Nacional de Pediatría, Insurgentes Sur 3700-C Insurgentes Cuicuilco, P01090, Ciudad de Mexico, México
- Posgrado en Ciencias Biológicas, Universidad Nacional Autónoma de México, Ciudad de México, México
| | - Ulises Juárez
- Laboratorio de Citogenética, Instituto Nacional de Pediatría, Insurgentes Sur 3700-C Insurgentes Cuicuilco, P01090, Ciudad de Mexico, México
| | - Julieta Domínguez
- Laboratorio de Citogenética, Instituto Nacional de Pediatría, Insurgentes Sur 3700-C Insurgentes Cuicuilco, P01090, Ciudad de Mexico, México
| | - Bertha Molina
- Laboratorio de Citogenética, Instituto Nacional de Pediatría, Insurgentes Sur 3700-C Insurgentes Cuicuilco, P01090, Ciudad de Mexico, México
| | - Rehotbevely Barrientos
- Laboratorio de Citogenética, Instituto Nacional de Pediatría, Insurgentes Sur 3700-C Insurgentes Cuicuilco, P01090, Ciudad de Mexico, México
| | - Angélica Martínez-Hernández
- Laboratorio de Inmunogenómica y Enfermedades Metabólicas, Instituto Nacional de Medicina Genómica, Ciudad de Mexico, México
| | - Alessandra Carnevale
- Laboratorio de Enfermedades Mendelianas, Instituto Nacional de Medicina Genómica, Ciudad de Mexico, México
| | - Patricia Grether-González
- Departamento de Genética y Genómica Humana, Instituto Nacional de Perinatología, Ciudad de Mexico, México
- Centro Médico ABC, Campus Santa Fe, Ciudad de Mexico, México
| | - Dora Gilda Mayen
- Unidad de Genética Aplicada. Hospital Ángeles Lomas, Huixquilucan, Edo. de México, México
| | - Camilo Villarroel
- Genética Humana, Instituto Nacional de Pediatría, Ciudad de Mexico, México
| | - Esther Lieberman
- Genética Humana, Instituto Nacional de Pediatría, Ciudad de Mexico, México
| | - Emiy Yokoyama
- Genética Humana, Instituto Nacional de Pediatría, Ciudad de Mexico, México
| | | | - Leda Torres
- Laboratorio de Citogenética, Instituto Nacional de Pediatría, Insurgentes Sur 3700-C Insurgentes Cuicuilco, P01090, Ciudad de Mexico, México.
| | - Sara Frias
- Laboratorio de Citogenética, Instituto Nacional de Pediatría, Insurgentes Sur 3700-C Insurgentes Cuicuilco, P01090, Ciudad de Mexico, México.
- Departamento de Medicina Genómica y Toxicología Ambiental, Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México, Ciudad de México, México.
| |
Collapse
|