1
|
Aspiring toward equitable benefits from genomic advances to individuals of ancestrally diverse backgrounds. Am J Hum Genet 2024; 111:809-824. [PMID: 38642557 PMCID: PMC11080611 DOI: 10.1016/j.ajhg.2024.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 04/01/2024] [Accepted: 04/01/2024] [Indexed: 04/22/2024] Open
Abstract
Advancements in genomic technologies have shown remarkable promise for improving health trajectories. The Human Genome Project has catalyzed the integration of genomic tools into clinical practice, such as disease risk assessment, prenatal testing and reproductive genomics, cancer diagnostics and prognostication, and therapeutic decision making. Despite the promise of genomic technologies, their full potential remains untapped without including individuals of diverse ancestries and integrating social determinants of health (SDOHs). The NHGRI launched the 2020 Strategic Vision with ten bold predictions by 2030, including "individuals from ancestrally diverse backgrounds will benefit equitably from advances in human genomics." Meeting this goal requires a holistic approach that brings together genomic advancements with careful consideration to healthcare access as well as SDOHs to ensure that translation of genetics research is inclusive, affordable, and accessible and ultimately narrows rather than widens health disparities. With this prediction in mind, this review delves into the two paramount applications of genetic testing-reproductive genomics and precision oncology. When discussing these applications of genomic advancements, we evaluate current accessibility limitations, highlight challenges in achieving representativeness, and propose paths forward to realize the ultimate goal of their equitable applications.
Collapse
|
2
|
Effects of urban-induced mutations on ecology, evolution and health. Nat Ecol Evol 2024:10.1038/s41559-024-02401-z. [PMID: 38641700 DOI: 10.1038/s41559-024-02401-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Accepted: 03/13/2024] [Indexed: 04/21/2024]
Abstract
Increasing evidence suggests that urbanization is associated with higher mutation rates, which can affect the health and evolution of organisms that inhabit cities. Elevated pollution levels in urban areas can induce DNA damage, leading to de novo mutations. Studies on mutations induced by urban pollution are most prevalent in humans and microorganisms, whereas studies of non-human eukaryotes are rare, even though increased mutation rates have the potential to affect organisms and their populations in contemporary time. Our Perspective explores how higher mutation rates in urban environments could impact the fitness, ecology and evolution of populations. Most mutations will be neutral or deleterious, and higher mutation rates associated with elevated pollution in urban populations can increase the risk of cancer in humans and potentially other species. We highlight the potential for urban-driven increased deleterious mutational loads in some organisms, which could lead to a decline in population growth of a wide diversity of organisms. Although beneficial mutations are expected to be rare, we argue that higher mutation rates in urban areas could influence adaptive evolution, especially in organisms with short generation times. Finally, we explore avenues for future research to better understand the effects of urban-induced mutations on the fitness, ecology and evolution of city-dwelling organisms.
Collapse
|
3
|
Epistasis between mutator alleles contributes to germline mutation spectrum variability in laboratory mice. eLife 2024; 12:RP89096. [PMID: 38381482 PMCID: PMC10942616 DOI: 10.7554/elife.89096] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2024] Open
Abstract
Maintaining germline genome integrity is essential and enormously complex. Although many proteins are involved in DNA replication, proofreading, and repair, mutator alleles have largely eluded detection in mammals. DNA replication and repair proteins often recognize sequence motifs or excise lesions at specific nucleotides. Thus, we might expect that the spectrum of de novo mutations - the frequencies of C>T, A>G, etc. - will differ between genomes that harbor either a mutator or wild-type allele. Previously, we used quantitative trait locus mapping to discover candidate mutator alleles in the DNA repair gene Mutyh that increased the C>A germline mutation rate in a family of inbred mice known as the BXDs (Sasani et al., 2022, Ashbrook et al., 2021). In this study we developed a new method to detect alleles associated with mutation spectrum variation and applied it to mutation data from the BXDs. We discovered an additional C>A mutator locus on chromosome 6 that overlaps Ogg1, a DNA glycosylase involved in the same base-excision repair network as Mutyh (David et al., 2007). Its effect depends on the presence of a mutator allele near Mutyh, and BXDs with mutator alleles at both loci have greater numbers of C>A mutations than those with mutator alleles at either locus alone. Our new methods for analyzing mutation spectra reveal evidence of epistasis between germline mutator alleles and may be applicable to mutation data from humans and other model organisms.
Collapse
|
4
|
Public platform with 39,472 exome control samples enables association studies without genotype sharing. Nat Genet 2024; 56:327-335. [PMID: 38200129 PMCID: PMC10864173 DOI: 10.1038/s41588-023-01637-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2018] [Accepted: 12/01/2023] [Indexed: 01/12/2024]
Abstract
Acquiring a sufficiently powered cohort of control samples matched to a case sample can be time-consuming or, in some cases, impossible. Accordingly, an ability to leverage genetic data from control samples that were already collected elsewhere could dramatically improve power in genetic association studies. Sharing of control samples can pose significant challenges, since most human genetic data are subject to strict sharing regulations. Here, using the properties of singular value decomposition and subsampling algorithm, we developed a method allowing selection of the best-matching controls in an external pool of samples compliant with personal data protection and eliminating the need for genotype sharing. We provide access to a library of 39,472 exome sequencing controls at http://dnascore.net enabling association studies for case cohorts lacking control subjects. Using this approach, control sets can be selected from this online library with a prespecified matching accuracy, ensuring well-calibrated association analysis for both rare and common variants.
Collapse
|
5
|
Genome-wide significant risk loci for mood disorders in the Old Order Amish founder population. Mol Psychiatry 2023; 28:5262-5271. [PMID: 36882501 PMCID: PMC10483025 DOI: 10.1038/s41380-023-02014-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 02/19/2023] [Accepted: 02/23/2023] [Indexed: 03/09/2023]
Abstract
Genome-wide association studies (GWAS) of mood disorders in large case-control cohorts have identified numerous risk loci, yet pathophysiological mechanisms remain elusive, primarily due to the very small effects of common variants. We sought to discover risk variants with larger effects by conducting a genome-wide association study of mood disorders in a founder population, the Old Order Amish (OOA, n = 1,672). Our analysis revealed four genome-wide significant risk loci, all of which were associated with >2-fold relative risk. Quantitative behavioral and neurocognitive assessments (n = 314) revealed effects of risk variants on sub-clinical depressive symptoms and information processing speed. Network analysis suggested that OOA-specific risk loci harbor novel risk-associated genes that interact with known neuropsychiatry-associated genes via gene interaction networks. Annotation of the variants at these risk loci revealed population-enriched, non-synonymous variants in two genes encoding neurodevelopmental transcription factors, CUX1 and CNOT1. Our findings provide insight into the genetic architecture of mood disorders and a substrate for mechanistic and clinical studies.
Collapse
|
6
|
Epistasis between mutator alleles contributes to germline mutation spectra variability in laboratory mice. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.25.537217. [PMID: 37162999 PMCID: PMC10168256 DOI: 10.1101/2023.04.25.537217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Maintaining germline genome integrity is essential and enormously complex. Although many proteins are involved in DNA replication, proofreading, and repair [1], mutator alleles have largely eluded detection in mammals. DNA replication and repair proteins often recognize sequence motifs or excise lesions at specific nucleotides. Thus, we might expect that the spectrum of de novo mutations - the frequencies of C>T, A>G, etc. - will differ between genomes that harbor either a mutator or wild-type allele. Previously, we used quantitative trait locus mapping to discover candidate mutator alleles in the DNA repair gene Mutyh that increased the C>A germline mutation rate in a family of inbred mice known as the BXDs [2,3]. In this study we developed a new method to detect alleles associated with mutation spectrum variation and applied it to mutation data from the BXDs. We discovered an additional C>A mutator locus on chromosome 6 that overlaps Ogg1, a DNA glycosylase involved in the same base-excision repair network as Mutyh [4]. Its effect depended on the presence of a mutator allele near Mutyh, and BXDs with mutator alleles at both loci had greater numbers of C>A mutations than those with mutator alleles at either locus alone. Our new methods for analyzing mutation spectra reveal evidence of epistasis between germline mutator alleles and may be applicable to mutation data from humans and other model organisms.
Collapse
|
7
|
Evidence of Site-Specific and Male-Biased Germline Mutation Rate in a Wild Songbird. Genome Biol Evol 2023; 15:evad180. [PMID: 37793164 PMCID: PMC10627410 DOI: 10.1093/gbe/evad180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 09/07/2023] [Accepted: 09/26/2023] [Indexed: 10/06/2023] Open
Abstract
Germline mutations are the ultimate source of genetic variation and the raw material for organismal evolution. Despite their significance, the frequency and genomic locations of mutations, as well as potential sex bias, are yet to be widely investigated in most species. To address these gaps, we conducted whole-genome sequencing of 12 great reed warblers (Acrocephalus arundinaceus) in a pedigree spanning 3 generations to identify single-nucleotide de novo mutations (DNMs) and estimate the germline mutation rate. We detected 82 DNMs within the pedigree, primarily enriched at CpG sites but otherwise randomly located along the chromosomes. Furthermore, we observed a pronounced sex bias in DNM occurrence, with male warblers exhibiting three times more mutations than females. After correction for false negatives and adjusting for callable sites, we obtained a mutation rate of 7.16 × 10-9 mutations per site per generation (m/s/g) for the autosomes and 5.10 × 10-9 m/s/g for the Z chromosome. To demonstrate the utility of species-specific mutation rates, we applied our autosomal mutation rate in models reconstructing the demographic history of the great reed warbler. We uncovered signs of drastic population size reductions predating the last glacial period (LGP) and reduced gene flow between western and eastern populations during the LGP. In conclusion, our results provide one of the few direct estimates of the mutation rate in wild songbirds and evidence for male-driven mutations in accordance with theoretical expectations.
Collapse
|
8
|
Validation of machine learning approach for direct mutation rate estimation. Mol Ecol Resour 2023; 23:1757-1771. [PMID: 37486035 DOI: 10.1111/1755-0998.13841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Revised: 06/16/2023] [Accepted: 07/05/2023] [Indexed: 07/25/2023]
Abstract
Mutations are the primary source of all genetic variation. Knowledge about their rates is critical for any evolutionary genetic analyses, but for a long time, that knowledge has remained elusive and indirectly inferred. In recent years, parent-offspring comparisons have yielded the first direct mutation rate estimates. The analyses are, however, challenging due to high rate of false positives and no consensus regarding standardized filtering of candidate de novo mutations. Here, we validate the application of a machine learning approach for such a task and estimate the mutation rate for the guppy (Poecilia reticulata), a model species in eco-evolutionary studies. We sequenced 4 parents and 20 offspring, followed by screening their genomes for de novo mutations. The initial large number of candidate de novo mutations was hard-filtered to remove false-positive results. These results were compared with mutation rate estimated with a supervised machine learning approach. Both approaches were followed by molecular validation of all candidate de novo mutations and yielded similar results. The ML method uniquely identified three mutations, but overall required more hands-on curation and had higher rates of false positives and false negatives. Both methods concordantly showed no difference in mutation rates between families. Estimated here the guppy mutation rate is among the lowest directly estimated mutation rates in vertebrates; however, previous research has also found low estimated rates in other teleost fishes. We discuss potential explanations for such a pattern, as well as future utility and limitations of machine learning approaches.
Collapse
|
9
|
The divergence of mutation rates and spectra across the Tree of Life. EMBO Rep 2023; 24:e57561. [PMID: 37615267 PMCID: PMC10561183 DOI: 10.15252/embr.202357561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 08/01/2023] [Accepted: 08/02/2023] [Indexed: 08/25/2023] Open
Abstract
Owing to advances in genome sequencing, genome stability has become one of the most scrutinized cellular traits across the Tree of Life. Despite its centrality to all things biological, the mutation rate (per nucleotide site per generation) ranges over three orders of magnitude among species and several-fold within individual phylogenetic lineages. Within all major organismal groups, mutation rates scale negatively with the effective population size of a species and with the amount of functional DNA in the genome. This relationship is most parsimoniously explained by the drift-barrier hypothesis, which postulates that natural selection typically operates to reduce mutation rates until further improvement is thwarted by the power of random genetic drift. Despite this constraint, the molecular mechanisms underlying DNA replication fidelity and repair are free to wander, provided the performance of the entire system is maintained at the prevailing level. The evolutionary flexibility of the mutation rate bears on the resolution of several prior conundrums in phylogenetic and population-genetic analysis and raises challenges for future applications in these areas.
Collapse
|
10
|
Wild pedigrees inform mutation rates and historic abundance in baleen whales. Science 2023; 381:990-995. [PMID: 37651509 DOI: 10.1126/science.adf2160] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 07/25/2023] [Indexed: 09/02/2023]
Abstract
Phylogeny-based estimates suggesting a low germline mutation rate (μ) in baleen whales have influenced research ranging from assessments of whaling impacts to evolutionary cancer biology. We estimated μ directly from pedigrees in four baleen whale species for both the mitochondrial control region and nuclear genome. The results suggest values higher than those obtained through phylogeny-based estimates and similar to pedigree-based values for primates and toothed whales. Applying our estimate of μ reduces previous genetic-based estimates of preexploitation whale abundance by 86% and suggests that μ cannot explain low cancer rates in gigantic mammals. Our study shows that it is feasible to estimate μ directly from pedigrees in natural populations, with wide-ranging implications for ecological and evolutionary research.
Collapse
|
11
|
Extensive variation in germline de novo mutations in Poecilia reticulata. Genome Res 2023; 33:1317-1324. [PMID: 37442578 PMCID: PMC10547258 DOI: 10.1101/gr.277936.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Accepted: 07/07/2023] [Indexed: 07/15/2023]
Abstract
The rate of germline mutation is fundamental to evolutionary processes, as it generates the variation upon which selection acts. The guppy, Poecilia reticulata, is a model of rapid adaptation, however the relative contribution of standing genetic variation versus de novo mutation (DNM) to evolution in this species remains unclear. Here, we use pedigree-based approaches to quantify and characterize germline DNMs in three large guppy families. Our results suggest germline mutation rate in the guppy varies substantially across individuals and families. Most DNMs are shared across multiple siblings, suggesting they arose during early embryonic development. DNMs are randomly distributed throughout the genome, and male-biased mutation rate is low, as would be expected from the short guppy generation time. Overall, our study shows remarkable variation in germline mutation rate and provides insights into rapid evolution of guppies.
Collapse
|
12
|
Challenges in screening for de novo noncoding variants contributing to genetically complex phenotypes. HGG ADVANCES 2023; 4:100210. [PMID: 37305558 PMCID: PMC10248550 DOI: 10.1016/j.xhgg.2023.100210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 05/15/2023] [Indexed: 06/13/2023] Open
Abstract
Understanding the genetic basis for complex, heterogeneous disorders, such as autism spectrum disorder (ASD), is a persistent challenge in human medicine. Owing to their phenotypic complexity, the genetic mechanisms underlying these disorders may be highly variable across individual patients. Furthermore, much of their heritability is unexplained by known regulatory or coding variants. Indeed, there is evidence that much of the causal genetic variation stems from rare and de novo variants arising from ongoing mutation. These variants occur mostly in noncoding regions, likely affecting regulatory processes for genes linked to the phenotype of interest. However, because there is no uniform code for assessing regulatory function, it is difficult to separate these mutations into likely functional and nonfunctional subsets. This makes finding associations between complex diseases and potentially causal de novo single-nucleotide variants (dnSNVs) a difficult task. To date, most published studies have struggled to find any significant associations between dnSNVs from ASD patients and any class of known regulatory elements. We sought to identify the underlying reasons for this and present strategies for overcoming these challenges. We show that, contrary to previous claims, the main reason for failure to find robust statistical enrichments is not only the number of families sampled, but also the quality and relevance to ASD of the annotations used to prioritize dnSNVs, and the reliability of the set of dnSNVs itself. We present a list of recommendations for designing future studies of this sort that will help researchers avoid common pitfalls.
Collapse
|
13
|
A global catalog of whole-genome diversity from 233 primate species. Science 2023; 380:906-913. [PMID: 37262161 DOI: 10.1126/science.abn7829] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2021] [Accepted: 02/06/2023] [Indexed: 06/03/2023]
Abstract
The rich diversity of morphology and behavior displayed across primate species provides an informative context in which to study the impact of genomic diversity on fundamental biological processes. Analysis of that diversity provides insight into long-standing questions in evolutionary and conservation biology and is urgent given severe threats these species are facing. Here, we present high-coverage whole-genome data from 233 primate species representing 86% of genera and all 16 families. This dataset was used, together with fossil calibration, to create a nuclear DNA phylogeny and to reassess evolutionary divergence times among primate clades. We found within-species genetic diversity across families and geographic regions to be associated with climate and sociality, but not with extinction risk. Furthermore, mutation rates differ across species, potentially influenced by effective population sizes. Lastly, we identified extensive recurrence of missense mutations previously thought to be human specific. This study will open a wide range of research avenues for future primate genomic research.
Collapse
|
14
|
Genotype-phenotype characteristics and baseline natural history of Chinese myelin protein zero gene related neuropathy patients. Eur J Neurol 2023; 30:1069-1079. [PMID: 36692866 DOI: 10.1111/ene.15700] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 12/24/2022] [Accepted: 12/30/2022] [Indexed: 01/25/2023]
Abstract
BACKGROUND AND PURPOSE The aim was to characterize the phenotypic and genotypic features of myelin protein zero (MPZ) related neuropathy and provide baseline data for longitudinal natural history studies or drug clinical trials. METHOD Clinical, neurophysiological and genetic data of 37 neuropathy patients with MPZ mutations were retrospectively collected. RESULTS Nineteen different MPZ mutations in 23 unrelated neuropathy families were detected, and the frequency of MPZ mutations was 5.84% in total. Mutations c.103_104InsTGGTTTACACCG, c.513dupG, c.521_557del and c.696_699delCAGT had not been reported previously. Hot spot mutation p.Thr124Met was detected in four unrelated families, and seven patients carried de novo mutations. The onset age indicated a bimodal distribution: prominent clustering in the first and fourth decades. The infantile-onset group included 12 families, the childhood-onset group consisted of two families and the adult-onset group included nine families. The Charcot-Marie-Tooth Disease Neuropathy Score ranged from 3 to 25 with a mean value of 15.85 ± 5.88. Mutations that changed the cysteine residue (p.Arg98Cys, p.Cys127Trp, p.Ser140Cys and p.Cys127Arg) in the extracellular region were more likely to cause severe early-onset Charcot-Marie-Tooth disease type 1B (CMT1B) or Dejerine-Sottas syndrome. Nonsense-mediated mRNA decay mutations p.Asp35delInsVVYTD, p.Leu174Argfs*66 and p.Leu172Alafs*63 were related to severe infantile-onset CMT1B or Dejerine-Sottas syndrome; however, mutation p.Val232Valfs*19 was associated with a relatively milder childhood-onset CMT1 phenotype. CONCLUSION Four novel MPZ mutations are reported that expand the genetic spectrum. De novo mutations accounted for 30.4% and were most related to a severe infantile-onset phenotype. Genetic and clinical data from this cohort will provide the baseline data necessary for clinical trials and natural history studies.
Collapse
|
15
|
Inferring the mode and strength of ongoing selection. Genome Res 2023; 33:632-643. [PMID: 37055196 PMCID: PMC10234300 DOI: 10.1101/gr.276386.121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Accepted: 03/29/2023] [Indexed: 04/15/2023]
Abstract
Genome sequence data are no longer scarce. The UK Biobank alone comprises 200,000 individual genomes, with more on the way, leading the field of human genetics toward sequencing entire populations. Within the next decades, other model organisms will follow suit, especially domesticated species such as crops and livestock. Having sequences from most individuals in a population will present new challenges for using these data to improve health and agriculture in the pursuit of a sustainable future. Existing population genetic methods are designed to model hundreds of randomly sampled sequences but are not optimized for extracting the information contained in the larger and richer data sets that are beginning to emerge, with thousands of closely related individuals. Here we develop a new method called trio-based inference of dominance and selection (TIDES) that uses data from tens of thousands of family trios to make inferences about natural selection acting in a single generation. TIDES further improves on the state of the art by making no assumptions regarding demography, linkage, or dominance. We discuss how our method paves the way for studying natural selection from new angles.
Collapse
|
16
|
Spectrum of FAR1 (Fatty Acyl-CoA Reductase 1) Variants and Related Neurological Conditions. Mov Disord 2023; 38:502-504. [PMID: 36781603 DOI: 10.1002/mds.29323] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 11/30/2022] [Accepted: 12/08/2022] [Indexed: 02/15/2023] Open
|
17
|
Abstract
The germline mutation rate determines the pace of genome evolution and is an evolving parameter itself1. However, little is known about what determines its evolution, as most studies of mutation rates have focused on single species with different methodologies2. Here we quantify germline mutation rates across vertebrates by sequencing and comparing the high-coverage genomes of 151 parent-offspring trios from 68 species of mammals, fishes, birds and reptiles. We show that the per-generation mutation rate varies among species by a factor of 40, with mutation rates being higher for males than for females in mammals and birds, but not in reptiles and fishes. The generation time, age at maturity and species-level fecundity are the key life-history traits affecting this variation among species. Furthermore, species with higher long-term effective population sizes tend to have lower mutation rates per generation, providing support for the drift barrier hypothesis3. The exceptionally high yearly mutation rates of domesticated animals, which have been continually selected on fecundity traits including shorter generation times, further support the importance of generation time in the evolution of mutation rates. Overall, our comparative analysis of pedigree-based mutation rates provides ecological insights on the mutation rate evolution in vertebrates.
Collapse
|
18
|
Abstract
The generation times of our recent ancestors can tell us about both the biology and social organization of prehistoric humans, placing human evolution on an absolute time scale. We present a method for predicting historical male and female generation times based on changes in the mutation spectrum. Our analyses of whole-genome data reveal an average generation time of 26.9 years across the past 250,000 years, with fathers consistently older (30.7 years) than mothers (23.2 years). Shifts in sex-averaged generation times have been driven primarily by changes to the age of paternity, although we report a substantial increase in female generation times in the recent past. We also find a large difference in generation times among populations, reaching back to a time when all humans occupied Africa.
Collapse
|
19
|
de novo variant calling identifies cancer mutation signatures in the 1000 Genomes Project. Hum Mutat 2022; 43:1979-1993. [PMID: 36054329 PMCID: PMC9771978 DOI: 10.1002/humu.24455] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 07/22/2022] [Accepted: 08/29/2022] [Indexed: 01/25/2023]
Abstract
Detection of de novo variants (DNVs) is critical for studies of disease-related variation and mutation rates. To accelerate DNV calling, we developed a graphics processing units-based workflow. We applied our workflow to whole-genome sequencing data from three parent-child sequenced cohorts including the Simons Simplex Collection (SSC), Simons Foundation Powering Autism Research (SPARK), and the 1000 Genomes Project (1000G) that were sequenced using DNA from blood, saliva, and lymphoblastoid cell lines (LCLs), respectively. The SSC and SPARK DNV callsets were within expectations for number of DNVs, percent at CpG sites, phasing to the paternal chromosome of origin, and average allele balance. However, the 1000G DNV callset was not within expectations and contained excessive DNVs that are likely cell line artifacts. Mutation signature analysis revealed 30% of 1000G DNV signatures matched B-cell lymphoma. Furthermore, we found variants in DNA repair genes and at Clinvar pathogenic or likely-pathogenic sites and significant excess of protein-coding DNVs in IGLL5; a gene known to be involved in B-cell lymphomas. Our study provides a new rapid DNV caller for the field and elucidates important implications of using sequencing data from LCLs for reference building and disease-related projects.
Collapse
|
20
|
Landscape of the intratumroal microenvironment in bladder cancer: Implications for prognosis and immunotherapy. Comput Struct Biotechnol J 2022; 21:74-85. [PMID: 36514337 PMCID: PMC9730156 DOI: 10.1016/j.csbj.2022.11.052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Revised: 11/25/2022] [Accepted: 11/25/2022] [Indexed: 12/02/2022] Open
Abstract
Introduction This study aims to present the landscape of the intratumoral microenvironment and by which establish a classification system that can be used to predict the prognosis of bladder cancer patients and their response to anti-PD-L1 immunotherapy. Methods The expression profiles of 1554 bladder cancer cases were downloaded from seven public datasets. Single-sample gene set enrichment analysis (ssGSEA), univariate Cox regression analysis, and meta-analysis were employed to establish the bladder cancer immune prognostic index (BCIPI). Extensive analyses were executed to investigate the association between BCIPI and overall survival, tumor-infiltrated immunocytes, immunotherapeutic response, mutation load, etc. Results The results obtained from seven independent cohorts and meta-analyses suggested that the BCIPI is an effective classification system for estimating bladder cancer patients' overall survival. Patients in the BCIPI-High subgroup revealed different immunophenotypic outcomes from those in the BCIPI-Low subgroup regarding tumor-infiltrated immunocytes and mutated genes. Subsequent analysis suggested that patients in the BCIPI-High subgroup were more sensitive to anti-PD-L1 immunotherapy than those in the BCIPI-Low subgroup. Conclusions The newly established BCIPI is a valuable tool for predicting overall survival outcomes and immunotherapeutic responses in patients with bladder cancer.
Collapse
Key Words
- AJCC, American Joint Committee on Cancer
- Anti-PD-L1, Antitumor response to atezolizumab
- BCG, Bacillus Calmette-Guerin
- BCIPI, Bladder cancer immune prognostic index
- Bladder cancer
- CNVs, Copy number variations
- FDA, Food and Drug Administration
- FPKM, Fragments per kilobase per million
- Genomic
- ICI, Immune checkpoint inhibitor
- IHC, Immunohistochemistry
- Immunotherapy
- MES, Mesenchymal transition
- NES, Normalized enrichment score
- Overall survival
- RMA, Robust multiarray average
- RMS, Restricted mean survival
- TPM, Transcripts per kilobase million
- ssGSEA, Single-sample GSEA
Collapse
|
21
|
Estimating the genome-wide mutation rate from thousands of unrelated individuals. Am J Hum Genet 2022; 109:2178-2184. [PMID: 36370709 PMCID: PMC9748258 DOI: 10.1016/j.ajhg.2022.10.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 10/15/2022] [Indexed: 11/13/2022] Open
Abstract
We provide a method for estimating the genome-wide mutation rate from sequence data on unrelated individuals by using segments of identity by descent (IBD). The length of an IBD segment indicates the time to shared ancestor of the segment, and mutations that have occurred since the shared ancestor result in discordances between the two IBD haplotypes. Previous methods for IBD-based estimation of mutation rate have required the use of family data for accurate phasing of the genotypes. This has limited the scope of application of IBD-based mutation rate estimation. Here, we develop an IBD-based method for mutation rate estimation from population data, and we apply it to whole-genome sequence data on 4,166 European American individuals from the TOPMed Framingham Heart Study, 2,996 European American individuals from the TOPMed My Life, Our Future study, and 1,586 African American individuals from the TOPMed Hypertension Genetic Epidemiology Network study. Although mutation rates may differ between populations as a result of genetic factors, demographic factors such as average parental age, and environmental exposures, our results are consistent with equal genome-wide average mutation rates across these three populations. Our overall estimate of the average genome-wide mutation rate per 108 base pairs per generation for single-nucleotide variants is 1.24 (95% CI 1.18-1.33).
Collapse
|
22
|
Telomeres, Telomerase and Cancer. Arch Med Res 2022; 53:741-746. [PMID: 36334946 DOI: 10.1016/j.arcmed.2022.10.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 10/05/2022] [Accepted: 10/12/2022] [Indexed: 11/06/2022]
Abstract
Telomeres and telomerase play a crucial role in human aging and cancer. Three "drivers" of human aging can be identified. The developmental program encoded in DNA is the primary determinant of lifespan. Faithful execution of the developmental program requires stability of the (epi-)genome which is challenged throughout life by damage to DNA as well as epigenetic 'scars' from error-free DNA repair and stochastic errors made during the establishment and maintenance of the "epigenome". Over time (epi-)mutations accumulate, compromising cellular function and causing (pre-)malignant alterations. Damage to the genome and epigenome can be considered the second "driver" of aging. A third driver of the aging process, important to suppress tumors in long-lived animals, is caused by progressive loss of telomeric DNA. Telomere erosion protects against cancer early in life but limits cell renewal late in life, in agreement with the Antagonistic Pleiotropy theory on the evolutionary origin of aging. Malignant tumors arise when mutations and/or epimutations in cells (clock 2) corrupt the developmental program (clock 1) as well as tumor suppression by telomere erosion (clock 3). In cancer cells clock 3 is typically inactivated by loss of p53 as well as increased expression of telomerase. Taken together, aging in humans can be described by the ticking of three clocks: the clock that directs development, the accumulation of (epi-)mutations over time and the telomere clock that limits the number of cell divisions in normal stem and immune cells.
Collapse
|
23
|
Individual Genetic Heterogeneity. Genes (Basel) 2022; 13:genes13091626. [PMID: 36140794 PMCID: PMC9498725 DOI: 10.3390/genes13091626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 08/25/2022] [Accepted: 09/08/2022] [Indexed: 11/28/2022] Open
Abstract
Genetic variation has been widely covered in literature, however, not from the perspective of an individual in any species. Here, a synthesis of genetic concepts and variations relevant for individual genetic constitution is provided. All the different levels of genetic information and variation are covered, ranging from whether an organism is unmixed or hybrid, has variations in genome, chromosomes, and more locally in DNA regions, to epigenetic variants or alterations in selfish genetic elements. Genetic constitution and heterogeneity of microbiota are highly relevant for health and wellbeing of an individual. Mutation rates vary widely for variation types, e.g., due to the sequence context. Genetic information guides numerous aspects in organisms. Types of inheritance, whether Mendelian or non-Mendelian, zygosity, sexual reproduction, and sex determination are covered. Functions of DNA and functional effects of variations are introduced, along with mechanism that reduce and modulate functional effects, including TARAR countermeasures and intraindividual genetic conflict. TARAR countermeasures for tolerance, avoidance, repair, attenuation, and resistance are essential for life, integrity of genetic information, and gene expression. The genetic composition, effects of variations, and their expression are considered also in diseases and personalized medicine. The text synthesizes knowledge and insight on individual genetic heterogeneity and organizes and systematizes the central concepts.
Collapse
|
24
|
Abstract
Germline de novo mutations (DNMs) represent one of the important topics that need extensive attention from epidemiologists, geneticists, and other relevant stakeholders. Advances in next-generation sequencing technologies allowed examination of parent-offspring trios to ascertain the frequency of germline DNMs. Many epidemiological risk factors for childhood cancer are indicative of DNMs as a mechanism. The aim of this review was to give an overview of germline DNMs, their causes in general, and to discuss their relation to childhood cancer risk. In addition, we highlighted existing gaps in knowledge in many topics of germline DNMs in childhood cancer that need exploration and collaborative efforts.
Collapse
|
25
|
Emerging pathogenic mechanisms in human brain arteriovenous malformations: a contemporary review in the multiomics era. Neurosurg Focus 2022; 53:E2. [DOI: 10.3171/2022.4.focus2291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Accepted: 04/18/2022] [Indexed: 11/06/2022]
Abstract
A variety of pathogenic mechanisms have been described in the formation, maturation, and rupture of brain arteriovenous malformations (bAVMs). While the understanding of bAVMs has largely been formulated based on animal models of rare hereditary diseases in which AVMs form, a new era of “omics” has permitted large-scale examinations of contributory genetic variations in human sporadic bAVMs. New findings regarding the pathogenesis of bAVMs implicate changes to endothelial and mural cells that result in increased angiogenesis, proinflammatory recruitment, and breakdown of vascular barrier properties that may result in hemorrhage; a greater diversity of cell populations that compose the bAVM microenvironment may also be implicated and complicate traditional models. Genomic sequencing of human bAVMs has uncovered inherited, de novo, and somatic activating mutations, such as KRAS, which contribute to the pathogenesis of bAVMs. New droplet-based, single-cell sequencing technologies have generated atlases of cell-specific molecular derangements. Herein, the authors review emerging genomic and transcriptomic findings underlying pathologic cell transformations in bAVMs derived from human tissues. The application of multiple sequencing modalities to bAVM tissues is a natural next step for researchers, although the potential therapeutic benefits or clinical applications remain unknown.
Collapse
|
26
|
Patterns and distribution of de novo mutations in multiplex Middle Eastern families. J Hum Genet 2022; 67:579-588. [PMID: 35718832 PMCID: PMC9510050 DOI: 10.1038/s10038-022-01054-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 05/23/2022] [Accepted: 05/24/2022] [Indexed: 11/10/2022]
Abstract
While de novo mutations (DNMs) are key to genetic diversity, they are also responsible for a high number of rare disorders. To date, no study has systematically examined the rate and distribution of DNMs in multiplex families in highly consanguineous populations. Leveraging WGS profiles of 645 individuals in 146 families, we implemented a combinatorial approach using 3 complementary tools for DNM discovery in 353 unique trio combinations. We found a total of 27,168 DNMs (median: 70 single-nucleotide and 6 insertion-deletions per individual). Phasing revealed around 80% of DNMs were paternal in origin. Notably, using whole-genome methylation data of spermatogonial stem cells, these DNMs were significantly more likely to occur at highly methylated CpGs (OR: 2.03; p value = 6.62 × 10−11). We then examined the effects of consanguinity and ethnicity on DNMs, and found that consanguinity does not seem to correlate with DNM rate, and special attention has to be considered while measuring such a correlation. Additionally, we found that Middle-Eastern families with Arab ancestry had fewer DNMs than African families, although not significant (p value = 0.16). Finally, for families with diseased probands, we examined the difference in DNM counts and putative impact across affected and unaffected siblings, but did not find significant differences between disease groups, likely owing to the enrichment for recessive disorders in this part of the world, or the small sample size per clinical condition. This study serves as a reference for DNM discovery in multiplex families from the globally under-represented populations of the Middle-East.
Collapse
|
27
|
The impact of genetic modifiers on variation in germline mutation rates within and among human populations. Genetics 2022; 221:6603115. [PMID: 35666194 DOI: 10.1093/genetics/iyac087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Accepted: 05/16/2022] [Indexed: 11/14/2022] Open
Abstract
Mutation rates and spectra differ among human populations. Here, we examine whether this variation could be explained by evolution at mutation modifiers. To this end, we consider genetic modifier sites at which mutations, "mutator alleles", increase genome-wide mutation rates and model their evolution under purifying selection due to the additional deleterious mutations that they cause, genetic drift, and demographic processes. We solve the model analytically for a constant population size and characterize how evolution at modifier sites impacts variation in mutation rates within and among populations. We then use simulations to study the effects of modifier sites under a plausible demographic model for Africans and Europeans. When comparing populations that evolve independently, weakly selected modifier sites (2Nes ≈ 1), which evolve slowly, contribute the most to variation in mutation rates. In contrast, when populations recently split from a common ancestral population, strongly selected modifier sites (2Nes » 1), which evolve rapidly, contribute the most to variation between them. Moreover, a modest number of modifier sites (e.g., 10 per mutation type in the standard classification into 96 types) subject to moderate to strong selection (2Nes > 1) could account for the variation in mutation rates observed among human populations. If such modifier sites indeed underlie differences among populations, they should also cause variation in mutation rates within populations and their effects should be detectable in pedigree studies.
Collapse
|
28
|
De novo mutations in children born after medical assisted reproduction. Hum Reprod 2022; 37:1360-1369. [PMID: 35413117 PMCID: PMC9156847 DOI: 10.1093/humrep/deac068] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 03/08/2022] [Indexed: 01/23/2023] Open
Abstract
STUDY QUESTION Are there more de novo mutations (DNMs) present in the genomes of children born through medical assisted reproduction (MAR) compared to spontaneously conceived children? SUMMARY ANSWER In this pilot study, no statistically significant difference was observed in the number of DNMs observed in the genomes of MAR children versus spontaneously conceived children. WHAT IS KNOWN ALREADY DNMs are known to play a major role in sporadic disorders with reduced fitness such as severe developmental disorders, including intellectual disability and epilepsy. Advanced paternal age is known to place offspring at increased disease risk, amongst others by increasing the number of DNMs in their genome. There are very few studies reporting on the effect of MAR on the number of DNMs in the offspring, especially when male infertility is known to be affecting the potential fathers. With delayed parenthood an ongoing epidemiological trend in the 21st century, there are more children born from fathers of advanced age and more children born through MAR every day. STUDY DESIGN, SIZE, DURATION This observational pilot study was conducted from January 2015 to March 2019 in the tertiary care centre at Radboud University Medical Center. We included a total of 53 children and their respective parents, forming 49 trios (mother, father and child) and two quartets (mother, father and two siblings). One group of children was born after spontaneous conception (n = 18); a second group of children born after IVF (n = 17) and a third group of children born after ICSI combined with testicular sperm extraction (ICSI-TESE) (n = 18). In this pilot study, we also subdivided each group by paternal age, resulting in a subgroup of children born to younger fathers (<35 years of age at conception) and older fathers (>45 years of age at conception). PARTICIPANTS/MATERIALS, SETTING, METHODS Whole-genome sequencing (WGS) was performed on all parent-offspring trios to identify DNMs. For 34 of 53 trios/quartets, WGS was performed twice to independently detect and validate the presence of DNMs. Quality of WGS-based DNM calling was independently assessed by targeted Sanger sequencing. MAIN RESULTS AND THE ROLE OF CHANCE No significant differences were observed in the number of DNMs per child for the different methods of conception, independent of parental age at conception (multi-factorial ANOVA, f(2) = 0.17, P-value = 0.85). As expected, a clear paternal age effect was observed after adjusting for method of conception and maternal age at conception (multiple regression model, t = 5.636, P-value = 8.97 × 10-7), with on average 71 DNMs in the genomes of children born to young fathers (<35 years of age) and an average of 94 DNMs in the genomes of children born to older fathers (>45 years of age). LIMITATIONS, REASONS FOR CAUTION This is a pilot study and other small-scale studies have recently reported contrasting results. Larger unbiased studies are required to confirm or falsify these results. WIDER IMPLICATIONS OF THE FINDINGS This pilot study did not show an effect for the method of conception on the number of DNMs per genome in offspring. Given the role that DNMs play in disease risk, this negative result is good news for IVF and ICSI-TESE born children, if replicated in a larger cohort. STUDY FUNDING/COMPETING INTEREST(S) This research was funded by the Netherlands Organisation for Scientific Research (918-15-667) and by an Investigator Award in Science from the Wellcome Trust (209451). The authors have no conflicts of interest to declare. TRIAL REGISTRATION NUMBER N/A.
Collapse
|
29
|
A natural mutator allele shapes mutation spectrum variation in mice. Nature 2022; 605:497-502. [PMID: 35545679 DOI: 10.1038/s41586-022-04701-5] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 03/25/2022] [Indexed: 12/12/2022]
Abstract
Although germline mutation rates and spectra can vary within and between species, common genetic modifiers of the mutation rate have not been identified in nature1. Here we searched for loci that influence germline mutagenesis using a uniquely powerful resource: a panel of recombinant inbred mouse lines known as the BXD, descended from the laboratory strains C57BL/6J (B haplotype) and DBA/2J (D haplotype). Each BXD lineage has been maintained by brother-sister mating in the near absence of natural selection, accumulating de novo mutations for up to 50 years on a known genetic background that is a unique linear mosaic of B and D haplotypes2. We show that mice inheriting D haplotypes at a quantitative trait locus on chromosome 4 accumulate C>A germline mutations at a 50% higher rate than those inheriting B haplotypes, primarily owing to the activity of a C>A-dominated mutational signature known as SBS18. The B and D quantitative trait locus haplotypes encode different alleles of Mutyh, a DNA repair gene that underlies the heritable cancer predisposition syndrome that causes colorectal tumors with a high SBS18 mutation load3,4. Both B and D Mutyh alleles are present in wild populations of Mus musculus domesticus, providing evidence that common genetic variation modulates germline mutagenesis in a model mammalian species.
Collapse
|
30
|
Familial long-read sequencing increases yield of de novo mutations. Am J Hum Genet 2022; 109:631-646. [PMID: 35290762 DOI: 10.1016/j.ajhg.2022.02.014] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Accepted: 02/16/2022] [Indexed: 12/11/2022] Open
Abstract
Studies of de novo mutation (DNM) have typically excluded some of the most repetitive and complex regions of the genome because these regions cannot be unambiguously mapped with short-read sequencing data. To better understand the genome-wide pattern of DNM, we generated long-read sequence data from an autism parent-child quad with an affected female where no pathogenic variant had been discovered in short-read Illumina sequence data. We deeply sequenced all four individuals by using three sequencing platforms (Illumina, Oxford Nanopore, and Pacific Biosciences) and three complementary technologies (Strand-seq, optical mapping, and 10X Genomics). Using long-read sequencing, we initially discovered and validated 171 DNMs across two children-a 20% increase in the number of de novo single-nucleotide variants (SNVs) and indels when compared to short-read callsets. The number of DNMs further increased by 5% when considering a more complete human reference (T2T-CHM13) because of the recovery of events in regions absent from GRCh38 (e.g., three DNMs in heterochromatic satellites). In total, we validated 195 de novo germline mutations and 23 potential post-zygotic mosaic mutations across both children; the overall true substitution rate based on this integrated callset is at least 1.41 × 10-8 substitutions per nucleotide per generation. We also identified six de novo insertions and deletions in tandem repeats, two of which represent structural variants. We demonstrate that long-read sequencing and assembly, especially when combined with a more complete reference genome, increases the number of DNMs by >25% compared to previous studies, providing a more complete catalog of DNM compared to short-read data alone.
Collapse
|
31
|
Role of sperm DNA damage in creating de novo mutations in human offspring: the ‘post-meiotic oocyte collusion’ hypothesis. Reprod Biomed Online 2022; 45:109-124. [DOI: 10.1016/j.rbmo.2022.03.012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 03/10/2022] [Accepted: 03/11/2022] [Indexed: 11/24/2022]
|
32
|
Telomeres, aging, and cancer: the big picture. Blood 2022; 139:813-821. [PMID: 35142846 DOI: 10.1182/blood.2021014299] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Accepted: 12/16/2021] [Indexed: 12/13/2022] Open
Abstract
The role of telomeres in human health and disease is yet to be fully understood. The limitations of mouse models for the study of human telomere biology and difficulties in accurately measuring the length of telomere repeats in chromosomes and cells have diverted attention from many important and relevant observations. The goal of this perspective is to summarize some of these observations and to discuss the antagonistic role of telomere loss in aging and cancer in the context of developmental biology, cell turnover, and evolution. It is proposed that both damage to DNA and replicative loss of telomeric DNA contribute to aging in humans, with the differences in leukocyte telomere length between humans being linked to the risk of developing specific diseases. These ideas are captured in the Telomere Erosion in Disposable Soma theory of aging proposed herein.
Collapse
|
33
|
Revisiting the neutral dynamics derived limiting guanine-cytosine content using human de novo point mutation data. Meta Gene 2022. [DOI: 10.1016/j.mgene.2021.100994] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
|
34
|
The mutationathon highlights the importance of reaching standardization in estimates of pedigree-based germline mutation rates. eLife 2022; 11:73577. [PMID: 35018888 PMCID: PMC8830884 DOI: 10.7554/elife.73577] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 01/11/2022] [Indexed: 11/13/2022] Open
Abstract
In the past decade, several studies have estimated the human per-generation germline mutation rate using large pedigrees. More recently, estimates for various nonhuman species have been published. However, methodological differences among studies in detecting germline mutations and estimating mutation rates make direct comparisons difficult. Here, we describe the many different steps involved in estimating pedigree-based mutation rates, including sampling, sequencing, mapping, variant calling, filtering, and appropriately accounting for false-positive and false-negative rates. For each step, we review the different methods and parameter choices that have been used in the recent literature. Additionally, we present the results from a ‘Mutationathon,’ a competition organized among five research labs to compare germline mutation rate estimates for a single pedigree of rhesus macaques. We report almost a twofold variation in the final estimated rate among groups using different post-alignment processing, calling, and filtering criteria, and provide details into the sources of variation across studies. Though the difference among estimates is not statistically significant, this discrepancy emphasizes the need for standardized methods in mutation rate estimations and the difficulty in comparing rates from different studies. Finally, this work aims to provide guidelines for computational and statistical benchmarks for future studies interested in identifying germline mutations from pedigrees.
Collapse
|
35
|
De novo mutations in childhood cases of sudden unexplained death that disrupt intracellular Ca2+ regulation. Proc Natl Acad Sci U S A 2021; 118:2115140118. [PMID: 34930847 PMCID: PMC8719874 DOI: 10.1073/pnas.2115140118] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/20/2021] [Indexed: 01/04/2023] Open
Abstract
Approximately 400 United States children 1 y of age and older die suddenly from unexplained causes annually. We studied whole-exome sequence data from 124 “trios” (decedent child and living parents) to identify genetic risk factors. Nonsynonymous mutations, mostly de novo (present in child but absent in both biological parents), were highly enriched in genes associated with cardiac and seizure disorders relative to controls, and contributed to 9% of deaths. We found significant overtransmission of loss-of-function or pathogenic missense variants in cardiac and seizure disorder genes. Most pathogenic variants were de novo in origin, highlighting the importance of trio studies. Many of these pathogenic de novo mutations altered a protein network regulating calcium-related excitability at submembrane junctions in cardiomyocytes and neurons. Sudden unexplained death in childhood (SUDC) is an understudied problem. Whole-exome sequence data from 124 “trios” (decedent child, living parents) was used to test for excessive de novo mutations (DNMs) in genes involved in cardiac arrhythmias, epilepsy, and other disorders. Among decedents, nonsynonymous DNMs were enriched in genes associated with cardiac and seizure disorders relative to controls (odds ratio = 9.76, P = 2.15 × 10−4). We also found evidence for overtransmission of loss-of-function (LoF) or previously reported pathogenic variants in these same genes from heterozygous carrier parents (11 of 14 transmitted, P = 0.03). We identified a total of 11 SUDC proband genotypes (7 de novo, 1 transmitted parental mosaic, 2 transmitted parental heterozygous, and 1 compound heterozygous) as pathogenic and likely contributory to death, a genetic finding in 8.9% of our cohort. Two genes had recurrent missense DNMs, RYR2 and CACNA1C. Both RYR2 mutations are pathogenic (P = 1.7 × 10−7) and were previously studied in mouse models. Both CACNA1C mutations lie within a 104-nt exon (P = 1.0 × 10−7) and result in slowed L-type calcium channel inactivation and lower current density. In total, six pathogenic DNMs can alter calcium-related regulation of cardiomyocyte and neuronal excitability at a submembrane junction, suggesting a pathway conferring susceptibility to sudden death. There was a trend for excess LoF mutations in LoF intolerant genes, where ≥1 nonhealthy sample in denovo-db has a similar variant (odds ratio = 6.73, P = 0.02); additional uncharacterized genetic causes of sudden death in children might be discovered with larger cohorts.
Collapse
|
36
|
M-DATA: A statistical approach to jointly analyzing de novo mutations for multiple traits. PLoS Genet 2021; 17:e1009849. [PMID: 34735430 PMCID: PMC8568192 DOI: 10.1371/journal.pgen.1009849] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 09/29/2021] [Indexed: 11/22/2022] Open
Abstract
Recent studies have demonstrated that multiple early-onset diseases have shared risk genes, based on findings from de novo mutations (DNMs). Therefore, we may leverage information from one trait to improve statistical power to identify genes for another trait. However, there are few methods that can jointly analyze DNMs from multiple traits. In this study, we develop a framework called M-DATA (Multi-trait framework for De novo mutation Association Test with Annotations) to increase the statistical power of association analysis by integrating data from multiple correlated traits and their functional annotations. Using the number of DNMs from multiple diseases, we develop a method based on an Expectation-Maximization algorithm to both infer the degree of association between two diseases as well as to estimate the gene association probability for each disease. We apply our method to a case study of jointly analyzing data from congenital heart disease (CHD) and autism. Our method was able to identify 23 genes for CHD from joint analysis, including 12 novel genes, which is substantially more than single-trait analysis, leading to novel insights into CHD disease etiology.
Collapse
|
37
|
Abstract
Despite years of active research into the role of DNA repair and replication in mutagenesis, surprisingly little is known about the origin of spontaneous human mutation in the germ line. With the advent of high-throughput sequencing, genome-scale data have revealed statistical properties of mutagenesis in humans. These properties include variation of the mutation rate and spectrum along the genome at different scales in relation to epigenomic features and dependency on parental age. Moreover, mutations originated in mothers are less frequent than mutations originated in fathers and have a distinct genomic distribution. Statistical analyses that interpret these patterns in the context of known biochemistry can provide mechanistic models of mutagenesis in humans.
Collapse
|
38
|
A modified fluctuation assay reveals a natural mutator phenotype that drives mutation spectrum variation within Saccharomyces cerevisiae. eLife 2021; 10:68285. [PMID: 34523420 PMCID: PMC8497059 DOI: 10.7554/elife.68285] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Accepted: 09/14/2021] [Indexed: 12/23/2022] Open
Abstract
Although studies of Saccharomyces cerevisiae have provided many insights into mutagenesis and DNA repair, most of this work has focused on a few laboratory strains. Much less is known about the phenotypic effects of natural variation within S. cerevisiae’s DNA repair pathways. Here, we use natural polymorphisms to detect historical mutation spectrum differences among several wild and domesticated S. cerevisiae strains. To determine whether these differences are likely caused by genetic mutation rate modifiers, we use a modified fluctuation assay with a CAN1 reporter to measure de novo mutation rates and spectra in 16 of the analyzed strains. We measure a 10-fold range of mutation rates and identify two strains with distinctive mutation spectra. These strains, known as AEQ and AAR, come from the panel’s ‘Mosaic beer’ clade and share an enrichment for C > A mutations that is also observed in rare variation segregating throughout the genomes of several Mosaic beer and Mixed origin strains. Both AEQ and AAR are haploid derivatives of the diploid natural isolate CBS 1782, whose rare polymorphisms are enriched for C > A as well, suggesting that the underlying mutator allele is likely active in nature. We use a plasmid complementation test to show that AAR and AEQ share a mutator allele in the DNA repair gene OGG1, which excises 8-oxoguanine lesions that can cause C > A mutations if left unrepaired.
Collapse
|
39
|
Abstract
De novo mutations are central for evolution, since they provide the raw material for natural selection by regenerating genetic variation. However, studying de novo mutations is challenging and is generally restricted to model species, so we have a limited understanding of the evolution of the mutation rate and spectrum between closely related species. Here, we present a mutation accumulation (MA) experiment to study de novo mutation in the unicellular green alga Chlamydomonas incerta and perform comparative analyses with its closest known relative, Chlamydomonas reinhardtii. Using whole-genome sequencing data, we estimate that the median single nucleotide mutation (SNM) rate in C. incerta is μ = 7.6 × 10-10, and is highly variable between MA lines, ranging from μ = 0.35 × 10-10 to μ = 131.7 × 10-10. The SNM rate is strongly positively correlated with the mutation rate for insertions and deletions between lines (r > 0.97). We infer that the genomic factors associated with variation in the mutation rate are similar to those in C. reinhardtii, allowing for cross-prediction between species. Among these genomic factors, sequence context and complexity are more important than GC content. With the exception of a remarkably high C→T bias, the SNM spectrum differs markedly between the two Chlamydomonas species. Our results suggest that similar genomic and biological characteristics may result in a similar mutation rate in the two species, whereas the SNM spectrum has more freedom to diverge.
Collapse
|
40
|
Mutability of mononucleotide repeats, not oxidative stress, explains the discrepancy between laboratory-accumulated mutations and the natural allele-frequency spectrum in C. elegans. Genome Res 2021; 31:1602-1613. [PMID: 34404692 PMCID: PMC8415377 DOI: 10.1101/gr.275372.121] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 07/12/2021] [Indexed: 11/24/2022]
Abstract
Important clues about natural selection can be gleaned from discrepancies between the properties of segregating genetic variants and of mutations accumulated experimentally under minimal selection, provided the mutational process is the same in the laboratory as in nature. The base-substitution spectrum differs between C. elegans laboratory mutation accumulation (MA) experiments and the standing site-frequency spectrum, which has been argued to be in part owing to increased oxidative stress in the laboratory environment. Using genome sequence data from C. elegans MA lines carrying a mutation (mev-1) that increases the cellular titer of reactive oxygen species (ROS), leading to increased oxidative stress, we find the base-substitution spectrum is similar between mev-1, its wild-type progenitor (N2), and another set of MA lines derived from a different wild strain (PB306). Conversely, the rate of short insertions is greater in mev-1, consistent with studies in other organisms in which environmental stress increased the rate of insertion–deletion mutations. Further, the mutational properties of mononucleotide repeats in all strains are different from those of nonmononucleotide sequence, both for indels and base-substitutions, and whereas the nonmononucleotide spectra are fairly similar between MA lines and wild isolates, the mononucleotide spectra are very different, with a greater frequency of A:T → T:A transversions and an increased proportion of ±1-bp indels. The discrepancy in mutational spectra between laboratory MA experiments and natural variation is likely owing to a consistent (but unknown) effect of the laboratory environment that manifests itself via different modes of mutability and/or repair at mononucleotide loci.
Collapse
|
41
|
Genomic partitioning of inbreeding depression in humans. Am J Hum Genet 2021; 108:1488-1501. [PMID: 34214457 DOI: 10.1016/j.ajhg.2021.06.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Accepted: 06/01/2021] [Indexed: 02/05/2023] Open
Abstract
Across species, offspring of related individuals often exhibit significant reduction in fitness-related traits, known as inbreeding depression (ID), yet the genetic and molecular basis for ID remains elusive. Here, we develop a method to quantify enrichment of ID within specific genomic annotations and apply it to human data. We analyzed the phenomes and genomes of ∼350,000 unrelated participants of the UK Biobank and found, on average of over 11 traits, significant enrichment of ID within genomic regions with high recombination rates (>21-fold; p < 10-5), with conserved function across species (>19-fold; p < 10-4), and within regulatory elements such as DNase I hypersensitive sites (∼5-fold; p = 8.9 × 10-7). We also quantified enrichment of ID within trait-associated regions and found suggestive evidence that genomic regions contributing to additive genetic variance in the population are enriched for ID signal. We find strong correlations between functional enrichment of SNP-based heritability and that of ID (r = 0.8, standard error: 0.1). These findings provide empirical evidence that ID is most likely due to many partially recessive deleterious alleles in low linkage disequilibrium regions of the genome. Our study suggests that functional characterization of ID may further elucidate the genetic architectures and biological mechanisms underlying complex traits and diseases.
Collapse
|
42
|
Association of assisted reproductive technology, germline de novo mutations and congenital heart defects in a prospective birth cohort study. Cell Res 2021; 31:919-928. [PMID: 34108666 PMCID: PMC8324888 DOI: 10.1038/s41422-021-00521-w] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 05/17/2021] [Indexed: 01/05/2023] Open
Abstract
Emerging evidence suggests that children conceived through assisted reproductive technology (ART) have a higher risk of congenital heart defects (CHDs) even when there is no family history. De novo mutation (DNM) is a well-known cause of sporadic congenital diseases; however, whether ART procedures increase the number of germline DNM (gDNM) has not yet been well studied. Here, we performed whole-genome sequencing of 1137 individuals from 160 families conceived through ART and 205 families conceived spontaneously. Children conceived via ART carried 4.59 more gDNMs than children conceived spontaneously, including 3.32 paternal and 1.26 maternal DNMs, after correcting for parental age at conception, cigarette smoking, alcohol drinking, and exercise behaviors. Paternal DNMs in offspring conceived via ART are characterized by C>T substitutions at CpG sites, which potentially affect protein-coding genes and are significantly associated with the increased risk of CHD. In addition, the accumulation of non-coding functional mutations was independently associated with CHD and 87.9% of the mutations were originated from the father. Among ART offspring, infertility of the father was associated with elevated paternal DNMs; usage of both recombinant and urinary follicle-stimulating hormone and high-dosage human chorionic gonadotropin trigger was associated with an increase of maternal DNMs. In sum, the increased gDNMs in offspring conceived by ART were primarily originated from fathers, indicating that ART itself may not be a major reason for the accumulation of gDNMs. Our findings emphasize the importance of evaluating the germline status of the fathers in families with the use of ART.
Collapse
|
43
|
Discovery of genomic variation across a generation. Hum Mol Genet 2021; 30:R174-R186. [PMID: 34296264 PMCID: PMC8490016 DOI: 10.1093/hmg/ddab209] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 07/09/2021] [Accepted: 07/19/2021] [Indexed: 11/12/2022] Open
Abstract
Over the past 30 years (the timespan of a generation), advances in genomics technologies have revealed tremendous and unexpected variation in the human genome and have provided increasingly accurate answers to long-standing questions of how much genetic variation exists in human populations and to what degree the DNA complement changes between parents and offspring. Tracking the characteristics of these inherited and spontaneous (or de novo) variations has been the basis of the study of human genetic disease. From genome-wide microarray and next-generation sequencing scans, we now know that each human genome contains over 3 million single nucleotide variants when compared with the ~ 3 billion base pairs in the human reference genome, along with roughly an order of magnitude more DNA—approximately 30 megabase pairs (Mb)—being ‘structurally variable’, mostly in the form of indels and copy number changes. Additional large-scale variations include balanced inversions (average of 18 Mb) and complex, difficult-to-resolve alterations. Collectively, ~1% of an individual’s genome will differ from the human reference sequence. When comparing across a generation, fewer than 100 new genetic variants are typically detected in the euchromatic portion of a child’s genome. Driven by increasingly higher-resolution and higher-throughput sequencing technologies, newer and more accurate databases of genetic variation (for instance, more comprehensive structural variation data and phasing of combinations of variants along chromosomes) of worldwide populations will emerge to underpin the next era of discovery in human molecular genetics.
Collapse
|
44
|
Sequencing of a central nervous system tumor demonstrates cancer transmission in an organ transplant. Life Sci Alliance 2021; 4:4/9/e202000941. [PMID: 34301805 PMCID: PMC8321656 DOI: 10.26508/lsa.202000941] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Revised: 07/09/2021] [Accepted: 07/12/2021] [Indexed: 11/24/2022] Open
Abstract
This study uses DNA sequencing to trace a donor organ transplant–mediated cancer transmission and illustrates how precise molecular pathology profiles might reduce future risk for transplant recipients. Four organ transplant recipients from an organ donor diagnosed with anaplastic pleomorphic xanthoastrocytoma developed fatal malignancies for which the origin could not be confirmed by standard methods. We identified the somatic mutational profiles of the neoplasms using next-generation sequencing technologies and tracked the relationship between the different samples. The data were consistent with the presence of an aggressive clonal entity in the donor and the subsequent proliferation of descendent tumors in each recipient. Deleterious mutations in BRAF, PIK3CA, SDHC, DDR2, and FANCD2, and a chromosomal deletion spanning the CDKN2A/B genes, were shared between the recipients’ lesions. In addition to demonstrating that DNA sequencing tracked a donor/recipient cancer transmission, this study established that the genetic profile of a donor tumor and its potential aggressive phenotype could have been determined before transplantation was considered. As the genetic correlates of tumor invasion and metastases become better known, adding genetic profiling by DNA sequencing to the data considered for transplant safety should be considered.
Collapse
|
45
|
Abstract
Mutations are the driving force of evolution, yet they underlie many diseases, in particular, cancer. They are thought to arise from a combination of stochastic errors in DNA processing, naturally occurring DNA damage (e.g., the spontaneous deamination of methylated CpG sites), replication errors, and dysregulation of DNA repair mechanisms. High-throughput sequencing has made it possible to generate large datasets to study mutational processes in health and disease. Since the emergence of the first mutational process studies in 2012, this field is gaining increasing attention and has already accumulated a host of computational approaches and biomedical applications.
Collapse
|
46
|
The challenge and promise of estimating the de novo mutation rate from whole-genome comparisons among closely related individuals. Mol Ecol 2021; 30:6087-6100. [PMID: 34062029 DOI: 10.1111/mec.16007] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 04/22/2021] [Accepted: 05/26/2021] [Indexed: 12/20/2022]
Abstract
Germline mutations are the raw material for natural selection, driving species evolution and the generation of earth's biodiversity. Without this driver of genetic diversity, life on earth would stagnate. Yet, it is a double-edged sword. An excess of mutations can have devastating effects on fitness and population viability. It is therefore one of the great challenges of molecular ecology to determine the rate and mechanisms by which these mutations accrue across the tree of life. Advances in high-throughput sequencing technologies are providing new opportunities for characterizing the rates and mutational spectra within species and populations thus informing essential evolutionary parameters such as the timing of speciation events, the intricacies of historical demography, and the degree to which lineages are subject to the burdens of mutational load. Here, we will focus on both the challenge and promise of whole-genome comparisons among parents and their offspring from known pedigrees for the detection of germline mutations as they arise in a single generation. The potential of these studies is high, but the field is still in its infancy and much uncertainty remains. Namely, the technical challenges are daunting given that pedigree-based genome comparisons are essentially searching for needles in a haystack given the very low signal to noise ratio. Despite the challenges, we predict that rapidly developing methods for whole-genome comparisons hold great promise for integrating empirically derived estimates of de novo mutation rates and mutation spectra across many molecular ecological applications.
Collapse
|
47
|
Using GC Content to Compare Recombination Patterns on the Sex Chromosomes and Autosomes of the Guppy, Poecilia reticulata, and Its Close Outgroup Species. Mol Biol Evol 2021; 37:3550-3562. [PMID: 32697821 DOI: 10.1093/molbev/msaa187] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Genetic and physical mapping of the guppy (Poecilia reticulata) have shown that recombination patterns differ greatly between males and females. Crossover events occur evenly across the chromosomes in females, but in male meiosis they are restricted to the tip furthest from the centromere of each chromosome, creating very high recombination rates per megabase, as in pseudoautosomal regions of mammalian sex chromosomes. We used GC content to indirectly infer recombination patterns on guppy chromosomes, based on evidence that recombination is associated with GC-biased gene conversion, so that genome regions with high recombination rates should be detectable by high GC content. We used intron sequences and third positions of codons to make comparisons between sequences that are matched, as far as possible, and are all probably under weak selection. Almost all guppy chromosomes, including the sex chromosome (LG12), have very high GC values near their assembly ends, suggesting high recombination rates due to strong crossover localization in male meiosis. Our test does not suggest that the guppy XY pair has stronger crossover localization than the autosomes, or than the homologous chromosome in the close relative, the platyfish (Xiphophorus maculatus). We therefore conclude that the guppy XY pair has not recently undergone an evolutionary change to a different recombination pattern, or reduced its crossover rate, but that the guppy evolved Y-linkage due to acquiring a male-determining factor that also conferred the male crossover pattern. We also identify the centromere ends of guppy chromosomes, which were not determined in the genome assembly.
Collapse
|
48
|
Constrained chromatin accessibility in PU.1-mutated agammaglobulinemia patients. J Exp Med 2021; 218:212070. [PMID: 33951726 PMCID: PMC8105723 DOI: 10.1084/jem.20201750] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Revised: 02/09/2021] [Accepted: 03/16/2021] [Indexed: 12/19/2022] Open
Abstract
The pioneer transcription factor (TF) PU.1 controls hematopoietic cell fate by decompacting stem cell heterochromatin and allowing nonpioneer TFs to enter otherwise inaccessible genomic sites. PU.1 deficiency fatally arrests lymphopoiesis and myelopoiesis in mice, but human congenital PU.1 disorders have not previously been described. We studied six unrelated agammaglobulinemic patients, each harboring a heterozygous mutation (four de novo, two unphased) of SPI1, the gene encoding PU.1. Affected patients lacked circulating B cells and possessed few conventional dendritic cells. Introducing disease-similar SPI1 mutations into human hematopoietic stem and progenitor cells impaired early in vitro B cell and myeloid cell differentiation. Patient SPI1 mutations encoded destabilized PU.1 proteins unable to nuclear localize or bind target DNA. In PU.1-haploinsufficient pro–B cell lines, euchromatin was less accessible to nonpioneer TFs critical for B cell development, and gene expression patterns associated with the pro– to pre–B cell transition were undermined. Our findings molecularly describe a novel form of agammaglobulinemia and underscore PU.1’s critical, dose-dependent role as a hematopoietic euchromatin gatekeeper.
Collapse
|
49
|
Automatic inference of demographic parameters using generative adversarial networks. Mol Ecol Resour 2021; 21:2689-2705. [PMID: 33745225 PMCID: PMC8596911 DOI: 10.1111/1755-0998.13386] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Accepted: 03/05/2021] [Indexed: 12/12/2022]
Abstract
Population genetics relies heavily on simulated data for validation, inference and intuition. In particular, since the evolutionary ‘ground truth’ for real data is always limited, simulated data are crucial for training supervised machine learning methods. Simulation software can accurately model evolutionary processes but requires many hand‐selected input parameters. As a result, simulated data often fail to mirror the properties of real genetic data, which limits the scope of methods that rely on it. Here, we develop a novel approach to estimating parameters in population genetic models that automatically adapts to data from any population. Our method, pg‐gan, is based on a generative adversarial network that gradually learns to generate realistic synthetic data. We demonstrate that our method is able to recover input parameters in a simulated isolation‐with‐migration model. We then apply our method to human data from the 1000 Genomes Project and show that we can accurately recapitulate the features of real data.
Collapse
|
50
|
Relationships among smoking, oxidative stress, inflammation, macromolecular damage, and cancer. MUTATION RESEARCH. REVIEWS IN MUTATION RESEARCH 2021; 787:108365. [PMID: 34083039 PMCID: PMC8287787 DOI: 10.1016/j.mrrev.2021.108365] [Citation(s) in RCA: 164] [Impact Index Per Article: 54.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 01/06/2021] [Accepted: 01/07/2021] [Indexed: 02/07/2023]
Abstract
Smoking is a major risk factor for a variety of diseases, including cancer and immune-mediated inflammatory diseases. Tobacco smoke contains a mixture of chemicals, including a host of reactive oxygen- and nitrogen species (ROS and RNS), among others, that can damage cellular and sub-cellular targets, such as lipids, proteins, and nucleic acids. A growing body of evidence supports a key role for smoking-induced ROS and the resulting oxidative stress in inflammation and carcinogenesis. This comprehensive and up-to-date review covers four interrelated topics, including 'smoking', 'oxidative stress', 'inflammation', and 'cancer'. The review discusses each of the four topics, while exploring the intersections among the topics by highlighting the macromolecular damage attributable to ROS. Specifically, oxidative damage to macromolecular targets, such as lipid peroxidation, post-translational modification of proteins, and DNA adduction, as well as enzymatic and non-enzymatic antioxidant defense mechanisms, and the multi-faceted repair pathways of oxidized lesions are described. Also discussed are the biological consequences of oxidative damage to macromolecules if they evade the defense mechanisms and/or are not repaired properly or in time. Emphasis is placed on the genetic- and epigenetic alterations that may lead to transcriptional deregulation of functionally-important genes and disruption of regulatory elements. Smoking-associated oxidative stress also activates the inflammatory response pathway, which triggers a cascade of events of which ROS production is an initial yet indispensable step. The release of ROS at the site of damage and inflammation helps combat foreign pathogens and restores the injured tissue, while simultaneously increasing the burden of oxidative stress. This creates a vicious cycle in which smoking-related oxidative stress causes inflammation, which in turn, results in further generation of ROS, and potentially increased oxidative damage to macromolecular targets that may lead to cancer initiation and/or progression.
Collapse
|