51
|
Wang RJ, Radivojac P, Hahn MW. Distinct error rates for reference and nonreference genotypes estimated by pedigree analysis. Genetics 2021; 217:1-10. [PMID: 33683359 DOI: 10.1093/genetics/iyaa014] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Accepted: 11/13/2020] [Indexed: 01/06/2023] Open
Abstract
Errors in genotype calling can have perverse effects on genetic analyses, confounding association studies, and obscuring rare variants. Analyses now routinely incorporate error rates to control for spurious findings. However, reliable estimates of the error rate can be difficult to obtain because of their variance between studies. Most studies also report only a single estimate of the error rate even though genotypes can be miscalled in more than one way. Here, we report a method for estimating the rates at which different types of genotyping errors occur at biallelic loci using pedigree information. Our method identifies potential genotyping errors by exploiting instances where the haplotypic phase has not been faithfully transmitted. The expected frequency of inconsistent phase depends on the combination of genotypes in a pedigree and the probability of miscalling each genotype. We develop a model that uses the differences in these frequencies to estimate rates for different types of genotype error. Simulations show that our method accurately estimates these error rates in a variety of scenarios. We apply this method to a dataset from the whole-genome sequencing of owl monkeys (Aotus nancymaae) in three-generation pedigrees. We find significant differences between estimates for different types of genotyping error, with the most common being homozygous reference sites miscalled as heterozygous and vice versa. The approach we describe is applicable to any set of genotypes where haplotypic phase can reliably be called and should prove useful in helping to control for false discoveries.
Collapse
Affiliation(s)
- Richard J Wang
- Department of Biology, Indiana University, Bloomington, IN 47405, USA
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, USA
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, IN 47405, USA
- Department of Computer Science, Indiana University, Bloomington, IN 47405, USA
| |
Collapse
|
52
|
Wang C, Lv H, Ling X, Li H, Diao F, Dai J, Du J, Chen T, Xi Q, Zhao Y, Zhou K, Xu B, Han X, Liu X, Peng M, Chen C, Tao S, Huang L, Liu C, Wen M, Jiang Y, Jiang T, Lu C, Wu W, Wu D, Chen M, Lin Y, Guo X, Huo R, Liu J, Ma H, Jin G, Xia Y, Sha J, Shen H, Hu Z. Association of assisted reproductive technology, germline de novo mutations and congenital heart defects in a prospective birth cohort study. Cell Res 2021; 31:919-928. [PMID: 34108666 PMCID: PMC8324888 DOI: 10.1038/s41422-021-00521-w] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 05/17/2021] [Indexed: 01/05/2023] Open
Abstract
Emerging evidence suggests that children conceived through assisted reproductive technology (ART) have a higher risk of congenital heart defects (CHDs) even when there is no family history. De novo mutation (DNM) is a well-known cause of sporadic congenital diseases; however, whether ART procedures increase the number of germline DNM (gDNM) has not yet been well studied. Here, we performed whole-genome sequencing of 1137 individuals from 160 families conceived through ART and 205 families conceived spontaneously. Children conceived via ART carried 4.59 more gDNMs than children conceived spontaneously, including 3.32 paternal and 1.26 maternal DNMs, after correcting for parental age at conception, cigarette smoking, alcohol drinking, and exercise behaviors. Paternal DNMs in offspring conceived via ART are characterized by C>T substitutions at CpG sites, which potentially affect protein-coding genes and are significantly associated with the increased risk of CHD. In addition, the accumulation of non-coding functional mutations was independently associated with CHD and 87.9% of the mutations were originated from the father. Among ART offspring, infertility of the father was associated with elevated paternal DNMs; usage of both recombinant and urinary follicle-stimulating hormone and high-dosage human chorionic gonadotropin trigger was associated with an increase of maternal DNMs. In sum, the increased gDNMs in offspring conceived by ART were primarily originated from fathers, indicating that ART itself may not be a major reason for the accumulation of gDNMs. Our findings emphasize the importance of evaluating the germline status of the fathers in families with the use of ART.
Collapse
Affiliation(s)
- Cheng Wang
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Bioinformatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Hong Lv
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
- State Key Laboratory of Reproductive Medicine (Suzhou Centre), The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, Jiangsu, China
| | - Xiufeng Ling
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Reproduction, Women's Hospital of Nanjing Medical University, Nanjing Maternity and Child Health Hospital, Nanjing, Jiangsu, China
| | - Hong Li
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Reproductive Genetic Center, The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, Jiangsu, China
| | - Feiyang Diao
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Clinical Center of Reproductive Medicine, The First Affiliated Hospital of Nanjing Medical University, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Juncheng Dai
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Jiangbo Du
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Ting Chen
- Scientific Education Section, Women's Hospital of Nanjing Medical University, Nanjing Maternity and Child Health Hospital, Nanjing, Jiangsu, China
| | - Qi Xi
- Department of Obstetrics, The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, Jiangsu, China
| | - Yang Zhao
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Kun Zhou
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Bo Xu
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Xiumei Han
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Xiaoyu Liu
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Meijuan Peng
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Congcong Chen
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Shiyao Tao
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Lei Huang
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Cong Liu
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Mingyang Wen
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Yangqian Jiang
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Tao Jiang
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Chuncheng Lu
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Wei Wu
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Di Wu
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Minjian Chen
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Yuan Lin
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- State Key Laboratory of Reproductive Medicine (Suzhou Centre), The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, Jiangsu, China
- Department of Maternal, Child and Adolescent Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Xuejiang Guo
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Ran Huo
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Jiayin Liu
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- State Key Laboratory of Reproductive Medicine (Suzhou Centre), The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, Jiangsu, China
- Clinical Center of Reproductive Medicine, The First Affiliated Hospital of Nanjing Medical University, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Hongxia Ma
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Guangfu Jin
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Yankai Xia
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
- Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Jiahao Sha
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Hongbing Shen
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China.
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China.
- State Key Laboratory of Reproductive Medicine (Suzhou Centre), The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, Jiangsu, China.
| | - Zhibin Hu
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu, China.
- Department of Epidemiology and Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China.
- State Key Laboratory of Reproductive Medicine (Suzhou Centre), The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, Jiangsu, China.
| |
Collapse
|
53
|
Patil AB, Vijay N. Repetitive genomic regions and the inference of demographic history. Heredity (Edinb) 2021; 127:151-166. [PMID: 34002046 PMCID: PMC8322061 DOI: 10.1038/s41437-021-00443-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2021] [Revised: 04/16/2021] [Accepted: 04/17/2021] [Indexed: 02/03/2023] Open
Abstract
Inference of demographic histories using whole-genome datasets has provided insights into diversification, adaptation, hybridization, and plant-pathogen interactions, and stimulated debate on the impact of anthropogenic interventions and past climate on species demography. However, the impact of repetitive genomic regions on these inferences has mostly been ignored by masking of repeats. We use the Populus trichocarpa genome (Pop_tri_v3) to show that masking of repeat regions leads to lower estimates of effective population size (Ne) in the distant past in contrast to an increase in Ne estimates in recent times. However, in human datasets, masking of repeats resulted in lower estimates of Ne at all time points. We demonstrate that repeats affect demographic inferences using diverse methods like PSMC, MSMC, SMC++, and the Stairway plot. Our genomic analysis revealed that the biases in Ne estimates were dependent on the repeat class type and its abundance in each atomic interval. Notably, we observed a weak, yet consistently significant negative correlation between the repeat abundance of an atomic interval and the Ne estimates for that interval, which potentially reflects the recombination rate variation within the genome. The rationale for the masking of repeats has been that variants identified within these regions are erroneous. We find that polymorphisms in some repeat classes occur in callable regions and reflect reliable coalescence histories (e.g., LTR Gypsy, LTR Copia). The current demography inference methods do not handle repeats explicitly, and hence the effect of individual repeat classes needs careful consideration in comparative analysis. Deciphering the repeat demographic histories might provide a clear understanding of the processes involved in repeat accumulation.
Collapse
Affiliation(s)
- Ajinkya Bharatraj Patil
- Computational Evolutionary Genomics Lab, Department of Biological Sciences, IISER Bhopal, Bhauri, Madhya Pradesh, India
| | - Nagarjun Vijay
- Computational Evolutionary Genomics Lab, Department of Biological Sciences, IISER Bhopal, Bhauri, Madhya Pradesh, India.
| |
Collapse
|
54
|
Kim YA, Leiserson MDM, Moorjani P, Sharan R, Wojtowicz D, Przytycka TM. Mutational Signatures: From Methods to Mechanisms. Annu Rev Biomed Data Sci 2021; 4:189-206. [PMID: 34465178 DOI: 10.1146/annurev-biodatasci-122320-120920] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Mutations are the driving force of evolution, yet they underlie many diseases, in particular, cancer. They are thought to arise from a combination of stochastic errors in DNA processing, naturally occurring DNA damage (e.g., the spontaneous deamination of methylated CpG sites), replication errors, and dysregulation of DNA repair mechanisms. High-throughput sequencing has made it possible to generate large datasets to study mutational processes in health and disease. Since the emergence of the first mutational process studies in 2012, this field is gaining increasing attention and has already accumulated a host of computational approaches and biomedical applications.
Collapse
Affiliation(s)
- Yoo-Ah Kim
- National Center of Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA;
| | - Mark D M Leiserson
- Department of Computer Science and Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland 20742, USA
| | - Priya Moorjani
- Department of Molecular and Cell Biology and Center for Computational Biology, University of California, Berkeley, California 94720, USA
| | - Roded Sharan
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel
| | - Damian Wojtowicz
- National Center of Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA;
| | - Teresa M Przytycka
- National Center of Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA;
| |
Collapse
|
55
|
Bergman J, Schierup MH. Population dynamics of GC-changing mutations in humans and great apes. Genetics 2021; 218:6291657. [PMID: 34081117 DOI: 10.1093/genetics/iyab083] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Accepted: 05/27/2021] [Indexed: 11/14/2022] Open
Abstract
The nucleotide composition of the genome is a balance between origin and fixation rates of different mutations. For example, it is well-known that transitions occur more frequently than transversions, particularly at CpG sites. Differences in fixation rates of mutation types are less explored. Specifically, recombination-associated GC-biased gene conversion (gBGC) may differentially impact GC-changing mutations, due to differences in their genomic distributions and efficiency of mismatch repair mechanisms. Given that recombination evolves rapidly across species, we explore gBGC of different mutation types across human populations and great ape species. We report a stronger correlation between segregating GC frequency and recombination for transitions than for transversions. Notably, CpG transitions are most strongly affected by gBGC in humans and chimpanzees. We show that the overall strength of gBGC is generally correlated with effective population sizes in humans, with some notable exceptions, such as a stronger effect of gBGC on non-CpG transitions in populations of European descent. Furthermore, species of the Gorilla and Pongo genus have a greatly reduced gBGC effect on CpG sites. We also study the dependence of gBGC dynamics on flanking nucleotides and show that some mutation types evolve in opposition to the gBGC expectation, likely due to hypermutability of specific nucleotide contexts. Our results highlight the importance of different gBGC dynamics experienced by GC-changing mutations and their impact on nucleotide composition evolution.
Collapse
Affiliation(s)
- Juraj Bergman
- Bioinformatics Research Institute, Aarhus University, DK-8000 Aarhus C, Denmark
| | | |
Collapse
|
56
|
Goldberg ME, Harris K. Mutational signatures of replication timing and epigenetic modification persist through the global divergence of mutation spectra across the great ape phylogeny. Genome Biol Evol 2021; 14:6275268. [PMID: 33983415 PMCID: PMC8743035 DOI: 10.1093/gbe/evab104] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/07/2021] [Indexed: 11/17/2022] Open
Abstract
Great ape clades exhibit variation in the relative mutation rates of different three-base-pair genomic motifs, with closely related species having more similar mutation spectra than distantly related species. This pattern cannot be explained by classical demographic or selective forces, but imply that DNA replication fidelity has been perturbed in different ways on each branch of the great ape phylogeny. Here, we use whole-genome variation from 88 great apes to investigate whether these species’ mutation spectra are broadly differentiated across the entire genome, or whether mutation spectrum differences are driven by DNA compartments that have particular functional features or chromatin states. We perform principal component analysis (PCA) and mutational signature deconvolution on mutation spectra ascertained from compartments defined by features including replication timing and ancient repeat content, finding evidence for consistent species-specific mutational signatures that do not depend on which functional compartments the spectra are ascertained from. At the same time, we find that many compartments have their own characteristic mutational signatures that appear stable across the great ape phylogeny. For example, in a mutation spectrum PCA compartmentalized by replication timing, the second principal component explaining 21.2% of variation separates all species’ late-replicating regions from their early-replicating regions. Our results suggest that great ape mutation spectrum evolution is not driven by epigenetic changes that modify mutation rates in specific genomic regions, but instead by trans-acting mutational modifiers that affect mutagenesis across the whole genome fairly uniformly.
Collapse
Affiliation(s)
- Michael E Goldberg
- University of Washington Department of Genome Sciences, 3720 15th Ave NE, Seattle WA 98105, United States of America
| | - Kelley Harris
- University of Washington Department of Genome Sciences, 3720 15th Ave NE, Seattle WA 98105, United States of America.,Fred Hutchinson Cancer Center Computational Biology Division, 1100 Fairview Ave N, Seattle, WA 98109, United States of America
| |
Collapse
|
57
|
Hubert JN, Suybeng V, Vallée M, Delhomme TM, Maubec E, Boland A, Bacq D, Deleuze JF, Jouenne F, Brennan P, McKay JD, Avril MF, Bressac-de Paillerets B, Chanudet E. The PI3K/mTOR Pathway Is Targeted by Rare Germline Variants in Patients with Both Melanoma and Renal Cell Carcinoma. Cancers (Basel) 2021; 13:2243. [PMID: 34067022 PMCID: PMC8125037 DOI: 10.3390/cancers13092243] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 04/26/2021] [Accepted: 04/28/2021] [Indexed: 12/24/2022] Open
Abstract
Background: Malignant melanoma and RCC have different embryonic origins, no common lifestyle risk factors but intriguingly share biological properties such as immune regulation and radioresistance. An excess risk of malignant melanoma is observed in RCC patients and vice versa. This bidirectional association is poorly understood, and hypothetic genetic co-susceptibility remains largely unexplored. Results: We hereby provide a clinical and genetic description of a series of 125 cases affected by both malignant melanoma and RCC. Clinical germline mutation testing identified a pathogenic variant in a melanoma and/or RCC predisposing gene in 17/125 cases (13.6%). This included mutually exclusive variants in MITF (p.E318K locus, N = 9 cases), BAP1 (N = 3), CDKN2A (N = 2), FLCN (N = 2), and PTEN (N = 1). A subset of 46 early-onset cases, without underlying germline variation, was whole-exome sequenced. In this series, thirteen genes were significantly enriched in mostly exclusive rare variants predicted to be deleterious, compared to 19,751 controls of similar ancestry. The observed variation mainly consisted of novel or low-frequency variants (<0.01%) within genes displaying strong evolutionary mutational constraints along the PI3K/mTOR pathway, including PIK3CD, NFRKB, EP300, MTOR, and related epigenetic modifier SETD2. The screening of independently processed germline exomes from The Cancer Genome Atlas confirmed an association with melanoma and RCC but not with cancers of established differing etiology such as lung cancers. Conclusions: Our study highlights that an exome-wide case-control enrichment approach may better characterize the rare variant-based missing heritability of multiple primary cancers. In our series, the co-occurrence of malignant melanoma and RCC was associated with germline variation in the PI3K/mTOR signaling cascade, with potential relevance for early diagnostic and clinical management.
Collapse
Affiliation(s)
- Jean-Noël Hubert
- Section of Genetics, International Agency for Research on Cancer (IARC-WHO), 69372 Lyon, France; (J.-N.H.); (M.V.); (T.M.D.); (P.B.); (J.D.M.)
| | - Voreak Suybeng
- Gustave Roussy, Département de Biopathologie, 94805 Villejuif, France; (V.S.); (F.J.)
| | - Maxime Vallée
- Section of Genetics, International Agency for Research on Cancer (IARC-WHO), 69372 Lyon, France; (J.-N.H.); (M.V.); (T.M.D.); (P.B.); (J.D.M.)
| | - Tiffany M. Delhomme
- Section of Genetics, International Agency for Research on Cancer (IARC-WHO), 69372 Lyon, France; (J.-N.H.); (M.V.); (T.M.D.); (P.B.); (J.D.M.)
| | - Eve Maubec
- Department of Dermatology, AP-HP, Hôpital Avicenne, University Paris 13, 93000 Bobigny, France;
- UMRS-1124, Campus Paris Saint-Germain-des-Prés, University of Paris, 75006 Paris, France
| | - Anne Boland
- Centre National de Recherche en Génomique Humaine, Université Paris-Saclay, CEA, 91057 Evry, France; (A.B.); (D.B.); (J.-F.D.)
| | - Delphine Bacq
- Centre National de Recherche en Génomique Humaine, Université Paris-Saclay, CEA, 91057 Evry, France; (A.B.); (D.B.); (J.-F.D.)
| | - Jean-François Deleuze
- Centre National de Recherche en Génomique Humaine, Université Paris-Saclay, CEA, 91057 Evry, France; (A.B.); (D.B.); (J.-F.D.)
| | - Fanélie Jouenne
- Gustave Roussy, Département de Biopathologie, 94805 Villejuif, France; (V.S.); (F.J.)
| | - Paul Brennan
- Section of Genetics, International Agency for Research on Cancer (IARC-WHO), 69372 Lyon, France; (J.-N.H.); (M.V.); (T.M.D.); (P.B.); (J.D.M.)
| | - James D. McKay
- Section of Genetics, International Agency for Research on Cancer (IARC-WHO), 69372 Lyon, France; (J.-N.H.); (M.V.); (T.M.D.); (P.B.); (J.D.M.)
| | | | - Brigitte Bressac-de Paillerets
- Gustave Roussy, Département de Biopathologie, 94805 Villejuif, France; (V.S.); (F.J.)
- INSERM U1279, Tumor Cell Dynamics, 94805 Villejuif, France
| | - Estelle Chanudet
- Section of Genetics, International Agency for Research on Cancer (IARC-WHO), 69372 Lyon, France; (J.-N.H.); (M.V.); (T.M.D.); (P.B.); (J.D.M.)
| |
Collapse
|
58
|
Taliun D, Harris DN, Kessler MD, Carlson J, Szpiech ZA, Torres R, Taliun SAG, Corvelo A, Gogarten SM, Kang HM, Pitsillides AN, LeFaive J, Lee SB, Tian X, Browning BL, Das S, Emde AK, Clarke WE, Loesch DP, Shetty AC, Blackwell TW, Smith AV, Wong Q, Liu X, Conomos MP, Bobo DM, Aguet F, Albert C, Alonso A, Ardlie KG, Arking DE, Aslibekyan S, Auer PL, Barnard J, Barr RG, Barwick L, Becker LC, Beer RL, Benjamin EJ, Bielak LF, Blangero J, Boehnke M, Bowden DW, Brody JA, Burchard EG, Cade BE, Casella JF, Chalazan B, Chasman DI, Chen YDI, Cho MH, Choi SH, Chung MK, Clish CB, Correa A, Curran JE, Custer B, Darbar D, Daya M, de Andrade M, DeMeo DL, Dutcher SK, Ellinor PT, Emery LS, Eng C, Fatkin D, Fingerlin T, Forer L, Fornage M, Franceschini N, Fuchsberger C, Fullerton SM, Germer S, Gladwin MT, Gottlieb DJ, Guo X, Hall ME, He J, Heard-Costa NL, Heckbert SR, Irvin MR, Johnsen JM, Johnson AD, Kaplan R, Kardia SLR, Kelly T, Kelly S, Kenny EE, Kiel DP, Klemmer R, Konkle BA, Kooperberg C, Köttgen A, Lange LA, Lasky-Su J, Levy D, Lin X, Lin KH, Liu C, Loos RJF, et alTaliun D, Harris DN, Kessler MD, Carlson J, Szpiech ZA, Torres R, Taliun SAG, Corvelo A, Gogarten SM, Kang HM, Pitsillides AN, LeFaive J, Lee SB, Tian X, Browning BL, Das S, Emde AK, Clarke WE, Loesch DP, Shetty AC, Blackwell TW, Smith AV, Wong Q, Liu X, Conomos MP, Bobo DM, Aguet F, Albert C, Alonso A, Ardlie KG, Arking DE, Aslibekyan S, Auer PL, Barnard J, Barr RG, Barwick L, Becker LC, Beer RL, Benjamin EJ, Bielak LF, Blangero J, Boehnke M, Bowden DW, Brody JA, Burchard EG, Cade BE, Casella JF, Chalazan B, Chasman DI, Chen YDI, Cho MH, Choi SH, Chung MK, Clish CB, Correa A, Curran JE, Custer B, Darbar D, Daya M, de Andrade M, DeMeo DL, Dutcher SK, Ellinor PT, Emery LS, Eng C, Fatkin D, Fingerlin T, Forer L, Fornage M, Franceschini N, Fuchsberger C, Fullerton SM, Germer S, Gladwin MT, Gottlieb DJ, Guo X, Hall ME, He J, Heard-Costa NL, Heckbert SR, Irvin MR, Johnsen JM, Johnson AD, Kaplan R, Kardia SLR, Kelly T, Kelly S, Kenny EE, Kiel DP, Klemmer R, Konkle BA, Kooperberg C, Köttgen A, Lange LA, Lasky-Su J, Levy D, Lin X, Lin KH, Liu C, Loos RJF, Garman L, Gerszten R, Lubitz SA, Lunetta KL, Mak ACY, Manichaikul A, Manning AK, Mathias RA, McManus DD, McGarvey ST, Meigs JB, Meyers DA, Mikulla JL, Minear MA, Mitchell BD, Mohanty S, Montasser ME, Montgomery C, Morrison AC, Murabito JM, Natale A, Natarajan P, Nelson SC, North KE, O'Connell JR, Palmer ND, Pankratz N, Peloso GM, Peyser PA, Pleiness J, Post WS, Psaty BM, Rao DC, Redline S, Reiner AP, Roden D, Rotter JI, Ruczinski I, Sarnowski C, Schoenherr S, Schwartz DA, Seo JS, Seshadri S, Sheehan VA, Sheu WH, Shoemaker MB, Smith NL, Smith JA, Sotoodehnia N, Stilp AM, Tang W, Taylor KD, Telen M, Thornton TA, Tracy RP, Van Den Berg DJ, Vasan RS, Viaud-Martinez KA, Vrieze S, Weeks DE, Weir BS, Weiss ST, Weng LC, Willer CJ, Zhang Y, Zhao X, Arnett DK, Ashley-Koch AE, Barnes KC, Boerwinkle E, Gabriel S, Gibbs R, Rice KM, Rich SS, Silverman EK, Qasba P, Gan W, Papanicolaou GJ, Nickerson DA, Browning SR, Zody MC, Zöllner S, Wilson JG, Cupples LA, Laurie CC, Jaquish CE, Hernandez RD, O'Connor TD, Abecasis GR. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 2021; 590:290-299. [PMID: 33568819 PMCID: PMC7875770 DOI: 10.1038/s41586-021-03205-y] [Show More Authors] [Citation(s) in RCA: 1207] [Impact Index Per Article: 301.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Accepted: 01/07/2021] [Indexed: 02/08/2023]
Abstract
The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1. In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.
Collapse
Affiliation(s)
- Daniel Taliun
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Daniel N Harris
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
- Program in Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Michael D Kessler
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
- Program in Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Jedidiah Carlson
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Zachary A Szpiech
- Department of Biology, Pennsylvania State University, University Park, PA, USA
- Institute for Computational and Data Sciences, Pennsylvania State University, University Park, PA, USA
| | - Raul Torres
- Biomedical Sciences Graduate Program, University of California, San Francisco, San Francisco, CA, USA
| | - Sarah A Gagliano Taliun
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | | | | | - Hyun Min Kang
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | | | - Jonathon LeFaive
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Seung-Been Lee
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Xiaowen Tian
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Brian L Browning
- Department of Medicine, Division of Medical Genetics, University of Washington, Seattle, WA, USA
| | - Sayantan Das
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | | | | | - Douglas P Loesch
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
- Program in Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Amol C Shetty
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
- Program in Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Thomas W Blackwell
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Albert V Smith
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Quenna Wong
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Xiaoming Liu
- USF Genomics, College of Public Health, University of South Florida, Tampa, FL, USA
| | - Matthew P Conomos
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Dean M Bobo
- Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - François Aguet
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Alvaro Alonso
- Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, GA, USA
| | | | - Dan E Arking
- McKusick-Nathans Institute, Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | | | - Paul L Auer
- Zilber School of Public Health, University of Wisconsin Milwaukee, Milwaukee, WI, USA
| | | | - R Graham Barr
- Department of Medicine, Columbia University Medical Center, New York, NY, USA
- Department of Epidemiology, Columbia University Medical Center, New York, NY, USA
| | | | | | - Rebecca L Beer
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Emelia J Benjamin
- Department of Medicine, Boston University School of Medicine, Boston, MA, USA
- Department of Epidemiology, Boston University School of Public Health, Boston, MA, USA
- Framingham Heart Study, Framingham, MA, USA
| | - Lawrence F Bielak
- Department of Epidemiology, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - John Blangero
- Department of Human Genetics, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
| | - Michael Boehnke
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Donald W Bowden
- Department of Biochemistry, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Jennifer A Brody
- Department of Medicine, University of Washington, Seattle, WA, USA
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA
| | - Esteban G Burchard
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Brian E Cade
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - James F Casella
- Department of Pediatrics, Johns Hopkins University, Baltimore, MD, USA
- Division of Pediatric Hematology, Johns Hopkins University, Baltimore, MD, USA
| | - Brandon Chalazan
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Daniel I Chasman
- Division of Preventive Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Yii-Der Ida Chen
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation, Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Michael H Cho
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Mina K Chung
- Department of Cardiovascular Medicine, Heart & Vascular Institute, Cleveland Clinic, Cleveland, OH, USA
- Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH, USA
| | - Clary B Clish
- Metabolomics Platform, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Adolfo Correa
- Department of Medicine, University of Mississippi Medical Center, Jackson, MS, USA
- Department of Pediatrics, University of Mississippi Medical Center, Jackson, MS, USA
- Department of Population Health Science, University of Mississippi Medical Center, Jackson, MS, USA
| | - Joanne E Curran
- Department of Human Genetics, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
| | - Brian Custer
- Vitalant Research Institute, San Francisco, CA, USA
- Department of Laboratory Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Dawood Darbar
- Department of Medicine, University of Illinois at Chicago, Chicago, IL, USA
| | - Michelle Daya
- Division of Biomedical Informatics and Personalized Medicine, Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | | | - Dawn L DeMeo
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Susan K Dutcher
- McDonnell Genome Institute, Washington University, St Louis, MO, USA
- Department of Genetics, Washington University, St Louis, MO, USA
| | - Patrick T Ellinor
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Leslie S Emery
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Celeste Eng
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Diane Fatkin
- Molecular Cardiology Division, Victor Chang Cardiac Research Institute, Darlinghurst, New South Wales, Australia
- Faculty of Medicine, University of New South Wales, Kensington, New South Wales, Australia
- Cardiology Department, St Vincent's Hospital, Darlinghurst, New South Wales, Australia
| | - Tasha Fingerlin
- National Jewish Health, Center for Genes, Environment and Health, Denver, CO, USA
| | - Lukas Forer
- Institute of Genetic Epidemiology, Department of Genetics and Pharmacology, Medical University of Innsbruck, Innsbruck, Austria
| | - Myriam Fornage
- Institute of Molecular Medicine, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Nora Franceschini
- Department of Epidemiology, University of North Carolina, Chapel Hill, NC, USA
| | - Christian Fuchsberger
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Institute of Genetic Epidemiology, Department of Genetics and Pharmacology, Medical University of Innsbruck, Innsbruck, Austria
- Institute for Biomedicine, Eurac Research, Bolzano, Italy
| | - Stephanie M Fullerton
- Department of Bioethics & Humanities, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Mark T Gladwin
- Pittsburgh Heart, Lung, Blood and Vascular Medicine Institute, University of Pittsburgh, Pittsburgh, PA, USA
- Pulmonary, Allergy and Critical Care Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Daniel J Gottlieb
- VA Boston Healthcare System, Boston, MA, USA
- Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA, USA
| | - Xiuqing Guo
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation, Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Michael E Hall
- Department of Medicine, University of Mississippi Medical Center, Jackson, MS, USA
| | - Jiang He
- Department of Epidemiology, Tulane University, New Orleans, LA, USA
- Tulane University Translational Science Institute, Tulane University, New Orleans, LA, USA
| | - Nancy L Heard-Costa
- Framingham Heart Study, Framingham, MA, USA
- Department of Neurology, Boston University School of Medicine, Boston, MA, USA
| | - Susan R Heckbert
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Marguerite R Irvin
- Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Jill M Johnsen
- Department of Medicine, University of Washington, Seattle, WA, USA
- Bloodworks Northwest Research Institute, Seattle, WA, USA
| | - Andrew D Johnson
- Framingham Heart Study, Framingham, MA, USA
- Population Sciences Branch, National Heart, Lung, and Blood Institute, National Institutes of Health, Framingham, MA, USA
| | - Robert Kaplan
- Albert Einstein College of Medicine, New York, NY, USA
| | - Sharon L R Kardia
- Department of Epidemiology, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Tanika Kelly
- Department of Epidemiology, Tulane University, New Orleans, LA, USA
| | - Shannon Kelly
- Department of Epidemiology, Vitalant Research Institute, San Francisco, CA, USA
- Department of Pediatrics, UCSF Benioff Children's Hospital, Oakland, CA, USA
- Division of Pediatric Hematology, UCSF Benioff Children's Hospital, Oakland, CA, USA
| | - Eimear E Kenny
- Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Douglas P Kiel
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Hinda and Arthur Marcus Institute for Aging Research, Hebrew SeniorLife, Boston, MA, USA
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Robert Klemmer
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Barbara A Konkle
- Department of Medicine, University of Washington, Seattle, WA, USA
- Bloodworks Northwest Research Institute, Seattle, WA, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Anna Köttgen
- Department of Epidemiology, Johns Hopkins University, Baltimore, MD, USA
- Institute of Genetic Epidemiology, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
| | - Leslie A Lange
- Department of Medicine, University of Colorado at Denver, Aurora, CO, USA
| | - Jessica Lasky-Su
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Brigham and Women's Hospital, Boston, MA, USA
| | - Daniel Levy
- Department of Medicine, Boston University School of Medicine, Boston, MA, USA
- Framingham Heart Study, Framingham, MA, USA
- Population Sciences Branch, National Heart, Lung, and Blood Institute, National Institutes of Health, Framingham, MA, USA
| | - Xihong Lin
- Biostatistics and Statistics, Harvard University, Boston, MA, USA
| | - Keng-Han Lin
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Chunyu Liu
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Ruth J F Loos
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- The Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Lori Garman
- Department of Genes and Human Disease, Oklahoma Medical Research Foundation, Oklahoma City, OK, USA
| | | | | | - Kathryn L Lunetta
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Angel C Y Mak
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Ani Manichaikul
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
- Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA
| | - Alisa K Manning
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Clinical and Translational Epidemiology Unit, Mongan Institute, Massachusetts General Hospital, Boston, MA, USA
- Metabolism Program, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Rasika A Mathias
- Department of Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - David D McManus
- Cardiovascular Medicine, University of Massachusetts Medical School, Worcester, MA, USA
| | - Stephen T McGarvey
- International Health Institute, Brown University, Providence, RI, USA
- Department of Epidemiology, Brown University, Providence, RI, USA
- Department of Anthropology, Brown University, Providence, RI, USA
| | - James B Meigs
- Division of General Internal Medicine, Massachusetts General Hospital, Harvard Medical School, The Broad Institute of MIT and Harvard, Boston, MA, USA
| | | | - Julie L Mikulla
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mollie A Minear
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Braxton D Mitchell
- Program in Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Geriatrics Research and Education Clinical Center, Baltimore Veterans Administration Medical Center, Baltimore, MD, USA
| | - Sanghamitra Mohanty
- Texas Cardiac Arrhythmia Institute, St David's Medical Center, Austin, TX, USA
- Department of Internal Medicine, Dell Medical School, Austin, TX, USA
| | - May E Montasser
- Program in Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Courtney Montgomery
- Department of Genes and Human Disease, Oklahoma Medical Research Foundation, Oklahoma City, OK, USA
| | - Alanna C Morrison
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Joanne M Murabito
- Department of Medicine, Boston University School of Medicine, Boston, MA, USA
| | - Andrea Natale
- Texas Cardiac Arrhythmia Institute, St David's Medical Center, Austin, TX, USA
| | - Pradeep Natarajan
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Sarah C Nelson
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Kari E North
- Department of Epidemiology, University of North Carolina, Chapel Hill, NC, USA
| | - Jeffrey R O'Connell
- Program in Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Nicholette D Palmer
- Department of Biochemistry, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Nathan Pankratz
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, USA
| | - Gina M Peloso
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Patricia A Peyser
- Department of Epidemiology, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Jacob Pleiness
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Wendy S Post
- Division of Cardiology, Department of Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Bruce M Psaty
- Department of Medicine, University of Washington, Seattle, WA, USA
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
- Department of Health Services, University of Washington, Seattle, WA, USA
- Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA
| | - D C Rao
- Division of Biostatistics, Washington University in St Louis, St Louis, MO, USA
| | - Susan Redline
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Alexander P Reiner
- Department of Epidemiology, University of Washington, Seattle, WA, USA
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Dan Roden
- Vanderbilt University Medical Center, Nashville, TN, USA
| | - Jerome I Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation, Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Ingo Ruczinski
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Chloé Sarnowski
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Sebastian Schoenherr
- Institute of Genetic Epidemiology, Department of Genetics and Pharmacology, Medical University of Innsbruck, Innsbruck, Austria
| | | | - Jeong-Sun Seo
- Precision Medicine Center, Seoul National University Bundang Hospital, Seongnam, Republic of Korea
- Macrogen Inc, Seoul, Republic of Korea
- Gong Wu Genomic Medicine Institute, Seoul National University Bundang Hospital, Seongnam, Republic of Korea
| | - Sudha Seshadri
- Framingham Heart Study, Framingham, MA, USA
- Glenn Biggs Institute for Alzheimer's and Neurodegenerative Diseases, University of Texas Health Sciences Center at San Antonio, San Antonio, TX, USA
| | - Vivien A Sheehan
- Department of Pediatrics, Emory University School of Medicine, Atlanta, GA, USA
- Aflac Cancer and Blood Disorders Center, Children's Healthcare of Atlanta, Atlanta, GA, USA
| | - Wayne H Sheu
- Taichung Veterans General Hospital Taiwan, Taichung City, Taiwan
| | | | - Nicholas L Smith
- Department of Epidemiology, University of Washington, Seattle, WA, USA
- Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA
- Seattle Epidemiologic Research and Information Center, Department of Veterans Affairs Office of Research and Development, Seattle, WA, USA
| | - Jennifer A Smith
- Department of Epidemiology, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI, USA
| | - Nona Sotoodehnia
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA
| | - Adrienne M Stilp
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Weihong Tang
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Kent D Taylor
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation, Harbor-UCLA Medical Center, Torrance, CA, USA
| | | | | | - Russell P Tracy
- Department of Pathology & Laboratory Medicine, University of Vermont Larner College of Medicine, Burlington, VT, USA
| | - David J Van Den Berg
- Center for Genetic Epidemiology, Department of Preventive Medicine, University of Southern California, Los Angeles, CA, USA
| | - Ramachandran S Vasan
- Department of Medicine, Boston University School of Medicine, Boston, MA, USA
- Framingham Heart Study, Framingham, MA, USA
| | | | - Scott Vrieze
- Department of Psychology, University of Minnesota, Minneapolis, MN, USA
| | - Daniel E Weeks
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Bruce S Weir
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Scott T Weiss
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Brigham and Women's Hospital, Boston, MA, USA
| | | | - Cristen J Willer
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Internal Medicine-Cardiology, University of Michigan, Ann Arbor, MI, USA
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Yingze Zhang
- Pittsburgh Heart, Lung, Blood and Vascular Medicine Institute, University of Pittsburgh, Pittsburgh, PA, USA
- Pulmonary, Allergy and Critical Care Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Xutong Zhao
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Donna K Arnett
- Department of Epidemiology, University of Kentucky, Lexington, KY, USA
| | - Allison E Ashley-Koch
- Duke Molecular Physiology Institute, Duke University Medical Center, Durham, NC, USA
| | - Kathleen C Barnes
- Division of Biomedical Informatics and Personalized Medicine, Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Eric Boerwinkle
- University of Texas Health Science Center at Houston, Houston, TX, USA
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Stacey Gabriel
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Richard Gibbs
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Kenneth M Rice
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Stephen S Rich
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
- Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA
| | - Edwin K Silverman
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Pankaj Qasba
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Weiniu Gan
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - George J Papanicolaou
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Deborah A Nickerson
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Northwest Genomics Center, Seattle, WA, USA
- Brotman Baty Institute, Seattle, WA, USA
| | - Sharon R Browning
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | | | - Sebastian Zöllner
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Department of Psychiatry, University of Michigan, Ann Arbor, MI, USA
| | - James G Wilson
- Department of Physiology and Biophysics, University of Mississippi Medical Center, Jackson, MS, USA
| | - L Adrienne Cupples
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA.
- Framingham Heart Study, Framingham, MA, USA.
| | - Cathy C Laurie
- Department of Biostatistics, University of Washington, Seattle, WA, USA.
| | - Cashell E Jaquish
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA.
| | - Ryan D Hernandez
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA.
- Department of Human Genetics, McGill University, Montreal, Quebec, Canada.
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA, USA.
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA.
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA.
| | - Timothy D O'Connor
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA.
- Program in Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA.
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA.
| | - Gonçalo R Abecasis
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA.
| |
Collapse
|
59
|
Jia X, Burugula BB, Chen V, Lemons RM, Jayakody S, Maksutova M, Kitzman JO. Massively parallel functional testing of MSH2 missense variants conferring Lynch syndrome risk. Am J Hum Genet 2021; 108:163-175. [PMID: 33357406 PMCID: PMC7820803 DOI: 10.1016/j.ajhg.2020.12.003] [Citation(s) in RCA: 77] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 12/03/2020] [Indexed: 12/20/2022] Open
Abstract
The lack of functional evidence for the majority of missense variants limits their clinical interpretability and poses a key barrier to the broad utility of carrier screening. In Lynch syndrome (LS), one of the most highly prevalent cancer syndromes, nearly 90% of clinically observed missense variants are deemed “variants of uncertain significance” (VUS). To systematically resolve their functional status, we performed a massively parallel screen in human cells to identify loss-of-function missense variants in the key DNA mismatch repair factor MSH2. The resulting functional effect map is substantially complete, covering 94% of the 17,746 possible variants, and is highly concordant (96%) with existing functional data and expert clinicians’ interpretations. The large majority (89%) of missense variants were functionally neutral, perhaps unexpectedly in light of its evolutionary conservation. These data provide ready-to-use functional evidence to resolve the ∼1,300 extant missense VUSs in MSH2 and may facilitate the prospective classification of newly discovered variants in the clinic.
Collapse
|
60
|
Identification of New, Functionally Relevant Mutations in the Coding Regions of the Human Fos and Jun Proto-Oncogenes in Rheumatoid Arthritis Synovial Tissue. Life (Basel) 2020; 11:life11010005. [PMID: 33374881 PMCID: PMC7823737 DOI: 10.3390/life11010005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 12/16/2020] [Accepted: 12/22/2020] [Indexed: 02/06/2023] Open
Abstract
In rheumatoid arthritis (RA), the expression of many pro-destructive/pro-inflammatory proteins depends on the transcription factor AP-1. Therefore, our aim was to analyze the presence and functional relevance of mutations in the coding regions of the AP-1 subunits of the fos and jun family in peripheral blood (PB) and synovial membranes (SM) of RA and osteoarthritis patients (OA, disease control), as well as normal controls (NC). Using the non-isotopic RNAse cleavage assay, one known polymorphism (T252C: silent; rs1046117; present in RA, OA, and NC) and three novel germline mutations of the cfos gene were detected: (i) C361G/A367G: Gln121Glu/Ile123Val, denoted as “fos121/123”; present only in one OA sample; (ii) G374A: Arg125Lys, “fos125”; and (iii) C217A/G374A: Leu73Met/Arg125Lys, “fos73/125”, the latter two exclusively present in RA. In addition, three novel somatic cjun mutations (604–606ΔCAG: ΔGln202, “jun202”; C706T: Pro236Ser, “jun236”; G750A: silent) were found exclusively in the RA SM. Tansgenic expression of fos125 and fos73/125 mutants in NIH-3T3 cells induced an activation of reporter constructs containing either the MMP-1 (matrix metalloproteinase) promoter (3- and 4-fold, respectively) or a pentameric AP-1 site (approximately 5-fold). Combined expression of these two cfos mutants with cjun wildtype or mutants (jun202, jun236) further enhanced reporter expression of the pentameric AP-1 construct. Finally, genotyping for the novel functionally relevant germline mutations in 298 RA, 288 OA, and 484 NC samples revealed no association with RA. Thus, functional cfos/cjun mutants may contribute to local joint inflammation/destruction in selected patients with RA by altering the transactivation capacity of AP-1 complexes.
Collapse
|
61
|
Demongeot J, Moreira A, Seligmann H. Negative CG dinucleotide bias: An explanation based on feedback loops between Arginine codon assignments and theoretical minimal RNA rings. Bioessays 2020; 43:e2000071. [PMID: 33319381 DOI: 10.1002/bies.202000071] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 11/23/2020] [Accepted: 11/26/2020] [Indexed: 01/05/2023]
Abstract
Theoretical minimal RNA rings are candidate primordial genes evolved for non-redundant coding of the genetic code's 22 coding signals (one codon per biogenic amino acid, a start and a stop codon) over the shortest possible length: 29520 22-nucleotide-long RNA rings solve this min-max constraint. Numerous RNA ring properties are reminiscent of natural genes. Here we present analyses showing that all RNA rings lack dinucleotide CG (a mutable, chemically instable dinucleotide coding for Arginine), bearing a resemblance to known CG-depleted genomes. CG in "incomplete" RNA rings (not coding for all coding signals, with only 3-12 nucleotides) gradually decreases towards CG absence in complete, 22-nucleotide-long RNA rings. Presumably, feedback loops during RNA ring growth during evolution (when amino acid assignment fixed the genetic code) assigned Arg to codons lacking CG (AGR) to avoid CG. Hence, as a chemical property of base pairs, CG mutability restructured the genetic code, thereby establishing itself as genetically encoded biological information.
Collapse
Affiliation(s)
- Jacques Demongeot
- Laboratory AGEIS EA 7407, Team Tools for e-Gnosis Medical & Labcom CNRS/UGA/OrangeLabs Telecom4Health, Faculty of Medicine, Université Grenoble Alpes, La Tronche, France
| | - Andrés Moreira
- Departamento de Informática, Universidad Técnica Federico Santa María, Santiago, Chile
| | - Hervé Seligmann
- Laboratory AGEIS EA 7407, Team Tools for e-Gnosis Medical & Labcom CNRS/UGA/OrangeLabs Telecom4Health, Faculty of Medicine, Université Grenoble Alpes, La Tronche, France.,The National Natural History Collections, The Hebrew University of Jerusalem, Jerusalem, Israel.,Institute of Microstructure Technology, Karlsruhe Institute of Technology (KIT), Eggenstein-Leopoldshafen, Germany
| |
Collapse
|
62
|
Identification and characterization of constrained non-exonic bases lacking predictive epigenomic and transcription factor binding annotations. Nat Commun 2020; 11:6168. [PMID: 33268804 PMCID: PMC7710766 DOI: 10.1038/s41467-020-19962-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2020] [Accepted: 11/06/2020] [Indexed: 12/12/2022] Open
Abstract
Annotations of evolutionary sequence constraint based on multi-species genome alignments and genome-wide maps of epigenomic marks and transcription factor binding provide important complementary information for understanding the human genome and genetic variation. Here we developed the Constrained Non-Exonic Predictor (CNEP) to quantify the evidence of each base in the genome being in an evolutionarily constrained non-exonic element from an input of over 60,000 epigenomic and transcription factor binding features. We find that the CNEP score outperforms baseline and related existing scores at predicting evolutionarily constrained non-exonic bases from such data. However, a subset of them are still not well predicted by CNEP. We developed a complementary Conservation Signature Score by CNEP (CSS-CNEP) that is predictive of those bases. We further characterize the nature of constrained non-exonic bases with low CNEP scores using additional types of information. CNEP and CSS-CNEP are resources for analyzing constrained non-exonic bases in the genome. Genome-wide maps of evolutionary constraint and large-scale compendia of epigenomic and transcription factor data provide complementary information for genome annotation. Here, the authors develop the Constrained Non-Exonic Predictor (CNEP) that enables better understanding of their relationship.
Collapse
|
63
|
The Impact of DNA Methylation Dynamics on the Mutation Rate During Human Germline Development. G3-GENES GENOMES GENETICS 2020; 10:3337-3346. [PMID: 32727923 PMCID: PMC7466984 DOI: 10.1534/g3.120.401511] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
DNA methylation is a dynamic epigenetic modification found in most eukaryotic genomes. It is known to lead to a high CpG to TpG mutation rate. However, the relationship between the methylation dynamics in germline development and the germline mutation rate remains unexplored. In this study, we used whole genome bisulfite sequencing (WGBS) data of cells at 13 stages of human germline development and rare variants from the 1000 Genome Project as proxies for germline mutations to investigate the correlation between dynamic methylation levels and germline mutation rates at different scales. At the single-site level, we found a significant correlation between methylation and the germline point mutation rate at CpG sites during germline developmental stages. Then we explored the mutability of methylation dynamics in all stages. Our results also showed a broad correlation between the regional methylation level and the rate of C > T mutation at CpG sites in all genomic regions, especially in intronic regions; a similar link was also seen at all chromosomal levels. Our findings indicate that the dynamic DNA methylome during human germline development has a broader mutational impact than is commonly assumed.
Collapse
|
64
|
Cortez Cardoso Penha R, Cortez de Almeida RF, Câmara Mariz J, Brewer Lisboa L, do Nascimento Barbosa L, Souto da Silva R. The deregulation of NOTCH pathway, inflammatory cytokines, and keratinization genes in two Dowling-Degos disease patients with hidradenitis suppurativa. Am J Med Genet A 2020; 182:2662-2665. [PMID: 33200913 DOI: 10.1002/ajmg.a.61800] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 07/04/2020] [Accepted: 07/06/2020] [Indexed: 11/09/2022]
Abstract
Dowling-Degos disease (DDD) is a rare autosomal-dominant genodermatosis and it has been associated with hidradenitis suppurativa (HS). Deregulation of NOTCH pathway has been linked to the development of HS in DDD context (DDD-HS). However, molecular alterations in DDD-HS, including altered gene expression of NOTCH and downstream effectors that are involved in the follicular differentiation and inflammatory response, are poorly defined. We report two cases of patients diagnosed with DDD-HS, one of those, under Adalimumab treatment. Our results have shown downregulation of NOTCH1/NCSTN pathway, distinct molecular profiles of inflammatory cytokines (IL23A and TNF), and a novel aberrant upregulation of genes involved in the cornified envelope (CE) formation (SPRR1B, SPRR2D, SPRR3, and IVL) in paired HS lesions of two DDD patients.
Collapse
Affiliation(s)
| | | | - Juliana Câmara Mariz
- Department of Dermatology, State University of Rio de Janeiro (UERJ), Rio de Janeiro, Brazil
| | - Lilian Brewer Lisboa
- Molecular Carcinogenesis Program, Brazilian National Cancer Institute (INCA), Rio de Janeiro, Brazil
| | | | - Roberto Souto da Silva
- Department of Dermatology, State University of Rio de Janeiro (UERJ), Rio de Janeiro, Brazil
| |
Collapse
|
65
|
Simon H, Huttley G. Quantifying Influences on Intragenomic Mutation Rate. G3 (BETHESDA, MD.) 2020; 10:2641-2652. [PMID: 32527747 PMCID: PMC7407452 DOI: 10.1534/g3.120.401335] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Accepted: 05/28/2020] [Indexed: 12/14/2022]
Abstract
We report work to quantify the impact on the probability of human genome polymorphism both of recombination and of sequence context at different scales. We use population-based analyses of data on human genetic variants obtained from the public Ensembl database. For recombination, we calculate the variance due to recombination and the probability that a recombination event causes a mutation. We employ novel statistical procedures to take account of the spatial auto-correlation of recombination and mutation rates along the genome. Our results support the view that genomic diversity in recombination hotspots arises largely from a direct effect of recombination on mutation rather than predominantly from the effect of selective sweeps. We also use the statistic of variance due to context to compare the effect on the probability of polymorphism of contexts of various sizes. We find that when the 12 point mutations are considered separately, variance due to context increases significantly as we move from 3-mer to 5-mer and from 5-mer to 7-mer contexts. However, when all mutations are considered in aggregate, these differences are outweighed by the effect of interaction between the central base and its immediate neighbors. This interaction is itself dominated by the transition mutations, including, but not limited to, the CpG effect. We also demonstrate strand-asymmetry of contextual influence in intronic regions, which is hypothesized to be a result of transcription coupled DNA repair. We consider the extent to which the measures we have used can be used to meaningfully compare the relative magnitudes of the impact of recombination and context on mutation.
Collapse
Affiliation(s)
- Helmut Simon
- Research School of Biology, the Australian National University
| | - Gavin Huttley
- Research School of Biology, the Australian National University
| |
Collapse
|
66
|
Wu FL, Strand AI, Cox LA, Ober C, Wall JD, Moorjani P, Przeworski M. A comparison of humans and baboons suggests germline mutation rates do not track cell divisions. PLoS Biol 2020; 18:e3000838. [PMID: 32804933 PMCID: PMC7467331 DOI: 10.1371/journal.pbio.3000838] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Revised: 09/02/2020] [Accepted: 07/28/2020] [Indexed: 12/19/2022] Open
Abstract
In humans, most germline mutations are inherited from the father. This observation has been widely interpreted as reflecting the replication errors that accrue during spermatogenesis. If so, the male bias in mutation should be substantially lower in a closely related species with similar rates of spermatogonial stem cell divisions but a shorter mean age of reproduction. To test this hypothesis, we resequenced two 3-4 generation nuclear families (totaling 29 individuals) of olive baboons (Papio anubis), who reproduce at approximately 10 years of age on average, and analyzed the data in parallel with three 3-generation human pedigrees (26 individuals). We estimated a mutation rate per generation in baboons of 0.57×10-8 per base pair, approximately half that of humans. Strikingly, however, the degree of male bias in germline mutations is approximately 4:1, similar to that of humans-indeed, a similar male bias is seen across mammals that reproduce months, years, or decades after birth. These results mirror the finding in humans that the male mutation bias is stable with parental ages and cast further doubt on the assumption that germline mutations track cell divisions. Our mutation rate estimates for baboons raise a further puzzle, suggesting a divergence time between apes and Old World monkeys of 65 million years, too old to be consistent with the fossil record; reconciling them now requires not only a slowdown of the mutation rate per generation in humans but also in baboons.
Collapse
Affiliation(s)
- Felix L. Wu
- Department of Systems Biology, Columbia University, New York, New York, United States of America
- Integrated Program in Cellular, Molecular, and Biomedical Studies, Columbia University, New York, New York, United States of America
| | - Alva I. Strand
- Department of Biological Sciences, Columbia University, New York, New York, United States of America
| | - Laura A. Cox
- Center for Precision Medicine, Department of Internal Medicine, Section of Molecular Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America
- Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, Texas, United States of America
| | - Carole Ober
- Department of Human Genetics, The University of Chicago, Chicago, Illinois, United States of America
| | - Jeffrey D. Wall
- Institute for Human Genetics, Department of Epidemiology & Biostatistics, University of California, San Francisco, San Francisco, California, United States of America
| | - Priya Moorjani
- Department of Biological Sciences, Columbia University, New York, New York, United States of America
| | - Molly Przeworski
- Department of Systems Biology, Columbia University, New York, New York, United States of America
- Department of Biological Sciences, Columbia University, New York, New York, United States of America
| |
Collapse
|
67
|
Chintalapati M, Moorjani P. Evolution of the mutation rate across primates. Curr Opin Genet Dev 2020; 62:58-64. [PMID: 32634682 DOI: 10.1016/j.gde.2020.05.028] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2020] [Revised: 05/10/2020] [Accepted: 05/22/2020] [Indexed: 12/31/2022]
Abstract
Germline mutations are the source of all heritable variation. In the past few years, whole genome sequencing has allowed direct and comprehensive surveys of mutation patterns in humans and other species. These studies have documented substantial variation in both mutation rates and spectra across primates, the causes of which remain unclear. Here, we review what is currently known about mutation rates in primates, highlight the factors proposed to explain the variation across species, and discuss some implications of these findings on our understanding of the chronology of primate evolution and the process of mutagenesis.
Collapse
Affiliation(s)
- Manjusha Chintalapati
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, United States; Center for Computational Biology, University of California, Berkeley, CA, United States
| | - Priya Moorjani
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, United States; Center for Computational Biology, University of California, Berkeley, CA, United States.
| |
Collapse
|
68
|
Germline de novo mutation rates on exons versus introns in humans. Nat Commun 2020; 11:3304. [PMID: 32620809 PMCID: PMC7334200 DOI: 10.1038/s41467-020-17162-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 06/02/2020] [Indexed: 02/06/2023] Open
Abstract
A main assumption of molecular population genetics is that genomic mutation rate does not depend on sequence function. Challenging this assumption, a recent study has found a reduction in the mutation rate in exons compared to introns in somatic cells, ascribed to an enhanced exonic mismatch repair system activity. If this reduction happens also in the germline, it can compromise studies of population genomics, including the detection of selection when using introns as proxies for neutrality. Here we compile and analyze published germline de novo mutation data to test if the exonic mutation rate is also reduced in germ cells. After controlling for sampling bias in datasets with diseased probands and extended nucleotide context dependency, we find no reduction in the mutation rate in exons compared to introns in the germline. Therefore, there is no evidence that enhanced exonic mismatch repair activity determines the mutation rate in germline cells. Evidence that somatic mutation rates in introns exceed those in exons challenges the molecular evolution tenet that mutation rate and sequence function are independent. Here, authors analyze germline de novo mutations and reveal no evidence for mutation rate differences between exons and introns.
Collapse
|
69
|
Vierstra J, Lazar J, Sandstrom R, Halow J, Lee K, Bates D, Diegel M, Dunn D, Neri F, Haugen E, Rynes E, Reynolds A, Nelson J, Johnson A, Frerker M, Buckley M, Kaul R, Meuleman W, Stamatoyannopoulos JA. Global reference mapping of human transcription factor footprints. Nature 2020; 583:729-736. [PMID: 32728250 PMCID: PMC7410829 DOI: 10.1038/s41586-020-2528-x] [Citation(s) in RCA: 225] [Impact Index Per Article: 45.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Accepted: 06/25/2020] [Indexed: 11/09/2022]
Abstract
Combinatorial binding of transcription factors to regulatory DNA underpins gene regulation in all organisms. Genetic variation in regulatory regions has been connected with diseases and diverse phenotypic traits1, but it remains challenging to distinguish variants that affect regulatory function2. Genomic DNase I footprinting enables the quantitative, nucleotide-resolution delineation of sites of transcription factor occupancy within native chromatin3-6. However, only a small fraction of such sites have been precisely resolved on the human genome sequence6. Here, to enable comprehensive mapping of transcription factor footprints, we produced high-density DNase I cleavage maps from 243 human cell and tissue types and states and integrated these data to delineate about 4.5 million compact genomic elements that encode transcription factor occupancy at nucleotide resolution. We map the fine-scale structure within about 1.6 million DNase I-hypersensitive sites and show that the overwhelming majority are populated by well-spaced sites of single transcription factor-DNA interaction. Cell-context-dependent cis-regulation is chiefly executed by wholesale modulation of accessibility at regulatory DNA rather than by differential transcription factor occupancy within accessible elements. We also show that the enrichment of genetic variants associated with diseases or phenotypic traits in regulatory regions1,7 is almost entirely attributable to variants within footprints, and that functional variants that affect transcription factor occupancy are nearly evenly partitioned between loss- and gain-of-function alleles. Unexpectedly, we find increased density of human genetic variation within transcription factor footprints, revealing an unappreciated driver of cis-regulatory evolution. Our results provide a framework for both global and nucleotide-precision analyses of gene regulatory mechanisms and functional genetic variation.
Collapse
Affiliation(s)
- Jeff Vierstra
- Altius Institute for Biomedical Sciences, Seattle, WA, USA.
| | - John Lazar
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | | | - Jessica Halow
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
| | - Kristen Lee
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
| | - Daniel Bates
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
| | - Morgan Diegel
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
| | - Douglas Dunn
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
| | - Fidencio Neri
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
| | - Eric Haugen
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
| | - Eric Rynes
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
| | - Alex Reynolds
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
| | - Jemma Nelson
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
| | - Audra Johnson
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
| | - Mark Frerker
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
| | | | - Rajinder Kaul
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
| | | | - John A Stamatoyannopoulos
- Altius Institute for Biomedical Sciences, Seattle, WA, USA.
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Division of Oncology, Department of Medicine, University of Washington, Seattle, WA, USA.
| |
Collapse
|
70
|
Carlson J, DeWitt WS, Harris K. Inferring evolutionary dynamics of mutation rates through the lens of mutation spectrum variation. Curr Opin Genet Dev 2020; 62:50-57. [PMID: 32619789 DOI: 10.1016/j.gde.2020.05.024] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Revised: 05/13/2020] [Accepted: 05/22/2020] [Indexed: 01/04/2023]
Abstract
There are many possible failure points in the transmission of genetic information that can produce heritable germline mutations. Once a mutation has been passed from parents to offspring for several generations, it can be difficult or impossible to identify its root cause; however, sometimes the nature of the ancestral and derived DNA sequences can provide mechanistic clues about a genetic change that happened hundreds or thousands of generations ago. Here, we review evidence that the sequence context 'spectrum' of germline mutagenesis has been evolving surprisingly rapidly over the history of humans and other species. We go on to discuss possible causal factors that might underlie rapid mutation spectrum evolution.
Collapse
Affiliation(s)
- Jedidiah Carlson
- Department of Genome Sciences, Foege Hall, University of Washington, Seattle, WA 98105, United States
| | - William S DeWitt
- Department of Genome Sciences, Foege Hall, University of Washington, Seattle, WA 98105, United States; Computational Biology Program, Fred Hutchinson Cancer Research Center, 1100 Eastlake Ave E, Seattle, WA 98109, United States
| | - Kelley Harris
- Department of Genome Sciences, Foege Hall, University of Washington, Seattle, WA 98105, United States; Computational Biology Program, Fred Hutchinson Cancer Research Center, 1100 Eastlake Ave E, Seattle, WA 98109, United States.
| |
Collapse
|
71
|
Rong S, Buerer L, Rhine CL, Wang J, Cygan KJ, Fairbrother WG. Mutational bias and the protein code shape the evolution of splicing enhancers. Nat Commun 2020; 11:2845. [PMID: 32504065 PMCID: PMC7275064 DOI: 10.1038/s41467-020-16673-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Accepted: 04/28/2020] [Indexed: 02/06/2023] Open
Abstract
Exonic splicing enhancers (ESEs) are enriched in exons relative to introns and bind splicing activators. This study considers a fundamental question of co-evolution: How did ESE motifs become enriched in exons prior to the evolution of ESE recognition? We hypothesize that the high exon to intron motif ratios necessary for ESE function were created by mutational bias coupled with purifying selection on the protein code. These two forces retain certain coding motifs in exons while passively depleting them from introns. Through the use of simulations, genomic analyses, and high throughput splicing assays, we confirm the key predictions of this hypothesis, including an overlap between protein and splicing information in ESEs. We discuss the implications of mutational bias as an evolutionary driver in other cis-regulatory systems. Splicing is regulated by cis-acting elements in pre-mRNAs such as exonic or intronic splicing enhancers and silencers. Here the authors show that exonic splicing enhancers are enriched in exons compared to introns due to mutational bias coupled with purifying selection on the protein code.
Collapse
Affiliation(s)
- Stephen Rong
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA.,Ecology and Evolutionary Biology, Brown University, Providence, RI, 02912, USA
| | - Luke Buerer
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA
| | - Christy L Rhine
- Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, 02912, USA
| | - Jing Wang
- Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, 02912, USA
| | - Kamil J Cygan
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA.,Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, 02912, USA
| | - William G Fairbrother
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA. .,Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, 02912, USA. .,Hassenfeld Child Health Innovation Institute of Brown University, Providence, RI, 02912, USA.
| |
Collapse
|
72
|
Extreme differences between human germline and tumor mutation densities are driven by ancestral human-specific deviations. Nat Commun 2020; 11:2512. [PMID: 32427823 PMCID: PMC7237693 DOI: 10.1038/s41467-020-16296-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Accepted: 04/22/2020] [Indexed: 12/29/2022] Open
Abstract
Mutations do not accumulate uniformly across the genome. Human germline and tumor mutation density correlate poorly, and each is associated with different genomic features. Here, we use non-human great ape (NHGA) germlines to determine human germline- and tumor-specific deviations from an ancestral-like great ape genome-wide mutational landscape. Strikingly, we find that the distribution of mutation densities in tumors presents a stronger correlation with NHGA than with human germlines. This effect is driven by human-specific differences in the distribution of mutations at non-CpG sites. We propose that ancestral human demographic events, together with the human-specific mutation slowdown, disrupted the human genome-wide distribution of mutation densities. Tumors partially recover this distribution by accumulating preneoplastic-like somatic mutations. Our results highlight the potential utility of using NHGA population data, rather than human controls, to establish the expected mutational background of healthy somatic cells.
Collapse
|
73
|
Golicz AA, Bhalla PL, Edwards D, Singh MB. Rice 3D chromatin structure correlates with sequence variation and meiotic recombination rate. Commun Biol 2020; 3:235. [PMID: 32398676 PMCID: PMC7217851 DOI: 10.1038/s42003-020-0932-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2019] [Accepted: 03/31/2020] [Indexed: 11/30/2022] Open
Abstract
Genomes of many eukaryotic species have a defined three-dimensional architecture critical for cellular processes. They are partitioned into topologically associated domains (TADs), defined as regions of high chromatin inter-connectivity. While TADs are not a prominent feature of A. thaliana genome organization, they have been reported for other plants including rice, maize, tomato and cotton and for which TAD formation appears to be linked to transcription and chromatin epigenetic status. Here we show that in the rice genome, sequence variation and meiotic recombination rate correlate with the 3D genome structure. TADs display increased SNP and SV density and higher recombination rate compared to inter-TAD regions. We associate the observed differences with the TAD epigenetic landscape, TE composition and an increased incidence of meiotic crossovers. Golicz et al. report an increase in single nucleotide polymorphisms and structural variations across and within Topologically Associated Domains (TADs) in the rice genome, which is different to the pattern observed in the human genome. They show that this may be due to epigenetic modifications, transposable elements composition, and meiotic crossovers in the TAD regions.
Collapse
Affiliation(s)
- Agnieszka A Golicz
- School of Agriculture and Food, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC, 3010, Australia.
| | - Prem L Bhalla
- School of Agriculture and Food, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - David Edwards
- School of Biological Sciences and Institute of Agriculture, The University of Western Australia, Perth, WA, 6009, Australia
| | - Mohan B Singh
- School of Agriculture and Food, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC, 3010, Australia
| |
Collapse
|
74
|
Li C, Luscombe NM. Nucleosome positioning stability is a modulator of germline mutation rate variation across the human genome. Nat Commun 2020; 11:1363. [PMID: 32170069 PMCID: PMC7070026 DOI: 10.1038/s41467-020-15185-0] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Accepted: 02/23/2020] [Indexed: 02/08/2023] Open
Abstract
Nucleosome organization has been suggested to affect local mutation rates in the genome. However, the lack of de novo mutation and high-resolution nucleosome data has limited the investigation of this hypothesis. Additionally, analyses using indirect mutation rate measurements have yielded contradictory and potentially confounding results. Here, we combine data on >300,000 human de novo mutations with high-resolution nucleosome maps and find substantially elevated mutation rates around translationally stable (‘strong’) nucleosomes. We show that the mutational mechanisms affected by strong nucleosomes are low-fidelity replication, insufficient mismatch repair and increased double-strand breaks. Strong nucleosomes preferentially locate within young SINE/LINE transposons, suggesting that when subject to increased mutation rates, transposons are then more rapidly inactivated. Depletion of strong nucleosomes in older transposons suggests frequent positioning changes during evolution. The findings have important implications for human genetics and genome evolution. Nucleosome organization has been suggested to affect local mutation rates in the genome. Here, the authors analyse data on >300,000 human de novo mutations and high-resolution nucleosome maps and provide evidence that nucleosome positioning stability modulates germline mutation rate variation across the human genome.
Collapse
Affiliation(s)
- Cai Li
- The Francis Crick Institute, London, NW1 1AT, UK. .,School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, China.
| | - Nicholas M Luscombe
- The Francis Crick Institute, London, NW1 1AT, UK.,Okinawa Institute of Science & Technology Graduate University, Okinawa, 904-0495, Japan.,UCL Genetics Institute, University College London, London, WC1E 6BT, UK
| |
Collapse
|
75
|
Cytosine Methylation Affects the Mutability of Neighboring Nucleotides in Germline and Soma. Genetics 2020; 214:809-823. [PMID: 32079595 DOI: 10.1534/genetics.120.303028] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Accepted: 02/12/2020] [Indexed: 02/07/2023] Open
Abstract
Methylated cytosines deaminate at higher rates than unmethylated cytosines, and the lesions they produce are repaired less efficiently. As a result, methylated cytosines are mutational hotspots. Here, combining rare polymorphism and base-resolution methylation data in humans, Arabidopsis thaliana, and rice (Oryza sativa), we present evidence that methylation state affects mutation dynamics not only at the focal cytosine but also at neighboring nucleotides. In humans, contrary to prior suggestions, we find that nucleotides in the close vicinity (±3 bp) of methylated cytosines mutate less frequently. Reduced mutability around methylated CpGs is also observed in cancer genomes, considering single nucleotide variants alongside tissue-of-origin-matched methylation data. In contrast, methylation is associated with increased neighborhood mutation risk in A. thaliana and rice. The difference in neighborhood mutation risk is less pronounced further away from the focal CpG and modulated by regional GC content. Our results are consistent with a model where altered risk at neighboring bases is linked to lesion formation at the focal CpG and subsequent long-patch repair. Our findings indicate that cytosine methylation has a broader mutational footprint than is commonly assumed.
Collapse
|
76
|
Kessler MD, Loesch DP, Perry JA, Heard-Costa NL, Taliun D, Cade BE, Wang H, Daya M, Ziniti J, Datta S, Celedón JC, Soto-Quiros ME, Avila L, Weiss ST, Barnes K, Redline SS, Vasan RS, Johnson AD, Mathias RA, Hernandez R, Wilson JG, Nickerson DA, Abecasis G, Browning SR, Zöllner S, O'Connell JR, Mitchell BD, O'Connor TD. De novo mutations across 1,465 diverse genomes reveal mutational insights and reductions in the Amish founder population. Proc Natl Acad Sci U S A 2020; 117:2560-2569. [PMID: 31964835 PMCID: PMC7007577 DOI: 10.1073/pnas.1902766117] [Citation(s) in RCA: 68] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
De novo mutations (DNMs), or mutations that appear in an individual despite not being seen in their parents, are an important source of genetic variation whose impact is relevant to studies of human evolution, genetics, and disease. Utilizing high-coverage whole-genome sequencing data as part of the Trans-Omics for Precision Medicine (TOPMed) Program, we called 93,325 single-nucleotide DNMs across 1,465 trios from an array of diverse human populations, and used them to directly estimate and analyze DNM counts, rates, and spectra. We find a significant positive correlation between local recombination rate and local DNM rate, and that DNM rate explains a substantial portion (8.98 to 34.92%, depending on the model) of the genome-wide variation in population-level genetic variation from 41K unrelated TOPMed samples. Genome-wide heterozygosity does correlate with DNM rate, but only explains <1% of variation. While we are underpowered to see small differences, we do not find significant differences in DNM rate between individuals of European, African, and Latino ancestry, nor across ancestrally distinct segments within admixed individuals. However, we did find significantly fewer DNMs in Amish individuals, even when compared with other Europeans, and even after accounting for parental age and sequencing center. Specifically, we found significant reductions in the number of C→A and T→C mutations in the Amish, which seem to underpin their overall reduction in DNMs. Finally, we calculated near-zero estimates of narrow sense heritability (h2), which suggest that variation in DNM rate is significantly shaped by nonadditive genetic effects and the environment.
Collapse
Affiliation(s)
- Michael D Kessler
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201
- Program for Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD 21201
- University of Maryland Marlene and Stewart Greenebaum Comprehensive Cancer Center, University of Maryland School of Medicine, Baltimore, MD 21201
| | - Douglas P Loesch
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201
- Program for Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD 21201
| | - James A Perry
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201
- Program for Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD 21201
| | - Nancy L Heard-Costa
- Department of Neurology, Boston University School of Medicine, Boston, MA 02118
- Framingham Heart Study, Framingham, MA 01702
| | - Daniel Taliun
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109
| | - Brian E Cade
- Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA 02115
- Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02142
| | - Heming Wang
- Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA 02115
- Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02142
| | - Michelle Daya
- Department of Medicine, University of Colorado Denver, Aurora, CO 80045
| | - John Ziniti
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115
| | - Soma Datta
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115
| | - Juan C Celedón
- Division of Pediatric Pulmonary Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213
| | - Manuel E Soto-Quiros
- Department of Pediatrics, Hospital Nacional de Niños, 10103 San José, Costa Rica
| | - Lydiana Avila
- Department of Pediatrics, Hospital Nacional de Niños, 10103 San José, Costa Rica
| | - Scott T Weiss
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115
- Department of Medicine, Harvard Medical School, Boston, MA 02115
| | - Kathleen Barnes
- Department of Medicine, University of Colorado Denver, Aurora, CO 80045
| | - Susan S Redline
- Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA 02115
- Division of Sleep Medicine, Harvard Medical School, Boston, MA 02115
- Division of Pulmonary, Critical Care, and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA 02215
| | | | - Andrew D Johnson
- Framingham Heart Study, Framingham, MA 01702
- Population Sciences Branch, Division of Intramural Research, National Heart, Lung and Blood Institute, The Framingham Heart Study, Framingham, MA 01702
| | - Rasika A Mathias
- Division of Allergy and Clinical Immunology, The Johns Hopkins School of Medicine, Baltimore, MD 21224
- Bloomberg School of Public Health, The Johns Hopkins University, Baltimore, MD 21218
| | - Ryan Hernandez
- Quantitative Life Sciences, McGill University, Montreal, QC H3A OG4, Canada
| | - James G Wilson
- Department of Physiology and Biophysics, University of Mississippi Medical Center, Jackson, MS 39216
| | | | - Goncalo Abecasis
- School of Public Health, University of Michigan, Ann Arbor, MI 48109
| | - Sharon R Browning
- Department of Biostatistics, University of Washington, Seattle, WA 98195
| | - Sebastian Zöllner
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109
- Department of Psychiatry, University of Michigan, Ann Arbor, MI 48109
| | - Jeffrey R O'Connell
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201
- Program for Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD 21201
| | - Braxton D Mitchell
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201
- Program for Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD 21201
- Geriatrics Research and Education Clinical Center, Baltimore Veterans Administration Medical Center, Baltimore, MD 21201
| | - Timothy D O'Connor
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201;
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201
- Program for Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD 21201
- University of Maryland Marlene and Stewart Greenebaum Comprehensive Cancer Center, University of Maryland School of Medicine, Baltimore, MD 21201
| |
Collapse
|
77
|
Mai TL, Chuang TJ. A-to-I RNA editing contributes to the persistence of predicted damaging mutations in populations. Genome Res 2019; 29:1766-1776. [PMID: 31515285 PMCID: PMC6836733 DOI: 10.1101/gr.246033.118] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2018] [Accepted: 09/04/2019] [Indexed: 12/13/2022]
Abstract
Adenosine-to-inosine (A-to-I) RNA editing is a very common co-/posttranscriptional modification that can lead to A-to-G changes at the RNA level and compensate for G-to-A genomic changes to a certain extent. It has been shown that each healthy individual can carry dozens of missense variants predicted to be severely deleterious. Why strongly detrimental variants are preserved in a population and not eliminated by negative natural selection remains mostly unclear. Here, we ask if RNA editing correlates with the burden of deleterious A/G polymorphisms in a population. Integrating genome and transcriptome sequencing data from 447 human lymphoblastoid cell lines, we show that nonsynonymous editing activities (prevalence/level) are negatively correlated with the deleteriousness of A-to-G genomic changes and positively correlated with that of G-to-A genomic changes within the population. We find a significantly negative correlation between nonsynonymous editing activities and allele frequency of A within the population. This negative editing-allele frequency correlation is particularly strong when editing sites are located in highly important genes/loci. Examinations of deleterious missense variants from the 1000 Genomes Project further show a significantly higher proportion of rare missense mutations for G-to-A changes than for other types of changes. The proportion for G-to-A changes increases with increasing deleterious effects of the changes. Moreover, the deleteriousness of G-to-A changes is significantly positively correlated with the percentage of editing enzyme binding motifs at the variants. Overall, we show that nonsynonymous editing is associated with the increased burden of G-to-A missense mutations in healthy individuals, expanding RNA editing in pathogenomics studies.
Collapse
Affiliation(s)
- Te-Lun Mai
- Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| | | |
Collapse
|
78
|
Signatures of replication timing, recombination, and sex in the spectrum of rare variants on the human X chromosome and autosomes. Proc Natl Acad Sci U S A 2019; 116:17916-17924. [PMID: 31427530 PMCID: PMC6731651 DOI: 10.1073/pnas.1900714116] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
The sources of human germline mutations are poorly understood. Part of the difficulty is that mutations occur very rarely, and so direct pedigree-based approaches remain limited in the numbers that they can examine. To address this problem, we consider the spectrum of low-frequency variants in a dataset (Genome Aggregation Database, gnomAD) of 13,860 human X chromosomes and autosomes. X-autosome differences are reflective of germline sex differences and have been used extensively to learn about male versus female mutational processes; what is less appreciated is that they also reflect chromosome-level biochemical features that differ between the X and autosomes. We tease these components apart by comparing the mutation spectrum in multiple genomic compartments on the autosomes and between the X and autosomes. In so doing, we are able to ascribe specific mutation patterns to replication timing and recombination and to identify differences in the types of mutations that accrue in males and females. In particular, we identify C > G as a mutagenic signature of male meiotic double-strand breaks on the X, which may result from late repair. Our results show how biochemical processes of damage and repair in the germline interact with sex-specific life history traits to shape mutation patterns on both the X chromosome and autosomes.
Collapse
|
79
|
Supek F, Lehner B. Scales and mechanisms of somatic mutation rate variation across the human genome. DNA Repair (Amst) 2019; 81:102647. [PMID: 31307927 DOI: 10.1016/j.dnarep.2019.102647] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Cancer genome sequencing has revealed that somatic mutation rates vary substantially across the human genome and at scales from megabase-sized domains to individual nucleotides. Here we review recent work that has both revealed the major mutation biases that operate across the genome and the molecular mechanisms that cause them. The default mutation rate landscape in mammalian genomes results in active genes having low mutation rates because of a combination of factors that increase DNA repair: early DNA replication, transcription, active chromatin modifications and accessible chromatin. Therefore, either an increase in the global mutation rate or a redistribution of mutations from inactive to active DNA can increase the rate at which consequential mutations are acquired in active genes. Several environmental carcinogens and intrinsic mechanisms operating in tumor cells likely cause cancer by this second mechanism: by specifically increasing the mutation rate in active regions of the genome.
Collapse
Affiliation(s)
- Fran Supek
- Genome Data Science, Institut de Recerca Biomedica (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac 10, 08028, Barcelona, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluís Companys 23, 08010 Barcelona, Spain.
| | - Ben Lehner
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluís Companys 23, 08010 Barcelona, Spain; Systems Biology Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Doctor Aiguader 88, 08003 Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain
| |
Collapse
|
80
|
Aikens RC, Johnson KE, Voight BF. Signals of Variation in Human Mutation Rate at Multiple Levels of Sequence Context. Mol Biol Evol 2019; 36:955-965. [PMID: 30753705 PMCID: PMC6501879 DOI: 10.1093/molbev/msz023] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Our understanding of the human mutation rate helps us build evolutionary models and interpret patterns of genetic variation observed in human populations. Recent work indicates that the frequencies of specific polymorphism types have been elevated in Europe, and that many more, subtler signatures of global polymorphism variation may yet remain unidentified. Here, we present an analysis of the 1000 Genomes Project supported by analysis in the Simons Genome Diversity Panel, suggesting additional putative signatures of mutation rate variation across populations and the extent to which they are shaped by local sequence context. First, we compiled a list of the most significantly variable polymorphism types in a cross-continental statistical test. Clustering polymorphisms together, we observe three sets that showed distinct shared patterns of relative enrichment among ancestral populations, and we characterize each one of these putative “signatures” of polymorphism variation. For three of these signatures, we found that a single flanking base pair of sequence context was sufficient to determine the majority of enrichment or depletion of a polymorphism type. However, local genetic context up to 2–3 bp away contributes additional variability and may help to interpret a previously noted enrichment of certain polymorphism types in some East Asian groups. Moreover, considering broader local genetic context highlights patterns of polymorphism variation, which were not captured by previous approaches. Building our understanding of mutation rate in this way can help us to construct more accurate evolutionary models and better understand the mechanisms that underlie genetic change.
Collapse
Affiliation(s)
- Rachael C Aikens
- Program in Biomedical Informatics, Stanford University School of Medicine, Stanford University, Stanford, CA
| | - Kelsey E Johnson
- Genetics and Epigenetics Program, Cell and Molecular Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| | - Benjamin F Voight
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA.,Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA.,Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA
| |
Collapse
|
81
|
Saxena AS, Salomon MP, Matsuba C, Yeh SD, Baer CF. Evolution of the Mutational Process under Relaxed Selection in Caenorhabditis elegans. Mol Biol Evol 2019; 36:239-251. [PMID: 30445510 DOI: 10.1093/molbev/msy213] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The mutational process varies at many levels, from within genomes to among taxa. Many mechanisms have been linked to variation in mutation, but understanding of the evolution of the mutational process is rudimentary. Physiological condition is often implicated as a source of variation in microbial mutation rate and may contribute to mutation rate variation in multicellular organisms.Deleterious mutations are an ubiquitous source of variation in condition. We test the hypothesis that the mutational process depends on the underlying mutation load in two groups of Caenorhabditis elegans mutation accumulation (MA) lines that differ in their starting mutation loads. "First-order MA" (O1MA) lines maintained under minimal selection for ∼250 generations were divided into high-fitness and low-fitness groups and sets of "second-order MA" (O2MA) lines derived from each O1MA line were maintained for ∼150 additional generations. Genomes of 48 O2MA lines and their progenitors were sequenced. There is significant variation among O2MA lines in base-substitution rate (µbs), but no effect of initial fitness; the indel rate is greater in high-fitness O2MA lines. Overall, µbs is positively correlated with recombination and proximity to short tandem repeats and negatively correlated with 10 bp and 1 kb GC content. However, probability of mutation is sufficiently predicted by the three-nucleotide motif alone. Approximately 90% of the variance in standing nucleotide variation is explained by mutability. Total mutation rate increased in the O2MA lines, as predicted by the "drift barrier" model of mutation rate evolution. These data, combined with experimental estimates of fitness, suggest that epistasis is synergistic.
Collapse
Affiliation(s)
| | - Matthew P Salomon
- Department of Biology, University of Florida, Gainesville, FL
- Department of Molecular Oncology, John Wayne Cancer Institute, Santa Monica, CA
| | - Chikako Matsuba
- Department of Biology, University of Florida, Gainesville, FL
- Department of Molecular Oncology, John Wayne Cancer Institute, Santa Monica, CA
| | - Shu-Dan Yeh
- Department of Biology, University of Florida, Gainesville, FL
- Department of Life Sciences, National Central University, Taoyuan, Taiwan
| | - Charles F Baer
- Department of Biology, University of Florida, Gainesville, FL
- University of Florida Genetics Institute
| |
Collapse
|
82
|
Abstract
Mutation provides the ultimate source of all new alleles in populations, including variants that cause disease and fuel adaptation. Recent whole genome sequencing studies have uncovered variation in the mutation rate among individuals and differences in the relative frequency of specific nucleotide changes (the mutation spectrum) between populations. Although parental age is a major driver of differences in overall mutation rate among individuals, the causes of variation in the mutation spectrum remain less well understood. Here, I use high-quality whole genome sequences from 29 inbred laboratory mouse strains to explore the root causes of strain variation in the mutation spectrum. My analysis leverages the unique, mosaic patterns of genetic relatedness among inbred mouse strains to identify strain private variants residing on haplotypes shared between multiple strains due to their recent descent from a common ancestor. I show that these strain-private alleles are strongly enriched for recent de novo mutations and lack signals of widespread purifying selection, suggesting their faithful recapitulation of the spontaneous mutation landscape in single strains. The spectrum of strain-private variants varies significantly among inbred mouse strains reared under standardized laboratory conditions. This variation is not solely explained by strain differences in age at reproduction, raising the possibility that segregating genetic differences affect the constellation of new mutations that arise in a given strain. Collectively, these findings imply the action of remarkably precise nucleotide-specific genetic mechanisms for tuning the de novo mutation landscape in mammals and underscore the genetic complexity of mutation rate control.
Collapse
|
83
|
Havrilla JM, Pedersen BS, Layer RM, Quinlan AR. A map of constrained coding regions in the human genome. Nat Genet 2018; 51:88-95. [PMID: 30531870 DOI: 10.1038/s41588-018-0294-6] [Citation(s) in RCA: 166] [Impact Index Per Article: 23.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2017] [Accepted: 10/29/2018] [Indexed: 12/13/2022]
Abstract
Deep catalogs of genetic variation from thousands of humans enable the detection of intraspecies constraint by identifying coding regions with a scarcity of variation. While existing techniques summarize constraint for entire genes, single gene-wide metrics conceal regional constraint variability within each gene. Therefore, we have created a detailed map of constrained coding regions (CCRs) by leveraging variation observed among 123,136 humans from the Genome Aggregation Database. The most constrained CCRs are enriched for pathogenic variants in ClinVar and mutations underlying developmental disorders. CCRs highlight protein domain families under high constraint and suggest unannotated or incomplete protein domains. The highest-percentile CCRs complement existing variant prioritization methods when evaluating de novo mutations in studies of autosomal dominant disease. Finally, we identify highly constrained CCRs within genes lacking known disease associations. This observation suggests that CCRs may identify regions under strong purifying selection that, when mutated, cause severe developmental phenotypes or embryonic lethality.
Collapse
Affiliation(s)
- James M Havrilla
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA.,USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT, USA
| | - Brent S Pedersen
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA.,USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT, USA
| | - Ryan M Layer
- BioFrontiers Institute, University of Colorado, Boulder, CO, USA.,Department of Computer Science, University of Colorado, Boulder, CO, USA
| | - Aaron R Quinlan
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA. .,USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT, USA. .,Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, USA.
| |
Collapse
|
84
|
Liu Y, Liang Y, Cicek AE, Li Z, Li J, Muhle RA, Krenzer M, Mei Y, Wang Y, Knoblauch N, Morrison J, Zhao S, Jiang Y, Geller E, Ionita-Laza I, Wu J, Xia K, Noonan JP, Sun ZS, He X. A Statistical Framework for Mapping Risk Genes from De Novo Mutations in Whole-Genome-Sequencing Studies. Am J Hum Genet 2018; 102:1031-1047. [PMID: 29754769 PMCID: PMC5992125 DOI: 10.1016/j.ajhg.2018.03.023] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2017] [Accepted: 03/22/2018] [Indexed: 10/16/2022] Open
Abstract
Analysis of de novo mutations (DNMs) from sequencing data of nuclear families has identified risk genes for many complex diseases, including multiple neurodevelopmental and psychiatric disorders. Most of these efforts have focused on mutations in protein-coding sequences. Evidence from genome-wide association studies (GWASs) strongly suggests that variants important to human diseases often lie in non-coding regions. Extending DNM-based approaches to non-coding sequences is challenging, however, because the functional significance of non-coding mutations is difficult to predict. We propose a statistical framework for analyzing DNMs from whole-genome sequencing (WGS) data. This method, TADA-Annotations (TADA-A), is a major advance of the TADA method we developed earlier for DNM analysis in coding regions. TADA-A is able to incorporate many functional annotations such as conservation and enhancer marks, to learn from data which annotations are informative of pathogenic mutations, and to combine both coding and non-coding mutations at the gene level to detect risk genes. It also supports meta-analysis of multiple DNM studies, while adjusting for study-specific technical effects. We applied TADA-A to WGS data of ∼300 autism-affected family trios across five studies and discovered several autism risk genes. The software is freely available for all research uses.
Collapse
Affiliation(s)
- Yuwen Liu
- Department of Human Genetics, The University of Chicago, Chicago, IL 60637, USA
| | - Yanyu Liang
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15123, USA
| | - A Ercument Cicek
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15123, USA; Computer Engineering Department, Bilkent University, Ankara 06800, Turkey
| | - Zhongshan Li
- Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou, Zhejiang 325000, China
| | - Jinchen Li
- Center for Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan 410078, China; National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan 410078, China
| | | | - Martina Krenzer
- Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06520, USA
| | - Yue Mei
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing 100000, China
| | - Yan Wang
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing 100000, China
| | - Nicholas Knoblauch
- Committee on Genetics, Genomics and Systems Biology, The University of Chicago, Chicago, IL 60637, USA
| | - Jean Morrison
- Department of Human Genetics, The University of Chicago, Chicago, IL 60637, USA
| | - Siming Zhao
- Department of Human Genetics, The University of Chicago, Chicago, IL 60637, USA
| | - Yi Jiang
- Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou, Zhejiang 325000, China; Center for Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan 410078, China
| | - Evan Geller
- Department of Genetics, Yale School of Medicine, New Haven, CT 06520, USA; Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06520, USA
| | | | - Jinyu Wu
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing 100000, China; Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou, Zhejiang 325000, China
| | - Kun Xia
- Center for Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan 410078, China
| | - James P Noonan
- Department of Genetics, Yale School of Medicine, New Haven, CT 06520, USA; Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06520, USA
| | - Zhong Sheng Sun
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing 100000, China; Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou, Zhejiang 325000, China.
| | - Xin He
- Department of Human Genetics, The University of Chicago, Chicago, IL 60637, USA.
| |
Collapse
|
85
|
Chen C, Qi H, Shen Y, Pickrell J, Przeworski M. Contrasting Determinants of Mutation Rates in Germline and Soma. Genetics 2017; 207:255-267. [PMID: 28733365 PMCID: PMC5586376 DOI: 10.1534/genetics.117.1114] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2017] [Accepted: 07/01/2017] [Indexed: 12/13/2022] Open
Abstract
Recent studies of somatic and germline mutations have led to the identification of a number of factors that influence point mutation rates, including CpG methylation, expression levels, replication timing, and GC content. Intriguingly, some of the effects appear to differ between soma and germline: in particular, whereas mutation rates have been reported to decrease with expression levels in tumors, no clear effect has been detected in the germline. Distinct approaches were taken to analyze the data, however, so it is hard to know whether these apparent differences are real. To enable a cleaner comparison, we considered a statistical model in which the mutation rate of a coding region is predicted by GC content, expression levels, replication timing, and two histone repressive marks. We applied this model to both a set of germline mutations identified in exomes and to exonic somatic mutations in four types of tumors. Most determinants of mutations are shared: notably, we detected an effect of expression levels on both germline and somatic mutation rates. Moreover, in all tissues considered, higher expression levels are associated with greater strand asymmetry of mutations. However, mutation rates increase with expression levels in testis (and, more tentatively, in ovary), whereas they decrease with expression levels in somatic tissues. This contrast points to differences in damage or repair rates during transcription in soma and germline.
Collapse
Affiliation(s)
- Chen Chen
- Department of Biological Sciences, Columbia University, New York, New York 10025
- New York Genome Center, New York, New York 10013
| | - Hongjian Qi
- Department of Systems Biology, Columbia University Medical Center, New York, New York 10032
- Department of Applied Physics and Applied Mathematics, Columbia University, New York, New York 10025
| | - Yufeng Shen
- Department of Systems Biology, Columbia University Medical Center, New York, New York 10032
- Department of Biomedical Informatics, Columbia University, New York, New York 10025
| | - Joseph Pickrell
- Department of Biological Sciences, Columbia University, New York, New York 10025
- New York Genome Center, New York, New York 10013
| | - Molly Przeworski
- Department of Biological Sciences, Columbia University, New York, New York 10025
- Department of Systems Biology, Columbia University Medical Center, New York, New York 10032
| |
Collapse
|
86
|
Harris K, Pritchard JK. Rapid evolution of the human mutation spectrum. eLife 2017; 6. [PMID: 28440220 PMCID: PMC5435464 DOI: 10.7554/elife.24284] [Citation(s) in RCA: 102] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2016] [Accepted: 04/07/2017] [Indexed: 01/02/2023] Open
Abstract
DNA is a remarkably precise medium for copying and storing biological information. This high fidelity results from the action of hundreds of genes involved in replication, proofreading, and damage repair. Evolutionary theory suggests that in such a system, selection has limited ability to remove genetic variants that change mutation rates by small amounts or in specific sequence contexts. Consistent with this, using SNV variation as a proxy for mutational input, we report here that mutational spectra differ substantially among species, human continental groups and even some closely related populations. Close examination of one signal, an increased TCC→TTC mutation rate in Europeans, indicates a burst of mutations from about 15,000 to 2000 years ago, perhaps due to the appearance, drift, and ultimate elimination of a genetic modifier of mutation rate. Our results suggest that mutation rates can evolve markedly over short evolutionary timescales and suggest the possibility of mapping mutational modifiers. DOI:http://dx.doi.org/10.7554/eLife.24284.001 DNA is a molecule that contains the information needed to build an organism. This information is stored as a code made up of four chemicals: adenine (A), guanine (G), cytosine (C), and thymine (T). Every time a cell divides and copies its DNA, it accidentally introduces ‘typos’ into the code, known as mutations. Most mutations are harmless, but some can cause damage. All cells have ways to proofread DNA, and the more resources are devoted to proofreading, the less mutations occur. Simple organisms such as bacteria use less energy to reduce mutations, because their genomes may tolerate more damage. More complex organisms, from yeast to humans, instead need to proofread their genomes more thoroughly. Recent research has shown that humans have a lower mutation rate than chimpanzees and gorillas, their closest living relatives. Humans and other apes copy and proofread their DNA with basically the same biological machinery as yeast, which is about a billion years old. Yet, humans and apes have only existed for a small fraction of this time, a few million years. Why then do humans need to replicate and proofread their DNA differently from apes, and could it be that the way mutations arise is still evolving? Previous research revealed that European people experience more mutations within certain DNA motifs (specifically, the DNA sequences ‘TCC’, ‘TCT’, ‘CCC’ and ‘ACC’) than Africans or East Asians do. Now, Harris (who conducted the previous research) and Pritchard have compared how various human ethnic groups accumulate mutations and how these processes differ in different groups. Statistical analysis of the genomes of thousands of people from all over the world did indeed show that the mutation rates of many different three-letter DNA motifs have changed during the past 20,000 years of human evolution. Harris and Pritchard report that when groups of humans left Africa and settled in isolated populations across different continents, each population quickly became better at avoiding mutations in some genomic contexts, but worse in others. This suggests that the risk of passing on harmful mutations to future generations is changing and evolving at an even faster rate than was originally suspected. The results suggest that every human ethnic group carries specific variants of the genes which ensure that DNA replication and repair are accurate. These differences appear to influence which types of mutations are frequently passed down to future generations. An important next step will be to identify the genetic variants that could be controlling mutational patterns and how they affect human health. DOI:http://dx.doi.org/10.7554/eLife.24284.002
Collapse
Affiliation(s)
- Kelley Harris
- Department of Genetics, Stanford University, Stanford, United States
| | - Jonathan K Pritchard
- Department of Genetics, Stanford University, Stanford, United States.,Department of Biology, Stanford University, Stanford, United States.,Howard Hughes Medical Institute, Stanford University, Stanford, United States
| |
Collapse
|