1
|
Oliveira HR, Chud TCS, Oliveira GA, Hermisdorff IC, Narayana SG, Rochus CM, Butty AM, Malchiodi F, Stothard P, Miglior F, Baes CF, Schenkel FS. Genome-wide association analyses reveals copy number variant regions associated with reproduction and disease traits in Canadian Holstein cattle. J Dairy Sci 2024:S0022-0302(24)00810-5. [PMID: 38788846 DOI: 10.3168/jds.2023-24295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 04/01/2024] [Indexed: 05/26/2024]
Abstract
This study aimed to evaluate the impact of copy number variants (CNVs) on 13 reproduction and 12 disease traits in Holstein cattle. Intensity signal files containing Log R ratio and B allele frequency information from 13,730 Holstein animals genotyped with a 95K SNP panel, and 8,467 Holstein animals genotyped with a 50K SNP panel were used to identify the CNVs. Subsequently, the identified CNVs were validated using whole genome sequence data from 126 animals, resulting in 870 high-confidence CNV regions (CNVRs) on 12,131 animals. Out of these, 54 CNVRs had frequencies higher than or equal to 1% in the population and were used in the genome-wide association analysis (one CNVR at a time, including the G matrix). Results revealed that 4 CNVRs were significantly (p-value < 3.7 × 10-5) associated with at least one of the traits analyzed in this study. Specifically, 2 CNVRs were associated with 3 reproduction traits (i.e., calf survival, first service to conception, and non-return rate), and 2 CNVRs were associated with 2 disease traits (i.e., metritis and retained placenta). These CNVRs harbored genes implicated in immune response, cellular signaling, and neuronal development, supporting their potential involvement in these traits. Further investigations to unravel the mechanistic and functional implications of these CNVRs on the mentioned traits are warranted.
Collapse
Affiliation(s)
- Hinayah R Oliveira
- Department of Animal Sciences, Purdue University, West Lafayette, Indiana, USA; Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, Ontario, Canada.
| | - Tatiane C S Chud
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, Ontario, Canada
| | - Gerson A Oliveira
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, Ontario, Canada
| | - Isis C Hermisdorff
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, Ontario, Canada
| | - Saranya G Narayana
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, Ontario, Canada; Lactanet, Guelph, Ontario, Canada
| | - Christina M Rochus
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, Ontario, Canada
| | | | - Francesca Malchiodi
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, Ontario, Canada; Semex, Guelph, Ontario, Canada
| | | | - Filippo Miglior
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, Ontario, Canada; Lactanet, Guelph, Ontario, Canada
| | - Christine F Baes
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, Ontario, Canada; Institute of Genetics, Vetsuisse Faculty, University of Bern, Bern, Switzerland
| | - Flavio S Schenkel
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, Ontario, Canada.
| |
Collapse
|
2
|
Zhang L, Chang M, Liu C, Xu Y, Feng Q, Yin S, Wu W. A case of de novo -α 3.7 thalassaemia and the utility of CATSA for detecting de novo mutations in thalassaemia. Br J Haematol 2024. [PMID: 38757312 DOI: 10.1111/bjh.19507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 04/23/2024] [Indexed: 05/18/2024]
Affiliation(s)
- Lei Zhang
- Medical Genetics Center, Shenzhen Maternity and Child Healthcare Hospital, Shenzhen, Guangdong, China
| | - Ming Chang
- Department of Hematology, The Seventh Affiliated Hospital of Sun Yat-Sen University, Shenzhen, Guangdong, China
| | - Chao Liu
- Medical Genetics Center, Shenzhen Maternity and Child Healthcare Hospital, Shenzhen, Guangdong, China
| | - Yong Xu
- Medical Genetics Center, Shenzhen Maternity and Child Healthcare Hospital, Shenzhen, Guangdong, China
| | - Qing Feng
- Medical Genetics Center, Shenzhen Maternity and Child Healthcare Hospital, Shenzhen, Guangdong, China
| | - Shanshan Yin
- Medical Genetics Center, Shenzhen Maternity and Child Healthcare Hospital, Shenzhen, Guangdong, China
| | - Weiqing Wu
- Medical Genetics Center, Shenzhen Maternity and Child Healthcare Hospital, Shenzhen, Guangdong, China
| |
Collapse
|
3
|
Jay P, Jeffries D, Hartmann FE, Véber A, Giraud T. Why do sex chromosomes progressively lose recombination? Trends Genet 2024:S0168-9525(24)00067-2. [PMID: 38677904 DOI: 10.1016/j.tig.2024.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 03/18/2024] [Accepted: 03/19/2024] [Indexed: 04/29/2024]
Abstract
Progressive recombination loss is a common feature of sex chromosomes. Yet, the evolutionary drivers of this phenomenon remain a mystery. For decades, differences in trait optima between sexes (sexual antagonism) have been the favoured hypothesis, but convincing evidence is lacking. Recent years have seen a surge of alternative hypotheses to explain progressive extensions and maintenance of recombination suppression: neutral accumulation of sequence divergence, selection of nonrecombining fragments with fewer deleterious mutations than average, sheltering of recessive deleterious mutations by linkage to heterozygous alleles, early evolution of dosage compensation, and constraints on recombination restoration. Here, we explain these recent hypotheses and dissect their assumptions, mechanisms, and predictions. We also review empirical studies that have brought support to the various hypotheses.
Collapse
Affiliation(s)
- Paul Jay
- Center for GeoGenetics, University of Copenhagen, Copenhagen, Denmark; Université Paris-Saclay, CNRS, AgroParisTech, Laboratoire Ecologie Systématique et Evolution, UMR 8079, Bâtiment 680, 12 route RD128, 91190 Gif-sur-Yvette, France.
| | - Daniel Jeffries
- Division of Evolutionary Ecology, Institute of Ecology and Evolution, University of Bern, 3012 Bern, Switzerland
| | - Fanny E Hartmann
- Université Paris-Saclay, CNRS, AgroParisTech, Laboratoire Ecologie Systématique et Evolution, UMR 8079, Bâtiment 680, 12 route RD128, 91190 Gif-sur-Yvette, France
| | - Amandine Véber
- Université Paris Cité, CNRS, MAP5, F-75006 Paris, France
| | - Tatiana Giraud
- Université Paris-Saclay, CNRS, AgroParisTech, Laboratoire Ecologie Systématique et Evolution, UMR 8079, Bâtiment 680, 12 route RD128, 91190 Gif-sur-Yvette, France
| |
Collapse
|
4
|
Jurgens JA, Barry BJ, Chan WM, MacKinnon S, Whitman MC, Matos Ruiz PM, Pratt BM, England EM, Pais L, Lemire G, Groopman E, Glaze C, Russell KA, Singer-Berk M, Di Gioia SA, Lee AS, Andrews C, Shaaban S, Wirth MM, Bekele S, Toffoloni M, Bradford VR, Foster EE, Berube L, Rivera-Quiles C, Mensching FM, Sanchis-Juan A, Fu JM, Wong I, Zhao X, Wilson MW, Weisburd B, Lek M, Brand H, Talkowski ME, MacArthur DG, O’Donnell-Luria A, Robson CD, Hunter DG, Engle EC. Expanding the genetics and phenotypes of ocular congenital cranial dysinnervation disorders. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.03.22.24304594. [PMID: 38585811 PMCID: PMC10996726 DOI: 10.1101/2024.03.22.24304594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Purpose To identify genetic etiologies and genotype/phenotype associations for unsolved ocular congenital cranial dysinnervation disorders (oCCDDs). Methods We coupled phenotyping with exome or genome sequencing of 467 pedigrees with genetically unsolved oCCDDs, integrating analyses of pedigrees, human and animal model phenotypes, and de novo variants to identify rare candidate single nucleotide variants, insertion/deletions, and structural variants disrupting protein-coding regions. Prioritized variants were classified for pathogenicity and evaluated for genotype/phenotype correlations. Results Analyses elucidated phenotypic subgroups, identified pathogenic/likely pathogenic variant(s) in 43/467 probands (9.2%), and prioritized variants of uncertain significance in 70/467 additional probands (15.0%). These included known and novel variants in established oCCDD genes, genes associated with syndromes that sometimes include oCCDDs (e.g., MYH10, KIF21B, TGFBR2, TUBB6), genes that fit the syndromic component of the phenotype but had no prior oCCDD association (e.g., CDK13, TGFB2), genes with no reported association with oCCDDs or the syndromic phenotypes (e.g., TUBA4A, KIF5C, CTNNA1, KLB, FGF21), and genes associated with oCCDD phenocopies that had resulted in misdiagnoses. Conclusion This study suggests that unsolved oCCDDs are clinically and genetically heterogeneous disorders often overlapping other Mendelian conditions and nominates many candidates for future replication and functional studies.
Collapse
Affiliation(s)
- Julie A. Jurgens
- F.M. Kirby Neurobiology Center, Boston Children’s Hospital, Boston, MA, USA
- Department of Neurology, Boston Children’s Hospital, Boston, MA, USA
- Department of Neurology, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Brenda J. Barry
- Department of Neurology, Boston Children’s Hospital, Boston, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Wai-Man Chan
- F.M. Kirby Neurobiology Center, Boston Children’s Hospital, Boston, MA, USA
- Department of Neurology, Boston Children’s Hospital, Boston, MA, USA
- Department of Neurology, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Sarah MacKinnon
- Department of Ophthalmology, Boston Children’s Hospital, Boston, MA, USA
- Department of Ophthalmology, Harvard Medical School, Boston, MA, USA
| | - Mary C. Whitman
- F.M. Kirby Neurobiology Center, Boston Children’s Hospital, Boston, MA, USA
- Department of Ophthalmology, Boston Children’s Hospital, Boston, MA, USA
- Department of Ophthalmology, Harvard Medical School, Boston, MA, USA
| | | | - Brandon M. Pratt
- Department of Neurology, Boston Children’s Hospital, Boston, MA, USA
| | - Eleina M. England
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Harvard Medical School, Boston, MA, USA
| | - Lynn Pais
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Harvard Medical School, Boston, MA, USA
| | - Gabrielle Lemire
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Harvard Medical School, Boston, MA, USA
| | - Emily Groopman
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Harvard Medical School, Boston, MA, USA
| | - Carmen Glaze
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Kathryn A. Russell
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Moriel Singer-Berk
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Silvio Alessandro Di Gioia
- F.M. Kirby Neurobiology Center, Boston Children’s Hospital, Boston, MA, USA
- Department of Neurology, Boston Children’s Hospital, Boston, MA, USA
- Department of Neurology, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Regeneron Pharmaceuticals, Tarrytown, NY, 10591, USA
| | - Arthur S. Lee
- F.M. Kirby Neurobiology Center, Boston Children’s Hospital, Boston, MA, USA
- Department of Neurology, Boston Children’s Hospital, Boston, MA, USA
- Department of Neurology, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Caroline Andrews
- Department of Neurology, Boston Children’s Hospital, Boston, MA, USA
| | - Sherin Shaaban
- F.M. Kirby Neurobiology Center, Boston Children’s Hospital, Boston, MA, USA
- Department of Neurology, Boston Children’s Hospital, Boston, MA, USA
- Department of Neurology, Harvard Medical School, Boston, MA, USA
- Department of Pathology, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Megan M. Wirth
- Department of Neurology, Boston Children’s Hospital, Boston, MA, USA
| | - Sarah Bekele
- Department of Neurology, Boston Children’s Hospital, Boston, MA, USA
| | - Melissa Toffoloni
- Department of Neurology, Boston Children’s Hospital, Boston, MA, USA
| | | | - Emma E. Foster
- Department of Neurology, Boston Children’s Hospital, Boston, MA, USA
| | - Lindsay Berube
- Department of Neurology, Boston Children’s Hospital, Boston, MA, USA
| | | | | | - Alba Sanchis-Juan
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Jack M. Fu
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Isaac Wong
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Xuefang Zhao
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Michael W. Wilson
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ben Weisburd
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Monkol Lek
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Harrison Brand
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Pediatric Surgical Research Laboratories, Massachusetts General Hospital, Boston, MA, USA
| | - Michael E. Talkowski
- Department of Neurology, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Daniel G. MacArthur
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Anne O’Donnell-Luria
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Harvard Medical School, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Caroline D. Robson
- Division of Neuroradiology, Department of Radiology, Boston Children’s Hospital, Boston, MA, USA
- Department of Radiology, Harvard Medical School, Boston, MA, USA
| | - David G. Hunter
- Department of Ophthalmology, Boston Children’s Hospital, Boston, MA, USA
- Department of Ophthalmology, Harvard Medical School, Boston, MA, USA
| | - Elizabeth C. Engle
- F.M. Kirby Neurobiology Center, Boston Children’s Hospital, Boston, MA, USA
- Department of Neurology, Boston Children’s Hospital, Boston, MA, USA
- Department of Neurology, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Department of Ophthalmology, Boston Children’s Hospital, Boston, MA, USA
- Department of Ophthalmology, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
5
|
Lanciano S, Philippe C, Sarkar A, Pratella D, Domrane C, Doucet AJ, van Essen D, Saccani S, Ferry L, Defossez PA, Cristofari G. Locus-level L1 DNA methylation profiling reveals the epigenetic and transcriptional interplay between L1s and their integration sites. CELL GENOMICS 2024; 4:100498. [PMID: 38309261 PMCID: PMC10879037 DOI: 10.1016/j.xgen.2024.100498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 07/20/2023] [Accepted: 01/09/2024] [Indexed: 02/05/2024]
Abstract
Long interspersed element 1 (L1) retrotransposons are implicated in human disease and evolution. Their global activity is repressed by DNA methylation, but deciphering the regulation of individual copies has been challenging. Here, we combine short- and long-read sequencing to unveil L1 methylation heterogeneity across cell types, families, and individual loci and elucidate key principles involved. We find that the youngest primate L1 families are specifically hypomethylated in pluripotent stem cells and the placenta but not in most tumors. Locally, intronic L1 methylation is intimately associated with gene transcription. Conversely, the L1 methylation state can propagate to the proximal region up to 300 bp. This phenomenon is accompanied by the binding of specific transcription factors, which drive the expression of L1 and chimeric transcripts. Finally, L1 hypomethylation alone is typically insufficient to trigger L1 expression due to redundant silencing pathways. Our results illuminate the epigenetic and transcriptional interplay between retrotransposons and their host genome.
Collapse
Affiliation(s)
- Sophie Lanciano
- University Cote d'Azur, INSERM, CNRS, Institute for Research on Cancer and Aging of Nice (IRCAN), Nice, France
| | - Claude Philippe
- University Cote d'Azur, INSERM, CNRS, Institute for Research on Cancer and Aging of Nice (IRCAN), Nice, France
| | - Arpita Sarkar
- University Cote d'Azur, INSERM, CNRS, Institute for Research on Cancer and Aging of Nice (IRCAN), Nice, France
| | - David Pratella
- University Cote d'Azur, INSERM, CNRS, Institute for Research on Cancer and Aging of Nice (IRCAN), Nice, France
| | - Cécilia Domrane
- University Paris Cité, CNRS, Epigenetics and Cell Fate, Paris, France
| | - Aurélien J Doucet
- University Cote d'Azur, INSERM, CNRS, Institute for Research on Cancer and Aging of Nice (IRCAN), Nice, France
| | - Dominic van Essen
- University Cote d'Azur, INSERM, CNRS, Institute for Research on Cancer and Aging of Nice (IRCAN), Nice, France
| | - Simona Saccani
- University Cote d'Azur, INSERM, CNRS, Institute for Research on Cancer and Aging of Nice (IRCAN), Nice, France
| | - Laure Ferry
- University Paris Cité, CNRS, Epigenetics and Cell Fate, Paris, France
| | | | - Gael Cristofari
- University Cote d'Azur, INSERM, CNRS, Institute for Research on Cancer and Aging of Nice (IRCAN), Nice, France.
| |
Collapse
|
6
|
Pinglay S, Lalanne JB, Daza RM, Koeppel J, Li X, Lee DS, Shendure J. Multiplex generation and single cell analysis of structural variants in a mammalian genome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.22.576756. [PMID: 38405830 PMCID: PMC10888807 DOI: 10.1101/2024.01.22.576756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
The functional consequences of structural variants (SVs) in mammalian genomes are challenging to study. This is due to several factors, including: 1) their numerical paucity relative to other forms of standing genetic variation such as single nucleotide variants (SNVs) and short insertions or deletions (indels); 2) the fact that a single SV can involve and potentially impact the function of more than one gene and/or cis regulatory element; and 3) the relative immaturity of methods to generate and map SVs, either randomly or in targeted fashion, in in vitro or in vivo model systems. Towards addressing these challenges, we developed Genome-Shuffle-seq, a straightforward method that enables the multiplex generation and mapping of several major forms of SVs (deletions, inversions, translocations) throughout a mammalian genome. Genome-Shuffle-seq is based on the integration of "shuffle cassettes" to the genome, wherein each shuffle cassette contains components that facilitate its site-specific recombination (SSR) with other integrated shuffle cassettes (via Cre-loxP), its mapping to a specific genomic location (via T7-mediated in vitro transcription or IVT), and its identification in single-cell RNA-seq (scRNA-seq) data (via T7-mediated in situ transcription or IST). In this proof-of-concept, we apply Genome-Shuffle-seq to induce and map thousands of genomic SVs in mouse embryonic stem cells (mESCs) in a single experiment. Induced SVs are rapidly depleted from the cellular population over time, possibly due to Cre-mediated toxicity and/or negative selection on the rearrangements themselves. Leveraging T7 IST of barcodes whose positions are already mapped, we further demonstrate that we can efficiently genotype which SVs are present in association with each of many single cell transcriptomes in scRNA-seq data. Finally, preliminary evidence suggests our method may be a powerful means of generating extrachromosomal circular DNAs (ecDNAs). Looking forward, we anticipate that Genome-Shuffle-seq may be broadly useful for the systematic exploration of the functional consequences of SVs on gene expression, the chromatin landscape, and 3D nuclear architecture. We further anticipate potential uses for in vitro modeling of ecDNAs, as well as in paving the path to a minimal mammalian genome.
Collapse
Affiliation(s)
- Sudarshan Pinglay
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
- Seattle Hub for Synthetic Biology, Seattle, WA, USA
| | | | - Riza M Daza
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Seattle Hub for Synthetic Biology, Seattle, WA, USA
| | | | - Xiaoyi Li
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - David S Lee
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
- Seattle Hub for Synthetic Biology, Seattle, WA, USA
- Allen Discovery Center for Cell Lineage Tracing, Seattle, WA, USA
- Howard Hughes Medical Institute, Seattle, WA, USA
| |
Collapse
|
7
|
Smolka M, Paulin LF, Grochowski CM, Horner DW, Mahmoud M, Behera S, Kalef-Ezra E, Gandhi M, Hong K, Pehlivan D, Scholz SW, Carvalho CMB, Proukakis C, Sedlazeck FJ. Detection of mosaic and population-level structural variants with Sniffles2. Nat Biotechnol 2024:10.1038/s41587-023-02024-y. [PMID: 38168980 DOI: 10.1038/s41587-023-02024-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Accepted: 10/11/2023] [Indexed: 01/05/2024]
Abstract
Calling structural variations (SVs) is technically challenging, but using long reads remains the most accurate way to identify complex genomic alterations. Here we present Sniffles2, which improves over current methods by implementing a repeat aware clustering coupled with a fast consensus sequence and coverage-adaptive filtering. Sniffles2 is 11.8 times faster and 29% more accurate than state-of-the-art SV callers across different coverages (5-50×), sequencing technologies (ONT and HiFi) and SV types. Furthermore, Sniffles2 solves the problem of family-level to population-level SV calling to produce fully genotyped VCF files. Across 11 probands, we accurately identified causative SVs around MECP2, including highly complex alleles with three overlapping SVs. Sniffles2 also enables the detection of mosaic SVs in bulk long-read data. As a result, we identified multiple mosaic SVs in brain tissue from a patient with multiple system atrophy. The identified SV showed a remarkable diversity within the cingulate cortex, impacting both genes involved in neuron function and repetitive elements.
Collapse
Affiliation(s)
- Moritz Smolka
- Human Genome Sequencing Center Baylor College of Medicine, Houston, TX, USA
| | - Luis F Paulin
- Human Genome Sequencing Center Baylor College of Medicine, Houston, TX, USA
| | | | - Dominic W Horner
- Department of Clinical and Movement Neurosciences, Royal Free Campus, Queen Square Institute of Neurology, University College London, London, UK
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, USA
| | - Medhat Mahmoud
- Human Genome Sequencing Center Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Sairam Behera
- Human Genome Sequencing Center Baylor College of Medicine, Houston, TX, USA
| | - Ester Kalef-Ezra
- Department of Clinical and Movement Neurosciences, Royal Free Campus, Queen Square Institute of Neurology, University College London, London, UK
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, USA
| | - Mira Gandhi
- Pacific Northwest Research Institute (PNRI), Seattle, WA, USA
| | - Karl Hong
- Bionano Genomics, San Diego, CA, USA
| | - Davut Pehlivan
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Division of Neurology and Developmental Neuroscience, Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA
| | - Sonja W Scholz
- Neurodegenerative Diseases Research Unit, National Institute of Neurological Disorders and Stroke, Bethesda, MD, USA
- Department of Neurology, Johns Hopkins University Medical Center, Baltimore, MD, USA
| | - Claudia M B Carvalho
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Pacific Northwest Research Institute (PNRI), Seattle, WA, USA
| | - Christos Proukakis
- Department of Clinical and Movement Neurosciences, Royal Free Campus, Queen Square Institute of Neurology, University College London, London, UK
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center Baylor College of Medicine, Houston, TX, USA.
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, USA.
- Department of Computer Science, Rice University, Houston, TX, USA.
| |
Collapse
|
8
|
Hahn MW, Peña-Garcia Y, Wang RJ. The 'faulty male' hypothesis for sex-biased mutation and disease. Curr Biol 2023; 33:R1166-R1172. [PMID: 37989088 DOI: 10.1016/j.cub.2023.09.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2023]
Abstract
Biological differences between males and females lead to many differences in physiology, disease, and overall health. One of the most prominent disparities is in the number of germline mutations passed to offspring: human males transmit three times as many mutations as do females. While the classic explanation for this pattern invokes differences in post-puberty germline replication between the sexes, recent whole-genome evidence in humans and other mammals has cast doubt on this mechanism. Here, we review recent work that is inconsistent with a replication-driven model of male-biased mutation, and propose an alternative, 'faulty male' hypothesis. This model proposes that males are less able to repair and/or protect DNA from damage compared to females. Importantly, we suggest that this new model for male-biased mutation may also help to explain several pronounced differences between the sexes in cancer, aging, and DNA repair. Although the detailed contributions of genetic, epigenetic, and hormonal influences of biological sex on mutation remain to be fully understood, a reconsideration of the mechanisms underlying these differences will lead to a deeper understanding of evolution and disease.
Collapse
Affiliation(s)
- Matthew W Hahn
- Department of Biology, Indiana University, 1001 E. 3(rd) Street, Bloomington, IN 47405, USA; Department of Computer Science, 700 N. Woodlawn Avenue, Bloomington, IN 47405, USA.
| | - Yadira Peña-Garcia
- Department of Biology, Indiana University, 1001 E. 3(rd) Street, Bloomington, IN 47405, USA
| | - Richard J Wang
- Department of Biology, Indiana University, 1001 E. 3(rd) Street, Bloomington, IN 47405, USA; Department of Computer Science, 700 N. Woodlawn Avenue, Bloomington, IN 47405, USA
| |
Collapse
|
9
|
Li S, Vazquez JM, Sudmant PH. The evolution of aging and lifespan. Trends Genet 2023; 39:830-843. [PMID: 37714733 PMCID: PMC11147682 DOI: 10.1016/j.tig.2023.08.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 08/18/2023] [Accepted: 08/21/2023] [Indexed: 09/17/2023]
Abstract
Aging is a nearly inescapable trait among organisms yet lifespan varies tremendously across different species and spans several orders of magnitude in vertebrates alone. This vast phenotypic diversity is driven by distinct evolutionary trajectories and tradeoffs that are reflected in patterns of diversification and constraint in organismal genomes. Age-specific impacts of selection also shape allele frequencies in populations, thus impacting disease susceptibility and environment-specific mortality risk. Further, the mutational processes that spawn this genetic diversity in both germline and somatic cells are strongly influenced by age and life history. We discuss recent advances in our understanding of the evolution of aging and lifespan at organismal, population, and cellular scales, and highlight outstanding questions that remain unanswered.
Collapse
Affiliation(s)
- Stacy Li
- Department of Integrative Biology, University of California, Berkeley, CA, USA; Center for Computational Biology, University of California, Berkeley, CA. USA
| | - Juan Manuel Vazquez
- Department of Integrative Biology, University of California, Berkeley, CA, USA
| | - Peter H Sudmant
- Department of Integrative Biology, University of California, Berkeley, CA, USA; Center for Computational Biology, University of California, Berkeley, CA. USA.
| |
Collapse
|
10
|
Bhati M, Mapel XM, Lloret-Villas A, Pausch H. Structural variants and short tandem repeats impact gene expression and splicing in bovine testis tissue. Genetics 2023; 225:iyad161. [PMID: 37655920 PMCID: PMC10627265 DOI: 10.1093/genetics/iyad161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 06/05/2023] [Accepted: 08/24/2023] [Indexed: 09/02/2023] Open
Abstract
Structural variants (SVs) and short tandem repeats (STRs) are significant sources of genetic variation. However, the impacts of these variants on gene regulation have not been investigated in cattle. Here, we genotyped and characterized 19,408 SVs and 374,821 STRs in 183 bovine genomes and investigated their impact on molecular phenotypes derived from testis transcriptomes. We found that 71% STRs were multiallelic. The vast majority (95%) of STRs and SVs were in intergenic and intronic regions. Only 37% SVs and 40% STRs were in high linkage disequilibrium (LD) (R2 > 0.8) with surrounding SNPs/insertions and deletions (Indels), indicating that SNP-based association testing and genomic prediction are blind to a nonnegligible portion of genetic variation. We showed that both SVs and STRs were more than 2-fold enriched among expression and splicing QTL (e/sQTL) relative to SNPs/Indels and were often associated with differential expression and splicing of multiple genes. Deletions and duplications had larger impacts on splicing and expression than any other type of SV. Exonic duplications predominantly increased gene expression either through alternative splicing or other mechanisms, whereas expression- and splicing-associated STRs primarily resided in intronic regions and exhibited bimodal effects on the molecular phenotypes investigated. Most e/sQTL resided within 100 kb of the affected genes or splicing junctions. We pinpoint candidate causal STRs and SVs associated with the expression of SLC13A4 and TTC7B and alternative splicing of a lncRNA and CAPP1. We provide a catalog of STRs and SVs for taurine cattle and show that these variants contribute substantially to gene expression and splicing variation.
Collapse
Affiliation(s)
- Meenu Bhati
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8092, Zurich, Switzerland
| | - Xena Marie Mapel
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8092, Zurich, Switzerland
| | | | - Hubert Pausch
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8092, Zurich, Switzerland
| |
Collapse
|
11
|
Lee YL, Bouwman AC, Harland C, Bosse M, Costa Monteiro Moreira G, Veerkamp RF, Mullaart E, Cambisano N, Groenen MAM, Karim L, Coppieters W, Georges M, Charlier C. The rate of de novo structural variation is increased in in vitro-produced offspring and preferentially affects the paternal genome. Genome Res 2023; 33:1455-1464. [PMID: 37793781 PMCID: PMC10620045 DOI: 10.1101/gr.277884.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 08/08/2023] [Indexed: 10/06/2023]
Abstract
Assisted reproductive technologies (ARTs), including in vitro maturation and fertilization (IVF), are increasingly used in human and animal reproduction. Whether these technologies directly affect the rate of de novo mutation (DNM), and to what extent, has been a matter of debate. Here we take advantage of domestic cattle, characterized by complex pedigrees that are ideally suited to detect DNMs and by the systematic use of ART, to study the rate of de novo structural variation (dnSV) in this species and how it is impacted by IVF. By exploiting features of associated de novo point mutations (dnPMs) and dnSVs in clustered DNMs, we provide strong evidence that (1) IVF increases the rate of dnSV approximately fivefold, and (2) the corresponding mutations occur during the very early stages of embryonic development (one- and two-cell stage), yet primarily affect the paternal genome.
Collapse
Affiliation(s)
- Young-Lim Lee
- Unit of Animal Genomics, GIGA-R, Faculty of Veterinary Medicine, University of Liège, B-4000 Liège, Belgium;
- Wageningen University and Research, Animal Breeding, and Genomics, 6708 WG Wageningen, The Netherlands
| | - Aniek C Bouwman
- Wageningen University and Research, Animal Breeding, and Genomics, 6708 WG Wageningen, The Netherlands
| | - Chad Harland
- Unit of Animal Genomics, GIGA-R, Faculty of Veterinary Medicine, University of Liège, B-4000 Liège, Belgium
- Livestock Improvement Corporation, Hamilton 3240, New Zealand
| | - Mirte Bosse
- Wageningen University and Research, Animal Breeding, and Genomics, 6708 WG Wageningen, The Netherlands
| | | | - Roel F Veerkamp
- Wageningen University and Research, Animal Breeding, and Genomics, 6708 WG Wageningen, The Netherlands
| | | | - Nadine Cambisano
- GIGA Genomics Platform, GIGA Institute, University of Liège, B-4000 Liège, Belgium
| | - Martien A M Groenen
- Wageningen University and Research, Animal Breeding, and Genomics, 6708 WG Wageningen, The Netherlands
| | - Latifa Karim
- GIGA Genomics Platform, GIGA Institute, University of Liège, B-4000 Liège, Belgium
| | - Wouter Coppieters
- Unit of Animal Genomics, GIGA-R, Faculty of Veterinary Medicine, University of Liège, B-4000 Liège, Belgium
- GIGA Genomics Platform, GIGA Institute, University of Liège, B-4000 Liège, Belgium
| | - Michel Georges
- Unit of Animal Genomics, GIGA-R, Faculty of Veterinary Medicine, University of Liège, B-4000 Liège, Belgium;
| | - Carole Charlier
- Unit of Animal Genomics, GIGA-R, Faculty of Veterinary Medicine, University of Liège, B-4000 Liège, Belgium;
| |
Collapse
|
12
|
Babadi M, Fu JM, Lee SK, Smirnov AN, Gauthier LD, Walker M, Benjamin DI, Zhao X, Karczewski KJ, Wong I, Collins RL, Sanchis-Juan A, Brand H, Banks E, Talkowski ME. GATK-gCNV enables the discovery of rare copy number variants from exome sequencing data. Nat Genet 2023; 55:1589-1597. [PMID: 37604963 PMCID: PMC10904014 DOI: 10.1038/s41588-023-01449-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 06/16/2023] [Indexed: 08/23/2023]
Abstract
Copy number variants (CNVs) are major contributors to genetic diversity and disease. While standardized methods, such as the genome analysis toolkit (GATK), exist for detecting short variants, technical challenges have confounded uniform large-scale CNV analyses from whole-exome sequencing (WES) data. Given the profound impact of rare and de novo coding CNVs on genome organization and human disease, we developed GATK-gCNV, a flexible algorithm to discover rare CNVs from sequencing read-depth information, complete with open-source distribution via GATK. We benchmarked GATK-gCNV in 7,962 exomes from individuals in quartet families with matched genome sequencing and microarray data, finding up to 95% recall of rare coding CNVs at a resolution of more than two exons. We used GATK-gCNV to generate a reference catalog of rare coding CNVs in WES data from 197,306 individuals in the UK Biobank, and observed strong correlations between per-gene CNV rates and measures of mutational constraint, as well as rare CNV associations with multiple traits. In summary, GATK-gCNV is a tunable approach for sensitive and specific CNV discovery in WES data, with broad applications.
Collapse
Affiliation(s)
- Mehrtash Babadi
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | - Jack M Fu
- Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Samuel K Lee
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Andrey N Smirnov
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Laura D Gauthier
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Mark Walker
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - David I Benjamin
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Xuefang Zhao
- Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Konrad J Karczewski
- Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Isaac Wong
- Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Ryan L Collins
- Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Alba Sanchis-Juan
- Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Harrison Brand
- Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Eric Banks
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Michael E Talkowski
- Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA.
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA.
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
13
|
Sanchis-Juan A, Megy K, Stephens J, Armirola Ricaurte C, Dewhurst E, Low K, French CE, Grozeva D, Stirrups K, Erwood M, McTague A, Penkett CJ, Shamardina O, Tuna S, Daugherty LC, Gleadall N, Duarte ST, Hedrera-Fernández A, Vogt J, Ambegaonkar G, Chitre M, Josifova D, Kurian MA, Parker A, Rankin J, Reid E, Wakeling E, Wassmer E, Woods CG, Raymond FL, Carss KJ. Genome sequencing and comprehensive rare-variant analysis of 465 families with neurodevelopmental disorders. Am J Hum Genet 2023; 110:1343-1355. [PMID: 37541188 PMCID: PMC10432178 DOI: 10.1016/j.ajhg.2023.07.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 07/07/2023] [Accepted: 07/07/2023] [Indexed: 08/06/2023] Open
Abstract
Despite significant progress in unraveling the genetic causes of neurodevelopmental disorders (NDDs), a substantial proportion of individuals with NDDs remain without a genetic diagnosis after microarray and/or exome sequencing. Here, we aimed to assess the power of short-read genome sequencing (GS), complemented with long-read GS, to identify causal variants in participants with NDD from the National Institute for Health and Care Research (NIHR) BioResource project. Short-read GS was conducted on 692 individuals (489 affected and 203 unaffected relatives) from 465 families. Additionally, long-read GS was performed on five affected individuals who had structural variants (SVs) in technically challenging regions, had complex SVs, or required distal variant phasing. Causal variants were identified in 36% of affected individuals (177/489), and a further 23% (112/489) had a variant of uncertain significance after multiple rounds of re-analysis. Among all reported variants, 88% (333/380) were coding nuclear SNVs or insertions and deletions (indels), and the remainder were SVs, non-coding variants, and mitochondrial variants. Furthermore, long-read GS facilitated the resolution of challenging SVs and invalidated variants of difficult interpretation from short-read GS. This study demonstrates the value of short-read GS, complemented with long-read GS, in investigating the genetic causes of NDDs. GS provides a comprehensive and unbiased method of identifying all types of variants throughout the nuclear and mitochondrial genomes in individuals with NDD.
Collapse
Affiliation(s)
- Alba Sanchis-Juan
- Department of Haematology, University of Cambridge, Cambridge, UK; NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK; Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Karyn Megy
- Department of Haematology, University of Cambridge, Cambridge, UK; NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK; Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Jonathan Stephens
- Department of Haematology, University of Cambridge, Cambridge, UK; NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Camila Armirola Ricaurte
- Department of Haematology, University of Cambridge, Cambridge, UK; NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Eleanor Dewhurst
- Department of Haematology, University of Cambridge, Cambridge, UK; NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Kayyi Low
- Department of Haematology, University of Cambridge, Cambridge, UK; NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | | | - Detelina Grozeva
- Department of Medical Genetics, University of Cambridge, Cambridge, UK; Centre for Trials Research, Cardiff University, Cardiff, UK
| | - Kathleen Stirrups
- Department of Haematology, University of Cambridge, Cambridge, UK; NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Marie Erwood
- Department of Haematology, University of Cambridge, Cambridge, UK; NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Amy McTague
- Molecular Neurosciences, Zayed Centre for Research into Rare Disease in Children, UCL Great Ormond Street Institute of Child Health, London, UK; Department of Neurology, Great Ormond Street Hospital for Children NHS Foundation Trust, London, UK
| | - Christopher J Penkett
- Department of Haematology, University of Cambridge, Cambridge, UK; NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Olga Shamardina
- Department of Haematology, University of Cambridge, Cambridge, UK; NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Salih Tuna
- Department of Haematology, University of Cambridge, Cambridge, UK; NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Louise C Daugherty
- Department of Haematology, University of Cambridge, Cambridge, UK; NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Nicholas Gleadall
- Department of Haematology, University of Cambridge, Cambridge, UK; NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Sofia T Duarte
- Hospital Dona Estefânia, Centro Hospitalar de Lisboa Central, Lisbon, Portugal
| | | | - Julie Vogt
- West Midlands Regional Genetics Service, Birmingham Women's and Children's Hospital, Birmingham, UK
| | - Gautam Ambegaonkar
- Child Development Centre, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Manali Chitre
- Clinical Medical School, University of Cambridge, Cambridge, UK
| | | | - Manju A Kurian
- Molecular Neurosciences, Zayed Centre for Research into Rare Disease in Children, UCL Great Ormond Street Institute of Child Health, London, UK
| | - Alasdair Parker
- Clinical Medical School, University of Cambridge, Cambridge, UK; Child Development Centre, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Julia Rankin
- Department of Clinical Genetics, Royal Devon University Healthcare NHS Foundation Trust, Exeter, UK
| | - Evan Reid
- Cambridge Institute for Medical Research and Department of Medical Genetics, University of Cambridge, Cambridge, UK
| | - Emma Wakeling
- North West Thames Regional Genetics Service, Harrow, UK
| | - Evangeline Wassmer
- Neurology Department, Birmingham Women and Children's Hospital, Birmingham, UK
| | - C Geoffrey Woods
- Clinical Medical School, University of Cambridge, Cambridge, UK; Department of Medical Genetics, University of Cambridge, Cambridge, UK
| | - F Lucy Raymond
- Department of Haematology, University of Cambridge, Cambridge, UK; NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK; Department of Medical Genetics, University of Cambridge, Cambridge, UK.
| | - Keren J Carss
- Department of Haematology, University of Cambridge, Cambridge, UK; NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK; Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK.
| |
Collapse
|
14
|
Wang M. Estimating the parental age effect on intelligence with controlling for confounding effects from genotypic differences. PERSONALITY AND INDIVIDUAL DIFFERENCES 2023. [DOI: 10.1016/j.paid.2023.112137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
|
15
|
Kucuk E, van der Sanden BPGH, O'Gorman L, Kwint M, Derks R, Wenger AM, Lambert C, Chakraborty S, Baybayan P, Rowell WJ, Brunner HG, Vissers LELM, Hoischen A, Gilissen C. Comprehensive de novo mutation discovery with HiFi long-read sequencing. Genome Med 2023; 15:34. [PMID: 37158973 PMCID: PMC10169305 DOI: 10.1186/s13073-023-01183-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Accepted: 04/19/2023] [Indexed: 05/10/2023] Open
Abstract
BACKGROUND Long-read sequencing (LRS) techniques have been very successful in identifying structural variants (SVs). However, the high error rate of LRS made the detection of small variants (substitutions and short indels < 20 bp) more challenging. The introduction of PacBio HiFi sequencing makes LRS also suited for detecting small variation. Here we evaluate the ability of HiFi reads to detect de novo mutations (DNMs) of all types, which are technically challenging variant types and a major cause of sporadic, severe, early-onset disease. METHODS We sequenced the genomes of eight parent-child trios using high coverage PacBio HiFi LRS (~ 30-fold coverage) and Illumina short-read sequencing (SRS) (~ 50-fold coverage). De novo substitutions, small indels, short tandem repeats (STRs) and SVs were called in both datasets and compared to each other to assess the accuracy of HiFi LRS. In addition, we determined the parent-of-origin of the small DNMs using phasing. RESULTS We identified a total of 672 and 859 de novo substitutions/indels, 28 and 126 de novo STRs, and 24 and 1 de novo SVs in LRS and SRS respectively. For the small variants, there was a 92 and 85% concordance between the platforms. For the STRs and SVs, the concordance was 3.6 and 0.8%, and 4 and 100% respectively. We successfully validated 27/54 LRS-unique small variants, of which 11 (41%) were confirmed as true de novo events. For the SRS-unique small variants, we validated 42/133 DNMs and 8 (19%) were confirmed as true de novo event. Validation of 18 LRS-unique de novo STR calls confirmed none of the repeat expansions as true DNM. Confirmation of the 23 LRS-unique SVs was possible for 19 candidate SVs of which 10 (52.6%) were true de novo events. Furthermore, we were able to assign 96% of DNMs to their parental allele with LRS data, as opposed to just 20% with SRS data. CONCLUSIONS HiFi LRS can now produce the most comprehensive variant dataset obtainable by a single technology in a single laboratory, allowing accurate calling of substitutions, indels, STRs and SVs. The accuracy even allows sensitive calling of DNMs on all variant levels, and also allows for phasing, which helps to distinguish true positive from false positive DNMs.
Collapse
Affiliation(s)
- Erdi Kucuk
- Department of Human Genetics, Radboud University Medical Center, PO Box 9101, 6500 HB, Nijmegen, The Netherlands
- Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Bart P G H van der Sanden
- Department of Human Genetics, Radboud University Medical Center, PO Box 9101, 6500 HB, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Luke O'Gorman
- Department of Human Genetics, Radboud University Medical Center, PO Box 9101, 6500 HB, Nijmegen, The Netherlands
| | - Michael Kwint
- Department of Human Genetics, Radboud University Medical Center, PO Box 9101, 6500 HB, Nijmegen, The Netherlands
| | - Ronny Derks
- Department of Human Genetics, Radboud University Medical Center, PO Box 9101, 6500 HB, Nijmegen, The Netherlands
| | | | | | | | | | | | - Han G Brunner
- Department of Human Genetics, Radboud University Medical Center, PO Box 9101, 6500 HB, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
- Department of Clinical Genetics, Maastricht University Medical Center, Maastricht, The Netherlands
- GROW School for Oncology and Developmental Biology, Maastricht University Medical Center, Maastricht, The Netherlands
| | - Lisenka E L M Vissers
- Department of Human Genetics, Radboud University Medical Center, PO Box 9101, 6500 HB, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Alexander Hoischen
- Department of Human Genetics, Radboud University Medical Center, PO Box 9101, 6500 HB, Nijmegen, The Netherlands.
- Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands.
- Department of Internal Medicine, Radboud University Medical Center for Infectious Diseases (RCI), Radboud University Medical Center, Nijmegen, the Netherlands.
| | - Christian Gilissen
- Department of Human Genetics, Radboud University Medical Center, PO Box 9101, 6500 HB, Nijmegen, The Netherlands.
- Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands.
| |
Collapse
|
16
|
De Vas MG, Boulet F, Joshi SS, Garstang MG, Khan TN, Atla G, Parry D, Moore D, Cebola I, Zhang S, Cui W, Lampe AK, Lam WW, Ferrer J, Pradeepa MM, Atanur SS. Regulatory de novo mutations underlying intellectual disability. Life Sci Alliance 2023; 6:e202201843. [PMID: 36854624 PMCID: PMC9978454 DOI: 10.26508/lsa.202201843] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 02/11/2023] [Accepted: 02/13/2023] [Indexed: 03/02/2023] Open
Abstract
The genetic aetiology of a major fraction of patients with intellectual disability (ID) remains unknown. De novo mutations (DNMs) in protein-coding genes explain up to 40% of cases, but the potential role of regulatory DNMs is still poorly understood. We sequenced 63 whole genomes from 21 ID probands and their unaffected parents. In addition, we analysed 30 previously sequenced genomes from exome-negative ID probands. We found that regulatory DNMs were selectively enriched in fetal brain-specific enhancers as compared with adult brain enhancers. DNM-containing enhancers were associated with genes that show preferential expression in the prefrontal cortex. Furthermore, we identified recurrently mutated enhancer clusters that regulate genes involved in nervous system development (CSMD1, OLFM1, and POU3F3). Most of the DNMs from ID probands showed allele-specific enhancer activity when tested using luciferase assay. Using CRISPR-mediated mutation and editing of epigenomic marks, we show that DNMs at regulatory elements affect the expression of putative target genes. Our results, therefore, provide new evidence to indicate that DNMs in fetal brain-specific enhancers play an essential role in the aetiology of ID.
Collapse
Affiliation(s)
- Matias G De Vas
- Section of Genetics and Genomics, Department of Metabolism, Digestion and Reproduction, Imperial College London, London, UK
| | - Fanny Boulet
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK
| | - Shweta S Joshi
- Section of Genetics and Genomics, Department of Metabolism, Digestion and Reproduction, Imperial College London, London, UK
| | - Myles G Garstang
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK
- School of Biological Sciences, University of Essex, Colchester, UK
| | - Tahir N Khan
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK
- Department of Biological Sciences, National University of Medical Sciences, Rawalpindi, Pakistan
| | - Goutham Atla
- Section of Genetics and Genomics, Department of Metabolism, Digestion and Reproduction, Imperial College London, London, UK
- Regulatory Genomics and Diabetes, Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Diabetes y Enfermedades Metabólicas Asociadas, Barcelona, Spain
| | - David Parry
- MRC Human Genetics Unit, University of Edinburgh, Edinburgh, UK
| | - David Moore
- South-East Scotland Regional Genetics Service, Western General Hospital, Edinburgh, UK
| | - Inês Cebola
- Section of Genetics and Genomics, Department of Metabolism, Digestion and Reproduction, Imperial College London, London, UK
| | - Shuchen Zhang
- Institute of Reproductive and Developmental Biology, Faculty of Medicine, Imperial College London, London, UK
| | - Wei Cui
- Institute of Reproductive and Developmental Biology, Faculty of Medicine, Imperial College London, London, UK
| | - Anne K Lampe
- South-East Scotland Regional Genetics Service, Western General Hospital, Edinburgh, UK
| | - Wayne W Lam
- South-East Scotland Regional Genetics Service, Western General Hospital, Edinburgh, UK
| | - Jorge Ferrer
- Section of Genetics and Genomics, Department of Metabolism, Digestion and Reproduction, Imperial College London, London, UK
- Regulatory Genomics and Diabetes, Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Diabetes y Enfermedades Metabólicas Asociadas, Barcelona, Spain
| | - Madapura M Pradeepa
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK
- School of Biological Sciences, University of Essex, Colchester, UK
| | - Santosh S Atanur
- Section of Genetics and Genomics, Department of Metabolism, Digestion and Reproduction, Imperial College London, London, UK
- NIHR Imperial Biomedical Research Centre, ITMAT Data Science Group, Imperial College London, London, UK
- Previous Institute: Centre for Genomic and Experimental Medicine, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
17
|
Steensma MJ, Lee YL, Bouwman AC, Pita Barros C, Derks MFL, Bink MCAM, Harlizius B, Huisman AE, Crooijmans RPMA, Groenen MAM, Mulder HA, Rochus CM. Identification and characterisation of de novo germline structural variants in two commercial pig lines using trio-based whole genome sequencing. BMC Genomics 2023; 24:208. [PMID: 37072725 PMCID: PMC10114323 DOI: 10.1186/s12864-023-09296-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 04/04/2023] [Indexed: 04/20/2023] Open
Abstract
BACKGROUND De novo mutations arising in the germline are a source of genetic variation and their discovery broadens our understanding of genetic disorders and evolutionary patterns. Although the number of de novo single nucleotide variants (dnSNVs) has been studied in a number of species, relatively little is known about the occurrence of de novo structural variants (dnSVs). In this study, we investigated 37 deeply sequenced pig trios from two commercial lines to identify dnSVs present in the offspring. The identified dnSVs were characterised by identifying their parent of origin, their functional annotations and characterizing sequence homology at the breakpoints. RESULTS We identified four swine germline dnSVs, all located in intronic regions of protein-coding genes. Our conservative, first estimate of the swine germline dnSV rate is 0.108 (95% CI 0.038-0.255) per generation (one dnSV per nine offspring), detected using short-read sequencing. Two detected dnSVs are clusters of mutations. Mutation cluster 1 contains a de novo duplication, a dnSNV and a de novo deletion. Mutation cluster 2 contains a de novo deletion and three de novo duplications, of which one is inverted. Mutation cluster 2 is 25 kb in size, whereas mutation cluster 1 (197 bp) and the other two individual dnSVs (64 and 573 bp) are smaller. Only mutation cluster 2 could be phased and is located on the paternal haplotype. Mutation cluster 2 originates from both micro-homology as well as non-homology mutation mechanisms, where mutation cluster 1 and the other two dnSVs are caused by mutation mechanisms lacking sequence homology. The 64 bp deletion and mutation cluster 1 were validated through PCR. Lastly, the 64 bp deletion and the 573 bp duplication were validated in sequenced offspring of probands with three generations of sequence data. CONCLUSIONS Our estimate of 0.108 dnSVs per generation in the swine germline is conservative, due to our small sample size and restricted possibilities of dnSV detection from short-read sequencing. The current study highlights the complexity of dnSVs and shows the potential of breeding programs for pigs and livestock species in general, to provide a suitable population structure for identification and characterisation of dnSVs.
Collapse
Affiliation(s)
- Marije J Steensma
- Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands.
| | - Y L Lee
- Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands
| | - A C Bouwman
- Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands
| | - C Pita Barros
- Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands
| | - M F L Derks
- Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands
- Topigs Norsvin Research Center, Schoenaker 6, Beuningen, 6641 SZ, the Netherlands
| | - M C A M Bink
- Hendrix Genetics, P.O. Box 114, Boxmeer, 5830 AC, the Netherlands
| | - B Harlizius
- Topigs Norsvin Research Center, Schoenaker 6, Beuningen, 6641 SZ, the Netherlands
| | - A E Huisman
- Hendrix Genetics, P.O. Box 114, Boxmeer, 5830 AC, the Netherlands
| | - R P M A Crooijmans
- Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands
| | - M A M Groenen
- Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands
| | - H A Mulder
- Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands
| | - C M Rochus
- University of Guelph, Centre for Genetic Improvement of Livestock, 50 Stone Rd E, Guelph, O N, N1G 2W1, Canada
| |
Collapse
|
18
|
Denti L, Khorsand P, Bonizzoni P, Hormozdiari F, Chikhi R. SVDSS: structural variation discovery in hard-to-call genomic regions using sample-specific strings from accurate long reads. Nat Methods 2023; 20:550-558. [PMID: 36550274 DOI: 10.1038/s41592-022-01674-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Accepted: 10/08/2022] [Indexed: 12/24/2022]
Abstract
Structural variants (SVs) account for a large amount of sequence variability across genomes and play an important role in human genomics and precision medicine. Despite intense efforts over the years, the discovery of SVs in individuals remains challenging due to the diploid and highly repetitive structure of the human genome, and by the presence of SVs that vastly exceed sequencing read lengths. However, the recent introduction of low-error long-read sequencing technologies such as PacBio HiFi may finally enable these barriers to be overcome. Here we present SV discovery with sample-specific strings (SVDSS)-a method for discovery of SVs from long-read sequencing technologies (for example, PacBio HiFi) that combines and effectively leverages mapping-free, mapping-based and assembly-based methodologies for overall superior SV discovery performance. Our experiments on several human samples show that SVDSS outperforms state-of-the-art mapping-based methods for discovery of insertion and deletion SVs in PacBio HiFi reads and achieves notable improvements in calling SVs in repetitive regions of the genome.
Collapse
Affiliation(s)
- Luca Denti
- Sequence Bioinformatics, Department of Computational Biology, Institut Pasteur, Paris, France
| | | | - Paola Bonizzoni
- Department of Informatics, Systems and Communication, University of Milano-Bicocca, Milan, Italy.
| | - Fereydoun Hormozdiari
- Genome Center, UC Davis, Davis, CA, USA.
- UC Davis MIND Institute, Sacramento, CA, USA.
- Department of Biochemistry and Molecular Medicine, Sacramento, UC Davis, Sacramento, CA, USA.
| | - Rayan Chikhi
- Sequence Bioinformatics, Department of Computational Biology, Institut Pasteur, Paris, France.
| |
Collapse
|
19
|
Kirsche M, Prabhu G, Sherman R, Ni B, Battle A, Aganezov S, Schatz MC. Jasmine and Iris: population-scale structural variant comparison and analysis. Nat Methods 2023; 20:408-417. [PMID: 36658279 PMCID: PMC10006329 DOI: 10.1038/s41592-022-01753-3] [Citation(s) in RCA: 26] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Accepted: 12/15/2022] [Indexed: 01/21/2023]
Abstract
The availability of long reads is revolutionizing studies of structural variants (SVs). However, because SVs vary across individuals and are discovered through imprecise read technologies and methods, they can be difficult to compare. Addressing this, we present Jasmine and Iris ( https://github.com/mkirsche/Jasmine/ ), for fast and accurate SV refinement, comparison and population analysis. Using an SV proximity graph, Jasmine outperforms six widely used comparison methods, including reducing the rate of Mendelian discordance in trio datasets by more than fivefold, and reveals a set of high-confidence de novo SVs confirmed by multiple technologies. We also present a unified callset of 122,813 SVs and 82,379 indels from 31 samples of diverse ancestry sequenced with long reads. We genotype these variants in 1,317 samples from the 1000 Genomes Project and the Genotype-Tissue Expression project with DNA and RNA-sequencing data and assess their widespread impact on gene expression, including within medically relevant genes.
Collapse
Affiliation(s)
- Melanie Kirsche
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Gautam Prabhu
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Rachel Sherman
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Bohan Ni
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Alexis Battle
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Sergey Aganezov
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
| | - Michael C Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
20
|
López-Cortegano E, Craig RJ, Chebib J, Balogun EJ, Keightley PD. Rates and spectra of de novo structural mutations in Chlamydomonas reinhardtii. Genome Res 2023; 33:45-60. [PMID: 36617667 PMCID: PMC9977147 DOI: 10.1101/gr.276957.122] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Accepted: 12/06/2022] [Indexed: 12/14/2022]
Abstract
Genetic variation originates from several types of spontaneous mutation, including single-nucleotide substitutions, short insertions and deletions (indels), and larger structural changes. Structural mutations (SMs) drive genome evolution and are thought to play major roles in evolutionary adaptation, speciation, and genetic disease, including cancers. Sequencing of mutation accumulation (MA) lines has provided estimates of rates and spectra of single-nucleotide and indel mutations in many species, yet the rate of new SMs is largely unknown. Here, we use long-read sequencing to determine the full mutation spectrum in MA lines derived from two strains (CC-1952 and CC-2931) of the green alga Chlamydomonas reinhardtii The SM rate is highly variable between strains and between MA lines, and SMs represent a substantial proportion of all mutations in both strains (CC-1952 6%; CC-2931 12%). The SM spectra differ considerably between the two strains, with almost all inversions and translocations occurring in CC-2931 MA lines. This variation is associated with heterogeneity in the number and type of active transposable elements (TEs), which comprise major proportions of SMs in both strains (CC-1952 22%; CC-2931 38%). In CC-2931, a Crypton and a previously undescribed type of DNA element have caused 71% of chromosomal rearrangements, whereas in CC-1952, a Dualen LINE is associated with 87% of duplications. Other SMs, notably large duplications in CC-2931, are likely products of various double-strand break repair pathways. Our results show that diverse types of SMs occur at substantial rates, and support prominent roles for SMs and TEs in evolution.
Collapse
Affiliation(s)
- Eugenio López-Cortegano
- Institute of Ecology and Evolution, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
| | - Rory J. Craig
- Institute of Ecology and Evolution, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom;,California Institute for Quantitative Biosciences, UC Berkeley, Berkeley, California 94720, USA
| | - Jobran Chebib
- Institute of Ecology and Evolution, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
| | - Eniolaye J. Balogun
- Department of Ecology and Evolutionary Biology, University of Toronto, Ontario ON M5S 3B2, Canada;,Department of Biology, University of Toronto Mississauga, Mississauga ON L5L 1C6, Canada
| | - Peter D. Keightley
- Institute of Ecology and Evolution, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
| |
Collapse
|
21
|
Scheuren M, Möhner J, Zischler H. R-loop landscape in mature human sperm: Regulatory and evolutionary implications. Front Genet 2023; 14:1069871. [PMID: 37139234 PMCID: PMC10149866 DOI: 10.3389/fgene.2023.1069871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 04/03/2023] [Indexed: 05/05/2023] Open
Abstract
R-loops are three-stranded nucleic acid structures consisting of an RNA:DNA hybrid and a displaced DNA strand. While R-loops pose a potential threat to genome integrity, they constitute 5% of the human genome. The role of R-loops in transcriptional regulation, DNA replication, and chromatin signature is becoming increasingly clear. R-loops are associated with various histone modifications, suggesting that they may modulate chromatin accessibility. To potentially harness transcription-coupled repair mechanisms in the germline, nearly the entire genome is expressed during the early stages of male gametogenesis in mammals, providing ample opportunity for the formation of a transcriptome-dependent R-loop landscape in male germ cells. In this study, our data demonstrated the presence of R-loops in fully mature human and bonobo sperm heads and their partial correspondence to transcribed regions and chromatin structure, which is massively reorganized from mainly histone to mainly protamine-packed chromatin in mature sperm. The sperm R-loop landscape resembles characteristic patterns of somatic cells. Surprisingly, we detected R-loops in both residual histone and protamine-packed chromatin and localize them to still-active retroposons, ALUs and SINE-VNTR-ALUs (SVAs), the latter has recently arisen in hominoid primates. We detected both evolutionarily conserved and species-specific localizations. Comparing our DNA-RNA immunoprecipitation (DRIP) data with published DNA methylation and histone chromatin immunoprecipitation (ChIP) data, we hypothesize that R-loops epigenetically reduce methylation of SVAs. Strikingly, we observe a strong influence of R-loops on the transcriptomes of zygotes from early developmental stages before zygotic genome activation. Overall, these findings suggest that chromatin accessibility influenced by R-loops may represent a system of inherited gene regulation.
Collapse
|
22
|
Wang S, Meyer DH, Schumacher B. Inheritance of paternal DNA damage by histone-mediated repair restriction. Nature 2023; 613:365-374. [PMID: 36544019 PMCID: PMC9834056 DOI: 10.1038/s41586-022-05544-w] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 11/08/2022] [Indexed: 12/24/2022]
Abstract
How paternal exposure to ionizing radiation affects genetic inheritance and disease risk in the offspring has been a long-standing question in radiation biology. In humans, nearly 80% of transmitted mutations arise in the paternal germline1, but the transgenerational effects of ionizing radiation exposure has remained controversial and the mechanisms are unknown. Here we show that in sex-separated Caenorhabditis elegans strains, paternal, but not maternal, exposure to ionizing radiation leads to transgenerational embryonic lethality. The offspring of irradiated males displayed various genome instability phenotypes, including DNA fragmentation, chromosomal rearrangement and aneuploidy. Paternal DNA double strand breaks were repaired by maternally provided error-prone polymerase theta-mediated end joining. Mechanistically, we show that depletion of an orthologue of human histone H1.0, HIS-24, or the heterochromatin protein HPL-1, could significantly reverse the transgenerational embryonic lethality. Removal of HIS-24 or HPL-1 reduced histone 3 lysine 9 dimethylation and enabled error-free homologous recombination repair in the germline of the F1 generation from ionizing radiation-treated P0 males, consequently improving the viability of the F2 generation. This work establishes the mechanistic underpinnings of the heritable consequences of paternal radiation exposure on the health of offspring, which may lead to congenital disorders and cancer in humans.
Collapse
Affiliation(s)
- Siyao Wang
- Institute for Genome Stability in Aging and Disease, Medical Faculty, University Hospital and University of Cologne, Cologne, Germany.
- Cologne Excellence Cluster for Cellular Stress Responses in Aging-Associated Diseases (CECAD), Center for Molecular Medicine Cologne (CMMC), University of Cologne, Cologne, Germany.
| | - David H Meyer
- Institute for Genome Stability in Aging and Disease, Medical Faculty, University Hospital and University of Cologne, Cologne, Germany
- Cologne Excellence Cluster for Cellular Stress Responses in Aging-Associated Diseases (CECAD), Center for Molecular Medicine Cologne (CMMC), University of Cologne, Cologne, Germany
| | - Björn Schumacher
- Institute for Genome Stability in Aging and Disease, Medical Faculty, University Hospital and University of Cologne, Cologne, Germany.
- Cologne Excellence Cluster for Cellular Stress Responses in Aging-Associated Diseases (CECAD), Center for Molecular Medicine Cologne (CMMC), University of Cologne, Cologne, Germany.
| |
Collapse
|
23
|
Steely CJ, Watkins WS, Baird L, Jorde LB. The mutational dynamics of short tandem repeats in large, multigenerational families. Genome Biol 2022; 23:253. [PMID: 36510265 PMCID: PMC9743774 DOI: 10.1186/s13059-022-02818-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 11/17/2022] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Short tandem repeats (STRs) compose approximately 3% of the genome, and mutations at STR loci have been linked to dozens of human diseases including amyotrophic lateral sclerosis, Friedreich ataxia, Huntington disease, and fragile X syndrome. Improving our understanding of these mutations would increase our knowledge of the mutational dynamics of the genome and may uncover additional loci that contribute to disease. To estimate the genome-wide pattern of mutations at STR loci, we analyze blood-derived whole-genome sequencing data for 544 individuals from 29 three-generation CEPH pedigrees. These pedigrees contain both sets of grandparents, the parents, and an average of 9 grandchildren per family. RESULTS We use HipSTR to identify de novo STR mutations in the 2nd generation of these pedigrees and require transmission to the third generation for validation. Analyzing approximately 1.6 million STR loci, we estimate the empirical de novo STR mutation rate to be 5.24 × 10-5 mutations per locus per generation. Perfect repeats mutate about 2 × more often than imperfect repeats. De novo STRs are significantly enriched in Alu elements. CONCLUSIONS Approximately 30% of new STR mutations occur within Alu elements, which compose only 11% of the genome, but only 10% are found in LINE-1 insertions, which compose 17% of the genome. Phasing these mutations to the parent of origin shows that parental transmission biases vary among families. We estimate the average number of de novo genome-wide STR mutations per individual to be approximately 85, which is similar to the average number of observed de novo single nucleotide variants.
Collapse
Affiliation(s)
- Cody J. Steely
- grid.223827.e0000 0001 2193 0096Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT 84112 USA
| | - W. Scott Watkins
- grid.223827.e0000 0001 2193 0096Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT 84112 USA
| | - Lisa Baird
- grid.223827.e0000 0001 2193 0096Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT 84112 USA
| | - Lynn B. Jorde
- grid.223827.e0000 0001 2193 0096Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT 84112 USA
| |
Collapse
|
24
|
Pokrovac I, Pezer Ž. Recent advances and current challenges in population genomics of structural variation in animals and plants. Front Genet 2022; 13:1060898. [PMID: 36523759 PMCID: PMC9745067 DOI: 10.3389/fgene.2022.1060898] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 11/15/2022] [Indexed: 05/02/2024] Open
Abstract
The field of population genomics has seen a surge of studies on genomic structural variation over the past two decades. These studies witnessed that structural variation is taxonomically ubiquitous and represent a dominant form of genetic variation within species. Recent advances in technology, especially the development of long-read sequencing platforms, have enabled the discovery of structural variants (SVs) in previously inaccessible genomic regions which unlocked additional structural variation for population studies and revealed that more SVs contribute to evolution than previously perceived. An increasing number of studies suggest that SVs of all types and sizes may have a large effect on phenotype and consequently major impact on rapid adaptation, population divergence, and speciation. However, the functional effect of the vast majority of SVs is unknown and the field generally lacks evidence on the phenotypic consequences of most SVs that are suggested to have adaptive potential. Non-human genomes are heavily under-represented in population-scale studies of SVs. We argue that more research on other species is needed to objectively estimate the contribution of SVs to evolution. We discuss technical challenges associated with SV detection and outline the most recent advances towards more representative reference genomes, which opens a new era in population-scale studies of structural variation.
Collapse
Affiliation(s)
| | - Željka Pezer
- Laboratory for Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
| |
Collapse
|
25
|
Tan R, Shen Y. Accurate in silico confirmation of rare copy number variant calls from exome sequencing data using transfer learning. Nucleic Acids Res 2022; 50:e123. [PMID: 36124672 PMCID: PMC9756945 DOI: 10.1093/nar/gkac788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 08/08/2022] [Accepted: 09/01/2022] [Indexed: 12/24/2022] Open
Abstract
Exome sequencing is widely used in genetic studies of human diseases and clinical genetic diagnosis. Accurate detection of copy number variants (CNVs) is important to fully utilize exome sequencing data. However, exome data are noisy. None of the existing methods alone can achieve both high precision and recall rate. A common practice is to perform heuristic filtration followed by manual inspection of read depth of putative CNVs. This approach does not scale in large studies. To address this issue, we developed a transfer learning method, CNV-espresso, for in silico confirming rare CNVs from exome sequencing data. CNV-espresso encodes candidate CNVs from exome data as images and uses pretrained convolutional neural network models to classify copy number states. We trained CNV-espresso using an offspring-parents trio exome sequencing dataset, with inherited CNVs as positives and CNVs with Mendelian errors as negatives. We evaluated the performance using additional samples that have both exome and whole-genome sequencing (WGS) data. Assuming the CNVs detected from WGS data as a proxy of ground truth, CNV-espresso significantly improves precision while keeping recall almost intact, especially for CNVs that span a small number of exons. CNV-espresso can effectively replace manual inspection of CNVs in large-scale exome sequencing studies.
Collapse
Affiliation(s)
- Renjie Tan
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
| | - Yufeng Shen
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA
- JP Sulzberger Columbia Genome Center, Columbia University, New York, NY 10032, USA
| |
Collapse
|
26
|
Schuy J, Grochowski CM, Carvalho CMB, Lindstrand A. Complex genomic rearrangements: an underestimated cause of rare diseases. Trends Genet 2022; 38:1134-1146. [PMID: 35820967 PMCID: PMC9851044 DOI: 10.1016/j.tig.2022.06.003] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 05/12/2022] [Accepted: 06/06/2022] [Indexed: 01/24/2023]
Abstract
Complex genomic rearrangements (CGRs) are known contributors to disease but are often missed during routine genetic screening. Identifying CGRs requires (i) identifying copy number variants (CNVs) concurrently with inversions, (ii) phasing multiple breakpoint junctions incis, as well as (iii) detecting and resolving structural variants (SVs) within repeats. We demonstrate how combining cytogenetics and new sequencing methodologies is being successfully applied to gain insights into the genomic architecture of CGRs. In addition, we review CGR patterns and molecular features revealed by studying constitutional genomic disorders. These data offer invaluable lessons to individuals interested in investigating CGRs, evaluating their clinical relevance and frequency, as well as assessing their impact(s) on rare genetic diseases.
Collapse
Affiliation(s)
- Jakob Schuy
- Department of Molecular Medicine and Surgery and Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
| | | | - Claudia M B Carvalho
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA; Pacific Northwest Research Institute, Seattle, WA, USA
| | - Anna Lindstrand
- Department of Molecular Medicine and Surgery and Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden; Department of Clinical Genetics, Karolinska University Hospital, Stockholm, Sweden.
| |
Collapse
|
27
|
Du H, Jolly A, Grochowski CM, Yuan B, Dawood M, Jhangiani SN, Li H, Muzny D, Fatih JM, Coban-Akdemir Z, Carlin ME, Scheuerle AE, Witzl K, Posey JE, Pendleton M, Harrington E, Juul S, Hastings PJ, Bi W, Gibbs RA, Sedlazeck FJ, Lupski JR, Carvalho CMB, Liu P. The multiple de novo copy number variant (MdnCNV) phenomenon presents with peri-zygotic DNA mutational signatures and multilocus pathogenic variation. Genome Med 2022; 14:122. [PMID: 36303224 PMCID: PMC9609164 DOI: 10.1186/s13073-022-01123-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Accepted: 10/10/2022] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND The multiple de novo copy number variant (MdnCNV) phenotype is described by having four or more constitutional de novo CNVs (dnCNVs) arising independently throughout the human genome within one generation. It is a rare peri-zygotic mutational event, previously reported to be seen once in every 12,000 individuals referred for genome-wide chromosomal microarray analysis due to congenital abnormalities. These rare families provide a unique opportunity to understand the genetic factors of peri-zygotic genome instability and the impact of dnCNV on human diseases. METHODS Chromosomal microarray analysis (CMA), array-based comparative genomic hybridization, short- and long-read genome sequencing (GS) were performed on the newly identified MdnCNV family to identify de novo mutations including dnCNVs, de novo single-nucleotide variants (dnSNVs), and indels. Short-read GS was performed on four previously published MdnCNV families for dnSNV analysis. Trio-based rare variant analysis was performed on the newly identified individual and four previously published MdnCNV families to identify potential genetic etiologies contributing to the peri-zygotic genomic instability. Lin semantic similarity scores informed quantitative human phenotype ontology analysis on three MdnCNV families to identify gene(s) driving or contributing to the clinical phenotype. RESULTS In the newly identified MdnCNV case, we revealed eight de novo tandem duplications, each ~ 1 Mb, with microhomology at 6/8 breakpoint junctions. Enrichment of de novo single-nucleotide variants (SNV; 6/79) and de novo indels (1/12) was found within 4 Mb of the dnCNV genomic regions. An elevated post-zygotic SNV mutation rate was observed in MdnCNV families. Maternal rare variant analyses identified three genes in distinct families that may contribute to the MdnCNV phenomenon. Phenotype analysis suggests that gene(s) within dnCNV regions contribute to the observed proband phenotype in 3/3 cases. CNVs in two cases, a contiguous gene duplication encompassing PMP22 and RAI1 and another duplication affecting NSD1 and SMARCC2, contribute to the clinically observed phenotypic manifestations. CONCLUSIONS Characteristic features of dnCNVs reported here are consistent with a microhomology-mediated break-induced replication (MMBIR)-driven mechanism during the peri-zygotic period. Maternal genetic variants in DNA repair genes potentially contribute to peri-zygotic genomic instability. Variable phenotypic features were observed across a cohort of three MdnCNV probands, and computational quantitative phenotyping revealed that two out of three had evidence for the contribution of more than one genetic locus to the proband's phenotype supporting the hypothesis of de novo multilocus pathogenic variation (MPV) in those families.
Collapse
Affiliation(s)
- Haowei Du
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Angad Jolly
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
- Medical Scientist Training Program, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Christopher M Grochowski
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Bo Yuan
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
- Baylor Genetics Laboratory, Houston, TX, 77021, USA
- Seattle Children's Hospital, Seattle, WA, 98105, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Moez Dawood
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
- Medical Scientist Training Program, Baylor College of Medicine, Houston, TX, 77030, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Shalini N Jhangiani
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - He Li
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Donna Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Jawid M Fatih
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Zeynep Coban-Akdemir
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Mary Esther Carlin
- Division of Genetics and Metabolism, Department of Pediatrics, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Angela E Scheuerle
- Division of Genetics and Metabolism, Department of Pediatrics, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
- Division of Genetics Diagnostics, Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
- McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Karin Witzl
- Clinical Institute of Medical Genetics, University Medical Centre Ljubljana, 1000, Ljubljana, Slovenia
- Medical Faculty, University of Ljubljana, 1000, Ljubljana, Slovenia
| | - Jennifer E Posey
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | | | | | - Sissel Juul
- Oxford Nanopore Technologies Inc, New York, NY, 10013, USA
| | - P J Hastings
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
- Dan L. Duncan Comprehensive Cancer Center, BCM, Houston, TX, 77030, USA
| | - Weimin Bi
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
- Baylor Genetics Laboratory, Houston, TX, 77021, USA
| | - Richard A Gibbs
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Fritz J Sedlazeck
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - James R Lupski
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA.
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, 77030, USA.
- Texas Children's Hospital, Houston, TX, 77030, USA.
| | - Claudia M B Carvalho
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Pacific Northwest Research Institute, 720 Broadway, Seattle, WA, 98122, USA.
| | - Pengfei Liu
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Baylor Genetics Laboratory, Houston, TX, 77021, USA.
| |
Collapse
|
28
|
Fu JM, Satterstrom FK, Peng M, Brand H, Collins RL, Dong S, Wamsley B, Klei L, Wang L, Hao SP, Stevens CR, Cusick C, Babadi M, Banks E, Collins B, Dodge S, Gabriel SB, Gauthier L, Lee SK, Liang L, Ljungdahl A, Mahjani B, Sloofman L, Smirnov AN, Barbosa M, Betancur C, Brusco A, Chung BHY, Cook EH, Cuccaro ML, Domenici E, Ferrero GB, Gargus JJ, Herman GE, Hertz-Picciotto I, Maciel P, Manoach DS, Passos-Bueno MR, Persico AM, Renieri A, Sutcliffe JS, Tassone F, Trabetti E, Campos G, Cardaropoli S, Carli D, Chan MCY, Fallerini C, Giorgio E, Girardi AC, Hansen-Kiss E, Lee SL, Lintas C, Ludena Y, Nguyen R, Pavinato L, Pericak-Vance M, Pessah IN, Schmidt RJ, Smith M, Costa CIS, Trajkova S, Wang JYT, Yu MHC, Cutler DJ, De Rubeis S, Buxbaum JD, Daly MJ, Devlin B, Roeder K, Sanders SJ, Talkowski ME. Rare coding variation provides insight into the genetic architecture and phenotypic context of autism. Nat Genet 2022; 54:1320-1331. [PMID: 35982160 PMCID: PMC9653013 DOI: 10.1038/s41588-022-01104-0] [Citation(s) in RCA: 143] [Impact Index Per Article: 71.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 05/24/2022] [Indexed: 01/11/2023]
Abstract
Some individuals with autism spectrum disorder (ASD) carry functional mutations rarely observed in the general population. We explored the genes disrupted by these variants from joint analysis of protein-truncating variants (PTVs), missense variants and copy number variants (CNVs) in a cohort of 63,237 individuals. We discovered 72 genes associated with ASD at false discovery rate (FDR) ≤ 0.001 (185 at FDR ≤ 0.05). De novo PTVs, damaging missense variants and CNVs represented 57.5%, 21.1% and 8.44% of association evidence, while CNVs conferred greatest relative risk. Meta-analysis with cohorts ascertained for developmental delay (DD) (n = 91,605) yielded 373 genes associated with ASD/DD at FDR ≤ 0.001 (664 at FDR ≤ 0.05), some of which differed in relative frequency of mutation between ASD and DD cohorts. The DD-associated genes were enriched in transcriptomes of progenitor and immature neuronal cells, whereas genes showing stronger evidence in ASD were more enriched in maturing neurons and overlapped with schizophrenia-associated genes, emphasizing that these neuropsychiatric disorders may share common pathways to risk.
Collapse
Affiliation(s)
- Jack M Fu
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - F Kyle Satterstrom
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Minshi Peng
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Harrison Brand
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Pediatric Surgical Research Laboratories, Department of Surgery, Massachusetts General Hospital, Boston, MA, USA
| | - Ryan L Collins
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Program in Bioinformatics and Integrative Genomics, Harvard Medical School, Boston, MA, USA
| | - Shan Dong
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA
| | - Brie Wamsley
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Lambertus Klei
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Lily Wang
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Bioinformatics and Integrative Genomics, Harvard Medical School, Boston, MA, USA
| | - Stephanie P Hao
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Pediatric Surgical Research Laboratories, Department of Surgery, Massachusetts General Hospital, Boston, MA, USA
| | - Christine R Stevens
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Caroline Cusick
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Mehrtash Babadi
- Data Sciences Platform, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Eric Banks
- Data Sciences Platform, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Brett Collins
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- The Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Sheila Dodge
- Genomics Platform, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Stacey B Gabriel
- Genomics Platform, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Laura Gauthier
- Data Sciences Platform, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Samuel K Lee
- Data Sciences Platform, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Lindsay Liang
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA
| | - Alicia Ljungdahl
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA
| | - Behrang Mahjani
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Laura Sloofman
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- The Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Andrey N Smirnov
- Data Sciences Platform, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Mafalda Barbosa
- The Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Catalina Betancur
- Sorbonne Université, INSERM, CNRS, Neuroscience Paris Seine, Institut de Biologie Paris Seine, Paris, France
| | - Alfredo Brusco
- Department of Medical Sciences, University of Torino, Turin, Italy
- Medical Genetics Unit, 'Città della Salute e della Scienza' University Hospital, Turin, Italy
| | - Brian H Y Chung
- Department of Pediatrics and Adolescent Medicine, Duchess of Kent Children's Hospital, The University of Hong Kong, Hong Kong Special Administrative Region, China
| | - Edwin H Cook
- Institute for Juvenile Research, Department of Psychiatry, University of Illinois at Chicago, Chicago, IL, USA
| | - Michael L Cuccaro
- The John P Hussman Institute for Human Genomics, The University of Miami Miller School of Medicine, Miami, FL, USA
| | - Enrico Domenici
- Department of Cellular, Computational and Integrative Biology, , University of Trento, Trento, Italy
| | | | - J Jay Gargus
- Center for Autism Research and Translation, University of California Irvine, Irvine, CA, USA
| | - Gail E Herman
- The Research Institute at Nationwide Children's Hospital, Columbus, OH, USA
| | - Irva Hertz-Picciotto
- MIND (Medical Investigation of Neurodevelopmental Disorders) Institute, University of California Davis, Davis, CA, USA
| | - Patricia Maciel
- Life and Health Sciences Research Institute, School of Medicine, University of Minho, Braga, Portugal
| | - Dara S Manoach
- Department of Psychiatry, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Maria Rita Passos-Bueno
- Centro de Pesquisas sobre o Genoma Humano e Células tronco, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | - Antonio M Persico
- Interdepartmental Program 'Autism 0-90', 'Gaetano Martino' University Hospital, University of Messina, Messina, Italy
| | - Alessandra Renieri
- Med Biotech Hub and Competence Center, Department of Medical Biotechnologies, University of Siena, Siena, Italy
- Medical Genetics, , University of Siena, Siena, Italy
- Genetica Medica, Azienda Ospedaliera Universitaria Senese, Siena, Italy
| | - James S Sutcliffe
- Department of Molecular Physiology & Biophysics and Psychiatry, Vanderbilt University School of Medicine, Nashville, TN, USA
- Vanderbilt Genetics Institute, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Flora Tassone
- MIND (Medical Investigation of Neurodevelopmental Disorders) Institute, University of California Davis, Davis, CA, USA
- Department of Biochemistry and Molecular Medicine, University of California Davis, School of Medicine, Sacramento, CA, USA
| | - Elisabetta Trabetti
- Department of Neurosciences, Biomedicine and Movement Sciences, Section of Biology and Genetics, University of Verona, Verona, Italy
| | - Gabriele Campos
- Centro de Pesquisas sobre o Genoma Humano e Células tronco, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | - Simona Cardaropoli
- Department of Public Health and Pediatrics, University of Torino, Turin, Italy
| | - Diana Carli
- Department of Public Health and Pediatrics, University of Torino, Turin, Italy
| | - Marcus C Y Chan
- Department of Pediatrics and Adolescent Medicine, Duchess of Kent Children's Hospital, The University of Hong Kong, Hong Kong Special Administrative Region, China
| | - Chiara Fallerini
- Med Biotech Hub and Competence Center, Department of Medical Biotechnologies, University of Siena, Siena, Italy
- Medical Genetics, , University of Siena, Siena, Italy
| | - Elisa Giorgio
- Department of Medical Sciences, University of Torino, Turin, Italy
| | - Ana Cristina Girardi
- Centro de Pesquisas sobre o Genoma Humano e Células tronco, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | - Emily Hansen-Kiss
- Department of Diagnostic and Biomedical Sciences, University of Texas Health Science Center at Houston, School of Dentistry, Houston, TX, USA
| | - So Lun Lee
- Department of Pediatrics and Adolescent Medicine, Duchess of Kent Children's Hospital, The University of Hong Kong, Hong Kong Special Administrative Region, China
| | - Carla Lintas
- Service for Neurodevelopmental Disorders, University Campus Bio-medico of Rome, Rome, Italy
| | - Yunin Ludena
- MIND (Medical Investigation of Neurodevelopmental Disorders) Institute, University of California Davis, Davis, CA, USA
| | - Rachel Nguyen
- Center for Autism Research and Translation, University of California Irvine, Irvine, CA, USA
| | - Lisa Pavinato
- Department of Medical Sciences, University of Torino, Turin, Italy
| | - Margaret Pericak-Vance
- The John P Hussman Institute for Human Genomics, The University of Miami Miller School of Medicine, Miami, FL, USA
| | - Isaac N Pessah
- MIND (Medical Investigation of Neurodevelopmental Disorders) Institute, University of California Davis, Davis, CA, USA
- Department of Molecular Biosciences, University of California Davis, School of Veterinary Medicine, Davis, CA, USA
| | - Rebecca J Schmidt
- MIND (Medical Investigation of Neurodevelopmental Disorders) Institute, University of California Davis, Davis, CA, USA
| | - Moyra Smith
- Center for Autism Research and Translation, University of California Irvine, Irvine, CA, USA
| | - Claudia I S Costa
- Centro de Pesquisas sobre o Genoma Humano e Células tronco, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | - Slavica Trajkova
- Department of Medical Sciences, University of Torino, Turin, Italy
| | - Jaqueline Y T Wang
- Centro de Pesquisas sobre o Genoma Humano e Células tronco, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | - Mullin H C Yu
- Department of Pediatrics and Adolescent Medicine, Duchess of Kent Children's Hospital, The University of Hong Kong, Hong Kong Special Administrative Region, China
| | - David J Cutler
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, USA
| | - Silvia De Rubeis
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- The Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Joseph D Buxbaum
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- The Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| | - Mark J Daly
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA.
- Harvard Medical School, Boston, MA, USA.
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland.
| | - Bernie Devlin
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA.
| | - Kathryn Roeder
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA, USA.
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA.
| | - Stephan J Sanders
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA.
| | - Michael E Talkowski
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA.
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Program in Bioinformatics and Integrative Genomics, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
29
|
Fitzgerald T, Birney E. CNest: A novel copy number association discovery method uncovers 862 new associations from 200,629 whole-exome sequence datasets in the UK Biobank. CELL GENOMICS 2022; 2:100167. [PMID: 36779085 PMCID: PMC9903682 DOI: 10.1016/j.xgen.2022.100167] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/21/2021] [Revised: 04/11/2022] [Accepted: 07/13/2022] [Indexed: 10/15/2022]
Abstract
Copy number variation (CNV) is known to influence human traits, having a rich history of research into common and rare genetic disease, and although CNV is accepted as an important class of genomic variation, progress on copy-number-based genome-wide association studies (GWASs) from next-generation sequencing (NGS) data has been limited. Here we present a novel method for large-scale copy number analysis from NGS data generating robust copy number estimates and allowing copy number GWASs (CN-GWASs) to be performed genome-wide in discovery mode. We provide a detailed analysis in the UK Biobank resource and a specifically designed software package. We use these methods to perform CN-GWAS analysis across 78 human traits, discovering over 800 genetic associations that are likely to contribute strongly to trait distributions. Finally, we compare CNV and SNP association signals across the same traits and samples, defining specific CNV association classes.
Collapse
Affiliation(s)
- Tomas Fitzgerald
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB10 1SD, UK
| | - Ewan Birney
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB10 1SD, UK
| |
Collapse
|
30
|
Zuffardi O, Fichera M, Bonaglia MC. The embryo battle against adverse genomes: Are de novo terminal deletions the rescue of unfavorable zygotic imbalances? Eur J Med Genet 2022; 65:104532. [PMID: 35724817 DOI: 10.1016/j.ejmg.2022.104532] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 04/02/2022] [Accepted: 05/21/2022] [Indexed: 11/03/2022]
Abstract
De novo distal deletions are structural variants considered to be already present in the zygote. However, investigations especially in the prenatal setting have documented that they are often in mosaic with cell lines in which the same deleted chromosome shows different types of aberrations such as: 1) neutral copy variants with loss of heterozygosity that replace the deleted region with equivalent portions of the homologous chromosome and create distal uniparental disomy (UPD); 2) derivative chromosomes where the deleted one ends with the distal region of another chromosome or has the shape of a ring; 3) U-type mirror dicentric or inv-dup del rearrangements. Unstable dicentrics had already been entailed as causative of terminal deletions even when no trace of the reciprocal inv-dup del had been detected. To clarify the mechanism of origin of distal deletions, we examined PubMed using as keywords: complex/mosaic chromosomal deletions, distal UPD, U-type dicentrics, inv-dup del chromosomes, excluding the recurrent inv-dup del(8p)s which are known to originate by NAHR at the maternal meiosis. The literature has shown that U-type dicentrics leading to nearly complete trisomy and therefore incompatible with zygotic survival underlie many types of de novo unbalanced rearrangements, including terminal deletions. In the early embryo, the position of the postzygotic breaks of the dicentric, the different ways of acquiring telomeres by the broken portions and the selection of the most favorable cell lines in the different tissues determine the prevalence of one or the other rearrangement. Multiple lines with simple terminal deletions, inv-dup dels, unbalanced translocations and segmental UPDs can coexist in various mosaic combinations although it is rare to identify them all in the blood. Regarding the origin of the dicentric, among the 30 cases of non-recurrent inv-dup del with sufficient genotyping information, paternal origin was markedly prevalent with consistently identical polymorphisms within the duplication region, regardless of parental origin. The non-random parental origin made any postzygotic origin unlikely and suggested the occurrence of these dicentrics mainly in spermatogenesis. This study strengthens the evidence that non-recurrent de novo structural rearrangements are often secondary to the rescue of a zygotic genome incompatible with embryo survival.
Collapse
Affiliation(s)
- Orsetta Zuffardi
- Department of Molecular Medicine, University of Pavia, Pavia, Italy.
| | - Marco Fichera
- Department of Biomedical and Biotechnological Sciences, Medical Genetics, University of Catania, Catania, Italy; Oasi Research Institute-IRCCS, Troina, Italy.
| | - Maria Clara Bonaglia
- Cytogenetics Laboratory, Scientific Institute, IRCCS Eugenio Medea, Bosisio Parini, Lecco, Italy.
| |
Collapse
|
31
|
Duan X, Pan M, Fan S. Comprehensive evaluation of structural variant genotyping methods based on long-read sequencing data. BMC Genomics 2022; 23:324. [PMID: 35461238 PMCID: PMC9034514 DOI: 10.1186/s12864-022-08548-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Accepted: 04/11/2022] [Indexed: 12/28/2022] Open
Abstract
Background Structural variants (SVs) play a crucial role in gene regulation, trait association, and disease in humans. SV genotyping has been extensively applied in genomics research and clinical diagnosis. Although a growing number of SV genotyping methods for long reads have been developed, a comprehensive performance assessment of these methods has yet to be done. Results Based on one simulated and three real SV datasets, we performed an in-depth evaluation of five SV genotyping methods, including cuteSV, LRcaller, Sniffles, SVJedi, and VaPoR. The results show that for insertions and deletions, cuteSV and LRcaller have similar F1 scores (cuteSV, insertions: 0.69–0.90, deletions: 0.77–0.90 and LRcaller, insertions: 0.67–0.87, deletions: 0.74–0.91) and are superior to other methods. For duplications, inversions, and translocations, LRcaller yields the most accurate genotyping results (0.84, 0.68, and 0.47, respectively). When genotyping SVs located in tandem repeat region or with imprecise breakpoints, cuteSV (insertions and deletions) and LRcaller (duplications, inversions, and translocations) are better than other methods. In addition, we observed a decrease in F1 scores when the SV size increased. Finally, our analyses suggest that the F1 scores of these methods reach the point of diminishing returns at 20× depth of coverage. Conclusions We present an in-depth benchmark study of long-read SV genotyping methods. Our results highlight the advantages and disadvantages of each genotyping method, which provide practical guidance for optimal application selection and prospective directions for tool improvement. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08548-y.
Collapse
Affiliation(s)
- Xiaoke Duan
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, Zhangjiang Fudan International Innovation Center, Fudan University, Shanghai, 200438, China.,MOE Key Laboratory of Contemporary Anthropology, Department of Anthropology and Human Genetics, School of Life Sciences, Fudan University, Shanghai, 200433, China
| | - Mingpei Pan
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, Zhangjiang Fudan International Innovation Center, Fudan University, Shanghai, 200438, China.,MOE Key Laboratory of Contemporary Anthropology, Department of Anthropology and Human Genetics, School of Life Sciences, Fudan University, Shanghai, 200433, China
| | - Shaohua Fan
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, Zhangjiang Fudan International Innovation Center, Fudan University, Shanghai, 200438, China.
| |
Collapse
|
32
|
Noyes MD, Harvey WT, Porubsky D, Sulovari A, Li R, Rose NR, Audano PA, Munson KM, Lewis AP, Hoekzema K, Mantere T, Graves-Lindsay TA, Sanders AD, Goodwin S, Kramer M, Mokrab Y, Zody MC, Hoischen A, Korbel JO, McCombie WR, Eichler EE. Familial long-read sequencing increases yield of de novo mutations. Am J Hum Genet 2022; 109:631-646. [PMID: 35290762 DOI: 10.1016/j.ajhg.2022.02.014] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Accepted: 02/16/2022] [Indexed: 12/11/2022] Open
Abstract
Studies of de novo mutation (DNM) have typically excluded some of the most repetitive and complex regions of the genome because these regions cannot be unambiguously mapped with short-read sequencing data. To better understand the genome-wide pattern of DNM, we generated long-read sequence data from an autism parent-child quad with an affected female where no pathogenic variant had been discovered in short-read Illumina sequence data. We deeply sequenced all four individuals by using three sequencing platforms (Illumina, Oxford Nanopore, and Pacific Biosciences) and three complementary technologies (Strand-seq, optical mapping, and 10X Genomics). Using long-read sequencing, we initially discovered and validated 171 DNMs across two children-a 20% increase in the number of de novo single-nucleotide variants (SNVs) and indels when compared to short-read callsets. The number of DNMs further increased by 5% when considering a more complete human reference (T2T-CHM13) because of the recovery of events in regions absent from GRCh38 (e.g., three DNMs in heterochromatic satellites). In total, we validated 195 de novo germline mutations and 23 potential post-zygotic mosaic mutations across both children; the overall true substitution rate based on this integrated callset is at least 1.41 × 10-8 substitutions per nucleotide per generation. We also identified six de novo insertions and deletions in tandem repeats, two of which represent structural variants. We demonstrate that long-read sequencing and assembly, especially when combined with a more complete reference genome, increases the number of DNMs by >25% compared to previous studies, providing a more complete catalog of DNM compared to short-read data alone.
Collapse
Affiliation(s)
- Michelle D Noyes
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Arvis Sulovari
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Ruiyang Li
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Nicholas R Rose
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Peter A Audano
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Tuomo Mantere
- Department of Human Genetics, Radboud University Medical Center, 6500 Nijmegen, the Netherlands; Laboratory of Cancer Genetics and Tumor Biology, Cancer and Translational Medicine Research Unit and Biocenter Oulu, University of Oulu, 90220 Oulu, Finland
| | | | - Ashley D Sanders
- European Molecular Biology Laboratory, Genome Biology Unit, 69117 Heidelberg, Germany
| | - Sara Goodwin
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Melissa Kramer
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Younes Mokrab
- Department of Human Genetics, Sidra Medicine, PO Box 26999, Doha, Qatar; Weill Cornell Medicine, PO Box 24144, Doha, Qatar; College of Health and Life Sciences, Hamad Bin Khalifa University, PO Box 34110, Doha, Qatar
| | | | - Alexander Hoischen
- Department of Human Genetics, Radboud University Medical Center, 6500 Nijmegen, the Netherlands; Radboud Institute of Medical Life Sciences and Department of Internal Medicine and Radboud Center for Infectious Diseases, Radboud University Medical Center, 6500 Nijmegen, the Netherlands
| | - Jan O Korbel
- European Molecular Biology Laboratory, Genome Biology Unit, 69117 Heidelberg, Germany
| | - W Richard McCombie
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
33
|
Bergeron LA, Besenbacher S, Turner T, Versoza CJ, Wang RJ, Price AL, Armstrong E, Riera M, Carlson J, Chen HY, Hahn MW, Harris K, Kleppe AS, López-Nandam EH, Moorjani P, Pfeifer SP, Tiley GP, Yoder AD, Zhang G, Schierup MH. The mutationathon highlights the importance of reaching standardization in estimates of pedigree-based germline mutation rates. eLife 2022; 11:73577. [PMID: 35018888 PMCID: PMC8830884 DOI: 10.7554/elife.73577] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 01/11/2022] [Indexed: 11/13/2022] Open
Abstract
In the past decade, several studies have estimated the human per-generation germline mutation rate using large pedigrees. More recently, estimates for various nonhuman species have been published. However, methodological differences among studies in detecting germline mutations and estimating mutation rates make direct comparisons difficult. Here, we describe the many different steps involved in estimating pedigree-based mutation rates, including sampling, sequencing, mapping, variant calling, filtering, and appropriately accounting for false-positive and false-negative rates. For each step, we review the different methods and parameter choices that have been used in the recent literature. Additionally, we present the results from a ‘Mutationathon,’ a competition organized among five research labs to compare germline mutation rate estimates for a single pedigree of rhesus macaques. We report almost a twofold variation in the final estimated rate among groups using different post-alignment processing, calling, and filtering criteria, and provide details into the sources of variation across studies. Though the difference among estimates is not statistically significant, this discrepancy emphasizes the need for standardized methods in mutation rate estimations and the difficulty in comparing rates from different studies. Finally, this work aims to provide guidelines for computational and statistical benchmarks for future studies interested in identifying germline mutations from pedigrees.
Collapse
Affiliation(s)
- Lucie A Bergeron
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Søren Besenbacher
- Department of Molecular Medicine (MOMA), Aarhus University, Aarhus N, Denmark
| | - Tychele Turner
- Department of Genetics, Washington University in St. Louis, Saint Louis, United States
| | - Cyril J Versoza
- Center for Evolution and Medicine, Arizona State University, Tempe, United States
| | - Richard J Wang
- Department of Biology, Indiana University, Bloomington, United States
| | - Alivia Lee Price
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Ellie Armstrong
- Department of Biology, Stanford University, Stanford, United States
| | - Meritxell Riera
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Jedidiah Carlson
- Department of Genome Sciences, University of Washington, Seattle, United States
| | - Hwei-Yen Chen
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, United States
| | - Kelley Harris
- Department of Genome Sciences, University of Washington, Seattle, United States
| | | | | | - Priya Moorjani
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, United States
| | - Susanne P Pfeifer
- School of Life Sciences, Arizona State University, Tempe, United States
| | - George P Tiley
- Department of Biology, Duke University, Durham, United States
| | - Anne D Yoder
- Department of Biology, Duke University, Durham, United States
| | - Guojie Zhang
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | | |
Collapse
|
34
|
Borges-Monroy R, Chu C, Dias C, Choi J, Lee S, Gao Y, Shin T, Park PJ, Walsh CA, Lee EA. Whole-genome analysis reveals the contribution of non-coding de novo transposon insertions to autism spectrum disorder. Mob DNA 2021; 12:28. [PMID: 34838103 PMCID: PMC8627061 DOI: 10.1186/s13100-021-00256-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 11/02/2021] [Indexed: 12/30/2022] Open
Abstract
Background Retrotransposons have been implicated as causes of Mendelian disease, but their role in autism spectrum disorder (ASD) has not been systematically defined, because they are only called with adequate sensitivity from whole genome sequencing (WGS) data and a large enough cohort for this analysis has only recently become available. Results We analyzed WGS data from a cohort of 2288 ASD families from the Simons Simplex Collection by establishing a scalable computational pipeline for retrotransposon insertion detection. We report 86,154 polymorphic retrotransposon insertions—including > 60% not previously reported—and 158 de novo retrotransposition events. The overall burden of de novo events was similar between ASD individuals and unaffected siblings, with 1 de novo insertion per 29, 117, and 206 births for Alu, L1, and SVA respectively, and 1 de novo insertion per 21 births total. However, ASD cases showed more de novo L1 insertions than expected in ASD genes. Additionally, we observed exonic insertions in loss-of-function intolerant genes, including a likely pathogenic exonic insertion in CSDE1, only in ASD individuals. Conclusions These findings suggest a modest, but important, impact of intronic and exonic retrotransposon insertions in ASD, show the importance of WGS for their analysis, and highlight the utility of specific bioinformatic tools for high-throughput detection of retrotransposon insertions. Supplementary Information The online version contains supplementary material available at 10.1186/s13100-021-00256-w.
Collapse
Affiliation(s)
- Rebeca Borges-Monroy
- Division of Genetics and Genomics, Manton Center for Orphan Disease, Boston Children's Hospital, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.,Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Chong Chu
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Caroline Dias
- Division of Genetics and Genomics, Manton Center for Orphan Disease, Boston Children's Hospital, Boston, MA, USA.,Division of Developmental Medicine, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
| | - Jaejoon Choi
- Division of Genetics and Genomics, Manton Center for Orphan Disease, Boston Children's Hospital, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.,Department of Genetics, Harvard Medical School, MA, Boston, USA
| | - Soohyun Lee
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Yue Gao
- Division of Genetics and Genomics, Manton Center for Orphan Disease, Boston Children's Hospital, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.,Department of Pediatrics, Harvard Medical School, MA, Boston, USA
| | - Taehwan Shin
- Division of Genetics and Genomics, Manton Center for Orphan Disease, Boston Children's Hospital, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.,Department of Pediatrics, Harvard Medical School, MA, Boston, USA
| | - Peter J Park
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Christopher A Walsh
- Division of Genetics and Genomics, Manton Center for Orphan Disease, Boston Children's Hospital, Boston, MA, USA. .,Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA. .,Department of Pediatrics, Harvard Medical School, MA, Boston, USA. .,Department of Neurology, Harvard Medical School, Boston, MA, USA. .,Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA, USA.
| | - Eunjung Alice Lee
- Division of Genetics and Genomics, Manton Center for Orphan Disease, Boston Children's Hospital, Boston, MA, USA. .,Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA. .,Department of Pediatrics, Harvard Medical School, MA, Boston, USA.
| |
Collapse
|
35
|
Maroilley T, Li X, Oldach M, Jean F, Stasiuk SJ, Tarailo-Graovac M. Deciphering complex genome rearrangements in C. elegans using short-read whole genome sequencing. Sci Rep 2021; 11:18258. [PMID: 34521941 PMCID: PMC8440550 DOI: 10.1038/s41598-021-97764-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 08/30/2021] [Indexed: 12/14/2022] Open
Abstract
Genomic rearrangements cause congenital disorders, cancer, and complex diseases in human. Yet, they are still understudied in rare diseases because their detection is challenging, despite the advent of whole genome sequencing (WGS) technologies. Short-read (srWGS) and long-read WGS approaches are regularly compared, and the latter is commonly recommended in studies focusing on genomic rearrangements. However, srWGS is currently the most economical, accurate, and widely supported technology. In Caenorhabditis elegans (C. elegans), such variants, induced by various mutagenesis processes, have been used for decades to balance large genomic regions by preventing chromosomal crossover events and allowing the maintenance of lethal mutations. Interestingly, those chromosomal rearrangements have rarely been characterized on a molecular level. To evaluate the ability of srWGS to detect various types of complex genomic rearrangements, we sequenced three balancer strains using short-read Illumina technology. As we experimentally validated the breakpoints uncovered by srWGS, we showed that, by combining several types of analyses, srWGS enables the detection of a reciprocal translocation (eT1), a free duplication (sDp3), a large deletion (sC4), and chromoanagenesis events. Thus, applying srWGS to decipher real complex genomic rearrangements in model organisms may help designing efficient bioinformatics pipelines with systematic detection of complex rearrangements in human genomes.
Collapse
Affiliation(s)
- Tatiana Maroilley
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, T2N 4N1, Canada.,Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB, T2N 4N1, Canada
| | - Xiao Li
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, T2N 4N1, Canada.,Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB, T2N 4N1, Canada
| | - Matthew Oldach
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, T2N 4N1, Canada.,Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB, T2N 4N1, Canada
| | - Francesca Jean
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, T2N 4N1, Canada.,Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB, T2N 4N1, Canada
| | - Susan J Stasiuk
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, T2N 4N1, Canada.,Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB, T2N 4N1, Canada
| | - Maja Tarailo-Graovac
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, T2N 4N1, Canada. .,Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB, T2N 4N1, Canada.
| |
Collapse
|
36
|
Gill PS, Clothier JL, Veerapandiyan A, Dweep H, Porter-Gill PA, Schaefer GB. Molecular Dysregulation in Autism Spectrum Disorder. J Pers Med 2021; 11:848. [PMID: 34575625 PMCID: PMC8466026 DOI: 10.3390/jpm11090848] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 08/21/2021] [Accepted: 08/26/2021] [Indexed: 12/14/2022] Open
Abstract
Autism Spectrum Disorder (ASD) comprises a heterogeneous group of neurodevelopmental disorders with a strong heritable genetic component. At present, ASD is diagnosed solely by behavioral criteria. Advances in genomic analysis have contributed to numerous candidate genes for the risk of ASD, where rare mutations and s common variants contribute to its susceptibility. Moreover, studies show rare de novo variants, copy number variation and single nucleotide polymorphisms (SNPs) also impact neurodevelopment signaling. Exploration of rare and common variants involved in common dysregulated pathways can provide new diagnostic and therapeutic strategies for ASD. Contributions of current innovative molecular strategies to understand etiology of ASD will be explored which are focused on whole exome sequencing (WES), whole genome sequencing (WGS), microRNA, long non-coding RNAs and CRISPR/Cas9 models. Some promising areas of pharmacogenomic and endophenotype directed therapies as novel personalized treatment and prevention will be discussed.
Collapse
Affiliation(s)
- Pritmohinder S. Gill
- Department of Pediatrics, University of Arkansas for Medical Sciences, Little Rock, AR 72202, USA;
- Arkansas Children’s Research Institute, 13 Children’s Way, Little Rock, AR 72202, USA;
| | - Jeffery L. Clothier
- Psychiatric Research Institute, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA;
| | - Aravindhan Veerapandiyan
- Pediatric Neurology, Arkansas Children’s Hospital, 1 Children’s Way, Little Rock, AR 72202, USA;
| | - Harsh Dweep
- The Wistar Institute, 3601 Spruce St., Philadelphia, PA 19104, USA;
| | | | - G. Bradley Schaefer
- Department of Pediatrics, University of Arkansas for Medical Sciences, Little Rock, AR 72202, USA;
- Genetics and Pediatrics, University of Arkansas for Medical Sciences, Little Rock, AR 72202, USA
- Arkansas Children’s Hospital NW, Springdale, AR 72762, USA
| |
Collapse
|
37
|
Trost B, Loureiro LO, Scherer SW. Discovery of genomic variation across a generation. Hum Mol Genet 2021; 30:R174-R186. [PMID: 34296264 PMCID: PMC8490016 DOI: 10.1093/hmg/ddab209] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 07/09/2021] [Accepted: 07/19/2021] [Indexed: 11/12/2022] Open
Abstract
Over the past 30 years (the timespan of a generation), advances in genomics technologies have revealed tremendous and unexpected variation in the human genome and have provided increasingly accurate answers to long-standing questions of how much genetic variation exists in human populations and to what degree the DNA complement changes between parents and offspring. Tracking the characteristics of these inherited and spontaneous (or de novo) variations has been the basis of the study of human genetic disease. From genome-wide microarray and next-generation sequencing scans, we now know that each human genome contains over 3 million single nucleotide variants when compared with the ~ 3 billion base pairs in the human reference genome, along with roughly an order of magnitude more DNA—approximately 30 megabase pairs (Mb)—being ‘structurally variable’, mostly in the form of indels and copy number changes. Additional large-scale variations include balanced inversions (average of 18 Mb) and complex, difficult-to-resolve alterations. Collectively, ~1% of an individual’s genome will differ from the human reference sequence. When comparing across a generation, fewer than 100 new genetic variants are typically detected in the euchromatic portion of a child’s genome. Driven by increasingly higher-resolution and higher-throughput sequencing technologies, newer and more accurate databases of genetic variation (for instance, more comprehensive structural variation data and phasing of combinations of variants along chromosomes) of worldwide populations will emerge to underpin the next era of discovery in human molecular genetics.
Collapse
Affiliation(s)
- Brett Trost
- The Centre for Applied Genomics and Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Livia O Loureiro
- The Centre for Applied Genomics and Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Stephen W Scherer
- The Centre for Applied Genomics and Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada.,McLaughlin Centre and Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| |
Collapse
|
38
|
Thomas GWC, Wang RJ, Nguyen J, Alan Harris R, Raveendran M, Rogers J, Hahn MW. Origins and Long-Term Patterns of Copy-Number Variation in Rhesus Macaques. Mol Biol Evol 2021; 38:1460-1471. [PMID: 33226085 PMCID: PMC8042740 DOI: 10.1093/molbev/msaa303] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Mutations play a key role in the development of disease in an individual and the evolution of traits within species. Recent work in humans and other primates has clarified the origins and patterns of single-nucleotide variants, showing that most arise in the father’s germline during spermatogenesis. It remains unknown whether larger mutations, such as deletions and duplications of hundreds or thousands of nucleotides, follow similar patterns. Such mutations lead to copy-number variation (CNV) within and between species, and can have profound effects by deleting or duplicating genes. Here, we analyze patterns of CNV mutations in 32 rhesus macaque individuals from 14 parent–offspring trios. We find the rate of CNV mutations per generation is low (less than one per genome) and we observe no correlation between parental age and the number of CNVs that are passed on to offspring. We also examine segregating CNVs within the rhesus macaque sample and compare them to a similar data set from humans, finding that both species have far more segregating deletions than duplications. We contrast this with long-term patterns of gene copy-number evolution between 17 mammals, where the proportion of deletions that become fixed along the macaque lineage is much smaller than the proportion of segregating deletions. These results suggest purifying selection acting on deletions, such that the majority of them are removed from the population over time. Rhesus macaques are an important biomedical model organism, so these results will aid in our understanding of this species and the disease models it supports.
Collapse
Affiliation(s)
- Gregg W C Thomas
- Division of Biological Sciences, University of Montana, Missoula, MT, USA
| | - Richard J Wang
- Department of Biology, Indiana University, Bloomington, IN, USA
| | - Jelena Nguyen
- Department of Computer Science, Indiana University, Bloomington, IN, USA
| | - R Alan Harris
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.,Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Muthuswamy Raveendran
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.,Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Jeffrey Rogers
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.,Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, IN, USA.,Department of Computer Science, Indiana University, Bloomington, IN, USA
| |
Collapse
|