1
|
Ma W, Chaisson M. Genotyping sequence-resolved copy number variation using pangenomes reveals paralog-specific global diversity and expression divergence of duplicated genes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.08.11.607269. [PMID: 39149335 PMCID: PMC11326217 DOI: 10.1101/2024.08.11.607269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Copy number variant (CNV) genes are important in evolution and disease, yet sequence variation in CNV genes remains a blind spot in large-scale studies. We present ctyper, a method that leverages pangenomes to produce allele-specific copy numbers with locally phased variants from next-generation sequencing (NGS) reads. Benchmarking on 3,351 CNV genes, including HLA, SMN, and CYP2D6, and 212 challenging medically relevant (CMR) genes that are poorly mapped by NGS, ctyper captures 96.5% of phased variants with ≥99.1% correctness of copy number on CNV genes and 94.8% of phased variants on CMR genes. Applying alignment-free algorithms, ctyper requires 1.5 hours per genome on a single CPU. The results improve prediction of gene expression compared to known expression quantitative trait loci (eQTL) variants. Allele-specific expression quantified divergent expression on 7.94% of paralogs and tissue-specific biases on 4.68% of paralogs. We found reduced expression of SMN-2 due to SMN1 conversion, potentially affecting spinal muscular atrophy, and increased expression of translocated duplications of AMY2B. Overall, ctyper enables biobank-scale genotyping of CNV and CMR genes.
Collapse
|
2
|
de los Angeles Becerra Rodriguez M, Gonzalez Muñoz E, Moore T. Oligodendrocyte-specific expression of PSG8- AS1 suggests a role in myelination with prognostic value in oligodendroglioma. Noncoding RNA Res 2024; 9:1061-1068. [PMID: 39022681 PMCID: PMC11254506 DOI: 10.1016/j.ncrna.2024.06.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 05/03/2024] [Accepted: 06/10/2024] [Indexed: 07/20/2024] Open
Abstract
The segmentally duplicated Pregnancy-specific glycoprotein (PSG) locus on chromosome 19q13 may be one of the most rapidly evolving in the human genome. It comprises ten coding genes (PSG1-9, 11) and one predominantly non-coding gene (PSG10) that are expressed in the placenta and gut, in addition to several poorly characterized long non-coding RNAs. We report that long non-coding RNA PSG8-AS1 has an oligodendrocyte-specific expression pattern and is co-expressed with genes encoding key myelin constituents. PSG8-AS1 exhibits two peaks of expression during human brain development coinciding with the most active periods of oligodendrogenesis and myelination. PSG8-AS1 orthologs were found in the genomes of several primates but significant expression was found only in the human, suggesting a recent evolutionary origin of its proposed role in myelination. Additionally, because co-deletion of chromosomes 1p/19q is a genomic marker of oligodendroglioma, expression of PSG8-AS1 was examined in these tumors. PSG8-AS1 may be a promising diagnostic biomarker for glioma, with prognostic value in oligodendroglioma.
Collapse
Affiliation(s)
- Maria de los Angeles Becerra Rodriguez
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
- SFI Centre for Research Training in Genomics Data Science, University College Cork, Cork, Ireland
| | - Elena Gonzalez Muñoz
- Instituto de Investigación Biomédica de Málaga y Plataforma en Nanomedicina-IBIMA Plataforma BIONAND, 29590, Málaga, Spain
- Universidad de Malaga, Dpto. Biología Celular, Genética y Fisiología, 29071, Málaga, Spain
| | - Tom Moore
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| |
Collapse
|
3
|
Elron E, Maya I, Shefer-Averbuch N, Kahana S, Matar R, Klein K, Agmon-Fishman I, Gurevitch M, Basel-Salmon L, Levy M. The Diagnostic Yield of Chromosomal Microarray Analysis in Third-Trimester Fetal Abnormalities. Am J Perinatol 2024; 41:2232-2242. [PMID: 38688298 DOI: 10.1055/s-0044-1786514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
OBJECTIVE This study aimed to determine the diagnostic yield of chromosomal microarray analysis (CMA) performed in cases of fetal abnormalities detected during the third trimester of pregnancy. STUDY DESIGN A retrospective review of medical records was conducted for women who underwent amniocentesis at or beyond 28 weeks of gestation between January 2017 and February 2023. CMA results of pregnancies with abnormal sonographic findings not detected before 28 weeks were included. RESULTS A total of 482 fetuses met the inclusion criteria. The average maternal age was 31.3 years, and the average gestational age at amniocentesis was 32.3 weeks. The overall diagnostic yield of CMA was 6.2% (30 clinically significant copy number variations [CNVs]). The yield was 16.4% in cases with two or more fetal malformations, while cases with a single anomaly revealed a diagnostic yield of 7.3%. Cases presenting isolated polyhydramnios or isolated fetal growth restriction had a lower yield of 9.3 and 5.4%, respectively. Of the 30 clinically significant cases, 19 (or 63.4%) exhibited recurrent CNVs. The remaining 11 cases (or 36.6%) presented unique CNVs. The theoretical yield of Noninvasive Prenatal Testing (NIPT) in our cohort is 2% for aneuploidy, which implies that it could potentially miss up to 70% of the significant findings that could be identified by CMA. In 80% of the fetuses (or 24 out of 30) with clinically significant CNVs, the structural abnormalities detected on fetal ultrasound examinations corresponded with the CMA results. CONCLUSION The 6.2% detection rate of significant CNVs in late-onset fetal anomalies confirms the value of CMA in third-trimester amniocentesis. The findings underscore the necessity of CMA for detecting CNVs potentially overlooked by NIPT and emphasize the importance of thorough genetic counseling. KEY POINTS · CMA yields 6.2% for third-trimester anomalies.. · NIPT may miss 70% of CMA findings.. · Ultrasound matched 80% of CMA results..
Collapse
Affiliation(s)
- Eyal Elron
- Department of Neonatology, Schneider Children's Medical Center, Petah Tikva, Israel
- Pediatric Genetics Unit, Schneider Children's Medical Center, Petah Tikva, Israel
- The Raphael Recanati Genetic Institute, Rabin Medical Center, Petah Tikva, Israel
- Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Idit Maya
- The Raphael Recanati Genetic Institute, Rabin Medical Center, Petah Tikva, Israel
- Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Noa Shefer-Averbuch
- Pediatric Genetics Unit, Schneider Children's Medical Center, Petah Tikva, Israel
- The Raphael Recanati Genetic Institute, Rabin Medical Center, Petah Tikva, Israel
- Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
- The Jesse Z. and Sara Lea Shafer Institute for Endocrinology and Diabetes, Schneider Children's Medical Center of Israel, The Raphael Recanati Genetics Institute, Rabin Medical Center, Beilinson Campus, Petah Tikva, Israel
| | - Sarit Kahana
- The Raphael Recanati Genetic Institute, Rabin Medical Center, Petah Tikva, Israel
| | - Reut Matar
- The Raphael Recanati Genetic Institute, Rabin Medical Center, Petah Tikva, Israel
| | - Kochav Klein
- The Raphael Recanati Genetic Institute, Rabin Medical Center, Petah Tikva, Israel
| | - Ifat Agmon-Fishman
- The Raphael Recanati Genetic Institute, Rabin Medical Center, Petah Tikva, Israel
| | - Merav Gurevitch
- The Raphael Recanati Genetic Institute, Rabin Medical Center, Petah Tikva, Israel
| | - Lina Basel-Salmon
- Pediatric Genetics Unit, Schneider Children's Medical Center, Petah Tikva, Israel
- The Raphael Recanati Genetic Institute, Rabin Medical Center, Petah Tikva, Israel
- Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
- Felsenstein Medical Research Center, Petach Tikva, Israel
| | - Michal Levy
- The Raphael Recanati Genetic Institute, Rabin Medical Center, Petah Tikva, Israel
- Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
4
|
Shimada MK, Nishida T. Haplotype-Based Approach Represents Locus Specificity in the Genomic Diversification Process in Humans ( Homo sapiens). Genes (Basel) 2024; 15:1554. [PMID: 39766821 PMCID: PMC11675571 DOI: 10.3390/genes15121554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2024] [Revised: 11/23/2024] [Accepted: 11/25/2024] [Indexed: 01/11/2025] Open
Abstract
BACKGROUND/OBJECTIVES Recent progress in evolutionary genomics on human (Homo sapiens) populations has revealed complex demographic events and genomic changes. These include population expansion with complicated migration, substantial population structure, and ancient introgression from other hominins, as well as human characteristics selections. Nevertheless, the genomic regions in which such evolutionary events took place have remained unclear. METHODS Here, we focused on eight loci containing the haplotypes that were previously presented as atypical for the mutation pattern in sequence and/or geographic distribution pattern with the model of recent African origin, which constitute two major clusters: African only, and global. This was the consensus model before information regarding introgression from Neanderthal (Homo neanderthalensis) was available. We compared diversity in identical datasets of the modern human population genome, with the 1000 Genomes project among them. RESULTS/CONCLUSIONS This study identified representative genomic regions that show traces of various demographic events and genomic changes that modern humans have undergone by categorizing the relationships in sequence similarity and in worldwide geographic distribution among haplotypes.
Collapse
Affiliation(s)
- Makoto K. Shimada
- Center for Medical Science, Fujita Health University, Toyoake 470-1192, Aichi, Japan
| | | |
Collapse
|
5
|
Guitart X, Porubsky D, Yoo D, Dougherty ML, Dishuck PC, Munson KM, Lewis AP, Hoekzema K, Knuth J, Chang S, Pastinen T, Eichler EE. Independent expansion, selection, and hypervariability of the TBC1D3 gene family in humans. Genome Res 2024; 34:1798-1810. [PMID: 39107043 DOI: 10.1101/gr.279299.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Accepted: 07/29/2024] [Indexed: 08/09/2024]
Abstract
TBC1D3 is a primate-specific gene family that has expanded in the human lineage and has been implicated in neuronal progenitor proliferation and expansion of the frontal cortex. The gene family and its expression have been challenging to investigate because it is embedded in high-identity and highly variable segmental duplications. We sequenced and assembled the gene family using long-read sequencing data from 34 humans and 11 nonhuman primate species. Our analysis shows that this particular gene family has independently duplicated in at least five primate lineages, and the duplicated loci are enriched at sites of large-scale chromosomal rearrangements on Chromosome 17. We find that all human copy-number variation maps to two distinct clusters located at Chromosome 17q12 and that humans are highly structurally variable at this locus, differing by as many as 20 copies and ∼1 Mbp in length depending on haplotypes. We also show evidence of positive selection, as well as a significant change in the predicted human TBC1D3 protein sequence. Last, we find that, despite multiple duplications, human TBC1D3 expression is limited to a subset of copies and, most notably, from a single paralog group: TBC1D3-CDKL These observations may help explain why a gene potentially important in cortical development can be so variable in the human population.
Collapse
Affiliation(s)
- Xavi Guitart
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - DongAhn Yoo
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Max L Dougherty
- Tisch Cancer Institute, Division of Hematology and Medical Oncology, The Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Philip C Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Jordan Knuth
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Stephen Chang
- Department of Biochemistry
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University, Stanford, California 94305, USA
| | - Tomi Pastinen
- Department of Pediatrics, Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, Missouri 64108, USA
- Department of Pediatrics, School of Medicine, University of Missouri Kansas City, Kansas City, Missouri 64108, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA;
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
| |
Collapse
|
6
|
Chen Y, Khan MZ, Wang X, Liang H, Ren W, Kou X, Liu X, Chen W, Peng Y, Wang C. Structural variations in livestock genomes and their associations with phenotypic traits: a review. Front Vet Sci 2024; 11:1416220. [PMID: 39600883 PMCID: PMC11588642 DOI: 10.3389/fvets.2024.1416220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Accepted: 10/29/2024] [Indexed: 11/29/2024] Open
Abstract
Genomic structural variation (SV) refers to differences in gene sequences between individuals on a genomic scale. It is widely distributed in the genome, primarily in the form of insertions, deletions, duplications, inversions, and translocations. Due to its characterization by long segments and large coverage, SVs significantly impact the genetic characteristics and production performance of livestock, playing a crucial role in studying breed diversity, biological evolution, and disease correlation. Research on SVs contributes to an enhanced understanding of chromosome function and genetic characteristics and is important for understanding hereditary diseases mechanisms. In this article, we review the concept, classification, main formation mechanisms, detection methods, and advancement of research on SVs in the genomes of cattle, buffalo, equine, sheep, and goats, aiming to reveal the genetic basis of differences in phenotypic traits and adaptive genetic mechanisms through genomic research, which will provide a theoretical basis for better understanding and utilizing the genetic resources of herbivorous livestock.
Collapse
Affiliation(s)
| | - Muhammad Zahoor Khan
- College of Agronomy and Agricultural Engineering Liaocheng University, Liaocheng, China
| | | | | | | | | | | | | | - Yongdong Peng
- College of Agronomy and Agricultural Engineering Liaocheng University, Liaocheng, China
| | - Changfa Wang
- College of Agronomy and Agricultural Engineering Liaocheng University, Liaocheng, China
| |
Collapse
|
7
|
Versoza CJ, Jensen JD, Pfeifer SP. The landscape of structural variation in aye-ayes ( Daubentonia madagascariensis). BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.08.622672. [PMID: 39605644 PMCID: PMC11601217 DOI: 10.1101/2024.11.08.622672] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Abstract
Aye-ayes (Daubentonia madagascariensis) are one of the 25 most critically endangered primate species in the world. Endemic to Madagascar, their small and highly fragmented populations make them particularly vulnerable to both genetic disease and anthropogenic environmental changes. Over the past decade, conservation genomic efforts have largely focused on inferring and monitoring population structure based on single nucleotide variants to identify and protect critical areas of genetic diversity. However, the recent release of a highly contiguous genome assembly allows, for the first time, for the study of structural genomic variation (deletions, duplications, insertions, and inversions) which are likely to impact a substantial proportion of the species' genome. Based on whole-genome, short-read sequencing data from 14 individuals, >1,000 high-confidence autosomal structural variants were detected, affecting ~240 kb of the aye-aye genome. The majority of these variants (>85%) were deletions shorter than 200 bp, consistent with the notion that longer structural mutations are often associated with strongly deleterious fitness effects. For example, two deletions longer than 850 bp located within disease-linked genes were predicted to impose substantial fitness deficits owing to a resulting frameshift and gene fusion, respectively; whereas several other major effect variants outside of coding regions are likely to impact gene regulatory landscapes. Taken together, this first glimpse into the landscape of structural variation in aye-ayes will enable future opportunities to advance our understanding of the traits impacting the fitness of this endangered species, as well as allow for enhanced evolutionary comparisons across the full primate clade.
Collapse
Affiliation(s)
- Cyril J. Versoza
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Jeffrey D. Jensen
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Susanne P. Pfeifer
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| |
Collapse
|
8
|
Fornezza S, Delvecchio VS, Harvey WT, Dishuck PC, Eichler EE, Giannuzzi G. AGAP duplicons associate with structural diversity at Chromosome 10q11.22. Genome Res 2024; 34:1487-1499. [PMID: 39322278 PMCID: PMC11534156 DOI: 10.1101/gr.279454.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2024] [Accepted: 09/10/2024] [Indexed: 09/27/2024]
Abstract
The 10q11.22 chromosomal region is a duplication-rich interval of the human genome and one of the last to be fully assembled. It carries copy number-variable genes associated with intellectual disability, bipolar disorder, and obesity. In this study, we characterized the structural diversity at this locus by analyzing 64 haploid assemblies produced by the Human Pangenome Reference Consortium. We identified 11 alternative haplotypes that differ in the copy number and/or orientation of large genomic segments, ranging from hundreds of kilobase pairs (kbp) to over one megabase pair (Mbp). We uncovered a 2.4 Mbp size difference between the shortest and longest haplotypes. Breakpoint analysis revealed that genomic instability results from nonallelic homologous recombination between segmental duplication (SD) pairs with varying similarity (94.4%-99.6%). Nonetheless, these pairs generally recombine at positions where their identity is higher (>99.6%). Recurrent inversions occur with different breakpoints within the same inverted SD pair. Inversion polymorphisms shuffle the entire SD arrangement, creating new predispositions to copy-number variations. The SD architecture is associated with a catarrhine-specific subgroup of the AGAP gene family, which likely triggered the accumulation of SDs at this locus over the past 25 million years of human evolution. Our results reveal extensive structural diversity and genomic instability at the 10q11.22 locus, and expand the general understanding of the mutational mechanisms behind SD-mediated rearrangements.
Collapse
Affiliation(s)
| | | | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Philip C Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
| | | |
Collapse
|
9
|
Wang S, Shen Y, Lin Z, Miao Y, Wang C, Zhang W, Zhang Y. New genes driven by segmental duplications share a testis-specific expression pattern in the chromosome-level genome assembly of tree sparrow. Integr Zool 2024; 19:1004-1008. [PMID: 38014459 DOI: 10.1111/1749-4877.12789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Based on a chromosome-level genome assembly, a burst of new genes with different structures but a similar testis-specific expression pattern was detected in tree sparrow.
Collapse
Affiliation(s)
- Shengnan Wang
- Gansu Key Laboratory of Biomonitoring and Bioremediation for Environmental Pollution, School of Life Science, Lanzhou University, Lanzhou, China
| | - Yue Shen
- Gansu Key Laboratory of Biomonitoring and Bioremediation for Environmental Pollution, School of Life Science, Lanzhou University, Lanzhou, China
| | - Zhaocun Lin
- Gansu Key Laboratory of Biomonitoring and Bioremediation for Environmental Pollution, School of Life Science, Lanzhou University, Lanzhou, China
| | - Yuquan Miao
- Gansu Key Laboratory of Biomonitoring and Bioremediation for Environmental Pollution, School of Life Science, Lanzhou University, Lanzhou, China
| | - Chengqi Wang
- Gansu Key Laboratory of Biomonitoring and Bioremediation for Environmental Pollution, School of Life Science, Lanzhou University, Lanzhou, China
| | - Wenya Zhang
- Gansu Key Laboratory of Biomonitoring and Bioremediation for Environmental Pollution, School of Life Science, Lanzhou University, Lanzhou, China
| | - Yingmei Zhang
- Gansu Key Laboratory of Biomonitoring and Bioremediation for Environmental Pollution, School of Life Science, Lanzhou University, Lanzhou, China
| |
Collapse
|
10
|
Zhou B, Purmann C, Guo H, Shin G, Huang Y, Pattni R, Meng Q, Greer SU, Roychowdhury T, Wood RN, Ho M, zu Dohna H, Abyzov A, Hallmayer JF, Wong WH, Ji HP, Urban AE. Resolving the 22q11.2 deletion using CTLR-Seq reveals chromosomal rearrangement mechanisms and individual variance in breakpoints. Proc Natl Acad Sci U S A 2024; 121:e2322834121. [PMID: 39042694 PMCID: PMC11295037 DOI: 10.1073/pnas.2322834121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Accepted: 06/15/2024] [Indexed: 07/25/2024] Open
Abstract
We developed a generally applicable method, CRISPR/Cas9-targeted long-read sequencing (CTLR-Seq), to resolve, haplotype-specifically, the large and complex regions in the human genome that had been previously impenetrable to sequencing analysis, such as large segmental duplications (SegDups) and their associated genome rearrangements. CTLR-Seq combines in vitro Cas9-mediated cutting of the genome and pulse-field gel electrophoresis to isolate intact large (i.e., up to 2,000 kb) genomic regions that encompass previously unresolvable genomic sequences. These targets are then sequenced (amplification-free) at high on-target coverage using long-read sequencing, allowing for their complete sequence assembly. We applied CTLR-Seq to the SegDup-mediated rearrangements that constitute the boundaries of, and give rise to, the 22q11.2 Deletion Syndrome (22q11DS), the most common human microdeletion disorder. We then performed de novo assembly to resolve, at base-pair resolution, the full sequence rearrangements and exact chromosomal breakpoints of 22q11.2DS (including all common subtypes). Across multiple patients, we found a high degree of variability for both the rearranged SegDup sequences and the exact chromosomal breakpoint locations, which coincide with various transposons within the 22q11.2 SegDups, suggesting that 22q11DS can be driven by transposon-mediated genome recombination. Guided by CTLR-Seq results from two 22q11DS patients, we performed three-dimensional chromosomal folding analysis for the 22q11.2 SegDups from patient-derived neurons and astrocytes and found chromosome interactions anchored within the SegDups to be both cell type-specific and patient-specific. Lastly, we demonstrated that CTLR-Seq enables cell-type specific analysis of DNA methylation patterns within the deletion haplotype of 22q11DS.
Collapse
Affiliation(s)
- Bo Zhou
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA94305
- Stanford Maternal and Child Health Research Institute, Stanford University School of Medicine, Stanford, CA94305
| | - Carolin Purmann
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA94305
- Stanford Maternal and Child Health Research Institute, Stanford University School of Medicine, Stanford, CA94305
- Department of Genetics, Stanford University School of Medicine, Stanford, CA94305
| | - Hanmin Guo
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA94305
- Stanford Maternal and Child Health Research Institute, Stanford University School of Medicine, Stanford, CA94305
- Department of Genetics, Stanford University School of Medicine, Stanford, CA94305
- Department of Statistics, Stanford University, Stanford, CA94305
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305
| | - GiWon Shin
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA94305
| | - Yiling Huang
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA94305
- Department of Genetics, Stanford University School of Medicine, Stanford, CA94305
| | - Reenal Pattni
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA94305
- Department of Genetics, Stanford University School of Medicine, Stanford, CA94305
| | - Qingxi Meng
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA94305
| | - Stephanie U. Greer
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA94305
| | - Tanmoy Roychowdhury
- Division of Computational Biology, Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN55905
| | - Raegan N. Wood
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA94305
| | - Marcus Ho
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA94305
- Department of Genetics, Stanford University School of Medicine, Stanford, CA94305
| | - Heinrich zu Dohna
- Department of Biology, American University of Beirut, Beirut1107 2020, Lebanon
| | - Alexej Abyzov
- Division of Computational Biology, Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN55905
| | - Joachim F. Hallmayer
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA94305
| | - Wing H. Wong
- Department of Statistics, Stanford University, Stanford, CA94305
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305
| | - Hanlee P. Ji
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA94305
| | - Alexander E. Urban
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA94305
- Stanford Maternal and Child Health Research Institute, Stanford University School of Medicine, Stanford, CA94305
- Department of Genetics, Stanford University School of Medicine, Stanford, CA94305
- Program on Genetics of Brain Function, Stanford Center for Genomics and Personalized Medicine, Stanford University School of Medicine, Stanford, CA94305
| |
Collapse
|
11
|
Hu J, Wang Z, Sun Z, Hu B, Ayoola AO, Liang F, Li J, Sandoval JR, Cooper DN, Ye K, Ruan J, Xiao CL, Wang D, Wu DD, Wang S. NextDenovo: an efficient error correction and accurate assembly tool for noisy long reads. Genome Biol 2024; 25:107. [PMID: 38671502 PMCID: PMC11046930 DOI: 10.1186/s13059-024-03252-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 04/17/2024] [Indexed: 04/28/2024] Open
Abstract
Long-read sequencing data, particularly those derived from the Oxford Nanopore sequencing platform, tend to exhibit high error rates. Here, we present NextDenovo, an efficient error correction and assembly tool for noisy long reads, which achieves a high level of accuracy in genome assembly. We apply NextDenovo to assemble 35 diverse human genomes from around the world using Nanopore long-read data. These genomes allow us to identify the landscape of segmental duplication and gene copy number variation in modern human populations. The use of NextDenovo should pave the way for population-scale long-read assembly using Nanopore long-read data.
Collapse
Affiliation(s)
- Jiang Hu
- GrandOmics Biosciences, Beijing, 102206, China
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Zhuo Wang
- GrandOmics Biosciences, Beijing, 102206, China
| | - Zongyi Sun
- GrandOmics Biosciences, Beijing, 102206, China
| | - Benxia Hu
- Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China
| | - Adeola Oluwakemi Ayoola
- Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China
| | - Fan Liang
- GrandOmics Biosciences, Beijing, 102206, China
| | - Jingjing Li
- GrandOmics Biosciences, Beijing, 102206, China
| | - José R Sandoval
- Centro de Investigación de Genética y Biología Molecular (CIGBM), Instituto de Investigación, Facultad de Medicina, Universidad de San Martín de Porres, Lima, 15102, Peru
| | - David N Cooper
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Kai Ye
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Jue Ruan
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Chuan-Le Xiao
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, #7 Jinsui Road, Tianhe District, Guangzhou, China
| | - Depeng Wang
- GrandOmics Biosciences, Beijing, 102206, China.
| | - Dong-Dong Wu
- Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.
- Kunming Primate Research Center, and National Research Facility for Phenotypic and Genetic Analysis of Model Animals (Primate Facility), National Resource Center for Non-Human Primates, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650107, China.
- Kunming Natural History Museum of Zoology, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.
| | - Sheng Wang
- Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.
- Yunnan Key Laboratory of Biodiversity Information, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.
| |
Collapse
|
12
|
Schloissnig S, Pani S, Rodriguez-Martin B, Ebler J, Hain C, Tsapalou V, Söylev A, Hüther P, Ashraf H, Prodanov T, Asparuhova M, Hunt S, Rausch T, Marschall T, Korbel JO. Long-read sequencing and structural variant characterization in 1,019 samples from the 1000 Genomes Project. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.18.590093. [PMID: 38659906 PMCID: PMC11042266 DOI: 10.1101/2024.04.18.590093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Structural variants (SVs) contribute significantly to human genetic diversity and disease 1-4 . Previously, SVs have remained incompletely resolved by population genomics, with short-read sequencing facing limitations in capturing the whole spectrum of SVs at nucleotide resolution 5-7 . Here we leveraged nanopore sequencing 8 to construct an intermediate coverage resource of 1,019 long-read genomes sampled within 26 human populations from the 1000 Genomes Project. By integrating linear and graph-based approaches for SV analysis via pangenome graph-augmentation, we uncover 167,291 sequence-resolved SVs in these samples, considerably advancing SV characterization compared to population-wide short-read sequencing studies 3,4 . Our analysis details diverse SV classes-deletions, duplications, insertions, and inversions-at population-scale. LINE-1 and SVA retrotransposition activities frequently mediate transductions 9,10 of unique sequences, with both mobile element classes transducing sequences at either the 3'- or 5'-end, depending on the source element locus. Furthermore, analyses of SV breakpoint junctions suggest a continuum of homology-mediated rearrangement processes are integral to SV formation, and highlight evidence for SV recurrence involving repeat sequences. Our open-access dataset underscores the transformative impact of long-read sequencing in advancing the characterisation of polymorphic genomic architectures, and provides a resource for guiding variant prioritisation in future long-read sequencing-based disease studies.
Collapse
|
13
|
Laudanski K, Elmadhoun O, Mathew A, Kahn-Pascual Y, Kerfeld MJ, Chen J, Sisniega DC, Gomez F. Anesthetic Considerations for Patients with Hereditary Neuropathy with Liability to Pressure Palsies: A Narrative Review. Healthcare (Basel) 2024; 12:858. [PMID: 38667620 PMCID: PMC11050561 DOI: 10.3390/healthcare12080858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2024] [Revised: 03/28/2024] [Accepted: 03/29/2024] [Indexed: 04/28/2024] Open
Abstract
Hereditary neuropathy with liability to pressure palsies (HNPP) is an autosomal dominant demyelinating neuropathy characterized by an increased susceptibility to peripheral nerve injury from trauma, compression, or shear forces. Patients with this condition are unique, necessitating distinct considerations for anesthesia and surgical teams. This review describes the etiology, prevalence, clinical presentation, and management of HNPP and presents contemporary evidence and recommendations for optimal care for HNPP patients in the perioperative period. While the incidence of HNPP is reported at 7-16:100,000, this figure may be an underestimation due to underdiagnosis, further complicating medicolegal issues. With the subtle nature of symptoms associated with HNPP, patients with this condition may remain unrecognized during the perioperative period, posing significant risks. Several aspects of caring for this population, including anesthetic choices, intraoperative positioning, and monitoring strategy, may deviate from standard practices. As such, a tailored approach to caring for this unique population, coupled with meticulous preoperative planning, is crucial and requires a multidisciplinary approach.
Collapse
Affiliation(s)
- Krzysztof Laudanski
- Department of Anesthesiology and Perioperative Care, Mayo Clinic, Rochester, MN 55902, USA; (K.L.); (O.E.); (M.J.K.); (J.C.)
| | - Omar Elmadhoun
- Department of Anesthesiology and Perioperative Care, Mayo Clinic, Rochester, MN 55902, USA; (K.L.); (O.E.); (M.J.K.); (J.C.)
| | - Amal Mathew
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA 19104, USA;
| | - Yul Kahn-Pascual
- St George’s University Hospitals NHS Foundation Trust, London SW17 0QT, UK;
| | - Mitchell J. Kerfeld
- Department of Anesthesiology and Perioperative Care, Mayo Clinic, Rochester, MN 55902, USA; (K.L.); (O.E.); (M.J.K.); (J.C.)
| | - James Chen
- Department of Anesthesiology and Perioperative Care, Mayo Clinic, Rochester, MN 55902, USA; (K.L.); (O.E.); (M.J.K.); (J.C.)
| | - Daniella C. Sisniega
- Department of Neurology, University of Pennsylvania, Philadelphia, PA 19104, USA;
| | - Francisco Gomez
- Department of Neurology, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
14
|
Meng A, Li X, Li Z, Miao F, Ma L, Li S, Sun W, Huang J, Yang G. Genome assembly of Melilotus officinalis provides a new reference genome for functional genomics. BMC Genom Data 2024; 25:37. [PMID: 38637749 PMCID: PMC11025269 DOI: 10.1186/s12863-024-01224-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Accepted: 04/10/2024] [Indexed: 04/20/2024] Open
Abstract
BACKGROUND Sweet yellow clover (Melilotus officinalis) is a diploid plant (2n = 16) that is native to Europe. It is an excellent legume forage. It can both fix nitrogen and serve as a medicine. A genome assembly of Melilotus officinalis that was collected from Best corporation in Beijing is available based on Nanopore sequencing. The genome of Melilotus officinalis was sequenced, assembled, and annotated. RESULTS The latest PacBio third generation HiFi assembly and sequencing strategies were used to produce a Melilotus officinalis genome assembly size of 1,066 Mbp, contig N50 = 5 Mbp, scaffold N50 = 130 Mbp, and complete benchmarking universal single-copy orthologs (BUSCOs) = 96.4%. This annotation produced 47,873 high-confidence gene models, which will substantially aid in our research on molecular breeding. A collinear analysis showed that Melilotus officinalis and Medicago truncatula shared conserved synteny. The expansion and contraction of gene families showed that Melilotus officinalis expanded by 565 gene families and shrank by 56 gene families. The contacted gene families were associated with response to stimulus, nucleotide binding, and small molecule binding. Thus, it is related to a family of genes associated with peptidase activity, which could lead to better stress tolerance in plants. CONCLUSIONS In this study, the latest PacBio technology was used to assemble and sequence the genome of the Melilotus officinalis and annotate its protein-coding genes. These results will expand the genomic resources available for Melilotus officinalis and should assist in subsequent research on sweet yellow clover plants.
Collapse
Affiliation(s)
- Aoran Meng
- Key Laboratory of National Forestry and Grassland Administration on Grassland Resources and Ecology in the Yellow River Delta, College of Grassland Science, Qingdao Agricultural University, 266109, Qingdao, China
| | - Xinru Li
- Key Laboratory of National Forestry and Grassland Administration on Grassland Resources and Ecology in the Yellow River Delta, College of Grassland Science, Qingdao Agricultural University, 266109, Qingdao, China
| | - Zhiguang Li
- Key Laboratory of National Forestry and Grassland Administration on Grassland Resources and Ecology in the Yellow River Delta, College of Grassland Science, Qingdao Agricultural University, 266109, Qingdao, China
| | - Fuhong Miao
- Key Laboratory of National Forestry and Grassland Administration on Grassland Resources and Ecology in the Yellow River Delta, College of Grassland Science, Qingdao Agricultural University, 266109, Qingdao, China
| | - Lichao Ma
- Key Laboratory of National Forestry and Grassland Administration on Grassland Resources and Ecology in the Yellow River Delta, College of Grassland Science, Qingdao Agricultural University, 266109, Qingdao, China
| | - Shuo Li
- Key Laboratory of National Forestry and Grassland Administration on Grassland Resources and Ecology in the Yellow River Delta, College of Grassland Science, Qingdao Agricultural University, 266109, Qingdao, China
| | - Wenfei Sun
- Key Laboratory of National Forestry and Grassland Administration on Grassland Resources and Ecology in the Yellow River Delta, College of Grassland Science, Qingdao Agricultural University, 266109, Qingdao, China
| | | | - Guofeng Yang
- Key Laboratory of National Forestry and Grassland Administration on Grassland Resources and Ecology in the Yellow River Delta, College of Grassland Science, Qingdao Agricultural University, 266109, Qingdao, China.
| |
Collapse
|
15
|
Guitart X, Porubsky D, Yoo D, Dougherty ML, Dishuck PC, Munson KM, Lewis AP, Hoekzema K, Knuth J, Chang S, Pastinen T, Eichler EE. Independent expansion, selection and hypervariability of the TBC1D3 gene family in humans. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.12.584650. [PMID: 38654825 PMCID: PMC11037872 DOI: 10.1101/2024.03.12.584650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
TBC1D3 is a primate-specific gene family that has expanded in the human lineage and has been implicated in neuronal progenitor proliferation and expansion of the frontal cortex. The gene family and its expression have been challenging to investigate because it is embedded in high-identity and highly variable segmental duplications. We sequenced and assembled the gene family using long-read sequencing data from 34 humans and 11 nonhuman primate species. Our analysis shows that this particular gene family has independently duplicated in at least five primate lineages, and the duplicated loci are enriched at sites of large-scale chromosomal rearrangements on chromosome 17. We find that most humans vary along two TBC1D3 clusters where human haplotypes are highly variable in copy number, differing by as many as 20 copies, and structure (structural heterozygosity 90%). We also show evidence of positive selection, as well as a significant change in the predicted human TBC1D3 protein sequence. Lastly, we find that, despite multiple duplications, human TBC1D3 expression is limited to a subset of copies and, most notably, from a single paralog group: TBC1D3-CDKL. These observations may help explain why a gene potentially important in cortical development can be so variable in the human population.
Collapse
Affiliation(s)
- Xavi Guitart
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - DongAhn Yoo
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Max L. Dougherty
- Tisch Cancer Institute, Division of Hematology and Medical Oncology, The Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Philip C. Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Katherine M. Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Alexandra P. Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Jordan Knuth
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Stephen Chang
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University, Stanford, CA, USA
| | - Tomi Pastinen
- Department of Pediatrics, Genomic Medicine Center, Children’s Mercy Kansas City, Kansas City, MO, USA
- Department of Pediatrics, School of Medicine, University of Missouri Kansas City, Kansas City, MO, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical institute, University of Washington, Seattle, WA, USA
| |
Collapse
|
16
|
Bukhman YV, Morin PA, Meyer S, Chu LF, Jacobsen JK, Antosiewicz-Bourget J, Mamott D, Gonzales M, Argus C, Bolin J, Berres ME, Fedrigo O, Steill J, Swanson SA, Jiang P, Rhie A, Formenti G, Phillippy AM, Harris RS, Wood JMD, Howe K, Kirilenko BM, Munegowda C, Hiller M, Jain A, Kihara D, Johnston JS, Ionkov A, Raja K, Toh H, Lang A, Wolf M, Jarvis ED, Thomson JA, Chaisson MJP, Stewart R. A High-Quality Blue Whale Genome, Segmental Duplications, and Historical Demography. Mol Biol Evol 2024; 41:msae036. [PMID: 38376487 PMCID: PMC10919930 DOI: 10.1093/molbev/msae036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 01/11/2024] [Accepted: 01/22/2024] [Indexed: 02/21/2024] Open
Abstract
The blue whale, Balaenoptera musculus, is the largest animal known to have ever existed, making it an important case study in longevity and resistance to cancer. To further this and other blue whale-related research, we report a reference-quality, long-read-based genome assembly of this fascinating species. We assembled the genome from PacBio long reads and utilized Illumina/10×, optical maps, and Hi-C data for scaffolding, polishing, and manual curation. We also provided long read RNA-seq data to facilitate the annotation of the assembly by NCBI and Ensembl. Additionally, we annotated both haplotypes using TOGA and measured the genome size by flow cytometry. We then compared the blue whale genome with other cetaceans and artiodactyls, including vaquita (Phocoena sinus), the world's smallest cetacean, to investigate blue whale's unique biological traits. We found a dramatic amplification of several genes in the blue whale genome resulting from a recent burst in segmental duplications, though the possible connection between this amplification and giant body size requires further study. We also discovered sites in the insulin-like growth factor-1 gene correlated with body size in cetaceans. Finally, using our assembly to examine the heterozygosity and historical demography of Pacific and Atlantic blue whale populations, we found that the genomes of both populations are highly heterozygous and that their genetic isolation dates to the last interglacial period. Taken together, these results indicate how a high-quality, annotated blue whale genome will serve as an important resource for biology, evolution, and conservation research.
Collapse
Affiliation(s)
- Yury V Bukhman
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
| | - Phillip A Morin
- Southwest Fisheries Science Center, National Oceanic and Atmospheric Administration (NOAA), La Jolla, CA 92037, USA
| | - Susanne Meyer
- Neuroscience Research Institute, University of California, Santa Barbara, CA, USA
| | - Li-Fang Chu
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
- Department of Comparative Biology and Experimental Medicine, University of Calgary, Calgary, Canada
| | | | | | - Daniel Mamott
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
| | - Maylie Gonzales
- Neuroscience Research Institute, University of California, Santa Barbara, CA, USA
| | - Cara Argus
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
| | - Jennifer Bolin
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
| | - Mark E Berres
- University of Wisconsin Biotechnology Center, Bioinformatics Resource Center, University of Wisconsin - Madison, Madison, WI 53706, USA
| | - Olivier Fedrigo
- Vertebrate Genome Lab, The Rockefeller University, New York, NY 10065, USA
| | - John Steill
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
| | - Scott A Swanson
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
| | - Peng Jiang
- Center for Gene Regulation in Health and Disease (GRHD), Cleveland State University, Cleveland, OH, USA
- Department of Biological, Geological and Environmental Sciences, Cleveland State University, Cleveland, OH, USA
- Center for RNA Science and Therapeutics, School of Medicine, Case Western Reserve University, Cleveland, OH, USA
| | - Arang Rhie
- Genome Informatics Section, National Human Genome Research Institute, Bethesda, MD 20892, USA
| | - Giulio Formenti
- Laboratory of Neurogenetics of Language, The Rockefeller University/HHMI, New York, NY 10065, USA
| | - Adam M Phillippy
- Genome Informatics Section, National Human Genome Research Institute, Bethesda, MD 20892, USA
| | - Robert S Harris
- Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
| | | | - Kerstin Howe
- Tree of Life, Wellcome Sanger Institute, Cambridge CB10 1SA, UK
| | - Bogdan M Kirilenko
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Institute of Cell Biology and Neuroscience, Faculty of Biosciences, Goethe University Frankfurt, 60438 Frankfurt, Germany
| | - Chetan Munegowda
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Institute of Cell Biology and Neuroscience, Faculty of Biosciences, Goethe University Frankfurt, 60438 Frankfurt, Germany
| | - Michael Hiller
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Institute of Cell Biology and Neuroscience, Faculty of Biosciences, Goethe University Frankfurt, 60438 Frankfurt, Germany
| | - Aashish Jain
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - J Spencer Johnston
- Department of Entomology, Texas A&M University, College Station, TX 77843, USA
| | - Alexander Ionkov
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
| | - Kalpana Raja
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
| | - Huishi Toh
- Neuroscience Research Institute, University of California, Santa Barbara, CA, USA
| | - Aimee Lang
- Southwest Fisheries Science Center, National Oceanic and Atmospheric Administration (NOAA), La Jolla, CA 92037, USA
| | - Magnus Wolf
- Institute for Evolution and Biodiversity (IEB), University of Muenster, 48149, Muenster, Germany
- Senckenberg Biodiversity and Climate Research Centre (BiK-F), Frankfurt am Main, Germany
| | - Erich D Jarvis
- Vertebrate Genome Lab, The Rockefeller University, New York, NY 10065, USA
- Laboratory of Neurogenetics of Language, The Rockefeller University/HHMI, New York, NY 10065, USA
| | - James A Thomson
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
- Department of Molecular, Cellular and Developmental Biology, University of California Santa Barbara, Santa Barbara, CA 93106, USA
- Department of Cell and Regenerative Biology, University of Wisconsin School of Medicine and Public Health, Madison, WI 53726, USA
| | - Mark J P Chaisson
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, Los Angeles, CA 90089, USA
| | - Ron Stewart
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
| |
Collapse
|
17
|
Oehler J, Morrow CA, Whitby MC. Gene duplication and deletion caused by over-replication at a fork barrier. Nat Commun 2023; 14:7730. [PMID: 38007544 PMCID: PMC10676400 DOI: 10.1038/s41467-023-43494-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 11/10/2023] [Indexed: 11/27/2023] Open
Abstract
Replication fork stalling can provoke fork reversal to form a four-way DNA junction. This remodelling of the replication fork can facilitate repair, aid bypass of DNA lesions, and enable replication restart, but may also pose a risk of over-replication during fork convergence. We show that replication fork stalling at a site-specific barrier in fission yeast can induce gene duplication-deletion rearrangements that are independent of replication restart-associated template switching and Rad51-dependent multi-invasion. Instead, they resemble targeted gene replacements (TGRs), requiring the DNA annealing activity of Rad52, the 3'-flap nuclease Rad16-Swi10, and mismatch repair protein Msh2. We propose that excess DNA, generated during the merging of a canonical fork with a reversed fork, can be liberated by a nuclease and integrated at an ectopic site via a TGR-like mechanism. This highlights how over-replication at replication termination sites can threaten genome stability in eukaryotes.
Collapse
Affiliation(s)
- Judith Oehler
- Department of Biochemistry, University of Oxford, South Parks Road, Oxford, OX1 3QU, UK
| | - Carl A Morrow
- Department of Biochemistry, University of Oxford, South Parks Road, Oxford, OX1 3QU, UK
| | - Matthew C Whitby
- Department of Biochemistry, University of Oxford, South Parks Road, Oxford, OX1 3QU, UK.
| |
Collapse
|
18
|
Feng LY, Lin PF, Xu RJ, Kang HQ, Gao LZ. Comparative Genomic Analysis of Asian Cultivated Rice and Its Wild Progenitor ( Oryza rufipogon) Has Revealed Evolutionary Innovation of the Pentatricopeptide Repeat Gene Family through Gene Duplication. Int J Mol Sci 2023; 24:16313. [PMID: 38003501 PMCID: PMC10671101 DOI: 10.3390/ijms242216313] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 11/10/2023] [Accepted: 11/12/2023] [Indexed: 11/26/2023] Open
Abstract
The pentatricopeptide repeat (PPR) gene family is one of the largest gene families in land plants. However, current knowledge about the evolution of the PPR gene family remains largely limited. In this study, we performed a comparative genomic analysis of the PPR gene family in O. sativa and its wild progenitor, O. rufipogon, and outlined a comprehensive landscape of gene duplications. Our findings suggest that the majority of PPR genes originated from dispersed duplications. Although segmental duplications have only expanded approximately 11.30% and 13.57% of the PPR gene families in the O. sativa and O. rufipogon genomes, we interestingly obtained evidence that segmental duplication promotes the structural diversity of PPR genes through incomplete gene duplications. In the O. sativa and O. rufipogon genomes, 10 (~33.33%) and 22 pairs of gene duplications (~45.83%) had non-PPR paralogous genes through incomplete gene duplication. Segmental duplications leading to incomplete gene duplications might result in the acquisition of domains, thus promoting functional innovation and structural diversification of PPR genes. This study offers a unique perspective on the evolution of PPR gene structures and underscores the potential role of segmental duplications in PPR gene structural diversity.
Collapse
Affiliation(s)
- Li-Ying Feng
- Institution of Genomics and Bioinformatics, South China Agricultural University, Guangzhou 510642, China; (L.-Y.F.); (P.-F.L.)
| | - Pei-Fan Lin
- Institution of Genomics and Bioinformatics, South China Agricultural University, Guangzhou 510642, China; (L.-Y.F.); (P.-F.L.)
| | - Rong-Jing Xu
- Tropical Biodiversity and Genomics Research Center, Hainan University, Haikou 570228, China; (R.-J.X.); (H.-Q.K.)
| | - Hai-Qi Kang
- Tropical Biodiversity and Genomics Research Center, Hainan University, Haikou 570228, China; (R.-J.X.); (H.-Q.K.)
| | - Li-Zhi Gao
- Institution of Genomics and Bioinformatics, South China Agricultural University, Guangzhou 510642, China; (L.-Y.F.); (P.-F.L.)
- Tropical Biodiversity and Genomics Research Center, Hainan University, Haikou 570228, China; (R.-J.X.); (H.-Q.K.)
| |
Collapse
|
19
|
Vance Z, McLysaght A. Ohnologs and SSD Paralogs Differ in Genomic and Expression Features Related to Dosage Constraints. Genome Biol Evol 2023; 15:evad174. [PMID: 37776514 PMCID: PMC10563793 DOI: 10.1093/gbe/evad174] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 09/21/2023] [Accepted: 09/26/2023] [Indexed: 10/02/2023] Open
Abstract
Gene duplication is recognized as a critical process in genome evolution; however, many questions about this process remain unanswered. Although gene duplicability has been observed to differ by duplication mechanism and evolutionary rate, there is so far no broad characterization of its determinants. Many features correlate with this difference in duplicability; however, our ability to exploit these observations to advance our understanding of the role of duplication in evolution is hampered by limitations within existing work. In particular, the existence of methodological differences across studies impedes meaningful comparison. Here, we use consistent definitions of duplicability in the human lineage to explore these associations, allow resolution of the impact of confounding factors, and define the overall relevance of individual features. Using a classifier approach and controlling for the confounding effect of duplicate longevity, we find a subset of gene features important in differentiating genes duplicable by small-scale duplication from those duplicable by whole-genome duplication, revealing critical roles for gene dosage and expression costs in duplicability. We further delve into patterns of functional enrichment and find a lack of constraint on duplicate retention in any context for genes duplicable by small-scale duplication.
Collapse
Affiliation(s)
- Zoe Vance
- Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
| | - Aoife McLysaght
- Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
| |
Collapse
|
20
|
Soto DC, Uribe-Salazar JM, Shew CJ, Sekar A, McGinty S, Dennis MY. Genomic structural variation: A complex but important driver of human evolution. AMERICAN JOURNAL OF BIOLOGICAL ANTHROPOLOGY 2023; 181 Suppl 76:118-144. [PMID: 36794631 PMCID: PMC10329998 DOI: 10.1002/ajpa.24713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/02/2022] [Revised: 01/21/2023] [Accepted: 02/05/2023] [Indexed: 02/17/2023]
Abstract
Structural variants (SVs)-including duplications, deletions, and inversions of DNA-can have significant genomic and functional impacts but are technically difficult to identify and assay compared with single-nucleotide variants. With the aid of new genomic technologies, it has become clear that SVs account for significant differences across and within species. This phenomenon is particularly well-documented for humans and other primates due to the wealth of sequence data available. In great apes, SVs affect a larger number of nucleotides than single-nucleotide variants, with many identified SVs exhibiting population and species specificity. In this review, we highlight the importance of SVs in human evolution by (1) how they have shaped great ape genomes resulting in sensitized regions associated with traits and diseases, (2) their impact on gene functions and regulation, which subsequently has played a role in natural selection, and (3) the role of gene duplications in human brain evolution. We further discuss how to incorporate SVs in research, including the strengths and limitations of various genomic approaches. Finally, we propose future considerations in integrating existing data and biospecimens with the ever-expanding SV compendium propelled by biotechnology advancements.
Collapse
Affiliation(s)
- Daniela C. Soto
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA, USA
- Integrative Genetics and Genomics Graduate Group, University of California, Davis, CA, USA
| | - José M. Uribe-Salazar
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA, USA
- Integrative Genetics and Genomics Graduate Group, University of California, Davis, CA, USA
| | - Colin J. Shew
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA, USA
- Integrative Genetics and Genomics Graduate Group, University of California, Davis, CA, USA
| | - Aarthi Sekar
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA, USA
- Integrative Genetics and Genomics Graduate Group, University of California, Davis, CA, USA
| | - Sean McGinty
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA, USA
- Integrative Genetics and Genomics Graduate Group, University of California, Davis, CA, USA
| | - Megan Y. Dennis
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA, USA
- Integrative Genetics and Genomics Graduate Group, University of California, Davis, CA, USA
| |
Collapse
|
21
|
Erenpreisa J, Vainshelbaum NM, Lazovska M, Karklins R, Salmina K, Zayakin P, Rumnieks F, Inashkina I, Pjanova D, Erenpreiss J. The Price of Human Evolution: Cancer-Testis Antigens, the Decline in Male Fertility and the Increase in Cancer. Int J Mol Sci 2023; 24:11660. [PMID: 37511419 PMCID: PMC10380301 DOI: 10.3390/ijms241411660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 07/15/2023] [Accepted: 07/17/2023] [Indexed: 07/30/2023] Open
Abstract
The increasing frequency of general and particularly male cancer coupled with the reduction in male fertility seen worldwide motivated us to seek a potential evolutionary link between these two phenomena, concerning the reproductive transcriptional modules observed in cancer and the expression of cancer-testis antigens (CTA). The phylostratigraphy analysis of the human genome allowed us to link the early evolutionary origin of cancer via the reproductive life cycles of the unicellulars and early multicellulars, potentially driving soma-germ transition, female meiosis, and the parthenogenesis of polyploid giant cancer cells (PGCCs), with the expansion of the CTA multi-families, very late during their evolution. CTA adaptation was aided by retrovirus domestication in the unstable genomes of mammals, for protecting male fertility in stress conditions, particularly that of humans, as compensation for the energy consumption of a large complex brain which also exploited retrotransposition. We found that the early and late evolutionary branches of human cancer are united by the immunity-proto-placental network, which evolved in the Cambrian and shares stress regulators with the finely-tuned sex determination system. We further propose that social stress and endocrine disruption caused by environmental pollution with organic materials, which alter sex determination in male foetuses and further spermatogenesis in adults, bias the development of PGCC-parthenogenetic cancer by default.
Collapse
Affiliation(s)
| | | | - Marija Lazovska
- Molecular Genetics Scientific Laboratory, Riga Stradins University, Dzirciema 16, LV-1007 Riga, Latvia
| | - Roberts Karklins
- Molecular Genetics Scientific Laboratory, Riga Stradins University, Dzirciema 16, LV-1007 Riga, Latvia
| | - Kristine Salmina
- Latvian Biomedical Research and Study Centre, Ratsupites 1-1k, LV-1067 Riga, Latvia
| | - Pawel Zayakin
- Latvian Biomedical Research and Study Centre, Ratsupites 1-1k, LV-1067 Riga, Latvia
| | - Felikss Rumnieks
- Latvian Biomedical Research and Study Centre, Ratsupites 1-1k, LV-1067 Riga, Latvia
| | - Inna Inashkina
- Latvian Biomedical Research and Study Centre, Ratsupites 1-1k, LV-1067 Riga, Latvia
| | - Dace Pjanova
- Latvian Biomedical Research and Study Centre, Ratsupites 1-1k, LV-1067 Riga, Latvia
- Molecular Genetics Scientific Laboratory, Riga Stradins University, Dzirciema 16, LV-1007 Riga, Latvia
| | - Juris Erenpreiss
- Molecular Genetics Scientific Laboratory, Riga Stradins University, Dzirciema 16, LV-1007 Riga, Latvia
- Clinic iVF-Riga, Zala 1, LV-1010 Riga, Latvia
| |
Collapse
|
22
|
Qian R, Xie F, Zhang W, Kong J, Zhou X, Wang C, Li X. Genome-wide detection of CNV regions between Anqing six-end-white and Duroc pigs. Mol Cytogenet 2023; 16:12. [PMID: 37400846 PMCID: PMC10316616 DOI: 10.1186/s13039-023-00646-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 06/19/2023] [Indexed: 07/05/2023] Open
Abstract
BACKGROUND Anqing six-end-white pig is a native breed in Anhui Province. The pigs have the disadvantages of a slow growth rate, low proportion of lean meat, and thick back fat, but feature the advantages of strong stress resistance and excellent meat quality. Duroc pig is an introduced pig breed with a fast growth rate and high proportion of lean meat. With the latter breed featuring superior growth characteristics but inferior meat quality traits, the underlying molecular mechanism that causes these phenotypic differences between Chinese and foreign pigs is still unclear. RESULTS In this study, copy number variation (CNV) detection was performed using the re-sequencing data of Anqing Six-end-white pigs and Duroc pigs, A total of 65,701 CNVs were obtained. After merging the CNVs with overlapping genomic positions, 881 CNV regions (CNVRs) were obtained. Based on the obtained CNVR information combined with their positions on the 18 chromosomes, a whole-genome map of the pig CNVs was drawn. GO analysis of the genes in the CNVRs showed that they were primarily involved in the cellular processes of proliferation, differentiation, and adhesion, and primarily involved in the biological processes of fat metabolism, reproductive traits, and immune processes. CONCLUSION The difference analysis of the CNVs between the Chinese and foreign pig breeds showed that the CNV of the Anqing six-end-white pig genome was higher than that of the introduced pig breed Duroc. Six genes related to fat metabolism, reproductive performance, and stress resistance were found in genome-wide CNVRs (DPF3, LEPR, MAP2K6, PPARA, TRAF6, NLRP4).
Collapse
Affiliation(s)
- Rong Qian
- Institue of Agricultural Economics and Information, Anhui Academy of Agricultural Sciences, Hefei, 230031, Anhui, China
| | - Fei Xie
- College of Animal Science, Anhui Science and Technology University, Fengyang County, 233100, Anhui Province, China
| | - Wei Zhang
- Institue of Animal Husbandry and Veterinary Medicine, Anhui Academy of Agricultural Sciences, Hefei, 230031, Anhui, China
| | - JuanJuan Kong
- Institue of Agricultural Economics and Information, Anhui Academy of Agricultural Sciences, Hefei, 230031, Anhui, China
| | - Xueli Zhou
- Institue of Animal Husbandry and Veterinary Medicine, Anhui Academy of Agricultural Sciences, Hefei, 230031, Anhui, China
| | - Chonglong Wang
- Institue of Animal Husbandry and Veterinary Medicine, Anhui Academy of Agricultural Sciences, Hefei, 230031, Anhui, China.
| | - Xiaojin Li
- College of Animal Science, Anhui Science and Technology University, Fengyang County, 233100, Anhui Province, China.
| |
Collapse
|
23
|
Sun YH, Cui H, Song C, Shen JT, Zhuo X, Wang RH, Yu X, Ndamba R, Mu Q, Gu H, Wang D, Murthy GG, Li P, Liang F, Liu L, Tao Q, Wang Y, Orlowski S, Xu Q, Zhou H, Jagne J, Gokcumen O, Anthony N, Zhao X, Li XZ. Amniotes co-opt intrinsic genetic instability to protect germ-line genome integrity. Nat Commun 2023; 14:812. [PMID: 36781861 PMCID: PMC9925758 DOI: 10.1038/s41467-023-36354-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 01/27/2023] [Indexed: 02/15/2023] Open
Abstract
Unlike PIWI-interacting RNA (piRNA) in other species that mostly target transposable elements (TEs), >80% of piRNAs in adult mammalian testes lack obvious targets. However, mammalian piRNA sequences and piRNA-producing loci evolve more rapidly than the rest of the genome for unknown reasons. Here, through comparative studies of chickens, ducks, mice, and humans, as well as long-read nanopore sequencing on diverse chicken breeds, we find that piRNA loci across amniotes experience: (1) a high local mutation rate of structural variations (SVs, mutations ≥ 50 bp in size); (2) positive selection to suppress young and actively mobilizing TEs commencing at the pachytene stage of meiosis during germ cell development; and (3) negative selection to purge deleterious SV hotspots. Our results indicate that genetic instability at pachytene piRNA loci, while producing certain pathogenic SVs, also protects genome integrity against TE mobilization by driving the formation of rapid-evolving piRNA sequences.
Collapse
Affiliation(s)
- Yu H Sun
- Center for RNA Biology: From Genome to Therapeutics, Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, 14642, USA
| | - Hongxiao Cui
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, 712100, China
| | - Chi Song
- College of Public Health, Division of Biostatistics, The Ohio State University, Columbus, OH, 43210, USA
| | - Jiafei Teng Shen
- International Institutes of Medicine, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu, Zhejiang, 322000, China
| | - Xiaoyu Zhuo
- Department of Genetics, The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, 63110, USA
| | - Ruoqiao Huiyi Wang
- Center for RNA Biology: From Genome to Therapeutics, Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, 14642, USA
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, 712100, China
| | - Xiaohui Yu
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, 712100, China
| | - Rudo Ndamba
- Center for RNA Biology: From Genome to Therapeutics, Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, 14642, USA
| | - Qian Mu
- Center for RNA Biology: From Genome to Therapeutics, Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, 14642, USA
| | - Hanwen Gu
- Center for RNA Biology: From Genome to Therapeutics, Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, 14642, USA
| | - Duolin Wang
- Center for RNA Biology: From Genome to Therapeutics, Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, 14642, USA
| | - Gayathri Guru Murthy
- Center for RNA Biology: From Genome to Therapeutics, Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, 14642, USA
| | - Pidong Li
- Grandomics Biosciences Co., Ltd, Beijing, 102206, China
| | - Fan Liang
- Grandomics Biosciences Co., Ltd, Beijing, 102206, China
| | - Lei Liu
- Grandomics Biosciences Co., Ltd, Beijing, 102206, China
| | - Qing Tao
- Grandomics Biosciences Co., Ltd, Beijing, 102206, China
| | - Ying Wang
- Department of Animal Science, University of California, Davis, CA, 95616, USA
| | - Sara Orlowski
- Department of Poultry Science, University of Arkansas, Fayetteville, AR, 72701, USA
| | - Qi Xu
- Department of Animal Science, McGill University, Quebec, H9X 3V9, Canada
| | - Huaijun Zhou
- Department of Animal Science, University of California, Davis, CA, 95616, USA
| | - Jarra Jagne
- Animal Health Diagnostic Center, Cornell University College of Veterinary Medicine, Ithaca, NY, 14850, USA
| | - Omer Gokcumen
- Department of Biological Sciences, University at Buffalo, State University of New York, Buffalo, NY, 14260, USA
| | - Nick Anthony
- Department of Poultry Science, University of Arkansas, Fayetteville, AR, 72701, USA
| | - Xin Zhao
- Department of Animal Science, McGill University, Quebec, H9X 3V9, Canada.
| | - Xin Zhiguo Li
- Center for RNA Biology: From Genome to Therapeutics, Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, 14642, USA.
| |
Collapse
|
24
|
Serrano C, Lopes-Marques M, Amorim A, João Prata M, Azevedo L. A partial duplication of an X-linked gene exclusive of a primate lineage (Macaca). Gene 2023; 851:146997. [DOI: 10.1016/j.gene.2022.146997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 10/12/2022] [Accepted: 10/18/2022] [Indexed: 11/04/2022]
|
25
|
Toh H, Yang C, Formenti G, Raja K, Yan L, Tracey A, Chow W, Howe K, Bergeron LA, Zhang G, Haase B, Mountcastle J, Fedrigo O, Fogg J, Kirilenko B, Munegowda C, Hiller M, Jain A, Kihara D, Rhie A, Phillippy AM, Swanson SA, Jiang P, Clegg DO, Jarvis ED, Thomson JA, Stewart R, Chaisson MJP, Bukhman YV. A haplotype-resolved genome assembly of the Nile rat facilitates exploration of the genetic basis of diabetes. BMC Biol 2022; 20:245. [DOI: 10.1186/s12915-022-01427-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Accepted: 09/29/2022] [Indexed: 11/09/2022] Open
Abstract
Abstract
Background
The Nile rat (Avicanthis niloticus) is an important animal model because of its robust diurnal rhythm, a cone-rich retina, and a propensity to develop diet-induced diabetes without chemical or genetic modifications. A closer similarity to humans in these aspects, compared to the widely used Mus musculus and Rattus norvegicus models, holds the promise of better translation of research findings to the clinic.
Results
We report a 2.5 Gb, chromosome-level reference genome assembly with fully resolved parental haplotypes, generated with the Vertebrate Genomes Project (VGP). The assembly is highly contiguous, with contig N50 of 11.1 Mb, scaffold N50 of 83 Mb, and 95.2% of the sequence assigned to chromosomes. We used a novel workflow to identify 3613 segmental duplications and quantify duplicated genes. Comparative analyses revealed unique genomic features of the Nile rat, including some that affect genes associated with type 2 diabetes and metabolic dysfunctions. We discuss 14 genes that are heterozygous in the Nile rat or highly diverged from the house mouse.
Conclusions
Our findings reflect the exceptional level of genomic resolution present in this assembly, which will greatly expand the potential of the Nile rat as a model organism.
Collapse
|
26
|
Mallik S, Tawfik DS, Levy ED. How gene duplication diversifies the landscape of protein oligomeric state and function. Curr Opin Genet Dev 2022; 76:101966. [PMID: 36007298 PMCID: PMC9548406 DOI: 10.1016/j.gde.2022.101966] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 07/01/2022] [Accepted: 07/08/2022] [Indexed: 11/29/2022]
Abstract
Oligomeric proteins are central to cellular life and the duplication and divergence of their genes is a key driver of evolutionary innovations. The duplication of a gene coding for an oligomeric protein has numerous possible outcomes, which motivates questions on the relationship between structural and functional divergence. How do protein oligomeric states diversify after gene duplication? In the simple case of duplication of a homo-oligomeric protein gene, what properties can influence the fate of descendant paralogs toward forming independent homomers or maintaining their interaction as a complex? Furthermore, how are functional innovations associated with the diversification of oligomeric states? Here, we review recent literature and present specific examples in an attempt to illustrate and answer these questions.
Collapse
Affiliation(s)
- Saurav Mallik
- Department of Chemical and Structural Biology, The Weizmann Institute of Science, Rehovot 7610001, Israel; Department of Biomolecular Sciences, The Weizmann Institute of Science, Rehovot 7610001, Israel.
| | - Dan S Tawfik
- Department of Biomolecular Sciences, The Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Emmanuel D Levy
- Department of Chemical and Structural Biology, The Weizmann Institute of Science, Rehovot 7610001, Israel.
| |
Collapse
|
27
|
Porubsky D, Höps W, Ashraf H, Hsieh P, Rodriguez-Martin B, Yilmaz F, Ebler J, Hallast P, Maria Maggiolini FA, Harvey WT, Henning B, Audano PA, Gordon DS, Ebert P, Hasenfeld P, Benito E, Zhu Q, Lee C, Antonacci F, Steinrücken M, Beck CR, Sanders AD, Marschall T, Eichler EE, Korbel JO. Recurrent inversion polymorphisms in humans associate with genetic instability and genomic disorders. Cell 2022; 185:1986-2005.e26. [PMID: 35525246 PMCID: PMC9563103 DOI: 10.1016/j.cell.2022.04.017] [Citation(s) in RCA: 84] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 02/14/2022] [Accepted: 04/08/2022] [Indexed: 12/13/2022]
Abstract
Unlike copy number variants (CNVs), inversions remain an underexplored genetic variation class. By integrating multiple genomic technologies, we discover 729 inversions in 41 human genomes. Approximately 85% of inversions <2 kbp form by twin-priming during L1 retrotransposition; 80% of the larger inversions are balanced and affect twice as many nucleotides as CNVs. Balanced inversions show an excess of common variants, and 72% are flanked by segmental duplications (SDs) or retrotransposons. Since flanking repeats promote non-allelic homologous recombination, we developed complementary approaches to identify recurrent inversion formation. We describe 40 recurrent inversions encompassing 0.6% of the genome, showing inversion rates up to 2.7 × 10-4 per locus per generation. Recurrent inversions exhibit a sex-chromosomal bias and co-localize with genomic disorder critical regions. We propose that inversion recurrence results in an elevated number of heterozygous carriers and structural SD diversity, which increases mutability in the population and predisposes specific haplotypes to disease-causing CNVs.
Collapse
Affiliation(s)
- David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Wolfram Höps
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstr. 1, 69117 Heidelberg, Germany
| | - Hufsah Ashraf
- Heinrich Heine University, Medical Faculty, Institute for Medical Biometry and Bioinformatics, Moorenstraße 5, 40225 Düsseldorf, Germany
| | - PingHsun Hsieh
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Bernardo Rodriguez-Martin
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstr. 1, 69117 Heidelberg, Germany
| | - Feyza Yilmaz
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT 06032, USA
| | - Jana Ebler
- Heinrich Heine University, Medical Faculty, Institute for Medical Biometry and Bioinformatics, Moorenstraße 5, 40225 Düsseldorf, Germany
| | - Pille Hallast
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT 06032, USA
| | - Flavia Angela Maria Maggiolini
- Department of Biology, University of Bari "Aldo Moro", 70125 Bari, Italy; Consiglio per la Ricerca in Agricoltura e l'Analisi dell'Economia Agraria-Centro di Ricerca Viticoltura ed Enologia (CREA-VE), Via Casamassima 148, 70010 Turi, Italy
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Barbara Henning
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Peter A Audano
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT 06032, USA
| | - David S Gordon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Peter Ebert
- Heinrich Heine University, Medical Faculty, Institute for Medical Biometry and Bioinformatics, Moorenstraße 5, 40225 Düsseldorf, Germany
| | - Patrick Hasenfeld
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstr. 1, 69117 Heidelberg, Germany
| | - Eva Benito
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstr. 1, 69117 Heidelberg, Germany
| | - Qihui Zhu
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT 06032, USA
| | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT 06032, USA
| | | | - Matthias Steinrücken
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA; Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Christine R Beck
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT 06032, USA; The University of Connecticut Health Center, 400 Farmington Rd., Farmington, CT 06032, USA
| | - Ashley D Sanders
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany; Berlin Institute of Health (BIH), Berlin, Germany; Charité-Universitätsmedizin, Berlin, Berlin, Germany
| | - Tobias Marschall
- Heinrich Heine University, Medical Faculty, Institute for Medical Biometry and Bioinformatics, Moorenstraße 5, 40225 Düsseldorf, Germany.
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| | - Jan O Korbel
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstr. 1, 69117 Heidelberg, Germany; European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| |
Collapse
|
28
|
Shieh YK, Peng DY, Chen YH, Wu TW, Lu CL. An Integer Linear Programming Approach for Scaffolding Based on Exemplar Breakpoint Distance. J Comput Biol 2022; 29:961-973. [PMID: 35638936 DOI: 10.1089/cmb.2021.0399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Reference-based scaffolding is an important process used in genomic sequencing to order and orient the contigs in a draft genome based on a reference genome. In this study, we utilize the concept of genome rearrangement to formulate this process as an exemplar breakpoint distance (EBD)-based scaffolding problem, whose aim is to scaffold the contigs of two given draft genomes, both containing duplicate genes (or sequence markers) and acting with each other as a reference, such that the EBD between the scaffolded genomes is minimized. The EBD-based scaffolding problem is difficult to solve because it is non-deterministic polynomial-time (NP)-hard. In this work, we design an integer linear programming (ILP)-based algorithm to exactly solve the EBD-based scaffolding problem. Our experimental results on both simulated and biological data sets show that our ILP-based scaffolding algorithm can accurately and efficiently use a reference genome to scaffold the contigs of a draft genome. Moreover, our ILP-based scaffolding algorithm with considering duplicate genes indeed has better accuracy performance than that without considering duplicate genes, suggesting that duplicate genes and their exemplars are helpful for the application of genome rearrangement in the study of the reference-based scaffolding problem. When compared with RaGOO, a current state-of-the-art alignment-based scaffolder, our ILP-based scaffolding algorithm still has better accuracy performance on the biological data sets.
Collapse
Affiliation(s)
- Yi-Kung Shieh
- Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan
| | - Dao-Yuan Peng
- Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan
| | - Yu-Han Chen
- Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan
| | - Tsung-Wei Wu
- Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan
| | - Chin Lung Lu
- Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan
| |
Collapse
|
29
|
High-quality chromosome-scale de novo assembly of the Paspalum notatum 'Flugge' genome. BMC Genomics 2022; 23:293. [PMID: 35410159 PMCID: PMC9004155 DOI: 10.1186/s12864-022-08489-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 03/16/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Paspalum notatum 'Flugge' is a diploid with 20 chromosomes (2n = 20) multi-purpose subtropical herb native to South America and has a high ecological significance. It is currently widely planted in tropical and subtropical regions. Despite the gene pool of P. notatum 'Flugge' being unearthed to a large extent in the past decade, no details about the genomic information of relevant species in Paspalum have been reported. In this study, the complete genome information of P. notatum was established and annotated through sequencing and de novo assembly of its genome. RESULTS The latest PacBio third-generation HiFi assembly and sequencing revealed that the genome size of P. notatum 'Flugge' is 541 M. The assembly result is the higher index among the genomes of the gramineous family published so far, with a contig N50 = 52Mbp, scaffold N50 = 49Mbp, and BUSCOs = 98.1%, accounting for 98.5% of the estimated genome. Genome annotation revealed 36,511 high-confidence gene models, thus providing an important resource for future molecular breeding and evolutionary research. A comparison of the genome annotation results of P. notatum 'Flugge' with other closely related species revealed that it had a close relationship with Zea mays but not close compared to Brachypodium distachyon, Setaria viridis, Oryza sativa, Puccinellia tenuiflora, Echinochloa crusgalli. An analysis of the expansion and contraction of gene families suggested that P. notatum 'Flugge' contains gene families associated with environmental resistance, increased reproductive ability, and molecular evolution, which explained its excellent agronomic traits. CONCLUSION This study is the first to report the high-quality chromosome-scale-based genome of P. notatum 'Flugge' assembled using the latest PacBio third-generation HiFi sequencing reads. The study provides an excellent genetic resource bank for gramineous crops and invaluable perspectives regarding the evolution of gramineous plants.
Collapse
|
30
|
Išerić H, Alkan C, Hach F, Numanagić I. Fast characterization of segmental duplication structure in multiple genome assemblies. Algorithms Mol Biol 2022; 17:4. [PMID: 35303886 PMCID: PMC8932185 DOI: 10.1186/s13015-022-00210-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Accepted: 02/08/2022] [Indexed: 11/29/2022] Open
Abstract
MOTIVATION The increasing availability of high-quality genome assemblies raised interest in the characterization of genomic architecture. Major architectural elements, such as common repeats and segmental duplications (SDs), increase genome plasticity that stimulates further evolution by changing the genomic structure and inventing new genes. Optimal computation of SDs within a genome requires quadratic-time local alignment algorithms that are impractical due to the size of most genomes. Additionally, to perform evolutionary analysis, one needs to characterize SDs in multiple genomes and find relations between those SDs and unique (non-duplicated) segments in other genomes. A naïve approach consisting of multiple sequence alignment would make the optimal solution to this problem even more impractical. Thus there is a need for fast and accurate algorithms to characterize SD structure in multiple genome assemblies to better understand the evolutionary forces that shaped the genomes of today. RESULTS Here we introduce a new approach, BISER, to quickly detect SDs in multiple genomes and identify elementary SDs and core duplicons that drive the formation of such SDs. BISER improves earlier tools by (i) scaling the detection of SDs with low homology to multiple genomes while introducing further 7-33[Formula: see text] speed-ups over the existing tools, and by (ii) characterizing elementary SDs and detecting core duplicons to help trace the evolutionary history of duplications to as far as 300 million years. AVAILABILITY AND IMPLEMENTATION BISER is implemented in Seq programming language and is publicly available at https://github.com/0xTCG/biser .
Collapse
Affiliation(s)
- Hamza Išerić
- Department of Computer Science, University of Victoria, Victoria, BC, V8P 5C2, Canada
| | - Can Alkan
- Department of Computer Engineering, Bilkent University, 06800, Ankara, Turkey
| | - Faraz Hach
- Vancouver Prostate Centre, Vancouver, BC, V6H 3Z6, Canada
- Department of Urologic Sciences, University of British Columbia, Vancouver, BC, V5Z 1M9, Canada
| | - Ibrahim Numanagić
- Department of Computer Science, University of Victoria, Victoria, BC, V8P 5C2, Canada.
| |
Collapse
|
31
|
Sønderby IE, Ching CRK, Thomopoulos SI, van der Meer D, Sun D, Villalon‐Reina JE, Agartz I, Amunts K, Arango C, Armstrong NJ, Ayesa‐Arriola R, Bakker G, Bassett AS, Boomsma DI, Bülow R, Butcher NJ, Calhoun VD, Caspers S, Chow EWC, Cichon S, Ciufolini S, Craig MC, Crespo‐Facorro B, Cunningham AC, Dale AM, Dazzan P, de Zubicaray GI, Djurovic S, Doherty JL, Donohoe G, Draganski B, Durdle CA, Ehrlich S, Emanuel BS, Espeseth T, Fisher SE, Ge T, Glahn DC, Grabe HJ, Gur RE, Gutman BA, Haavik J, Håberg AK, Hansen LA, Hashimoto R, Hibar DP, Holmes AJ, Hottenga J, Hulshoff Pol HE, Jalbrzikowski M, Knowles EEM, Kushan L, Linden DEJ, Liu J, Lundervold AJ, Martin‐Brevet S, Martínez K, Mather KA, Mathias SR, McDonald‐McGinn DM, McRae AF, Medland SE, Moberget T, Modenato C, Monereo Sánchez J, Moreau CA, Mühleisen TW, Paus T, Pausova Z, Prieto C, Ragothaman A, Reinbold CS, Reis Marques T, Repetto GM, Reymond A, Roalf DR, Rodriguez‐Herreros B, Rucker JJ, Sachdev PS, Schmitt JE, Schofield PR, Silva AI, Stefansson H, Stein DJ, Tamnes CK, Tordesillas‐Gutiérrez D, Ulfarsson MO, Vajdi A, van 't Ent D, van den Bree MBM, Vassos E, Vázquez‐Bourgon J, Vila‐Rodriguez F, Walters GB, Wen W, Westlye LT, Wittfeld K, Zackai EH, Stefánsson K, Jacquemont S, et alSønderby IE, Ching CRK, Thomopoulos SI, van der Meer D, Sun D, Villalon‐Reina JE, Agartz I, Amunts K, Arango C, Armstrong NJ, Ayesa‐Arriola R, Bakker G, Bassett AS, Boomsma DI, Bülow R, Butcher NJ, Calhoun VD, Caspers S, Chow EWC, Cichon S, Ciufolini S, Craig MC, Crespo‐Facorro B, Cunningham AC, Dale AM, Dazzan P, de Zubicaray GI, Djurovic S, Doherty JL, Donohoe G, Draganski B, Durdle CA, Ehrlich S, Emanuel BS, Espeseth T, Fisher SE, Ge T, Glahn DC, Grabe HJ, Gur RE, Gutman BA, Haavik J, Håberg AK, Hansen LA, Hashimoto R, Hibar DP, Holmes AJ, Hottenga J, Hulshoff Pol HE, Jalbrzikowski M, Knowles EEM, Kushan L, Linden DEJ, Liu J, Lundervold AJ, Martin‐Brevet S, Martínez K, Mather KA, Mathias SR, McDonald‐McGinn DM, McRae AF, Medland SE, Moberget T, Modenato C, Monereo Sánchez J, Moreau CA, Mühleisen TW, Paus T, Pausova Z, Prieto C, Ragothaman A, Reinbold CS, Reis Marques T, Repetto GM, Reymond A, Roalf DR, Rodriguez‐Herreros B, Rucker JJ, Sachdev PS, Schmitt JE, Schofield PR, Silva AI, Stefansson H, Stein DJ, Tamnes CK, Tordesillas‐Gutiérrez D, Ulfarsson MO, Vajdi A, van 't Ent D, van den Bree MBM, Vassos E, Vázquez‐Bourgon J, Vila‐Rodriguez F, Walters GB, Wen W, Westlye LT, Wittfeld K, Zackai EH, Stefánsson K, Jacquemont S, Thompson PM, Bearden CE, Andreassen OA. Effects of copy number variations on brain structure and risk for psychiatric illness: Large-scale studies from the ENIGMA working groups on CNVs. Hum Brain Mapp 2022; 43:300-328. [PMID: 33615640 PMCID: PMC8675420 DOI: 10.1002/hbm.25354] [Show More Authors] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Revised: 01/07/2021] [Accepted: 01/13/2021] [Indexed: 01/21/2023] Open
Abstract
The Enhancing NeuroImaging Genetics through Meta-Analysis copy number variant (ENIGMA-CNV) and 22q11.2 Deletion Syndrome Working Groups (22q-ENIGMA WGs) were created to gain insight into the involvement of genetic factors in human brain development and related cognitive, psychiatric and behavioral manifestations. To that end, the ENIGMA-CNV WG has collated CNV and magnetic resonance imaging (MRI) data from ~49,000 individuals across 38 global research sites, yielding one of the largest studies to date on the effects of CNVs on brain structures in the general population. The 22q-ENIGMA WG includes 12 international research centers that assessed over 533 individuals with a confirmed 22q11.2 deletion syndrome, 40 with 22q11.2 duplications, and 333 typically developing controls, creating the largest-ever 22q11.2 CNV neuroimaging data set. In this review, we outline the ENIGMA infrastructure and procedures for multi-site analysis of CNVs and MRI data. So far, ENIGMA has identified effects of the 22q11.2, 16p11.2 distal, 15q11.2, and 1q21.1 distal CNVs on subcortical and cortical brain structures. Each CNV is associated with differences in cognitive, neurodevelopmental and neuropsychiatric traits, with characteristic patterns of brain structural abnormalities. Evidence of gene-dosage effects on distinct brain regions also emerged, providing further insight into genotype-phenotype relationships. Taken together, these results offer a more comprehensive picture of molecular mechanisms involved in typical and atypical brain development. This "genotype-first" approach also contributes to our understanding of the etiopathogenesis of brain disorders. Finally, we outline future directions to better understand effects of CNVs on brain structure and behavior.
Collapse
Affiliation(s)
- Ida E. Sønderby
- Department of Medical GeneticsOslo University HospitalOsloNorway
- Norwegian Centre for Mental Disorders Research (NORMENT), Division of Mental Health and AddictionOslo University Hospital and University of OsloOsloNorway
- KG Jebsen Centre for Neurodevelopmental DisordersUniversity of OsloOsloNorway
| | - Christopher R. K. Ching
- Imaging Genetics CenterMark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine, University of Southern CaliforniaMarina del ReyCaliforniaUSA
| | - Sophia I. Thomopoulos
- Imaging Genetics CenterMark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine, University of Southern CaliforniaMarina del ReyCaliforniaUSA
| | - Dennis van der Meer
- Norwegian Centre for Mental Disorders Research (NORMENT), Division of Mental Health and AddictionOslo University Hospital and University of OsloOsloNorway
- School of Mental Health and Neuroscience, Faculty of Health, Medicine and Life SciencesMaastricht UniversityMaastrichtThe Netherlands
| | - Daqiang Sun
- Semel Institute for Neuroscience and Human Behavior, Departments of Psychiatry and Biobehavioral Sciences and PsychologyUniversity of California Los AngelesLos AngelesCaliforniaUSA
- Department of Mental HealthVeterans Affairs Greater Los Angeles Healthcare System, Los AngelesCaliforniaUSA
| | - Julio E. Villalon‐Reina
- Imaging Genetics CenterMark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine, University of Southern CaliforniaMarina del ReyCaliforniaUSA
| | - Ingrid Agartz
- NORMENT, Institute of Clinical PsychiatryUniversity of OsloOsloNorway
- Department of Psychiatric ResearchDiakonhjemmet HospitalOsloNorway
- Department of Clinical NeuroscienceKarolinska InstitutetStockholmSweden
| | - Katrin Amunts
- Institute of Neuroscience and Medicine (INM‐1)Research Centre JülichJülichGermany
- Cecile and Oskar Vogt Institute for Brain Research, Medical FacultyUniversity Hospital Düsseldorf, Heinrich‐Heine‐University DüsseldorfDüsseldorfGermany
| | - Celso Arango
- Department of Child and Adolescent PsychiatryInstitute of Psychiatry and Mental Health, Hospital General Universitario Gregorio Marañon, IsSGM, Universidad Complutense, School of MedicineMadridSpain
- Centro Investigación Biomédica en Red de Salud Mental (CIBERSAM)MadridSpain
| | | | - Rosa Ayesa‐Arriola
- Centro Investigación Biomédica en Red de Salud Mental (CIBERSAM)MadridSpain
- Department of PsychiatryMarqués de Valdecilla University Hospital, Valdecilla Biomedical Research Institute (IDIVAL)SantanderSpain
| | - Geor Bakker
- Department of Psychiatry and NeuropsychologyMaastricht UniversityMaastrichtThe Netherlands
- Department of Radiology and Nuclear MedicineVU University Medical CenterAmsterdamThe Netherlands
| | - Anne S. Bassett
- Clinical Genetics Research ProgramCentre for Addiction and Mental HealthTorontoOntarioCanada
- Dalglish Family 22q Clinic for Adults with 22q11.2 Deletion Syndrome, Toronto General HospitalUniversity Health NetworkTorontoOntarioCanada
- Department of PsychiatryUniversity of TorontoTorontoOntarioCanada
| | - Dorret I. Boomsma
- Department of Biological PsychologyVrije Universiteit AmsterdamAmsterdamThe Netherlands
- Amsterdam Public Health (APH) Research InstituteAmsterdam UMCAmsterdamThe Netherlands
| | - Robin Bülow
- Institute of Diagnostic Radiology and NeuroradiologyUniversity Medicine GreifswaldGreifswaldGermany
| | - Nancy J. Butcher
- Department of PsychiatryUniversity of TorontoTorontoOntarioCanada
- Child Health Evaluative SciencesThe Hospital for Sick Children Research InstituteTorontoOntarioCanada
| | - Vince D. Calhoun
- Tri‐institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS)Georgia State, Georgia Tech, EmoryAtlantaGeorgiaUSA
| | - Svenja Caspers
- Institute of Neuroscience and Medicine (INM‐1)Research Centre JülichJülichGermany
- Institute for Anatomy IMedical Faculty & University Hospital Düsseldorf, University of DüsseldorfDüsseldorfGermany
| | - Eva W. C. Chow
- Clinical Genetics Research ProgramCentre for Addiction and Mental HealthTorontoOntarioCanada
- Department of PsychiatryUniversity of TorontoTorontoOntarioCanada
| | - Sven Cichon
- Institute of Neuroscience and Medicine (INM‐1)Research Centre JülichJülichGermany
- Institute of Medical Genetics and PathologyUniversity Hospital BaselBaselSwitzerland
- Department of BiomedicineUniversity of BaselBaselSwitzerland
| | - Simone Ciufolini
- Department of Psychosis StudiesInstitute of Psychiatry, Psychology and Neuroscience, King's College LondonLondonUnited Kingdom
| | - Michael C. Craig
- Department of Forensic and Neurodevelopmental SciencesThe Sackler Institute for Translational Neurodevelopmental Sciences, Institute of Psychiatry, Psychology and Neuroscience, King's CollegeLondonUnited Kingdom
| | | | - Adam C. Cunningham
- MRC Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical NeurosciencesCardiff UniversityCardiffUnited Kingdom
| | - Anders M. Dale
- Center for Multimodal Imaging and GeneticsUniversity of California San DiegoLa JollaCaliforniaUSA
- Department RadiologyUniversity of California San DiegoLa JollaCaliforniaUSA
| | - Paola Dazzan
- Department of Psychological MedicineInstitute of Psychiatry, Psychology and Neuroscience, King's College LondonLondonUnited Kingdom
| | - Greig I. de Zubicaray
- Faculty of HealthQueensland University of Technology (QUT)BrisbaneQueenslandAustralia
| | - Srdjan Djurovic
- Department of Medical GeneticsOslo University HospitalOsloNorway
- NORMENT, Department of Clinical ScienceUniversity of BergenBergenNorway
| | - Joanne L. Doherty
- MRC Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical NeurosciencesCardiff UniversityCardiffUnited Kingdom
- Cardiff University Brain Research Imaging Centre (CUBRIC)CardiffUnited Kingdom
| | - Gary Donohoe
- Center for Neuroimaging, Genetics and GenomicsSchool of Psychology, NUI GalwayGalwayIreland
| | - Bogdan Draganski
- LREN, Centre for Research in Neuroscience, Department of NeuroscienceUniversity Hospital Lausanne and University LausanneLausanneSwitzerland
- Neurology DepartmentMax‐Planck Institute for Human Brain and Cognitive SciencesLeipzigGermany
| | - Courtney A. Durdle
- MIND Institute and Department of Psychiatry and Behavioral SciencesUniversity of California DavisDavisCaliforniaUSA
| | - Stefan Ehrlich
- Division of Psychological and Social Medicine and Developmental NeurosciencesFaculty of Medicine, TU DresdenDresdenGermany
| | - Beverly S. Emanuel
- Department of PediatricsPerelman School of Medicine at the University of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Thomas Espeseth
- Department of PsychologyUniversity of OsloOsloNorway
- Department of PsychologyBjørknes CollegeOsloNorway
| | - Simon E. Fisher
- Language and Genetics DepartmentMax Planck Institute for PsycholinguisticsNijmegenThe Netherlands
- Donders Institute for Brain, Cognition and BehaviourRadboud UniversityNijmegenThe Netherlands
| | - Tian Ge
- Psychiatric and Neurodevelopmental Genetics UnitCenter for Genomic Medicine, Massachusetts General HospitalBostonMassachusettsUSA
- Department of Psychiatry, Massachusetts General HospitalHarvard Medical SchoolBostonMassachusettsUSA
| | - David C. Glahn
- Tommy Fuss Center for Neuropsychiatric Disease ResearchBoston Children's HospitalBostonMassachusettsUSA
- Department of PsychiatryHarvard Medical SchoolBostonMassachusettsUSA
| | - Hans J. Grabe
- German Center for Neurodegenerative Diseases (DZNE)Site Rostock/GreifswaldGreifswaldGermany
- Department of Psychiatry and PsychotherapyUniversity Medicine GreifswaldGreifswaldGermany
| | - Raquel E. Gur
- Department of PsychiatryUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Youth Suicide Prevention, Intervention and Research CenterChildren's Hospital of PhiladelphiaPhiladelphiaPennsylvaniaUSA
| | - Boris A. Gutman
- Medical Imaging Research Center, Department of Biomedical EngineeringIllinois Institute of TechnologyChicagoIllinoisUSA
| | - Jan Haavik
- Department of BiomedicineUniversity of BergenBergenNorway
- Division of PsychiatryHaukeland University HospitalBergenNorway
| | - Asta K. Håberg
- Department of Neuromedicine and Movement Science, Faculty of Medicine and Health SciencesNorwegian University of Science and TechnologyTrondheimNorway
- Department of Radiology and Nuclear MedicineSt. Olavs HospitalTrondheimNorway
| | - Laura A. Hansen
- Department of Psychiatry and Biobehavioral SciencesUniversity of California Los AngelesLos AngelesCaliforniaUSA
| | - Ryota Hashimoto
- Department of Pathology of Mental DiseasesNational Institute of Mental Health, National Center of Neurology and PsychiatryTokyoJapan
- Department of PsychiatryOsaka University Graduate School of MedicineOsakaJapan
| | - Derrek P. Hibar
- Personalized Healthcare AnalyticsGenentech, Inc.South San FranciscoCaliforniaUSA
| | - Avram J. Holmes
- Department of PsychologyYale UniversityNew HavenConnecticutUSA
- Department of PsychiatryYale UniversityNew HavenConnecticutUSA
| | - Jouke‐Jan Hottenga
- Department of Biological PsychologyVrije Universiteit AmsterdamAmsterdamThe Netherlands
| | - Hilleke E. Hulshoff Pol
- Department of Psychiatry, UMC Utrecht Brain Center, University Medical Center UtrechtUtrecht UniversityUtrechtThe Netherlands
| | | | - Emma E. M. Knowles
- Department of Psychiatry, Massachusetts General HospitalHarvard Medical SchoolBostonMassachusettsUSA
- Department of PsychiatryBoston Children's HospitalBostonMassachusettsUSA
| | - Leila Kushan
- Semel Institute for Neuroscience and Human BehaviorUniversity of California Los AngelesLos AngelesCaliforniaUSA
| | - David E. J. Linden
- School for Mental Health and NeuroscienceMaastricht UniversityMaastrichtThe Netherlands
- Neuroscience and Mental Health Research InstituteCardiff UniversityCardiffUnited Kingdom
| | - Jingyu Liu
- Tri‐institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS)Georgia State, Georgia Tech, EmoryAtlantaGeorgiaUSA
- Computer ScienceGeorgia State UniversityAtlantaGeorgiaUSA
| | - Astri J. Lundervold
- Department of Biological and Medical PsychologyUniversity of BergenBergenNorway
| | - Sandra Martin‐Brevet
- LREN, Centre for Research in Neuroscience, Department of NeuroscienceUniversity Hospital Lausanne and University LausanneLausanneSwitzerland
| | - Kenia Martínez
- Department of Child and Adolescent PsychiatryInstitute of Psychiatry and Mental Health, Hospital General Universitario Gregorio Marañon, IsSGM, Universidad Complutense, School of MedicineMadridSpain
- Centro Investigación Biomédica en Red de Salud Mental (CIBERSAM)MadridSpain
- Facultad de PsicologíaUniversidad Autónoma de MadridMadridSpain
| | - Karen A. Mather
- Centre for Healthy Brain Ageing (CHeBA), School of Psychiatry, Faculty of MedicineUniversity of New South WalesSydneyNew South WalesAustralia
- Neuroscience Research AustraliaSydneyNew South WalesAustralia
| | - Samuel R. Mathias
- Department of PsychiatryHarvard Medical SchoolBostonMassachusettsUSA
- Department of PsychiatryBoston Children's HospitalBostonMassachusettsUSA
| | - Donna M. McDonald‐McGinn
- Department of PediatricsPerelman School of Medicine at the University of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Division of Human GeneticsChildren's Hospital of PhiladelphiaPhiladelphiaPennsylvaniaUSA
- Division of Human Genetics and 22q and You CenterChildren's Hospital of PhiladelphiaPhiladelphiaPennsylvaniaUSA
| | - Allan F. McRae
- Institute for Molecular BioscienceThe University of QueenslandBrisbaneQueenslandAustralia
| | - Sarah E. Medland
- Psychiatric GeneticsQIMR Berghofer Medical Research InstituteBrisbaneQueenslandAustralia
| | - Torgeir Moberget
- Department of Psychology, Faculty of Social SciencesUniversity of OsloOsloNorway
| | - Claudia Modenato
- LREN, Centre for Research in Neuroscience, Department of NeuroscienceUniversity Hospital Lausanne and University LausanneLausanneSwitzerland
- University of LausanneLausanneSwitzerland
| | - Jennifer Monereo Sánchez
- School for Mental Health and NeuroscienceMaastricht UniversityMaastrichtThe Netherlands
- Faculty of Health, Medicine and Life SciencesMaastricht UniversityMaastrichtThe Netherlands
- Department of Radiology and Nuclear MedicineMaastricht University Medical CenterMaastrichtThe Netherlands
| | - Clara A. Moreau
- Sainte Justine Hospital Research CenterUniversity of Montreal, MontrealQCCanada
| | - Thomas W. Mühleisen
- Institute of Neuroscience and Medicine (INM‐1)Research Centre JülichJülichGermany
- Cecile and Oskar Vogt Institute for Brain Research, Medical FacultyUniversity Hospital Düsseldorf, Heinrich‐Heine‐University DüsseldorfDüsseldorfGermany
- Department of BiomedicineUniversity of BaselBaselSwitzerland
| | - Tomas Paus
- Bloorview Research InstituteHolland Bloorview Kids Rehabilitation HospitalTorontoOntarioCanada
- Departments of Psychology and PsychiatryUniversity of TorontoTorontoOntarioCanada
| | - Zdenka Pausova
- Translational Medicine, The Hospital for Sick ChildrenTorontoOntarioCanada
| | - Carlos Prieto
- Bioinformatics Service, NucleusUniversity of SalamancaSalamancaSpain
| | | | - Céline S. Reinbold
- Department of BiomedicineUniversity of BaselBaselSwitzerland
- Centre for Lifespan Changes in Brain and Cognition, Department of PsychologyUniversity of OsloOsloNorway
| | - Tiago Reis Marques
- Department of Psychosis StudiesInstitute of Psychiatry, Psychology and Neuroscience, King's College LondonLondonUnited Kingdom
- Psychiatric Imaging Group, MRC London Institute of Medical Sciences (LMS), Hammersmith HospitalImperial College LondonLondonUnited Kingdom
| | - Gabriela M. Repetto
- Center for Genetics and GenomicsFacultad de Medicina, Clinica Alemana Universidad del DesarrolloSantiagoChile
| | - Alexandre Reymond
- Center for Integrative GenomicsUniversity of LausanneLausanneSwitzerland
| | - David R. Roalf
- Department of PsychiatryUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | | | - James J. Rucker
- Department of Psychological MedicineInstitute of Psychiatry, Psychology and Neuroscience, King's College LondonLondonUnited Kingdom
| | - Perminder S. Sachdev
- Centre for Healthy Brain Ageing (CHeBA), School of Psychiatry, Faculty of MedicineUniversity of New South WalesSydneyNew South WalesAustralia
- Neuropsychiatric InstituteThe Prince of Wales HospitalSydneyNew South WalesAustralia
| | - James E. Schmitt
- Department of Radiology and PsychiatryUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Peter R. Schofield
- Neuroscience Research AustraliaSydneyNew South WalesAustralia
- School of Medical SciencesUNSW SydneySydneyNew South WalesAustralia
| | - Ana I. Silva
- Neuroscience and Mental Health Research InstituteCardiff UniversityCardiffUnited Kingdom
- School for Mental Health and Neuroscience, Department of Psychiatry and Neuropsychology, Faculty of Health, Medicine and Life SciencesMaastricht UniversityMaastrichtThe Netherlands
| | | | - Dan J. Stein
- SA MRC Unit on Risk & Resilience in Mental Disorders, Department of Psychiatry and Neuroscience InstituteUniversity of Cape TownCape TownSouth Africa
| | - Christian K. Tamnes
- Norwegian Centre for Mental Disorders Research (NORMENT), Division of Mental Health and AddictionOslo University Hospital and University of OsloOsloNorway
- Department of Psychiatric ResearchDiakonhjemmet HospitalOsloNorway
- PROMENTA Research Center, Department of PsychologyUniversity of OsloOsloNorway
| | - Diana Tordesillas‐Gutiérrez
- Centro Investigación Biomédica en Red de Salud Mental (CIBERSAM)MadridSpain
- Neuroimaging Unit, Technological FacilitiesValdecilla Biomedical Research Institute (IDIVAL), SantanderSpain
| | - Magnus O. Ulfarsson
- Population Genomics, deCODE genetics/AmgenReykjavikIceland
- Faculty of Electrical and Computer EngineeringUniversity of Iceland, ReykjavikIceland
| | - Ariana Vajdi
- Semel Institute for Neuroscience and Human BehaviorUniversity of California Los AngelesLos AngelesCaliforniaUSA
| | - Dennis van 't Ent
- Department of Biological PsychologyVrije Universiteit AmsterdamAmsterdamThe Netherlands
| | - Marianne B. M. van den Bree
- MRC Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical NeurosciencesCardiff UniversityCardiffUnited Kingdom
| | - Evangelos Vassos
- Social, Genetic and Developmental Psychiatry CentreInstitute of Psychiatry, Psychology & Neuroscience, King's College LondonLondonUnited Kingdom
| | - Javier Vázquez‐Bourgon
- Centro Investigación Biomédica en Red de Salud Mental (CIBERSAM)MadridSpain
- Department of PsychiatryMarqués de Valdecilla University Hospital, Valdecilla Biomedical Research Institute (IDIVAL)SantanderSpain
- School of MedicineUniversity of CantabriaSantanderSpain
| | - Fidel Vila‐Rodriguez
- Department of PsychiatryThe University of British ColumbiaVancouverBritish ColumbiaCanada
| | - G. Bragi Walters
- Population Genomics, deCODE genetics/AmgenReykjavikIceland
- Faculty of MedicineUniversity of IcelandReykjavikIceland
| | - Wei Wen
- Centre for Healthy Brain Ageing (CHeBA), School of Psychiatry, Faculty of MedicineUniversity of New South WalesSydneyNew South WalesAustralia
| | - Lars T. Westlye
- KG Jebsen Centre for Neurodevelopmental DisordersUniversity of OsloOsloNorway
- Department of PsychologyUniversity of OsloOsloNorway
- NORMENT, Division of Mental Health and AddictionOslo University HospitalOsloNorway
| | - Katharina Wittfeld
- German Center for Neurodegenerative Diseases (DZNE)Site Rostock/GreifswaldGreifswaldGermany
- Department of Psychiatry and PsychotherapyUniversity Medicine GreifswaldGreifswaldGermany
| | - Elaine H. Zackai
- Department of PediatricsPerelman School of Medicine at the University of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Division of Human GeneticsChildren's Hospital of PhiladelphiaPhiladelphiaPennsylvaniaUSA
| | - Kári Stefánsson
- Population Genomics, deCODE genetics/AmgenReykjavikIceland
- Faculty of MedicineUniversity of IcelandReykjavikIceland
| | - Sebastien Jacquemont
- Sainte Justine Hospital Research CenterUniversity of Montreal, MontrealQCCanada
- Department of PediatricsUniversity of Montreal, MontrealQCCanada
| | - Paul M. Thompson
- Imaging Genetics CenterMark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine, University of Southern CaliforniaMarina del ReyCaliforniaUSA
| | - Carrie E. Bearden
- Semel Institute for Neuroscience and Human Behavior, Departments of Psychiatry and Biobehavioral Sciences and PsychologyUniversity of California Los AngelesLos AngelesCaliforniaUSA
- Center for Neurobehavioral GeneticsUniversity of California Los AngelesLos AngelesCaliforniaUSA
| | - Ole A. Andreassen
- Norwegian Centre for Mental Disorders Research (NORMENT), Division of Mental Health and AddictionOslo University Hospital and University of OsloOsloNorway
| |
Collapse
|
32
|
Quantitative assessment reveals the dominance of duplicated sequences in germline-derived extrachromosomal circular DNA. Proc Natl Acad Sci U S A 2021; 118:2102842118. [PMID: 34789574 PMCID: PMC8617514 DOI: 10.1073/pnas.2102842118] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/04/2021] [Indexed: 01/08/2023] Open
Abstract
Extrachromosomal circular DNA (eccDNA) plays a role in human diseases such as cancer, but little is known about the impact of eccDNA in healthy human biology. Since eccDNA is a tiny fraction of nuclear DNA, artificial amplification has been employed to increase eccDNA amounts, resulting in the loss of native compositions. We developed an approach to enrich eccDNA populations at the native state (naïve small circular DNA, nscDNA) and investigated their origins in the human genome. We found that, in human sperm, the vast majority of nscDNA came from high-copy genomic regions, including the most variable regions between individuals. Because eccDNA can be incorporated back into chromosomes, eccDNA may promote human genetic variation. Extrachromosomal circular DNA (eccDNA) originates from linear chromosomal DNA in various human tissues under physiological and disease conditions. The genomic origins of eccDNA have largely been investigated using in vitro–amplified DNA. However, in vitro amplification obscures quantitative information by skewing the total population stoichiometry. In addition, the analyses have focused on eccDNA stemming from single-copy genomic regions, leaving eccDNA from multicopy regions unexamined. To address these issues, we isolated eccDNA without in vitro amplification (naïve small circular DNA, nscDNA) and assessed the populations quantitatively by integrated genomic, molecular, and cytogenetic approaches. nscDNA of up to tens of kilobases were successfully enriched by our approach and were predominantly derived from multicopy genomic regions including segmental duplications (SDs). SDs, which account for 5% of the human genome and are hotspots for copy number variations, were significantly overrepresented in sperm nscDNA, with three times more sequencing reads derived from SDs than from the entire single-copy regions. SDs were also overrepresented in mouse sperm nscDNA, which we estimated to comprise 0.2% of nuclear DNA. Considering that eccDNA can be integrated into chromosomes, germline-derived nscDNA may be a mediator of genome diversity.
Collapse
|
33
|
Comprehensive characterization of copy number variation (CNV) called from array, long- and short-read data. BMC Genomics 2021; 22:826. [PMID: 34789167 PMCID: PMC8596897 DOI: 10.1186/s12864-021-08082-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 10/13/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND SNP arrays, short- and long-read genome sequencing are genome-wide high-throughput technologies that may be used to assay copy number variants (CNVs) in a personal genome. Each of these technologies comes with its own limitations and biases, many of which are well-known, but not all of them are thoroughly quantified. RESULTS We assembled an ensemble of public datasets of published CNV calls and raw data for the well-studied Genome in a Bottle individual NA12878. This assembly represents a variety of methods and pipelines used for CNV calling from array, short- and long-read technologies. We then performed cross-technology comparisons regarding their ability to call CNVs. Different from other studies, we refrained from using the golden standard. Instead, we attempted to validate the CNV calls by the raw data of each technology. CONCLUSIONS Our study confirms that long-read platforms enable recalling CNVs in genomic regions inaccessible to arrays or short reads. We also found that the reproducibility of a CNV by different pipelines within each technology is strongly linked to other CNV evidence measures. Importantly, the three technologies show distinct public database frequency profiles, which differ depending on what technology the database was built on.
Collapse
|
34
|
A rare familial rearrangement of chromosomes 9 and 15 associated with intellectual disability: a clinical and molecular study. Mol Cytogenet 2021; 14:47. [PMID: 34607577 PMCID: PMC8489072 DOI: 10.1186/s13039-021-00565-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Accepted: 03/09/2021] [Indexed: 11/22/2022] Open
Abstract
Background There are many reports on rearrangements occurring separately in the regions of chromosomes 9p and 15q affected in the case under study. 15q duplication syndrome is caused by the presence of at least one extra maternally derived copy of the Prader–Willi/Angelman critical region. Trisomy 9p is the fourth most frequent chromosome anomaly with a clinically recognizable syndrome often accompanied by intellectual disability. Here we report a new case of a patient with maternally derived unique complex sSMC resulting in partial trisomy of both chromosomes 9 and 15 associated with intellectual disability. Case presentation We characterise a supernumerary derivative chromosome 15: 47,XY,+der(15)t(9;15)(p21.2;q13.2), likely resulting from 3:1 malsegregation during maternal gametogenesis. Chromosomal analysis showed that a phenotypically normal mother is a carrier of balanced translocation t(9;15)(p21.1;q13.2). Her 7-year-old son showed signs of intellectual disability and a number of physical abnormalities including bilateral cryptorchidism and congenital megaureter. The child’s magnetic resonance imaging showed changes in brain volume and in structural and functional connectivity revealing phenotypic changes caused by the presence of the extra chromosome material, whereas the mother’s brain MRI was normal. Sequence analyses of the microdissected der(15) chromosome detected two breakpoint regions: HSA9:25,928,021-26,157,441 (9p21.2 band) and HSA15:30,552,104-30,765,905 (15q13.2 band). The breakpoint region on chromosome HSA9 is poor in genetic features with several areas of high homology with the breakpoint region on chromosome 15. The breakpoint region on HSA15 is located in the area of a large segmental duplication. Conclusions We discuss the case of these phenotypic and brain MRI features in light of reported signatures for 9p partial trisomy and 15 duplication syndromes and analyze how the genomic characteristics of the found breakpoint regions have contributed to the origin of the derivative chromosome. We recommend MRI for all patients with a developmental delay, especially in cases with identified rearrangements, to accumulate more information on brain phenotypes related to chromosomal syndromes. Supplementary Information The online version contains supplementary material available at 10.1186/s13039-021-00565-y.
Collapse
|
35
|
Riba A, Fumagalli MR, Caselle M, Osella M. A Model-Driven Quantitative Analysis of Retrotransposon Distributions in the Human Genome. Genome Biol Evol 2021; 12:2045-2059. [PMID: 32986810 PMCID: PMC7750997 DOI: 10.1093/gbe/evaa201] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/19/2020] [Indexed: 12/21/2022] Open
Abstract
Retrotransposons, DNA sequences capable of creating copies of themselves, compose about half of the human genome and played a central role in the evolution of mammals. Their current position in the host genome is the result of the retrotranscription process and of the following host genome evolution. We apply a model from statistical physics to show that the genomic distribution of the two most populated classes of retrotransposons in human deviates from random placement, and that this deviation increases with time. The time dependence suggests a major role of the host genome dynamics in shaping the current retrotransposon distributions. Focusing on a neutral scenario, we show that a simple model based on random placement followed by genome expansion and sequence duplications can reproduce the empirical retrotransposon distributions, even though more complex and possibly selective mechanisms can have contributed. Besides the inherent interest in understanding the origin of current retrotransposon distributions, this work sets a general analytical framework to analyze quantitatively the effects of genome evolutionary dynamics on the distribution of genomic elements.
Collapse
Affiliation(s)
| | - Maria Rita Fumagalli
- Institute of Biophysics - CNR, National Research Council, Genova, Italy.,Department of Environmental Science and Policy, Center for Complexity and Biosystems, University of Milan, Milano, Italy
| | - Michele Caselle
- Department of Physics and INFN, University of Torino, Torino, Italy
| | - Matteo Osella
- Department of Physics and INFN, University of Torino, Torino, Italy
| |
Collapse
|
36
|
Vervoort L, Dierckxsens N, Pereboom Z, Capozzi O, Rocchi M, Shaikh TH, Vermeesch JR. 22q11.2 Low Copy Repeats Expanded in the Human Lineage. Front Genet 2021; 12:706641. [PMID: 34335701 PMCID: PMC8320366 DOI: 10.3389/fgene.2021.706641] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Accepted: 06/23/2021] [Indexed: 11/13/2022] Open
Abstract
Segmental duplications or low copy repeats (LCRs) constitute duplicated regions interspersed in the human genome, currently neglected in standard analyses due to their extreme complexity. Recent functional studies have indicated the potential of genes within LCRs in synaptogenesis, neuronal migration, and neocortical expansion in the human lineage. One of the regions with the highest proportion of duplicated sequence is the 22q11.2 locus, carrying eight LCRs (LCR22-A until LCR22-H), and rearrangements between them cause the 22q11.2 deletion syndrome. The LCR22-A block was recently reported to be hypervariable in the human population. It remains unknown whether this variability also exists in non-human primates, since research is strongly hampered by the presence of sequence gaps in the human and non-human primate reference genomes. To chart the LCR22 haplotypes and the associated inter- and intra-species variability, we de novo assembled the region in non-human primates by a combination of optical mapping techniques. A minimal and likely ancient haplotype is present in the chimpanzee, bonobo, and rhesus monkey without intra-species variation. In addition, the optical maps identified assembly errors and closed gaps in the orthologous chromosome 22 reference sequences. These findings indicate the LCR22 expansion to be unique to the human population, which might indicate involvement of the region in human evolution and adaptation. Those maps will enable LCR22-specific functional studies and investigate potential associations with the phenotypic variability in the 22q11.2 deletion syndrome.
Collapse
Affiliation(s)
| | | | - Zjef Pereboom
- Centre for Research and Conservation, Royal Zoological Society of Antwerp, Antwerp, Belgium
- Evolutionary Ecology Group, Department of Biology, Antwerp University, Antwerp, Belgium
| | | | | | - Tamim H. Shaikh
- Section of Genetics and Metabolism, Department of Pediatrics, University of Colorado School of Medicine, Aurora, CO, United States
| | | |
Collapse
|
37
|
Requena F, Abdallah HH, García A, Nitschké P, Romana S, Malan V, Rausell A. CNVxplorer: a web tool to assist clinical interpretation of CNVs in rare disease patients. Nucleic Acids Res 2021; 49:W93-W103. [PMID: 34019647 PMCID: PMC8262689 DOI: 10.1093/nar/gkab347] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2021] [Revised: 04/12/2021] [Accepted: 05/20/2021] [Indexed: 12/20/2022] Open
Abstract
Copy Number Variants (CNVs) are an important cause of rare diseases. Array-based Comparative Genomic Hybridization tests yield a ∼12% diagnostic rate, with ∼8% of patients presenting CNVs of unknown significance. CNVs interpretation is particularly challenging on genomic regions outside of those overlapping with previously reported structural variants or disease-associated genes. Recent studies showed that a more comprehensive evaluation of CNV features, leveraging both coding and non-coding impacts, can significantly improve diagnostic rates. However, currently available CNV interpretation tools are mostly gene-centric or provide only non-interactive annotations difficult to assess in the clinical practice. Here, we present CNVxplorer, a web server suited for the functional assessment of CNVs in a clinical diagnostic setting. CNVxplorer mines a comprehensive set of clinical, genomic, and epigenomic features associated with CNVs. It provides sequence constraint metrics, impact on regulatory elements and topologically associating domains, as well as expression patterns. Analyses offered cover (a) agreement with patient phenotypes; (b) visualizations of associations among genes, regulatory elements and transcription factors; (c) enrichment on functional and pathway annotations and (d) co-occurrence of terms across PubMed publications related to the query CNVs. A flexible evaluation workflow allows dynamic re-interrogation in clinical sessions. CNVxplorer is publicly available at http://cnvxplorer.com.
Collapse
Affiliation(s)
- Francisco Requena
- Université de Paris, Institut Imagine, F-75006 Paris, France
- Clinical Bioinformatics Laboratory, Imagine Institute, INSERM UMR1163, F-75015 Paris, France
| | - Hamza Hadj Abdallah
- Université de Paris, Institut Imagine, F-75006 Paris, France
- Service de Cytogénétique, Hôpital Necker-Enfants Malades, APHP, F-75015 Paris, France
| | - Alejandro García
- Université de Paris, Institut Imagine, F-75006 Paris, France
- Clinical Bioinformatics Laboratory, Imagine Institute, INSERM UMR1163, F-75015 Paris, France
| | - Patrick Nitschké
- Université de Paris, Institut Imagine, F-75006 Paris, France
- Plateforme de Bioinformatique, Université Paris Descartes, F-75015 Paris, France
| | - Sergi Romana
- Université de Paris, Institut Imagine, F-75006 Paris, France
- Service de Cytogénétique, Hôpital Necker-Enfants Malades, APHP, F-75015 Paris, France
| | - Valérie Malan
- Université de Paris, Institut Imagine, F-75006 Paris, France
- Service de Cytogénétique, Hôpital Necker-Enfants Malades, APHP, F-75015 Paris, France
| | - Antonio Rausell
- Université de Paris, Institut Imagine, F-75006 Paris, France
- Clinical Bioinformatics Laboratory, Imagine Institute, INSERM UMR1163, F-75015 Paris, France
- Service de Génétique Moleculaire, Hôpital Necker-Enfants Malades, APHP, F-75015, Paris, France
| |
Collapse
|
38
|
Abdullaev ET, Umarova IR, Arndt PF. Modelling segmental duplications in the human genome. BMC Genomics 2021; 22:496. [PMID: 34215180 PMCID: PMC8254307 DOI: 10.1186/s12864-021-07789-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Accepted: 06/10/2021] [Indexed: 11/22/2022] Open
Abstract
Background Segmental duplications (SDs) are long DNA sequences that are repeated in a genome and have high sequence identity. In contrast to repetitive elements they are often unique and only sometimes have multiple copies in a genome. There are several well-studied mechanisms responsible for segmental duplications: non-allelic homologous recombination, non-homologous end joining and replication slippage. Such duplications play an important role in evolution, however, we do not have a full understanding of the dynamic properties of the duplication process. Results We study segmental duplications through a graph representation where nodes represent genomic regions and edges represent duplications between them. The resulting network (the SD network) is quite complex and has distinct features which allow us to make inference on the evolution of segmantal duplications. We come up with the network growth model that explains features of the SD network thus giving us insights on dynamics of segmental duplications in the human genome. Based on our analysis of genomes of other species the network growth model seems to be applicable for multiple mammalian genomes. Conclusions Our analysis suggests that duplication rates of genomic loci grow linearly with the number of copies of a duplicated region. Several scenarios explaining such a preferential duplication rates were suggested. Supplementary Information The online version contains supplementary material available at (10.1186/s12864-021-07789-7).
Collapse
Affiliation(s)
- Eldar T Abdullaev
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Ihnestraße 63/73, Berlin, 14195, Germany.
| | - Iren R Umarova
- Faculty of Computational Mathematics and Cybernetics, Moscow State University, Leninskiye Gory 1-52, Moscow, 119991, Russia
| | - Peter F Arndt
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Ihnestraße 63/73, Berlin, 14195, Germany
| |
Collapse
|
39
|
Moreira A, Croze M, Delehelle F, Cussat-Blanc S, Luga H, Mollereau C, Balaresque P. Hearing Sensitivity of Primates: Recurrent and Episodic Positive Selection in Hair Cells and Stereocilia Protein-Coding Genes. Genome Biol Evol 2021; 13:6302699. [PMID: 34137817 PMCID: PMC8358225 DOI: 10.1093/gbe/evab133] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/06/2021] [Indexed: 12/29/2022] Open
Abstract
The large spectrum of hearing sensitivity observed in primates results from the impact of environmental and behavioral pressures to optimize sound perception and localization. Although evidence of positive selection in auditory genes has been detected in mammals including in Hominoids, selection has never been investigated in other primates. We analyzed 123 genes highly expressed in the inner ear of 27 primate species and tested to what extent positive selection may have shaped these genes in the order Primates tree. We combined both site and branch-site tests to obtain a comprehensive picture of the positively selected genes (PSGs) involved in hearing sensitivity, and drew a detailed description of the most affected branches in the tree. We chose a conservative approach, and thus focused on confounding factors potentially affecting PSG signals (alignment, GC-biased gene conversion, duplications, heterogeneous sequencing qualities). Using site tests, we showed that around 12% of these genes are PSGs, an α selection value consistent with average human genome estimates (10-15%). Using branch-site tests, we showed that the primate tree is heterogeneously affected by positive selection, with the black snub-nosed monkey, the bushbaby, and the orangutan, being the most impacted branches. A large proportion of these genes is inclined to shape hair cells and stereocilia, which are involved in the mechanotransduction process, known to influence frequency perception. Adaptive selection, and more specifically recurrent adaptive evolution, could have acted in parallel on a set of genes (ADGRV1, USH2A, PCDH15, PTPRQ, and ATP8A2) involved in stereocilia growth and the whole complex of bundle links connecting them, in species across different habitats, including high altitude and nocturnal environments.
Collapse
Affiliation(s)
- Andreia Moreira
- Anthropologie Moléculaire et Imagerie de Synthèse (AMIS), Faculté de Médecine Purpan, CNRS UMR5288, Université de Toulouse, Université Toulouse III Paul Sabatier, France.,Institut de Recherche en Informatique de Toulouse (IRIT), CNRS UMR5505, Université Toulouse III Paul Sabatier, France
| | - Myriam Croze
- Anthropologie Moléculaire et Imagerie de Synthèse (AMIS), Faculté de Médecine Purpan, CNRS UMR5288, Université de Toulouse, Université Toulouse III Paul Sabatier, France
| | - Franklin Delehelle
- Anthropologie Moléculaire et Imagerie de Synthèse (AMIS), Faculté de Médecine Purpan, CNRS UMR5288, Université de Toulouse, Université Toulouse III Paul Sabatier, France.,Institut de Recherche en Informatique de Toulouse (IRIT), CNRS UMR5505, Université Toulouse III Paul Sabatier, France
| | - Sylvain Cussat-Blanc
- Institut de Recherche en Informatique de Toulouse (IRIT), CNRS UMR5505, Université Toulouse III Paul Sabatier, France
| | - Hervé Luga
- Institut de Recherche en Informatique de Toulouse (IRIT), CNRS UMR5505, Université Toulouse III Paul Sabatier, France
| | - Catherine Mollereau
- Anthropologie Moléculaire et Imagerie de Synthèse (AMIS), Faculté de Médecine Purpan, CNRS UMR5288, Université de Toulouse, Université Toulouse III Paul Sabatier, France
| | - Patricia Balaresque
- Anthropologie Moléculaire et Imagerie de Synthèse (AMIS), Faculté de Médecine Purpan, CNRS UMR5288, Université de Toulouse, Université Toulouse III Paul Sabatier, France
| |
Collapse
|
40
|
Ferrari R, Grandi N, Tramontano E, Dieci G. Retrotransposons as Drivers of Mammalian Brain Evolution. Life (Basel) 2021; 11:life11050376. [PMID: 33922141 PMCID: PMC8143547 DOI: 10.3390/life11050376] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 04/20/2021] [Accepted: 04/21/2021] [Indexed: 12/11/2022] Open
Abstract
Retrotransposons, a large and diverse class of transposable elements that are still active in humans, represent a remarkable force of genomic innovation underlying mammalian evolution. Among the features distinguishing mammals from all other vertebrates, the presence of a neocortex with a peculiar neuronal organization, composition and connectivity is perhaps the one that, by affecting the cognitive abilities of mammals, contributed mostly to their evolutionary success. Among mammals, hominids and especially humans display an extraordinarily expanded cortical volume, an enrichment of the repertoire of neural cell types and more elaborate patterns of neuronal connectivity. Retrotransposon-derived sequences have recently been implicated in multiple layers of gene regulation in the brain, from transcriptional and post-transcriptional control to both local and large-scale three-dimensional chromatin organization. Accordingly, an increasing variety of neurodevelopmental and neurodegenerative conditions are being recognized to be associated with retrotransposon dysregulation. We review here a large body of recent studies lending support to the idea that retrotransposon-dependent evolutionary novelties were crucial for the emergence of mammalian, primate and human peculiarities of brain morphology and function.
Collapse
Affiliation(s)
- Roberto Ferrari
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, 43124 Parma, Italy;
| | - Nicole Grandi
- Laboratory of Molecular Virology, Department of Life and Environmental Sciences, University of Cagliari, Cittadella Universitaria di Monserrato, 09042 Monserrato, Italy; (N.G.); (E.T.)
| | - Enzo Tramontano
- Laboratory of Molecular Virology, Department of Life and Environmental Sciences, University of Cagliari, Cittadella Universitaria di Monserrato, 09042 Monserrato, Italy; (N.G.); (E.T.)
- Istituto di Ricerca Genetica e Biomedica, Consiglio Nazionale delle Ricerche, 09042 Monserrato, Italy
| | - Giorgio Dieci
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, 43124 Parma, Italy;
- Correspondence:
| |
Collapse
|
41
|
Warren WC, Harris RA, Haukness M, Fiddes IT, Murali SC, Fernandes J, Dishuck PC, Storer JM, Raveendran M, Hillier LW, Porubsky D, Mao Y, Gordon D, Vollger MR, Lewis AP, Munson KM, DeVogelaere E, Armstrong J, Diekhans M, Walker JA, Tomlinson C, Graves-Lindsay TA, Kremitzki M, Salama SR, Audano PA, Escalona M, Maurer NW, Antonacci F, Mercuri L, Maggiolini FAM, Catacchio CR, Underwood JG, O'Connor DH, Sanders AD, Korbel JO, Ferguson B, Kubisch HM, Picker L, Kalin NH, Rosene D, Levine J, Abbott DH, Gray SB, Sanchez MM, Kovacs-Balint ZA, Kemnitz JW, Thomasy SM, Roberts JA, Kinnally EL, Capitanio JP, Skene JHP, Platt M, Cole SA, Green RE, Ventura M, Wiseman RW, Paten B, Batzer MA, Rogers J, Eichler EE. Sequence diversity analyses of an improved rhesus macaque genome enhance its biomedical utility. Science 2021; 370:370/6523/eabc6617. [PMID: 33335035 DOI: 10.1126/science.abc6617] [Citation(s) in RCA: 107] [Impact Index Per Article: 26.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Accepted: 10/29/2020] [Indexed: 12/15/2022]
Abstract
The rhesus macaque (Macaca mulatta) is the most widely studied nonhuman primate (NHP) in biomedical research. We present an updated reference genome assembly (Mmul_10, contig N50 = 46 Mbp) that increases the sequence contiguity 120-fold and annotate it using 6.5 million full-length transcripts, thus improving our understanding of gene content, isoform diversity, and repeat organization. With the improved assembly of segmental duplications, we discovered new lineage-specific genes and expanded gene families that are potentially informative in studies of evolution and disease susceptibility. Whole-genome sequencing (WGS) data from 853 rhesus macaques identified 85.7 million single-nucleotide variants (SNVs) and 10.5 million indel variants, including potentially damaging variants in genes associated with human autism and developmental delay, providing a framework for developing noninvasive NHP models of human disease.
Collapse
Affiliation(s)
- Wesley C Warren
- Department of Animal Sciences, Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA. .,Department of Surgery, School of Medicine, University of Missouri, Columbia, MO 65211, USA.,Institute of Data Science and Informatics, University of Missouri, Columbia, MO 65211, USA
| | - R Alan Harris
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Marina Haukness
- Computational Genomics Laboratory, University of California-Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - Shwetha C Murali
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | - Jason Fernandes
- Department of Biomolecular Engineering, University of California-Santa Cruz, Santa Cruz, CA 95064, USA
| | - Philip C Dishuck
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Jessica M Storer
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA.,Institue for Systems Biology, Seattle, WA 98109, USA
| | - Muthuswamy Raveendran
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - LaDeana W Hillier
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Yafei Mao
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - David Gordon
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | - Mitchell R Vollger
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Elizabeth DeVogelaere
- Computational Genomics Laboratory, University of California-Santa Cruz, Santa Cruz, CA 95064, USA
| | - Joel Armstrong
- Computational Genomics Laboratory, University of California-Santa Cruz, Santa Cruz, CA 95064, USA
| | - Mark Diekhans
- Computational Genomics Laboratory, University of California-Santa Cruz, Santa Cruz, CA 95064, USA
| | - Jerilyn A Walker
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Chad Tomlinson
- McDonnell Genome Institute, Washington University, St. Louis, MO 63108, USA
| | | | - Milinn Kremitzki
- McDonnell Genome Institute, Washington University, St. Louis, MO 63108, USA
| | - Sofie R Salama
- Department of Biomolecular Engineering, University of California-Santa Cruz, Santa Cruz, CA 95064, USA
| | - Peter A Audano
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Merly Escalona
- Department of Biomolecular Engineering, University of California-Santa Cruz, Santa Cruz, CA 95064, USA
| | - Nicholas W Maurer
- Department of Biomolecular Engineering, University of California-Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - Ludovica Mercuri
- Department of Biology, University of Bari 'Aldo Moro', 70125 Bari, Italy
| | | | | | | | - David H O'Connor
- Department of Pathology and Laboratory Medicine, Wisconsin National Primate Research Center, University of Wisconsin-Madison, Madison, WI 53711, USA
| | - Ashley D Sanders
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Jan O Korbel
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Betsy Ferguson
- Division of Genetics, Oregon National Primate Research Center, Oregon Health and Science University, Beaverton, OR 97006, USA
| | | | - Louis Picker
- Oregon National Primate Research Center and Vaccine and Gene Therapy Institute, Oregon Health Sciences University, Beaverton, OR 97006, USA
| | - Ned H Kalin
- Department of Psychiatry, University of Wisconsin School of Medicine and Public Health, Madison, WI 53719, USA
| | - Douglas Rosene
- Department of Anatomy and Neurobiology, Boston University School of Medicine, Boston, MA 02118, USA
| | - Jon Levine
- Department of Neuroscience, University of Wisconsin, Madison, WI 53175, USA.,Wisconsin National Primate Research Center, University of Wisconsin, Madison, WI 53171, USA
| | - David H Abbott
- Wisconsin National Primate Research Center, University of Wisconsin, Madison, WI 53171, USA.,Department of Obstetrics and Gynecology, Wisconsin National Primate Research Center, University of Wisconsin, Madison, WI 53715, USA
| | - Stanton B Gray
- The University of Texas MD Anderson Cancer Center, Michale E. Keeling Center for Comparative Medicine and Research, Bastrop, TX 78602, USA
| | - Mar M Sanchez
- Yerkes National Primate Research Center, Atlanta, GA 30329, USA.,Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta, GA 30329, USA
| | | | - Joseph W Kemnitz
- Wisconsin National Primate Research Center, University of Wisconsin, Madison, WI 53171, USA.,Department of Cell and Regenerative Biology, University of Wisconsin, Madison, WI 53706, USA
| | - Sara M Thomasy
- Department of Surgical and Radiological Sciences, School of Veterinary Medicine, University of California-Davis, Davis, CA 95616, USA.,Department of Ophthalmology and Vision Science, School of Medicine, University of California-Davis, Davis, CA 95817, USA
| | | | - Erin L Kinnally
- California National Primate Research Center, Davis, CA 95616, USA.,Department of Psychology, University of California, Davis, CA 95616, USA
| | - John P Capitanio
- California National Primate Research Center, Davis, CA 95616, USA.,Department of Psychology, University of California, Davis, CA 95616, USA
| | - J H Pate Skene
- Department of Neurobiology, Duke University School of Medicine, Durham, NC 27710, USA
| | - Michael Platt
- Department of Neuroscience, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Shelley A Cole
- Population Health Program, Texas Biomedical Research Institute and Southwest National Primate Research Center, San Antonio, TX 78227, USA
| | - Richard E Green
- Department of Biomolecular Engineering, University of California-Santa Cruz, Santa Cruz, CA 95064, USA
| | - Mario Ventura
- Department of Biology, University of Bari 'Aldo Moro', 70125 Bari, Italy
| | - Roger W Wiseman
- Department of Pathology and Laboratory Medicine, Wisconsin National Primate Research Center, University of Wisconsin-Madison, Madison, WI 53711, USA
| | - Benedict Paten
- Computational Genomics Laboratory, University of California-Santa Cruz, Santa Cruz, CA 95064, USA
| | - Mark A Batzer
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Jeffrey Rogers
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA. .,Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
42
|
Gualtieri CT. Genomic Variation, Evolvability, and the Paradox of Mental Illness. Front Psychiatry 2021; 11:593233. [PMID: 33551865 PMCID: PMC7859268 DOI: 10.3389/fpsyt.2020.593233] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Accepted: 11/27/2020] [Indexed: 12/30/2022] Open
Abstract
Twentieth-century genetics was hard put to explain the irregular behavior of neuropsychiatric disorders. Autism and schizophrenia defy a principle of natural selection; they are highly heritable but associated with low reproductive success. Nevertheless, they persist. The genetic origins of such conditions are confounded by the problem of variable expression, that is, when a given genetic aberration can lead to any one of several distinct disorders. Also, autism and schizophrenia occur on a spectrum of severity, from mild and subclinical cases to the overt and disabling. Such irregularities reflect the problem of missing heritability; although hundreds of genes may be associated with autism or schizophrenia, together they account for only a small proportion of cases. Techniques for higher resolution, genomewide analysis have begun to illuminate the irregular and unpredictable behavior of the human genome. Thus, the origins of neuropsychiatric disorders in particular and complex disease in general have been illuminated. The human genome is characterized by a high degree of structural and behavioral variability: DNA content variation, epistasis, stochasticity in gene expression, and epigenetic changes. These elements have grown more complex as evolution scaled the phylogenetic tree. They are especially pertinent to brain development and function. Genomic variability is a window on the origins of complex disease, neuropsychiatric disorders, and neurodevelopmental disorders in particular. Genomic variability, as it happens, is also the fuel of evolvability. The genomic events that presided over the evolution of the primate and hominid lineages are over-represented in patients with autism and schizophrenia, as well as intellectual disability and epilepsy. That the special qualities of the human genome that drove evolution might, in some way, contribute to neuropsychiatric disorders is a matter of no little interest.
Collapse
|
43
|
Lavrichenko K, Helgeland Ø, Njølstad PR, Jonassen I, Johansson S. SeeCiTe: a method to assess CNV calls from SNP arrays using trio data. Bioinformatics 2021; 37:1876-1883. [PMID: 33459766 PMCID: PMC8317106 DOI: 10.1093/bioinformatics/btab028] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 12/17/2020] [Accepted: 01/11/2021] [Indexed: 11/15/2022] Open
Abstract
Motivation Single nucleotide polymorphism (SNP) genotyping arrays remain an attractive platform for assaying copy number variants (CNVs) in large population-wide cohorts. However, current tools for calling CNVs are still prone to extensive false positive calls when applied to biobank scale arrays. Moreover, there is a lack of methods exploiting cohorts with trios available (e.g. nuclear family) to assist in quality control and downstream analyses following the calling. Results We developed SeeCiTe (Seeing CNVs in Trios), a novel CNV-quality control tool that postprocesses output from current CNV-calling tools exploiting child-parent trio data to classify calls in quality categories and provide a set of visualizations for each putative CNV call in the offspring. We apply it to the Norwegian Mother, Father and Child Cohort Study (MoBa) and show that SeeCiTe improves the specificity and sensitivity compared to the common empiric filtering strategies. To our knowledge, it is the first tool that utilizes probe-level CNV data in trios (and singletons) to systematically highlight potential artifacts and visualize signal intensities in a streamlined fashion suitable for biobank scale studies. Availability and implementation The software is implemented in R with the source code freely available at https://github.com/aksenia/SeeCiTe Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ksenia Lavrichenko
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway.,Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Øyvind Helgeland
- Department of Clinical Science, University of Bergen, Bergen, Norway.,Department of Genetics and Bioinformatics, Norwegian Institute of Public Health, Oslo, Norway
| | - Pål R Njølstad
- Department of Clinical Science, University of Bergen, Bergen, Norway.,Department of Pediatrics and Adolescents, Haukeland University Hospital, Bergen, Norway
| | - Inge Jonassen
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - Stefan Johansson
- Department of Clinical Science, University of Bergen, Bergen, Norway.,Department of Medical Genetics, Haukeland University Hospital, Bergen, Norway
| |
Collapse
|
44
|
Global Genome Conformational Programming during Neuronal Development Is Associated with CTCF and Nuclear FGFR1-The Genome Archipelago Model. Int J Mol Sci 2020; 22:ijms22010347. [PMID: 33396256 PMCID: PMC7795191 DOI: 10.3390/ijms22010347] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2020] [Revised: 12/17/2020] [Accepted: 12/18/2020] [Indexed: 01/15/2023] Open
Abstract
During the development of mouse embryonic stem cells (ESC) to neuronal committed cells (NCC), coordinated changes in the expression of 2851 genes take place, mediated by the nuclear form of FGFR1. In this paper, widespread differences are demonstrated in the ESC and NCC inter- and intra-chromosomal interactions, chromatin looping, the formation of CTCF- and nFGFR1-linked Topologically Associating Domains (TADs) on a genome-wide scale and in exemplary HoxA-D loci. The analysis centered on HoxA cluster shows that blocking FGFR1 disrupts the loop formation. FGFR1 binding and genome locales are predictive of the genome interactions; likewise, chromatin interactions along with nFGFR1 binding are predictive of the genome function and correlate with genome regulatory attributes and gene expression. This study advances a topologically integrated genome archipelago model that undergoes structural transformations through the formation of nFGFR1-associated TADs. The makeover of the TAD islands serves to recruit distinct ontogenic programs during the development of the ESC to NCC.
Collapse
|
45
|
Binversie EE, Baker LA, Engelman CD, Hao Z, Moran JJ, Piazza AM, Sample SJ, Muir P. Analysis of copy number variation in dogs implicates genomic structural variation in the development of anterior cruciate ligament rupture. PLoS One 2020; 15:e0244075. [PMID: 33382735 PMCID: PMC7774950 DOI: 10.1371/journal.pone.0244075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Accepted: 12/02/2020] [Indexed: 11/19/2022] Open
Abstract
Anterior cruciate ligament (ACL) rupture is an important condition of the human knee. Second ruptures are common and societal costs are substantial. Canine cranial cruciate ligament (CCL) rupture closely models the human disease. CCL rupture is common in the Labrador Retriever (5.79% prevalence), ~100-fold more prevalent than in humans. Labrador Retriever CCL rupture is a polygenic complex disease, based on genome-wide association study (GWAS) of single nucleotide polymorphism (SNP) markers. Dissection of genetic variation in complex traits can be enhanced by studying structural variation, including copy number variants (CNVs). Dogs are an ideal model for CNV research because of reduced genetic variability within breeds and extensive phenotypic diversity across breeds. We studied the genetic etiology of CCL rupture by association analysis of CNV regions (CNVRs) using 110 case and 164 control Labrador Retrievers. CNVs were called from SNPs using three different programs (PennCNV, CNVPartition, and QuantiSNP). After quality control, CNV calls were combined to create CNVRs using ParseCNV and an association analysis was performed. We found no strong effect CNVRs but found 46 small effect (max(T) permutation P<0.05) CCL rupture associated CNVRs in 22 autosomes; 25 were deletions and 21 were duplications. Of the 46 CCL rupture associated CNVRs, we identified 39 unique regions. Thirty four were identified by a single calling algorithm, 3 were identified by two calling algorithms, and 2 were identified by all three algorithms. For 42 of the associated CNVRs, frequency in the population was <10% while 4 occurred at a frequency in the population ranging from 10–25%. Average CNVR length was 198,872bp and CNVRs covered 0.11 to 0.15% of the genome. All CNVRs were associated with case status. CNVRs did not overlap previous canine CCL rupture risk loci identified by GWAS. Associated CNVRs contained 152 annotated genes; 12 CNVRs did not have genes mapped to CanFam3.1. Using pathway analysis, a cluster of 19 homeobox domain transcript regulator genes was associated with CCL rupture (P = 6.6E-13). This gene cluster influences cranial-caudal body pattern formation during embryonic limb development. Clustered genes were found in 3 CNVRs on chromosome 14 (HoxA), 28 (NKX6-2), and 36 (HoxD). When analysis was limited to deletion CNVRs, the association was strengthened (P = 8.7E-16). This study suggests a component of the polygenic risk of CCL rupture in Labrador Retrievers is associated with small effect CNVs and may include aspects of stifle morphology regulated by homeobox domain transcript regulator genes.
Collapse
Affiliation(s)
- Emily E. Binversie
- Comparative Orthopaedic and Genetics Research Laboratory, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Lauren A. Baker
- Comparative Orthopaedic and Genetics Research Laboratory, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Corinne D. Engelman
- Department of Population Health Sciences, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Zhengling Hao
- Comparative Orthopaedic and Genetics Research Laboratory, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - John J. Moran
- Department of Comparative Biosciences, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Alexander M. Piazza
- Comparative Orthopaedic and Genetics Research Laboratory, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Susannah J. Sample
- Comparative Orthopaedic and Genetics Research Laboratory, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Peter Muir
- Comparative Orthopaedic and Genetics Research Laboratory, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- * E-mail:
| |
Collapse
|
46
|
Single-cell strand sequencing of a macaque genome reveals multiple nested inversions and breakpoint reuse during primate evolution. Genome Res 2020; 30:1680-1693. [PMID: 33093070 PMCID: PMC7605249 DOI: 10.1101/gr.265322.120] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Accepted: 09/02/2020] [Indexed: 12/14/2022]
Abstract
Rhesus macaque is an Old World monkey that shared a common ancestor with human ∼25 Myr ago and is an important animal model for human disease studies. A deep understanding of its genetics is therefore required for both biomedical and evolutionary studies. Among structural variants, inversions represent a driving force in speciation and play an important role in disease predisposition. Here we generated a genome-wide map of inversions between human and macaque, combining single-cell strand sequencing with cytogenetics. We identified 375 total inversions between 859 bp and 92 Mbp, increasing by eightfold the number of previously reported inversions. Among these, 19 inversions flanked by segmental duplications overlap with recurrent copy number variants associated with neurocognitive disorders. Evolutionary analyses show that in 17 out of 19 cases, the Hominidae orientation of these disease-associated regions is always derived. This suggests that duplicated sequences likely played a fundamental role in generating inversions in humans and great apes, creating architectures that nowadays predispose these regions to disease-associated genetic instability. Finally, we identified 861 genes mapping at 156 inversions breakpoints, with some showing evidence of differential expression in human and macaque cell lines, thus highlighting candidates that might have contributed to the evolution of species-specific features. This study depicts the most accurate fine-scale map of inversions between human and macaque using a two-pronged integrative approach, such as single-cell strand sequencing and cytogenetics, and represents a valuable resource toward understanding of the biology and evolution of primate species.
Collapse
|
47
|
Van Bibber NW, Haerle C, Khalife R, Dayhoff GW, Uversky VN. Intrinsic Disorder in Human Proteins Encoded by Core Duplicon Gene Families. J Phys Chem B 2020; 124:8050-8070. [PMID: 32880174 DOI: 10.1021/acs.jpcb.0c07676] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Segmental duplications (i.e., highly homologous DNA fragments greater than 1 kb in length that are present within a genome at more than one site) are typically found in genome regions that are prone to rearrangements. A noticeable fraction of the human genome (∼5%) includes segmental duplications (or duplicons) that are assumed to play a number of vital roles in human evolution, human-specific adaptation, and genomic instability. Despite their importance for crucial events such as synaptogenesis, neuronal migration, and neocortical expansion, these segmental duplications continue to be rather poorly characterized. Of particular interest are the core duplicon gene (CDG) families, which are replicates sharing common "core" DNA among the randomly attached pieces and which expand along single chromosomes and might harbor newly acquired protein domains. Another important feature of proteins encoded by CDG families is their multifunctionality. Although it seems that these proteins might possess many characteristic features of intrinsically disordered proteins, to the best of our knowledge, a systematic investigation of the intrinsic disorder predisposition of the proteins encoded by core duplicon gene families has not been conducted yet. To fill this gap and to determine the degree to which these proteins might be affected by intrinsic disorder, we analyzed a set of human proteins encoded by the members of 10 core duplicon gene families, such as NBPF, RGPD, GUSBP, PMS2P, SPATA31, TRIM51, GOLGA8, NPIP, TBC1D3, and LRRC37. Our analysis revealed that the vast majority of these proteins are highly disordered, with their disordered regions often being utilized as means for the protein-protein interactions and/or targeted for numerous posttranslational modifications of different nature.
Collapse
Affiliation(s)
- Nathan W Van Bibber
- Department of Molecular Medicine Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Boulevard, Tampa, Florida 33612, United States
| | - Cornelia Haerle
- Department of Molecular Medicine Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Boulevard, Tampa, Florida 33612, United States
| | - Roy Khalife
- Department of Molecular Medicine Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Boulevard, Tampa, Florida 33612, United States
| | - Guy W Dayhoff
- Department of Chemistry, College of Art and Sciences, University of South Florida, Tampa, Florida 33620, United States
| | - Vladimir N Uversky
- Department of Molecular Medicine Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Boulevard, Tampa, Florida 33612, United States.,USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Boulevard, Tampa, Florida 33612, United States.,Institute for Biological Instrumentation, Russian Academy of Sciences, Federal Research Center "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", 4 Institutskaya St., Pushchino, 142290, Moscow Region, Russia
| |
Collapse
|
48
|
Lallemand T, Leduc M, Landès C, Rizzon C, Lerat E. An Overview of Duplicated Gene Detection Methods: Why the Duplication Mechanism Has to Be Accounted for in Their Choice. Genes (Basel) 2020; 11:E1046. [PMID: 32899740 PMCID: PMC7565063 DOI: 10.3390/genes11091046] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 09/01/2020] [Accepted: 09/02/2020] [Indexed: 12/11/2022] Open
Abstract
Gene duplication is an important evolutionary mechanism allowing to provide new genetic material and thus opportunities to acquire new gene functions for an organism, with major implications such as speciation events. Various processes are known to allow a gene to be duplicated and different models explain how duplicated genes can be maintained in genomes. Due to their particular importance, the identification of duplicated genes is essential when studying genome evolution but it can still be a challenge due to the various fates duplicated genes can encounter. In this review, we first describe the evolutionary processes allowing the formation of duplicated genes but also describe the various bioinformatic approaches that can be used to identify them in genome sequences. Indeed, these bioinformatic approaches differ according to the underlying duplication mechanism. Hence, understanding the specificity of the duplicated genes of interest is a great asset for tool selection and should be taken into account when exploring a biological question.
Collapse
Affiliation(s)
- Tanguy Lallemand
- IRHS, Agrocampus-Ouest, INRAE, Université d’Angers, SFR 4207 QuaSaV, 49071 Beaucouzé, France; (T.L.); (M.L.); (C.L.)
| | - Martin Leduc
- IRHS, Agrocampus-Ouest, INRAE, Université d’Angers, SFR 4207 QuaSaV, 49071 Beaucouzé, France; (T.L.); (M.L.); (C.L.)
| | - Claudine Landès
- IRHS, Agrocampus-Ouest, INRAE, Université d’Angers, SFR 4207 QuaSaV, 49071 Beaucouzé, France; (T.L.); (M.L.); (C.L.)
| | - Carène Rizzon
- Laboratoire de Mathématiques et Modélisation d’Evry (LaMME), Université d’Evry Val d’Essonne, Université Paris-Saclay, UMR CNRS 8071, ENSIIE, USC INRAE, 23 bvd de France, CEDEX, 91037 Evry Paris, France;
| | - Emmanuelle Lerat
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive UMR 5558, F-69622 Villeurbanne, France
| |
Collapse
|
49
|
Ji QM, Xin JW, Chai ZX, Zhang CF, Dawa Y, Luo S, Zhang Q, Pingcuo Z, Peng MS, Zhu Y, Cao HW, Wang H, Han JL, Zhong JC. A chromosome-scale reference genome and genome-wide genetic variations elucidate adaptation in yak. Mol Ecol Resour 2020; 21:201-211. [PMID: 32745324 PMCID: PMC7754329 DOI: 10.1111/1755-0998.13236] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2019] [Revised: 07/03/2020] [Accepted: 07/20/2020] [Indexed: 11/28/2022]
Abstract
Yak is an important livestock animal for the people indigenous to the harsh, oxygen‐limited Qinghai‐Tibetan Plateau and Hindu Kush ranges of the Himalayas. The yak genome was sequenced in 2012, but its assembly was fragmented because of the inherent limitations of the Illumina sequencing technology used to analyse it. An accurate and complete reference genome is essential for the study of genetic variations in this species. Long‐read sequences are more complete than their short‐read counterparts and have been successfully applied towards high‐quality genome assembly for various species. In this study, we present a high‐quality chromosome‐scale yak genome assembly (BosGru_PB_v1.0) constructed with long‐read sequencing and chromatin interaction technologies. Compared to an existing yak genome assembly (BosGru_v2.0), BosGru_PB_v1.0 shows substantially improved chromosome sequence continuity, reduced repetitive structure ambiguity, and gene model completeness. To characterize genetic variation in yak, we generated de novo genome assemblies based on Illumina short reads for seven recognized domestic yak breeds in Tibet and Sichuan and one wild yak from Hoh Xil. We compared these eight assemblies to the BosGru_PB_v1.0 genome, obtained a comprehensive map of yak genetic diversity at the whole‐genome level, and identified several protein‐coding genes absent from the BosGru_PB_v1.0 assembly. Despite the genetic bottleneck experienced by wild yak, their diversity was nonetheless higher than that of domestic yak. Here, we identified breed‐specific sequences and genes by whole‐genome alignment, which may facilitate yak breed identification.
Collapse
Affiliation(s)
- Qiu-Mei Ji
- State Key Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa, China.,Institute of Animal Science and Veterinary Research, Tibet Academy of Agricultural and Animal Husbandry Sciences, Lhasa, China
| | - Jin-Wei Xin
- State Key Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa, China.,Institute of Animal Science and Veterinary Research, Tibet Academy of Agricultural and Animal Husbandry Sciences, Lhasa, China
| | - Zhi-Xin Chai
- Key Laboratory of Qinghai-Tibetan Plateau Animal Genetic Resource Reservation and Utilization, Sichuan Province and Ministry of Education, Southwest Minzu University, Chengdu, China
| | - Cheng-Fu Zhang
- State Key Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa, China.,Institute of Animal Science and Veterinary Research, Tibet Academy of Agricultural and Animal Husbandry Sciences, Lhasa, China
| | - Yangla Dawa
- State Key Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa, China.,Institute of Animal Science and Veterinary Research, Tibet Academy of Agricultural and Animal Husbandry Sciences, Lhasa, China
| | - Sang Luo
- State Key Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa, China.,Institute of Animal Science and Veterinary Research, Tibet Academy of Agricultural and Animal Husbandry Sciences, Lhasa, China
| | - Qiang Zhang
- State Key Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa, China.,Institute of Animal Science and Veterinary Research, Tibet Academy of Agricultural and Animal Husbandry Sciences, Lhasa, China
| | - Zhandui Pingcuo
- State Key Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa, China.,Institute of Animal Science and Veterinary Research, Tibet Academy of Agricultural and Animal Husbandry Sciences, Lhasa, China
| | - Min-Sheng Peng
- State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Yong Zhu
- State Key Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa, China.,Institute of Animal Science and Veterinary Research, Tibet Academy of Agricultural and Animal Husbandry Sciences, Lhasa, China
| | - Han-Wen Cao
- State Key Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa, China.,Institute of Animal Science and Veterinary Research, Tibet Academy of Agricultural and Animal Husbandry Sciences, Lhasa, China
| | - Hui Wang
- Key Laboratory of Qinghai-Tibetan Plateau Animal Genetic Resource Reservation and Utilization, Sichuan Province and Ministry of Education, Southwest Minzu University, Chengdu, China
| | - Jian-Lin Han
- CAAS-ILRI Joint Laboratory on Livestock and Forage Genetic Resources, Institute of Animal Science, Chinese Academy of Agriculture Sciences (CAAS), Beijing, China
| | - Jin-Cheng Zhong
- Key Laboratory of Qinghai-Tibetan Plateau Animal Genetic Resource Reservation and Utilization, Sichuan Province and Ministry of Education, Southwest Minzu University, Chengdu, China
| |
Collapse
|
50
|
Bekpen C, Tautz D. Human core duplicon gene families: game changers or game players? Brief Funct Genomics 2020; 18:402-411. [PMID: 31529038 PMCID: PMC6920530 DOI: 10.1093/bfgp/elz016] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 05/01/2019] [Accepted: 06/24/2019] [Indexed: 01/09/2023] Open
Abstract
Illuminating the role of specific gene duplications within the human lineage can provide insights into human-specific adaptations. The so-called human core duplicon gene families have received particular attention in this respect, due to special features, such as expansion along single chromosomes, newly acquired protein domains and signatures of positive selection. Here, we summarize the data available for 10 such families and include some new analyses. A picture emerges that suggests broad functions for these protein families, possibly through modification of core cellular pathways. Still, more dedicated studies are required to elucidate the function of core-duplicons gene families and how they have shaped adaptations and evolution of humans.
Collapse
Affiliation(s)
| | - Diethard Tautz
- Max-Planck Institute for Evolutionary Biology, 24306 Plön, Germany
| |
Collapse
|