1
|
Bespalova AV, Kulikova DA, Zelentsova ES, Rezvykh AP, Guseva IO, Dorador AP, Evgen’ev MB, Funikov SY. Paramutation-Like Behavior of Genic piRNA-Producing Loci in Drosophila virilis. Int J Mol Sci 2025; 26:4243. [PMID: 40362480 PMCID: PMC12072073 DOI: 10.3390/ijms26094243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2025] [Revised: 04/24/2025] [Accepted: 04/28/2025] [Indexed: 05/15/2025] Open
Abstract
Piwi-interacting RNAs (piRNAs) play a crucial role in silencing transposable elements (TEs) in the germ cells of Metazoa by acting as sequence-specific guides. Originating from distinct genomic loci, called piRNA clusters, piRNA can trigger an epigenetic conversion of TE insertions into piRNA clusters by means of a paramutation-like process. However, the variability in piRNA clusters' capacity to induce such conversions remains poorly understood. Here, we investigated two Drosophila virilis strains with differing capacities to produce piRNAs from the subtelomeric RhoGEF3 and Adar gene loci. We found that active piRNA generation correlates with high levels of the heterochromatic mark histone 3 lysine 9 trimethylation (H3K9me3) over genomic regions that give rise to piRNAs. Importantly, the maternal transmission of piRNAs drives their production in the progeny, even from homologous loci previously inactive in piRNA biogenesis. The RhoGEF3 locus, once epigenetically converted, maintained enhanced piRNA production in subsequent generations lacking the original allele carrying the active piRNA cluster. In contrast, piRNA expression from the converted Adar locus was lost in offspring lacking the inducer allele. The present findings suggest that the paramutation-like behavior of piRNA clusters may be influenced not only by piRNAs but also by structural features and the chromatin environment in the proximity to telomeres, providing new insights into the epigenetic regulation of the Drosophila genome.
Collapse
Affiliation(s)
- Alina V. Bespalova
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Dina A. Kulikova
- Koltzov Institute of Developmental Biology, Russian Academy of Sciences, 119334 Moscow, Russia
| | - Elena S. Zelentsova
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Alexander P. Rezvykh
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Iuliia O. Guseva
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia
- Moscow Center for Advanced Studies, Kulakova Str. 20, 123592 Moscow, Russia
| | - Ana P. Dorador
- Howard Hughes Medical Institute, Department of Cell Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Mikhail B. Evgen’ev
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Sergei Y. Funikov
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia
| |
Collapse
|
2
|
Dong C, Xia S, Zhang L, Arsala D, Fang C, Tan S, Clark AG, Long M. Subcellular Enrichment Patterns of New Genes in Drosophila Evolution. Mol Biol Evol 2025; 42:msaf038. [PMID: 39920336 PMCID: PMC11843443 DOI: 10.1093/molbev/msaf038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2024] [Revised: 12/31/2024] [Accepted: 01/14/2025] [Indexed: 02/09/2025] Open
Abstract
The evolutionary patterns of proteins within subcellular compartments underlie the innovation and diversification foundation of the living eukaryotic organism. The location of proteins in subcellular compartments promotes the formation of network interaction modules, which in turn reshape the architecture of higher-level protein-protein interaction networks. Here, we conducted the most up-to-date gene age dating of Drosophila melanogaster by employing recently available long-read sequencing genomes as references. We found that an elevated gene fixation in the most recent common ancestor of Drosophila genus predated the divergence between two Drosophila subgenera, and a significant tendency of these genes in D. melanogaster encode proteins that localize to the extracellular matrix, accompanying the adaptive radiation of Drosophila species. Proteins encoded by genes located in the extracellular space exhibit higher sequence divergence, suggesting a rapid evolutionary process. We also observed that proteins encoded by genes originating from the same evolutionary branches tend to co-localize in the same subcellular compartments, and proteins in the same subcellular compartment tend to interact with each other. The proteins encoded by genes that have persisted through deeper branches exhibit broader localization across multiple subcellular compartments, enhancing the likelihood of their integration into various protein or gene regulatory networks, thereby increasing functional diversity. These evolutionary patterns not only contribute to understanding the evolution of subcellular localization in proteins encoded by genes originating from different branches, but also provide insights into the evolution of protein-protein networks driven by the emergence of new genes.
Collapse
Affiliation(s)
- Chuan Dong
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Hangzhou, Zhejiang, China
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL, USA
| | - Shengqian Xia
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL, USA
| | - Li Zhang
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL, USA
| | - Deanna Arsala
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL, USA
| | - Chengchi Fang
- The Key Laboratory of Aquatic Biodiversity and Conservation of Chinese Academy of Sciences, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei, China
| | - Shengjun Tan
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL, USA
| | - Andrew G Clark
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | - Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL, USA
| |
Collapse
|
3
|
Sharakhov IV, Sharakhova MV. Chromosomal inversions and their impact on insect evolution. CURRENT OPINION IN INSECT SCIENCE 2024; 66:101280. [PMID: 39374869 PMCID: PMC11611660 DOI: 10.1016/j.cois.2024.101280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Revised: 08/20/2024] [Accepted: 10/02/2024] [Indexed: 10/09/2024]
Abstract
Insects can adapt quickly and effectively to rapid environmental change and maintain long-term adaptations, but the genetic mechanisms underlying this response are not fully understood. In this review, we summarize studies on the potential impact of chromosomal inversion polymorphisms on insect evolution at different spatial and temporal scales, ranging from long-term evolutionary stability to rapid emergence in response to emerging biotic and abiotic factors. The study of inversions has recently been advanced by comparative, population, and 3D genomics methods. The impact of inversions on insect genome evolution can be profound, including increased gene order rearrangements on sex chromosomes, accumulation of transposable elements, and facilitation of genome divergence. Understanding these processes provides critical insights into the evolutionary mechanisms shaping insect diversity.
Collapse
Affiliation(s)
- Igor V Sharakhov
- Department of Entomology and Fralin Life Sciences Institute, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA; The Center for Emerging, Zoonotic, and Arthropod-borne Pathogens, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA; The Center for Mathematics of Biosystems, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA; Department of Genetics and Cell Biology, Tomsk State University, Tomsk 634050, Russia.
| | - Maria V Sharakhova
- Department of Entomology and Fralin Life Sciences Institute, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA; The Center for Emerging, Zoonotic, and Arthropod-borne Pathogens, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA; Laboratory of Cell Differentiation Mechanisms, Institute of Cytology and Genetics, Novosibirsk 630090, Russia
| |
Collapse
|
4
|
Volarić M, Despot-Slade E, Veseljak D, Mravinac B, Meštrović N. Long-read genome assembly of the insect model organism Tribolium castaneum reveals spread of satellite DNA in gene-rich regions by recurrent burst events. Genome Res 2024; 34:1878-1894. [PMID: 39438111 DOI: 10.1101/gr.279225.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Accepted: 06/11/2024] [Indexed: 10/25/2024]
Abstract
Eukaryotic genomes are replete with satellite DNAs (satDNAs), large stretches of tandemly repeated sequences that are mostly underrepresented in genome assemblies. Here we combined nanopore long-read sequencing with a reference-guided assembly approach to generate an improved, high-quality genome assembly, TcasONT, of the model beetle Tribolium castaneum Enriched by 45 Mb in repetitive regions, the new assembly comprises almost the entire genome sequence. We use the enhanced assembly to conduct global and in-depth analyses of abundant euchromatic satDNAs. Unexpectedly, we show the extensive spread of satDNAs in gene-rich regions, including long arrays. The sequence similarity relationships between satDNA monomers and arrays indicate a recent exchange of satDNA arrays between different chromosomes. We propose a scenario of their genome dynamics characterized by repeated bursts of satDNAs spreading through euchromatin, followed by a process of elongation and homogenization of arrays. We find that suppressed recombination on the X Chromosome has no significant effect on the spread of satDNAs but the X rather tolerates the amplification of satDNAs into longer arrays. Analyses of arrays' neighboring regions show a tendency of one satDNA to be associated with transposable-like elements. Using 2D electrophoresis followed by Southern blotting, we prove Cast satDNAs' presence in the fraction of extrachromosomal circular DNA (eccDNA). We point to two mechanisms that enable this satDNA spread to occur: transposition by transposable elements and insertion mediated by eccDNA. The presence of such a large proportion of satDNA in gene-rich regions inevitably gives rise to speculation about their possible influence on gene expression.
Collapse
|
5
|
Huang YH, Sun YF, Li H, Li HS, Pang H. PhyloAln: A Convenient Reference-Based Tool to Align Sequences and High-Throughput Reads for Phylogeny and Evolution in the Omic Era. Mol Biol Evol 2024; 41:msae150. [PMID: 39041199 PMCID: PMC11287380 DOI: 10.1093/molbev/msae150] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 05/15/2024] [Accepted: 07/16/2024] [Indexed: 07/24/2024] Open
Abstract
The current trend in phylogenetic and evolutionary analyses predominantly relies on omic data. However, prior to core analyses, traditional methods typically involve intricate and time-consuming procedures, including assembly from high-throughput reads, decontamination, gene prediction, homology search, orthology assignment, multiple sequence alignment, and matrix trimming. Such processes significantly impede the efficiency of research when dealing with extensive data sets. In this study, we develop PhyloAln, a convenient reference-based tool capable of directly aligning high-throughput reads or complete sequences with existing alignments as a reference for phylogenetic and evolutionary analyses. Through testing with simulated data sets of species spanning the tree of life, PhyloAln demonstrates consistently robust performance compared with other reference-based tools across different data types, sequencing technologies, coverages, and species, with percent completeness and identity at least 50 percentage points higher in the alignments. Additionally, we validate the efficacy of PhyloAln in removing a minimum of 90% foreign and 70% cross-contamination issues, which are prevalent in sequencing data but often overlooked by other tools. Moreover, we showcase the broad applicability of PhyloAln by generating alignments (completeness mostly larger than 80%, identity larger than 90%) and reconstructing robust phylogenies using real data sets of transcriptomes of ladybird beetles, plastid genes of peppers, or ultraconserved elements of turtles. With these advantages, PhyloAln is expected to facilitate phylogenetic and evolutionary analyses in the omic era. The tool is accessible at https://github.com/huangyh45/PhyloAln.
Collapse
Affiliation(s)
- Yu-Hao Huang
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-sen University, Shenzhen 518107, China
| | - Yi-Fei Sun
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-sen University, Shenzhen 518107, China
| | - Hao Li
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-sen University, Shenzhen 518107, China
| | - Hao-Sen Li
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-sen University, Shenzhen 518107, China
| | - Hong Pang
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-sen University, Shenzhen 518107, China
| |
Collapse
|
6
|
Brand CL, Oliver GT, Farkas IZ, Buszczak M, Levine MT. Recurrent Duplication and Diversification of a Vital DNA Repair Gene Family Across Drosophila. Mol Biol Evol 2024; 41:msae113. [PMID: 38865490 PMCID: PMC11210505 DOI: 10.1093/molbev/msae113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 05/30/2024] [Accepted: 06/04/2024] [Indexed: 06/14/2024] Open
Abstract
Maintaining genome integrity is vital for organismal survival and reproduction. Essential, broadly conserved DNA repair pathways actively preserve genome integrity. However, many DNA repair proteins evolve adaptively. Ecological forces like UV exposure are classically cited drivers of DNA repair evolution. Intrinsic forces like repetitive DNA, which also imperil genome integrity, have received less attention. We recently reported that a Drosophila melanogaster-specific DNA satellite array triggered species-specific, adaptive evolution of a DNA repair protein called Spartan/MH. The Spartan family of proteases cleave hazardous, covalent crosslinks that form between DNA and proteins ("DNA-protein crosslink repair"). Appreciating that DNA satellites are both ubiquitous and universally fast-evolving, we hypothesized that satellite DNA turnover spurs adaptive evolution of DNA-protein crosslink repair beyond a single gene and beyond the D. melanogaster lineage. This hypothesis predicts pervasive Spartan gene family diversification across Drosophila species. To study the evolutionary history of the Drosophila Spartan gene family, we conducted population genetic, molecular evolution, phylogenomic, and tissue-specific expression analyses. We uncovered widespread signals of positive selection across multiple Spartan family genes and across multiple evolutionary timescales. We also detected recurrent Spartan family gene duplication, divergence, and gene loss. Finally, we found that ovary-enriched parent genes consistently birthed functionally diverged, testis-enriched daughter genes. To account for Spartan family diversification, we introduce a novel mechanistic model of antagonistic coevolution that links DNA satellite evolution and adaptive regulation of Spartan protease activity. This framework promises to accelerate our understanding of how DNA repeats drive recurrent evolutionary innovation to preserve genome integrity.
Collapse
Affiliation(s)
- Cara L Brand
- Department of Biology and Epigenetics Institute, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Genevieve T Oliver
- Department of Biology and Epigenetics Institute, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Isabella Z Farkas
- Department of Biology and Epigenetics Institute, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Michael Buszczak
- Department of Molecular Biology and Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Mia T Levine
- Department of Biology and Epigenetics Institute, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
7
|
Flynn JM, Yamashita YM. The implications of satellite DNA instability on cellular function and evolution. Semin Cell Dev Biol 2024; 156:152-159. [PMID: 37852904 DOI: 10.1016/j.semcdb.2023.10.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 09/21/2023] [Accepted: 10/11/2023] [Indexed: 10/20/2023]
Abstract
Abundant tandemly repeated satellite DNA is present in most eukaryotic genomes. Previous limitations including a pervasive view that it was uninteresting junk DNA, combined with challenges in studying it, are starting to dissolve - and recent studies have found important functions for satellite DNAs. The observed rapid evolution and implied instability of satellite DNA now has important significance for their functions and maintenance within the genome. In this review, we discuss the processes that lead to satellite DNA copy number instability, and the importance of mechanisms to manage the potential negative effects of instability. Satellite DNA is vulnerable to challenges during replication and repair, since it forms difficult-to-process secondary structures and its homology within tandem arrays can result in various types of recombination. Satellite DNA instability may be managed by DNA or chromatin-binding proteins ensuring proper nuclear localization and repair, or by proteins that process aberrant structures that satellite DNAs tend to form. We also discuss the pattern of satellite DNA mutations from recent mutation accumulation (MA) studies that have tracked changes in satellite DNA for up to 1000 generations with minimal selection. Finally, we highlight examples of satellite evolution from studies that have characterized satellites across millions of years of Drosophila fruit fly evolution, and discuss possible ways that selection might act on the satellite DNA composition.
Collapse
Affiliation(s)
- Jullien M Flynn
- Whitehead Institute for Biomedical Research, Cambridge, MA, USA; Howard Hughes Medical Institute, Cambridge, MA, USA.
| | - Yukiko M Yamashita
- Whitehead Institute for Biomedical Research, Cambridge, MA, USA; Howard Hughes Medical Institute, Cambridge, MA, USA; Massachusetts Institute of Technology, Cambridge, MA, USA.
| |
Collapse
|
8
|
Flynn JM, Ahmed-Braimah YH, Long M, Wing RA, Clark AG. High-Quality Genome Assemblies Reveal Evolutionary Dynamics of Repetitive DNA and Structural Rearrangements in the Drosophila virilis Subgroup. Genome Biol Evol 2024; 16:evad238. [PMID: 38159044 PMCID: PMC10783647 DOI: 10.1093/gbe/evad238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 12/18/2023] [Accepted: 12/23/2023] [Indexed: 01/03/2024] Open
Abstract
High-quality genome assemblies across a range of nontraditional model organisms can accelerate the discovery of novel aspects of genome evolution. The Drosophila virilis group has several attributes that distinguish it from more highly studied species in the Drosophila genus, such as an unusual abundance of repetitive elements and extensive karyotype evolution, in addition to being an attractive model for speciation genetics. Here, we used long-read sequencing to assemble five genomes of three virilis group species and characterized sequence and structural divergence and repetitive DNA evolution. We find that our contiguous genome assemblies allow characterization of chromosomal arrangements with ease and can facilitate analysis of inversion breakpoints. We also leverage a small panel of resequenced strains to explore the genomic pattern of divergence and polymorphism in this species and show that known demographic histories largely predicts the extent of genome-wide segregating polymorphism. We further find that a neo-X chromosome in Drosophila americana displays X-like levels of nucleotide diversity. We also found that unusual repetitive elements were responsible for much of the divergence in genome composition among species. Helitron-derived tandem repeats tripled in abundance on the Y chromosome in D. americana compared to Drosophila novamexicana, accounting for most of the difference in repeat content between these sister species. Repeats with characteristics of both transposable elements and satellite DNAs expanded by 3-fold, mostly in euchromatin, in both D. americana and D. novamexicana compared to D. virilis. Our results represent a major advance in our understanding of genome biology in this emerging model clade.
Collapse
Affiliation(s)
- Jullien M Flynn
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
- Whitehead Institute for Biomedical Research, Cambridge, MA, USA
| | | | - Manyuan Long
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
| | - Rod A Wing
- School of Plant Sciences, Arizona Genomics Institute, University of Arizona, Tucson, AZ, USA
| | - Andrew G Clark
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| |
Collapse
|
9
|
Flynn JM, Ahmed-Braimah YH, Long M, Wing RA, Clark AG. High quality genome assemblies reveal evolutionary dynamics of repetitive DNA and structural rearrangements in the Drosophila virilis sub-group. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.13.553086. [PMID: 37645834 PMCID: PMC10462019 DOI: 10.1101/2023.08.13.553086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
High-quality genome assemblies across a range of non-traditional model organisms can accelerate the discovery of novel aspects of genome evolution. The Drosophila virilis group has several attributes that distinguish it from more highly studied species in the Drosophila genus, such as an unusual abundance of repetitive elements and extensive karyotype evolution, in addition to being an attractive model for speciation genetics. Here we used long-read sequencing to assemble five genomes of three virilis group species and characterized sequence and structural divergence and repetitive DNA evolution. We find that our contiguous genome assemblies allow characterization of chromosomal arrangements with ease and can facilitate analysis of inversion breakpoints. We also leverage a small panel of resequenced strains to explore the genomic pattern of divergence and polymorphism in this species and show that known demographic histories largely predicts the extent of genome-wide segregating polymorphism. We further find that a neo-X chromosome in D. americana displays X-like levels of nucleotide diversity. We also found that unusual repetitive elements were responsible for much of the divergence in genome composition among species. Helitron-derived tandem repeats tripled in abundance on the Y chromosome in D. americana compared to D. novamexicana, accounting for most of the difference in repeat content between these sister species. Repeats with characteristics of both transposable elements and satellite DNAs expanded by three-fold, mostly in euchromatin, in both D. americana and D. novamexicana compared to D. virilis. Our results represent a major advance in our understanding of genome biology in this emerging model clade.
Collapse
Affiliation(s)
- Jullien M. Flynn
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
- Whitehead Institute for Biomedical Research, Cambridge, MA, USA
| | | | - Manyuan Long
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
| | - Rod A. Wing
- School of Plant Sciences, Arizona Genomics Institute, University of Arizona, Tucson, AZ
| | - Andrew G. Clark
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| |
Collapse
|
10
|
Flynn JM, Hu KB, Clark AG. Three recent sex chromosome-to-autosome fusions in a Drosophila virilis strain with high satellite DNA content. Genetics 2023; 224:iyad062. [PMID: 37052958 PMCID: PMC10213488 DOI: 10.1093/genetics/iyad062] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 12/02/2022] [Accepted: 04/07/2023] [Indexed: 04/14/2023] Open
Abstract
The karyotype, or number and arrangement of chromosomes, has varying levels of stability across both evolution and disease. Karyotype changes often originate from DNA breaks near the centromeres of chromosomes, which generally contain long arrays of tandem repeats or satellite DNA. Drosophila virilis possesses among the highest relative satellite abundances of studied species, with almost half its genome composed of three related 7 bp satellites. We discovered a strain of D. virilis that we infer recently underwent three independent chromosome fusion events involving the X and Y chromosomes, in addition to one subsequent fission event. Here, we isolate and characterize the four different karyotypes we discovered in this strain which we believe demonstrates remarkable genome instability. We discovered that one of the substrains with an X-autosome fusion has an X-to-Y chromosome nondisjunction rate 20 × higher than the D. virilis reference strain (21% vs 1%). Finally, we found an overall higher rate of DNA breakage in the substrain with higher satellite DNA compared to a genetically similar substrain with less satellite DNA. This suggests that satellite DNA abundance may play a role in the risk of genome instability. Overall, we introduce a novel system consisting of a single strain with four different karyotypes, which we believe will be useful for future studies of genome instability, centromere function, and sex chromosome evolution.
Collapse
Affiliation(s)
- Jullien M Flynn
- Department of Molecular Biology and Genetics, Cornell University, Biotechnology Building Room 227, Ithaca, NY 14853, USA
| | - Kevin B Hu
- Department of Molecular Biology and Genetics, Cornell University, Biotechnology Building Room 227, Ithaca, NY 14853, USA
| | - Andrew G Clark
- Department of Molecular Biology and Genetics, Cornell University, Biotechnology Building Room 227, Ithaca, NY 14853, USA
| |
Collapse
|
11
|
Nurk S, Koren S, Rhie A, Rautiainen M, Bzikadze AV, Mikheenko A, Vollger MR, Altemose N, Uralsky L, Gershman A, Aganezov S, Hoyt SJ, Diekhans M, Logsdon GA, Alonge M, Antonarakis SE, Borchers M, Bouffard GG, Brooks SY, Caldas GV, Chen NC, Cheng H, Chin CS, Chow W, de Lima LG, Dishuck PC, Durbin R, Dvorkina T, Fiddes IT, Formenti G, Fulton RS, Fungtammasan A, Garrison E, Grady PG, Graves-Lindsay TA, Hall IM, Hansen NF, Hartley GA, Haukness M, Howe K, Hunkapiller MW, Jain C, Jain M, Jarvis ED, Kerpedjiev P, Kirsche M, Kolmogorov M, Korlach J, Kremitzki M, Li H, Maduro VV, Marschall T, McCartney AM, McDaniel J, Miller DE, Mullikin JC, Myers EW, Olson ND, Paten B, Peluso P, Pevzner PA, Porubsky D, Potapova T, Rogaev EI, Rosenfeld JA, Salzberg SL, Schneider VA, Sedlazeck FJ, Shafin K, Shew CJ, Shumate A, Sims Y, Smit AFA, Soto DC, Sović I, Storer JM, Streets A, Sullivan BA, Thibaud-Nissen F, Torrance J, Wagner J, Walenz BP, Wenger A, Wood JMD, Xiao C, Yan SM, Young AC, Zarate S, Surti U, McCoy RC, Dennis MY, Alexandrov IA, Gerton JL, O’Neill RJ, Timp W, Zook JM, Schatz MC, Eichler EE, Miga KH, Phillippy AM. The complete sequence of a human genome. Science 2022; 376:44-53. [PMID: 35357919 PMCID: PMC9186530 DOI: 10.1126/science.abj6987] [Citation(s) in RCA: 1532] [Impact Index Per Article: 510.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Since its initial release in 2000, the human reference genome has covered only the euchromatic fraction of the genome, leaving important heterochromatic regions unfinished. Addressing the remaining 8% of the genome, the Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion-base pair sequence of a human genome, T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors in the prior references, and introduces nearly 200 million base pairs of sequence containing 1956 gene predictions, 99 of which are predicted to be protein coding. The completed regions include all centromeric satellite arrays, recent segmental duplications, and the short arms of all five acrocentric chromosomes, unlocking these complex regions of the genome to variational and functional studies.
Collapse
Affiliation(s)
- Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| | - Mikko Rautiainen
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| | - Andrey V. Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California, San Diego; La Jolla, CA, USA
| | - Alla Mikheenko
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University; Saint Petersburg, Russia
| | - Mitchell R. Vollger
- Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA, USA
| | - Nicolas Altemose
- Department of Bioengineering, University of California, Berkeley; Berkeley, CA, USA
| | - Lev Uralsky
- Sirius University of Science and Technology; Sochi, Russia
- Vavilov Institute of General Genetics; Moscow, Russia
| | - Ariel Gershman
- Department of Molecular Biology and Genetics, Johns Hopkins University; Baltimore, MD, USA
| | - Sergey Aganezov
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
| | - Savannah J. Hoyt
- Institute for Systems Genomics and Department of Molecular and Cell Biology, University of Connecticut; Storrs, CT, USA
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
| | - Glennis A. Logsdon
- Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA, USA
| | - Michael Alonge
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
| | | | | | - Gerard G. Bouffard
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
| | - Shelise Y. Brooks
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
| | - Gina V. Caldas
- Department of Molecular and Cell Biology, University of California, Berkeley; Berkeley, CA, USA
| | - Nae-Chyun Chen
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
| | - Haoyu Cheng
- Department of Data Sciences, Dana-Farber Cancer Institute; Boston, MA
- Department of Biomedical Informatics, Harvard Medical School; Boston, MA
| | | | | | | | - Philip C. Dishuck
- Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA, USA
| | - Richard Durbin
- Wellcome Sanger Institute; Cambridge, UK
- Department of Genetics, University of Cambridge; Cambridge, UK
| | - Tatiana Dvorkina
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University; Saint Petersburg, Russia
| | | | - Giulio Formenti
- Laboratory of Neurogenetics of Language and The Vertebrate Genome Lab, The Rockefeller University; New York, NY, USA
- Howard Hughes Medical Institute; Chevy Chase, MD, USA
| | - Robert S. Fulton
- Department of Genetics, Washington University School of Medicine; St. Louis, MO, USA
| | | | - Erik Garrison
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
- University of Tennessee Health Science Center; Memphis, TN, USA
| | - Patrick G.S. Grady
- Institute for Systems Genomics and Department of Molecular and Cell Biology, University of Connecticut; Storrs, CT, USA
| | | | - Ira M. Hall
- Department of Genetics, Yale University School of Medicine; New Haven, CT, USA
| | - Nancy F. Hansen
- Comparative Genomics Analysis Unit, Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
| | - Gabrielle A. Hartley
- Institute for Systems Genomics and Department of Molecular and Cell Biology, University of Connecticut; Storrs, CT, USA
| | - Marina Haukness
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
| | | | | | - Chirag Jain
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
- Department of Computational and Data Sciences, Indian Institute of Science; Bangalore KA, India
| | - Miten Jain
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
| | - Erich D. Jarvis
- Laboratory of Neurogenetics of Language and The Vertebrate Genome Lab, The Rockefeller University; New York, NY, USA
- Howard Hughes Medical Institute; Chevy Chase, MD, USA
| | | | - Melanie Kirsche
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
| | - Mikhail Kolmogorov
- Department of Computer Science and Engineering, University of California, San Diego; San Diego, CA, USA
| | | | - Milinn Kremitzki
- McDonnell Genome Institute, Washington University in St. Louis; St. Louis, MO, USA
| | - Heng Li
- Department of Data Sciences, Dana-Farber Cancer Institute; Boston, MA
- Department of Biomedical Informatics, Harvard Medical School; Boston, MA
| | - Valerie V. Maduro
- Undiagnosed Diseases Program, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
| | - Tobias Marschall
- Heinrich Heine University Düsseldorf, Medical Faculty, Institute for Medical Biometry and Bioinformatics; Düsseldorf, Germany
| | - Ann M. McCartney
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| | - Jennifer McDaniel
- Biosystems and Biomaterials Division, National Institute of Standards and Technology; Gaithersburg, MD, USA
| | - Danny E. Miller
- Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA, USA
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children’s Hospital; Seattle, WA, USA
| | - James C. Mullikin
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
- Comparative Genomics Analysis Unit, Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
| | - Eugene W. Myers
- Max-Planck Institute of Molecular Cell Biology and Genetics; Dresden, Germany
| | - Nathan D. Olson
- Biosystems and Biomaterials Division, National Institute of Standards and Technology; Gaithersburg, MD, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
| | | | - Pavel A. Pevzner
- Department of Computer Science and Engineering, University of California, San Diego; San Diego, CA, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA, USA
| | - Tamara Potapova
- Stowers Institute for Medical Research; Kansas City, MO, USA
| | - Evgeny I. Rogaev
- Sirius University of Science and Technology; Sochi, Russia
- Vavilov Institute of General Genetics; Moscow, Russia
- Department of Psychiatry, University of Massachusetts Medical School; Worcester, MA, USA
- Faculty of Biology, Lomonosov Moscow State University; Moscow, Russia
| | | | - Steven L. Salzberg
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University; Baltimore, MD, USA
| | - Valerie A. Schneider
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health; Bethesda, MD, USA
| | - Fritz J. Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine; Houston TX, USA
| | - Kishwar Shafin
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
| | - Colin J. Shew
- Genome Center, MIND Institute, Department of Biochemistry and Molecular Medicine, University of California, Davis; CA, USA
| | - Alaina Shumate
- Department of Biomedical Engineering, Johns Hopkins University; Baltimore, MD, USA
| | - Ying Sims
- Wellcome Sanger Institute; Cambridge, UK
| | | | - Daniela C. Soto
- Genome Center, MIND Institute, Department of Biochemistry and Molecular Medicine, University of California, Davis; CA, USA
| | - Ivan Sović
- Pacific Biosciences; Menlo Park, CA, USA
- Digital BioLogic d.o.o.; Ivanić-Grad, Croatia
| | | | - Aaron Streets
- Department of Bioengineering, University of California, Berkeley; Berkeley, CA, USA
- Chan Zuckerberg Biohub; San Francisco, CA, USA
| | - Beth A. Sullivan
- Department of Molecular Genetics and Microbiology, Duke University School of Medicine; Durham, NC, USA
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health; Bethesda, MD, USA
| | | | - Justin Wagner
- Biosystems and Biomaterials Division, National Institute of Standards and Technology; Gaithersburg, MD, USA
| | - Brian P. Walenz
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| | | | | | - Chunlin Xiao
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health; Bethesda, MD, USA
| | - Stephanie M. Yan
- Department of Biology, Johns Hopkins University; Baltimore, MD, USA
| | - Alice C. Young
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
| | - Samantha Zarate
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
| | - Urvashi Surti
- Department of Pathology, University of Pittsburgh; Pittsburgh, PA, USA
| | - Rajiv C. McCoy
- Department of Biology, Johns Hopkins University; Baltimore, MD, USA
| | - Megan Y. Dennis
- Genome Center, MIND Institute, Department of Biochemistry and Molecular Medicine, University of California, Davis; CA, USA
| | - Ivan A. Alexandrov
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University; Saint Petersburg, Russia
- Vavilov Institute of General Genetics; Moscow, Russia
- Research Center of Biotechnology of the Russian Academy of Sciences; Moscow, Russia
| | - Jennifer L. Gerton
- Stowers Institute for Medical Research; Kansas City, MO, USA
- Department of Biochemistry and Molecular Biology, University of Kansas Medical School; Kansas City, MO, USA
| | - Rachel J. O’Neill
- Institute for Systems Genomics and Department of Molecular and Cell Biology, University of Connecticut; Storrs, CT, USA
| | - Winston Timp
- Department of Molecular Biology and Genetics, Johns Hopkins University; Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University; Baltimore, MD, USA
| | - Justin M. Zook
- Biosystems and Biomaterials Division, National Institute of Standards and Technology; Gaithersburg, MD, USA
| | - Michael C. Schatz
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
- Department of Biology, Johns Hopkins University; Baltimore, MD, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA, USA
- Howard Hughes Medical Institute; Chevy Chase, MD, USA
| | - Karen H. Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
- Department of Biomolecular Engineering, University of California Santa Cruz, CA, USA
| | - Adam M. Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| |
Collapse
|
12
|
Chang CH, Gregory LE, Gordon KE, Meiklejohn CD, Larracuente AM. Unique structure and positive selection promote the rapid divergence of Drosophila Y chromosomes. eLife 2022; 11:e75795. [PMID: 34989337 PMCID: PMC8794474 DOI: 10.7554/elife.75795] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 12/18/2021] [Indexed: 02/06/2023] Open
Abstract
Y chromosomes across diverse species convergently evolve a gene-poor, heterochromatic organization enriched for duplicated genes, LTR retrotransposons, and satellite DNA. Sexual antagonism and a loss of recombination play major roles in the degeneration of young Y chromosomes. However, the processes shaping the evolution of mature, already degenerated Y chromosomes are less well-understood. Because Y chromosomes evolve rapidly, comparisons between closely related species are particularly useful. We generated de novo long-read assemblies complemented with cytological validation to reveal Y chromosome organization in three closely related species of the Drosophila simulans complex, which diverged only 250,000 years ago and share >98% sequence identity. We find these Y chromosomes are divergent in their organization and repetitive DNA composition and discover new Y-linked gene families whose evolution is driven by both positive selection and gene conversion. These Y chromosomes are also enriched for large deletions, suggesting that the repair of double-strand breaks on Y chromosomes may be biased toward microhomology-mediated end joining over canonical non-homologous end-joining. We propose that this repair mechanism contributes to the convergent evolution of Y chromosome organization across organisms.
Collapse
Affiliation(s)
- Ching-Ho Chang
- Department of Biology, University of RochesterRochesterUnited States
| | - Lauren E Gregory
- Department of Biology, University of RochesterRochesterUnited States
| | - Kathleen E Gordon
- School of Biological Sciences, University of Nebraska-LincolnLincolnUnited States
| | - Colin D Meiklejohn
- School of Biological Sciences, University of Nebraska-LincolnLincolnUnited States
| | | |
Collapse
|
13
|
Affiliation(s)
| | - Francisco J. Ruiz-Ruano
- Department of Organismal Biology – Systematic Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
- School of Biological Sciences, Norwich Research Park University of East Anglia, Norwich, UK
| |
Collapse
|
14
|
Abstract
We are entering a new era in genomics where entire centromeric regions are accurately represented in human reference assemblies. Access to these high-resolution maps will enable new surveys of sequence and epigenetic variation in the population and offer new insight into satellite array genomics and centromere function. Here, we focus on the sequence organization and evolution of alpha satellites, which are credited as the genetic and genomic definition of human centromeres due to their interaction with inner kinetochore proteins and their importance in the development of human artificial chromosome assays. We provide an overview of alpha satellite repeat structure and array organization in the context of these high-quality reference data sets; discuss the emergence of variation-based surveys; and provide perspective on the role of this new source of genetic and epigenetic variation in the context of chromosome biology, genome instability, and human disease.
Collapse
Affiliation(s)
- Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California 95064, USA; .,Department of Biomolecular Engineering, University of California, Santa Cruz, California 95064, USA
| | - Ivan A Alexandrov
- Department of Genomics and Human Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia; .,Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg 199004, Russia.,Research Center of Biotechnology of the Russian Academy of Sciences, Moscow 119071, Russia
| |
Collapse
|
15
|
Subgenome Discrimination in Brassica and Raphanus Allopolyploids Using Microsatellites. Cells 2021; 10:cells10092358. [PMID: 34572008 PMCID: PMC8466703 DOI: 10.3390/cells10092358] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 09/01/2021] [Accepted: 09/03/2021] [Indexed: 01/11/2023] Open
Abstract
Intergeneric crosses between Brassica species and Raphanus sativus have produced crops with prominent shoot and root systems of Brassica and R. sativus, respectively. It is necessary to discriminate donor genomes when studying cytogenetic stability in distant crosses to identify homologous chromosome pairing, and microsatellite repeats have been used to discriminate subgenomes in allopolyploids. To identify genome-specific microsatellites, we explored the microsatellite content in three Brassica species (B. rapa, AA, B. oleracea, CC, and B. nigra, BB) and R. sativus (RR) genomes, and validated their genome specificity by fluorescence in situ hybridization. We identified three microsatellites showing A, C, and B/R genome specificity. ACBR_msat14 and ACBR_msat20 were detected in the A and C chromosomes, respectively, and ACBR_msat01 was detected in B and R genomes. However, we did not find a microsatellite that discriminated the B and R genomes. The localization of ACBR_msat20 in the 45S rDNA array in ×Brassicoraphanus 977 corroborated the association of the 45S rDNA array with genome rearrangement. Along with the rDNA and telomeric repeat probes, these microsatellites enabled the easy identification of homologous chromosomes. These data demonstrate the utility of microsatellites as probes in identifying subgenomes within closely related Brassica and Raphanus species for the analysis of genetic stability of new synthetic polyploids of these genomes.
Collapse
|
16
|
Kuhn GCS, Heringer P, Dias GB. Structure, Organization, and Evolution of Satellite DNAs: Insights from the Drosophila repleta and D. virilis Species Groups. PROGRESS IN MOLECULAR AND SUBCELLULAR BIOLOGY 2021; 60:27-56. [PMID: 34386871 DOI: 10.1007/978-3-030-74889-0_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The fact that satellite DNAs (satDNAs) in eukaryotes are abundant genomic components, can perform functional roles, but can also change rapidly across species while being homogenous within a species, makes them an intriguing and fascinating genomic component to study. It is also becoming clear that satDNAs represent an important piece in genome architecture and that changes in their structure, organization, and abundance can affect the evolution of genomes and species in many ways. Since the discovery of satDNAs more than 50 years ago, species from the Drosophila genus have continuously been used as models to study several aspects of satDNA biology. These studies have been largely concentrated in D. melanogaster and closely related species from the Sophophora subgenus, even though the vast majority of all Drosophila species belong to the Drosophila subgenus. This chapter highlights some studies on the satDNA structure, organization, and evolution in two species groups from the Drosophila subgenus: the repleta and virilis groups. We also discuss and review the classification of other abundant tandem repeats found in these species in the light of the current information available.
Collapse
Affiliation(s)
- Gustavo C S Kuhn
- Departamento de Genética, Ecologia e Evolução, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG, Brazil.
| | - Pedro Heringer
- Departamento de Genética, Ecologia e Evolução, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG, Brazil
| | - Guilherme Borges Dias
- Department of Genetics and Institute of Bioinformatics, University of Georgia, Athens, GA, USA
| |
Collapse
|
17
|
Kim BY, Wang JR, Miller DE, Barmina O, Delaney E, Thompson A, Comeault AA, Peede D, D'Agostino ERR, Pelaez J, Aguilar JM, Haji D, Matsunaga T, Armstrong EE, Zych M, Ogawa Y, Stamenković-Radak M, Jelić M, Veselinović MS, Tanasković M, Erić P, Gao JJ, Katoh TK, Toda MJ, Watabe H, Watada M, Davis JS, Moyle LC, Manoli G, Bertolini E, Košťál V, Hawley RS, Takahashi A, Jones CD, Price DK, Whiteman N, Kopp A, Matute DR, Petrov DA. Highly contiguous assemblies of 101 drosophilid genomes. eLife 2021; 10:e66405. [PMID: 34279216 PMCID: PMC8337076 DOI: 10.7554/elife.66405] [Citation(s) in RCA: 96] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 07/16/2021] [Indexed: 12/13/2022] Open
Abstract
Over 100 years of studies in Drosophila melanogaster and related species in the genus Drosophila have facilitated key discoveries in genetics, genomics, and evolution. While high-quality genome assemblies exist for several species in this group, they only encompass a small fraction of the genus. Recent advances in long-read sequencing allow high-quality genome assemblies for tens or even hundreds of species to be efficiently generated. Here, we utilize Oxford Nanopore sequencing to build an open community resource of genome assemblies for 101 lines of 93 drosophilid species encompassing 14 species groups and 35 sub-groups. The genomes are highly contiguous and complete, with an average contig N50 of 10.5 Mb and greater than 97% BUSCO completeness in 97/101 assemblies. We show that Nanopore-based assemblies are highly accurate in coding regions, particularly with respect to coding insertions and deletions. These assemblies, along with a detailed laboratory protocol and assembly pipelines, are released as a public resource and will serve as a starting point for addressing broad questions of genetics, ecology, and evolution at the scale of hundreds of species.
Collapse
Affiliation(s)
- Bernard Y Kim
- Department of Biology, Stanford UniversityStanfordUnited States
| | - Jeremy R Wang
- Department of Genetics, University of North CarolinaChapel HillUnited States
| | - Danny E Miller
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children’s HospitalSeattleUnited States
| | - Olga Barmina
- Department of Evolution and Ecology, University of California DavisDavisUnited States
| | - Emily Delaney
- Department of Evolution and Ecology, University of California DavisDavisUnited States
| | - Ammon Thompson
- Department of Evolution and Ecology, University of California DavisDavisUnited States
| | - Aaron A Comeault
- School of Natural Sciences, Bangor UniversityBangorUnited Kingdom
| | - David Peede
- Biology Department, University of North CarolinaChapel HillUnited States
| | | | - Julianne Pelaez
- Department of Integrative Biology, University of California, BerkeleyBerkeleyUnited States
| | - Jessica M Aguilar
- Department of Integrative Biology, University of California, BerkeleyBerkeleyUnited States
| | - Diler Haji
- Department of Integrative Biology, University of California, BerkeleyBerkeleyUnited States
| | - Teruyuki Matsunaga
- Department of Integrative Biology, University of California, BerkeleyBerkeleyUnited States
| | | | - Molly Zych
- Molecular and Cellular Biology Program, University of WashingtonSeattleUnited States
| | - Yoshitaka Ogawa
- Department of Biological Sciences, Tokyo Metropolitan UniversityHachiojiJapan
| | | | - Mihailo Jelić
- Faculty of Biology, University of BelgradeBelgradeSerbia
| | | | - Marija Tanasković
- University of Belgrade, Institute for Biological Research "Siniša Stanković", National Institute of Republic of SerbiaBelgradeSerbia
| | - Pavle Erić
- University of Belgrade, Institute for Biological Research "Siniša Stanković", National Institute of Republic of SerbiaBelgradeSerbia
| | - Jian-Jun Gao
- School of Ecology and Environmental Science, Yunnan UniversityKunmingChina
| | - Takehiro K Katoh
- School of Ecology and Environmental Science, Yunnan UniversityKunmingChina
| | | | - Hideaki Watabe
- Biological Laboratory, Sapporo College, Hokkaido University of EducationSapporoJapan
| | - Masayoshi Watada
- Graduate School of Science and Engineering, Ehime UniversityMatsuyamaJapan
| | - Jeremy S Davis
- Department of Biology, University of KentuckyLexingtonUnited States
| | - Leonie C Moyle
- Department of Biology, Indiana UniversityBloomingtonUnited States
| | - Giulia Manoli
- Neurobiology and Genetics, Theodor Boveri Institute, Biocentre, University of WürzburgWürzburgGermany
| | - Enrico Bertolini
- Neurobiology and Genetics, Theodor Boveri Institute, Biocentre, University of WürzburgWürzburgGermany
| | - Vladimír Košťál
- Institute of Entomology, Biology Centre, Academy of Sciences of the Czech RepublicPragueCzech Republic
| | - R Scott Hawley
- Department of Molecular and Integrative Physiology, University of Kansas Medical Center, Stowers Institute for Medical ResearchKansas CityUnited States
| | - Aya Takahashi
- Department of Biological Sciences, Tokyo Metropolitan UniversityHachiojiJapan
| | - Corbin D Jones
- Biology Department, University of North CarolinaChapel HillUnited States
| | - Donald K Price
- School of Life Science, University of NevadaLas VegasUnited States
| | - Noah Whiteman
- Department of Integrative Biology, University of California, BerkeleyBerkeleyUnited States
| | - Artyom Kopp
- Department of Evolution and Ecology, University of California DavisDavisUnited States
| | - Daniel R Matute
- Biology Department, University of North CarolinaChapel HillUnited States
| | - Dmitri A Petrov
- Department of Biology, Stanford UniversityStanfordUnited States
| |
Collapse
|
18
|
Wallace AD, Sasani TA, Swanier J, Gates BL, Greenland J, Pedersen BS, Varley KE, Quinlan AR. CaBagE: A Cas9-based Background Elimination strategy for targeted, long-read DNA sequencing. PLoS One 2021; 16:e0241253. [PMID: 33830997 PMCID: PMC8031414 DOI: 10.1371/journal.pone.0241253] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Accepted: 01/19/2021] [Indexed: 11/29/2022] Open
Abstract
A substantial fraction of the human genome is difficult to interrogate with short-read DNA sequencing technologies due to paralogy, complex haplotype structures, or tandem repeats. Long-read sequencing technologies, such as Oxford Nanopore's MinION, enable direct measurement of complex loci without introducing many of the biases inherent to short-read methods, though they suffer from relatively lower throughput. This limitation has motivated recent efforts to develop amplification-free strategies to target and enrich loci of interest for subsequent sequencing with long reads. Here, we present CaBagE, a method for target enrichment that is efficient and useful for sequencing large, structurally complex targets. The CaBagE method leverages the stable binding of Cas9 to its DNA target to protect desired fragments from digestion with exonuclease. Enriched DNA fragments are then sequenced with Oxford Nanopore's MinION long-read sequencing technology. Enrichment with CaBagE resulted in a median of 116X coverage (range 39-416) of target loci when tested on five genomic targets ranging from 4-20kb in length using healthy donor DNA. Four cancer gene targets were enriched in a single reaction and multiplexed on a single MinION flow cell. We further demonstrate the utility of CaBagE in two ALS patients with C9orf72 short tandem repeat expansions to produce genotype estimates commensurate with genotypes derived from repeat-primed PCR for each individual. With CaBagE there is a physical enrichment of on-target DNA in a given sample prior to sequencing. This feature allows adaptability across sequencing platforms and potential use as an enrichment strategy for applications beyond sequencing. CaBagE is a rapid enrichment method that can illuminate regions of the 'hidden genome' underlying human disease.
Collapse
Affiliation(s)
- Amelia D. Wallace
- Department of Human Genetics, School of Medicine, University of Utah, Salt Lake City, Utah, United States of America
- Utah Center for Genetic Discovery, School of Medicine, University of Utah, Salt Lake City, Utah, United States of America
| | - Thomas A. Sasani
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Jordan Swanier
- Department of Human Genetics, School of Medicine, University of Utah, Salt Lake City, Utah, United States of America
| | - Brooke L. Gates
- Department of Oncological Sciences, Huntsman Cancer Institute, Salt Lake City, Utah, United States of America
| | - Jeff Greenland
- Department of Oncological Sciences, Huntsman Cancer Institute, Salt Lake City, Utah, United States of America
| | - Brent S. Pedersen
- Department of Human Genetics, School of Medicine, University of Utah, Salt Lake City, Utah, United States of America
- Utah Center for Genetic Discovery, School of Medicine, University of Utah, Salt Lake City, Utah, United States of America
| | - Katherine E. Varley
- Department of Oncological Sciences, Huntsman Cancer Institute, Salt Lake City, Utah, United States of America
| | - Aaron R. Quinlan
- Department of Human Genetics, School of Medicine, University of Utah, Salt Lake City, Utah, United States of America
- Utah Center for Genetic Discovery, School of Medicine, University of Utah, Salt Lake City, Utah, United States of America
- Department of Biomedical Informatics, School of Medicine, University of Utah, Salt Lake City, Utah, United States of America
| |
Collapse
|
19
|
Sena RS, Heringer P, Valeri MP, Pereira VS, Kuhn GCS, Svartman M. Identification and characterization of satellite DNAs in two-toed sloths of the genus Choloepus (Megalonychidae, Xenarthra). Sci Rep 2020; 10:19202. [PMID: 33154538 PMCID: PMC7644632 DOI: 10.1038/s41598-020-76199-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Accepted: 10/19/2020] [Indexed: 11/09/2022] Open
Abstract
Choloepus, the only extant genus of the Megalonychidae family, is composed of two living species of two-toed sloths: Choloepus didactylus and C. hoffmanni. In this work, we identified and characterized the main satellite DNAs (satDNAs) in the sequenced genomes of these two species. SATCHO1, the most abundant satDNA in both species, is composed of 117 bp tandem repeat sequences. The second most abundant satDNA, SATCHO2, is composed of ~ 2292 bp tandem repeats. Fluorescence in situ hybridization in C. hoffmanni revealed that both satDNAs are located in the centromeric regions of all chromosomes, except the X. In fact, these satDNAs present some centromeric characteristics in their sequences, such as dyad symmetries predicted to form secondary structures. PCR experiments indicated the presence of SATCHO1 sequences in two other Xenarthra species: the tree-toed sloth Bradypus variegatus and the anteater Myrmecophaga tridactyla. Nevertheless, SATCHO1 is present as large tandem arrays only in Choloepus species, thus likely representing a satDNA exclusively in this genus. Our results reveal interesting features of the satDNA landscape in Choloepus species with the potential to aid future phylogenetic studies in Xenarthra and mammalian genomes in general.
Collapse
Affiliation(s)
- Radarane Santos Sena
- Laboratório de Citogenômica Evolutiva, Departamento de Genética, Ecologia e Evolução, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Pedro Heringer
- Laboratório de Citogenômica Evolutiva, Departamento de Genética, Ecologia e Evolução, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Mirela Pelizaro Valeri
- Laboratório de Citogenômica Evolutiva, Departamento de Genética, Ecologia e Evolução, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | | | - Gustavo C S Kuhn
- Laboratório de Citogenômica Evolutiva, Departamento de Genética, Ecologia e Evolução, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Marta Svartman
- Laboratório de Citogenômica Evolutiva, Departamento de Genética, Ecologia e Evolução, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil.
| |
Collapse
|
20
|
Gamez S, Srivastav S, Akbari OS, Lau NC. Diverse Defenses: A Perspective Comparing Dipteran Piwi-piRNA Pathways. Cells 2020; 9:E2180. [PMID: 32992598 PMCID: PMC7601171 DOI: 10.3390/cells9102180] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Revised: 09/22/2020] [Accepted: 09/23/2020] [Indexed: 02/07/2023] Open
Abstract
Animals face the dual threat of virus infections hijacking cellular function and transposons proliferating in germline genomes. For insects, the deeply conserved RNA interference (RNAi) pathways and other chromatin regulators provide an important line of defense against both viruses and transposons. For example, this innate immune system displays adaptiveness to new invasions by generating cognate small RNAs for targeting gene silencing measures against the viral and genomic intruders. However, within the Dipteran clade of insects, Drosophilid fruit flies and Culicids mosquitoes have evolved several unique mechanistic aspects of their RNAi defenses to combat invading transposons and viruses, with the Piwi-piRNA arm of the RNAi pathways showing the greatest degree of novel evolution. Whereas central features of Piwi-piRNA pathways are conserved between Drosophilids and Culicids, multiple lineage-specific innovations have arisen that may reflect distinct genome composition differences and specific ecological and physiological features dividing these two branches of Dipterans. This perspective review focuses on the most recent findings illuminating the Piwi/piRNA pathway distinctions between fruit flies and mosquitoes, and raises open questions that need to be addressed in order to ameliorate human diseases caused by pathogenic viruses that mosquitoes transmit as vectors.
Collapse
Affiliation(s)
- Stephanie Gamez
- Division of Biological Sciences, Section of Cell and Developmental Biology, University of California, San Diego, CA 92093, USA; (S.G.); (O.S.A.)
| | - Satyam Srivastav
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853-2703, USA;
| | - Omar S. Akbari
- Division of Biological Sciences, Section of Cell and Developmental Biology, University of California, San Diego, CA 92093, USA; (S.G.); (O.S.A.)
| | - Nelson C. Lau
- Department of Biochemistry and Genome Science Institute, Boston University School of Medicine, Boston, MA 02118, USA
| |
Collapse
|
21
|
Shatskikh AS, Kotov AA, Adashev VE, Bazylev SS, Olenina LV. Functional Significance of Satellite DNAs: Insights From Drosophila. Front Cell Dev Biol 2020; 8:312. [PMID: 32432114 PMCID: PMC7214746 DOI: 10.3389/fcell.2020.00312] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2020] [Accepted: 04/08/2020] [Indexed: 12/12/2022] Open
Abstract
Since their discovery more than 60 years ago, satellite repeats are still one of the most enigmatic parts of eukaryotic genomes. Being non-coding DNA, satellites were earlier considered to be non-functional “junk,” but recently this concept has been extensively revised. Satellite DNA contributes to the essential processes of formation of crucial chromosome structures, heterochromatin establishment, dosage compensation, reproductive isolation, genome stability and development. Genomic abundance of satellites is under stabilizing selection owing of their role in the maintenance of vital regions of the genome – centromeres, pericentromeric regions, and telomeres. Many satellites are transcribed with the generation of long or small non-coding RNAs. Misregulation of their expression is found to lead to various defects in the maintenance of genomic architecture, chromosome segregation and gametogenesis. This review summarizes our current knowledge concerning satellite functions, the mechanisms of regulation and evolution of satellites, focusing on recent findings in Drosophila. We discuss here experimental and bioinformatics data obtained in Drosophila in recent years, suggesting relevance of our analysis to a wide range of eukaryotic organisms.
Collapse
Affiliation(s)
- Aleksei S Shatskikh
- Laboratory of Analysis of Clinical and Model Tumor Pathologies on the Organismal Level, Institute of Molecular Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Alexei A Kotov
- Laboratory of Biochemical Genetics of Animals, Institute of Molecular Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Vladimir E Adashev
- Laboratory of Biochemical Genetics of Animals, Institute of Molecular Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Sergei S Bazylev
- Laboratory of Biochemical Genetics of Animals, Institute of Molecular Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Ludmila V Olenina
- Laboratory of Biochemical Genetics of Animals, Institute of Molecular Genetics, Russian Academy of Sciences, Moscow, Russia
| |
Collapse
|
22
|
Louzada S, Lopes M, Ferreira D, Adega F, Escudeiro A, Gama-Carvalho M, Chaves R. Decoding the Role of Satellite DNA in Genome Architecture and Plasticity-An Evolutionary and Clinical Affair. Genes (Basel) 2020; 11:E72. [PMID: 31936645 PMCID: PMC7017282 DOI: 10.3390/genes11010072] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 12/29/2019] [Accepted: 01/08/2020] [Indexed: 12/11/2022] Open
Abstract
Repetitive DNA is a major organizational component of eukaryotic genomes, being intrinsically related with their architecture and evolution. Tandemly repeated satellite DNAs (satDNAs) can be found clustered in specific heterochromatin-rich chromosomal regions, building vital structures like functional centromeres and also dispersed within euchromatin. Interestingly, despite their association to critical chromosomal structures, satDNAs are widely variable among species due to their high turnover rates. This dynamic behavior has been associated with genome plasticity and chromosome rearrangements, leading to the reshaping of genomes. Here we present the current knowledge regarding satDNAs in the light of new genomic technologies, and the challenges in the study of these sequences. Furthermore, we discuss how these sequences, together with other repeats, influence genome architecture, impacting its evolution and association with disease.
Collapse
Affiliation(s)
- Sandra Louzada
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Mariana Lopes
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Daniela Ferreira
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Filomena Adega
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Ana Escudeiro
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Margarida Gama-Carvalho
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Raquel Chaves
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| |
Collapse
|